Horribly confused about caps

bluewave41 · December 28, 2025, 11:24pm

I don’t understand if I’m just dumb or what is going on.

My receiver pipeline is

udpsrc port=%d buffer-size=4194304 caps="application/x-rtp, media=video, encoding-name=H264, payload=96" ! rtph264depay ! h264parse config-interval=1 ! amcviddec-c2qtiavcdecoderlowlatency ! appsink name=unity emit-signals=true

The caps for amcviddec-c2qtiavcdecoderlowlatency are

Src caps: 
video/x-raw(memory:GLMemory), format=(string)RGBA, texture-target=(string)external-oes;

video/x-raw, format=(string)I420, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ];

video/x-raw, format=(string)NV12, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ];

video/x-raw, format=(string)P010_10LE, width=(int)[ 1, 2147483647 ], height=(int)[ 1, 2147483647 ], framerate=(fraction)[ 0/1, 2147483647/1 ]

Sink caps: 
video/x-h264, width=(int)[ 16, 4096 ], height=(int)[ 16, 4096 ], framerate=(fraction)[ 0/1, 2147483647/1 ], parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)constrained-high;

video/x-h264, width=(int)[ 16, 4096 ], height=(int)[ 16, 4096 ], framerate=(fraction)[ 0/1, 2147483647/1 ], parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)high;

video/x-h264, width=(int)[ 16, 4096 ], height=(int)[ 16, 4096 ], framerate=(fraction)[ 0/1, 2147483647/1 ], parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)main;

video/x-h264, width=(int)[ 16, 4096 ], height=(int)[ 16, 4096 ], framerate=(fraction)[ 0/1, 2147483647/1 ], parsed=(boolean)true, stream-format=(string)byte-stream, alignment=(string)au, profile=(string)constrained-baseline;

video/x-h264, width=(int)[ 16, 4096 ], height=(int)[ 16, 4096 ], framerate=(fraction)[ 0/1, 2147483647/1 ], parsed=(boolean)true, stream-format=(str (cut off for unknown reasons but not important)

If I run the pipeline as is it negotiates NV12 with appsink which is not preferable. I want it to be GLMemory. I update my pipeline as such

udpsrc port=%d buffer-size=4194304 caps="application/x-rtp, media=video, encoding-name=H264, payload=96" ! rtph264depay ! h264parse config-interval=1 ! amcviddec-c2qtiavcdecoderlowlatency ! video/x-raw(memory:GLMemory) ! appsink name=unity emit-signals=true

Which will only allow GLMemory to flow from the decoder. Now it no longer negotiates.

[GStreamer] upstream tags: taglist, video-codec=(string)"H.264\ \(Main\ Profile\)";
[GStreamer] creating caps event video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted
[GStreamer] could not send sticky events
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted
[GStreamer] First buffer since flush took 0:00:03.545317437 to produce
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted
[GStreamer] error: Internal data stream error.
[GStreamer] error: streaming stopped, reason not-negotiated (-4)
[GStreamer] posting message: Internal data stream error.
[GStreamer] posted error message: Internal data stream error.
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted
[GStreamer] caps video/x-raw, format=(string)NV12, width=(int)320, height=(int)240, interlace-mode=(string)progressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-mono, pixel-aspect-ratio=(fraction)1/1, colorimetry=(string)bt601, framerate=(fraction)30/1 not accepted

Which is extremely unhelpful. The decoder can output GLMemory there should be no problems. These logs complaining about NV12 and the small width are probably from the videotestsrc sender but I don’t see how that’s relevant to anything going on here. The caps on appsink should accept anything and I don’t see why there should be an issue.

Why the heck won’t it accept my GLMemory caps? What information about caps am I missing?

ndufresne · December 29, 2025, 11:00pm

The decoder will only enable GL if the application shared a GL context. The code to create an internal context is commented out for reasons I don’t know.

Do you reply to the GL specific bus messages in your app ?

The code I’m referring to:

bluewave41 · December 29, 2025, 11:07pm

I think so but I’m not at all confident that it’s working correctly. I’m not good enough at OpenGL to figure out what’s working and what isn’t and it’s made harder due to it running on a Quest 3 headset and any debugging tools seem to be lacking or non existent on Linux.

static gboolean
sync_bus_call (GstBus *bus, GstMessage *msg, gpointer    data)
{
  switch (GST_MESSAGE_TYPE (msg)) {
    case GST_MESSAGE_NEED_CONTEXT:
    {
      const gchar *context_type;
      GstContext *context = NULL;
     
      gst_message_parse_context_type (msg, &context_type);
      Logger::logFormat("Got need context %s", context_type);

      if (g_strcmp0 (context_type, "gst.gl.app_context") == 0) {
        GstGLContext *gl_context = sharedContext;
        GstStructure *s;

        Logger::log("Setting context...");
        context = gst_context_new ("gst.gl.app_context", TRUE);
        Logger::log("Made new context");
        s = gst_context_writable_structure (context);
        Logger::log("Made writable context");
        gst_structure_set (s, "context", GST_TYPE_GL_CONTEXT, gl_context, NULL);
        Logger::log("Set GL context");
        gst_element_set_context (GST_ELEMENT (msg->src), context);
        Logger::log("Set context");
      }
      if (context) {
        gst_context_unref (context);
      }
      break;
    }
    default:
      break;
  }

  return FALSE;
}

I create and activate the context and such in a different function

void GLInit() {
  g_unityDisplay = eglGetCurrentDisplay();
  g_unityContext = eglGetCurrentContext();

  Logger::logFormat("Unity context %p", g_unityContext);
  Logger::logFormat("Unity display %p", g_unityDisplay);

  GstGLDisplayEGL *display = gst_gl_display_egl_new_with_egl_display(g_unityDisplay);

  sharedContext = gst_gl_context_new_wrapped(
    GST_GL_DISPLAY_CAST(display),
    (guintptr)g_unityContext,
    GST_GL_PLATFORM_EGL,
    GST_GL_API_GLES2
  );

  gst_gl_context_activate(sharedContext, TRUE);
  GError *error;
  if (!gst_gl_context_fill_info (sharedContext, &error)) {
    Logger::log("Failed to retrieve context info");
    gst_gl_context_activate (sharedContext, FALSE);
    return;
  }

  gst_gl_context_activate (sharedContext, FALSE);

  Logger::log("GL has been init successfully.");
}

I have another thread going at Shared context failing to fill info - #3 by bluewave41 relating to my Unity context sharing struggles. I can’t find any good examples of this that work correctly.

ndufresne · December 29, 2025, 11:50pm

Can’t tell if there is anything else to be done. Some helpful data would be to test similar code but on pure Android, test if it somehow depends on glimagesink.

bluewave41 · December 30, 2025, 12:47am

Probably a better idea, Let me ask a different question though about caps.

Take this pipeline

udpsrc port=%d buffer-size=4194304 caps="application/x-rtp, media=video, encoding-name=H264, payload=96" ! rtph264depay ! h264parse config-interval=1 ! amcviddec-c2qtiavcdecoderlowlatency ! video/x-raw(memory:GLMemory) ! glcolorconvert ! video/x-raw(memory:GLMemory),format=RGBA,texture-format=2D ! appsink name=unity emit-signals=true

I believe that since https://gstreamer.freedesktop.org/documentation/opengl/glcolorconvert.html?gi-language=c lists texture-target: { (string)2D, (string)rectangle, (string)external-oes } for both the sink and src that it’s capable of converting an OES texture to 2D. When I run my pipeline everythink links together but

    GstBuffer *buffer = gst_sample_get_buffer(instance->sample);
    GstMemory *mem = gst_buffer_peek_memory(buffer, 0);
    if(!gst_is_gl_memory(mem)) {
      Logger::log("Not GL memory.");
      return;
    }
    GLenum target = gst_gl_memory_get_texture_target(glMem);
    Logger::logFormat("srcTex target = 0x%x", target);

Prints that the target is 0x3 which https://gstreamer.freedesktop.org/documentation/gl/gstgl_enums.html?gi-language=c#GstGLTextureTarget shows is still an OES texture.

Based on your knowledge does what I’m thinking make sense or am I missing something relating to forcing the output I want?

ndufresne · December 30, 2025, 1:06am

It’s possible that the serialization got confused. It’s a shot in the dark, but try:

...,texture-target=(string)2D

bluewave41 · December 30, 2025, 1:20am

That did it. So I should just be explicit about everything in the future.

I’ve fixed up my context sharing and everything and I have it working on the headset now. Time for optimizing and such to make it even better. Thanks for all your insights,

ndufresne · December 30, 2025, 1:40am

A bit of context, as caps fields are not pre-declared, the fields type isn’t known by the deserializer, it will under the type based on the content. In this case I was hinted by the fact the first character is a number. Similar issues happens if you are not explicit with H264/5/6 level field.