Associating additional data with a frame - webrtc

I am trying to associate additional data (server-generated ID, timestamp, 6dof “pose”) with frames being sent between two processes using gstreamer and webrtcbin. (I have control over both ends, browser interop is not important.)

I already have a data_channel set up between them, but that does not synchronize with the actual frames sent. I would be able to send the rest of the data over the data channel if I could get a 64-bit ID or timestamp with the frame (“re-assembling” the data from the two sources after the fact).

I’m not sure the best way to do this. I’m already populating a timestamp on the server end, but can’t figure out how to get that out on the receiving end. I am not sure if my use of hardware decoders on the receiver is complicating things there. For reference, here’s where the server side is pushing that timestamp: src/xrt/auxiliary/gstreamer/gst_sink.c · main · Monado / Monado · GitLab

Any advice would be greatly appreciated. I had heard suggestions for or started looking at the following:

  • RTP header extensions (would it be a custom one?)
  • Some kind of custom buffer muxed into RTP?
  • H264 custom user data (less preferable since it locks us to h264/265) GstVideo SEI Unregistered User Data

I assume I’m on the very edge of common usages, because finding examples of some of these is a little tricky. However, example code would be very helpful if anyone knows some that can be linked.


1 Like

What we did in a project for this purpose was putting additional data into a custom H264 SEI on the GStreamer side (via a small custom element that is placed between encoder and payloader, in case of webrtcsink this can be done via the request-encoded-filter signal), and on the browser side extracted it via the Insertable Streams API.

I don’t think it’s possible to access the RTP packets in a web browser, so using an RTP header extension (or even a custom payload format) is out of the question unless you only need to have it work between non-browser peers. If that’s the case you can do whatever you want, obviously :slight_smile:


If you decide to use Unregistered User Data SEI messages, the injection has to be done manually with an element or a pad probe after the parser. We are working on adding support in the parser to do it using the attached VideoSEIUserDataUnregisteredMeta of the incoming buffers so that it’s easier to inject SEI messages simply by attaching this meta to a buffer.

Here is a quick example from memory on how to do it, not sure it compiles though :slight_smile:

GstPadProbeReturn example_inject_sei_cb (GstPad* pad, GstPadProbeInfo* info)
    GstBuffer* buffer;
    GstBuffer* new_buffer;
    GstH264SEIMessage sei_msg;
    GstH264UserDataUnregistered* udu;
    GstMemory* sei_memory;
    GArray* sei_data;

    buffer = GST_PAD_PROBE_INFO_BUFFER(info);

    memset(&sei_msg, 0, sizeof(GstH264SEIMessage));
    sei_msg.payloadType = GST_H264_SEI_USER_DATA_UNREGISTERED;
    udu = &sei_msg.payload.user_data_unregistered;
    udu->uuid = /* UUID for your custom messages */ 
    udu->data = /* Serialized JSON data or any other format */
    udu->size = /* Size of the data */

    sei_data = g_array_new(FALSE, FALSE, sizeof(sei_msg));
    g_array_append_vals(sei_data, &sei_msg, 1);
    sei_memory = gst_h264_create_sei_memory_avc(4, sei_data);

    new_buffer = gst_h264_parser_insert_sei_avc(m_parser, 4, buffer, sei_memory);
    if (new_buffer != NULL)
        info->data = new_buffer;
        info->size = gst_buffer_get_size(new_buffer);

    return GST_PAD_PROBE_OK;

I am working on exactly this now! so have some additional questions.

We are working on adding support in the parser to do it using the attached VideoSEIUserDataUnregisteredMeta of the incoming buffers

This would be why invoking “gst_buffer_add_video_sei_user_data_unregistered_meta” on a buffer seems to have no effect, right?

the injection has to be done manually with an element or a pad probe after the parser.

I have used two approaches, a funnel which sends a h264(or 265) SEI NALU which I join with video
and secondly an appsink, appsrc combination where I append the NALU to the other NALU’s traveling through pipeline.

However when using H264Parse or H265Parse I get a TON of warnings.
“Too small payload size 10
0:01:55.570267000 1816 0x60000132de00 WARN h264parse gsth264parse.c:638:gst_h264_parse_process_sei: failed to parse one or more SEI message”
but I can extract the NALU no problem on the other side and I can see the video flowing into a player downstream from the h26(5/4)parse.

I have encoded the sei data with emulation prevention bytes, my NALU uses the correct identifiers (3x0 +1) and my SEI is getting correctly identified when parsing the NALU’s downstream in an appsink.
Is this expected at this point?

Yeah, I do not care about web browsers in this case: just using WebRTC for the neatly packaged “everything else” solution between two gstreamer-powered things. (I mean on some level it would be nice to have the client run in a browser instead of a native app but that seems unlikely to be able to achieve the same goals)

Right now I’m primarily concerned with a proof of concept, so definitely not concerned about browsers.

fwiw, you can use a similar mechanism as the one @slomo mentioned to get called back before WebRTCSrc hands off the incoming frame to the decoder by connecting to this signal.

1 Like

That’s the reason, right now, the parse does not handle incoming VideoSEIUserDataUnregisteredMeta yet.

SEI NAL units must appear before any VCL NAL within the AU, you can’t append them anywhere in the stream. Our approach does the injection after h264parse with alignment=au, which outputs a single buffer with all the NAL’s of the AU. The injection is done with gst_h264_parser_insert_sei_avc which takes care of inserting the SEI NAL in the correct position.

1 Like