Associating additional data with a frame - webrtc

rpavlik · October 2, 2023, 10:15pm

I am trying to associate additional data (server-generated ID, timestamp, 6dof “pose”) with frames being sent between two processes using gstreamer and webrtcbin. (I have control over both ends, browser interop is not important.)

I already have a data_channel set up between them, but that does not synchronize with the actual frames sent. I would be able to send the rest of the data over the data channel if I could get a 64-bit ID or timestamp with the frame (“re-assembling” the data from the two sources after the fact).

I’m not sure the best way to do this. I’m already populating a timestamp on the server end, but can’t figure out how to get that out on the receiving end. I am not sure if my use of hardware decoders on the receiver is complicating things there. For reference, here’s where the server side is pushing that timestamp: src/xrt/auxiliary/gstreamer/gst_sink.c · main · Monado / Monado · GitLab

Any advice would be greatly appreciated. I had heard suggestions for or started looking at the following:

RTP header extensions (would it be a custom one?)
Some kind of custom buffer muxed into RTP?
H264 custom user data (less preferable since it locks us to h264/265) GstVideo SEI Unregistered User Data

I assume I’m on the very edge of common usages, because finding examples of some of these is a little tricky. However, example code would be very helpful if anyone knows some that can be linked.

Thanks!

slomo · October 3, 2023, 5:39am

What we did in a project for this purpose was putting additional data into a custom H264 SEI on the GStreamer side (via a small custom element that is placed between encoder and payloader, in case of webrtcsink this can be done via the request-encoded-filter signal), and on the browser side extracted it via the Insertable Streams API.

I don’t think it’s possible to access the RTP packets in a web browser, so using an RTP header extension (or even a custom payload format) is out of the question unless you only need to have it work between non-browser peers. If that’s the case you can do whatever you want, obviously

ylatuya · October 3, 2023, 8:37am

Hi,

If you decide to use Unregistered User Data SEI messages, the injection has to be done manually with an element or a pad probe after the parser. We are working on adding support in the parser to do it using the attached VideoSEIUserDataUnregisteredMeta of the incoming buffers so that it’s easier to inject SEI messages simply by attaching this meta to a buffer.

Here is a quick example from memory on how to do it, not sure it compiles though

GstPadProbeReturn example_inject_sei_cb (GstPad* pad, GstPadProbeInfo* info)
{
    GstBuffer* buffer;
    GstBuffer* new_buffer;
    GstH264SEIMessage sei_msg;
    GstH264UserDataUnregistered* udu;
    GstMemory* sei_memory;
    GArray* sei_data;

    buffer = GST_PAD_PROBE_INFO_BUFFER(info);

    memset(&sei_msg, 0, sizeof(GstH264SEIMessage));
    sei_msg.payloadType = GST_H264_SEI_USER_DATA_UNREGISTERED;
    udu = &sei_msg.payload.user_data_unregistered;
    udu->uuid = /* UUID for your custom messages */ 
    udu->data = /* Serialized JSON data or any other format */
    udu->size = /* Size of the data */

    sei_data = g_array_new(FALSE, FALSE, sizeof(sei_msg));
    g_array_append_vals(sei_data, &sei_msg, 1);
    sei_memory = gst_h264_create_sei_memory_avc(4, sei_data);
    g_array_unref(sei_data);

    new_buffer = gst_h264_parser_insert_sei_avc(m_parser, 4, buffer, sei_memory);
    if (new_buffer != NULL)
    {
        info->data = new_buffer;
        info->size = gst_buffer_get_size(new_buffer);
        gst_buffer_unref(buffer);
    }

    return GST_PAD_PROBE_OK;
}

lennel · October 3, 2023, 1:38pm

I am working on exactly this now! so have some additional questions.

We are working on adding support in the parser to do it using the attached VideoSEIUserDataUnregisteredMeta of the incoming buffers

This would be why invoking “gst_buffer_add_video_sei_user_data_unregistered_meta” on a buffer seems to have no effect, right?

the injection has to be done manually with an element or a pad probe after the parser.

I have used two approaches, a funnel which sends a h264(or 265) SEI NALU which I join with video
and secondly an appsink, appsrc combination where I append the NALU to the other NALU’s traveling through pipeline.

However when using H264Parse or H265Parse I get a TON of warnings.
“Too small payload size 10
0:01:55.570267000 1816 0x60000132de00 WARN h264parse gsth264parse.c:638:gst_h264_parse_process_sei: failed to parse one or more SEI message”
but I can extract the NALU no problem on the other side and I can see the video flowing into a player downstream from the h26(5/4)parse.

I have encoded the sei data with emulation prevention bytes, my NALU uses the correct identifiers (3x0 +1) and my SEI is getting correctly identified when parsing the NALU’s downstream in an appsink.
Is this expected at this point?

rpavlik · October 3, 2023, 3:28pm

Yeah, I do not care about web browsers in this case: just using WebRTC for the neatly packaged “everything else” solution between two gstreamer-powered things. (I mean on some level it would be nice to have the client run in a browser instead of a native app but that seems unlikely to be able to achieve the same goals)

Right now I’m primarily concerned with a proof of concept, so definitely not concerned about browsers.

fengalin · October 4, 2023, 10:20am

fwiw, you can use a similar mechanism as the one @slomo mentioned to get called back before WebRTCSrc hands off the incoming frame to the decoder by connecting to this signal.

ylatuya · October 5, 2023, 10:54am

That’s the reason, right now, the parse does not handle incoming VideoSEIUserDataUnregisteredMeta yet.

SEI NAL units must appear before any VCL NAL within the AU, you can’t append them anywhere in the stream. Our approach does the injection after h264parse with alignment=au, which outputs a single buffer with all the NAL’s of the AU. The injection is done with gst_h264_parser_insert_sei_avc which takes care of inserting the SEI NAL in the correct position.

galaxycatto · July 4, 2024, 2:13pm

Currently trying this out. I have custom H264 SEI being sent with video frames on the GStreamer side, but on the browser side, I’m having trouble extracting it with the Insertable Streams API. Would it be extracted as an extra media track? So far I’ve seen mostly explanations on how to mutate a track using the API. Some guidance on this would be great

slomo · July 4, 2024, 2:19pm

You’ll have to parse the H264 bitstream in the browser via that JavaScript API. That needs the encodedInsertableStreams flag in the rtcConfig, getting the encoded streams via createEncodedStreams() on the event passed via onRemoteTrack and then inserting your own transform worker into that stream for parsing the bitstream, extracting whatever you need and passing it back via messages to your main web app.

galaxycatto · July 4, 2024, 8:03pm

is onRemoteTrack part of that javascript API? I have not found it anywhere. Also, I am testing this with a stream that uses the SEI timecodes in H264. I used a worker to extract the data from the stream through a TransformStream and set the flag in the config. I believe I have the frame data logged in my console, but I am unsure of how to exactly extract the timecode from it.

slomo · July 5, 2024, 5:30am

Sorry, I meant the ontrack event on the peer connection.

galaxycatto · July 5, 2024, 12:17pm

Thank you very much! Yes I got the encoded stream from the ontrack event. Is there any documentation on how Gstreamer embeds the timestamp into the SEI message? I used the timecodestamper with h264parse update-timecode=true in my pipeline.

slomo · July 5, 2024, 12:20pm

That’s explained in detail in the H264 specification. What you’re looking for is the picture timing SEI message.

philn · July 5, 2024, 3:36pm

There’s a proposal about this, GitHub - w3c/webrtc-rtptransport: Repository for the RTPTransport specification of the WebRTC Working Group

leidix · December 10, 2024, 11:37am

Did you manage to get this working? I would appreciate if you could share how you solved this

perukas · February 25, 2025, 11:35am

Hi and thanks for that. What would be the parsing end on the client side?
I have a pad probe callback on an h264parse node (src pad, i.e. after the sample is parsed) and I receive an gst_h264_parser_parse_sei_message: failed to read uint8 for 'payload_type_byte', nbits: 8.

The snippet is the following:

static GstPadProbeReturn buffer_sei_pull_probe_callback(GstPad* pad, GstPadProbeInfo* info, gpointer user_data)
{
	GstBuffer* buffer = GST_PAD_PROBE_INFO_BUFFER(info);
	if (!buffer)
		return GST_PAD_PROBE_OK;

	GstH264NalParser* parser = (GstH264NalParser*)user_data;
	GstH264NalUnit nal;
	GstMapInfo map;

	if (!gst_buffer_map(buffer, &map, GST_MAP_READ))
	{
		LOG_ERROR << "stream:: Failed to map buffer for reading SEI.";
		return GST_PAD_PROBE_OK;
	}



	if (gst_h264_parser_identify_nalu_avc(parser, map.data, 0, map.size, 4, &nal) == GST_H264_PARSER_OK)
	{
		if (nal.type == GST_H264_NAL_AU_DELIMITER)
		{
			GArray* sei_data;
			GstH264ParserResult res = gst_h264_parser_parse_sei(parser, &nal, &sei_data);
			LOG_INFO << nal.data;
			if (sei_data)
			{

			}
		}
	}

	return GST_PAD_PROBE_OK;
}

Just trying to understand how the whole thing works.
Thanks for your time.

ylatuya · February 25, 2025, 12:33pm

On the client side, h264parse/h265parse parses SEI messages and attaches them as GstVideoSEIUserDataUnregisteredMeta to the output buffers. You should add a buffer probe after the parser and read the buffer metas trying to find a GstVideoSEIUserDataUnregisteredMeta that matches your UUID.

static const guint8 FLU_TIMING_UUID[] = {0xCF, 0x84, 0x82, 0x78, 0xEE, 0x23, 0x30, 0x6C,
                                         0x92, 0x65, 0xE8, 0xFE, 0xF2, 0x00, 0x01, 0x02};

GstVideoSEIUserDataUnregisteredMeta* flu_timing_metadata_meta_get(GstBuffer* buffer)
{
    GstVideoSEIUserDataUnregisteredMeta* sei_meta = NULL;
    gpointer iter = NULL;

    /* We loop over the SEIUserDataUnregistered metas until we find the first meta with our UUID */
    while ((sei_meta = (GstVideoSEIUserDataUnregisteredMeta*)gst_buffer_iterate_meta_filtered(
                buffer, &iter, GST_VIDEO_SEI_USER_DATA_UNREGISTERED_META_API_TYPE)))
    {
        if (memcmp(&sei_meta->uuid, &FLU_TIMING_UUID, 16) == 0)
        {
           return sei_meta;
        }
    }
    return NULL;
}

void flu_buffer_cb (GstPad* pad, GstPadProbeInfo* info)
{
    GstBuffer* buffer;
    GstVideoSEIUserDataUnregisteredMeta* sei_meta;
    FluTimingMetadata* metadata;

    buffer = GST_PAD_PROBE_INFO_BUFFER(info);
    sei_meta = flu_timing_metadata_meta_get(buffer, false);

    if (!sei_meta)
        return;

    metadata = (FluTimingMetadata*)sei_meta->data;
    [...]
}

perukas · February 26, 2025, 7:57am

Thank you so much! I didn’t realize that if SEI metadata were injected correctly in an H264 stream, GstVideoSEIUserDataUnregisteredMeta objects are automatically present on the client side.

Thanks again.