Recording a H.264 stream to disk with one GOP per file

Hello!

I am trying to save a H.264 stream into many MP4 files where each file contains exactly one keyframe at the beginning (that is, each file is a single, closed group of pictures). So if this is the encoded stream

IPBBBPIPBBBPIPBBBP...

I want to save it to following files

IPBBBP
IPBBBP
IPBBBP
...

I use nvcudah264enc for encoding. I set gop-size to FPS of the stream and then use splitmuxsink with max-size-time of 1 second. I expected each file to be of the same length but they differ a bit. Is there a better way to do this? I experimented with async-finalize, other muxers, multipartmux, multipart and options in those.

I see there are split-now and split-after signals I can send to splitmuxsink in my application to trigger a flush, but I am not sure where I would even send them. Is there a standard signal in GStreamer for end of GOP coming out of encoders? Am I misunderstanding this somehow? Thank you for your time!

Best wishes,

Konstanty

Do your files contain a single GOP (IPBBBP)? Are you using constant rate encoding?

1 Like

Oh, amazing! Setting bitrate property on nvhcudah264enc made the files have the same length and 1 GOP in each.

In this pipeline

videotestsrc ! videorate ! video/x-raw, framerate=15/1 ! nvcudah264enc gop-size=15 bitrate=120 b-frames=3 ! h264parse ! queue ! splitmuxsink location=/tmp/1gop1file/%d.mp4 max-size-time=1000000000 max-files=200

Using ffprobe -show_frames /tmp/1gop1file/1.mp4 2>/dev/null | grep pict_type shows a GOP of IBBBPBBBPBBBPBP (15 frames) and pts_time + pkt_duration_time = 1.0. This holds for each of 200 files I recorded this way. The same pipeline without bitrate produces a file with two I-frames and slightly more than 1.0s in duration. Weirdly, ffrobe also shows that bitrate is not constant in both of these scenarios. Not sure if this setting even applies, but it does change something it seems!

I’m curious though, is this the best way? It seems weirdly indirect to achieve this by synchronising framerate, gop-size and max-size-time in three different elements. I thought the encoder would do its thing with whatever settings and I would catch a signal that a GOP is ready and flush it to disk on that signal. I know I can send split-now to splitmuxsink but I don’t know how to use it.

If what I have is the only way, is there something that can go wrong with that pipeline? Are there any more settings you would recommend for robustness?

One more weird thing: all these files are playable in gst-play, ffplay, VLC and Chromium, but not in Firefox. Probably something wrong with Firefox, but weirdly files I encoded with nvcudah264enc without any settings work well.

You could maybe detect key frames by looking at the buffer flags of buffers going into splitmuxsink. IDR frames should have the DELTA_UNIT flag cleared, and otherwise it should be set. You could do that from a pad probe just before or on the splitmuxsink and call split-now just before the next keyframe (start of GOP) goes into the splitmuxsink. I don’t know if this will actually work like this though with how splitmuxsink works internally, but something to try perhaps.

2 Likes