Missing daa at the start of new MP4 file

I’m working on an embedded device that incorporates a relatively complex GST pipeline including interpipe objects (from Ridge Run). We’re running GStreamer 1.20.3 on an NXP i.MX 8M mini SoC running a Linux 5.15.71 kernel.

Here’s a stripped-down version of the pipeline that (I hope) can be used to reproduce the problem:

The main app pipeline looks like this:

v4l2src device=/dev/video0
! video/x-raw, width=1280, height=720, framerate=30/1, format=GRAY8
! tee name=t1
  t1. ! appsink max=buffers=1 drop=true async=false
  t1. ! queue max-size-buffers=1 max-size-bytes=0 leaky=downstream
      ! videoconvert
      ! cairooverlay
      ! videoconvert
      ! vpuenc_h264
      ! tee name=t2
        t2. ! queue max-size-buffers=1 max-size-bytes=0 leaky=downstream
            ! rtph264pay
            ! udpsink host=host.example.com port=5000 sync=false async=false
        t2. ! h264parse ! mpegtsmux
            ! interpipesink name=logger_sink sync=false async=false

Where:

  • The appsink is used to shunt video frames into application code for processing. No problems with that part.
  • The cairooverlay is used for us to draw data based on the frames that have been processed. We’re having no problems with that part. (There is some latency where the app-sink processes frames faster than they can go through the overlay. We have code to deal with that. No problems there).
  • The overlayed video is then send to our SoC’s hardware H.264 encode (vpuenc_h264).
  • The first branch of t2 wraps the raw H.264 video and sends it out over UDP, where we can stream it using VLC. No problems with this, other than some latency (to be expected).
  • The second branch of t2 wraps the H.264 video into an MPEG-TS stream. In our product, we mux in audio as well (not shown here) and send the resulting data to a RidgeRun interpipesink. This is so we can save the stream to an MPEG file, but selectively decide when to turn the file-saving on and off without halting the rest of the stream.

We have a second pipeline which receives the output of that interpipe to write the data to a file:

interpipesrc listen-to=logger_sink allow-renegotiation=false accept-events=false format=time
! filesink sync=false async=false

When we want to start saving the MPEG data to a file, we set the filesink’s location property to the filename, then set that secondary pipeline into PLAYING state. When we want to stop saving MPEG data, we send an EOS the secondary pipeline and then wait for the bus to report EOS, after which we set its state to NULL.

In general, this all works.

We can stream live video (via RTP) from the UDP port. We see all this come through without problem and without interruption.

When we turn on the file-save pipeline, we get MP4 files. And we can play them to see the same video that we saw on the live stream.

But we find that some of the saved MP4 files begin with a period of black frames (no image) before the video content starts. This period may range between nothing and about 3 seconds. After which time, the video is all good.

On an older build, using GStreamer version 1.20.0 on Linux kernel 5.15.31, we see the same problem, but instead of black video during that initial period, it is a single frame from the video stream, duplicated for up to about 3s.

And just to reiterate, the output of the H264 encoder is not getting messed up because the RTP stream never shows these glitches. And the MPEG-TS mux isn’t stopped. The only thing changing when we record new files is starting and stopping that secondary pipeline.

I’m not sure what could be causing this. I suspect it has something to do with the stream generating key frames at about one every 3s, so the video file is unable to show anything until it his that first key frame. But that’s just a suspicion. I’m not certain about how to debug this problem, let alone how to go about fixing it.

Alternatively, is there a better approach to saving the MPEG data to files. Maybe it can be done without the interpipe? Would that change anything?

Any ideas?

I think this may easily be due to buffers getting dropped.

Doing some more testing, I found that if I add a non-leaky queue in the second branch of t2, just before the h264parse element, the problem is lessened. I still see some video files beginning with a second or two of black, but it’s not as bad as it was before.

The use of leaky queues with max-size-buffers=1 is to try and minimize latency. More specifically, certain operations (like the videoconvert prior to the cairooverlay) consume a lot of CPU load (because they’re converting the image from grayscale to color) and can’t keep up with the full 30fps on our hardware. Without the leaky queue, this causes buffers to back up all the way to the v4l2src, which reports that it is dropping frames. And that backup greatly reduces the frame rate and increases the latency for frames delivered to the appsink.

The second leaky queue (before the rtph264pay) is for similar reasons. If there are network delays, I don’t want them to cause delays to back up into the prior objects.

The problem has been determined to be due to saving MPEG-TS video data starting from random points in the stream instead of from a key-frame.

See this thread for a more useful discussion about how to possibly fix it: