Crash when switching H264 RTP streams using input-selector / pipeline restart (v4l2h264dec)

Hi all,

I’m relatively new to GStreamer and still have a limited understanding of it. I’m trying to build an application that receives H264 RTP streams over UDP and displays them using a hardware-accelerated decoder (v4l2h264dec). However, I’m facing instability and crashes when switching between two streams.


Setup

I have two independent RTP H264 streams:

Stream 1:

gst-launch-1.0 \
  udpsrc port=8081 \
  caps="application/x-rtp,media=video,encoding-name=H264,clock-rate=90000,payload=96" \
  ! rtph264depay \
  ! h264parse \
  ! v4l2h264dec \
  ! kmssink sync=false force-modesetting=true

Stream 2:

gst-launch-1.0 \
  udpsrc port=8082 \
  caps="application/x-rtp,media=video,encoding-name=H264,clock-rate=90000,payload=96" \
  ! rtph264depay \
  ! h264parse \
  ! v4l2h264dec \
  ! kmssink sync=false force-modesetting=true

Each pipeline works correctly when run independently.


Requirement

I need to dynamically switch between these two streams (e.g., based on user interaction or events).


What I Tried

1. Pipeline restart approach

  • Stop pipeline (stream on port 8081)
  • Start pipeline (stream on port 8082)

This works initially but leads to random crashes after multiple switches.


2. input-selector approach

  • Single pipeline with two inputs
  • Switch active pad dynamically

This improves behavior but still results in instability after some time.


Observed Errors

Kernel / driver logs:

vdec 30210000.video-codec: wave5_vpu_enc_finish_encode: encoded buffer (1) was not in ready queue 0. 
vdec 30210000.video-codec: wave5_vpu_enc_finish_encode: no source buffer with index: 1 found 
vdec 30210000.video-codec: wave5_vpu_firmware_command_queue_error_check: result not ready: 0x800
vdec 30210000.video-codec: wave5_vpu_dec_finish_decode: could not get output info.

GStreamer warning:

v4l2h264dec0: Too old frames, bug in decoder -- please file a bug

Questions

  1. Is repeatedly stopping/starting pipelines with v4l2h264dec expected to cause instability?

  2. Is input-selector the recommended way to switch between RTP streams with hardware decoding?

  3. What does the "Too old frames" warning indicate in this context?


Goal

A stable solution to:

  • Keep both RTP streams active
  • Switch between them dynamically
  • Avoid decoder crashes or desynchronization

Any suggestions or pointers would be greatly appreciated.

Thanks!

I do not have any experience with v4l2h264dec but I am working on a similar project, and I would suggest that since you are ok with keeping both streams live at the same time, that you remove the input selector and use a compositor downstream of the decoders. With this approach you can then change settings on the compositor sink pads to show or hide a particular stream on the fly.

Thanks for the suggestion!

The reason I’m sticking with input-selector here is the hardware constraint on the my embedded board. It has a single shared VPU with limited concurrent decode capacity, so running both v4l2h264dec instances simultaneously — even with one hidden — puts both hardware decode slots under load all the time.

The other issue specific to v4l2h264dec is that the two UDP sources have independent RTP clocks. The compositor tries to synchronise frames from both inputs by PTS, and with two unrelated clock sources that causes one stream to stall waiting on the other, which is actually what was causing my freeze issue in the first place.

No, absolutely not. This is the brute force approach basically. If this doesn’t work, that indicates there are big problems either in the GStreamer decoder element and/or the driver/firmware.

Yes, in principle this should work fine provided the decoder element used supports caps changes at runtime, as it should. But there may be issues with this specific element and/or the driver/firmware.

I know this is not particularly helpful, just saying that I don’t think you’re doing anything wrong here at the GStreamer level in your application. You could check if things work better with a software decoder like avdec_h264, just for comparison.

A couple of things, I have implemented something very similar with multiple inputs to the compositor originating from multiple RTP sources, all with different clocks and do not have issues. You just need to modify the compositor start time property inherited from I think GstAggregator.

Also I have seen others with more experience than me avoid input selectors. One option is to use a funnel and valve combination instead. The funnel should be placed either after the decoder to funnel the raw video or after the caps filter and before the depayloader. The depayloader is aware of SSRC switches so as long as you don’t have an SSRC collision in theory this should work. Also hopefully you have a smaller GOP size set at the encoders, assuming you don’t have an out of band method to request key frames.

If you funnel after the decoder then I would put the valve just before it to help with quick switches.

If you funnel before the depayloader then place your valves just before the funnel.

Also consider adding a jitterbuffer after each RTP caps.

If this works also consider adding queues to each branch leading into the funnel. My understanding is in one to many or many to one scenarios it is best to have queues to decouple the processing. Let me know how the above works and we can look at the queues if you get that far.

Please note that we usually use the term crash to refer to a execution fault (segmentation fault, ill code, etc.). If this is the case, its always helpful to capture a backtrace using a gdb.

From the kernel trace you shared though, it seems you are also hitting issues with your kernel driver. Its good habit to verify if you kernel includes all the recent driver fixes. This driver specialy have had major fixes in the last few kernel version including some threading fix and some fixes included in the PR I’ve sent for inclusion in the media tree. Since this is out of topic for gstreamer, I strongly suggest watching the mailing list and latest released kernels.