Mp4mux deadlock when trying to EOS with video and audio pads

I have an application that sometimes streams audio to the mp4mux in addition to a video stream (i.e., the application requests both pads and feeds them with data). In that configuration, I consistently have problems with sending an EOS to the mp4mux. This is on GStreamer 1.22.8, however we’ve seen it since before 1.20. The mp4mux is configured with its default properties, the sources are “live”, and for the sake of my testing the muxer is linked to a fakesink. When I test without requesting an audio pad and feeding it a stream, everything is fine – nothing blocks when I try to EOS the video side. Only when both audio and video are present am I seeing this behavior.

When the application goes to end the file, the gst_pad_send_event(sink, gst_event_new_eos()) call blocks regardless of being called from the streaming thread, a probe callback, or from a dispatched task via gst_element_call_async. In every case, in the debugger I can see the mp4mux aggregator base class is blocked on SRC_WAIT (self) over on line 880 of gst_aggregator_wait_and_check(). If I send an EOS on the audio pad, it sometimes unblocks and allows the EOS through on the video pad (and sometimes the gst_pad_send_event(...) call to the audio pad also blocks. I can greatly increase the likelihood that neither of those calls block by first sending a flush start followed by an immediate flush stop (no reset time), however it is not 100% successful and carries the downside of destroying whatever is queued in the pipeline – I want the application to finish processing the data it has.

Now, when the mp4mux is in this SRC_WAIT state, all upstream elements’ pads are stuck on send_unchecked and blocked because their stream locks are held. And since EOS is a serialized event, trying to send one down the pad results in the call blocking as it waits to enter the queue. I’ve tried various pad -related APIs trying to see if there’s a way for the upstream logic to see that deep down in the hierarchy the mp4mux and all associated pads are in this deadlocked state, but I have come up with nothing. Furthermore, since the EOS cannot propagate down, no pad probes I’ve added will actually be called until the muxer comes out of SRC_WAIT, so there’s no way that I can see to use a probe to react to this state. Instead, all I can do is treat this branch of the pipeline as a “special case” that requires flushing before it can be EOSd, which strikes me as being an anti-pattern considering other elements seem to generally handle EOS events being pushed without blocking at gst_pad_send_event (especially if called from an element’s thread pool).

I do not have a simplified pipeline example that I can share at this time. I have looked through the gstreamer codebase for tests involving the mp4mux and using both the video and audio sink pads, however I’ve not run across one yet. I have also not run across documentation explaining the correct procedure for how to EOS the mp4mux element in that configuration.

My questions then are:

  1. What is the correct procedure for submitting an EOS to the mp4mux element when it is handling both audio and video streams?
  2. Is there a way to determine at runtime that an element is in this state which would require flushing to actually end the stream?