RTSPSRC Set State Null Deadlock

I’m using Ubuntu 20.04 and stuck with using GStreamer 1.16.3 so guessing this patch doesn’t exist (rtspsrc: deadlock on set_state(NULL) (#900) · Issues · GStreamer / gst-plugins-good · GitLab).

Is there any sort of workaround for setting the rtspsrc element to the null state without deadlocking? I have a pipeline that runs some hardware decoding on rtsp streams and I want to dynamically remove the rtspsrc when a failure/error occurs.

In this case I am just trying to catch the “on-sender-timeout” signal from the rtpbin manager, then add an IDLE probe and perform the dynamic reconnect. Inside of the dynamic reconnect method:

    # Remove the old rtspsrc
    print('Unlinking and removal process...')
    pad.unlink(pad_peer)
    pad.unref()
    source_bin.remove(rtspsrc)
    children = manager.children
    for child in children[0:-1]:
        print(child, type(child))
        child.set_state(Gst.State.NULL)
        print('child set to null...')
    print('nulling old rtspsrc')
    rtspsrc.set_state(Gst.State.NULL)
    rtspsrc.unref()
    print('unlinking and removal done...')

The code hangs indefinitely at rtspsrc.set_state(Gst.State.NULL). I was looking at the linked issue and that is why I added the child iteration step, but that did not help. Also, when trying to set all child elements to NULL I recieve this error when attempting to set the RtpSession to NULL:

<__gi__.GstRtpSession object at 0x7f336ee24480 (GstRtpSession at 0x7f3368066140)> <class '__gi__.GstRtpSession'>

(python:73122): GLib-ERROR **: 15:41:15.402: file ../../../glib/gthread-posix.c: line 1353 (g_system_thread_wait): error 'Resource deadlock avoided' during 'pthread_join (pt->system_thread, NULL)'

Any options to get around this, or do things a different way? If I simply try to ignore the step for setting the state and unref() then I get the following error after the probe completes:

(python:74821): GStreamer-CRITICAL **: 16:10:57.198: 
Trying to dispose element rtspsrc_0, but it is in PLAYING instead of the NULL state.
You need to explicitly set elements to the NULL state before
dropping the final reference, to allow them to clean up.
This problem may also be caused by a refcounting bug in the
application or some element.

g_mutex_clear() called on uninitialised or locked mutex

Sorry for all the edits…
Could switching to uridecodebin/uridecodebin3 based pipeline work, assuming they are able to be set to Null when performing the dynamic changes?
Would additional elements be needed to mimic the function of rtspsrc (jitterbuffer, or other)?

Update: Using uridecodebin I notice that internally an rtspsrc element is created still. I also still seem to hit a deadlock when trying to null the uridecodebin.

I have added a callback connecting to before-send incase the issue was somehow related to this:
"NOTE: rtspsrc will send a PAUSE command to the server if you set the element to the PAUSED state, and will send a PLAY command if you set it to the PLAYING state.

Unfortunately, going to the NULL state involves going through PAUSED, so rtspsrc does not know the difference and will send a PAUSE when you wanted a TEARDOWN. The workaround is to hook into the before-send signal and return FALSE in this case. [rtspsrc]"

The callback looks like this:

def on_before_send(rtspsrc: Gst.Element, rtspmsg):
    print('BEFORE SEND')
    # print(rtspmsg) -- > <GstRtsp.RTSPMessage object at 0x7f032701cb80 (GstRTSPMessage at 0x7f031d9a0bd0)>
    # print(rtspmsg.type) --> <enum GST_RTSP_MESSAGE_REQUEST of type GstRtsp.RTSPMsgType>
    if rtspmsg.type != GstRtsp.RTSPMsgType.REQUEST:
        return True
    res, method, uri, version = rtspmsg.parse_request()
    if method == GstRtsp.RTSPMethod.PAUSE:
        print(f"Caught {method}")
        return False
    return True

However, this still doesn’t seem to help as I just see this in the console:

Unlinking and removal process...
nulling old rtspsrc
BEFORE SEND
Caught <flags GST_RTSP_PAUSE of type GstRtsp.RTSPMethod>

Is there anything I’m doing wrong? Again, I just want to be able to dynamically delete an rtspsrc element and add back another in its place.

The procedure I use in replacing a sink is as follows:
if the pipeline is running:

  1. pause the pipeline
  2. flush it
  3. set the state to NULL
  4. unlink it from the pipeline
  5. recreate it and link it.

You can also delete the whole pipeline and rebuild it.

1 Like

Thanks for the reply! In my case I am adding several rtspsrc elements to the pipeline, which are then connected to nvstreammux for some batch processing.
Based on my understanding of making dynamic pipeline changes I was hoping to be able to leave the rest of the pipeline rtspsrc’s running and just perform the dynamic relink on any rtspsrc’s that output the source-timeout signal (ex. if one of my rtsp feeds/IP cameras get disconnected). However, it seems that calling set_state null on the rtspsrc just deadlocks and never returns…maybe there are some special steps to take when dealing with live source elements or rtspsrc though?

if you have a few elements to change, simply delete the pipeline and rebuild it. It is quick.

1 Like