Methodology for Processing Multiple Live RTSP Sources

I’m wondering what the pros/cons would be between two approaches when attempting to process data from a large network of cameras. In this situation I may expect:

  • 50+ RTSP sources
  • Unstable network at times leading to packet drop
  • Cameras disconnecting leading to timeouts/EOS (rtpbin)
  • Maintaining a certain level of synchronization (+/- 1 second between cameras)

From what I have seen there could be two basic approaches:

  1. Running a seperate GStreamer pipeline for each source
  2. Running a monolithic pipeline that creates multiple streaming elements
    (3. And as a bonus I suppose you could do something in between handling groups of RTSP sources?)
  • Are there any methodologies surrounding the development of these sort of GStreamer pipeline(s)?

  • From what I know, method 1 would likely be easier to handle when EOS/unstable network issues occur as the pipeline could simply be quit/deleted and re-spun up. However, would creating 50 seperate GStreamer pipelines have a significant burden on the system compared to 1 monolithic pipeline handling all the streams? In Python especially, I’ve noticed that starting several GStreamer MainLoop in seperate processes is not a great approach as you lose the ability to easily communicate to the pipelines.

  • Using method 2 seems like the more intuitive approach to me. The main things is that handling EOS/errors becomes trickier, although still possible, as things would need to be handled carefully in a dynamic matter. Certainly with method 2 I imagine synchronization would be significantly better/easier to manage as all elements are connected to the same pipeline clock.

Anyways, I’m mainly just wondering what other people think as I am fairly new to creating pipelines that would need to handle this type of scale.


I’m not the biggest expert in gstreamer, just recently started to work with. But from my experience there is no option to choose from. Every rtspsrc element (source of your stream) will have only one location, every pipeline can have 1 src element, so here is 1 to 1 mapping inevitably. Probably it is possible to deal with 50+ rtsp streams within some viodemixer or similiar elements, but I really don’t understand why to have single pipeline for this. Suppose you need to restart some stream (because it hangs or something, or issues with camera, network issues), in this case all other streams will “suffer”. In case of 1-1 it will be isolated. In terms of system resources (CPU, memory) I don’t see any difference (maybe I miss something?). Also for multiply pipelines only single MainLoop is needed. I mean 1 process and multiply pipeline threads. Also I work with gst from C# which have normal threading model, in case of python there could be problems (at least in 2.x)

Maintaining a certain level of synchronization (+/- 1 second between cameras)

You can sync. them from single ntp server.

1 Like