Shmsrc/sink vs tee : which is the best for building complex/dynamic pipeline

Rz-Rz · July 31, 2024, 1:18pm

I’ve been trying to understand how to build pipeline that takes a single stream and outputs multiples stream for example to decode/encode and register at the same time.
I’ve been using tee for the time being but I’ve stumbled upon the shmsrc/sink plugin. I wonder whether this plugin is more efficient than using tee. And also if there are any advantage to using tee over a shm ? It seems tee use shm behind the scene.

best regards,

ocrete · August 1, 2024, 3:49pm

shmsink/src are really designed to send data between processes. If you are using some kind of hardware acceleration, you may want to look at unixfdsrc/sink instead for the multi-process use-case.

If you are inside the same process, there are multiple options to split a pipeline, you can also use inter{video,audio,sub}{src,sink} which pass the buffers between pipelines as a pointer, or there is also proxysrc/sink if you want a tighter integration.

The main reason to keep everything in a single pipeline (using a tee) is to keep everything synchronized, but if you have a large complex pipeline, it may make sense to split it. It very much depends on what you are trying to accomplish and what your constraints are.

Rz-Rz · August 2, 2024, 3:49pm

Thank you for this answer I understand better now. But I am wondering why unixfdsrc/sink are superior to shmsrc/sink for media ipc when it comes to HW ? (which means nvidia or intel specifically here I believe like nvvidconv and such, please correct me if I’m wrong). As I understand HW in GStreamer through plugins, both use case should work and shared memory should even be superior as access to RAM is the fastest ?

ocrete · August 2, 2024, 4:07pm

Most HW acceleration requires a special allocation, which is most often associated with a FD. If you use shmsrc/shmsink, the data will be memcpy’d into the shared memory, but with unixfdsrc/sink, then it will just send the FD, allowing for a zero-copy pipeline.

That said, this applies to Intel, and everything ARM when using the upstream kernels, but NVidia have their own abstraction, and our NVMM memory support currently doesn’t implement exporting/importing to a FD, so you’ll end up doing some kind of copy there too I believe.

ocrete · August 2, 2024, 4:09pm

To be more precise, for CUDA, you want to use the CUDA IPC elements.

Rz-Rz · August 20, 2024, 12:20pm

Sadly the hardware I use (Jetson Orin) does not support cuda ipc. However I believe that it is worth a shot to use unixdfsrc/sink (especially if it is compatible with ARM) ? Currently I am bridging the gaps using opencv to read frames from a shm then pass them to gstreamer but this is not efficient.