How to set up a rtsp/rtp receiving pipeline for different encoding?

Beginner question, but I’m wondering if it’s possible to receive in the same GStreamer application, through the same pipeline, maybe with decodebin, multiples video stream that are encoded differently (h264 and jpeg for example).
I know there is rtpbin and decodebin, I have tried multiple combination but to no avail.

Yes, of course, you can use multiple rtpbins / decodebins in the same pipeline, but the details will depend on your exact inputs.

I have the following idea, a class that can deal with rtp streams. The goal is to handle different rtp streams using the same class, but right now I have to get one way or another data about the rtp streams. I was wondering if the pipeline could deduce by itself the caps of the rtp streams :

#include "RTPPipelineStrategy.hpp"
#include "StreamThread.hpp"

#include <gst/gst.h>
#include <iostream>
#include <stdexcept>

void RTPPipelineStrategy::init_elements(StreamThread &streamThread) {
  std::cout << "initiating elements of RTPPipelineStrategy" << std::endl;

  streamThread.source = gst_element_factory_make("udpsrc", "source");
  streamThread.queue1 = gst_element_factory_make("queue", "queue1rtp");
  if (streamThread.videoSource.encoding == EncodingType::H264) {
    streamThread.depay = gst_element_factory_make("rtph264depay", "depay");
    streamThread.parse = gst_element_factory_make("h264parse", "parse");
    streamThread.decode = gst_element_factory_make("avdec_h264", "decode");
  } else if (streamThread.videoSource.encoding == EncodingType::MJPEG) {
    streamThread.depay = gst_element_factory_make("rtpjpegdepay", "depay");
    streamThread.decode = gst_element_factory_make("jpegdec", "decode");
  } else {
    g_printerr("Unsupported encoding type.\n");
    throw std::runtime_error("Unsupported encoding type");
  }
  streamThread.queue2 = gst_element_factory_make("queue", "queue2rtp");
  streamThread.queue3 = gst_element_factory_make("queue", "queue3rtp");
  streamThread.convert = gst_element_factory_make("videoconvert", "convert");
  streamThread.sink = gst_element_factory_make("tee", "sink");

  if (!streamThread.source || !streamThread.queue1 || !streamThread.depay ||
      !streamThread.parse || !streamThread.queue2 || !streamThread.decode ||
      !streamThread.queue3 || !streamThread.convert || !streamThread.sink) {
    g_printerr("Not all elements could be created.\n");
    throw std::runtime_error("Element creation failed");
  }

  streamThread.pipeline = gst_pipeline_new("udp-pipeline");
  if (!streamThread.pipeline) {
    g_printerr("Pipeline could not be created.\n");
    throw std::runtime_error("Pipeline creation failed");
  }
}

void RTPPipelineStrategy::build_pipeline(StreamThread &streamThread) {
  g_object_set(G_OBJECT(streamThread.source), "port",
               streamThread.videoSource.port, NULL);
  // Set the appropriate caps based on the encoding type
  GstCaps *caps = nullptr;
  if (streamThread.videoSource.encoding == EncodingType::H264) {
    caps = gst_caps_from_string(
        "application/x-rtp,media=video,encoding-name=H264,payload=96");
  } else if (streamThread.videoSource.encoding == EncodingType::MJPEG) {
    caps = gst_caps_from_string(
        "application/x-rtp,media=video,encoding-name=JPEG,payload=26");
  } else {
    g_printerr("Unsupported encoding type.\n");
    throw std::runtime_error("Unsupported encoding type");
  }

  if (caps) {
    g_object_set(G_OBJECT(streamThread.source), "caps", caps, NULL);
    gst_caps_unref(caps);
  } else {
    g_printerr("Failed to create caps.\n");
    throw std::runtime_error("Caps creation failed");
  }

  gst_bin_add_many(GST_BIN(streamThread.pipeline), streamThread.source,
                   streamThread.queue1, streamThread.depay, streamThread.parse,
                   streamThread.decode, streamThread.convert, streamThread.sink,
                   NULL);

  if (gst_element_link_many(streamThread.source, streamThread.queue1,
                            streamThread.depay, streamThread.parse,
                            streamThread.decode, streamThread.convert,
                            streamThread.sink, NULL) != TRUE) {
    g_printerr("Elements could not be linked.\n");
    throw std::runtime_error("Pipeline linking failed");
  }

  if (!streamThread.dataflow(streamThread.sink)) {
    g_printerr("Failed to add initial queue to pipeline.\n");
    throw std::runtime_error("Adding initial queue failed");
  }
}

As you can see I have to choose the correct type of stream, between jpeg packet or H264 packets. It would be amazing if rtpbin/decodebin could deduce it, however all my attempts failed…

rtspsrc->parsebin->decodebin-> autovideoconvert → autovideosink should work. I tried with h264 and h265 streams mostly. There is one camera that streams h263 and that works too. I don’t know about jpeg…
Also I had lots of problems when I tried to create elements → add them to pipeline-> link them together… I’m creating the pipeline using gst_parse_launch…

1 Like

You may try element uridecodebin (or uridecodebin3):

gst-launch-1.0 -v uridecodebin uri=rtsp://your_rtsp_server:port/path ! queue ! autovideoconvert ! autovideosink

# In some cases with Jetson the command above might fail. In such case try:
gst-launch-1.0 -v uridecodebin uri=rtsp://your_rtsp_server:port/path ! queue ! videoconvert ! autovideosink

It can also handle a file (uri=file://absolute_path_to_file such as uri=file:///home/user/Videos/test.mp4)

I’ve been trying to figure out where are all the URI it handles ? I was wondering if it could handle a RTP stream sent via TCP ? It seems that might be not the case, it would be nice to see a detail of its capabilities, but nothing about that in the docs sadly…

Also I tried to parse the following pipeline :
gst-launch-1.0 v4l2src device=/dev/video0 ! “image/jpeg, width=1280, height=720, fps=30/1” ! jpegdec ! videoconvert ! videoscale ! “video/x-raw, width=1760, height=990” ! jpegenc ! rtpjpegpay ! rtpstreampay ! tcpserversink port=5000 host=127.0.0.1

The receiving end works well with this :
gst-launch-1.0 tcpclientsrc host=127.0.0.1 port=5000 ! “application/x-rtp-stream,encoding-name=JPEG” ! rtpstreamdepay ! rtpjpegdepay ! jpegdec ! videoconvert ! autovideosink

However I cannot replace it wit parsebin and decodebin :
tcpclientsrc host=127.0.0.1 port=5000 ! “application/x-rtp,encoding-name=JPEG” ! parsebin ! decodebin ! videoconvert ! autovideosink

As it tells me I’m missing a plugin or there is a linking problem :

Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
0:00:00.038119991 10654 0xaaab0d8e42a0 FIXME                default gstutils.c:3980:gst_pad_create_stream_id_internal:<tcpclientsrc0:src> Creating random stream-id, consider implementing a deterministic way of creating a stream-id
0:00:00.069696490 10654 0xaaab0d8e42a0 WARN                 default descriptions.c:1233:gst_pb_utils_get_codec_description: No description available for media type: application/x-rtp
0:00:00.069972492 10654 0xaaab0d8e42a0 WARN                 default descriptions.c:1233:gst_pb_utils_get_codec_description: No description available for media type: application/x-rtp
0:00:00.070000780 10654 0xaaab0d8e42a0 WARN                parsebin gstparsebin.c:3486:gst_parse_bin_expose:<parsebin0> error: no suitable plugins found:
Missing parser: application/x-rtp (application/x-rtp, encoding-name=(string)JPEG)

Missing element: application/x-rtp decoder
ERROR: from element /GstPipeline:pipeline0/GstParseBin:parsebin0: Your GStreamer installation is missing a plug-in.
Additional debug info:
gstparsebin.c(3486): gst_parse_bin_expose (): /GstPipeline:pipeline0/GstParseBin:parsebin0:
no suitable plugins found:
Missing parser: application/x-rtp (application/x-rtp, encoding-name=(string)JPEG)

0:00:00.070138125 10654 0xaaab0d8e42a0 WARN                 basesrc gstbasesrc.c:3072:gst_base_src_loop:<tcpclientsrc0> error: Internal data stream error.
ERROR: pipeline doesn't want to preroll.
Setting pipeline to NULL ...
0:00:00.070161997 10654 0xaaab0d8e42a0 WARN                 basesrc gstbasesrc.c:3072:gst_base_src_loop:<tcpclientsrc0> error: streaming stopped, reason not-linked (-1)
Freeing pipeline ...

But even when I try to just replace the jpegdec with decodebin it fails : gst-launch-1.0 tcpclientsrc host=127.0.0.1 port=5000 ! “application/x-$
tp-stream,encoding-name=JPEG” ! rtpstreamdepay ! rtpjpegdepay ! decodebin ! videoconvert !
autovideosink

...ing a deterministic way of creating a stream-id                                   [42/573]
0:00:00.128126479 10705 0xaaaaf8281300 FIXME           videodecoder gstvideodecoder.c:946:g
st_video_decoder_drain_out:<nvjpegdec0> Sub-class should implement drain()
0:00:00.128260880 10705 0xaaaaf8281300 FIXME           videodecoder gstvideodecoder.c:946:g
st_video_decoder_drain_out:<nvjpegdec0> Sub-class should implement drain()
NvMMLiteBlockCreate : Block : BlockType = 256  
[JPEG Decode] BeginSequence Display WidthxHeight 1760x992
0:00:00.137935366 10705 0xaaaaf8281300 WARN           basetransform gstbasetransform.c:1362
:gst_base_transform_setcaps:<videoconvert0> transform could not transform video/x-raw(memor
y:NVMM), format=(string)I420, width=(int)1760, height=(int)992, interlace-mode=(string)prog
ressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff
:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-
mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, colorimetry=(string)1:4:
0:0, framerate=(fraction)0/1 in anything we support
0:00:00.138042023 10705 0xaaaaf8281300 WARN           basetransform gstbasetransform.c:1362
:gst_base_transform_setcaps:<videoconvert0> transform could not transform video/x-raw(memor
y:NVMM), format=(string)I420, width=(int)1760, height=(int)992, interlace-mode=(string)prog
ressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff
:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-
mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, colorimetry=(string)1:4:
0:0, framerate=(fraction)0/1 in anything we support
0:00:00.138060775 10705 0xaaaaf8281300 WARN                GST_PADS gstpad.c:4231:gst_pad_p
eer_query:<decodebin0:src_0> could not send sticky events
0:00:00.138300584 10705 0xaaaaf8281300 WARN           basetransform gstbasetransform.c:1362
:gst_base_transform_setcaps:<videoconvert0> transform could not transform video/x-raw(memor
y:NVMM), format=(string)I420, width=(int)1760, height=(int)992, interlace-mode=(string)prog
ressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff
:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-
mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, colorimetry=(string)1:4:
0:0, framerate=(fraction)0/1 in anything we support
0:00:00.138320809 10705 0xaaaaf8281300 WARN                GST_PADS gstpad.c:4231:gst_pad_p
eer_query:<decodebin0:src_0> could not send sticky events
0:00:00.138662858 10705 0xaaaaf8281300 WARN           basetransform gstbasetransform.c:1362
:gst_base_transform_setcaps:<videoconvert0> transform could not transform video/x-raw(memor
y:NVMM), format=(string)I420, width=(int)1760, height=(int)992, interlace-mode=(string)prog
ressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff
:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-
mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, colorimetry=(string)1:4:
0:0, framerate=(fraction)0/1 in anything we support
0:00:00.138791019 10705 0xaaaaf8281300 WARN           basetransform gstbasetransform.c:1362
:gst_base_transform_setcaps:<videoconvert0> transform could not transform video/x-raw(memor
y:NVMM), format=(string)I420, width=(int)1760, height=(int)992, interlace-mode=(string)prog
ressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)0:ffffffff
:/right-view-first/left-flipped/left-flopped/right-flipped/right-flopped/half-aspect/mixed-
mono, pixel-aspect-ratio=(fraction)1/1, chroma-site=(string)mpeg2, colorimetry=(string)1:4:
0:0, framerate=(fraction)0/1 in anything we support
0:00:00.190758735 10705 0xaaaaf8281300 WARN           basetransform gstbasetransform.c:1362
:gst_base_transform_setcaps:<videoconvert0> transform could not transform video/x-raw(memor
y:NVMM), format=(string)I420, width=(int)1760, height=(int)992, interlace-mode=(string)prog
ressive, multiview-mode=(string)mono, multiview-flags=(GstVideoMultiviewFlagsSet)...

Note: I am using GStreamer 1.16

For the error above, I think it is because decodebin has selected either nvjpegdec or nvv4l2decoder (mjpeg=1) that both output into NVMM memory.
You would try using autovideoconvert instead of videoconvert (or nvvidconv if you don’t have autovideoconvert available).

For this topic, I’m afraid that you’re reinventing the wheel, RTSP would handle that, so if you have access to sender, you may just set up a RTSP server from your RTP stream, so uridecodebin on receiver side would be able to decode it.
If your concern is about using TCP transport intead of UDP, with RTSP receiver can specify TCP transport such as:

uridecodebin uri=rtsp://server:port/stream source::protocols=tcp ! ...