Hi everyone,
I have a scenario where I receive audio samples from OpenAI Realtime that I want to playback on a remote computer. Initially, I tried using webrtcsink, but I couldn’t get it to work.
In theory, it should be possible to use a pipeline such as:
appsrc caps="audio/x-raw,channels=2,rate=24000,format=F32LE" ! webrtcsink
and then:
webrtcsrc ! alsasink sync=false
However, this didn’t work as expected. Interestingly, replacing appsrc with audiotestsrc works fine.
Can the WebRTC pipeline work with an appsrc that doesn’t push data continuously?
I managed to get it working by combining appsrc ! audiomixer with audiotestsrc pattern=silence ! audiomixer, but the received audio wasn’t properly rendered. Could this be due to improper timestamping of the audio samples from OpenAI?
For now, I’ve switched to a simple udpsrc/sink setup:
gst-launch-1.0 appsrc ! audioconvert ! audioresample ! opusenc ! rtpopuspay ! udpsink host=127.0.0.1
gst-launch-1.0 udpsrc ! "application/x-rtp,media=audio,encoding-name=OPUS,payload=96" ! rtpjitterbuffer ! rtpopusdepay ! queue ! opusdec ! audioconvert ! audioresample ! alsasink sync=false
While this works, I’d prefer using WebRTC since it’s already being used to stream microphone input from the remote computer to OpenAI Realtime.
I initially tried using rtpsrc/sink, but it didn’t work, even with audiotestsrc. On the receiver side, I encountered these warnings:
0:00:02.326823961 110052 0x7f7f20000d30 WARN rtpsource rtpsource.c:1134:calculate_jitter: cannot get clock-rate for pt 96 0:00:02.326839070 110052 0x7f7f20000d30 WARN rtpjitterbuffer gstrtpjitterbuffer.c:3754:gst_rtp_jitter_buffer_chain:<rtpjitterbuffer0> No clock-rate in caps!, dropping buffer
I used the following commands:
gst-launch-1.0 audiotestsrc ! audioconvert ! audioresample ! opusenc ! rtpopuspay ! rtpsink address=127.0.0.1
gst-launch-1.0 rtpsrc ! "application/x-rtp,media=audio,encoding-name=OPUS,payload=96,clock-rate=48000" ! rtpjitterbuffer ! rtpopusdepay ! queue ! opusdec ! audioconvert ! audioresample ! autoaudiosink
If anyone has experience with this or suggestions for improving the setup, I’d love to hear your thoughts!
Thanks in advance!