SIGSEGV with PyGObject + GStreamer under concurrent pipelines (Python)
Summary
A segmentation fault (SIGSEGV, exit code 139) occurs when running multiple concurrent GStreamer pipelines using PyGObject in a Python application.
The crash happens non-deterministically after some runtime and always under concurrency (>2 sessions).
No Python exception is raised; the process terminates inside native code.
Environment
- OS: Ubuntu 22.04 (Docker)
- Python: 3.10.12
- GStreamer: 1.20.3
- PyGObject: 3.42.1
- Architecture: x86_64
Application overview
- Receives live WebM/Opus audio over WebSocket
- Uses GStreamer (PyGObject) to:
- Convert WebM Opus → Ogg Opus or PCM
- Each WebSocket session creates:
- Its own GStreamer pipeline
- Its own
GLib.MainLooprunning in a Python thread
- Output is consumed by Azure Cognitive Services Speech SDK
Steps to reproduce
- Run the app inside an Ubuntu 22.04 Docker container.
- Start the Python server using PyGObject + GStreamer.
- Open more than two concurrent WebSocket connections.
- Stream audio continuously.
- After some time, the process crashes with SIGSEGV.
Reproducibility: Always under concurrency, but timing varies.
Minimal reproduction code
import threading
import gi
gi.require_version('Gst', '1.0')
from gi.repository import Gst, GLib
Gst.init(None)
class WebmGstreamerConverter:
def __init__(self,pipeline):
# Create the GStreamer pipeline
self.pipeline = Gst.parse_launch(
pipeline
)
# Get source and sink elements
self.src = self.pipeline.get_by_name("src")
self.sink = self.pipeline.get_by_name("sink")
# Connect appsink signal
self.sink.connect("new-sample", self.on_new_sample)
# Start GLib MainLoop in a separate thread
self.loop = GLib.MainLoop()
self.loop_thread = threading.Thread(target=self.loop.run, daemon=True)
self.loop_thread.start()
self.output = []
# Start pipeline
self.pipeline.set_state(Gst.State.PLAYING)
def on_new_sample(self, sink):
"""Callback when a new Ogg sample is available from appsink."""
sample = sink.emit("pull-sample")
buf = sample.get_buffer()
success, map_info = buf.map(Gst.MapFlags.READ)
if success:
self.output.append(map_info.data)
buf.unmap(map_info)
return Gst.FlowReturn.OK
def feed(self, webm_chunk: bytes):
"""Feed WebM data into the GStreamer pipeline."""
buf = Gst.Buffer.new_allocate(None, len(webm_chunk), None)
buf.fill(0, webm_chunk)
# Set timestamps to avoid pipeline stalling
buf.pts = buf.dts = Gst.util_uint64_scale(len(webm_chunk), Gst.SECOND, 8192)
self.src.emit("push-buffer", buf)
def read_ogg(self) -> bytes:
"""Retrieve accumulated Ogg data."""
data = b"".join(self.output)
self.output.clear()
return data
def close(self):
"""Gracefully stop the pipeline."""
self.src.emit("end-of-stream")
self.pipeline.set_state(Gst.State.NULL)
self.loop.quit()
def get_webm_to_ogg_converter() -> WebmGstreamerConverter :
return WebmGstreamerConverter(
"appsrc name=src is-live=true format=time do-timestamp=true ! "
"matroskademux name=demux ! "
"queue ! opusparse ! muxer. "
"oggmux name=muxer ! appsink name=sink emit-signals=true sync=false")
def get_webm_to_pcm_converter() -> WebmGstreamerConverter :
return WebmGstreamerConverter(
"appsrc name=src is-live=true format=time do-timestamp=true ! "
"matroskademux name=demux ! "
"queue ! opusdec ! audioconvert ! audioresample ! "
"audio/x-raw,format=S16LE,channels=1,rate=16000 ! "
"appsink name=sink emit-signals=true sync=false"
)
Actual behavior
-
Process crashes with:
Segmentation fault (core dumped) Exit code: 139 -
No Python traceback
-
Crash occurs in native threads (GLib / GStreamer / Azure SDK)
Expected behavior
- Safe handling of multiple concurrent pipelines
- No memory corruption or process crash
Observations / questions
-
Each session runs its own
GLib.MainLoop -
Crash happens in non-Python threads
-
Is it safe to:
- Run multiple
GLib.MainLoopinstances in one process? - Use multiple concurrent GStreamer pipelines from Python?
- Run multiple
-
Is a single global GLib main loop required?
Any guidance or confirmation of known limitations would be appreciated.