Nvcudah264enc vs nv264enc sinks & performance

tadeaustria · March 21, 2024, 9:39am

Although both encoder offer the same feature, they have different requirements to their sinks. nvcudah264enc seems to require its input already in a YUV format while nv264enc also allows RGBA formats. The nvcudah264enc’s restriction means an additional cudaconvert element in the pipeline.
However, measurements show that that RGBA → nv264enc is more performant than RGBA → cudaconvert → nvcudah264enc.
Is there a specific reason nvcudah264enc does not provide the same video format interface as nv264enc?

seungha · March 21, 2024, 10:50am

the reason is because NVENC does not expose RGB → YUV conversion related parameters, so it’s not controllable.

NVENC launches CUDA kernel regardless of the input format (I guess it does linear → tiled conversion or similar). So, doing it at once like nvh264enc might be more performant, yes.

Can you share your performance measurement result? I might need to consider adding RGB format support into new encoders depending on perf. differences.

tadeaustria · March 21, 2024, 12:26pm

Yes, that’s true, the conversion is then kinda hidden.

Our tests were done with gst-launch-1.0

gst-launch-1.0 videotestsrc num-buffers=100 ! video/x-raw,width=3200,height=1200,framerate=60/1,format=RGBA ! cudaupload ! cudaconvert ! "video/x-raw(memory:CUDAMemory),format=NV12" ! nvcudah264enc ! h264parse ! mp4mux ! fakesink

tadeaustria · March 21, 2024, 12:27pm

gst-launch-1.0 videotestsrc num-buffers=100 ! video/x-raw,width=3200,height=1200,framerate=60/1,format=RGBA ! nvh264enc ! h264parse ! mp4mux ! fakesink

So doing this conversion at once, seems to take only a third of the time, than doing it separately.