[FFmpeg-devel] Sharing cuda context between transcode sessions to reduce initialization overhead

Mon Jun 12 23:38:29 EEST 2017

Hi,

Currently incase of using 1 -> N transcode (1 SW decode -> N  NVENC encodes) without HW upload filter, we end up allocating multiple Cuda contexts for the N transcode sessions for the same underlying gpu device. This comes with the cuda context initialization overhead. (~100 ms per context creation with 4th gen i5 with GTX 1080 in ubuntu 16.04).  Also in case of  M * (1->N) full HW accelerated transcode we face this issue where the cuda context is not shared between the M transcode sessions. Sharing the context would greatly reduce the initialization time which will matter in case of short clip transcodes.

I currently have a global array in avutil/hwcontext_cuda.c which keeps track of the cuda contexts created and reuses existing contexts when request for hwdevice ctx create occurs. This is shared in the attached patch. Please check the approach and let me know if there is better/cleaner way to do this. Thanks

Regards

Ganapathy

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Share-cuda-context-across-multiple-transcode-session.patch
Type: text/x-patch
Size: 6032 bytes
Desc: 0001-Share-cuda-context-across-multiple-transcode-session.patch
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170612/f829c297/attachment.bin>