[FFmpeg-devel] h264_qsv decoder speed

Mark Thompson sw at jkqxz.net
Thu Aug 18 01:13:13 EEST 2016

On 17/08/16 20:47, Chao Liu wrote:
> Hi there,
> I compared h264_qsv decoder from ffmpeg to intel media sdk sample_decode.
> There is pretty big speed gap. I wonder whether I did sth. wrong or there
> are really some problems with ffmpeg's implementation..
> The test video was captured from a 3MP(2048x1536) camera. The commands I
> used:
> -  ffmpeg -c:v h264_qsv -async_depth 10 -i test.h264 -c:v rawvideo -f null
> /dev/null
> -  sample_decode h264 -i test.h264
> Both uses 100% cpu (a full core). ffmpeg got 170FPS. sample_decode got
> 370FPS.
> I haven't got time debugging into this. Sending this out to see whether you
> guys might have sth. in mind..

I think in both cases your speed bound must be on something other than the decode, because the hardware goes a lot faster than either of those for me.  Perhaps you are downloading the all of the output frames to normal memory in order to write them to a null device output, and one of the cases is doing that less efficiently somehow?

Using vaapi on a low-power Haswell mobile chip (i.e. the same Quick Sync hardware that libmfx uses) decodes a single 2048x1536 stream at around 800fps with less than 50% CPU for me.

- Mark

(My command to compare is:

./ffmpeg_g -vaapi_device /dev/dri/renderD128 -hwaccel vaapi -hwaccel_output_format vaapi -i input.mp4 -an -vf 'format=nv12|vaapi,hwupload' -f null -

The nasty filtering there is contrived to do nothing, even with the inconvenient stream reinitialisation.  I think libmfx might also work somehow with "-c:v h264_qsv -hwaccel qsv", but I'm not sure and I don't have anything to try it on right now.)

More information about the ffmpeg-devel mailing list