[FFmpeg-trac] #7690(undetermined:new): FFmpeg QSV decode + VPP performance is just a fraction of what one gets with VA-API and MediaSDK

FFmpeg trac at avcodec.org
Wed Jan 23 17:35:00 EET 2019


#7690: FFmpeg QSV decode + VPP performance is just a fraction of what one gets
with VA-API and MediaSDK
-------------------------------------+-------------------------------------
             Reporter:  eero-t       |                    Owner:
                 Type:  defect       |                   Status:  new
             Priority:  normal       |                Component:
              Version:  git-master   |  undetermined
             Keywords:  qsv          |               Resolution:
             Blocking:               |               Blocked By:
Analyzed by developer:  0            |  Reproduced by developer:  0
-------------------------------------+-------------------------------------

Comment (by eero-t):

 More testing with KBL-i7 GT2 (and latest Git versions of everything).

 With 8-bit 1920x540 HEVC decode, QSV is clearly faster than VA-API:
 {{{
 ffmpeg  -hwaccel qsv -qsv_device /dev/dri/renderD128 -c:v hevc_qsv -i
 1920x540_60_yuv420p_4800.h265 -f null -
 ...
 ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -i
 1920x540_60_yuv420p_4800.h265 -y -f null -
 }}}

 Whereas with (8-bit & 10-bit) 1920x1080 HEVC decode, QSV was clearly
 slower than VA-API.
 In both cases, also when GPU is forced to run at max speed.

 In the 1920x540 case, QSV gets (clearly) slower than VA-API when something
 is done to the decoded data.  If I do VPP upscale from decoded 1920x540 to
 1920x1080, unlike 1920x540 decoding, QSV perf of that also impacted by
 kernel power management (has much lower perf when GPU isn't forced to
 max).

 To summarize findings so far:
 * Resolution impacts whether doing (HEVC) decoding is slower with QSV or
 VA-API backends
 * In larger resolutions, VPP operations with QSV backend are slower than
 with VA-API
 * Depending on what the pipeline does and at what resolutions, QSV backend
 can fool kernel to lower GPU speed so that it's '''much''' slower than
 with VA-API (when there's single pipeline running at the same time). IMHO
 this is larger of the issues, but it could be related to the VPP issue

 > I suppose this an issue of qsv hwdownloading. FFmpeg-vaapi can get an
 image directly via vaDeriveImage()
 (​https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/hwcontext_vaapi.c#L788).
 But possibly MSDK is via vaGetImage()
 (​https://github.com/FFmpeg/FFmpeg/blob/master/libavutil/hwcontext_vaapi.c#L815)
 thus making a copy cause performance drop
 ...
 > I will take a deeper look and go back to this issue.

 Any conclusions?  Something like that might explain operations being
 synchronous enough that kernel power management doesn't think use-case to
 be GPU bound (enough).

--
Ticket URL: <https://trac.ffmpeg.org/ticket/7690#comment:12>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list