[FFmpeg-trac] #7690(undetermined:new): FFmpeg QSV decode + VPP performance is just a fraction of what one gets with VA-API and MediaSDK
FFmpeg
trac at avcodec.org
Mon Jan 21 15:18:36 EET 2019
#7690: FFmpeg QSV decode + VPP performance is just a fraction of what one gets
with VA-API and MediaSDK
-------------------------------------+-------------------------------------
Reporter: eero-t | Type: defect
Status: new | Priority: normal
Component: | Version: git-
undetermined | master
Keywords: | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
-------------------------------------+-------------------------------------
Summary of the bug: Running 10-bit HEVC decode + downscale + upload with
QSV backend is only 20-30% of the performance with the VA-API backend
performance, or of the performance with the Intel MediaSDK sample
application.
Setup:
* Distro: Ubuntu 18.04
* FFmpeg: latest compiled from Git
* MediaSDK & its deps: latest compiled from Git
* HW: tested on KBL GT2, KBL GT3e and CFL GT2
Steps to reproduce:
1. Encoding 1080p 10-bit HEVC test input video with libx265:
{{{
$ ffmpeg -i 4k_uhd_hevc_10bit_60fps.mkv -frames 4800 -s 1920x1080 -pix_fmt
yuv420p10le -x265-params level=5.2 1920x1080_10bit_60fps.h265
}}}
2. export LIBVA_DRIVER_NAME=iHD
3. Decode + VPP with QSV for that video:
{{{
ffmpeg -hwaccel qsv -qsv_device /dev/dri/renderD128 -c:v hevc_qsv -i
1920x1080_10bit_60fps.h265 -vf
scale_qsv=w=300:h=300,hwdownload,format=p010 -f null -
}}}
4. Decode + VPP with VA-API:
{{{
$ ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128
-hwaccel_output_format vaapi -i 1920x1080_10bit_60fps.h265 -vf
scale_vaapi=w=300:h=300,hwdownload,format=p010 -f null -
}}}
5. Decode + VPP with MediaSDK sample app:
{{{
$ sample_decode -hw h265 -w 300 -h 300 -p010 -i 1920x1080_10bit_60fps.h265
-o /dev/null
}}}
Expected output:
* Similar performance for all 3 cases
Actual outcome:
* MediaSDK and VA-API cases performance is within few percent of each
other
* QSV performance is only a fraction of MediaSDK and VA-API performance,
at best ~30%
When looking at the GPU information:
* GPU runs at minimum freq with QSV, but at max with others
* despite this, video engine is only half utilized with QSV, fully
utilized with others
As QSV uses less CPU than the other two cases (CPU utilization percentage
is nearly same, but according to RAPL, CPU core power usage is much
smaller with QSV), issue could be some extra synchronization between CPU &
GPU with FFmpeg QSV backend.
QSV outputs following errors at beginning:
{{{
Stream mapping:
Stream #0:0 -> #0:0 (hevc (hevc_qsv) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[NULL @ 0x556f79e64840] missing picture in access unit with size 1
Last message repeated 1 times
[hevc_qsv @ 0x556f79e75dc0] A decode call did not consume any data: expect
more data at input (-10)
[NULL @ 0x556f79e64840] missing picture in access unit with size 1
[hevc_qsv @ 0x556f79e75dc0] A decode call did not consume any data: expect
more data at input (-10)
[NULL @ 0x556f79e64840] missing picture in access unit with size 1
[hevc_qsv @ 0x556f79e75dc0] A decode call did not consume any data: expect
more data at input (-10)
[NULL @ 0x556f79e64840] missing picture in access unit with size 1
Last message repeated 2 times
Output #0, null, to 'pipe:':
}}}
And a huge amount of following warnings during rest of the pipeline:
{{{
[NULL @ 0x556f79e64840] missing picture in access unit with size
1peed=7.51x
Last message repeated 224 times
[NULL @ 0x556f79e64840] missing picture in access unit with size
1peed=7.49x
Last message repeated 262 times
}}}
--
Ticket URL: <https://trac.ffmpeg.org/ticket/7690>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list