[FFmpeg-trac] #7797(undetermined:new): AVC->MPEG-2 transcoding with VA-API 2-3x slower than with QSV

FFmpeg trac at avcodec.org
Thu Mar 14 18:41:25 EET 2019


#7797: AVC->MPEG-2 transcoding with VA-API 2-3x slower than with QSV
-------------------------------------+-------------------------------------
             Reporter:  eero-t       |                     Type:  defect
               Status:  new          |                 Priority:  normal
            Component:               |                  Version:  git-
  undetermined                       |  master
             Keywords:               |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 Setup:
 - Ubuntu 18.04 with drm-tip 5.x kernel from Git
 - iHD media driver, MediaSDK and FFmpeg built from Git

 Summary of the bug:
 - Bad VA-API performance with transcoding.  Doing AVC -> MPEG-2
 transcoding with QSV is 2-3x faster than using VA-API

 How to reproduce:
 {{{
 $ export LIBVA_DRIVER_NAME=iHD
 $ ffmpeg -hwaccel qsv -qsv_device /dev/dri/renderD128 -c:v h264_qsv -i
 720x480p_30.00_4mb_h264_cabac_180s.264 -c:v mpeg2_qsv -b:v 2000K
 -compression_level 4 -y output.mpg
 ffmpeg version N-93330-g7ff89574c7 Copyright (c) 2000-2019 the FFmpeg
 developers
   built with gcc 7 (Ubuntu 7.3.0-27ubuntu1~18.04)
 ...
 Input #0, h264, from 'input/720x480p_30.00_4mb_h264_cabac_180s.264':
   Duration: N/A, bitrate: N/A
     Stream #0:0: Video: h264 (High), 1 reference frame, yuv420p(tv,
 smpte170m, progressive, left), 720x480 [SAR 10:11 DAR 15:11], 30 fps, 30
 tbr, 1200k tbn, 60 tbc
 Stream mapping:
   Stream #0:0 -> #0:0 (h264 (native) -> mpeg2video (mpeg2_vaapi))
 Press [q] to stop, [?] for help
 [h264 @ 0x55a62883c380] Reinit context to 720x480, pix_fmt: vaapi_vld
 [graph 0 input from stream 0:0 @ 0x55a628872f40] w:720 h:480
 pixfmt:vaapi_vld tb:1/1200000 fr:30/1 sar:10/11 sws_param:flags=2
 [mpeg2_vaapi @ 0x55a62883eb80] Input surface format is nv12.
 [mpeg2_vaapi @ 0x55a62883eb80] Using VAAPI profile VAProfileMPEG2Main (1).
 [mpeg2_vaapi @ 0x55a62883eb80] Using VAAPI entrypoint VAEntrypointEncSlice
 (6).
 [mpeg2_vaapi @ 0x55a62883eb80] Using VAAPI render target format YUV420
 (0x1).
 [mpeg2_vaapi @ 0x55a62883eb80] RC mode: VBR.
 [mpeg2_vaapi @ 0x55a62883eb80] RC target: 50% of 4000000 bps over 500 ms.
 [mpeg2_vaapi @ 0x55a62883eb80] RC buffer: 2000000 bits, initial fullness
 1500000 bits.
 [mpeg2_vaapi @ 0x55a62883eb80] RC framerate: 30/1 (30.00 fps).
 [mpeg2_vaapi @ 0x55a62883eb80] Using intra, P- and B-frames (supported
 references: 1 / 1).
 [mpeg2_vaapi @ 0x55a62883eb80] Driver does not support some wanted packed
 headers (wanted 0x3, found 0x10).
 [mpeg2_vaapi @ 0x55a62883eb80] Sample aspect ratio 10:11 is not
 representable, signalling square pixels instead.
 [mpeg @ 0x55a62883a580] VBV buffer size not set, using default size of
 230KB
 If you want the mpeg file to be compliant to some specification
 Like DVD, VCD or others, make sure you set the correct buffer size
 Output #0, mpeg, to 'output/0039_SD03MP2_1.0.mpg':
   Metadata:
     encoder         : Lavf58.26.101
     Stream #0:0: Video: mpeg2video (mpeg2_vaapi) (Main), vaapi_vld,
 720x480 [SAR 10:11 DAR 15:11], q=-1--1, 2000 kb/s, 30 fps, 90k tbn, 30 tbc
     Metadata:
       encoder         : Lavc58.47.103 mpeg2_vaapi
 ...
 }}}

 And QSV:
 {{{
 $ ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128
 -hwaccel_output_format vaapi -i 720x480p_30.00_4mb_h264_cabac_180s.264
 -c:v mpeg2_vaapi -b:v 2000K -compression_level 4 -y output.mpg
 ...
 [AVHWDeviceContext @ 0x565392bd4280] Initialize MFX session: API version
 is 1.28, implementation version is 1.28
 [AVHWDeviceContext @ 0x565392bd4280] MFX compile/runtime API: 1.28/1.28
 [AVHWDeviceContext @ 0x565392bf2f00] VAAPI driver: Intel iHD driver -
 1.0.0.
 [AVHWDeviceContext @ 0x565392bf2f00] Driver not found in known nonstandard
 list, using standard behaviour.
 [graph 0 input from stream 0:0 @ 0x565392d785c0] w:720 h:480 pixfmt:qsv
 tb:1/1200000 fr:30/1 sar:10/11 sws_param:flags=2
 [mpeg2_qsv @ 0x565392bd1f40] Using the variable bitrate (VBR) ratecontrol
 method
 [AVHWDeviceContext @ 0x565392cfc340] VAAPI driver: Intel iHD driver -
 1.0.0.
 [AVHWDeviceContext @ 0x565392cfc340] Driver not found in known nonstandard
 list, using standard behaviour.
 [mpeg2_qsv @ 0x565392bd1f40] profile: main; level: 8
 [mpeg2_qsv @ 0x565392bd1f40] GopPicSize: 250; GopRefDist: 4; GopOptFlag:
 closed ; IdrInterval: 0
 [mpeg2_qsv @ 0x565392bd1f40] TargetUsage: 4; RateControlMethod: VBR
 [mpeg2_qsv @ 0x565392bd1f40] BufferSizeInKB: 500; InitialDelayInKB: 500;
 TargetKbps: 2000; MaxKbps: 2000; BRCParamMultiplier: 1
 [mpeg2_qsv @ 0x565392bd1f40] NumSlice: 30; NumRefFrame: 0
 [mpeg2_qsv @ 0x565392bd1f40] RateDistortionOpt: unknown
 [mpeg2_qsv @ 0x565392bd1f40] RecoveryPointSEI: unknown IntRefType: 0;
 IntRefCycleSize: 0; IntRefQPDelta: 0
 [mpeg2_qsv @ 0x565392bd1f40] MaxFrameSize: 0; MaxSliceSize: 0;
 [mpeg2_qsv @ 0x565392bd1f40] BitrateLimit: unknown; MBBRC: unknown;
 ExtBRC: unknown
 [mpeg2_qsv @ 0x565392bd1f40] Trellis: auto
 [mpeg2_qsv @ 0x565392bd1f40] VDENC: OFF
 [mpeg2_qsv @ 0x565392bd1f40] RepeatPPS: unknown; NumMbPerSlice: 0;
 LookAheadDS: unknown
 [mpeg2_qsv @ 0x565392bd1f40] AdaptiveI: unknown; AdaptiveB: unknown;
 BRefType: auto
 [mpeg2_qsv @ 0x565392bd1f40] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0;
 MinQPB: 0; MaxQPB: 0
 [mpeg2_qsv @ 0x565392bd1f40] FrameRateExtD: 1; FrameRateExtN: 30
 [mpeg @ 0x565392bd1500] VBV buffer size not set, using default size of
 230KB
 If you want the mpeg file to be compliant to some specification
 Like DVD, VCD or others, make sure you set the correct buffer size
 Output #0, mpeg, to 'output/0039_SD03MP2_1.0.mpg':
     Metadata:
       encoder         : Lavc58.47.103 mpeg2_qsv
     Side data:
       cpb: bitrate max/min/avg: 0/0/2000000 buffer size: 0 vbv_delay: -1
 ...
 }}}

 GPU is running at full speed in both cases, so this isn't related to
 ticket #7690.  It could be related to regression #7706, but I can't test
 it because ticket #7650 ("invalid RC mode") was fixed only after that
 regression.

 When looking at CPU utilization and power usage, QSV utilizes more CPU,
 but has also more iowait, and correspondingly, it's using both more CPU
 and GPU power than VA-API.  Maybe VA-API isn't running asynchronously
 enough?

 There are also (AVC) transcode single-stream cases where VA-API is slower,
 but gap is much smaller, and if one runs multiple processes in parallel,
 VA-API is actually slightly faster.  In this case, VA-API is slower also
 with multiple parallel transcode processes.

 I'm seeing similar perf gap on all the Core devices [1] currently
 supported by iHD: BDW, SKL, KBL & CFL, both on GT2 & GT3e devices i.e.
 issue isn't platform specific.

 [1] This test-case doesn't work on the only GEN9+ non-core device I have
 (BXT/APL).

 Extra info:
 * With a larger 1280x720p_29.97_10mb_h264_cabac input, performance gap was
 still about same >2x
 * When using even larger 1920x1080i_29.97_20mb_mpeg2_high as input, gap
 decreased to ~25%, but performance with both APIs had also dropped to a
 fraction.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/7797>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list