[FFmpeg-devel] [PATCH v2 3/6] lavc/qsvdec: Replace current parser with MFXVideoDECODE_DecodeHeader()

Li, Zhong zhong.li at intel.com
Thu Feb 21 07:43:12 EET 2019

> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf
> Of Mark Thompson
> Sent: Thursday, February 21, 2019 5:32 AM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v2 3/6] lavc/qsvdec: Replace current
> parser with MFXVideoDECODE_DecodeHeader()
> On 20/02/2019 02:58, Zhong Li wrote:
> > Using MSDK parser can improve qsv decoder pass rate in some cases (E.g:
> > sps declares a wrong level_idc, smaller than it should be).
> > And it is necessary for adding new qsv decoders such as MJPEG and VP9
> > since current parser can't provide enough information.
> Can you explain the problem with level_idc?  Why would the libmfx parser
> determine a different answer?

Detail discussion is here: https://github.com/Intel-Media-SDK/MediaSDK/issues/582 
"Some clips declare a wrong level_idc, smaller than it should be", for example, a clip declare a level_idc= 1.0, but other sps/pps paramters such as resolution or reference number is out of scope. 
I believe this is a very common issue, many clips don't declare a correct level, and it should be decoded with decoder error detecting.
Currently internal parser is just reading what is the level_idc is, there is no error handing. Thus making MSDK decoding error

> Given that you need the current parser anyway (see previous mail), it would
> likely be more useful to extend it to supply any information which is missing.
> > Actually using MFXVideoDECODE_DecodeHeader() was disscussed at
> > https://ffmpeg.org/pipermail/ffmpeg-devel/2015-July/175734.html and
> > merged as commit 1acb19d, but was overwritten when merged libav
> patches (commit: 1f26a23) without any explain.
> I'm not sure where the explanation for this went; maybe it was only
> discussed on IRC.
> The reason for using the internal parsers is that you need the information
> before libmfx is initialized at all in the hw_frames_ctx case (i.e. before the
> get_format callback which will supply the hardware context information),
> and once you require that anyway there isn't much point in parsing things
> twice for the same information.

As I see, there are very limited information needed before init a libmfx session (we must <and only need> init a libmfx session before call MFXVideoDECODE_DecodeHeader() ): There are resolution and pix_fmt. 
As you can see from my current implementation, I don't call internal parser before init the session, we are assuming a resolution (It may be provided from libavformat, but not must as Hendrik's comment) and pix_fmt (we assume it is NV12), and will correct it after MFXVideoDECODE_DecodeHeader().
It probably means we need to init the session twice for the first decoding call (e.g hevc/vp9 10bit clips, or resolution is not provided somewhere such as libavformat), but it is just happens when the first call (if header information is not changed after that) and the assumed resolution/pix_fmt is not correct.  
I think it is higher efficient than parse twice for every decoding calling, and it is not a workaround way since we still need to handling resolution/pix_fmt changing cases. 

> It's probably fine to parse it twice if you want, but the two cases really
> should be returning the same information.

More information about the ffmpeg-devel mailing list