[FFmpeg-devel] [PATCH 5/5] avformat/matroskaenc: update AV1 support

James Almer jamrial at gmail.com
Mon Aug 6 18:04:09 EEST 2018

On 8/6/2018 3:39 AM, Steve Lhomme wrote:
> On 31/07/2018 17:19, James Almer wrote:
>> On 7/31/2018 3:37 AM, Steve Lhomme wrote:
>>> On 30/07/2018 17:13, James Almer wrote:
>>>> On 7/30/2018 2:03 AM, Steve Lhomme wrote:
>>>>> On 26/07/2018 03:11, James Almer wrote:
>>>>>> Make sure to not write forbidden OBUs to CodecPrivate, and do the
>>>>>> same
>>>>>> with
>>>>>> unnecessary OBUs for packets.
>>>>> Does this include reordering the OBUs ? The first one must be the
>>>>> Sequence Header OBU. Also it must be the only one of that kind.
>>>> No, it doesn't. I can make that change, but why does it matter if it's
>>>> the first? Why can't a metadata OBU be before a Seq Header? isobmff
>>> Because if the spec say it's the first and its enforced, the OBU parser
>>> needed at the container level may be simpler. The muxing side can have
>>> more complexity if needed.
>> Nothing in the spec says it's the first, afaics, just that in a temporal
>> unit if present it must prepend a frame. I may be misreading it, though.
> As we're talking about the Matroska muxer the current document says:
> The first OBU MUST be the first |Sequence Header OBU| and be the only
> OBU of type |OBU_SEQUENCE_HEADER| in the |CodecPrivate|

Right, i misunderstood and thought you were talking about the av1 spec,
not the Matroska mapping.

>> In any case, my suggestion in the isobmff issue includes ordering of
>> OBUs in av1C.
> Good. Otherwise we can create an ISOBMFF file with 1000000 metadata OBUs
> and then the Sequence Header OBU and it's still valid.
>>>> doesn't care what's first, only what's in.
>>> I requested that the same constraint be applied in ISOBMFF
>>> https://github.com/AOMediaCodec/av1-isobmff/pull/47#issuecomment-408598719
>>>> A parser is currently forced to know how to read OBUs anyway, and it
>>>> could just skip any metadata obu before the sequence header.
>>>> For that matter, i do agree with
>>>> https://github.com/AOMediaCodec/av1-isobmff/issues/7#issuecomment-406257234
>>>> and
>>>> https://github.com/AOMediaCodec/av1-isobmff/issues/46#issuecomment-406516226
>>>> that perhaps av1C should, much like hvcC, avcC and in it's own way also
>>>> vpcC, contain a bunch of header bytes listing some basic information
>>>> (and therefore the entire thing, sans atom size and fourcc, be included
>>>> in CodecPrivate verbatim). Things like profile, level, and maybe number
>>>> of OBUs, before the raw OBUs.
>>>> As an extra benefit than simply removing the requirement for
>>>> demuxers to
>>>> parse OBUs that may even only be in sample data just to see if the file
>>>> can be decoded (firing up a hardware decoder can be expensive), it
>>>> would
>>>> also give us a way to identify the source of the bitstream (mp4/mkv vs
>>>> ivf/raw/annexb/etc), for example for the purpose of inserting the
>>>> extradata Seq Header if not present on key frames (Matroska allows to
>>>> remove it, but isobmff in section 2.5 seems to want a sync sample to be
>>>> a fully compliant RAP without the need of extradata obus, despite the
>>>> description in section 2.4.4), achieved by writing and using an
>>>> av1_mp4tolobf bsf or similar, which would then be included where needed
>>>> instead of the naive dump_extradata one from patch 2/5 of this set, for
>>>> example.
>>> I totally support this. In the end storing OBUs in extra data is
>>> becoming a PITA, especially if it needs to be prepended to some frames
>>> but not all. If we stick to a simpler structure with what's needed to
>>> identify decoder properties (mostly the profile) it would make things a
>>> lot easier.
>> The process of assembling a bitstream for the purpose of decoding or
>> storing is up to the implementation. The requirement for the global Seq
>> Headers to be prepended to key frames for raw bitstream purposes is
>> essentially the same as sps/pps/vps from h264 and h265.
> I did not know that.
>> ffmpeg's decoders keep global headers in extradata to use them as they
>> see fit (native decoders), or request the aid of bitstream filters like
>> h264_mp4toannexb to assemble the bitstream using said extradata if
>> required (hardware decoders, some external decoders). Similarly, muxers
>> for non global header containers (ivf, raw, etc) do the same on their
>> own, or with the aid of a bitstream filter.
> It seems odd that it's done at the decoder level and not the container
> level, since it's a feature of a particular mapping. Or is it done
> through a flag that tells that the extradata need to be prepended for
> keyframes ?

are used to signal that.
Demuxers propagate packets as they are stored, so for h264 raw
bitstreams the parameter sets will in any sane stream be part of them.
For containers like mp4/mkv, they will most likely not be there, but
will be propagated in extradata as extracted from avcC/CodecPrivate.
The internal decoder will always look at extradata for these, but other
decoders, like libopenh264 and pretty much all hardware implementations,
only support raw bitstreams, so with those we use the h264_mp4toannexb
bitstream filter which will append parameter sets to keyframes.

Similarly, muxers for containers that require raw bitstream will use the
same bitstream filter as part of the muxing process. Muxers for
containers that expect global headers will set the aforementioned flag
to signal the encoder to propagate parameter sets in extradata and not
bother with them in packets.

More information about the ffmpeg-devel mailing list