[FFmpeg-devel] [PATCH] read header in FLAC demuxer

Justin Ruggles justin.ruggles
Tue Feb 3 03:00:05 CET 2009


Michael Niedermayer wrote:
> On Mon, Feb 02, 2009 at 05:21:45PM -0500, Justin Ruggles wrote:
>> Justin Ruggles wrote:
>>> Justin Ruggles wrote:
>>>> Michael Niedermayer wrote:
>>>>> On Sat, Jan 31, 2009 at 10:47:53AM -0500, Justin Ruggles wrote:
>>>>>> M?ns Rullg?rd wrote:
>>>>>>> Justin Ruggles <justin.ruggles at gmail.com> writes:
>>>>>>>
>>>>>>>> M?ns Rullg?rd wrote:
>>>>>>>>> Michael Niedermayer <michaelni at gmx.at> writes:
>>>>>>>>>
>>>>>>>>>> On Fri, Jan 30, 2009 at 10:36:10PM -0500, Justin Ruggles wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Here is a patch to read the header in the raw FLAC demuxer.  Currently
>>>>>>>>>>> the header is passed through in the output AVPackets, which is not good
>>>>>>>>>>> and requires some ugliness on the decoder side.  After this patch is
>>>>>>>>>>> applied, the decoder can be changed so it doesn't have to handle the
>>>>>>>>>>> header except in the extradata.  The extradata handling in the decoder
>>>>>>>>>>> will also be simplified.
>>>>>>>>>> is it allowed to place the header in the stream instead of extradata in
>>>>>>>>>> flac in avi?
>>>>>>>>> If it is physically possible, someone will do it.
>>>>>>>> Should we really support everything that is physically possible?
>>>>>>> Of course not.
>>>>>>>
>>>>>>>> FLAC in avi and wav should not have the header in the stream.
>>>>>>> Flac in avi and wav simply shouldn't IMO.  In fact, avi shouldn't at
>>>>>>> all, and wav is only useful for PCM.
>>>>>>>
>>>>>>>>>> flac in mkv?
>>>>>>>>> No.
>>>>>>>>>
>>>>>>>>>> flac in * ?
>>>>>>>>> There's always Ogg...
>>>>>>>> Ogg doesn't use the raw FLAC header. It splits it up in its own funny
>>>>>>>> way, which we already support.
>>>>>>>>
>>>>>>>> Do we have any other format that is header+frames that we allow the
>>>>>>>> header to be in the stream?
>>>>>>> It's possible to do all sorts of funny things with MPEG-PS/TS.  Some
>>>>>>> variants of AAC encapsulation put (parts of) the header outside the
>>>>>>> main stream, some put it all inline.
>>>>>> But raw FLAC doesn't do any funny things... There is a header, which
>>>>>> contains metadata, information about the stream, seek table, etc... It
>>>>>> is always completely separate from the audio frames.  If we only handle
>>>>>> it in the parser or decoder, we lose the ability to use that information
>>>>>> in the demuxer, right?
>>>>> i dont see why handling the seektable or others in the raw demuxer would
>>>>> affect if its passed on or not to the parser, of course theres no point
>>>>> on passing the seektable to the parser but still one could if one wanted
>>>> That is true. The issue is code duplication. It's either in lavf where
>>>> we can better utilize it, lavc where we can't, or both. It would require
>>>> messy workarounds to get it to work in one place with both and maybe
>>>> even unavoidable code duplication.  Passing the header would also mean
>>>> that a FLAC parser would have to handle the header as well.
>>>>
>>>>> About the rest of the header i dont know how or where to handle it best
>>>>> but what you apparently propose is not adding any features its not making
>>>>> code simpler it just removes support for having the headers inline
>>>> This does add features. It reads the metadata and sets the stream
>>>> timebase and duration.
>>>>
>>>>> If you want to add support for something iam very much in favor for that
>>>>> and if you explain how that might conflict with inline headers i also
>>>>> dont mind loosing support for them assuming such files are practically
>>>>> non existent but 
>>>>> Just removing it (or even worse in this patch duplicating it first)
>>>>> with no real explanation why is something that feels odd ...
>>>>> also moving code should never be split in duplication and removial
>>>> Ok, I can update the patch to remove support in the decoder at the same
>>>> time.
>>> Here is an updated patch which also removes reading of the header in the
>>> decoder. The whole header can still be handled as extradata, but that is
>>> much simpler because it just extracts the STREAMINFO, which has a fixed
>>> size and position.
>>>
> 
>>> I tried to implement a solution which allowed for reading the header
>>> within the demuxer and inline in the decoder.  I got it working, but it
>>> was very messy (mainly due to ByteIOContext vs. GetBitContext) and I
> 
> for()
>     get_buffer()
>     readfrombuffer()
> 
> vs.
> for()
>     readfrombuffer()
> 
> isnt messy

The issue is that in the demuxer, I have to read each metadata block
header to know how much data to get.  So I'm essentially already
processing the header at that point.  The demuxer and decoder do
different things with the metadata, so the only function that can really
be shared between them is already shared, ff_flac_parse_streaminfo().  I
tried to share more code and just ended up with more code without really
getting rid of much duplication.

>>> could not avoid some code duplication.  I don't see an advantage to
>>> handling the header inline within the decoder, as the necessary
>>> information should be passed in the extradata.  It is, however, useful
>>> in the demuxer for reading the metadata and stream parameters.  If I
>>> decide to attempt seeking support, reading the seek table in the demuxer
>>> will be simple this way as well.
>> oops... old patch. here is a new one against current SVN.
>>
>> -Justin
>>
> 
>> Index: libavcodec/flacdec.c
>> ===================================================================
>> --- libavcodec/flacdec.c	(revision 16957)
>> +++ libavcodec/flacdec.c	(working copy)
>> @@ -96,7 +96,6 @@
>>  }
>>  
>>  static void allocate_buffers(FLACContext *s);
>> -static int metadata_parse(FLACContext *s);
>>  
>>  static av_cold int flac_decode_init(AVCodecContext *avctx)
>>  {
>> @@ -105,16 +104,22 @@
>>  
>>      avctx->sample_fmt = SAMPLE_FMT_S16;
>>  
>> -    if (avctx->extradata_size > 4) {
>> +    if (avctx->extradata_size >= FLAC_STREAMINFO_SIZE) {
>>          /* initialize based on the demuxer-supplied streamdata header */
>> -        if (avctx->extradata_size == FLAC_STREAMINFO_SIZE) {
>> +        int streaminfo_start = 0;
>> +        if (avctx->extradata_size >= FLAC_STREAMINFO_SIZE + 8 &&
>> +                AV_RB32(avctx->extradata) == MKBETAG('f','L','a','C')) {
>> +            int type = avctx->extradata[4] & 0x7F;
>> +            int size = AV_RB24(&avctx->extradata[5]);
>> +            if (type != FLAC_METADATA_TYPE_STREAMINFO || size != FLAC_STREAMINFO_SIZE) {
>> +                av_log(avctx, AV_LOG_ERROR, "invalid FLAC extradata\n");
>> +                return -1;
>> +            }
>> +            streaminfo_start = 8;
>> +        }
>>              ff_flac_parse_streaminfo(avctx, (FLACStreaminfo *)s,
>> -                                     avctx->extradata);
>> +                                     &avctx->extradata[streaminfo_start]);
>>              allocate_buffers(s);
>> -        } else {
>> -            init_get_bits(&s->gb, avctx->extradata, avctx->extradata_size*8);
>> -            metadata_parse(s);
>> -        }
>>      }
>>  
>>      return 0;
> 
> why are these changes needed?

Currently, the decoder has a function to read through all the metadata
blocks, but only really uses the STREAMINFO.  It has to read through
them all for when the header is inline and the GetBitContext needs to be
moved past the header to get to the first frame.  The same function is
currently used for extradata, but it's not really necessary to read the
whole header in that case.  So the change just extracts the STREAMINFO
for the case when the whole header is passed in the extradata.

> also this patch must be tested against all odd flac in whatever files we
> have to make sure there are no regressions, this change has a high potential
> to break things

We don't really have much odd FLAC that I'm aware of, but I have tried
the files I could find in our samples collection and have tried creating
some odd files of my own to test with. I haven't had any issues so far
that don't also occur with the current decoder...for different reasons.
 I've tested with large metadata, small metadata, ogg, mkv...  I don't
have any samples for caf/qt but I found how it's supposed to be handled
in the XiphQT code, and the headers are not sent inline. The
extradata/"cookie" can contain either just the STREAMINFO or all metadata.

I haven't tried many broken/fuzzed files.  The two I did try exited
cleanly with "I/O error occurred. Usually that means that input file is
truncated and/or corrupted."

I'm open to suggestions or other samples to test.

> 
> [...]
>> @@ -304,6 +306,85 @@
>>      return 0;
>>  }
>>  
>> +#ifdef CONFIG_FLAC_DEMUXER
> 
> my script says this is wrong and should be a #if
> also static functions that are unused should not need #if unless they refer
> to unavailable things and thus could fail compilation

sorry... forgot to fix this one. fixed locally.

> 
>> +static int flac_read_header(AVFormatContext *s, AVFormatParameters *ap)
>> +{
>> +    int err, metadata_done=0, type, length, found_streaminfo=0, buffer_size=0;
>> +    uint32_t header;
>> +    uint8_t *buffer=NULL;
>> +    FLACStreaminfo si;
>> +    AVStream *st;
>> +
>> +    err = audio_read_header(s, ap);
>> +    if (err)
>> +        return err;
>> +
>> +    st = s->streams[0];
>> +
>> +    /* if fLaC marker is not found, assume there is no header */
>> +    if (get_le32(s->pb) != MKTAG('f','L','a','C'))
>> +        return 0;
>> +
>> +    /* process metadata blocks */
>> +    while (!url_feof(s->pb) && !metadata_done) {
>> +        header = get_be32(s->pb);
>> +        metadata_done = (header & 0x80000000);
>> +        type          = (header & 0x7f000000) >> 24;
>> +        length        = (header & 0x00ffffff);
>> +
>> +        if (!buffer || buffer_size < length) {
> 
>> +            buffer = av_realloc(buffer, length + FF_INPUT_BUFFER_PADDING_SIZE);
>> +            if (!buffer)
>> +                return AVERROR(ENOMEM);
> 
> memleak

fixed locally.

> 
>> +            buffer_size = length;
>> +        }
>> +        if (get_buffer(s->pb, buffer, length) != length) {
>> +            av_free(buffer);
>> +            return AVERROR(EIO);
>> +        }
>> +
>> +        if (type == FLAC_METADATA_TYPE_STREAMINFO) {
>> +            if (length != FLAC_STREAMINFO_SIZE) {
>> +                av_free(buffer);
>> +                return AVERROR(EINVAL);
>> +            }
>> +            found_streaminfo = 1;
>> +            st->codec->extradata      = buffer;
>> +            st->codec->extradata_size = length;
> 
> is there anything that prevents this from being executed several times and
> thus leaking?

no.. fixed locally. STREAMINFO must only occur once.

> [...]
>> +    if (buffer)
>> +        av_free(buffer);
> 
> useless if()

ah, i forgot about that. changed locally.

After looking at this again, it's probably simpler to just allocate the
buffer for each block instead of using av_realloc(), and free it if it's
not used for extradata.

I'll send a new patch after your response.

Thanks,
Justin






More information about the ffmpeg-devel mailing list