[FFmpeg-devel] [PATCH] speex in ogg muxer

Justin Ruggles justin.ruggles
Sat Sep 19 20:31:13 CEST 2009


Justin Ruggles wrote:

> Justin Ruggles wrote:
> 
>> Justin Ruggles wrote:
>>
>>> Justin Ruggles wrote:
>>>
>>>> Justin Ruggles wrote:
>>>>
>>>>> Justin Ruggles wrote:
>>>>>
>>>>>> Justin Ruggles wrote:
>>>>>>
>>>>>>> Baptiste Coudurier wrote:
>>>>>>>> Justin Ruggles wrote:
>>>>>>>>> Baptiste Coudurier wrote:
>>>>>>>>>> Hi Justin,
>>>>>>>>>>
>>>>>>>>>> Justin Ruggles wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> This patch adds speex support to the ogg muxer.  It basically does the
>>>>>>>>>>> same thing as Ogg/FLAC, in that the 1st packet is a global header from
>>>>>>>>>>> extradata and the 2nd packet is vorbiscomment metadata.
>>>>>>>>>>>
>>>>>>>>>>> This seems to work just fine for speex-to-speex stream copy, but
>>>>>>>>>>> probably would not work for flv-to-speex because flv doesn't to have any
>>>>>>>>>>> speex extradata from what I can tell.  I guess a header could be
>>>>>>>>>>> constructed, but that would be a separate patch to the flv demuxer.
>>>>>>>>>>>
>>>>>>>>>>> This patch is a precursor to libspeex encoding support, which I'll be
>>>>>>>>>>> sending shortly.
>>>>>>>>>>>
>>>>>>>>>>> -Justin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>> Index: libavformat/oggenc.c
>>>>>>>>>>> ===================================================================
>>>>>>>>>>> --- libavformat/oggenc.c	(revision 19244)
>>>>>>>>>>> +++ libavformat/oggenc.c	(working copy)
>>>>>>>>>>> @@ -104,17 +125,39 @@
>>>>>>>>>>>      bytestream_put_byte(&p, 0x00); // streaminfo
>>>>>>>>>>>      bytestream_put_be24(&p, 34);
>>>>>>>>>>>      bytestream_put_buffer(&p, streaminfo, FLAC_STREAMINFO_SIZE);
>>>>>>>>>>> -    oggstream->header_len[1] = 1+3+4+strlen(vendor)+4;
>>>>>>>>>>> -    oggstream->header[1] = av_mallocz(oggstream->header_len[1]);
>>>>>>>>>>> -    p = oggstream->header[1];
>>>>>>>>>>> +    p = ogg_write_vorbiscomment(4, bitexact, &oggstream->header_len[1]);
>>>>>>>>>>> +    if (!p)
>>>>>>>>>>> +        return -1;
>>>>>>>>>> AVERROR(ENOMEM)
>>>>>>>>> fixed.
>>>>>>>>>
>>>>>>>>>>> @@ -144,6 +188,12 @@
>>>>>>>>>>>                  av_log(s, AV_LOG_ERROR, "Extradata corrupted\n");
>>>>>>>>>>>                  av_freep(&st->priv_data);
>>>>>>>>>>>              }
>>>>>>>>>>> +        } else if (st->codec->codec_id == CODEC_ID_SPEEX) {
>>>>>>>>>>> +            if (ogg_build_speex_headers(st->codec, oggstream,
>>>>>>>>>>> +                                        st->codec->flags & CODEC_FLAG_BITEXACT) < 0) {
>>>>>>>>>>> +                av_log(s, AV_LOG_ERROR, "error writing Speex headers\n");
>>>>>>>>>>> +                av_freep(&st->priv_data);
>>>>>>>>>>> +            }
>>>>>>>>>> return error here with the return code of the func :>
>>>>>>>>>> Yes, it seems flac miss it too, this needs a fix.
>>>>>>>>>>
>>>>>>>>>> patch fine otherwise, maybe a micro bump for avformat would be nice.
>>>>>>>>> fixed. new patch attached. the new patch also differs in that it
>>>>>>>>> overrides the extra_headers field in the Speex header to be 0 since only
>>>>>>>>> the 2 required headers are written.
>>>>>>>>>
>>>>>>>> patch ok if it works :>
>>>>>> Ok, back to square one.
>>>>>>
>>>>>>> Hmm... I've done several more tests and it does not quite work as-is for
>>>>>>> all samples.  Here is what I have run into.  The tests so far are for
>>>>>>> ogg-to-ogg stream copy.
>>>>>>>
>>>>>>> - When the source has more than 1 frame per packet, the resulting copy
>>>>>>> plays fine with ffmpeg/ffplay but is quick and choppy with speexdec.  I
>>>>>>> was able to fix this by modifying the ogg/speex demuxer to set
>>>>>>> avctx->frame_size to the number of samples in a packet instead of in a
>>>>>>> frame.  I also had to update the libspeex decoder accordingly.  Maybe
>>>>>>> this is the wrong way to go about it though.  I'm guessing it is a
>>>>>>> timestamp/granulepos issue, but I don't know enough about Ogg to tell
>>>>>>> more than that.
>>>>>> This is now corrected after much discussion. :)
>>>>>>
>>>>>>> - Even with the fix and even with 1 frame per packet, 2 short samples
>>>>>>> I've tested so far have a single soft pop when the stream-copied file is
>>>>>>> decoded with speexdec, but it's fine with ffmpeg/ffplay.
>>>>>>>
>>>>>>> Maybe someone else might have an idea of what could be going wrong?
>>>>>> Now I think I know what is going wrong, and there is nothing we can do
>>>>>> about it I think.  speexenc does some weird things with granule
>>>>>> positions.  It starts out for a long time with granulepos=0 even though
>>>>>> it is encoding audio, then when it starts writing granule positions it
>>>>>> is not always in sync with the start of the stream.  Below is a little
>>>>>> snippet from a comparison of an original spx file to a copied spx file.
>>>>>>  Each packet should be 320 samples.
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 57
>>>>>> +00:00:01.120: serialno 0000000000, granulepos 17920, packetno 57
>>>>>>
>>>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 58
>>>>>> +00:00:01.140: serialno 0000000000, granulepos 18240, packetno 58
>>>>>>
>>>>>> -00:00:00.000: serialno 1626088319, calc. gpos 0, packetno 59
>>>>>> +00:00:01.160: serialno 0000000000, granulepos 18560, packetno 59
>>>>>>
>>>>>> -00:00:01.171: serialno 1626088319, granulepos 18737, packetno 60
>>>>>> +00:00:01.180: serialno 0000000000, granulepos 18880, packetno 60
>>>>>>
>>>>>> -00:00:01.191: serialno 1626088319, calc. gpos 19057, packetno 61
>>>>>> +00:00:01.191: serialno 0000000000, granulepos 19057, packetno 61
>>>>>>
>>>>>> -00:00:01.211: serialno 1626088319, calc. gpos 19377, packetno 62
>>>>>> +00:00:01.211: serialno 0000000000, granulepos 19377, packetno 62
>>>>> So... I figured it out, but you may not want to know the answer. ;)
>>>>>
>>>>> The granulepos of the first packet is supposed to be interpreted as
>>>>> smaller than the full frame size by calculating what the granulepos of
>>>>> the first page would normally be, then subtracting it from what it
>>>>> really is to get the delay.
>>>>>
>>>>>> >From above, this is the last packet in the first page. There are 59
>>>>> packets per page in this stream (the first 2 packets are headers, hence
>>>>> the packetno of 60).
>>>>>> -00:00:01.171: serialno 1626088319, granulepos 18737, packetno 60
>>>>>> +00:00:01.180: serialno 0000000000, granulepos 18880, packetno 60
>>>>> speexdec interprets the first packet as having a delay of
>>>>> 18880-18737=143 samples.  So the first packet should be 320-143=177
>>>>> samples long, and the decoder discards the first 143 samples of the
>>>>> first frame.
>>>>>
>>>>> None of this is documented except for in the speexenc and speexdec
>>>>> source code.  From analyzing a Speex-in-FLV sample, it appears that the
>>>>> way Adobe handles this in Flash Media Server is to do like our ogg
>>>>> demuxer does and interpret the first page as if each frame is 320
>>>>> samples, then resync timestamps with the source after the first page,
>>>>> causing a skip in timestamps after the first page instead of at the
>>>>> beginning of the stream.
>>>>>
>>>>> I'm still not sure what to do about this though...
>>>> This patch makes it so that all the pts and durations are correct for
>>>> Ogg/Speex.  It basically just changes the durations of the first and
>>>> last packets.
>>> nevermind. this doesn't quite work. i'm still working on it. damn ogg
>>> and its craziness!
>> Ok, now this patch should work correctly.
> 
> ping.

ping2. Ogg demuxer maintainer?

My Speex-related patch queue is stuck... I am waiting until this issue
is corrected before submitting updated patches for Speex muxing in Ogg,
FLV, and NUT so that we don't end up creating files with invalid
timestamps.  And muxing support is needed before submitting my new patch
for libspeex encoding.

-Justin




More information about the ffmpeg-devel mailing list