[FFmpeg-devel] audio gop_size

Michael Niedermayer michaelni
Wed Aug 25 11:21:13 CEST 2010


On Tue, Aug 24, 2010 at 05:59:05PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> 
> > On Sun, Aug 22, 2010 at 04:40:54PM -0400, Justin Ruggles wrote:
> >> Hi,
> >>
> >> There are several audio codecs (such as MLP, ALS, and Speex) which
> >> utilize the concept of groups of frames.  For decoding, it is simple
> >> enough to either decode the whole group at once or decode
> >> frame-by-frame, but for encoding there are other issues.  This is how
> >> Thilo and I have implemented it in the ALS encoder we've been working
> >> on, and I wanted to run it by the list to make sure we're on the right
> >> track.
> >>
> >> 1) Make gop_size an audio option as well.
> >>
> >> alternative: The downside is that it defaults to 12, which may not be an
> >> appropriate default for some audio codecs.  Would changing the gop_size
> >> default have API implications?  Should we instead add another field for
> >> audio group of frames size that would default to 0 or -1?
> >>
> >> 2) Use an internal buffer in the encoder to store each encoded frame
> >> until it has enough for a group, then set coded_frame->pts appropriately
> >> and output the whole group.  The reason that the whole group needs to be
> >> encoded at once is because the container packet should always contain a
> >> whole group (at least this is the case for ALS and Speex).
> > 
> > if the container needs a "GOP" per packet and the encoder outputs a whole gop
> > at a time then this is semantically different
> > from video gops and it appears more a encoder internal structuring like mp3
> > or aac using short instead of long windows
> 
> Well, it can be handled completely internally, but then there are some
> other issues.  I think the best way to describe this is to give specific
> examples.
> 
> MPEG-4 ALS has a concept of random access (RA) units, where an RA frame
> is the first frame in an RA unit.  RA frames do not rely on any samples
> from previous frame in the linear prediction, while the remaining frames
> in the RA unit do rely on previous samples.  One weird situation is that
> there is such thing as streams with no RA frames, so the whole file
> contains a single packet (the first frame pretends there was a previous
> frame with all zero samples).
> 
> Demuxing : A single MP4 packet contains the whole RA unit, and there is
> no way to parse out single frames.
> 
> Decoding : The decoder only decodes a single frame at a time, requiring
> multiple decoding calls for each input packet.  This is fine according
> to our audio API.
> 
> Encoding : The encoder-side equivalent of what the decoder does would be
> like #2 in my original email.  Encode a single frame at a time and
> buffer the frames internally until a whole RA unit is done, then output
> the whole thing at once.  But for the encoder, we need a way for the
> user to specify how many frames to put in an RA unit.  If gop_size is
> not really a logical equivalent I guess we need a new field.
> 
> Muxing : If the demuxer and encoder both output full RA units then there
> is no issue here as far as muxing is concerned.
> 
> 
> Speex is a similar situation to ALS, but has some other oddities due to
> individual frames not being byte-aligned.  In this case, the encoder
> could just as easily encode a whole packet at once.  But the user still
> needs to be able to specify how many frames to put in a packet.
> 
> 
> So, I think both situations can be handled slightly differently and
> still work correctly, but a new field is needed.  Something like
> audio_frames_per_packet or similar?
> 
> If that sounds ok I can send a patch.

probably unless someone has a better idea

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100825/6388b1d6/attachment.pgp>



More information about the ffmpeg-devel mailing list