[FFmpeg-soc] AAC Encoding - Where we stand, what's left

Alex Converse alex.converse at gmail.com
Wed Jul 8 06:05:50 CEST 2009


On Tue, Jul 7, 2009 at 4:45 AM, Kostya<kostya.shishkov at gmail.com> wrote:
> On Mon, Jul 06, 2009 at 09:14:00PM -0400, Alex Converse wrote:
>> I'd like to take a minute to discuss the status of the AAC encoder and
>> where it is going.
>>
>> In SoC svn:
>> --Applies cleanly to SVN HEAD
>> --The most egregious of the artifacting is gone (sections being
>> replaced by silence or having the wrong volume, etc.)
>> --Lacks TNS
>
>> --Lacks multichannel support
>
> Ahem, I've added it long time ago.
>

$ ./ffmpeg -i ../../Canyon-5.1-48khz-448kbit.ac3 canyon5.1.m4a
FFmpeg version git-04fe5e6, Copyright (c) 2000-2009 Fabrice Bellard, et al.
  configuration: --enable-gpl --disable-ffserver
  libavutil     50. 3. 0 / 50. 3. 0
  libavcodec    52.32. 0 / 52.32. 0
  libavformat   52.36. 0 / 52.36. 0
  libavdevice   52. 2. 0 / 52. 2. 0
  libswscale     0. 7. 1 /  0. 7. 1
  built on Jul  7 2009 23:49:58, gcc: 4.3.3
Input #0, ac3, from '../../Canyon-5.1-48khz-448kbit.ac3':
  Duration: 00:00:37.98, bitrate: 448 kb/s
    Stream #0.0: Audio: ac3, 48000 Hz, 5.1, s16, 448 kb/s
File 'canyon5.1.m4a' already exists. Overwrite ? [y/N] y
Output #0, ipod, to 'canyon5.1.m4a':
    Stream #0.0: Audio: aac, 48000 Hz, 5.1, s16, 64 kb/s
Stream mapping:
  Stream #0.0 -> #0.0
Press [q] to stop encoding
Segmentation fault

>> --Lacks rate control
>> --Lacks SBR
>> --Produces illegal bitstreams by violating the maximum frame size
>
> This one could be fixed.
>

Could be fixed but depends on rate control

>> --Below faac quality
>> --Well below the quality of competitive encoders
>>
>> In my tree:
>> --Ruggles' PARCOR
>> --Rudimentary TNS support based on ISO 13818-7 Annex C
>> --TNS coefficient compressor
>> --Various performance opts
>> --Different value for CLIPPED_ESCAPE (165113.5f * IQ)
>> --Substantial rate control related re-factoring
>> --Pseudo ABR rate control
>> --Maximum frame size enforcement
>
>> --VBR rate control that forces comically high bitrate output.
>
> Heh, do you mean it's always maximum frame size?
>

Not quite but many frames do saturate or get close.

>> TNS is not helpful at the moment. Sharp attacks are losing most of
>> their power before we get to the TNS stage. I believe this is may be
>> psy model related.
>>
>> To be frank, at this point it seems like it might be prudent for me to
>> stop working on this and move to either replacing the
>> non-redistributable parts of faac (to get something legal and faac
>> quality) or improving the 3GPP code (to get something awesome but not
>> distributable). At this point both code bases offer better quality and
>> more features (including SBR support from 3GPP). Dsputil is awesome
>> but developing this encoder inside ffmpeg is constricting to say the
>> least.
>
> I'm stronly against it. It seems to me that it's easier to backport FAAC
> psy model and codebook selection to our encoder to get comparative
> output - IIRC non-LGPL parts of libfaac are exactly the basic stuff I
> implemented.
>

Needing replacement: bitstream.[ch], channels.c, filtbank.c,
huffman.[ch], tns.[ch]
Can be eliminated: backpred.[ch], ltp.[ch]

--Alex


More information about the FFmpeg-soc mailing list