[FFmpeg-soc] AAC Encoding - Where we stand, what's left

Tue Jul 7 04:38:55 CEST 2009

On Mon, Jul 6, 2009 at 9:28 PM, Diego Biurrun<diego at biurrun.de> wrote:
> On Mon, Jul 06, 2009 at 09:14:00PM -0400, Alex Converse wrote:
>> I'd like to take a minute to discuss the status of the AAC encoder and
>> where it is going.
>>
>> In SoC svn:
>> --Lacks multichannel support
>> --Lacks SBR
>
> These are likely low priority.
>

All the other AAC encoders out there worth their salt support these.
It's 2009, SBR is no longer a fringe extension to AAC that major
implementations don't support. Microsoft and Apple have both moved to
supporting HE-AAC. 14496-3:2009 will include the HE-AAC profile in the
main body (not an amendment). SBR is absolutely necessary to be
competitive at low bitrates.

>> --Produces illegal bitstreams by violating the maximum frame size
>
> This is bad.
>
>> In my tree:
>> --Ruggles' PARCOR
>
> What's that?

It's a pre req for TNS

>
>> --Maximum frame size enforcement
>
> Could you try to get this merged next?
>

It depends on the rate control stuff.

>> To be frank, at this point it seems like it might be prudent for me to
>> stop working on this
>
> Uh, why?
>

Getting faac free (by dropping long forgotten profiles and
reimplementing things from spec), seem like less effort than getting
FFmpeg to faac quality (running around trying to fix bugs in someone
else's codebase). Building on 26.410 v8.0.0 is attractive because it
is already better quality than ffmpeg and faac and includes a working
SBR implementation which would require tons of work to add to ffmpeg
or faac.

>> Dsputil is awesome but developing this encoder inside ffmpeg is
>> constricting to say the least.
>
> Why?  Elaborate..
>

Well, our source control is one major paradigm shift behind (and the
use of svn:externals definitely damages makes using the git mirror
more painful and branches aren't even exposed on the git mirror but
we've discussed this a thousand times and I don't see any chance of it
changing in the near future). This is especially painful because
essentially I'm playing in someone else's sandbox here.

The concepts of trying to minimize the diff vs HEAD outside of the
core encoder files to ease merging and a variety of things that need
to be done are diametrically opposed. A variety of codec specific
options are needed (see previous). Splitting out the windowing and
psymodel code to make a fake encoder to tune the psymodel is intrusive
OR leads to lots of duplication. either way it's pretty nasty and
requires big adjustments to both the AAC encoder and decoder.

If I'm the only one working on this, implementing it outside of
libavcodec allows me to make the changes I feel are necessary without
having to justify them to anyone else.

I'm not the only one who's wondered if FFmpeg is really the best place
to implement a high quality encoder. FFmpeg lacks a VC-1 encoder, an
H.264 encoder, and an MP3 encoder. x264 is developed outside of FFmpeg
despite sharing some code. Aften and Flake (that PARCOR routine is
actually from Flake) are developed outside of FFmpeg and periodically
have features backported. AAC itself is older than FFmpeg (not some
johnny-come-lately format) and we still lack a working encoder for it.

 --Alex