[FFmpeg-devel] [Patch] AAC encoder improvements

Sun May 5 19:06:54 CEST 2013

On Sat, May 04, 2013 at 10:54:25PM -0300, Claudio Freire wrote:
> On Sat, May 4, 2013 at 10:31 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> >> >>              adjust_frame_information(cpe, chans);
> >> >> +            if (cpe->ms_mode) {
> >> >> +                /* Re-evaluate psy model and quantization selection based on
> >> >> +                   MS-transformed channels */
> >> >> +                s->psy.model->analyze(&s->psy, start_ch, coeffs, wi);
> >> >> +                for (ch = 0; ch < chans; ch++) {
> >> >> +                    s->cur_channel = start_ch * 2 + ch;
> >> >> +                    s->coder->search_for_quantizers(avctx, s, &cpe->ch[ch], s->lambda);
> >> >> +                }
> >> >> +            }
> >> >
> >> > shouldnt this and the previous hunks be in seperate patches or is
> >> > there some dependance ?
> >>
> >> Well, they're related, as they both pertain to joint stereo, but
> >> there's no hard dependency between them, that's true. I did begin with
> >> the first hunk, feeling that those artifacts were due to bad choices,
> >> and it's not enough on its own. The really important hunk is the
> >> second, re-doing quantization.
> >
> > how can it be tested ?
> > there doesnt seem to be a difference in the output with the second
> > hunk applied
> 
> I imagine you enabled joint stereo? (-stereo_mode auto in ffmpeg)
> Without it, the code doesn't do anything. And, of course, stereo input
> that can benefit from it (any real track with vocals should).
> 
> 
> > Also how can i test the improvment the first hunk produces ?
> 
> It's subtle. There's a test clip in lame's test sample set,
> castanets[0], that shows some subtle improvement from it.
> 
> > can you resubmit the patch without the unrelated changes ?
> 
> Attached.
> 
> [0] http://lame.sourceforge.net/download/samples/castanets.wv
> [0] http://lame.sourceforge.net/download/samples/spahm.wv

>  aaccoder.c |   17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 4418289496c7fc0adda7e3d80897119ce1cc31e8  0004-aac-twoloop-quant.patch
> From ab748e42e893c696b72c533adfc654026a85cc0e Mon Sep 17 00:00:00 2001
> From: Claudio Freire <klaussfreire at gmail.com>
> Date: Sat, 4 May 2013 18:38:11 -0300
> Subject: [PATCH 4/5] Several improvements to the AAC encoder:
>    * Fix twoloop's initial upper limit estimation. Threshold information
>      was lost when clipping quantizer range, and it resulted in poorer
>      decisions in the optimization loop. Preserving threshold information
>      better preserves the psychoacoustic detail while searching the
>      optimum quantization offset. Forced range was experimentally derived
>      using Lame's test samples.
> 
> ---
>  libavcodec/aaccoder.c |   21 +++++++++++++--------
>  1 files changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/libavcodec/aaccoder.c b/libavcodec/aaccoder.c
> index 65a5554..86be276 100644
> --- a/libavcodec/aaccoder.c
> +++ b/libavcodec/aaccoder.c
> @@ -716,7 +716,8 @@ static void search_for_quantizers_twoloop(AVCodecContext *avctx,
>      int fflag, minscaler;
>      int its  = 0;
>      int allz = 0;
> -    float minthr = INFINITY;
> +    float minthr = INFINITY, maxthr = 0.0f;
> +    float iminthr, ilogmaxthr;
>  
>      // for values above this the decoder might end up in an endless loop
>      // due to always having more bits than what can be encoded.
> @@ -738,18 +739,22 @@ static void search_for_quantizers_twoloop(AVCodecContext *avctx,
>              }
>              uplims[w*16+g] = uplim *512;
>              sce->zeroes[w*16+g] = !nz;
> -            if (nz)
> +            if (nz) {
>                  minthr = FFMIN(minthr, uplim);
> +                maxthr = FFMAX(maxthr, uplim);
> +            }
>              allz |= nz;
>          }
>      }
> +    iminthr = 1.0f / minthr;
> +    ilogmaxthr = 12.0f / log2f(FFMAX(2.0f,maxthr*512*iminthr));
>      for (w = 0; w < sce->ics.num_windows; w += sce->ics.group_len[w]) {
>          for (g = 0;  g < sce->ics.num_swb; g++) {
>              if (sce->zeroes[w*16+g]) {
>                  sce->sf_idx[w*16+g] = SCALE_ONE_POS;
>                  continue;
>              }
> -            sce->sf_idx[w*16+g] = SCALE_ONE_POS + FFMIN(log2f(uplims[w*16+g]/minthr)*4,59);
> +            sce->sf_idx[w*16+g] = SCALE_ONE_POS + FFMIN(log2f(uplims[w*16+g]*iminthr)*ilogmaxthr+59-12-1,59);

the iminthr change is a unrelated optimization that should be in a
seperate patch.

>          }
>      }
>  
> @@ -814,12 +819,12 @@ static void search_for_quantizers_twoloop(AVCodecContext *avctx,
>                  }
>              }
>              if (tbits > destbits) {
>                  for (i = 0; i < 128; i++)
>                      if (sce->sf_idx[i] < 218 - qstep)
>                          sce->sf_idx[i] += qstep;
> -            } else {
> +            } else if (tbits < destbits) {

> -                for (i = 0; i < 128; i++)
> +                for (i = 0; i < 128; i++)

trailing whitespace (which cannot be commited as its rejected by
our git server)

> -                    if (sce->sf_idx[i] > 60 - qstep)
> +                    if (sce->sf_idx[i] > 60 + qstep) 

this also looks like it could be in a seperate patch

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The misfortune of the wise is better than the prosperity of the fool.
-- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130505/7eab2780/attachment.asc>