[Ffmpeg-devel] Fixed point arithmetic RealAudio G2 (cook) decoder

Benjamin Larsson banan
Thu Mar 8 22:29:45 CET 2007


Ian Braithwaite wrote:
> Hi,
> 
> 
> OK - here's my fixed point patch for the cook audio decoder.
> 
> 
>>> I know that Rich is always interested in integer implementation for
>>> his slow K6-III+. You're gonna make at least one user happy ;-)
>> not only rich, 25x speedup for ARM is interresting, though we of course
>> cannot unconditionally replace the float decoder so the patch must be
>> clean to be accepted in svn, clean hear means that common code
>> is not duplicated and that both float and integer can be selected by the
>> users (at compile time is ok)
> 
> Just to be clear - the fixed point decoder is _slower_ than the floating
> point version, by about a factor of two, on my Athlon.
> So, no, replacing the float decoder isn't an option.
> 
> The 25x speedup on ARM is perhaps not so suprising, since to run the float
> version, it has to use a software FPU emulator.
> 
> 
> To avoid duplication of common code, here's what I did.
> 
> The current decoder consists of two files, "cook.c" and "cookdata.h".
> I factored out the floating point math from "cook.c", and put it into
> inline functions in a new file, "cook_float.h".
> Then I implemented fixed point versions, and put them in "cook_fixpoint.h".
> Then I repeated the same trick for "cookdata.h".
> Finally, I conditionally include the appropriate two files in "cook.c".
> There's a define in "cook.c" to choose between float and fixed -
> presumably this could be integrated into the configuration system.
> 
> 
> Some of the patch isn't directly related to the fixed point implementation,
> so I've split it up into 4 (in my opinion!) improvements, and then the
> patch itself. The improvements came about by me figuring out how
> the whole thing works.
> 
> Here's a run down of the 5 patches.
> 
> 
> Patch 1.
> Don't output the first two frames, since they don't contain valid audio.
> This also eases comparison of decoded output with Real's binary decoder.
> 
>  cook.c |    3 +++
>  1 file changed, 3 insertions(+)

Applied.

> 
> 
> Patch 2.
> Simplify gain block handling.
> No effect on output.
> 

Applied.

>  cook.c |  152
> ++++++++++++++++++++++-------------------------------------------
>  1 file changed, 52 insertions(+), 100 deletions(-)
> 
> 
> Patch 3.
> Replace custom modified discrete cosine transform with ffmpeg's own.
> This does alter the decoded output, although not to my ears.
> The new output is closer to that produced by Real's "official" decoder,
> and the decoder runs slightly faster.
> 
>  cook.c |   98
> +++++++++++++++++++----------------------------------------------
>  1 file changed, 30 insertions(+), 68 deletions(-)
> 

Merge the scale factor with the window. 1 mul less per sample.

Oh btw, great job porting the mdct to the ffmpeg one. I failed when I tried.

> 
> Patch 4.
> Simplify subband coefficient handling.
> No effect on output.
> 
>  cook.c |   95
> +++++++++++++++++++++++++++++------------------------------------
>  1 file changed, 43 insertions(+), 52 deletions(-)
> 

Change to av_random.

> 
> Patch 5.
> Fixed point decoder.
> A define in "cook.c" chooses between fixed and floating point
> implementations.
> With floating point, no efect on output.
> With fixed point, output difference is about 0.25rms, max +-1, at 16bit.

+-1 rounding errors are ok. Imo the float and fixedpoint code and tables
can share the same file with the help of defines. The fixedpoint math
operattions should be moved to some generic fixedpoint.h file for the
possible use in other codecs. The fixedpoint mdct should be moved to
mdct.h.

> 
>  cook.c              |  279 +++++---------------------
>  cook_fixp_mdct.h    |  545
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  cook_fixpoint.h     |  243 +++++++++++++++++++++++
>  cook_float.h        |  226 +++++++++++++++++++++
>  cookdata.h          |   66 ------
>  cookdata_fixpoint.h |  433 +++++++++++++++++++++++++++++++++++++++++
>  cookdata_float.h    |  113 ++++++++++
>  7 files changed, 1614 insertions(+), 291 deletions(-)
> 
> 
> regards,
> Ian
> 
> 
> ------------------------------------------------------------------------
> 
> diff -upN -x 'ff*' gain/cook.c mlt/cook.c
> --- gain/cook.c	2007-03-07 13:58:47.000000000 +0100
> +++ mlt/cook.c	2007-03-08 08:54:51.000000000 +0100
> @@ -90,15 +90,9 @@ typedef struct {
>      int                 random_state;
>  

[...]

>  
>      /* Initialize the MLT window: simple sine window. */
> -    alpha = M_PI / (2.0 * (float)q->mlt_size);
> -    for(j=0 ; j<q->mlt_size ; j++) {
> -        q->mlt_window[j] = sin((j + 512.0/(float)q->mlt_size) * alpha);
> +    alpha = M_PI / (2.0 * (float)mlt_size);
> +    for(j=0 ; j<mlt_size ; j++) {
> +        q->mlt_window[j] = sin((j + 0.5) * alpha);

Complete the window.

>      }
>  
> -    /* pre/post twiddle factors */
> -    for (j=0 ; j<q->mlt_size/2 ; j++){
> -        q->mlt_precos[j] = cos( ((j+0.25)*M_PI)/q->mlt_size);
> -        q->mlt_presin[j] = sin( ((j+0.25)*M_PI)/q->mlt_size);
> -        q->mlt_postcos[j] = (float)sqrt(2.0/(float)q->mlt_size)*cos( ((float)j*M_PI) /q->mlt_size); //sqrt(2/MLT_size) = scalefactor
> +    /* Initialize the MDCT. */
> +    if (ff_mdct_init(&q->mdct_ctx, av_log2(mlt_size)+1, 1)) {
> +      av_free(q->mlt_window);
> +      return -1;

[...]

> -
> -    /* window and reorder */
> -    for(i=0 ; i<q->mlt_size/2 ; i++){
> -        outbuffer[i] = mlt_tmp[q->mlt_size/2-1-i] * q->mlt_window[i];
> -        outbuffer[q->mlt_size-1-i]= mlt_tmp[q->mlt_size/2-1-i] *
> -                                    q->mlt_window[q->mlt_size-1-i];
> -        outbuffer[q->mlt_size+i]= mlt_tmp[q->mlt_size/2+i] *
> -                                  q->mlt_window[q->mlt_size-1-i];
> -        outbuffer[2*q->mlt_size-1-i]= -(mlt_tmp[q->mlt_size/2+i] *
> -                                      q->mlt_window[i]);
> +    ff_imdct_calc(&q->mdct_ctx, outbuffer, inbuffer, q->mdct_tmp);
> +
> +    for(i = 0; i < q->samples_per_channel; i++){
> +        float tmp = outbuffer[i];
> +        
> +        outbuffer[i] =
> +          q->mlt_window[i] * gain * outbuffer[q->samples_per_channel + i];
> +        outbuffer[q->samples_per_channel + i] =
> +          q->mlt_window[q->samples_per_channel - 1 - i] * gain * -tmp;

Use the dsp vector mul function here with the completed window, the -tmp
might need to be merged in the completed window.

[...]

> 
> ------------------------------------------------------------------------
> 
> diff -upN -x 'ff*' mlt/cook.c coef/cook.c
> --- mlt/cook.c	2007-03-08 08:54:51.000000000 +0100
> +++ coef/cook.c	2007-03-08 08:59:45.000000000 +0100
> @@ -124,6 +124,20 @@ typedef struct {
>      float               decode_buffer_2[1024];
>  } COOKContext;
>  
> +
> +/**
> + * Random bit stream generator.
> + * Should we be using av_random()?
> + */

Yes.

> +static int inline cook_random(COOKContext *q)
> +{
> +    q->random_state =
> +      q->random_state * 214013 + 2531011; /* typical RNG numbers */
> +
> +    return (q->random_state/0x1000000)&1;  /*>>31*/
> +}
> +
> +

[...]

Please post 1 patch per mail. Makes it easier to review the code.


MvH
Benjamin Larsson




More information about the ffmpeg-devel mailing list