[FFmpeg-devel] Fixpoint FFT optimization, with MDCT and IMDCT wrappers for audio optimization

Marc Hoffman mmhoffm
Sun Jul 29 13:43:05 CEST 2007


On 7/29/07, Marc Hoffman <mmhoffm at gmail.com> wrote:
> On 7/29/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> > Hi
> >
> > On Sat, Jul 28, 2007 at 10:17:53PM -0400, Marc Hoffman wrote:
> > > On 7/27/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > > Hi
> > > >
> > > > On Fri, Jul 27, 2007 at 05:40:14PM -0400, mmh wrote:
> > > > [...]
> > > >
> > > > > +static FFTComplex16 *stwids (FFTComplex *w, int n) {
> > > > > +    int i;
> > > > > +    FFTComplex16 *v = av_malloc (sizeof (short)*n);
> > > > > +    for (i = 0; i < n/2; i++) {
> > > > > +        v[i].re = w[i].re*32767;
> > > > > +        v[i].im = w[i].im*32767;
> > > > > +    }
> > > >
> > > > this should be 32768 with proper cliping
> > > >
> > >
> > > Sorry, to make this so much work for you. I guess if you look at this
> > > what your asking for is reasonable but you do realize that because of
> > > overflow 32767 doesn't overflow like 32768 would.  If you multiply
> > > 1*32768 you get -1 here which is not right not sure how proper
> > > clipping would help either unless you determine the sign prior to the
> > > computation which is a mess in C.
> >
> > proper cliping: (there are of course hundreads of other ways to do it)
> > FFMIN(w[i].re*32768, 32767)
> >
> >
> > >  So I think it should be left alone
> > > otherwise this code becomes fairly complex when its not needed.
> > >
> > > So the proper way to do this is to multiply by almost 1 which is
> > > 0x7fff and not -1.  BTW because of the way I truncate the result
> > > 0x7fff or 0x8000 produces the identical values except for the case
> > > 1*0x8000 which is -1.
> > >
> > > Thoughts?
> >
> > yes fix your code, this is getting annoying
> > scaling factor is 32768 as you use >>16, if you use 32767 you would have to
> > divide by 32767 in the innermost loop
> > and as you mention truncation, please replace this by proper rounding with
> > lrinf() or equivalent
> > you do proper rounding in the inner most loop but in the code which is
> > excuted just once you dont?!
> >
> > and keep in mind the whole code already uses small coefficients, so this
> > makes already inacurate code even less accurate
> >
>
> is this what you want?
>
> static FFTComplex16 *stwids (FFTComplex *w, int n) {
>     int i;
>     FFTComplex16 *v = av_malloc (sizeof (short)*n);
>     for (i = 0; i < n/2; i++) {
>         v[i].re = lrintf(w[i].re*32768);
>         v[i].im = lrintf(w[i].im*32768);
>     }
>     return v;
> }
>

Thats what you want no problem sorry to be annoying.

/**
 * Generate 16b coefficient table from a predefined floating point table.
 * @param w     - floating point twiddle factors
 * Wparam n     - number of twiddle factors
 */
static FFTComplex16 *stwids (FFTComplex *w, int n) {
    int i;
    FFTComplex16 *v = av_malloc (sizeof (short)*n);
    for (i = 0; i < n/2; i++) {
        v[i].re = lrintf(FFMAX(FFMIN(w[i].re*32768,32767),-32768));
        v[i].im = lrintf(FFMAX(FFMIN(w[i].im*32768,32767),-32768));
    }
    return v;
}




More information about the ffmpeg-devel mailing list