[FFmpeg-devel] Fixpoint FFT optimization, with MDCT and IMDCT wrappers for audio optimization

Marc Hoffman mmhoffm
Sun Jul 29 13:33:47 CEST 2007


On 7/29/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> Hi
>
> On Sat, Jul 28, 2007 at 10:17:53PM -0400, Marc Hoffman wrote:
> > On 7/27/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > Hi
> > >
> > > On Fri, Jul 27, 2007 at 05:40:14PM -0400, mmh wrote:
> > > [...]
> > >
> > > > +static FFTComplex16 *stwids (FFTComplex *w, int n) {
> > > > +    int i;
> > > > +    FFTComplex16 *v = av_malloc (sizeof (short)*n);
> > > > +    for (i = 0; i < n/2; i++) {
> > > > +        v[i].re = w[i].re*32767;
> > > > +        v[i].im = w[i].im*32767;
> > > > +    }
> > >
> > > this should be 32768 with proper cliping
> > >
> >
> > Sorry, to make this so much work for you. I guess if you look at this
> > what your asking for is reasonable but you do realize that because of
> > overflow 32767 doesn't overflow like 32768 would.  If you multiply
> > 1*32768 you get -1 here which is not right not sure how proper
> > clipping would help either unless you determine the sign prior to the
> > computation which is a mess in C.
>
> proper cliping: (there are of course hundreads of other ways to do it)
> FFMIN(w[i].re*32768, 32767)
>
>
> >  So I think it should be left alone
> > otherwise this code becomes fairly complex when its not needed.
> >
> > So the proper way to do this is to multiply by almost 1 which is
> > 0x7fff and not -1.  BTW because of the way I truncate the result
> > 0x7fff or 0x8000 produces the identical values except for the case
> > 1*0x8000 which is -1.
> >
> > Thoughts?
>
> yes fix your code, this is getting annoying
> scaling factor is 32768 as you use >>16, if you use 32767 you would have to
> divide by 32767 in the innermost loop
> and as you mention truncation, please replace this by proper rounding with
> lrinf() or equivalent
> you do proper rounding in the inner most loop but in the code which is
> excuted just once you dont?!
>
> and keep in mind the whole code already uses small coefficients, so this
> makes already inacurate code even less accurate
>

is this what you want?

static FFTComplex16 *stwids (FFTComplex *w, int n) {
    int i;
    FFTComplex16 *v = av_malloc (sizeof (short)*n);
    for (i = 0; i < n/2; i++) {
        v[i].re = lrintf(w[i].re*32768);
        v[i].im = lrintf(w[i].im*32768);
    }
    return v;
}




More information about the ffmpeg-devel mailing list