[FFmpeg-devel] [PATCH] Common ACELP routines (2/3) - filters

Vladimir Voroshilov voroshil
Fri Apr 25 05:18:58 CEST 2008


Michael Niedermayer wrote: 
> On Fri, Apr 25, 2008 at 08:22:15AM +0700, Vladimir Voroshilov wrote:
> > On Fri, Apr 25, 2008 at 1:17 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > On Fri, Apr 25, 2008 at 12:07:15AM +0700, Vladimir Voroshilov wrote:
> > >  > On Thu, Apr 24, 2008 at 10:13 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > >  [...]
> > >
> > >
> > > > >  > > > +        filter_data[10+n] = out[n] = sum;
> > >  > >  > >
> > >  > >  > > This duplicated storeage is unacceptable.
> > >  > >  >
> > >  > >  > First for all assigned to filter data values will be used in loop later.
> > >  > >  > Thus filter_data can not be eliminated.
> > >  > >  > I can't use "out" instead of it due to necessary 10 items
> > >  > >  > with data from previous subframe at top).
> > >  > >  > Extending out with 10 items at top will require another temporary buffer
> > >  > >  > one memcpy somewhere later (because i will not be able to use output buffer
> > >  > >  > directly).
> > >  > >
> > >  > >  The double write is definitly useless after the first 10 iterations as
> > >  > >  after that you can just work in the out buffer.
> > >  > >
> > >  > >  foobar_filter(filter_data+10, 10);
> > >  > >  memcpy(out, filter_data+10, 10);
> > >  > >  foobar_filter(out+10, N-10);
> > >  > >
> > >  > >  should work fine and will for large N (dunno how large it is, so maybe
> > >  > >  this isnt worth it ...) be faster. Also it allows filter_data to be smaller.
> > >  >
> > >  > ... and code will look like :(
> > >  >
> > >  > if(foobar_filter(filter_data+10, 10)!=OVERFLOW)
> > >  > {
> > >  >   memcpy(out, filter_data+10, 10);
> > >  >   if(foobar_filter(out+10, N-10)==OVERFLOW)
> > >  >   {
> > >  >      for(i=0;i<len;i++) out>>=2;
> > >  >      foobar_filter(filter_data+10, 10);
> > >  >      memcpy(out, filter_data+10, 10);
> > >  >      foobar_filter(out+10, N-10);
> > >  >   }
> > >  > }
> > >  > else
> > >  > {
> > >  >      for(i=0;i<len;i++) out>>=2;
> > >  >      foobar_filter(filter_data+10, 10);
> > >  >      memcpy(out, filter_data+10, 10);
> > >  >      foobar_filter(out+10, N-10);
> > >  > }
> > >
> > >  for(;;){
> > >     overflow= foobar_filter(filter_data+10, 10);
> > >
> > >     memcpy(out, filter_data+10, 10);
> > >     overflow|= foobar_filter(out+10, N-10);
> > >     if(!overflow)
> > >         break;
> > >
> > >     for(i=0;i<len;i++) out>>=2;
> > >  }
> > 
> > This will change filter_data even if overflow occuried.
> > Which cause wrong synthesis result on second iteration.
> > Current code on overflow case just downscales
> > excitation signal (without touching filter data).
> 
> well its a matter of adding if(!overflow)

I'm afraid you misunderstand me.

Synthesis filter can be desribed as 
speech[i] = sum(speech[i-j]* coeff[j]), j=0..10
or in Z-transform:
A(z)=1/(sum(coeff[j]*z^(-j))

Thus on each loop iteration it uses 10 previous
samples (with applied filter!)
filter_data array - 10 samples (with applied filter!) from
previous speech data.

So to avoid updating filter_data in case of overflow
we should either:
1. Use current code.
or
2. Create internal temporary
array 20 samples long, memcpy filter data to it, run filter
for 10 first samples using this temporary buffer, memcpy
data from  temp buffer to out, run filter for the rest.
And finally memcpy tail of out buffer back to
filter_data if necessary (only this can be done outside filter, imho)

> Also note that a memcpy is likely faster than the one by one element writing
> in the loop. So if you dislike spliting it in 2 like above, then a seperate
> memcpy is definitly prefered over the double writing.

Well. I don't dislike splitting.
But I think splitting loop inside filter using [2] will be cleaner for
filter usage.

About memcpy.
First versions of decoder used three memcpy 
(one filter_data->temp, second temp->out, third - filter_data shift).
In http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2008-March/044227.html
you suggests to replace them with double writing.
Did i missed something there?

I'll try to return those memcpy back
and split loop inside filter. This will require tmp_buf with 20 samples long
instead of 10+MAX_SUBFRAME_SIZE in first version. 
ok? 

-- 
Regards,
Vladimir Voroshilov mailto:voroshil at gmail.com
Omsk State University
JID: voroshil at jabber.ru
ICQ: 95587719




More information about the ffmpeg-devel mailing list