[FFmpeg-devel] [RFC/PATCH] More flexible variafloat_to_int16 , WMA optimization, Vorbis
Tue Jul 15 00:21:17 CEST 2008
On Mon, 14 Jul 2008, Siarhei Siamashka wrote:
> For example, it is possible to get rid of "memcpy(saved, buf+blocksize/4,
> blocksize/4*sizeof(float))" and probably "vc->buf", performing output
> directly to "vc->ret" and "vc->saved" from "fft.imdct_half".
> It should further improve both performance and L1 cache use, making vorbis
> decoder even better than it is now.
It's not that clear cut. I can remove vc->buf (overwriting some other
buffer that's not used at the time, like channel_residues). But
eliminating the memcpy requires increasing the amount of memory used,
since you then need to keep one saved array per channel plus one for the
current block to be pointer-swapped. This is faster if the data still
fits in L1 after that expansion, but slower if you have an old cpu with
a small cache.
More information about the ffmpeg-devel