[FFmpeg-devel] [PATCH] lavc/lpc: exploit even symmetry of window function

Wed Mar 9 10:47:25 CET 2016

On 9 March 2016 at 07:22, Reimar Döffinger <Reimar.Doeffinger at gmx.de> wrote:

> On 09.03.2016, at 04:16, Ganesh Ajjanagadde <gajjanag at gmail.com> wrote:
>
> > Yields 2x improvement in function performance, and boosts aac encoding
> > speed by ~ 4% overall. Sample benchmark (Haswell+GCC under
> -march=native):
> > after:
> > ffmpeg -i sin.flac -acodec aac -y sin_new.aac  5.22s user 0.03s system
> 105% cpu 4.970 total
> >
> > before:
> > ffmpeg -i sin.flac -acodec aac -y sin_new.aac  5.40s user 0.05s system
> 105% cpu 5.162 total
> >
> > Big shame that len-1 is -1 mod 4; 0 mod 4 would have yielded a further
> 2x through
> > additional symmetry. Of course, one could approximate with the 0 mod 4
> variant,
> > error would essentially be ~ 1/len in the worst case.
>
> Note that I have no idea why we are using double here (is there a good
> reason?)
> It doesn't really matter for the rest of the code, but cosf is also at
> least twice as fast as cos...
> Probably has smaller error than fudging for symmetry and should be enough
> to push the speed cost of this function close to negligible.

Yes, the reason why double is used is because most of the existing code
worked internally using doubles (though it accepted only integer samples
and did an integer->double conversion when windowing). So only a single
function was needed to actually use the existing code with a float input.
The code already has assembly optimizations, so it should be decently fast.
I'm not sure if templating for doubles and floats would be worth it.