[FFmpeg-devel] [PATCH] aacenc_utils: unroll loops to allow compiler to use SIMD.
Reimar.Doeffinger at gmx.de
Mon Mar 7 08:54:59 CET 2016
On 07.03.2016, at 04:04, Ganesh Ajjanagadde <gajjanag at gmail.com> wrote:
> On Sun, Mar 6, 2016 at 1:43 PM, Reimar Döffinger
> <Reimar.Doeffinger at gmx.de> wrote:
>> On Sun, Mar 06, 2016 at 07:35:58PM +0100, Reimar Döffinger wrote:
>>> Approximately 10% faster transcode from mp3 to aac
>>> with default settings.
>> Note to anyone wanting to optimize it further:
>> There is almost 25% on the table if you can replace
>> the pow() and cos() function uses by something more
> So I did try one thing, namely in lavc/aacenc_utils, replace powf in
> find_form_factor by a conditional checking for 2.0f, squaring if it
> is, powf otherwise (see lavc/aaccoder_twoloop for the calls, one is
> with 2.0f, other without), but it yields essentially nothing.
> Likewise, an even more trivial one is line 125 of aaccoder_twoloop:
> powf can be replaced here by sqrtf(sqrtf()), but this also yields
Probably those cases are already optimized by the implementation.
> Can you be more specific, and are you sure about this?
Just run your favourite performance analysis tool and you'll see.
As it is non-inlined libc code I'm fairly sure the numbers are accurate enough.
More information about the ffmpeg-devel