[FFmpeg-devel] [PATCH] lavc/aacenc_utils: replace powf(x, y) by expf(logf(x), y)
Ronald S. Bultje
rsbultje at gmail.com
Fri Mar 11 15:30:14 CET 2016
On Thu, Mar 10, 2016 at 9:23 PM, Ganesh Ajjanagadde <gajjanag at gmail.com>
> On Thu, Mar 10, 2016 at 8:56 AM, Ronald S. Bultje <rsbultje at gmail.com>
> > Hi,
> > On Thu, Mar 10, 2016 at 2:37 AM, Reimar Döffinger <
> Reimar.Doeffinger at gmx.de>
> > wrote:
> >> On 10.03.2016, at 03:06, Ganesh Ajjanagadde <gajjanag at gmail.com> wrote:
> >> > On Wed, Mar 9, 2016 at 2:16 AM, Reimar Döffinger
> >> > <Reimar.Doeffinger at gmx.de> wrote:
> >> >> On 08.03.2016, at 04:48, Ganesh Ajjanagadde <gajjanag at gmail.com>
> >> >>
> >> >>> + nzl += expf(logf(s / ethresh) * nzslope);
> >> >>
> >> >> Shouldn't log2f/exp2f be faster?
> >> >> log2f at least has CPU support on x86 AFAICT.
> >> >
> >> > I had tested this, and no, though it is still faster than powf.
> >> >
> >> > It still seems to rely on libm, note that we don't use -ffast-math and
> >> > a look at
> >> https://github.com/lattera/glibc/tree/master/sysdeps/x86_64/fpu
> >> > as well seems to say no. Problem is, GNU people like to prioritize
> >> > "correctly rounded" behavior over fast, reasonably accurate code,
> >> > sometimes to ludicruous degrees.
> >> >
> >> > Personally, I don't know why we don't use -ffast-math, not many seem
> >> > to care that heavily on strict IEEE semantics. Maybe it leads to too
> >> > much variation across platforms?
> >> You lose some guarantees. In particular, the compiler will assume NaNs
> >> not happen and you cannot predict which code path (after a comparison
> >> example) they take.
> >> But some code for either security or correctness reasons needs them to
> >> handled a certain way.
> >> I guess in theory you could try to make sure fisnan is used in all those
> >> cases, but then you need to find them, and I think if you take
> >> description literally there is no guarantee that even fisnan continues
> >> work... I am also not sure none of the code relies on order of
> >> to get the precision it needs.
> >> So it is simply too dangerous.
> >> Some more specific options might be possible to use though (but I think
> >> even full -ffast-math gains you almost nothing? Does it even help
> Yes, sorry, I meant some specific things from -ffast-math. I checked
> configure, most of the unambiguously clear ones are already being
> turned on. As such, it seems ok.
> > One could also consider writing some customized assembly (calling the
> > relevant instructions instead of C wrappers) in cases where it is
> > speed-sensitive. It's sort of the inverse of what Ganesh is suggesting, I
> > guess, maybe some more effort involved but it can't be that much. You
> > even use av_always_inline functions and inline assembly to call the
> > relevant instruction and otherwise keep things in C. That's identical to
> > what -ffast-math does but turns on only when specifically calling the new
> > API function name...
> So seems like everything wrt this patch is fine, right?
Not really. Your patch still does two things, and I don't like the explicit
exp(log(a)*b). What I'm thinking is that you should have a static inline
function, let's call it fast_pow(a, b), which can internally (in the C
version) be implemented as exp+log. Just as you found for pow, we might
find that for exp/log, the system lib is not very optimized and we can do
it faster ourselves by doing whatever -ffast-math is doing for these
functions. Those would be more specifically optimized and that would be
part of the fast_pow implementation. This way, the code in aacenc remains
easy to follow and the optimization is accessible for other parts of ffmpeg
More information about the ffmpeg-devel