[FFmpeg-devel] [PATCH] lavc/aacenc_utils: replace powf(x, y) by expf(logf(x), y)

Ronald S. Bultje rsbultje at gmail.com
Thu Mar 10 14:56:10 CET 2016


On Thu, Mar 10, 2016 at 2:37 AM, Reimar Döffinger <Reimar.Doeffinger at gmx.de>

> On 10.03.2016, at 03:06, Ganesh Ajjanagadde <gajjanag at gmail.com> wrote:
> > On Wed, Mar 9, 2016 at 2:16 AM, Reimar Döffinger
> > <Reimar.Doeffinger at gmx.de> wrote:
> >> On 08.03.2016, at 04:48, Ganesh Ajjanagadde <gajjanag at gmail.com> wrote:
> >>
> >>> +                    nzl += expf(logf(s / ethresh) * nzslope);
> >>
> >> Shouldn't log2f/exp2f be faster?
> >> log2f at least has CPU support on x86 AFAICT.
> >
> > I had tested this, and no, though it is still faster than powf.
> >
> > It still seems to rely on libm, note that we don't use -ffast-math and
> > a look at
> https://github.com/lattera/glibc/tree/master/sysdeps/x86_64/fpu
> > as well seems to say no. Problem is, GNU people like to prioritize
> > "correctly rounded" behavior over fast, reasonably accurate code,
> > sometimes to ludicruous degrees.
> >
> > Personally, I don't know why we don't use -ffast-math, not many seem
> > to care that heavily on strict IEEE semantics. Maybe it leads to too
> > much variation across platforms?
> You lose some guarantees. In particular, the compiler will assume NaNs do
> not happen and you cannot predict which code path (after a comparison for
> example) they take.
> But some code for either security or correctness reasons needs them to be
> handled a certain way.
> I guess in theory you could try to make sure fisnan is used in all those
> cases, but then you need to find them, and I think if you take -ffast-math
> description literally there is no guarantee that even fisnan continues to
> work... I am also not sure none of the code relies on order of operations
> to get the precision it needs.
> So it is simply too dangerous.
> Some more specific options might be possible to use though (but I think
> even full -ffast-math gains you almost nothing? Does it even help here?).

One could also consider writing some customized assembly (calling the
relevant instructions instead of C wrappers) in cases where it is
speed-sensitive. It's sort of the inverse of what Ganesh is suggesting, I
guess, maybe some more effort involved but it can't be that much. You could
even use av_always_inline functions and inline assembly to call the
relevant instruction and otherwise keep things in C. That's identical to
what -ffast-math does but turns on only when specifically calling the new
API function name...


More information about the ffmpeg-devel mailing list