[FFmpeg-devel] [PATCHv4] avutil/common: add av_rint64_clip

Ronald S. Bultje rsbultje at gmail.com
Sat Nov 14 01:17:38 CET 2015


Hi,

On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
wrote:

> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbultje at gmail.com>
> wrote:
> > Hi,
> >
> > On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
> > wrote:
> >
> >> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <rsbultje at gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde <
> gajjanag at mit.edu>
> >> > wrote:
> >> >
> >> >> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje <
> rsbultje at gmail.com>
> >> >> wrote:
> >> >> > Hi Ganesh,
> >> >> > On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" <
> >> gajjanagadde at gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> The rationale for this function is reflected in the documentation
> for
> >> >> >> it, and is copied here:
> >> >> >>
> >> >> >> Clip a double value into the long long amin-amax range.
> >> >> >> This function is needed because conversion of floating point to
> >> integers
> >> >> > when
> >> >> >> it does not fit in the integer's representation does not
> necessarily
> >> >> > saturate
> >> >> >> correctly (usually converted to a cvttsd2si on x86) which
> saturates
> >> >> > numbers
> >> >> >> > INT64_MAX to INT64_MIN. The standard marks such conversions as
> >> >> undefined
> >> >> >> behavior, allowing this sort of mathematically bogus conversions.
> >> This
> >> >> > provides
> >> >> >> a safe alternative that is slower obviously but assures safety and
> >> >> better
> >> >> >> mathematical behavior.
> >> >> >> API:
> >> >> >> @param a value to clip
> >> >> >> @param amin minimum value of the clip range
> >> >> >> @param amax maximum value of the clip range
> >> >> >> @return clipped value
> >> >> >>
> >> >> >> Note that a priori if one can guarantee from the calling side that
> >> the
> >> >> >> double is in range, it is safe to simply do an explicit/implicit
> >> cast,
> >> >> >> and that will be far faster. However, otherwise this function
> should
> >> be
> >> >> >> used.
> >> >> >>
> >> >> >> avutil minor version is bumped.
> >> >> >>
> >> >> >> Reviewed-by: Ronald S. Bultje <rsbultje at gmail.com>
> >> >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
> >> >> >> ---
> >> >> >>  libavutil/common.h  | 30 ++++++++++++++++++++++++++++++
> >> >> >>  libavutil/version.h |  2 +-
> >> >> >>  2 files changed, 31 insertions(+), 1 deletion(-)
> >> >> >>
> >> >> >> diff --git a/libavutil/common.h b/libavutil/common.h
> >> >> >> index 6f0f582..f4687ab 100644
> >> >> >> --- a/libavutil/common.h
> >> >> >> +++ b/libavutil/common.h
> >> >> >> @@ -298,6 +298,33 @@ static av_always_inline av_const double
> >> >> > av_clipd_c(double a, double amin, double
> >> >> >>      else               return a;
> >> >> >>  }
> >> >> >>
> >> >> >> +/**
> >> >> >> + * Clip and convert a double value into the long long amin-amax
> >> range.
> >> >> >> + * This function is needed because conversion of floating point
> to
> >> >> > integers when
> >> >> >> + * it does not fit in the integer's representation does not
> >> necessarily
> >> >> > saturate
> >> >> >> + * correctly (usually converted to a cvttsd2si on x86) which
> >> saturates
> >> >> > numbers
> >> >> >> + * > INT64_MAX to INT64_MIN. The standard marks such conversions
> as
> >> >> > undefined
> >> >> >> + * behavior, allowing this sort of mathematically bogus
> conversions.
> >> >> > This provides
> >> >> >> + * a safe alternative that is slower obviously but assures safety
> >> and
> >> >> > better
> >> >> >> + * mathematical behavior.
> >> >> >> + * @param a value to clip
> >> >> >> + * @param amin minimum value of the clip range
> >> >> >> + * @param amax maximum value of the clip range
> >> >> >> + * @return clipped value
> >> >> >> + */
> >> >> >> +static av_always_inline av_const int64_t av_rint64_clip_c(double
> a,
> >> >> > int64_t amin, int64_t amax)
> >> >> >> +{
> >> >> >> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) &&
> >> ASSERT_LEVEL
> >> >> >>= 2
> >> >> >> +    if (amin > amax) abort();
> >> >> >> +#endif
> >> >> >> +    // INT64_MAX+1,INT64_MIN are exactly representable as IEEE
> >> doubles
> >> >> >> +    if (a >=  9223372036854775808.0 || llrint(a) >= amax)
> >> >> >> +        return amax;
> >> >> >> +    if (a <= -9223372036854775808.0 || llrint(a) <= amin)
> >> >> >> +        return amin;
> >> >> >
> >> >> > Doesn't this allow negative overflows in the max check? I think you
> >> need
> >> >> > both overflow checks before the min/max checks. Try the next float
> val
> >> >> > smaller than int64_min as input with a small amax, eg 0. I bet it
> >> >> returns 0
> >> >> > instead of amin.
> >> >>
> >> >> They are needed. As you and others can clearly see, I am quite bad
> >> >> with this stuff.
> >> >
> >> >
> >> > Hm, so, getting back to my computer, I wanted to test this, and I have
> >> this
> >> > problem: llrint() works correctly for me for the "undefined" cases,
> i.e.,
> >> > it already does what you're trying to fix in av_rint64_clip_c.
> >> >
> >> > llrint(-10223372056756029440.000000) returns -9223372036854775808
> >> > llrint(10223372056756029440.000000) returns 9223372036854775807
> >> >
> >> > So, how do I reproduce that llrint() overflows?
> >>
> >> The link I gave originally
> >>
> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer
> >> gives an illustration. Maybe the weird behavior happens only on
> >> 9223372036854775808.0. This happens because INT64_MAX+1 is not
> >> representable in long long, and hence signed overflow occurs yielding
> >> INT64_MIN (of course undefined). Here is a minimal test program:
> >> #include <stdio.h>
> >> #include <math.h>
> >>
> >> int main(void) {
> >>     printf("%lld\n", llrint(9223372036854775808.0));
> >>     return 0;
> >> }
> >>
> >
> > 9223372036854775807
> >
> > Seems apple's libc got one thing right :)
>
> I personally am not that charitable, looking more carefully at your
> asm shows a cmplesd, suggesting slowdown. Here is a source reference:
> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c.
> As usual, Apple dumps many implementations of llrint and it is unclear
> which is actually being used on OS X at the moment (see e.g other
> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c),
> but I digress.
>
> They essentially all put special case code like the patch above. Thus
> their function is inherently slower than the conformant GNU libm
> implementation. A client may very well want a branch free llrint for
> speed. Apple offers no performance choice here, forcing a fast llrint
> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected
> by this slowdown.


I think FFmpeg should consider using Apple's version as a x86
implementation for av_rint64_clip :)

Ronald


More information about the ffmpeg-devel mailing list