[FFmpeg-devel] [PATCHv4] avutil/common: add av_rint64_clip

Ganesh Ajjanagadde gajjanag at mit.edu
Sat Nov 14 22:38:08 CET 2015


On Sat, Nov 14, 2015 at 4:30 PM, Hendrik Leppkes <h.leppkes at gmail.com> wrote:
> On Sat, Nov 14, 2015 at 10:27 PM, Ganesh Ajjanagadde <gajjanag at mit.edu> wrote:
>> On Sat, Nov 14, 2015 at 4:03 PM, Ganesh Ajjanagadde <gajjanag at mit.edu> wrote:
>>> On Sat, Nov 14, 2015 at 3:28 PM, Hendrik Leppkes <h.leppkes at gmail.com> wrote:
>>>> On Sat, Nov 14, 2015 at 3:51 AM, Ganesh Ajjanagadde <gajjanag at mit.edu> wrote:
>>>>> On Fri, Nov 13, 2015 at 7:17 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Fri, Nov 13, 2015 at 6:16 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Nov 13, 2015 at 4:52 PM, Ronald S. Bultje <rsbultje at gmail.com>
>>>>>>> wrote:
>>>>>>> > Hi,
>>>>>>> >
>>>>>>> > On Fri, Nov 13, 2015 at 4:28 PM, Ganesh Ajjanagadde <gajjanag at mit.edu>
>>>>>>> > wrote:
>>>>>>> >
>>>>>>> >> On Fri, Nov 13, 2015 at 1:28 PM, Ronald S. Bultje <rsbultje at gmail.com>
>>>>>>> >> wrote:
>>>>>>> >> > Hi,
>>>>>>> >> >
>>>>>>> >> > On Fri, Nov 13, 2015 at 12:17 PM, Ganesh Ajjanagadde <
>>>>>>> gajjanag at mit.edu>
>>>>>>> >> > wrote:
>>>>>>> >> >
>>>>>>> >> >> On Fri, Nov 13, 2015 at 12:10 PM, Ronald S. Bultje <
>>>>>>> rsbultje at gmail.com>
>>>>>>> >> >> wrote:
>>>>>>> >> >> > Hi Ganesh,
>>>>>>> >> >> > On Nov 13, 2015 12:02 PM, "Ganesh Ajjanagadde" <
>>>>>>> >> gajjanagadde at gmail.com>
>>>>>>> >> >> > wrote:
>>>>>>> >> >> >>
>>>>>>> >> >> >> The rationale for this function is reflected in the documentation
>>>>>>> for
>>>>>>> >> >> >> it, and is copied here:
>>>>>>> >> >> >>
>>>>>>> >> >> >> Clip a double value into the long long amin-amax range.
>>>>>>> >> >> >> This function is needed because conversion of floating point to
>>>>>>> >> integers
>>>>>>> >> >> > when
>>>>>>> >> >> >> it does not fit in the integer's representation does not
>>>>>>> necessarily
>>>>>>> >> >> > saturate
>>>>>>> >> >> >> correctly (usually converted to a cvttsd2si on x86) which
>>>>>>> saturates
>>>>>>> >> >> > numbers
>>>>>>> >> >> >> > INT64_MAX to INT64_MIN. The standard marks such conversions as
>>>>>>> >> >> undefined
>>>>>>> >> >> >> behavior, allowing this sort of mathematically bogus conversions.
>>>>>>> >> This
>>>>>>> >> >> > provides
>>>>>>> >> >> >> a safe alternative that is slower obviously but assures safety and
>>>>>>> >> >> better
>>>>>>> >> >> >> mathematical behavior.
>>>>>>> >> >> >> API:
>>>>>>> >> >> >> @param a value to clip
>>>>>>> >> >> >> @param amin minimum value of the clip range
>>>>>>> >> >> >> @param amax maximum value of the clip range
>>>>>>> >> >> >> @return clipped value
>>>>>>> >> >> >>
>>>>>>> >> >> >> Note that a priori if one can guarantee from the calling side that
>>>>>>> >> the
>>>>>>> >> >> >> double is in range, it is safe to simply do an explicit/implicit
>>>>>>> >> cast,
>>>>>>> >> >> >> and that will be far faster. However, otherwise this function
>>>>>>> should
>>>>>>> >> be
>>>>>>> >> >> >> used.
>>>>>>> >> >> >>
>>>>>>> >> >> >> avutil minor version is bumped.
>>>>>>> >> >> >>
>>>>>>> >> >> >> Reviewed-by: Ronald S. Bultje <rsbultje at gmail.com>
>>>>>>> >> >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>>>>>>> >> >> >> ---
>>>>>>> >> >> >>  libavutil/common.h  | 30 ++++++++++++++++++++++++++++++
>>>>>>> >> >> >>  libavutil/version.h |  2 +-
>>>>>>> >> >> >>  2 files changed, 31 insertions(+), 1 deletion(-)
>>>>>>> >> >> >>
>>>>>>> >> >> >> diff --git a/libavutil/common.h b/libavutil/common.h
>>>>>>> >> >> >> index 6f0f582..f4687ab 100644
>>>>>>> >> >> >> --- a/libavutil/common.h
>>>>>>> >> >> >> +++ b/libavutil/common.h
>>>>>>> >> >> >> @@ -298,6 +298,33 @@ static av_always_inline av_const double
>>>>>>> >> >> > av_clipd_c(double a, double amin, double
>>>>>>> >> >> >>      else               return a;
>>>>>>> >> >> >>  }
>>>>>>> >> >> >>
>>>>>>> >> >> >> +/**
>>>>>>> >> >> >> + * Clip and convert a double value into the long long amin-amax
>>>>>>> >> range.
>>>>>>> >> >> >> + * This function is needed because conversion of floating point
>>>>>>> to
>>>>>>> >> >> > integers when
>>>>>>> >> >> >> + * it does not fit in the integer's representation does not
>>>>>>> >> necessarily
>>>>>>> >> >> > saturate
>>>>>>> >> >> >> + * correctly (usually converted to a cvttsd2si on x86) which
>>>>>>> >> saturates
>>>>>>> >> >> > numbers
>>>>>>> >> >> >> + * > INT64_MAX to INT64_MIN. The standard marks such conversions
>>>>>>> as
>>>>>>> >> >> > undefined
>>>>>>> >> >> >> + * behavior, allowing this sort of mathematically bogus
>>>>>>> conversions.
>>>>>>> >> >> > This provides
>>>>>>> >> >> >> + * a safe alternative that is slower obviously but assures safety
>>>>>>> >> and
>>>>>>> >> >> > better
>>>>>>> >> >> >> + * mathematical behavior.
>>>>>>> >> >> >> + * @param a value to clip
>>>>>>> >> >> >> + * @param amin minimum value of the clip range
>>>>>>> >> >> >> + * @param amax maximum value of the clip range
>>>>>>> >> >> >> + * @return clipped value
>>>>>>> >> >> >> + */
>>>>>>> >> >> >> +static av_always_inline av_const int64_t av_rint64_clip_c(double
>>>>>>> a,
>>>>>>> >> >> > int64_t amin, int64_t amax)
>>>>>>> >> >> >> +{
>>>>>>> >> >> >> +#if defined(HAVE_AV_CONFIG_H) && defined(ASSERT_LEVEL) &&
>>>>>>> >> ASSERT_LEVEL
>>>>>>> >> >> >>= 2
>>>>>>> >> >> >> +    if (amin > amax) abort();
>>>>>>> >> >> >> +#endif
>>>>>>> >> >> >> +    // INT64_MAX+1,INT64_MIN are exactly representable as IEEE
>>>>>>> >> doubles
>>>>>>> >> >> >> +    if (a >=  9223372036854775808.0 || llrint(a) >= amax)
>>>>>>> >> >> >> +        return amax;
>>>>>>> >> >> >> +    if (a <= -9223372036854775808.0 || llrint(a) <= amin)
>>>>>>> >> >> >> +        return amin;
>>>>>>> >> >> >
>>>>>>> >> >> > Doesn't this allow negative overflows in the max check? I think you
>>>>>>> >> need
>>>>>>> >> >> > both overflow checks before the min/max checks. Try the next float
>>>>>>> val
>>>>>>> >> >> > smaller than int64_min as input with a small amax, eg 0. I bet it
>>>>>>> >> >> returns 0
>>>>>>> >> >> > instead of amin.
>>>>>>> >> >>
>>>>>>> >> >> They are needed. As you and others can clearly see, I am quite bad
>>>>>>> >> >> with this stuff.
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> > Hm, so, getting back to my computer, I wanted to test this, and I have
>>>>>>> >> this
>>>>>>> >> > problem: llrint() works correctly for me for the "undefined" cases,
>>>>>>> i.e.,
>>>>>>> >> > it already does what you're trying to fix in av_rint64_clip_c.
>>>>>>> >> >
>>>>>>> >> > llrint(-10223372056756029440.000000) returns -9223372036854775808
>>>>>>> >> > llrint(10223372056756029440.000000) returns 9223372036854775807
>>>>>>> >> >
>>>>>>> >> > So, how do I reproduce that llrint() overflows?
>>>>>>> >>
>>>>>>> >> The link I gave originally
>>>>>>> >>
>>>>>>> http://blog.frama-c.com/index.php?post/2013/10/09/Overflow-float-integer
>>>>>>> >> gives an illustration. Maybe the weird behavior happens only on
>>>>>>> >> 9223372036854775808.0. This happens because INT64_MAX+1 is not
>>>>>>> >> representable in long long, and hence signed overflow occurs yielding
>>>>>>> >> INT64_MIN (of course undefined). Here is a minimal test program:
>>>>>>> >> #include <stdio.h>
>>>>>>> >> #include <math.h>
>>>>>>> >>
>>>>>>> >> int main(void) {
>>>>>>> >>     printf("%lld\n", llrint(9223372036854775808.0));
>>>>>>> >>     return 0;
>>>>>>> >> }
>>>>>>> >>
>>>>>>> >
>>>>>>> > 9223372036854775807
>>>>>>> >
>>>>>>> > Seems apple's libc got one thing right :)
>>>>>>>
>>>>>>> I personally am not that charitable, looking more carefully at your
>>>>>>> asm shows a cmplesd, suggesting slowdown. Here is a source reference:
>>>>>>> https://opensource.apple.com/source/Libm/Libm-2026/Source/ARM/llrint.c.
>>>>>>> As usual, Apple dumps many implementations of llrint and it is unclear
>>>>>>> which is actually being used on OS X at the moment (see e.g other
>>>>>>> https://opensource.apple.com/source/Libm/Libm-92/i386.subproj/llrint.c),
>>>>>>> but I digress.
>>>>>>>
>>>>>>> They essentially all put special case code like the patch above. Thus
>>>>>>> their function is inherently slower than the conformant GNU libm
>>>>>>> implementation. A client may very well want a branch free llrint for
>>>>>>> speed. Apple offers no performance choice here, forcing a fast llrint
>>>>>>> to use cvt2dsi inline or equivalent. Don't know if FFmpeg is affected
>>>>>>> by this slowdown.
>>>>>>
>>>>>>
>>>>>> I think FFmpeg should consider using Apple's version as a x86
>>>>>> implementation for av_rint64_clip :)
>>>>>
>>>>> I don't agree with this: it is a far less readable implementation with
>>>>> many more lines of code, and worse yet only handles the llrint aspect
>>>>> and not the clipping. Regardless, belongs to a separate patch/thread.
>>>>> Pushed. Thanks all for reviews.
>>>>>
>>>>
>>>> This change broke building on VS2012, llrint is apprently not available there.
>>>> Note that this is a public header, so our compat headers ala
>>>> avutil/libm.h cannot be included there.
>>>
>>> Hmm, so I could create a local avpriv_llrint with some ifdefry - e.g
>>> for GNU_C, use llrint, else use a slower implementation on the lines
>>> of Apple's.
>>> Any cleaner solutions?
>>
>> Possibly better idea: use floor(f + 0.5) as a hack (c89) on things
>> lacking llrint (via a HAVE_LLRINT check). This won't result in
>> identical output past a sufficiently large power of 2, but is still a
>> safe API. It is also clearer and smaller. Idea inspired by
>> avcodec/mpegaudio_tablegen.h (where this hack may be removed).
>>
>
> This code is in a public header, public headers don't have access to
> config.h, so no HAVE_* checks.
> You could make it non-inline, then you have all the freedom in the world.

I do understand this is breakage from my end, but just a request that
may help in resolving this:
@Hendrik or others with Windows: could you do the needful?
Rationale:
1. I lack Windows myself, so I can't test if my proposed solution even works.
2. As seen above, my understanding of these things is vague, and in
more experienced hands this will be resolved faster and more
efficiently.

>
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


More information about the ffmpeg-devel mailing list