[FFmpeg-devel] [PATCH] lavu/common: Fix AV_CEIL_RSHIFT for unsigned LHS

Thu Nov 14 01:59:06 EET 2024

Hi Nuo Mi

On Sun, Nov 10, 2024 at 08:17:06PM +0800, Nuo Mi wrote:
> On Mon, Oct 21, 2024 at 2:30 AM Michael Niedermayer <michael at niedermayer.cc>
> wrote:
> 
> > On Fri, Oct 18, 2024 at 06:48:40PM +0100, Frank Plowman wrote:
> > > On 15/10/2024 21:23, Frank Plowman wrote:
> > > > On 14/10/2024 23:26, Michael Niedermayer wrote:
> > > >> On Sat, Oct 05, 2024 at 03:38:05PM -0700, Frank Plowman wrote:
> > > >>> The first branch of this ternary expression was intended to avoid
> > > >>> having two shift operations in the case the RHS is not known at
> > > >>> compile time.  It only works if the LHS has a signed type however,
> > > >>> otherwise the result is invalid.
> > > >>
> > > >> If the expression is faster, why not check if its signed ?
> >
> uint16_t is okfor the RHS because -uint16_t will be promoted to int, but
> uint32_t and uint64_t will not.
> 
> see "6.3.1.1 Boolean, characters, and integers" in the "ISO/IEC 9899"
> "If an int can represent all values of the original type (as restricted by
> the width, for a bit-field), the value is converted to an int;
> otherwise, it is converted to an unsigned int. These are called the integer
> promotions. 59) All other types are unchanged by the integer promotions."
> 
> Without the help of typeof() in C23, we cannot provide an effective way to
> detect uint32_t and uint64_t.
> Let us fix dvb clip issue with

> https://patchwork.ffmpeg.org/project/ffmpeg/patch/20241110121314.12540-1-nuomi2021@gmail.com/

patchwork seems under maintainance currently

> first.
> 
> Thank you.
> 
> > >>
> > > >> if its not faster, then the argument could be that its not
> > > >> faster and removes complexity
> > > >>
> > > >> thx
> > > >>
> > > >
> > > > In a quick microbenchmark (clang 16, AArch64), the bithack is 10%
> > faster
> > > > with -O0 but there is no significant difference with -O1 and up.
> > >
> > > It has been drawn to my attention that this is causing a failure for
> > > VVC_HDR_UHDTV1_OpenGOP_3840x2160_50fps_HLG10_mosaic from the DVB VVC
> > > test suite, due to the macro's use at
> > > libavcodec/cbs_h266_syntax_template.c:1206
> > >
> >
> > > In light of the lack of performance difference between the two
> > > implementations
> >
> > Thats a bit of a jump from
> > "In a quick microbenchmark (clang 16, AArch64), the bithack is 10% faster
> >  with -O0 but there is no significant difference with -O1 and up."
> > to a general "lack of performance difference"
> >
> > Also if there is a slowdown such slowdowns accumulate and multiple reports
> > suggest that FFmpeg has become slower over the years. We should not be
> > sloppy
> > with changes that could negatively affect speed. Noone ever said (s)he
> > liked
> > the slowdown FFmpeg had over the years
> >
> > Your test covers clang and aarch, your suggested changes covers all
> > compilers and all arhcitectures.
> > Its alot of work to test the main affected cases, which i agree is annoying
> >
> > But even if one ignores the question about speed loss. The change is
> > actually still introducing other issues
> >
> > -#define AV_CEIL_RSHIFT(a,b) (!av_builtin_constant_p(b) ? -((-(a)) >> (b))
> > \
> > -                                                       : ((a) + (1<<(b))
> > - 1) >> (b))
> > +#define AV_CEIL_RSHIFT(a,b) (((a) + (1<<(b)) - 1) >> (b))
> >
> > Here the argument was
> > -((-(a)) >> (b))
> > only works with signed numbers, ok true, even though iam not sure why
> > we would need to use this with unsigned numbers.
> >
> > ((a) + (1<<(b)) - 1) >> (b)
> > only works with ints
> > it fails with int64_t as (1<<(b)) can overflow
> > also it fails with many more cases
> > for example, something like this
> > AV_CEIL_RSHIFT(0x7FFFFFFF, 30)
> > will overflow (a) + (1<<(b)), this will be fine with -((-(a)) >> (b))
> >
> Overflow may be a separate issue; the caller must ensure there is no
> overflow.

The concern here is, that the overflow depends on the implementation
of AV_CEIL_RSHIFT(), its not the input or output to AV_CEIL_RSHIFT()
that overflows. Its the intermediate of one of the 2 possible
implementations which overflows

at the least that needs to be documented, if its not easy to make
the code work for all valid input/output

thx

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Nations do behave wisely once they have exhausted all other alternatives. 
-- Abba Eban
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20241114/9ff2cd8e/attachment.sig>