[FFmpeg-devel] [PATCH 1/2] lavu/bswap: remove some inline assembler
Michael Niedermayer
michael at niedermayer.cc
Tue Jun 11 19:17:13 EEST 2024
On Tue, Jun 11, 2024 at 01:08:04PM -0300, James Almer wrote:
> On 6/11/2024 12:57 PM, Michael Niedermayer wrote:
> > On Tue, Jun 11, 2024 at 12:38:37PM -0300, James Almer wrote:
> > > On 6/11/2024 10:15 AM, Michael Niedermayer wrote:
> > > > On Fri, Jun 07, 2024 at 09:19:46PM +0300, Rémi Denis-Courmont wrote:
> > > > > C code or compiler built-ins are preferable over inline assembler for
> > > > > byte-swaps as it allows for better optimisations (e.g. instruction
> > > > > scheduling) which would otherwise be impossible.
> > > > >
> > > > > As with f64c2e710fa1a7b59753224e717f57c48462076f for x86 and Arm,
> > > > > this removes the inline assembler on GCC (and Clang) since we now
> > > > > require recent enough compiler versions (this indeed seems to work on
> > > > > AArch64).
> > > > > ---
> > > > > libavutil/aarch64/bswap.h | 56 ---------------------------------------
> > > > > libavutil/avr32/bswap.h | 44 ------------------------------
> > > > > libavutil/bswap.h | 8 +-----
> > > > > libavutil/sh4/bswap.h | 48 ---------------------------------
> > > >
> > > > As you are writing that this preferrable for better optimisations
> > > > Please provide benchmarks (for sh4, avr32)
> > >
> > > This is a ridiculous request, considering nobody has such hardware at all.
> >
> > Then I think its a ridiculous claim that this optimizes the code
> >
> > I mean, at some point there was hardware and these optimisations did improve
> > speed.
> >
> > This patch is not removing the code because its a rare (or dead) platform, it removes
> > it with the claim that this would "allows for better optimisations"
> > Iam sorry but i do not see why asking for the claim in the commit message
> > to be backed up with facts being ridiculous
> > The claim in the commit message may be ridiculous
>
> Compilers have come a long way since 20 years ago when this code was added.
> See https://godbolt.org/z/jPose4rj3, where new GCC generates the same code
> for sh4. And no inline assembly means instruction scheduling will take these
> functions into account.
thanks for checking
please add a note to the commit message that this was checked for sh-4
that resolves my concern about sh-4
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
If you drop bombs on a foreign country and kill a hundred thousand
innocent people, expect your government to call the consequence
"unprovoked inhuman terrorist attacks" and use it to justify dropping
more bombs and killing more people. The technology changed, the idea is old.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240611/110b74dc/attachment.sig>
More information about the ffmpeg-devel
mailing list