[FFmpeg-devel] [PATCH] lavc/aarch64/fdct: add neon-optimized fdct for aarch64
Martin Storsjö
martin at martin.st
Wed Feb 14 11:42:45 EET 2024
Hi,
On Sun, 4 Feb 2024, Ramiro Polla wrote:
> The code is imported from libjpeg-turbo-3.0.1. The neon registers used
> have been changed to avoid modifying v8-v15.
> ---
I don't remember if we have any extra routines we need to do if importing
foreign code with a differing license. The license here seems fine in any
case though.
This seems to work fine in all my test environments. And thanks for making
sure it doesn't use v8-v15!
I'm not so familiar with these DSP functions, whether it is norm to add a
new constant like FF_DCT_NEON, but I guess it seems to match the pattern
of the existing code.
I presume the main case that tests this is "make fate-dct8x8", which
builds and executes libavcodec/tests/dct? How much work would it be to
integrate testing of these routines into checkasm? That way we could rest
assured that the assembly passes all such ABI checks that we do there,
including what registers must not be clobbered.
The assembly uses a different indentation width than the rest of our
assembly. I recently spent some effort on cleaning that up so that our
code is mostly consistent, so I'd prefer not to add new code that deviates
from it. It primarily looks like you'd need to add 4 spaces at the start
of each line.
I've used a script for mostly automatically reindenting our arm assembly,
you can grab it at https://martin.st/temp/ffmpeg-asm-indent.pl, run it as
"cat file.S | ./ffmpeg-asm-indent.pl > tmp; mv tmp file.S". It's not 100%
accurate, but mostly gets you there, but it's good to manually check it
afterwards as well.
// Martin
More information about the ffmpeg-devel
mailing list