[FFmpeg-devel] [aarch64] improve performance of ff_hscale_8_to_15_neon

Jean-Baptiste Kempf jb at videolan.org
Tue Nov 26 00:18:21 EET 2019


On Mon, Nov 25, 2019, at 22:59, Sebastian Pop wrote:
> This patch implements ff_hscale_8_to_15_neon with NEON fused multiply accumulate
> and bumps the vectorization factor from 2 to 4. I have seen speedups up to 15%
> on Graviton A1 instances based on A-72 cpus.

Why adding a new version, in intrinsics, instead of changing the existing implementation?


Jean-Baptiste Kempf - President
+33 672 704 734

More information about the ffmpeg-devel mailing list