[FFmpeg-devel] [PATCH] sws/aarch64: add ff_hscale_8_to_15_neon

Ronald S. Bultje rsbultje at gmail.com
Thu Mar 24 14:35:01 CET 2016


Hi,

On Mar 24, 2016 8:28 AM, "Clément Bœsch" <u at pkh.me> wrote:
>
> From: Clément Bœsch <clement at stupeflix.com>
>
> ./ffmpeg -nostats -f lavfi -i testsrc2=4k:d=2 -vf
bench=start,scale=1024x1024,bench=stop -f null -
>
>     before: t:0.489726 avg:0.489883 max:0.491852 min:0.489482
>     after:  t:0.256515 avg:0.256458 max:0.256999 min:0.253755
> ---
> Changes:
> - FIX: not using the v8-v15 registers
> - writing directly from the SIMD register (thx Martin)
> - misc reordering
>
> I'm looking at the vscale part now.
> ---
>  libswscale/aarch64/Makefile   |  6 +++--
>  libswscale/aarch64/hscale.S   | 59
+++++++++++++++++++++++++++++++++++++++++++
>  libswscale/aarch64/swscale.c  | 37 +++++++++++++++++++++++++++
>  libswscale/swscale.c          |  2 ++
>  libswscale/swscale_internal.h |  1 +
>  libswscale/utils.c            |  4 ++-
>  6 files changed, 106 insertions(+), 3 deletions(-)
>  create mode 100644 libswscale/aarch64/hscale.S
>  create mode 100644 libswscale/aarch64/swscale.c
Do you intend to create special versions for specific filter widths (e.g.
x86 has special versions for filter_width=4 and 8). That helped speed up
the default filters (bicubic) a little more.

This version looks OK already for the default case.

Ronald


More information about the ffmpeg-devel mailing list