[FFmpeg-devel] [PATCH] libavcodec/hevcdsp: port SIMD idct functions from 32-bit.
Josh Dekker
josh at itanimul.li
Tue Jan 12 14:24:07 EET 2021
Hi,
On 2021-01-08 21:36, Reimar.Doeffinger at gmx.de wrote:
> From: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
>
> Makes SIMD-optimized 8x8 and 16x16 idcts for 8 and 10 bit depth
> available on aarch64.
> For a UHD HDR (10 bit) sample video these were consuming the most time
> and this optimization reduced overall decode time from 19.4s to 16.4s,
> approximately 15% speedup.
> Test sample was the first 300 frames of "LG 4K HDR Demo - New York.ts",
> running on Apple M1.
> ---
> libavcodec/aarch64/Makefile | 2 +
> libavcodec/aarch64/hevcdsp_idct_neon.S | 426 ++++++++++++++++++++++
> libavcodec/aarch64/hevcdsp_init_aarch64.c | 45 +++
> libavcodec/hevcdsp.c | 2 +
> libavcodec/hevcdsp.h | 1 +
> 5 files changed, 476 insertions(+)
> create mode 100644 libavcodec/aarch64/hevcdsp_idct_neon.S
> create mode 100644 libavcodec/aarch64/hevcdsp_init_aarch64.c
>
> [...]
AS libavcodec/aarch64/hevcdsp_idct_neon.o
libavcodec/aarch64/hevcdsp_idct_neon.S: Assembler messages:
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Error: operand mismatch --
`mov v29.4S,v28.4S'
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: did you mean this?
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.8b, v28.8b
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: other valid variant(s):
libavcodec/aarch64/hevcdsp_idct_neon.S:418: Info: mov v29.16b, v28.16b
This doesn't build on GNU assembler (GNU Binutils for Ubuntu) 2.34
(aarch64). Thanks for porting this, I was in the process of writing HEVC
assembly (see my set on the ML) and would be interested to rebase this
on top of that set.
--
Josh
More information about the ffmpeg-devel
mailing list