[FFmpeg-devel] [PATCH] swscale: aarch64: Fix yuv2rgb with negative strides

Martin Storsjö martin at martin.st
Thu Oct 27 13:18:26 EEST 2022


On Tue, 25 Oct 2022, Martin Storsjö wrote:

> Treat the 32 bit stride registers as signed.
>
> Alternatively, we could make the stride arguments ptrdiff_t instead
> of int, and changing all of the assembly to operate on these
> registers with their full 64 bit width, but that's a much larger
> and more intrusive change (and risks missing some operation, which
> would clamp the intermediates to 32 bit still).
>
> Fixes: https://trac.ffmpeg.org/ticket/9985
>
> Signed-off-by: Martin Storsjö <martin at martin.st>
> ---
> libswscale/aarch64/yuv2rgb_neon.S | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/libswscale/aarch64/yuv2rgb_neon.S b/libswscale/aarch64/yuv2rgb_neon.S
> index f4b220fb60..f341268c5d 100644
> --- a/libswscale/aarch64/yuv2rgb_neon.S
> +++ b/libswscale/aarch64/yuv2rgb_neon.S
> @@ -118,8 +118,8 @@
> .endm
>
> .macro increment_yuv422p
> -    add                 x6,  x6,  w7, UXTW                              // srcU += incU
> -    add                 x13, x13, w14, UXTW                             // srcV += incV
> +    add                 x6,  x6,  w7, SXTW                              // srcU += incU
> +    add                 x13, x13, w14, SXTW                             // srcV += incV
> .endm
>
> .macro compute_rgba r1 g1 b1 a1 r2 g2 b2 a2
> @@ -189,8 +189,8 @@ function ff_\ifmt\()_to_\ofmt\()_neon, export=1
>     st4                 {v16.8B,v17.8B,v18.8B,v19.8B}, [x2], #32
>     subs                w8, w8, #16                                     // width -= 16
>     b.gt                2b
> -    add                 x2, x2, w3, UXTW                                // dst  += padding
> -    add                 x4, x4, w5, UXTW                                // srcY += paddingY
> +    add                 x2, x2, w3, SXTW                                // dst  += padding
> +    add                 x4, x4, w5, SXTW                                // srcY += paddingY
>     increment_\ifmt
>     subs                w1, w1, #1                                      // height -= 1
>     b.gt                1b
> -- 
> 2.37.0 (Apple Git-136)

Will push later today, and backport to some older branches where relevant 
(a bit later).

// Martin


More information about the ffmpeg-devel mailing list