[FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

James Almer jamrial at gmail.com
Wed Apr 20 01:42:46 CEST 2016


On 4/18/2016 6:25 PM, Christophe Gisquet wrote:
> 2016-04-18 21:18 GMT+02:00 Michael Niedermayer <michael at niedermayer.cc>:
>> > this breaks (only noise)
>> > \[CCCP\]_Mega_Weird_Audio_Test.mkv track 23
> Worthwhile sample.
> 
> I rewrote the patch to reduce code duplication, and I fixed the issue
> (misread a shift).
> 
> -- Christophe
> 
> 
> 0005-x86-lossless-audio-SSE4-madd-32bits.patch
> 
> 
> From a0d4a96c032d73bc0e34fec320497aefafba3c28 Mon Sep 17 00:00:00 2001
> From: Christophe Gisquet <christophe.gisquet at gmail.com>
> Date: Mon, 18 Apr 2016 13:20:07 +0200
> Subject: [PATCH 5/7] x86: lossless audio: SSE4 madd 32bits
> 
> The unique user so far is wmalossless 24bits. The few samples tested show an
> order of 8, so more unrolling or an avx2 version do not make sense.
> 
> Timings: 72 -> 49 cycles
> ---
>  libavcodec/x86/lossless_audiodsp.asm    | 31 +++++++++++++++++++++++++------
>  libavcodec/x86/lossless_audiodsp_init.c |  7 +++++++
>  2 files changed, 32 insertions(+), 6 deletions(-)
> 
> diff --git a/libavcodec/x86/lossless_audiodsp.asm b/libavcodec/x86/lossless_audiodsp.asm
> index 5597dad..d00869b 100644
> --- a/libavcodec/x86/lossless_audiodsp.asm
> +++ b/libavcodec/x86/lossless_audiodsp.asm
> @@ -22,13 +22,17 @@
>  
>  SECTION .text
>  
> -%macro SCALARPRODUCT 0
> +%macro SCALARPRODUCT 1
>  ; int ff_scalarproduct_and_madd_int16(int16_t *v1, int16_t *v2, int16_t *v3,
>  ;                                     int order, int mul)
> -cglobal scalarproduct_and_madd_int16, 4,4,8, v1, v2, v3, order, mul
> -    shl orderq, 1
> +; int ff_scalarproduct_and_madd_int32(int32_t *v1, int32_t *v2, int32_t *v3,
> +;                                     int order, int mul)
> +cglobal scalarproduct_and_madd_int %+ %1, 4,4,8, v1, v2, v3, order, mul
> +    shl orderq, (%1/16)

order is int, so maybe it would be better to use orderd here, to make sure the upper
half of the register is cleared on x86_64.
Wonder why it was never an issue until now, though.



More information about the ffmpeg-devel mailing list