[FFmpeg-devel] [PATCH 6/6] lossless audio dsp: unroll

James Almer jamrial at gmail.com
Mon Apr 18 19:15:26 CEST 2016


On 4/18/2016 10:07 AM, Christophe Gisquet wrote:
> The loops are guaranteed to be at least multiples of 8, so this
> unrolling is safe but allows exploiting execution ports.
> 
> For int32 version: 72 -> 57c.

What compiler are you using, and what cpu at configure time?

We're currently enabling tree vectorization for gcc 4.9 or newer on x86,
and at least with gcc 5.3.0 on mingw-w64 the resulting code now seems worse.
I didn't bench it, but after this patch it's not being vectorized anymore.

> ---
>  libavcodec/lossless_audiodsp.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/libavcodec/lossless_audiodsp.c b/libavcodec/lossless_audiodsp.c
> index 55495d0..17a61cd 100644
> --- a/libavcodec/lossless_audiodsp.c
> +++ b/libavcodec/lossless_audiodsp.c
> @@ -29,10 +29,12 @@ static int32_t scalarproduct_and_madd_int16_c(int16_t *v1, const int16_t *v2,
>  {
>      int res = 0;
>  
> -    while (order--) {
> +    do {
>          res   += *v1 * *v2++;
>          *v1++ += mul * *v3++;
> -    }
> +        res   += *v1 * *v2++;
> +        *v1++ += mul * *v3++;
> +    } while (order-=2);
>      return res;
>  }
>  
> @@ -42,10 +44,12 @@ static int32_t scalarproduct_and_madd_int32_c(int32_t *v1, const int32_t *v2,
>  {
>      int res = 0;
>  
> -    while (order--) {
> +    do {
> +        res   += *v1 * *v2++;
> +        *v1++ += mul * *v3++;
>          res   += *v1 * *v2++;
>          *v1++ += mul * *v3++;
> -    }
> +    } while (order-=2);
>      return res;
>  }
>  
> 



More information about the ffmpeg-devel mailing list