[FFmpeg-devel] [PATCH] aacenc_utils: unroll loops to allow compiler to use SIMD.

James Almer jamrial at gmail.com
Sun Mar 6 19:49:00 CET 2016


On 3/6/2016 3:35 PM, Reimar Döffinger wrote:
> Approximately 10% faster transcode from mp3 to aac
> with default settings.
> 
> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> ---
>  libavcodec/aacenc_utils.h | 47 ++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 38 insertions(+), 9 deletions(-)
> 
> diff --git a/libavcodec/aacenc_utils.h b/libavcodec/aacenc_utils.h
> index b9bd6bf..1639021 100644
> --- a/libavcodec/aacenc_utils.h
> +++ b/libavcodec/aacenc_utils.h
> @@ -36,15 +36,29 @@
>  #define ROUND_TO_ZERO 0.1054f
>  #define C_QUANT 0.4054f
>  
> +#define ABSPOW(inv, outv) \
> +do { \
> +    float a = (inv); \
> +    a = fabsf(a); \
> +    (outv) = sqrtf(a * sqrtf(a)); \
> +} while(0)
> +
>  static inline void abs_pow34_v(float *out, const float *in, const int size)
>  {
>      int i;
> -    for (i = 0; i < size; i++) {
> -        float a = fabsf(in[i]);
> -        out[i] = sqrtf(a * sqrtf(a));
> +    for (i = 0; i < size - 3; i += 4) {
> +        ABSPOW(in[i], out[i]);
> +        ABSPOW(in[i+1], out[i+1]);
> +        ABSPOW(in[i+2], out[i+2]);
> +        ABSPOW(in[i+3], out[i+3]);
> +    }

Are you sure this wasn't vectorized already? I remember i checked and it mostly
was, at least on gcc 5.3 mingw-w64 with default settings.



More information about the ffmpeg-devel mailing list