[FFmpeg-devel] [PATCH] [WIP] swr: rewrite resample_common/linear_float_sse/avx in yasm.

James Almer jamrial at gmail.com
Sat Jun 21 02:06:43 CEST 2014


On 19/06/14 9:37 PM, Ronald S. Bultje wrote:
> DO NOT MERGE. Speed not tested, avx not yet tested.
> ---
>  configure                            |   3 +-
>  libswresample/resample_template.c    |  12 +-
>  libswresample/x86/Makefile           |   1 +
>  libswresample/x86/resample.asm       | 327 +++++++++++++++++++++++++++++++++++
>  libswresample/x86/resample_mmx.h     | 118 -------------
>  libswresample/x86/resample_x86_dsp.c |  34 ++--
>  6 files changed, 346 insertions(+), 149 deletions(-)
>  create mode 100644 libswresample/x86/resample.asm

[...]

> +.inner_loop:
> +    movu              m1, [srcptrq+filter_lenq*4]
> +    mulps             m1, [filterq+filter_lenq*4]
> +    addps             m0, m1
> +    add      filter_lenq, mmsize/4
> +    js .inner_loop
> +
> +%if cpuflag(avx)
> +    vextractf128     xm1, m0, 0x1
> +    addps            xm0, xm1
> +%endif
> +
> +    ; horizontal sum
> +    movhlps          xm1, xm0
> +    addps            xm0, xm1
> +    movss            xm1, xm0
> +    shufps           xm0, xm0, q0001

you can do shufps xm1, xm0, xm0, q0001 and remove the movss.
Same with linear.


More information about the ffmpeg-devel mailing list