[FFmpeg-devel] [PATCH 1/9] SBR DSP x86: implement SSE qmf_pre_shuffle

Michael Niedermayer michaelni at gmx.at
Fri Apr 5 15:35:06 CEST 2013


On Thu, Apr 04, 2013 at 07:45:45PM +0000, Christophe Gisquet wrote:
> From 253 to 70c on Arrandale and Win64.
> ---
>  libavcodec/x86/sbrdsp.asm    | 33 +++++++++++++++++++++++++++++++++
>  libavcodec/x86/sbrdsp_init.c |  2 ++
>  2 files changed, 35 insertions(+)
> 
> diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm
> index 1b7f3a8..2029b45 100644
> --- a/libavcodec/x86/sbrdsp.asm
> +++ b/libavcodec/x86/sbrdsp.asm
> @@ -220,3 +220,36 @@ cglobal sbr_qmf_post_shuffle, 2,3,4,W,z
>      cmp               zq, r2q
>      jl             .loop
>      REP_RET
> +
> +INIT_XMM sse
> +cglobal sbr_qmf_pre_shuffle, 1,4,7,z
> +%define OFFSET  (32*4-2*mmsize)
> +    mov       r3q, OFFSET
> +    lea       r1q, [zq + (32+1)*4]
> +    lea       r2q, [zq + 64*4]
> +    mova       m6, [ps_neg]
> +.loop:
> +    movu       m0, [r1q]
> +    movu       m2, [r1q + mmsize]
> +    movu       m1, [zq + r3q + 4 + mmsize]
> +    movu       m3, [zq + r3q + 4]
> +    xorps      m2, m6
> +    xorps      m0, m6

> +    shufps     m2, m2, q0123
> +    shufps     m0, m0, q0123

with pshufd instead of shufps the code changes from 61 to 55 cycles on
sandybridge

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Its not that you shouldnt use gotos but rather that you should write
readable code and code with gotos often but not always is less readable
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130405/f4f70486/attachment.asc>


More information about the ffmpeg-devel mailing list