[FFmpeg-devel] [PATCH + RFC] Faster ff_celp_lp_synthesis_filterf() (and failed SSE SIMD version)

Michael Niedermayer michaelni
Mon Dec 14 01:49:06 CET 2009


On Sun, Dec 13, 2009 at 08:55:08PM +0100, Vitor Sessak wrote:
[...]
> +            old_out3 = old_out2;
> +            old_out2 = old_out1;
> +            old_out1 = old_out0;
> +            old_out0 = out[-i-1];
> +
> +            val = filter_coeffs[i];
> +
> +            out0 -= val * old_out0;
> +            out1 -= val * old_out1;
> +            out2 -= val * old_out2;
> +            out3 -= val * old_out3;

old_out3 = out[-i-1];

val = filter_coeffs[i];
out0 -= val * old_out3;
out1 -= val * old_out0;
out2 -= val * old_out1;
out3 -= val * old_out2;

and similarly you can get rid of the other copies if you unroll it more

i didnt look at the sse code

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091214/afb626b5/attachment.pgp>



More information about the ffmpeg-devel mailing list