[FFmpeg-devel] [PATCH] Add x86-optimized function ac3_or_abs_int16() and use in log2_tab().

Justin Ruggles justin.ruggles
Sat Feb 12 02:10:15 CET 2011


On 02/11/2011 07:55 PM, Justin Ruggles wrote:

> +%macro PABSW2_MMX 6 ; dst1, dst2, src1, src2, temp1, temp2
> +    mova    %1, %3
> +    mova    %2, %4
> +    mova    %5, %1
> +    mova    %6, %2
> +    psraw   %5, 15
> +    psraw   %6, 15
> +    pxor    %1, %5
> +    pxor    %2, %6
> +    psubw   %1, %5
> +    psubw   %2, %6
> +%endmacro


If anyone is wondering why I used 2 temp registers and interleaved the
instructions instead of using 1 temp register... it is faster on Atom.

 MMX: 7367 vs. 8966
SSE2: 4228 vs. 4838

-Justin



More information about the ffmpeg-devel mailing list