[FFmpeg-devel] [PATCH] Add x86-optimized function ac3_or_abs_int16() and use in log2_tab().
Justin Ruggles
justin.ruggles
Sat Feb 12 02:10:15 CET 2011
On 02/11/2011 07:55 PM, Justin Ruggles wrote:
> +%macro PABSW2_MMX 6 ; dst1, dst2, src1, src2, temp1, temp2
> + mova %1, %3
> + mova %2, %4
> + mova %5, %1
> + mova %6, %2
> + psraw %5, 15
> + psraw %6, 15
> + pxor %1, %5
> + pxor %2, %6
> + psubw %1, %5
> + psubw %2, %6
> +%endmacro
If anyone is wondering why I used 2 temp registers and interleaved the
instructions instead of using 1 temp register... it is faster on Atom.
MMX: 7367 vs. 8966
SSE2: 4228 vs. 4838
-Justin
More information about the ffmpeg-devel
mailing list