[FFmpeg-devel] [PATCH] Add x86-optimized function ac3_or_abs_int16() and use in log2_tab().

Loren Merritt lorenm
Sat Feb 12 06:52:36 CET 2011


>+%macro PABSW2_MMX 6 ; dst1, dst2, src1, src2, temp1, temp2
>+    mova    %1, %3
>+    mova    %2, %4
>+    mova    %5, %1
>+    mova    %6, %2
>+    psraw   %5, 15
>+    psraw   %6, 15
>+    pxor    %1, %5
>+    pxor    %2, %6
>+    psubw   %1, %5
>+    psubw   %2, %6
>+%endmacro
>+
>+%macro PABSW2_SSSE3 6 ; dst1, dst2, src1, src2, unused, unused
>+    pabsw   %1, %3
>+    pabsw   %2, %4
>+%endmacro

Already in x86util.asm

But you don't actually want to compute (bit-or of abs), right? You want 
to compute (log2 of max of abs). Since MMX has min/max instructions and 
doesn't have abs, try running signed min/max first and doing abs only 
once in the tail.
That way might be faster in C too, on cpus with scalar cmov/min/max and 
without scalar abs.

--Loren Merritt



More information about the ffmpeg-devel mailing list