[FFmpeg-devel] [PATCH] Add x86-optimized function ac3_or_abs_int16() and use in log2_tab().

Måns Rullgård mans
Sat Feb 12 13:48:23 CET 2011


Loren Merritt <lorenm at u.washington.edu> writes:

>>+%macro PABSW2_MMX 6 ; dst1, dst2, src1, src2, temp1, temp2
>>+    mova    %1, %3
>>+    mova    %2, %4
>>+    mova    %5, %1
>>+    mova    %6, %2
>>+    psraw   %5, 15
>>+    psraw   %6, 15
>>+    pxor    %1, %5
>>+    pxor    %2, %6
>>+    psubw   %1, %5
>>+    psubw   %2, %6
>>+%endmacro
>>+
>>+%macro PABSW2_SSSE3 6 ; dst1, dst2, src1, src2, unused, unused
>>+    pabsw   %1, %3
>>+    pabsw   %2, %4
>>+%endmacro
>
> Already in x86util.asm
>
> But you don't actually want to compute (bit-or of abs), right? You
> want to compute (log2 of max of abs). Since MMX has min/max
> instructions and doesn't have abs, try running signed min/max first
> and doing abs only once in the tail.
> That way might be faster in C too, on cpus with scalar cmov/min/max
> and without scalar abs.

So the description could be made more general, allowing both approaches.


-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list