[FFmpeg-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

Vitor Sessak vitor1001 at gmail.com
Sun Aug 28 10:46:59 CEST 2011


On Sun, Aug 28, 2011 at 2:37 AM, Loren Merritt <lorenm at u.washington.edu> wrote:
> On Sat, 27 Aug 2011, Vitor Sessak wrote:
>
>> %macro PSHUFD_AVX 3
>>     shufps %1, %2, %2, %3
>> %endmacro
>
> This can serve as sse1 too.

Fixed.

>>>> %macro SWAP_64BITS 2
>>>> %ifdef ARCH_X86_64
>>>>    SWAP %1, %2
>>>> %endif
>>>> %endmacro
>>>
>>> What good is this doing? There's no %else, so the code must also work
>>> (with no extra instructions) if you don't swap...?
>>
>> I was hoping that swapping the temp variable in code like
>>
>> mova m5, m0
>> addps m5, m1
>> mulps m2, m5
>>
>> SWAP_64BITS m5, m10
>>
>> mova m5, m3
>> addps m5, m6
>> mulps m7, m5
>>
>> would allow a x32_64 CPU to use out-of-order execution to interleave
>> the two blocks of instructions in any order.
>
> Unnecessary. Every x86 cpu that supports out of order execution also
> supports register renaming.
> Equivalently, the x86 pipeline really uses static-single-assignment, with
> the output value of every instruction remaining available even if some
> later instruction overwrites the same variable name.

Ok, removed it.

-Vitor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-mpegaudiodec-add-SSE-optimized-imdct36.patch
Type: text/x-patch
Size: 11857 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20110828/2cfd1d90/attachment.bin>


More information about the ffmpeg-devel mailing list