[FFmpeg-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

Loren Merritt lorenm at u.washington.edu
Sun Aug 28 02:37:48 CEST 2011


On Sat, 27 Aug 2011, Vitor Sessak wrote:

> %macro PSHUFD_AVX 3
>     shufps %1, %2, %2, %3
> %endmacro

This can serve as sse1 too.

>>> %macro SWAP_64BITS 2
>>> %ifdef ARCH_X86_64
>>>    SWAP %1, %2
>>> %endif
>>> %endmacro
>>
>> What good is this doing? There's no %else, so the code must also work
>> (with no extra instructions) if you don't swap...?
>
> I was hoping that swapping the temp variable in code like
>
> mova m5, m0
> addps m5, m1
> mulps m2, m5
>
> SWAP_64BITS m5, m10
>
> mova m5, m3
> addps m5, m6
> mulps m7, m5
>
> would allow a x32_64 CPU to use out-of-order execution to interleave
> the two blocks of instructions in any order.

Unnecessary. Every x86 cpu that supports out of order execution also
supports register renaming.
Equivalently, the x86 pipeline really uses static-single-assignment, with
the output value of every instruction remaining available even if some
later instruction overwrites the same variable name.

--Loren Merritt


More information about the ffmpeg-devel mailing list