[FFmpeg-devel] [PATCH] SSE2 Xvid idct

Michael Niedermayer michaelni
Sun Apr 13 12:26:41 CEST 2008


On Sun, Apr 13, 2008 at 05:35:01AM -0400, Alexander Strange wrote:
>
> On Apr 12, 2008, at 8:15 AM, Michael Niedermayer wrote:
[...]
>>>    "psubsw   %%xmm6, %%xmm5          \n\t" \
>>>    "movdqa   "ROW0", %%xmm4          \n\t" \
>>>    "movdqa   "ROW4", %%xmm6          \n\t" \
>>>    "movdqa   %%xmm2, "spill"         \n\t" \
>>>    "movdqa   %%xmm4, %%xmm2          \n\t" \
>>>    "psubsw   %%xmm6, %%xmm4          \n\t" \
>>>    "paddsw   %%xmm2, %%xmm6          \n\t" \
>>>    "movdqa   %%xmm6, %%xmm2          \n\t" \
>>>    "psubsw   %%xmm7, %%xmm6          \n\t" \
>>>    "paddsw   %%xmm2, %%xmm7          \n\t" \
>>>    "movdqa   %%xmm4, %%xmm2          \n\t" \
>>>    "psubsw   %%xmm5, %%xmm4          \n\t" \
>>>    "paddsw   %%xmm2, %%xmm5          \n\t" \
>>>    "movdqa   %%xmm5, %%xmm2          \n\t" \
>>>    "psubsw   %%xmm0, %%xmm5          \n\t" \
>>>    "paddsw   %%xmm2, %%xmm0          \n\t" \
>>>    "movdqa   %%xmm4, %%xmm2          \n\t" \
>>>    "psubsw   %%xmm3, %%xmm4          \n\t" \
>>>    "paddsw   %%xmm2, %%xmm3          \n\t" \
>>>    "movdqa  "spill", %%xmm2          \n\t" \
>>
>> #ifdef ARCH_X86_64
>> # define XMMS   "%%xmm12"
>> #else
>> # define XMMS   "%%xmm2"
>> #endif
>> s/%%xmm2/XMMS/
>>
>> #ifndef ARCH_X86_64
>> "movdqa   %%xmm2, "spill"         \n\t" \
>> #endif
>> ...
>> #ifndef ARCH_X86_64
>> "movdqa  "spill", %%xmm2          \n\t" \
>> #endif
>>
>> or a
>> MOV_ONLY_ON32" %%xmm2, ...
>>
>>
>> And i think something similar can be don with ROW*
>
> Done. The row part is already optimal on 64 since pshufhw handles it.

I meant the
>     "movdqa   "ROW2", %%xmm4          \n\t" \
>     "movdqa   "ROW6", %%xmm6          \n\t" \
[...]
>     "movdqa   "ROW0", %%xmm4          \n\t" \
>     "movdqa   "ROW4", %%xmm6          \n\t" \

they are unneeded on 64.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080413/88d36cb6/attachment.pgp>



More information about the ffmpeg-devel mailing list