[FFmpeg-devel] [PATCH] Add check for Athlon64 and similar AMD processors with slow SSE2.

Jason Garrett-Glaser jason
Sun Feb 6 04:04:14 CET 2011


On Sat, Feb 5, 2011 at 5:46 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> Hi,
>
> On Fri, Feb 4, 2011 at 1:03 PM, Justin Ruggles <justin.ruggles at gmail.com> wrote:
>> On 02/04/2011 12:27 PM, Ronald S. Bultje wrote:
>>> I'm not against the original idea of reusing SSE2SLOW, just make sure
>>> it's properly documented.
>>> - SSE2 - CPU supports good SSE2
>>> - SSE2SLOW (core1 etc.) - CPU supports SSE2 in theory but it's almost
>>> always slower - only set SSE2 functions if explicitely tested to be
>>> faster
>>> - SSE2|SSE2SLOW (athlon64 etc.) - CPU supports SSE2 but it's
>>> occasionaly slower - don't set SSE2 functions if explicitely tested to
>>> be slower
>>>
>>> And I thought that's what your patch did.
>>
>>
>> It did. But I think it made one of the flag checks more complicated.
>>
>> all sse2:
>> flags & (SSE2 | SSE2SLOW)
>>
>> exclude core 1 only:
>> flags & SSE2
>>
>> exclude core 1 and athlon64:
>> (flags & SSE2) && !(flags & SSE2SLOW)
>> or
>> (flags & (SSE2 | SSE2SLOW)) ^ SSE2SLOW
>
> flags & (SSE2|SSE2SLOW) == SSE2,
>
> (^ SSE2SLOW only flips the slow bit, and then if either bit is non-zero, etc.)
>
>> exclude athlon64 only:
>> (flags & (SSE2 | SSE2SLOW)) && !(flags & SSE2 && flags & SSE2SLOW)
>> or
>> (flags & (SSE2 | SSE2SLOW)) ^ (SSE2 | SSE2SLOW)
>>
>> The first 3 are self-explanatory, but the last case is not.
>
> I don't think it matters. When would you ever want to exclude
> Athlon64, but not Core1?

Almost any SSE2 function?

Jason



More information about the ffmpeg-devel mailing list