[Ffmpeg-devel] [REQUEST] MMX/MMX2 and SSE optimizations for H.264 decoding

Loren Merritt lorenm
Thu Sep 15 18:52:42 CEST 2005

On Thu, 15 Sep 2005, Martin Boehme wrote:
> Gamester17 wrote:
>> Yes there already are some MMX integer optimization for H264 but what about 
>> SSE (Streaming SIMD Extensions) optimizations?, isn't SSE suppose to be 
>> much more powerfull than MMX (and in fact be the thing that replaces MMX)?
> Well, for a start, SSE has registers that are 128 bits wide, while MMX's 
> registers are 64 bits. As long as you're operating only on the registers 
> (i.e. you're CPU-bound, not memory bandwidth limited) that's an instant 
> factor of 2 speedup.

On AMD, most SSE2 instructions take exactly twice as long as the 
equivalent MMX instruction. Any speedups are due only to scheduling.
In x264, we have a bunch of SSE2 functions, but most of them are _slower_ 
than the MMX versions on AMD.
On Intel, yes SSE2 is faster, but still not a full factor of 2 even 
before you count memory bandwidth.

--Loren Merritt

More information about the ffmpeg-devel mailing list