[Ffmpeg-devel] [REQUEST] MMX/MMX2 and SSE optimizations for H.264 decoding

Martin Boehme boehme
Fri Sep 16 14:20:42 CEST 2005

Loren Merritt wrote:
> On Thu, 15 Sep 2005, Martin Boehme wrote:
>> Gamester17 wrote:
>>> Yes there already are some MMX integer optimization for H264 but what 
>>> about SSE (Streaming SIMD Extensions) optimizations?, isn't SSE 
>>> suppose to be much more powerfull than MMX (and in fact be the thing 
>>> that replaces MMX)?
>> Well, for a start, SSE has registers that are 128 bits wide, while 
>> MMX's registers are 64 bits. As long as you're operating only on the 
>> registers (i.e. you're CPU-bound, not memory bandwidth limited) that's 
>> an instant factor of 2 speedup.
> On AMD, most SSE2 instructions take exactly twice as long as the 
> equivalent MMX instruction. Any speedups are due only to scheduling.
> In x264, we have a bunch of SSE2 functions, but most of them are 
> _slower_ than the MMX versions on AMD.

Interesting -- wasn't aware of that. I would assume that the AMD 
processors only have enough execution units for 64 bits worth of data 
and have to do SSE operations in two gos?

> On Intel, yes SSE2 is faster, but still not a full factor of 2 even 
> before you count memory bandwidth.

Thanks for the info!


Martin B?hme
Inst. f. Neuro- and Bioinformatics
Ratzeburger Allee 160, D-23538 Luebeck
Phone: +49 451 500 5514
Fax:   +49 451 500 5502
boehme at inb.uni-luebeck.de

More information about the ffmpeg-devel mailing list