[FFmpeg-devel] Some IWMMXT functions for libavcodec

Dmitry Antipov dmantipov
Sat May 17 13:13:34 CEST 2008

Siarhei Siamashka wrote:

> Does Intel contradict itself? Or there is some variation between different
> revisions of XScale cores and they have different optimization rules? Can you
> provide a direct link to the document you are using?

There are two generations of WMMX hardware for now - WMMX inside PXA27x cores and WMMX2 inside
PXA3xx cores (read the PXA genealogy at http://en.wikipedia.org/wiki/XScale if you're don't
familiar with it).

The specification at http://www.intel.com/design/intelxscale/314510.htm describes WMMX2, but
there is another (older) specification of WMMX. I can't find a direct link on Intel's sites,
but you can grab my copy at

I'm using the hardware based on PXA310 (http://www.marvell.com/products/cellular/application/pxa310.jsp).
But the PXA27x cores are not out of the business - in fact, they forms today's end-user hardware mainstream,
and PXA3x hardware goes to replace them in the near future.

As I understand, WMMX2 is a strict superset of WMMX in the sense of instructions semantic - it
adds new instructions, but the rest is fully backward compatible. But WMMX and WMMX2 are (how many?)
different on the hardware level, so the code which is perfectly tuned for WMMX2 may be not so
perfect on WMMX.

> One more interesting issue with WLDRD instruction is that it should support
> register offset addressing mode according to the manual. So you should
> have been able to use:
>     wldrd wr2, [%1, #8]
>     wldrd wr1, [%1], %2
> instead of
>     wldrd wr1, [%1]
>     wldrd wr2, [%1, #8]
>     add %1, %1, %2
> But the toolchain I'm using (also tried gcc 4.3 and binutils 2.18) seems 
> to silently ignore register offset and generates wrong instruction here
> (without register postincrement). Either I'm misunderstanding something, 
> or it is a bug in binutils. Could you please try to investigate it further
> and submit a bugreport to binutils if needed?

I'm using ancient (but proven to be stable) gcc 3.4.3 and binutils 2.15.94 (dated 20041215).
This toolchain understands constant post-increment like

     wldrd wr0, [%0], #8

but not register post-increment like

     wldrd wr0, [%0], %1

An attempt to compile the last example issues and error from as:

test.s:36:Error: # or { expected after comma -- `wldrd wr0,[r4],r6'

Indeed, this is strange, and I'll try to investigate it.


More information about the ffmpeg-devel mailing list