# [Ffmpeg-devel] Re: about mmx instructions

Pascal Massimino skal
Sat Sep 3 09:09:43 CEST 2005

```	Howdy everybody!

(hey don't you guys never take holidays?)
(or: do you spare funny questions *for* holidays? ;)

On Thu, 2005-09-01 at 16:07, Michael Niedermayer wrote:

> anyone got a nicer derivation/proof?

Well, for a more generic approach, one should remember
that it's all about emulating the propagation of
carry/borrow.

Let's take for example the addition of two 8-bits values,
followed by a shift by 2:  (x+y)>>2

The intermediate value (x+y) doesn't fit in 8bits (because
of the carry), but one can split the addition in two parts:
the 2 lowest bits are added, and the resulting carry is reported
to the result of adding the highest 6 bits. Namely:

(x+y)>>2 = [ (x>>2) + (y>>2) ] + [((x&3) + (y&3)) >> 2)]

The first term is the carry-less addition of the 6 hi bits, the
second terms adds the 2 lowest bits and only keeps the carry.

Now, note that you can also retrieve the carry from the
result itself, noting that:
(x+y)>>2 = (x>>2) + (y>>2) + { ((x+y)>>2) ^ (x>>2) ^ (y>>2) }
(x-y)>>2 = (x>>2) - (y>>2) - { ((x-y)>>2) ^ (x>>2) ^ (y>>2) }
These relations may sound silly, since we're using the result
(x+y)>>2 ... to find the result (x+y)>>2. But recall that the
problem is to make all the intermediate values fit into 8 bits.

Ok, same goes on for difference, and one can also replace the
arithmetic ops by logical ones, if one recalls that, in
boolean algebra, the '&' logical operator is equivalent to the
arithmetic multiplication, and the '^' operator equivalent to

(x+y)>>1 = (x>>1) + (y>>1) + { x&y }
(x+y+1)>>1 = (x>>1) + (y>>1) + { x|y }
(x-y)>>1 = (x>>1) - (y>>1) - { y&~x }
(x-y+1)>>1 = (x>>1) - (y>>1) + { x&~y }

where, for short, {a} means 'lsb of a', that is: {a} = a&1.
Note that: {a+b} = {a}^{b}, {a*b} = {a}^{b}, {a+a}={0}, etc..

Ok, now we can stir the bit soup, using : x=(p0+(q1>>2)+1)
and y=(q0+(p1>>2)+1.

First we have:

(p1-q1)>>2 = (p1>>2) - (q1>>2) - Epsilon

Where Epsilon={0,1}  is { ((p1-q1)>>2) ^ (p1>>2) ^ (q1>>2) }

So that:

(q0-p0+(p1-q1)>>2 + 1) = [q0+(p1>>2)+1] - [p0+(q1>>2)+1]  + 1-Epsilon

Now, you can use the generic relation:

(x-y + e)>>1 = [ (x>>1) + { x&~y&e } ] - [ (y>>1) + { y&~x&~e } ]

for e = 0 or 1.

Replacing e by 1-Epsilon = 1^ { ((p1-q1)>>2) ^ (p1>>2) ^ (q1>>2) }
will lead to the result after some boolean stirring, if you recall,
for instance, that:  { p0+(q1>>2) } = {p0}^{(q1>>2)} as far as the
LSB is concerned.

> or even a faster implementation?

Hint: (d&a) and (d&~a) can be simplified to not use 'a'
twice. Left as exercise to the reader...

Hope it helps,
bye!
Skal

```