[FFmpeg-devel] [PATCH] fix add_bytes_mmx and add_bytes_l2_mmx for w <= 15

Reimar Döffinger Reimar.Doeffinger
Mon Jun 23 18:52:48 CEST 2008


Hello,
On Mon, Jun 23, 2008 at 01:08:32AM +0200, Michael Niedermayer wrote:
> On Sun, Jun 22, 2008 at 09:34:08AM +0200, Reimar D?ffinger wrote:
> > Like in attached patch? Unfortunately the benchmark number seem
> > completely unrealistic to me, going by them there would be 4x speedup in
> > some cases...
> > Though I tested with png images, maybe they are a horrible testcase.
> > 
> > Example numbers:
> > previous code:
> > 39350 dezicycles in blub, 1 runs, 0 skips
> > 24925 dezicycles in blub, 2 runs, 0 skips
> > 16242 dezicycles in blub, 4 runs, 0 skips
> > 11603 dezicycles in blub, 8 runs, 0 skips
> > 9407 dezicycles in blub, 16 runs, 0 skips
> 
> 16 runs? Dont you have a file that uses that code more than 16 times?

Well, the pngs created with e.g. mplayer -vo png do not use it at all,
but I used some other png files.

> and patch ok if it works (same and correct output)

Well, in that respect it works, but with a "proper" benchmark it seems
to be slower - I strongly suspect that the png code hardly ever uses
counts larger than 15. I guess a better way to benchmark is necessary,
and there's also the question how important the tiny-size case is.
These are the results, each time the lowest values from 10 runs:
old code:
7140 dezicycles in blub, 1 runs, 0 skips
7075 dezicycles in blub, 2 runs, 0 skips
6865 dezicycles in blub, 4 runs, 0 skips
6806 dezicycles in blub, 8 runs, 0 skips
6851 dezicycles in blub, 16 runs, 0 skips
12182 dezicycles in blub, 32 runs, 0 skips
14793 dezicycles in blub, 64 runs, 0 skips
16509 dezicycles in blub, 128 runs, 0 skips
17970 dezicycles in blub, 256 runs, 0 skips

new code:
7510 dezicycles in blub, 1 runs, 0 skips
7085 dezicycles in blub, 2 runs, 0 skips
7050 dezicycles in blub, 4 runs, 0 skips
6887 dezicycles in blub, 8 runs, 0 skips
7094 dezicycles in blub, 16 runs, 0 skips
12722 dezicycles in blub, 32 runs, 0 skips
15341 dezicycles in blub, 64 runs, 0 skips
16824 dezicycles in blub, 128 runs, 0 skips
18241 dezicycles in blub, 256 runs, 0 skips

Greetings,
Reimar D?ffinger




More information about the ffmpeg-devel mailing list