[FFmpeg-devel] [PATCH] fix add_bytes_mmx and add_bytes_l2_mmx for w <= 15
Mon Jun 23 01:08:32 CEST 2008
On Sun, Jun 22, 2008 at 09:34:08AM +0200, Reimar D?ffinger wrote:
> On Sun, Jun 22, 2008 at 03:16:14AM +0200, Michael Niedermayer wrote:
> > On Sat, Jun 21, 2008 at 08:40:02PM +0200, Reimar D?ffinger wrote:
> > > as noticeable when decoding small png images, these two functions do not
> > > work correctly and cause a segfault.
> > > Attached is one possible solution, I think another would be to change
> > > the jb to js and jmp to the comparison before the first loop.
> > Iam ok with the solutiom that is faster and if they are the same speed the
> > one that is smaller
> In my quick tests (I am not going to do extensive benchmarks on code that will
> be changed later anyway) the version using jmp is smaller and usually faster,
> so I applied that.
> > Besides the cmp is unneeded and can be removed
> Like in attached patch? Unfortunately the benchmark number seem
> completely unrealistic to me, going by them there would be 4x speedup in
> some cases...
> Though I tested with png images, maybe they are a horrible testcase.
> Example numbers:
> previous code:
> 39350 dezicycles in blub, 1 runs, 0 skips
> 24925 dezicycles in blub, 2 runs, 0 skips
> 16242 dezicycles in blub, 4 runs, 0 skips
> 11603 dezicycles in blub, 8 runs, 0 skips
> 9407 dezicycles in blub, 16 runs, 0 skips
16 runs? Dont you have a file that uses that code more than 16 times?
and patch ok if it works (same and correct output)
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel