[Ffmpeg-devel] yet another silly int vs. float benchmark

Måns Rullgård mru
Sat May 21 19:42:00 CEST 2005


Michael Niedermayer <michaelni at gmx.at> writes:

> Hi
>
> heres another benchmark proggy, advantages over the others
> 1. pure c
> 2. ~40 lines of code, can be easily done in less i know ...
> 3. tries to test both the case where each instruction depends upon the 
> previous one and where the instructions are a little more independant

I tried it on my Alpha PCA56, but I keep getting a SIGFPE on the first
float operation.  I disabled the first float add test, and got these
numbers:

100 ; needed     3 cycles ->     3 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[0]; needed   229 cycles ->   114 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[0]; needed  1827 cycles ->   913 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[0]; needed   804 cycles ->   402 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[2];iv[2]+=iv[3];iv[3]+=iv[4];iv[4]+=iv[5]; needed   282 cycles ->    56 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[2];iv[2]*=iv[3];iv[3]*=iv[4];iv[4]*=iv[5]; needed  2032 cycles ->   406 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[2];fv[2]+=fv[3];fv[3]+=fv[4];fv[4]+=fv[5]; needed   511 cycles ->   102 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[2];fv[2]*=fv[3];fv[3]*=fv[4];fv[4]*=fv[5]; needed   511 cycles ->   102 cycles per operation

Can someone (Falk?) explain the FPE?  It goes away with -mieee, but
doing so slows things down a little:

100 ; needed     3 cycles ->     3 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[0]; needed   205 cycles ->   102 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[0]; needed  1804 cycles ->   902 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[0]; needed 13892 cycles ->  6946 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[0]; needed   822 cycles ->   411 cycles per operation
100 iv[0]+=iv[1];iv[1]+=iv[2];iv[2]+=iv[3];iv[3]+=iv[4];iv[4]+=iv[5]; needed   262 cycles ->    52 cycles per operation
100 iv[0]*=iv[1];iv[1]*=iv[2];iv[2]*=iv[3];iv[3]*=iv[4];iv[4]*=iv[5]; needed  2011 cycles ->   402 cycles per operation
100 fv[0]+=fv[1];fv[1]+=fv[2];fv[2]+=fv[3];fv[3]+=fv[4];fv[4]+=fv[5]; needed   676 cycles ->   135 cycles per operation
100 fv[0]*=fv[1];fv[1]*=fv[2];fv[2]*=fv[3];fv[3]*=fv[4];fv[4]*=fv[5]; needed   676 cycles ->   135 cycles per operation


-- 
M?ns Rullg?rd
mru at inprovide.com





More information about the ffmpeg-devel mailing list