[Ffmpeg-devel] int vs. float: Hard Numbers

Michael Niedermayer michaelni
Sat May 21 01:04:05 CEST 2005


Hi

On Friday 20 May 2005 23:35, Attila Kinali wrote:
> Heyo,
>
> On Fri, 20 May 2005 12:13:24 -0600
>
> Mike Melanson <mike at multimedia.cx> wrote:
> > integer_adder() (10 adds) returned 50, 36 cycles used
> > float_adder() (10 adds) returned 50.000000, 36 cycles used
> > integer_mult() (10 mults) returned 9765625, 115 cycles used
> > float_mult() (10 mults) returned 9765625.000000, 36 cycles used
>
> integer_adder() (10 adds) returned 50, 46 cycles used
> integer_adder_nl() (10 adds) returned 20, 47 cycles used
> float_adder() (10 adds) returned 50.000000, 65 cycles used
> integer_mult() (10 mults) returned 9765625, 87 cycles used
> float_mult() (10 mults) returned 9765625.000000, 85 cycles used
>
> This was performed on a PentiumM:
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 13
> model name      : Intel(R) Pentium(R) M processor 1.70GHz
> stepping        : 6
> cpu MHz         : 1699.061
> cache size      : 2048 KB
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 2
> wp              : yes
> flags           : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat
> clflush dts acpi mmx fxsr sse sse2 ss tm pbe est tm2 bogomips        :
> 3358.72
>
> integer_adder_nl is a small hack to see how much the result
> depends on the register stall, ie i replaced the addition
> loop by:
> ---
>   add   ecx, 5
>   add   eax, 5
>   add   edx, 5
>   add   ecx, 5
>   add   eax, 5
>   add   edx, 5
>   add   ecx, 5
>   add   eax, 5
>   add   edx, 5
>   add   ecx, 5
> ----
>
> For another test i added a for loop infront of each test function
> with 1000 iterations to check for cache dependencies. Interestingly
> i got exactly the same numbers as before. 

i guess the function was inlined so it wasnt in the code cache after the loop, 
only the inlined one inside the loop was, but thats just a guess


> I assume that the 2MB L2 
> cache still contains the content of the program from the loading
> operation of the OS.

yes probably but L2 != L1, and L1 is normally split between data and code, so 
the code must have been executed to be in the L1 cache

[...]
-- 
Michael





More information about the ffmpeg-devel mailing list