[FFmpeg-devel] [RFC] Loop unrolling in C code for 'vector_fmul_*' functions

Michael Niedermayer michaelni
Mon Jan 14 04:13:44 CET 2008

On Sun, Jan 13, 2008 at 06:05:48PM +0100, Vadim Lebedev wrote:
> I'm running your program as follows:
> gcc 4.1.2  -O3 -fomit-frame-pointer  -msse -o vector_fmul_test 
> vector_fmul_test.c
> ./vector_fmul_test 2000
> And the output is:
> Function: 'vector_fmul_c', time=73.910 (cycles/element=288.713)
> Function: 'vector_fmul_c_unrolled', time=73.010 (cycles/element=285.195)
> Function: 'vector_fmul_c_other_unrolled', time=72.999 
> (cycles/element=285.152)
> Function: 'vector_fmul_c_simd', time=0.141 (cycles/element=0.552)
> Any idea why it is so slow (except simd case)?

if i had to guess ...
maybe something with overflows and exceptions
try to set the arrays to 1.0

also theres another flaw in the test the arrays should be global or
volatile or so otherwise gcc could optimize the functions completely
out (yeah if it had a microscopic speck of intelligence it would
realize they are never read ...)

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080114/b5ded109/attachment.pgp>

More information about the ffmpeg-devel mailing list