[FFmpeg-devel] [RFC] Loop unrolling in C code for 'vector_fmul_*' functions
Fri Jan 11 00:44:07 CET 2008
On Jan 11, 2008 7:39 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Tue, Jan 08, 2008 at 02:20:07AM +0200, Siarhei Siamashka wrote:
> > But at least for ARM, looks like the compiler is quite stupid and can't
> > schedule instructions properly as seen from the benchmark results (just
> > unrolling loop is not enough and some extra tweaks are needed
> > in 'vector_fmul_c_other_unrolled'). VFP coprocessor has a high result
> > (8 cycles), though throughput is quite good (1 cycle) and some other
> > features which can improve performance exist (documantation for VFP can
> > found at http://www.arm.com). The compiler (gcc) does not even try to
> > instructions and pipeline is just stalled most of the time. I would not
> > surprised if the compiler screwed up and generated something suboptimal
> > more complicated floating point stuff as well (fft and imdct).
> Please submit reports to the gcc devels for every case of suboptimal code
> generated by gcc you stumble across!
> Its much better if gcc would be improved instead of everyone having to
> schedule c code.
> > Tweaking C code, performance can be improved quite a lot
> > ('vector_fmul_c_other_unrolled' vs. 'vector_fmul_c_unrolled').
> > But such unnesessarily cluttering code because of inefficient compilers
> is not
> > a good option. Anyway, probably at least just loops can be unrolled to
> > the compiler do its job? The compiler itself does not know that 'len is
> > multiple of 8' and manual loops unrolling seems to be reasonable.
> Add a assert((len & 7) == 0); and the compiler can know it.
That is a really interesting statement. are you saying that gcc will
optimize by adding such an assert? This is the first i have heard of this.
such code annotations could probably help in many places.
> > Well, I will do the rest of ARM VFP optimizations for all
> > these 'vector_fmul_*' functions anyway :)
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> In a rich man's house there is no place to spit but his face.
> -- Diogenes of Sinope
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> -----END PGP SIGNATURE-----
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
More information about the ffmpeg-devel