[FFmpeg-cvslog] r16207 - trunk/libavcodec/h264.c

Michael Niedermayer michaelni
Thu Dec 18 12:18:52 CET 2008


On Thu, Dec 18, 2008 at 04:27:34AM +0000, M?ns Rullg?rd wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
> 
> > On Thu, Dec 18, 2008 at 03:57:54AM +0000, M?ns Rullg?rd wrote:
> >> Michael Niedermayer <michaelni at gmx.at> writes:
> >> 
> >> > On Thu, Dec 18, 2008 at 02:57:17AM +0000, M?ns Rullg?rd wrote:
> >> >> michael <subversion at mplayerhq.hu> writes:
> >> >> 
> >> >> > Author: michael
> >> >> > Date: Thu Dec 18 03:53:18 2008
> >> >> > New Revision: 16207
> >> >> >
> >> >> > Log:
> >> >> > Use the new idct functions (except chroma as it was slower in benchmarks)
> >> >> > cathedral +0.5% speed
> >> >> > aladin +0.6% speed [note aladin has been cat-ed 10 times to reduce the influence
> >> >> > of init time]
> >> >> > Speedup also verified via START/STOP_TIMER (difference was very significant
> >> >> > for the changed parts)
> >> >> 
> >> >> How much does this hurt on architectures that don't yet have the new
> >> >> SIMD functions?
> >> >
> >> > there are no really new SIMD functions.
> >> > I just moved the loops like
> >> > for(i=0; i<16; i++)
> >> >     dsp->idct4x4_add(blah blah);
> >> >
> >> > into dsputil so they are
> >> >
> >> > for(i=0; i<16; i++)
> >> >     idct4x4_add_simdwhatever(blah blah);
> >> >
> >> > that way gcc can inline the function and avoids up to 15 calls through dsp->
> >> >
> >> > adding support for this to your favorite architecture is a matter of copy
> >> > & paste and adjusting the function names.
> >> 
> >> I can see how it can be done.  I'm asking how much of an impact this
> >> has on performance until it's been done.  What percentage of the old
> >> calls are affected?
> >
> > depends on the video, if i have to guess and thats just a guess id say
> > more than 50% of the idct work will go to new code
> 
> That's enough to give a significant slowdown on ARM.  How about a
> little coordination next time?

Iam not sure what you suggest?

I could of course send out a warning liks
"will in an hour (if it passes tests and benchmarks) commit code that will
 require arch specific optims to be updated, until that update they will
 be slower. I will update x86 myself"

but i dont think that would have helped you.
And waiting for ppc, sparc, alpha, sh4, ... to be updated is like not
commiting the change at all ...

besides if above fails benchmarks, it would be a false alarm ...


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-cvslog/attachments/20081218/f756bb1a/attachment.pgp>



More information about the ffmpeg-cvslog mailing list