[FFmpeg-cvslog] r16207 - trunk/libavcodec/h264.c

Måns Rullgård mans
Thu Dec 18 05:27:34 CET 2008


Michael Niedermayer <michaelni at gmx.at> writes:

> On Thu, Dec 18, 2008 at 03:57:54AM +0000, M?ns Rullg?rd wrote:
>> Michael Niedermayer <michaelni at gmx.at> writes:
>> 
>> > On Thu, Dec 18, 2008 at 02:57:17AM +0000, M?ns Rullg?rd wrote:
>> >> michael <subversion at mplayerhq.hu> writes:
>> >> 
>> >> > Author: michael
>> >> > Date: Thu Dec 18 03:53:18 2008
>> >> > New Revision: 16207
>> >> >
>> >> > Log:
>> >> > Use the new idct functions (except chroma as it was slower in benchmarks)
>> >> > cathedral +0.5% speed
>> >> > aladin +0.6% speed [note aladin has been cat-ed 10 times to reduce the influence
>> >> > of init time]
>> >> > Speedup also verified via START/STOP_TIMER (difference was very significant
>> >> > for the changed parts)
>> >> 
>> >> How much does this hurt on architectures that don't yet have the new
>> >> SIMD functions?
>> >
>> > there are no really new SIMD functions.
>> > I just moved the loops like
>> > for(i=0; i<16; i++)
>> >     dsp->idct4x4_add(blah blah);
>> >
>> > into dsputil so they are
>> >
>> > for(i=0; i<16; i++)
>> >     idct4x4_add_simdwhatever(blah blah);
>> >
>> > that way gcc can inline the function and avoids up to 15 calls through dsp->
>> >
>> > adding support for this to your favorite architecture is a matter of copy
>> > & paste and adjusting the function names.
>> 
>> I can see how it can be done.  I'm asking how much of an impact this
>> has on performance until it's been done.  What percentage of the old
>> calls are affected?
>
> depends on the video, if i have to guess and thats just a guess id say
> more than 50% of the idct work will go to new code

That's enough to give a significant slowdown on ARM.  How about a
little coordination next time?

-- 
M?ns Rullg?rd
mans at mansr.com




More information about the ffmpeg-cvslog mailing list