[FFmpeg-devel] [PATCH] VC-1 MMX DSP functions
Sat Oct 20 17:05:58 CEST 2007
Michael Niedermayer a ?crit :
> of course it is, unless the codec does not get faster overall. its possible
That's a very good point, since I'm optimizing functions that are less
than 10% CPU time on the sequence I use. Overall timing are provided
using Andreas's program and running 20 programs once stddev was low enough.
text section size: 373454
time: avg: 3.865 stddev: 0.026 med: 3.868
text section size: 373214
time: avg: 3.811 stddev: 0.019 med: 3.806
text section size:
time: avg: 3.830 stddev: 0.016 med: 3.826
This on a Core2 32bits and gcc 4.2.1.
> in principle (though i dont think thats the case here) that one function gets
> faster but the increase in code size would make the codec overall slower due
> to code cache issues, but again i dont think thats the case here, 10% speedup
> is great!
I've tested the code under a 64bits system, and the speedup for the
unrolled version was lower. I guess the speedup has partly to do with
> if you could make the motion compensation code from h.264 10% faster iam sure
> you would get a lot of fans ;)
More than with VC-1, for sure. For what I've seen, there are some
special cases I wouldn't have thought of that make this code already
faster than what I would have probably come with.
>> Attached patch allows to test/verify/report those figures.
> iam glad its just for test/verify/report
> one patch less to review :)
> or did you want a review?
I sent the earlier patch so that people could confirm the timing I
measured or if anyone spotted some things hurting performance when
comparing one version to another.
As no one did that, I guess the final version should have the unrolled
version and not the special case. Anyway, as my reviewer, I'm waiting
for your decision on what it should be before submitting that final version.
More information about the ffmpeg-devel