[FFmpeg-devel] [PATCH] 'vorbis_residue_decode' optimizations

Michael Niedermayer michaelni
Sat Aug 30 23:55:39 CEST 2008


On Sat, Aug 30, 2008 at 11:42:31PM +0300, Siarhei Siamashka wrote:
> On Saturday 30 August 2008, Loren Merritt wrote:
> > On Sat, 30 Aug 2008, Siarhei Siamashka wrote:
> > > This trivial patch improves overall vorbis decoding performance by ~3% on
> > > Pentium-M with gcc 4.2.3
> >
> > vorbis_residue_decode_type# are superfluous. Just inline
> > vorbis_residue_decode_internal into vorbis_residue_decode.
> 
> Theoretically they are superfluous (inlining vorbis_residue_decode_internal
> into vorbis_residue_decode was the first thing that I tried). But in practice
> code is consistently faster this way. Probably it is easier for gcc to
> optimize 3 independent functions than everything bundled into a huge one. Let
> me know if you get different results.

well, I do

[...]
> --------------------
> callgrind simulation for './ffmpeg_g.1huge' (L1 data cache is 32K):
> I   refs:      85,817,091
> D   refs:      43,457,905  (28,888,575 rd + 14,569,330 wr)
> D1  misses:       785,564  (   583,645 rd +    201,919 wr)
> D1  miss rate:        1.8% (       2.0%   +        1.3%  )
> callgrind simulation for './ffmpeg_g.3func' (L1 data cache is 32K):
> I   refs:      85,085,997
> D   refs:      42,653,212  (28,454,961 rd + 14,198,251 wr)
> D1  misses:       782,978  (   581,685 rd +    201,293 wr)
> D1  miss rate:        1.8% (       2.0%   +        1.4%  )
> 
> The difference is visible both for the total number of instructions and for 
> the number of memory accesses.

loren:
I   refs:      5,663,789,738
I1  misses:        3,515,218
I1  miss rate:          0.06%
D   refs:      1,889,318,408  (1,365,757,445 rd   + 523,560,963 wr)
D1  misses:       32,073,499  (   22,443,938 rd   +   9,629,561 wr)
D1  miss rate:           1.6% (          1.6%     +         1.8%  )

siar:
I   refs:      5,670,795,747
I1  misses:        3,488,120
I1  miss rate:          0.06%
D   refs:      1,896,279,210  (1,372,731,243 rd   + 523,547,967 wr)
D1  misses:       32,096,476  (   22,464,805 rd   +   9,631,671 wr)
D1  miss rate:           1.6% (          1.6%     +         1.8%  )


Ill commit the clean version without the dummy functions in a day or 2
unless someone objects / has some idea of how to improve it.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is not what we do, but why we do it that matters.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080830/0337b577/attachment.pgp>



More information about the ffmpeg-devel mailing list