[FFmpeg-devel] [PATCH] Altivec version of h264_idct_add

David Conrad umovimus
Sat Jun 2 00:52:57 CEST 2007


Hi,

This is an updated version of ff_h264_idct_add_altivec, based on a  
patch by Mauricio Alvarez [1]. It's 1.9 times faster than the scalar  
version on my G4. Regression tests pass except for seektest, which is  
currently broken for me with vanilla SVN (should it work?)

170 dezicycles in ff_h264_idct_add_altivec, 1 runs, 0 skips
150 dezicycles in ff_h264_idct_add_altivec, 2 runs, 0 skips
287 dezicycles in ff_h264_idct_add_altivec, 4 runs, 0 skips
203 dezicycles in ff_h264_idct_add_altivec, 8 runs, 0 skips
131 dezicycles in ff_h264_idct_add_altivec, 16 runs, 0 skips
79 dezicycles in ff_h264_idct_add_altivec, 32 runs, 0 skips
53 dezicycles in ff_h264_idct_add_altivec, 64 runs, 0 skips
33 dezicycles in ff_h264_idct_add_altivec, 128 runs, 0 skips
23 dezicycles in ff_h264_idct_add_altivec, 256 runs, 0 skips
18 dezicycles in ff_h264_idct_add_altivec, 512 runs, 0 skips
15 dezicycles in ff_h264_idct_add_altivec, 1024 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 2048 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 4096 runs, 0 skips
13 dezicycles in ff_h264_idct_add_altivec, 8192 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 16384 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 32768 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 65536 runs, 0 skips

210 dezicycles in ff_h264_idct_add_c, 1 runs, 0 skips
215 dezicycles in ff_h264_idct_add_c, 2 runs, 0 skips
180 dezicycles in ff_h264_idct_add_c, 4 runs, 0 skips
145 dezicycles in ff_h264_idct_add_c, 8 runs, 0 skips
94 dezicycles in ff_h264_idct_add_c, 16 runs, 0 skips
74 dezicycles in ff_h264_idct_add_c, 32 runs, 0 skips
53 dezicycles in ff_h264_idct_add_c, 64 runs, 0 skips
40 dezicycles in ff_h264_idct_add_c, 128 runs, 0 skips
33 dezicycles in ff_h264_idct_add_c, 256 runs, 0 skips
30 dezicycles in ff_h264_idct_add_c, 512 runs, 0 skips
28 dezicycles in ff_h264_idct_add_c, 1024 runs, 0 skips
28 dezicycles in ff_h264_idct_add_c, 2048 runs, 0 skips
28 dezicycles in ff_h264_idct_add_c, 4096 runs, 0 skips
27 dezicycles in ff_h264_idct_add_c, 8190 runs, 2 skips
27 dezicycles in ff_h264_idct_add_c, 16381 runs, 3 skips
27 dezicycles in ff_h264_idct_add_c, 32764 runs, 4 skips
28 dezicycles in ff_h264_idct_add_c, 65528 runs, 8 skips

190 dezicycles in ff_h264_idct_add_altivec, 1 runs, 0 skips
160 dezicycles in ff_h264_idct_add_altivec, 2 runs, 0 skips
147 dezicycles in ff_h264_idct_add_altivec, 4 runs, 0 skips
101 dezicycles in ff_h264_idct_add_altivec, 8 runs, 0 skips
82 dezicycles in ff_h264_idct_add_altivec, 16 runs, 0 skips
60 dezicycles in ff_h264_idct_add_altivec, 32 runs, 0 skips
39 dezicycles in ff_h264_idct_add_altivec, 64 runs, 0 skips
27 dezicycles in ff_h264_idct_add_altivec, 128 runs, 0 skips
21 dezicycles in ff_h264_idct_add_altivec, 256 runs, 0 skips
17 dezicycles in ff_h264_idct_add_altivec, 512 runs, 0 skips
15 dezicycles in ff_h264_idct_add_altivec, 1024 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 2048 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 4096 runs, 0 skips
15 dezicycles in ff_h264_idct_add_altivec, 8192 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 16384 runs, 0 skips
14 dezicycles in ff_h264_idct_add_altivec, 32768 runs, 0 skips
15 dezicycles in ff_h264_idct_add_altivec, 65533 runs, 3 skips

240 dezicycles in ff_h264_idct_add_c, 1 runs, 0 skips
215 dezicycles in ff_h264_idct_add_c, 2 runs, 0 skips
152 dezicycles in ff_h264_idct_add_c, 4 runs, 0 skips
100 dezicycles in ff_h264_idct_add_c, 8 runs, 0 skips
73 dezicycles in ff_h264_idct_add_c, 16 runs, 0 skips
57 dezicycles in ff_h264_idct_add_c, 32 runs, 0 skips
43 dezicycles in ff_h264_idct_add_c, 64 runs, 0 skips
35 dezicycles in ff_h264_idct_add_c, 128 runs, 0 skips
32 dezicycles in ff_h264_idct_add_c, 256 runs, 0 skips
29 dezicycles in ff_h264_idct_add_c, 512 runs, 0 skips
28 dezicycles in ff_h264_idct_add_c, 1024 runs, 0 skips
27 dezicycles in ff_h264_idct_add_c, 2048 runs, 0 skips
27 dezicycles in ff_h264_idct_add_c, 4096 runs, 0 skips
26 dezicycles in ff_h264_idct_add_c, 8192 runs, 0 skips
26 dezicycles in ff_h264_idct_add_c, 16383 runs, 1 skips
27 dezicycles in ff_h264_idct_add_c, 32766 runs, 2 skips
27 dezicycles in ff_h264_idct_add_c, 65531 runs, 5 skips

[1] http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2006-February/ 
007211.html

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: h264_idct_add_altivec.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070601/9c0fac32/attachment.txt>



More information about the ffmpeg-devel mailing list