[Ffmpeg-devel] [PATCH] fix mpegaudiodec on ARM and benchmark

Aurelien Jacobs aurel
Thu Aug 24 00:08:22 CEST 2006


On Wed, 23 Aug 2006 17:21:29 +0200
Michael Niedermayer <michaelni at gmx.at> wrote:

> Hi
> 
> On Wed, Aug 23, 2006 at 02:12:40PM +0200, Aurelien Jacobs wrote:
> > Hi,
> > 
> > After the recent optimisation in mpegaudiodec, I've benchmarked mp3 on ARM.
> > But first, mpegaudiodec.c didn't compiled, so I fixed it.
> > I guess I should commit the attached patch ?
> > 
> > Here is how I benchmarked:
> >  ./mplayer -quiet -ac ffmp3 -ao pcm:fast:file=/dev/null -benchmark a.mp3
> > 
> > And here are the results with various lavc revisions (Xscale IXP420).
> > 
> > r6036
> > BENCHMARKs: VC:   0.000s VO:   0.000s A: 216.931s Sys:   0.414s =  217.346s
> > BENCHMARK%: VC:  0.0000% VO:  0.0000% A: 99.8093% Sys:  0.1907% = 100.0000%
> > 
> > r6037
> > BENCHMARKs: VC:   0.000s VO:   0.000s A: 212.347s Sys:   0.412s =  212.759s
> > BENCHMARK%: VC:  0.0000% VO:  0.0000% A: 99.8062% Sys:  0.1938% = 100.0000%
> > 
> > r6039
> > BENCHMARKs: VC:   0.000s VO:   0.000s A: 212.703s Sys:   0.411s =  213.114s
> > BENCHMARK%: VC:  0.0000% VO:  0.0000% A: 99.8070% Sys:  0.1930% = 100.0000%
> > 
> > r6050 (patched)
> > BENCHMARKs: VC:   0.000s VO:   0.000s A: 170.642s Sys:   0.411s =  171.053s
> > BENCHMARK%: VC:  0.0000% VO:  0.0000% A: 99.7597% Sys:  0.2403% = 100.0000%
> > 
> > Overall 20% speedup, which is not so bad :-)
> 
> was this with or without  --disable-libavcodec_mpegaudio_hp ?

All my tests are done *not* using --disable-libavcodec_mpegaudio_hp.

> > Index: mpegaudiodec.c
> > ===================================================================
> > --- mpegaudiodec.c	(revision 6050)
> > +++ mpegaudiodec.c	(working copy)
> > @@ -59,13 +59,13 @@
> >  #   define MULL(a, b) \
> >          ({  int lo, hi;\
> >              asm("smull %0, %1, %2, %3     \n\t"\
> > -                "mov   %0, %0,     lsr #%4\n\t"\
> > -                "add   %1, %0, %1, lsl #%5\n\t"\
> > -            : "=r"(lo), "=r"(hi)\
> > +                "mov   %0, %0,     lsr %4\n\t"\
> > +                "add   %1, %0, %1, lsl %5\n\t"\
> > +            : "=&r"(lo), "=&r"(hi)\
> >              : "r"(b), "r"(a), "i"(FRAC_BITS), "i"(32-FRAC_BITS));\
> >           hi; })
> >  #   define MUL64(a,b) ((int64_t)(a) * (int64_t)(b))
> > -#   define MULH(a, b) ({ int lo, hi; asm ("smull %0, %1, %2, %3" : "=r"(lo), "=r"(hi) : "r"(b),"r"(a)); hi; })
> > +#   define MULH(a, b) ({ int lo, hi; asm ("smull %0, %1, %2, %3" : "=&r"(lo), "=&r"(hi) : "r"(b),"r"(a)); hi; })
> 
> i think not all 4 of the & are needed, but iam not sure ...

If I remove any one of them, I get a load of messages like this one:
{standard input}: Assembler messages:
{standard input}:630: rdhi, rdlo and rm must all be different
Note that I'm cross-compiling with gcc-4.1 if that's relevant.

> also please try to xchange a and b, some ARM cpus need less time to do 
> multiplications if the right one of these is small but i dunno which one it 
> was ...

The current order seems to be the fastest, but the difference is very slight.

> and another idea, try to set -mcpu -march -mtune correctly for the cpu

When setting -march=armv4 or armv4t or armv5 or armv5t it don't even compile:

arm-linux-gnu-gcc -DHAVE_AV_CONFIG_H -I.. -I../libavutil -Wdeclaration-after-statement -march=armv5t -D_REENTRANT -I/usr/include -I/usr/src/DVB/ost/include -I/usr/include/dxr2 -I/usr/local/include/cdda -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_ISOC9X_SOURCE    -c -o armv4l/dsputil_arm_s.o armv4l/dsputil_arm_s.S
armv4l/dsputil_arm_s.S: Assembler messages:
armv4l/dsputil_arm_s.S:77: Error: selected processor does not support `pld [r1]'
armv4l/dsputil_arm_s.S:88: Error: selected processor does not support `pld [r1]'
[...]

Setting -march=armv5te (which is exactly what my Xscale is) is quite
slower, I don't understand why:
BENCHMARKs: VC:   0.000s VO:   0.000s A: 206.553s Sys:   0.438s =  206.991s
BENCHMARK%: VC:  0.0000% VO:  0.0000% A: 99.7882% Sys:  0.2118% = 100.0000%


Now I also benchmarked libmad. It's still "slightly" faster !
BENCHMARKs: VC:   0.000s VO:   0.000s A:  54.212s Sys:   0.407s =   54.618s
BENCHMARK%: VC:  0.0000% VO:  0.0000% A: 99.2554% Sys:  0.7446% = 100.0000%

Then I benchmarked ffmp3 r6050 (patched) with --disable-libavcodec_mpegaudio_hp
BENCHMARKs: VC:   0.000s VO:   0.000s A:  78.751s Sys:   0.419s =   79.171s
BENCHMARK%: VC:  0.0000% VO:  0.0000% A: 99.4702% Sys:  0.5298% = 100.0000%
Pretty impressive ! Not so far from libmad !

Aurel




More information about the ffmpeg-devel mailing list