[FFmpeg-devel] [PATCH] activate ac3 decoder

Michael Niedermayer michaelni
Fri Aug 10 15:59:11 CEST 2007


Hi

On Fri, Aug 10, 2007 at 10:14:27AM +0200, Guillaume Poirier wrote:
[...]
> > our fft is crap, all of it ...
> > the C fft is a very slow cooley tukey fft
> > our SSE fft contains unneeded instructions (see 
> > MPlayer-svn/mplayer/liba52/imdct.c for anonther inefficit cooley tukey
> > SSE fft but at least with fewer SSE instructions, note this one was
> > writen by me ...)
> > 
> > so to solve this and make our code faster
> > 1. implement a plain C split radix fft
> 
> 
> You seem to be putting a very big focus on getting the FFT that has
> the fewest number of computations. Indeed, split-radix is the most
> efficient in terms of using the fewest number of add and mul.
> 
> However, after reading some different papers on FFTs and googling
> around, I'm not that convinced. This paper:
> http://www.fftw.org/fftw-paper-icassp.pdf
> by FFTW people show that in practice, there are other things to take
> into account (e.g. CPU cache) than just the number of computations.

if you search for fftw and djbfft you find that:
http://cr.yp.to/djbfft/bench-notes.html

which doesnt strengthen my trust in the benchmarks by the fftw authors
though its kinda old ...

also when you read
http://cr.yp.to/djbfft/faq.html
you will see that djbfft has been designed with cache and such in mind


and yes, the operation counts are not the only thing whic matters but
here
1. fact is liba52 with a split radix fft is faster then our code with cooley
   tukey fft
2. fewer operations does not neccessarily mean that other things like
   the memory accesses would be more random/slower

> 
> 
> There seem to be so many FFT libs out there that I wonder if we really
> need to write our own, except, of course if there aren't any decent
> that's LGPL-compatible.
> 
> Heck, if we didn't care about the license, Intel's IPP is clearly the
> best choice out there ;-)
> J/K
> 
> 
> > 2. implement djbfft support (this should be trivial considering that
> > liba52 needs something like 5 lines of code for it)
> 
> According to the website djbfft uses a split-radix-2/4 FFT. This looks
> like what you are after isn't it?
> Why don't we just cannibalize the relevant parts of djbfft and make do
> without step 1 of your checklist (1. implement a plain C split radix fft)?
> I couldn't figure out what's the license of djbfft so I don't know if
> that's possible though :-(

djbfft has no proper license AFAIK it just says
on http://cr.yp.to/djbfft/faq.html

"Can I use djbfft in my own code?

Yes. Please tell me what programs you're using it in so that I can let NSF know."

maybe someone should _politely_ ask the author if he would mind the code to be
used in ffmpeg (requireing LGPL)

but before asking we should benchmark the code, it would be silly to ask and
then realize that some other code which is under LGPL/BSD is faster

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070810/be0068d9/attachment.pgp>



More information about the ffmpeg-devel mailing list