[Libav-user] av_rdft_calc() data[0] is always the largest?

Ricky Huang rhuang.work at gmail.com
Fri Jun 6 04:01:57 CEST 2014


Hello all,

I have been following these two guides: http://aubio.org/news/20091111-2339_shazam and http://blog.pkh.me/p/6-las-lossy-audio-spotter.html in an attempt to understand the Shazam audio fingerprinting algorithm.  (First link explains the Shazam methodology, and the second link explains how to run some of the ffmpeg fft functions).

One critical components is to pick the bin number with the highest amplitude, e.g., re^2 + im^2, and use the bin numbers as well as the time deltas as the hash during audio search.

The problem is, my bins 0 and 1 always have the highest amplitude.  I have tried this on 5 or so clips and that's always the case.  Does it make sense or did I miss something fundamentally about fft and audio processing?


Thanks in advance.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ffmpeg.org/pipermail/libav-user/attachments/20140605/24ce3a47/attachment.html>


More information about the Libav-user mailing list