[Libav-user] Calculate spectrogram from the audio channel

wm4 nfxjfg at googlemail.com
Sat May 3 16:24:23 CEST 2014


On Fri, 2 May 2014 17:48:37 -0700
Ricky Huang <rhuang.work at gmail.com> wrote:

> Hello all,
> 
> I am trying to reproduce the Shazam algorithm as outlined in Avery Wang's paper "An Industrial-Strength Audio Search Algorithm" (http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf).  One of the step in this is to convert the audio to spectrogram and identify the spectrogram peaks.  I am wondering if building a custom audio-filter for ffmpeg would be the correct way to go?  If so, does anyone have any pointers on converting the audio data to spectrogram for me?  (algorithm to use, things to note, etc?)
> 
> 
> Any help would be appreciated.  Thanks.

No idea about the algorithm, but if you want to see a sample filter how
to integrate such a filter into libavfilter, have a look at
libavfilter/avf_showspectrum.c. This filter visualizes the computed data.

If you actually want to export the filtered data instead of visualizing
it audio-player style, you could do something like vf_cropdetect.c, and
attach the filtered data to output AVFrames.

(If you just want to convert the data, my reply is probably not helpful
at all.)


More information about the Libav-user mailing list