[FFmpeg-user] resolution of the waterfall diagram of typical mp3 file
florin at andrei.myip.org
Sun Aug 7 23:17:17 EEST 2016
Consider an mp3 file, mono (single channel), 44.1 kHz, encoded at 128
kb/s constant bitrate (to keep things simple) with your encoder of
choice using average settings (let's say whatever ffmpeg uses as
defaults for this case).
Think of the full 3D representation of the spectrum of the whole file,
with time being one dimension, frequency another dimension, and relative
amplitude the 3rd dimension. Or the waterfall diagram - again time is
one dimension, frequency the other, and the relative amplitude is
For that particular file, the resolution of the time dimension is pretty
clear: it's 44100 samples per second. What's less clear to me is the
resolutions of the other two dimensions. If I were to build the full 3D
representation, what resolutions should I choose on the other two
dimensions to achieve, overall, a similar amount of information as that
contained in the original mp3 file?
For the frequency dimension, what are the limits? Is it 20 Hz and 20
kHz? And how many frequency "buckets" do I need to keep things
comparable to the original mp3 file?
For the relative amplitude, how many bits do I need to capture more or
less the same amount of info as the original mp3 file? 8 bit? 16 bit?
Keep in mind this is the completely rolled out waterfall representation,
not the encoded mp3 stream.
I think all these questions are ultimately tied into the total amount of
information contained in the mp3 file. And I'm only looking for
reasonable estimates for these parameters.
More information about the ffmpeg-user