[FFmpeg-user] Sound level measuring on the 2nd audio stream

Alex win2000rus at hotmail.com
Mon Jun 29 13:15:24 EEST 2020


Hi All!
I faced difficulties while trying to measure sound level for a
multimedia file with multiple audio streams. Here is the background:

1) ffmpeg 4.2.2 and it was used in different OS (Windows, FreeBSD).

2) The source file:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'test1.mov':
  Metadata:
    major_brand     : qt
    minor_version   : 537199360
    compatible_brands: qt
    creation_time   : 2020-05-05T09:48:02.000000Z
  Duration: 01:01:15.04, start: 0.000000, bitrate: 29791 kb/s
    Stream #0:0(eng): Video: mpeg2video (Main) (xdvc / 0x63766478),
yuv420p(tv, bt709, top coded first (swapped)), 1920x1080 [SAR 1:1 DAR
16:9], 26715 kb/s, 25 fps, 25 tbr, 25 tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2020-05-05T09:48:02.000000Z
      handler_name    : Apple Video Media Handler
      encoder         : XDCAM EX 1080i50 (35 Mb/s VBR)
      timecode        : 00:00:00:00
    Stream #0:1(eng): Audio: pcm_s16le (sowt / 0x74776F73), 48000 Hz,
stereo, s16, 1536 kb/s (default)
    Metadata:
      creation_time   : 2020-05-05T09:48:02.000000Z
      handler_name    : Apple Sound Media Handler
      timecode        : 00:00:00:00
    Stream #0:2(eng): Audio: pcm_s16le (sowt / 0x74776F73), 48000 Hz,
stereo, s16, 1536 kb/s (default)
    Metadata:
      creation_time   : 2020-05-05T09:48:02.000000Z
      handler_name    : Apple Sound Media Handler
      timecode        : 00:00:00:00
    Stream #0:3(eng): Data: none (tmcd / 0x64636D74) (default)
    Metadata:
      creation_time   : 2020-05-05T11:14:47.000000Z
      handler_name    : Time Code Media Handler
      timecode        : 00:00:00:00
Unsupported codec with id 0 for input stream 3

And the stream #0:2 is empty (no sound at all)! This is important.

3) The way I used to measure the sound level:
https://stackoverflow.com/questions/38056970/ffmpeg-txt-from-audio-levels

The documentation (https://ffmpeg.org/ffmpeg-filters.html#astats-1) says
that it's possible to set the channel number (starting from 1) or string
'Overall' for the integral value. I decided to print levels for 1st and
2nd audio streams separately and overall levels finally. Here is the
command:

ffprobe -hide_banner -f lavfi -i
amovie=test1.mov,astats=metadata=1:reset=1 -show_entries
frame=pkt_pts_time:frame_tags=lavfi.astats.1.Peak_level,lavfi.astats.2.Peak_level,lavfi.astats.Overall.Peak_level
-of csv=p=0

And here is what I see:

[mov,mp4,m4a,3gp,3g2,mj2 @ 0x80557b600] st: 0 edit list: 1 Missing key
frame while searching for timestamp: 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x80557b600] st: 0 edit list 1 Cannot find an
index entry before timestamp: 0.
Input #0, lavfi, from
'amovie=/mnt/playout6/Playout/Trinity/Exxxotica/Media/test1.mov,astats=metadata=1:reset=1':
  Duration: N/A, start: 0.000000, bitrate: 1536 kb/s
    Stream #0:0: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
0.000000,-inf,-inf,-inf
0.021333,-inf,-inf,-inf
0.042667,-inf,-inf,-inf
0.064000,-inf,-inf,-inf
0.085333,-inf,-inf,-inf
0.106667,-inf,-inf,-inf
0.128000,-inf,-inf,-inf
0.149333,-inf,-inf,-inf
0.170667,-inf,-inf,-inf
0.192000,-inf,-inf,-inf
0.213333,-inf,-inf,-inf
0.234667,-inf,-inf,-inf
0.256000,-inf,-inf,-inf
0.277333,-inf,-inf,-inf
0.298667,-inf,-inf,-inf
0.320000,-inf,-inf,-inf
0.341333,-40.124683,-48.232659,-40.124683
0.362667,-24.450328,-20.442361,-20.442361
0.384000,-15.920450,-15.235541,-15.235541
0.405333,-17.108509,-13.644516,-13.644516
0.426667,-15.176011,-13.778167,-13.778167
0.448000,-14.284777,-14.921187,-14.284777
0.469333,-14.353691,-13.147619,-13.147619
0.490667,-15.612737,-13.749723,-13.749723
0.512000,-15.577617,-14.215043,-14.215043
0.533333,-15.476248,-14.472115,-14.472115
0.554667,-15.115377,-11.835565,-11.835565
0.576000,-15.372937,-13.422919,-13.422919
0.597333,-5.541789,-4.610649,-4.610649
0.618667,-13.954783,-10.754262,-10.754262
0.640000,-12.198213,-13.178989,-12.198213
0.661333,-13.894198,-14.257364,-13.894198
0.682667,-14.118886,-12.640048,-12.640048
0.704000,-13.659833,-14.141813,-13.659833
0.725333,-17.131342,-16.504812,-16.504812
0.746667,-18.004467,-18.494126,-18.004467
0.768000,-14.940412,-16.608238,-14.940412
0.789333,-13.574658,-13.259134,-13.259134
0.810667,-12.985351,-13.042276,-12.985351
0.832000,-9.460374,-9.366353,-9.366353
0.853333,-13.081630,-10.817579,-10.817579
0.874667,-14.097363,-15.270840,-14.097363
0.896000,-15.432269,-13.685421,-13.685421
0.917333,-16.315447,-14.210959,-14.210959
0.938667,-14.378635,-13.564543,-13.564543
...

It is a lie. The second audio stream is totally silent from the
beginning to the end. Also, note that ffprobe mentioned the only audio
stream and it was #0:0. Why?

I decided to test audio streams separately and copied them to separate
files, then checked the sound level. For the first stream:

ffmpeg -hide_banner -i test1.mov -ss 00:01:00 -t 00:01:00 -map 0:a:0
test1.mp3

ffprobe -hide_banner -f lavfi -i
amovie=test1.mp3,astats=metadata=1:reset=1 -show_entries
frame=pkt_pts_time:frame_tags=lavfi.astats.1.RMS_level -of csv=p=0

The result is expectable:
[mp3float @ 0000000
Input #0, lavfi, fr
  Duration: N/A, st
    Stream #0:0: Au
0.000000,-50.248303
0.024000,-53.384511
0.048000,-53.646786
0.072000,-48.221139
0.096000,-51.244874
0.120000,-53.856832
0.144000,-54.184842
0.168000,-52.673918
0.192000,-53.253474
0.216000,-49.317896
0.240000,-44.537543
0.264000,-43.784163
0.288000,-47.155750
0.312000,-37.411110
0.336000,-45.422727
0.360000,-39.557636
0.384000,-35.931526
0.408000,-27.129281
0.432000,-27.934319
0.456000,-30.063120
0.480000,-34.268751
0.504000,-32.415779
0.528000,-28.112756
...

For the second stream:

ffmpeg -hide_banner -i test1.mov -ss 00:01:00 -t 00:01:00 -map 0:a:0
test2.mp3

ffprobe -hide_banner -f lavfi -i
amovie=test2.mp3,astats=metadata=1:reset=1 -show_entries
frame=pkt_pts_time:frame_tags=lavfi.astats.1.RMS_level -of csv=p=0
0.000000,-inf
0.024000,-inf
0.048000,-inf
0.072000,-inf
0.096000,-inf
0.120000,-inf
0.144000,-inf
0.168000,-inf
0.192000,-inf
0.216000,-inf
0.240000,-inf
0.264000,-inf
0.288000,-inf
0.312000,-inf
0.336000,-inf
0.360000,-inf
0.384000,-inf
0.408000,-inf
0.432000,-inf
0.456000,-inf
0.480000,-inf
0.504000,-inf
0.528000,-inf
...

Yes, this is really empty.
But when I had copied both audio streams into WMA and tested then, I saw
a lie again. It seems ffprobe cannot measure the 2nd audio stream and
this is a bug. Am I right?

WBR Alex



More information about the ffmpeg-user mailing list