[FFmpeg-user] 5% of audio samples missing when capturing audio on a mac

Norbert Pozar bertapozar at gmail.com
Sun Sep 13 04:03:06 EEST 2020


> > I am attempting to capture a webcam with audio on a MacBook pro
(Catalina
> > 10.15.6), but I am having trouble with the audio stream. The video part
is
> > fine, but audio seems to be missing about 5% of the expected samples.
This
> > simple command illustrates the problem:
> >
> > ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 10 out.wav
>
> This is missing the console output of:
> ffmpeg -i out.wav -f null -

Thanks for having a look. Sorry about that. Here is the console output of
both commands (the exact produced length changes on each run, depending on
how many samples are missing):

$ ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 10 out.wav
ffmpeg version git-2020-09-12-1c09456 Copyright (c) 2000-2020 the FFmpeg
developers
  built with Apple clang version 11.0.0 (clang-1100.0.33.17)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-1c09456_1
--enable-shared --enable-pthreads --enable-version3 --enable-avresample
--cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls
--enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d
--enable-libmp3lame --enable-libopus --enable-librav1e
--enable-librubberband --enable-libsnappy --enable-libsrt
--enable-libtesseract --enable-libtheora --enable-libvidstab
--enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264
--enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma
--enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
--enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox
--disable-libjack --disable-indev=jack
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.105.100 / 58.105.100
  libavformat    58. 54.100 / 58. 54.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with
argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging
level) with argument '99'.
Reading option '-y' ... matched as option 'y' (overwrite output files) with
argument '1'.
Reading option '-f' ... matched as option 'f' (force format) with argument
'avfoundation'.
Reading option '-i' ... matched as input url with argument ':0'.
Reading option '-t' ... matched as option 't' (record or transcode
"duration" seconds of audio/video) with argument '10'.
Reading option 'out.wav' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url :0.
Applying option f (force format) with argument avfoundation.
Successfully parsed a group of options.
Opening an input file: :0.
[avfoundation @ 0x7f872c814000] audio device 'Built-in Microphone' opened
[avfoundation @ 0x7f872c814000] All info found
[avfoundation @ 0x7f872c814000] stream 0: start_time: 38114.7 duration:
NOPTS
[avfoundation @ 0x7f872c814000] format: start_time: 38114.7 duration: NOPTS
(estimate from bit rate) bitrate=2822 kb/s
Input #0, avfoundation, from ':0':
  Duration: N/A, start: 38114.693333, bitrate: 2822 kb/s
    Stream #0:0, 1, 1/1000000: Audio: pcm_f32le, 44100 Hz, stereo, flt,
2822 kb/s
Successfully opened the file.
Parsing a group of options: output url out.wav.
Applying option t (record or transcode "duration" seconds of audio/video)
with argument 10.
Successfully parsed a group of options.
Opening an output file: out.wav.
[file @ 0x7f872be0e480] Setting default whitelist 'file,crypto,data'
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_f32le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if
it occurs once at the start per stream)
detected 4 logical cores
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'sample_fmt' to value 'flt'
[graph_0_in_0_0 @ 0x7f872bc392c0] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x7f872bc392c0] tb:1/44100 samplefmt:flt samplerate:44100
chlayout:0x3
[format_out_0_0 @ 0x7f872bd3a840] Setting 'sample_fmts' to value 's16'
[format_out_0_0 @ 0x7f872bd3a840] auto-inserting filter 'auto_resampler_0'
between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 0x7f872be2a0c0] query_formats: 5 queried, 9 merged, 3
already done, 0 delayed
[auto_resampler_0 @ 0x7f872bd25080] [SWR @ 0x7f872bf69000] Using fltp
internally between filters
[auto_resampler_0 @ 0x7f872bd25080] ch:2 chl:stereo fmt:flt r:44100Hz ->
ch:2 chl:stereo fmt:s16 r:44100Hz
Output #0, wav, to 'out.wav':
  Metadata:
    ISFT            : Lavf58.54.100
    Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.105.100 pcm_s16le
[out_0_0 @ 0x7f872bc394c0] EOF on sink link out_0_0:default.=   1x
No more output streams to write to, finishing.
size=    1619kB time=00:00:10.00 bitrate=1326.1kbits/s speed=0.998x
video:0kB audio:1619kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.004706%
Input file #0 (:0):
  Input stream #0:0 (audio): 811 packets read (3321856 bytes); 811 frames
decoded (415232 samples);
  Total: 811 packets (3321856 bytes) demuxed
Output file #0 (out.wav):
  Output stream #0:0 (audio): 810 frames encoded (414376 samples); 810
packets muxed (1657504 bytes);
  Total: 810 packets (1657504 bytes) muxed
811 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x7f872be1db00] Statistics: 4 seeks, 9 writeouts




$ ffmpeg -i out.wav -f null -
ffmpeg version git-2020-09-12-1c09456 Copyright (c) 2000-2020 the FFmpeg
developers
  built with Apple clang version 11.0.0 (clang-1100.0.33.17)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-1c09456_1
--enable-shared --enable-pthreads --enable-version3 --enable-avresample
--cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls
--enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d
--enable-libmp3lame --enable-libopus --enable-librav1e
--enable-librubberband --enable-libsnappy --enable-libsrt
--enable-libtesseract --enable-libtheora --enable-libvidstab
--enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264
--enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma
--enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
--enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox
--disable-libjack --disable-indev=jack
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.105.100 / 58.105.100
  libavformat    58. 54.100 / 58. 54.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'out.wav':
  Metadata:
    encoder         : Lavf58.54.100
  Duration: 00:00:09.40, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz,
stereo, s16, 1411 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf58.54.100
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.105.100 pcm_s16le
size=N/A time=00:00:09.39 bitrate=N/A speed=1.23e+03x
video:0kB audio:1619kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: unknown



Here I am also attaching the console output (only the relevant timings that
differ from the above) with -t 100 to illustrate that it does not seem to
be a warm-up problem.

$ ffmpeg -v 9 -loglevel 99 -y -f avfoundation -i ":0" -t 100 out.wav
...
size=   16163kB time=00:01:40.00 bitrate=1324.0kbits/s speed=   1x
video:0kB audio:16163kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 0.000471%
Input file #0 (:0):
  Input stream #0:0 (audio): 8083 packets read (33107968 bytes); 8083
frames decoded (4138496 samples);
  Total: 8083 packets (33107968 bytes) demuxed
Output file #0 (out.wav):
  Output stream #0:0 (audio): 8082 frames encoded (4137616 samples); 8082
packets muxed (16550464 bytes);
  Total: 8082 packets (16550464 bytes) muxed
8083 frames successfully decoded, 0 decoding errors

$ ffmpeg -i out.wav -f null -
...
  Metadata:
    encoder         : Lavf58.54.100
  Duration: 00:01:33.82, bitrate: 1411 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz,
stereo, s16, 1411 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf58.54.100
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc58.105.100 pcm_s16le
size=N/A time=00:01:33.82 bitrate=N/A speed=2.21e+03x
video:0kB audio:16163kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: unknown


More information about the ffmpeg-user mailing list