[FFmpeg-user] Unable to capture audio and video synced

Leonardo Soares Müller leozinho29_eu at hotmail.com
Tue May 23 20:11:25 EEST 2017


I am trying to record my computer's screen together with the sound from 
speakers and (occasionally) microphone. While the record is made 
successfully, it is problematic: when watching on VLC it is good, but on 
Kdenlive or YouTube audio and video are out of sync.

I am using Xubuntu 16.04.2 and I have downloaded ffmpeg source and built 
it. To record the screen, I use x11grab. To record speakers and 
microphone, I use pulse. The command I used to record was:

env PULSE_LATENCY_MSEC=20 ffmpeg -vaapi_device /dev/dri/renderD128 
-hwaccel vaapi -hwaccel_output_format yuv420p -thread_queue_size 16384 
-f pulse -sample_rate 44100 -channels 2 -i 
alsa_output.pci-0000_00_1f.3.analog-stereo.monitor -thread_queue_size 
16384 -f x11grab -s 1366x768 -r 30 -i :0.0 -acodec libfdk_aac -b:a 160k 
-vf format=nv12,hwupload,scale_vaapi=w=960:h=540 -vcodec h264_vaapi -qp 
20 -f flv -shortest TEST001.flv

The output of the command was:

ffmpeg version N-86126-ge434840 Copyright (c) 2000-2017 the FFmpeg 
   built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 20160609
   configuration: --enable-shared --enable-avresample --enable-avisynth 
--enable-gnutls --enable-ladspa --enable-libass --enable-libbluray 
--enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite 
--enable-libfontconfig --enable-libfreetype --enable-libfribidi 
--enable-libgme --enable-libgsm --enable-libmodplug --enable-libmp3lame 
--enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librtmp 
--enable-libschroedinger --enable-libshine --enable-libsnappy 
--enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora 
--enable-libtwolame --enable-libvorbis --enable-libvpx 
--enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid 
--enable-libzvbi --enable-openal --enable-opengl --enable-libdc1394 
--enable-libiec61883 --enable-libzmq --enable-frei0r --enable-libx264 
--enable-libopencv --enable-libfdk-aac --enable-libmfx --enable-vaapi 
--enable-nonfree --enable-gpl --enable-libxcb --enable-libxcb-shm 
--enable-libxcb-xfixes --enable-libxcb-shape
   libavutil      55. 63.100 / 55. 63.100
   libavcodec     57. 96.101 / 57. 96.101
   libavformat    57. 72.101 / 57. 72.101
   libavdevice    57.  7.100 / 57.  7.100
   libavfilter     6. 90.100 /  6. 90.100
   libavresample   3.  6.  0 /  3.  6.  0
   libswscale      4.  7.101 /  4.  7.101
   libswresample   2.  8.100 /  2.  8.100
   libpostproc    54.  6.100 / 54.  6.100
libva info: VA-API version 0.39.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_39
libva info: va_openDriver() returns 0
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, pulse, from 'alsa_output.pci-0000_00_1f.3.analog-stereo.monitor':
   Duration: N/A, start: 1495555903.842161, bitrate: 1411 kb/s
     Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
[x11grab @ 0x18612e0] Stream #0: not enough frames to estimate rate; 
consider increasing probesize
Input #1, x11grab, from ':0.0':
   Duration: N/A, start: 1495555904.112324, bitrate: N/A
     Stream #1:0: Video: rawvideo (BGR[0] / 0x524742), bgr0, 1366x768, 
30 fps, 1000k tbr, 1000k tbn, 1000k tbc
Stream mapping:
   Stream #1:0 -> #0:0 (rawvideo (native) -> h264 (h264_vaapi))
   Stream #0:0 -> #0:1 (pcm_s16le (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
[swscaler @ 0x187b200] Warning: data is not aligned! This can lead to a 
Output #0, flv, to 'TEST001.flv':
     encoder         : Lavf57.72.101
     Stream #0:0: Video: h264 (h264_vaapi) (High) ([7][0][0][0] / 
0x0007), vaapi_vld(progressive), 960x540, q=0-31, 30 fps, 1k tbn, 30 tbc
       encoder         : Lavc57.96.101 h264_vaapi
     Stream #0:1: Audio: aac (libfdk_aac) ([10][0][0][0] / 0x000A), 
44100 Hz, stereo, s16, 160 kb/s
       encoder         : Lavc57.96.101 libfdk_aac
frame=  681 fps= 30 q=-0.0 Lsize=    2607kB time=00:00:22.64 bitrate= 
943.1kbits/s speed=0.999x
video:2134kB audio:443kB subtitle:0kB other streams:0kB global 
headers:0kB muxing overhead: 1.161332%

After I have recorded it, I have separated the video and audio from the 
resulting file in two files (the input being the file, using -an -vcodec 
copy and -vn acodec copy, for the respective outputs). The video only 
file was opened on Kdenlive and the audio file was opened on Audacity.

Looking at the command, it would be expected a difference of 20 ms (0,02 
s) between audio and video at maximum, but the length's difference was 
460 ms (0,46 s). There are three different lengths for this file and, 
possibly, different multimedia implementations see the file with a 
different lengths.

ffmpeg (from command line): 22,64 s
Kdenlive: 22,22 s
Audacity: 22,686 s

The millisecond is only visible using Audacity, so I don't know the 
milliseconds from ffmpeg and Kdenlive.

I have done this test some times, and the results appear to be random. 
Sometimes the delay is 80 ms, sometimes 1000 ms. There is not a fixed 
difference, so correcting the command with a fixed offset is not a 
options, unfortunately.

I would like to know what should I do to capture the screen with sound 
synced, as the -shortest option was ineffective on this case because the 
inputs don't have an end.

Thank you.

More information about the ffmpeg-user mailing list