[FFmpeg-user] Real-time, synchronized capture of multiple V4L sources

Ochi ochi at arcor.de
Mon Jan 15 21:47:18 EET 2018


Hello,

what I am trying to achieve is the following: I have multiple V4L 
sources with possibly different time bases (different internal start 
times, different fps) and I would like to capture them live and in sync 
to separate files. In my particular case those devices are two HDMI 
grabbers at 60 FPS and two webcams capturing at ~30 fps.

Ideally, I would also like to preview the sources at the same time using 
e.g. an SDL or OpenGL window showing the inputs using overlay or {v,h}stack.

If necessary, frames should be dropped or duplicated in order to 
maintain real-time even when capturing for a long period (say, 2-5 
hours). The resulting videos should be constant frame rate (30 and 60 
FPS, respectively, or all 60 FPS if that should be necessary).

The main problem that I have is that the different video steams are not 
in sync right from the beginning and I cannot find a way to make them 
synchronized.

As I will explain, I think the underlying problem (or solution) is quite 
simple, however to give you an idea how a minimal, naive approach could 
look like, consider the following example:


ffmpeg -y \
     -video_size .. -input_format .. -framerate 60 -i /dev/video0 \
     -video_size .. -input_format .. -framerate 60 -i /dev/video1  \
     -video_size .. -input_format .. -framerate 30 -i /dev/video2 \
     -video_size .. -input_format .. -framerate 30 -i /dev/video3 \
     -filter_complex "
         [0:v] format=abgr, vflip, split [hdmi0a][hdmi0b];
         [1:v] format=abgr, vflip, split [hdmi1a][hdmi1b];
         [2:v] format=abgr, split [cam0a][cam0b];
         [3:v] format=abgr, split [cam1a][cam1b];
         [hdmi0a] scale=.. [tmp0], [hdmi1a] scale=.., [tmp0] hstack 
[hdmistack];
         [cam0a] scale=.. [tmp1], [cam1a] scale=.., [tmp1] hstack 
[camstack];
         [hdmistack][camstack] vstack [preview]
     " \
     -map "[preview]" -f opengl - \
     -map "[hdmi0b]" -c:v h264_nvenc -qp 23 /tmp/hdmi0.mkv \
     -map "[hdmi1b]" -c:v h264_nvenc -qp 23 /tmp/hdmi1.mkv \
     -map "[cam0b]" -c:v h264 -qp 23 /tmp/cam0.mkv \
     -map "[cam1b]" -c:v h264 -qp 23 /tmp/cam1.mkv


When the V4L devices are initialized, the start timestamps of the input 
steams are all either a bit different (I suppose due to the fact that 
the devices are initialized in a particular order) and some start times 
may even be zero, supposedly due to a bug in Magewell HDMI capture boxes:


ffmpeg version 3.4.1 Copyright (c) 2000-2017 the FFmpeg developers
   built with gcc 7.2.1 (GCC) 20171224
   configuration: --prefix=/usr --disable-debug --disable-static 
--disable-stripping --enable-avisynth --enable-avresample 
--enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl 
--enable-ladspa --enable-libass --enable-libbluray --enable-libfreetype 
--enable-libfribidi --enable-libgsm --enable-libiec61883 
--enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb 
--enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus 
--enable-libpulse --enable-libsoxr --enable-libspeex --enable-libssh 
--enable-libtheora --enable-libv4l2 --enable-libvidstab 
--enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 
--enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid 
--enable-shared --enable-version3 --enable-opengl --enable-opencl
   libavutil      55. 78.100 / 55. 78.100
   libavcodec     57.107.100 / 57.107.100
   libavformat    57. 83.100 / 57. 83.100
   libavdevice    57. 10.100 / 57. 10.100
   libavfilter     6.107.100 /  6.107.100
   libavresample   3.  7.  0 /  3.  7.  0
   libswscale      4.  8.100 /  4.  8.100
   libswresample   2.  9.100 /  2.  9.100
   libpostproc    54.  7.100 / 54.  7.100
[video4linux2,v4l2 @ 0x5606648ea320] Dequeued v4l2 buffer contains 
corrupted data (0 bytes).
Input #0, video4linux2,v4l2, from '/dev/video0':
   Duration: N/A, start: 0.000000, bitrate: 2985984 kb/s
     Stream #0:0: Video: rawvideo (BGR[24] / 0x18524742), bgr24, 
1920x1080, 2985984 kb/s, 60 fps, 60 tbr, 1000k tbn, 1000k tbc
Input #1, video4linux2,v4l2, from '/dev/video1':
   Duration: N/A, start: 122510.232758, bitrate: 2985984 kb/s
     Stream #1:0: Video: rawvideo (BGR[24] / 0x18524742), bgr24, 
1920x1080, 2985984 kb/s, 60 fps, 60 tbr, 1000k tbn, 1000k tbc
Input #2, video4linux2,v4l2, from '/dev/video2':
   Duration: N/A, start: 122510.619448, bitrate: N/A
     Stream #2:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown), 
640x360, 30 fps, 30 tbr, 1000k tbn, 1000k tbc
Input #3, video4linux2,v4l2, from '/dev/video3':
   Duration: N/A, start: 122510.997742, bitrate: N/A
     Stream #3:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown), 
640x360, 30 fps, 30 tbr, 1000k tbn, 1000k tbc
Stream mapping:
   Stream #0:0 (rawvideo) -> format
   Stream #1:0 (rawvideo) -> format
   Stream #2:0 (mjpeg) -> format
   Stream #3:0 (mjpeg) -> format
   vstack -> Stream #0:0 (rawvideo)
   split:output1 -> Stream #1:0 (h264_nvenc)
   split:output1 -> Stream #2:0 (h264_nvenc)
   split:output1 -> Stream #3:0 (libx264)
   split:output1 -> Stream #4:0 (libx264)


The result is that I do get a window showing a 2x2 grid of the inputs 
and the files are being written out to disk. However, the inputs are not 
in sync by 0.5-1 seconds. I have tried all kinds of combinations of 
input and output frame rate settings (-r), "vsync" settings, the "fps" 
filter and the "setpts" filter but I have never achieved what I was 
looking for.

I'm sometimes using OBS Studio to do recordings from multiple sources by 
placing the sources next to each other in a large canvas, recording the 
resulting video in 60 FPS and later on splitting the parts within the 
video to separate videos for editing. What OBS seemingly does is to 
simply pull the latest frame available from all input devices, rendering 
them together into the canvas and writing the resulting frame out to 
disk. This kind of synchronization is what I would like to achieve with 
ffmpeg to record to separate files directly.

Now to the core of my question: Is there any way how I can fetch the 
latest frame from all input devices at a fixed interval (60 fps), 
dropping all possibly buffered frames as necessary, to just keep the 
last available frame from each device and writing that out? I would like 
to ignore all timestamps that any device gives me and just fetch data as 
quickly as possible and write that out at a fixed rate starting from a 
common timestamp zero.

I thought the "fps" video filter would do something like that but no 
matter what I tried, ffmpeg always seems to try to do magic to 
synchronize things in a way which do not correspond to real-time. I 
would imagine a hypothetical "live sync" filter that gets n inputs, 
fetches the latest frame available from all devices and sends the result 
(roughly corresponding to the complete rendered image of OBS, just in 
separate streams) out to disk, or rather to the encoders. Or maybe there 
is some solution based on separate (ffmpeg?) processes which provide 
real-time buffers to an aggregating instance of ffmpeg, but I wasn't 
successful with such an approach yet either.

I would be grateful for any advice.

Best regards
Ochi


More information about the ffmpeg-user mailing list