[FFmpeg-user] Syncronizing live sources across multiple / separate FFmpeg commands

Sat May 8 08:21:56 EEST 2021

Hey there, I'm wondering if anybody would have any ideas in regards to
synchronizing live sources across multiple / separate FFmpeg commands.

Obviously the best case scenario would be handling all live sources in a
single command, however I just can't seem to find the performance needed to
do such a thing. For example here's what such a command would roughly look
like :
ffmpeg -y `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3840x2160 -framerate 60 `
-pixel_format nv12 -i video="OBS Virtual Camera":audio="ADAT (7+8) (RME
Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3440x1440 -framerate 60 `
-pixel_format nv12 -i video="Video (00-1 Pro Capture Dual HDMI
4K+)":audio="ADAT (3+4) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (1+2) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3840x2160 `
-pixel_format nv12 -i video="Game Capture 4K60 Pro MK.2":audio="Game
Capture 4K60 Pro MK.2 Audio" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3840x2160 `
-pixel_format nv12 -i video="Game Capture 4K60 Pro MK.2 (2)":audio="ADAT
(31+32) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (27+28) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (5+6) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (15+16) (RME Digiface USB)" `
-map 0 -map 1 -map 2 -map 3 -map 4 -map 5 -map 6 -map 7 `
-c:v h264_nvenc -gpu 1 -preset: p1 -pix_fmt nv12 -r 60 -rc-lookahead 120
-strict_gop 1 -flags +cgop -g 120 -forced-idr 1 `
-sc_threshold 0 -force_key_frames "expr:gte(t,n_forced*2)" -b:v 288M
-minrate 288M -maxrate 288M -bufsize 288M -c:a mp3 -ac 2 `
-ar 44100 -b:a 320k -af "aresample=async=250" -vsync 1
-max_muxing_queue_size 9999 `
-f segment -segment_time 2 -segment_wrap 5400 -segment_list
"C:\Users\gabri\Videos\FFmpeg\Segments\all.m3u8" `
-segment_list_size 5400 -reset_timestamps 1 -segment_format_options
max_delay=0 `
"C:\Users\gabri\Videos\FFmpeg\Segments\all%02d.ts"

When I have everything in a single command like the above I can use a
filter complex with adelay, tpad, and other A/V filters to synchronize all
the sources. This isn't perfect, I still observe 0-5 frames of variance
between each video source, but I can live with that. The problem is that
the above command fully saturates a single thread on my CPU (AMD 1950X) no
matter how I spin it up, there's simply too many sources at too high a
quality. I can confirm that this is the bottleneck by observing
per-thread CPU usage in task manager. I'm positive I don't have other
bottlenecks because if I split the above command into multiple, everything
works fine in the performance department. Even if I just exclude one of the
4K60 video streams I can get the command to run in real-time, but it looks
like 3 4K60 sources is the limit. The only real "solution" I can find to
this problem is by splitting up relevant sources into their own commands
like mentioned before, and then joining all the files back up afterwards:
ffmpeg -y `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3840x2160 -framerate 60 `
-pixel_format nv12 -i video="OBS Virtual Camera":audio="ADAT (7+8) (RME
Digiface USB)" `
-map 0 `
-c:v h264_nvenc -gpu 1 -preset: p1 -pix_fmt nv12 -r 60 -rc-lookahead 120
-strict_gop 1 -flags +cgop -g 120 -forced-idr 1 `
-sc_threshold 0 -force_key_frames "expr:gte(t,n_forced*2)" -b:v 288M
-minrate 288M -maxrate 288M -bufsize 288M -c:a mp3 -ac 2 `
-ar 44100 -b:a 320k -af "aresample=async=250" -vsync 1
-max_muxing_queue_size 9999 `
-f segment -segment_time 2 -segment_wrap 5400 -segment_list
"C:\Users\gabri\Videos\FFmpeg\Segments\OBS.m3u8" `
-segment_list_size 5400 -reset_timestamps 1 -segment_format_options
max_delay=0 `
"C:\Users\gabri\Videos\FFmpeg\Segments\OBS%02d.ts"

ffmpeg -y `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3440x1440 -framerate 60 `
-pixel_format nv12 -i video="Video (00-1 Pro Capture Dual HDMI
4K+)":audio="ADAT (3+4) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (1+2) (RME Digiface USB)" `
-map 0 -map 1 `
-c:v h264_nvenc -gpu 1 -preset: p1 -pix_fmt nv12 -r 60 -rc-lookahead 120
-strict_gop 1 -flags +cgop -g 120 -forced-idr 1 `
-sc_threshold 0 -force_key_frames "expr:gte(t,n_forced*2)" -b:v 288M
-minrate 288M -maxrate 288M -bufsize 288M -c:a mp3 -ac 2 `
-ar 44100 -b:a 320k -af "aresample=async=250" -vsync 1
-max_muxing_queue_size 9999 `
-f segment -segment_time 2 -segment_wrap 5400 -segment_list
"C:\Users\gabri\Videos\FFmpeg\Segments\Primary.m3u8" `
-segment_list_size 5400 -reset_timestamps 1 -segment_format_options
max_delay=0 `
"C:\Users\gabri\Videos\FFmpeg\Segments\Primary%02d.ts"

ffmpeg -y `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3840x2160 `
-pixel_format nv12 -i video="Game Capture 4K60 Pro MK.2":audio="Game
Capture 4K60 Pro MK.2 Audio" `
-map 0 -map 1 `
-c:v h264_nvenc -gpu 1 -preset: p1 -pix_fmt nv12 -r 60 -rc-lookahead 120
-strict_gop 1 -flags +cgop -g 120 -forced-idr 1 `
-sc_threshold 0 -force_key_frames "expr:gte(t,n_forced*2)" -b:v 288M
-minrate 288M -maxrate 288M -bufsize 288M -c:a mp3 -ac 2 `
-ar 44100 -b:a 320k -af "aresample=async=250" -vsync 1
-max_muxing_queue_size 9999 `
-f segment -segment_time 2 -segment_wrap 5400 -segment_list
"C:\Users\gabri\Videos\FFmpeg\Segments\Secondary.m3u8" `
-segment_list_size 5400 -reset_timestamps 1 -segment_format_options
max_delay=0 `
"C:\Users\gabri\Videos\FFmpeg\Segments\Secondary%02d.ts"

ffmpeg -y `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -video_size 3840x2160 `
-pixel_format nv12 -i video="Game Capture 4K60 Pro MK.2 (2)":audio="ADAT
(31+32) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (27+28) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (5+6) (RME Digiface USB)" `
-guess_layout_max 0 -thread_queue_size 9999 -indexmem 9999 -f dshow
-rtbufsize 2147.48M -i audio="ADAT (15+16) (RME Digiface USB)" `
-map 0 -map 1 -map 2 -map 3 `
-c:v h264_nvenc -gpu 1 -preset: p1 -pix_fmt nv12 -r 60 -rc-lookahead 120
-strict_gop 1 -flags +cgop -g 120 -forced-idr 1 `
-sc_threshold 0 -force_key_frames "expr:gte(t,n_forced*2)" -b:v 288M
-minrate 288M -maxrate 288M -bufsize 288M -c:a mp3 -ac 2 `
-ar 44100 -b:a 320k -af "aresample=async=250" -vsync 1
-max_muxing_queue_size 9999 `
-f segment -segment_time 2 -segment_wrap 5400 -segment_list
"C:\Users\gabri\Videos\FFmpeg\Segments\Camera.m3u8" `
-segment_list_size 5400 -reset_timestamps 1 -segment_format_options
max_delay=0 `
"C:\Users\gabri\Videos\FFmpeg\Segments\Camera%02d.ts"

The problem with this approach is that I can't seem to synchronize these
sources in a meaningful way, even when I start them all at the same time
programmatically with something like a Powershell script there's a large
sync variance. Like I was saying, with a single command for everything +
filters I can get every source within 5 frames of each other. But once I
split everything into multiple commands I'm looking at variances of up to
half a second even with filters and such, which is just too large. The
first output / source is my OBS feed via OBS's virtual camera that contains
multiple of the following capture cards at any given time. For example, the
"camera" feed may be overlayed on top of the "primary" feed in the OBS
composite. The idea is to have a multi-track file that contains all the
sources on top of the composite from OBS, so I can mute certain audio
tracks, or pull up a single video feed in full quality. For example, if I
just have the OBS feed and want to blow up my camera full screen it becomes
super pixelated, because I'm blowing up 1/8 of the overall image (when it's
overlayed on top of another source). But if I record the camera as a
separate source alongside the OBS feed, I have access to a "raw" or full
res feed of the camera. It's important to keep each feed as synchronized as
possible because I use other FFmpeg commands to edit the output on the fly,
having to worry about synchronizing sources in post is just too much
overhead.

So once again, I'm wondering if anyone may have trick up their sleeve to
solve a problem like this. I'm thinking that if there is some way to pause
each command, and or discard data until everything is running and then
resume that maybe I could better sync things up. Any ideas?