[FFmpeg-trac] #7182(avformat:new): Asynchronity when muxing Opus in Matroska

FFmpeg trac at avcodec.org
Wed May 2 18:34:59 EEST 2018


#7182: Asynchronity when muxing Opus in Matroska
-----------------------------------+--------------------------------------
             Reporter:  mkver      |                     Type:  defect
               Status:  new        |                 Priority:  normal
            Component:  avformat   |                  Version:  git-master
             Keywords:  mkv, opus  |               Blocked By:
             Blocking:             |  Reproduced by developer:  0
Analyzed by developer:  0          |
-----------------------------------+--------------------------------------
 Muxing opus in Matroska currently leads to asynchronity because the muxer
 doesn't account for the fact that Matroska's CodecDelay element already
 contains an implicit delay.
 Before turning to the more explicit explanation, let me say that I used
 this version of ffmpeg (latest version of Zeranoe's builds, still from
 today; I'm declaring the version to be git-master although git-master is
 ahead by one completely unrelated commit):
 {{{
 ffmpeg version N-90920-ge07b1913fc Copyright (c) 2000-2018 the FFmpeg
 developers
   built with gcc 7.3.0 (GCC)
   configuration: --disable-static --enable-shared --enable-gpl --enable-
 version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls
 --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype
 --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb
 --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy
 --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx
 --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265
 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp
 --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-
 libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-
 libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va
 --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth
   libavutil      56. 18.100 / 56. 18.100
   libavcodec     58. 19.100 / 58. 19.100
   libavformat    58. 13.100 / 58. 13.100
   libavdevice    58.  4.100 / 58.  4.100
   libavfilter     7. 21.100 /  7. 21.100
   libswscale      5.  2.100 /  5.  2.100
   libswresample   3.  2.100 /  3.  2.100
   libpostproc    55.  2.100 / 55.  2.100
 }}}
 The nullsrc and anullsrc filter create tracks whose timestamps (both pts
 and dts) start at zero:
 {{{
 ffmpeg.exe -f lavfi -i nullsrc -f lavfi -i anullsrc -t 0.2 -f framehash
 -hash crc32 -

 #format: frame checksums
 #version: 2
 #hash: CRC32
 #software: Lavf58.13.100
 #tb 0: 1/25
 #media_type 0: video
 #codec_id 0: rawvideo
 #dimensions 0: 320x240
 #sar 0: 1/1
 #tb 1: 1/44100
 #media_type 1: audio
 #codec_id 1: pcm_s16le
 #sample_rate 1: 44100
 #channel_layout 1: 3
 #channel_layout_name 1: stereo
 #stream#, dts,        pts, duration,     size, hash
 0,          0,          0,        1,   115200, 2a01c517
 1,          0,          0,     1024,     4096, c71c0011
 1,       1024,       1024,     1024,     4096, c71c0011
 0,          1,          1,        1,   115200, 2a01c517
 1,       2048,       2048,     1024,     4096, c71c0011
 1,       3072,       3072,     1024,     4096, c71c0011
 0,          2,          2,        1,   115200, 2a01c517
 1,       4096,       4096,     1024,     4096, c71c0011
 1,       5120,       5120,     1024,     4096, c71c0011
 0,          3,          3,        1,   115200, 2a01c517
 1,       6144,       6144,     1024,     4096, c71c0011
 0,          4,          4,        1,   115200, 2a01c517
 1,       7168,       7168,     1024,     4096, c71c0011
 1,       8192,       8192,      628,     2512, 3f99da8d
 }}}
 If one encodes the audio, the pts and dts of the audio are shifted by the
 amount of samples of encoder delay the encoding process entails so that
 the output audio that actually corresponds to input samples has the same
 timestamps as the corresponding input samples:
 {{{
 ffmpeg.exe -f lavfi -i nullsrc -f lavfi -i anullsrc -c:a libopus -t 0.5 -f
 framehash -hash crc32 -

 #format: frame checksums
 #version: 2
 #hash: CRC32
 #extradata 1,                              19, ea5d642a
 #software: Lavf58.13.100
 #tb 0: 1/25
 #media_type 0: video
 #codec_id 0: rawvideo
 #dimensions 0: 320x240
 #sar 0: 1/1
 #tb 1: 1/48000
 #media_type 1: audio
 #codec_id 1: opus
 #sample_rate 1: 48000
 #channel_layout 1: 3
 #channel_layout_name 1: stereo
 #stream#, dts,        pts, duration,     size, hash
 1,       -312,       -312,      960,        3, 8abe71cf
 0,          0,          0,        1,   115200, 2a01c517
 1,        648,        648,      960,        3, 8abe71cf
 1,       1608,       1608,      960,        3, 8abe71cf
 0,          1,          1,        1,   115200, 2a01c517
 1,       2568,       2568,      960,        3, 8abe71cf
 1,       3528,       3528,      960,        3, 8abe71cf
 0,          2,          2,        1,   115200, 2a01c517
 1,       4488,       4488,      960,        3, 8abe71cf
 1,       5448,       5448,      960,        3, 8abe71cf
 0,          3,          3,        1,   115200, 2a01c517
 1,       6408,       6408,      960,        3, 8abe71cf
 1,       7368,       7368,      960,        3, 8abe71cf
 0,          4,          4,        1,   115200, 2a01c517
 1,       8328,       8328,      960,        3, 8abe71cf
 1,       9288,       9288,      960,        3, 8abe71cf
 0,          5,          5,        1,   115200, 2a01c517
 1,      10248,      10248,      960,        3, 8abe71cf
 1,      11208,      11208,      960,        3, 8abe71cf
 0,          6,          6,        1,   115200, 2a01c517
 1,      12168,      12168,      960,        3, 8abe71cf
 1,      13128,      13128,      960,        3, 8abe71cf
 0,          7,          7,        1,   115200, 2a01c517
 1,      14088,      14088,      960,        3, 8abe71cf
 1,      15048,      15048,      960,        3, 8abe71cf
 0,          8,          8,        1,   115200, 2a01c517
 1,      16008,      16008,      960,        3, 8abe71cf
 1,      16968,      16968,      960,        3, 8abe71cf
 0,          9,          9,        1,   115200, 2a01c517
 1,      17928,      17928,      960,        3, 8abe71cf
 1,      18888,      18888,      960,        3, 8abe71cf
 0,         10,         10,        1,   115200, 2a01c517
 1,      19848,      19848,      960,        3, 8abe71cf
 1,      20808,      20808,      960,        3, 8abe71cf
 0,         11,         11,        1,   115200, 2a01c517
 1,      21768,      21768,      960,        3, 8abe71cf
 1,      22728,      22728,      960,        3, 8abe71cf
 0,         12,         12,        1,   115200, 2a01c517
 1,      23688,      23688,      312,        3, 8abe71cf, S=1,       10,
 6ba9ada3
 }}}
 If one now muxes this into Matroska (in order to use a valid codec in
 Matroska, I encoded the video with libx264 and -tune zerolatency in order
 not to run into #4536), the -312 samples (6.5ms) encoder delay from above
 lead to a shift of all timestamps by the same amount to make all
 timestamps non-negative; this happens with every audio codec and is not
 Opus-specific:
 {{{
 ffmpeg.exe -f lavfi -i nullsrc -f lavfi -i anullsrc -c:v libx264 -tune
 zerolatency -c:a libopus -t 0.5 -f matroska test.mkv
 mkvinfo -s test.mkv
 Track 1: video, codec ID: V_MPEG4/ISO/AVC (h.264 profile: High @L1.3),
 mkvmerge/mkvextract track ID: 0, language: und, default duration: 40.000ms
 (25.000 frames/fields per second for a video track), pixel width: 320,
 pixel height: 240
 Track 2: audio, codec ID: A_OPUS, mkvmerge/mkvextract track ID: 1,
 language: und, channels: 2, sampling freq: 48000, bits per sample: 16
 I frame, track 2, timestamp 00:00:00.000000000, size 3, adler 0x05f302fa
 I frame, track 1, timestamp 00:00:00.007000000, size 812, adler 0x080a17e4
 I frame, track 2, timestamp 00:00:00.021000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.041000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.047000000, size 51, adler 0xa07a11ec
 I frame, track 2, timestamp 00:00:00.061000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.081000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.087000000, size 61, adler 0x76721649
 I frame, track 2, timestamp 00:00:00.101000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.121000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.127000000, size 65, adler 0x23a11875
 I frame, track 2, timestamp 00:00:00.141000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.161000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.167000000, size 65, adler 0x249f181b
 I frame, track 2, timestamp 00:00:00.181000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.201000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.207000000, size 65, adler 0x334918bd
 I frame, track 2, timestamp 00:00:00.221000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.241000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.247000000, size 65, adler 0x34021860
 I frame, track 2, timestamp 00:00:00.261000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.281000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.287000000, size 65, adler 0x42ac1902
 I frame, track 2, timestamp 00:00:00.301000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.321000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.327000000, size 65, adler 0x085c17a3
 I frame, track 2, timestamp 00:00:00.341000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.361000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.367000000, size 65, adler 0x17061845
 I frame, track 2, timestamp 00:00:00.381000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.401000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.407000000, size 65, adler 0x180417eb
 I frame, track 2, timestamp 00:00:00.421000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.441000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.447000000, size 65, adler 0x2669188a
 I frame, track 2, timestamp 00:00:00.461000000, size 3, adler 0x05f302fa
 I frame, track 2, timestamp 00:00:00.481000000, size 3, adler 0x05f302fa
 P frame, track 1, timestamp 00:00:00.487000000, size 65, adler 0x2722182d
 I frame, track 2, timestamp 00:00:00.501000000, size 3, adler 0x05f302fa
 }}}
 So the encoder delay gets backed into the usual timestamps. But for Opus
 the encoding delay also gets signalled via the CodecDelay element in the
 Opus track header. The semantics of this field imply that the first 6.5ms
 of audio should be discarded and that the audio for time t has Matroska
 time t+6.5ms (i.e. the second opus block at 20ms actually has a timestamp
 of 13.5ms). This means that the synchronization of the opus track and the
 other tracks shifted by the encoder delay as can be seen e.g. in the
 output of the Matroska demuxer:
 {{{
 ffmpeg.exe -copyts -i test.mkv -c copy -f framehash -hash crc32 -

 #format: frame checksums
 #version: 2
 #hash: CRC32
 #extradata 0,                              40, 8237cd92
 #extradata 1,                              19, ea5d642a
 #software: Lavf58.13.100
 #tb 0: 1/1000
 #media_type 0: video
 #codec_id 0: h264
 #dimensions 0: 320x240
 #sar 0: 1/1
 #tb 1: 1/1000
 #media_type 1: audio
 #codec_id 1: opus
 #sample_rate 1: 48000
 #channel_layout 1: 3
 #channel_layout_name 1: stereo
 #stream#, dts,        pts, duration,     size, hash
 1,         -7,         -7,       20,        3, 8abe71cf
 0,          7,          7,       40,      812, dbac8e3e
 1,         14,         14,       20,        3, 8abe71cf
 1,         34,         34,       20,        3, 8abe71cf
 0,         47,         47,       40,       51, 4885e758
 1,         54,         54,       20,        3, 8abe71cf
 1,         74,         74,       20,        3, 8abe71cf
 0,         87,         87,       40,       61, 5c29c696
 1,         94,         94,       20,        3, 8abe71cf
 1,        114,        114,       20,        3, 8abe71cf
 0,        127,        127,       40,       65, 2832137b
 1,        134,        134,       20,        3, 8abe71cf
 1,        154,        154,       20,        3, 8abe71cf
 0,        167,        167,       40,       65, 985e3247
 1,        174,        174,       20,        3, 8abe71cf
 1,        194,        194,       20,        3, 8abe71cf
 0,        207,        207,       40,       65, 85567570
 1,        214,        214,       20,        3, 8abe71cf
 1,        234,        234,       20,        3, 8abe71cf
 0,        247,        247,       40,       65, c623be44
 1,        254,        254,       20,        3, 8abe71cf
 1,        274,        274,       20,        3, 8abe71cf
 0,        287,        287,       40,       65, db2bf973
 1,        294,        294,       20,        3, 8abe71cf
 1,        314,        314,       20,        3, 8abe71cf
 0,        327,        327,       40,       65, 49d46f1e
 1,        334,        334,       20,        3, 8abe71cf
 1,        354,        354,       20,        3, 8abe71cf
 0,        367,        367,       40,       65, 54dc2829
 1,        374,        374,       20,        3, 8abe71cf
 1,        394,        394,       20,        3, 8abe71cf
 0,        407,        407,       40,       65, 584b1113
 1,        414,        414,       20,        3, 8abe71cf
 1,        434,        434,       20,        3, 8abe71cf
 0,        447,        447,       40,       65, 0aa1a42a
 1,        454,        454,       20,        3, 8abe71cf
 1,        474,        474,       20,        3, 8abe71cf
 0,        487,        487,       40,       65, f52f7718
 1,        494,        494,       20,        3, 8abe71cf, S=1,       10,
 6ba9ada3
 }}}
 (Without -copyts the timestamps would be shifted to make them non-
 negative.)
 As one sees, this is essentially a shift by the encoder delay. If one
 makes roundtrips demuxer->muxer, the tracks get ever more out of sync.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/7182>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list