[FFmpeg-user] Non-monotonous DTS in output stream

Pieter Venter pietventer at gmail.com
Wed Feb 10 02:19:58 EET 2021


Hello,

I'm trying to process some AVI files that originally came from a Sony
Handycam to mp4 and managed to get that to work.
However, a couple of files are giving me trouble and I've spent days trying
to figure it out.

I'm new here but have done my homework as best I could:
* I downloaded the latest ffmpeg build I could find.
* Searched the forums for solutions/switches (-af
aresample=async=1, -fflags +igndts, -fflags +sortdts).
* Tried other tools to extract the audio.
* Past full command and output for your reference.
* Will try to not "top post". That seems to be a thing here.

If I run the following command, audio is processed correctly up to about
the 43s mark, then becomes slower than expected (i.e. voices are deep,
audio in "slow mo" but video plays normal).

The original file plays correctly with VLC and Video on Linux. It does not
play correctly with mplayer (same slowed down audio issue past 43 seconds).

fmpeg -i input.avi -c:v libx264 -preset fast -crf 21 output.mp4
ffmpeg version N-55863-g9f38fac053-static https://johnvansickle.com/ffmpeg/
 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static
--disable-debug --disable-ffplay --disable-indev=sndio
--disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r
--enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom
--enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype
--enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libopenjpeg --enable-librubberband --enable-libsoxr
--enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus
--enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc
--enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265
--enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi
--enable-libzimg
  libavutil      56. 64.100 / 56. 64.100
  libavcodec     58.119.100 / 58.119.100
  libavformat    58. 65.101 / 58. 65.101
  libavdevice    58. 11.103 / 58. 11.103
  libavfilter     7.100.100 /  7.100.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
[avi @ 0x620c340] Switching to NI mode, due to poor interleaving
Input #0, avi, from 'input.avi':
  Duration: 00:07:00.00, start: 0.000000, bitrate: 28806 kb/s
    Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3],
25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
    Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
    Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (dvvideo (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x6233580] using SAR=16/15
[libx264 @ 0x6233580] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
AVX FMA3 BMI2 AVX2
[libx264 @ 0x6233580] profile High, level 3.0, 4:2:0, 8-bit
[libx264 @ 0x6233580] 264 - core 161 r3040 35417dc - H.264/MPEG-4 AVC codec
- Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1
ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1 psy_rd=1.00:0.00
mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11
fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2
sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0
constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1
weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40
intra_refresh=0 rc_lookahead=30 rc=crf mbtree=1 crf=21.0 qcomp=0.60 qpmin=0
qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
  Metadata:
    encoder         : Lavf58.65.101
    Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(bottom coded
first (swapped)), 720x576 [SAR 16:15 DAR 4:3], q=2-31, 25 fps, 12800 tbn
    Metadata:
      encoder         : Lavc58.119.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, stereo,
fltp, 128 kb/s
    Metadata:
      encoder         : Lavc58.119.100 aac
frame=10500 fps=116 q=-1.0 Lsize=  168876kB time=00:07:00.02
bitrate=3293.7kbits/s speed=4.63x
video:158898kB audio:9545kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: 0.257381%
[libx264 @ 0x6233580] frame I:93    Avg QP:20.65  size: 54098
[libx264 @ 0x6233580] frame P:3607  Avg QP:23.10  size: 22302
[libx264 @ 0x6233580] frame B:6800  Avg QP:24.65  size: 11358
[libx264 @ 0x6233580] consecutive B-frames: 13.4%  0.6%  0.7% 85.3%
[libx264 @ 0x6233580] mb I  I16..4:  1.2% 98.2%  0.6%
[libx264 @ 0x6233580] mb P  I16..4:  0.5% 32.9%  0.4%  P16..4: 35.4% 17.7%
 9.8%  0.0%  0.0%    skip: 3.2%
[libx264 @ 0x6233580] mb B  I16..4:  2.2% 25.7%  0.2%  B16..8: 22.8%  8.7%
 0.7%  direct:28.9%  skip:10.8%  L0:40.2% L1:32.6% BI:27.2%
[libx264 @ 0x6233580] 8x8 transform intra:93.9% inter:82.3%
[libx264 @ 0x6233580] coded y,uvDC,uvAC intra: 82.4% 88.7% 30.9% inter:
40.1% 65.9% 1.2%
[libx264 @ 0x6233580] i16 v,h,dc,p: 21% 19% 34% 26%
[libx264 @ 0x6233580] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 14% 40%  5%  5%
 5%  5%  6%  6%
[libx264 @ 0x6233580] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu:  9% 54% 15%  4%  4%
 3%  4%  3%  4%
[libx264 @ 0x6233580] i8c dc,h,v,p: 59% 19% 19%  3%
[libx264 @ 0x6233580] Weighted P-Frames: Y:8.2% UV:1.4%
[libx264 @ 0x6233580] ref P L0: 56.9% 43.1%
[libx264 @ 0x6233580] ref B L0: 74.0% 26.0%
[libx264 @ 0x6233580] ref B L1: 95.3%  4.7%
[libx264 @ 0x6233580] kb/s:3099.25
[aac @ 0x6234fc0] Qavg: 189.772

Trying to narrow down the problem area, I did the following - just encode
up to the 43s mark and dump the audio.
If I run it for the whole file, there are hundreds, if not thousands of
entries like " Non-monotonous DTS..."

 ffmpeg -t 00:00:43 -i  input.avi -map 0:a:0 -c:a aac output.avi
ffmpeg version N-55863-g9f38fac053-static https://johnvansickle.com/ffmpeg/
 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static
--disable-debug --disable-ffplay --disable-indev=sndio
--disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r
--enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom
--enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype
--enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libopenjpeg --enable-librubberband --enable-libsoxr
--enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus
--enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc
--enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265
--enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi
--enable-libzimg
  libavutil      56. 64.100 / 56. 64.100
  libavcodec     58.119.100 / 58.119.100
  libavformat    58. 65.101 / 58. 65.101
  libavdevice    58. 11.103 / 58. 11.103
  libavfilter     7.100.100 /  7.100.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
[avi @ 0x5b922c0] Switching to NI mode, due to poor interleaving
Input #0, avi, from 'input.avi':
  Duration: 00:07:00.00, start: 0.000000, bitrate: 28806 kb/s
    Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3],
25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
    Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
    Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
File 'output.avi' already exists. Overwrite? [y/N] y
Stream mapping:
  Stream #0:1 -> #0:0 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
Output #0, avi, to 'output.avi':
  Metadata:
    ISFT            : Lavf58.65.101
    Stream #0:0: Audio: aac (LC) ([255][0][0][0] / 0x00FF), 32000 Hz,
stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc58.119.100 aac
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1341,
current: 1339; changing to 1342. This may result in incorrect timestamps in
the output file.
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1342,
current: 1340; changing to 1343. This may result in incorrect timestamps in
the output file.
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1343,
current: 1340; changing to 1344. This may result in incorrect timestamps in
the output file.
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1344,
current: 1341; changing to 1345. This may result in incorrect timestamps in
the output file.
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1345,
current: 1342; changing to 1346. This may result in incorrect timestamps in
the output file.
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1346,
current: 1342; changing to 1347. This may result in incorrect timestamps in
the output file.
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1347,
current: 1343; changing to 1348. This may result in incorrect timestamps in
the output file.
[avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1348,
current: 1343; changing to 1349. This may result in incorrect timestamps in
the output file.
size=     714kB time=00:00:43.16 bitrate= 135.6kbits/s speed=96.2x
video:0kB audio:676kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 5.592729%
[aac @ 0x5bb9980] Qavg: 225.541

The only solution I could find so far is to
a. extract all the audio using ffmpeg (total audio file lengh is 10m09s)
b. use audacity to carefully select the audio from 0m43s to the end and
"shrink" it down to a total of 7m00s (the original file length)
c. create a new video file from the processed audio stream

Is there a way to troubleshoot, ignore, correct the Non-monotonous DTS
error?
Thanks for your help.


More information about the ffmpeg-user mailing list