[FFmpeg-trac] #5033(undetermined:new): Incorrect duration when converting WAV/MP3 files to AAC
FFmpeg
trac at avcodec.org
Tue Nov 24 00:58:06 CET 2015
#5033: Incorrect duration when converting WAV/MP3 files to AAC
--------------------------------------+----------------------------------
Reporter: ausjjtkd | Type: defect
Status: new | Priority: normal
Component: undetermined | Version: 2.8.1
Keywords: | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
--------------------------------------+----------------------------------
Summary of the bug:
Incorrect duration of encoded audio files when converting WAV/MP3 (MP3
created with the same WAV file) to AAC.
I assume that this is affecting the duration of MP4 videos created with an
still image and audio (which is the reason I'm reporting this bug).
How to reproduce:
(espeak is a TTS engine. -w in espeak will save the output speech to a wav
file)
{{{
$ espeak -s 80 -w /tmp/in.wav 'This product is meant for educational
purposes only. Any resemblance to real persons, living or dead is purely
coincidental. Void where prohibited. Some assembly required. List each
check separately by bank number. Batteries not included. Contents may
settle during shipment. Use only as directed. No other warranty expressed
or implied. Do not use while operating a motor vehicle or heavy equipment.
Postage will be paid by addressee. Subject to CARB approval.'
$ ffmpeg -y -i /tmp/in.wav -vn -c:a aac -strict -2 -ab 24k -ar 16000
/tmp/out.aac
ffmpeg version 2.8.2-static http://johnvansickle.com/ffmpeg/ Copyright
(c) 2000-2015 the FFmpeg developers
built with gcc 5.2.1 (Debian 5.2.1-23) 20151028
configuration: --enable-gpl --enable-version3 --disable-shared
--disable-debug --enable-runtime-cpudetect --enable-libmp3lame --enable-
libx264 --enable-libx265 --enable-libwebp --enable-libspeex --enable-
libvorbis --enable-libvpx --enable-libfreetype --enable-fontconfig
--enable-libxvid --enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libtheora --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-
gray --enable-libopenjpeg --enable-libopus --enable-libass --enable-gnutls
--enable-libvidstab --enable-libsoxr --enable-frei0r --enable-libfribidi
--disable-indev=sndio --disable-outdev=sndio --cc=gcc
libavutil 54. 31.100 / 54. 31.100
libavcodec 56. 60.100 / 56. 60.100
libavformat 56. 40.101 / 56. 40.101
libavdevice 56. 4.100 / 56. 4.100
libavfilter 5. 40.101 / 5. 40.101
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 2.101 / 1. 2.101
libpostproc 53. 3.100 / 53. 3.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from '/tmp/in.wav':
Duration: 00:01:00.33, bitrate: 352 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, 1
channels, s16, 352 kb/s
Output #0, adts, to '/tmp/out.aac':
Metadata:
encoder : Lavf56.40.101
Stream #0:0: Audio: aac, 16000 Hz, mono, fltp, 24 kb/s
Metadata:
encoder : Lavc56.60.100 aac
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
size= 195kB time=00:01:00.35 bitrate= 26.5kbits/s
video:0kB audio:189kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 3.422557%
$ for i in /tmp/in.wav /tmp/out.aac; do ffprobe -i $i 2>&1 | grep
Duration; done
Duration: 00:01:00.33, bitrate: 352 kb/s
Duration: 00:03:19.68, bitrate: 8 kb/s
}}}
If you encode the WAV files to MP3 and then to AAC, once again, you get
different, longer audio streams:
{{{
$ ffmpeg -y -i /tmp/in.wav -vn -strict -2 -ab 24k -ar 16000 -f mp3
/tmp/out.mp3
...
Stream #0:0 -> #0:0 (pcm_s16le (native) -> mp3 (libmp3lame))
...
size= 177kB time=00:01:00.33 bitrate= 24.1kbits/s
$ ffmpeg -y -i /tmp/out.mp3 -vn -c:a aac -strict -2 -ab 24k -ar 16000
/tmp/out.aac
$ for i in /tmp/out.mp3 /tmp/out.aac; do ffprobe -i $i 2>&1 | grep
Duration; done
Duration: 00:01:00.41, start: 0.069063, bitrate: 24 kb/s
Duration: 00:01:06.31, bitrate: 23 kb/s
}}}
Interestingly, I ran espeak with a different input ('This is some sample
text to test audio encoding. ' repeated 10 times) and it gave me a shorter
difference between the duration of the encoded output files.
I tried this with OGG/libvorbis files, the duration of output AAC file
didn't seem to be affected.
For MP4s:
{{{
$ ffprobe -i /tmp/in.wav 2>&1 | grep Duration
Duration: 00:01:00.33, bitrate: 352 kb/s
$ ffmpeg -y -framerate 1 -r 1 -loop 1 -i /tmp/in.jpg -i /tmp/in.wav -c:v
libx264 -preset veryfast -tune stillimage -c:a aac -ab 32k -ar 16000
-strict experimental -shortest -pix_fmt yuv420p -movflags faststart -f mp4
/tmp/out.mp4
ffmpeg version 2.8.2-static http://johnvansickle.com/ffmpeg/ Copyright
(c) 2000-2015 the FFmpeg developers
built with gcc 5.2.1 (Debian 5.2.1-23) 20151028
configuration: --enable-gpl --enable-version3 --disable-shared
--disable-debug --enable-runtime-cpudetect --enable-libmp3lame --enable-
libx264 --enable-libx265 --enable-libwebp --enable-libspeex --enable-
libvorbis --enable-libvpx --enable-libfreetype --enable-fontconfig
--enable-libxvid --enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libtheora --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-
gray --enable-libopenjpeg --enable-libopus --enable-libass --enable-gnutls
--enable-libvidstab --enable-libsoxr --enable-frei0r --enable-libfribidi
--disable-indev=sndio --disable-outdev=sndio --cc=gcc
libavutil 54. 31.100 / 54. 31.100
libavcodec 56. 60.100 / 56. 60.100
libavformat 56. 40.101 / 56. 40.101
libavdevice 56. 4.100 / 56. 4.100
libavfilter 5. 40.101 / 5. 40.101
libswscale 3. 1.101 / 3. 1.101
libswresample 1. 2.101 / 1. 2.101
libpostproc 53. 3.100 / 53. 3.100
[mjpeg @ 0x4b1daa0] Changeing bps to 8
Input #0, image2, from '/tmp/in.jpg':
Duration: 00:00:01.00, start: 0.000000, bitrate: 272 kb/s
Stream #0:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown),
300x300 [SAR 1:1 DAR 1:1], 1 fps, 1 tbr, 1 tbn, 1 tbc
Guessed Channel Layout for Input Stream #1.0 : mono
Input #1, wav, from '/tmp/in.wav':
Duration: 00:01:00.33, bitrate: 352 kb/s
Stream #1:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, 1
channels, s16, 352 kb/s
[swscaler @ 0x4b59e20] deprecated pixel format used, make sure you did set
range correctly
[libx264 @ 0x4b7eba0] using SAR=1/1
[libx264 @ 0x4b7eba0] using cpu capabilities: MMX2 SSE2Fast LZCNT
[libx264 @ 0x4b7eba0] profile High, level 1.2
[libx264 @ 0x4b7eba0] 264 - core 148 r209 7599210 - H.264/MPEG-4 AVC codec
- Copyleft 2003-2015 - http://www.videolan.org/x264.html - options:
cabac=1 ref=1 deblock=1:-3:-3 analyse=0x3:0x113 me=hex subme=2 psy=1
psy_rd=2.00:0.70 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=1
cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=6
lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0
bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=1
scenecut=40 intra_refresh=0 rc_lookahead=10 rc=crf mbtree=1 crf=23.0
qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.20
Output #0, mp4, to '/tmp/out.mp4':
Metadata:
encoder : Lavf56.40.101
Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p,
300x300 [SAR 1:1 DAR 1:1], q=-1--1, 1 fps, 16384 tbn, 1 tbc
Metadata:
encoder : Lavc56.60.100 libx264
Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 16000 Hz, mono,
fltp, 32 kb/s
Metadata:
encoder : Lavc56.60.100 aac
Stream mapping:
Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
Stream #1:0 -> #0:1 (pcm_s16le (native) -> aac (native))
Press [q] to stop, [?] for help
[mp4 @ 0x4b38d40] Starting second pass: moving the moov atom to the
beginning of the file
frame= 84 fps=0.0 q=-1.0 Lsize= 298kB time=00:01:22.00 bitrate=
29.8kbits/s
video:48kB audio:243kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 2.420870%
[libx264 @ 0x4b7eba0] frame I:1 Avg QP:13.91 size: 46896
[libx264 @ 0x4b7eba0] frame P:21 Avg QP:11.16 size: 31
[libx264 @ 0x4b7eba0] frame B:62 Avg QP:16.35 size: 15
[libx264 @ 0x4b7eba0] consecutive B-frames: 1.2% 0.0% 3.6% 95.2%
[libx264 @ 0x4b7eba0] mb I I16..4: 0.6% 5.5% 93.9%
[libx264 @ 0x4b7eba0] mb P I16..4: 0.0% 0.0% 0.0% P16..4: 0.7% 0.0%
0.0% 0.0% 0.0% skip:99.3%
[libx264 @ 0x4b7eba0] mb B I16..4: 0.0% 0.0% 0.0% B16..8: 0.0% 0.0%
0.0% direct: 0.0% skip:100.0%
[libx264 @ 0x4b7eba0] 8x8 transform intra:5.5% inter:6.7%
[libx264 @ 0x4b7eba0] coded y,uvDC,uvAC intra: 97.9% 97.5% 94.7% inter:
0.0% 0.1% 0.0%
[libx264 @ 0x4b7eba0] i16 v,h,dc,p: 0% 0% 50% 50%
[libx264 @ 0x4b7eba0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 8% 51% 14% 4% 2%
1% 9% 5% 6%
[libx264 @ 0x4b7eba0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 14% 22% 16% 6% 8%
5% 8% 7% 13%
[libx264 @ 0x4b7eba0] i8c dc,h,v,p: 41% 27% 18% 14%
[libx264 @ 0x4b7eba0] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x4b7eba0] kb/s:4.62
$ ffprobe -i /tmp/out.mp4 2>&1 | grep Duration
Duration: 00:01:24.00, start: 0.064000, bitrate: 29 kb/s
}}}
I talked in #ffmpeg @ freenode with another user who has the same problem
with MP4s:
{{{
<user> llogan: http://sprunge.us/JYOS
<user> when i run that with a 4m45s audio stream, the first one has a
duration of 4:50, the second one 4:56
<user> s/4:56/4:46/
<user> it's also not specific to aac or mp4
<user> and also according to mediainfo it's the video stream which is too
long, not the audio stream
<user> and finally, increasing -r reduces the offset
<user> i assume it's an issue with the image2 muxer, but i don't know for
sure
<user> and by for sure i of course mean at all
}}}
--
Ticket URL: <https://trac.ffmpeg.org/ticket/5033>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list