[FFmpeg-user] Encoding MP4/AAC audio from pcm: issues with packets, duration, and pts/dts (especially when using -movflags empty_moov)

Eric Amram eric.amram at gmail.com
Thu Apr 6 23:12:28 EEST 2017


Hello,

I've encountered several issues trying to encode audio PCM into MP4/AAC.
I've recompiled the latest nightly to make sure it was not already solved.

Here are the FFmpeg command line I ran to encode a 8192 bytes of raw s16le PCM file (4096 samples) into MP4/AAC:

ffmpeg -nostdin -hide_banner -loglevel debug \
 -f s16le -channel_layout mono -vn -ac 1 -i test-8192.raw \
 -f mp4 -acodec aac -movflags empty_moov -ac 1 -ar 44100 -b:a 128000 \
 result.mp4 

(same without empty_moov)
ffmpeg -nostdin -hide_banner -loglevel debug \
 -f s16le -channel_layout mono -vn -ac 1 -i test-8192.raw \
 -f mp4 -acodec aac -ac 1 -ar 44100 -b:a 128000 \
 result.mp4 



1/ Why is there an empty packet added to the MP4?

When I run ffmpeg, I get the following logs:

video:0kB audio:2kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 50.496140%
Input file #0 (test-8192.raw):
  Input stream #0:0 (audio): 4 packets read (8192 bytes); 4 frames decoded (4096 samples); 
  Total: 4 packets (8192 bytes) demuxed
Output file #0 (result.mp4):
  Output stream #0:0 (audio): 4 frames encoded (4096 samples); 5 packets muxed (1814 bytes); 
  Total: 5 packets (1814 bytes) muxed
4 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x25eb300] Statistics: 0 seeks, 4 writeouts


--> There are *5* packets (and 5 frames) instead of the 4 frames from the input file.

When decoded, this additional packet is a series of 2048 bytes of pure zeros (1024 samples of 0).

However, it does use 536 bytes in the mp4 file. Why such a waste??


Moreover, with empty_moov flag, the mp4 file is seen having a LONGER DURATION by players,
and it triggers 23ms of initial silence when playing the file.




2/ PTS/DTS bug with EMPTY_MOOV on this first packet


Running ffprobe on the result.mp4, the pts/dts seems wrong when using -movflags empty_moov.

# ffprobe -hide_banner -pretty -show_packets result.mp4



WITHOUT empty_moov, the first packet (the empty one with pure zeros) has pts/dts 
with negative values, so that the next packet with actual sound starts at 0:00

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'result.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf57.72.100
  Duration: 00:00:00.12, start: 0.000000, bitrate: 176 kb/s
    Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 124 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
[PACKET]
codec_type=audio
stream_index=0
pts=-1024
pts_time=0:00:-0.023220
dts=-1024
dts_time=0:00:-0.023220
duration=1024
duration_time=0:00:00.023220
convergence_duration=N/A
convergence_duration_time=N/A
size=536 byte
pos=44
flags=KD
[SIDE_DATA]
side_data_type=Skip Samples
skip_samples=1024
discard_padding=0
skip_reason=0
discard_reason=0
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=audio
stream_index=0
pts=0
pts_time=0:00:00.000000
dts=0
dts_time=0:00:00.000000
duration=1024
duration_time=0:00:00.023220
...



But WITH -movflags empty_moov, the first packet starts at pts/dts 0:00, and therefore
mp4 players see a LONGER file, with 23ms of silence at the start:

[PACKET]
codec_type=audio
stream_index=0
pts=0
pts_time=0:00:00.000000
dts=0
dts_time=0:00:00.000000
duration=N/A
duration_time=N/A
convergence_duration=N/A
convergence_duration_time=N/A
size=536 byte
pos=849
flags=K_
[/PACKET]
[PACKET]
codec_type=audio
stream_index=0
pts=1024
pts_time=0:00:00.023220
dts=1024
dts_time=0:00:00.023220
duration=1024
duration_time=0:00:00.023220




Here is the detail about my FFmpeg version:

ffmpeg version N-85272-gc901ae9 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-11)
  configuration: --prefix=/opt/ffmpeg_build --extra-cflags=-I/opt/ffmpeg_build/include --extra-ldflags='-L/opt/ffmpeg_build/lib -ldl' --bindir=/usr/local/bin --pkg-config-flags=--static --enable-gpl --enable-libfreetype
  libavutil      55. 59.100 / 55. 59.100
  libavcodec     57. 91.100 / 57. 91.100
  libavformat    57. 72.100 / 57. 72.100
  libavdevice    57.  7.100 / 57.  7.100
  libavfilter     6. 84.100 /  6. 84.100
  libswscale      4.  7.100 /  4.  7.100
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100



Any help about why there is an additional first packet filled with zeros, 
and why the timing turns wrong with empty_moov would be much appreciated!!

Thank you!




More information about the ffmpeg-user mailing list