[FFmpeg-user] Transcoding with different codec (libfdk_aac) alters duration, how to fix?

Fri Dec 27 04:47:07 CET 2013

Hi,

I have built a transcoding workflow, which is meant to run in a distributed
Grid computing environment. The idea is to improve transcoding latencies by
first splitting input files using the segment muxer (
http://www.ffmpeg.org/ffmpeg-formats.html#segment_002c-stream_005fsegment_002c-ssegment),
then individually transcoding the segments on different nodes (into mpegts
containers) and finally join the transcoded segments with the concat
protocol (http://ffmpeg.org/ffmpeg-protocols.html#concat).

This works in principle but causes problems in the resulting output file at
the boundaries of the joined segments. Specifically, there is an audible
"gap" in the audio stream. I have created a simple test case to reproduce
the issue:

https://gist.github.com/sbalko/fcc930d4f6f5f7b03ed6

In trying to fix up this issue, I identified a possible reason. When
transcoding a segment into AAC (using libfdk_aac), the resulting audio
frames are very different. For a ~2 sec input segment, the first frame had
pkt_pts_time=*0.000000*, pkt_duration_time=*0.021333 *and the last frame
pkt_pts_time=*1.984000*, pkt_duration_time=*0.018000 *(resulting in a total
duration of 2.002 seconds). After transcoding the segment like so:

ffmpeg -y -i input_audio00.mov -f mpegts -mpegts_copyts 1 -acodec
libfdk_aac  output_audio00.ts

the output file's first and last audio frames have pkt_pts_time=*0.000000*;
pkt_duration_time=*0.021333 *and pkt_pts_time=*2.026667*;
pkt_duration_time=*0.021333,
*respectively, resulting in an overall duration of 2.048 seconds. I also
figured that the input file had 94 audio frames, whereas the output file
has 96 frames.

I realize that all this may have been introduced by re-encoding the audio
stream. In an effort to rectify the deviating timestamps, I tried the
aresample filter like:

ffmpeg -y -i input_audio00.mov -f mpegts -mpegts_copyts 1 -acodec
libfdk_aac  -af aresample=async=1000 output_audio00.ts

without any effect. I would have hoped aresample somehow retrieves the
timestamps from the input file and stretches or squeezes the output file to
match these. My understanding of how aresample would retrieve the original
timestamps is kind of limited - so perhaps something is fundamentally wrong
here (?)

Any ideas are greatly appreciated!

Thanks,
Soeren