[FFmpeg-user] Create an AAC stream matching the Core Media Audio packet format / priming etc?
mwjburton at gmail.com
Thu Jun 29 20:46:18 EEST 2017
> On 28 Jun 2017, at 01:12, Sasi Inguva <isasi-at-google.com at ffmpeg.org> wrote:
> I have been helping Mark test Marton's patch. I looked at the test file Mark
> was using to test the sync. There are multiple reasons for audio being
> i) That file doesn't contain a non-zero edit list or 'sgpd' atom, as
> suggests to put. For this kind of file the spec says that, use the
> historical solution of assuming the delay as 2112 samples. And this is what
> the QuickTime player and iMovie on my MAC seem to be doing. However in
> Ffmpeg we don't assume the delay as 2112 samples. If there is no edit list,
> we assume it as zero.
> Hence, when we transcode the video using ffmpeg, we are adding 2112 samples
> of silence in the transcoded file ( as actual audio data).
> ii) On top of that ffmpeg AAC encoder itself introduces 1024 samples of
> silence, and ffmpeg then uses edit list to denote that as the encoder delay.
> However the spec says that along with the edit list, we should also set the
> "sgpd" atom, ( which is what Marton's patch does).
> So to fix this, I hacked ffmpeg MOV demuxer to assume 2112 delay for AAC,
> and combined it with Marton's patch. I hoped that the file transcoded from
> ffmpeg built from these two patches, will correctly match the original test
> file when decoded with Apple tools (iMovie) .
> However it was not to be. It seems like even Apple tools don't respect the
> new way of setting the encoder delay. When I decode the file using iMovie, I
> observe that 2112 samples from the beginning are gone, indicating that Apple
> is still assuming 2112 samples of delay for AAC.
> I am attaching the original test file, and the file I generated.
Thanks Sasi. So the short of this is that its unclear whether it is in fact possible to create a mov file which Apple tools will decode using the ‘new’ method.
If Apple tools will always reliably use the 2112 decode method, would it not make more sense to include a new option in ffmpeg to encode in this way (same as Quicktime encoders) and therefore ensure accurate decode of the file in the actual formats decoder? If the current methods are either not fully meeting the spec or are simply being ignored by Quicktime, it would seem that in order to maintain compatibility with the format itself, having a way to encode with 2112 delay would be the most reliable way to go.
More information about the ffmpeg-user