[FFmpeg-trac] #6722(avcodec:new): XMA/WMAPro decoder: gapless problems
FFmpeg
trac at avcodec.org
Sat Oct 7 17:19:55 EEST 2017
#6722: XMA/WMAPro decoder: gapless problems
-------------------------------------+-------------------------------------
Reporter: bnnm | Type: defect
Status: new | Priority: normal
Component: avcodec | Version: git-
Keywords: xma, wmapro | master
Blocking: | Blocked By:
Analyzed by developer: 0 | Reproduced by developer: 0
-------------------------------------+-------------------------------------
XMA first/last output is slightly incorrect, notably breaking gapless
files. Haven't tested WMAPro files as much but I believe also applies.
----
From tests with Microsoft's XMA encoder (xmaencode.exe):
- FFmpeg decodes 128(?) samples late, and xmaencode adds 128 samples at
output start (1 subframe of "setup_samples"?), making last subframe in a
file incorrect. Basically the first and last output is off (samples in the
middle look fine).
- FFmpeg ignores "start skip" (samples to discard at the beginning) and
"end skip" (same at the end), see wmaprodec.c at 1443. Both used in decodes
and applied after the first 128 "extra" samples. start_skip is usually set
to frame_samples and end_skip to <frame_samples, but if manually changed
(ex. to 100 or 0) xmaencode will honor the values. If >frame_size it
clamps the value (512 in XMA), and end_skip seems to be always included
even if 0.
ex. final samples output of 10 frames: FFmpeg = 10*512; xmaencode = 128 +
1*512 - start_skip + 9*512 - end_skip
----
Example files:
https://mega.nz/#!DBAFGY4C!Jb0Y8gtDpm_V12DSqz5LP63k7xkqq_L9fMNn0Fc0Qv4
test_20.xma (20 PCM samples)
- xmaencode encodes 1 frame (512) file
- xmaencode decodes 128 + 512 - 512 start_skip (leaving last 128 from the
frame) - 108 end_skip = 20 samples
- FFmpeg just outputs 512, and the waveform isn't correct (last 128 are
wrong)
Screenshot: original PCM vs xmaencode vs ffmpeg vs xmaencode manually
removing the skips from the file.
test_322.xma: same with 2 frames, see how FFmpeg starts "late", and the
second frame now decodes like the first frame should in test_20.xma:
Both files are less than one packet, so it isn't a bug in the bit
reservoir.
I encoded those but the issues are present in all "real" files AFAIK.
----
Console output:
{{{
%
ffmpeg -i test_20.xma test_20.wav -v debug
ffmpeg version N-87353-g183fd30 Copyright (c) 2000-2017 the FFmpeg
developers
built with gcc 7.2.0 (GCC)
configuration: --enable-gpl --enable-version3 --enable-cuda --enable-
cuvid --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-nvenc
--enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r
--enable-gnutls --enable-iconv --enable-libass --enable-libbluray
--enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme
--enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame
--enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264
--enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libsnappy
--enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame
--enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-
libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-
libx265 --enable-libxavs --enable-libxvid --enable-libxml2 --enable-
libzimg --enable-lzma --enable-zlib
libavutil 55. 76.100 / 55. 76.100
libavcodec 57.106.101 / 57.106.101
libavformat 57. 82.101 / 57. 82.101
libavdevice 57. 8.101 / 57. 8.101
libavfilter 6.105.100 / 6.105.100
libswscale 4. 7.103 / 4. 7.103
libswresample 2. 8.100 / 2. 8.100
libpostproc 54. 6.100 / 54. 6.100
Splitting the commandline.
Reading option '-i' ... matched as input url with argument 'test_20.xma'.
Reading option 'test_20.wav' ... matched as output url.
Reading option '-v' ... matched as option 'v' (set logging level) with
argument 'debug'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument debug.
Successfully parsed a group of options.
Parsing a group of options: input url test_20.xma.
Successfully parsed a group of options.
Opening an input file: test_20.xma.
[NULL @ 00367800] Opening 'test_20.xma' for reading
[file @ 00367f20] Setting default whitelist 'file,crypto'
[wav @ 00367800] Format wav probed with size=2048 and score=99
[wav @ 00367800] Before avformat_find_stream_info() pos: 60 bytes
read:2128 seeks:0 nb_streams:1
[wav @ 00367800] parser not found for codec xma1, packets or times may be
invalid.
[xma1 @ 03231580] extradata:
[xma1 @ 03231580] [d6] [10] [0] [0] [1] [0] [0] [2] [80] [2d] [7] [0] [44]
[ac] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [1] [0]
[wav @ 00367800] parser not found for codec xma1, packets or times may be
invalid.
[wav @ 00367800] After avformat_find_stream_info() pos: 2128 bytes
read:2128 seeks:0 frames:1
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'test_20.xma':
Duration: N/A, bitrate: N/A
Stream #0:0, 1, 1/44100: Audio: xma1 (e[1][0][0] / 0x0165), 44100 Hz,
mono, fltp
Successfully opened the file.
Parsing a group of options: output url test_20.wav.
Successfully parsed a group of options.
Opening an output file: test_20.wav.
[file @ 0036df00] Setting default whitelist 'file,crypto'
Successfully opened the file.
[xma1 @ 03231c00] extradata:
[xma1 @ 03231c00] [d6] [10] [0] [0] [1] [0] [0] [2] [80] [2d] [7] [0] [44]
[ac] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [1] [0]
Stream mapping:
Stream #0:0 -> #0:0 (xma1 (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per
stream)
detected 2 logical cores
[graph_0_in_0_0 @ 032b5260] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 032b5260] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 032b5260] Setting 'sample_fmt' to value 'fltp'
[graph_0_in_0_0 @ 032b5260] Setting 'channels' to value '1'
[graph_0_in_0_0 @ 032b5260] tb:1/44100 samplefmt:fltp samplerate:44100
chlayout:(null)
[format_out_0_0 @ 032b56e0] Setting 'sample_fmts' to value 's16'
[format_out_0_0 @ 032b56e0] auto-inserting filter 'auto_resampler_0'
between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 032b4a00] query_formats: 4 queried, 6 merged, 3 already
done, 0 delayed
[auto_resampler_0 @ 032b5e20] [SWR @ 032b6ea0] Using fltp internally
between filters
[auto_resampler_0 @ 032b5e20] ch:1 chl:1 channels fmt:fltp r:44100Hz ->
ch:1 chl:1 channels fmt:s16 r:44100Hz
Output #0, wav, to 'test_20.wav':
Metadata:
ISFT : Lavf57.82.101
Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
44100 Hz, 1 channels, s16, 705 kb/s
Metadata:
encoder : Lavc57.106.101 pcm_s16le
[out_0_0 @ 032b55a0] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
size= 1kB time=00:00:00.01 bitrate= 759.3kbits/s speed= 5.8x
video:0kB audio:1kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 7.617188%
Input file #0 (test_20.xma):
Input stream #0:0 (audio): 1 packets read (2048 bytes); 1 frames decoded
(512 samples);
Total: 1 packets (2048 bytes) demuxed
Output file #0 (test_20.wav):
Output stream #0:0 (audio): 1 frames encoded (512 samples); 1 packets
muxed (1024 bytes);
Total: 1 packets (1024 bytes) muxed
1 frames successfully decoded, 0 decoding errors
[AVIOContext @ 03290060] Statistics: 4 seeks, 4 writeouts
[AVIOContext @ 0036ca40] Statistics: 2128 bytes read, 0 seeks
}}}
--
Ticket URL: <https://trac.ffmpeg.org/ticket/6722>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list