[FFmpeg-trac] #6722(avcodec:new): XMA/WMAPro decoder: gapless problems

FFmpeg trac at avcodec.org
Sat Oct 7 17:19:55 EEST 2017


#6722: XMA/WMAPro decoder: gapless problems
-------------------------------------+-------------------------------------
             Reporter:  bnnm         |                     Type:  defect
               Status:  new          |                 Priority:  normal
            Component:  avcodec      |                  Version:  git-
             Keywords:  xma, wmapro  |  master
             Blocking:               |               Blocked By:
Analyzed by developer:  0            |  Reproduced by developer:  0
-------------------------------------+-------------------------------------
 XMA first/last output is slightly incorrect, notably breaking gapless
 files. Haven't tested WMAPro files as much but I believe also applies.

 ----

 From tests with Microsoft's XMA encoder (xmaencode.exe):
 - FFmpeg decodes 128(?) samples late, and xmaencode adds 128 samples at
 output start (1 subframe of "setup_samples"?), making last subframe in a
 file incorrect. Basically the first and last output is off (samples in the
 middle look fine).
 - FFmpeg ignores "start skip" (samples to discard at the beginning) and
 "end skip" (same at the end), see wmaprodec.c at 1443. Both used in decodes
 and applied after the first 128 "extra" samples. start_skip is usually set
 to frame_samples and end_skip to <frame_samples, but if manually changed
 (ex. to 100 or 0) xmaencode will honor the values. If >frame_size it
 clamps the value (512 in XMA), and end_skip seems to be always included
 even if 0.

 ex. final samples output of 10 frames: FFmpeg = 10*512; xmaencode = 128 +
 1*512 - start_skip + 9*512 - end_skip

 ----

 Example files:
 https://mega.nz/#!DBAFGY4C!Jb0Y8gtDpm_V12DSqz5LP63k7xkqq_L9fMNn0Fc0Qv4

 test_20.xma (20 PCM samples)
 - xmaencode encodes 1 frame (512) file
 - xmaencode decodes 128 + 512 - 512 start_skip (leaving last 128 from the
 frame) - 108 end_skip = 20 samples
 - FFmpeg just outputs 512, and the waveform isn't correct (last 128 are
 wrong)

 Screenshot: original PCM vs xmaencode vs ffmpeg vs xmaencode manually
 removing the skips from the file.

 test_322.xma: same with 2 frames, see how FFmpeg starts "late", and the
 second frame now decodes like the first frame should in test_20.xma:

 Both files are less than one packet, so it isn't a bug in the bit
 reservoir.
 I encoded those but the issues are present in all "real" files AFAIK.

 ----

 Console output:
 {{{
 %
 ffmpeg -i test_20.xma test_20.wav -v debug
 ffmpeg version N-87353-g183fd30 Copyright (c) 2000-2017 the FFmpeg
 developers
   built with gcc 7.2.0 (GCC)
   configuration: --enable-gpl --enable-version3 --enable-cuda --enable-
 cuvid --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-nvenc
 --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r
 --enable-gnutls --enable-iconv --enable-libass --enable-libbluray
 --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme
 --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame
 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264
 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libsnappy
 --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame
 --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-
 libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-
 libx265 --enable-libxavs --enable-libxvid --enable-libxml2 --enable-
 libzimg --enable-lzma --enable-zlib
   libavutil      55. 76.100 / 55. 76.100
   libavcodec     57.106.101 / 57.106.101
   libavformat    57. 82.101 / 57. 82.101
   libavdevice    57.  8.101 / 57.  8.101
   libavfilter     6.105.100 /  6.105.100
   libswscale      4.  7.103 /  4.  7.103
   libswresample   2.  8.100 /  2.  8.100
   libpostproc    54.  6.100 / 54.  6.100
 Splitting the commandline.
 Reading option '-i' ... matched as input url with argument 'test_20.xma'.
 Reading option 'test_20.wav' ... matched as output url.
 Reading option '-v' ... matched as option 'v' (set logging level) with
 argument 'debug'.
 Finished splitting the commandline.
 Parsing a group of options: global .
 Applying option v (set logging level) with argument debug.
 Successfully parsed a group of options.
 Parsing a group of options: input url test_20.xma.
 Successfully parsed a group of options.
 Opening an input file: test_20.xma.
 [NULL @ 00367800] Opening 'test_20.xma' for reading
 [file @ 00367f20] Setting default whitelist 'file,crypto'
 [wav @ 00367800] Format wav probed with size=2048 and score=99
 [wav @ 00367800] Before avformat_find_stream_info() pos: 60 bytes
 read:2128 seeks:0 nb_streams:1
 [wav @ 00367800] parser not found for codec xma1, packets or times may be
 invalid.
 [xma1 @ 03231580] extradata:
 [xma1 @ 03231580] [d6] [10] [0] [0] [1] [0] [0] [2] [80] [2d] [7] [0] [44]
 [ac] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [1] [0]
 [wav @ 00367800] parser not found for codec xma1, packets or times may be
 invalid.
 [wav @ 00367800] After avformat_find_stream_info() pos: 2128 bytes
 read:2128 seeks:0 frames:1
 Guessed Channel Layout for Input Stream #0.0 : mono
 Input #0, wav, from 'test_20.xma':
   Duration: N/A, bitrate: N/A
     Stream #0:0, 1, 1/44100: Audio: xma1 (e[1][0][0] / 0x0165), 44100 Hz,
 mono, fltp
 Successfully opened the file.
 Parsing a group of options: output url test_20.wav.
 Successfully parsed a group of options.
 Opening an output file: test_20.wav.
 [file @ 0036df00] Setting default whitelist 'file,crypto'
 Successfully opened the file.
 [xma1 @ 03231c00] extradata:
 [xma1 @ 03231c00] [d6] [10] [0] [0] [1] [0] [0] [2] [80] [2d] [7] [0] [44]
 [ac] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [1] [0]
 Stream mapping:
   Stream #0:0 -> #0:0 (xma1 (native) -> pcm_s16le (native))
 Press [q] to stop, [?] for help
 cur_dts is invalid (this is harmless if it occurs once at the start per
 stream)
 detected 2 logical cores
 [graph_0_in_0_0 @ 032b5260] Setting 'time_base' to value '1/44100'
 [graph_0_in_0_0 @ 032b5260] Setting 'sample_rate' to value '44100'
 [graph_0_in_0_0 @ 032b5260] Setting 'sample_fmt' to value 'fltp'
 [graph_0_in_0_0 @ 032b5260] Setting 'channels' to value '1'
 [graph_0_in_0_0 @ 032b5260] tb:1/44100 samplefmt:fltp samplerate:44100
 chlayout:(null)
 [format_out_0_0 @ 032b56e0] Setting 'sample_fmts' to value 's16'
 [format_out_0_0 @ 032b56e0] auto-inserting filter 'auto_resampler_0'
 between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
 [AVFilterGraph @ 032b4a00] query_formats: 4 queried, 6 merged, 3 already
 done, 0 delayed
 [auto_resampler_0 @ 032b5e20] [SWR @ 032b6ea0] Using fltp internally
 between filters
 [auto_resampler_0 @ 032b5e20] ch:1 chl:1 channels fmt:fltp r:44100Hz ->
 ch:1 chl:1 channels fmt:s16 r:44100Hz
 Output #0, wav, to 'test_20.wav':
   Metadata:
     ISFT            : Lavf57.82.101
     Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
 44100 Hz, 1 channels, s16, 705 kb/s
     Metadata:
       encoder         : Lavc57.106.101 pcm_s16le
 [out_0_0 @ 032b55a0] EOF on sink link out_0_0:default.
 No more output streams to write to, finishing.
 size=       1kB time=00:00:00.01 bitrate= 759.3kbits/s speed= 5.8x
 video:0kB audio:1kB subtitle:0kB other streams:0kB global headers:0kB
 muxing overhead: 7.617188%
 Input file #0 (test_20.xma):
   Input stream #0:0 (audio): 1 packets read (2048 bytes); 1 frames decoded
 (512 samples);
   Total: 1 packets (2048 bytes) demuxed
 Output file #0 (test_20.wav):
   Output stream #0:0 (audio): 1 frames encoded (512 samples); 1 packets
 muxed (1024 bytes);
   Total: 1 packets (1024 bytes) muxed
 1 frames successfully decoded, 0 decoding errors
 [AVIOContext @ 03290060] Statistics: 4 seeks, 4 writeouts
 [AVIOContext @ 0036ca40] Statistics: 2128 bytes read, 0 seeks
 }}}

--
Ticket URL: <https://trac.ffmpeg.org/ticket/6722>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list