[FFmpeg-trac] #3701(avcodec:new): adpcm-ima_qt encoder's trellis support is broken

FFmpeg trac at avcodec.org
Thu Jun 5 03:26:03 CEST 2014


#3701: adpcm-ima_qt encoder's trellis support is broken
--------------------------------------+---------------------------------
               Reporter:  Timothy_Gu  |                  Owner:
                   Type:  defect      |                 Status:  new
               Priority:  normal      |              Component:  avcodec
                Version:  git-master  |               Keywords:  adpcm
             Blocked By:              |               Blocking:
Reproduced by developer:  0           |  Analyzed by developer:  0
--------------------------------------+---------------------------------
 [[PageOutline(2-3,Contents)]]

 == Summary of the bug ==
 The adpcm-ima_qt encoder's trellis support is broken in two ways:
 - it does not produce reproducible output
 - it significantly degrades the output

 == How to reproduce ==
 === Non-reproducible output ===
 First encoding process:
 {{{
 timothy_gu at ubuntu-lenovo:~/ffmpeg$ ./ffmpeg -cpuflags 0 -threads 1 -flags
 +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1
 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact
 -nostats -f md5 -
 ffmpeg version N-63714-g1a426d5 Copyright (c) 2000-2014 the FFmpeg
 developers
   built on Jun  4 2014 17:34:36 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
   configuration:
   libavutil      52. 89.100 / 52. 89.100
   libavcodec     55. 66.100 / 55. 66.100
   libavformat    55. 42.100 / 55. 42.100
   libavdevice    55. 13.101 / 55. 13.101
   libavfilter     4.  5.100 /  4.  5.100
   libswscale      2.  6.100 /  2.  6.100
   libswresample   0. 19.100 /  0. 19.100
 Guessed Channel Layout for  Input Stream #0.0 : stereo
 Input #0, wav, from 'tests/data/asynth-44100-2.wav':
   Duration: 00:00:06.00, bitrate: 1411 kb/s
     Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2
 channels, s16, 1411 kb/s
 Output #0, md5, to 'pipe:':
     Stream #0:0: Audio: adpcm_ima_qt, 44100 Hz, stereo, s16p, 352 kb/s
     Metadata:
       encoder         : Lavc adpcm_ima_qt
 Stream mapping:
   Stream #0:0 -> #0:0 (pcm_s16le -> adpcm_ima_qt)
 Press [q] to stop, [?] for help
 MD5=06391007776121799859126bd4d848f3
 size=       0kB time=00:00:06.00 bitrate=   0.0kbits/s
 video:0kB audio:275kB subtitle:0kB other streams:0kB global headers:0kB
 muxing overhead: unknown
 }}}

 Second encoding process:
 {{{
 timothy_gu at ubuntu-lenovo:~/ffmpeg$ ./ffmpeg -cpuflags 0 -threads 1 -flags
 +bitexact -fflags +bitexact -i tests/data/asynth-44100-2.wav -threads 1
 -b:a 128k -c adpcm_ima_qt -trellis 5 -flags +bitexact -fflags +bitexact
 -nostats -f md5 -
 ffmpeg version N-63714-g1a426d5 Copyright (c) 2000-2014 the FFmpeg
 developers
   built on Jun  4 2014 17:34:36 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
   configuration:
   libavutil      52. 89.100 / 52. 89.100
   libavcodec     55. 66.100 / 55. 66.100
   libavformat    55. 42.100 / 55. 42.100
   libavdevice    55. 13.101 / 55. 13.101
   libavfilter     4.  5.100 /  4.  5.100
   libswscale      2.  6.100 /  2.  6.100
   libswresample   0. 19.100 /  0. 19.100
 Guessed Channel Layout for  Input Stream #0.0 : stereo
 Input #0, wav, from 'tests/data/asynth-44100-2.wav':
   Duration: 00:00:06.00, bitrate: 1411 kb/s
     Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 2
 channels, s16, 1411 kb/s
 Output #0, md5, to 'pipe:':
     Stream #0:0: Audio: adpcm_ima_qt, 44100 Hz, stereo, s16p, 352 kb/s
     Metadata:
       encoder         : Lavc adpcm_ima_qt
 Stream mapping:
   Stream #0:0 -> #0:0 (pcm_s16le -> adpcm_ima_qt)
 Press [q] to stop, [?] for help
 MD5=353699581c94f150671616ecfc357c09
 size=       0kB time=00:00:06.00 bitrate=   0.0kbits/s
 video:0kB audio:275kB subtitle:0kB other streams:0kB global headers:0kB
 muxing overhead: unknown
 }}}

 The MD5 changed from `06391007776121799859126bd4d848f3` to
 `353699581c94f150671616ecfc357c09`. This phenomenon doesn't happen with
 any other adpcm encoders.

 === Significantly degraded output ===
 I will omit part of the encoding log because there is nothing interesting.

 Encoding/decoding without trellis
 {{{
 ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i
 tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt -flags
 +bitexact -fflags +bitexact -nostats nontrellis.aiff
 ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i
 nontrellis.aiff -threads 1  -flags +bitexact -fflags +bitexact -nostats
 nontrellis.wav
 }}}

 Encoding/decoding with trellis:
 {{{
 ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i
 tests/data/asynth-44100-2.wav -threads 1 -b:a 128k -c adpcm_ima_qt
 -trellis 5 -flags +bitexact -fflags +bitexact trellis.aiff
 ./ffmpeg -cpuflags 0 -threads 1 -flags +bitexact -fflags +bitexact -i
 trellis.aiff -threads 1  -flags +bitexact -fflags +bitexact trellis.wav
 }}}

 Finding PSNR:
 {{{
 timothy_gu at ubuntu-lenovo:~/ffmpeg$ tests/tiny_psnr
 tests/data/asynth-44100-2.wav nontrellis.wav 2
 stddev:  904.76 PSNR: 37.20 MAXDIFF:34029 bytes:  1058400/  1058560
 timothy_gu at ubuntu-lenovo:~/ffmpeg$ tests/tiny_psnr
 tests/data/asynth-44100-2.wav trellis.wav 2
 stddev: 8399.21 PSNR: 17.84 MAXDIFF:64623 bytes:  1058400/  1058560
 }}}

 For reference, with this specific sample, all other ADPCM encoders have a
 ~2dB PSNR increase.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/3701>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list