[FFmpeg-devel] [PATCH] E-AC-3 spectral extension

Michael Niedermayer michaelni
Mon Jun 1 12:20:54 CEST 2009


On Sat, May 30, 2009 at 11:43:12PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> > On Sun, May 17, 2009 at 02:23:34PM -0400, Justin Ruggles wrote:
> >> Hi,
> >>
> >> I was recently made aware that some French TV station(s) will soon (if
> >> not already) start using E-AC-3 streams in their broadcasts which
> >> utilize spectral extension.  I was also given some samples (thanks j-b
> >> and Anthony), which I uploaded to mphq:
> >> http://samples.mplayerhq.hu/A-codecs/AC3/eac3/csi_miami_*
> >>
> >> So I decided to revisit my SPX patch.  The previous version was done
> >> with all integer arithmetic, but it turns out that it's really not
> >> accurate enough for spectal extension processing.  The resulting decoded
> >> output had a max bandwidth of about 2kHz less when using 24-bit fixed
> >> point vs. floating point, and was only slightly higher than without any
> >> SPX processing at all.  Making just the square roots floating point
> >> raised the bandwidth about 1kHz, and making the rest (noise/signal
> >> scaling, spx coords, and notch filters) floating point added about
> >> another 1kHz.
> >>
> >> I was able to compare the output to Nero's E-AC-3 decoder (thanks
> >> madshi), and the results are very close considering that AC-3 uses
> >> random noise for zero-bit mantissas:
> > 
> >> stddev:  131.16 PSNR: 53.96
> > 
> > i wouldnt call 131.16 close
> 
> Well, considering I don't know how the Nero decoder differs, it's not
> bad.  I don't know how the Nero decoder ends up with higher bandwidth
> than it should, it very likely uses a different random noise generator,
> and it could do dithering in the float-to-int16 conversion.

dither in float2int might account for ~1.0 stdev maybe but we are 2
magnitudes above that.

about the PRNG, well just decode a AC3 with 2 different PRNGS and compare
by how much they differ

also you can take neros output and ours and create a wav file with the
sample wise differences.
looking at that / listening to it might provide a hint about what is that
differs.


> 
> >> PEAQ ODG: -0.44
> > 
> > what is PEAQ ODG ?
> 
> PEAQ is an ITU standard for perceptual evaluation of audio quality.  ODG
> is the objective difference grade.  It tries to objectively estimate
> what results might be from a subjective listening test by using a
> psychoacoustic model.  The method has its flaws, but it's a heck of a
> lot simpler than setting up a multi-user double-blind listening test for
> each change.
> 
>  0 = Imperceptible
> -1 = Perceptible, but not annoying
> -2 = Slightly annoying
> -3 = Annoying
> -4 = Very annoying

ok, understood


> 
> > btw, have you tested your code with our trasher / some fuzzer to make sure
> > it doesnt segfault?
> 
> Yes, and it does not segfault even with -er 0.  The values read from the
> stream which affect reading/writing from memory are bounds checked.  The
> (E)AC-3 decoder already does fairly well with damaged streams, and that
> is no different after this change.
> 
> > 
> >> One thing I'm unsure about is whether I should add optional runtime
> >> generation of the attenuation table rather than always hardcoding it.
> > 
> > i think due to the relatively small size there is little point
> 
> ok.
> 
> > 
> > [...]
> > 
> >> diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c
> >> index c176cb3..e6d7a9d 100644
> >> --- a/libavcodec/ac3dec.c
> >> +++ b/libavcodec/ac3dec.c
> >> @@ -825,14 +825,94 @@ static int decode_audio_block(AC3DecodeContext *s, int blk)
> >>  
> >>      /* spectral extension strategy */
> >>      if (s->eac3 && (!blk || get_bits1(gbc))) {
> >> -        if (get_bits1(gbc)) {
> >> -            ff_log_missing_feature(s->avctx, "Spectral extension", 1);
> >> -            return -1;
> >> +        s->spx_in_use = get_bits1(gbc);
> >> +        if (s->spx_in_use) {
> >> +            int begf, endf;
> >> +            int spx_end_subband;
> >> +
> >> +            /* determine which channels use spx */
> >> +            if (s->channel_mode == AC3_CHMODE_MONO) {
> >> +                s->channel_in_spx[1] = 1;
> >> +            } else {
> >> +                for (ch = 1; ch <= fbw_channels; ch++)
> >> +                    s->channel_in_spx[ch] = get_bits1(gbc);
> >> +            }
> >> +
> >> +            s->spx_copy_start_freq = get_bits(gbc, 2) * 12 + 25;
> >> +            begf = get_bits(gbc, 3);
> >> +            endf = get_bits(gbc, 3);
> >> +            s->spx_start_subband = begf < 6 ? begf+2 : 2*begf-3;
> >> +            spx_end_subband      = endf < 4 ? endf+5 : 2*endf+3;
> >> +            if (s->spx_start_subband >= spx_end_subband) {
> >> +                av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension range (%d >= %d)\n",
> >> +                       s->spx_start_subband, spx_end_subband);
> >> +                return -1;
> >> +            }
> >> +            s->spx_start_freq    = s->spx_start_subband * 12 + 25;
> >> +            s->spx_end_freq      = spx_end_subband      * 12 + 25;
> >> +            if (s->spx_copy_start_freq >= s->spx_start_freq) {
> >> +                av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension copy start bin (%d >= %d)\n",
> >> +                       s->spx_copy_start_freq, s->spx_start_freq);
> >> +                return -1;
> >> +            }
> > 
> > you know, i always have a bad feeling when various variables are updated
> > first and checked afterwards but left in an invalid state anyway
> > 
> > are you sure this is all free of buffer overflows? (ive not checked so
> > it may very well be ok ...)
> 
> No further blocks are read in the frame after a block decode fails.
> Each frame is independent, so the next frame is not affected by an
> invalid state.  Also, it was previously discussed and agreed upon that
> trying to read subsequent blocks after a failed block is pointless since
> there are no known streams which use the block start info.

ok

[..]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Thouse who are best at talking, realize last or never when they are wrong.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090601/b27235cc/attachment.pgp>



More information about the ffmpeg-devel mailing list