[FFmpeg-devel] [PATCH] E-AC-3 spectral extension

Michael Niedermayer michaelni
Thu Oct 22 14:14:20 CEST 2009


On Wed, Aug 05, 2009 at 06:14:39PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> 
> > On Mon, Aug 03, 2009 at 06:08:52PM -0400, Justin Ruggles wrote:
> >> Michael Niedermayer wrote:
> > [...]
> >>>> +    int8_t spx_atten_code[AC3_MAX_CHANNELS];    ///< spx attenuation code                   (spxattencod)
> >>>> +    int spx_start_subband;                      ///< spx beginning frequency band           (spxbegf)
> >>>> +    int spx_start_freq;                         ///< spx start frequency bin
> >>>> +    int spx_end_freq;                           ///< spx end frequency bin
> >>> what are frequency bins?
> >> MDCT coefficient index.  Frequency bin is the term that the (E-)AC-3
> >> spec uses throughout.
> > 
> > Please explain these terms somewhere, like at the top of the file.
> > And dont hesitate to add diagrams and ascii art to make all the relations
> > clear.
> 
> 
> I have added a summary of the bin/subband/band grouping of MDCT
> coefficients used in (E-)AC-3...with an ascii art diagram. :)
> 
> > 
> > 
> >>>> +    int spx_copy_start_freq;                    ///< spx starting frequency for copying     (copystartmant)
> >>> frequency in hz ?
> >> bin number. fixed.
> > 
> > next clarify what starting frequency is
> > there surely is a source and a destination to every copy ...
> 
> the summary noted above defines what a frequency bin is.  i also added a
> comment here stating that the copy range ends at the start of the spx range.
> 
> > 
> >>
> >>>> +    int num_spx_bands;                          ///< number of spx bands                    (nspxbnds)
> > 
> >>>> +    uint8_t spx_band_struct[SPX_MAX_BANDS];     ///< spectral extension band structure      (spxbndstrc)
> >>> what is a spectral extension band structure?
> >>
> >> determines how many subbands are in each band.  for each subband, 1
> >> means combine with previous band, 0 means start new band.  this is the
> >> same format as the coupling band structure, hence the shared function to
> >> decode it.  I just added more information to the current documentation
> >> for decode_band_structure().
> > 
> > Could you please explain this in the doxy of the variable,
> > someone reading the code who sees "spx_band_struct" would look up
> > the doxy in the struct first and only if the doxy that is supposed
> > to explain does not do that at all would he read the code using the
> > variable
> > you dont have to write all the details in the doxy but
> > "spectral extension band structure" says nothing unless one happens to
> > know the ac3 spec by heart and id love to not require that.
> > A person knowing how MDCT+VLC/RLE based audio coding works should be
> > able to make sense of such decoders in lavc and not have to read
> > through each individual spec
> 
> I hope the recent changes I've made help this.  I removed all the
> *_band_struct[] from the AC3DecodeContext.  So what defines the band
> structure is the number of bands and size of each band, which is much
> more intuitive to someone not already knowing the spec.
> 
> New patch attached.
> 
> Thanks for the suggestions.
[...]
> diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c
> index f2cadf0..492388f 100644
> --- a/libavcodec/ac3dec.c
> +++ b/libavcodec/ac3dec.c
> @@ -812,14 +812,104 @@ static int decode_audio_block(AC3DecodeContext *s, int blk)
>  
>      /* spectral extension strategy */
>      if (s->eac3 && (!blk || get_bits1(gbc))) {
> -        if (get_bits1(gbc)) {
> -            av_log_missing_feature(s->avctx, "Spectral extension", 1);
> -            return -1;
> +        s->spx_in_use = get_bits1(gbc);
> +        if (s->spx_in_use) {
> +            int copy_start, start_subband, end_subband, start_freq, end_freq;
> +
> +            /* determine which channels use spx */
> +            if (s->channel_mode == AC3_CHMODE_MONO) {
> +                s->channel_uses_spx[1] = 1;
> +            } else {
> +                for (ch = 1; ch <= fbw_channels; ch++)
> +                    s->channel_uses_spx[ch] = get_bits1(gbc);
> +            }
> +
> +            /* get the frequency bins of the spx copy region and the spx start
> +               and end subbands */
> +            copy_start    = get_bits(gbc, 2);
> +            start_subband = get_bits(gbc, 3) + 2;
> +            if (start_subband > 7)
> +                start_subband += start_subband - 7;
> +            end_subband   = get_bits(gbc, 3) + 5;
> +            if (end_subband   > 7)
> +                end_subband   += end_subband   - 7;
> +            copy_start    = copy_start    * 12 + 25;
> +            start_freq    = start_subband * 12 + 25;
> +            end_freq      = end_subband   * 12 + 25;
> +
> +            /* check validity of spx ranges */
> +            if (start_subband >= end_subband) {
> +                av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension "
> +                       "range (%d >= %d)\n", start_subband, end_subband);
> +                return -1;
> +            }
> +            if (copy_start >= start_freq) {
> +                av_log(s->avctx, AV_LOG_ERROR, "invalid spectral extension "
> +                       "copy start bin (%d >= %d)\n", copy_start, start_freq);
> +                return -1;
> +            }
> +
> +            s->spx_copy_start_freq = copy_start;
> +            s->spx_start_freq      = start_freq;
> +            s->spx_end_freq        = end_freq;
> +
> +            decode_band_structure(gbc, blk, s->eac3, 0,
> +                                  start_subband, end_subband,
> +                                  ff_eac3_default_spx_band_struct,
> +                                  &s->num_spx_bands,
> +                                  s->spx_band_sizes);

its not in this patch but decode_band_structure():
    if (num_bands || band_sizes ) {
        n_bands = n_subbands;
        bnd_sz[0] = ecpl ? 6 : 12;
        for (bnd = 0, subbnd = 1; subbnd < n_subbands; subbnd++) {

this relly is hard to read, abbreviations are so randomly mixed
num_bands and n_bands, i assume n is an abbreviation for num but then
whats their difference.
bnd_sz i assume supposed to mean band_size why is it abbreviated here
while not band_sizes.

also
    if (num_bands || band_sizes ) {
[...]
    }

    /* set optional output params */
    if (num_bands)
        *num_bands = n_bands;
    if (band_sizes)
        memcpy(band_sizes, bnd_sz, n_bands);

can be simplifed to:
    if (num_bands || band_sizes ) {
[...]
        /* set optional output params */
        if (num_bands)
            *num_bands = n_bands;
        if (band_sizes)
            memcpy(band_sizes, bnd_sz, n_bands);
    }



[...]

> +void ff_eac3_apply_spectral_extension(AC3DecodeContext *s)
> +{
> +    int bin, bnd, ch, i;
> +    uint8_t wrapflag[SPX_MAX_BANDS]={1,0,}, num_copy_sections, copy_sizes[SPX_MAX_BANDS];
> +    float rms_energy[SPX_MAX_BANDS];
> +

> +    /* Set copy index mapping table. Set wrap flags to apply a notch filter at
> +       wrap points later on. */
> +    bin = s->spx_copy_start_freq;
> +    num_copy_sections = 0;
> +    for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +        int copysize;
> +        int bandsize = s->spx_band_sizes[bnd];
> +        if (bin + bandsize > s->spx_start_freq) {
> +            copy_sizes[num_copy_sections++] = bin - s->spx_copy_start_freq;
> +            bin = s->spx_copy_start_freq;
> +            wrapflag[bnd] = 1;
> +        }
> +        for (i = 0; i < bandsize; i += copysize) {
> +            if (bin == s->spx_start_freq) {
> +                copy_sizes[num_copy_sections++] = bin - s->spx_copy_start_freq;
> +                bin = s->spx_copy_start_freq;
> +            }
> +            copysize = FFMIN(bandsize - i, s->spx_start_freq - bin);
> +            bin += copysize;
> +        }
> +    }

this code is quite messy ...
not a complaint, just an observation


> +    copy_sizes[num_copy_sections++] = bin - s->spx_copy_start_freq;
> +
> +    for (ch = 1; ch <= s->fbw_channels; ch++) {
> +        if (!s->channel_uses_spx[ch])
> +            continue;
> +
> +        /* Copy coeffs from normal bands to extension bands */
> +        bin = s->spx_start_freq;
> +        for (i = 0; i < num_copy_sections; i++) {
> +            memcpy(&s->transform_coeffs[ch][bin],
> +                   &s->transform_coeffs[ch][s->spx_copy_start_freq],
> +                   copy_sizes[i]*sizeof(float));
> +            bin += copy_sizes[i];
> +        }
> +

> +        /* Calculate RMS energy for each SPX band. */
> +        bin = s->spx_start_freq;
> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +            int bandsize = s->spx_band_sizes[bnd];
> +            float accum = 0.0f;
> +            for (i = 0; i < bandsize; i++) {
> +                float coeff = s->transform_coeffs[ch][bin++];
> +                accum += coeff * coeff;
> +            }
> +            rms_energy[bnd] = sqrtf(accum / bandsize);
> +        }

if i understand the code correctly, the same source coefficients can be
copied to several destinations, thus it would be more efficient to not
calculate this for all destination but rather for the source and copy
as needed to destination bands


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you think the mosad wants you dead since a long time then you are either
wrong or dead since a long time.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091022/f636448e/attachment.pgp>



More information about the ffmpeg-devel mailing list