[FFmpeg-devel] [PATCH] E-AC-3 spectral extension (bump)

Justin Ruggles justin.ruggles
Sat Mar 20 14:13:36 CET 2010


Michael Niedermayer wrote:

> On Fri, Mar 19, 2010 at 01:03:16PM +0100, Christophe Gisquet wrote:
>> Hi,
>>
>> in the sake of restarting the review process, I'm restarting a thread left here:
>> http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2009-October/077490.html
>>
>> The one complain remaining on the patch at that time was the following
>> (sorry for the shoddy editing):
>>
>>>> +    copy_sizes[num_copy_sections++] = bin - s->spx_copy_start_freq;
>>>> +
>>>> +    for (ch = 1; ch <= s->fbw_channels; ch++) {
>>>> +        if (!s->channel_uses_spx[ch])
>>>> +            continue;
>>>> +
>>>> +        /* Copy coeffs from normal bands to extension bands */
>>>> +        bin = s->spx_start_freq;
>>>> +        for (i = 0; i < num_copy_sections; i++) {
>>>> +            memcpy(&s->transform_coeffs[ch][bin],
>>>> +                   &s->transform_coeffs[ch][s->spx_copy_start_freq],
>>>> +                   copy_sizes[i]*sizeof(float));
>>>> +            bin += copy_sizes[i];
>>>> +        }
>>>> +
>>>> +        /* Calculate RMS energy for each SPX band. */
>>>> +        bin = s->spx_start_freq;
>>>> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
>>>> +            int bandsize = s->spx_band_sizes[bnd];
>>>> +            float accum = 0.0f;
>>>> +            for (i = 0; i < bandsize; i++) {
>>>> +                float coeff = s->transform_coeffs[ch][bin++];
>>>> +                accum += coeff * coeff;
>>>> +            }
>>>> +            rms_energy[bnd] = sqrtf(accum / bandsize);
>>>> +        }
>>> if i understand the code correctly, the same source coefficients can be
>>> copied to several destinations, thus it would be more efficient to not
>>> calculate this for all destination but rather for the source and copy
>>> as needed to destination bands
>> The RMS computations occur on strictly non-overlapping samples from
>> band to band, and always on the SPX bands, never on the original
>> source:
>> s->spx_copy_start_freq+sum(copy_sizes[i]) <= s->spx_start_freq
>>
>> If one would output some info over the 2 samples available at:
>> http://samples.mplayerhq.hu/A-codecs/AC3/eac3/
>> one would see for the 5.1 sample:
>> Channel 1: Copying 72 elements to bin 109 from bin 25
>> RMS for band 0 on 109-121 elements
>> RMS for band 1 on 121-133 elements
>> RMS for band 2 on 133-145 elements
>> RMS for band 3 on 145-157 elements
>> RMS for band 4 on 157-181 elements
>> (and so on for each channel and call to that function)
>> For the stereo sample:
>> Channel 1: Copying 24 elements to bin 133 from bin 25
>> RMS for band 0 on 133-145 elements
>> RMS for band 1 on 145-157 elements
>> (repeat for each channel and call)
>>
>> So:
>> - no computation is remade
> 
> this is not neccessarily so. Actually ill bet its very very hard
> to proof the lack of redundant computations even for much simpler
> algorihms
> 
> 
>> - the number of samples for each iteration is for the available
>> samples 12 and 24
> 
>> - each band may start on an unaligned position (for SSE code at
>> least), so scalarproduct & co functions can't be reused directly
> 
> you bring up an interresting point, do these sometimes or always start
> at an unaligned address?, if its always x*12*sizeof(float)+1 we can
> probably do something about it

Yes, it will always be that.  The spec says "Transform coefficients #25
through #228 are grouped into 17 sub-bands of 12 coefficients each."

> 
>> - ff_eac3_apply_spectral_extension is 0.5% of execution time at best
>> (from a oprofile log)
>>
>> Attached is the patch rediff'ed against current SVN. I declare myself
>> incompetent for more theoretical discussions, though.
> 
> iam incompetent as well, but you seem interrested. Maybe we can together
> resolve this, because one or both of us is not understanding the code
> completely and its quite complicated code though luckily much cleaner than
> our aac code.
> 
> for example:
>> +        /* Copy coeffs from normal bands to extension bands */
>> +        bin = s->spx_start_freq;
>> +        for (i = 0; i < num_copy_sections; i++) {
>> +            memcpy(&s->transform_coeffs[ch][bin],
>> +                   &s->transform_coeffs[ch][s->spx_copy_start_freq],
>> +                   copy_sizes[i]*sizeof(float));
>> +            bin += copy_sizes[i];
>> +        }
> 
> this clearly copies the same area several times if num_copy_sections is
> greater than 1.
> Can it be greater than 1? or to ask differently does this happen for the
> samples we have?
> if its guranteed to be ==1 the code can be simplified alot.
> if not, how representative are our samples? are these random recordings or
> from some reference/conformance suite?
> 
> now if its copied 2 or more times i think the rms calculation is done
> several times over the same data
> it maybe just 0.5% for the samples we have but first it feels wrong if
> we do calculations redundantly and second it maybe more than 0.5% for
> other samples

num_copy_sections can definitely be >1.  That is the whole point of it.
 It might not be so with our current samples, but the spec is clear
about how this is done.  The copy region can be smaller than the spx
region, and in that case you have to wrap back around to the beginning
of the copy region to keep copying coefficients to the spx region, hence
the wrap flags and notch filter.

Also, each band has multiple sub-bands of 12 coeffs each.  The copying
is done in multiples of 12, but the wrapping can occur in the middle of
a band.  If we want to reduce calculations, we could accumulate sum of
squared coeffs for each sub-band, then add them together for each band
before dividing by bandsize and taking the square root.

> [...]
>> @@ -43,6 +66,7 @@
>>  #define AC3_MAX_COEFS   256
>>  #define AC3_BLOCK_SIZE  256
>>  #define MAX_BLOCKS        6
>> +#define SPX_MAX_BANDS    17
>>  
>>  typedef struct {
>>      AVCodecContext *avctx;                  ///< parent context
>> @@ -89,6 +113,22 @@
>>      int cpl_coords[AC3_MAX_CHANNELS][18];   ///< coupling coordinates                   (cplco)
>>  ///@}
>>  
>> +///@defgroup spx spectral extension
>> +///@{
>> +    int spx_in_use;                             ///< spectral extension in use              (spxinu)
>> +    uint8_t channel_uses_spx[AC3_MAX_CHANNELS]; ///< channel uses spectral extension        (chinspx)
>> +    int8_t spx_atten_code[AC3_MAX_CHANNELS];    ///< spx attenuation code                   (spxattencod)
> 
>> +    int spx_start_freq;                         ///< spx start frequency bin
>> +    int spx_end_freq;                           ///< spx end frequency bin
>> +    int spx_copy_start_freq;                    ///< spx starting frequency bin for copying (copystartmant)
>> +                                                ///< the copy region ends at the start of the spx region.
> 
> start/end/copy_start are not ideal names, they should contain "src"/"dst"
> so its clear from where to where things are copied

spx_src_start_freq, spx_dst_start_freq, spx_dst_end_freq ?


-Justin



More information about the ffmpeg-devel mailing list