[FFmpeg-devel] [PATCH] use MUL64 in ac3dec.c

Måns Rullgård mans
Tue Jan 12 00:11:43 CET 2010


Reimar D?ffinger <Reimar.Doeffinger at gmx.de> writes:

> Hello,
> a real 64x64 multiply is really slow on 32 bit processors.
> Patch below speeds up AC3 decoding on an Atom by 5 % overall
> (fastest out of five runs: from 3.183 to 3.024 for my sample).
>
> Index: libavcodec/ac3dec.c
> ===================================================================
> --- libavcodec/ac3dec.c	(revision 21153)
> +++ libavcodec/ac3dec.c	(working copy)
> @@ -420,10 +420,9 @@
>          int band_end = bin + s->cpl_band_sizes[band];
>          for (ch = 1; ch <= s->fbw_channels; ch++) {
>              if (s->channel_in_cpl[ch]) {
> -                int64_t cpl_coord = s->cpl_coords[ch][band];
> +                int cpl_coord = s->cpl_coords[ch][band];
>                  for (bin = band_start; bin < band_end; bin++) {
> -                    s->fixed_coeffs[ch][bin] = ((int64_t)s->fixed_coeffs[CPL_CH][bin] *
> -                                                cpl_coord) >> 23;
> +                    s->fixed_coeffs[ch][bin] = MUL64(s->fixed_coeffs[CPL_CH][bin], cpl_coord) >> 23;
>                  }
>                  if (ch == 2 && s->phase_flags[band]) {
>                      for (bin = band_start; bin < band_end; bin++)

That looks correct to me.

> Even faster (though possibly wrong, even though I hear no issues with
> my samples) should be the variant with MULH, though I could not really
> measure a difference:

Make sure the output is actually identical.

> Index: libavcodec/ac3dec.c
> ===================================================================
> --- libavcodec/ac3dec.c	(revision 21153)
> +++ libavcodec/ac3dec.c	(working copy)
> @@ -420,10 +420,9 @@
>          int band_end = bin + s->cpl_band_sizes[band];
>          for (ch = 1; ch <= s->fbw_channels; ch++) {
>              if (s->channel_in_cpl[ch]) {
> -                int64_t cpl_coord = s->cpl_coords[ch][band];
> +                int cpl_coord = s->cpl_coords[ch][band] << 9;

Is this certain not to overflow?

>                  for (bin = band_start; bin < band_end; bin++) {
> -                    s->fixed_coeffs[ch][bin] = ((int64_t)s->fixed_coeffs[CPL_CH][bin] *
> -                                                cpl_coord) >> 23;
> +                    s->fixed_coeffs[ch][bin] = MULH(s->fixed_coeffs[CPL_CH][bin], cpl_coord);
>                  }

Provided the shift above is safe, that had better work, since the
destination is 32-bit, or the original would be suffering some serious
truncation problems.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list