[FFmpeg-devel] [PATCH 10/11] dca: factorize scaling in inverse ADPCM

Michael Niedermayer michaelni at gmx.at
Fri Feb 7 13:07:02 CET 2014


On Thu, Feb 06, 2014 at 12:42:01AM +0000, Christophe Gisquet wrote:
> The codeblock affected accounted for around 4% of the runtime on x86_64
> (measured using oprofile on a Penryn).
> Timings for Arrandale (gcc 4.6.1 tdm64-1 for windows):
> win32: 341 -> 331
> win64: 321 -> 120
> Part of the gain also comes from the adpcm values to be converted to float
> outside of the loops.
> ---
>  libavcodec/dcadec.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/libavcodec/dcadec.c b/libavcodec/dcadec.c
> index 03c5d6f..7f956d1 100644
> --- a/libavcodec/dcadec.c
> +++ b/libavcodec/dcadec.c
> @@ -1346,15 +1346,15 @@ static int dca_subsubframe(DCAContext *s, int base_channel, int block_index)
>              if (s->prediction_mode[k][l]) {
>                  int n;
>                  for (m = 0; m < 8; m++) {
> +                    float sum = 0;
>                      for (n = 1; n <= 4; n++)
>                          if (m >= n)
> -                            subband_samples[k][l][m] += (adpcm_vb[s->prediction_vq[k][l]][n - 1] * subband_samples[k][l][m - n] / 8192);
> +                            sum +=                       adpcm_vb[s->prediction_vq[k][l]][n - 1] * subband_samples[k][l][m - n];
>                          else if (s->predictor_history)
> -                            subband_samples[k][l][m] += (adpcm_vb[s->prediction_vq[k][l]][n - 1] * s->subband_samples_hist[k][l][m - n + 4] / 8192);
> +                            sum                      +=  adpcm_vb[s->prediction_vq[k][l]][n - 1] * s->subband_samples_hist[k][l][m - n + 4];
> +                    subband_samples[k][l][m] += sum / 8192;

should be ok though theres some further portntial for optimizations
like the first iteration could be done with sum = instead of +=
the adpcm_vb[s->prediction_vq[k][l]] and &subband_samples[k][l][m] and
&s->subband_samples_hist[k][l][m+4]
could be factored out

the / 8192 could be changed to * (1.0 / 8192)

some compilers will do some of these automatically though 

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
then the original author, trying to rewrite it will not make it better.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140207/7a0591c4/attachment.asc>


More information about the ffmpeg-devel mailing list