[FFmpeg-soc] [soc]: r704 - dirac/libavcodec/dirac.c

Michael Niedermayer michaelni at gmx.at
Sat Aug 11 23:28:24 CEST 2007


On Sat, Aug 11, 2007 at 11:20:23PM +0200, marco wrote:
> Author: marco
> Date: Sat Aug 11 23:20:23 2007
> New Revision: 704
> 
> Log:
> optimize loops for the 9/7 IDWT
> 
> Modified:
>    dirac/libavcodec/dirac.c
> 
> Modified: dirac/libavcodec/dirac.c
> ==============================================================================
> --- dirac/libavcodec/dirac.c	(original)
> +++ dirac/libavcodec/dirac.c	Sat Aug 11 23:20:23 2007
> @@ -1757,7 +1757,7 @@ STOP_TIMER("idwt53")
>  static int dirac_subband_idwt_97(AVCodecContext *avctx,
>                                   int *data, int level) {
>      DiracContext *s = avctx->priv_data;
> -    int *synth;
> +    int *synth, *synthline;
>      int x, y;
>      int width = subband_width(avctx, level);
>      int height = subband_height(avctx, level);
> @@ -1799,90 +1799,101 @@ START_TIMER
>      */
>  
>      /* Vertical synthesis: Lifting stage 1.  */
> +    synthline = synth;
>      for (x = 0; x < synth_width; x++)
> +        synthline[x] -= (    synthline[synth_width]
> +                                     + synthline[synth_width]
>                                       + 2) >> 2;
> +    synthline = synth + (synth_width << 1);
>      for (y = 1; y < height - 1; y++) {
>          for (x = 0; x < synth_width; x++) {
> +            synthline[x] -= (    synthline[x - synth_width]
> +                                     + synthline[x + synth_width]
>                                       + 2) >> 2;
>          }
> +        synthline += synth_width << 1;
>      }
> +    synthline = synth + (synth_height - 2) * synth_width;
>      for (x = 0; x < synth_width; x++)
> +        synthline[x] -= (    synthline[x - synth_width]
> +                                     + synthline[x + synth_width]
>                                       + 2) >> 2;
>  
>      /* Vertical synthesis: Lifting stage 2.  */
> +    synthline = synth + synth_width;
>      for (x = 0; x < synth_width; x++)
> +        synthline[x] += (     -synthline[x - synth_width]
> +                                   + 9 * synthline[x - synth_width]
> +                                   + 9 * synthline[x + synth_width]
> +                                   -     synthline[x + 3 * synth_width]
>                                     + 8) >> 4;
> +    synthline = synth + (synth_width << 1);
>      for (y = 1; y < height - 2; y++) {
>          for (x = 0; x < synth_width; x++) {

performing lifting pass X over the whole image and then pass X+1 over the
whole image is not very cache friendly
it would be better to perform lifting pass X for a line then pass X+1 for
whatever line(s) it can be performed with the data which became available
and then do pass X for the next line, ...

also look at snow.c::lift() maybe something like that could be used to
simplify the code


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-soc/attachments/20070811/40d99a11/attachment.pgp>


More information about the FFmpeg-soc mailing list