[FFmpeg-soc] Review of dirac code

Fri Aug 17 00:12:57 CEST 2007

On Thu, Aug 16, 2007 at 11:30:37PM +0200, Marco Gerards wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
> 
> Hi,
> 
> > On Thu, Aug 16, 2007 at 07:37:24PM +0200, Marco Gerards wrote:
> > [...]
> >> > [...]
> >> >
> >> > dirac_arith.h:
> >> > [...]
> >> >> typedef struct dirac_arith_state {
> >> >>     /* Arithmetic decoding.  */
> >> >>     unsigned int low;
> >> >>     unsigned int range;
> >> >>     unsigned int code;
> >> >>     unsigned int bits_left;
> >> >>     int carry;
> >> >>     unsigned int contexts[ARITH_CONTEXT_COUNT];
> >> >> 
> >> >>     GetBitContext *gb;
> >> >>     PutBitContext *pb;
> >> >
> >> > GetBitContext gb;
> >> > PutBitContext pb;
> >> > would avoid a pointer dereference if these 2 are used in the innermost
> >> > loops, though maybe their use can be avoided by reading/writing 8bit
> >> > at a time similar to how our cabac reader and range coder works
> >> 
> >> Can I just copy the GetBitContext, PutBitContext at initialization
> >> time, or is there a more efficient way to do this?
> >
> > no, its ok
> 
> Do you mean it is ok to copy them?

yes its ok to copy for the arith coder


[...]
> > [...]
> >> >
> >> > this is VERY inefficient, most of what is done here does not need to be done
> >> > per pixel or can be precalculated
> >> 
> >> Absolutely!  gprof agrees with you ;-)
> >> 
> >> Actually, this comment applies to all code above, I assume.
> >> 
> >>  time   seconds   seconds    calls  ms/call  ms/call  name    
> >>  40.85      2.41     2.41 88939320     0.00     0.00  upconvert
> >>  29.49      4.15     1.74       50    34.80   113.59  dirac_decode_frame
> >>  10.34      4.76     0.61   417867     0.00     0.00  motion_comp_block1ref
> >>   6.78      5.16     0.40      540     0.74     0.74  dirac_subband_idwt_53
> >>   4.24      5.41     0.25  8689089     0.00     0.00  coeff_unpack
> >>   4.07      5.65     0.24  8707719     0.00     0.00  dirac_arith_read_uint
> >> 
> >> Luca and I talked about if I would concentrate on further
> >> optimizations or on the encoder and he said he preferred having me
> >> working on the encoder for now, because I know Dirac quite well by now
> >> (Luca, please tell me if I am wrong in any way).
> >> 
> >> So there is quite some room for optimizations.  In the specification,
> >> the used pseudo code looped over the pixels.  My current code at least
> >> doesn't do that.  But I agree, the MC code is where a lot of
> >> optimizations can be made:
> >> 
> >> - The qpel/eightpel interpolation code is very inefficient and perhaps
> >>   has to be merged with motion_comp_block1ref and
> >>   motion_comp_block2ref.  In that case it can be optimized.  We
> >>   especially have to take care of border cases to avoid clipping, more
> >>   or less like I did for halfpel interpolation.
> >
> > simply making upconvert() work not with 1 pixel but with a block of pixels
> > should make it alot more efficient
> 
> I moved this into motion_comp_block1ref/motion_comp_block2ref.  This
> works quite well and boosts the speed of the decoder by 50% and makes
> further optimizations easier.  Hopefully you do not have problems with
> this approach.

iam fine with it


[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Good people do not need laws to tell them to act responsibly, while bad
people will find a way around the laws. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-soc/attachments/20070817/7e18396e/attachment.pgp>