[FFmpeg-devel] [PATCH] RV40 loop filter, try 2

Michael Niedermayer michaelni
Fri Oct 31 01:18:12 CET 2008


On Thu, Oct 30, 2008 at 04:51:19PM +0200, Kostya wrote:
> $subj
> 
> I've tried to generalize, clean and document it.

its better but this still needs a lot more work to be readable.
the bit masks still are practically unreadable.
Its not so much a question of what is added into them but what the
variables actually are.
what i mean is things like

cbp // these are 16bits representing the coded block flags of the 4x4 luma
       blocks like:
    0  1  2  3
    4  5  6  7
    8  9 10 11
   12 13 14 15 where 0 is lsb and 15 is msb
   (msb)3x3, 2x3, 1x3, 0x3, 3x2, ...

this is especially absolutely essential for all the trickier loop filter
masks. mvmask, these h/v masks, ...

I approximately know by now what the variable contain but i do not know it
with enough precission and detail to really understand the code and
review it properly. Now i surely could "reverse engeneer" this but i
really think that this should not be needed, the code should be easier
to understand, it shouldnt require every reader to analyze the code to find
out what everything is ...


> Index: libavcodec/rv40.c
> ===================================================================
> --- libavcodec/rv40.c	(revision 15732)
> +++ libavcodec/rv40.c	(working copy)
> @@ -247,7 +247,427 @@
>      return 0;
>  }
>  
> +#define CLIP_SYMM(a, b) av_clip(a, -(b), b)
>  /**
> + * weaker deblocking very similar to the one described in 4.4.2 of JVT-A003r1
> + */
> +static inline void rv40_weak_loop_filter(uint8_t *src, const int step,
> +                                         const int filter_outer, const int filter_inner,
> +                                         const int alpha,
> +                                         const int lim_inner, const int lim_outer,
> +                                         const int difflim, const int beta,
> +                                         const int S0, const int S1,
> +                                         const int S2, const int S3)
> +{
> +    uint8_t *cm = ff_cropTbl + MAX_NEG_CROP;
> +    int t, u, diff;
> +
> +    t = src[0*step] - src[-1*step];
> +    if(!t){
> +        return;
> +    }
> +    u = (alpha * FFABS(t)) >> 7;
> +    if(u > 3 - (filter_inner && filter_outer)){
> +        return;
> +    }
> +
> +    t <<= 2;
> +    if(filter_inner && filter_outer)
> +        t += src[-2*step] - src[1*step];
> +    diff = CLIP_SYMM((t + 4) >> 3, difflim);
> +    src[-1*step] = cm[src[-1*step] + diff];
> +    src[ 0*step] = cm[src[ 0*step] - diff];
> +    if(FFABS(S2) <= beta && filter_outer){
> +        t = (S0 + S2 - diff) >> 1;
> +        src[-2*step] = cm[src[-2*step] - CLIP_SYMM(t, lim_outer)];
> +    }
> +    if(FFABS(S3) <= beta && filter_inner){
> +        t = (S1 + S3 + diff) >> 1;
> +        src[ 1*step] = cm[src[ 1*step] - CLIP_SYMM(t, lim_inner)];
> +    }
> +}

inner and outer?
src[0*step]/src[-1*step] would be inner and src[-2*step]/src[ 1*step]
outer relative to the edge, so iam confused by the naming ...



> +
> +/**
> + * This macro is used for calculating 25*x0+26*x1+26*x2+26*x3+25*x4
> + * or 25*x0+26*x1+51*x2+26*x3
> + *
> + * @param idx25 index of the value with coefficient = 25 (could be at the end of
> +                coefficients or at the start)
> + * @param idx25_51 index of the value that takes coefficient 25 when this index
> +                is before strt index or after (start+3) and 51 when it falls into
> +                that range
> + */

strt?


> +#define RV40_STRONG_FILTER(src, step, start, idx25_51, idx25) \
> +     26*(src[start    *step] + src[(start+1)*step]  + src[(start+2)*step] \
> +       + src[(start+3)*step] + src[idx25    *step]) - src[idx25_51 *step] \
> +       - src[idx25    *step]
> +

> +/**
> + * Deblocking filter, the altered version from JVT-A003r1 H.26L draft.
> + */
> +static inline void rv40_adaptive_loop_filter(uint8_t *src, const int step,
> +                                             const int stride, const int dmode,
> +                                             const int lim_inner, const int lim_outer,
> +                                             const int alpha,
> +                                             const int beta, const int beta2,
> +                                             const int chroma, const int edge)
> +{

> +    int diffs[4][4];

the second index is always a litteral number so this could be as well
4 arrays with individual and better names.


[...]
> +static int rv40_set_deblock_coef(RV34DecContext *r)
> +{
> +    MpegEncContext *s = &r->s;
> +    int mvmask = 0, i, j;
> +    int midx = s->mb_x * 2 + s->mb_y * 2 * s->b8_stride;
> +    int16_t (*motion_val)[2] = s->current_picture_ptr->motion_val[0][midx];
> +    if(s->pict_type == FF_I_TYPE)
> +        return 0;
> +    for(i = 0; i < 2; i++){
> +        if(!s->first_slice_line && check_mv_difference(motion_val + i, s->b8_stride)){
> +            mvmask |= 0x03 << (i*2);
> +        }
> +        if(check_mv_difference(motion_val + s->b8_stride + i, s->b8_stride)){
> +            mvmask |= 0x03 << (i*2 + 8);
> +        }
> +    }
> +    for(j = 0; j < 2; j++){
> +        if(s->mb_x && check_mv_difference(motion_val, 1)){
> +            mvmask |= 0x11 << (j*8);
> +        }
> +        if(check_mv_difference(motion_val + 1, 1)){
> +            mvmask |= 0x11 << (2 + j*8);
> +        }
> +        motion_val += s->b8_stride;
> +    }
> +    return mvmask;
> +}

the s->first_slice_line and s->mb_x checks are still inside the loops.
also the pict_type check is still there
besides these the *8 can be merged in the for loop and the C<<(D+E) can be
simplified by merging the effect of D into C


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Old school: Use the lowest level language in which you can solve the problem
            conveniently.
New school: Use the highest level language in which the latest supercomputer
            can solve the problem without the user falling asleep waiting.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081031/0bc12571/attachment.pgp>



More information about the ffmpeg-devel mailing list