[FFmpeg-devel] [PATCH] RV40 Loop Filter

Michael Niedermayer michaelni
Mon Oct 27 09:07:31 CET 2008


On Sun, Oct 26, 2008 at 03:41:09PM +0200, Kostya wrote:
> On Sat, Oct 25, 2008 at 11:14:25AM +0200, Michael Niedermayer wrote:
> > On Sat, Oct 25, 2008 at 10:08:44AM +0300, Kostya wrote:
> > > On Wed, Oct 22, 2008 at 10:53:23AM +0200, Michael Niedermayer wrote:
> > > > On Tue, Oct 21, 2008 at 09:23:21AM +0300, Kostya wrote:
> > [...]
> > > > [...]
> > > > > +static int rv40_set_deblock_coef(RV34DecContext *r)
> > > > > +{
> > > > > +    MpegEncContext *s = &r->s;
> > > > > +    int mvmask = 0, i, j, dx, dy;
> > > > > +    int midx = s->mb_x * 2 + s->mb_y * 2 * s->b8_stride;
> > > > 
> > > > > +    if(s->pict_type == FF_I_TYPE)
> > > > > +        return 0;
> > > > 
> > > > why is this even called for i frames?
> > > 
> > > I intend to use it for calculating macroblock-specific deblock
> > > strength in RV30.
> > 
> > fine but how is that related to having the pict_type check inside the
> > function compared to outside?
>  
> For RV30 setting deblock coefficients would be performed for
> I-frames as well.

so there are 2 different functions

if(rv30)
    rv30_set_deblock_coef()
else if(!I)
    rv40_set_deblock_coef()

clean, simple, fast, ...
vs.

ctx->func_ptr()

init(){
    if(rv30)
        ctx->func_ptr= func30
    else
        ctx->func_ptr= func40
}

func40(){
    if(I)
        return;
}

This is not simple, and calling functions that just return is IMHO also
not clean.


>  
> > [...]
> > > > > +                if(dx > 3 || dy > 3){
> > > > > +                    mvmask |= 0x03 << (i*2 + j*8);
> > > > > +                }
> > > > > +            }
> > > > > +        }
> > > > > +        midx += s->b8_stride;
> > > > > +    }
> > > > 
> > > > i think the if() can be moved out of the loop like
> > > > if(first_slice_line)
> > > >     mvmask &= 123;
> > > 
> > > IMO it can't.
> > > It constructs mask based on motion vectors difference in the
> > > horizontal/vertical neighbouring blocks after all. 
> > 
> > one way (there surely are thousend others)
> > 
> > get_mask(int delta)
> >     for()
> >         for()
> >             v0= motion_val[x+y*stride]
> >             v1= motion_val[x+y*stride+delta]
> >             if(FFABS(v0[0]-v1[0])>3 || FFABS(v0[1]-v1[1])>3)
> >                 mask |= 1<<(2*x+8*y);
> >     return mask
> > 
> > hmask= get_mask(1     );
> > vmask= get_mask(stride);
> > if(!mb_x)
> >     hmask &= 0x...
> > if(first_slice_line)
> >     vmask &= 0x...
> > mask = hmask | (hmask<<1) | vmask | (vmask<<4);
> > 
> > besides, the way mask bits are combined looks strange/wrong
> 
> Per my understanding it sets edges for 2x2 groups of 4x4 subblocks. 
>  
> > >  
> > > > > +    return mvmask;
> > > > > +}
> > > > > +
> > > > > +static void rv40_loop_filter(RV34DecContext *r)
> > > > > +{
> > > > > +    MpegEncContext *s = &r->s;
> > > > > +    int mb_pos;
> > > > > +    int i, j;
> > > > > +    uint8_t *Y, *C;
> > > > > +    int alpha, beta, betaY, betaC;
> > > > > +    int q;
> > > > > +    // 0 - cur block, 1 - top, 2 - left, 3 - bottom
> > > > > +    int btype[4], clip[4], mvmasks[4], cbps[4], uvcbps[4][2];
> > > > > +
> > > > 
> > > > > +    if(s->pict_type == FF_B_TYPE)
> > > > > +        return;
> > > > 
> > > > why is this even called for b frames?
> > > 
> > > Because the spec says so :)
> > > RV40 has many special cases for B-frame loop filter which
> > > I didn't care to implement.
> > 
> > :/
> > i hope it cannot use B frames as reference?
> 
> Looks like it does not 
>  
> > [...]
> > > [lots of loop filter invoking] 
> > > > 
> > > > the word mess is probably the best way to describe this
> > > > as far as i can tell you are packing all the bits related to deblocking
> > > > and then later duplicate code each with hardcoded masks to extract them
> > > > again.
> > > 
> > > We have a saying here "To make a candy from crap", which I think describes
> > > current situation. I'd like to shot the group of men who proposed the loop
> > > filter in the form RV40 has it.
> > 
> > there arent many codecs around that are cleanly designed ...
> > Some things here and there are ok but terrible messes like this are more
> > common.
> > We dont have too much of a choice, to support things the mess has to be
> > implemented. If it can be done cleaner/simpler thats a big advantage in the
> > long term, easier to maintain, understand, optimize; smaller and faste, ...
> 
> Also I think that forcing someone to understand it counts as
> a psychological abuse and the sentence on it should be
> debugging X8 frames or implementing interlaced mode in VC-1
> (sorry, can't remember more evil codecs).
>  
> > > 
> > > The problem is that edges should be filtered in that order with clipping
> > > values depending on clipping values selected depending on whether
> > > neighbouring block coded is not and if it belongs to the same MB or not.
> > > It's possible to all of the into loop, but it will have too many additional
> > > conditions to my taste. I've merged some of them though.
> > 
> > iam not suggesting to build a complex and ugly loop, rather something like
> > storing all the numbers that might differ in a 2d array and then
> > having a loop go over this.
> > the mb edge flags, coded info and all that would be in the array so that
> > reading it is a matter of coded[y][x], mb_edge[y][x], mb_type[y][x]
> > i think this would be cleaner IMHO
> 
> done
>  
> > Ill review the new patch soon
> 
> here it is
> 
> > [...]
> > -- 
> > Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

> Index: libavcodec/rv40.c
> ===================================================================
> --- libavcodec/rv40.c	(revision 15305)
> +++ libavcodec/rv40.c	(working copy)
> @@ -247,7 +247,462 @@
>      return 0;
>  }
>  
> +#define CLIP_SYMM(a, b) av_clip(a, -(b), b)
>  /**
> + * weaker deblocking very similar to the one described in 4.4.2 of JVT-A003r1
> + */
> +static inline void rv40_weak_loop_filter(uint8_t *src, const int step,
> +                                         const int flag0, const int flag1,
> +                                         const int alpha,
> +                                         const int lim0, const int lim1,
> +                                         const int difflim, const int beta,
> +                                         const int S0, const int S1,
> +                                         const int S2, const int S3)
> +{
> +    uint8_t *cm = ff_cropTbl + MAX_NEG_CROP;
> +    int t, u, diff;
> +
> +    t = src[0*step] - src[-1*step];
> +    if(!t){
> +        return;
> +    }
> +    u = (alpha * FFABS(t)) >> 7;
> +    if(u > 3 - (flag0 && flag1)){
> +        return;
> +    }
> +
> +    t <<= 2;
> +    if(flag0 && flag1)
> +        t += src[-2*step] - src[1*step];
> +    diff = CLIP_SYMM((t + 4) >> 3, difflim);
> +    src[-1*step] = cm[src[-1*step] + diff];
> +    src[ 0*step] = cm[src[ 0*step] - diff];
> +    if(FFABS(S2) <= beta && flag0){
> +        t = (S0 + S2 - diff) >> 1;
> +        src[-2*step] = cm[src[-2*step] - CLIP_SYMM(t, lim1)];
> +    }
> +    if(FFABS(S3) <= beta && flag1){
> +        t = (S1 + S3 + diff) >> 1;
> +        src[ 1*step] = cm[src[ 1*step] - CLIP_SYMM(t, lim0)];
> +    }
> +}

rename flag0/1 to filter_first / filter_last or some other name that is
related to what they do!



> +
> +/**
> + * This macro is used for calculating 25*x0+26*x1+26*x2+26*x3+25*x4
> + * or 25*x0+26*x1+51*x2+26*x3

> + * @param  sub - index of the value with coefficient = 25

idx25 maybe


> + * @param last - index of the value with coefficient 25 or 51

idx25_51

but still the doxy is not sufficient to understand what the function
does and how overlapping of the 2 variables behave and are used.


> + */
> +#define RV40_STRONG_FILTER(src, step, start, last, sub) \
> +     26*(src[start    *step] + src[(start+1)*step]  + src[(start+2)*step] \
> +       + src[(start+3)*step] + src[last     *step]) - src[last     *step] \
> +       - src[sub      *step]
> +
> +/**
> + * Deblocking filter, the altered version from JVT-A003r1 H.26L draft.
> + */
> +static inline void rv40_adaptive_loop_filter(uint8_t *src, const int step,
> +                                             const int stride, const int dmode,
> +                                             const int lim0, const int lim1,
> +                                             const int alpha,
> +                                             const int beta, const int beta2,
> +                                             const int chroma, const int edge)
> +{
> +    int diffs[4][4];
> +    int s0 = 0, s1 = 0, s2 = 0, s3 = 0;
> +    uint8_t *ptr;
> +    int flag0 = 1, flag1 = 1;
> +    int strength0 = 3, strength1 = 3;
> +    int i;
> +    int lims;
> +
> +    for(i = 0, ptr = src; i < 4; i++, ptr += stride){
> +        diffs[i][0] = ptr[-2*step] - ptr[-1*step];
> +        diffs[i][1] = ptr[ 1*step] - ptr[ 0*step];
> +        s0 += diffs[i][0];
> +        s1 += diffs[i][1];
> +    }
> +    if(FFABS(s0) >= (beta<<2)){
> +        strength0 = 1;
> +    }
> +    if(FFABS(s1) >= (beta<<2)){
> +        strength1 = 1;
> +    }
> +    if(strength0 + strength1 <= 2){
> +        return;
> +    }
> +
> +    for(i = 0, ptr = src; i < 4; i++, ptr += stride){
> +        diffs[i][2] = ptr[-2*step] - ptr[-3*step];
> +        diffs[i][3] = ptr[ 1*step] - ptr[ 2*step];
> +        s2 += diffs[i][2];
> +        s3 += diffs[i][3];
> +    }
> +
> +    if(!edge)
> +        flag0 = flag1 = 0;
> +    else{
> +        flag0 = (strength0 == 3) && (FFABS(s2) < beta2);
> +        flag1 = (strength1 == 3) && (FFABS(s3) < beta2);
> +    }
> +
> +    lims = (lim0 + lim1 + strength0 + strength1) >> 1;
> +    if(flag0 && flag1){ /* strong filtering */
> +        for(i = 0; i < 4; i++, src += stride){
> +            int diff[2], sflag, p0, p1;
> +            int t = src[0*step] - src[-1*step];
> +
> +            if(!t) continue;
> +            sflag = (alpha * FFABS(t)) >> 7;
> +            if(sflag > 1) continue;
> +
> +            p0 = (RV40_STRONG_FILTER(src, step, -3, 1, -3) + rv40_dither_l[dmode + i]) >> 7;
> +            p1 = (RV40_STRONG_FILTER(src, step, -2, 2, -2) + rv40_dither_r[dmode + i]) >> 7;
> +            diff[0] = src[-1*step];
> +            diff[1] = src[ 0*step];
> +            src[-1*step] = sflag ? av_clip(p0, src[-1*step] - lims, src[-1*step] + lims) : p0;
> +            src[ 0*step] = sflag ? av_clip(p1, src[ 0*step] - lims, src[ 0*step] + lims) : p1;
> +            diff[0] -= src[-1*step];
> +            diff[1] -= src[ 0*step];
> +            p0 = (RV40_STRONG_FILTER(src, step, -4, 0, -4) + rv40_dither_l[dmode + i] + diff[1]*25) >> 7;
> +            p1 = (RV40_STRONG_FILTER(src, step, -1, 3, -1) + rv40_dither_r[dmode + i] + diff[0]*25) >> 7;
> +            src[-2*step] = sflag ? av_clip(p0, src[-2*step] - lims, src[-2*step] + lims) : p0;
> +            src[ 1*step] = sflag ? av_clip(p1, src[ 1*step] - lims, src[ 1*step] + lims) : p1;
> +            if(!chroma){
> +                src[-3*step] = (RV40_STRONG_FILTER(src, step, -4, -3, -1) + 64) >> 7;
> +                src[ 2*step] = (RV40_STRONG_FILTER(src, step,  0,  2,  0) + 64) >> 7;
> +            }
> +        }
> +    }else if(strength0 == 3 && strength1 == 3){
> +        for(i = 0; i < 4; i++, src += stride)
> +            rv40_weak_loop_filter(src, step, 1, 1, alpha, lim0, lim1, lims, beta,
> +                                  diffs[i][0], diffs[i][1], diffs[i][2], diffs[i][3]);
> +    }else{
> +        for(i = 0; i < 4; i++, src += stride)
> +            rv40_weak_loop_filter(src, step, strength0==3, strength1==3,
> +                                  alpha, lim0>>1, lim1>>1, lims>>1, beta,
> +                                  diffs[i][0], diffs[i][1], diffs[i][2], diffs[i][3]);
> +    }
> +}
> +
> +static void rv40_v_loop_filter(uint8_t *src, int stride, int dmode, int lim0, int lim1,
> +                               int alpha, int beta, int beta2, int chroma, int edge){
> +    rv40_adaptive_loop_filter(src, 1, stride, dmode, lim0, lim1, alpha, beta, beta2, chroma, edge);
> +}
> +static void rv40_h_loop_filter(uint8_t *src, int stride, int dmode, int lim0, int lim1,
> +                               int alpha, int beta, int beta2, int chroma, int edge){
> +    rv40_adaptive_loop_filter(src, stride, 1, dmode, lim0, lim1, alpha, beta, beta2, chroma, edge);
> +}
> +

> +static int check_mv(int16_t (*motion_val)[2], int step)
> +{
> +    int d;
> +    d = motion_val[0][0] - motion_val[-step][0];
> +    if(d < -3 || d > 3)
> +        return 1;
> +    d = motion_val[0][1] - motion_val[-step][1];
> +    if(d < -3 || d > 3)
> +        return 1;
> +    return 0;        
> +}

the name check_mv() is too generic


> +
> +static int rv40_set_deblock_coef(RV34DecContext *r)
> +{
> +    MpegEncContext *s = &r->s;
> +    int mvmask = 0, i, j, dx, dy;
> +    int midx = s->mb_x * 2 + s->mb_y * 2 * s->b8_stride;
> +    int16_t (*motion_val)[2] = s->current_picture_ptr->motion_val[0][midx];
> +    if(s->pict_type == FF_I_TYPE)
> +        return 0;
> +    for(j = 0; j < 2; j++){
> +        for(i = 0; i < 2; i++){
> +            if(i || s->mb_x){
> +                if(check_mv(motion_val, 1)){
> +                    mvmask |= 0x11 << (i*2 + j*8);
> +                }
> +            }
> +            if(j || !s->first_slice_line){
> +                if(check_mv(motion_val, s->b8_stride)){
> +                    mvmask |= 0x03 << (i*2 + j*8);
> +                }
> +            }
> +        }
> +        motion_val += s->b8_stride;
> +    }
> +    return mvmask;
> +}

this is still doing the s->mb_x and first_slice_line checks in the inner loop


> +
> +/** This structure holds conditions on applying loop filter to some edge */
> +typedef struct RV40LoopFilterCond{
> +    int x;              ///< x coordinate of edge start
> +    int y;              ///< y coordinate of edge start

> +    int dir;            ///< edge filtering direction (horizontal or vertical)

and what value does dir have for each?


> +    int filt_mask;      ///< mask specifying what deblock pattern bit should be tested for filtering
> +    int edge_mbtype;    ///< edge condition testing - number of neighbouring mbtype or -1
> +    int nonedge_mbtype; ///< not at edge condition testing - number of neighbouring mbtype or -1
> +    int next_clip_mask; ///< mask specifying bit to test to select neighbour block clip value
> +    int dither;         ///< dither parameter for the current loop filtering
> +}RV40LoopFilterCond;
> +
> +#define RV40_LUMA_LOOP_FIRST 13
> +static const RV40LoopFilterCond rv40_loop_cond_luma_first_row[RV40_LUMA_LOOP_FIRST] = {
> +    {  0,  4, 0, 0x0010, -1, -1, 0x0001,  0 }, // subblock 0
> +    {  0,  0, 1, 0x0001, -1,  2, 0x0008,  0 },
> +    {  0,  0, 0, 0x0001,  1, -1, 0x1000,  0 },
> +    {  0,  0, 1, 0x0001,  2, -1, 0x0008,  0 },
> +    {  4,  4, 0, 0x0020, -1, -1, 0x0002,  4 }, // subblocks 1-3
> +    {  4,  0, 1, 0x0002, -1, -1, 0x0001,  4 },
> +    {  4,  0, 0, 0x0002,  1, -1, 0x2000,  4 },
> +    {  8,  4, 0, 0x0040, -1, -1, 0x0004,  8 },
> +    {  8,  0, 1, 0x0004, -1, -1, 0x0002,  8 },
> +    {  8,  0, 0, 0x0004,  1, -1, 0x4000,  8 },
> +    { 12,  4, 0, 0x0080, -1, -1, 0x0008, 12 },
> +    { 12,  0, 1, 0x0008, -1, -1, 0x0004, 12 },
> +    { 12,  0, 0, 0x0008,  1, -1, 0x8000, 12 }
> +};
> +
> +#define RV40_LUMA_LOOP_NEXT 9
> +static const RV40LoopFilterCond rv40_loop_cond_luma_next_rows[RV40_LUMA_LOOP_NEXT] = {
> +    {  0,  4, 0, 0x0010, -1, -1, 0x0001, 0 }, // first subblock of the row
> +    {  0,  0, 1, 0x0001,  2, -1, 0x0008, 0 },
> +    {  0,  0, 1, 0x0001, -1,  2, 0x0008, 0 },
> +    {  4,  4, 0, 0x0020, -1, -1, 0x0002, 1 }, // the rest of subblocks
> +    {  4,  0, 1, 0x0002, -1, -1, 0x0001, 1 },
> +    {  8,  4, 0, 0x0040, -1, -1, 0x0004, 2 },
> +    {  8,  0, 1, 0x0004, -1, -1, 0x0002, 2 },
> +    { 12,  4, 0, 0x0080, -1, -1, 0x0008, 3 },
> +    { 12,  0, 1, 0x0008, -1, -1, 0x0004, 3 }
> +};
> +
> +#define RV40_CHROMA_LOOP 12
> +static const RV40LoopFilterCond rv40_loop_cond_chroma[RV40_CHROMA_LOOP] = {
> +    { 0, 4, 0, 0x04, -1, -1, 0x01, 0 }, // subblock 0
> +    { 0, 0, 1, 0x01, -1,  2, 0x02, 0 },
> +    { 0, 0, 0, 0x01,  1, -1, 0x04, 0 },
> +    { 0, 0, 1, 0x01,  2, -1, 0x02, 0 },
> +    { 4, 4, 0, 0x08, -1, -1, 0x02, 8 }, // subblock 1
> +    { 4, 4, 1, 0x02, -1, -1, 0x01, 0 },
> +    { 4, 4, 0, 0x02,  1, -1, 0x08, 8 },
> +    { 0, 8, 0, 0x10, -1, -1, 0x04, 0 }, // subblock 2
> +    { 0, 4, 1, 0x04, -1,  2, 0x08, 8 },
> +    { 0, 4, 1, 0x04,  2, -1, 0x08, 8 },
> +    { 4, 8, 0, 0x20, -1, -1, 0x08, 8 }, // subblock 3
> +    { 4, 4, 1, 0x08, -1, -1, 0x04, 8 },
> +};
> +
> +static void rv40_loop_filter(RV34DecContext *r)
> +{
> +    MpegEncContext *s = &r->s;
> +    int mb_pos;
> +    int i, j, k;
> +    uint8_t *Y, *C;
> +    int alpha, beta, betaY, betaC;
> +    int q;
> +    // 0 - cur block, 1 - top, 2 - left, 3 - bottom
> +    int mbtype[4], clip[4], mvmasks[4], cbp[4], uvcbp[4][2];
> +
> +    if(s->pict_type == FF_B_TYPE)
> +        return;
> +
> +    for(s->mb_y = 0; s->mb_y < s->mb_height; s->mb_y++){
> +        mb_pos = s->mb_y * s->mb_stride;
> +        for(s->mb_x = 0; s->mb_x < s->mb_width; s->mb_x++, mb_pos++){
> +            int btype = s->current_picture_ptr->mb_type[mb_pos];
> +            if(IS_INTRA(btype) || IS_SEPARATE_DC(btype)){
> +                r->cbp_luma  [mb_pos] = 0xFFFF;
> +            }
> +            if(IS_INTRA(btype)){
> +                r->cbp_chroma[mb_pos] = 0xFF;
> +            }
> +        }
> +    }
> +    for(s->mb_y = 0; s->mb_y < s->mb_height; s->mb_y++){
> +        mb_pos = s->mb_y * s->mb_stride;
> +        for(s->mb_x = 0; s->mb_x < s->mb_width; s->mb_x++, mb_pos++){
> +            int y_h_deblock, y_v_deblock;
> +            int c_v_deblock[2], c_h_deblock[2];
> +
> +            ff_init_block_index(s);
> +            ff_update_block_index(s);
> +            Y = s->dest[0];
> +            q = s->current_picture_ptr->qscale_table[mb_pos];
> +            alpha = rv40_alpha_tab[q];
> +            beta  = rv40_beta_tab [q];
> +            betaY = betaC = beta * 3;
> +            if(s->width * s->height <= 0x6300){
> +                betaY += beta;
> +            }
> +
> +            mvmasks[0] = r->deblock_coefs[mb_pos];
> +            mbtype [0] = s->current_picture_ptr->mb_type[mb_pos];
> +            cbp    [0] = r->cbp_luma[mb_pos];
> +            uvcbp[0][0] = r->cbp_chroma[mb_pos] & 0xF;
> +            uvcbp[0][1] = r->cbp_chroma[mb_pos] >> 4;
> +            for(i = 1; i < 4; i++){
> +                mvmasks[i] = 0;
> +                mbtype [i] = mbtype[0];
> +                cbp    [i] = 0;
> +                uvcbp[1][0] = uvcbp[1][1] = 0;
> +            }
> +            if(s->mb_y){
> +                mvmasks[1] = r->deblock_coefs[mb_pos - s->mb_stride] & 0xF000;
> +                mbtype [1] = s->current_picture_ptr->mb_type[mb_pos - s->mb_stride];
> +                cbp    [1] = r->cbp_luma[mb_pos - s->mb_stride] & 0xF000;
> +                uvcbp[1][0] =  r->cbp_chroma[mb_pos - s->mb_stride]       & 0xC;
> +                uvcbp[1][1] = (r->cbp_chroma[mb_pos - s->mb_stride] >> 4) & 0xC;
> +            }
> +            if(s->mb_x){
> +                mvmasks[2] = r->deblock_coefs[mb_pos - 1] & 0x8888;
> +                mbtype [2] = s->current_picture_ptr->mb_type[mb_pos - 1];
> +                cbp    [2] = r->cbp_luma[mb_pos - 1] & 0x8888;
> +                uvcbp[2][0] =  r->cbp_chroma[mb_pos - 1]       & 0xA;
> +                uvcbp[2][1] = (r->cbp_chroma[mb_pos - 1] >> 4) & 0xA;
> +            }
> +            if(s->mb_y < s->mb_height - 1){
> +                mvmasks[3] = r->deblock_coefs[mb_pos + s->mb_stride] & 0x000F;
> +                mbtype [3] = s->current_picture_ptr->mb_type[mb_pos + s->mb_stride];
> +                cbp    [3] = r->cbp_luma[mb_pos + s->mb_stride] & 0x000F;
> +                uvcbp[3][0] =  r->cbp_chroma[mb_pos + s->mb_stride]       & 0x3;
> +                uvcbp[3][1] = (r->cbp_chroma[mb_pos + s->mb_stride] >> 4) & 0x3;
> +            }
> +            for(i = 0; i < 4; i++){
> +                mbtype[i] = (IS_INTRA(mbtype[i]) || IS_SEPARATE_DC(mbtype[i])) ? 2 : 1;
> +                clip[i] = rv40_filter_clip_tbl[mbtype[i]][q];
> +            }
> +            y_h_deblock = cbp[0] | ((cbp[0] << 4) & ~0x000F) | (cbp[1] >> 12)
> +                        | ((cbp[3] << 20) & ~0x000F) | (cbp[3] << 16)
> +                        | mvmasks[0] | (mvmasks[3] << 16);
> +            y_v_deblock = ((cbp[0] << 1) & ~0x1111) | (cbp[2] >> 3)
> +                        | cbp[0] | (cbp[3] << 16)
> +                        | mvmasks[0] | (mvmasks[3] << 16);
> +            if(!s->mb_x){
> +                y_v_deblock &= ~0x1111;
> +            }
> +            if(!s->mb_y){
> +                y_h_deblock &= ~0x000F;
> +            }
> +            if(s->mb_y == s->mb_height - 1 || (mbtype[0] == 2 || mbtype[3] == 2)){
> +                y_h_deblock &= ~0xF0000;
> +            }
> +            cbp[0] = cbp[0] | (cbp[3] << 16)
> +                   | mvmasks[0] | (mvmasks[3] << 16);
> +            for(i = 0; i < 2; i++){
> +                c_v_deblock[i] = ((uvcbp[0][i] << 1) & ~0x5) | (uvcbp[2][i] >> 1)
> +                               | (uvcbp[3][i] << 4) | uvcbp[0][i];
> +                c_h_deblock[i] = (uvcbp[3][i] << 4) | uvcbp[0][i] | (uvcbp[1][i] >> 2)
> +                               | (uvcbp[3][i] << 6) | (uvcbp[0][i] << 2);
> +                uvcbp[0][i] = (uvcbp[3][i] << 4) | uvcbp[0][i];
> +                if(!s->mb_x){
> +                    c_v_deblock[i] &= ~0x5;
> +                }
> +                if(!s->mb_y){
> +                    c_h_deblock[i] &= ~0x3;
> +                }
> +                if(s->mb_y == s->mb_height - 1 || mbtype[0] == 2 || mbtype[3] == 2){
> +                    c_h_deblock[i] &= ~0x30;
> +                }
> +            }
> +
> +            for(j = 0; j < RV40_LUMA_LOOP_FIRST; j++){
> +                RV40LoopFilterCond *loop = rv40_loop_cond_luma_first_row + j;
> +                int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
> +                Y = s->dest[0] + loop->x + loop->y * s->linesize;
> +                cond = (loop->dir ? y_v_deblock : y_h_deblock) & loop->filt_mask;
> +                if(loop->edge_mbtype != -1){
> +                    edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> +                }
> +                if(loop->nonedge_mbtype != -1){
> +                    nonedgecond = !(mbtype[0] == 2 || mbtype[loop->nonedge_mbtype] == 2);
> +                }
> +                clip_cur = cbp[0] & loop->filt_mask ? clip[0] : 0;
> +                if(!loop->x && loop->dir){
> +                    clip_next = (cbp[2] | mvmasks[2]) & loop->next_clip_mask ? clip[2] : 0;
> +                }else if(!loop->y && !loop->dir){
> +                    clip_next = (cbp[1] | mvmasks[1]) & loop->next_clip_mask ? clip[1] : 0;
> +                }else{
> +                    clip_next = cbp[0] & loop->next_clip_mask ? clip[0] : 0;
> +                }
> +                if(cond && edgecond && nonedgecond){
> +                    if(loop->dir){
> +                        rv40_v_loop_filter(Y, s->linesize, loop->dither,
> +                                           clip_cur, clip_next,
> +                                           alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> +                    }else{
> +                        rv40_h_loop_filter(Y, s->linesize, loop->dither,
> +                                           clip_cur, clip_next,
> +                                           alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> +                    }
> +                }
> +            }
> +            for(j = 4; j < 12; j++){
> +                for(k = 0; k < RV40_LUMA_LOOP_NEXT; k++){
> +                    RV40LoopFilterCond *loop = rv40_loop_cond_luma_next_rows + k;
> +                    int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
> +                    Y = s->dest[0] + loop->x + (loop->y + j) * s->linesize;
> +                    cond = (loop->dir ? y_v_deblock : y_h_deblock) & (loop->filt_mask << j);
> +                    if(loop->edge_mbtype != -1){
> +                        edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> +                    }
> +                    if(loop->nonedge_mbtype != -1){
> +                        nonedgecond = !(mbtype[0] == 2 || mbtype[loop->nonedge_mbtype] == 2);
> +                    }
> +                    clip_cur = cbp[0] & (loop->filt_mask << j) ? clip[0] : 0;
> +                    if(!loop->x && loop->dir){
> +                        clip_next = (cbp[2] | mvmasks[2]) & (loop->next_clip_mask << j) ? clip[2] : 0;
> +                    }else{
> +                        clip_next = cbp[0] & (loop->next_clip_mask << j) ? clip[0] : 0;
> +                    }
> +                    if(cond && edgecond && nonedgecond){
> +                        if(loop->dir){
> +                            rv40_v_loop_filter(Y, s->linesize, loop->dither + j,
> +                                               clip_cur, clip_next,
> +                                               alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> +                        }else{
> +                            rv40_h_loop_filter(Y, s->linesize, loop->dither + j,
> +                                               clip_cur, clip_next,
> +                                               alpha, beta, betaY, 0, loop->edge_mbtype != -1);
> +                        }
> +                    }
> +                }
> +            }
> +            for(i = 0; i < 2; i++){
> +                for(j = 0; j < RV40_CHROMA_LOOP; j++){
> +                    RV40LoopFilterCond *loop = rv40_loop_cond_chroma + j;
> +                    int cond, edgecond = 1, nonedgecond = 1, clip_cur, clip_next;
> +                    C = s->dest[i+1] + loop->x + loop->y * s->uvlinesize;
> +                    cond = (loop->dir ? c_v_deblock[i] : c_h_deblock[i]) & loop->filt_mask;
> +                    if(loop->edge_mbtype != -1){
> +                        edgecond = (mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> +                    }
> +                    if(loop->nonedge_mbtype != -1){
> +                        nonedgecond = !(mbtype[0] == 2 || mbtype[loop->edge_mbtype] == 2);
> +                    }
> +                    clip_cur = uvcbp[0][i] & loop->filt_mask ? clip[0] : 0;
> +                    if(!loop->x && loop->dir){
> +                        clip_next = uvcbp[2][i] & loop->next_clip_mask ? clip[2] : 0;
> +                    }else if(!loop->y && !loop->dir){
> +                        clip_next = uvcbp[1][i] & loop->next_clip_mask ? clip[1] : 0;
> +                    }else{
> +                        clip_next = uvcbp[0][i] & loop->next_clip_mask ? clip[0] : 0;
> +                    }
> +                    if(cond && edgecond && nonedgecond){
> +                        if(loop->dir){
> +                            rv40_v_loop_filter(C, s->uvlinesize, loop->dither,
> +                                               clip_cur, clip_next,
> +                                               alpha, beta, betaC, 1, loop->edge_mbtype != -1);
> +                        }else{
> +                            rv40_h_loop_filter(C, s->uvlinesize, loop->dither,
> +                                               clip_cur, clip_next,
> +                                               alpha, beta, betaC, 1, loop->edge_mbtype != -1);
> +                        }
> +                    }
> +                }
> +            }
> +        }
> +    }
> +}

i will not accept this mess, sorry.
If you dont (or cant) clean this up i will try eventually but that might not
be soon.
as is, this is too much of a mess and iam unwilling to belive that h264
drafts required such mess.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081027/d0f3b9dc/attachment.pgp>



More information about the ffmpeg-devel mailing list