[Ffmpeg-devel] H.264 encoder

Michael Niedermayer michaelni
Wed Oct 4 13:11:07 CEST 2006


Hi

On Wed, Oct 04, 2006 at 02:29:18AM +0200, Panagiotis Issaris wrote:
> Hi,
> 
> Michael Niedermayer wrote:
> >[...]
> 
> Attached to these e-mails are patches providing a basic H.264 encoder. The first
> patch contains the basic encoder, the second adds MMX optimisations. This to
> make the basic patch smaller and thus ease reviewing.
[...]
> >
> > [...]
> >
> >
> >>>> code duplication, ive not checked too carefully but loop filter & motion
> >>>> compensation at least appear to be duplicated from the h264 decoder
> >>>>
> >>>
> >>>
> >> We didn't find a way to easily reuse the loop filter from the decoder, which
> >> is why we wrote the routine ourselves. The motion compensation code in
> >>
> >
> > hmm, where is the problem?

ping ...

anyway if theres significant code duplication then the patch wont be accepted


> >
> >
> >
> >> libavcodec depends on the MpegEncoderContext, which we're not using. For
> >> this
> >> reason, we wrote a simple motion compensation scheme ourselves, using the
> >> optimized DspContext functions.
> >>
> >
> > hmm, i dont remember any h264 MC functions which depend on MpegEncContext

ping2 ...

same as above, use existing code or explain why you cant or patch rejected


> >
> >
> > [...]
> >
> >> I'm pretty sure our codec is simpler then x264, if only for the simple
> >> reason
> >> that it is _much_ smaller in codesize and number of features as we only
> >> started working on it a few months ago  :)
> >>
> >
> > well, is that true even when its compared against old x264 versions?
> >
> I would hope so :-) I can't know for sure, as I do not dare reading the x264
> code as it is GPL licensed and we want our code to be LGPL licensed.  I'd think
> reading code of other H.264 encoders might cause me to use similar constructs,
> which might not be liked by f.e. the x264 authors.

not looking at others code will lead only to one thing, lower quality of your
encoder
also for simplicity comparission simply looking at the object and source code
sizes will do just fine, no need to read the code


> 
> >> What specifically needs to be done now to get the patch integrated in
> >> libavcodec?
> >>
> >
> > alot, the code looks somewhat messy, more details are below,
> > also note that if there where many similar issues i generally
> > mentioned just one but for acceptance all need to be fixed
> >
> I hope it looks a bit less messy now.

yes, but theres still some work left to do before it can be applied
for example all the issues i mentioned which you "ignored", below
is some quick and incomplete review, ill do a more complete one
when you dealt with all issues either by fixing or explaining why
you cant fix ...


[...]
> >> +
> >> +// we'll always work with transposed input blocks, to avoid having to make a distinction between
> >> +// C and mmx implementations
> >> +void ff_h264_transform_dct_quant_c(int16_t block[4][4], int QP, int dontscaleDC) // y,x indexing
> >> +{
> >> +    static const int16_t MF[6][4][4] =
> >> +    {
> >> +        { { 13107, 8066, 13107, 8066}, {  8066, 5243,  8066, 5243}, { 13107, 8066, 13107, 8066}, {  8066, 5243,  8066, 5243} },
> >> +        { { 11916, 7490, 11916, 7490}, {  7490, 4660,  7490, 4660}, { 11916, 7490, 11916, 7490}, {  7490, 4660,  7490, 4660} },
> >> +        { { 10082, 6554, 10082, 6554}, {  6554, 4194,  6554, 4194}, { 10082, 6554, 10082, 6554}, {  6554, 4194,  6554, 4194} },
> >> +        { {  9362, 5825,  9362, 5825}, {  5825, 3647,  5825, 3647}, {  9362, 5825,  9362, 5825}, {  5825, 3647,  5825, 3647} },
> >> +        { {  8192, 5243,  8192, 5243}, {  5243, 3355,  5243, 3355}, {  8192, 5243,  8192, 5243}, {  5243, 3355,  5243, 3355} },
> >> +        { {  7282, 4559,  7282, 4559}, {  4559, 2893,  4559, 2893}, {  7282, 4559,  7282, 4559}, {  4559, 2893,  4559, 2893} }
> >> +    };
> >> +    int32_t qbits = 15 + QP/6;
> >> +    int32_t f = (1<<qbits)/3;
> >> +    int mod = QP%6;
> >>
> >
> > maybe the unused quantize_c() from h264.c could be used or merged?
> >
> >
> > [...]
> >
> >
> >> +void ff_h264_transform_inverse_quant_dct_add_c(int16_t block[4][4], int QP, int dontscaleDC, uint8_t *dst, int stride) // y,x indexing
> >> +{
> >> +    static const int16_t V[6][4][4] =
> >> +    {
> >> +        { { 10*16, 13*16, 10*16, 13*16}, { 13*16, 16*16, 13*16, 16*16}, { 10*16, 13*16, 10*16, 13*16}, { 13*16, 16*16, 13*16, 16*16} },
> >> +        { { 11*16, 14*16, 11*16, 14*16}, { 14*16, 18*16, 14*16, 18*16}, { 11*16, 14*16, 11*16, 14*16}, { 14*16, 18*16, 14*16, 18*16} },
> >> +        { { 13*16, 16*16, 13*16, 16*16}, { 16*16, 20*16, 16*16, 20*16}, { 13*16, 16*16, 13*16, 16*16}, { 16*16, 20*16, 16*16, 20*16} },
> >> +        { { 14*16, 18*16, 14*16, 18*16}, { 18*16, 23*16, 18*16, 23*16}, { 14*16, 18*16, 14*16, 18*16}, { 18*16, 23*16, 18*16, 23*16} },
> >> +        { { 16*16, 20*16, 16*16, 20*16}, { 20*16, 25*16, 20*16, 25*16}, { 16*16, 20*16, 16*16, 20*16}, { 20*16, 25*16, 20*16, 25*16} },
> >> +        { { 18*16, 23*16, 18*16, 23*16}, { 23*16, 29*16, 23*16, 29*16}, { 18*16, 23*16, 18*16, 23*16}, { 23*16, 29*16, 23*16, 29*16} }
> >> +    };
> >> +    DCTELEM elem[4][4];
> >> +    int mod = QP%6;
> >> +
> >> +    if (QP >= 24)
> >> +    {
> >> +        int shift = QP/6-4;
> >> +
> >> +        if (dontscaleDC)
> >> +            elem[0][0] = block[0][0];
> >> +        else
> >> +            elem[0][0] = ((int32_t)block[0][0]*V[mod][0][0]) << shift;
> >> +
> >> +        elem[0][1] = ((int32_t)block[0][1]*V[mod][0][1]) << shift;
> >> +        elem[0][2] = ((int32_t)block[0][2]*V[mod][0][2]) << shift;
> >> +        elem[0][3] = ((int32_t)block[0][3]*V[mod][0][3]) << shift;
> >> +
> >> +        elem[1][0] = ((int32_t)block[1][0]*V[mod][1][0]) << shift;
> >> +        elem[1][1] = ((int32_t)block[1][1]*V[mod][1][1]) << shift;
> >> +        elem[1][2] = ((int32_t)block[1][2]*V[mod][1][2]) << shift;
> >> +        elem[1][3] = ((int32_t)block[1][3]*V[mod][1][3]) << shift;
> >> +
> >> +        elem[2][0] = ((int32_t)block[2][0]*V[mod][2][0]) << shift;
> >> +        elem[2][1] = ((int32_t)block[2][1]*V[mod][2][1]) << shift;
> >> +        elem[2][2] = ((int32_t)block[2][2]*V[mod][2][2]) << shift;
> >> +        elem[2][3] = ((int32_t)block[2][3]*V[mod][2][3]) << shift;
> >> +
> >> +        elem[3][0] = ((int32_t)block[3][0]*V[mod][3][0]) << shift;
> >> +        elem[3][1] = ((int32_t)block[3][1]*V[mod][3][1]) << shift;
> >> +        elem[3][2] = ((int32_t)block[3][2]*V[mod][3][2]) << shift;
> >> +        elem[3][3] = ((int32_t)block[3][3]*V[mod][3][3]) << shift;
> >>
> >
> > maybe the init_dequant* stuff from h264.c could be used?
> >
[...]
> > and cliping is bad, VERY bad, you should change the mb type, qp or whatever
> > if you cannot encode a coefficient, of coarse if you encode several MB types
> > and choose the one which has the best rate+distortion at the end then cliping
> > isnt an issue
> 
> 
[...]
> 
> > [...]
> >
> >
> >> +static inline int h264cavlc_get_lookup_table(int na, int nb)
> >> +{
> >> +    int nc = 0;
> >> +    int8_t lookup_table[8] = {0, 0, 1, 1, 2, 2, 2, 2};
> >> +
> >> +    if (na >= 0 && nb >= 0)
> >> +    {
> >> +        nc = na+nb+1;
> >> +        nc >>= 1;
> >> +    }
> >> +    else
> >> +    {
> >> +        if (na >= 0) // nB < 0
> >> +            nc = na;
> >> +        else if (nb >= 0) // nA < 0
> >> +            nc = nb;
> >> +    }
> >> +
> >> +    return (nc < 8) ? lookup_table[nc] : 3;
> >> +}
> >>
> >
> > pred_non_zero_count() could be used for the first part
> >
> >
> > [...]
> >
[...]
> 
> > [...]
> >
> >
> >> +    for (i = 0 ; i < rbsplen ; i++)
> >> +    {
> >> +        if (i + 2 < rbsplen && (rbsp[i] == 0 && rbsp[i+1] == 0 && rbsp[i+2] < 4))
> >> +        {
> >> +            dest[destpos++] = rbsp[i++];
> >> +            dest[destpos++] = rbsp[i];
> >> +            dest[destpos++] = 0x03; // emulation prevention byte
> >> +        }
> >> +        else
> >> +            dest[destpos++] = rbsp[i];
> >> +    }
> >>
> >
> > instead of a loop which checks every byte, you could check just every 2nd
> > also see encode_nal() un h264.c maybe you can use some parts from that
> >
> >
> > [...]
> >
> >> +
> >> +// inblock is transposed, outblock isn't
> >> +void ff_h264_dct_c(DCTELEM inblock[4][4],DCTELEM outblock[4][4])
> >> +{
> >> +    DCTELEM pieces[4][4];
> >> +
> >> +    pieces[0][0] = inblock[0][0]+inblock[1][0]+inblock[2][0]+inblock[3][0];
> >> +    pieces[0][1] = inblock[0][1]+inblock[1][1]+inblock[2][1]+inblock[3][1];
> >> +    pieces[0][2] = inblock[0][2]+inblock[1][2]+inblock[2][2]+inblock[3][2];
> >> +    pieces[0][3] = inblock[0][3]+inblock[1][3]+inblock[2][3]+inblock[3][3];
> >> +
> >> +    pieces[1][0] = (inblock[0][0]<<1)+inblock[1][0]-inblock[2][0]-(inblock[3][0]<<1);
> >> +    pieces[1][1] = (inblock[0][1]<<1)+inblock[1][1]-inblock[2][1]-(inblock[3][1]<<1);
> >> +    pieces[1][2] = (inblock[0][2]<<1)+inblock[1][2]-inblock[2][2]-(inblock[3][2]<<1);
> >> +    pieces[1][3] = (inblock[0][3]<<1)+inblock[1][3]-inblock[2][3]-(inblock[3][3]<<1);
> >> +
> >> +    pieces[2][0] = inblock[0][0]-inblock[1][0]-inblock[2][0]+inblock[3][0];
> >> +    pieces[2][1] = inblock[0][1]-inblock[1][1]-inblock[2][1]+inblock[3][1];
> >> +    pieces[2][2] = inblock[0][2]-inblock[1][2]-inblock[2][2]+inblock[3][2];
> >> +    pieces[2][3] = inblock[0][3]-inblock[1][3]-inblock[2][3]+inblock[3][3];
> >> +
> >> +    pieces[3][0] = inblock[0][0]-(inblock[1][0]<<1)+(inblock[2][0]<<1)-inblock[3][0];
> >> +    pieces[3][1] = inblock[0][1]-(inblock[1][1]<<1)+(inblock[2][1]<<1)-inblock[3][1];
> >> +    pieces[3][2] = inblock[0][2]-(inblock[1][2]<<1)+(inblock[2][2]<<1)-inblock[3][2];
> >> +    pieces[3][3] = inblock[0][3]-(inblock[1][3]<<1)+(inblock[2][3]<<1)-inblock[3][3];
> >> +
> >> +    outblock[0][0] = pieces[0][0]+pieces[0][1]+pieces[0][2]+pieces[0][3];
> >> +    outblock[0][1] = pieces[1][0]+pieces[1][1]+pieces[1][2]+pieces[1][3];
> >> +    outblock[0][2] = pieces[2][0]+pieces[2][1]+pieces[2][2]+pieces[2][3];
> >> +    outblock[0][3] = pieces[3][0]+pieces[3][1]+pieces[3][2]+pieces[3][3];
> >> +
> >> +    outblock[1][0] = (pieces[0][0] << 1)+pieces[0][1]-pieces[0][2]-(pieces[0][3]<<1);
> >> +    outblock[1][1] = (pieces[1][0] << 1)+pieces[1][1]-pieces[1][2]-(pieces[1][3]<<1);
> >> +    outblock[1][2] = (pieces[2][0] << 1)+pieces[2][1]-pieces[2][2]-(pieces[2][3]<<1);
> >> +    outblock[1][3] = (pieces[3][0] << 1)+pieces[3][1]-pieces[3][2]-(pieces[3][3]<<1);
> >> +
> >> +    outblock[2][0] = pieces[0][0]-pieces[0][1]-pieces[0][2]+pieces[0][3];
> >> +    outblock[2][1] = pieces[1][0]-pieces[1][1]-pieces[1][2]+pieces[1][3];
> >> +    outblock[2][2] = pieces[2][0]-pieces[2][1]-pieces[2][2]+pieces[2][3];
> >> +    outblock[2][3] = pieces[3][0]-pieces[3][1]-pieces[3][2]+pieces[3][3];
> >> +
> >> +    outblock[3][0] = pieces[0][0]-(pieces[0][1]<<1)+(pieces[0][2]<<1)-pieces[0][3];
> >> +    outblock[3][1] = pieces[1][0]-(pieces[1][1]<<1)+(pieces[1][2]<<1)-pieces[1][3];
> >> +    outblock[3][2] = pieces[2][0]-(pieces[2][1]<<1)+(pieces[2][2]<<1)-pieces[2][3];
> >> +    outblock[3][3] = pieces[3][0]-(pieces[3][1]<<1)+(pieces[3][2]<<1)-pieces[3][3];
> >> +}
> >>
> >
> > h264.c h264_diff_dct_c() looks so much nicer, is this faster?
> >
> > [...]
> >
> >
[...]
> > [...]
> >
> >> +        int x,y;
> >> +
> >> +        for (y = 0 ; y < 4 ; y++)
> >> +            for (x = 0 ; x < 4 ; x++)
> >> +                mb->Y_nonzero[y][x] = 0;
> >>
> >
> > memset() ?
> >
> >
> >
> >> +    }
> >> +
> >> +    if (CodedBlockPatternChroma == 0)
> >> +    {
> >> +        int x,y;
> >> +
> >> +        for (y = 0 ; y < 2 ; y++)
> >> +        {
> >> +            for (x = 0 ; x < 2 ; x++)
> >> +            {
> >> +                mb->U_nonzero[y][x] = 0;
> >> +                mb->V_nonzero[y][x] = 0;
> >>
> >
> > again memset() should be useable here, also look at fill_rectangle() from
> > h264.c
> >
> >
> >
[...]
> 
> > [...]
> >
> >
> >
> >> +
> >> +#ifdef H264_ENABLE_QPEL
> >> +
> >> +#define H264_QPEL_6TAP(a,b,c,d,e,f) ((a)-5*(b)+20*(c)+20*(d)-5*(e)+(f))
> >> +#define H264_QPEL_AVG(a,b) (((a)+(b)+1)>>1)
> >> +
> >> +static inline void ff_h264_calc_Yqpelpixels(const uint8_t *src, int srcstride, uint8_t *dst, int dststride1,int dststride2)
> >> +{
> >> +    int b1,h1,s1,m1;
> >> +    int aa,bb,gg,hh;
> >> +    int j1,j,b,h;
> >> +    const uint8_t *pos = src - 2 - 2*srcstride;
> >> +    const uint8_t *pos2 = pos + 2;
> >> +    int s,m;
> >> +    int a,c,d,n,f,i,k,q;
> >> +    int e,g,p,r;
> >> +
> >> +    aa = H264_QPEL_6TAP((int)pos[0],(int)pos[1],(int)pos[2],(int)pos[3],(int)pos[4],(int)pos[5]);
> >> +    pos += srcstride;
> >> +    bb = H264_QPEL_6TAP((int)pos[0],(int)pos[1],(int)pos[2],(int)pos[3],(int)pos[4],(int)pos[5]);
> >> +    pos += srcstride;
> >> +    b1 = H264_QPEL_6TAP((int)pos[0],(int)pos[1],(int)pos[2],(int)pos[3],(int)pos[4],(int)pos[5]);
> >> +    pos += srcstride;
> >> +    s1 = H264_QPEL_6TAP((int)pos[0],(int)pos[1],(int)pos[2],(int)pos[3],(int)pos[4],(int)pos[5]);
> >> +    pos += srcstride;
> >> +    gg = H264_QPEL_6TAP((int)pos[0],(int)pos[1],(int)pos[2],(int)pos[3],(int)pos[4],(int)pos[5]);
> >> +    pos += srcstride;
> >> +    hh = H264_QPEL_6TAP((int)pos[0],(int)pos[1],(int)pos[2],(int)pos[3],(int)pos[4],(int)pos[5]);
> >> +    h1 = H264_QPEL_6TAP((int)pos2[0],(int)pos2[srcstride],(int)pos2[srcstride*2],(int)pos2[srcstride*3],
> >> +                   (int)pos2[srcstride*4],(int)pos2[srcstride*5]);
> >> +    pos2++;
> >> +    m1 = H264_QPEL_6TAP((int)pos2[0],(int)pos2[srcstride],(int)pos2[srcstride*2],(int)pos2[srcstride*3],
> >> +                   (int)pos2[srcstride*4],(int)pos2[srcstride*5]);
> >> +
> >> +    j1 = H264_QPEL_6TAP(aa,bb,b1,s1,gg,hh);
> >> +    b = clip_uint8((b1+16)>>5);
> >> +    h = clip_uint8((h1+16)>>5);
> >> +    j = clip_uint8((j1+512)>>10);
> >> +    s = clip_uint8((s1+16)>>5);
> >> +    m = clip_uint8((m1+16)>>5);
> >> +
> >> +    a = H264_QPEL_AVG((int)src[0],b);
> >> +    c = H264_QPEL_AVG((int)src[1],b);
> >> +    d = H264_QPEL_AVG((int)src[0],h);
> >> +    n = H264_QPEL_AVG((int)src[srcstride],h);
> >> +    f = H264_QPEL_AVG(b,j);
> >> +    i = H264_QPEL_AVG(h,j);
> >> +    k = H264_QPEL_AVG(j,m);
> >> +    q = H264_QPEL_AVG(j,s);
> >> +
> >> +    e = H264_QPEL_AVG(b,h);
> >> +    g = H264_QPEL_AVG(b,m);
> >> +    p = H264_QPEL_AVG(h,s);
> >> +    r = H264_QPEL_AVG(m,s);
> >> +
> >> +    dst[0] = src[0];
> >> +    dst[dststride1] = a;
> >> +    dst[dststride1*2] = b;
> >> +    dst[dststride1*3] = c;
> >> +    dst += dststride2;
> >> +    dst[0] = d;
> >> +    dst[dststride1] = e;
> >> +    dst[dststride1*2] = f;
> >> +    dst[dststride1*3] = g;
> >> +    dst += dststride2;
> >> +    dst[0] = h;
> >> +    dst[dststride1] = i;
> >> +    dst[dststride1*2] = j;
> >> +    dst[dststride1*3] = k;
> >> +    dst += dststride2;
> >> +    dst[0] = n;
> >> +    dst[dststride1] = p;
> >> +    dst[dststride1*2] = q;
> >> +    dst[dststride1*3] = r;
> >> +}
> >>
> >
> > this looks like it duplicates put_h264_qpel_pixels_tab or similar from dsputil
> I'll try to understand that function and see if it does the same.
> 
> >
> > [...]
> >
> >
> >> +
> >> +    while (!done && numsteps < MAXSEARCHSTEPS)
> >> +    {
> >> +        int startx = curx - SEARCHWIDTH;
> >> +        int starty = cury - SEARCHWIDTH;
> >> +        int stopx = curx + SEARCHWIDTH + 1;
> >> +        int stopy = cury + SEARCHWIDTH + 1;
> >> +        int foundbetter = 0;
> >> +        int scanx, scany;
> >> +
> >> +        if (startx < 0)
> >> +            startx = 0;
> >> +        if (starty < 0)
> >> +            starty = 0;
> >> +        if (stopx > t->refframe_width - 16 + 1)
> >> +            stopx = t->refframe_width - 16 + 1;
> >> +        if (stopy > t->refframe_height - 16 + 1)
> >> +            stopy = t->refframe_height -16 + 1;
> >> +
> >> +        for(scany = starty; scany < stopy; scany++)
> >> +        {
> >> +            for(scanx = startx; scanx < stopx; scanx++)
> >> +            {
> >> +                if (!(curx == scanx && cury == scany))
> >> +                {
> >> +                    int xvec = (scanx-x)*4-pred_mvx; // it's actually this difference which will be encoded!
> >> +                    int yvec = (scany-y)*4-pred_mvy;
> >> +                    int bitsize;
> >> +                    int xmod = scanx%2;
> >> +                    int ymod = scany%2;
> >> +                    int absnum = xmod+ymod*2;
> >> +                    int sae = t->dspcontext.pix_abs[0][0](0,targetmb->Y[0],
> >> +                        refframe->reconstructed_picture.data[0]    + scany * refframe->reconstructed_picture.linesize[0] + scanx,
> >> +                        refframe->reconstructed_picture.linesize[0], 16);
> >> +
> >> +                    sae += t->dspcontext.pix_abs[1][absnum](0,targetmb->U[0],
> >> +                        refframe->reconstructed_picture.data[1]    + (scany/2) * refframe->reconstructed_picture.linesize[1] + scanx/2,
> >> +                        refframe->reconstructed_picture.linesize[1], 8);
> >> +                    sae += t->dspcontext.pix_abs[1][absnum](0,targetmb->V[0],
> >> +                        refframe->reconstructed_picture.data[2]    + (scany/2) * refframe->reconstructed_picture.linesize[2] + scanx/2,
> >> +                        refframe->reconstructed_picture.linesize[2], 8);
> >>
> >
> > it would be nice if the comparission function where user selectable instead
> > of hardcoded like in the rest of lavc too
> >
> > [...]
> I do not know yet how this works in the rest of lavc, but I'll have a look at
> this.
> 
[...]
> 
> >
> >> +
> >> +// Adjust the values of mvx and mvy based on the prediction from the neighbouring macroblocks
> >> +static void ff_h264_estimate_motion_vectors(MacroBlock *destmb, int *mvpred_x, int *mvpred_y, int *mvpred_x2, int *mvpred_y2)
> >> +{
> >> +    int mvAx = 0, mvAy = 0;
> >> +    int mvBx = 0, mvBy = 0;
> >> +    int mvCx = 0, mvCy = 0;
> >> +    int mvDx = 0, mvDy = 0;
> >> +    int Aavail = 0;
> >> +    int Bavail = 0;
> >> +    int Cavail = 0;
> >> +    int Davail = 0;
> >> +
> >> +    if (destmb->leftblock != NULL && destmb->leftblock->available)
> >> +    {
> >> +        Aavail = 1;
> >> +        mvAx = destmb->leftblock->mv_x;
> >> +        mvAy = destmb->leftblock->mv_y;
> >> +    }
> >> +    if (destmb->topblock != NULL)
> >> +    {
> >> +        MacroBlock *topblock = destmb->topblock;
> >> +
> >> +        if (topblock->available)
> >> +        {
> >> +            Bavail = 1;
> >> +            mvBx = topblock->mv_x;
> >> +            mvBy = topblock->mv_y;
> >> +        }
> >> +        if (topblock->leftblock != NULL && topblock->leftblock->available)
> >> +        {
> >> +            Davail = 1;
> >> +            mvDx = topblock->leftblock->mv_x;
> >> +            mvDy = topblock->leftblock->mv_y;
> >> +        }
> >> +        if (topblock->rightblock != NULL && topblock->rightblock->available)
> >> +        {
> >> +            Cavail = 1;
> >> +            mvCx = topblock->rightblock->mv_x;
> >> +            mvCy = topblock->rightblock->mv_y;
> >> +        }
> >> +    }
> >> +
> >> +    if (!Cavail)
> >> +    {
> >> +        Cavail = Davail;
> >> +        mvCx = mvDx;
> >> +        mvCy = mvDy;
> >> +    }
> >> +
> >> +    if (!Bavail && !Cavail && Aavail)
> >> +    {
> >> +        mvBx = mvAx;
> >> +        mvBy = mvAy;
> >> +        mvCx = mvAx;
> >> +        mvCy = mvAy;
> >> +    }
> >> +
> >> +    *mvpred_x = ff_h264_median(mvAx,mvBx,mvCx);
> >> +    *mvpred_y = ff_h264_median(mvAy,mvBy,mvCy);
> >> +
> >> +    if (!Aavail || !Bavail || (Aavail && mvAx == 0 && mvAy == 0) || (Bavail && mvBx == 0 && mvBy == 0))
> >> +    {
> >> +        *mvpred_x2 = 0;
> >> +        *mvpred_y2 = 0;
> >> +    }
> >> +    else
> >> +    {
> >> +        *mvpred_x2 = *mvpred_x;
> >> +        *mvpred_y2 = *mvpred_y;
> >> +    }
> >> +}
> >>
> >
> > this looks like its a duplicate of h264.c pred_motion()
> I'll try to understand pred_motion() and see if it does the same.
> 
> 
[...]
> +                if (val > 0)
> +                    val--;
> +                else // < 0
> +                    val++;

val -= (val>>31)|1;


[...]
> +void ff_h264_hadamard_quant_4x4_c(DCTELEM Y[4][4], int QP)
> +{
> +    int qbits = 15 + div6[QP];
> +    int f2 = ((1 << qbits) / 3)*2;

always replace division by constants by multiplications by the reciprocal
>>something unless the code is irrelevant speedwise (init code)


[...]
> +    buf = (uint8_t *)av_malloc(s);

useless cast


[...]
> +    for (y = 0 ; y < 2 ; y++)
> +    {
> +        for (x = 0 ; x < 2 ; x++)
> +        {
> +            copy_mb->U_nonzero[y][x] = 16;
> +            copy_mb->V_nonzero[y][x] = 16;
> +        }
> +    }

memset()


[...]
> +#define H264_JUST_CLIP(x) \
> +{\
> +     int16_t val = x;\
> +     if (val > 2047)\
> +             x = 2047; \
> +     else if (val < -2047) \
> +             x = -2047;\

clip()


[...]
> +    static const int8_t mbtype_map[4][3][2] =
> +    {
> +        {
> +            {  1, 13 },  // 0 0 0, 0 0 1
> +            {  5, 17 },  // 0 1 0, 0 1 1
> +            {  9, 21 }   // 0 2 0, 0 2 1
> +        },
> +        {
> +            {  2, 14 },  // 1 0 0, 1 0 1
> +            {  6, 18 },  // 1 1 0, 1 1 1
> +            { 10, 22 }   // 1 2 0, 1 2 1
> +        },
> +        {
> +            {  3, 15 },  // 2 0 0, 2 0 1
> +            {  7, 19 },  // 2 1 0, 2 1 1
> +            { 11, 23 }   // 2 2 0, 2 2 1
> +        },
> +        {
> +            {  4, 16 },  // 3 0 0, 3 0 1
> +            {  8, 20 },  // 3 1 0, 3 1 1
> +            { 12, 24 }   // 3 2 0, 3 2 1
> +        }
> +    };

mbtype_map[a][b][c] -> 1 + a + 4*(b + 3*c)


[...]
> +                int x = (i%2)+X;
> +                int y = (i/2)+Y;

&1 and >>1


> +
> +                int k;
> +
> +                for (k = 0 ; k < 15 ; k++)
> +                    coefficients[k] = residual->part4x4Y[y][x][zigzagy[k+1]][zigzagx[k+1]];
> +                ff_h264_neighbour_count_nonzero(mb,NEIGHBOUR_SUBTYPE_Y,x,y,&nA,&nB);
> +                mb->Y_nonzero[y][x] = h264cavlc_encode(b,coefficients,15,nA,nB,0);
> +            }
> +        }
> +    }
> +    else
> +        memset(mb->Y_nonzero, 0, sizeof(mb->Y_nonzero));
> +
> +    if (CodedBlockPatternChroma == 0)
> +    {
> +        memset(mb->U_nonzero, 0, sizeof(mb->U_nonzero));
> +        memset(mb->V_nonzero, 0, sizeof(mb->V_nonzero));
> +        return;
> +    }
> +
> +    if (CodedBlockPatternChroma != 0)
> +    {
> +        H264_ENCODE_INTRA16X16_RESIDUAL_COEFFICIENTS(UD);
> +        H264_ENCODE_INTRA16X16_RESIDUAL_COEFFICIENTS(VD);
> +    }
> +
> +    if (CodedBlockPatternChroma == 2)
> +    {
> +        for (i = 0 ; i < 4 ; i++)
> +        {
> +            int x = (i%2);
> +            int y = (i/2);

&1 and >>1


> +
> +            int k;
> +
> +            for (k = 0 ; k < 15 ; k++)
> +                coefficients[k] = residual->part4x4U[y][x][zigzagy[k+1]][zigzagx[k+1]];
> +            ff_h264_neighbour_count_nonzero(mb,NEIGHBOUR_SUBTYPE_U,x,y,&nA,&nB);
> +            mb->U_nonzero[y][x] = h264cavlc_encode(b,coefficients,15,nA,nB,0);
> +        }
> +
> +        for (i = 0 ; i < 4 ; i++)
> +        {
> +            int x = (i%2);
> +            int y = (i/2);

&1 and >>1


[...]
> +        for (y = 0 ; y < 2 ; y++)
> +        {
> +            for (x = 0 ; x < 2 ; x++)
> +            {
> +                mb->U_nonzero[y][x] = 0;
> +                mb->V_nonzero[y][x] = 0;
> +            }
> +        }

memset()


[...]
> +        if (topavail && w == 16 && h == 16 && srcleft->topblock != 0 && srcleft->topblock->available)
> +        {
> +            // Plane prediction

use h264.c:pred*x*_plane_c and the other ones from h264.c


[...]
> +#ifdef CONFIG_ENCODERS
> +AVCodec h264_encoder = {
> +    "ffh264",
> +    CODEC_TYPE_VIDEO,
> +    CODEC_ID_FFH264,
> +    sizeof(H264Context),
> +    ff_h264_encoder_init,
> +    ff_h264_encode,
> +    ff_h264_encoder_close,

list of supported pix_fmts missing


[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is




More information about the ffmpeg-devel mailing list