[FFmpeg-devel] [PATCHv4] VP4 video decoder

Reimar Döffinger Reimar.Doeffinger at gmx.de
Tue May 21 22:30:54 EEST 2019


On Tue, May 21, 2019 at 05:44:20PM +1000, Peter Ross wrote:
> +    if (bits < 0x100) {
> +        skip_bits(gb, 1);
> +    } else if (bits < 0x180) {
> +        skip_bits(gb, 2);
> +        v += 1;
> +    }
> +#define body(n) { \
> +    skip_bits(gb, 2 + n); \
> +    v += (1 << n) + get_bits(gb, n); }
> +#define else_if(thresh, n) else if (bits < thresh) body(n)

Not sure I think the defines are great for readability,
but if you want to fully encode the logic, you could go for
e.g.
#define else_if(n) else if (bits < (0x200 - (0x80 >> n))) body(n)
Also as to the earlier discussed early bailout for the +257 case:
it seems sensible values can't really be larger than yuv_macroblock_count
and I think FFmpeg has defines for maximum frame width/height that
you could thus use to have a non-arbitrary bailout value?

> +    has_partial = 0;
> +    bit         = get_bits1(gb);
> +    current_run = vp4_read_mb_value(gb);
> +
> +    for (i = 0; i < s->yuv_macroblock_count; i++) {
> +        if (!current_run) {
> +            bit ^= 1;
> +            current_run = vp4_read_mb_value(gb);
> +        }
> +        s->superblock_coding[i] = 2 * bit;
> +        has_partial |= bit == 0;
> +        current_run--;
> +    }

This code doesn't quite look right to me.
For both of the vp4_read_mb_value weird stuff
seems to happen when it returns 0,
in the first case it directly flips bit
and reads a new value, which is stupid wasteful
encoding, in the second case current_run underflows
on current_run--, which is undefined behaviour
- or at least weird?

Except for that, isn't that the same as
   bit         = get_bits1(gb);
   for (i = 0; i < s->yuv_macroblock_count; ) {
       current_run = vp4_read_mb_value(gb);
       if (current_run > s->yuv_macroblock_count - i) -> report error?
       if (current_run == 0) current_run = s->yuv_macroblock_count - i; // maybe??
       memset(s->superblock_coding + i, 2*bit, current_run);
       has_partial |= bit == 0;
       i += current_run;
       bit ^= 1;
   }

Hm, admittedly this doesn't really make much
sense as you can't apply this trick to the has_partial
case.
But still maybe the current_run too large and 0
cases could be clarified at least.

> +                    int mb_y = 2 * sb_y + (((j >> 1) + j) & 1);

Is ^ potentially clearer than + here?

> +                    for (k = 0; k < 4; k++) {
> +                        if (BLOCK_X >= fragment_width || BLOCK_Y >= fragment_height)
> +                            continue;
> +                        fragment = s->fragment_start[plane] + BLOCK_Y * fragment_width + BLOCK_X;
> +
> +                        coded = pattern & (1 << (3 - k));

coded = pattern & (8 >> k);
maybe?

> +    if (last_motion < 0)
> +        v = -v;
> +    return v;

I'd probably be partial to using ?: here, but your decision.

> +                    if (coding_mode == 2) { /* VP4 */
> +                        motion_x[0] = vp4_get_mv(s, gb, 0, last_gold_motion_x);
> +                        motion_y[0] = vp4_get_mv(s, gb, 1, last_gold_motion_y);
> +                        last_gold_motion_x = motion_x[0];
> +                        last_gold_motion_y = motion_y[0];

Could write as
last_gold_motion_x =
motion_x[0] = vp4_get_mv(s, gb, 0, last_gold_motion_x);
I think?
But no strong opinion either.

> @@ -1012,8 +1214,8 @@ static int unpack_vlcs(Vp3DecodeContext *s, GetBitContext *gb,
>              bits_to_get = coeff_get_bits[token];
>              if (bits_to_get)
>                  bits_to_get = get_bits(gb, bits_to_get);
> -            coeff = coeff_tables[token][bits_to_get];
>
> +            coeff = coeff_tables[token][bits_to_get];

cosmetic?

> +            eob_run = eob_run_base[token];
> +            if (eob_run_get_bits[token])
[...]
> +            zero_run = zero_run_base[token];
> +            if (zero_run_get_bits[token])

If _run_base and _run_get_bits are always used together like this,
wouldn't it for readability and cache locality be better to
make it an array of structs so they are next to each other in memory?

> +            vp4_dc_predictor_reset(&dc_pred[j * 6 + i + 7]);
> +        s->dc_pred_row[sb_x * 4 + i] = dc_pred[25 + i];
> +        dc_pred[6 * i] = dc_pred[6 * i + 4];

If there's an easy way to make those constants like 6, 7 and 25
more obvious that might be a good idea.

> +    if (dc_pred[idx - 6].type == type) {
> +        dc += dc_pred[idx - 6].dc;
> +        count++;
> +    }
> +
> +    if (dc_pred[idx + 6].type == type) {
> +        dc += dc_pred[idx + 6].dc;
> +        count++;
> +    }
> +
> +    if (count != 2 && dc_pred[idx - 1].type == type) {
> +        dc += dc_pred[idx - 1].dc;
> +        count++;
> +    }
> +
> +    if (count != 2 && dc_pred[idx + 1].type == type) {
> +        dc += dc_pred[idx + 1].dc;
> +        count++;
> +    }

Maybe do dc_pred += idx at the start and then
only dc_pred[-6], dc_pred[6] etc?

> +#define loop_stride 12
> +    uint8_t loop[12 * loop_stride];

Hm, at 144 bytes, might it make sense to have in context
instead of on stack?

> +#define SHIFT(v, shift) ((v) >> (shift))
> +#define ABS_SHIFT(v, shift) ((v) > 0 ? SHIFT(v, shift) : -SHIFT(-v, shift))

Don't we have something like that already?
I think this should rather be:
(v - (v >> 31)) >> shift
?

Best regards,
Reimar Döffinger


More information about the ffmpeg-devel mailing list