[FFmpeg-devel] [PATCH][RFC] Indeo3 replacement

Maxim max_pole
Sun Jul 26 23:16:22 CEST 2009


Michael Niedermayer schrieb:
> [...]  
>> /* FIXME: I know we already have a bitreader in ffmpeg */
>> /* it should be adapted to read ahead only one byte */
>> /* otherwise it won't work for indeo3 !!! */
>>     
>
> elaborate please
>   

Ok, frame data mixes binary tree which is a bitstream with cell data
which is a bytestream.
Below an example of a typical parsing sequence.
Consider the following frame data:

Addr
0000   0x11,
0001   0x6C,
0002   0x0,
0003   0x1,
0004   0xFD,
0005   0xFB,
0006   0x3,
0007   0x11

We parse them as follows:

data_ptr  points to the next data code
read_bintree_code() reads 2bit codes starting with MSB

bs_cache = *data_ptr++; // feed the bitstream reader with the first byte
= 0x11 = %00010001
code = read_bintree_code(); // we'll get %00 = H_SPLIT, bs_cache = %010001
...process this code...
code = read_bintree_code(); // we'll get %01 = V_SPLIT, bs_cache = %0001
...process this code...
code = read_bintree_code(); // we'll get %00 = H_SPLIT, bs_cache = %01
...process this code...
code = read_bintree_code(); // we'll get %01 = V_SPLIT, bs_cache = empty
bs_cache = *data_ptr++; // feed the bitstream reader with the 2nd byte =
0x6C = %01101100
...process "code"...
code = read_bintree_code(); // we'll get %01 = V_SPLIT, bs_cache = %101100
...process this code...
code = read_bintree_code(); // we'll get %10 = INTRA, bs_cache = %1100
...process this code...
code = read_bintree_code(); // we'll get %11 = VQ_DATA, bs_cache = %00

at this place we call decode_cell() routine that takes the data_ptr
pointed to the cell data starting with the address 0002. The data at
this position is treated as bytestream now. After cell decoding we set
the data_ptr to the next byte.

data_ptr = decode_cell(data_ptr);

// data_ptr points to addr 0007 now

code = read_bintree_code(); // we'll get %00 = H_SPLIT, bs_cache = empty
bs_cache = *data_ptr++; // feed the bitstream reader with the 7th byte =
0x11 = %00010001

... continue processing...

So, the code needs to manipulate the internal structs of the FFmpeg's
bitreader in order to achieve the same behaviour what's not really safe
IMHO. Maybe there is a possibility to do it smart. I don't know...


> [...]
>
>
>>         prim_delta   = &delta_tabs   [prim_indx]  [0];
>>         prim_sel     = &selector_tabs[prim_indx]  [0];
>>         second_delta = &delta_tabs   [second_indx][0];
>>         second_sel   = &selector_tabs[second_indx][0];
>>     } else {
>>         vq_index += ctx->cb_offset;
>>         assert(vq_index <= 23);
>>
>>         prim_delta   = &delta_tabs   [vq_index][0];
>>         prim_sel     = &selector_tabs[vq_index][0];
>>         second_delta = prim_delta;
>>         second_sel   = prim_sel;
>>     }
>>
>>     /* requantize the prediction if VQ index of this cell differs from VQ index */
>>     /* of the predicted cell in order to avoid overflows. */
>>     /* FIXME: if (vq_index >= 8 && (mode == 0 || mode == 3 || mode == 10) [win32] */
>>     if (vq_index >= 8) {
>>         for (x = 0; x < cell->width << 2; x++)
>>             ref_block[x] = requant_tab[vq_index & 7][ref_block[x]];
>>     }
>>
>>     /* convert the pixel offset into 4x4 block one */
>>     row_offset     = plane->pitch >> 2;
>>     blk_row_offset = (plane->pitch - cell->width) << 2;
>>
>>     rle_blocks = 0;  // reset RLE block counter
>>
>>     switch (mode) {
>>         case 0: /*------------------ MODES 0 & 1 (4x4 block processing) --------------------*/
>>         case 1:
>>             skip_flag = 0;
>>
>>             for (y = 0; y < cell->height; y++) {
>>                 for (x = 0; x < cell->width; x++) {
>>                     /* address 4 pixels as one 32bit integer */
>>                     ref32 = (int32_t *)ref_block;
>>                     src32 = (int32_t *)block;
>>
>>                     if (rle_blocks > 0) {
>>                         /* apply 0 delta to whole next block */
>>                         if (cell->mv_ptr || !skip_flag)
>>                             copy_32(src32, ref32, 4, row_offset);
>>                         rle_blocks--;
>>                     } else {
>>                         for (line = 0; line < 4;) {
>>                             num_lines = 1;
>>
>>                             code = *data_ptr++;
>>                             /* select primary VQ table for odd, secondary for even lines */
>>                             delta_tab = (line & 1) ? prim_delta : second_delta;
>>
>>                             /* switch on code type: dyad, quad or RLE escape codes */
>>                             switch ((line & 1) ? prim_sel[code] : second_sel[code]) {
>>                                 case DELTA_DYAD: /* apply VQ delta to two dyads (2+2 pixels) using softSIMD */
>>                                     if (((line & 1) ? prim_sel[*data_ptr] : second_sel[*data_ptr]) != DELTA_DYAD) {
>>                                         av_log(avctx, AV_LOG_ERROR, "Mode 0/1: invalid VQ data!\n");
>>                                         return -1;
>>                                     }
>>                                     ref16 = (int16_t *)ref32;
>>                                     src16 = (int16_t *)src32;
>>                                     src16[0] = ref16[0] + delta_tab[*data_ptr++];
>>                                     src16[1] = ref16[1] + delta_tab[code];
>>                                     break;
>>
>>                                 case DELTA_QUAD: /* apply VQ delta to 4 pixels at once using softSIMD */
>>                                     src32[0] = ref32[0] + delta_tab[code];
>>                                     break;
>>
>>                                 case RLE_ESC_FF: /* apply null delta to all lines up to the 2nd line */
>>                                     //assert(line < 1);
>>                                     copy_32(src32, ref32, 2, row_offset);
>>                                     num_lines = 2;
>>                                     break;
>>
>>                                 case RLE_ESC_FE: /* apply null delta to all lines up to the 3rd line */
>>                                     //assert(line < 2);
>>                                     copy_32(src32, ref32, 3 - line, row_offset);
>>                                     num_lines = 3 - line;
>>                                     break;
>>
>>                                 case RLE_ESC_FC:
>>                                     /* apply null delta to all remaining lines of this block
>>                                        and to whole next block */
>>                                     skip_flag  = 0;
>>                                     rle_blocks = 1;
>>
>>                                 case RLE_ESC_FD: /* apply null delta to all remaining lines of this block */
>>                                     copy_32(src32, ref32, 4 - line, row_offset);
>>                                     num_lines = 4 - line; /* go to process next block */
>>                                     break;
>>
>>                                 case RLE_ESC_FB: /* apply null delta to n blocks/skip n blocks */
>>                                     /* get next byte after the escape code 0xFB */
>>                                     code = *data_ptr++;
>>                                     rle_blocks = (code & 0x1F) - 1; /* set the block counter */
>>                                     if (code >= 64 || rle_blocks < 0) {
>>                                         av_log(avctx, AV_LOG_ERROR, "Mode 0/1: RLE-FB invalid counter: %d!\n", code);
>>                                         return -1;
>>                                     }
>>                                     skip_flag = code & 0x20;
>>                                     if (cell->mv_ptr || !skip_flag)
>>                                         copy_32(src32, ref32, 4 - line, row_offset);
>>                                     num_lines = 4 - line; /* go to process next block */
>>                                     break;
>>
>>                                 case RLE_ESC_F9: /* skip this block and the next one */
>>                                     skip_flag  = 1;
>>                                     rle_blocks = 1;
>>
>>                                 case RLE_ESC_FA: /* skip this block (INTRA) or copy the reference block (INTER) */
>>                                     assert(!line);
>>                                     if (cell->mv_ptr)
>>                                         copy_32(src32, ref32, 4, row_offset);
>>                                     num_lines = 4;
>>                                     break;
>>
>>                                 default:
>>                                     av_log(avctx, AV_LOG_ERROR, "Mode 0/1: unsupported RLE code: %d!\n",
>>                                           (line & 1) ? prim_sel[code] : second_sel[code]);
>>                                     return(-1);
>>                             }// switch code
>>
>>                             /* move forward num_lines */
>>                             line  += num_lines;
>>                             ref32 += row_offset * num_lines;
>>                             src32 += row_offset * num_lines;
>>                         }// for line
>>                     }// if/else
>>
>>                     /* move to next block horizontal */
>>                     ref_block += 4;
>>                     block     += 4;
>>                 }// for x
>>
>>                 /* move to next line of blocks */
>>                 ref_block += blk_row_offset;
>>                 block     += blk_row_offset;
>>             }// for y
>>             break;
>>
>>         case 3: /*------------------ MODES 3 & 4 (4x8 block processing) --------------------*/
>>         case 4:
>>             if (cell->mv_ptr) {
>>                 av_log(avctx, AV_LOG_ERROR, "Trying to use Mode 3/4 for an INTER cell!\n");
>>                 return -1;
>>             }
>>             block32        = (int32_t *)block;
>>             blk_row_offset = (row_offset << 3) - cell->width;
>>             skip_flag      = 0;
>>
>>             for (y = 0, is_first_row = 1; y < cell->height; y += 2) {
>>                 for (x = 0; x < cell->width; x++) {
>>                     /* address 4 pixels as one 32bit integer */
>>                     ref32 = &block32[-row_offset];
>>                     src32 = &block32[row_offset];
>>
>>                     if (rle_blocks > 0) {
>>                         /* apply 0 delta to whole next block */
>>                         if (!skip_flag)
>>                             copy_32(block32, ref32, 8, row_offset);
>>                         rle_blocks--;
>>                     } else {
>>                         for(line = 0; line < 4;) {
>>                             num_lines      = 1;
>>                             is_top_of_cell = is_first_row & (!line);
>>
>>                             code = *data_ptr++;
>>                             /* select primary VQ table for odd, secondary for even lines */
>>                             delta_tab = (line & 1) ? prim_delta : second_delta;
>>
>>                             /* switch on code type: dyad, quad or RLE escape codes */
>>                             switch ((line & 1) ? prim_sel[code] : second_sel[code]) {
>>                                 case DELTA_DYAD: /* apply VQ delta to two dyads (2+2 pixels) using softSIMD */
>>                                     if (((line & 1) ? prim_sel[*data_ptr] : second_sel[*data_ptr]) != DELTA_DYAD) {
>>                                         av_log(avctx, AV_LOG_ERROR, "Mode 3/4: invalid VQ data!\n");
>>                                         return -1;
>>                                     }
>>                                     ref16 = (int16_t *)ref32;
>>                                     src16 = (int16_t *)src32;
>>                                     src16[0] = ref16[0] + delta_tab[*data_ptr++];
>>                                     src16[1] = ref16[1] + delta_tab[code];
>>
>>                                     /* odd lines are not coded but rather interpolated/replicated */
>>                                     /* first line of the cell on the top of image? - replicate */
>>                                     /* otherwise - interpolate */
>>                                     if (is_top_of_cell && !cell->ypos) {
>>                                         src32[-row_offset] = src32[0];
>>                                     } else
>>                                         INTERPOLATE_32(src32 -row_offset, src32, ref32);
>>                                     break;
>>
>>                                 case DELTA_QUAD: /* apply VQ delta to 4 pixels at once using softSIMD */
>>                                     src32[0] = ref32[0] + delta_tab[code];
>>                                     if (is_top_of_cell && !cell->ypos) {
>>                                         src32[-row_offset] = src32[0];
>>                                     } else
>>                                         INTERPOLATE_32(src32 -row_offset, src32, ref32);
>>                                     break;
>>
>>                                 case RLE_ESC_FF: /* apply null delta to all lines up to the 2nd line */
>>                                     assert(line < 1);
>>                                     copy_32(src32 - row_offset, ref32, 4, row_offset);
>>                                     num_lines = 2;
>>                                     break;
>>
>>                                 case RLE_ESC_FE: /* apply null delta to all lines up to the 3rd line */
>>                                     assert(line < 2);
>>                                     copy_32(src32 - row_offset, ref32, (3 - line) << 1, row_offset);
>>                                     num_lines = 3 - line;
>>                                     break;
>>
>>                                 case RLE_ESC_FC:
>>                                     /* apply null delta to all remaining lines of this block
>>                                     and to whole next block */
>>                                     skip_flag  = 0;
>>                                     rle_blocks = 1;
>>
>>                                 case RLE_ESC_FD: /* apply null delta to all remaining lines of this block */
>>                                     copy_32(src32 - row_offset, ref32, (4 - line) << 1, row_offset);
>>                                     num_lines = 4 - line; /* go to process next block */
>>                                     break;
>>
>>                                 case RLE_ESC_FB: /* apply null delta to n blocks/skip n blocks */
>>                                     /* get next byte after the escape code 0xFB */
>>                                     code = *data_ptr++;
>>                                     rle_blocks = (code & 0x1F) - 1; /* set the block counter */
>>                                     if (code >= 64 || rle_blocks < 0) {
>>                                         av_log(avctx, AV_LOG_ERROR, "Mode 3/4: RLE-FB invalid counter: %d!\n", code);
>>                                         return -1;
>>                                     }
>>                                     skip_flag = code & 0x20;
>>                                     if (!skip_flag)
>>                                         copy_32(src32 - row_offset, ref32, (4 - line) << 1, row_offset);
>>                                     num_lines = 4 - line; /* go to process next block */
>>                                     break;
>>
>>                                 case RLE_ESC_F9: /* skip this block and the next one */
>>                                     skip_flag  = 1;
>>                                     rle_blocks = 1;
>>
>>                                 case RLE_ESC_FA: /* skip this block */
>>                                     assert(!line);
>>                                     num_lines = 4;
>>                                     break;
>>
>>                                 default:
>>                                     av_log(avctx, AV_LOG_ERROR, "Mode 3/4: unsupported RLE code: %d!\n",
>>                                            (line & 1) ? prim_sel[code] : second_sel[code]);
>>                                     return(-1);
>>                             }// switch code
>>
>>                             /* move to num_lines (even) */
>>                             line  += num_lines;
>>                             ref32 += row_offset * (num_lines << 1);
>>                             src32 += row_offset * (num_lines << 1);
>>                         }// for line
>>                     }// if/else
>>
>>                     /* move to next block horizontal */
>>                     block32++;
>>                 }// for x
>>
>>                 /* move to next line of blocks */
>>                 block32      += blk_row_offset;
>>                 is_first_row  = 0;
>>             }// for y
>>             break;
>>     
>
> looks very similar to the 4x4 code ...
>   

Yes, it looks similar but these actually aren't the same... Modes 3/4
process twice as many lines than the mode 0/1. Fix me when I'm wrong but
the code merging isn't worth it IMHO because I need to add alot of "if
mode xx then" statements in order to reflect some mode specific behaviour.
Moreover, all these modes are built on the same principle. Only parts
having differences are DYAD and QUAD processing. The algorithm of the
RLE codes is mostly the same (with or without interpolation), so one can
think about creating a big generalized loop for all modes and then try
to add internal switches for each arbitrary DYAD/QUAD case. I don't know
how it will affect the performance but I don't see any other possibility
for refraction, do you?

Regards
Maxim



More information about the ffmpeg-devel mailing list