[FFmpeg-devel] [PATCH] VP8: correctly use optimal epel functions for splitmv mode

David Conrad lessen42
Mon Jun 28 09:04:13 CEST 2010


On Jun 27, 2010, at 1:57 PM, Ronald S. Bultje wrote:

> Hi,
> 
> On Sat, Jun 26, 2010 at 8:19 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> currently, we apply MC/epel for splitmv coding as 4x4 subblocks (of
>> 4x4px each) in the 16x16px MB. This is suboptimal, because the MVs are
>> actually shared between multiple subblocks, so applying epel in
>> 16x8/8x16/8x8 would be more optimal, particularly if we use SSE2/SSSE3
>> optimizations.
>> 
>> The attached patch tries to improve the situation.
>> 
>> Once the SSE2/MMX patches are applied, this leads to about 10% speedup
>> for splitmv MBs (5937 to 5486 cycles per whole splitmv-MB for sample
>> 15 in the vector testsuite). Of course this depends on the coding of
>> the MB and thus on the sample. With SSSE3 it probably leads to even
>> better speedups, but I can't test that because my CPU is old.
> 
> New patch against SVN after David's bilinear filter addition.
> 
> Ronald

> Index: ffmpeg-svn/libavcodec/vp8.c
> ===================================================================
> --- ffmpeg-svn.orig/libavcodec/vp8.c	2010-06-26 21:50:39.000000000 -0400
> +++ ffmpeg-svn/libavcodec/vp8.c	2010-06-27 13:56:47.000000000 -0400
> @@ -943,6 +943,38 @@
>      mc_func[my_idx][mx_idx](dst, linesize, src, linesize, block_h, mx, my);
>  }
>  
> +static void vp8_mc_part(VP8Context *s, uint8_t *dst[3], AVFrame *ref_frame,
> +                        int x_off, int y_off, int bx_off, int by_off,
> +                        int block_w, int block_h,
> +                        int width, int height, VP56mv *mv)

inline
(someone, maybe me, should experiment with manually forcing inlining stuff in vp8)

> +{
> +    VP56mv uvmv = *mv;
> +
> +    /* Y */
> +    vp8_mc(s, 1, dst[0] + by_off * s->linesize + bx_off,
> +           ref_frame->data[0], mv, x_off + bx_off, y_off + by_off,
> +           block_w, block_h, width, height, s->linesize,
> +           s->put_pixels_tab[block_w == 8]);
> +
> +    /* U/V */
> +    if (s->profile == 3) {
> +        uvmv.x &= ~7;
> +        uvmv.y &= ~7;
> +    }
> +    x_off   >>= 1; y_off   >>= 1;
> +    bx_off  >>= 1; by_off  >>= 1;
> +    width   >>= 1; height  >>= 1;
> +    block_w >>= 1; block_h >>= 1;
> +    vp8_mc(s, 0, dst[1] + by_off * s->uvlinesize + bx_off,
> +           ref_frame->data[1], &uvmv, x_off + bx_off, y_off + by_off,
> +           block_w, block_h, width, height, s->uvlinesize,
> +           s->put_pixels_tab[1 + (block_w == 4)]);
> +    vp8_mc(s, 0, dst[2] + by_off * s->uvlinesize + bx_off,
> +           ref_frame->data[2], &uvmv, x_off + bx_off, y_off + by_off,
> +           block_w, block_h, width, height, s->uvlinesize,
> +           s->put_pixels_tab[1 + (block_w == 4)]);

> 
> @@ -112,7 +119,7 @@
>        { -6, -7 }                            // '110', '111'
>  };
>  
> -static const uint8_t vp8_mbsplits[4][16] = {
> +static const uint8_t vp8_mbsplits[5][16] = {
>      {  0,  0,  0,  0,  0,  0,  0,  0,
>         1,  1,  1,  1,  1,  1,  1,  1  },
>      {  0,  0,  1,  1,  0,  0,  1,  1,
> @@ -120,7 +127,9 @@
>      {  0,  0,  1,  1,  0,  0,  1,  1,
>         2,  2,  3,  3,  2,  2,  3,  3  },
>      {  0,  1,  2,  3,  4,  5,  6,  7,
> -       8,  9, 10, 11, 12, 13, 14, 15  }
> +       8,  9, 10, 11, 12, 13, 14, 15  },
> +    {  0,  0,  0,  0,  0,  0,  0,  0,
> +       0,  0,  0,  0,  0,  0,  0,  0  }
>  };
>  
>  static const uint8_t vp8_mbfirstidx[4][16] = {

Is this related?

OK otherwise



More information about the ffmpeg-devel mailing list