[FFmpeg-devel] [PATCH] MMX acceleration in libswscale for YUV -> BGRA32

Michael Niedermayer michaelni
Mon Sep 24 09:31:52 CEST 2007


Hi

On Mon, Sep 24, 2007 at 08:08:45AM +0200, Peter Schlaile wrote:
> Hi,
> 
> while updating blender to the latest version of ffmpeg, I noticed, that
> this particular conversion function isn't MMX accelerated. (And blender
> really needs just this one all the time :)
> 
> A patch, that adds that to libswscale is attached.
> 
> Sincerely,
> Peter Schlaile
> 
> 

> Index: yuv2rgb.c
> ===================================================================
> --- yuv2rgb.c	(revision 12118)
> +++ yuv2rgb.c	(working copy)
> @@ -619,6 +619,7 @@
>  #if defined(HAVE_MMX2) || defined(HAVE_MMX)
>      if (c->flags & SWS_CPU_CAPS_MMX2){
>          switch(c->dstFormat){
> +	case PIX_FMT_BGR32:  return yuv420_bgr32_MMX2;
>          case PIX_FMT_RGB32:  return yuv420_rgb32_MMX2;

tabs are forbidden in ffmpeg svn


>          case PIX_FMT_BGR24:  return yuv420_rgb24_MMX2;
>          case PIX_FMT_BGR565: return yuv420_rgb16_MMX2;
> @@ -627,6 +628,7 @@
>      }
>      if (c->flags & SWS_CPU_CAPS_MMX){
>          switch(c->dstFormat){
> +	case PIX_FMT_BGR32:  return yuv420_bgr32_MMX;
>          case PIX_FMT_RGB32:  return yuv420_rgb32_MMX;
>          case PIX_FMT_BGR24:  return yuv420_rgb24_MMX;
>          case PIX_FMT_BGR565: return yuv420_rgb16_MMX;
> Index: yuv2rgb_template.c
> ===================================================================
> --- yuv2rgb_template.c	(revision 12118)
> +++ yuv2rgb_template.c	(working copy)
> @@ -536,3 +536,89 @@
>      __asm__ __volatile__ (EMMS);
>      return srcSliceH;
>  }
> +

> +static inline int RENAME(yuv420_bgr32)(SwsContext *c, uint8_t* src[], int srcStride[], int srcSliceY,
> +                                       int srcSliceH, uint8_t* dst[], int dstStride[]){
> +    int y, h_size;
> +
> +    if(c->srcFormat == PIX_FMT_YUV422P){
> +        srcStride[1] *= 2;
> +        srcStride[2] *= 2;
> +    }
> +
> +    h_size= (c->dstW+7)&~7;
> +    if(h_size*4 > FFABS(dstStride[0])) h_size-=8;
> +
> +    __asm__ __volatile__ ("pxor %mm4, %mm4;" /* zero mm4 */ );
> +
> +    for (y= 0; y<srcSliceH; y++ ) {
> +        uint8_t *_image = dst[0] + (y+srcSliceY)*dstStride[0];
> +        uint8_t *_py = src[0] + y*srcStride[0];
> +        uint8_t *_pu = src[1] + (y>>1)*srcStride[1];
> +        uint8_t *_pv = src[2] + (y>>1)*srcStride[2];

please use normal variable names and this is code duplication

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Opposition brings concord. Out of discord comes the fairest harmony.
-- Heraclitus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070924/05495f22/attachment.pgp>



More information about the ffmpeg-devel mailing list