[FFmpeg-devel] [PATCH 07/11] avcodec/mips: loongson optimize h264qpel with mmi v2

Michael Niedermayer michael at niedermayer.cc
Mon May 23 21:47:30 CEST 2016


On Tue, May 17, 2016 at 03:08:13PM +0800, 周晓勇 wrote:
> avcodec/mips/h264qpel_mmi: Version 2 of the optimizations for loongson mmi
>     
>     1. no longer use the register names directly and optimized code format
>     2. to be compatible with O32, specify type of address variable with mips_reg and handle the address variable with PTR_ operator
>     3. temporarily annotated func put_(avg_)h264_qpel16_hv_lowpass_mmi and related funcs which couldn't pass fate testing in O32 ABI
>     4. use uld and mtc1 to workaround cpu 3A2000 gslwlc1 bug (gslwlc1 instruction extension bug in O32 ABI)
>     5. put_pixels_ an avg_pixels_ functions use hpeldsp optimizations instead

[...]
> @@ -1373,161 +1412,589 @@ static void put_h264_qpel4_hv_lowpass_mmi(uint8_t *dst, const uint8_t *src,
>      }
>  }
>  
> -static void put_h264_qpel8_hv_lowpass_mmi(uint8_t *dst, const uint8_t *src,
> -        int dstStride, int srcStride)
> -{
> -    int16_t _tmp[104];
> -    int16_t *tmp = _tmp;
> -    int i;
> -    src -= 2*srcStride;
> +static inline void put_h264_qpel8or16_hv1_lowpass_mmi(int16_t *tmp,
> +        const uint8_t *src, ptrdiff_t tmpStride, ptrdiff_t srcStride, int size)
> +{
> +    int w = (size + 8) >> 2;
> +    double ftmp[11];
> +    uint64_t tmp0;
> +    uint64_t low32;
> +
> +    src -= 2 * srcStride + 2;
[...]

> +        src8  += 2L * src8Stride;
> +        src16 += 48;
> +        dst   += 2L * dstStride;

why does this use long types  instead of ints while other code uses
ints ?

> +    } while (h -= 2);
> +}
> +
> +static void put_h264_qpel16_h_lowpass_l2_mmi(uint8_t *dst, const uint8_t *src,
> +        const uint8_t *src2, ptrdiff_t dstStride, ptrdiff_t src2Stride)
> +{
> +    put_h264_qpel8_h_lowpass_l2_mmi(dst, src, src2, dstStride, src2Stride);
> +    put_h264_qpel8_h_lowpass_l2_mmi(dst + 8, src + 8, src2 + 8, dstStride,
> +            src2Stride);
> +
> +    src += 8 * dstStride;
> +    dst += 8 * dstStride;
> +    src2 += 8 * src2Stride;



[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I do not agree with what you have to say, but I'll defend to the death your
right to say it. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160523/d2d77fef/attachment.sig>


More information about the ffmpeg-devel mailing list