[FFmpeg-devel] [PATCH 3/4] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 idct functions

Ronald S. Bultje rsbultje at gmail.com
Thu Jul 16 17:11:35 CEST 2015


Hi,

On Thu, Jul 9, 2015 at 9:15 AM, <shivraj.patil at imgtec.com> wrote:

> +void ff_idct_idct_16x16_add_msa(uint8_t *dst, ptrdiff_t stride,
> +                                int16_t *block, int eob)
> +{
> +    vp9_idct16x16_colcol_addblk_msa(block, dst, stride);
> +    memset(block, 0, 16 * 16 * sizeof(*block));
> +}


(This comment applies to all code in this file), you're not using the eob
parameter anywhere. Admittedly, for the iadst variants, the eob value is
generally quite high so this won't give any merit, but for idct_idct, eob
is typically low (possibly even 1), and you can make use of that to do
sub-idcts. Look at the C code for an example of dc-only idct_idct, and look
at the x86 simd for examples of sub-idcts. They give great speedups on top
of the regular speedup expected from simd vectorization, especially for the
bigger ones (16x16, 32x32).

Ronald


More information about the ffmpeg-devel mailing list