[FFmpeg-devel] [PATCH] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP8 functions
Ronald S. Bultje
rsbultje at gmail.com
Mon Aug 3 20:46:18 CEST 2015
Hi,
On Mon, Jul 27, 2015 at 6:01 AM, <shivraj.patil at imgtec.com> wrote:
> +++ b/libavcodec/mips/vp8dsp_init_mips.c
>
Is there a reason the init code lives in a different file than the
implementations? It seems to me all symbols could be static if the init
code lived in the same file as the implementation. This isn't a big deal,
just wondering.
> +++ b/libavcodec/mips/vp8dsp_msa.c
>
Please split this file in 3, one for loopfilter, one for MC and one for
idct-related stuff. I agree the x86 code is a bad example in this respect,
but it's a good idea to improve this going forward.
> +void ff_put_vp8_epel4_h4_msa(uint8_t *dst, ptrdiff_t dst_stride,
> + uint8_t *src, ptrdiff_t src_stride,
> + int height, int mx, int my)
> +{
> + const int8_t *filter = subpel_filters_msa[mx - 1];
> +
> + if (2 == height) {
> + common_hz_4t_4x2_msa(src, src_stride, dst, dst_stride, filter);
>
Is this ever true? I don't think blocksize goes below 4 in either
dimension, ever.
> +void ff_put_vp8_epel4_v4_msa(uint8_t *dst, ptrdiff_t dst_stride,
> + uint8_t *src, ptrdiff_t src_stride,
> + int height, int mx, int my)
> +{
> + const int8_t *filter = subpel_filters_msa[my - 1];
> +
> + if (2 == height) {
> + common_vt_4t_4x2_msa(src, src_stride, dst, dst_stride, filter);
>
Same.
> +void ff_put_vp8_pixels8_msa(uint8_t *dst, ptrdiff_t dst_stride,
> + uint8_t *src, ptrdiff_t src_stride,
> + int height, int mx, int my)
> +{
> + int32_t cnt;
> + uint64_t out0, out1, out2, out3, out4, out5, out6, out7;
> + v16u8 src0, src1, src2, src3, src4, src5, src6, src7;
> +
> + if (0 == height % 12) {
>
Is this ever true? My impression was blocksize could ever only be a power
of two (4, 8, 16) in vp8.
> + } else if (0 == height % 2) {
>
Same as above, blocksize should never be 2. I don't think this code ever
executes.
> +void ff_put_vp8_pixels16_msa(uint8_t *dst, ptrdiff_t dst_stride,
> + uint8_t *src, ptrdiff_t src_stride,
> + int height, int mx, int my)
> +{
> + int32_t cnt;
> + v16u8 src0, src1, src2, src3, src4, src5, src6, src7;
> +
> + if (0 == height % 12) {
>
Same as above.
The remainder of the patch looks OK to me.
> Ronald
More information about the ffmpeg-devel
mailing list