[FFmpeg-devel] [PATCH] swscale/ppc: VSX-optimize non-full-chroma yuv2rgb_2

Lauri Kasanen cand at gmx.com
Thu Apr 11 09:07:38 EEST 2019


On Fri, 5 Apr 2019 11:41:19 +0300
Lauri Kasanen <cand at gmx.com> wrote:

> ./ffmpeg -f lavfi -i yuvtestsrc=duration=1:size=1200x1440 -sws_flags fast_bilinear \
>         -s 1200x720 -f null -vframes 100 -pix_fmt $i -nostats \
>         -cpuflags 0 -v error -
>
> 32-bit mul, power8 only.
>
> ~2x speedup:
>
> rgb24
>   24431 UNITS in yuv2packed2,   16384 runs,      0 skips
>   13783 UNITS in yuv2packed2,   16383 runs,      1 skips
> bgr24
>   24396 UNITS in yuv2packed2,   16384 runs,      0 skips
>   14059 UNITS in yuv2packed2,   16384 runs,      0 skips
> rgba
>   26815 UNITS in yuv2packed2,   16383 runs,      1 skips
>   12797 UNITS in yuv2packed2,   16383 runs,      1 skips
> bgra
>   27060 UNITS in yuv2packed2,   16384 runs,      0 skips
>   13138 UNITS in yuv2packed2,   16384 runs,      0 skips
> argb
>   26998 UNITS in yuv2packed2,   16384 runs,      0 skips
>   12728 UNITS in yuv2packed2,   16381 runs,      3 skips
> bgra
>   26651 UNITS in yuv2packed2,   16384 runs,      0 skips
>   13124 UNITS in yuv2packed2,   16384 runs,      0 skips
>
> This is a low speedup, but the x86 mmx version also gets only ~2x. The mmx version
> is also heavily inaccurate, while the vsx version has high accuracy.
>
> Signed-off-by: Lauri Kasanen <cand at gmx.com>
> ---
>  libswscale/ppc/swscale_vsx.c | 188 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 188 insertions(+)

Applying.

- Lauri


More information about the ffmpeg-devel mailing list