[FFmpeg-devel] [PATCH 1/6] x86: huffyuvdsp: port mmx add_bytes to yasm
James Almer
jamrial at gmail.com
Thu May 29 21:16:45 CEST 2014
On 29/05/14 2:37 PM, Christophe Gisquet wrote:
> +.1:
> + mova m0, [dstq + sizeq]
> + mova m1, [srcq + sizeq]
> + mova m2, [dstq + sizeq + mmsize]
> + mova m3, [srcq + sizeq + mmsize]
> + paddb m1, m0
> + paddb m3, m2
> + mova [dstq + sizeq], m1
> + mova [dstq + sizeq + mmsize], m3
> + add sizeq, 2*mmsize
> + jl .1
Why not instead something like
mova m0, [dstq + sizeq]
mova m1, [dstq + sizeq + mmsize]
paddb m0, [srcq + sizeq]
paddb m1, [srcq + sizeq + mmsize]
mova [dstq + sizeq], m0
mova [dstq + sizeq + mmsize], m1
Didn't bench, but i assume it should be faster, and similar stuff is
already being done in lavu's float_dsp.asm
More information about the ffmpeg-devel
mailing list