[FFmpeg-devel] [PATCH] MMX2/SSSE3 VC1 loop filter
Ronald S. Bultje
rsbultje
Mon Jul 5 22:30:29 CEST 2010
Hi,
On Mon, Jul 5, 2010 at 1:44 AM, David Conrad <lessen42 at gmail.com> wrote:
> Updated to patch cleanly, compile, and added mmx/sse2 versions
[..]
> +SECTION_RODATA
> +pw_4: times 8 dw 4
> +pw_5: times 8 dw 5
cextern pw_4, pw_5 (i.e. use the ones in dsputil_mmx.c) maybe?
> +; low, high (src), zero
> +%macro UNPACK2 4
> + mova m%2, m%3
> + punpckh%1 m%3, m%4
> + punpckl%1 m%2, m%4
> +%endmacro
duplicate of SBUTTERFLY in x86util.asm, maybe?
> +%macro STORE_4_WORDS_MMX 6
> + movd %6, %5
> +%if mmsize==16
> + psrldq %5, 4
> +%else
> + psrlq %5, 32
> +%endif
> + mov %1, %6w
> + shr %6, 16
> + mov %2, %6w
> + movd %6, %5
> + mov %3, %6w
> + shr %6, 16
> + mov %4, %6w
> +%endmacro
For VP8 H loopfilter, I save the neighbouring two rows (p1/q1) and
write the four out as dwords using movd at once from the mm register,
have you tried that (I'm not asking you to rewrite it if you didn't),
and if so, is it faster?
(I suppose this isn't very practical because of the SSE4 version below...)
> +%macro STORE_4_WORDS_SSE4 6
> + pextrw %1, %5, %6+0
> + pextrw %2, %5, %6+1
> + pextrw %3, %5, %6+2
> + pextrw %4, %5, %6+3
> +%endmacro
[..]
> +%macro VC1_H_LOOP_FILTER 1-2
> + movq m0, [r0 -4]
> + movq m1, [r0+ r1-4]
> + movq m2, [r0+2*r1-4]
> + movq m3, [r0+ r3-4]
> +%if %1 > 4
> + movq m4, [r4 -4]
> + movq m5, [r4+ r1-4]
> + movq m6, [r4+2*r1-4]
> + movq m7, [r4+ r3-4]
> + punpcklbw m0, m1
> + punpcklbw m2, m3
> + punpcklbw m4, m5
> + punpcklbw m6, m7
> + SWAP 1, 2
> + SWAP 2, 4
> + SWAP 3, 6
> + SBUTTERFLY wd, 0, 1, 4
> + SBUTTERFLY wd, 2, 3, 4
> + SBUTTERFLY dq, 0, 2, 4
> + SBUTTERFLY dq, 1, 3, 4
> +%else
> + SBUTTERFLY bw, 0, 1, 4
> + SBUTTERFLY bw, 2, 3, 4
> + SBUTTERFLY wd, 0, 2, 4
> + SBUTTERFLY wd, 1, 3, 4
> +%endif
TRANSPOSE4x4W, TRANSPOSE4x4B?
> +cglobal vc1_h_loop_filter8_sse4, 3,5,8
Should this (and others like it) be under #ifdef X86_64? I got compile
errors if I tried to use xmm8-15 on x86_32.
Ronald
More information about the ffmpeg-devel
mailing list