[FFmpeg-devel] [Patch]x86/hevc : new idct + ASM

James Almer jamrial at gmail.com
Wed Jun 4 18:00:55 CEST 2014


On 04/06/14 8:21 AM, Pierre Edouard Lepere wrote:
> +%endif
> +    lea             dstq, [dstq+strideq]

"add dstq, strideq" is probably faster.

[...]

> +INIT_XMM sse2
> +
> +TRANSFORM_DC_ADD 8, 8
> +TRANSFORM_DC_ADD 16, 8
> +
> +TRANSFORM_DC_ADD 8, 10
> +
> +INIT_MMX mmx

Needs to be mmxext as i mentioned in my previous email.
You're using CLIPW, which expands to pmaxsw and pminsw, both integer SSE instructions (AKA, mmxext).
Not to mention SPLATW would expand into four instructions instead of only one.

No more comments from me. Maybe wait a bit for someone else to comment on the new idct or such 
before resending the patch again with the above changes.


More information about the ffmpeg-devel mailing list