[FFmpeg-devel] [Patch]x86/hevc : new idct + ASM
jamrial at gmail.com
Wed Jun 4 18:00:55 CEST 2014
On 04/06/14 8:21 AM, Pierre Edouard Lepere wrote:
> + lea dstq, [dstq+strideq]
"add dstq, strideq" is probably faster.
> +INIT_XMM sse2
> +TRANSFORM_DC_ADD 8, 8
> +TRANSFORM_DC_ADD 16, 8
> +TRANSFORM_DC_ADD 8, 10
> +INIT_MMX mmx
Needs to be mmxext as i mentioned in my previous email.
You're using CLIPW, which expands to pmaxsw and pminsw, both integer SSE instructions (AKA, mmxext).
Not to mention SPLATW would expand into four instructions instead of only one.
No more comments from me. Maybe wait a bit for someone else to comment on the new idct or such
before resending the patch again with the above changes.
More information about the ffmpeg-devel