[FFmpeg-devel] [PATCH 0/2] x86: hevc_mc: use proxy functions

Christophe Gisquet christophe.gisquet at gmail.com
Sat Oct 4 10:28:08 CEST 2014


Hi,

2014-10-04 1:19 GMT+02:00 James Almer <jamrial at gmail.com>:
> Or how everything is declared as sse4 even though less than half the code
> actually uses sse4 instructions.

The incorrect insn patch actually caught another issue in qpel_hv
(where packusdw is required) , which made me the macros mess even more
messier.

> You already tried to deal with this in "x86: hevc_mc: port to SSSE3 v2", but
> it got blocked by one of the patches that broke x86_32. Maybe it's worth
> looking at again.

Well, I've tried this, but:
- you still need sse4 for WP - maybe not that useful now, but that was a 10% hit
- having sse4 versions where needed, even when reusing the ssse3
functions, increased the total object size to near 500K, hence this
patchset

> Hell, quite a few are sse2, even, but the macros are kinda messy and it's
> much easier declaring everything as ssse3/sse4 than micromanaging stuff.

Indeed. I bet most of the sse2 versions are for 10+ bits versions. I
wouldn't have expected main10 to get such a large acceptance, seeing
avc's "hi10p", but it did, so people may find this sufficiently
desirable to invest what is needed.

In the end, all of this (clean proxying, ssse3/sse2) is what one would
like to get to have neat code, but except avx2, nothing that would
matter to a non-negligible percentage of ffmpeg's users. One may even
argue that having 32bits asm would be more important.

-- 
Christophe


More information about the ffmpeg-devel mailing list