[FFmpeg-devel] [PATCH 1/7] x86: hevc_mc: add AVX2 optimizations
christophe.gisquet at gmail.com
Fri Feb 6 08:41:24 CET 2015
2015-02-06 1:15 GMT+01:00 James Almer <jamrial at gmail.com>:
> On 05/02/15 4:20 PM, Christophe Gisquet wrote:
>> From: plepere <pierre-edouard.lepere at insa-rennes.fr>
> This should probably be changed to Pierre Edouard Lepere.
Yeah, I amended with --author=lepere and that's what I got: it's what
appears in 9ba6b17add2, 942e22c651, 92cccb7bcd and in fact any he
But he's no longer in INSA, so the mail part is indeed less relevant.
>> +%if cpuflag(avx2) && (%0 == 3)
>> + vextracti128 xm10, m0, 1
>> + vinserti128 m10, m1, xm10, 0
>> + vinserti128 m0, m0, xm1, 1
>> + mova m1, m10
>> + vextracti128 xm10, m2, 1
>> + vinserti128 m10, m3, xm10, 0
>> + vinserti128 m2, m2, xm3, 1
>> + mova m3, m10
>> + vextracti128 xm10, m4, 1
>> + vinserti128 m10, m5, xm10, 0
>> + vinserti128 m4, m4, xm5, 1
>> + mova m5, m10
>> + vextracti128 xm10, m6, 1
>> + vinserti128 m10, m7, xm10, 0
>> + vinserti128 m6, m6, xm7, 1
>> + mova m7, m10
> I didn't check but i think these can be simplified using vperm2i128.
> It can be done in a separate patch anyway.
I'd prefer so, because I don't know avx2, so I can neither apply your
comments, nor review.
One think you may look also is that QPEL_HV lacks the shuffling that
QPEL has for 8 bits. Consequently, there's now qpel_hv 16-wide avx2
version. That may also explain why OpenHEVC didn't get much speed
improvement from avx2 on 8 bits.
I don't know if it is feasible.
> It would be nice all this was compressed to a couple macros like with SSE4. But that's
> cosmetics and not a blocker.
Yeah, I did tell myself it was for another patch.
> Should be ok if it passes fate
I think both you and Mickael validated it.
> and compiles with yasm <= 1.1.0 (there are C wrappers
> and those usually need more strict checks for HAVE_AVX2_EXTERNAL because dead code
> elimination doesn't seem to trigger until after pre-processing is done).
Is it equivalent to setting HAVE_AVX2_EXTERNAL to 0/!yes in config.*?
Because doing so results in no avx2 function and no link issue, as
should be the case, I guess.
More information about the ffmpeg-devel