[FFmpeg-devel] avfilter/x86/vf_blend : add avx2 for 8b func (v2)

Martin Vignali martin.vignali at gmail.com
Wed Jan 17 22:13:49 EET 2018


Hello,


New patch in attach

with modification in average, grain extract, multiply, screen, grain merge


-- blend Average --
Prev patch :
average_c: 15605.4
average_sse2: 1205.9
average_avx2: 772.4

New patch :
average_c: 15604.4
average_sse2: 490.9
average_avx2: 265.2

With 3 operand :
using
%if cpuflag(avx)
    pxor m0, m2, [topq + xq]
    pxor m1, m2, [bottomq + xq]
%else
    movu           m0, [topq + xq]
    movu           m1, [bottomq + xq]
    pxor           m0, m2
    pxor           m1, m2
%endif

average_c: 15615.5
average_sse2: 456.2
average_avx: 553.7
average_avx2: 387.0


And for grain extract, multiply, screen, grain merge
using mmsize process at each loop (instead of mmsize / 2)

-- Grain extract --
Prev :
grainextract_c: 22182.9
grainextract_sse2: 1158.9
grainextract_avx2: 777.6

New :
grainextract_c: 22206.5
grainextract_sse2: 964.8
grainextract_avx2: 485.3

-- Multiply --
Prev :
multiply_c: 41347.8
multiply_sse2: 1376.0
multiply_avx2: 840.0

New :
multiply_c: 40432.5
multiply_sse2: 1248.0
multiply_avx2: 635.0

-- Screen --
Prev :
screen_c: 21635.8
screen_sse2: 1801.5
screen_avx2: 1069.8

New :
screen_c: 21617.0
screen_sse2: 1625.7
screen_avx2: 840.2

-- Grain merge --
Prev :
grainmerge_c: 25233.5
grainmerge_sse2: 1158.0
grainmerge_avx2: 775.7

New :
grainmerge_c: 25246.7
grainmerge_sse2: 967.4
grainmerge_avx2: 487.7


Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-avfilter-x86-vf_blend-avfilter-x86-vf_blend-add-AVX2.patch
Type: application/octet-stream
Size: 11957 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180117/48224057/attachment.obj>


More information about the ffmpeg-devel mailing list