[FFmpeg-devel] [PATCH] SSE2 version of vf_idet's filter_line()

Pascal Massimino pascal.massimino at gmail.com
Wed Sep 3 08:38:28 CEST 2014


On Tue, Sep 2, 2014 at 10:26 PM, Reimar Döffinger <Reimar.Doeffinger at gmx.de>

> On 03.09.2014, at 00:49, Pascal Massimino <pascal.massimino at gmail.com>
> wrote:
> > On Tue, Sep 2, 2014 at 9:39 AM, Michael Niedermayer <michaelni at gmx.at>
> > wrote:
> >
> >
> > [ahem: ffmpeg doesn't feel like using intrinsics, by chance?]
> I tried that about 5 months back, once more.
> It still results in code that is slower than the plain C version, even
> when using SIMD, on trivial NEON audio format conversion (same thing in asm
> was about 8x faster).
> So you can get the same effect with less effort by disabling just
> disabling asm code.

strange. I exclusively used intrinsics for libwebp (x86, but also
neon/aarch64) and was pretty
pleased with the result (say <2% perf loss, but 10x easier maintenance and
friendliness to non-guru contributors).
Agreed, coding style is weird, with all these 'const __m128i var = ...',
My 2c.


