[FFmpeg-devel] [PATCH] SSE2 version of vf_idet's filter_line()

James Almer jamrial at gmail.com
Wed Sep 3 08:55:59 CEST 2014

On 03/09/14 3:38 AM, Pascal Massimino wrote:
> Hi,
> On Tue, Sep 2, 2014 at 10:26 PM, Reimar Döffinger <Reimar.Doeffinger at gmx.de>
> wrote:
>> On 03.09.2014, at 00:49, Pascal Massimino <pascal.massimino at gmail.com>
>> wrote:
>>> On Tue, Sep 2, 2014 at 9:39 AM, Michael Niedermayer <michaelni at gmx.at>
>>> wrote:
>>> [ahem: ffmpeg doesn't feel like using intrinsics, by chance?]
>> I tried that about 5 months back, once more.
>> It still results in code that is slower than the plain C version, even
>> when using SIMD, on trivial NEON audio format conversion (same thing in asm
>> was about 8x faster).
>> So you can get the same effect with less effort by disabling just
>> disabling asm code.
> strange. I exclusively used intrinsics for libwebp (x86, but also
> neon/aarch64) and was pretty
> pleased with the result (say <2% perf loss, but 10x easier maintenance and
> friendliness to non-guru contributors).
> Agreed, coding style is weird, with all these 'const __m128i var = ...',
> but...
> My 2c.
> /skal

Adding to Michael's reasons, there's also the fact you need to compile the files 
with instruction set specific flags, which would be a PITA to handle from the 
build system.
GCC 4.8 and above added some features to work around this, but we support other 
compilers as well as older GCC versions.

More information about the ffmpeg-devel mailing list