[FFmpeg-devel] [PATCH 0/3] showcqt x86 optimization using intrinsic
Reimar.Doeffinger at gmx.de
Thu Mar 10 20:28:33 CET 2016
On 10.03.2016, at 12:01, Ismail Donmez <ismail at i10z.com> wrote:
> On Thu, Mar 10, 2016 at 12:04 PM, wm4 <nfxjfg at googlemail.com> wrote:
>> On Thu, 10 Mar 2016 16:53:12 +0700
>> Muhammad Faiz <mfcc64 at gmail.com> wrote:
>>> I use intrinsic because writing asm using nasm or inline asm
>>> is difficult task for me.
>>> [PATCH 1/3] configure: add x86 intrinsic support
>>> [PATCH 2/3] avfilter/avf_showcqt: cqt_calc x86 optimization
>>> [PATCH 3/3] avfilter/avf_showcqt: draw_bar x86 optimization
>> We generally don't accept intrinsic in ffmpeg.
> Given this policy has roots from gcc 2.x times, it might be a good
> idea to discuss it again in the context of gcc5 and clang 3.8 and
I think last time I tried it on some gcc 4.x the intrinsics generated code significantly slower than the non-SIMD code (asm was about 4x faster, and it was a trivial raw audio format conversion loop).
So from my point of view I still think with intrinsics you have to expect a > 4x performance variation, which for me is "it might be better to just not optimize at all" level.
Maybe in another 5 years... But honestly it seems to me autovectorization might get there before intrinsics...
More information about the ffmpeg-devel