[FFmpeg-devel] [PATCH] avfilter: add hflip x86 SIMD

James Almer jamrial at gmail.com
Sun Dec 3 20:41:09 EET 2017


On 12/3/2017 3:09 PM, Martin Vignali wrote:
>> 2017-12-03 17:46 GMT+01:00 Paul B Mahol <onemda at gmail.com>:
>>
>>> On 12/3/17, Martin Vignali <martin.vignali at gmail.com> wrote:
>>>> Hello,
>>>>
>>>> Maybe you can use a macro for byte and short version,
>>>> only few lines are different in each version
>>>
>>> Sure, feel free to send patches.
>>>
>>> I'm not very macro proficient.
>>>
>>
>> Ok, i will take a look.
>>
>> Martin
>>
> 
> I write a basic checkasm test. Seems like the byte version is slower than c
> 
> hflip_byte_c: 31.8
> hflip_byte_ssse3: 108.1
> hflip_short_c: 300.1
> hflip_short_ssse3: 139.8
> 
> (checkasm patch in attach if you want to test)
> 
> Martin

$ tests/checkasm/checkasm.exe --test=vf_hflip --bench
benchmarking with native FFmpeg timers
nop: 32.0
hflip_byte_c: 362.0
hflip_byte_ssse3: 96.0
hflip_short_c: 374.0
hflip_short_ssse3: 121.0

Guess your compiler is really good at optimizing this code, or something
funny is going on.
Can you post a disassembly of hflip_byte_c?


More information about the ffmpeg-devel mailing list