[FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

James Almer jamrial at gmail.com
Wed Feb 25 18:25:50 CET 2015


On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
> Hi,
> 
> On Tue, Feb 24, 2015 at 8:05 PM, James Almer <jamrial at gmail.com> wrote:
>>
>> +#if HAVE_FAST_POPCNT
>> +#if AV_GCC_VERSION_AT_LEAST(4,5)
>> +#ifndef av_popcount
>> +    #define av_popcount   __builtin_popcount
>> +#endif /* av_popcount */
>> +#if HAVE_FAST_64BIT
>> +#ifndef av_popcount64
>> +    #define av_popcount64 __builtin_popcountll
>> +#endif /* av_popcount64 */
>> +#endif /* HAVE_FAST_64BIT */
>> +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
>> +#endif /* HAVE_FAST_POPCNT */
>>
> 
> Is this just to get the sse4 popcnt instruction if we compile with
> -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet
> configure still does an arch/cpu check. I'd expect the built-in/compiler to
> do that for us based on -mcpu, and we could always unconditionally use this
> (as long as gcc >= 4.5); alternatively, you could use inline asm and then
> have the configure check (HAVE_FAST_POPCNT). But doing both seems a little
> odd. I have no objection to it, patch is still fine, just odd.
> 
> Ronald

I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt 
because otherwise __builtin_popcount (at least gcc's) is slower than our generic 
av_popcount_c function from lavu/common.h.
When the CPU supports popcnt the builtin becomes a single inlined instruction.

I tried the __asm__ approach, but the code generated by the builtin seemed better.
And I agree it looks odd and maybe way too specific, which is why i said i can add 
this to a new header in the x86/ folder instead.

Patch attached. I don't have clang so i can't test it, nor i know how to check for a 
version that supports the builtin.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-libavutil-add-x86-optimized-av_popcount.patch
Type: text/x-patch
Size: 2230 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150225/e7e2a0e8/attachment.bin>


More information about the ffmpeg-devel mailing list