[FFmpeg-devel] 6c6ac9cb "avutil/x86/intmath: Use tzcnt in place of bsf."

Matt Oliver protogonoi at gmail.com
Tue Nov 17 08:02:34 CET 2015


On 17 November 2015 at 12:12, Hans Wennborg <hans at chromium.org> wrote:

> Hello ffmpeg developers,
>
> This commit [1] is causing problems when compiling with Clang on Windows:
>
> ..\..\third_party\ffmpeg\libavutil/x86/intmath.h(53,33) :  error:
> always_inline function '__tzcnt_u32' requires target feature 'bmi',
> but would be inlined into function 'ff_ctzll_x86' that is compiled
> without support for 'bmi'
>     return ((uint32_t)v == 0) ? _tzcnt_u32((uint32_t)(v >> 32)) + 32 :
> _tzcnt_u32((uint32_t)v);
>                                 ^
>
> Essentially the compiler is saying that it won't allow using this
> intrinsic unless compiling for a target that supports BMI.
>
> Is there a performance reason for using __tzcnt_u32 instead of
> _BitScanForward, or was it mainly to simplify the code?
>
> We're working around this in Chromium by #define'ing __tzcnt_u32 to
> __builtin_ctz at the moment, but it would be good if we could find a
> nicer solution that could be applied upstream.
>
> Cheers,
> Hans
>
>  [1].
> https://github.com/FFmpeg/FFmpeg/commit/6c6ac9cb17c4944514bde833f2fa8aa8dafa974a
>

tzcnt was used instead of bsf as it has performance advantages on any cpu
that supports the tzcnt instruction. Although tzcnt is a newer instruction
that is part of the BMI instruction set the actual opcode generated by this
instruction is the equivalent to 'rep bsf'. So this instruction will still
execute on any older cpu that doesnt support tzcnt as just the bsf
instruction instead. So it was used as it provides optimum performance on
both newer and older cpus.

Clang is clearly not allowing this particular optimisation. I dont know of
a way to ignore this using a command line option so probably the best way
is to just disable this code when using clang. This will go back to using
the previous behavior which assuming clang on windows still defines the gcc
version check macros will use the __builtin_ctz.

The attached patch should fix it but I dont have clang for windows to test
it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-avutil-x86-intmath-Fix-compilation-with-clang-on-win.patch
Type: application/octet-stream
Size: 881 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151117/a612f784/attachment.obj>


More information about the ffmpeg-devel mailing list