[FFmpeg-devel] [PATCH 08/11] avcodec/v210enc: add AVX-512 10-bit line pack function

James Darnley jdarnley at obe.tv
Fri Nov 10 23:13:53 EET 2017

On 2017-11-10 14:32, James Darnley wrote:
> I mentioned previously that using ZMM registers will cause the CPU to
> reduce its frequency.
> Gramner said on IRC that a user should spend 20-30% of time in
> AVX-512/ZMM code for it to be a net gain in speed.
> From ffmpeg-devel IRC on 2017-10-26
>> https://lists.ffmpeg.org/pipermail/ffmpeg-devel-irc/2017-October/004622.html
>> [18:49:26 CEST] <Gramner> J_Darnley: be aware that using zmm registers induces significant frequency drops which reduces performance of everything else, so if you want to use 512-bit vectors you better go all in on it to make up for it. you probably want to spend at least 20-30% of overall runtime in avx-512 code
>> [18:50:00 CEST] <Gramner> the alternative is to stay in 256-bit mode and just leverage new instructions and opmasks
> This means any cycles you might save by using longer registers, fewer
> instructions, better instructions, whatever, will be lost because the
> frequency drops meaning it takes longer to execute overall.

Some details about this can be found in one of Intel's documents: IntelĀ®
64 and IA-32 Architectures Optimization Reference Manual
Order Number: 248966-038
October 2017
> https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf
Specifically section 15.26 "SKYLAKE SERVER POWER MANAGEMENT"

Earlier on the ffmpeg-devel IRC channel I posted a link to Cloudflare's
blog in which they discuss the effects of running just a few (my words)
AVX-512/ZMM instructions.
> https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/

In the worst cases on some of the new processors the frequency drop can
be 1GHz.  In Cloudflare's case just spending about 2.5% of time in a
cryptography function using AVX-512 was causing a 10% drop in their
overall performance (requests served per second).

After seeing this and the discussion on IRC I won't commit any of the
function patches.  The functions are not very impressive and are likely
to make everything else slower.

The IRC log should appear at the link below.
> https://lists.ffmpeg.org/pipermail/ffmpeg-devel-irc/2017-November/004651.html

More information about the ffmpeg-devel mailing list