[FFmpeg-devel] avcodec/x86/bswapdsp : convert pb_bswap32 to ymm constant in order to simplify code

James Almer jamrial at gmail.com
Mon Nov 27 19:22:01 EET 2017

On 11/27/2017 2:17 PM, Martin Vignali wrote:
> 2017-11-27 17:59 GMT+01:00 Henrik Gramner <henrik at gramner.com>:
>> On Sat, Nov 25, 2017 at 9:53 PM, Martin Vignali
>> <martin.vignali at gmail.com> wrote:
>>> Hello,
>>> In attach patch to convert pb_bswap32 to ymm constant
>>> and remove the vbroadcasti128 part
>>> Speed seems to be similar to me
>> This just wastes cache for no reason. A tiny amount, sure, but minor
>> things tends to add up eventually.
>> 128-bit broadcasts are the same speed as 256-bit loads on Intel CPU:s
>> and twice as fast as 256-bit loads on AMD CPU:s.
>> A better solution if you want to avoid ifdeffery would be to create a
>> macro that uses vbroadcasti128 when mmsize == 32 and mova otherwise.
>> _______________________________________________
> Hello,
> Thanks for your comments.
> Do you have an idea, for the name of this macro ?

It doesn't currently exist, so look at the existing ones in x86utils.asm
and add one for vbroadcasti128.

> Relative to previous patch similar to this in discussion :
> avcodec/x86/exrdsp : use ymm constant for pb_80 instead of vbroadcasti128
> Do you think, we need to not use YMM constant (declare in constants.h/c),
> and convert the constantes to XMM in this file, with a vbroadcasti128 load ?

There's no need to convert them back to xmm to use broadcasts.

More information about the ffmpeg-devel mailing list