[FFmpeg-devel] avcodec/x86/bswapdsp : convert pb_bswap32 to ymm constant in order to simplify code
henrik at gramner.com
Mon Nov 27 18:59:22 EET 2017
On Sat, Nov 25, 2017 at 9:53 PM, Martin Vignali
<martin.vignali at gmail.com> wrote:
> In attach patch to convert pb_bswap32 to ymm constant
> and remove the vbroadcasti128 part
> Speed seems to be similar to me
This just wastes cache for no reason. A tiny amount, sure, but minor
things tends to add up eventually.
128-bit broadcasts are the same speed as 256-bit loads on Intel CPU:s
and twice as fast as 256-bit loads on AMD CPU:s.
A better solution if you want to avoid ifdeffery would be to create a
macro that uses vbroadcasti128 when mmsize == 32 and mova otherwise.
More information about the ffmpeg-devel