[FFmpeg-devel] [PATCH 2/2] avutil/float_dsp: add ff_vector_dmul_{sse2, avx}

James Almer jamrial at gmail.com
Fri Sep 14 16:26:46 EEST 2018

On 9/14/2018 9:57 AM, Henrik Gramner wrote:
> On Thu, Sep 13, 2018 at 3:08 PM, James Almer <jamrial at gmail.com> wrote:
>> +    lea       lenq, [lend*8 - mmsize*4]
> Is len guaranteed to be a multiple of mmsize/8? Otherwise this would
> cause misalignment. It will also break if len < mmsize/2.

len must be a multiple of 16 as per the doxy, so yes.
The only way for len to be < mmsize/2 is if we add an avx512 version.

> Also if you want a 32-bit result from lea it should be written as "lea
> lend, [lenq*8 - mmsize*4]" which is equivalent but has a shorter
> opcode (e.g. always use native sizes within brackets).

len is an int, so I assume this is only possible here because it's an
argument passed in a reg and not stack? Otherwise, the upper 32bits
would probably make a mess with the multiplication. See for example
vector_fmul_add where len is the fifth argument.

More information about the ffmpeg-devel mailing list