[FFmpeg-devel] [PATCH 5/5] aarch64: me_cmp: Don't do uaddlv once per iteration

Martin Storsjö martin at martin.st
Sat Jul 16 00:25:37 EEST 2022

On Fri, 15 Jul 2022, Michael Niedermayer wrote:

> On Fri, Jul 15, 2022 at 10:56:03PM +0300, Martin Storsjö wrote:
>> On Fri, 15 Jul 2022, Swinney, Jonathan wrote:
>>> If the max height is just 16, then this should be fine. I assumed that h
>>> could have a much higher value (>1024), but if that is not the case,
>>> then this is a useful optimization.
>> At least according to the me_cmp.h header, which says:
>> /* Motion estimation:
>>  * h is limited to { width / 2, width, 2 * width },
>>  * but never larger than 16 and never smaller than 2.
>>  * Although currently h < 4 is not used as functions with
>>  * width < 8 are neither used nor implemented. */
> These rules where written with support for encoding of all
> standard formats in mind at the time that was written.
> today it may make sense to extend these rules to cover the
> things which where created since then

Right, but if that suddenly changes, such a change also must expect that 
it might need updates to all assembly implementations that implement that 
interface currently. Right now, both the defacto case (any callers in the 
codebase) and the explicit documentation says that it can't be called with 
parameters outside of that range.

Even if it's raised from the current <= 16, this particular optimization 
should be fine as long as h <= 256 - which should be fine for at least all 
current-gen mainstream codecs since, I think?

// Martin

More information about the ffmpeg-devel mailing list