[FFmpeg-devel] [PATCH 1/2] x86inc: Extend FMA_INSTR functionality

James Almer jamrial at gmail.com
Sun Feb 16 00:29:53 CET 2014


On 15/02/14 4:49 PM, Michael Niedermayer wrote:
> On Sat, Feb 15, 2014 at 04:05:34PM -0300, James Almer wrote:
>> On 14/02/14 10:46 PM, Michael Niedermayer wrote:
>>> On Fri, Feb 14, 2014 at 11:50:15AM -0300, James Almer wrote:
>>>> On 14/02/14 5:57 AM, Christophe Gisquet wrote:
>>>>> 2014-02-13 James Almer <jamrial at gmail.com>:
>>>>>> You're right, a fifth parameter is probably the proper way. See
>>>>>> FMULADD_PS in x86util. It would allow actual non-destructive emulation
>>>>>> of these XOP instructions if it's ever needed.
>>>>>> It's not for now, but changing it will not hurt and it will probably have
>>>>>> to be done at some point anyway.
>>>>>
>>>>> You can probably make it optional (haven't looked at FMULADD_PS), ie
>>>>> make the macro use 4-8 arguments, and if %0 == 8, use %8, otherwise
>>>>> use %2.
>>>>>
>>>>
>>>> Someone else committed the opposite to x264, making the %1 = %2 * %3 + %1 case 
>>>> unsupported instead, so I'm not sure at this point if this should be on x86inc 
>>>> or added as a local macro on a given asm file like i originally did.
>>>
>>> <michaelni> Skyler_, should i revert 23a8c63452009df21b3f184936b343593d4ccb04 (x86inc: Extend FMA_INSTR functionality) and apply "Warn about not supported emulation of some XOP instructions. Also add pmacsdql emulation." from x264 ?
>>> <michaelni> i think it would be best if a 5th argument, that is a temporary for emulation could be specified
>>> <Skyler_> yeah, that might be a better solution than both, but in that case we should move it out of x86inc and into x86util and capitalize it
>>> <Skyler_> because its semantics don't match the original instruction
>>> <Skyler_> e.g. like PALIGNR vs palignr
>>
>> Sounds good to me.
> 
> agree, do you volunteer to implement it ?

Alright.

Should each XOP macro accept both four and five arguments, or only five?
I ask because i can't seem to find a way to make it work with the former.
What Christophe mentioned above does not work.

The latter is easy and i have it working, but of course requires providing 
a value for the fifth argument even when it's not needed.


More information about the ffmpeg-devel mailing list