[FFmpeg-devel] [PATCH] h264_i386: Optimize decode_significance_8x8_x86 for 64 bit.

Reimar Döffinger Reimar.Doeffinger at gmx.de
Wed Dec 3 09:00:39 CET 2014


On 03.12.2014, at 01:40, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Sat, Nov 22, 2014 at 02:09:01PM +0100, Reimar Döffinger wrote:
>> On Mon, Nov 17, 2014 at 01:41:13PM +0100, Michael Niedermayer wrote:
>>> On Mon, Nov 17, 2014 at 08:19:32AM +0100, Reimar Döffinger wrote:
>>>> On 17.11.2014, at 02:37, Michael Niedermayer <michaelni at gmx.at> wrote:
>>>>> On Sat, Nov 15, 2014 at 06:16:03PM +0100, Reimar Döffinger wrote:
>>>>>> 11674 -> 10877 decicycles on my Phenom II.
>>>>>> Overall speedup was unfortunately within measurement error.
>>>>> 
>>>>> here its  10153 ->10135
>>>> 
>>>> I suspect it also depends a bit on the compiler and how it changes the surrounding code.
>>>> Note that I also tested with PIC actually.
>>>> 
>>>>> but ive a slightly odd feeling about the chnages to the asm code,
>>>>> iam not sure if all assemblers will be happy about the changed
>>>>> code
>>>> 
>>>> Do you mean particularly the movzbl change?
>>> 
>>> yes and the k stuff
>>> 
>>> 
>>>> I am also unsure about that, I think there was a reason for that %k6 mess...
>>>> But this as well as movzx seemed to work for me...
>>> 
>>> it works here too i just have the feeling it might fail on some odd
>>> assembler or platform. Thats not meant to keep you from pushing this
>>> just that it might require to be reverted or fixed if such
>>> problems actually occor
>> 
>> I pushed it.
>> If anyone sees issues please tell me and I'll look into it!
> 
> i think these fate failures are caused by it but thats based just
> on other commits in the range looking unlikely:
> 
> http://fate.ffmpeg.org/report.cgi?time=20141122231657&slot=x86_64-darwin-clang-3.5-O3
> http://fate.ffmpeg.org/report.cgi?time=20141122223720&slot=x86_64-darwin-clang-3.5

That's annoying, I only expected compile errors, this looks more like a compiler bug.
Can someone run tests?
Does just using the "m" instead of "r" constraint like on 32 bit fix it?


More information about the ffmpeg-devel mailing list