[FFmpeg-cvslog] r12144 - trunk/libavcodec/mpeg12.c

Loren Merritt lorenm
Wed Feb 20 05:23:18 CET 2008


On Tue, 19 Feb 2008, Rich Felker wrote:
> On Tue, Feb 19, 2008 at 08:24:22PM +0200, Ivan Kalvachev wrote:
>>
>> Even gcc 2.95.3 won't issue divide on exact power of two. The code
>> would check the sign  add correction and then do the shift.
>> Doing unsigned >> is still faster :)
>
> The conditional is probably at least as slow as the divide...
> Unless it has a nice trick with flags/carry..

gcc 4.1.2, core2

int shift3(int x) { return x>>3; }
int div8(int x) { return x/8; }

<shift3>:
mov    eax,[esp+4]
sar    eax,0x3
ret

<div8>:
mov    eax,[esp+4]
test   eax,eax
lea    edx,[eax+7]
cmovs  eax,edx
sar    eax,0x3
ret

... or if you forbid cmov (-march=i386), then

<div8>:
mov    edx,[esp+4]
mov    eax,edx
sar    eax,0x1f
shr    eax,0x1d
add    eax,edx
sar    eax,0x3
ret

respective latencies (inlined): 1, 4, 5 cycles

--Loren Merritt




More information about the ffmpeg-cvslog mailing list