[FFmpeg-cvslog] r12144 - trunk/libavcodec/mpeg12.c
Loren Merritt
lorenm
Wed Feb 20 05:23:18 CET 2008
On Tue, 19 Feb 2008, Rich Felker wrote:
> On Tue, Feb 19, 2008 at 08:24:22PM +0200, Ivan Kalvachev wrote:
>>
>> Even gcc 2.95.3 won't issue divide on exact power of two. The code
>> would check the sign add correction and then do the shift.
>> Doing unsigned >> is still faster :)
>
> The conditional is probably at least as slow as the divide...
> Unless it has a nice trick with flags/carry..
gcc 4.1.2, core2
int shift3(int x) { return x>>3; }
int div8(int x) { return x/8; }
<shift3>:
mov eax,[esp+4]
sar eax,0x3
ret
<div8>:
mov eax,[esp+4]
test eax,eax
lea edx,[eax+7]
cmovs eax,edx
sar eax,0x3
ret
... or if you forbid cmov (-march=i386), then
<div8>:
mov edx,[esp+4]
mov eax,edx
sar eax,0x1f
shr eax,0x1d
add eax,edx
sar eax,0x3
ret
respective latencies (inlined): 1, 4, 5 cycles
--Loren Merritt
More information about the ffmpeg-cvslog
mailing list