[FFmpeg-devel] [PATCH] simplify GET_UTF8 to use ff_log2_tab
Tue Dec 11 01:01:48 CET 2007
On Mon, Dec 10, 2007 at 09:09:37PM +0000, M?ns Rullg?rd wrote:
> "Ivan Kalvachev" <ikalvachev at gmail.com> writes:
> > On Dec 10, 2007 12:14 AM, Rich Felker <dalias at aerifal.cx> wrote:
> >> On Sun, Dec 09, 2007 at 11:09:25AM +0100, Reimar D?ffinger wrote:
> >> > Hello,
> >> > currently GET_UTF8 calls av_log2 which is simply overkill,
> >> > since we only care about the lowest 8 bits.
> >> > This may be intentional since my suggestion would be problematic
> >> > if GET_UTF8 should become part of the public API, since ff_log2_tab
> >> > is not public.
> >> > A possibility would be to at least use av_log2_16bit or better add a
> >> > public av_log2_8bit.
> >> > Comments?
> >> Counting the number of ones is useless. Instead, just left shift at
> >> each iteration and check the high bit as your "loop counter".
> >> Thankfully gcc even optimizes this correctly as (shl ; js) on i386,
> >> even in my ancient gcc 2.95.
> > Ever heard of "bsr" and "bsf" instructions?
> > I think BitScanReverse have huge latency (i386=10+3*n, P1=72, Athlon=14),
> > but it still should be faster than loop or cache miss.
> What the hell is wrong with that architecture? On ARM, the CLZ (count
> leading zeros) instruction takes all of ONE cycle. I believe MIPS is
on core2 its latency=2 throughput=1
my data for p2/p3 is incomplete for bsf/bsr but i suspect its the same as
P4 has latency= 4 throughput=1/2 (only half of core2 ...)
P4E has latency=16 throughput=1/4 (no comment)
amd64 needs 8-10 depending on 16/32/64bit
and my docs list 7-73 for the P1
anyway truth is that bsf/bsr are not very important instructions (if you
disagree, tell me how libav* could be faster on core2 by using bsf/bsr)
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have often repented speaking, but never of holding my tongue.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel