[FFmpeg-devel] [PATCH] h264_cabac.c: branchless (amvd>2)+(amvd>32)

Michael Niedermayer michaelni
Sat Feb 27 00:10:23 CET 2010


On Fri, Feb 26, 2010 at 11:28:32PM +0800, Zhou Zongyi wrote:
> Hi Michael,
> 
> in commit 22032:
> >switch back to (amvd>2)+(amvd>32), its 5 cpu cycles faster now.
> 
> On x86 it seems gcc uses the following way to get (amvd>2)
> xor reg, reg
> cmp reg, 2
> setg regb
> 
> This introduces partial register access, which is slow on most CPUs.

there should be no partial register stall after a xor of the larger register

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In fact, the RIAA has been known to suggest that students drop out
of college or go to community college in order to be able to afford
settlements. -- The RIAA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100227/a0ae72ca/attachment.pgp>



More information about the ffmpeg-devel mailing list