[FFmpeg-devel] [PATCH] Unroll base64 decode loop.

Michael Niedermayer michaelni at gmx.at
Sat Jan 21 16:33:07 CET 2012


On Sat, Jan 21, 2012 at 04:26:30PM +0100, Reimar Döffinger wrote:
> On Sat, Jan 21, 2012 at 03:58:45PM +0100, Michael Niedermayer wrote:
> > On Sat, Jan 21, 2012 at 12:51:58PM +0100, Reimar Döffinger wrote:
> > > On Sat, Jan 21, 2012 at 12:45:09PM +0100, Reimar Döffinger wrote:
> > > > Around 50% faster.
> > > > decode:       374139 -> 248852 decicycles
> > > > syntax check: 236955 -> 123854 decicycles
> > > 
> > > Note that this is despite gcc failing completely and utterly,
> > > randomly deciding to make the "goto out" path the "fast" path
> > > and sometimes not.
> > > The code the optimizer creates IMO simply makes no sense.
> > > I did not try it with this code, but using the __builtin_expect
> > > cluebat did not help one bit on the previous try (which did
> > > not use the larger table and thus resulted in even messier code).
> > > The numbers mean that it still needs about 24 cycles per byte on
> > > the Phenom2. Not sure if I should consider that good or bad...
> > 
> > id consider it bad if it was a human who wrote the asm :)
> > 
> > also it probably can be improved by making the table signed and making
> > invalid values negativ, with that if the bits get ored together the
> > final value will be negative if any input was so fewer checks could
> > be used.
> 
> I don't think so, not without requiring padding.
> Admittedly with valid input there should be enough padding with =
> but I don't think we can assume that.

hmm, might be worth trying using strlen() to determine the size before
but i dont know if the optimization would be enough to still be a win
then

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120121/b7254844/attachment.asc>


More information about the ffmpeg-devel mailing list