[FFmpeg-devel] [PATCH] Further optimization of base64 decode using AV_WB32.

Michael Niedermayer michaelni at gmx.at
Sat Jan 21 17:56:32 CET 2012


On Sat, Jan 21, 2012 at 05:52:27PM +0100, Reimar Döffinger wrote:
> This is somewhat questionable.
> The biggest issue is that av_bswap32 is not replaced
> with our asm version on gcc 4.5 or newer.
> This causes gcc to generate horrible code that is slower
> than the unoptimized variant.
> Old:                                  248852 decicycles
> New with gcc's attempt at av_bswap32: 256576 decicycles
> New with our bswap32:                 200260 decicycles
[...]
> diff --git a/libavutil/x86/bswap.h b/libavutil/x86/bswap.h
> index 52ffb4d..aa39d97 100644
> --- a/libavutil/x86/bswap.h
> +++ b/libavutil/x86/bswap.h
> @@ -37,7 +37,7 @@ static av_always_inline av_const unsigned av_bswap16(unsigned x)
>  }
>  #endif /* !AV_GCC_VERSION_AT_LEAST(4,1) */
>  
> -#if !AV_GCC_VERSION_AT_LEAST(4,5)
> +#if 1 || !AV_GCC_VERSION_AT_LEAST(4,5)
>  #define av_bswap32 av_bswap32
>  static av_always_inline av_const uint32_t av_bswap32(uint32_t x)
>  {

also make sure -cpu/arch/tune is set so gcc is allowed to use bswap
(its 486+) so not possible for gcc to use on strict x86

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120121/7085258e/attachment.asc>


More information about the ffmpeg-devel mailing list