[FFmpeg-devel] [PATCH] h264.c/decode_cabac_residual optimization

Jason Garrett-Glaser darkshikari
Tue Jul 1 19:57:35 CEST 2008

> - gcc pointlessly unrolls the "while( coeff_abs < 15 && get_cabac( CC, ctx )
> )" loop into taking up about half of the compiled function when x86 asm is
> used, so it could be rewritten in asm to fix that.

I'm noticing this when doing CABAC encoding/decoding related stuff in
both x264 and ffmpeg; GCC has a tendency to unroll certain loops,
making a function much larger--and then, when inlining that function,
not revisit the unrolling choice.  This results in cases where
inlining an apparently trivial function can increase code size by 20
kilobytes or more--and when one investigates, one finds that function
had an absurd amount of loop unrolling done to it.

I suspect there could be a significant speed boost gained from
reducing the amount of unrolling and inlining going on, allowing
inlining to be targeted more carefully.

Dark Shikari

More information about the ffmpeg-devel mailing list