[FFmpeg-devel] [PATCH] h264.c/decode_cabac_residual optimization
Tue Jul 1 19:57:35 CEST 2008
> - gcc pointlessly unrolls the "while( coeff_abs < 15 && get_cabac( CC, ctx )
> )" loop into taking up about half of the compiled function when x86 asm is
> used, so it could be rewritten in asm to fix that.
I'm noticing this when doing CABAC encoding/decoding related stuff in
both x264 and ffmpeg; GCC has a tendency to unroll certain loops,
making a function much larger--and then, when inlining that function,
not revisit the unrolling choice. This results in cases where
inlining an apparently trivial function can increase code size by 20
kilobytes or more--and when one investigates, one finds that function
had an absurd amount of loop unrolling done to it.
I suspect there could be a significant speed boost gained from
reducing the amount of unrolling and inlining going on, allowing
inlining to be targeted more carefully.
More information about the ffmpeg-devel