[FFmpeg-devel] LIBMPEG2_BITSTREAM_READER vs. golomb.h
Mon Jul 14 02:14:30 CEST 2008
On Monday 14 July 2008, M?ns Rullg?rd wrote:
> >> This is all annoying because LIBMPEG2_BITSTREAM_READER is slightly
> >> faster on ARM.
> > What about just using ALT_BITSTREAM_READER for ARMv6 and newer (cores
> > that support unaligned memory accesses)?
> I tried enabling HAVE_FAST_UNALIGNED, and it didn't make any
> significant difference.
> > It could be the fastest bitstream reader when implementing unaligned
> > 32-bit bigendian load as:
> > setend be
> > ldr ...
> > setend le
> ldr; rev is only two instructions.
But it's 6 cycles on ARM11. Because unaligned read has 4 cycles latency, and
rev instruction has its argument as 'early reg' (+1 more cycle penalty).
Sequence "ldr"+"rev" is a dependency chain and you can't do much about it,
it's a bad choice.
On the other hand, "setend be"/"ldr"/"setend le" sequence is 3 cycles, with
some latency for load result availability. In the worst case it is 5 cycles,
which is already better than what you suggest. And you still have some freedom
reordering instructions for getting better results.
More information about the ffmpeg-devel