[FFmpeg-devel] [RFC] use ff_avc_find_startcode in ff_find_start_code
Tue Feb 19 15:45:40 CET 2008
On Tue, Feb 19, 2008 at 03:22:21PM +0100, Michael Niedermayer wrote:
> On Tue, Feb 19, 2008 at 01:30:53PM +0100, Reimar D?ffinger wrote:
> > Hello,
> > On Tue, Feb 19, 2008 at 10:22:09AM +0100, Reimar D?ffinger wrote:
> > > On Tue, Feb 19, 2008 at 03:57:07AM +0100, Michael Niedermayer wrote:
> > > > On Mon, Feb 18, 2008 at 11:07:38PM +0100, Reimar D?ffinger wrote:
> > > > > First I wonder if that ff_find_start_code function is
> > > > > not quite buggy anyway, or is it intentional that it searches for
> > > > > 00 00 01 00 in the part involving the state but for 00 00 01 ?? in the
> > > > > later code? If so, could somebody document the code?.
> > > > > Anyway, this is a quite ugly patch that makes the function use
> > > > > ff_avc_find_startcode (since that is in lavf, it can't be used as is of
> > > > > course).
> > > > > It probably also breaks the original use of ff_avc_find_startcode,
> > > > > though I found the current behaviour a bit strange as well, and this
> > > > > function is undocumented, too.
> > > > > This causes at least a 6% speedup when decoding
> > > > > http://samples.mplayerhq.hu/GXF/THX_Science_FLT_1920.gxf (I only tested
> > > > > with MPlayer though).
> > > >
> > > > current code:
> > > > 7054 dezicycles in findsc, 262131 runs, 13 skips
> > > > 7082 dezicycles in findsc, 262134 runs, 10 skips
> > > >
> > > > your code:
> > > > 11371 dezicycles in findsc, 262119 runs, 25 skips
> > > > 11624 dezicycles in findsc, 262115 runs, 29 skips
> > > >
> > > > gcc: 4.3.0 20080127 (experimental)
> > > > 800mhz duron
> > > > ffmpeg -v 9 -i matrixbench_mpeg2.mpg -vcodec copy -an -y test.avi
> > >
> > > Please use MPlayer, for some reason it gives these numbers:
> > > current code:
> > > 1039 dezicycles in test, 4096 runs, 0 skips
> > > 1031 dezicycles in test, 8192 runs, 0 skips
> > > 1022 dezicycles in test, 16384 runs, 0 skips
> > >
> > > my code:
> > > 623 dezicycles in test, 4096 runs, 0 skips
> > > 624 dezicycles in test, 8192 runs, 0 skips
> > > 631 dezicycles in test, 16384 runs, 0 skips
> > I think I found the reason for the discrepancies, the current code seems
> > about 25% faster with the parser, whereas the decoder is about the same
> > amount slower...
> > Can someone help me find out why, or should we just use two different
> > implementations?
> As ive already said, the decoder does only search 3 bytes, and what you print
> above is a 40 cpu cycles difference, first i dont understand what causes that,
> second 40 cycles per slice, 36 slices in 576 lines and 25fps are 36k cycles
> per second. That is just 0.0072% on a 500mhz system. This has absolutely no
> chance of causing any meassureable difference.
> Also the 3minute matrixbench_mpeg2 has 3*60*25*36 slices, that are 162000
> your numbers of 16384 runs looks very strange. (not to mention 40 cycles
> in a code run 16384 times does not matter ...)
> Still it would be interresting to know why theres a 40cycle difference ...
Heres a benchmark from just the call in the decoder (duron & matrixbench_mpeg2)
ffmpeg -v 9 -i matrixbench_mpeg2.mpg -an -y -f rawvideo /dev/null
2504 dezicycles in A2, 131069 runs, 3 skips
2832 dezicycles in A2, 131068 runs, 4 skips
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Many that live deserve death. And some that die deserve life. Can you give
it to them? Then do not be too eager to deal out death in judgement. For
even the very wise cannot see all ends. -- Gandalf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel