[FFmpeg-devel] Fix MMX dct_quantize for non zigzag_direct scans

Ramiro Polla ramiro
Thu May 15 00:25:43 CEST 2008


>>> and id like to see benchmarks as well, so we can be sure this doesnt
>>> slow the code down
>> Adding custom permutation code for ff_alternate_vertical_scan the same way 
>> it's done for ff_zigzag_direct gives these results (the source file is 
>> properly cached in memory, 10 runs removing the highest and lowest):
>> ./ffmpeg_g -benchmark -s 352x288 -i paris.cif -flags +alt -vcodec mpeg4 -y 
>> -f rawvideo /dev/null
>>       ref     new
>> avg   2.7905  2.7390
>> stdev 0.0318  0.0287
> I do not care at all about alternate_scan speed. I care about zigzag_direct
> speed! Its whats used 99% of the time.
> even a 0.1% speedloss for zigzag means rejected patch!

Tested with a bigger cif sample.

       ref     new
avg   7.9774  7.9652
stdev 0.0433  0.0396

I ran the tests on init 1 with everything that I could stop (no network, 
no log daemon, no fs journaling, no cron, no bunch of daemons...), and 
couldn't get the stdev to be below the 0.1% you want.

Benchmarking with START/STOP_TIMER isn't very good since the runs can 
vary on the time they take depending on last_non_zero. Also the patch 
changes not only the MMX code but removes the hack in mpegvideo_enc.c.

The changes for direct zigzag are basically:
- read inverse from context instead of from static array. (shouldn't 
slow anything down)
- remove an if(s->alternate_scan) from mpegvideo_enc.c
- add an if(s->intra_scantable.scantable == ff_zigzag_direct) to MMX code.

I think this is a good solution that shows no measurable speed loss. I 
can test more if someone knows a way to make incredibly accurate 
benchmarking (<0.1% stdev).

We could also have two functions of each, one hardcoded to direct zigzag 
and another more generic, that can be set in MPV_common_init_mmx() 
(under #ifdef CONFIG_SMALL). I attached an example of what could be 
done. Another way would be if my first patch is accepted, have an 
av_always_inline dct_quantize_xxx_template(..., direct_zigzag), and 
create dct_quantize_xxx_direct and dct_quantize_xxx_normal.

Ramiro Polla
-------------- next part --------------
A non-text attachment was scrubbed...
Name: direct_zigzag.diff
Type: text/x-patch
Size: 6089 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080514/79fd40ff/attachment.bin>

More information about the ffmpeg-devel mailing list