[FFmpeg-devel] [PATCH] MMX VP3 Loop Filter

David Conrad lessen42
Sat Oct 11 10:53:24 CEST 2008


On Oct 8, 2008, at 1:59 AM, David Conrad wrote:

> On Oct 7, 2008, at 5:43 AM, Jason Garrett-Glaser wrote:
>
>>> Here's an 8-bit version. However, checking for the C fallback  
>>> negates the
>>> small speedup on my Penryn compared to the 16-bit version.
>>
>> Most of the code is still 16-bit.  Are you sure this can't be done
>> x264-style with emulation of extra bits and 8-bit math (reference for
>> an example of how to do this: common/x86/deblock-a.asm in x264 tree)?
>> This would eliminate the need for all unpacks, all packs, and all
>> multiplication, and probably increase speed dramatically.  I strongly
>> suspect that it can be done, as the deblocking formulas seem very
>> similar to those used in H.264.
>
> It seems like you're right; the only difference between  
> DEBLOCK_P0_Q0 and VP3 is a *3 vs. a *4 in H.264.
> I don't quite fully understand x264's implementation, so it'll take  
> another bit to adapt it.

And here's an entirely 8-bit implementation. ~3 cycles faster than the  
last patch I posted.
I'm not sure the best way to avoid the duplication of ff_pb_1/3/7  
constants; there aren't enough registers to pass the address of all of  
the constants I need.

old:
590 dezicycles in ff_vp3_v_loop_filter_mmx, 4194251 runs, 53 skips
582 dezicycles in ff_vp3_v_loop_filter_mmx, 8388520 runs, 88 skips
577 dezicycles in ff_vp3_v_loop_filter_mmx, 16777013 runs, 203 skips
585 dezicycles in ff_vp3_v_loop_filter_mmx, 33554073 runs, 359 skips

new:
557 dezicycles in ff_vp3_v_loop_filter_mmx, 4194266 runs, 38 skips
550 dezicycles in ff_vp3_v_loop_filter_mmx, 8388539 runs, 69 skips
546 dezicycles in ff_vp3_v_loop_filter_mmx, 16777064 runs, 152 skips
557 dezicycles in ff_vp3_v_loop_filter_mmx, 33554058 runs, 374 skips

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 03-mmx-vp3-loop-filter-all8bit.txt
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081011/53f1de28/attachment.txt>
-------------- next part --------------




More information about the ffmpeg-devel mailing list