[FFmpeg-devel] [PATCH] move H264 IDCT to yasm

Ronald S. Bultje rsbultje
Mon Sep 6 23:00:12 CEST 2010


Hi,

this patch moves H264 IDCT (the LGPL part) to yasm. Performance for
most loopy parts is improved quite a bit because gcc is completely
retarded when it comes to setting up loops (I'm not joking here), some
up to 50%. Performance for one particular function (intra16_mmx2) is
mildly worse (a few cycles) and I don't quite understand why, the code
is identical. This might be related to alignment (gcc aligns the parts
that it jmps to using nops, I don't yet know how to do that in yasm),
otherwise I don't really know. Let me know if you want detailed
performance statistics for each function.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: yamsify-h264_idct.patch
Type: application/octet-stream
Size: 38431 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100906/dfb7347a/attachment.obj>



More information about the ffmpeg-devel mailing list