[Ffmpeg-devel] a little optim for a SSE version of H263_LOOP_FILTER

skal skal65535
Mon Nov 6 07:35:24 CET 2006


  Hi everybody,

> Message du 05/11/06 16:50
> >
> >  in case, it seems to me a SSE version of
> >  H263_LOOP_FILTER is possible by replacing
> >        "psubusb %%mm4, %%mm2           \n\t"\
> >        "movq %%mm2, %%mm3              \n\t"\
> >        "psubusb %%mm4, %%mm3           \n\t"\
> >        "psubb %%mm3, %%mm2             \n\t"\
> >  at dsputil_mmx.c:587 (fresh cvs), by:
> >        "psubusb %%mm4, %%mm2           \n\t"\
> >        "pminub %%mm4, %%mm2           \n\t"\
> >
> >  +maybe a little re-org of the loop (mm3 is gone).
> 
> Please send patch, I'll try to benchmark the speed change.
> 
> Note that movq is very slow on P4, so any code that removes
> mov(q|dqu|..) provides an interesting speed-up.
> 
> 
> >  Well, this is just for the fun of it, since the speed-up
> >  (if any) might not be worth a special version...
> 
> Once I have a patch to play with, I can benchmark it on P4, PM, and K8... :)

   sure, attached is the diff (test only!)

> 
> > (gotta love these saturated instructions. All of h263's
> > UpDownRamp() with 2 instructions is quite fun)
> 
> Mmmm... grep -r "UpDownRamp" libav* doesn't return anything here, as
> well as in google code search.
> What kind of code are you referring to?

    It's the name used in the h263 ISO spec.
    ( e.g. :  http://nova.postech.ac.kr/~dkim/course/cs703a/h263.pdf ,
    says Google)


   bye!

Skal
-------------- next part --------------
A non-text attachment was scrubbed...
Name: /home/massimin/h263_loopfilter_sse_test_only.diff
Type: application/octet-stream
Size: 655 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20061106/1996760f/attachment.obj>



More information about the ffmpeg-devel mailing list