[Ffmpeg-devel] a little optim for a SSE version of H263_LOOP_FILTER

Guillaume POIRIER poirierg
Mon Nov 6 14:48:19 CET 2006


Hi,

On 11/6/06, skal <skal65535 at orange.fr> wrote:
>   Hi everybody,
>
> > Message du 05/11/06 16:50
> > >
> > >  in case, it seems to me a SSE version of
> > >  H263_LOOP_FILTER is possible by replacing
> > >        "psubusb %%mm4, %%mm2           \n\t"\
> > >        "movq %%mm2, %%mm3              \n\t"\
> > >        "psubusb %%mm4, %%mm3           \n\t"\
> > >        "psubb %%mm3, %%mm2             \n\t"\
> > >  at dsputil_mmx.c:587 (fresh cvs), by:
> > >        "psubusb %%mm4, %%mm2           \n\t"\
> > >        "pminub %%mm4, %%mm2           \n\t"\
> > >
> > >  +maybe a little re-org of the loop (mm3 is gone).
> >
> > Please send patch, I'll try to benchmark the speed change.
> >
> > Note that movq is very slow on P4, so any code that removes
> > mov(q|dqu|..) provides an interesting speed-up.
> >
> >
> > >  Well, this is just for the fun of it, since the speed-up
> > >  (if any) might not be worth a special version...
> >
> > Once I have a patch to play with, I can benchmark it on P4, PM, and K8... :)
>
>    sure, attached is the diff (test only!)

Mmm.. I thought it would be slightly more complicated since you said
"+maybe a little re-org of the loop (mm3 is gone)."

Ok, I've tried your patch.Regression tests pass, however, I have
trouble testing your patch behond that. I lack a sample with proper
inloop filter it seems.
I've tried this sample
http://samples.mplayerhq.hu/V-codecs/h263/100374.mov and a couple of
others, whithout any luck...
Maybe I'm not benchmarking the relevant parts of the inloop filter
(see attachemnt to see what I was benchmarking)?
Or could you provide a sample to run my benches?

> > > (gotta love these saturated instructions. All of h263's
> > > UpDownRamp() with 2 instructions is quite fun)
> >
> > Mmmm... grep -r "UpDownRamp" libav* doesn't return anything here, as
> > well as in google code search.
> > What kind of code are you referring to?
>
>     It's the name used in the h263 ISO spec.
>     ( e.g. :  http://nova.postech.ac.kr/~dkim/course/cs703a/h263.pdf ,
>     says Google)

Woops, I've just got bitten by a "google is your friend" that I love
to write to everyone who asks obvious questions ;-)

Guillaume
-- 
With DADVSI (http://en.wikipedia.org/wiki/DADVSI), France finally has
a lead on USA on selling out individuals right to corporations!
Vive la France!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench_h263.diff
Type: text/x-diff
Size: 1605 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20061106/4403cb2c/attachment.diff>



More information about the ffmpeg-devel mailing list