[FFmpeg-devel] [PATCH] SSE RDFT

Alex Converse alex.converse
Sat Mar 20 22:43:21 CET 2010


2010/3/20 M?ns Rullg?rd <mans at mansr.com>

> Jason Garrett-Glaser <darkshikari at gmail.com> writes:
>
> > On Sun, Mar 14, 2010 at 3:23 PM, Alex Converse <alex.converse at gmail.com>
> wrote:
> >> I'm sure I've made some embarrassingly amateurish mistakes here.
> >> Feedback is more than welcome.
> >>
> >> --Alex
> >
> > In the interests of getting away from discussions about yasm and into
> > actually reviewing the asm...
> >
> > +///sign mask of RDFT sine terms
> >
> > Three / ?
> >
> > Looking at the asm overall, it looks like there's a huge amount of
> > moving stuff around and very little actual calculation.  Is there no
> > better way to organize it?
> >
> > +        "movlps     (%4,%0,4), %%xmm4     \n\t"
> > +        "unpcklps      %%xmm4, %%xmm4     \n\t"
> > +        "movlps     (%5,%0,4), %%xmm3     \n\t"
> > +        "unpcklps      %%xmm3, %%xmm3     \n\t"
> >
> > This looks like a candidate for movsldup in an SSE3 version.
>
> Well?
>

Sorry, I've been a little tied up trying to finish up PS.

There is a lot of data shuffling in here. One potential reduction is
reorganizing the trig tables but keeping extra trig tables around is always
a bit controversial.



More information about the ffmpeg-devel mailing list