[FFmpeg-devel] [PATCH] SSE RDFT

Måns Rullgård mans
Sat Mar 20 22:34:06 CET 2010


Jason Garrett-Glaser <darkshikari at gmail.com> writes:

> On Sun, Mar 14, 2010 at 3:23 PM, Alex Converse <alex.converse at gmail.com> wrote:
>> I'm sure I've made some embarrassingly amateurish mistakes here.
>> Feedback is more than welcome.
>>
>> --Alex
>
> In the interests of getting away from discussions about yasm and into
> actually reviewing the asm...
>
> +///sign mask of RDFT sine terms
>
> Three / ?
>
> Looking at the asm overall, it looks like there's a huge amount of
> moving stuff around and very little actual calculation.  Is there no
> better way to organize it?
>
> +        "movlps     (%4,%0,4), %%xmm4     \n\t"
> +        "unpcklps      %%xmm4, %%xmm4     \n\t"
> +        "movlps     (%5,%0,4), %%xmm3     \n\t"
> +        "unpcklps      %%xmm3, %%xmm3     \n\t"
>
> This looks like a candidate for movsldup in an SSE3 version.

Well?

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list