[Ffmpeg-devel] [PATCH] SSE counterpart of ff_imdct_calc_3dn2

Luca Barbato lu_zero
Thu Aug 24 19:47:27 CEST 2006


Loren Merritt wrote:
> On Thu, 24 Aug 2006, Luca Barbato wrote:
> 
>> Zuxy Meng wrote:
>>
>>> +    z += n8;
>>
>> [...]
>>> +    for(k = 0; k < n8; k += 2) {
>> [...]
>>> +        asm (
>>> +            "movaps          %4, %%xmm0 \n\t"   // xmm0 = 0 1 2 3
>>> +            "movaps          %5, %%xmm1 \n\t"   // xmm1 = 4 5 6 7
>> [...]
>>> +            :"m"(z[k]), "m"(z[-2 - k])
>>
>> I'm missing something or it could be unaligned?
>> z is 8 byte not 16.
> 
> The array index is even.
I know

> In order for n8 to be odd you'd need an 8
> element fft.

I need an odd multiple of 8

> Nothing in ffmpeg does one that small, and the simd code
> would break for more reasons than just alignment.
> 

lu

-- 

Luca Barbato

Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero





More information about the ffmpeg-devel mailing list