[FFmpeg-devel] [PATCH 2/2] SSE optimized mp3 windowing

Vitor Sessak vitor1001
Wed Jun 23 22:33:21 CEST 2010


On 06/17/2010 10:56 PM, Vitor Sessak wrote:
> On 06/17/2010 09:56 PM, Loren Merritt wrote:
>> On Thu, 17 Jun 2010, Vitor Sessak wrote:
>>
>>> + "movaps (%0,%5), %%xmm1 \n\t"
>>> + "movaps (%2,%5), %%xmm2 \n\t"
>>> + "movaps (%1,%5), %%xmm3 \n\t"
>>
>> One of these can be a memory arg to mulps.
>
> Already addressed in my latest patch in the same thread.
>
>>> + "mulps %%xmm2, %%xmm1 \n\t"
>>> + "subps %%xmm1, %%xmm0 \n\t"
>>> + "mulps %%xmm2, %%xmm3 \n\t"
>>> + "subps %%xmm3, %%xmm4 \n\t"
>>> [repeated lots of times]
>>
>> Looks like a place for a macro.
>
> Good point. Used macros also for the other block.
>
>>> + if (incr == 1) {
>>
>> Does output really need to be interleaved?
>
> It's a known TODO to allow codecs to outpu some kind of
> SAMPLE_FMT_PLANAR_FLOAT.
>
>>> + "movups 52(%4), %%xmm0 \n\t"
>>> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
>>> + "movaps (%1), %%xmm1 \n\t"
>>
>> memory arg
>
> Fixed
>
>>> + "subps %%xmm1, %%xmm0 \n\t"
>>> + "movaps %%xmm0, (%0) \n\t"
>>> +
>>> + "movups 4(%3), %%xmm0 \n\t"
>>> + "movaps 48(%2), %%xmm1 \n\t"
>>> + "shufps $0x1b, %%xmm0, %%xmm0 \n\t"
>>> + "addps %%xmm1, %%xmm0 \n\t"
>>> + "movaps %%xmm0, 112(%0) \n\t"
>>
>> Why do you alternate between two schedules?
>
> No good reason, fixed.
>
> New patch attached.

ping

-Vitor



More information about the ffmpeg-devel mailing list