[FFmpeg-devel] r9017 breaks WMA decoding on Intel Macs

Guillaume Poirier gpoirier
Sun May 27 15:08:42 CEST 2007


Hi,

Le 27 mai 07 ? 14:52, Guillaume POIRIER a ?crit :

> On 5/27/07, Guillaume POIRIER <poirierg at gmail.com> wrote:
>> Any vorbis should do the trick. I'll try to narrow down the  
>> problem to
>> see which part of the patch broke it.
>
> This hunk is what causes the regression:

Off course this should read: "applying this hunk fixes the regression".


> Index: fft_sse.c
> ===================================================================
> --- fft_sse.c	(revision 9017)
> +++ fft_sse.c	(revision 6577)
> @@ -100,33 +100,20 @@
>              i = nloops*8;
>              asm volatile(
>                  "1: \n\t"
> -                "sub $32, %0 \n\t"
> +                "sub $16, %0 \n\t"
>                  "movaps    (%2,%0), %%xmm1 \n\t"
>                  "movaps    (%1,%0), %%xmm0 \n\t"
> -                "movaps  16(%2,%0), %%xmm5 \n\t"
> -                "movaps  16(%1,%0), %%xmm4 \n\t"
>                  "movaps     %%xmm1, %%xmm2 \n\t"
> -                "movaps     %%xmm5, %%xmm6 \n\t"
>                  "shufps      $0xA0, %%xmm1, %%xmm1 \n\t"
>                  "shufps      $0xF5, %%xmm2, %%xmm2 \n\t"
> -                "shufps      $0xA0, %%xmm5, %%xmm5 \n\t"
> -                "shufps      $0xF5, %%xmm6, %%xmm6 \n\t"
>                  "mulps   (%3,%0,2), %%xmm1 \n\t" //  cre*re cim*re
>                  "mulps 16(%3,%0,2), %%xmm2 \n\t" // -cim*im cre*im
> -                "mulps 32(%3,%0,2), %%xmm5 \n\t" //  cre*re cim*re
> -                "mulps 48(%3,%0,2), %%xmm6 \n\t" // -cim*im cre*im
>                  "addps      %%xmm2, %%xmm1 \n\t"
> -                "addps      %%xmm6, %%xmm5 \n\t"
>                  "movaps     %%xmm0, %%xmm3 \n\t"
> -                "movaps     %%xmm4, %%xmm7 \n\t"
>                  "addps      %%xmm1, %%xmm0 \n\t"
>                  "subps      %%xmm1, %%xmm3 \n\t"
> -                "addps      %%xmm5, %%xmm4 \n\t"
> -                "subps      %%xmm5, %%xmm7 \n\t"
>                  "movaps     %%xmm0, (%1,%0) \n\t"
>                  "movaps     %%xmm3, (%2,%0) \n\t"
> -                "movaps     %%xmm4, 16(%1,%0) \n\t"
> -                "movaps     %%xmm7, 16(%2,%0) \n\t"
>                  "jg 1b \n\t"
>                  :"+r"(i)
>                  :"r"(p), "r"(p + nloops), "r"(cptr)
>
>
> We're quite lucky, it's the shortest of the 2 hunks.
>
> Now I need to figure out what's wrong in that hunk.

There's nothing wrong to this hunk!
It just duplicates the original code and uses "original register  
number" + 4.
Why on earth would it break on OSX and not on Linux?

Is there's some qualified guru out there who could could enlighten me  
here?

Guillaume



More information about the ffmpeg-devel mailing list