[FFmpeg-devel] r9017 breaks WMA decoding on Intel Macs

Guillaume POIRIER poirierg
Wed May 30 00:28:58 CEST 2007


On 5/29/07, Zuxy Meng <zuxy.meng at gmail.com> wrote:

> These warnings comes from the assembler not the compiler about cases
> like 16+(%esi). The FSF as treats this as equivalent to 16+0(esi) ==
> 16(esi) (therefore the assumed 0). If the Apple as treats it
> differently without even a warning then the result is catastrophic...
> > I'm attaching both version (the one with -vanilla postfix is the one
> > from trunk, that is getting misscompiled).
> >
> > I'll have a look at it tonight, but in the meantime, if someone wants
> > to look at it....
> >
> > Guillaume
> >
> >        .const
> >        .align 4
> [...]
> >L32:
> >       addl    $4, %ebx
> >       movaps          (%eax), %xmm0
> >       movaps       16+(%eax), %xmm4
> >       movlps          (%ecx), %xmm1
> >       movlps        8+(%ecx), %xmm5
> Actually gcc -S can't help diagnostic here: we don't know how
> "16+(%eax)" will be assembled so you should have done an 'objdump -d'
> instead.

Ok, that's a great advice right there. I attached both version as
produced on a x86/Linux machine and on OSX/x86

Here are the relevant details:

 1bd:   0f 28 02                movaps (%edx),%xmm0
 1c0:   0f 28 19                movaps (%ecx),%xmm3
 1c3:   0f 28 62 f0             movaps 0xfffffff0(%edx),%xmm4
 1c7:   0f 28 79 10             movaps 0x10(%ecx),%xmm7
 1cb:   0f 12 0f                movlps (%edi),%xmm1
 1ce:   0f 12 13                movlps (%ebx),%xmm2
 1d1:   0f 12 6f 08             movlps 0x8(%edi),%xmm5
 1d5:   0f 12 73 08             movlps 0x8(%ebx),%xmm6
 1d9:   0f c6 c0 5f             shufps $0x5f,%xmm0,%xmm0
 1dd:   0f c6 db a0             shufps $0xa0,%xmm3,%xmm3
 1e1:   0f c6 e4 5f             shufps $0x5f,%xmm4,%xmm4
 1e5:   0f c6 ff a0             shufps $0xa0,%xmm7,%xmm7
 1e9:   0f 14 ca                unpcklps %xmm2,%xmm1
 1ec:   0f 14 ee                unpcklps %xmm6,%xmm5
 1ef:   0f 28 d1                movaps %xmm1,%xmm2
 1f2:   0f 28 f5                movaps %xmm5,%xmm6
 1f5:   0f 57 15 00 00 00 00    xorps  0x0,%xmm2
 1fc:   0f 57 35 00 00 00 00    xorps  0x0,%xmm6
 203:   0f 59 c1                mulps  %xmm1,%xmm0
 206:   0f 59 e5                mulps  %xmm5,%xmm4
 209:   0f c6 d2 b1             shufps $0xb1,%xmm2,%xmm2
 20d:   0f c6 f6 b1             shufps $0xb1,%xmm6,%xmm6
 211:   0f 59 da                mulps  %xmm2,%xmm3
 214:   0f 59 fe                mulps  %xmm6,%xmm7
 217:   0f 58 c3                addps  %xmm3,%xmm0
 21a:   0f 58 e7                addps  %xmm7,%xmm4

000001d7        movaps  (%ebx),%xmm0
000001da        movaps  (%edi),%xmm3
000001dd        movaps  0x00(%ebx),%xmm4
000001e1        movaps  0x00(%edi),%xmm7
000001e5        movlps  0x00(%ebp),%xmm1
000001e9        movlps  (%eax),%xmm2
000001ec        movlps  0x00(%ebp),%xmm5
000001f0        movlps  0x00(%eax),%xmm6
000001f4        shufps  $0x5f,%xmm0,%xmm0
000001f8        shufps  $0xa0,%xmm3,%xmm3
000001fc        shufps  $0x5f,%xmm4,%xmm4
00000200        shufps  $0xa0,%xmm7,%xmm7
00000204        unpcklps        %xmm2,%xmm1
00000207        unpcklps        %xmm6,%xmm5
0000020a        movaps  %xmm1,%xmm2
0000020d        movaps  %xmm5,%xmm6
00000210        xorps   0x00000500,%xmm2
00000217        xorps   0x00000500,%xmm6
0000021e        mulps   %xmm1,%xmm0
00000221        mulps   %xmm5,%xmm4
00000224        shufps  $0xb1,%xmm2,%xmm2
00000228        shufps  $0xb1,%xmm6,%xmm6
0000022c        mulps   %xmm2,%xmm3
0000022f        mulps   %xmm6,%xmm7
00000232        addps   %xmm3,%xmm0
00000235        addps   %xmm7,%xmm4

As you can clearly see, that damn OSX manage to loose the offset.
Zuxy, do you know another syntax than the one you suggested, that
wouldn't confuse OSX's assembler?

Looks like a bug in OSX's assembler anyhow. I reported to bug to
Apple: https://bugreport.apple.com/cgi-bin/WebObjects/RadarWeb.woa/4/wo/Ft9Qh72xYtq9KglegN5jzg/7.22

> I'm sorry I don't access to any OS X boxes so I can't really help you
> much in shooting down the problem although it's me who introduced this
> regression. I hope you don't feel annoyed.

Nope, I'm learning a great deal of things thanks to this, so I'm happy. :)

So now we know that it's Apple's AS falt, not yours :-)

What to do now?

Y'a pas de gonzesse hooligan,
Imb?cile et meurtri?re
Y'en a pas m?me en grande Bretagne
A part bien s?r Madame Thatcher
  -- Renaud (sur "Miss Maggie")
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fft_sse_linux.asm.bz2
Type: application/x-bzip2
Size: 3668 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070530/1fd2581a/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fft_sse_osx.asm.bz2
Type: application/x-bzip2
Size: 2134 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070530/1fd2581a/attachment-0001.bin>

More information about the ffmpeg-devel mailing list