[FFmpeg-devel] [PATCH] ff_h[yc]scale_fast_mmext always clobbers %rbx

Michael Niedermayer michaelni at gmx.at
Thu May 7 03:03:43 CEST 2015


On Wed, May 06, 2015 at 04:08:09PM -0700, Nick Lewycky wrote:
> On 6 May 2015 at 15:06, Michael Niedermayer <michaelni at gmx.at> wrote:
> 
> > Hi
> >
> > On Wed, May 06, 2015 at 11:52:59AM -0700, Nick Lewycky wrote:
> > > Newer versions of clang will allocate %rbx as one of the inline asm
> > inputs,
> >
> > Thats great
> >
> >
> > > even in PIC. This occurs when building ffmpeg with clang
> > -fsanitize=address
> > > -O1 -fPIE. Because the asm does clobber %bx whether PIC is on or off,
> > just
> > > include %bx in the clobber list regardless of whether PIC is on or off.
> >
> > you cant include *bx in the clobber list with PIC
> > it breaks compilers that are less great, like gcc
> >
> 
> Doh!! I did check for this, but only tried x86-64, not x86-32. Sorry!
> 
> gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
> > ../libswscale/x86/hscale_fast_bilinear_simd.c: In function
> > ‘ff_hyscale_fast_mmxext’:
> > ../libswscale/x86/hscale_fast_bilinear_simd.c:205:5: error: PIC register
> > clobbered by ‘%ebx’ in ‘asm’
> > ../libswscale/x86/hscale_fast_bilinear_simd.c: In function
> > ‘ff_hcscale_fast_mmxext’:
> > ../libswscale/x86/hscale_fast_bilinear_simd.c:276:5: error: PIC register
> > clobbered by ‘%ebx’ in ‘asm’
> >
> >
> > also what exactly are you trying to fix ?
> > or rather what exactly goes how exactly wrong with the code as it is
> > if rbx is used ?
> >
> 
> Ok, let's look at ff_hcscale_fast_mmext. Preprocessor directives evaluated
> in PIC x86-64, the inline constraints work out to:
> 
>         :: "m" (src1), "m" (dst1), "m" (filter), "m" (filterPos),
>            "m" (mmxextFilterCode), "m" (src2), "m"(dst2)
>           ,"m" (ebxsave)
>           ,"m"(retsave)
>         : "%"REG_a, "%"REG_c, "%"REG_d, "%"REG_S, "%"REG_D
> 
> so clang looks at that and decides that it can pick src1 = (%r10), dst1 =
> (%r8), filter = (%r11), filterPos = (%r12), mmxextFilterCode = (%r15), src2
> = (%rbx), dst2 = (%r14), ebxsave = (%r13), retsave = (%r9). The problem
> there is src2 being (%rbx).
> 
> Now let's look at how we use them:
> 
>         "mov                 %0, %%"REG_c"  \n\t"
>         "mov                 %1, %%"REG_D"  \n\t"
>         "mov                 %2, %%"REG_d"  \n\t"
>         "mov                 %3, %%"REG_b"  \n\t"  // Clobbers %rbx / src2
> / %5 here
>         "xor          %%"REG_a", %%"REG_a"  \n\t"
>         PREFETCH"   (%%"REG_c")             \n\t"
>         PREFETCH" 32(%%"REG_c")             \n\t"
>         PREFETCH" 64(%%"REG_c")             \n\t"
> 
>         CALL_MMXEXT_FILTER_CODE
>         CALL_MMXEXT_FILTER_CODE
>         CALL_MMXEXT_FILTER_CODE
>         CALL_MMXEXT_FILTER_CODE
>         "xor          %%"REG_a", %%"REG_a"  \n\t"
>         "mov                 %5, %%"REG_c"  \n\t"  // %5 is read too late,
> we get %3 / filterPos instead
>         "mov                 %6, %%"REG_D"  \n\t"
>         PREFETCH"   (%%"REG_c")             \n\t"
>         PREFETCH" 32(%%"REG_c")             \n\t"
>         PREFETCH" 64(%%"REG_c")             \n\t"
> 
>         CALL_MMXEXT_FILTER_CODE  // Crash...
> 
> The two marked instructions are intended to refer to different contents.
> CALL_MMXEXT_FILTER_CODE moves RBX to ESI and calls mmxextFilterCode:
> 
>         "movl            (%%"REG_b"), %%esi     \n\t"\
>         "call                    *%4            \n\t"\  // Crash...
> 
> That callee ultimately segfaults:
> 
> => 0x7fffefa45000:      movq   (%rdx,%rax,1),%mm3
> => 0x7fffefa45004:      movd   (%rcx,%rsi,1),%mm0
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007fffefa45004 in ?? ()
> => 0x7fffefa45004:      movd   (%rcx,%rsi,1),%mm0
> 
> I'm trying to fix this segfault.
> 
> also ideally changes to this code should be tested against gcc/clang/icc
> > of various versions with and without PIC, 32 and 64 bit
> > this code is more tricky than than the average so theres a good
> > change changes to it will break some compiler
> >
> 
> I understand that, but I may not be able to test as many environments as
> you like. I'm going to give testing it my best effort, but I can tell you
> up front that I'm only going to test gcc and clang, on an x86 Ubuntu
> machine. I don't have ICC, so I can't test that.

iam not sure it would work but, configure could easily test if
ebx can be put on the clobber list and that then could be used
in place of PIC in the checks

and thanks for the detailed informaion

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
then the original author, trying to rewrite it will not make it better.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150507/a7234daf/attachment.asc>


More information about the ffmpeg-devel mailing list