[FFmpeg-devel] [PATCH] split-radix FFT

Måns Rullgård mans
Tue Jul 29 21:03:51 CEST 2008


Michael Niedermayer <michaelni at gmx.at> writes:

> On Tue, Jul 29, 2008 at 07:39:25PM +0100, M?ns Rullg?rd wrote:
>> Michael Niedermayer <michaelni at gmx.at> writes:
>> 
>> > On Tue, Jul 29, 2008 at 05:20:15PM +0100, M?ns Rullg?rd wrote:
>> >> 
>> >> Michael Niedermayer wrote:
>> >> > On Tue, Jul 29, 2008 at 06:26:49PM +0300, Uoti Urpala wrote:
>> >> >> On Tue, 2008-07-29 at 17:10 +0200, Michael Niedermayer wrote:
>> >> >> > And just to clarify, yes what i considered a good argument
>> >> >> > was the sentance above where my reply is. That is to use
>> >> >> > MANGLE in speed critical code.  That way most textrels are
>> >> >> > avoided while minimizing the speed impact.
>> >> >> >
>> >> >> > I do not think you ever argued for that.
>> >> >>
>> >> >> IIRC I did mention the possibility of omitting -fPIC for a subset of
>> >> >> files.
>> >> >>
>> >> >> >  I remember you strongly arguing toward replacing all
>> >> >> > MANGLE by "m" knowing that it would break gcc 2.95 and not
>> >> >> > really caring that it would slow down code compiled with
>> >> >> > -fPIC.
>> >> >>
>> >> >> Of course the code would be slower on x86. If you want it to
>> >> >> be as fast as possible then compile it with -fPIC on x86. I
>> >> >> don't think it's worthwhile to pick only the globals used
>> >> >> inside asm for such special treatment.
>> >> >
>> >> > x86-64 shared libs require -fPIC, unless that has been fixed.
>> >> 
>> >> The x86-64 instruction set hasn't been "fixed", and I doubt it ever
>> >> will be.  You simply can't fit a 64-bit offset in a 32-bit immediate
>> >> operand.
>> >
>> > Thats not what i meant
>> >
>> >> 
>> >> > so the user does not always have the option to omit -fPIC
>> >> 
>> >> But in these cases, forcing a textrel will break the build.
>> >
>> > MANGLE forces rip relative addressing on x86-64 and thus avoids the
>> > occasional GOT indirection gcc adds.
>> >
>> > Heres a example:
>> > long globivar;
>> >
>> > void func(){
>> >     asm(
>> >         "mov globivar(%rip), %rax\n\t"
>> >     );
>> >     asm(
>> >         "mov %0, %%rax\n\t"
>> >         :: "m"(globivar)
>> >     );
>> > }
>> >
>> > results in:
>> > 0000000000000554 <func>:
>> >  554:	55                   	push   %rbp
>> >  555:	48 89 e5             	mov    %rsp,%rbp
>> >  558:	48 8b 05 d1 02 20 00 	mov    0x2002d1(%rip),%rax        # 200830 <globivar>
>> >  55f:	48 8b 05 8a 02 20 00 	mov    0x20028a(%rip),%rax        # 2007f0 <_DYNAMIC+0x1b8>
>> >  566:	48 8b 00             	mov    (%rax),%rax
>> >  569:	c9                   	leaveq 
>> >  56a:	c3                   	retq   
>> >
>> > you can see the second needs 2 instructions, the first just 1.
>> 
>> There is no guarantee that &globivar is reachable with a 32-bit offset
>> from %rip (or any other register).
>
> libavcodec is still smaller than 4gb so it would work fine within and thats
> the only case we really care about. I do not think any of our asm() accesses
> globals from outside and if it does thats a seperate thing that can use "m"

There is still no guarantee that the data section will be mapped
within 4GB of the text section.

-- 
M?ns Rullg?rd
mans at mansr.com




More information about the ffmpeg-devel mailing list