[FFmpeg-devel] [PATCH] Altivec split-radix FFT
Thu Aug 27 00:35:57 CEST 2009
David Conrad <lessen42 at gmail.com> writes:
> On Aug 26, 2009, at 5:28 PM, M?ns Rullg?rd wrote:
>> Guillaume POIRIER <poirierg at gmail.com> writes:
>>> The previous error I reported was on Linux/PPC64, where your code was
>>> compiling/assembling OK, but not running.
>>> I tried fix that problem on my local OSXLepoard/PPC, and in fact,
>>> code doesn't assemble with that toolchain. That can't be good.
>>> This is the relevant part of your code (from ff_fft_calc_altivec()
>>> + "mtctr %0 \n"
>>> + "stw 2,-4(1) \n"
>>> + "li 2,16 \n"
>>> + "bctrl \n"
>>> + "lwz 2,-4(1) \n" // ABI docs say r2 is general purpose and
>>> caller-saved, but gcc doesn't save it and crashes
>> Which ABI doc was that? Mine says r2 is "reserved for system use and
>> should not be changed by application code". Be that as it may, I
>> assume you meant stwu to push the value of r2 and lwz/addi to restore
>> it. Even with that change, it would fail on ppc64 since you'd be
>> preserving only half the register. Furthermore, the ABI mandates a
>> 16-byte-aligned stack at all times, but you could probably get away
>> without that if you never call any compiled code.
> On OS X at least it's general purpose. http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/100-32-bit_PowerPC_Function_Calling_Conventions/32bitPowerPC.html
I was reading the PPC ELF spec. I guess they're different.
>>> Which translates into:
>>> L214: mtctr r11
>>> L215: stw 2,-4(1)
>>> L216: li 2,16
>>> L217: bctrl
>>> L218: lwz 2,-4(1)
>>> The error message is:
>>> libavcodec/ppc/fft_altivec.S:215:Parameter syntax error (parameter 1)
>>> libavcodec/ppc/fft_altivec.S:216:Parameter syntax error (parameter 1)
>>> libavcodec/ppc/fft_altivec.S:218:Parameter syntax error (parameter 1)
>>> I honestly don't understand what's wrong with what you wrote. OSX
>>> toolchain doesn't like that code neither for PPC32 nor PPC64 target.
>> That's probably the assembler throwing a fit over the reserved r2
>> register. You could cheat it by writing the instructions with .word
>> directives instead.
> Apple's gas only accepts register names, not numbers for asm. So it
> would be stw r2, -4(r1) etc.
And gnu gas only supports numbers. For extra fun, objdump uses r
mans at mansr.com
More information about the ffmpeg-devel