[FFmpeg-devel] [PATCH] NEON FFT/IMDCT

Måns Rullgård mans
Thu Sep 10 10:55:16 CEST 2009


Naotoshi Nojiri <naonoj at gmail.com> writes:

> 2009/9/8 M?ns Rullg?rd <mans at mansr.com>:
>> M?ns Rullg?rd <mans at mansr.com> writes:
>>
>>> Naotoshi Nojiri <naonoj at gmail.com> writes:
>>>
>>>> Hi,
>>>>
>>>> I tested the patch on Cortex-A8 @500MHz (BeagleBoard).
>>>> FFT (fft-test -s):
>>>> 440.8 -> 34.2 us/transform (12.9x speed up)
>>>> IMDCT (fft-test -i -m -s):
>>>> 142.4 -> 11.8 us/transform (12.1x speed up)
>>>>
>>>> I had written NEON intrinsics code a bit, but this is my first
>>>> ARM/NEON code in assembly.
>>>> So, any comments and suggestions would be appreciated.
>>>
>>> Inline asm is unacceptable.
>>
>> I have a faster, pure-asm version of the mdct stuff almost ready. ?No
>> need to resubmit.
>
> Thank you for all of your comments and advices. I revised the patch
> The latest performance is as follows.
>
> FFT (fft-test -s):
> 32.0us
> IMDCT (fft-test -i -m -s):
> 11.3us
>
> Mans,
>
> I also wrote a pure-asm version of MDCT, but because it doesn't
> improve the performance, please ignore the part and use the FFT part
> only.

Thanks.  I've committed a slightly improved variant.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list