[FFmpeg-devel] [PATCH] ARM: NEON optimised simple_idct

Luca Barbato lu_zero
Tue Aug 26 13:06:13 CEST 2008


M?ns Rullg?rd wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
> 
>> On Mon, Aug 25, 2008 at 03:53:29PM +0100, M?ns Rullg?rd wrote:
>>> Michael Niedermayer <michaelni at gmx.at> writes:
>>>
>>>> On Mon, Aug 25, 2008 at 04:06:33AM +0100, Mans Rullgard wrote:
>>>>> ---
>>>>>  libavcodec/Makefile                  |    2 +
>>>>>  libavcodec/armv4l/dsputil_arm.c      |   15 ++
>>>>>  libavcodec/armv4l/simple_idct_neon.S |  383 ++++++++++++++++++++++++++++++++++
>>>>>  libavcodec/avcodec.h                 |    1 +
>>>>>  libavcodec/utils.c                   |    1 +
>>>>>  5 files changed, 402 insertions(+), 0 deletions(-)
>>>>>  create mode 100644 libavcodec/armv4l/simple_idct_neon.S
>>>>>
>>>> is this idct binary identical in output to the C/MMX simple idct?
>>> Yes.
>>>
>>>>> +#ifdef HAVE_NEON
>>>>> +        } else if (idct_algo==FF_IDCT_SIMPLENEON){
>>>>> +            c->idct_put= ff_simple_idct_put_neon;
>>>>> +            c->idct_add= ff_simple_idct_add_neon;
>>>>> +            c->idct    = ff_simple_idct_neon;
>>>>> +            c->idct_permutation_type = FF_NO_IDCT_PERM;
>>>>> +#endif
>>>> I do not know neon at all but, ive never seen a SIMD instruction set for
>>>> which the identity permutation would have been optimal.
>>>>
>>>> Also i suspect that the MMX simple idct is a better basis from which to
>>>> write other SIMD variants of the simple idct than the C one.
>>> I can't read mmx code.

Try the altivec one, should be easy to understand.

lu


-- 

Luca Barbato
Gentoo Council Member
Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero





More information about the ffmpeg-devel mailing list