[FFmpeg-devel] [PATCH/RFC] Add some dsputil functions useful for AAC decoder

Måns Rullgård mans
Sun Sep 20 14:39:40 CEST 2009


Michael Niedermayer <michaelni at gmx.at> writes:

> On Fri, Sep 18, 2009 at 11:11:55PM +0100, Mans Rullgard wrote:
>> This patch adds a few dsputil functions that can be used in the AAC
>> decoder.
>> 
>> With trivial NEON versions of these functions, the AAC decoder gets
>> ~1.6x faster on Cortex-A8, and better NEON code will push that even
>> further.
>> 
>> I will readily admit that some of the names in this patch are rubbish,
>> so please suggest something better.  Other enhancements are obviously
>> welcome too.
> [...]
>
>> diff --git a/libavcodec/dsputil.h b/libavcodec/dsputil.h
>> index d9d7d16..61252f5 100644
>> --- a/libavcodec/dsputil.h
>> +++ b/libavcodec/dsputil.h
>> @@ -397,6 +397,14 @@ typedef struct DSPContext {
>>      /* assume len is a multiple of 8, and arrays are 16-byte aligned */
>>      void (*int32_to_float_fmul_scalar)(float *dst, const int *src, float mul, int len);
>>      void (*vector_clipf)(float *dst /* align 16 */, const float *src /* align 16 */, float min, float max, int len /* align 16 */);
>> +    void (*vector_fmul_scalar)(float *dst, const float *src, float mul,
>> +                               int len);
>> +    void (*vector_fmul_scalar_vp[2])(float *dst, const float *src,
>> +                                     const float **vp, float mul, int len);
>> +    void (*vp_fmul_scalar[2])(float *dst, const float **vp,
>> +                              float mul, int len);
>> +    float (*scalarproduct_float)(const float *v1, const float *v2, int len);
>> +    void (*butterflies_float)(float *v1, float *v2, int len);
>
> missing doxy

I don't want to waste time on documentation before the general idea is
approved.

> also, without seeing how these all are used i do have the feeling that
> they maybe are too small primitives and that bigger chunks of aac code
> should be optimized to increase flexibility and reduce call overhead ...

See attached patch.

> and i would suggest to only optimize code when it matters speedwise and
> not when the code just makes up <1% of the cpu time, alex reply made
> me think that this may apply to some code in there ...

1.6x speedup matters to me.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list