[FFmpeg-devel] [PATCH 04/12] Add vector_fmul_matrix to dsputil

Måns Rullgård mans
Mon Oct 19 00:29:22 CEST 2009


Michael Niedermayer <michaelni at gmx.at> writes:

> On Sun, Oct 18, 2009 at 10:13:20PM +0100, M?ns Rullg?rd wrote:
>> Michael Niedermayer <michaelni at gmx.at> writes:
>> 
>> > On Sun, Oct 18, 2009 at 09:17:48PM +0100, M?ns Rullg?rd wrote:
>> >> Michael Niedermayer <michaelni at gmx.at> writes:
>> > [...]
>> >> >> +        }
>> >> >> +    } else {
>> >> >> +        for (i = 0; i < len; i++) {
>> >> >> +            const float *m = mtx;
>> >> >> +            for (j = 0; j < w; j++) {
>> >> >> +                float s = 0;
>> >> >
>> >> >> +                for (k = 0; k < w; k++)
>> >> >> +                    s += v[k][i] * *m++;
>> >> >
>> >> > this is quite inefficient because for(k) v[k][i] needs 2 memory reads
>> >> > a flat 2d array would be better
>> >> 
>> >> And how will the data magically transform itself into such a layout?
>> >
>> > What is the a reason that the data is not in that layout?
>> > If the awnser is that some decoder is implemenetd that way then my next
>> > question is, would there be a disadvanatge in changing it?
>> 
>> Many of the audio decoders allocate the channels separately.  I didn't
>> write them, so I can't say how difficult it would be to change that.
>
> for many channels it should even be faster to memcpy them instead of the
> double dereferences
> memcpy needs O(w*len)
> the dereferences are O(w*w*len)

Please stop being ridiculous.  I don't expect w to be greater than 8.
It will probably be 2 or 6 in most cases.

Which do you prefer, getting 99% of the speed now, with little effort,
or 99.5% of the speed at some indeterminate future time, when someone
has rewritten all the code to use some other data layout, and you have
reviewed and approved the rewrite?  Your absurd requirements often
make people give up and do nothing at all, which is usually the worst
of all alternatives.  Is that the way you like it?

> also, maybe mtx would be more convenient for SIMD if its transposed
> before the function 

That's quite possible, but we can change that later.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list