[FFmpeg-devel] [PATCH 2/3] Indeo 5 decoder: common DSP functions

Michael Niedermayer michaelni
Sun Jan 17 18:58:51 CET 2010


On Sun, Jan 17, 2010 at 04:58:23PM +0200, Kostya wrote:
> On Sun, Jan 17, 2010 at 02:35:16PM +0100, Michael Niedermayer wrote:
> > On Sun, Jan 17, 2010 at 03:06:40PM +0200, Kostya wrote:
> > > On Sun, Jan 10, 2010 at 01:22:17PM +0200, Kostya wrote:
> > > > On Sat, Jan 09, 2010 at 05:43:40PM +0200, Kostya wrote:
> > > > > On Sat, Jan 09, 2010 at 03:47:39PM +0100, Michael Niedermayer wrote:
> > > > > > On Sat, Jan 09, 2010 at 04:40:30PM +0200, Kostya wrote:
> > > > > > > On Fri, Jan 08, 2010 at 11:41:23PM +0100, Michael Niedermayer wrote:
> > > > > > > > On Sun, Jan 03, 2010 at 12:56:36PM +0200, Kostya wrote:
> > > > > > > > [...]
> > > > > > > > > void ff_ivi_recompose53(const IVIPlaneDesc *plane, uint8_t *dst,
> > > > > > > [function body skipped]
> > > > > > > > 
> > > > > > > > is this mess faster than some more readable variant?
> > > > > > > 
> > > > > > > Here's more readable variant by me, checked to be bitexact but it's
> > > > > > > significantly slower (> 10%), I'd rather leave old one.
> > > > > > 
> > > > > > I also prefer speed, what about an implementation using lifting?
> > > > > 
> > > > > I'll try to implement it.
> > > > 
> > > > Hmm, after some experiments I'd rather leave original version.
> > > > Even grouping variables together in array gives significant performance
> > > > drop. And pure lifting transform is not applicable here either because
> > > > band data is grouped and it will take at least two passes (hor/vert)
> > > > with conditions for missing bands and requires an additional temp
> > > > buffer.
> > > 
> > > I've tried reusing Snow wavelet composing there. It was several percents
> > > slower which is not surprising because it needs an intermediate buffer
> > > there and Indeo5 code does not need to check for odd dimensions.   
> > 
> > the intermediate buffer is avoidable, it can be done as part of the transform
> > between horizontal & vertical transform.
> > is it faster without that transform?
>  
> It's not avoidable - this scaling cannot modify input since it's used
> for further decoding and output is uint8_t, so it is simply not enough
> for holding intermediate values.

static void horizontal_compose53i(IDWTELEM *b, int width){
    IDWTELEM temp[width];
            ^^^^^^^^^^^^^
this is avoidable


> In theory it should be faster, but unfortunately is not straight
> applicable here.

If we cannot merge the 2 5/3 wavelets then then optimizations would also
be duplicated (assuming there are volunteers for both) this feels like
a bad thing to me
But if its really not possible then its not a reason to hold up indeo

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Avoid a single point of failure, be that a person or equipment.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100117/31389f45/attachment.pgp>



More information about the ffmpeg-devel mailing list