[Ffmpeg-devel] [PATCH] put_mpeg4_qpel16_h_lowpass altivec, take 2

Sun Nov 26 19:23:33 CET 2006

Hi

On Sun, Nov 26, 2006 at 05:23:35PM +0100, Luca Barbato wrote:
[...]
> > +
> > +static void put_pixels16_l2_altivec(uint8_t *dst, const uint8_t *src1,
> > +        const uint8_t *src2, int dst_stride, int src_stride1,
> > +        int src_stride2, int h)
> > +{
> > +    register vector unsigned char src1v, src2v, dstv;
> > +    register vector unsigned char tmp1, tmp2, mask, edges, align;
> > +    int i;
> > +
> > +    for(i=0; i<h; i++) {
> > +        /* Unaligned load */
> > +        src1v = vec_perm(
> > +            vec_ld(0, src1), vec_ld(15, src1), vec_lvsl(0, src1));
> > +        src2v = vec_perm(
> > +            vec_ld(0, src2), vec_ld(15, src2), vec_lvsl(0, src2));
> 
> if the stride is a multiple of 16 you could put vec_lvsl out the loop

all strides should in general be multiples of 16 on arch which benefit from
it (yes there are excpetions in obscure codecs ... but i think this one is ok,
without looking at the code ...)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is