[Ffmpeg-devel] [PATCH] idct8 in Altivec for H.264 decoding
Mon Oct 9 11:04:36 CEST 2006
On 10/9/06, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Mon, Oct 09, 2006 at 12:05:30AM +0200, Guillaume POIRIER wrote:
> > Hi,
> > Attached patch should provide a 2% decoding speed-up if I do the math right.
> > This patch isn't meant to be merged as it is now, as in addition to
> > adding idct8 routine, it moves TRANSPOSE8 macro to dsputil_altivec.h as
> > this macro is already duplicated in vc1dsp_altivec.c, and
> > mpegvideo_altivec.c.
> > This patch also carries some macros that are useful in Altivec
> > programming. They are taken from x264 project, and I have permission
> > from the author to re-licence them in LGPL.
> could you send a seperate patch for the TRANSPOSE move and these?
Yes, please find them in attachement if this mail.
I shall make an updated patch with my idct8 implementation when I have
> > One more thing: if the dst array is 8 or 16 bytes aligned, it should be
> > possible to make the routine even faster. Unfortunately, I can't manage
> > to make an implementation that works.
> > I've left the optimized routines ALTIVEC_STORE_SUM_CLIP_ALIGN8_A (16
> > bytes aligned *dst) and ALTIVEC_STORE_SUM_CLIP_ALIGN8_B (8 bytes aligned
> > *dst (but _not_ 16 bytes aligned) so ppl can have a look at them and
> > hopefully find what is wrong.
> > As far as I can see, ALTIVEC_STORE_SUM_CLIP_ALIGN8_A works as expected,
> > but ALTIVEC_STORE_SUM_CLIP_ALIGN8_B doesn't (that's really surprising
> > considering how much alike they are).
> 1. check that the stuff is really 8byte aligned (yes it should be but ...)
> 2. maybe some print_vec() function which prints the contents of a vec*
> together with a check at the end if the calculaton matches what you
> expect could help
> my idea is something like:
> vec_u8_t dstv = vec_ld(0, dest);
> vec_st(sum8, 0, temp);\
> for(i=0; i<16; i++)
> if(temp[i] != dest[i])
I'll see what I can do. Thanks for the suggestion.
With DADVSI (http://en.wikipedia.org/wiki/DADVSI), France finally has
a lead on USA on selling out individuals right to corporations!
Vive la France!
More information about the ffmpeg-devel