[Ffmpeg-devel] [PATCH] SSE counterpart of ff_imdct_calc_3dn2
Thu Aug 24 19:27:16 CEST 2006
On Thu, Aug 24, 2006 at 06:35:07PM +0200, Guillaume Poirier wrote:
> \o/ Rich is back to flaming mode!
> Rich Felker wrote:
> > On Thu, Aug 24, 2006 at 09:59:37AM +0200, Guillaume POIRIER wrote:
> >>>Intrinsics are also gcc4-specific
> >>False, They existed in 3.4 and I think in 3.3 also (I don't know about
> >>earlier releases, but for sure 2.95 do not support them).
> > Only gcc4 and later have the 3dnow intrinsics.
> I wasn't specifically talking about 3dnow intrinsics...
> You probably should back up your claims Rich, or precise that you made
> smth up.
The docs are incorrect. If you're RTFml (mailing list) you would know
that 3dnow intrinsics are actually missing on gcc prior to 4.x.
> >>Rich, you should really consider that some ppl aren't willing to spend
> >>their youth on writting killer hand tuned asm code.
> > It takes maybe 5-10 minutes more to write the obvious handwritten asm
> > than to write the code with intrinsics, and performance should be same
> > or better. If you want to make it even faster you may spend somewhat
> > longer but your claims of "spending their youth" are exaggerated and
> > misleading.
> Well, you forgot to consider several things:
> appropriate register allocation (gcc may not be to good at that, it's
> still easier to write code with named variables rather than with
> anonymous reg names).
And easier to make mistakes that will result in load/store!! If you're
forced to write register names you know all the data fits in
registers. Otherwise you have to duplicate the work of the compiler in
your head to make sure it fits, or read the compiler's asm output.
More information about the ffmpeg-devel