[FFmpeg-devel] [PATCH] ARM: NEON optimised simple_idct
Mon Aug 25 21:55:11 CEST 2008
On Mon, Aug 25, 2008 at 07:47:16PM +0100, M?ns Rullg?rd wrote:
> Michael Niedermayer <michaelni at gmx.at> writes:
> >2. depending on the pattern of non zero / all zero rows one of 8
> > optimized column transforms is used. This may be a bad idea though
> > for a CPU with a small code cache ...
> > also maybe it would make sense to look at i386/idct_sse2_xvid.c
> > which uses SSE2 (128bit registers), this one uses only 16bit operations
> > for the column transform so it may be faster when the tricks of the simple
> > idct arent applicable
> Do you expect any sane person to be able to read that?
well, a little insanity may be needed
> That's also
> not bitexact, right?
it is supposed to be bitexact, and i cannot remember a case where any
input lead to different output. Also the MMX one is used in the
regression tests and they match between MMX and non x86 cpus ...
> > also
> > Intel 64 and IA-32 Architectures
> > Software Developers Manual
> > Volume 2A (and B)
> > Instruction Set Reference
> > contains very readable and unambigious explanations of what all the
> > MMX, SSE* instruction do, if you ever want to decypher mmx or sse code
> I have those documents, and reading Chinese is easier.
This is great, so you can help me communicate with zhentan who is a SOC
student and IIRC chinese.
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Freedom in capitalist society always remains about the same as it was in
ancient Greek republics: Freedom for slave owners. -- Vladimir Lenin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel