[FFmpeg-devel] [PATCH] ARM: NEON optimised simple_idct

Alexander Strange astrange
Mon Aug 25 22:10:38 CEST 2008


On Aug 25, 2008, at 4:04 PM, M?ns Rullg?rd wrote:

> Michael Niedermayer <michaelni at gmx.at> writes:
>
>> On Mon, Aug 25, 2008 at 07:47:16PM +0100, M?ns Rullg?rd wrote:
>>> Michael Niedermayer <michaelni at gmx.at> writes:
>> [...]
>>>> 2. depending on the pattern of non zero / all zero rows one of 8
>>>> optimized column transforms is used.  This may be a bad idea though
>>>> for a CPU with a small code cache ...
>>>>
>>>> also maybe it would make sense to look at i386/idct_sse2_xvid.c
>>>> which uses SSE2 (128bit registers), this one uses only 16bit  
>>>> operations
>>>> for the column transform so it may be faster when the tricks of  
>>>> the simple
>>>> idct arent applicable
>>>
>>> Do you expect any sane person to be able to read that?
>>
>> well, a little insanity may be needed
>>
>>> That's also
>>> not bitexact, right?
>>
>> it is supposed to be bitexact, and i cannot remember a case where any
>> input lead to different output. Also the MMX one is used in the
>> regression tests and they match between MMX and non x86 cpus ...
>
> All the different IDCT variants (int, simple, simplemmx, libmpeg2mmx,
> xvidmmx, faani) give different output on my machine with current
> FFmpeg.  Which one is correct?

All of them are correct; none of the IDCT-using codecs specify exact  
rounding.
simple* and xvid* should be the same as their C versions, though.
It's best to stick with simpleidct so we can at least have bit-exact  
compatibility with ffmpeg-encoded files.

>>>> also
>>>>
>>>>    Intel 64 and IA-32 Architectures
>>>>    Software Developers Manual
>>>>                              Volume 2A (and B)
>>>>           Instruction Set Reference
>>>>
>>>> contains very readable and unambigious explanations of what all the
>>>> MMX, SSE* instruction do, if you ever want to decypher mmx or sse  
>>>> code
>>>
>>> I have those documents, and reading Chinese is easier.
>>
>> This is great, so you can help me communicate with zhentan who is a  
>> SOC
>> student and IIRC chinese.
>
> No, but maybe he can explain mmx to me.

Googling the intrinsic names sometimes turns up better or at least  
different documentation, but MS's website is really not accessible...



More information about the ffmpeg-devel mailing list