[Ffmpeg-devel] dsputil arm patch

Siarhei Siamashka siarhei.siamashka
Tue Apr 24 18:07:01 CEST 2007


On 4/24/07, M?ns Rullg?rd <mans at mansr.com> wrote:

> >> Well, did you specify the correct CPU?  If you don't, the ARMv6
> >> optimized IDCT isn't used.
> >>
> > My patch is for some put*, NOT IDCT. Which version IDCT isn't important.
>
> With the optimised IDCT some of those add/put functions are never called,
> so it is relevant.

In this case probably quite the opposite is important - testing with
non-optimized IDCT?
Just to get better coverage and test for bugs in otherwise unused
functions? By the way, I remember some video artefacts which showed up
with the default armv4 IDCT, but I did not pay much attention to this
as optimized armv5te IDCT got available later :)

So support for armv4 may be already broken. By the way, these dsp
functions use cache preload instruction PLD which is not supported on
armv4 (it appeared in armv5te), so this code will not work there
anyway. But considering that there were no bugreports for a long time
(though I may be mistaken), nobody seems to be interested in armv4
support anymore.

> > Now I reconfigure ffmpeg with extra-cflags -mcpu=arm1136jf-s -mfpu=vfp
> > -mfloat-abi=softfp
>
> Hmm, the chip has a real floating point unit so you don't want softfloat
> emulation.

But the whole system (maemo 3.x) is using softfloat ABI as far as I
know, so it should be probably important for properly calling glibc
functions which use floating point arguments. Using '-mfpu=vfp' option
makes gcc use floating point unit where possible, but still remain
compatible with softfloat.

I have tested this patch and it really improves performance, some
tests with 'mplayer -vo md5sum' did not reveal any problems as well. I
just wonder why some of these dsp functions were commented out while
marked as ok? I earlier thought that they could probably cause
performance regression on XScale (original author of that code
developed it for Sharp Zaurus PDA which is XScale powered). If there
are any XScale users subscribed to this mailing list, maybe they could
run some benchmarks?

I suspect that these motion compensation functions may need separate
implementations for different ARM devices (ARM9E, ARM11, XScale) for
the best performance as optimal memory access patterns and cache
behaviour may differ for all of them (at least it is very different
for ARM9E and ARM11).

PS. Added this patch to the latest build of MPlayer for maemo yesterday.



More information about the ffmpeg-devel mailing list