[FFmpeg-devel] [PATCH] Some ARM VFP optimizations (vector_fmul, vector_fmul_reverse, float_to_int16)

Siarhei Siamashka siarhei.siamashka
Sun Apr 27 13:26:02 CEST 2008


On Monday 21 April 2008, Michael Niedermayer wrote:
> > > To awnser 3.
> > > huge speedloss, and thats why this isnt a solution
> >
> > Where did you get this idea? Actually using current FFmpeg implementation
> > of ARMv5TE IDCT is a huge speedloss :)
> >
> > The proposed upgrade is not perfect, but it still can be improved
> > further. And it will provide performance improvement, and provide it
> > right now. Before this hardware (ARMv5TE is already old) gets completely
> > outdated and abandoned by everyone...
>
> Well if you insist on this messy stack realign in the innermost loop then
> iam fine with it, if you provide some benchmarks (with the realign enabled)
> which are faster than the current code.

Technically speaking, it is definitely NOT the innermost loop as we get 
some loops inside IDCT function too. But if you keep insisting that
it is "messy", I'm fine with that and even agree :) I would surely 
prefer if there was no need for this workaround at all.

I'll submit the latest revision of IDCT patch and repost the benchmark 
results (ARM9E and ARM11) in its own separate topic once we get some 
progress with VFP optimizations.


-- 
Best regards,
Siarhei Siamashka




More information about the ffmpeg-devel mailing list