[FFmpeg-devel] [PATCH] Fix function parameters for rgb48 to YV12 functions.

Tue Feb 2 20:21:15 CET 2010

On Tue, Feb 02, 2010 at 08:01:26PM +0100, Michael Niedermayer wrote:
> On Tue, Feb 02, 2010 at 04:10:06PM -0200, Ramiro Polla wrote:
> > Hello Michael,
> > 
> > On Sun, Jan 24, 2010 at 8:31 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > the gain happens when you change the variables used to calculate the index
> > > also to it. You could also try to make the index unsigned but make sure it
> > > cant be negative if you try this
> > 
> > Sorry but I still don't understand how that will be of use here in
> > libswscale. I've tried forcing int32_t and int64_t for x86_64 in some
> > of those functions (some xxxTo(Y|UV), hScale and the fast bilinear
> > ones), in all C, MMX and MMX2. All I can see is the expansion from
> > 32-bit to 64-bit being changed from caller and callee. There is no
> > difference in the inner loop, nor in how gcc addresses the the src and
> > dst arrays.
> 
> maybe theres no gain for swscale, i cant say without looking at the asm
> gcc generates.
> i know that in h264 gcc filled some functions with 32->64 sign extension
> code in the inner loops.

Which compilation options have you been using?
IIRC sign extension is only an issue when gcc uses the indexed load instructions,
which it does not use at all anymore when optimizing for a more modern CPU
(which means it ends up being even shorter on registers and thus creating
larger and slower code with probably theoretically better instruction scheduling).