[FFmpeg-devel] [PATCH 2/2] swscale/arm/yuv2rgb: add ff_yuv420p_to_{argb, rgba, abgr, bgra}_neon_{16, 32}

Michael Niedermayer michael at niedermayer.cc
Sat Dec 19 03:27:06 CET 2015


On Fri, Dec 18, 2015 at 11:44:27AM +0100, Matthieu Bouron wrote:
> On Thu, Dec 17, 2015 at 07:47:08PM +0100, Michael Niedermayer wrote:
> > On Thu, Dec 17, 2015 at 04:54:31PM +0100, Matthieu Bouron wrote:
> > > On Tue, Dec 15, 2015 at 06:22:43PM +0100, Michael Niedermayer wrote:
> > > > On Tue, Dec 15, 2015 at 05:46:09PM +0100, Matthieu Bouron wrote:
> > > > > From: Matthieu Bouron <matthieu.bouron at stupeflix.com>
> > > > > 
> > > > > ---
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > This commit is likely to break fate on arm since the current C code path
> > > > > seems to use less precision.
> > > > > 
> > > > > How should I proceed to fix it ?
> > > > 
> > > > hmm
> > > > can the precission of the C code path and any asm impl of it under
> > > > bitexact (if they exist), be changed to higher precission without
> > > > speedloss?
> > > > if so that would be an option
> > > 
> > > We are currently facing 4 cases (with this patch applied)
> > > 
> > >   * [1] ARM +ACCURATE_RND: uses neon, 13bit coefficients and 32bit
> > >   precision overall
> > >   * [2] ARM -ACCURATE_RND: uses neon, 6bit coefficients and 16bit
> > >   precision overall
> > 
> > >   * [3] X86 +ACCURATE_RND: uses a C code path with lookup tables
> > 
> > which LUT do you mean here ?
> 
> The table filled by ff_yuv2rgb_c_init_tables. Not sure if it's used
> though, I haven't looked at the C code that actually does the conversion
> (yet).
> 
> > 
> > 
> > >   * [4] X86 -ACCURATE_RND: uses MMX+MMXEXT with apparently 13bit
> > >   coefficients (libswscale/yuv2rgb.c around line 800).
> > > 
> > > Notes:
> > >   * The 4 outputs are different with [3] being ugly (ghosting/non-sharp
> > >   edges).
> > > 
> > >   * [1] and [4] (13bit coefficient accuracy) should be the same but have
> > >   slight differences.
> > > 
> > > Questions:
> > > 
> > 
> > >   * What is the meaning of the ACCURATE_RND flag ?
> > 
> > it should enable accurate rounding
> > 
> > 
> > >   * Does [3] use some kind of interpolation instead of duplicating
> > >   chroma lines ? Its output seems inferior to the other code paths.
> > 
> > are you sure that is true for real images?
> > its easy to end up with wrong conclusions with synthetic inputs
> > unless you want to use the scaler only for such inputs.
> > 
> > either way line duplication is likely not optimal for real images
> > iam not made of constant color blocks that are aligned to some cammeras
> > 2x2 samples
> > 
> > 
> > >   * Is [3] the output that should be taken as reference ?
> > 
> > id say, the reference is reality, making the output as close as a
> > image of the new resolution would be if it had been taken that way
> > 
> > 
> > >   * Should we use BITEXACT instead of ACCURATE_RND to determine the
> > >   precision used ?
> > 
> > BITEXACT is to avoid platform differences and allow regression tests
> > 
> > if all else is equal it would be best if C and asm matches, and if
> > C is bad then it should be improved
> 
> Here are the C, MMX and NEON outputs from a photo:
> http://0x5c.me/yuv2rgb/photos
> 
> The C and NEON outputs look almost the same.
> The MMX one have slightly different "colors" overall.
> 
> Since figuring out what the C code is actually doing and have the neon asm
> matches its output is likely to take some time. Would you be OK if, on the
> ARM platform, +ACCURATE_RND uses the C code path (so fate is not broken),
> and -ACCURATE_RND uses the neon code path with a precision of 16bit (IMHO,
> speed is preferred over the slight quality gain of the 32bit version on
> this platform) ?
> 
> This behaviour will affect yuv420p but also nv12 and nv21 inputs.
> 
> This is a kind of a temporary (and lame) solution until I have some time
> to work on it.

no objections

thanks

>
> Matthieu
> [...]
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

During times of universal deceit, telling the truth becomes a
revolutionary act. -- George Orwell
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151219/0ccaba52/attachment.sig>


More information about the ffmpeg-devel mailing list