[FFmpeg-devel] [PATCH 2/3] simple_idct12: align C and x86

Michael Niedermayer michael at niedermayer.cc
Wed Oct 14 00:04:49 CEST 2015


On Tue, Oct 13, 2015 at 09:21:40PM +0200, Christophe Gisquet wrote:
> Results for omse on the 3 idct dct-test.
> 
> C:   0.16915859   0.11848359   0.12913125
> x86: 0.16883281   0.11849063   0.19041875
> 
> Using 14 and 17 as shifts subtantially improve those, but actually
> cause overflows and incorrect decoding of 12bpp content.
> ---
>  libavcodec/simple_idct_template.c | 17 ++++-------------
>  libavcodec/x86/idctdsp_init.c     |  8 +++-----
>  libavcodec/x86/simple_idct10.asm  |  7 +++----
>  3 files changed, 10 insertions(+), 22 deletions(-)

dct-test -i changes
IDCT SIMPLE-C12: max_err=1 omse=0.11856094 ome=0.00286875 syserr=0.02590000 maxout=288 blockSumErr=64
to
IDCT SIMPLE-C12: max_err=1 omse=0.11825703 ome=0.01024297 syserr=0.05225000 maxout=288 blockSumErr=64

dct-test -i 2
     58    2571    1334    1334    1334    1334      58    2571
   2571      58    1334    1334    1334    1334    2571      58
     58    2571    3805   -1156    3805   -1156      58    2571
   2571      58   -1156    3805   -1156    3805    2571      58
     58    2571    3805   -1156    3805   -1156      58    2571
   2571      58   -1156    3805   -1156    3805    2571      58
     58    2571    1334    1334    1334    1334      58    2571
   2571      58    1334    1334    1334    1334    2571      58
IDCT SIMPLE-C12: max_err=1 omse=0.12911875 ome=0.06609375 syserr=0.19025000 maxout=256 blockSumErr=64
to
   2560    5073    2560    2560    2560    2560    2560    5073
   5073    2560    2560    2560    2560    2560    5073    2560
   2560    5073    5031      70    5031      70    2560    5073
   5073    2560      70    5031      70    5031    5073    2560
   2560    5073    5031      70    5031      70    2560    5073
   5073    2560      70    5031      70    5031    5073    2560
   2560    5073    2560    2560    2560    2560    2560    5073
   5073    2560    2560    2560    2560    2560    5073    2560
IDCT SIMPLE-C12: max_err=1 omse=0.19041875 ome=0.15929375 syserr=0.25365000 maxout=256 blockSumErr=64

the ome and syserr values worsen by this

iam not objecting to this if thats what people want, just want to
make sure its not missed

also IIUC this is just to make C and x86 match, so it could just be
skiped with no ill effects except that tnen x86 and C would not be
bitexact matches ?


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I have often repented speaking, but never of holding my tongue.
-- Xenocrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151014/9d3d2d5e/attachment.sig>


More information about the ffmpeg-devel mailing list