[FFmpeg-devel] [PATCH 10/11] avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high depth functions

Ronald S. Bultje rsbultje at gmail.com
Tue Jun 20 16:03:57 EEST 2017


Hi,

On Mon, Jun 19, 2017 at 4:32 PM, Michael Niedermayer <michael at niedermayer.cc
> wrote:

> On Mon, Jun 19, 2017 at 05:11:03PM +0200, James Darnley wrote:
> > Includes add/put functions
> >
> > Rounding contributed by Ronald S. Bultje
> > ---
> >  libavcodec/tests/x86/dct.c                |  2 +
> >  libavcodec/x86/idctdsp_init.c             | 23 ++++++++
> >  libavcodec/x86/simple_idct.h              |  9 +++
> >  libavcodec/x86/simple_idct10.asm          | 92
> +++++++++++++++++++++++++++++++
> >  libavcodec/x86/simple_idct10_template.asm |  6 +-
> >  5 files changed, 130 insertions(+), 2 deletions(-)
>
> this changes the output of:
> ./ffmpeg -an -i ~/tickets/4400/cartest_supers.mov -flags +bitexact
> out-ref.avi
>
> ls -alF out-ref.avi out.avi
> -rw-r----- 1 michael michael 761042 Jun 19 22:29 out.avi
> -rw-r----- 1 michael michael 761044 Jun 19 22:29 out-ref.avi


This is because you're comparing the non-bitexact mmx IDCT (which is
enabled even if the bitexact flag is set) with the bitexact sse2 IDCT.

Compare (without this patch) the C IDCT ("simple"):

./ffmpeg -an -i ~/Downloads/cartest_supers.mov -idct simple -flags
+bitexact /tmp/out-ref-simple.avi
-rw-r--r--  1 ronaldbultje  wheel  831994 Jun 20 08:56
/tmp/out-ref-simple.avi

with the MMX IDCT ("simplemmx", which is selected by "auto" and enabled by
default):

./ffmpeg -an -i ~/Downloads/cartest_supers.mov -flags +bitexact
/tmp/out-ref.avi
or
./ffmpeg -an -i ~/Downloads/cartest_supers.mov -idct simplemmx -flags
+bitexact /tmp/out-ref.avi
or
./ffmpeg -an -i ~/Downloads/cartest_supers.mov -idct simpleauto -flags
+bitexact /tmp/out-ref.avi
or
./ffmpeg -an -i ~/Downloads/cartest_supers.mov -idct auto -flags +bitexact
/tmp/out-ref.avi
-rw-r--r--  1 ronaldbultje  wheel  831998 Jun 20 08:54 /tmp/out-ref.avi

After this patch, all of these (simplemmx, simpleauto, simple, auto) will
refer to SSE2 instead of MMX, thus making their results identical to the C
version again.

Ronald


More information about the ffmpeg-devel mailing list