[FFmpeg-devel] [WIP] [PATCH 0/6] sse2/xmm version of 8-bit simple_idct

James Darnley jdarnley at obe.tv
Mon Jun 5 14:23:39 EEST 2017

To answer the couple of questions that were asked over the weekend.

Rostislav, about the performance.  I can see how to force a particular
IDCT implementation for real world decoding (the -idct option) but the
MPEG2 HD sample I've been working with mostly uses the "idct add"
function which doesn't exist for the functions in simple_idct10.asm.  So
for a next best thing, these are the results from the dct testing
utility over several runs.

> SIMPLE-C:      9124.8 ± 7.52
> SIMPLE-MMX:   11281.9 ± 32.67
> SIMPLE-SSE2:  15453.3 ± 78.86 (the adaption in the first 3 patches)
> SIMPLE8-SSE2: 15684.2 ± 7.52 (from simple_idct10.asm)
> SIMPLE8-AVX:  15398.4 ± 6.36 (simple_idct10.asm again)

I will try to get some real world results, eventually.

Ronald, yes.  I was thinking that the first 3 could be ignored if I can
get the latter patches to work correctly (pass fate that is).

I forgot to mention in my cover letter that although the dct test
passes, fate does not.  As I mentioned on IRC, changing them causes
errors elsewhere in fate.  I am currently looking into this problem and
I'm sure I will speak to you or others about it.

More information about the ffmpeg-devel mailing list