[FFmpeg-devel] [PATCH 0/6] More H.264 assembly (the sequel)

James Darnley jdarnley at obe.tv
Thu Dec 1 18:57:43 EET 2016


Some more assembly for review.  This time we have 10-bit h chroma functions.

The intra ones have some strange benchmark results.  Overall the improvement
isn't that large, particularly for the 4:2:0 intra.  And for the avx version of
that function it is slower than the sse2, by quite a margin.  I will definitely
try benchmarking it on my Nehalem after sending these emails.

Suggestions greatly appreciated.

James Darnley (6):
  avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter
  avcodec/h264: clean up and expand x86 function definitions
  whitespace changes after last commit
  avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma deblock/loop
    filter
  avcodec/h264: mmx2, sse2, avx 10-bit h chroma intra deblock/loop
    filter
  avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma intra deblock/loop
    filter

 libavcodec/x86/h264_deblock_10bit.asm | 213 ++++++++++++++++++++++++++++++++++
 libavcodec/x86/h264dsp_init.c         |  74 ++++++++----
 2 files changed, 262 insertions(+), 25 deletions(-)

-- 
2.10.2



More information about the ffmpeg-devel mailing list