[FFmpeg-devel] [PATCH 4/5] avcodec/h264: add avx 8-bit h264_idct_add
jdarnley at obe.tv
Fri Apr 14 14:26:25 EEST 2017
On 2017-04-06 18:06, James Almer wrote:
> Your numbers are really confusing. Could you post the actual numbers for
> each function instead of doing comparisons?
These figures are the actual numbers!
Using the figures from Haswell above:
> ff_h264_idct_add_8_mmx = 52 cycles
> ff_h264_idct_add_8_sse2 = 49 cycles
> ff_h264_idct_add_8_avx = 46 cycles
Coming back to this draft I saved I removed a fair bit of ranting and
cut it down to the essential point.
Also, I forgot about the Pentium I tested previous patches on. I added
SSE2. From that commit message:
> Kaby Lake Pentium:
> - ff_h264_idct_add_8_sse2: ~1.18x faster than mmxext
> - ff_h264_idct_dc_add_8_sse2: ~1.07x faster than mmxext
More information about the ffmpeg-devel