[FFmpeg-devel] [PATCH] x86: hevc_mc: better register allocation

Michael Niedermayer michaelni at gmx.at
Sun May 18 16:37:32 CEST 2014


On Sun, May 18, 2014 at 12:34:04AM +0200, Christophe Gisquet wrote:
> Hi,
> 
> 2014-05-18 0:20 GMT+02:00 Christophe Gisquet <christophe.gisquet at gmail.com>:
> > Patch needs to be rewritten
> 
> Here's an attempt, only tested (compilation+fate) on Win64.
> 
> -- 
> Christophe

>  hevc_mc.asm |   33 +++++++++++++++++++++++----------
>  1 file changed, 23 insertions(+), 10 deletions(-)
> 511373719ed90f69129591756547918c1d555ac5  0001-x86-hevc-dsp-better-register-allocation.patch
> From bcb6875c8c795486227b636b75a43d93408f207f Mon Sep 17 00:00:00 2001
> From: Christophe Gisquet <christophe.gisquet at gmail.com>
> Date: Sat, 17 May 2014 12:22:39 +0200
> Subject: [PATCH] x86: hevc dsp: better register allocation
> 
> The xmm reg count was incorrect, and manual loading of the gprs
> furthermore allow to noticeable reduce the number needed.
> 
> The modified function is used in weighted prediction, so only a few
> samples like WP_A_Toshiba_3.bit exhibit a change. For this one and
> Win64 (24 and 48 widths removed because of too few occurrences):
> 
> before:
> 3872 decicycles in a32, 32761 runs, 7 skips
> 2194 decicycles in a16, 32766 runs, 2 skips
> 
> after:
> 3767 decicycles in a32, 32765 runs, 3 skips
> 2119 decicycles in a16, 32767 runs, 1 skips
> ---
>  libavcodec/x86/hevc_mc.asm | 33 +++++++++++++++++++++++----------
>  1 file changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm
> index 1fae38c..c7e8d07 100644
> --- a/libavcodec/x86/hevc_mc.asm
> +++ b/libavcodec/x86/hevc_mc.asm
> @@ -1098,19 +1098,32 @@ cglobal hevc_put_hevc_bi_qpel_hv%1_%2, 9, 11, 16, dst, dststride, src, srcstride
>  %endmacro
>  
>  %macro WEIGHTING_FUNCS 2
> -cglobal hevc_put_hevc_uni_w%1_%2, 8, 10, 11, dst, dststride, src, srcstride, height, denom, wx, ox, shift
> -    lea          shiftd, [denomd+14-%2]          ; shift = 14 - bitd + denom
> -    shl             oxd, %2-8                    ; ox << (bitd - 8)
> -    movd             m2, wxd        ; WX
> -    movd             m3, oxd        ; OX
> -    movd             m4, shiftd     ; shift
> +%if WIN64
> +cglobal hevc_put_hevc_uni_w%1_%2, 4, 5, 7, dst, dststride, src, srcstride, height, denom, wx, ox
> +    mov             r4d, denomm
> +%define SHIFT  r4d
> +%else
> +cglobal hevc_put_hevc_uni_w%1_%2, 6, 6, 7, dst, dststride, src, srcstride, height, denom, wx, ox
> +%define SHIFT  denomd
> +%endif

this is getting a little bit ugly ...
anyway review left to james & ronald & anyone else who likes to ...


[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I am the wisest man alive, for I know one thing, and that is that I know
nothing. -- Socrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20140518/1a10d060/attachment.asc>


More information about the ffmpeg-devel mailing list