[FFmpeg-devel] [PATCH 3/7] x86/hevc: use CLIPW macro when possible

Michael Niedermayer michaelni at gmx.at
Fri Feb 6 18:08:05 CET 2015


On Thu, Feb 05, 2015 at 09:28:41PM -0300, James Almer wrote:
> On 05/02/15 4:20 PM, Christophe Gisquet wrote:
> > From: Mickaël Raulet <mraulet at insa-rennes.fr>
> > 
> > Conflicts:
> > 	libavcodec/x86/hevc_mc.asm
> > ---
> >  libavcodec/x86/hevc_mc.asm | 12 ++++--------
> >  1 file changed, 4 insertions(+), 8 deletions(-)
> > 
> > diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm
> > index efb4d1f..e8a5032 100644
> > --- a/libavcodec/x86/hevc_mc.asm
> > +++ b/libavcodec/x86/hevc_mc.asm
> > @@ -665,11 +665,9 @@ QPEL_TABLE 10, 8, w, avx2
> >  %if %2 == 8
> >      packuswb          %3, %4
> >  %else
> > -    pminsw            %3, [max_pixels_%2]
> > -    pmaxsw            %3, [zero]
> > +    CLIPW             %3, [zero], [max_pixels_%2]
> >  %if (%1 > 8 && notcpuflag(avx)) || %1 > 16
> > -    pminsw            %4, [max_pixels_%2]
> > -    pmaxsw            %4, [zero]
> > +    CLIPW             %4, [zero], [max_pixels_%2]
> 
> Many (But not all) of the functions calling these macros have free regs where max_pixels_%2 
> and zero (or in that case a simple pxor m*, m*) could be stored.
> It'll probably be faster than reloading these constants inside a loop.
> 
> But again, that's for a different patch.
> 
> >  %endif
> >  %endif
> >  %endmacro
> > @@ -1467,8 +1465,7 @@ cglobal hevc_put_hevc_uni_w%1_%2, 6, 6, 7, dst, dststride, src, srcstride, heigh
> >  %if %2 == 8
> >      packuswb          m0, m0
> >  %else
> > -    pminsw            m0, [max_pixels_%2]
> > -    pmaxsw            m0, [zero]
> > +    CLIPW             m0, [zero], [max_pixels_%2]
> >  %endif
> >      PEL_%2STORE%1   dstq, m0, m1
> >      add             dstq, dststrideq             ; dst += dststride
> > @@ -1539,8 +1536,7 @@ cglobal hevc_put_hevc_bi_w%1_%2, 5, 7, 10, dst, dststride, src, srcstride, src2,
> >  %if %2 == 8
> >      packuswb          m0, m0
> >  %else
> > -    pminsw            m0, [max_pixels_%2]
> > -    pmaxsw            m0, [zero]
> > +     CLIPW            m0, [zero], [max_pixels_%2]
> >  %endif
> >      PEL_%2STORE%1   dstq, m0, m1
> >      add             dstq, dststrideq             ; dst += dststride
> > 
> 
> lgtm otherwise.

applied

thanks


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150206/84df361a/attachment.asc>


More information about the ffmpeg-devel mailing list