[FFmpeg-cvslog] x86util: import MOVHL macro

James Darnley git at videolan.org
Sat Feb 18 21:28:05 EET 2017


ffmpeg | branch: master | James Darnley <jdarnley at obe.tv> | Sat Feb 11 13:25:09 2017 +0100| [7627df15d411a69f236b4650e88b1ab911f38efc] | committer: James Darnley

x86util: import MOVHL macro

Originally committed to x264 in 1637239a by Henrik Gramner who has
agreed to re-license it as LGPL.  Original commit message follows.

    x86: Avoid some bypass delays and false dependencies

    A bypass delay of 1-3 clock cycles may occur on some CPUs when transitioning
    between int and float domains, so try to avoid that if possible.

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=7627df15d411a69f236b4650e88b1ab911f38efc
---

 libavutil/x86/x86util.asm | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/libavutil/x86/x86util.asm b/libavutil/x86/x86util.asm
index c063436..1408f0a 100644
--- a/libavutil/x86/x86util.asm
+++ b/libavutil/x86/x86util.asm
@@ -876,3 +876,15 @@
     psrlq   %1, 8*(%2)
 %endif
 %endmacro
+
+%macro MOVHL 2 ; dst, src
+%ifidn %1, %2
+    punpckhqdq %1, %2
+%elif cpuflag(avx)
+    punpckhqdq %1, %2, %2
+%elif cpuflag(sse4)
+    pshufd     %1, %2, q3232 ; pshufd is slow on some older CPUs, so only use it on more modern ones
+%else
+    movhlps    %1, %2        ; may cause an int/float domain transition and has a dependency on dst
+%endif
+%endmacro



More information about the ffmpeg-cvslog mailing list