[FFmpeg-cvslog] lavr: x86: improve non-SSE4 version of S16_TO_S32_SX macro

Justin Ruggles git at videolan.org
Sat Jul 28 00:10:49 CEST 2012


ffmpeg | branch: master | Justin Ruggles <justin.ruggles at gmail.com> | Tue Jun 26 16:50:10 2012 -0400| [e9da9a311199c26e2ba1407a06c4b241148557b7] | committer: Justin Ruggles

lavr: x86: improve non-SSE4 version of S16_TO_S32_SX macro

Removes a false dependency on existing contents of the 2nd dst register,
giving better performance for OOE.

> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=e9da9a311199c26e2ba1407a06c4b241148557b7
---

 libavresample/x86/util.asm |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavresample/x86/util.asm b/libavresample/x86/util.asm
index 501f662..ca7fde5 100644
--- a/libavresample/x86/util.asm
+++ b/libavresample/x86/util.asm
@@ -26,7 +26,8 @@
     pmovsxwd     m%1, m%1
     SWAP %1, %2
 %else
-    punpckhwd    m%2, m%1
+    mova         m%2, m%1
+    punpckhwd    m%2, m%2
     punpcklwd    m%1, m%1
     psrad        m%2, 16
     psrad        m%1, 16



More information about the ffmpeg-cvslog mailing list