[FFmpeg-devel] [PATCH 2/2] tta/x86: add ff_ttafilter_process_dec_{ssse3, sse4}

Christophe Gisquet christophe.gisquet at gmail.com
Tue Feb 11 02:26:53 CET 2014


2014-02-11 2:12 GMT+01:00 Christophe Gisquet <christophe.gisquet at gmail.com>:
> I haven't quite checked if the code is optimal, but I haven't seen any
> other issue. Maybe using more registers to break dependencies, but
> that's a short function, and there's no loop to amortize their use.

There are a few spots where some dependencies might exist (not
checked) and could be lifted, e.g.
+    paddd      m6, m7
+
+    movd       m7, [filterq + 0x4]

I think at that point m2 and m3 are free, you should use them instead,
because those 2 insns may not execute in parallel.

-- 
Christophe


More information about the ffmpeg-devel mailing list