[FFmpeg-devel] [PATCH 07/12] mips/aacdec: remove uses of mips32r2 specific ext instructions
james410 at cowgill.org.uk
Tue Mar 3 18:18:13 CET 2015
On Tue, 2015-03-03 at 12:42 +0000, Nedeljko Babic wrote:
> >Removing these removes the dependency of this code on mips32r2 which would
> >allow it to be used on processors which have FPU instructions, but not r2
> >instructions (like the mips64el debian port for instance).
> I would be more comfortable if there were two instances of this code: one for
> mips32r2 and one for mips32 so advantages of using mips32r2 instructions
> (however small here) are left intact.
> On the other hand, since this doesn't change much number of instructions used
> (adding at maximum around 100 instructions overall if I am not mistaking) I
> am ok with this.
Well I can't see how 'ext' can ever be faster than 'and' (it does more
work) so most of these should be no slower anyway. For VMUL4S my version
has 2 extra instructions in it so it could be a bit slower. Does this
#if seem ok?
@@ -198,9 +198,18 @@ static inline float *VMUL4S_mips(float *dst, const float *v, unsigned idx,
"lwxc1 %[temp12], %[temp3](%[v]) \n\t"
"lwxc1 %[temp13], %[temp4](%[v]) \n\t"
"and %[temp1], %[sign], %[mask] \n\t"
+#if defined(__mips_isa_rev) && __mips_isa_rev >= 2
"ext %[temp2], %[idx], 12, 1 \n\t"
"ext %[temp3], %[idx], 13, 1 \n\t"
"ext %[temp4], %[idx], 14, 1 \n\t"
+ "srl %[temp2], %[idx], 12 \n\t"
+ "srl %[temp3], %[idx], 13 \n\t"
+ "srl %[temp4], %[idx], 14 \n\t"
+ "andi %[temp2], %[temp2], 1 \n\t"
+ "andi %[temp3], %[temp3], 1 \n\t"
+ "andi %[temp4], %[temp4], 1 \n\t"
"sllv %[sign], %[sign], %[temp2] \n\t"
"xor %[temp1], %[temp0], %[temp1] \n\t"
"and %[temp2], %[sign], %[mask] \n\t"
More information about the ffmpeg-devel