[FFmpeg-devel] [PATCH] lavc/lpc: R-V V apply_welch_window
Rémi Denis-Courmont
remi at remlab.net
Mon Dec 11 11:50:53 EET 2023
Le 11 décembre 2023 11:11:28 GMT+02:00, Anton Khirnov <anton at khirnov.net> a écrit :
>Quoting Rémi Denis-Courmont (2023-12-08 18:46:51)
>> +#if __riscv_xlen >= 64
>> +func ff_lpc_apply_welch_window_rvv, zve64d
>> + vsetvli t0, zero, e64, m8, ta, ma
>> + vid.v v0
>> + addi t2, a1, -1
>> + vfcvt.f.xu.v v0, v0
>> + li t3, 2
>> + fcvt.d.l ft2, t2
>> + srai t1, a1, 1
>> + fcvt.d.l ft3, t3
>> + li t4, 1
>> + fdiv.d ft0, ft3, ft2 # ft0 = c = 2. / (len - 1)
>> + fcvt.d.l fa1, t4 # fa1 = 1.
>> + fsub.d ft1, ft0, fa1
>> + vfrsub.vf v0, v0, ft1 # v0[i] = c - i - 1.
>> +1:
>> + vsetvli t0, t1, e64, m8, ta, ma
>> + vfmul.vv v16, v0, v0 # no fused multipy-add as v0 is reused
>> + sub t1, t1, t0
>> + vle32.v v8, (a0)
>> + fcvt.d.l ft2, t0
>> + vfrsub.vf v16, v16, fa1 # v16 = 1. - w * w
>> + sh2add a0, t0, a0
>> + vsetvli zero, zero, e32, m4, ta, ma
>> + vfwcvt.f.x.v v24, v8
>> + vsetvli zero, zero, e64, m8, ta, ma
>> + vfsub.vf v0, v0, ft2 # v0 -= vl
>> + vfmul.vv v8, v24, v16
>> + vse64.v v8, (a2)
>> + sh3add a2, t0, a2
>> + bnez t1, 1b
>> +
>> + andi t1, a1, 1
>> + beqz t1, 2f
>> +
>> + sd zero, (a2)
>> + addi a0, a0, 4
>> + addi a2, a2, 8
>> +2:
>> + vsetvli t0, zero, e64, m8, ta, ma
>> + vid.v v0
>> + srai t1, a1, 1
>> + vfcvt.f.xu.v v0, v0
>> + fcvt.d.l ft1, t1
>> + fsub.d ft1, ft0, ft1 # ft1 = c - (len / 2)
>> + vfadd.vf v0, v0, ft1 # v0[i] = c - (len / 2) + i
>> +3:
>> + vsetvli t0, t1, e64, m8, ta, ma
>> + vfmul.vv v16, v0, v0
>> + sub t1, t1, t0
>> + vle32.v v8, (a0)
>> + fcvt.d.l ft2, t0
>> + vfrsub.vf v16, v16, fa1 # v16 = 1. - w * w
>> + sh2add a0, t0, a0
>> + vsetvli zero, zero, e32, m4, ta, ma
>> + vfwcvt.f.x.v v24, v8
>> + vsetvli zero, zero, e64, m8, ta, ma
>> + vfadd.vf v0, v0, ft2 # v0 += vl
>> + vfmul.vv v8, v24, v16
>> + vse64.v v8, (a2)
>> + sh3add a2, t0, a2
>> + bnez t1, 3b
>
>I think it'd look a lot less like base64 < /dev/random if you vertically
>aligned the first operands.
They are aligned to the 17th column. Problem is that quite a few vector mnemonics are larger than 7 characters.
>
>--
>Anton Khirnov
>_______________________________________________
>ffmpeg-devel mailing list
>ffmpeg-devel at ffmpeg.org
>https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>To unsubscribe, visit link above, or email
>ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
More information about the ffmpeg-devel
mailing list