[FFmpeg-devel] [PATCH]v6 Opus Pyramid Vector Quantization Search in x86 SIMD asm

Henrik Gramner henrik at gramner.com
Sun Aug 6 14:39:24 EEST 2017


On Sat, Aug 5, 2017 at 12:58 AM, Ivan Kalvachev <ikalvachev at gmail.com> wrote:
> 8 packed, 8 scalar.
>
> Unless I miss something (and as I've said before,
> I'm not confident enough to mess with that code.)
>
> (AVX does extend to 32 variants, but they are not
> SSE compatible, so no need to emulate them.)

Oh, right. I quickly glanced at the docs and saw 32 pseudo-ops for
each instruction for a total of 128 when adding pd, ps, sd, ss, but
the fact that only the first 8 is relevant here reduces it to 32 which
is a lot more manageable.

>     movaps m1, [WRT_PIC_BASE + const_2 + r2 ]
>
> Looks better. (Also not tested. Will do, later.)

I intentionally used the WRT define at the end because that's most
similar to the built in wrt syntax used when accessing symbols through
the PLT or GOT, e.g.

mov eax, [external_symbol wrt ..got]

> Yeh $$ is the start of the current section, and that's is going to be
> ".text"  not "rodata".

Obviously, yes. You need a reference that results in a compile-time
constant PC-offset (which .rodata isn't) to create PC-relative
relocation records to external symbols.


More information about the ffmpeg-devel mailing list