[FFmpeg-devel] [WIP][PATCH] Opus Piramid Vector Quantization Search in x86 SIMD asm

Ivan Kalvachev ikalvachev at gmail.com
Fri Jun 9 14:41:05 EEST 2017


On 6/9/17, Michael Niedermayer <michael at niedermayer.cc> wrote:
> On Fri, Jun 09, 2017 at 01:36:07AM +0300, Ivan Kalvachev wrote:
>>  opus_pvq.c              |    9
>>  opus_pvq.h              |    5
>>  x86/Makefile            |    1
>>  x86/opus_dsp_init.c     |   47 +++
>>  x86/opus_pvq_search.asm |  597
>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 657 insertions(+), 2 deletions(-)
>> 3b9648bea3f01dad2cf159382f0ffc2d992c84b2
>> 0001-SIMD-opus-pvq_search-implementation.patch
>> From 06dc798c302e90aa5b45bec5d8fbcd64ba4af076 Mon Sep 17 00:00:00 2001
>> From: Ivan Kalvachev <ikalvachev at gmail.com>
>> Date: Thu, 8 Jun 2017 22:24:33 +0300
>> Subject: [PATCH 1/3] SIMD opus pvq_search implementation.
>
> seems this breaks build with mingw64, didnt investigate but it
> fails with these errors:
>
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x2d):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x3fd):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x7a1):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0xb48):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x2d):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x3fd):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0x7a1):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> libavcodec/libavcodec.a(opus_pvq_search.o):src/libavcodec/x86/opus_pvq_search.asm:(.text+0xb48):
> relocation truncated to fit: R_X86_64_32 against `const_align_abs_edge'
> collect2: error: ld returned 1 exit status
> collect2: error: ld returned 1 exit status
> make: *** [ffmpeg_g.exe] Error 1
> make: *** Waiting for unfinished jobs....
> make: *** [ffprobe_g.exe] Error 1


const_*_edge is used on only one place is the code.
Would you check if this patch fixes the issue.

--- a/libavcodec/x86/opus_pvq_search.asm
+++ b/libavcodec/x86/opus_pvq_search.asm
@@ -419,7 +419,7 @@ cglobal pvq_search,4,5,8, mmsize, inX, outY, K, N
         add         Nq,   r4q           ; Nq = align(Nq, mmsize)
         sub         rsp,  Nq            ; allocate tmpX[Nq]

-        movups      m3,   [const_align_abs_edge-mmsize+r4q] ; this is
the bit mask for the padded read at the end of the input
+        movups      m3,   [const_align_abs_mask+32-mmsize+r4q] ; this
is the bit mask for the padded read at the end of the input

         lea         r4q,  [Nq-mmsize]   ; Nq is rounded up (aligned
up) to mmsize, so r4q can't become negative here, unless N=0.
         movups      m2,   [inXq + r4q]
===
I expected that the addresses would be pre-calculated
by n/yasm as one value and indexed
relative to the section start.
Instead it seems that each entry is represented with
its own address and offset from it.
Since the offset is negative it uses all 64 bits and
it makes difference if it is truncated to 32 bits.

Same issue could happen with clang tools.


More information about the ffmpeg-devel mailing list