[FFmpeg-devel] [PATCH] x86/hevc_sao: make sao_band_filter work on x86_32

James Almer jamrial at gmail.com
Sun Feb 8 18:48:08 CET 2015


On 08/02/15 8:21 AM, Christophe Gisquet wrote:
> Hi,
> 
> 2015-02-07 23:06 GMT+01:00 James Almer <jamrial at gmail.com>:
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>>  libavcodec/x86/hevc_sao.asm   | 40 ++++++++++++++++++++++++++++++++++++----
>>  libavcodec/x86/hevcdsp_init.c | 24 ++++++++++++------------
>>  2 files changed, 48 insertions(+), 16 deletions(-)
> 
> Passes fate here for Win32/64, but that's probably not the platforms
> you interested in hearing feedback from.
> Looks OK to commit.
> 
> A few questions though:
> 
>> +    %assign MMSIZE mmsize
> 
> Why do that? Not a big deal: it's only for my education, if there's
> something I'm missing.

For width 48, the COMPUTE macro is last run after an INIT_XMM cpuname, so mmsize becomes 
16 and in the avx2 version the instructions would access the wrong data in stack.
Doing %assign MMSIZE mmsize at the beginning of the function and using it here makes sure 
it's always 32 in avx2.
sse2 is unaffected by this, of course.

And the reason I'm using INIT_XMM in the middle of the function for the avx2 width 48 case 
is because i couldn't find a nice and clean way to use the xm* reg aliases with the COMPUTE 
macros.

> 
>> +    pcmpeqw           m4, %2, [rsp+MMSIZE*0]
>> +    pcmpeqw           m5, %2, [rsp+MMSIZE*1]
>> +    pcmpeqw           m6, %2, [rsp+MMSIZE*2]
>> +    pcmpeqw           %2, [rsp+MMSIZE*3]
>> +    pand              m4, [rsp+MMSIZE*4]
>> +    pand              m5, [rsp+MMSIZE*5]
>> +    pand              m6, [rsp+MMSIZE*6]
> [...]
>> -cglobal hevc_sao_band_filter_%1_8, 6, 6, 15, dst, src, dststride, srcstride, offset, left
>> +cglobal hevc_sao_band_filter_%1_8, 6, 6, 15, 8*mmsize*ARCH_X86_32, dst, src, dststride, srcstride, offset, left
>>      HEVC_SAO_BAND_FILTER_INIT 8
> 
> Why do you need room for 8 regs, and not 7?

Remnant from before i realized i could keep m7 untouched. I'll change it.

> 
> Setting this to 7 and doing the required changes to remove MMSIZE pass
> here on Win32.

Again, MMSIZE is for avx2 and you're probably running the sse2 version, where it's always 
the same as mmsize.

> 
>> -                SAO_BAND_INIT(10, sse2);
>>                  SAO_EDGE_INIT(10, sse2);
>>              }
>> +            SAO_BAND_INIT(10, sse2);
> 
> On a side note, you haven't ported >8 bits edge filter to x86_32. I
> guess because it wouldn't play real time anyway?

I was going to give edge >8bit a try next.


More information about the ffmpeg-devel mailing list