[FFmpeg-devel] [Patch][OpenHEVC]added ASM DBF functions

James Almer jamrial at gmail.com
Fri May 16 19:58:50 CEST 2014


On 16/05/14 6:47 AM, Pierre Edouard Lepere wrote:
> Hi,
> 
> Here's a patch with the changes you suggested. However, I think that the luma is still ssse3 dependant.
> 
> Regards,
> Pierre-Edouard Lepere
> 

>  %macro LUMA_DEBLOCK_BODY 2
> -    movdqa           m9, m2
> +    mova             m9, m2
>      psllw            m9, 1; *2
> -    movdqa          m10, m1
> +    mova            m10, m1
>      psubw           m10, m9
>      paddw           m10, m3
> -    pabsw           m10, m10 ; 0dp0, 0dp3 , 1dp0, 1dp3
> +    ABS1            m10, m10 ; 0dp0, 0dp3 , 1dp0, 1dp3
>  
> -    movdqa           m9, m5
> +    mova             m9, m5

Unlike with PABSW, the second argument for ABS1 is a temp register that's used 
for the SSE2 case. The SSSE3 case uses the first argument twice when it expands 
to pabsw.
In this one you could for example use m9 or m11 since they are going to be 
overwritten by later instructions.

And after replacing all the pabsw with ABS1/PABSW luma can work with SSE2 alone.
It's a matter of duplicating the functions for each instruction set by using a 
macro, plus the necessary additions in hevcdps_init.c

Anyway, to avoid further postponing the committing of this code, i can send a 
patch to address the above after this makes it to the tree.
I already wrote it to test after all.

Regards.


More information about the ffmpeg-devel mailing list