[FFmpeg-devel] [PATCH] MMX implementation of VC-1 inverse transforms

Christophe GISQUET christophe.gisquet
Sun Jan 20 13:27:04 CET 2008


Michael Niedermayer a ?crit :
>> +    asm volatile (
>> +        "movd      %2, %%mm0    \n\t"
>> +        "movd      %3, %%mm1    \n\t"
>> +        "punpcklwd %%mm0, %%mm0 \n\t"
>> +        "punpcklwd %%mm1, %%mm1 \n\t"
>> +        "punpckldq %%mm0, %%mm0 \n\t"
>> +        "punpckldq %%mm1, %%mm1 \n\t"
>> +        "movq      %%mm0, %0    \n\t"
>> +        "movq      %%mm1, %1    \n\t"
>> +        : "+m"(mm_rnd1), "+m"(mm_rnd2)
>> +        : "m"(rnd1), "m"(rnd2)
>> +    );
> 
> as rnd1 and 2 as well as shift are constants, building these in the inner
> loops is completely unnacceptable, you should pass int64_t arguments

Will do.

> you should at least do
>> +        "movq   (%0,"OFF"), %%mm0 \n\t"         \
>> +        "psubw  %%mm0, %%mm1  \n\t"
>> +        "psubw  %%mm0, %%mm4  \n\t"
>> +        "psllw  $2, %%mm0 \n\t"
>> +        "psubw  %%mm0, %%mm2  \n\t"
>> +        "paddw  %%mm0, %%mm0  \n\t"
>> +        "psubw  %%mm0, %%mm4  \n\t"
>> +        "paddw  %%mm0, %%mm0  \n\t"
>> +        "psubw  %%mm0, %%mm3  \n\t"
>> +        "paddw  %%mm0, %%mm1  \n\t"
> 
> 2 instructions less, 3 registers less, no multiply, no constants read

Merging with the needed preshift, it's akin to writing (for instance):
t1 = 8 * src[1] + 8 * src[3] +  4 * src[5] +  2 * src[7]
   + (src[5] - src[3]) >> 1;

>> +        : "r"(off), "r"(3*off), "r"(5*off), "r"(7*off),
> 
> unneeded wasting of 4 registers to load a constant
> and resulting more complex and slower addressing

This I'm not sure how to handle. My goal was to make a function of the
1d dct8, and 'off' depends on what transform (8x8, 8x4, 4x8) uses that
function.

It is indeed well known what the value of 'off' is, but to really use
it, I would have to change the 1d dct8 to a macro, potentially
increasing object size if the function wasn't already inlined.

Is this what you want, or do I miss an intermediate solution?

Best regards,
-- 
Christophe GISQUET




More information about the ffmpeg-devel mailing list