[FFmpeg-devel] port mplayer eq filter to libavfilter

Tue Nov 30 21:50:42 CET 2010

Hi,

On Tue, Nov 30, 2010 at 4:10 AM, William Yu <genwillyu at gmail.com> wrote:
> 2010/11/30 Ronald S. Bultje <rsbultje at gmail.com>:
>> Hi,
>>
>> On Mon, Nov 29, 2010 at 10:57 AM, William Yu <genwillyu at gmail.com> wrote:
>>> 2010/11/26 Ronald S. Bultje <rsbultje at gmail.com>:
>>>> On Fri, Nov 26, 2010 at 9:38 AM, William Yu <genwillyu at gmail.com> wrote:
>>>>> 2010/11/25 Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>>>>> On Thu, Nov 25, 2010 at 4:27 AM, William Yu <genwillyu at gmail.com> wrote:
>>[..]
>>> + ? ?brvec[0] = brvec[1] = brvec[2] = brvec[3] = brightness;
>>> + ? ?contvec[0] = contvec[1] = contvec[2] = contvec[3] = contrast;
>>> +
>>> + ? ?__asm__ volatile (
>>> + ? ? ? ?"movq (%4), %%mm3 \n\t"
>>> + ? ? ? ?"movq (%5), %%mm4 \n\t"
>>
>> movd %4, %%mm3
>> movd %5, %%mm4
>>
>> where 4=brightness and 5=contrast
>>
>> Then (mmx2; ignore this and I'll do it for you if you don't see how to):
>>
>> pshufw 0x0, %%mm3, %%mm3
>> pshufw 0x0, %%mm4, %%mm4
>>
>> or (mmx):
>>
>> punpcklwd %%mm3, %%mm3
>> punpcklwd %%mm4, %%mm4
>> punpckldq %%mm3, %%mm3
>> punpckldq %%mm4, %%mm4
>>
>> Is several cycles faster, and now you don't need brvec/contvec
>> anymore, saving you two asm arguments, which makes it more likely to
>> compile on systems such as OSX.
>>
>>> + ? ? ? ?"movl %3, %%ecx \n\t"
>>> + ? ? ? ?"andl $7, %%ecx \n\t"
>>> + ? ? ? ?"cmpl $0, %%ecx \n\t"
>>
>> andl sets the ZF, you don't need the cmpl.
>>
>>> + ? ? ? ?"addl %7, %%eax \n\t"
>>> + ? ? ? ?"movl %%eax, %%edx \n\t"
>>> + ? ? ? ?"andl $768, %%eax \n\t"
>>> + ? ? ? ?"testl %%eax, %%eax \n\t"
>>
>> Same.
>>
>>> + ? ? ? ?: "=r" (line), "=m" (h)
>>> + ? ? ? ?: "0" (line), "r" (w), "r" (brvec), "r" (contvec), "m" (step), "m" (brightness), "m" (contrast)
>>
>> %2 is unused, you should remove it.
>
> Thanks for your help, Updated. Please check it again.

Looking almost good. Now, it doesn't work on x86-64 yet because you're
using a combination of size-specific calls and general registers:

>         "incl %0 \n\t"

Should be inc

>         "addl %5, %0 \n\t"

Should be add.

Can anyone enlighten me what the "+m"(h) and "m"(step) do? Doing
"+m"(h) -> "+r"(h) has no effect, and doing the same for step causes a
compilation failure, probably because register size of int isn't the
same as for a pointer, so add destptr, step fails. But other than
that, what's the difference?

Ronald