[FFmpeg-devel] port mplayer eq filter to libavfilter

Ronald S. Bultje rsbultje
Fri Nov 26 15:51:02 CET 2010


On Fri, Nov 26, 2010 at 9:38 AM, William Yu <genwillyu at gmail.com> wrote:
> 2010/11/25 Ronald S. Bultje <rsbultje at gmail.com> wrote:
>> On Thu, Nov 25, 2010 at 4:27 AM, William Yu <genwillyu at gmail.com> wrote:
>>> + ? ? ? ?for (i = w&7; i; i--) {
>>> + ? ? ? ? ? ?pel = ((*line* contrast)>>12) + brightness;
>>> + ? ? ? ? ? ?if (pel&768) pel = (-pel)>>31;
>>> + ? ? ? ? ? ?*line++ = pel;
>>> + ? ? ? ?}
>> Please don't mix and match C and ASM, this takes about 10-20 lines in
>> asm, if you want you can even compile it using gcc and directly copy
>> it (see above, use FASTDIV also). That will prevent it from using eax
>> and then the function gets faster.
> Can you tell why don't mix and match C and ASM. I think compiler's optimizer
> can does better than my manual code except those MMX instruction.

See doc/optimization.txt

Short version: there is no guarantee that values are maintained
between asm blocks or between looped calls to the same asm block,
unless the loop itself is within asm. This is correct:

asm("setup .. loop .. inner .. end loop");

This is not:

asm("setup"); for (..) { asm("inner"); }

which is why you do this:

for (..) { asm("setup .. inner"); }

but now you redo the setup for each row, unnecessarily.

The same is true for the stuff at the end. Advantages of the
do-it-all-in-asm is that you can make sure the compiler doesn't switch
eax/stack (which it shouldn't). Also, you can then do the setup before
the loop. Even though it's not the inner loop, as Michael said, it'd
still make it faster. If Michael thinks I'm an a-hole here, then
commit without and I'll fix it after commit.

>> This is so simple, why not write a SSE2 version also? Cheap shot (untested):
>> [..]
>>> + ? ? ? ?for (i = w&15; i; i--) {
>> Maybe even a SSSE3 version using pmaddubsw (not sure if that's
>> possible)? If you don't understand ASM you don't have to do the SSSE3
>> version, but still.
> Sorry, I am not enough experience on assembly. When i study these knowledge
> enough, I will add those version. Or another people may more suitable
> to does these job than me.

Same, I'll add it after you commit, if I find time. It makes more
sense to me to do it now (SSE2) since I already gave you the code, but
I'm not maintainer so I'm easily overruled.


More information about the ffmpeg-devel mailing list