[FFmpeg-devel] port mplayer eq filter to libavfilter

Ronald S. Bultje rsbultje
Thu Nov 25 16:18:21 CET 2010


Hi,

On Thu, Nov 25, 2010 at 4:27 AM, William Yu <genwillyu at gmail.com> wrote:
> I have remove unnecessary param from process functions.
> Please check this patch. Thanks.

Please don't top-post.

> +It accepts the following parameters:
> + at var{brightness}:@var{constrast}

Range of values?

> + * FFmpeg is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.

GStreamer has a LGPL variant of this. why not use that? I hate this
"everything-is-GPL-except-the-wrapper" concept. For trivial code such
as this, I'd almost say we should reject patches...

> +static void process_c( uint8_t *line, int stride,
> +    int w, int h, int brightness, int contrast)
> +{
[..]
> +    contrast = ((contrast+100)*256*256)/100;
> +    brightness = ((brightness+100)*511)/200-128 - (contrast>>9);

Can FASTDIV() be used here? (Leave the original code in comments for
easier understanding).

> +void ff_eq_filter_process_mmx(uint8_t *line, int stride,
> +    int w, int h, int brightness, int contrast)
[..]
> +    while (h--) {
> +        __asm__ volatile (
> +            "movq (%3), %%mm3 \n\t"
> +            "movq (%4), %%mm4 \n\t"
> +            "pxor %%mm0, %%mm0 \n\t"
> +            "movl %2, %%eax\n\t"

You will recalculate each of these per row iteration, that sounds like
a huge waste.

> +            ASMALIGN(4)
> +            "1: \n\t"
> +            "movq (%0), %%mm1 \n\t"
> +            "movq (%0), %%mm2 \n\t"

Why not movq mm1, mm2?

> +        for (i = w&7; i; i--) {
> +            pel = ((*line* contrast)>>12) + brightness;
> +            if (pel&768) pel = (-pel)>>31;
> +            *line++ = pel;
> +        }

Please don't mix and match C and ASM, this takes about 10-20 lines in
asm, if you want you can even compile it using gcc and directly copy
it (see above, use FASTDIV also). That will prevent it from using eax
and then the function gets faster.

This is so simple, why not write a SSE2 version also? Cheap shot (untested):

> +            "movdqa (%3), %%xmm3 \n\t"
> +            "movdqa (%4), %%xmm4 \n\t"
> +            "pxor %%xmm0, %%xmm0 \n\t"
> +            "movl %2, %%eax\n\t"
> +            ASMALIGN(4)
> +            "1: \n\t"
> +            "movdqa (%0), %%xmm1 \n\t"
> +            "movdqa %%xmm1, %%xmm2 \n\t"
> +            "punpcklbw %%xmm0, %%xmm1 \n\t"
> +            "punpckhbw %%xmm0, %%xmm2 \n\t"
> +            "psllw $4, %%xmm1 \n\t"
> +            "psllw $4, %%xmm2 \n\t"
> +            "pmulhw %%xmm4, %%xmm1 \n\t"
> +            "pmulhw %%xmm4, %%xmm2 \n\t"
> +            "paddw %%xmm3, %%xmm1 \n\t"
> +            "paddw %%xmm3, %%xmm2 \n\t"
> +            "packuswb %%xmm2, %%xmm1 \n\t"
> +            "movdqa %%xmm1, (%0) \n\t"
> +            "add $16, %0 \n\t"
> +            "decl %%eax \n\t"
> +            "jnz 1b \n\t"
> +            : "=r" (line)
> +            : "0" (line), "r" (w>>4), "r" (brvec), "r" (contvec)
> +            : "%eax"
[..]
> +        for (i = w&15; i; i--) {

Maybe even a SSSE3 version using pmaddubsw (not sure if that's
possible)? If you don't understand ASM you don't have to do the SSSE3
version, but still.

> +    __asm__ volatile ( "emms \n\t" ::: "memory" );

Uh... So is every filter going to do this? This is hideous. Let's
generalize this. Michael, what's the reason for not doing this
generically?

Ronald



More information about the ffmpeg-devel mailing list