[FFmpeg-devel] [GSoC] Qualification Task

Stefano Sabatini stefano.sabatini-lala at poste.it
Wed Mar 23 18:03:28 CET 2011


On date Wednesday 2011-03-23 06:25:15 +0200, Mina Nagy Zaki encoded:
> On Tuesday 22 March 2011 17:35:39 Stefano Sabatini wrote:
> [...]
> > > In both cases I can't seem to figure out how I can find the end of
> > > stream, which is important for some effects as they need to 'drain'
> > > (like the echo effect).
> > 
> > I suppose you read AVERROR_EOF from avfilter_request_frame() when you
> > reach the end of a stream. Note that I'm a bit rusty with libavfilter
> > audio, as it is some months I don't touch that code and I don't have a
> > good memory ;-), but I remember there was a problem with some filters
> > which needed that (in particular a video concatenation filter).
> [...]
> > Check: cmdutils.c:get_filtered_audio_buffer(), it implements a pull
> > model, correct me if I'm wrong.
> 
> I have already checked that, which is what lead me to the question in the first 
> place. I can't really read AVERROR_EOF from request_frame() because audio 
> filters don't(/shouldn't?) use it. get_filtered_audio_buffer() does a 
> request_frame() on aout which propagates all the way to asrc (without passing 
> through intermediate filters, because they don't/shouldn't use it) which then 
> does a filter_samples that propagates all the way to aout! If I try to 
> request_frame from the previous buffer it will reach asrc then eventually lead 
> to calling my own effect's filter_samples, so now do I filter in filter_samples or 
> request_frame :D  And in any case, asrc never returns AVERROR_EOF anyway.
> I have examined most of the RFC thread about lavfi audio, but I couldn't really 
> find a discussion about push/pull models. The current mixture is a slight mess.

Check avfilter_request_frame(), it calls the request_frame callback if
defined on the source pad, otherwise calls avfilter_request_frame() on
the first input link of the source filter.

In other words we have a mixed pull/push model, you pull frames, and
as a consequence the filterchain pushes frames on the
filterchain. There is nothing wrong in implementing a frame_request
callback in the output pads.

This is essentially what is done in the case of video, check for
example how it is done in fifo (and of course you need to implement it
in a source). I don't know if this is a mess, but don't worry, we'll
fix it ;-).

As for the asrc_buffer issue, you're right, request_frame() never
returns AVERROR_EOF so this should be fixed.

> > > Another major problem is what you asked at the very end: more than two
> > > channels in the widen effect. I was actually planning to send a separate
> > > email about this: Channel Negotiation. Not all effects can handle any
> > > number of channels, and there's no way to do anything about it (except
> > > fail loudly) in the current API.
> > > 
> > > Again looking at sox, it does this by having effects declare
> > > whether or not they can handle multiple channels. If they can then it
> > > simply sends in the entire stream. If they can't then it creates an
> > > effect instance for each channel, splits the stream, and 'flows' samples
> > > of one channel into each of the effect instances. IMHO we _could_
> > > improve on this.
> > 
> > I suppose the lavfi way would be to let the various filters declare
> > which are the supported channel layouts during the init stage, and
> > auto-insert an aconvert filter when it is needed.  For example you
> > could have:
> > 
> > aconvert=s16:mono, widen
> > -> an aconvert is automatically inserted before widen
> > aconvert=s16:mono, aconvert=s16:stereo, widen
> 
> Seems reasonable.
> 
> > BTW, thinking at it since this is a port I think it would be better to
> > use the same name of the original effect, so it's easier for people to
> > spot the correspondence between the two.
> 
> Sure. I've attached the latest patch, after all modifications.
> 
> -- 
> Mina

> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 8d2f273..ec288ec 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -17,6 +17,7 @@ OBJS = allfilters.o                                                     \
>  OBJS-$(CONFIG_ACONVERT_FILTER)               += af_aconvert.o
>  OBJS-$(CONFIG_ANULL_FILTER)                  += af_anull.o
>  OBJS-$(CONFIG_ARESAMPLE_FILTER)              += af_aresample.o
> +OBJS-$(CONFIG_EARWAX_FILTER)                 += af_earwax.o
>  
>  OBJS-$(CONFIG_ANULLSRC_FILTER)               += asrc_anullsrc.o
>  
> diff --git a/libavfilter/af_earwax.c b/libavfilter/af_earwax.c
> new file mode 100644
> index 0000000..f83302f
> --- /dev/null
> +++ b/libavfilter/af_earwax.c
> @@ -0,0 +1,138 @@
> +/*
> + * Copyright (c) 2010 Mina Nagy Zaki

Also add the original contributors, from sox source:

 * Copyright (c) 2000 Edward Beingessner And Sundry Contributors.
 * This source code is freely redistributable and may be used for any purpose.
 * This copyright notice must be maintained.  Edward Beingessner And Sundry
 * Contributors are not responsible for the consequences of using this
 * software.

> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * Stereo Widening Effect. Adds audio cues to move stereo image in
> + * front of the listener. Adapted from the libsox earwax effect.
> + */
> +
> +#include "avfilter.h"
> +
> +#define NUMTAPS 64
> +
> +static const int8_t filt[NUMTAPS] = {
> +/* 30°  330° */
> +    4,   -6,     /* 32 tap stereo FIR filter. */
> +    4,  -11,     /* One side filters as if the */
> +   -1,   -5,     /* signal was from 30 degrees */
> +    3,    3,     /* from the ear, the other as */
> +   -2,    5,     /* if 330 degrees. */
> +   -5,    0,
> +    9,    1,
> +    6,    3,     /*                         Input                         */
> +   -4,   -1,     /*                   Left         Right                  */
> +   -5,   -3,     /*                __________   __________                */
> +   -2,   -5,     /*               |          | |          |               */
> +   -7,    1,     /*           .---|  Hh,0(f) | |  Hh,0(f) |---.           */
> +    6,   -7,     /*          /    |__________| |__________|    \          */
> +   30,  -29,     /*         /                \ /                \         */
> +   12,   -3,     /*        /                  X                  \        */
> +  -11,    4,     /*       /                  / \                  \       */
> +   -3,    7,     /*  ____V_____   __________V   V__________   _____V____  */
> +  -20,   23,     /* |          | |          |   |          | |          | */
> +    2,    0,     /* | Hh,30(f) | | Hh,330(f)|   | Hh,330(f)| | Hh,30(f) | */
> +    1,   -6,     /* |__________| |__________|   |__________| |__________| */
> +  -14,   -5,     /*      \     ___      /           \      ___     /      */
> +   15,  -18,     /*       \   /   \    /    _____    \    /   \   /       */
> +    6,    7,     /*        `->| + |<--'    /     \    `-->| + |<-'        */
> +   15,  -10,     /*           \___/      _/       \_      \___/           */
> +  -14,   22,     /*               \     / \       / \     /               */
> +   -7,   -2,     /*                `--->| |       | |<---'                */
> +   -4,    9,     /*                     \_/       \_/                     */
> +    6,  -12,     /*                                                       */
> +    6,   -6,     /*                       Headphones                      */
> +    0,  -11,
> +    0,   -5,
> +    4,    0};
> +
> +typedef struct {
> +    int16_t taps[NUMTAPS * 2];
> +} WidenContext;

Nit+: s/WidenContext/EarwaxContext/

> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> +    AVFilterFormats *formats = NULL;
> +    avfilter_add_format(&formats, AV_SAMPLE_FMT_S16);
> +    avfilter_set_common_formats(ctx, formats);
> +    return 0;
> +}
> +
> +//FIXME: replace with DSPContext.scalarproduct_int16
> +static inline int16_t* scalarproduct(int16_t *in, int16_t *endin, int16_t *out)
> +{
> +    int32_t sample;
> +    int16_t j;
> +
> +    while (in < endin){

Nit++: while (...)_{

> +        sample = 0;
> +        for (j = 0; j < NUMTAPS; j++)
> +            sample += in[j] * filt[j];
> +        *out = sample >> 6;
> +        out++; in++;
> +    }
> +
> +    return out;
> +}
> +
> +static void filter_samples(AVFilterLink *inlink, AVFilterBufferRef *insamples)
> +{
> +    int16_t *taps, *endin, *in, *out;
> +    AVFilterBufferRef *outref =
> +        avfilter_get_audio_buffer(inlink, AV_PERM_WRITE,
> +                                  inlink->format,
> +                                  insamples->audio->nb_samples,
> +                                  inlink->channel_layout,
> +                                  insamples->audio->planar);
> +    taps  = ((WidenContext*)inlink->dst->priv)->taps;
> +    out   = (int16_t*)outref->data[0];
> +    in    = (int16_t*)insamples->data[0];
> +
> +    // copy part of new input and process with saved input
> +    memcpy(taps+NUMTAPS, in, NUMTAPS * sizeof(*taps));
> +    out   = scalarproduct(taps, taps + NUMTAPS, out);
> +
> +    // process current input
> +    endin = in + insamples->audio->nb_samples * 2 - NUMTAPS;
> +    out   = scalarproduct(in, endin, out);
> +
> +    // save part of input for next round
> +    memcpy(taps, endin , NUMTAPS * sizeof(*taps));

Nit+++: endin,

> +
> +    avfilter_filter_samples(inlink->dst->outputs[0], outref);
> +    avfilter_unref_buffer(insamples);
> +}
> +
> +AVFilter avfilter_af_earwax = {
> +    .name           = "earwax",
> +    .description    = NULL_IF_CONFIG_SMALL("Widen the stereo image."),
> +    .query_formats  = query_formats,
> +    .priv_size      = sizeof(WidenContext),
> +    .inputs  = (AVFilterPad[])  {{  .name           = "default",
> +                                    .type           = AVMEDIA_TYPE_AUDIO,
> +                                    .filter_samples = filter_samples,
> +                                    .min_perms      = AV_PERM_READ, },
> +                                 {  .name = NULL}},
> +
> +    .outputs = (AVFilterPad[])  {{  .name           = "default",
> +                                    .type           = AVMEDIA_TYPE_AUDIO, },
> +                                 {  .name = NULL}},
> +};

Note: original filter fails for sample rates != 44100, I wonder if it
makes sense to keep that limitation (that would require a check in
config_props).

> +
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 425cff8..fc8708b 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -37,6 +37,7 @@ void avfilter_register_all(void)
>      REGISTER_FILTER (ACONVERT,    aconvert,    af);
>      REGISTER_FILTER (ANULL,       anull,       af);
>      REGISTER_FILTER (ARESAMPLE,   aresample,   af);
> +    REGISTER_FILTER (EARWAX,      earwax,      af);
>  
>      REGISTER_FILTER (ANULLSRC,    anullsrc,    asrc);

Missing docs, you can reuse the sox nice docs and add an entry in
doc/filters.texi.
-- 
FFmpeg = Free and Funny Mysterious Patchable Extreme Gorilla



More information about the ffmpeg-devel mailing list