[FFmpeg-devel] [RFC] libavfilter audio API and related issues

Stefano Sabatini stefano.sabatini-lala
Sun May 23 19:12:36 CEST 2010


On date Saturday 2010-05-22 22:37:18 -0700, S.N. Hemanth Meenakshisundaram encoded:
[...]
> Hi,
> 
> I started off trying to make the ffplay changes required for audio
> filtering just to get an idea of what all will be required of an
> audio filter API. Attached is a rudimentary draft of the changes. It
> is merely to better understand the required design and based on this
> I have the following questions and observations about the design:
> 
> 1. ffplay currently gets only a single sample back for every
> audio_decode_frame call (even if encoded packet decodes to multiple
> samples). Should we be putting each sample individually through the
> filter chain or would it be better to collect a number of samples
> and then filter them together?

The second looks more efficient, so yes you could give a try at it.

> 2. Can sample rate, audio format etc change between samples? If not,
> can we move those parameters to the AVFilterLink structure as Bobby
> suggested earlier? The AVFilterLink strructure also needs to be
> generalized.

Sample rate and audio format can be considered constant, as they are
for video. This should be changed, but for the moment I believe is OK
to assume this.

> 3. The number of channels can also be stored in the filter link
> right? That way, we will know how many of the data[8] pointers are
> valid.

We need some way to describe the layout of every single channel. Maybe
we could store CH_LAYOUT_* (check libavcodec/avcodec.h) in the
AVFilterLink, this should provide more information that the mere
number of channels.
 
> 4. Do we require linesize[8] for audio. I guess linesize here would
> represent the length of data in each channel. Isn't this already
> captured by sample format? Can different channels ever have
> different datasizes for a sample?

If we know the sample format and the number of samples/duration, then
the linesize information is indeed redundant. If it bothers you can
set it to 0.

> 5. Is it necessary to have a separate num_samples value in the
> BufferRef or Buffer (in case we filter multiple samples at a time)?
> Can we instead capture it as part of a more useful 'datasize'
> variable that can be quickly be used for copying the data between
> filters?
> 
> Also if we are converting AVFilterPic structure to a more generic
> AVFilterBuffer that is referred to by an AVFilterPicRef and
> AVFilterBufferRef, should the video specific items like PixFormat be
> removed and kept confined to PicRef and BufferRef?

Yes that was the idea.
AVFilterBuffer => contains the data common to A/V/T (data+linesize)

AVFilterPicRef => reference an AVFilterBuffer, and contains data specific
to video

AVFilterSamplesRef => reference an AVFilterBuffer, and contains data specific
to audio

According to this scheme samples_nb would be moved to
AVFilterSamplesRef.

> Regards,
> 

> --- ffplay.c	2010-05-22 22:18:09.573923072 -0700
> +++ ../ffplay.af	2010-05-22 22:16:49.277933683 -0700
> @@ -105,6 +105,7 @@
>  
>  #if CONFIG_AVFILTER
>      AVFilterPicRef *picref;
> +    AVFilterBufferRef *bufref;
>  #endif
>  } VideoPicture;
>  
> @@ -209,6 +210,8 @@
>  
>  #if CONFIG_AVFILTER
>      AVFilterContext *out_video_filter;          ///<the last filter in the video chain
> +    AVFilterContext *out_audio_filter;          ///<the last filter in the audio chain
> +    AVFilterGraph *agraph;
>  #endif
>  
>      float skip_frames;
> @@ -265,6 +268,7 @@
>  static int rdftspeed=20;
>  #if CONFIG_AVFILTER
>  static char *vfilters = NULL;
> +static char *afilters = NULL;
>  #endif
>  
>  /* current context */
> @@ -1752,6 +1756,138 @@
>                                    { .name = NULL }},
>      .outputs   = (AVFilterPad[]) {{ .name = NULL }},
>  };
> +
> +typedef struct {
> +    VideoState *is;
> +} AudioFilterPriv;
> +
> +static int input_audio_init(AVFilterContext *ctx, const char *args, void *opaque)
> +{
> +    AudioFilterPriv *priv = ctx->priv;
> +    AVCodecContext *codec;
> +    if(!opaque) return -1;
> +
> +    priv->is = opaque;
> +    codec    = priv->is->video_st->codec;
> +    codec->opaque = ctx;
> +
> +    return 0;
> +}
> +
> +static void input_audio_uninit(AVFilterContext *ctx)
> +{
> +}
> +
> +static int input_request_samples(AVFilterLink *link)
> +{
> +    AudioFilterPriv *priv = link->src->priv;
> +    AVFilterBufferRef *bufref;
> +    int64_t pts = 0;
> +    int buf_size;
> +
> +#if CONFIG_AVFILTER
> +    buf_size = audio_get_filtered_samples(is->out_audio_filter, priv->is, &pts)
> +#else
> +    buf_size = audio_decode_frame(priv->is, &pts)
> +#endif
> +    if (buf_size <= 0)
> +        return -1;
> +
> +    bufref = avfilter_get_audio_samples(link, AV_PERM_WRITE, link->sample_rate, link->a_format);
> +    memcpy(&bufref->data, priv->is->audio_buf, buf_size);
> +
> +    bufref->pts = pts;
> +    bufref->datasize = buf_size;
> +    avfilter_filter_samples(link, bufref);
> +
> +    return 0;
> +}
> +
> +static int input_query_audio_formats(AVFilterContext *ctx)
> +{
> +    AudioFilterPriv *priv = ctx->priv;
> +    enum SampleFormat sample_fmts[] = {
> +        priv->is->audio_st->codec->sample_fmt, SAMPLE_FMT_NONE
> +    };
> +
> +    avfilter_set_common_formats(ctx, avfilter_make_format_list(sample_fmts));
> +    return 0;
> +}
> +
> +static int input_config_audio_props(AVFilterLink *link)
> +{
> +    AudioFilterPriv *priv  = link->src->priv;
> +    AVCodecContext *c = priv->is->video_st->codec;
> +
> +    link->sample_rate = c->sample_rate;
> +    link->channels = c->channels;
> +    link->sample_fmt = c->sample_fmt;
> +
> +    return 0;
> +}
> +
> +static AVFilter input_filter =
> +{
> +    .name      = "ffplay_audio_input",
> +
> +    .priv_size = sizeof(AudioFilterPriv),
> +
> +    .init      = input_audio_init,
> +    .uninit    = input_audio_uninit,
> +
> +    .query_formats = input_query_audio_formats,
> +
> +    .inputs    = (AVFilterPad[]) {{ .name = NULL }},
> +    .outputs   = (AVFilterPad[]) {{ .name = "default",
> +                                    .type = AVMEDIA_TYPE_AUDIO,
> +                                    .request_samples = input_request_samples,
> +                                    .config_props  = input_config_audio_props, },
> +                                  { .name = NULL }},
> +};
> +
> +static void output_filter_samples(AVFilterLink *link)
> +{
> +}
> +
> +static int output_query_audio_formats(AVFilterContext *ctx)
> +{
> +    enum SampleFormat sample_fmts[] = { SAMPLE_FMT_S16, SAMPLE_FMT_NONE };
> +
> +    avfilter_set_common_formats(ctx, avfilter_make_format_list(sample_fmts));
> +    return 0;
> +}
> +
> +static int get_filtered_audio_frame(AVFilterContext *ctx, VideoState *is, int64_t *pts);
> +{
> +    AVFilterBufRef *bufref;
> +
> +    if(avfilter_request_samples(ctx->inputs[0]))
> +        return -1;
> +    if(!(bufref = ctx->inputs[0]->cur_buf))
> +        return -1;
> +    ctx->inputs[0]->cur_buf = NULL;
> +
> +    *pts          = bufref->pts;
> +
> +    memcpy(is->audio_buf1, bufref->data, bufref->datasize);
> +    is->audio_buf = is->audio_buf1;
> +
> +    return bufref->datasize;
> +}
> +
> +static AVFilter output_audio_filter =
> +{
> +    .name      = "ffplay_audio_output",
> +
> +    .query_formats = output_query_audio_formats,
> +
> +    .inputs    = (AVFilterPad[]) {{ .name           = "default",
> +                                    .type           = AVMEDIA_TYPE_AUDIO,
> +                                    .filter_samples = output_filter_samples,
> +                                    .min_perms      = AV_PERM_READ, },
> +                                  { .name = NULL }},
> +    .outputs   = (AVFilterPad[]) {{ .name = NULL }},
> +};
>  #endif  /* CONFIG_AVFILTER */
>  
>  static int video_thread(void *arg)
> @@ -2175,6 +2311,9 @@
>      AVCodecContext *avctx;
>      AVCodec *codec;
>      SDL_AudioSpec wanted_spec, spec;
> +#if CONFIG_AVFILTER
> +    AVFilterContext *afilt_src = NULL, *afilt_out = NULL;
> +#endif
>  
>      if (stream_index < 0 || stream_index >= ic->nb_streams)
>          return -1;
> @@ -2227,6 +2366,45 @@
>          is->audio_src_fmt= SAMPLE_FMT_S16;
>      }
>  
> +#if CONFIG_AVFILTER
> +    is->agraph = av_mallocz(sizeof(AVFilterGraph));
> +    if(!(filt_src = avfilter_open(&input_audio_filter,  "asrc")))   goto the_end;
> +    if(!(filt_out = avfilter_open(&output_audio_filter, "aout")))   goto the_end;
> +
> +    if(avfilter_init_filter(afilt_src, NULL, is))             goto the_end;
> +    if(avfilter_init_filter(afilt_out, NULL, NULL))           goto the_end;
> +
> +
> +    if(afilters) {
> +        AVFilterInOut *outputs = av_malloc(sizeof(AVFilterInOut));
> +        AVFilterInOut *inputs  = av_malloc(sizeof(AVFilterInOut));
> +
> +        outputs->name    = av_strdup("ain");
> +        outputs->filter  = afilt_src;
> +        outputs->pad_idx = 0;
> +        outputs->next    = NULL;
> +
> +        inputs->name    = av_strdup("aout");
> +        inputs->filter  = afilt_out;
> +        inputs->pad_idx = 0;
> +        inputs->next    = NULL;
> +
> +        if (avfilter_graph_parse(agraph, afilters, inputs, outputs, NULL) < 0)
> +            goto the_end;
> +        av_freep(&afilters);
> +    } else {
> +        if(avfilter_link(afilt_src, 0, afilt_out, 0) < 0)          goto the_end;
> +    }
> +    avfilter_graph_add_filter(agraph, afilt_src);
> +    avfilter_graph_add_filter(agraph, afilt_out);
> +
> +    if(avfilter_graph_check_validity(agraph, NULL))           goto the_end;
> +    if(avfilter_graph_config_formats(agraph, NULL))           goto the_end;
> +    if(avfilter_graph_config_links(agraph, NULL))             goto the_end;
> +
> +    is->out_audio_filter = afilt_out;
> +#endif
> +
>      ic->streams[stream_index]->discard = AVDISCARD_DEFAULT;
>      switch(avctx->codec_type) {
>      case AVMEDIA_TYPE_AUDIO:
> @@ -2287,6 +2465,10 @@
>          if (is->reformat_ctx)
>              av_audio_convert_free(is->reformat_ctx);
>          is->reformat_ctx = NULL;
> +#if CONFIG_AVFILTER
> +        avfilter_graph_destroy(is->agraph);
> +        av_freep(&(is->agraph));
> +#endif
>          break;
>      case AVMEDIA_TYPE_VIDEO:
>          packet_queue_abort(&is->videoq);
> @@ -3046,6 +3228,7 @@
>      { "window_title", OPT_STRING | HAS_ARG, {(void*)&window_title}, "set window title", "window title" },
>  #if CONFIG_AVFILTER
>      { "vf", OPT_STRING | HAS_ARG, {(void*)&vfilters}, "video filters", "filter list" },
> +    { "af", OPT_STRING | HAS_ARG, {(void*)&afilters}, "audio filters", "filter list" },
>  #endif
>      { "rdftspeed", OPT_INT | HAS_ARG| OPT_AUDIO | OPT_EXPERT, {(void*)&rdftspeed}, "rdft speed", "msecs" },
>      { "default", OPT_FUNC2 | HAS_ARG | OPT_AUDIO | OPT_VIDEO | OPT_EXPERT, {(void*)opt_default}, "generic catch all option", "" },

Looks fine at first look.

So let's try to sketch a plan:

* Implement AVFilterBuffer, and use it in place of AVFilterPic.
  Make AVFilterPicRef reference such a struct, and create an
  AVFilterSamples containing the audio data.

* Have a first sketch at the API.

* Integrate it into ffplay. This step is more or less already
  implemented ;-).

As for the use of the SVN soc. It shouldn't be too bad to let you work
in the current libavfilter soc tree. Audio is quite indipendent from
video, so I don't expect major breakages and the stability of the tree
shouldn't be affected too much, so that shouldn't be a major issue even
for those who are currently using the libavfilter tree.

Regards.
-- 
FFmpeg = Friendly and Fancy Murdering Powerful Exploitable Game



More information about the ffmpeg-devel mailing list