[FFmpeg-devel] [PATCH] libavfilter: add atempo filter (revised patch v5)

Pavel Koshevoy pkoshevoy at gmail.com
Wed Jun 13 19:05:39 CEST 2012


On 6/13/2012 9:48 AM, Nicolas George wrote:
> Le quartidi 24 prairial, an CCXX, Pavel Koshevoy a écrit :

[...]

>> +    // 0: input sample position corresponding to the ring buffer tail
>> +    // 1: output sample position
>> +    int64_t position[2];
>> +
>> +    // sample format:
>> +    enum AVSampleFormat format;
> Seems redundant.

Why is it redundant?  I need to know the sample format in order to 
operate on samples of various types (unsigned char, short int, int, 
float, double).

[...]

>> +
>> +#define REALLOC_OR_FAIL(field, field_size)                      \
>> +    do {                                                        \
>> +        field = av_realloc(field, (field_size));                \
>> +        if (!field)                                             \
>> +            return AVERROR(ENOMEM);                             \
> Classic memory leak in case of failure. We have av_realloc_f to avoid just
> that, and the possible integer overflow in the multiplication at the same
> time.

Thanks, I'll rewrite this.


>> +    } while (0)
>> +
>> +/**
>> + * Prepare filter for processing audio data of given format,
>> + * sample rate and number of channels.
>> + */
>> +static int yae_reset(ATempoContext *atempo,
>> +                     enum AVSampleFormat format,
>> +                     int sample_rate,
>> +                     int channels)
>> +{
>> +    const int sample_size = av_get_bytes_per_sample(format);
>> +    uint32_t nlevels  = 0;
>> +    uint32_t pot;
>> +    int i;
>> +
>> +    atempo->format   = format;
>> +    atempo->channels = channels;
>> +    atempo->stride   = sample_size * channels;
>> +
>> +    // pick a segment window size:
>> +    atempo->window = sample_rate / 24;
>> +
>> +    // adjust window size to be a power-of-two integer:
>> +    nlevels = av_log2(atempo->window);
>> +    pot = 1 << nlevels;
>> +    av_assert0(pot <= atempo->window);
>> +
>> +    if (pot < atempo->window) {
>> +        atempo->window = pot * 2;
>> +        nlevels++;
>> +    }
>> +
>> +    // initialize audio fragment buffers:
>> +    REALLOC_OR_FAIL(atempo->frag[0].data,
>> +                    atempo->window * atempo->stride);
>> +
>> +    REALLOC_OR_FAIL(atempo->frag[1].data,
>> +                    atempo->window * atempo->stride);
>> +
>> +    REALLOC_OR_FAIL(atempo->frag[0].xdat,
>> +                    atempo->window * 2 * sizeof(FFTComplex));
>> +
>> +    REALLOC_OR_FAIL(atempo->frag[1].xdat,
>> +                    atempo->window * 2 * sizeof(FFTComplex));
>> +
>> +    // initialize FFT contexts:
>> +    av_fft_end(atempo->fft_forward);
>> +    av_fft_end(atempo->fft_inverse);
>> +
>> +    atempo->fft_forward = av_fft_init(nlevels + 1, 0);
>> +    if (!atempo->fft_forward) {
>> +        return AVERROR(ENOMEM);
>> +    }
>> +
>> +    atempo->fft_inverse = av_fft_init(nlevels + 1, 1);
>> +    if (!atempo->fft_inverse) {
>> +        return AVERROR(ENOMEM);
>> +    }
>> +
>> +    REALLOC_OR_FAIL(atempo->correlation,
>> +                    atempo->window * 2 * sizeof(FFTComplex));
>> +
>> +    atempo->ring = atempo->window * 3;
>> +    REALLOC_OR_FAIL(atempo->buffer, atempo->ring * atempo->stride);
>> +
>> +    // initialize the Hann window function:
>> +    REALLOC_OR_FAIL(atempo->hann, atempo->window * sizeof(float));
> It looks like the various allocations here are not deallocated if something
> fails.
>

This will be addressed.

[...]

> +static void push_samples(ATempoContext *atempo,
> +                         AVFilterLink *outlink,
> +                         int n_out)
> +{
> +    atempo->dst_buffer->audio->sample_rate = outlink->sample_rate;
> +    atempo->dst_buffer->audio->nb_samples  = n_out;
> +
> +    // adjust the PTS:
> +    atempo->dst_buffer->pts =
> +        av_rescale_q(atempo->nsamples_out,
> +                     (AVRational){ 1, outlink->sample_rate },
> +                     outlink->time_base);
> So I gather you have decided do ignore completely the input PTS and
> synthesize new PTS from scratch?
>


You've stated that the PTS has to be consistent with playback duration, 
which in my understanding is a sum of samples output so far.  I admit 
the above calculation assumes 0 PTS as input stream origin.  I'll see 
what I can do to accommodate streams that start from non-0 PTS.

The difficulty is that this filter is meant to be used interactively -- 
user may seek (this affects input stream PTS origin) or change tempo at 
any time.  Suggestions for handling these scenarios gracefully are welcome.

Thank you,
     Pavel.



More information about the ffmpeg-devel mailing list