[Libav-user] # of audio samples, calculated vs. codec context

Brad O'Hearne brado at bighillsoftware.com
Mon May 20 17:39:36 CEST 2013


On May 18, 2013, at 1:11 PM, Brad O'Hearne <brado at bighillsoftware.com> wrote:

> I have the following audio use-case: 
> 
> audio capture -> resample captured audio to destination format for encoding -> encode audio -> stream audio
> 
> I have developed an app which has worked decently well for fairly common sample rates (44100, 48000). However, I came across a sample rate of 16000 which is breaking the app. The problem stems from the calculation of the number of audio samples in the destination data exceeding the codec context's frame size. 
> 
> In the resampling_audio.c example, there it shows the following means to calculate the destination number of samples: 
> 
> dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx, src_rate) +
>                                        src_nb_samples, dst_rate, src_rate, AV_ROUND_UP);
> 
> Here are the values for these variables: 
> 
> src_rate = 16000
> src_nb_samples = 512
> dst_rate = 44100
> 
> and the calculated value: 
> 
> dst_nb_samples = 1412
> 
> However, the codec context's frame size is set (by the encoder, as per the documentation) at 1152, smaller than the calculated value. If I continue with the resampling and followed by encoding with these values, I see this in the console: 
> 
> [libmp3lame @ 0x10380a200] more samples than frame size (avcodec_encode_audio2)
> 
> followed by receiving a -22 return code from avcodec_encode_audio2. I tracked this into the FFmpeg source, and the console output is coming from libavcodec's utils.c line 1208 as the result of this check in the preceding line failing: 
> 
>            if (frame->nb_samples > avctx->frame_size) {
> 
> Fair enough, it doesn't want more samples than the codec specifies for its expected frame size. Just to see what would happen, I assigned dst_nb_samples the codec context's frame size value, and the audio seems mostly fine, but the associated video timing is out of sync and askew (which probably makes sense, as the timings should be wrong given using a wrong number of samples). 
> 
> So my question is how should I handle this scenario? What should the app do to accommodate the calculation for the number of samples which exceeds the frame size specified by the codec context, so that the timing isn't thrown out of whack? 

I take it by sound of crickets (no response) to my question above that either I've done a bad job communicating the issue, or it is indeed a real stumper. In the event that it is the former, I'm going to take another stab at this by distilling it all down to a very simple question: 

How does one encode decompressed audio received where source data sample buffers have 512 samples each and a sample rate of 16000, and encode it to a sample rate of 44100? 

Thanks for your help.

Brad


More information about the Libav-user mailing list