[Libav-user] Handling of 24 bit audio in libav* and libswresample

Hendrik Schreiber hs at tagtraum.com
Fri Jun 7 12:40:17 CEST 2013

On Jun 4, 2013, at 1:34 PM, Paul B Mahol wrote:

> On 6/4/13, Hendrik Schreiber <hs at tagtraum.com> wrote:
>> 1. (SHIFTING) When decoding, 24bit audio is apparently shifted, i.e. 24bit
>> become 32bit, as there is no 24bit AVSampleFormat. Am I right to assume that
>> the data is shifted toward the most significant byte? I.e. the most
>> significant 3 bytes are the same as the original 24bit?
>> Or is the most significant byte simply "sign-extended" and the three least
>> significant bytes are the original 24bit?

The first statement is true.

I did some tests and all libav does it shift the data toward the most significant byte. I.e. the least significant byte is 0. This means, that one has to apply dithering, *if* one wants to use this 4 bytes representation for anything other than extracting the most significant three bytes.

If one just wants to dump 3-byte for each 24bit sample, one has to simply cut off that extra byte added before (encoding with, AV_CODEC_ID_PCM_S24LE see below). No dithering necessary.

>> 2. (SWRESAMPLE) I'm using libswresample to, well, resample data, get rid of
>> planar formats etc. It's working great. libswresample also accepts
>> AVSampleFormat parameters for input and output format. This implies that it
>> does not support any conversion to true 24bit, represented by 3 bytes.
>> Correct?

Yes. I fiddled with it some more. swresample does not support any true 24bit (i.e. 3byte per sample) output. It works strictly on the intermediate dataformats defined in AVSampleFormat.

>> 3. (CODEC) What is the recommend way to produce 24bit audio? After decoding
>> (and potentially resampling), should I use the corresponding codec (e.g.
>> AV_CODEC_ID_PCM_S24LE) to produce the data in the format I'm interested in?
>> Or is there another, better way?
> There should be dithering applied, see output_sample_bits option.

I guess there is no other way (expect for perhaps filtering). It turns out that to produce 24bit audio in true a 24bit format, one has to use an appropriate encoder, e.g. AV_CODEC_ID_PCM_S24LE for signed 24bit little endian.

Dithering is only necessary, when converting the data somewhere in between (e.g. changing the sample rate while it's in 32bit format), as the code in pcm.c (macro ENCODE) simply shifts the 32bit representation by 8bit, essentially just dropping the last 8bits.

Since I got the answers to all my questions - I figured, I might as well post them. Hope it's useful to someone else.



