[FFmpeg-user] Preserving perceived loudness when downmixing audio from 5.1 AC3 to stereo AAC

Wed Aug 7 18:15:22 CEST 2013

> -----Original Message-----
> From: ffmpeg-user-bounces at ffmpeg.org [mailto:ffmpeg-user-
> bounces at ffmpeg.org] On Behalf Of Nicolas George
> Sent: 07 August 2013 16:50
> To: FFmpeg user questions
> Cc: 'Andy Furniss'
> Subject: Re: [FFmpeg-user] Preserving perceived loudness when
> downmixing audio from 5.1 AC3 to stereo AAC
> 
> Le decadi 20 thermidor, an CCXXI, Francois Visagie a écrit :
> > I'm not sure even -request_channels produces the expected result. It
> > merely seems to influence the number of input channels guessed:
> 
> If it did that, the sound would be completely garbled.
> 
> > Input #0, ac3, from 'in.ac3':
> >   Duration: 00:00:09.02, start: 0.000000, bitrate: 448 kb/s
> >     Stream #0:0: Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
> 
> Normal.
> 
> > Guessed Channel Layout for  Input Stream #0.0 : stereo
> 
> You can safely ignore this particular message, it just means that
something
> somewhere set channels to 2 but neglected to set channel_layout to stereo.
> 
> > Input #0, ac3, from 'in.ac3':
> >   Duration: 00:00:09.02, start: 0.000000, bitrate: 448 kb/s
> >     Stream #0:0: Audio: ac3, 48000 Hz, stereo, fltp, 448 kb/s
> 
> Normal.
> 
> > Would it therefore be correct to assume that -request_channels leads
> > to only that number of channels being extracted, hence no down-mix?
> 
> No. -request_channels uses codec-specific ways of extracting sound with
the
> specified number of channels. It only works for very few codecs that have
> that feature.

Thanks for your feedback.

Is it therefore correct to say that:
	* the only input codec-independent way of downmixing to stereo is
‘-ac 2’/‘-filter:a
aformat=channel_layouts=stereo’/‘-filter:a_aresample=ocl=3’ (which now all
behave the same?), and
	* if one wants to preserve perceived input volume one needs to
adjust gain during encoding?

Further to that, for a given energy level per input channel, does the
current down-mixing mechanism produce differing output energy levels
depending on the _number_ of input channels? I.e. is it expected that
different input layouts (with the same energy level per channel) would
require different gain factors for equally loud outputs, or will one be able
to find a suitable gain factor and use that regardless of number of input
channels?

Thanks,
Francois

> 
> Regards,
> 
> --
>   Nicolas George