[FFmpeg-devel] [PATCH] avcodec/mlpdec: Add decoding of object audio data

Hendrik Leppkes h.leppkes at gmail.com
Sun Mar 23 23:47:16 EET 2025


On Sun, Mar 23, 2025 at 9:35 PM James Almer <jamrial at gmail.com> wrote:
>
> On 3/23/2025 4:33 PM, Massimo Eynard wrote:
> > On 23/03/2025 20:01, James Almer wrote:
> >> On 3/22/2025 2:49 PM, Massimo Eynard wrote:
> >>> This patch adds support for decoding the fourth MLP substream
> >>> which contains the 16-channel presentation used for Atmos
> >>> audio objects.
> >>>
> >>> By default only the first three substreams are decoded
> >>> unless the new extract_objects flag is enabled as the resulting
> >>> presentation contains audio object feeds instead of classic
> >>> loudspeaker feeds.
> >>>
> >>> As this introduces interpolation of primitive matrices, precision
> >>> has been increased to 2.18 fixed point. Therefore this requires
> >>> DSP code upgrade which has been done for C and x86 implementations
> >>> but not the ARM implementation.
> >>>
> >>> Adds two FATE tests using existing atmos.thd sample to reflect
> >>> changes.
> >>>
> >>> Signed-off-by: Massimo Eynard <eynard.massimo at gmail.com>
> >>> ---
> >>>    libavcodec/arm/mlpdsp_armv5te.S  |   2 +-
> >>>    libavcodec/arm/mlpdsp_init_arm.c |   3 +-
> >>>    libavcodec/mlp.h                 |  10 +-
> >>>    libavcodec/mlp_parse.c           |  31 ++-
> >>>    libavcodec/mlp_parse.h           |   1 +
> >>>    libavcodec/mlp_parser.c          |  11 +-
> >>>    libavcodec/mlpdec.c              | 389 +++++++++++++++++++++++++++----
> >>>    libavcodec/mlpdsp.c              |  50 +++-
> >>>    libavcodec/mlpdsp.h              |  25 ++
> >>>    libavcodec/x86/mlpdsp.asm        |  19 +-
> >>>    tests/fate/truehd.mak            |  10 +
> >>>    11 files changed, 476 insertions(+), 75 deletions(-)
> >>
> >> With atmos.thd i get:
> >>
> >>> [aist#0:0/truehd @ 00000209caf3ee00] Guessed Channel Layout: 7.1.4
> >>> Input #0, truehd, from '../samples/truehd/atmos.thd':
> >>>    Duration: N/A, start: 0.000000, bitrate: N/A
> >>>    Stream #0:0: Audio: truehd (Dolby TrueHD + Dolby Atmos), 48000 Hz, 7.1.4, s32 (24 bit)
> >>
> >> Which is unlikely to be correct. The file has 11 (or 12) objects, which is exported as 12 channels in an unspecified layout, and automatically assumed to be a 7.1.4 fixed layout.
> >>
> >
> > This is caused by `guess_input_channel_layout` (in `ffmpeg_demux.c`) which tries to assume a layout.
> > Would using `AV_CHANNEL_ORDER_CUSTOM` with all channels set to `AV_CHAN_UNKNOWN` (for unknown position, except LFE if present) be a better solution?
>
> Possibly, but it may make the stream undecodable unless you remap the
> channels (probably with a filter in the filterchain).
>
> Is there no better representation for the output? What are these 12
> channels the sample exports? 16 channels (as you say the MLP substream
> contains) would match Ambisonics 3rd order, but i assume that doesn't
> apply here, unless you should also be outputting something else.
>

Its object-based audio. Every extra "channel" represents an audio
object at any arbitrary position in space, as defined by separate
metadata, which you are then supposed to mix together for your final
speaker configuration.
Typically, the "bed" channels (eg. the base 7.1) will contain audio
that doesn't require much localization information, music, background
noises, and the objects will contain audio which is more relevant to
have full spatial localization. A mixer is then tasked based on the
spatial metadata and knowledge of the physical speaker configuration
to mix the objects for ideal spatial representation.

We don't have a channel layout that would identify this sort of setup
as of yet, nevermind a mixer that could actually deal with it, or even
exporting the metadata from the TrueHD stream, but baby steps I
suppose.

FWIW, taking all this into account, I fully agree that it should by
default output the 7.1 representation that everyone can actually
process, because the bed+objects representation is rather unexpected
and unhandleable at this time.

- Hendrik


More information about the ffmpeg-devel mailing list