[FFmpeg-user] linear loudnorm
Jonathan Baecker
jonbae77 at gmail.com
Mon Feb 22 19:21:40 EET 2021
Hello!
I'm trying to nomalize an audio file with FFmpeg. I'm using the loudnorm
filter. The source loudness is -23 LUFS and I want to make it -17 LUFS.
As far as I know, loudnorm has 2 modes of normalizing audio: linear and
dynamic (analysing small parts vs. analysing the whole file).
The problem is that when I have an audio file where someone is speaking,
the pauses in the speech get louder and louder and thus a hissing noise
is clearly audible. Thats why I need linear normalization. But for some
reason, that I can't explain, FFmpeg always switches to dynamic mode.
I've considered all the requirements for liner scaling in the loudnorm
documentation, but I can't figure out whats wrong. I've specified all 4
values, target LRA isn't lower than input LRA, and when I normalize the
file in Adobe Audition to -17 LUFS, I can't see any peeking.
What would be the best way to get linear normalization with FFmpeg?
Here is what I'm doing:
1. Analyze the source audio file:
ffmpeg -i input.wav -filter:a
loudnorm=I=-17:TP=-1:LRA=9:print_format=json -f null -
ffmpeg version N-100942-gc596e82155-gb7251aed46+3 Copyright (c)
2000-2021 the FFmpeg developers
built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
configuration: --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache
g++' --ld='ccache g++' --disable-autodetect --enable-amf --enable-bzlib
--enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2
--enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2
--enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame
--enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264
--enable-libx265 --enable-libdav1d --enable-libaom --disable-debug
--enable-fontconfig --enable-libass --enable-libbluray
--enable-libfreetype --enable-libmfx --enable-libmysofa
--enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libopenjpeg --enable-libsnappy --enable-libsoxr
--enable-libspeex --enable-libtheora --enable-libtwolame
--enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp
--enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl
--enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3
--enable-chromaprint --enable-decklink --enable-frei0r --enable-libbs2b
--enable-libcaca --enable-libcdio --enable-libfdk-aac --enable-libflite
--enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc
--enable-libsvthevc --enable-libsvtav1 --enable-libkvazaar
--enable-libmodplug --enable-librtmp --enable-librubberband
--enable-libtesseract --enable-libxavs --enable-libzmq --enable-libzvbi
--enable-openal --enable-libvmaf --enable-libcodec2 --enable-libsrt
--enable-ladspa --enable-librav1e --enable-libglslang --enable-vulkan
--enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp
--extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++
--extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC
--extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++
--extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi
--extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads
--extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree
--extra-cflags=-DAL_LIBTYPE_STATIC
--extra-cflags='-ID:/ab-suite/local64/include/AL'
libavutil 56. 64.100 / 56. 64.100
libavcodec 58.120.100 / 58.120.100
libavformat 58. 65.101 / 58. 65.101
libavdevice 58. 11.103 / 58. 11.103
libavfilter 7.101.100 / 7.101.100
libswscale 5. 8.100 / 5. 8.100
libswresample 3. 8.100 / 3. 8.100
libpostproc 55. 8.100 / 55. 8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'input.wav':
Metadata:
encoded_by : Adobe Adobe Media Encoder 2020.0
encoder : Adobe Adobe Media Encoder 2020.0 (Windows)
date : 2021-02-15
creation_time : 15:31:34
time_reference : 0
Duration: 00:37:57.52, bitrate: 1539 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz,
stereo, s16, 1536 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
Metadata:
encoded_by : Adobe Adobe Media Encoder 2020.0
time_reference : 0
date : 2021-02-15
encoder : Lavf58.65.101
Stream #0:0: Audio: pcm_s16le, 192000 Hz, stereo, s16, 6144 kb/s
Metadata:
encoder : Lavc58.120.100 pcm_s16le
size=N/A time=00:37:54.62 bitrate=N/A speed=33.9x
video:0kB audio:1708140kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: unknown
[Parsed_loudnorm_0 @ 00000121fcf41e00]
{
"input_i" : "-22.72",
"input_tp" : "-2.67",
"input_lra" : "6.10",
"input_thresh" : "-33.31",
"output_i" : "-16.95",
"output_tp" : "-1.00",
"output_lra" : "6.00",
"output_thresh" : "-27.53",
"normalization_type" : "dynamic",
"target_offset" : "-0.05"
}
2. Encode the audio with:
ffmpeg -i input.wav -filter:a
loudnorm=I=-17:TP=-1:LRA=9:measured_I=-22.72:measured_TP=-2.67:measured_LRA=6.10:measured_thresh=-33.31:offset=-0.05:linear=true:print_format=summary
output.wav
ffmpeg version N-100942-gc596e82155-gb7251aed46+3 Copyright (c)
2000-2021 the FFmpeg developers
built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
configuration: --pkg-config=pkgconf --cc='ccache gcc' --cxx='ccache
g++' --ld='ccache g++' --disable-autodetect --enable-amf --enable-bzlib
--enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2
--enable-iconv --enable-lzma --enable-nvenc --enable-zlib --enable-sdl2
--enable-ffnvcodec --enable-nvdec --enable-cuda-llvm --enable-libmp3lame
--enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264
--enable-libx265 --enable-libdav1d --enable-libaom --disable-debug
--enable-fontconfig --enable-libass --enable-libbluray
--enable-libfreetype --enable-libmfx --enable-libmysofa
--enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libopenjpeg --enable-libsnappy --enable-libsoxr
--enable-libspeex --enable-libtheora --enable-libtwolame
--enable-libvidstab --enable-libvo-amrwbenc --enable-libwebp
--enable-libxml2 --enable-libzimg --enable-libshine --enable-gpl
--enable-avisynth --enable-libxvid --enable-libopenmpt --enable-version3
--enable-chromaprint --enable-decklink --enable-frei0r --enable-libbs2b
--enable-libcaca --enable-libcdio --enable-libfdk-aac --enable-libflite
--enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc
--enable-libsvthevc --enable-libsvtav1 --enable-libkvazaar
--enable-libmodplug --enable-librtmp --enable-librubberband
--enable-libtesseract --enable-libxavs --enable-libzmq --enable-libzvbi
--enable-openal --enable-libvmaf --enable-libcodec2 --enable-libsrt
--enable-ladspa --enable-librav1e --enable-libglslang --enable-vulkan
--enable-openssl --extra-cflags=-fopenmp --extra-libs=-lgomp
--extra-cflags=-DLIBTWOLAME_STATIC --extra-libs=-lstdc++
--extra-cflags=-DCACA_STATIC --extra-cflags=-DMODPLUG_STATIC
--extra-cflags=-DCHROMAPRINT_NODLL --extra-libs=-lstdc++
--extra-cflags=-DZMQ_STATIC --extra-libs=-lpsapi
--extra-cflags=-DLIBXML_STATIC --extra-libs=-liconv --disable-w32threads
--extra-cflags=-DKVZ_STATIC_LIB --enable-nonfree
--extra-cflags=-DAL_LIBTYPE_STATIC
--extra-cflags='-ID:/ab-suite/local64/include/AL'
libavutil 56. 64.100 / 56. 64.100
libavcodec 58.120.100 / 58.120.100
libavformat 58. 65.101 / 58. 65.101
libavdevice 58. 11.103 / 58. 11.103
libavfilter 7.101.100 / 7.101.100
libswscale 5. 8.100 / 5. 8.100
libswresample 3. 8.100 / 3. 8.100
libpostproc 55. 8.100 / 55. 8.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, wav, from 'input.wav':
Metadata:
encoded_by : Adobe Adobe Media Encoder 2020.0
encoder : Adobe Adobe Media Encoder 2020.0 (Windows)
date : 2021-02-15
creation_time : 15:31:34
time_reference : 0
Duration: 00:37:57.52, bitrate: 1539 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz,
stereo, s16, 1536 kb/s
File 'output.wav' already exists. Overwrite? [y/N] y
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'output.wav':
Metadata:
ITCH : Adobe Adobe Media Encoder 2020.0
time_reference : 0
ICRD : 2021-02-15
ISFT : Lavf58.65.101
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 192000 Hz,
stereo, s16, 6144 kb/s
Metadata:
encoder : Lavc58.120.100 pcm_s16le
size= 1708140kB time=00:37:54.62 bitrate=6151.8kbits/s speed=33.6x
video:0kB audio:1708140kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: 0.000009%
[Parsed_loudnorm_0 @ 000001a3673c0780]
Input Integrated: -22.7 LUFS
Input True Peak: -2.7 dBTP
Input LRA: 6.1 LU
Input Threshold: -33.3 LUFS
Output Integrated: -17.0 LUFS
Output True Peak: -1.0 dBTP
Output LRA: 6.0 LU
Output Threshold: -27.6 LUFS
Normalization Type: Dynamic
Target Offset: -0.0 LU
More information about the ffmpeg-user
mailing list