[FFmpeg-devel] On libavfilter: A Summary of Issues

Clément Bœsch ubitux at gmail.com
Mon Feb 18 11:09:18 CET 2013

On Mon, Feb 18, 2013 at 04:36:53AM +0100, Jan Ehrhardt wrote:
> Clément Bœsch in gmane.comp.video.ffmpeg.devel (Mon, 18 Feb 2013
> 03:37:04 +0100):
> >On Mon, Feb 18, 2013 at 03:24:58AM +0100, Jan Ehrhardt wrote:
> >> 
> >> I still like the idea of MEncoder's volnorm filter very much. Even in
> >> one pass encoding MEncoder adjusts the volume to an acceptable level (in
> >> my case it mostly gains volume). Would such a thing be possible with
> >> ebur128 and metadata injection?
> >
> >AFAICT, yes. We already have metadata injection in lavfi (see
> >silencedetect or the scene detection in select filter), and so if ebur128
> >filter was injecting metadata in frames, then a communication would be
> >possible between ebur128 and volume filter (for example).
> >
> >The main problem is that you need to adjust ebur128 so it is resizing
> >frames to windows to something like 100ms (IIRC), which might require some
> >thinking in the filter. The reason is that a loudness result is associated
> >for this time frame, so you need to transmit metadata to volume at this
> >rate.
> The EBU R128 specs talk about 400ms: http://tech.ebu.ch/loudness
> It looks like FFmpeg's ebur128 filter already uses this time frame as a
> default: https://ffmpeg.org/ffmpeg-filters.html#ebur128
> And the EBU specs also define a short-term loudness of 3 seconds, which
> is implemented in the ebur128 filter as well. AFAIK, 3 seconds comes
> close to MEncoder's volnorm filter.

   /* For integrated loudness, gating blocks are 400ms long with 75%
    * overlap (see BS.1770-2 p5), so a re-computation is needed each 100ms
    * (4800 samples at 48kHz). */

Both 400ms and 3s time frame loudness are computed, every 100ms.

> I do not know the inner workings of the ebur128 filter, but it looks
> logical to me that it is constantly (i.e. at every audio frame) updating
> the values for the Momentary loudness and the Short-term loudness.

Yes, you can observe this with the visual representation offered by the
filter, where both short term and momentary one are represented.

>                                                                    If
> so, then there is at every audio frame a value that can be used as input
> for the volume filter. Then it must also be possible to inject that as
> metadata into the audio frame and use it as input for the volume filter.
> Maybe kierank knows more about the way r128 has to be implemented. Or he
> might be able to retieve that information. I have sent him a message
> with a reference to this discussion.

See my other mail.

> >Another - maybe smarter - solution would be to auto inject a asetnsamples
> >before a ebur128 filter given various properties of the filtergraph. Or
> >maybe the ebur128 filter could alter the filtergraph itself (if that's
> >possible without much hack) to insert the filter before if a "metadata"
> >mode is requested.
> This goes above my head. I just grasped the concept of metadata
> injection, but do not yet know how it really works. I took a glimse on
> your patches for metadata inhection and the silencedetect filter...
> On #ffmpeg-devel kierank pointed out that r128 might be going to be
> mandatory in some countries for realtime encoding. It would be a big
> bonus for FFmpeg if it supported this.

AFAIK, it is actually the case. At least in my country, it is.

I've proposed various solutions. If no one wants to write that code, I can
do it. I'm relatively familiar with lavfi and metadata, and the author of
that filter.

Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130218/2d67d97d/attachment.asc>

More information about the ffmpeg-devel mailing list