[FFmpeg-devel] On libavfilter: A Summary of Issues

Mon Feb 18 11:09:18 CET 2013

On Mon, Feb 18, 2013 at 04:36:53AM +0100, Jan Ehrhardt wrote:
> Clément Bœsch in gmane.comp.video.ffmpeg.devel (Mon, 18 Feb 2013
> 03:37:04 +0100):
> >On Mon, Feb 18, 2013 at 03:24:58AM +0100, Jan Ehrhardt wrote:
> >> 
> >> I still like the idea of MEncoder's volnorm filter very much. Even in
> >> one pass encoding MEncoder adjusts the volume to an acceptable level (in
> >> my case it mostly gains volume). Would such a thing be possible with
> >> ebur128 and metadata injection?
> >
> >AFAICT, yes. We already have metadata injection in lavfi (see
> >silencedetect or the scene detection in select filter), and so if ebur128
> >filter was injecting metadata in frames, then a communication would be
> >possible between ebur128 and volume filter (for example).
> >
> >The main problem is that you need to adjust ebur128 so it is resizing
> >frames to windows to something like 100ms (IIRC), which might require some
> >thinking in the filter. The reason is that a loudness result is associated
> >for this time frame, so you need to transmit metadata to volume at this
> >rate.
> 
> The EBU R128 specs talk about 400ms: http://tech.ebu.ch/loudness
> It looks like FFmpeg's ebur128 filter already uses this time frame as a
> default: https://ffmpeg.org/ffmpeg-filters.html#ebur128
> 
> And the EBU specs also define a short-term loudness of 3 seconds, which
> is implemented in the ebur128 filter as well. AFAIK, 3 seconds comes
> close to MEncoder's volnorm filter.
> 

   /* For integrated loudness, gating blocks are 400ms long with 75%
    * overlap (see BS.1770-2 p5), so a re-computation is needed each 100ms
    * (4800 samples at 48kHz). */

Both 400ms and 3s time frame loudness are computed, every 100ms.

> I do not know the inner workings of the ebur128 filter, but it looks
> logical to me that it is constantly (i.e. at every audio frame) updating
> the values for the Momentary loudness and the Short-term loudness.

Yes, you can observe this with the visual representation offered by the
filter, where both short term and momentary one are represented.

>                                                                    If
> so, then there is at every audio frame a value that can be used as input
> for the volume filter. Then it must also be possible to inject that as
> metadata into the audio frame and use it as input for the volume filter.
> 
> Maybe kierank knows more about the way r128 has to be implemented. Or he
> might be able to retieve that information. I have sent him a message
> with a reference to this discussion.
> 

See my other mail.

> >Another - maybe smarter - solution would be to auto inject a asetnsamples
> >before a ebur128 filter given various properties of the filtergraph. Or
> >maybe the ebur128 filter could alter the filtergraph itself (if that's
> >possible without much hack) to insert the filter before if a "metadata"
> >mode is requested.
> 
> This goes above my head. I just grasped the concept of metadata
> injection, but do not yet know how it really works. I took a glimse on
> your patches for metadata inhection and the silencedetect filter...
> 
> On #ffmpeg-devel kierank pointed out that r128 might be going to be
> mandatory in some countries for realtime encoding. It would be a big
> bonus for FFmpeg if it supported this.
> 

AFAIK, it is actually the case. At least in my country, it is.

I've proposed various solutions. If no one wants to write that code, I can
do it. I'm relatively familiar with lavfi and metadata, and the author of
that filter.

-- 
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130218/2d67d97d/attachment.asc>