[FFmpeg-devel] [PATCH 2/3] lavfi/ebur128: add metadata injection. - volnorm.patch (0/1)

Clément Bœsch ubitux at gmail.com
Fri May 3 14:07:49 CEST 2013


On Wed, May 01, 2013 at 10:10:54PM +0200, Jan Ehrhardt wrote:
[...]
> The question arose: how to minimize the adjustments at the beginning of
> a video? I went back to f_ebur128.c and inserted another variable to the
> metadata: the pts. I could use the pts in af_volume.c to maximize the
> change in loudness during the initial seconds. My arbitrary choice:
> allow -1/+1 after the first second, -2/+2 after the second second,
> -20/+20 after 20 seconds or any longer duration. Of course, it is
> possible to lengthen the initial duration to, say, a minute and lower
> the maximum adjustment to -10/+10. But the idea is clear. Essential part
> of the patch:
> 
>     if (vol->metadata) {
>         double loudness, new_volume, pts, timestamp, mx;
>         AVDictionaryEntry *t, *e;
>         t = av_dict_get(buf->metadata, "lavfi.r128.pts", NULL, 0);
>         mx = 20; 
>         if (t) {
>             pts = av_strtod(t->value, NULL);
>             timestamp = pts / 48000; /* assume 48kHz */
>             mx = fmin(mx, timestamp);
>         }

volume has access to the frame pts, I don't think you need to use the
metadata for this.

>         e = av_dict_get(buf->metadata, vol->metadata, NULL, 0);
>         if (e) {
>             loudness = av_strtod(e->value, NULL);
>             new_volume = fmax(-mx,fmin(mx,(-23 - loudness)));
>             set_fixed_volume(vol, pow(10, new_volume / 20));
>         }
>     }
> 
> The mx variable defines the min/max adjustment. By setting an absolute
> maximum of 20 and by dividing the pts by 48k, I got the described setup
> of -1/+1 per second.
> 

Normalizing the beginning is indeed slightly tricky; the curve Nicolas is
proposing might be a bit more appropriate. Note that this could be
user configurable with another expression, but possibly overkill.

Have you tried to print/graph the new_volume values without the time
bounding code with various samples? This could help defining a generic
opposite time curve. Note that it of course depends on what loudness score
you select.

[...]

-- 
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130503/54de12e5/attachment.asc>


More information about the ffmpeg-devel mailing list