[FFmpeg-devel] A few filter questions

Gerion Entrup gerion.entrup at t-online.de
Fri Jul 18 12:38:43 CEST 2014


Am Donnerstag 17 Juli 2014, 17:24:35 schrieb Clément Bœsch:
> On Thu, Jul 17, 2014 at 04:56:08PM +0200, Gerion Entrup wrote:
> [...]
> 
> > > Also, you still have the string metadata possibility (git grep SET_META
> > > libavfilter).
> > 
> > Hmm, thank you, I will take a look at it. If I see it right, it is used to
> > fill a dictionary per frame with some kind of data?
> 
> Strings only, so you'll have to find a serialization somehow. Maybe simply
> an ascii hex string or something. But yeah, it just allows you to map some
> key → value string couples to the frames passing by in the filter.
> 
> How huge is the information to store per frame?
82 byte per frame for the finesignature
(Could be split again in three parts: An one byte confidence, a 5 byte words 
vector, and a 76 byte framesignature, something like:
struct finesignature{
    uint8_t confidence;
    uint8_t words[5];
    uint8_t framesignature[76]
})
152 byte per 90 frames for the coursesignature
(Note, that there are 2 coursesignatures with an offset of 45 frames:
0-89
45-134
90-179
...)

If I see it right, there are two possibilies:
Write as chars in the output (looks crappy, but needs the same amount of 
memory).
Write as ascii hex in the output (looks nice, but needs twice as much memory).

> 
> [...]
> 
> > > stdout/stderr really isn't a good thing. Using metadata is way better
> > > because you can output them from ffprobe, and parse them according to
> > > various outputs (XML, CSV, JSON, ...).
> > 
> > Sounds good…
> 
> tools/normalize.py make use of such feature if you want examples (that's
> the -of option of ffprobe)
Ok.
> 
> [...]
> 
> > > Am I understanding right your wondering?
> > 
> > No ;), but anyway thanks for your answer. In your 2nd method your filter
> > is a VV->V filter? Am I right, that this filter then also can take only
> > one stream? Said in another way: Can a VV->V filter also behave as a V->V
> > filter?
> Yes, fieldmatch is a (complex) example of this. But typically it's simply
> a filter with dynamic inputs, based on the user input. The simplest
> example would be the split filter. Look at it for an example of dynamic
> allocation of the number of inputs based on the user input (-vf split=4 is
> a V->VVVV filter)
Hmm, interesting code, thank you.
> 
> [...]
> 
> > > Check tools/normalize.py, it's using ebur128 and the metadata system.
> > 
> > Thats what I mean. Someone has to write an external script, which calls
> > ffmpeg/ffprobe two times, parse stdout of the first call and pass it to
> > the
> > filteroptions of the second call. As I see, there is no direct way.
> > Something like:
> > ffmpeg -i foo -f:a volume=mode=autodetect normalized.opus
> 
> We add a discussion several time for real time with that filter. If we do
> a 2-pass, that's simply because it's "more efficient". Typically, doing
> some live normalization can be done easily (we had patches for this):
> ebur128 already attaches some metadata to frames, so a following filter
> such as volume could reuse them, something like -filter_complex
> ebur128=metadata=1,volume=metadata.
> 
> [...]



More information about the ffmpeg-devel mailing list