[FFmpeg-devel] [RFC] AVFilter Parser

Vitor Sessak vitor1001
Tue Mar 25 21:41:14 CET 2008


Hi

vmrsss wrote:
> Hi everybody,
> 
> 	Vitor, there are a few things I don't understand in your proposal  
> below. A few questions and some comments below:
> 
> On 22 Mar 2008, at 00:22, Vitor Sessak wrote:
>> Just to make things clear, I'll describe more verbosely also the  
>> grammar
>> I though initially, since I think people understood an expanded  
>> version
>> of it (but now I'm not sure if it is not better expanded):
>>
>> filter_id:
>>     any alphanumeric string identifying a filter
>>
>> parameters:
>>     any string
>>
>> filter_with_parameters:
>>     filter_id
>>     filter_id=parameters
>>     Such a filter has a unique number of inputs and outputs.
>>
>> label_id:
>>     any alpha numeric string not beginning by "in" or by "out"
>>
>> label:
>>     (label_id)
>>     input_label
>>     output_label
> 
>> input_label:
>>     (in#)
>>     Where # can be any integer and (in) is an alias for (in1)
> 
>> output_label:
>>     (out#)
>>     Where # can be any integer and (out) is an alias for (out1)
> 
> Probably counting should start at zero for consistency with the way  
> ffmpeg numbers streams (and input files). Why is it important to have  
> the prefixes "in" and "out"? Could it not be any string? 

I think it is very clear to read and write. If the filter graph has 3 
inputs and 2 outputs, the strings "(in1)", "(in2)", "(in3)", "(out1)" 
and "(out2)" should appear, and only once. Also the fact that they are 
numbered make it clear to distinguish the different inputs and outputs 
of the filter (it may treat differently input1 and input2, for example).

> What is the  
> third kind of label (not input nor output) for? Is it to name  
> intermediate streams?

Yes.

> 
>> A filter graph should have elements (in1) (in2) ... and (out1) (out2)
>> ... in the exact number of inputs and outputs it has. Each one of  
>> those
>> elements can appear just once.
> 
> Is it still your intention to use labels to describe feedback? 

Yes, unless you have a good reason for not doing so other than 
implementing user-defined filters by string substitution.

> How do  
> you express that out_k is to be fed back to in_h? Or can you only do  
> feedback on intermediate labels?

(out_k) is one of the outputs of the whole filter graph. It can't be 
feed to anything. Maybe I don't understand your question.

> 
>> A filter is preceded and followed by a list of labels. Each label  
>> should
>> appear exactly once preceding a filter and once following one.
> 
> do you mean each input label should appear exactly once preceding ...,  
> and each ouput label should appear exactly one following ...?

I mean that every intermediate (I forgot that word) should appear twice. 
Once as an input of a filter and once as an output of a filter.

> 
>> The
>> number of input labels (minus one if preceded by a comma) should be
>> equal to the number of inputs of the filter (similar for outputs).
> 
>> The filters in a chain can be linked by commas. A comma link _one_
>> output stream with _one_ input stream,
> 
> Which one? the first? or the last? the one missing from the list?  
> Perhaps I should know the answer, but I am not clear about your  
> intended use of labels. Let me take an example you use below:
> 
> 	 '(in1)(in2) picInPic, (in3) picInPic(out1)'
> 
> Is "(in1)(in2)" the same as "(in2)(in1)" ? I expect the answer is no,  
> the second form would swap the two input stream. And (in3) refers to a  
> third input from the entire chain. Is that fed to the second input of  
> picInPic? 

You got all that right.

In order to feed it to the first input, would I have to
> write the filter this way:
> 
> 	 '(in1)(in2) picInPic(tmp), (in3)(tmp) picInPic(out1)'  ?

Unless this is a typo, you got this one wrong. The comma would link the 
second (inexistent!) output of picInPic to the first input of the other 
picInPic. It would cause a syntax error. Maybe you meant

'(in1)(in2) picInPic(tmp); (in3)(tmp) picInPic(out1)'  ?

> 
>> making things like
>>
>> (in) split, picInPic (out)
>>
>> syntactically invalid.
>>
>> A ';' just describes two different parts of a graph without doing any
>> linking (and each part should have all its inputs and outputs already
>> specified). The following
>>
>> vflip ;
>>
>> is not valid in any context, an output to vflip is missing. Only  
>> things like
>>
>> ...something here... vflip (T); (A)(B) picInPic ...something...
>>
>> are valid.
>>
>> My idea for user defined filters is to use the same syntax.
>>
>> my_filter1 = '(in1)(in2) picInPic, (in3) picInPic(out1)'
> 
>> or
>>
>> my_filter2 = '(in1) split (tmpa), (tmpb) picInPic(out1); (tmpa)  
>> vflip (tmpb)'
> 
>> and used like
>>
>> movie=foo.avi (A);
>> movie=bar.avi (B);
>> movie=main.avi (C);
>> (A)(B)(C)  my_filter1 (out1);
>>
>> Notice that there are no named inputs for the user-defined filter
>> ("in1", "in2" and "in3" are reserved keywords and "tmpa", "tmpb" are
>> local labels).
> 
> Are you saying that user-defined filters must use the in_k form, or  
> that they must not?

I mean that a filter graph has as much "(in#)" as it has inputs and a 
user defined filter is a filter graph on its own.

So

mygraph := (in1) vflip (out1)

vitor at vitor$ ffmpeg -i file1.avi -vfilters '(in1) mygraph (out1)' file2.avi

And yes, textual substitution will give

'(in1) (in1) vflip (out1) (out1)' which is nonsense.

> 
> with respect to the form above (A)(B)(C) my_filter1 (out1), if I make  
> a textual substitution I find myself with

As I've already said, textual substitution is not needed (and not 
wanted, at least by me). It's not easier to parse and it's not easier to 
understand. So, unless you give a good argument for textual 
substitution, please stop using it in examples or arguments for how the 
syntax should be.

> 	(A)(B)(C) (in1)(in2) picInPic, (in3) picInPic(out1) (out1)
> 
> how do we explain what happens here? 

Easy. Syntax error. Five inputs for a filter that expects two.

> Is this supposed to be the same  
> as writing
> 
> 	(A)(B) picInPic, (C) picInPic(out1) ?
> 
> Because of course, you'll have situations like this:
> 
> 	my_f1 = '(in1)something(out1)'
> 	my_f2 = '(in2)otherthing(out1)'
> 
> 	movie=f1.avi(A);
> 	movie=f2.avi(B);
> 	(A)(B) my_f1 ; my_f2 (out1)(out2)
> 

no. Syntax error. You defined

         my_f1 = '(in1)something(out1)'

That means you defined a filter that gets one (and only one) input, 
filter it with the filter "something" and gives one (and only one) 
output. In

        (A)(B) my_f1 ; my_f2 (out1)(out2)

you are passing two outputs to my_f1. Maybe you meant

  	my_f1 = '(in1)something(out1)'
  	my_f2 = '(in2)otherthing(out1)'

  	movie=f1.avi(A);
  	movie=f2.avi(B);
  	(A) my_f1 (out1); (B) my_f2 (out2)

This would be the same as
         movie=f1.avi,something (out1); movie=f2.avi, otherthing (out2)

 > which I expect it is the same as : (in1)something(out1) ;
 > (in2)otherthing(out2) .
 > Right?

Well, if I understand your graph, it has no input. So anything with 
"(in1)" in it will give a syntax error.

Also, looking at

        (A)(B) my_f1 ; my_f2 (out1)(out2)

It gives me the impression you misunderstood the meaning in my syntax of 
the semi-colon. For such a line be valid, my_f1 can't have any output. 
I'll try to explain by a drawing:

(etc) filter (etc2) ; (tmp1) filter2 (tmp2)

means

(etc) ------> filter ---> (etc2)       (tmp1) --->  filter2 ----> (tmp2)

another example

(etc) dead_end_filter ; vsrc_movie=a.avi (tmp2)

(etc) ----> dead_end_filter     vsrc_movie ---> (tmp2)

(mind the difference with the coma)

The semi-colon describes two different parts of the filter chain without 
doing any linking. As I said,

        vflip ; hflip

is a syntax error no matter where it appears. It expects vflip to have 
no output, but it has one. (Yes, I know it is related with your '*', but 
it has the lowest precedence)

>> To avoid name clashing, I propose to simply call
>> recursively the parser, so that the user-defined filter is parsed in
>> another context (and then link the graph to the inputs and outputs it
>> returns).
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

My point about textual substitution.


> 
>> Extensions based on other grammars suggested (which I'm mostly  
>> favourable):
>>
>> 1- Letting the comma link more than one pad
>> Advantages: Simplify the syntax of some filters
>> Disadvantages: Add to syntax complexity
> 
> I think it is important that whatever the syntax is, users are allowed  
> to omit elements so as to write simple filters as simply as always, eg:
> 
> 	... remove_logo,deinterlace,crop,scale,pad...
> 
> In this sense, it's important to be able to leave streams implicit if  
> possible.

Yes, I'm favorable to that. But how is this related to this suggestion?

> 
>> 2- Let the ';' be interpreted like vrmsss's '*'
>> Advantages, Disadvantages: Same as (1)
> 
> I don't think the choice of a symbol is important, it is the syntactic  
> mechanisms to combine streams which are important. I suppose we can  
> take everybody agrees on the key elements for that (functional or  
> sequential composition of filters --meaning outputs of the preceding  
> filter feed into inputs of the following one-- parallel or non- 
> sequential composition, rearrangement of streams via appropriate  
> namings, and feedback or looping). I don't think we are yet reached  
> the perfect way to express these, but we're close.

Maybe having both my ';' and your '*'. My ; is the last symbol in 
precedence, yours is the first. Mine operates in chains with no unlinked 
  pads. Yours operates in chains with one or more.

> 
>> 3- Allow labels to optionally specify the source/destination stream  
>> number.
>> Advantages: Nice simplifications, like instead of
>>
>> (in1) vflip (T1); (in2) (T1) picInPic (out)
>>
>> that would be equivalent to
>> (in1) vflip (T1:0); (in2:0) (T1:1) picInPic (out)
>>
>> could be written, much more simply
>> (in1) vflip, (in2:0) picInPic (out)
> 
> yes, I think this is appealing.
> 
>> and in this case the comma will link to the next available stream.
> 
> it would be the *only* one available, which is a good definition; but  
> apart from the convention you use for implicit linking, 

Which convention for implicit linking?

The comma links the last stream of the previous filter to the first of 
the next. This is explicit and well defined.

If you don't want "implicit" linking, you can do all your links 
explicitly (and named, btw), like


                   L1           L3
(in1) ---> split ---> rotate ------+
                |                   picinPic ----> out
                +-----> nop   ------+
                   L2           L4

(in1) split (L1) (L2);
(L1) rotate (L3);
(L2) nop (L4);
(L3) (L4) picinPic (out);

No comma, no "implicit" linking.

-Vitor




More information about the ffmpeg-devel mailing list