[FFmpeg-devel] Video filter ideas

Michael Niedermayer michaelni
Thu Jun 21 16:48:28 CEST 2007


On Wed, Jun 20, 2007 at 10:30:03AM -0400, Bobby Bingham wrote:
> Hi,
> I'm close to trying to turn my filter system ideas into an actual API
> now, and I'd like comments and suggestions on the overall architecture
> before I get too far.  Nothing here's set in stone, so any comments
> which might lead to a better design are very welcome. I've been looking
> through libmpcodecs for ideas lately, and to be honest, I'm not sure I
> quite grok the buffer management that's going on there, so
> suggestions or explanations relating to that are especially welcome.
> I have in mind a sort of combined push-pull architecture.  A filter
> will request a frame from its input(s) when necessary.  They will then
> push slices out (requesting in turn more frames from their inputs if
> necessary) by calling its output's draw_slice().  For example:
> decoder -> filter -> output
> Output will request a frame from the filter, which will in turn
> request a frame from the decoder.  The decoder will call the filter's
> draw_slice() for each slice.  As the filter processes these, it will
> call the output's draw_slice().
> Now, the output frames do not have to correspond one-to-one to the
> input frames.  For example, if the filter in the example is a decimate
> filter which drops frames which are similar, it may only output one
> frame for every N input.  Similarly, if it's a filter which increases
> the framerate by interpolating between frames, it may output multiple
> frames for a single input frame.  The only thing I think may be
> necessary is that when the next filter requests output, you should
> output at least one frame.
> Question:  what about outputting frames when not requested (for
> example, outputting two frames for a single request, or when
> satisfying a request for output B gives you enough input to also
> generate data for output A? I think most filters should handle it fine,
> but what about a video out which must buffer it until it's time to
> display?

a simple fifo filter could buffer frames which have become available
as a sideeffect of some other request but which the next filter cant use
yet. so it seems easy to support filters which cant deal with extra
input witout needing to forbid it totally ...

> Next up, slices.  I don't see why filters should implement the same
> thing in two different ways, so I'd like to make everything into
> slices.  All frame data is passed through the draw_slice() functions of
> the filters.  As I'll touch on later though, it will be possible for a
> filter to indicate that it can only handle slices which are the size of
> a full frame.


> Actually, lets get to that now.  Michael has said before that writing a
> filer should not require complete knowledge of all the internals.  So
> here are some ideas Rich proposed which help simplify filter writing:
> context.  A filter can specify how much spatial and temporal context it
> needs.  In the spatial sense, this corresponds to the minimum size of
> the slices (we could use 0 to indicate no special requirements, and
> -1 to indicate whole frame slices). If an input tries passing smaller
> slices, the filter system will automatically combine multiple slices
> together to satisfy this requirement.  

i think what rich meant was not the size of slices but rather that a filter
called with a 10x10 rectangle as input and a 5x3 pixel context requirement
should actually get a 10x10 but the pixels in the larger 20x16 reactangle
should be valid input too, that way one could write a filter like

for(y=0; y<h; y++){
    for(x=0; x<w; x++){
        out[y*s+x]= (in[y*s+x-1] + in[y*s+x+1] + ...)/4;
and the developer wouldnt need to worry about the borders

this is of course not hard to implement by using a wraper in almost any filter
system, so that a such simple API filters could be just called by the 
corresponding simple_spatial_filter wraper
this filter could then for the borders use an internal buffer with reflected
or repeated pixels "around" the borders and for the middle it would just
call the filter on the data directly

but as indicated already this can be done on top of pretty much any API
so IMHO we dont need to think too much about it currently

> Similarly for temporal context,
> the filter system would buffer the number of frames required by the
> filter for it.  A temporal context requirement of 0 would mean that the
> filter system won't automatically buffer frames for the filter (either
> the filter doesn't need it, or can handle it itself).  Also possible
> would be a temporal context requirement of -1, which would indicate
> that not only does the filter not require temporal context, but that
> the frames don't even have to be applied in display order.  This would
> allow for a filter graph which can process frames in decode order until
> the point where they hit a filter or video output which requires
> display order.  At that point, the filter system would automatically
> reorder frames.

sounds nice

> I can see how these context buffers could hurt performance.  In some
> cases, the filter would just be buffering frames itself, so it wouldn't
> be much different.  In other cases, it allows for lazier coding of
> filters with lower performance.  It certainly doesn't require their use
> - so if the filter itself can do better, I think that should be
> encouraged.  But I also think that a filter system with some low
> performing filters is worth more than a higher performing system which
> nobody can figure out how to write filters for.

> Buffer management.  This is something I'm still looking for ideas on.
> I understand the idea of direct rendering - rendering into a buffer
> provided by the next filter (or possible even further away) to reduce
> memcpys.  What I'm struggling with is a good solution to the
> requirements placed on each buffer.  For example, the decoder may
> require that buffers storing frames used as references are not
> modified.  Looking at mplayer, this seems to have been a problem there
> as well.  It appears that direct rendering is disabled when decoding
> h264 because, as far as I can tell, it breaks the assumptions mplayer
> makes on how many frames can be used as references.  I'll think some
> more about this, but some of you may have better ideas already on
> handling buffers.

mplayer has been designed before h.264 ...

about buffer management, well this is tricky ...
ive thought about it a little and tried to fit it into some file permission
like model so that a buffer would have some permissions attached to it but
it somehow doesnt work out at all
so heres another idea

lets assume there are no permissions on a buffer but that we rather have
some sort of agreement/contract between filters about a buffer, so that

a filter could request a buffer from another filter with given minimum set
of permissions if the filter couldnt provide such a buffer the filter core

and this filter now could use the buffer according to its permissions or
also give it to other filters, it also could drop permissions though not
add new ones of course as that would violate the "contract" with the
filter from where it has the buffer from

possible permissions would be
readable    (filter receiving the buffer can read it)
writeable   (filter receiving the buffer can write into it)
preserved   (no filter except the filter receiving the buffer may change it)
reuseable   (the filter receiving the buffer can output this buffer multiple times)

so a mpeg IP decoder would request readable+writeable+preserved buffers
draw into them and output them without the writeable permission

a old CR codec would request readable+writeable+preserved+reuseable
update this buffer in each run and output them without the preserved+writeable

mpeg1/2 b frames would be requested as writeable and output as is (the
decoder doesnt use them as reference ...)

a subtitle filter which receives a writeable input buffer would write into it
and output as is, if its input wherent writeable it would ask the next filter
for a writeable buffer and copy things into it

a crop filter would simply change the width/height/stride of a buffer and
pass that on with the same permissions

a temporal blur filter which receives buffers with the preserved permission
would simply keep a reference to them or if they come without preserved
permission then it would copy them (or somehow tell the filter core it needs
preserved permission input)

another temporal blur filter might just keep a single
readable+writable+preserved+reuseable buffer and repeatly add the input into
this and repeatly output this same buffer

also we might need to somehow provide a hint on the number of buffers a filter
will need so a video out with a limited number wont run out of buffers

> Brief list of things I don't think require much detail right now:
> - if a filter works on frame sized slices, its output can be
> automatically cut into normal sized slices again for the remainder of
> the filters
> - a lot of the places I've said "the filter system will automatically
> do X" may end up getting actually done by an automatically inserted
> filter.  For example a "slicify" filter to reduce frame sized slices,
> or a "order" filter to put frames in display order. The end effect that
> other filters won't have to worry about these details is the same.
> Ideas that have crossed my mind, but I'm not sure how to work them in,
> or if they are worth the trouble:
> - handling frames that only change a little.  Suppose only a single
> slice or two change in a frame.  Some filters could theoretically skip
> a lot of processing if they knew that.

libavcodec also keeps track of which macroblocks (16x16 pixels) changed
allhough it needs to know the buffer age for the buffer it receives for
the next frame (the age here is the number of frames since this buffer
has been output by libavcodec, which could of course be "infinite" if its
a new buffer)

> - only allocating memory for a slice at a time, and reusing that for
> later slices in the frame.  Only have one frame-sized buffer at the end
> where the final frame is assembled.

libavcodec does reuse a single slice for b frames in most cases currently
(not h.264 ...)
its not hard to keep lavc from doing this but it isnt optimal if the
following filter could work with independant slices easily

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No human being will ever know the Truth, for even if they happen to say it
by chance, they would not even known they had done so. -- Xenophanes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070621/1cf379f3/attachment.pgp>

More information about the ffmpeg-devel mailing list