[FFmpeg-devel] [PATCH v3 14/18] swscale/graph: add new high-level scaler dispatch mechanism

Thu Oct 24 13:02:41 EEST 2024

On Thu, 24 Oct 2024 11:30:12 +0200 Anton Khirnov <anton at khirnov.net> wrote:
> Quoting Niklas Haas (2024-10-20 22:05:23)
> > From: Niklas Haas <git at haasn.dev>
> >
> > This interface has been designed from the ground up to serve as a new
> > framework for dispatching various scaling operations at a high level. This
> > will eventually replace the old ad-hoc system of using cascaded contexts,
> > as well as allowing us to plug in more dynamic scaling passes requiring
> > intermediate steps, such as colorspace conversions, etc.
> >
> > The starter implementation merely piggybacks off the existing sws_init() and
> > sws_scale(), functions, though it does bring the immediate improvement of
> > splitting up cascaded functions and pre/post conversion functions into
> > separate filter passes, which allows them to e.g. be executed in parallel
> > even when the main scaler is required to be single threaded. Additionally,
> > a dedicated (multi-threaded) noop memcpy pass substantially improves
> > throughput of that fast path.
> >
> > Follow-up commits will eventually expand this to move all of the scaling
> > decision logic into the graph init function, and also eliminate some of the
> > current special cases.
>
> Does this (or can it) support copy-free passthrough of individual
> planes, for cases like YUV420P<->NV12?

Not currently, no. We could switch to AVBufferRefs for the plane pointers to
add this functionality down the line, but it's not a high priority because
doing this will require the much harder problem of rewriting the underlying
scaler dispatch logic to begin with.

Doing this would not be terribly difficult either way, but the problem is that
swscale currently does not exactly have a good concept of what's happening
to each plane - it's all a jumble of ad-hoc cases.

One of my plans for SwsGraph is to first make a list of operations to perform
on each plane, and then eliminate reduntant passes to figure out what special
cases and/or noop passes can be optimized. But this has to wait a bit, as I'm
first working on the immediate goal of adding support for more complex
colorspaces (by chaining together multiple scaling passes).