[FFmpeg-devel] [PATCH] Efficiently support several output pixel formats in Cinepak decoder

Ronald S. Bultje rsbultje at gmail.com
Fri Feb 3 15:52:53 EET 2017


Hi Rune,

On Fri, Feb 3, 2017 at 4:08 AM, <u-9iep at aetey.se> wrote:

> On Thu, Feb 02, 2017 at 11:16:35AM -0500, Ronald S. Bultje wrote:
> > On Thu, Feb 2, 2017 at 10:59 AM, <u-9iep at aetey.se> wrote:
> > > It is the irregular differences between them which are the reason
> > > for splitting. I would not call this "duplication". If you feel
> > > it is straightforward and important to make this more compact,
> > > with the same performance, just go ahead.
>
> > So, typically, we wouldn't duplicate the code, we'd template it. There's
> > some examples in h264 how to do it. You'd have a single
> (av_always_inline)
> > decode_codebook function, which takes "format" as an argument, and then
> > have three av_noinline callers to it (using fmt=rgb565, fmt=rgb24 or
> > fmt=rgb32).
> >
> > That way performance works as you want it, without the source code
> > duplication.
>
> (Thanks for the pointer. I'll look at how it is done in h264, but: )
>
> I thought about generating the bodies of the functions from something
> like a template but it did not feel like this would make the code more
> understandable aka maintainable. So I wonder if there is any point in
> doing this, given the same binary result.


wm4 has tried to make this point several times, it's about maintainability.
Let me explain, otherwise we'll keep going back and forth.

Let's assume you have a function that does a, b and c, where b depends on
the pix fmt. You're currently writing it as such:

function_565() {
a
b_565
c
}

function_24() {
a
b_24
c
}

function_32() {
a
b_32
c
}

It should be pretty obvious that a and c are triplicated in this example.
Now compare it with this:

av_always_inline function_generic(fmt) {
a
if (fmt == 565) {
b_565
} else if (fmt == 24) {
b_24
} else {
assert(fmt == 32);
b_32
}
c
}

function_565() {
funtion_generic(565);
}

function_24() {
function_generic(24);
}

function_32() {
function_generic(32);
}

It might look larger, but that's merely because we're not writing out a and
c. The key thing here is that we're no longer triplicating a and c. This is
a significant maintenance improvement. Also, because of the inline keywords
used, the performance will be identical.

Conclusion: better to maintain, identical performance. Only advantages, no
disadvantages. Should be easy to accept, right?

> > What we _especially_ don't do in FFmpeg is hardcoding colorspace
> > > > conversion coefficients in decoders, and doing colorspace conversion
> in
> > > > decoders.
> > >
> > > Have you got a suggestion how to do avoid this in this case,
> > > without sacrificing the speed?
>
> > Ah, yes, the question. So, the code change is quite big and it does
> various
> > things, and each of these might have a better alternative or be good
> as-is.
> > fundamentally, I don't really understand how _adding_ a colorspace
> > conversion does any good to speed. It fundamentally always makes things
> > slower. So can you explain why you need to _add_ a colorspace conversion?
>
> It moves the conversion from after the decoder, where the data to convert
> is all and whole frames, to the inside where the conversion applies
> to the codebooks, by the codec design much less than the output.
>
> > Why not just always output the native format? (And then do conversion in
>
> The "native" Cinepak format is actually unknown to swscaler, and I
> seriously doubt it would make sense to add it there, which would
> just largely cut down the decoding efficiency.


I see, so the codebook contains indexed entities that are re-used to
reconstruct the actual frames. That makes some sense. So, what I don't
understand then, is why below you're claiming that get_format() doesn't do
this. this is exactly what get_format() does. Why do you believe
get_format() isn't capable of helping you accomplish this?

> > > +    char *out_fmt_override = getenv("CINEPAK_OUTPUT_FORMAT_
> OVERRIDE");
> > >
> > > > Absolutely not acceptable.
> > >
> > > 1. Why?
> > >
> >
> > Libavcodec is a library. Being sensitive to environment in a library, or
> > worse yet, affecting the environment, is typically not what is expected.
> > There are almost always better ways to do the same thing.
>
> I did my best to look for a better way but it does not seem to be existing.


Look into private options, for one. But really, get_format() solves this. I
can elaborate more if you really want me to, as above, but I feel you
haven't really looked into it. I know this feeling, sometimes you've got
something that works for you and you just want to commit it and be done
with it.

But that's not going to happen. The env variable will never be committed.
Never. I guarantee it with 100% certainty. If you'd like this general
feature to be picked up in the main codebase, you'll need to change at the
very, very least how it is exposed as a selectable option. Env opts are not
acceptable, they have never been and never will. A private codec option
(av_opt_*()) might be acceptable depending on how special/non-generic it
is, but I'm still convinced that get_format() can help also.

I recommend that you play around with how get_format() is used in a few
places, or you try implementing an application that supports the
get_format() callback and use that to recommend a format to the decoder
here, then pick that up in the decoder to select the actual pix_fmt, and
use that for decoding. If you need help with that, we're obviously happy to
help. But we know this code, we know it can do this, don't tell us it's not
possible and an env variable is the only way. That's crazy.

Then make the recommended changes to prevent code duplication, and there's
an actual chance the patch will be picked up in the main codebase. That
sounds pretty straightforward, doesn't it?

(Aside, I agree with other reviewers that colorspace conversion doesn't
belong in a decoder; I'd just remove yuv420p in the codebook conversion,
the native format seems to be RGB, so yuv420p output makes no sense at all.
It appears you're primarily concerned about selecting between rgb565, 24
and 32 anyway. If you really need yuv, we need some way to import the
actual coefficients from other places, they don't belong in the decoder.)

Ronald


More information about the ffmpeg-devel mailing list