[FFmpeg-devel] [PATCH] lavfi: add opencl tonemap filter.

Niklas Haas ffmpeg at haasn.xyz
Tue May 22 15:54:26 EEST 2018

On Tue, 22 May 2018 08:56:37 +0000, "Song, Ruiling" <ruiling.song at intel.com> wrote:
> Yes, your idea sounds reasonable. But it may need much effort to re-structure the code to make it (that would launch two kernels, and we may need a wait between them) and evaluate the performance.

Actually, a brute force solution to solve the missing peak problem would
be to filter the first frame twice and discard the first result. (After
that, you only need to filter each frame once, so the overall
performance characteristic is unchanged for videos)

That requires minimal code change, and it still allows it to work for
single-frame video sources. It also prevents an initial flash of the
wrong brightness level for transcoded videos.

Also, performnace wise, I'm not sure how this works in OpenCL land, but
in OpenGL/Vulkan, you'd just need to emit a pipeline barrier. That
allows the kernels to synchronize without having to stall the pipeline
by doing a CPU wait. (And, in general, you'd need a pipeline barrier
even if you *are* running glFinish() afterwards - the pipeline barrier
isn't just for timing, it's also for flushing the appropriate caches. In
general, write visibility on storage buffers requires a pipeline
barrier. Are you sure this is not the case for OpenCL as well?)

> Although we are developing offline filter, I think that performance is still very important as well as quality.
> Given that the current implementation does well for video transcoding. I would leave it in my TODO list. Sounds ok?

ACK. It's not my decision, I'm just offering advice.

> Are you talking about display-referred HLG? I didn't update frame side channel data.
> I am not sure when do I need to update it. I thought all HLG should be scene-referred, seems not?
> Could you tell me more about display-referred HLG?

There's no such thing as "display-referred HLG". HLG by definition is
encoded as scene-referred, but the OOTF to convert from scene-referred
to display-referred is part of the EOTF (also by definition).

So the HLG EOTF inputs scene-referred and outputs display-referred. When
you apply the EOTF (including the OOTF) as part of your processing
chain, you're turning it into a linear light display referred signal.
The tone mapping then happens on this signal (in display light), and
then to turn it back to HLG after you're done tone-mapping you apply the
inverse OOTF + OETF, thus turning it back into scene referred light.

The HLG OOTF (and therefore the EOTF) is parametrized by the display
peak. Even though the HLG signal is stored in the range 0.0 - 12.0
(scene referred), the output range depends on how you tuned the EOTF. If
you tuned it for the 1000 cd/m^2 reference display, then an input of
12.0 will get turned into an output value of 1000 cd/m^2.

If we then tone-map this to a brightness of 500 cd/m^2, and pass it back
through the same OOTF, it would get turned into 6.0 rather than the
12.0. While this may ultimately reproduce the correct result on-screen
(assuming the end user of the video file also uses a peak of 1000 cd/m^2
to decode the file), it's a suboptimal use of the encoding range and
also not how HLG is designed to operate. (For example, it would affect
the "SDR backwards compatibility" property of HLG, which is the whole
reason for the peak-dependent encoding)

That's why the correct thing to do would be to re-encode the file using
an inverse OOTF tuned for 500 cd/m², thus taking our tone mapped value
in question back to the (scene-referred) value of 12.0, and update the
tagged peak to also read 500 cd/m². Now a spec-conforming implementation
of a video player (e.g. mpv or VLC) that plays this file would use the
same tuned EOTF to decode it back to the value of 500 cd/m², thus
ensuring it round trips correctly.

> I don't find anything about it. What metadata in HEVC indicate display-referred?
> Any display-referred HLG video sample?

As mentioned, the HLG EOTF by definition requires transforming to
display-referred space. The mastering display metadata *is* what
describes how this (definitively display-referred) space behaves. So
when decoding HLG, you use the tagged mastering metadata's peak as the
parametrization for the EOTF. (This is what e.g. mpv and VLC do)

For a better explanation of this (admittedly confusing) topic, see Annex
1 of ITU-R Recommendation BT.2100.

Here is a relevant excerpt: http://0x0.st/se7O.png

More information about the ffmpeg-devel mailing list