[FFmpeg-devel] [FFmpeg-devel, RFC] lavfi: add opencl tonemap filter.

Niklas Haas ffmpeg at haasn.xyz
Fri May 4 21:00:28 EEST 2018

Hello Ruiling,

On Fri,  4 May 2018 15:32:58 +0800, Ruiling Song <ruiling.song at intel.com> wrote:
> It basically does hdr to sdr conversion with tonemapping.
> Signed-off-by: Ruiling Song <ruiling.song at intel.com>
> ---
> This patch tries to add a filter to do hdr to sdr conversion with tonemapping.
> The filter does all the job of tonemapping in one pass, which is quite different from the vf_tonemap.c
> I choose this way because I think this would introduce less memory access.
> And I find that tonemaping shares lots of code with colorspace conversion.
> So I move color space related code into seprated files (both OpenCL kernel and host code).
> I am not sure whether the design seems OK?
> Is there anybody would like to give some comments on the overall design or implementation details?
> Thanks!
> Ruiling

As the original author of the tone mapping code that inspired vf_tonemap
and (by proxy) vf_tonemap_opencl, I can provide a handful of comments.

> +float3 map_one_pixel_rgb(float3 rgb, float peak) {
> +    // de-saturate
> +    float luma = get_luma(rgb.x, rgb.y, rgb.z);
> +    float overbright = max(luma - 2.0f, 1e-6f) / max(luma, 1e-6f);
> +    rgb = mix(rgb, (float3)luma, (float3)overbright);
> +
> +    float sig = max(max(rgb.x, max(rgb.y, rgb.z)), 1e-6f);
> +    float sig_old = sig;
> +    sig = TONE_FUNC(sig, peak);
> +    rgb *= (sig/sig_old);
> +    return rgb;
> +}

I consider this desaturation algorithm outdated. It works, but I think
it produces lower quality results than this one:

  float sig = max(max(rgb.x, max(rgb.y, rgb.z)), 1e-6f);
  float luma = get_luma(rgb.x, rgb.y, rgb.z);
  float coeff = max(sig - 0.18f, 1e-6f) / max(sig, 1e-6f);

  const float desaturation_coefficient = 0.5f; // or configurable!
  coeff = pow(coeff, 10 / desaturation_coefficient;
  rgb = mix(rgb, (float3)luma, coeff);
  sig = mix(sig, luma, coeff);

  // do the rest of tone-mapping on `sig`
  float sig_old = sig;

Basically, I've done the following:

- Calculate the overbright coefficient on `sig` rather than `luma` alone
- Instead of using an offset of 2.0f, use an offset of 0.18f
- Take the coefficient to a high exponent (lower exponent = more
  desaturation, which is why I invert the user-configurable parameter)

Since the new code needs to read `sig`, we also have to move up the
`sig` calculation and then correctly adjust it afterwards as well.


More importantly, this algorithm is missing something that I now
consider very important, and which would align well with OpenCL: source
brightness detection. Just doing the tone-mapping "blind" like this
works to some extent, but I think the best results require also
adjusting the exposure in order to compensate for hollywood's tendency
to ship poorly mastered, over-illuminated HDR material.

The basic premise is to calculate both the peak brightness as well as
the average brightness on a frame-by-frame basis, and incorporate those
measured values in the algorithm, in order to re-normalize overly bright
scenes to correspond to a typical SDR average of around 0.25. In
addition to this, calculating the peak explicitly allows us to exactly
tailor the hable() function to this specific frame, even if the
mastering metadata is missing or useless. (Which it almost always is)

Doing this in OpenCL would essentially require implementing a big
map-reduce to keep track of respectively the sum and max of each pixel's
brightness. In addition to this, I recommend averaging the results over
a few frames (I like to use around one second), with the caveat that
this is best paired with at least a naive scene change detection
heuristic to make sure this averaging window gets reset on a scene

> +static double determine_signal_peak(AVFrame *in)
> +{
> +    AVFrameSideData *sd = av_frame_get_side_data(in, AV_FRAME_DATA_CONTENT_LIGHT_LEVEL);
> +    double peak = 0;
> +
> +    if (sd) {
> +        AVContentLightMetadata *clm = (AVContentLightMetadata *)sd->data;
> +        peak = clm->MaxCLL;
> +    }
> +
> +    sd = av_frame_get_side_data(in, AV_FRAME_DATA_MASTERING_DISPLAY_METADATA);
> +    if (!peak && sd) {
> +        AVMasteringDisplayMetadata *metadata = (AVMasteringDisplayMetadata *)sd->data;
> +        if (metadata->has_luminance)
> +            peak = av_q2d(metadata->max_luminance);
> +    }
> +
> +    /* smpte2084 needs the side data above to work correctly
> +     * if missing, assume that the original transfer was arib-std-b67 */
> +    if (!peak)
> +        peak = 1200;
> +
> +    return peak;
> +}

This seems strange. For a source without peak tagging, you should
probably be deciding based on the video frame's tagged transfer function
(ST.2084 or STD-B67). If it's the former, default to 10,000, rather than

> +    switch(input_frames_ctx->sw_format) {
> +    case AV_PIX_FMT_P010:
> +        err = launch_kernel(avctx, ctx->kernel, output, input, peak);
> +        if (err < 0) goto fail;
> +        break;
> +    default:
> +        av_log(ctx, AV_LOG_ERROR, "unsupported format in tonemap_opencl.\n");
> +        err = AVERROR(ENOSYS);
> +        goto fail;
> +    }

This seems a bit needlessly restrictive? At the very least, I would
expect high-bit RGB and ideally float formats to also be supported.

Finally, I'm not entirely sure how you ingest HLG, but in the case of
HLG it's important to run the HLG OOTF (to take the content from
scene-referred to display-referred light) *before* tone-mapping, rather
than after it. I would assume this is probably handled by other FFmpeg
components but it might be worth double checking just to be sure.


As a last note, you can find my GLSL(+Vulkan) implementations of the
algorithm changes described above, as well as all of the related
color-management code and various decision logic for what values to
infer/default here:

The documentation for the tunable parameters is here:

Of specific interest are the functions `pl_shader_tone_map` (the core tone
mapping logic), `hdr_update_peak` (peak/avg detection using compute
shaders + SSBOs) and `pl_shader_color_map` (the calling code that also
does color conversions, OOTF application, etc.)

The way I implemented the logic there is fully generalized and allows
going from any combination of source (space, peak, average) to any
destination (space, peak, average), doing only the operations necessary
while making sure to compensate for brightness differences. This also
works well when viewing HDR material on HDR devices with a lower dynamic
range than the original material, even when those devices are calibrated
to SDR curves (search for `hdr_simulation` in that file).

I hope that may provide some assistance in (ultimately) making the
ffmpeg tone mapping filter as good as it can be.

Thanks for reading,
Niklas Haas

More information about the ffmpeg-devel mailing list