[FFmpeg-devel] [PATCH 0/5] Add nvidia hw decode support for HEVC 4:4:4 content

Sat Oct 20 23:46:56 EEST 2018

The new video decoder hardware on Turing GPUs supports HEVC 4:4:4 content.
This patch series adds the necessary new pixel formats and implements
support in nvdec/nvenc/cuviddec.

(Since the previous post of this series, I fixed the reversed terminology
on the pixel formats)

The big discussion was about the new pixel formats. I would like to get
to a clear conclusion on this otherwise, this patch series goes nowhere
forever, which I obviously don't want to happen, and I don't think is
a legitmate outcome for reasonable functionality.

So, discussion points:

1) What are the new data layouts that lack representation in the existing
   pixel formats?

With the introduction of HEVC 444 support in the decoding hardware, we
now have 3 plane YUV444 containing 10 or 12 bit data in an MSB packed
format. MSB packing means the layout can be interpreted as 16 bit data
transparently - equivalent to how P010 and P016 work. However, all the
pre-existing formats are LSB packed so these layouts cannot be mapped
to any of them except for YUV444P16, which is transparently compatible
in the same way that P016 is.

2) Why would we just not use YUV444P16 for everything?

The main reason is loss of metadata - after declaring that the data is
stored as YUV444P16, we no longer know what the actual bith depth is.
Using a format that explicitly declares the number of relevant bits
preserves that, which may be beneficial for filtering or other
transformations.

3) Why would we just not add these new formats and move on?

There is a general reluctance to add new pixel formats without good
reason, which is fair enough, but some people have also expressed a
wish to decouple bit depth from pixel formats. This would mean
carrying that information elsewhere and having the pixel format
purely reflect the physical layout (ie: You'd only have P016 and
YUV444P16).

4) Why not just implement this new mechanism for bit depth?

Because it's a massive amount of work. Adding new metadata like this
is a huge task that has impact throughout the codebase and also
requires all clients to be rewritten at some point. A very close
analog has been the attempt to remove YUVJ formats - these formats
indicate the choice of a particular colourspace, with the same physical
layout.

Despite a lot of time and a lot of work, the YUVJ removal is still not
done and it's been six months since the last patchset was proposed.
There is no reason to believe that decoupling bit depth would not be
a project of equal size and timescale - to say that this patchset is
blocked on bit depth decoupling is to say that it's blocked indefinitely;
no one even seriously articulated a desire to decouple bit depth until
this patchset raised the issue, and no one has committed to doing the
work.

It's also unclear whether anyone would seriously suggest removing P010,
which is an industry standard format - it would pointlessly harm
interoperability with other tools and clients for ffmpeg to unilaterally
declare that P010 was pointless and another mechanism should be used.

Put another way, If YUV444P10_MSB was described in an MSDN document
like P010, we'd probably not be having this conversation. (For the
curious, MSDN only describes packed 444 formats.)

5) What about splitting the difference?

One option that would reflect the relationship between P010 and P016
would be to only add YUV444P10_MSB and use YUV444P16 for 12 bit data.
This means we add one format rather than two, and the usage of them
is equivalent. Is that an interesting compromise? Doesn't bother me.

6) What does bother me?

All I ultimately care about is getting this in. It's legimate hardware
capability and I really want to make sure it's available. From that
perspective I don't care if we add 0/1/2 formats - I will do whatever
is agreed in the end. The option I am not happy with is saying that
we can only expose the hardware capabilities if we implement bit depth
decoupling - that's really saying you don't ever expect this
functionality to go in.

So, please, let's just make a decision.

Philip Langdale (5):
  avutil: Add YUV444P10_MSB and YUV444P12_MSB pixel formats
  avcodec/nvdec: Add support for decoding HEVC 4:4:4 content
  avcodec/nvdec: Explicitly mark codecs that support 444 output formats
  avcodec/cuviddec: Add support for decoding HEVC 4:4:4 content
  avcodec/nvenc: Accept YUV444P10_MSB and YUV444P12_MSB content

 Changelog                  |  1 +
 libavcodec/cuviddec.c      | 59 ++++++++++++++++++++++++++------------
 libavcodec/hevcdec.c       |  3 ++
 libavcodec/nvdec.c         | 46 +++++++++++++++++++++++------
 libavcodec/nvdec.h         |  5 +++-
 libavcodec/nvdec_h264.c    |  2 +-
 libavcodec/nvdec_hevc.c    | 10 +++++--
 libavcodec/nvdec_mjpeg.c   |  2 +-
 libavcodec/nvdec_mpeg12.c  |  2 +-
 libavcodec/nvdec_mpeg4.c   |  2 +-
 libavcodec/nvdec_vc1.c     |  2 +-
 libavcodec/nvdec_vp8.c     |  2 +-
 libavcodec/nvdec_vp9.c     |  2 +-
 libavcodec/nvenc.c         | 18 ++++++++----
 libavutil/hwcontext_cuda.c |  2 ++
 libavutil/pixdesc.c        | 48 +++++++++++++++++++++++++++++++
 libavutil/pixfmt.h         |  8 ++++++
 libavutil/version.h        |  4 +--
 18 files changed, 173 insertions(+), 45 deletions(-)

-- 
2.19.1