[FFmpeg-devel] [PATCH 1/2] add support for ROI-based encoding

Guo, Yejun yejun.guo at intel.com
Mon Dec 24 10:37:49 EET 2018



> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf
> Of Mark Thompson
> Sent: Friday, December 21, 2018 5:08 AM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] add support for ROI-based
> encoding
> 
> On 12/12/2018 16:26, Guo, Yejun wrote:
> > This patchset contains two patches.
> > - the first patch (this patch) finished the code and ask for upstream.
> > - the second patch is just a quick example on how to generate ROI info.
> >
> > The encoders such as libx264 support different QPs offset for
> > different MBs, it makes possible for ROI-based encoding. It makes
> > sense to add support within ffmpeg to generate/accept ROI infos and pass
> into encoders.
> >
> > Typical usage: After AVFrame is decoded, a ffmpeg filter or user's
> > code generates ROI info for that frame, and the encoder finally does
> > the ROI-based encoding. And so I choose to maintain the ROI info
> > (AVFrameROI) within AVFrame struct.
> >
> > Since the ROI info generator might more focus on the domain knowledge
> > of the interest regions, instead of the encoding detail, the
> > AVFrameROI is designed to be more friend for ffmpeg users.
> >
> > This patch just enabled the path from ffmpeg to libx264, the more
> > encoders can be added later.
> >
> > Signed-off-by: Guo, Yejun <yejun.guo at intel.com>
> > ---
> >  libavcodec/libx264.c | 56
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  libavutil/frame.c    |  8 ++++++++
> >  libavutil/frame.h    | 24 ++++++++++++++++++++++
> >  3 files changed, 88 insertions(+)
> >
> > ...
> > diff --git a/libavutil/frame.c b/libavutil/frame.c index
> > 9b3fb13..dbc4b0a 100644
> > --- a/libavutil/frame.c
> > +++ b/libavutil/frame.c
> > @@ -425,6 +425,13 @@ FF_DISABLE_DEPRECATION_WARNINGS
> > FF_ENABLE_DEPRECATION_WARNINGS  #endif
> >
> > +    av_buffer_unref(&dst->rois_buf);
> > +    if (src->rois_buf) {
> > +        dst->rois_buf = av_buffer_ref(src->rois_buf);
> > +        if (!dst->rois_buf)
> > +            return AVERROR(ENOMEM);
> > +    }
> > +
> >      av_buffer_unref(&dst->opaque_ref);
> >      av_buffer_unref(&dst->private_ref);
> >      if (src->opaque_ref) {
> > @@ -571,6 +578,7 @@ FF_DISABLE_DEPRECATION_WARNINGS
> > FF_ENABLE_DEPRECATION_WARNINGS  #endif
> >
> > +    av_buffer_unref(&frame->rois_buf);
> >      av_buffer_unref(&frame->hw_frames_ctx);
> >
> >      av_buffer_unref(&frame->opaque_ref);
> > diff --git a/libavutil/frame.h b/libavutil/frame.h index
> > 66f27f4..00d509d 100644
> > --- a/libavutil/frame.h
> > +++ b/libavutil/frame.h
> > @@ -193,6 +193,23 @@ typedef struct AVFrameSideData {
> >      AVBufferRef *buf;
> >  } AVFrameSideData;
> >
> > +enum AVRoiQuality {
> > +    AV_RQ_NONE = 0,
> > +    AV_RQ_BETTER = 1,
> > +    AV_RQ_BEST = 2,
> > +};
> 
> I don't think I like this enum - an integer value without directly specifying this
> meaning would seem best for future flexibility?  Negative values might be
> meaningful.  A greater range would also allow ranking if there are many
> regions, which may be needed if some are discarded (because of hardware
> constraints, for example - VAAPI makes this visible).

thanks, yes, the enum method also has disadvantages. 
I'll change the interface to export an integer value.

> 
> > +
> > +typedef struct AVFrameROI {
> > +    /* coordinates at frame pixel level.
> > +     * it will be extended internally if the codec requirs an alignment
> > +     */
> 
> What is intended to happen if regions overlap?  (In the above you have it
> using the last value in the list.)

Yes, the last value will be used if regions overlap.  
My idea is that the ROI generator is responsible to keep the order. I will add notes here.

> 
> > +    size_t top;
> > +    size_t bottom;
> > +    size_t left;
> > +    size_t right;
> > +    enum AVRoiQuality quality;
> > +} AVFrameROI;
> > +
> >  /**
> >   * This structure describes decoded (raw) audio or video data.
> >   *
> > @@ -556,6 +573,13 @@ typedef struct AVFrame {
> >      attribute_deprecated
> >      AVBufferRef *qp_table_buf;
> >  #endif
> > +
> > +    /**
> > +     * For ROI-based encoding, the number of ROI area is implied
> > +     * in the size of buf.
> > +     */
> > +    AVBufferRef *rois_buf;
> 
> This should be a new type of AVFrameSideData, not a new field in AVFrame.

thanks, will change to it.

> 
> > +
> >      /**
> >       * For hwaccel-format frames, this should be a reference to the
> >       * AVHWFramesContext describing the frame.
> >
> 
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


More information about the ffmpeg-devel mailing list