[FFmpeg-devel] [PATCH] add support for ROI-based encoding

Mark Thompson sw at jkqxz.net
Sun Dec 9 21:21:02 EET 2018


On 05/12/2018 09:58, Guo, Yejun wrote:
> this patch is not ask for merge, it is more to get a feature feedback.
> 
> The encoders such as libx264 support different QPs offset for different MBs,
> it makes possible for ROI-based encoding. It makes sense to add support
> within ffmpeg to generate/accept ROI infos and pass into encoders.
> 
> Typical usage: After AVFrame is decoded, a ffmpeg filter or user's code
> generates ROI info for that frame, and the encoder finally does the
> ROI-based encoding. And so I choose to maintain the ROI info within
> AVFrame struct.
> 
> TODO:
> - remove code in vf_scale.c, it is just an example to generate ROI info
> - use AVBufferRef instead of current implementation within AVFrame struct.
> - add other encoders support
> 
> Signed-off-by: Guo, Yejun <yejun.guo at intel.com>
> ---
>  libavcodec/libx264.c   | 35 +++++++++++++++++++++++++++++++++++
>  libavfilter/vf_scale.c |  8 ++++++++
>  libavutil/frame.c      |  9 +++++++++
>  libavutil/frame.h      | 14 ++++++++++++++
>  4 files changed, 66 insertions(+)

This approach seems reasonable to me; some thoughts below.

> ...
> diff --git a/libavutil/frame.c b/libavutil/frame.c
> index 9b3fb13..9c38bdd 100644
> --- a/libavutil/frame.c
> +++ b/libavutil/frame.c
> @@ -425,6 +425,15 @@ FF_DISABLE_DEPRECATION_WARNINGS
>  FF_ENABLE_DEPRECATION_WARNINGS
>  #endif
>  
> +    dst->nb_rois = src->nb_rois;
> +    for (int i = 0; i < dst->nb_rois; ++i) {
> +        dst->rois[i].top = src->rois[i].top;
> +        dst->rois[i].bottom = src->rois[i].bottom;
> +        dst->rois[i].left = src->rois[i].left;
> +        dst->rois[i].right = src->rois[i].right;
> +        dst->rois[i].qoffset = src->rois[i].qoffset;
> +    }
> +
>      av_buffer_unref(&dst->opaque_ref);
>      av_buffer_unref(&dst->private_ref);
>      if (src->opaque_ref) {
> diff --git a/libavutil/frame.h b/libavutil/frame.h
> index 66f27f4..b245a90 100644
> --- a/libavutil/frame.h
> +++ b/libavutil/frame.h
> @@ -193,6 +193,15 @@ typedef struct AVFrameSideData {
>      AVBufferRef *buf;
>  } AVFrameSideData;
>  
> +
> +typedef struct AVFrameROI {
> +    size_t top;
> +    size_t bottom;
> +    size_t left;
> +    size_t right;

Do you want some additional constraints on this?  For example, must be integers in every plane (so even values only for 4:2:0), or maybe even must match some codec-specific block size.

> +    float qoffset;

Everything else uses integers here, so this should be an integer.  The meaning of it will probably be entirely codec-dependent.

> +} AVFrameROI;
> +
>  /**
>   * This structure describes decoded (raw) audio or video data.
>   *
> @@ -556,6 +565,11 @@ typedef struct AVFrame {
>      attribute_deprecated
>      AVBufferRef *qp_table_buf;
>  #endif
> +
> +    //TODO: AVBufferRef*
> +    AVFrameROI rois[2];
> +    size_t nb_rois;

This should be side-data (which is automatically refcounted) rather than included directly in the AVFrame.

> +
>      /**
>       * For hwaccel-format frames, this should be a reference to the
>       * AVHWFramesContext describing the frame.
> 

Thanks,

- Mark


More information about the ffmpeg-devel mailing list