[FFmpeg-devel] [PATCH] libavfilter/unsharp: add opencl unsharp filter

Stefano Sabatini stefasab at gmail.com
Mon Apr 22 10:21:59 CEST 2013


On date Sunday 2013-04-21 17:10:38 +0800, Wei Gao encoded:
> Hi,
> Thanks for your reviewing, Stefano, some questions as follows:
> 
> 
> 2013/4/20 Stefano Sabatini <stefasab at gmail.com>
> 
> > On date Friday 2013-04-19 19:46:56 +0800, Wei Gao encoded:
> > >
> >
> > > From 995d82ac35f374b836629a0592debc57f0a97778 Mon Sep 17 00:00:00 2001
> > > From: highgod0401 <highgod0401 at gmail.com>
> > > Date: Fri, 19 Apr 2013 19:43:01 +0800
> > > Subject: [PATCH] add opencl unsharp filter
> > >
> > > ---
> > >  libavfilter/Makefile            |   2 +-
> > >  libavfilter/opencl_allkernels.c |   5 +-
> > >  libavfilter/unsharp.h           |  79 +++++++++++
> > >  libavfilter/unsharp_kernel.h    | 134 ++++++++++++++++++
> > >  libavfilter/unsharp_opencl.c    | 301
> > ++++++++++++++++++++++++++++++++++++++++
> > >  libavfilter/unsharp_opencl.h    |  34 +++++
> > >  libavfilter/vf_unsharp.c        |  87 +++++++-----
> >
> >
> > > +#endif /* AVFILTER_UNSHARP_H */
> > > diff --git a/libavfilter/unsharp_kernel.h b/libavfilter/unsharp_kernel.h
> > > new file mode 100644
> > > index 0000000..67fa6c0
> > > --- /dev/null
> > > +++ b/libavfilter/unsharp_kernel.h
> >
> > I think this should be renamed:
> > unsharp_opencl_kernel.h
> >
> > (same for deshake_kernel.h)
> >

> Could I rename the files(deshake_kernel.h and unsharp_kernel.h) in another
> patch later? Because if I rename now, I have to submit a new patch for
> deshake.

Yes.
 
> >
> > > +    int global_id = get_global_id(0);
> >
> > size_t global_id_size may be a better name
> >
> It is kernel code which will be compiled in GPU, and there seems no data
> type named size_t. 
> >
> > > +    int i, j, x, y, temp_src_stride, temp_dst_stride, temp_height,
> > temp_width, temp_steps_x, temp_steps_y,
> > > +        temp_amount, temp_scalebits, temp_halfscale, sum, idx_x, idx_y,
> > temp, res;
> > > +    if (global_id < width*height) {
> >
> >
> > > +#define UNSHARP_OPENCL_CHECK(method, ...)
> >                                    \
> > > +    status = method(__VA_ARGS__);
> >                                    \
> > > +    if (status != CL_SUCCESS) {
> >                                    \
> > > +        av_log(ctx, AV_LOG_ERROR, "error %s %d\n", # method, status);
> >                                    \
> >
> > Usual note on an av_opencl_ public error API.
> >
> The function is used as a intra function of opencl utils. can I add the api
> in another patch later? And then modify the deshake and unsharp file?

Sure.
 
> >
> > > +        return AVERROR_EXTERNAL;
> >                                     \
> > > +    }
> > > +
> > > +#define UNSHARP_OPENCL_SET_KERNEL_ARG(arg_ptr)
> >                                     \
> > > +    status =
> > clSetKernelArg((kernel),(arg_no++),(sizeof(arg_ptr)),(void*)(&(arg_ptr)));
> >                    \
> > > +    if (status != CL_SUCCESS) {
> >                                    \
> > > +        av_log(ctx, AV_LOG_ERROR, "cannot set kernel argument: %d\n",
> > status );                            \
> > > +        return AVERROR_EXTERNAL;
> >                                     \
> > > +    }
> >
> > These macros are mostly duplicated from deshake_opencl.c, so we think
> > to factorize them to a common file (libavfilter/opencl_internal.h
> > could be fine).
> >
> Yes, I agree with you, I defined them in the previous patch but there seems
> some problem because the parameter status, kernel, arg_no must be  declared
> in the function which use the macro. Can you give me some advice? Thanks.

Don't know, in case the macro is going to be used in other parts of
the code (as I suspect), then maybe it makes sense to define them in a
public header in libavutil. The problem with that approach is than we
can't change it later, so another approach would be to use an internal
header (libavutil/opencl_internal.h) which is not published and used
only internally. Michael can probably give a better advice.

> 
> >
> > > +
> > > +static void add_mask_counter(uint32_t *dst, uint32_t *counter1,
> > uint32_t *counter2, int len)
> > > +{
> > > +    int i;
> > > +    for (i = 0; i < len; i++) {
> > > +        dst[i] = counter1[i] + counter2[i];
> > > +    }
> > > +}
> > > +
> >
> > > +    ret = av_opencl_buffer_create(&unsharp->opencl_ctx.cl_chroma_mask,
> > > +                                  sizeof(uint32_t) * (2 *
> > unsharp->chroma.steps_x + 1) * (2 * unsharp->chroma.steps_y + 1),
> > > +                                  CL_MEM_READ_ONLY, NULL);
> > > +    if (ret < 0)
> > > +        return ret;
> > > +    ret = generate_mask(ctx);
> > > +    if (ret < 0)
> > > +        return ret;
> >
> > > +    unsharp->opencl_ctx.plane_num = PLANE_NUM;
> >
> > potentially unsafe if we add support to a different number of planes
> >

> where can I get the plane number? should I have to detect data0,data1,
> data2? I want to add share buffer in the later patch, maybe use data4 to
> pass GPU buffer, so I don't think it is a good idea.

In pixdesc.h you can find some utilities to return the number of
planes given the pixel format.

> 
> >
> > > +    if (!unsharp->opencl_ctx.kernel_env.kernel) {
> >
> >
> > > +        unsharp->opencl_ctx.in_plane_size[0]  = (in->linesize[0] *
> > in->height);
> > > +        unsharp->opencl_ctx.in_plane_size[1]  = (in->linesize[1] * ch);
> > > +        unsharp->opencl_ctx.in_plane_size[2]  = (in->linesize[2] * ch);
> > > +        unsharp->opencl_ctx.out_plane_size[0] = (out->linesize[0] *
> > out->height);
> > > +        unsharp->opencl_ctx.out_plane_size[1] = (out->linesize[1] * ch);
> > > +        unsharp->opencl_ctx.out_plane_size[2] = (out->linesize[2] * ch);
> > > +        unsharp->opencl_ctx.cl_inbuf_size  =
> > unsharp->opencl_ctx.in_plane_size[0] +
> > > +
> > unsharp->opencl_ctx.in_plane_size[1] +
> > > +
> > unsharp->opencl_ctx.in_plane_size[2];
> > > +        unsharp->opencl_ctx.cl_outbuf_size =
> > unsharp->opencl_ctx.out_plane_size[0] +
> > > +
> > unsharp->opencl_ctx.out_plane_size[1] +
> > > +
> > unsharp->opencl_ctx.out_plane_size[2];
> >
> > looks like another good candidate for factorization
> >
> Sorry, I don't get what you mean.

This could be factorized with a function or macro (as it is used
almost unchanged in deshake as well), but feel free to ignore the
comment for now.

> >
> > >
> > >  static void apply_unsharp(      uint8_t *dst, int dst_stride,
> > >                            const uint8_t *src, int src_stride,
> > > -                          int width, int height, FilterParam *fp)
> > > +                          int width, int height, UnsharpFilterParam *fp)
> > >  {
> > >      uint32_t **sc = fp->sc;
> > >      uint32_t sr[MAX_MATRIX_SIZE - 1], tmp1, tmp2;
> > > @@ -131,7 +107,27 @@ static void apply_unsharp(      uint8_t *dst, int
> > dst_stride,
> > >      }
> > >  }
> > >
> > > -static void set_filter_param(FilterParam *fp, int msize_x, int msize_y,
> > float amount)
> > > +static int apply_unsharp_c(AVFilterContext *ctx, AVFrame *in, AVFrame
> > *out)
> >
> > void since the return value is ignored.
> > Also some constness may help:
> >
> > static void apply_unsharp_c(AVFilterContext *ctx, const AVFrame *in,
> > AVFrame *out)
> >
> > I want use the function pointer for both C code and opencl code, the
> function of opencl returns int value because of there maybe some errors if
> use opencl APIs.

OK.

> 
> >
> > > +static void set_filter_param(UnsharpFilterParam *fp, int msize_x, int
> > msize_y, float amount)
> >
> >
> > please update docs for unsharp in doc/filters.texi
> >
> > Looks overall quite good. Do you have any benchmark to show?
> >
> Compn told me how to create the benchmark before, but sorry that I forgot
> it. Can you tell me again? I will record it, thanks.

You can show the total time used to transcode a file, see also the
time utility and the -benchmark ffmpeg option. More fine-grained
per-function benchmarks can be also obtained by using
libavutil/timer.h.

Also we need to know if the two implementations are bit-exact, there
are several tools for checking that, showinfo and the md5 protocol and
muxer for example, let me know if you need a concrete example.
-- 
FFmpeg = Fancy and Fostering Meaningful Powered Enchanting Gangster


More information about the ffmpeg-devel mailing list