[FFmpeg-devel] [PATCH] libavutil/libavfilter:add opencl warpper and opencl deshake filter

Wei Gao highgod0401 at gmail.com
Wed Mar 20 06:51:47 CET 2013


2013/3/20 Stefano Sabatini <stefasab at gmail.com>

> On date Monday 2013-03-18 11:37:58 +0800, Wei Gao encoded:
> > Hi,
> > Stefano Sabatini, thank you for your comments.And there are some
> questions
> > and some explanations as follows:
> >
> > Thanks
> > Best regards
> >
> > 2013/3/18 Stefano Sabatini <stefasab at gmail.com>
> >
> > > On date Sunday 2013-03-10 22:20:33 +0800, Wei Gao encoded:
> > > > Hi,
> > > >
> > > > Stefano, the attachment is the patch modified according to your
> comments.
> > > [...]
> > >
> > > > From dd254c133be45ebe07cfd3ae0ebf14d9c6fa151f Mon Sep 17 00:00:00
> 2001
> > > > From: highgod0401 <highgod0401 at gmail.com>
> > > > Date: Sun, 10 Mar 2013 22:16:12 +0800
> > > > Subject: [PATCH] add opencl warpper and opencl deshake filter
> > >
> > > wrapper
> > >
> > > >
> > > > ---
> > > >  configure                      |   5 +
> > > >  libavfilter/Makefile           |   2 +
> > > >  libavfilter/allfilters.c       |  21 +
> > > >  libavfilter/deshake_kernel.h   | 201 ++++++++++
> > > >  libavfilter/transform_opencl.c | 155 +++++++
> > > >  libavfilter/transform_opencl.h |  40 ++
> > > >  libavfilter/vf_deshake.c       | 222 ++++++++++
> > > >  libavutil/Makefile             |   4 +
> > > >  libavutil/opencl.c             | 892
> > > +++++++++++++++++++++++++++++++++++++++++
> > > >  libavutil/opencl.h             | 246 ++++++++++++
> > >
> > > Please split the patch in two distinct lavu and lavfi patches.
> > >
>
> > About this comment, you have answerd in previous mail, and you said they
> > can be submitted in one patch because the deshake opencl patch must use
> the
> > opencl wrapper code.
>
> On the other hand I think smaller patches are better, since this will
> simplify review (reviewing a small patches takes less time). Also for
> detecting regressions it is better to have two distinct patches,
> especially when the belong to two distinct libraries.
>
> [...]
> > > > +kernel void avfilter_transform(global  unsigned char *src,
> > > > +                               global  unsigned char *dst,
> > > > +                               global          float *matrix,
> > > > +                               global          float *matrix2,
> > > > +                                                 int interpolate,
> > > > +                                                 int fillmethod,
> > > > +                                                 int src_stride_lu,
> > > > +                                                 int dst_stride_lu,
> > > > +                                                 int src_stride_ch,
> > > > +                                                 int dst_stride_ch,
> > > > +                                                 int height,
> > > > +                                                 int width,
> > > > +                                                 int ch,
> > > > +                                                 int cw)
> > > > +{
> > > > +     int global_id = get_global_id(0);
> > > > +
> > > > +     global unsigned char *dst_y = dst;
> > > > +     global unsigned char *dst_u = dst_y + height * dst_stride_lu;
> > > > +     global unsigned char *dst_v = dst_u + ch * dst_stride_ch;
> > > > +
> > > > +     global unsigned char *src_y = src;
> > > > +     global unsigned char *src_u = src_y + height * src_stride_lu;
> > > > +     global unsigned char *src_v = src_u + ch * src_stride_ch;
> > > > +
> > > > +     global unsigned char *tempdst;
> > > > +     global unsigned char *tempsrc;
> > >
> > > sorry for the ignorance, what is "global" used for?
> > >
> > This is an OpenCL memory type defined in  "OpenCL Specification
> > Version: 1.2 ",it
> > defined that "A memory region accessible to all work-items executing in a
> > context. It is accessible to the host using commands such as read, write
> > and map"
>
> Thanks for explaining.
>
> [...]
> > > > +static int filter_frame_opencl(AVFilterLink *link, AVFilterBufferRef
> > > *in)
> > > > +{
> > > > +    DeshakeContext *deshake = link->dst->priv;
> > > > +    AVFilterLink *outlink = link->dst->outputs[0];
> > > > +    AVFilterBufferRef *out;
> > >
> > > Uhm this needs to be updated after the recent merge (you should
> > > directly make use of AVFrame).
> > >
> > should I need to get the latest git version and merge my code to it then
> > submit it?
>
> This is usually a good idea, otherwise the committer will have to do
> the work. Fortunately the changes required for the update should be
> minimal.
>
> [...]
> > > > +typedef struct OpenclErrorMsg {
> > > > +    int err_code;
> > > > +    const char *err_str;
> > > > +}OpenclErrorMsg;
> > > > +
> > > > +static OpenclErrorMsg opencl_err_msg[] = {
> > > > +        {CL_DEVICE_NOT_FOUND,                               "DEVICE
> NOT
> > > FOUND"},
> > > > +        {CL_DEVICE_NOT_AVAILABLE,                           "DEVICE
> NOT
> > > AVAILABLE"},
> > > > +        {CL_COMPILER_NOT_AVAILABLE,
> "COMPILER
> > > NOT AVAILABLE"},
> > > > +        {CL_MEM_OBJECT_ALLOCATION_FAILURE,                  "MEM
> OBJECT
> > > ALLOCATION FAILURE"},
> > > > +        {CL_OUT_OF_RESOURCES,                               "OUT OF
> > > RESOURCES"},
> > > > +        {CL_OUT_OF_HOST_MEMORY,                             "OUT OF
> > > HOST MEMORY"},
> > > > +        {CL_PROFILING_INFO_NOT_AVAILABLE,
> "PROFILING
> > > INFO NOT AVAILABLE"},
> > > > +        {CL_MEM_COPY_OVERLAP,                               "MEM
> COPY
> > > OVERLAP"},
> > > > +        {CL_IMAGE_FORMAT_MISMATCH,                          "IMAGE
> > > FORMAT MISMATCH"},
> > > > +        {CL_IMAGE_FORMAT_NOT_SUPPORTED,                     "IMAGE
> > > FORMAT NOT_SUPPORTED"},
> > > > +        {CL_BUILD_PROGRAM_FAILURE,                          "BUILD
> > > PROGRAM FAILURE"},
> > > > +        {CL_MAP_FAILURE,                                    "MAP
> > > FAILURE"},
> > > > +        {CL_MISALIGNED_SUB_BUFFER_OFFSET,
> "MISALIGNED
> > > SUB BUFFER OFFSET"},
> > > > +        {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST,      "EXEC
> > > STATUS ERROR FOR EVENTS IN WAIT LIST"},
> > > > +        {CL_COMPILE_PROGRAM_FAILURE,                        "COMPILE
> > > PROGRAM FAILURE"},
> > > > +        {CL_LINKER_NOT_AVAILABLE,                           "LINKER
> NOT
> > > AVAILABLE"},
> > > > +        {CL_LINK_PROGRAM_FAILURE,                           "LINK
> > > PROGRAM FAILURE"},
> > > > +        {CL_DEVICE_PARTITION_FAILED,                        "DEVICE
> > > PARTITION FAILED"},
> > > > +        {CL_KERNEL_ARG_INFO_NOT_AVAILABLE,                  "KERNEL
> ARG
> > > INFO NOT AVAILABLE"},
> > > > +        {CL_INVALID_VALUE,                                  "INVALID
> > > VALUE"},
> > > > +        {CL_INVALID_DEVICE_TYPE,                            "INVALID
> > > DEVICE TYPE"},
> > > > +        {CL_INVALID_PLATFORM,                               "INVALID
> > > PLATFORM"},
> > > > +        {CL_INVALID_DEVICE,                                 "INVALID
> > > DEVICE"},
> > > > +        {CL_INVALID_CONTEXT,                                "INVALID
> > > CONTEXT"},
> > > > +        {CL_INVALID_QUEUE_PROPERTIES,                       "INVALID
> > > QUEUE PROPERTIES"},
> > > > +        {CL_INVALID_COMMAND_QUEUE,                          "INVALID
> > > COMMAND QUEUE"},
> > > > +        {CL_INVALID_HOST_PTR,                               "INVALID
> > > HOST PTR"},
> > > > +        {CL_INVALID_MEM_OBJECT,                             "INVALID
> > > MEM OBJECT"},
> > > > +        {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR,                "INVALID
> > > IMAGE FORMAT DESCRIPTOR"},
> > > > +        {CL_INVALID_IMAGE_SIZE,                             "INVALID
> > > IMAGE SIZE"},
> > > > +        {CL_INVALID_SAMPLER,                                "INVALID
> > > SAMPLER"},
> > > > +        {CL_INVALID_BINARY,                                 "INVALID
> > > BINARY"},
> > > > +        {CL_INVALID_BUILD_OPTIONS,                          "INVALID
> > > BUILD OPTIONS"},
> > > > +        {CL_INVALID_PROGRAM,                                "INVALID
> > > PROGRAM"},
> > > > +        {CL_INVALID_PROGRAM_EXECUTABLE,                     "INVALID
> > > PROGRAM EXECUTABLE"},
> > > > +        {CL_INVALID_KERNEL_NAME,                            "INVALID
> > > KERNEL NAME"},
> > > > +        {CL_INVALID_KERNEL_DEFINITION,                      "INVALID
> > > KERNEL DEFINITION"},
> > > > +        {CL_INVALID_KERNEL,                                 "INVALID
> > > KERNEL"},
> > > > +        {CL_INVALID_ARG_INDEX,                              "INVALID
> > > ARG INDEX"},
> > > > +        {CL_INVALID_ARG_VALUE,                              "INVALID
> > > ARG VALUE"},
> > > > +        {CL_INVALID_ARG_SIZE,                               "INVALID
> > > ARG_SIZE"},
> > > > +        {CL_INVALID_KERNEL_ARGS,                            "INVALID
> > > KERNEL ARGS"},
> > > > +        {CL_INVALID_WORK_DIMENSION,                         "INVALID
> > > WORK DIMENSION"},
> > > > +        {CL_INVALID_WORK_GROUP_SIZE,                        "INVALID
> > > WORK GROUP SIZE"},
> > > > +        {CL_INVALID_WORK_ITEM_SIZE,                         "INVALID
> > > WORK ITEM SIZE"},
> > > > +        {CL_INVALID_GLOBAL_OFFSET,                          "INVALID
> > > GLOBAL OFFSET"},
> > > > +        {CL_INVALID_EVENT_WAIT_LIST,                        "INVALID
> > > EVENT WAIT LIST"},
> > > > +        {CL_INVALID_EVENT,                                  "INVALID
> > > EVENT"},
> > > > +        {CL_INVALID_OPERATION,                              "INVALID
> > > OPERATION"},
> > > > +        {CL_INVALID_GL_OBJECT,
>  "INVALID GL
> > > OBJECT"},
> > > > +        {CL_INVALID_BUFFER_SIZE,                            "INVALID
> > > BUFFER SIZE"},
> > > > +        {CL_INVALID_MIP_LEVEL,                              "INVALID
> > > MIP LEVEL"},
> > > > +        {CL_INVALID_GLOBAL_WORK_SIZE,                       "INVALID
> > > GLOBAL WORK SIZE"},
> > > > +        {CL_INVALID_PROPERTY,                               "INVALID
> > > PROPERTY"},
> > > > +        {CL_INVALID_IMAGE_DESCRIPTOR,                       "INVALID
> > > IMAGE DESCRIPTOR"},
> > > > +        {CL_INVALID_COMPILER_OPTIONS,                       "INVALID
> > > COMPILER OPTIONS"},
> > > > +        {CL_INVALID_LINKER_OPTIONS,                         "INVALID
> > > LINKER OPTIONS"},
> > > > +        {CL_INVALID_DEVICE_PARTITION_COUNT,                 "INVALID
> > > DEVICE PARTITION COUNT"},
> > >
> > > This is going to be a pain to maintain. Rather return a generic error
> > > message (OpenCL error with code %d occurred) in the code.
> > >
> > > Ideally the library should provide error string utilities (e.g. like
> > > the ones in libavutil/error.h) so that it is not required to keep them
> > > in sync in each and every single one OpenCL application.
> > >
> > These error code are defined in OpenCL Specification Version: 1.2, and
> they
> > will not changed frequency,so it may easy to maintain.And in your
> pervious
> > mail, you have suggested that:
> >  "Something like this should be good:
> > if (status != CL_SUCCESS) {
> >     av_log(&openclwrapperutils, AV_LOG_ERROR, "Error creating OpenCL
> > kernel: %s", opencl_errstr(status));"
> > I also agree that the string is a better way.
>
> I leave the choice to you. But in a global perspective it would be
> better if the OpenCL implementation supported this (think if all C
> application had to implement their own version of strerror_r()).
>
> Do you know where it is possible to file feature requests related to
> OpenCL?
>
> [...]
> > > > +++ b/libavutil/opencl.h
> > > > @@ -0,0 +1,246 @@
> > > > +/*
> > > > + * Copyright (C) 2012 Peng Gao <peng at multicorewareinc.com>
> > > > + * Copyright (C) 2012 Li   Cao <li at multicorewareinc.com>
> > > > + * Copyright (C) 2012 Wei  Gao <weigao at multicorewareinc.com>
> > > > + *
> > > > + * This file is part of FFmpeg.
> > > > + *
> > > > + * FFmpeg is free software; you can redistribute it and/or
> > > > + * modify it under the terms of the GNU Lesser General Public
> > > > + * License as published by the Free Software Foundation; either
> > > > + * version 2.1 of the License, or (at your option) any later
> version.
> > > > + *
> > > > + * FFmpeg is distributed in the hope that it will be useful,
> > > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > + * Lesser General Public License for more details.
> > > > + *
> > > > + * You should have received a copy of the GNU Lesser General Public
> > > > + * License along with FFmpeg; if not, write to the Free Software
> > > > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> > > 02110-1301 USA
> > > > + */
> > > > +
> > > > +#include "config.h"
> > > > +
> > > > +#ifndef LIBAVUTIL_OPENCLWRAPPER_H
> > > > +#define LIBAVUTIL_OPENCLWRAPPER_H
> > > > +
> > > > +#include <CL/cl.h>
> > > > +
> > >
> > > > +#define AV_OPENCL_CHECK(method, ...)\
> > > > +    status = method( __VA_ARGS__ ); if( status != CL_SUCCESS ) {\
> > > > +        av_log(NULL,AV_LOG_ERROR, " error %s %d\n", # method,
> status );
> > >  return status; }
> > > > +
> > > > +#define AV_OPENCL_SET_KERNEL_ARG(arg_ptr)\
> > > > +    status =
> > >
> clSetKernelArg((kernel),(arg_no++),(sizeof(arg_ptr)),(void*)(&(arg_ptr)));if(
> > > status != CL_SUCCESS ) {\
> > > > +        av_log(NULL,AV_LOG_ERROR, " error %s %d\n",
> "clSetKernelArg",
> > > status );  return status; }
> > >
> > > macros assuming a certain variable in the context are not acceptable
> > > in a public header. Please move them to the place where they are
> > > defined, they can be changed later if the need arises.
> > >
>
> > every opencl filer maybe need this to easy write code and check error
> > code,if move to the filter file, every opencl filter may define them.
>
> The principle is that we should avoid to overdesign things. Since they
> are used only in one file, I think it's better to define them
> there. Then we can move the macros to a common header when the need
> arises.
>
> My problem with these macros is that they rely on the presence of
> specific variables ("status", "kernel") so this is not very
> robust/handy to be included in a public header. For example the user
> may have already a "status" variable used for an entirely different
> purpose.
>
> > >
> > > > +
> > > > +
> > > > +#define FF_OPENCL_KERNEL( ... )# __VA_ARGS__
> > >
> > > FF_ if reserved for internal API, use AV_
> > >
> > > > +
> > > > +typedef struct AVOpenCLKernelEnv {
> > > > +    cl_context context;
> > > > +    cl_command_queue command_queue;
> > > > +    cl_program program;
> > > > +    cl_kernel kernel;
> > > > +    char kernel_name[150];
> > > > +}AVOpenCLKernelEnv;
> > > > +
> > > > +typedef struct AVOpenCLExternalInfo {
> > > > +    cl_platform_id platform;
> > > > +    cl_device_type device_type;
> > > > +    cl_context context;
> > >
> > > > +    cl_device_id *devices_id;
> > > > +    cl_device_id  dev;
> > >
> > > device_ids
> > > device_id
> > >
> > > may be better names
> > >
> > > > +    cl_command_queue command_queue;
> > > > +    char *platform_name;
> > > > +}AVOpenCLExternalInfo;
> > > > +
> > > > +/**
> > > > + * user defined, this is function wrapper which is used to set the
> > > input parameters.
> > > > + * launch kernel and copy data from GPU to CPU or CPU to GPU.
> > > > + */
> > >
> > > User defined function, used to set the input parameter in the kernel
> > > environment. This function launches kernel and copies data from GPU to
> > > CPU, or from CPU to GPU.
> > >
> > > ...
> > >
> > > I'm not yet sure what "to launch kernel" means.
> > >
>
> > The program will use it to run a kernel the user data is the data witch
> > will be trasform to GPU to compute and the OpenCL kernel info is
> > in AVOpenCLKernelEnv *kenv
>
> May be "Load kernel" a better description?
>
> [...]
> > > > +
> > > > +/**
> > > > + *  Save OpenCL buffer as share buffer.
> > >
> > > what is a share buffer?
> > >
>
> > It is an opencl buffer that avoid the copy data between GPU and CPU, the
> > opencl filter will detect whether the next filter is opencl fiter, if
> yes,
> > it will save the opencl buffer and don't copy the processed data to
> memory,
> > and the next filter will detect whether the previous data is opencl
> filter,
> > if yes, it will get the buffer porinter and use the data directly.So it
> can
> > avoid the copy operation GPU-CPU-GPU, and it can save much time.I want to
> > remove it in this patch because there is only one opencl filter(deshake),
> > If the wrapper patch is accepted, I will add it and submit the sacle
> > filter,is it OK?
>
> Sure, it's best to start with smaller patches and add pieces as we go.
>
> About OpenCL buffers, we have a framework for working with H/W
> buffers, but I need someone who has more experience than me to comment
> on it.
>
> Sorry, I didn't see this question. What is H/W?

> > >
> > > > + *
> > > > + *@param filtername                  filter name
> > > > + *@param cl_inbuf                     OpenCL buffer
> > > > + */
> > > > +
> > > > +int av_opencl_save_buffer(const char *filtername,void *cl_inbuf,int
> > > buf_size);
> > > > +
> > > > +/**
> > > > + *  Get the OpenCL share buffer.
> > > > + *
> > > > + *@param filtername                  filter name
> > > > + */
> > > > +
> > > > +void *av_opencl_get_buffer(const char *filtername,int buf_size);
> > > > +
>
> > > > +/**
> > > > + *  Regist kernels.
> > >
> > > Register kernels? kernel?
> > >
> > > > + *
> > > > + *@param kernel_name                  Regist kernel name
> > > > + *@param kernel_code                   Kernel code
> > > > + */
> > > > +
> > > > +void av_opencl_regist_kernel(const char *kernel_name,const char
> > > *kernel_code);
> > >
> > > av_opencl_register_kernel, characters are cheap this days.
> > >
> > > Sorry, I don't get what your mean, Do you mean that I should update the
> > latest version?
>
> I mean that there is a mismatch between the doxy (register kernels)
> and the function name (register_kernel), and I can't say what's more
> correct.
>
> >
> > > Also this will plainly crash unless you perform some operations
> > > before, which are not documented.
> > >
>
> > Sorry, I don't get what your mean.Do you mean that I should add some code
> > to make sure the function is run correct?
>
> Ideally yes, or you specify which are the conditions which allow to
> use the function. Ideally reading the docs should be enough to get how
> it works, and it should be almost impossible for the user to get it
> wrong.
> --
> FFmpeg = Faithful and Free Mastering Proud Ecumenical Guru
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list