[FFmpeg-devel] [PATCH 1/2] libavutil/libavfilter: add opencl wrapper to ffmpeg

Wei Gao highgod0401 at gmail.com
Sun Mar 24 05:08:11 CET 2013


Hi,

Stefano Sabatini, thanks for your comments, and some questions and
explanations are as follows.

Thanks
Best regards

2013/3/22 Stefano Sabatini <stefasab at gmail.com>

> On date Thursday 2013-03-21 16:17:12 +0800, highgod0401 encoded:
> >
> >
> >
> >
> >
> > highgod0401
>
> > From 6acf8825090a722c634dba9049b32553f9daef42 Mon Sep 17 00:00:00 2001
> > From: highgod0401 <highgod0401 at gmail.com>
> > Date: Thu, 21 Mar 2013 15:56:29 +0800
> > Subject: [PATCH 1/2] add opencl wrapper to ffmpeg
> >
> > ---
> >  configure          |   4 +
> >  libavutil/Makefile |   3 +
> >  libavutil/opencl.c | 988
> +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  libavutil/opencl.h | 219 ++++++++++++
> >  4 files changed, 1214 insertions(+)
> >  create mode 100644 libavutil/opencl.c
> >  create mode 100644 libavutil/opencl.h
> [...]
> > diff --git a/libavutil/opencl.c b/libavutil/opencl.c
> > new file mode 100644
> > index 0000000..c577280
> > --- /dev/null
> > +++ b/libavutil/opencl.c
> > @@ -0,0 +1,988 @@
> > +/*
> > + * Copyright (C) 2012 Peng Gao <peng at multicorewareinc.com>
> > + * Copyright (C) 2012 Li   Cao <li at multicorewareinc.com>
> > + * Copyright (C) 2012 Wei  Gao <weigao at multicorewareinc.com>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +#include "opencl.h"
> > +#include "avstring.h"
> > +#include "log.h"
> > +
> > +
> > +#define MAX_KERNEL_STRING_LEN   64
> > +#define MAX_CLFILE_NUM 50
>
> > +#define MAX_CLKERNEL_NUM 200
> > +#define MAX_CLFILE_PATH 255
> > +#define MAX_KERNEL_NUM  50
>
> What's the difference between MAX_CLKERNEL_NUM and MAX_KERNEL_NUM?
>
MAX_KERNEL_NUM is defined but not used, I will delete it.

>
> > +#define MAX_FILTER_NAME_LEN 64
> > +#define MAX_FILTER_NUM 200
> > +#define MAX_KERNEL_SRCFILE_LEN 256
> > +
> > +
> > +typedef struct OpenCLEnv {
> > +    cl_platform_id platform;
> > +    cl_context   context;
> > +    cl_device_id devices;
> > +    cl_command_queue command_queue;
> > +} OpenCLEnv;
> > +
> > +typedef struct GPUEnv {
>
> > +    //share vb in all modules in hb library
>
> it is not obvious to me what "vb" and "hb" stand for
>
These files are referenced our other projects, I forgot to delete this
note,  sorry.

>
> > +    cl_platform_id platform;
> > +    cl_device_type device_type;
> > +    cl_context context;
> > +    cl_device_id *device_ids;
> > +    cl_device_id  device_id;
> > +    cl_command_queue command_queue;
> > +    cl_kernel kernels[MAX_CLFILE_NUM];
> > +    cl_program programs[MAX_CLFILE_NUM]; //one program object maps one
> kernel source file
> > +    char  kernel_srcfile[MAX_CLFILE_NUM][MAX_KERNEL_SRCFILE_LEN];
> //the max len of kernel file name is 256
> > +    int file_count; // only one kernel file
> > +
> > +    char kernel_names[MAX_CLKERNEL_NUM][MAX_KERNEL_STRING_LEN+1];
> > +    av_opencl_kernel_function kernel_functions[MAX_CLKERNEL_NUM];
> > +    const char *kernel_code[MAX_CLKERNEL_NUM];
> > +    int kernel_count;
>
> > +    int reg_kernel_count;
>
> more meaningful name (e.g. kernel_function_count?)
>
this variable means that the register kernel which used in the runtime.For
example there are 10 kernels has been registered, but use only 1 in
runtime. so the reg_kernel_count is 1, and kernel_count is 10, should I
rename it to runtime_kernel_count?

>
> > +    int is_user_created; // 1: created , 0:no create and needed to
> create by opencl wrapper
> > +    uint8_t *temp_buffer;
> > +    int temp_buffer_size;
> > +} GPUEnv;
> > +
> > +typedef struct FilterBufferNode {
> > +    char filter_name[MAX_FILTER_NAME_LEN+1];
> > +    void *cl_inbuf;
> > +    int buf_size;
> > +} FilterBufferNode;
> > +
> > +typedef struct OpenclUtils {
> > +    const AVClass *class;
> > +    int   log_offset;
> > +    void *log_ctx;
> > +} OpenclUtils;
> > +
> > +typedef struct OpenclErrorMsg {
> > +    int err_code;
> > +    const char *err_str;
> > +} OpenclErrorMsg;
> > +
> > +static OpenclErrorMsg opencl_err_msg[] = {
> > +        {CL_DEVICE_NOT_FOUND,                               "DEVICE NOT
> FOUND"},
> > +        {CL_DEVICE_NOT_AVAILABLE,                           "DEVICE NOT
> AVAILABLE"},
> > +        {CL_COMPILER_NOT_AVAILABLE,                         "COMPILER
> NOT AVAILABLE"},
> > +        {CL_MEM_OBJECT_ALLOCATION_FAILURE,                  "MEM OBJECT
> ALLOCATION FAILURE"},
> > +        {CL_OUT_OF_RESOURCES,                               "OUT OF
> RESOURCES"},
> > +        {CL_OUT_OF_HOST_MEMORY,                             "OUT OF
> HOST MEMORY"},
> > +        {CL_PROFILING_INFO_NOT_AVAILABLE,                   "PROFILING
> INFO NOT AVAILABLE"},
> > +        {CL_MEM_COPY_OVERLAP,                               "MEM COPY
> OVERLAP"},
> > +        {CL_IMAGE_FORMAT_MISMATCH,                          "IMAGE
> FORMAT MISMATCH"},
> > +        {CL_IMAGE_FORMAT_NOT_SUPPORTED,                     "IMAGE
> FORMAT NOT_SUPPORTED"},
> > +        {CL_BUILD_PROGRAM_FAILURE,                          "BUILD
> PROGRAM FAILURE"},
> > +        {CL_MAP_FAILURE,                                    "MAP
> FAILURE"},
> > +        {CL_MISALIGNED_SUB_BUFFER_OFFSET,                   "MISALIGNED
> SUB BUFFER OFFSET"},
> > +        {CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST,      "EXEC
> STATUS ERROR FOR EVENTS IN WAIT LIST"},
> > +        {CL_COMPILE_PROGRAM_FAILURE,                        "COMPILE
> PROGRAM FAILURE"},
> > +        {CL_LINKER_NOT_AVAILABLE,                           "LINKER NOT
> AVAILABLE"},
> > +        {CL_LINK_PROGRAM_FAILURE,                           "LINK
> PROGRAM FAILURE"},
> > +        {CL_DEVICE_PARTITION_FAILED,                        "DEVICE
> PARTITION FAILED"},
> > +        {CL_KERNEL_ARG_INFO_NOT_AVAILABLE,                  "KERNEL ARG
> INFO NOT AVAILABLE"},
> > +        {CL_INVALID_VALUE,                                  "INVALID
> VALUE"},
> > +        {CL_INVALID_DEVICE_TYPE,                            "INVALID
> DEVICE TYPE"},
> > +        {CL_INVALID_PLATFORM,                               "INVALID
> PLATFORM"},
> > +        {CL_INVALID_DEVICE,                                 "INVALID
> DEVICE"},
> > +        {CL_INVALID_CONTEXT,                                "INVALID
> CONTEXT"},
> > +        {CL_INVALID_QUEUE_PROPERTIES,                       "INVALID
> QUEUE PROPERTIES"},
> > +        {CL_INVALID_COMMAND_QUEUE,                          "INVALID
> COMMAND QUEUE"},
> > +        {CL_INVALID_HOST_PTR,                               "INVALID
> HOST PTR"},
> > +        {CL_INVALID_MEM_OBJECT,                             "INVALID
> MEM OBJECT"},
> > +        {CL_INVALID_IMAGE_FORMAT_DESCRIPTOR,                "INVALID
> IMAGE FORMAT DESCRIPTOR"},
> > +        {CL_INVALID_IMAGE_SIZE,                             "INVALID
> IMAGE SIZE"},
> > +        {CL_INVALID_SAMPLER,                                "INVALID
> SAMPLER"},
> > +        {CL_INVALID_BINARY,                                 "INVALID
> BINARY"},
> > +        {CL_INVALID_BUILD_OPTIONS,                          "INVALID
> BUILD OPTIONS"},
> > +        {CL_INVALID_PROGRAM,                                "INVALID
> PROGRAM"},
> > +        {CL_INVALID_PROGRAM_EXECUTABLE,                     "INVALID
> PROGRAM EXECUTABLE"},
> > +        {CL_INVALID_KERNEL_NAME,                            "INVALID
> KERNEL NAME"},
> > +        {CL_INVALID_KERNEL_DEFINITION,                      "INVALID
> KERNEL DEFINITION"},
> > +        {CL_INVALID_KERNEL,                                 "INVALID
> KERNEL"},
> > +        {CL_INVALID_ARG_INDEX,                              "INVALID
> ARG INDEX"},
> > +        {CL_INVALID_ARG_VALUE,                              "INVALID
> ARG VALUE"},
> > +        {CL_INVALID_ARG_SIZE,                               "INVALID
> ARG_SIZE"},
> > +        {CL_INVALID_KERNEL_ARGS,                            "INVALID
> KERNEL ARGS"},
> > +        {CL_INVALID_WORK_DIMENSION,                         "INVALID
> WORK DIMENSION"},
> > +        {CL_INVALID_WORK_GROUP_SIZE,                        "INVALID
> WORK GROUP SIZE"},
> > +        {CL_INVALID_WORK_ITEM_SIZE,                         "INVALID
> WORK ITEM SIZE"},
> > +        {CL_INVALID_GLOBAL_OFFSET,                          "INVALID
> GLOBAL OFFSET"},
> > +        {CL_INVALID_EVENT_WAIT_LIST,                        "INVALID
> EVENT WAIT LIST"},
> > +        {CL_INVALID_EVENT,                                  "INVALID
> EVENT"},
> > +        {CL_INVALID_OPERATION,                              "INVALID
> OPERATION"},
> > +        {CL_INVALID_GL_OBJECT,                              "INVALID GL
> OBJECT"},
> > +        {CL_INVALID_BUFFER_SIZE,                            "INVALID
> BUFFER SIZE"},
> > +        {CL_INVALID_MIP_LEVEL,                              "INVALID
> MIP LEVEL"},
> > +        {CL_INVALID_GLOBAL_WORK_SIZE,                       "INVALID
> GLOBAL WORK SIZE"},
> > +        {CL_INVALID_PROPERTY,                               "INVALID
> PROPERTY"},
> > +        {CL_INVALID_IMAGE_DESCRIPTOR,                       "INVALID
> IMAGE DESCRIPTOR"},
> > +        {CL_INVALID_COMPILER_OPTIONS,                       "INVALID
> COMPILER OPTIONS"},
> > +        {CL_INVALID_LINKER_OPTIONS,                         "INVALID
> LINKER OPTIONS"},
> > +        {CL_INVALID_DEVICE_PARTITION_COUNT,                 "INVALID
> DEVICE PARTITION COUNT"},
> > +};
> > +
> > +static const AVClass openclutils_class = {"OPENCLUTILS",
> av_default_item_name,
> > +                                                   NULL,
> LIBAVUTIL_VERSION_INT,
> > +
> offsetof(OpenclUtils, log_offset),
> > +
> offsetof(OpenclUtils, log_ctx)};
>
> > +static OpenclUtils openclutils = {&openclutils_class,0,NULL};
>
> static structs are initialized to 0, thus I don't think you need to set
> the other fields to 0, same below.
>
> > +static GPUEnv gpu_env = {0};
>
> > +static FilterBufferNode filter_buffer[MAX_FILTER_NUM] = {{"", NULL,0}};
>
> > +static int isinited = 0;
> > +
> > +void av_opencl_register_kernel(const char *kernel_name,const char
> *kernel_code)
> > +{
> > +    if (gpu_env.kernel_count < MAX_CLKERNEL_NUM) {
> > +        gpu_env.kernel_code[gpu_env.kernel_count] = kernel_code;
> > +
>  av_strlcpy(gpu_env.kernel_names[gpu_env.kernel_count],kernel_name,MAX_KERNEL_STRING_LEN+1);
>
> nit: ..., kernel_name, MAX_KERNEL_STRING_LEN+1
>
> that is add missing spaces after ",", here and in the rest of the code
>
> Also, I suspect that it is better to just complain to the user and
> abort in case the registered name is too long, than silently clipping
> it.
>
> > +        gpu_env.kernel_count++;
> > +    } else {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"av_opencl_register_kernel,kernel number
> is more than %d\n",MAX_CLKERNEL_NUM);
>
> "Could not register kernel with name '%s', maximum number of registered
> kernels %d already reached\n"
>
> Also the user wants to know when the function failed, so you should
> return an error code.
>
> > +    }
> > +}
>
> > +static const char* opencl_errstr(int status)
>
> nit: static const chat *opencl_errstr(int status)
>
> for consistency
>
> > +{
> > +    for (int i = 0;i < sizeof(opencl_err_msg);i++) {
> > +        if(opencl_err_msg[i].err_code == status)
> > +            return opencl_err_msg[i].err_str;
> > +    }
> > +    return "unknown error";
> > +}
> > +
> > +static int access_binaries(cl_device_id *devices, int numdevices, const
> char *cl_file_name, FILE **fhandle,
> > +                                size_t *binarysizes, char **binaries,
> int *binary_existed,int write)
> > +{
> > +    FILE *fd = NULL;
> > +    int status;
> > +    char filename[1024] = {0};
> > +    char cl_name[1024] = {0};
> > +    for (int i = 0; i < numdevices; i++) {
> > +        if (devices[i] != 0) {
> > +            char devicename[1024];
> > +            status = clGetDeviceInfo(devices[i],
> > +                                     CL_DEVICE_NAME,
> > +                                     sizeof(devicename),
> > +                                     devicename,
> > +                                     NULL);
> > +            if (status == CL_SUCCESS) {
>
> > +                av_strlcpy(cl_name,cl_file_name,1024);
>
> "1024" can be changed to "sizeof(cl_name)", so that you avoid
> inconsistencies in case the size is changed (and improved readability)
>
> > +                snprintf(filename, sizeof(filename),"./%s-%s.bin",
> cl_name, devicename);
>
> > +                if (!write) {
> > +                    fd = fopen(filename,"rb");
> > +                    if (binary_existed) {
>
> > +                        *binary_existed = (fd != NULL) ? 1 : 0;
>
> *binary_existed = !!fd;
>
>
> > +                        if (*binary_existed) {
> > +                            if (fhandle)
> > +                                *fhandle = fd;
> > +                            break;
>
> I suspect you could remove the param binary_existed, and just keep
> fhandle
>
>
> > +                        }
> > +                    }
> > +                } else {
> > +                    FILE *output = NULL;
> > +                    if (binarysizes[i] != 0) {
> > +                        output = fopen(filename, "wb");
> > +                        if (!output) {
>
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"access_binaries error,write file
> error\n");
>
> Could not write file '%s'\n
>
> > +                            return AVERROR_EXTERNAL;
>
> you can return errno here, something like
>
> if (!output) {
>    int err = errno;
>    av_log(...);
>    return AVERROR(err);
> }
>
> you need to save errno since the call to av_log() may overwrite it
>
>
> > +                        }
> > +                        fwrite(binaries[i], sizeof(char),
> binarysizes[i], output);
> > +                        fclose(output);
> > +                    }
> > +                }
> > +            } else {
> > +                av_log(&openclutils,AV_LOG_ERROR,"access_binaries
> error,clGetDeviceInfo:%s\n",opencl_errstr(status));
>
> Could not get OpenCL device information: %s\n
>
> > +                return AVERROR_EXTERNAL;
>
> > +                break;
>
> unneeded
>
> > +            }
> > +        }
> > +    }
> > +    return 0;
> > +}
> > +static int check_generated_binary(cl_context context, const char *
> cl_file_name, FILE **fhandle, int *binary_existed)
>
> nit: add empty line after the previous function body
>
>
> > +{
> > +    int ret = 0;
> > +    cl_int status;
> > +    size_t numdevices;
> > +    cl_device_id *devices;
> > +    status = clGetContextInfo(context,
> > +                              CL_CONTEXT_NUM_DEVICES,
> > +                              sizeof(numdevices),
> > +                              &numdevices,
> > +                              NULL);
> > +    if (status != CL_SUCCESS){
> > +        av_log(&openclutils,AV_LOG_ERROR,"check_generated_binary
> error,clGetContextInfo:%s\n",opencl_errstr(status));
>
> Could not get OpenCL context number of devices: %s\n
>
>
> > +        return AVERROR_EXTERNAL;
> > +    }
> > +
> > +    devices = av_malloc(sizeof(cl_device_id) * numdevices);
> > +    if (!devices)
> > +        return AVERROR(ENOMEM);
> > +
> > +    /* grab the handles to all of the devices in the context. */
> > +    status = clGetContextInfo(context,
> > +                              CL_CONTEXT_DEVICES,
> > +                              sizeof(cl_device_id) * numdevices,
> > +                              devices,
> > +                              NULL);
> > +    if (status != CL_SUCCESS){
> > +        av_log(&openclutils,AV_LOG_ERROR,"check_generated_binary
> error,clGetContextInfo:%s\n",opencl_errstr(status));
>
> Could not get OpenCL context devices: %s\n
>
> > +        ret = AVERROR_EXTERNAL;
> > +        goto end;
> > +    }
> > +    ret = access_binaries(devices, numdevices,cl_file_name, fhandle,
> NULL, NULL, binary_existed, 0);
> > +end:
> > +    av_free(devices);
> > +    return ret;
> > +}
> > +
> > +static int generate_bin_from_kernel_source(cl_program program, const
> char * cl_file_name)
> > +{
> > +    int i = 0;
> > +    cl_int status;
> > +    size_t *binarysizes = NULL;
> > +    size_t numdevices;
> > +    cl_device_id *devices = NULL;
> > +    char **binaries = NULL;
> > +    int ret = 0;
>
> > +    status = clGetProgramInfo(program,
> > +                              CL_PROGRAM_NUM_DEVICES,
> > +                              sizeof(numdevices),
> > +                              &numdevices,
> > +                              NULL);
> > +    if (status != CL_SUCCESS) {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"generate_bin_from_kernel_source
> error,clGetProgramInfo:%s\n",opencl_errstr(status));
> > +        return AVERROR_EXTERNAL;
> > +    }
> > +    devices = av_mallocz(sizeof(cl_device_id) * numdevices);
> > +    if (!devices)
> > +        return AVERROR(ENOMEM);
> > +    /* grab the handles to all of the devices in the program. */
> > +    status = clGetProgramInfo(program,
> > +                              CL_PROGRAM_DEVICES,
> > +                              sizeof(cl_device_id) * numdevices,
> > +                              devices,
> > +                              NULL);
>
> this code could be possibly be factorized
>
> > +
> > +    /* figure out the sizes of each of the binaries. */
> > +    binarysizes = av_mallocz(sizeof(size_t) * numdevices);
> > +    if (!binarysizes) {
> > +        ret = AVERROR(ENOMEM);
> > +        goto end;
> > +    }
> > +
> > +    status = clGetProgramInfo(program,
> > +                              CL_PROGRAM_BINARY_SIZES,
> > +                              sizeof(size_t) * numdevices,
> > +                              binarysizes, NULL);
> > +    if (status != CL_SUCCESS) {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"generate_bin_from_kernel_source
> error,clGetProgramInfo:%s\n",opencl_errstr(status));
> > +        ret = AVERROR_EXTERNAL;
> > +        goto end;
> > +    }
>
> > +    /* copy over all of the generated binaries. */
> > +    binaries = av_mallocz(sizeof(char *) * numdevices);
> > +    if (!binaries) {
> > +        ret = AVERROR(ENOMEM);
> > +        goto end;
> > +    }
> > +    for (i = 0; i < numdevices; i++) {
> > +        if (binarysizes[i] != 0) {
> > +            binaries[i] = av_mallocz(sizeof(char) * binarysizes[i]);
> > +            if (!binaries[i]) {
> > +                ret = AVERROR(ENOMEM);
> > +                goto end;
> > +            }
> > +        }
> > +    }
> > +
> > +    status = clGetProgramInfo(program,
> > +                              CL_PROGRAM_BINARIES,
> > +                              sizeof(char *) * numdevices,
> > +                              binaries,
> > +                              NULL);
> > +    if (status != CL_SUCCESS) {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"generate_bin_from_kernel_source
> error,clGetProgramInfo:%s\n",opencl_errstr(status));
> > +        ret = AVERROR_EXTERNAL;
> > +        goto end;
> > +    }
> > +    ret = access_binaries(devices, numdevices, cl_file_name,
> NULL,binarysizes, binaries,NULL, 1);
> > +end:
> > +    // Release all resouces and memory
> > +    if (binaries) {
> > +        for (i = 0;i < numdevices;i++ ) {
> > +            av_free(binaries[i]);
> > +        }
> > +    }
> > +    av_free(binaries);
> > +    av_free(binarysizes);
> > +    av_free(devices);
> > +    return ret;
> > +}
> > +
> > +int av_opencl_create_kernel(const char * kernelname, AVOpenCLKernelEnv
> * env)
>
> nit: ...(const char *kernel_name, AVOpenCLKernelEnv *env)
>
> > +{
> > +    int status;
> > +    env->kernel = clCreateKernel(gpu_env.programs[0], kernelname,
> &status);
> > +    env->context = gpu_env.context;
> > +    env->command_queue = gpu_env.command_queue;
>
> > +    if (status != CL_SUCCESS) {
> > +        av_log(&openclutils,AV_LOG_ERROR,"av_opencl_create_kernel
> error,clCreateKernel:%s\n",opencl_errstr(status));
>
> In general it's better to provide more user readable messages and
> avoid to provide too much detailed information (such as function name).
> I suggest:
>
> "Could not create OpenCL kernel: %s\n"
>
> > +        return AVERROR_EXTERNAL;
> > +    }
> > +    return 0;
> > +}
> > +
> > +void av_opencl_release_kernel(AVOpenCLKernelEnv * env)
> > +{
> > +    int status = clReleaseKernel(env->kernel);
>
> > +    if (status != CL_SUCCESS) {
> > +        av_log(&openclutils,AV_LOG_ERROR,"av_opencl_release_kernel
> error,clReleaseKernel:%s\n",opencl_errstr(status));
> > +    }
>
> > +    return;
>
> useless
>
> > +}
> > +
> > +static int init_opencl_env(GPUEnv *gpu_info
>
> GPUEnv *gpu_info
>
> for consistency it could be: *gpu_env (less confusing)
>
> > ... ,void *ext_opencl_info)
>
> Also why void *...
>
> > +{
> > +    size_t length;
>
> a more specific name could be useful
>
> > +    cl_int status;
> > +    cl_uint numplatforms, numdevices;
> > +    cl_platform_id *platforms = NULL;
> > +    cl_context_properties cps[3];
> > +    char platform_name[100];
> > +    unsigned int i;
> > +    int ret = 0;
>
> > +    AVOpenCLExternalInfo *opencl_info = ext_opencl_info;
>
> and then the cast?
>
> > +    if (opencl_info) {
> > +        if (gpu_info->is_user_created)
> > +            return 0;
> > +        gpu_info->platform = opencl_info->platform;
> > +        gpu_info->is_user_created = 1;
> > +        gpu_info->command_queue = opencl_info->command_queue;
> > +        gpu_info->context = opencl_info->context;
> > +        gpu_info->device_ids = opencl_info->device_ids;
> > +        gpu_info->device_id = opencl_info->device_id;
> > +        gpu_info->device_type = opencl_info->device_type;
> > +    } else {
> > +        if (!gpu_info->is_user_created) {
> > +            status = clGetPlatformIDs(0,NULL,&numplatforms);
> > +            if (status != CL_SUCCESS) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clGetPlatformIDs:%s\n",opencl_errstr(status));
> > +                return AVERROR_EXTERNAL;
> > +            }
> > +            if (0 < numplatforms) {
> > +                platforms = av_mallocz(
>
> platform_ids may be a better name
>
> > +                    numplatforms * sizeof(cl_platform_id));
> > +                if (!platforms) {
> > +                    ret = AVERROR(ENOMEM);
> > +                    goto end;
> > +                }
> > +                status = clGetPlatformIDs(numplatforms, platforms,
> NULL);
> > +                if (status != CL_SUCCESS) {
> > +                    av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clGetPlatformIDs:%s\n",opencl_errstr(status));
> > +                    ret = AVERROR_EXTERNAL;
> > +                    goto end;
> > +                }
> > +                for (i = 0; i < numplatforms; i++) {
> > +                    status = clGetPlatformInfo(platforms[i],
> CL_PLATFORM_VENDOR,
> > +                                               sizeof(platform_name),
> platform_name,
> > +                                               NULL);
> > +
> > +                    if (status != CL_SUCCESS) {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clGetPlatformInfo:%s\n",opencl_errstr(status));
> > +                        ret = AVERROR_EXTERNAL;
> > +                        goto end;
> > +                    }
> > +                    gpu_info->platform = platforms[i];
> > +                    status = clGetDeviceIDs(gpu_info->platform /*
> platform */,
> > +                                            CL_DEVICE_TYPE_GPU /*
> device_type */,
> > +                                            0 /* num_entries */,
> > +                                            NULL /* devices */,
> > +                                            &numdevices);
> > +                    if (status != CL_SUCCESS) {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clGetDeviceIDs:%s\n",opencl_errstr(status));
> > +                        ret = AVERROR_EXTERNAL;
> > +                        goto end;
> > +                    }
> > +                    if (numdevices == 0) {
> > +                        //find CPU device
> > +                        status = clGetDeviceIDs( gpu_info->platform /*
> platform */,
> > +                                             CL_DEVICE_TYPE_CPU /*
> device_type */,
> > +                                             0 /* num_entries */,
> > +                                             NULL /* devices */,
> > +                                             &numdevices );
> > +                    }
> > +                    if (status != CL_SUCCESS) {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clGetDeviceIDs:%s\n",opencl_errstr(status));
> > +                        ret = AVERROR_EXTERNAL;
> > +                        goto end;
> > +                    }
> > +                    if(numdevices)
> > +                       break;
> > +
> > +                }
> > +            }
> > +            if (!gpu_info->platform) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,get platform error\n");
> > +                ret = AVERROR_EXTERNAL;
> > +                goto end;
> > +            }
> > +
> > +           /*
> > +                 * Use available platform.
> > +                 */
> > +            av_log(&openclutils,AV_LOG_VERBOSE,"Platform Name:
> %s\n",platform_name);
> > +            cps[0] = CL_CONTEXT_PLATFORM;
> > +            cps[1] = (cl_context_properties)gpu_info->platform;
> > +            cps[2] = 0;
>
> > +            /* Check for GPU. */
>
> > +            gpu_info->device_type = CL_DEVICE_TYPE_GPU;
> > +            gpu_info->context = clCreateContextFromType(
> > +                cps, gpu_info->device_type, NULL, NULL, &status );
> > +            if ((!gpu_info->context) || (status != CL_SUCCESS)) {
> > +                gpu_info->device_type = CL_DEVICE_TYPE_CPU;
> > +                gpu_info->context = clCreateContextFromType(
> > +                    cps, gpu_info->device_type, NULL, NULL, &status );
> > +            }
> > +            if ((!gpu_info->context) || (status != CL_SUCCESS)) {
> > +                gpu_info->device_type = CL_DEVICE_TYPE_DEFAULT;
> > +                gpu_info->context = clCreateContextFromType(
> > +                    cps, gpu_info->device_type, NULL, NULL, &status );
> > +            }
> > +            if ((!gpu_info->context) || (status != CL_SUCCESS)) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clCreateContextFromType:%s\n",opencl_errstr(status));
> > +                ret = AVERROR_EXTERNAL;
> > +                goto end;
> > +            }
>
> probably a loop could be more readable
>
> > +            /* Detect OpenCL devices. */
> > +            /* First, get the size of device list data */
> > +            status = clGetContextInfo(gpu_info->context,
> CL_CONTEXT_DEVICES,
> > +                                      0, NULL, &length);
> > +            if ((status != CL_SUCCESS) || (length == 0)) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clGetContextInfo:%s\n",opencl_errstr(status));
> > +                ret = AVERROR_EXTERNAL;
> > +                goto end;
> > +            }
> > +            /* Now allocate memory for device list based on the size we
> got earlier */
> > +            gpu_info->device_ids = av_mallocz(length);
> > +            if (!gpu_info->device_ids) {
> > +                ret = AVERROR(ENOMEM);
> > +                goto end;
> > +            }
> > +            /* Now, get the device list data */
> > +            status = clGetContextInfo(gpu_info->context,
> CL_CONTEXT_DEVICES, length,
> > +                                      gpu_info->device_ids, NULL);
> > +            if (status != CL_SUCCESS) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clGetContextInfo:%s\n",opencl_errstr(status));
> > +                ret = AVERROR_EXTERNAL;
> > +                goto end;
> > +            }
> > +            /* Create OpenCL command queue. */
> > +            gpu_info->command_queue =
> clCreateCommandQueue(gpu_info->context,
> > +
> gpu_info->device_ids[0],
> > +                                                           0, &status);
> > +            if (status != CL_SUCCESS) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"init_opencl_env
> error,clCreateCommandQueue:%s\n",opencl_errstr(status));
> > +                ret = AVERROR_EXTERNAL;
> > +                goto end;
> > +            }
> > +        }
> > +    }
> > +end:
> > +    av_free(platforms);
> > +    return ret;
> > +}
> > +
> > +static void release_opencl_env( GPUEnv *gpu_info )
>
> gpu info/env inconsistency again
>
> > +{
> > +    int i, status;
> > +    if (!isinited)
> > +        return;
> > +    if (gpu_info->is_user_created)
> > +        return;
> > +    gpu_info->reg_kernel_count--;
> > +    if (!gpu_info->reg_kernel_count) {
> > +        for (i = 0; i<gpu_env.file_count; i++) {
> > +            if (gpu_env.programs[i]) {
> > +                status = clReleaseProgram(gpu_env.programs[i]);
> > +                if (status != CL_SUCCESS) {
> > +
>  av_log(&openclutils,AV_LOG_ERROR,"release_opencl_env
> error,clReleaseProgram:%s\n",opencl_errstr(status));
> > +                }
> > +                gpu_env.programs[i] = NULL;
> > +            }
> > +        }
> > +        if (gpu_env.command_queue) {
> > +            status = clReleaseCommandQueue(gpu_env.command_queue);
> > +            if (status != CL_SUCCESS) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"release_opencl_env
> error,clReleaseCommandQueue:%s\n",opencl_errstr(status));
> > +            }
> > +            gpu_env.command_queue = NULL;
> > +        }
> > +        if (gpu_env.context) {
> > +            status = clReleaseContext(gpu_env.context);
> > +            if (status != CL_SUCCESS) {
> > +                av_log(&openclutils,AV_LOG_ERROR,"release_opencl_env
> error,clReleaseContext:%s\n",opencl_errstr(status));
> > +            }
> > +            gpu_env.context = NULL;
> > +        }
> > +        isinited = 0;
> > +    }
>
> you also need probably to free the allocated device_ids (?)
>
> > +    return;
> > +}
> > +
> > +int av_opencl_register_kernel_function(const char *kernel_name,
> av_opencl_kernel_function function)
> > +{
>
> > +    for (int i = 0; i < gpu_env.kernel_count; i++) {
>
> This syntax is not supported by some compilers.
>
> Use:
>
> int i;
> for (i = 0; ...)
>
> > +        if (av_strcasecmp(kernel_name, gpu_env.kernel_names[i])==0) {
> > +            gpu_env.kernel_functions[i] = function;
> > +            gpu_env.reg_kernel_count++;
> > +            return 0;
> > +        }
> > +    }
> > +    return AVERROR(EIO);
> > +}
> > +
>
> > +static int detect_kerner_program(const GPUEnv *gpu_env, const char *
> cl_file_name)
>
> kerner -> typo?
>
> > +{
> > +    for (int i = 0; i < gpu_env->file_count; i++) {
>
> ditto
>
> [...]
> > diff --git a/libavutil/opencl.h b/libavutil/opencl.h
> > new file mode 100644
> > index 0000000..e703cbc
> > --- /dev/null
> > +++ b/libavutil/opencl.h
> > @@ -0,0 +1,219 @@
> > +/*
> > + * Copyright (C) 2012 Peng Gao <peng at multicorewareinc.com>
> > + * Copyright (C) 2012 Li   Cao <li at multicorewareinc.com>
> > + * Copyright (C) 2012 Wei  Gao <weigao at multicorewareinc.com>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +#include "config.h"
> > +
> > +#ifndef LIBAVUTIL_OPENCLWRAPPER_H
> > +#define LIBAVUTIL_OPENCLWRAPPER_H
> > +
> > +#include <CL/cl.h>
> > +
>
> > +#define AV_OPENCL_KERNEL( ... )# __VA_ARGS__
>
> is this useful?
>
it is used to define kernel string in deshake_kernel.h:const char
*ff_kernel_deshake_opencl = AV_OPENCL_KERNEL(......);

>
> > +
> > +#define MAX_KERNEL_NAME_LEN 150
>
> Since this is public header you need a prefix to avoid conflicts.
>
> #define AV_OPENCL_MAX_KERNEL_NAME_SIZE 150
>
> "SIZE" should be better since "LEN" is usually associated to the
> length of a string, which is not the same as the corresponding buffer
> size.
>
> > +
> > +typedef struct AVOpenCLKernelEnv {
> > +    cl_context context;
> > +    cl_command_queue command_queue;
> > +    cl_program program;
> > +    cl_kernel kernel;
> > +    char kernel_name[MAX_KERNEL_NAME_LEN];
> > +} AVOpenCLKernelEnv;
> > +
> > +typedef struct AVOpenCLExternalInfo {
> > +    cl_platform_id platform;
> > +    cl_device_type device_type;
> > +    cl_context context;
> > +    cl_device_id *device_ids;
> > +    cl_device_id  device_id;
> > +    cl_command_queue command_queue;
> > +    char *platform_name;
> > +} AVOpenCLExternalInfo;
> > +
> > +/**
> > + * User defined function, used to set the input parameter in the kernel
> > + *environment. This function launches kernel and copies data from GPU to
> > + *CPU, or from CPU to GPU.
> > + */
> > +
> > +typedef int (*av_opencl_kernel_function)(void **userdata,
> AVOpenCLKernelEnv *kenv);
>
> nit here and below:
> space after the leading "*"
>
> /**
>  * User defined function, used to set the input parameter in the kernel
>  * environment. This function launches kernel and copies data from GPU to
>  * CPU, or from CPU to GPU.
>  */
> typedef int (*av_opencl_kernel_function)(void **userdata,
> AVOpenCLKernelEnv *kenv);
>
> > +
> > +/**
> > + * Register a function for running the kernel specified by the kernel
> name.The function is user defined to run the kernel.
> > + *
> > + */
> > +
> > +int av_opencl_register_kernel_function(const char *kernel_name,
> av_opencl_kernel_function function);
>
> /**
>  * Register a user defined function for running the kernel specified by
> the kernel name.
>  */
> int av_opencl_register_kernel_function(const char *kernel_name,
> av_opencl_kernel_function function);
>
> Avoid the empty line after the doxy, same below.
>
> > +
> > +/**
> > + *Load OpenCL kernel.
> > + *
> > + *@param kernel_name   this kernel name is used to find the kernel in
> OpenCL runtime environment.
> > + *@param userdata         this userdata is the all parameters for
> running the kernel specified by kernel name
> > + *@return 0 on success, others on failure
> > + */
> > +
> > +int av_opencl_run_kernel(const char *kernel_name, void **userdata);
>
> /**
>  * Load OpenCL kernel.
>  *
>  * @param kernel_name   kernel name used to find the kernel in the OpenCL
> runtime environment
>  * @param userdata      parameters for running the kernel
>  * @return 0 on success, a negative error code in case of failure
>  */
> int av_opencl_run_kernel(const char *kernel_name, void **userdata);
>
> also it is a good idea to say that the function returns >= 0 in case of
> success, so that it is possible to extend the function later in case
> you want to provide more information through the return code.
>
> > +
> > +/**
> > + * Init the run time  OpenCL environment.
> > + *
> > + *This function must be called befor calling any function related to
> OpenCL.
> > + *
> > + *
> > + *@param build_option         option of compile the kernel in OpenCL
> runtime environment,reference "OpenCL Specification Version: 1.2 chapter
> 5.6.4"
> > + *@param ext_opencl_info    this is the extern OpenCL environment which
> the application program has created
> > + *@return 0 on success, a negative value on error
> > + */
> > +
> > +int av_opencl_init_run_env(const char *build_option,void
> *ext_opencl_info);
> > +
> > +/**
> > + * Release OpenCL resources , this function must be called after
> calling any functions related to OpenCL.
> > + */
> > +
> > +void av_opencl_release_run_env(void);
> > +/**
> > + * Get the OpenCL status, this function is used the check whether or
> not the OpenCL run time has been created.
> > + *
> > + *@return 0 not init, 1, inited;
> > + *
> > + */
> > +int av_opencl_is_inited(void);
> > +
> > +/**
> > + * Create kernel object  by a kernel name on the specified OpenCL run
> time indicated by env parameter.
> > + *
> > + *@param kernelname          kernel name.
> > + *@param env                     The kernel environment which has been
> created by av_opencl_init_run_env
> > + *@return 0 on success, a negative value on error
> > + *
> > + */
> > +
> > +int av_opencl_create_kernel(const char *kernelname, AVOpenCLKernelEnv
> *env);
> > +
> > +/**
> > + *  Release kernel object.
> > + *
> > + *@param env  The kernel environment which has been created by
> av_opencl_init_run_env.
> > + */
> > +void av_opencl_release_kernel(AVOpenCLKernelEnv * env);
> > +
> > +/**
> > + *  Get the kernel environment.
> > + *
> > + *@param env  The kernel environment which has been created by
> av_opencl_init_run_env.
> > + *
> > + */
> > +
> > +void av_opencl_get_kernel_env(AVOpenCLKernelEnv *env);
> > +
>
> > +/**
> > + *  Create OpenCL buffer, the buffer is used to save the data which is
> used by OpenCL kernel.
> > + *
> > + *@param cl_buf         The pointer of OpenCL buffer.
> > + *@param flags           The flags which used to control buffer
> attribute
> > + *@param size            The size of OpenCL buffer
> > + *@param host_ptr      The host pointer of OpenCL buffer
> > + *@return 0 on success, a negative value on error
> > + */
> > +
> > +int av_opencl_create_buffer(void **cl_buf, int flags, size_t size,void
> *host_ptr);
> > +
> > +/**
> > + *  Read data form OpenCL buffer to memory. Src is OpenCL buffer, dst
> is CPU memory
> > + *
> > + *@param cl_buf         The pointer of OpenCL buffer.
> > + *@param outbuf         CPU memory
> > + *@param size            The size of OpenCL buffer
> > + *@return 0 on success, a negative value on error
> > + */
> > +int av_opencl_read_buffer(void *cl_inbuf, uint8_t *outbuf, size_t size);
> > +
> > +/**
> > + *  Write data from memory to OpenCL buffer. Src is frames
> buffer(data[0],data[1]...), dst is OpenCL buffer.
> > + *
> > + *@param cl_buf                The pointer of OpenCL buffer.
> > + *@param cl_buffer_size     OpenCL buffer size
> > + *@param data                  Picture or audio data for each plane
> > + *@plane_size                   Size of each plane
> > + *@param plane_num         The input plane number
> > + *@param offset                The offset of OpenCL buffer start
> position
> > + *@return 0 on success, other values on error
> > + */
> > +
> > +int av_opencl_write_buffer(void *cl_inbuf, size_t cl_buffer_size,
> uint8_t **data, int *plane_size, int plane_num, int offset);
>
> I suggest to skip this API for this round of review, especially if it
> is not yet used by the deshake filter.
>
Yes, it is used by deshake filter. The OpenCL work flow is that copy the
data from CPU buffers to GPU buffer, then GPU process the data using
kernel, then copy the processed data to CPU buffer, this function implement
the first step:copy data from CPU buffers to GPU buffer. should I rename it
to av_opencl_copy_data_to_gpu

>
> > +
> > +/**
> > + *  Get OpenCL device id.
> > + *
> > + */
> > +
> > +cl_device_id av_opencl_get_device_id(void);
> > +
> > +/**
> > + *  Get OpenCL context.
> > + *
> > + */
> > +
> > +cl_context av_opencl_get_context(void);
> > +
> > +/**
> > + *  Get OpenCL command queue.
> > + *
> > + */
> > +
> > +cl_command_queue av_opencl_get_command_queue(void);
> > +
> > +/**
> > + *  Release OpenCL buffer.
> > + *
> > + */
> > +
> > +void av_opencl_release_buffer(void *cl_buf);
>
> > +
> > +/**
> > + *  Read frame data from OpenCL buffer to frame buffer, src buffer is
> OpenCL buffer, dst buffer is frame buffer(data[0],data[1]....).
> > + *
> > + *@param cl_buf                The pointer of OpenCL buffer.
> > + *@param cl_buffer_size     OpenCL buffer size
> > + *@param data                  Picture or audio data for each plane
> > + *@plane_size                   Size of each plane
> > + *@param plane_num         The input plane number
> > + *@return 0 on success, other values on error
> > + */
> > +
> > +int av_opencl_read_to_frame_buffer(void *cl_inbuf, size_t
> cl_buffer_size, uint8_t **data,
> > +                                          int *plane_size, int
> plane_num);
>
> should be put together with the other buffer-related functions.
> Also "av_opencl_copy_buffer" is probably a better name.
>
This function implements copying data from CPU to CPU, should I rename it
to av_opencl_copy_data_to_cpu?

>
>
> > +
>
> > +/**
> > + *  Register kernel.This function is use to register kernel and kernel
> code.OpenCL wrapper will use the kernel name
> > + *  to find the kernel code and compile it in the runtime. The kernel
> name should no longer than 64 and the max kernel
> > + *  number is 200.
> > + *
> > + *@param kernel_name                  Regist kernel name
> > + *@param kernel_code                   Kernel code
> > + */
> > +
> > +void av_opencl_register_kernel(const char *kernel_name,const char
> *kernel_code);
>
> /**
>  * Register kernel code with name kernel name.
>  *
>  * The OpenCL wrapper will use the kernel name to find the kernel code
>  * and compile it in the runtime.
>  *
>  * @param kernel_name kernel name, should be no longer than
> AV_OPENCL_MAX_KERNEL_NAME_SIZE-1
>  * @param kernel_code kernel code to register
>  */
> void av_opencl_register_kernel(const char *kernel_name, const char
> *kernel_code);
>
> and this function should probably return an error code.
>
> > +
>
> > +#endif
>
> Mention the corresponding guard, like it is done in the other files:
> #endif /* AVUTIL_OPENCL_... */
> --
> FFmpeg = Fabulous Fabulous Multimedia Philosofic Extravagant Gem
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list