[FFmpeg-devel] [PATCH 1/2] libavutil/libavfilter: opencl wrapper based on comments on 20130326

Michael Niedermayer michaelni at gmx.at
Wed Mar 27 01:08:49 CET 2013


On Wed, Mar 27, 2013 at 12:48:18AM +0100, Stefano Sabatini wrote:
> On date Tuesday 2013-03-26 18:55:05 +0800, Wei Gao encoded:
> > From f91df6a8315a1b7bdc7b69517831fc745fcbd4fd Mon Sep 17 00:00:00 2001
[...]

> > +int av_opencl_run_kernel(const char *kernel_name, void **userdata)
> > +{
> > +    av_opencl_kernel_function function = NULL;
> > +    int i;
> 
> > +    for (i = 0; i < gpu_env.kernel_count; i++) {
> > +        if (av_strcasecmp(kernel_name, gpu_env.kernel_names[i]) == 0) {
> > +            function = gpu_env.kernel_functions[i];
> > +            break;
> > +        }
> > +    }
> 
> this could make use of a binary search (see libavutil/tree.c) for
> better access times (log_2 versus linear). Not blocking since it can
> be changed later with no interface modification.

I think its better to do the lookup at "init time", instead of
optimizing how to do it in the more inner loop. But then iam not sure
how speed relevant this is. I dont think a binary search is the right
solution


[...]
> > +        for (i = 0;i < plane_num;i++) {
> > +            memcpy(dst_data[i], temp, plane_size[i]);
> > +            temp += plane_size[i];
> > +        }
> > +    }
> > +    return ret;
> > +}
> > +
> > +cl_device_id av_opencl_get_device_id(void)
> > +{
> > +    if (!gpu_env.is_user_created) {
> > +        return *(gpu_env.device_ids);
> > +    } else
> > +        return gpu_env.device_id;
> > +}
> > +
> > +cl_context av_opencl_get_context(void)
> > +{
> > +    return gpu_env.context;
> > +}
> > +
> > +cl_command_queue av_opencl_get_command_queue(void)
> > +{
> > +    return gpu_env.command_queue;
> > +}
> > +
> > diff --git a/libavutil/opencl.h b/libavutil/opencl.h
> > new file mode 100644
> > index 0000000..f5172dc
> > --- /dev/null
> > +++ b/libavutil/opencl.h
> > @@ -0,0 +1,219 @@
> > +/*
> > + * Copyright (C) 2012 Peng Gao <peng at multicorewareinc.com>
> > + * Copyright (C) 2012 Li   Cao <li at multicorewareinc.com>
> > + * Copyright (C) 2012 Wei  Gao <weigao at multicorewareinc.com>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> > + */
> > +
> > +#include "config.h"
> > +
> > +#ifndef LIBAVUTIL_OPENCLWRAPPER_H
> > +#define LIBAVUTIL_OPENCLWRAPPER_H
> > +
> > +#include <CL/cl.h>
> 
> This is my understanding of how this OpenCL API works.
> 
> You register some code with a name (which are currently stored
> in the global environment), with av_opencl_register_kernel().
> 
> Then the previously registered framments of code are compiled by
> OpenCL, when doing: av_opencl_init() -> compile_kernel_file()
> 
> With av_opencl_init() you also specify some parameters (build_options)
> which are used when compiling the code of the specified functions.
> 
> av_opencl_init() also creates the OpenCL program, which is stored in
> the global environment. The program is unique for all the kernels
> registered so far.
> 
> At this point you need to create an entry point for each kernel, to
> run a specific *function* defined within it. This is done by creating
> a kernel, with av_opencl_create_kernel()
> 
> av_opencl_create_kernel() is used to create a kernel (a sort of
> handler to communicate with the *compiled* kernel). The kernel is
> created specifying the name of the function to run *in the kernel
> code*. The kernel is set in the passed AVOpenCLEnv environment.
> 
> In order to run a function specified in a kernel, you also need to
> provide some parameters/data to it.
> 
> This is done through av_opencl_register_kernel_function(), which is
> used to register a function which is associated to one of the
> previously registered kernel in the global environment.
> 
> so we have: kernel(global env) -> function(global env)
> 
> To run the code of a kernel, av_opencl_run_kernel() must be called,
> with the name of the registered kernel on which the function is to be
> called.
> 
> This function lookups the functions registered in the global
> environment, and executes the registered function with provided user
> data/parameters, which in particular must contain the opencl
> environment. The environment should contain the kernel handler created
> with av_opencl_create_kerne(), and is used to set the arguments for
> the function defined in the kernel code, and eventually run the code
> for it (see the deshake patch for an example of such usage).
> 
> ...
> 
> So basically this is the workflow:
> 
> kernel code registration (done in the global environment)            -> av_opencl_register_kernel()
> kernel code compilation/init (always done in the global environment) -> av_opencl_init()
> 
> kernel function registration (can be eventually done *before* init)  -> av_opencl_register_kernel_function()
> kernel object creation, which is required to run the code            -> av_opencl_create_kernel
> kernel code execution with user data parameters                      -> av_opencl_run_kernel()
> 
> Cleanup:
> kernel object (stored in an environment)                             -> av_opencl_release_kernel()
> global environment                                                   -> av_opencl_uninit()
> 
> ...
> 
> Can you confirm that this is an accurate description of the
> design/workflow?
> 
> The main problem with this design is that different threads and
> components can messup with the global environment.
> 
> For example you may want to init a filter, this creates a global
> environment, then you create another filter/component which requires
> to build a different kernel etc., which can't be done since you're
> supposed to init the global environment just once.
> 
> Ideally we should have one OpenCL context per component, so we don't
> need to know everything (kernel code and functions) when we init the
> OpenCL system, and by using a global environment you are prevented
> from doing that.
>

> In a similar way, when you uninit the OpenCL system you don't know if
> other components are actually using it, so the only safe way is to
> uninit() it when you close the *application*, which is not ideal for a
> library.

reference counting could be used with atomic operations, iam not
sure thats the right solution though
it would be best to avoid non constant globals instead of refcounting
or locking around them. 


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

What does censorship reveal? It reveals fear. -- Julian Assange
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130327/96b495e4/attachment.asc>


More information about the ffmpeg-devel mailing list