[FFmpeg-devel] [GSOC] [PATCH] DNN module introduction and SRCNN filter update

Sergey Lavrushkin dualfal at gmail.com
Tue May 29 01:52:12 EEST 2018

2018-05-28 9:32 GMT+03:00 Guo, Yejun <yejun.guo at intel.com>:

> looks that no tensorflow dependency is introduced, a new model format is
> created together with some CPU implementation for inference.   With this
> idea, Android Neural Network would be a very good reference, see
> https://developer.android.google.cn/ndk/guides/neuralnetworks/. It
> defines how the model is organized, and also provided a CPU optimized
> inference implementation (within the NNAPI runtime, it is open source). It
> is still under development but mature enough to run some popular dnn models
> with proper performance. We can absorb some basic design. Anyway, just a
> reference fyi.  (btw, I'm not sure about any IP issue)

The idea was to first introduce something to use when tensorflow is not
available. Here is another patch, that introduces tensorflow backend.

> For this patch, I have two comments.
> 1. change from "DNNModel* (*load_default_model)(DNNDefaultModel
> model_type);" to " DNNModel* (*load_builtin_model)(DNNBuiltinModel
> model_type);"
> The DNNModule can be invoked by many filters,  default model is a good
> name at the filter level, while built-in model is better within the DNN
> scope.
> typedef struct DNNModule{
>     // Loads model and parameters from given file. Returns NULL if it is
> not possible.
>     DNNModel* (*load_model)(const char* model_filename);
>     // Loads one of the default models
>     DNNModel* (*load_default_model)(DNNDefaultModel model_type);
>     // Executes model with specified input and output. Returns DNN_ERROR
> otherwise.
>     DNNReturnType (*execute_model)(const DNNModel* model);
>     // Frees memory allocated for model.
>     void (*free_model)(DNNModel** model);
> } DNNModule;
> 2. add a new variable 'number' for DNNData/InputParams
> As a typical DNN concept, the data shape usually is: <number, height,
> width, channel> or <number, channel, height, width>, the last component
> denotes its index changes the fastest in the memory. We can add this
> concept into the API, and decide to support <NHWC> or <NCHW> or both.

I did not add number of elements in batch because I thought, that we would
not feed more than one element at once to a network in a ffmpeg filter.
But it can be easily added if necessary.

So here is the patch that adds tensorflow backend with the previous patch.
I forgot to change include guards from AVUTIL_* to AVFILTER_* in it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-adds-dnn-srcnn.patch
Type: text/x-patch
Size: 372905 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180529/048cb137/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-adds-tensorflow-backend.patch
Type: text/x-patch
Size: 239047 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180529/048cb137/attachment-0001.bin>

More information about the ffmpeg-devel mailing list