[FFmpeg-devel] GSOC 2018 qualification task.

ANURAG SINGH IIT BHU anurag.singh.phy15 at iitbhu.ac.in
Fri Apr 13 07:09:35 EEST 2018


Thank you sir, I'll implement the suggested reviews as soon as possible.




‌

On Fri, Apr 13, 2018 at 4:04 AM, Michael Niedermayer <michael at niedermayer.cc
> wrote:

> On Fri, Apr 13, 2018 at 02:13:53AM +0530, ANURAG SINGH IIT BHU wrote:
> > Hello,
> > I have implemented the reviews mentioned on previous patch, now there is
> no
> > need to provide any subtitle file to the filter, I am attaching the
> > complete patch of the hellosubs filter.
> >
> > Command to run the filter
> > ffmpeg -i <videoname> -vf hellosubs=<videoname> helloout.mp4
> >
> >
> > Thanks and regards,
> > Anurag Singh.
> >
> >
> > ‌
> >
> > On Tue, Apr 10, 2018 at 4:55 AM, Rostislav Pehlivanov <
> atomnuker at gmail.com>
> > wrote:
> >
> > > On 9 April 2018 at 19:10, Paul B Mahol <onemda at gmail.com> wrote:
> > >
> > > > On 4/9/18, Rostislav Pehlivanov <atomnuker at gmail.com> wrote:
> > > > > On 9 April 2018 at 03:59, ANURAG SINGH IIT BHU <
> > > > > anurag.singh.phy15 at iitbhu.ac.in> wrote:
> > > > >
> > > > >> This mail is regarding the qualification task assigned to me for
> the
> > > > >> GSOC project
> > > > >> in FFmpeg for automatic real-time subtitle generation using
> speech to
> > > > text
> > > > >> translation ML model.
> > > > >>
> > > > >
> > > > > i really don't think lavfi is the correct place for such code, nor
> that
> > > > the
> > > > > project's repo should contain such code at all.
> > > > > This would need to be in another repo and a separate library.
> > > >
> > > > Why? Are you against ocr filter too?
> > > >
> > >
> > > The OCR filter uses libtessract so I'm fine with it. Like I said, as
> long
> > > as the actual code to do it is in an external library I don't mind.
> > > Mozilla recently released Deep Speech (https://github.com/mozilla/
> > > DeepSpeech)
> > > which does pretty much exactly speech to text and is considered to
> have the
> > > most accurate one out there. Someone just needs to convert the
> tensorflow
> > > code to something more usable.
> > > _______________________________________________
> > > ffmpeg-devel mailing list
> > > ffmpeg-devel at ffmpeg.org
> > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> > >
>
> >  Makefile       |    1
> >  allfilters.c   |    1
> >  vf_hellosubs.c |  513 ++++++++++++++++++++++++++++++
> +++++++++++++++++++++++++++
> >  3 files changed, 515 insertions(+)
> > 2432f100fddb7ec84e771be8282d4b66e3d1f50a  0001-avfilter-add-hellosubs-
> filter.patch
> > From ac0e09d431ea68aebfaef6e2ed0b450e76d473d9 Mon Sep 17 00:00:00 2001
> > From: ddosvulnerability <anurag.singh.phy15 at iitbhu.ac.in>
> > Date: Thu, 12 Apr 2018 22:06:43 +0530
> > Subject: [PATCH] avfilter: add hellosubs filter.
> >
> > ---
> >  libavfilter/Makefile       |   1 +
> >  libavfilter/allfilters.c   |   1 +
> >  libavfilter/vf_hellosubs.c | 513 ++++++++++++++++++++++++++++++
> +++++++++++++++
> >  3 files changed, 515 insertions(+)
> >  create mode 100644 libavfilter/vf_hellosubs.c
> >
> > diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> > index a90ca30..770b1b5 100644
> > --- a/libavfilter/Makefile
> > +++ b/libavfilter/Makefile
> > @@ -331,6 +331,7 @@ OBJS-$(CONFIG_SSIM_FILTER)                   +=
> vf_ssim.o framesync.o
> >  OBJS-$(CONFIG_STEREO3D_FILTER)               += vf_stereo3d.o
> >  OBJS-$(CONFIG_STREAMSELECT_FILTER)           += f_streamselect.o
> framesync.o
> >  OBJS-$(CONFIG_SUBTITLES_FILTER)              += vf_subtitles.o
> > +OBJS-$(CONFIG_HELLOSUBS_FILTER)              += vf_hellosubs.o
> >  OBJS-$(CONFIG_SUPER2XSAI_FILTER)             += vf_super2xsai.o
> >  OBJS-$(CONFIG_SWAPRECT_FILTER)               += vf_swaprect.o
> >  OBJS-$(CONFIG_SWAPUV_FILTER)                 += vf_swapuv.o
> > diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> > index 6eac828..a008908 100644
> > --- a/libavfilter/allfilters.c
> > +++ b/libavfilter/allfilters.c
> > @@ -322,6 +322,7 @@ extern AVFilter ff_vf_ssim;
> >  extern AVFilter ff_vf_stereo3d;
> >  extern AVFilter ff_vf_streamselect;
> >  extern AVFilter ff_vf_subtitles;
> > +extern AVFilter ff_vf_hellosubs;
> >  extern AVFilter ff_vf_super2xsai;
> >  extern AVFilter ff_vf_swaprect;
> >  extern AVFilter ff_vf_swapuv;
> > diff --git a/libavfilter/vf_hellosubs.c b/libavfilter/vf_hellosubs.c
> > new file mode 100644
> > index 0000000..b994050
> > --- /dev/null
> > +++ b/libavfilter/vf_hellosubs.c
> > @@ -0,0 +1,513 @@
> > +/*
> > + * Copyright (c) 2011 Baptiste Coudurier
> > + * Copyright (c) 2011 Stefano Sabatini
> > + * Copyright (c) 2012 Clément Bœsch
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +/**
> > + * @file
> > + * Libass hellosubs burning filter.
> > + *
> > +
> > + */
> > +
> > +#include <ass/ass.h>
> > +
> > +#include "config.h"
> > +#if CONFIG_SUBTITLES_FILTER
> > +# include "libavcodec/avcodec.h"
> > +# include "libavformat/avformat.h"
> > +#endif
> > +#include "libavutil/avstring.h"
> > +#include "libavutil/imgutils.h"
> > +#include "libavutil/opt.h"
> > +#include "libavutil/parseutils.h"
> > +#include "drawutils.h"
> > +#include "avfilter.h"
> > +#include "internal.h"
> > +#include "formats.h"
> > +#include "video.h"
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +
> > +typedef struct AssContext {
> > +    const AVClass *class;
> > +    ASS_Library  *library;
> > +    ASS_Renderer *renderer;
> > +    ASS_Track    *track;
> > +    char *filename;
> > +    char *fontsdir;
> > +    char *charenc;
> > +    char *force_style;
> > +    int stream_index;
> > +    int alpha;
> > +    uint8_t rgba_map[4];
> > +    int     pix_step[4];       ///< steps per pixel for each plane of
> the main output
> > +    int original_w, original_h;
> > +    int shaping;
> > +    FFDrawContext draw;
> > +} AssContext;
> > +
> > +#define OFFSET(x) offsetof(AssContext, x)
> > +#define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
> > +
> > +#define COMMON_OPTIONS \
> > +    {"filename",       "set the filename of file to read",
>            OFFSET(filename),   AV_OPT_TYPE_STRING,     {.str = NULL},
> CHAR_MIN, CHAR_MAX, FLAGS }, \
> > +    {"f",              "set the filename of file to read",
>            OFFSET(filename),   AV_OPT_TYPE_STRING,     {.str = NULL},
> CHAR_MIN, CHAR_MAX, FLAGS }, \
> > +    {"original_size",  "set the size of the original video (used to
> scale fonts)", OFFSET(original_w), AV_OPT_TYPE_IMAGE_SIZE, {.str = NULL},
> CHAR_MIN, CHAR_MAX, FLAGS }, \
> > +    {"fontsdir",       "set the directory containing the fonts to
> read",           OFFSET(fontsdir),   AV_OPT_TYPE_STRING,     {.str =
> NULL},  CHAR_MIN, CHAR_MAX, FLAGS }, \
> > +    {"alpha",          "enable processing of alpha channel",
>            OFFSET(alpha),      AV_OPT_TYPE_BOOL,       {.i64 = 0   },
>    0,        1, FLAGS }, \
> > +
> > +/* libass supports a log level ranging from 0 to 7 */
> > +static const int ass_libavfilter_log_level_map[] = {
> > +    [0] = AV_LOG_FATAL,     /* MSGL_FATAL */
> > +    [1] = AV_LOG_ERROR,     /* MSGL_ERR */
> > +    [2] = AV_LOG_WARNING,   /* MSGL_WARN */
> > +    [3] = AV_LOG_WARNING,   /* <undefined> */
> > +    [4] = AV_LOG_INFO,      /* MSGL_INFO */
> > +    [5] = AV_LOG_INFO,      /* <undefined> */
> > +    [6] = AV_LOG_VERBOSE,   /* MSGL_V */
> > +    [7] = AV_LOG_DEBUG,     /* MSGL_DBG2 */
> > +};
> > +
> > +static void ass_log(int ass_level, const char *fmt, va_list args, void
> *ctx)
> > +{
> > +    const int ass_level_clip = av_clip(ass_level, 0,
> > +        FF_ARRAY_ELEMS(ass_libavfilter_log_level_map) - 1);
> > +    const int level = ass_libavfilter_log_level_map[ass_level_clip];
> > +
> > +    av_vlog(ctx, level, fmt, args);
> > +    av_log(ctx, level, "\n");
> > +}
> > +
> > +static av_cold int init(AVFilterContext *ctx)
> > +{
> > +    AssContext *ass = ctx->priv;
> > +
> > +    if (!ass->filename) {
> > +        av_log(ctx, AV_LOG_ERROR, "No filename provided!\n");
> > +        return AVERROR(EINVAL);
> > +    }
> > +
> > +    ass->library = ass_library_init();
> > +    if (!ass->library) {
> > +        av_log(ctx, AV_LOG_ERROR, "Could not initialize libass.\n");
> > +        return AVERROR(EINVAL);
> > +    }
> > +    ass_set_message_cb(ass->library, ass_log, ctx);
> > +
> > +    ass_set_fonts_dir(ass->library, ass->fontsdir);
> > +
> > +    ass->renderer = ass_renderer_init(ass->library);
> > +    if (!ass->renderer) {
> > +        av_log(ctx, AV_LOG_ERROR, "Could not initialize libass
> renderer.\n");
> > +        return AVERROR(EINVAL);
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +static av_cold void uninit(AVFilterContext *ctx)
> > +{
> > +    AssContext *ass = ctx->priv;
> > +
> > +    if (ass->track)
> > +        ass_free_track(ass->track);
> > +    if (ass->renderer)
> > +        ass_renderer_done(ass->renderer);
> > +    if (ass->library)
> > +        ass_library_done(ass->library);
> > +}
> > +
> > +static int query_formats(AVFilterContext *ctx)
> > +{
> > +    return ff_set_common_formats(ctx, ff_draw_supported_pixel_
> formats(0));
> > +}
> > +
> > +static int config_input(AVFilterLink *inlink)
> > +{
> > +    AssContext *ass = inlink->dst->priv;
> > +
> > +    ff_draw_init(&ass->draw, inlink->format, ass->alpha ?
> FF_DRAW_PROCESS_ALPHA : 0);
> > +
> > +    ass_set_frame_size  (ass->renderer, inlink->w, inlink->h);
> > +    if (ass->original_w && ass->original_h)
> > +        ass_set_aspect_ratio(ass->renderer, (double)inlink->w /
> inlink->h,
> > +                             (double)ass->original_w / ass->original_h);
> > +    if (ass->shaping != -1)
> > +        ass_set_shaper(ass->renderer, ass->shaping);
> > +
> > +    return 0;
> > +}
> > +
> > +/* libass stores an RGBA color in the format RRGGBBTT, where TT is the
> transparency level */
> > +#define AR(c)  ( (c)>>24)
> > +#define AG(c)  (((c)>>16)&0xFF)
> > +#define AB(c)  (((c)>>8) &0xFF)
> > +#define AA(c)  ((0xFF-(c)) &0xFF)
> > +
> > +static void overlay_ass_image(AssContext *ass, AVFrame *picref,
> > +                              const ASS_Image *image)
> > +{
> > +    for (; image; image = image->next) {
> > +        uint8_t rgba_color[] = {AR(image->color), AG(image->color),
> AB(image->color), AA(image->color)};
> > +        FFDrawColor color;
> > +        ff_draw_color(&ass->draw, &color, rgba_color);
> > +        ff_blend_mask(&ass->draw, &color,
> > +                      picref->data, picref->linesize,
> > +                      picref->width, picref->height,
> > +                      image->bitmap, image->stride, image->w, image->h,
> > +                      3, 0, image->dst_x, image->dst_y);
> > +    }
> > +}
> > +
> > +static int filter_frame(AVFilterLink *inlink, AVFrame *picref)
> > +{
> > +    AVFilterContext *ctx = inlink->dst;
> > +    AVFilterLink *outlink = ctx->outputs[0];
> > +    AssContext *ass = ctx->priv;
> > +    int detect_change = 0;
> > +    double time_ms = picref->pts * av_q2d(inlink->time_base) * 1000;
> > +    ASS_Image *image = ass_render_frame(ass->renderer, ass->track,
> > +                                        time_ms, &detect_change);
> > +
> > +    if (detect_change)
> > +        av_log(ctx, AV_LOG_DEBUG, "Change happened at time ms:%f\n",
> time_ms);
> > +
> > +    overlay_ass_image(ass, picref, image);
> > +
> > +    return ff_filter_frame(outlink, picref);
> > +}
> > +
> > +static const AVFilterPad ass_inputs[] = {
> > +    {
> > +        .name             = "default",
> > +        .type             = AVMEDIA_TYPE_VIDEO,
> > +        .filter_frame     = filter_frame,
> > +        .config_props     = config_input,
> > +        .needs_writable   = 1,
> > +    },
> > +    { NULL }
> > +};
> > +
> > +static const AVFilterPad ass_outputs[] = {
> > +    {
> > +        .name = "default",
> > +        .type = AVMEDIA_TYPE_VIDEO,
> > +    },
> > +    { NULL }
> > +};
> > +
> > +
> > +
> > +
> > +
> > +static const AVOption hellosubs_options[] = {
> > +    COMMON_OPTIONS
> > +    {"charenc",      "set input character encoding", OFFSET(charenc),
>     AV_OPT_TYPE_STRING, {.str = NULL}, CHAR_MIN, CHAR_MAX, FLAGS},
> > +    {"stream_index", "set stream index",
>  OFFSET(stream_index), AV_OPT_TYPE_INT,    { .i64 = -1 }, -1,
>  INT_MAX,  FLAGS},
> > +    {"si",           "set stream index",
>  OFFSET(stream_index), AV_OPT_TYPE_INT,    { .i64 = -1 }, -1,
>  INT_MAX,  FLAGS},
> > +    {"force_style",  "force subtitle style",
>  OFFSET(force_style),  AV_OPT_TYPE_STRING, {.str = NULL}, CHAR_MIN,
> CHAR_MAX, FLAGS},
> > +    {NULL},
> > +};
> > +
> > +static const char * const font_mimetypes[] = {
> > +    "application/x-truetype-font",
> > +    "application/vnd.ms-opentype",
> > +    "application/x-font-ttf",
> > +    NULL
> > +};
> > +
> > +static int attachment_is_font(AVStream * st)
> > +{
> > +    const AVDictionaryEntry *tag = NULL;
> > +    int n;
> > +
> > +    tag = av_dict_get(st->metadata, "mimetype", NULL,
> AV_DICT_MATCH_CASE);
> > +
> > +    if (tag) {
> > +        for (n = 0; font_mimetypes[n]; n++) {
> > +            if (av_strcasecmp(font_mimetypes[n], tag->value) == 0)
> > +                return 1;
> > +        }
> > +    }
> > +    return 0;
> > +}
> > +
> > +AVFILTER_DEFINE_CLASS(hellosubs);
> > +
> > +static av_cold int init_hellosubs(AVFilterContext *ctx)
> > +{
> > +    int j, ret, sid;long int z=0;int t1=0;
> > +    int k = 0;
> > +    AVDictionary *codec_opts = NULL;
> > +    AVFormatContext *fmt = NULL;
> > +    AVCodecContext *dec_ctx = NULL;
> > +    AVCodec *dec = NULL;
> > +    const AVCodecDescriptor *dec_desc;
> > +    AVStream *st;
> > +    AVPacket pkt;
> > +    AssContext *ass = ctx->priv;
>
> > +    FILE *file;
> > +    if ((file = fopen("hello.srt", "r")))
>
> there is no need for accessing an external file for the task of
> drawing a line of text.
>
>
> > +    {
> > +        fclose(file);
> > +
> > +    }
> > +    else
> > +   {
> > +   FILE * fp;
> > +   fp = fopen ("hello.srt","w");
>
> thats even more true for writing such file.
> It also would not work predictable with multiple filters
>
>
> > +   fprintf (fp, "1\n");
> > +   fprintf (fp, "00:00:05,615 --> 00:00:08,083\n");
> > +   fprintf (fp, "%s",ass->filename);
> > +   fclose (fp);
> > +
> > +   char cmd[300];
> > +   strcpy(cmd,"ffmpeg -i ");
> > +   strcat(cmd,ass->filename);
> > +   char fn[200];
> > +   strcpy(fn,ass->filename);
> > +   strcat(cmd," -vf hellosubs=hello.srt helloout");
> > +   int m=0;
> > +   for(int w=(strlen(fn)-1);w>=0;w--)
> > +   {if (fn[w]=='.')
> > +   {m=w;
> > +   break;}}
> > +   char join[5];
> > +   for(int loc=m;loc<strlen(fn);loc++)
> > +   join[loc-m]=fn[loc];
> > +   char rem[100];
> > +   char join1[100];
> > +   strcpy(join1,join);
> > +   strcpy(rem,"helloout");
> > +   strcat(rem,join1);
> > +   remove(rem);
> > +
> > +   strcat(cmd,join);
> > +   system(cmd);
> > +   remove("hello.srt");
> > +
> > +exit(0);
>
> also a filter cannot call exit(), in fact a library like libavfilter must
> not
> call exit()
>
>
> > +}
> > +
> > +    /* Init libass */
> > +    ret = init(ctx);
> > +    if (ret < 0)
> > +        return ret;
> > +    ass->track = ass_new_track(ass->library);
> > +    if (!ass->track) {
> > +        av_log(ctx, AV_LOG_ERROR, "Could not create a libass track\n");
> > +        return AVERROR(EINVAL);
> > +    }
> > +
> > +
>
> > +    ret = avformat_open_input(&fmt, ass->filename, NULL, NULL);
> > +    if (ret < 0) {
> > +        av_log(ctx, AV_LOG_ERROR, "Unable to open %s\n", ass->filename);
> > +
> > +    }
>
> also no function from libavformat is needed, this filter should draw a
> line of
> text, not demux a file.
> You maybe misinterpredted my previous review. All unneeded code like every
> bit of
> libavformat use must be removed.
>
> You seem to be trying to workaround what i suggest not actually solve the
> issues
> raised.
> Like writing a file to replace the impossibility of accessing some input
> file
> directly. There really is no file and none can be written.
>
> The goal of this filter was to create subtitle packets/frames and pass
> them on.
> As this turned out too hard in the time available. The simpler goal now is
> to
> draw that text on a video frame.
>
> The filter gets video frames on its input and it passes them on to the
> output.
> In there it should write that Hello world text with the advancing number
> onto
> it
> For this there is no need to access any files, or use any demuxers.
> you can use the libass code from the subtitle filter as you do but that
> code
> uses a external subtitle file. You have to change this so it no longer
> uses a
> external file or demuxes this with libavformat. These steps are not needed
> and are incorrect for this task
>
> i suggest you remove "include "libavformat *" that way you will see
> exactly what must be removed
> and this should make the code simpler, it just isnt needed to have this
> baggage between the avcodec/libass and what you want to draw
>
> the libavformat code is there to read a subtitle file.
> There is no subtitle file. The filter should just draw a line saying
> hello world with a number.
>
>
> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Dictatorship: All citizens are under surveillance, all their steps and
> actions recorded, for the politicians to enforce control.
> Democracy: All politicians are under surveillance, all their steps and
> actions recorded, for the citizens to enforce control.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>


More information about the ffmpeg-devel mailing list