[FFmpeg-devel] [PATCH] lavfi: add thumb video filter.

Clément Bœsch ubitux at gmail.com
Mon Dec 5 17:41:10 CET 2011


On Wed, Nov 30, 2011 at 07:02:49PM +0100, Nicolas George wrote:
> Le nonidi 9 frimaire, an CCXX, Clément Bœsch a écrit :
> > + at section thumb
> 
> Maybe thumbnail in full? Or even thumbnails, plural, depending on what it
> does.
> 

OK for "thumbnail", I'd like to avoid the plural version though.

> > +Select potential thumbnail frames.
> > +
> > +It accepts as argument the threshold of frames to analyze (default is 100). A
> > +bigger value will result in a slower analysis and higher memory usage, but is
> > +likely to be more efficient.
> 
> By reading that, I have no idea what the filter actually does, and I can not
> fathom what the threshold is used for.
> 

I added a sentence, it will hopefully clarifies things. If not, any
suggestion is welcome :)

> Also
> 
> > + * Copyright (C) 2011 Smartjog S.A.S, Clément Bœsch <clement.boesch at smartjog.com>
> 
> Apparently, the other files use lowercase C for copyright.
> 

OK, fixed.

> > + * Algorithm by Vadim Zaliva <lord at crocodile.org>.
> 
> Was this address openly written on the web until now?
> 

Yes, at least here:
http://web.archive.org/web/20070519135542/http://www.crocodile.org/lord/thumbextraction/vthumb.py

And Google gives more hints.

> > +    AVFilterBufferRef **frames;     ///< n_frames frames cache
> > +    int *histogram;                 ///< array of RGB color distribution histograms
> 
> The calloc somewhat lower seems to indicate that:
> 
> struct thumb_frame {
>     AVFilterBufferRef *buf;
>     int hist[HIST_SZ];
> };
> 
> and then:
> 
>     struct thumb_frame *frames;
> 
> would be more elegant throughout the code.
> 

Agreed, changed.

> > +               "allocation failure, try to lower the frames threshold\n");
> 
> Apparently, they want capital on messages.
> 

Fixed here and below.

> > +static float frame_rmse(const int *hist, const float *median)
> 
> Why a float rather than a double?
> 

There is really no need for such precision IMO. It can be changed later if
really needed.

> > +static av_cold void uninit(AVFilterContext *ctx)
> > +{
> > +    ThumbContext *thumb = ctx->priv;
> > +    av_freep(&thumb->histogram);
> 
> You do not free thumb->frames?
> 

Mmh, fixed.

> > +    .inputs    = (const AVFilterPad[])  {{  .name             = "default",
> > +                                            .type             = AVMEDIA_TYPE_VIDEO,
> 
> My narrow terminal finds that much of indentation not very readable, but
> that is up to you.
> 

Yeah I never know how to indent them properly. Hopefully the new version
is better.

The attached patch also includes the changes requested by Michael. I'm
still not really satisfied about the memory usage though...

-- 
Clément B.
-------------- next part --------------
From afa3ee75d97cf805ffc428e71724634b8e07a6eb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Cl=C3=A9ment=20B=C5=93sch?= <clement.boesch at smartjog.com>
Date: Mon, 24 Oct 2011 17:11:10 +0200
Subject: [PATCH] lavfi: add thumb video filter.

---
 Changelog                  |    1 +
 doc/filters.texi           |   12 +++
 libavfilter/Makefile       |    1 +
 libavfilter/allfilters.c   |    1 +
 libavfilter/avfilter.h     |    2 +-
 libavfilter/vf_thumbnail.c |  195 ++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 211 insertions(+), 1 deletions(-)
 create mode 100644 libavfilter/vf_thumbnail.c

diff --git a/Changelog b/Changelog
index 6ab3e84..4a65486 100644
--- a/Changelog
+++ b/Changelog
@@ -130,6 +130,7 @@ easier to use. The changes are:
 - Microsoft Windows ICO demuxer
 - life source
 - PCM format support in OMA demuxer
+- Thumbnails support (see thumbnail video filter)
 
 
 version 0.8:
diff --git a/doc/filters.texi b/doc/filters.texi
index eb1a4df..e6b34eb 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -2393,6 +2393,18 @@ For example:
 will create two separate outputs from the same input, one cropped and
 one padded.
 
+ at section thumbnail
+Select potential thumbnail frames.
+
+It accepts as argument the threshold of frames to analyze (default is 100). The
+filter will pick one of these frames. A bigger value will result in a slower
+analysis and higher memory usage, but is likely to be more efficient.
+
+Example of thumbnail creation:
+ at example
+ffmpeg -i in.avi -vf thumbnail,scale=300:200 -vframes 1 out.png
+ at end example
+
 @section transpose
 
 Transpose rows with columns in the input video and optionally flip it.
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 7ee4d17..1faead7 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -78,6 +78,7 @@ OBJS-$(CONFIG_SETTB_FILTER)                  += vf_settb.o
 OBJS-$(CONFIG_SHOWINFO_FILTER)               += vf_showinfo.o
 OBJS-$(CONFIG_SLICIFY_FILTER)                += vf_slicify.o
 OBJS-$(CONFIG_SPLIT_FILTER)                  += vf_split.o
+OBJS-$(CONFIG_THUMBNAIL_FILTER)              += vf_thumbnail.o
 OBJS-$(CONFIG_TRANSPOSE_FILTER)              += vf_transpose.o
 OBJS-$(CONFIG_UNSHARP_FILTER)                += vf_unsharp.o
 OBJS-$(CONFIG_VFLIP_FILTER)                  += vf_vflip.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 2fe92bd..def8571 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -89,6 +89,7 @@ void avfilter_register_all(void)
     REGISTER_FILTER (SHOWINFO,    showinfo,    vf);
     REGISTER_FILTER (SLICIFY,     slicify,     vf);
     REGISTER_FILTER (SPLIT,       split,       vf);
+    REGISTER_FILTER (THUMBNAIL,   thumbnail,   vf);
     REGISTER_FILTER (TRANSPOSE,   transpose,   vf);
     REGISTER_FILTER (UNSHARP,     unsharp,     vf);
     REGISTER_FILTER (VFLIP,       vflip,       vf);
diff --git a/libavfilter/avfilter.h b/libavfilter/avfilter.h
index 24998a6..1e295ab 100644
--- a/libavfilter/avfilter.h
+++ b/libavfilter/avfilter.h
@@ -29,7 +29,7 @@
 #include "libavutil/rational.h"
 
 #define LIBAVFILTER_VERSION_MAJOR  2
-#define LIBAVFILTER_VERSION_MINOR 51
+#define LIBAVFILTER_VERSION_MINOR 52
 #define LIBAVFILTER_VERSION_MICRO  0
 
 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \
diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c
new file mode 100644
index 0000000..905cc94
--- /dev/null
+++ b/libavfilter/vf_thumbnail.c
@@ -0,0 +1,195 @@
+/*
+ * Copyright (c) 2011 Smartjog S.A.S, Clément Bœsch <clement.boesch at smartjog.com>
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/**
+ * @file
+ * Potential thumbnail lookup filter to reduce the risk of an inappropriate
+ * selection (such as a black frame) we could get with an absolute seek.
+ *
+ * Algorithm by Vadim Zaliva <lord at crocodile.org>.
+ * @url http://notbrainsurgery.livejournal.com/29773.html
+ */
+
+#include <math.h>
+#include "libavcodec/avcodec.h"
+#include "libavutil/imgutils.h"
+#include "libavutil/internal.h"
+#include "libswscale/swscale.h"
+#include "avfilter.h"
+
+#define HIST_SZ (3*256)
+
+struct thumb_frame {
+    AVFilterBufferRef *buf;     ///< cached frame
+    int histogram[HIST_SZ];     ///< RGB color distribution histogram of the frame
+};
+
+typedef struct {
+    int n;                      ///< current frame
+    int n_frames;               ///< threshold of frames for analysis
+    struct thumb_frame *frames; ///< the n_frames frames
+} ThumbContext;
+
+static av_cold int init(AVFilterContext *ctx, const char *args, void *opaque)
+{
+    ThumbContext *thumb = ctx->priv;
+
+    if (args)
+        thumb->n_frames = strtol(args, NULL, 10);
+    if (thumb->n_frames <= 0)
+        thumb->n_frames = 100;
+    thumb->frames = av_calloc(thumb->n_frames, sizeof(*thumb->frames));
+    if (!thumb->frames) {
+        av_log(ctx, AV_LOG_ERROR,
+               "Allocation failure, try to lower the frames threshold\n");
+        return AVERROR(ENOMEM);
+    }
+    av_log(ctx, AV_LOG_INFO, "Select thumbnail with threshold of %d frames\n",
+           thumb->n_frames);
+    return 0;
+}
+
+static void draw_slice(AVFilterLink *inlink, int y, int h, int slice_dir)
+{
+    int i, j;
+    AVFilterContext *ctx = inlink->dst;
+    ThumbContext *thumb = ctx->priv;
+    int *hist = thumb->frames[thumb->n].histogram;
+    AVFilterBufferRef *picref = inlink->cur_buf;
+    const uint8_t *p = picref->data[0] + y * picref->linesize[0];
+
+    for (j = 0; j < h; j++) {
+        for (i = 0; i < inlink->w; i++) {
+            hist[0*256 + p[i*3    ]]++;
+            hist[1*256 + p[i*3 + 1]]++;
+            hist[2*256 + p[i*3 + 2]]++;
+        }
+        p += picref->linesize[0];
+    }
+}
+
+/**
+ * @brief        compute Root-mean-square deviation to estimate "closeness"
+ * @param hist   color distribution histogram
+ * @param median average color distribution histogram
+ * @return       root mean squared error
+ */
+static float frame_rmse(const int *hist, const float *median)
+{
+    int i;
+    float err, mean_sq_err = 0;
+    for (i = 0; i < HIST_SZ; i++) {
+        err = median[i] - (float)hist[i];
+        mean_sq_err += err*err / HIST_SZ;
+    }
+    return sqrtf(mean_sq_err);
+}
+
+static void end_frame(AVFilterLink *inlink)
+{
+    int i, j;
+    float avg[HIST_SZ] = {0}, rmse, min_rmse = -1;
+    int best_frame = 0;
+    AVFilterLink *outlink = inlink->dst->outputs[0];
+    ThumbContext *thumb = inlink->dst->priv;
+
+    // keep a reference of each frame
+    thumb->frames[thumb->n].buf = inlink->cur_buf;
+
+    // no selection until the buffer of N frames is filled up
+    if (thumb->n < thumb->n_frames - 1) {
+        thumb->n++;
+        return;
+    }
+
+    // average histogram of the N frames
+    for (j = 0; j < FF_ARRAY_ELEMS(avg); j++)
+        for (i = 0; i < thumb->n_frames; i++)
+            avg[j] += (float)thumb->frames[i].histogram[j] / thumb->n_frames;
+
+    // find the frame closer to the average using RMSE
+    for (i = 0; i < thumb->n_frames; i++) {
+        rmse = frame_rmse(thumb->frames[i].histogram, avg);
+        if (i == 0 || rmse < min_rmse)
+            best_frame = i, min_rmse = rmse;
+    }
+
+    // free and reset everything (except the best frame buffer)
+    for (i = 0; i < thumb->n_frames; i++) {
+        memset(thumb->frames[i].histogram, 0, sizeof(thumb->frames[i].histogram));
+        if (i == best_frame)
+            continue;
+        avfilter_unref_buffer(thumb->frames[i].buf);
+        thumb->frames[i].buf = NULL;
+    }
+    thumb->n = 0;
+
+    // raise the chosen one
+    avfilter_start_frame(outlink, thumb->frames[best_frame].buf);
+    avfilter_draw_slice(outlink, 0, inlink->h, 1);
+    avfilter_end_frame(outlink);
+}
+
+static av_cold void uninit(AVFilterContext *ctx)
+{
+    int i;
+    ThumbContext *thumb = ctx->priv;
+    for (i = 0; i < thumb->n_frames && thumb->frames[i].buf; i++) {
+        avfilter_unref_buffer(thumb->frames[i].buf);
+        thumb->frames[i].buf = NULL;
+    }
+    av_freep(&thumb->frames);
+}
+
+static int query_formats(AVFilterContext *ctx)
+{
+    static const enum PixelFormat pix_fmts[] = {
+        PIX_FMT_RGB24, PIX_FMT_BGR24,
+        PIX_FMT_NONE
+    };
+    avfilter_set_common_pixel_formats(ctx, avfilter_make_format_list(pix_fmts));
+    return 0;
+}
+
+static void null_start_frame(AVFilterLink *link, AVFilterBufferRef *picref) { }
+
+AVFilter avfilter_vf_thumbnail = {
+    .name          = "thumbnail",
+    .description   = NULL_IF_CONFIG_SMALL("Thumbnail selection filter"),
+    .priv_size     = sizeof(ThumbContext),
+    .init          = init,
+    .uninit        = uninit,
+    .query_formats = query_formats,
+    .inputs        = (const AVFilterPad[]) {
+        {   .name             = "default",
+            .type             = AVMEDIA_TYPE_VIDEO,
+            .get_video_buffer = avfilter_null_get_video_buffer,
+            .start_frame      = null_start_frame,
+            .draw_slice       = draw_slice,
+            .end_frame        = end_frame,
+        },{ .name = NULL }
+    },
+    .outputs       = (const AVFilterPad[]) {
+        {   .name             = "default",
+            .type             = AVMEDIA_TYPE_VIDEO,
+            .rej_perms        = AV_PERM_REUSE2,
+        },{ .name = NULL }
+    },
+};
-- 
1.7.7.3

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20111205/566e3380/attachment.asc>


More information about the ffmpeg-devel mailing list