[FFmpeg-soc] libavfilter - filter for .ass subtitle rendering using libass

Alexey Lebedeff binarin at binarin.ru
Thu Feb 26 08:56:49 CET 2009


On Thu, 26 Feb 2009 00:45:04 +0100, Stefano Sabatini wrote:
> On date Tuesday 2009-02-24 20:28:53 +0300, Alexey Lebedeff encoded:
>> On Tue, 24 Feb 2009 13:54:08 +0300, Alexey Lebedeff wrote:

>>   int vsub,hsub;   //< chroma subsampling
>
> missing space after the comma

And the same in the vf_drawbox.c =)

>> } AssContext;
>> 
>> static int parse_args(AVFilterContext *ctx, AssContext *context, const char* args);
>> static av_cold int init(AVFilterContext *ctx, const char *args, void *opaque)
>> {
>>   AssContext *context= ctx->priv;
>> 
>>   /* defaults */
>>   context->margin = 10;
>>   context->encoding = "utf-8";
>> 
>>   if ( parse_args(ctx, context, args) )
>
> if (parse_args(...))
>
> style is preferred.
>
>>     return 1;
>
> The documentation only says to return 0 in case of success, but all
> the current filters return a negative value in case of failure, which
> is consistent with the FFmpeg API, so I'd say to return -1 here and in
> all the other cases.
>
> Also I find quite confusing to parse the args in a distinct function,
> but maybe it's only me.

It was too big, and initially I wanted to place libass initialization
in init() too. But together they was so large, that I decided to split
them in such way.

>>   return 0;
>> }

>> static int config_input(AVFilterLink *link)
>> {
>>   AssContext *context = link->dst->priv;
>> 
>>   context->frame_width = link->w;
>>   context->frame_height = link->h;
>> 
>>   context->ass_library = ass_library_init();
>> 
>>   if ( !context->ass_library ) {
>>     av_log(0, AV_LOG_ERROR, "ass_library_init() failed!\n");
>>     return 1;
>>   }
>> 
>>   ass_set_fonts_dir(context->ass_library, "");
>>   ass_set_extract_fonts(context->ass_library, 1);
>>   ass_set_style_overrides(context->ass_library, NULL);
>> 
>>   context->ass_renderer = ass_renderer_init(context->ass_library);
>>   if ( ! context->ass_renderer ) {
>>     av_log(0, AV_LOG_ERROR, "ass_renderer_init() failed!\n");
>>     return 1;
>>   }
>> 
>>   ass_set_frame_size(context->ass_renderer, link->w, link->h);
>>   ass_set_margins(context->ass_renderer, context->margin, context->margin, context->margin, context->margin);
>>   ass_set_use_margins(context->ass_renderer, 1);
>>   ass_set_font_scale(context->ass_renderer, 1.);
>>   ass_set_fonts(context->ass_renderer, NULL, "Sans");
>> 
>>   context->ass_track = ass_read_file(context->ass_library, context->filename, context->encoding);
>>   if ( !context->ass_track ) {
>>     av_log(0, AV_LOG_ERROR, "Failed to read subtitle file with ass_read_file()!\n");
>>     return 1;
>>   }
>
> I don't see why this is done here rather than in init().
Because of link->w and link->h.
It needs to be checked whether ass_set_frame_size() can be called
anytime, or libass init sequence is fixed.

> Also av_log should always have a context.
I was not sure  what to use here, and 0 as context worked fine.
Will be the 'link->dst' the right parameter for av_log() here?

>>   avcodec_get_chroma_sub_sample(link->format,
>> 				&context->hsub, &context->vsub);
>> 
>>   return 0;
>> }
>> 
>> static void start_frame(AVFilterLink *link, AVFilterPicRef *picref)
>> {
>>   avfilter_start_frame(link->dst->outputs[0], picref);
>> }
>> 
>> #define _r(c)  ((c)>>24)
>> #define _g(c)  (((c)>>16)&0xFF)
>> #define _b(c)  (((c)>>8)&0xFF)
>> #define _a(c)  ((c)&0xFF)
>> #define rgba2y(c)  ( (( 263*_r(c)  + 516*_g(c) + 100*_b(c)) >> 10) + 16  )                                                                     
>> #define rgba2u(c)  ( ((-152*_r(c) - 298*_g(c) + 450*_b(c)) >> 10) + 128 )                                                                      
>> #define rgba2v(c)  ( (( 450*_r(c) - 376*_g(c) -  73*_b(c)) >> 10) + 128 )                                                                      

> Please avoid _foo var names, they are unreadable.
> I wonder if we can do such transformation using some function, if
> that's not the case maybe we should implement it in the lib, otherwise
> filter writers will write this again and again (see for example VHOOK).

This can be rewritten using swscale, as Bobby Bingham suggested.

Then code will do the following:
1) Render all subtitle parts onto a big RGB bitmap
2) Convert this bitmap to destination format using swscale
3) Blend converted result with source pic

This is not optimal, as swscale will do many unneeded work.
But all subtleties of different formats handling will be hidden from us.

>> static void draw_ass_image(AVFilterPicRef *pic, ass_image_t *img, AssContext *context)
>> {
>>   unsigned char *row[4];
>>   unsigned char c_y = rgba2y(img->color);
>>   unsigned char c_u = rgba2u(img->color);
>>   unsigned char c_v = rgba2v(img->color);
>>   unsigned char opacity = 255 - _a(img->color);
>>   unsigned char *src;
>>   int i, j;
>> 
>>   unsigned char *bitmap = img->bitmap;
>>   int bitmap_w = img->w;
>>   int bitmap_h = img->h;
>>   int dst_x = img->dst_x;
>>   int dst_y = img->dst_y;
>> 
>>   int channel;
>>   int x,y;
>> 
>>   src = bitmap;
>> 
>>   for (i = 0; i < bitmap_h; ++i) {
>>     y = dst_y + i;
>>     if ( y >= pic->h )
>>       break;
>> 
>>     row[0] = pic->data[0] + y * pic->linesize[0];
>> 
>>     for (channel = 1; channel < 3; channel++)
>>       row[channel] = pic->data[channel] +
>> 	pic->linesize[channel] * (y>> context->vsub);
>> 
>>     for (j = 0; j < bitmap_w; ++j) {
>>       unsigned k = ((unsigned)src[j]) * opacity / 255;
>> 
>>       x = dst_x + j;
>>       if ( y >= pic->w )
>> 	break;
>> 
>>       row[0][x] = (k*c_y + (255-k)*row[0][x]) / 255;
>>       row[1][x >> context->hsub] = (k*c_u + (255-k)*row[1][x >> context->hsub]) / 255;
>>       row[2][x >> context->hsub] = (k*c_v + (255-k)*row[2][x >> context->hsub]) / 255;
>>     }
>> 
>>     src += img->stride;
>>   } 
>> }

> It's always better to use the slice API, would be possible to
> implement this overlay algorithm per-slice (I bet yes)?

What is slice API and where I can read about it?

> I don't think this is a good syntax, ':' is already used with a
> special meaning (to separate params) in most filters, if you want to
> support a list of key/val values (which looks like a good idea, at
> least better than a list of unnamed params) then I think it's better
> to use the syntax:
> param1=val1:param2=val2...:paramN=valN
Wanted to do this, but '=' was used by graph parsed, and I was
unaware that equal signs can be quoted.

> As already noted, the patch as it is currently designed maybe won't be
> accepted in SVN anyway, I think the "right" solution for the FFmpeg
> standards would be to use/implement/extend somehow the native ASS
> libavcodec decoder, then maybe we should also extend the lavfi API to
> also support subtitle frames.

I have some interest in subtitle support and some spare time, so I'll
try to implement the "right" solution.

> But maybe we could provide some place (multimedia wiki?) where to post
> "interesting/useful" filters which for some reason or another cannot
> be committed right into SVN.
>
> Then we may design a system (a-la VHOOK, but implemented better) to
> dynamically load user-provided filters, so no need to manually
> integrate them each time in the native build system.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-soc/attachments/20090226/671945fe/attachment.pgp>


More information about the FFmpeg-soc mailing list