[FFmpeg-devel] [RFC] AST subtitles

Nicolas George nicolas.george at normalesup.org
Wed Nov 28 14:40:30 CET 2012


Le quintidi 5 frimaire, an CCXXI, Clément Bœsch a écrit :
> Yes that's exactly what I was trying to do, but I hit two problems
> already, which I don't know how to solve:
> 
>  - we need to have an ASS encoder context to do that (and you can't just
>    call the encode function directly because you need the global styles
>    information), and this gets kind of ugly quickly.
>
>  - the subtitles encoding API is limited: the encoder takes a buffer +
>    buffer size instead of an AVPacket, which makes it kind of awkward to
>    use within the decoder. The API needs a lift as you said at the end of
>    your mail.

It can be done the other way around: move the necessary code from the ASS
encoder to a private API ff_styled_text_to_ass_line(), and then call this
private API both from avcodec_decode_subtitles2() and from the ASS encoder.

> > >         union {
> > >             char *s;        ///< must be a av_malloc'ed string if string type
> > >             double d;
> > >             int i;
> > >             int64_t i64;
> > >             uint32_t u32;
> > >             void *p;        /**< pointer to allocated data of an arbitrary
> > >                                  size (chunk type dependent) */
> > >         };
> > >         int p_nb;           /**< number of entries in p, can be used for
> > >                                  variable sized data */
> > >     } AVSubtitleASTChunk;
> > 
> > The "p_nb" field name is inconsistent.
> > 
> 
> nb_p?

nb_somehting. I realize that the union field in AVSubtitleASTChunk does not
have a name: I did not think it works, but it does with gcc. Is it standard?

> I'm already doing a doubling reallocation. Sorry, I should have pasted the
> code:

No problem. That looks fine. I always forget the trick of testing if the
size is a power of 2.

> Yes, I don't know yet, we'll indeed likely need to allocate the
> AVSubtitleASTSettings into the AVCodecContext in the decoder init
> callback.

Yes, that is exactly my understanding of the problem.

> I'd say the decoder will have to make its own list of profiles depending
> on the set of styles it expects. I don't really want to make the users and
> encoders deal with with complex trees of styles and inheritance processes.
> That information won't be restored properly in most of the output
> subtitles, so since we will likely have to "flatten" this stuff before
> encoding, I'd say it's up to the decoder to make the stupid markup it is
> parsing accessible & simple for any encoder.

Maybe. I am not completely sure that subtitles "rectangles" should not be
able to point to several global styles. That would probably solve most of
the non-trivial cases.

> The more I do this, the more I realize there is a lot of things to solve
> before that…

It is a good start.

> The decoding function is already taking an AVPacket and outputting an
> AVSubtitle. The encoding function on the other hand is pretty much a relic
> of the past, with one buffer + a buffer size as stated at the beginning of
> this mail (look at how ugly it's done in ffmpeg…). This is IMO what we
> should acknowledge first, but it's not simple.
> 
> There are actually all sort of factors to deal with, so let me summarize:
> 
>  - as you just said, making AVSubtitle heap allocated is one step ahead,
>    but it's actually not blocking for what I'm doing right now: the
>    SUBTITLE_AST means to add a field in the rectangle structure (which is
>    allocated internally), not AVSubtitle, so it shouldn't really matter.
>    Though, it might be required later, so feel free to do it.

True.

>  - currently, the text subtitles decoder are filling the rects[x]->ass
>    fields not only with the ASS "payload" but as if they were /lines/ of a
>    ASS file. The transformation can be summarized as the following:
> 
>      o "Dialogue: " is added at the beginning
>      o start time and end time are added in the payload
>      o field order is dropped
>      o \r\n is added at the end
> 
>    This is problematic because these packets data can not be sent to
>    libass in a sane manner: instead of using libass/ass_process_chunk()
>    like you would do with a simple packet, you need to call
>    libass/ass_process_data() (note that this was added to libass for that
>    specific reason), to re-parse the whole line. This is by the way why
>    MPlayer is doing a memcmp("Dialogue:"...) on the data packet…
> 
>    Now that AVPackets contains the pts and duration, I think it's wise to
>    make the ASS and Matroska demuxers output these packets in a proper
>    format. We will need to change a few things such as making the ASS
>    muxer add the timing, and remove the hack from the Matroska muxer.
> 
>    I don't mind doing this, but since it will change the layout of the
>    packets, it will break application expected ASS line and not ASS raw
>    packets. Any opinion?

Already answered in another mail.

> I think we will need to consider the encoding/charset as well at some
> point too, but it should be doable in a nice way with the
> AVStyledSubtitles, hopefully.

Yes, clearly; and the LF/CRLF thing too. I suppose you specified that the
text in AVStyledSubtitles is in UTF-8 and does not contain newlines?

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20121128/a25680c0/attachment.asc>


More information about the ffmpeg-devel mailing list