[FFmpeg-devel] Format of decoded subtitles (was: matroska: Identify S_TEXT/UTF-8 tracks as SRT and not TEXT.)
nicolas.george at normalesup.org
Thu May 24 14:19:02 CEST 2012
Le quartidi 4 prairial, an CCXX, Clément Bœsch a écrit :
> Most players use ASS rendering for every subtitles (assuming a conversion
> of the original subtitles markup into ASS), which is BTW what we do in our
> text subtitles decoders (SubRip, MicroDVD, and JacoSUB). ASS rendering is
> expected by most people even for these formats.
> ASS also handles mostly every "useful" markups (of course I have a bunch
> of exceptions in mind) at the moment. If a new subtitle format is meant to
> replace ASS, it will likely keep some kind of retro compatibility with it
> (otherwise it will be a pain for almost every current decoders/players),
> and so moving our internal formats to this new one should not be much a
> I'm not sure about what you mean by handling the markup syntax the same
> way we handle pixel/sample formats.
What I meant was this: in AVFrame, the decoded video is in arrays of
integers, but there is a pix_fmt field that says if these arrays are YUV420P
or RGBA. If we have one and want the other, there is libswscale to do the
conversion; sometimes it is lossless, sometimes it is not.
For decoded text subtitles, there would be a markup_syntax field with values
like SUB_MARKUP_ASS or SUB_MARKUP_HTML. And an API to convert, losslessly or
not, from one markup to another.
Of course, if we have a perfect round-trip MARKUP_X -> MARKUP_Y -> MARKUP_X
(this can happen even if Y has features that X does not have, as they will
not be used in an Y converted from X; OTOH, if Y is case-sensitive and X is
not, we may lose the case information, which may be considered acceptable),
then MARKUP_X is useless and we can always convert to and from Y.
(This is not true for video, we can not convert everything to 32-bits per
component because of performance issues.)
If, as you say, the ASS markup can express all the features of any other
known markup, then we can adopt ASS as an universal markup syntax, and
expect all subtitles codec to encode/decode the markup.
> BTW, I had in mind something about subtitles: I think the decode subtitles
> API should do the ASS rendering if possible; calling
> avcodec_decode_subtitles() with a "render_ass" flag to decode
> ASS-compliant subtitles (aka the decoder returns ASS packets) into a
> bitmap layout ready-to-blit by the player/transcoder (ffplay can already
> do that kind of subtitles bitmap rendering). It might avoid some pain with
> lavfi (except hardsubbing, does anyone see any more potentially useful
> subtitles filtering for lavfi?).
> Last time I looked for this solution I expected quite a few problems
> (which I can't remember now I admit), but maybe it's worth looking at this
There are a lot of issues with that:
First, an application may want to alter the subtitles before rendering them
(stupid example: use an automated translation system), so we at lease need
an entry point for that. That is not much of a problem.
Second, there is the issue Reimar raised when I implemented multi-rectangle
rendering in mplayer a few weeks back: subtitles often occupy a small
proportion of the whole video, but the closest-fit rectangle may be huge.
Performance-wise, this is not very good.
Third, rendering vectorial contents requires the target resolution, and that
depends on where exactly in the filter sequence the overlay is applied.
Fourth, it needs to handle overlapping subtitles. Even with seeking.
Fifth, we can not have ffmpeg depend on an external library like libass for
one of its core features. Even worse: for correct regression testing, we
would need internal handling of fonts.
Hum, it looks like I am bashing your suggestion; it is not my purpose. Your
suggestion has a lot of merits.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: Digital signature
More information about the ffmpeg-devel