[FFmpeg-devel] [PATCH 2/3] textdec: Rename all generic parts from srt to text.

Nicolas George nicolas.george at normalesup.org
Thu Aug 2 11:16:49 CEST 2012


Le quintidi 15 thermidor, an CCXX, Clément Bœsch a écrit :
>  - I was said a long time ago that demuxers should not drop arbitrary data

I am sure you realize that, but just for the record: this is not arbitrary
data, this is structure data, and it is not dropped, it is converted into a
practical internal format. This is exactly what a demuxer is supposed to do.

>  - that would allow subtitles muxer to be "raw" muxers

A small gain here for a large loss elsewhere, I do not think this is a good
idea.

We would probably be able to factor most of the code of the subtitles
muxers, though.

>  - if we now change that behaviour in lavf, some incompatibilities might
>    occur with a different lavc version

That is true, and that is the difficult point. I am still not sure how to
address that. Maybe we can use the AVFormatContext.subtitle_codec_id field
to let the application specify the kind of packets it wants to deal with,
and make the default value evolve progressively.

> The last point is a problem: we can't decide now that the lavc/srtdec
> will never receive a packet with timing in it.

We can decide that CODEC_ID_SRT has timings in it (and deprecate it) while
CODEC_ID_TEXT and CODEC_ID_TAGSOUP do not.

> So why I am talking about this here? Well, I think ideally the subtitles
> decoder should *not* receive the timing information. This way, we would
> have:
> 
>  - Matroska demuxer outputting SubRip packets the same way the SubRip
>    demuxer would output SubRip packets (both without any timing
>    information). Both would use CODEC_ID_SRT, and the codec will just
>    honor the markup

Yes on the principle, but we will need to deal with the fact that the markup
is wild. Matroska marks text subtitles as S_TEXT, it does not say anything
about markup, and it is IMHO an abuse from the players to honor it. But this
abuse is so common that we have to deal with it.

>		      [note: we might need to put the coords in the side
>    data or something].

I think this is the right place.

>  - Now, CODEC_ID_TEXT will be available for any formats using text
>    *without* SubRip markup (contrary to Matroska): this might be needed
>    for some video codec, but it means this codec will also be available
>    for various subtitles format where we don't need any markup. Right now
>    MicroDVD, JacoSUB, SAMI, SubViewer etc all have their own decoder for
>    special markups, but I'm sure various subtitles format don't have any
>    markup system, and here CODEC_ID_TEXT would make sense: that can not
>    work with CODEC_ID_TEXT is considered to be "html"/SubRip markup.

Except the "contrary to Matroska" part, I agree with that.

> If we decide to make the demuxers drop the timestamping, I will gladly
> update jacosub/sami/subviewer/etc demuxers since I don't think anyone is
> using them yet.

Thanks.

> BTW, you might have noticed I don't like very much this tight link between
> SRT/SubRip and TEXT just because Matroska happened to play with the
> confusion.

Hear, hear!

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120802/b0d16c18/attachment.asc>


More information about the ffmpeg-devel mailing list