[FFmpeg-devel] [PATCH 2/3] textdec: Rename all generic parts from srt to text.

Thu Aug 2 14:42:18 CEST 2012

On Wed, Aug 01, 2012 at 08:25:24PM +0200, Clément Bœsch wrote:
> On Wed, Aug 01, 2012 at 10:53:28AM -0700, Philip Langdale wrote:
> > On Wed, 1 Aug 2012 18:51:02 +0200
> > Nicolas George <nicolas.george at normalesup.org> wrote:
> > 
> > > 
> > > This is not really text -> ass. It could be called "pseudohtml_to_ass"
> > 
> > Sure.
> >  
> > > >  
> > > > -static int srt_decode_frame(AVCodecContext *avctx,
> > > > -                            void *data, int *got_sub_ptr, AVPacket
> > > > *avpkt) +static int text_decode_frame(AVCodecContext *avctx,
> > > > +                             void *data, int *got_sub_ptr,
> > > > AVPacket *avpkt) {
> > > >      AVSubtitle *sub = data;
> > > >      int ts_start, ts_end, x1 = -1, y1 = -1, x2 = -1, y2 = -1;
> > > > @@ -220,8 +220,8 @@ static int srt_decode_frame(AVCodecContext
> > > > *avctx, ptr = read_ts(ptr, &ts_start, &ts_end, &x1, &y1, &x2, &y2);
> > > >          if (!ptr)
> > > >              break;
> > > > -        ptr = srt_to_ass(avctx, buffer, buffer+sizeof(buffer), ptr,
> > > > -                         x1, y1, x2, y2);
> > > > +        ptr = text_to_ass(avctx, buffer, buffer+sizeof(buffer),
> > > > ptr,
> > > > +                          x1, y1, x2, y2);
> > > 
> > > After some thought, I am not comfortable with that. If the codec is
> > > text, it should have nothing to do with ASS, especially since ASS is
> > > still a mess of temporary hacks.
> > 
> > So, the problem here is matroska. When you put SRT into matroska, it
> > gets tagged as TEXT, but all the formatting remains, and should be
> > respected. You may recall the change I proposed a couple of months ago
> > to identify matroska TEXT tracks as SRT, as a way to make things line
> > up again. You and Clément felt that was abusive as the track is text
> > in the sense of not including SRT timing information. Fair enough, hence
> > I made this change.
> > 
> > But the pseudohtml styling is still present in the track and needs to
> > be respected, so any decoder that wants to decode CODEC_ID_TEXT and
> > work correctly with SRT-in-MKV must behave as written.
> > 
> > Either we do this, or we identify mkv text tracks as srt. I don't see
> > another solution (unless you want a third decoder just for
> > srt-in-mkv...)
> > 
> 
> Before I comment on this, I'd like to restat again a few things:
> 
> At the moment, the current design for "pure" subtitles demuxers (aka not
> in video containers like mkv) is to split the text file into chunk and
> *NOT* discard the timing information (and the decoders just skip them).
> This was done for a few reasons:
> 
>  - I was said a long time ago that demuxers should not drop arbitrary data
>  - that would allow subtitles muxer to be "raw" muxers
>  - if we now change that behaviour in lavf, some incompatibilities might
>    occur with a different lavc version
> 
> For the first point, I don't know if that really makes sense for subtitles
> and if there is any strong reason behind.
> 
> For the second point, it is actually an issue when dealing with -ss and
> -t, and eventually if we plan to do some timestamp scaling at some point.
> I think we agreed that muxers should handle the timing using the packet
> info and not parsing again the packet data.

Some additional comments
- we do not remove timestamps from video streams like h264 or mpeg2
  they contain various, also h264s various timestamps are possibly
  too complex
  to remove and too deeply integrated to remove even if we wanted.
  there are picture order counts various SEIs that contain timing
  stuff, GOP in mpeg2 that does, ...

- considering such complex timestamps, does similar exist in
  subtitles ? i mean for example things deeper in the syntax refering
  to some timestamps ? like <blink start="11:23:90">this</blink> ?

- The problem with scaling timestamps or cutting applies to video too

- Can the case occur / has it been considered that a container can
  store subtitles with their own i stream timestamps and
  different/inconsistent timestamps at the demuxer layer.
  Iam mentioning this because if you remove the timestamps from the
  subtitle bitstream what do you do if they dont match the demuxer
  provided timestamps to begin with ? 

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Into a blind darkness they enter who follow after the Ignorance,
they as if into a greater darkness enter who devote themselves
to the Knowledge alone. -- Isha Upanishad
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120802/0b7b8a3c/attachment.asc>