[FFmpeg-devel] [PATCH v20 02/20] avutil/frame: Prepare AVFrame for subtitle handling

Hendrik Leppkes h.leppkes at gmail.com
Mon Dec 6 00:53:27 EET 2021


On Sun, Dec 5, 2021 at 11:38 PM Hendrik Leppkes <h.leppkes at gmail.com> wrote:
>
> On Sun, Dec 5, 2021 at 6:58 PM Soft Works <softworkz at hotmail.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Lynne
> > > Sent: Sunday, December 5, 2021 5:40 PM
> > > To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> > > Subject: Re: [FFmpeg-devel] [PATCH v20 02/20] avutil/frame: Prepare AVFrame
> > > for subtitle handling
> > >
> > > 5 Dec 2021, 17:23 by softworkz at hotmail.com:
> > >
> > > > @@ -491,6 +499,39 @@ typedef struct AVFrame {
> > > >  */
> > > >  uint64_t channel_layout;
> > > >
> > > > +    /**
> > > > +     * Display start time, relative to packet pts, in ms.
> > > > +     */
> > > > +    uint32_t subtitle_start_time;
> > > > +
> > > > +    /**
> > > > +     * Display end time, relative to packet pts, in ms.
> > > > +     */
> > > > +    uint32_t subtitle_end_time;
> > > >
> > >
> > > Milliseconds? Our entire API's based around timestamps
> > > with time bases. Plus, we all know what happened when
> > > Matroska settled onto milliseconds and ruined a perfectly
> > > complex but good container.
> > > Make this relative to the PTS field, with the same timebase
> > > as the PTS field.
> > > There's even a new AVFrame->time_base field for you to
> > > set so you wouldn't forget it.
> >
> > The internal format for text subtitles is ASS, and this is
> > using a timebase of milliseconds.
> >
> > All existing decoders and encoders are using this and I'm
> > afraid, but I will not go and change them all.
> >
> > > > +    /**
> > > > +     * Number of items in the @ref subtitle_areas array.
> > > > +     */
> > > > +    unsigned num_subtitle_areas;
> > > > +
> > > > +    /**
> > > > +     * Array of subtitle areas, may be empty.
> > > > +     */
> > > > +    AVSubtitleArea **subtitle_areas;
> > > >
> > >
> > > There's no reason why this cannot be handled using the buffer
> > > and data fields. If you need more space there, you're free to bump
> > > the maximum number of pointers, plus this removes the horrid
> > > malloc(1) hack. We've discussed this, and I couldn't follow why
> > > this was bad in the email discussion you linked.
> >
> > There are reasons. Some of those had I mentioned in an earlier
> > discussion with Hendrik.
> > The effort alone to relate the buffers to subtitle areas (which area
> > 'owns' which buffer) is not reasonable. Too many cases to consider,
> > what's when there are 3 areas and the second area doesn't have any
> > buffer? The convention is that the buffer should be used contiguously.
> > Managing those relations is error-prone and would require a lot of code.
> >
> > > > +    /**
> > > > +     * Usually the same as packet pts, in AV_TIME_BASE.
> > > > +     *
> > > > +     * @deprecated This is kept for compatibility reasons and corresponds
> > > to
> > > > +     * AVSubtitle->pts. Might be removed in the future.
> > > > +     */
> > > > +    int64_t subtitle_pts;
> > > >
> > >
> > > I'm not going to accept a field which is instantly deprecated.
> > > As we've discussed multiple times, please merge this into
> > > the regular frame PTS field. We already have _2_ necessary
> > > stard/end fields.
> >
> > --
> >
> > > I agree with this entirely. Even ignoring the fact that adding a new
> > > field thats deprecated is instantly a disqualification, AVSubtitle had
> > > one pts field, AVFrame already has one pts field - both are even
> > > documented to have the same semantic. They should just contain the
> > > exact same data, thats how you achieve compatibility, not by claiming
> > > you need a new field for compatibility reasons.
> > >
> > > - Hendrik
> >
> > I think the mistake is to declare subtitle_pts as deprecated. I had
> > added the deprecation at a very early point in time where I had still
> > thought that it can be eliminated.
> >
> > Even though we are driving subtitle data through the graph attached
> > to AVFrame, the behavior involved is very different from audio and
> > video frames. Actually there's not one but many different ways how
> > subtitle data can appear in a source and travel through a filtergraph:
> >
> > - Sometimes, subtitle events are muxed into a stream many seconds
> >   ahead of display time. In this case, AVFrame.pts is the mux position
> >   and AVFrame.subtitle_pts is the actual presentation time.
> >   When filtering subtitles to modify something, it would be still desired
> >   to retain the offset between mux time and display start
> >
>
> This information was impossible to convey in AVSubtitle, as it had
> exactly one pts field, so where does it come from now?
> Also, isn't that kind of offset what subtitle_start_time is for, a
> delta on top of the pts?
>
> Sounds like its just duplicating logic once again.
>
> pts is also by definition and name the "presentation time stamp", if a
> timestamp of another meaning can convey additional information, we do
> have a dts field in AVFrame, for the decoding timestamp.

To illustrate this, lets just look at the description of timestamp
fields in AVFrame

pts:
Presentation timestamp in time_base units (time when frame should be
shown to user).

pkt_dts:
DTS copied from the AVPacket that triggered returning this frame.

These comments are pretty clear on the meaning of those fields, and
shall not be violated by subtitles. pts is the time when the frame is
going to be displayed, and nothing else. We probably shouldn't even
tolerate subtitle_start_time as its a stupid field when it should just
get baked into the timestamp (with duration being used for the
end_time field).
pkt_dts can be used to convey an additional timestamp that was copied
from the demuxer, if for some reason you want to preserve the
interleaving position of the frame - which in itself  is already
questionable, as that should be up to the muxer or the interleaving
code to decide, but the field exists, so its fine.

subtitle_pts has to go, and these existing fields have to be used
within their definition.

- Hendrik


More information about the ffmpeg-devel mailing list