[FFmpeg-devel] libavfilter API design in a realtime environment

Nicolas George george at nsup.org
Tue Apr 19 12:33:37 CEST 2016


Le septidi 27 ventôse, an CCXXIV, Kieran Kunhya a écrit :
> I want to try and use the libavfilter API to overlay bitmap subtitles on
> video from a realtime source. This seems difficult/impossible to do with
> the current API hence asking on the main devel list.

Have you looked at what the command-line tool ffmpeg does? It is not
optimized for ultra-low latency, but it should already achieve reasonable
results.

> 1: How do I know the end to end latency of the pipeline? Is it fixed, does
> it vary? This matters because my wallclock PTS needs addition of this
> latency.

You can not know that in the general case, since the latency of some filters
depends on the frame contents and arbitrary user-provided formulas.

On a particular case, the rule of thumb is that filters produce output as
soon as they have enough input information to do so. But note that for
filters that require syncing between several video streams will likely
require one extra frame on some or all stream. This happens because frames
have no duration (and I am convinced they should not have one), and
therefore the next frame is required to know the end timestamp.

> 2: Do I need to interleave video and subtitles (e.g VSVSVSVS) in
> monotonically increasing order? What happens if the subtitles stop for a
> bit (magic queues are bad in a realtime environment)? My timestamps are
> guaranteed to be the same though.

libavfilter can deal with streams slightly out of sync by buffering, but it
takes a lot of memory of course, and will eventually lead to OOM or dropping
frames if the desync is too large.

> 3: My world is CFR but libavfilter is VFR - how does the API know when to
> start releasing frames? Does this add one frame of video latency then until
> it waits for the next video frame to arrive?

Knowing you have CFR gives you an assumption at the frame duration, and
therefore can save you the wait for the next frame. The difficulty is
integrating that elegantly in the scheduling. See below.

> 4: What are the differences between the FFmpeg and libav implementations?
> FFmpeg uses a framesync and libav doesn't?

If your graph has a single output and you never have a choice about which
input to feed (for example because the frames arrive interleaved), then I
believe the difference do not matter for you.

> 5: I know exactly which frames have associated subtitle bitmaps or not, is
> there a way I can overlay without an extra frame delay?

As wm4 explained, the hard part with subtitles is that they are sparse: you
have a start event, then ~2 seconds, or ~50 frames worth of video, then an
end event, and then maybe several hours before the next start event. If any
filter requires an end timestamp and syncs with video, then you have a huge
latency and a huge buffer.

If the subtitles come from a separate on-demand file, there is no problem,
since the next event is available whenever necessary, and the scheduling (at
least in the FFmpeg version) will tell you when it is.

On the other hand, if your subtitles events are not available on demand,
either because they are interleaved with the video in a muxed format or
because they arrive in real time, it does not work.

You need assumptions about the timestamps properties of your streams. For
example, video players reading index-less streams will assume that subtitles
are not muxed too much after the video, or they will be ignored.

To take that into account for the support of bitmap subtitles in ffmpeg, I
used heartbeat frames: whenever a frame is decoded and injected on a
non-subtitle input, on all subtitle inputs connected to the same input
stream the current (possibly empty) frame is duplicated and injected.

When proper subtitle support will be implemented in lavfi, I suppose a
similar solution should be adopted. The application will need to let lavfi
know what streams are interleaved together, probably with a filter with N+n
inputs and as many outputs. But in that case, the heartbeat frames can be
really dummy frames, they may need no duplication.

The same trick can be used to inform lavfi about CFR, although it feels
really hacky: when you have a subtitle frame at pts=5, duplicate it to
pts=5.999999, it should allow processing the video frame at pts=5.

Another solution for CFR would be to add some kind of "min_frame_duration"
option to the framesync utility.

Hope this helps.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160419/c8b3d562/attachment.sig>


More information about the ffmpeg-devel mailing list