[FFmpeg-devel] FLV "special" metadata

Thu Aug 25 00:25:04 CEST 2011

Hello! FLV uses metadata packets for normal header metadata but also
lots of in-band timestamped contextual data (cuepoint signals
[different from keyframes], randomly defined commands, onTextData,
in-band subtitles, etc.).

We do transcoding on live streams and need these metadata constructs
to survive if possible, in-band (at least for flv->flv at the moment).
Right now libavformat/flvdec.c only uses these frames to update header
information when appropriate and then they're dropped. If I modify it
so that it adds them to the video or audio stream they predictably
screw the codecs up (libx264 & libfaac at the moment). It seems all
the current metadata infrastructure was built with a
'metadata-is-header-data' assumption- which isn't an option for this
kind of metadata or for live transcoding in general. So basically I
want a way to preserve them somehow in flvdec.c so that they can be
reinserted in flvenc.c. I'm a totally newb to the code base and a lot
of this stuff in general so please forgive any naivety, but it seems
like I have two options:

1 - modify some struct such as the pkt, when one of these special
metadata frames comes in drop it but add some data to the next pkt,
and pull it back out on the flvenc side to reconstruct the dropped
info. [seems hacky though, fragile, and overkill - certainly unlikely
to get merged into the main line]

2 - in flvdec treat these special metadata frames as subtitle packets
(and therefore set up a stream in addition to the video/audio ones
that get set up), in flvenc have it correctly process those subtitle
packets and turn them into 'special' metadata frames-  using a format
that is applicable to one of the handful of existing subtitle codecs
or just creating a simple new subtitle codec. (like 'normal' metadata,
the inband stuff is essentially just a key/value dictionary). An
advantage here is it could conceivably be used more generally by other
codecs / muxers / demuxers.

Does #2 in particular seem like the "right" approach? Am I way off or
missing something? Any advice, corrections, or insights appreciated.
Meanwhile I'll be working on #2 above. Thanks!

-- 

Joseph Wecker | Senior Engineer | Justin.tv | 650.898.7170 |
jwecker at justin.tv