[FFmpeg-devel] [PATCH RFC] libavdevice/decklink: Add support for EIA-708 output over SDI
dheitmueller at ltnglobal.com
Thu Oct 19 01:39:17 EEST 2017
> I was going to actually test this with some old broadcast equipment I have just dying for a purpose, but I don't see how to generate AV_PKT_DATA_A53_CC side packet data except using the Decklink capture. I have A53 documentation, but it just refers to CEA-708 (or SMPTE 334, or ... what an unraveling ball of yarn it is. Looks like I could spend a months income on standards just trying to learn how this is encoded).
Yeah. You could certainly spend a good bit of cash if you had to buy the individual specs. Worth noting that the ATSC specs are freely available though on their website, and the CEA-708 is largely described in the FCC specification (not a substitute for the real thing, but good enough for the casual reader). SMPTE has a “digital library” where you can get access to *all* their specs with a subscription of around $600/year. It’s not ideal for a non-professional, but for people who *need* the specs it’s way cheaper than buying them piecemeal for $120/spec.
> On a side note, can AV_PKT_DATA_A53_CC be used for something besides CEA-708? Not sure I understand the line between A53 CC encoding (which is at least in part what this generates, right?) and CEA-708 (which is what this takes, right?) and why this side data is called A53_CC?
> I know these questions are outside the scope that you were asking…
No problem. I should really write a primer on this stuff since there are a whole bunch of specs which are inter-related. Briefly….
CEA-708 is what non-technical people typically consider to be “digital closed captions”. They represent the standard that replaces old fashioned NTSC closed captions, which were described in EIA/CEA-608. The spec describes what could be characterized as a protocol stack of functionality, including transport through presentation layers (i.e. how the captions are constructed, rules for how to render them on-screen, etc).
CEA-708 also includes a construct for tunneling old CEA-608 packets. In fact, most CEA-708 streams are really just up-converted from CEA-608, since the FCC requires both to be supported and 608 is a subset in functionality of 708. On the other hand, you can’t typically down convert 708 to 608 since there are a bunch of formatting codes in 708 which have no corresponding capability in 608. If you’re using VLC or most other applications, they will claim to render 708 captions, but they’re really just rendering the 608 captions contained in 708.
One component of the CEA-708 spec describes a “CDP”, which is “Caption Distribution Packet”. This is a low-level packet format which includes not just multiple caption streams but also timecodes and service data (e.g. caption languages, etc). CDP packets can be sent over a number of different physical transports, including old-fashioned serial ports.
SMPTE 334M describes how to transport CEA-708 CDP packets over an SDI link in the VANC area of the frame.
A53 refers to the ATSC A/53 specification, which basically refers to how digital TV is transmitted over-the-air. One part of that spec includes how to embed CEA-708 captions into an MPEG2 transport stream. The A/53 spec basically says how to embed the CEA-708 caption bytes into an MPEG-2 stream, and then refers you to CEA-708 for the details of what to do with those bytes.
Both the CEA-708 CDP format and A/53 come down to a series of three byte packets which contain the actual captioning data. This corresponds to what is being serialized in AV_PKT_DATA_A53_CC. In order to encode an SDI feed into an MPEG-2 stream, you would need to deconstruct the CDP, extract the captioning bytes, and load them into the side data packet. Once that’s done, the avpacket is handed off to an H.264/MPEG-2 video encoder, which knows how to take those captioning bytes and embed them into the compressed video (using the MPEG-2 user_data field if it’s MPEG-2 video, or the SEI field if it’s H.264).
That series of three-byte packets is essentially the “lowest common denominator” representing the captioning data (assuming you only care about closed captions and not timecodes or service info). I have use cases where this stuff should really be preserved, and am weighing the merits of introducing a new side data format for the CDP which preserves all the info, and then encoders can extract what they need. There are plusses/minuses to this approach and it’s still under consideration.
I hope that gives you a bit more background.
More information about the ffmpeg-devel