[FFmpeg-soc] vp8 de/packetizers

Martin Storsjö martin at martin.st
Wed Aug 11 11:01:20 CEST 2010


On Wed, 11 Aug 2010, Josh Allmann wrote:

> On 11 August 2010 00:46, Josh Allmann <joshua.allmann at gmail.com> wrote:
> > On 4 August 2010 03:32, Luca Barbato <lu_zero at gentoo.org> wrote:
> >> On 08/04/2010 09:59 AM, Martin Storsjö wrote:
> >>> David, how large is this first data partition approximately? The frames
> >>> are fragmented into RTP packets of around ~1450 bytes (usually) - so if
> >>> we've got the first packet of a frame but not the rest, would we be better
> >>> off returning this than just skipping it?
> >>
> >> Even better: is there a way to get this information?
> >>
> >> If is common for every packet/there is a known upper bound, we can pass
> >> it through the sdp, or if not we could think about storing the value in
> >> an extension header.
> >>
> >
> > Not sure; I'd have to dig through the bitstream spec for this. Ronald,
> > do you happen to know off the top of your head?
> >
> >> Still I guess we could extend the depacketizer interface to get all the
> >> standard rtp information as well.
> >>
> >
> > Attached patches should work against the updated VP8 rtp proposal*.
> > Also pulled from latest trunk, so they also work with the Xiph stuff.
> >
> > The depacketizer now supports multiple VP8AUs in one packet but I
> > haven't tested that codepath, because VP8AUs are not used by the
> > packetizer.
> >
> > * https://groups.google.com/a/webmproject.org/group/webm-discuss/msg/eef6d312ac5c09c3?dmode=source&output=gplain
> >
> 
> Attaching...

The packetizer looks ok to me. Perhaps still adding a warning message at 
startup that the spec isn't finalized yet and may still change, as Luca A 
pointed out?

For the depacketizer:

> +static int vp8_handle_packet(AVFormatContext *ctx,
> +                             PayloadContext *vp8,
> +                             AVStream *st,
> +                             AVPacket *pkt,
> +                             uint32_t *timestamp,
> +                             const uint8_t *buf,
> +                             int len, int flags)
> +{
> +    int start_packet, end_packet, has_au;
> +    if (!buf)
> +        return AVERROR_INVALIDDATA;
> +
> +    start_packet = *buf & 1;
> +    end_packet   = flags & RTP_FLAG_MARKER;
> +    has_au = *buf & 2;
> +    buf++;
> +    len--;
> +
> +    if (start_packet) {
> +        int res;
> +        if (vp8->data) { // drop previous frame if needed
> +            uint8_t *tmp;
> +            url_close_dyn_buf(vp8->data, &tmp);
> +            vp8->data = NULL;
> +            av_free(tmp);
> +        }

Instead of dropping the previous frame, perhaps we should return the data 
we have buffered so far? If this contains the whole first data partition, 
we will be able to decode the rest of the GOP even if that particular 
frame is broken.

If end_packet isn't set, this is relatively simple - close the previous 
dyn buffer and set the data into pkt, then buffer the new data into a new 
dyn buf. You also need to swap vp8->timestamp and *timestamp in that case 
- the timestamp for the data you've returned is in vp8->timestamp, which 
you need to set into *timestamp so that the returned packet gets the right 
timestamp, but for the new data you buffer up, you need to store the 
current *timestamp.

If end_packet is set, we'd need to return two packets - both the previous, 
incomplete one, and the new one. In that case, you'd need to do something 
like what I described above, but return 1 so that the code outside knows 
to request another packet, then make the code handle the case when called 
with buf = NULL.

> +        if ((res = url_open_dyn_buf(&vp8->data)) < 0)
> +            return res;
> +        vp8->is_keyframe = *buf & 1;
> +        vp8->timestamp   = *timestamp;
> +    }
> +
> +    if (!vp8->data || vp8->timestamp != *timestamp) {
> +        av_log(ctx, AV_LOG_WARNING,
> +               "Received no start marker; dropping frame\n");
> +        return AVERROR(EAGAIN);
> +    }

If the timestamps differ and we actually have some old data buffered, do 
we want to return it here? Since the timestamps differ in that case, we 
wouldn't be interested in returning the currently processed data, though, 
since we have missed the start of that frame anyway. (On the other hand, 
in that case, we've already screwed up the GOP, so is there any point in 
returning the previous packet at all?)

> +
> +    // cycle through VP8AU headers if needed
> +    while (len) {
> +        int au_len = len;
> +        if (has_au && len > 2) {
> +            au_len = AV_RB16(buf) & 0x7ff; // use lower 11 bits
> +            buf += 2;
> +            len -= 2;
> +            if (buf + au_len > buf + len) {
> +                av_log(ctx, AV_LOG_ERROR, "Invalid VP8AU length\n");
> +                return AVERROR_INVALIDDATA;
> +            }
> +        }

I'm not totally sure of the intent behind VP8 aggregate units - if each
unit is a separate frame, we should split them here, I think. But from the
discussion on the spec:

> > If I understand correctly, it means that you can only aggregate 
> > packets from the same video frame ? If it is the case, it should be 
> > clearer. That said, can't they be just merged ? 
> 
> Yes. When people asked for having the ability to aggregate VP8 data with 
> same timestamp, I think they were looking to the future. For different 
> applications it could be advantageous. 

So in that case, merging the AUs like this probably is ok.

Perhaps a warning (either code comment or runtime warning) in this 
codepath in some way, that the VP8AU code is untested?

// Martin


More information about the FFmpeg-soc mailing list