[FFmpeg-cvslog] r13643 - in trunk: libavcodec/mpegvideo_parser.c tests/seek.regression.ref

Wed Jun 4 17:21:11 CEST 2008

On Wed, Jun 04, 2008 at 03:01:08PM +0100, M?ns Rullg?rd wrote:
> 
> Michael Niedermayer wrote:
> > On Wed, Jun 04, 2008 at 08:52:58AM +0100, M?ns Rullg?rd wrote:
> >> Michael Niedermayer <michaelni at gmx.at> writes:
> >>
> >> > On Tue, Jun 03, 2008 at 05:33:59AM +0100, M?ns Rullg?rd wrote:
> >> >> michael <subversion at mplayerhq.hu> writes:
> >> >>
> >> >> > Author: michael
> >> >> > Date: Tue Jun  3 04:43:17 2008
> >> >> > New Revision: 13643
> >> >> >
> >> >> > Log:
> >> >> > In mpeg1/2 timestamps (against all logic) are associated with
> >> >> > picture start codes and not with access units.
> >> >>
> >> >> The change may be valid, but this description doesn't make sense.
> >> >> This is what the spec says:
> >> >>
> >> >>   For ITU-T Rec. H.262 | ISO/IEC 13818-2 video, if a PTS is present in
> >> >>   a PES packet header, it shall refer to the access unit containing
> >> >>   the first picture start code that commences in this PES packet. A
> >> >>   picture start code commences in a PES packet if the first byte of
> >> >>   the picture start code is present in the PES packet.
> >> >>
> >> >> In other words, the access unit is defined as containing the picture
> >> >> start code, and I can't think of any other sane way of defining it.
> >> >
> >> > H.264 in H.222 defines it as the first access unit that commences in the
> >> > PES packet. (which is IMHO more sane)
> >>
> >> MPEG2 defines a video access unit like this:
> >>
> >>   In the case of video, an access unit includes all the coded data for
> >>   a picture, and any stuffing that follows it, up to but not including
> >>   the start of the next access unit. If a picture is not preceded by a
> >>   group_start_code or a sequence_header_code, the access unit begins
> >>   with the picture start code. If a picture is preceded by a
> >>   group_start_code and/or a sequence_header_code, the access unit
> >>   begins with the first byte of the first of these start codes. If it
> >>   is the last picture preceding a sequence_end_code in the bitstream,
> >>   all bytes between the last byte of the coded picture and the
> >>   sequence_end_code (including the sequence_end_code) belong to the
> >>   access unit.
> >>
> >> H.264 defines an access unit like this:
> >>
> >>   A set of NAL units always containing exactly one primary coded
> >>   picture. In addition to the primary coded picture, an access unit
> >>   may also contain one or more redundant coded pictures or other NAL
> >>   units not containing slices or slice data partitions of a coded
> >>   picture. The decoding of an access unit always results in a decoded
> >>   picture.
> >>
> >> For H.264 video, PTS is defined as:
> >>
> >>   For ITU-T Rec. H.264 | ISO/IEC 14496-10 video, if a PTS is present
> >>   in the PES packet header, it shall refer to the first AVC access
> >>   unit that commences in this PES packet. An AVC access unit commences
> >>   in a PES packet if the first byte of the AVC access unit is present
> >>   in the PES packet.
> >>
> >> The difference is how non-picture data, e.g. GOP headers and SPS, PPS,
> >> SEI etc., are treated, and I really don't see how one or the other
> >> would be significantly better.
> >
> > Thats because you did not spend 2 days fixing the parser to handle the
> > mpeg2 case correctly. ;)
> 
> That's because you insist on using a parser in the first place, and
> accidentally wrote it wrong.  

Fabrice accidentally wrote it wrong.

> MPEG2 was intended to be decoded as a
> stream, without being split into access units first.

The problem is when one does something else with the stream than decoding.
stream copy into any other container being an example. Besides software
decoders at least have problems with decoding randomly truncated pieces of
the bitstream. So even with just decoding, some sort of parsing is needed
before. Or the decoder has to regularly check if it still has enough data
and call back to the demuxer if it does not.
Such a callback system would hardly be cleaner than what we have now IMHO.
One of course could also use threads and wait instead of calling back but
that has its own set of issues, speedloss due to thread sync overhead being
one ...

So if you know of a better design please tell me as i do not. The parsers
seem to me the cleanest solution.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is not what we do, but why we do it that matters.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-cvslog/attachments/20080604/d85466b6/attachment.pgp>