[FFmpeg-devel] [PATCH] correct MPEG video parser not to return incomplete frames
Sat Aug 22 10:07:48 CEST 2009
Michael Niedermayer wrote:
>>> i really dont see where the problem comes from
>>> you do serach for a keyframe somehow already, so you must at that point
>>> where in the PES packet the frame is, skip whats before (minus seq headers
>>> and such) from the parser output, that seems to be all thats needed
>>> the keyframe choosen is the correct one in the new seeking semantics
>>> but maybe i miss something ...
>> That's exactly what I do!
>> But the problem is:
>> * I find the PES packet where my frame starts and return position of
>> this frame.
>> * I reposition the file pointer to this PES packet.
>> * Parser is reset.
>> * First frame read from the parser is the remainder of the previous
>> frame (starting somewhere hundred PES packets earlier).
>> * Parser assigns PTS/DTS of the frame actually starting in this PES
>> packet to this broken frame with remainder of the previous frame.
>> * Second frame read from the parser is the key frame, but timing
>> info is wrong - PTS/DTS contains AV_NOPS_VALUE and cannot be
>> computed correctly.
>> Strictly speaking, the frame stream coming out of the parser is as
>> decodable as ever, but since timestamps are messed up, it's impossible to
>> correctly play it from this position.
>> Unfortunately, I don't have any chance to detect that whatever I find
>> before first picture start code in the PES packet is not a frame, except by
>> checking whether there is a picture start code and a slice in there.
>> PTS/DTS in the PES packet is associated by lavf/utils.c with first frame
>> returned, which starts in this PES packet. Since our partial frame "starts"
>> in this PES packet, PTS/DTS is associated with it instead of with the
>> actually starting frame in this PES packet.
>> So, when I simply skip the broken frame at the beginning, first frame read
>> by the parser will be the frame which really starts in the PES packet and
>> it gets the proper timing info associated.
>> When decoding the file from the beginning, we find the proper beginning of
>> the frame whose last part is in my PES packet and thus don't steal PTS/DTS
>> of the frame really starting in the PES packet.
>> I hope I explained it somewhat clearly now...
> i understand your explanation but i dont understand the problem
> the parser, after an reset misassociates the first timestamp to something
> that is not a frame and i guess it also is not detect as a frame.
> That is a bug, it has to be fixed (and not by discarding this partial
Uhm, as you say, the parser misassociates the timestamp to something
that is not a frame. It will be then refused by the decoder, since it
doesn't contain minimum necessary information (picture start code and
slice start code).
So is something missing picture start code and slice start code in your
opinion a partial frame or is it something not a frame?
If the latter, then the parser doesn't need to return it at all, right?
(at least after a seek, stream copy is maybe different issue)
> the second part is that after seeking you know where in a PES packet your
> frame starts still the output leaving lavf contains things prior to that
> frame, this is a bug in your code, please fix this.
I don't know where in PES packet the frame starts. This information is
not propagated out of the parser. The only thing we know is in which PES
packet the frame starts and we can also position the file pointer only
by PES packet.
OK, it would be maybe possible to get the additional information about
where in PES packet the frame starts and handle it in generic parser.
BUT: This won't work as well. Namely, this would only fix the problem
with first stream encountered after a seek. If there is a PES packet for
stream 1 followed by a PES packet for stream 2, which has exactly the
same problem (rest of previous frame without picture start code and
slice start code and start of a new frame and contains PTS/DTS), then
the problem would simply happen in stream 2.
> If there are 3 keyframes in a PES packet, it should be possible to seek
> to any of them and receive exactly that keyframe and not always the first
> of the PES packet. To me this seems nothing except a single variable that
> says how much to skip before output.
That would be optimal, yes. But again, this single variable would solve
this case only, but wouldn't solve the problem of partial frames, at
least not in other streams except the stream in very first packet (see
above). OTOH, you yourself said in another post that it's no problem to
seek a few frames earlier... Another issue is, you could have two PES
packets, in first PES packet stream 1 with 3 frames PTS 2, 3 and 4 and
in second PES packet 3 frames from stream 2 with PTS 1, 2 and 3. How do
you seek to PTS 3? You can't, you'd have to keep the information
per-stream. And even with that, not only single offset, but also number
of frames would have to be kept (in case several frames are
interleaved), which would make the parser code quite hard to understand
(not that it is today easy to understand...).
Therefore, I propose the following:
* If the parser is newly-created when parsing starts at the
beginning of the file, handle everything as-is.
* If the parser is reset after a seek, set a flag to tell it to
ignore the first partial non-frame (which doesn't contain picture
start code and slice start code), as this anyway can't be decoded
Would that be OK? Or do you have some different idea? I feel this is a
reasonably good compromise, so we can get the things working finally,
since a lot of people who want to edit video on Linux are waiting for it
(together with my other patches regarding MPEG-TS seeking, which have
been reviewed in other thread, and as it seems, nobody has any
objections to that code anymore, so I can commit that at least).
Thanks & regards,
More information about the ffmpeg-devel