[FFmpeg-devel] [PATCH] AVCHD/H.264 parser: determination of frame type, question about timestamps

Sat Jan 31 23:50:09 CET 2009

On Mon, Jan 26, 2009 at 08:42:17AM +0100, Ivan Schreter wrote:
[...]
> >> [...]
> >>>> which is IMHO broken. Of course, we could communicate with it by setting 
> >>>> pict_type to FF_I_TYPE for keyframes only (IDR frames and frames after 
> >>>> recovery point), for other frames containing I- and P-slices to 
> >>>> FF_P_TYPE and for B-frames to FF_B_TYPE. But I don't like it much. Any 
> >>>> idea, how to do it correctly without the need to touch other codecs?
> >>>>     
> >>>>         
> >>> pict_type from the parse context should likely be split in pict_type and
> >>> keyframe
> >>>   
> >>>       
> >> Actually, we already have a flag field on AVPacket coming from the 
> >> parser, but compute_pkt_fields() doesn't believe it and resets it based 
> >> on pict_type from parse context instead.
> >>     
> >
> > the parser works with char* buffers not AVPackets
> >   
> Yes, sorry, I was referring to av_read_frame(), which returns AVPacket.
> 
> However, why do we need pict_type at all? I/P/B-frames are 
> MPEG-specific. Actually, I believe we should change it and return two 
> flags - delayed and key frame. This would make it IMHO cleaner and more 
> general than testing for pict_type.

i dont mind such a change if its tested a little and works

[...]
> > [...]
> >> Of course, we still have the problem of frame doubling/tripling and 
> >> having 3 fields per picture, eventually with one of them repeated 
> >> (pic_struct codes 5-8).
> >>     
> >
> > no we do not have a problem with this, we do not and never will duplicate
> > anything we just export the information and the app can do with it what it
> > wants, thats also exactly what we do in mpeg2
> >   
> We don't export the information, do we? But you are right, with frame 

see repeat_pict

> doubling and tripling there is no problem - the application code will 
> handle it by itself anyway by displaying last frame longer. As for 
> having 3 fields per picture, I'm not so sure it will currently work. At 
> least timing will be wrong. Consider following:

as ive said we support this stuff in mpeg2, i see nothing fundamentally
different in the h264 case just more obfuscated documentation of it.

> 
> We have a stream with pictures containing (T1/B1/T2==T1), (B2/T3/B3==B2) 
> fields. That's two H.264 pictures, but 3 frames. Each av_read_frame() 
> should return a packte containing exactly single frame. But we have just 
> 2 packets, which need to be returned in 3 calls to av_read_frame(), 
> according to API. Further, the DTS must be set correctly as well for the 
> three AVPackets in order to get the timing correct. How do you want to 
> handle this?

i dont see where you get 3 calls of av_read_frame(),
there are 2 or 4 access units not 3 unless one is coded as 2 fields
and 1 is a frame

> 
> And as already mentioned, the case with (T1), (B1), (T2), (B2), we are 
> returning 4 packets via av_read_frame() for 2 frames, which is against 
> API. How to handle this? My idea was delaying return from h264_parse, 
> until second field also parsed

well, just consider the exampl that timestamps are always associated with
the second field instead of the first. You couldnt associate them with the
AVPackets

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090131/e1c7804f/attachment.pgp>