[FFmpeg-devel] Remaining problems in H.264 handling

Ivan Schreter schreter
Sat Mar 28 23:22:06 CET 2009


Michael Niedermayer wrote:
> On Fri, Mar 27, 2009 at 07:56:59PM +0100, Ivan Schreter wrote:
>   
> [...]
>> IMHO, frame rate for H.264 (and probably also MPEG video) should be always 
>> set to 1/2 tbc (and set to reliable, if timing info is provided), since 
>> this maps the best to all possible combinations of picture structures. This 
>> would also most probably make the first sample work without problem. But I 
>> didn't do enough convinction work to convince Michael yet ;-).
>>     
>
> convince me that the frame rate is 1/2 tbc?
>
> if tbc is 1/90000 you want 45000 as framerate ?
> if tbc is 1/60 on a telecined video you want it to be 1/30?
>
> this is not about convincing its about starting out with ambgous terms
> and ending with nonsense
>   
Yes, the terms were ambiguous, sorry for that. I just wanted to write 
something quickly, not propose any solution yet, just state the fact 
that I have a different opinion.

> 1. time base and frame rate are 2 seperate things.
>   
Perfectly agreed.

> 2. there is no frame rate field, and i repeat like i did many times in the
>    past that people CANNOT hijack a timebase or the r_frame_rate field
>    and set them to the frame rate.
>    if you want a frame rate field that has to be added as a new field.
>
>   
I don't have sources on this computer and I don't remember exactly, so 
I'm not going to write proper member names here.

However, AFAIK we have following: time base of the stream (90kHz for 
MPEG-TS), which is rather uninteresting (it's just scaling constant), 
timestamp rate (50Hz) and actual video full frame rate (25fps).

The problem: video full frame rate will be currently determined by 
"unreliable" handling, where timestamps of *packets* are considered, 
instead of *frames*, packet timestamps being expressed in 50Hz. Thus, 
video consisting of field pictures (2 field pictures per video frame) 
will get frame rate 50fps instead of correct 25fps. Video consisting of 
frame pictures will get correct frame rate 25fps. Video consisting of 
"frame doubling" pictures will get frame rate 12.5fps.

However, in all aforementioned cases, the correct video full frame rate 
is actually 25fps (i.e., 1/2 of timestamp rate). Therefore I believe the 
frame rate of H.264 video (and most probably also MPEG video) should be 
computed as 1/2 of timestamp rate, if known, instead of using 
"unreliable" handling.

You might object that this is incorrect, since with frame doubling, we 
actually have 12.5fps. Yes, we do. But imagine a video starting with 200 
frame doubling pictures, then the rest is normal frames. Well, the rest 
is 25fps... So the whole stream should be treated as 25fps and not 
12.5fps, doubling frames as asked by picture structure (we have 
repeat_pict for that). Such switching of picture structure is completely 
normal in TV broadcast.

Similarly, picture structure top-bottom-top and bottom-top-bottom 
actually code three 25fps frames in two pictures, repeating half frames 
only (and constructing the frame "in the middle" from the other two 
frames). We cannot communicate this to the application yet, so it would 
be handled like doubling each second frame.

The whole idea of all those adventurous picture structure types was 
invented to represent other frame rates in fixed frame rate of 
television display (telecine). So in my opinion, we should handle it in 
such way as well - we have fixed frame rate of 1/2 timestamp rate and 
picture structure just describes how to distribute frames/fields onto 
full frames of this fixed frame rate.

I hope it makes some more sense now...

Regards,

Ivan




More information about the ffmpeg-devel mailing list