[FFmpeg-devel] [PATCH] H.264: decode picture timing SEI messageand get field order

Michael Niedermayer michaelni
Mon Nov 3 15:07:33 CET 2008


On Sun, Nov 02, 2008 at 10:51:28AM +0900, Haruhiko Yamagata wrote:
> On Sat, Nov 02, 2008 at 04:43:06, Michael Niedermayer wrote:
>> On Sat, Nov 01, 2008 at 06:43:06PM +0900, Haruhiko Yamagata wrote:
>>> Hi,
>>>
>>> I wrote a patch which implements decoding of picture timing SEI.
>>>
>>> [Current problems]
>>> libavcodec derives field order from POCs.
>>>
>>> cur->top_field_first = cur->field_poc[0] < cur->field_poc[1];
>>>
>>> It assumes BFF if two POSs are equal. This is not always correct.
>>>
>>> And it derives AVFrame::interlaced_frame from used decoding process.
>>>
>>> cur->interlaced_frame = FIELD_OR_MBAFF_PICTURE;
>>>
>>> Sometimes this is not correct. Interlaced/progressive mode of 
>>> post-decoding processes such as deinterlacing and color space conversion 
>>> is not always same as the mode used during decoding. H.264 for example 
>>> allows to encode an interlaced frame using a progressive algorithm. 
>>> Sample: premiere-paff.ts (http://x264.nl/h.264.samples/premiere-paff.ts).
>>>
>>> [Solutions]
>>> Picture timing SEI message contains pic_struct which defines 
>>> interlace/progressive
>>> mode of post decoding processes and field order. Most streams have the 
>>> SEI for every frame or field. My patch implements decoding of the SEI and 
>>> use of pic_struct. Now the field order is always correct. Interlaced 
>>> picture is judged interlaced. Field 1 repeat flags are also set.
>>>
>>> [Limitation]
>>> As usual, there are exceptions: badly encoded streams. Some progressive 
>>> streams from television have pic_struct 3, which indicates interlacing. 
>>> Fortunately such streams are rare. This is not my fault, other decoder 
>>> show the same behavior.
>>>
>>> Please review.
>>>
>>> ----------------------------- for svn log -----------------------
>>> H.264: Implement decoding of picture timing SEI message
>>> Now correct values are set to interlaced_frame, top_field_first
>>> and repeat_pict in AVFrame structure.
>>> patch by ffdshow tryouts
>>> --------------------------------------------------------------
>>>
>>> Best regards,
>>> Haruhiko Yamagata
>>
>>> Index: libavcodec/h264.c
>>> ===================================================================
>>> --- libavcodec/h264.c (revision 15762)
>>> +++ libavcodec/h264.c (working copy)
>>> @@ -6846,6 +6846,61 @@
>>>          }while(get_bits(&s->gb, 8) == 255);
>>>           switch(type){
>>> +        case 1: // Picture timing SEI
>>> +        {
>>> +            if(h->sps.nal_hrd_parameters_present_flag || 
>>> h->sps.vcl_hrd_parameters_present_flag){
>>
>> I think this and the code below would be cleaner in its own function like
>> decode_unregistered_user_data()
>
> done.
>
>>
>>
>>> +                get_bits(&s->gb, h->sps.cpb_removal_delay_length); /* 
>>> cpb_removal_delay */
>>> +                get_bits(&s->gb, h->sps.dpb_output_delay_length);  /* 
>>> dpb_output_delay */
>>
>> as the value isnt used, skip_bits() would be enough
>
> done.
>
>
>>> +            }
>>> +            if(h->sps.pic_struct_present_flag){
>>
>>> +                unsigned int i, NumClockTS;
>>
>> num_clock_ts would be more consistent relative to the other variables in
>> h264.c
>
> done.
>
>>> +                h->sei_pic_struct = get_bits(&s->gb, 4);
>>> +
>>> +                if (h->sei_pic_struct > SEI_PIC_STRUCT_FRAME_TRIPLING)
>>> +                    return -1;
>>> +
>>
>>> +                h->sei_pic_struct_valid_flag = 1;
>>
>> why is this needed?
>
> I removed sei_pic_struct_valid_flag.
> Checking the range of h->sei_pic_struct is necessary for broken streams or 
> future
> extension of the SEI.
>
>>> +
>>> +                // Just to skip bits, maybe better to skip bytes using 
>>> SEI payloadSize.
>>> +                NumClockTS = sei_NumClockTS_table[h->sei_pic_struct];
>>> +
>>> +                for (i = 0 ; i < NumClockTS ; i++){
>>
>>> +                    unsigned int clock_timestamp_flag = get_bits(&s->gb, 
>>> 1);
>>> +                    if(clock_timestamp_flag){
>>
>> if(get_bits1())
>
> done.
>
>>> +                            unsigned int seconds_flag = get_bits(&s->gb, 
>>> 1);
>>> +                            if(seconds_flag){
>>> +                                unsigned int minutes_flag;
>>> +                                get_bits(&s->gb, 6); /* seconds_value 
>>> range 0..59 */
>>> +                                minutes_flag = get_bits(&s->gb, 1);
>>> +                                if(minutes_flag){
>>> +                                    unsigned int hours_flag;
>>> +                                    get_bits(&s->gb, 6); /* 
>>> minutes_value 0..59 */
>>> +                                    hours_flag = get_bits(&s->gb, 1);
>>> +                                    if(hours_flag)
>>> +                                        get_bits(&s->gb, 5); /* 
>>> hours_value 0..23 */
>>> +                                }
>>> +                            }
>>
>> if(get_bits1(&s->gb)){
>>    get_bits(&s->gb, 6); /* seconds_value range 0..59 */
>>    if(get_bits1(&s->gb)){
>>        get_bits(&s->gb, 6); /* minutes_value 0..59 */
>>        if(get_bits1(&s->gb)){
>>            get_bits(&s->gb, 5); /* hours_value 0..23 */
>>        }
>>    }
>> }
>
> done.
>
>> [...]
>>> +                }else if(h->sei_pic_struct == 
>>> SEI_PIC_STRUCT_TOP_BOTTOM_TOP){
>>> +                    // Signal the possibility of telecined film 
>>> externally (pic_struct 5,6)
>>> +                    // From these hints, let the applications decide if 
>>> they apply deinterlacing.
>>> +                    cur->repeat_pict = 4;
>>> +                    cur->interlaced_frame = FIELD_OR_MBAFF_PICTURE;
>>> +                }
>>> +                else if(h->sei_pic_struct == 
>>> SEI_PIC_STRUCT_BOTTOM_TOP_BOTTOM){
>>> +                    cur->repeat_pict = 2;
>>> +                    cur->interlaced_frame = FIELD_OR_MBAFF_PICTURE;
>>> +                }else{
>>
>> both above should be repeat_pict=1 i think
>
> done.
>
>>> +                    // FIXME do something to signal frame 
>>> doubling/tripling externally.
>>> +                    cur->interlaced_frame = 0;
>>
>> repeat_pict= 2 or 4 should be here
>
> done.
>
>>>          //FIXME do something with unavailable reference frames
>>>               /* Sort B-frames into display order */
>>> Index: libavcodec/h264.h
>>> ===================================================================
>>> --- libavcodec/h264.h (revision 15762)
>>> +++ libavcodec/h264.h (working copy)
>>> @@ -110,6 +110,18 @@
>>>      NAL_AUXILIARY_SLICE=19
>>>  };
>>
>>>  +enum {
>>> +    SEI_PIC_STRUCT_FRAME             = 0, // *  0: frame
>>> +    SEI_PIC_STRUCT_TOP_FIELD         = 1, // *  1: top field
>>> +    SEI_PIC_STRUCT_BOTTOM_FIELD      = 2, // *  2: bottom field
>>> +    SEI_PIC_STRUCT_TOP_BOTTOM        = 3, // *  3: top field, bottom 
>>> field, in that order
>>> +    SEI_PIC_STRUCT_BOTTOM_TOP        = 4, // *  4: bottom field, top 
>>> field, in that order
>>> +    SEI_PIC_STRUCT_TOP_BOTTOM_TOP    = 5, // *  5: top field, bottom 
>>> field, top field repeated, in that order
>>> +    SEI_PIC_STRUCT_BOTTOM_TOP_BOTTOM = 6, // *  6: bottom field, top 
>>> field, bottom field repeated, in that order
>>> +    SEI_PIC_STRUCT_FRAME_DOUBLING    = 7, // *  7: frame doubling
>>> +    SEI_PIC_STRUCT_FRAME_TRIPLING    = 8  // *  8: frame tripling
>>> +};
>>
>> the enum should have a name and that should be used instead of int
>> where appropriate
>
> I named SEI_PicStructType.
> Because sei_pic_struct is unsigned int, this may not be the ideal situation 
> for using enum.
>
>>> +
>>>  /**
>>>   * Sequence parameter set
>>>   */
>>> @@ -150,6 +162,12 @@
>>>      int scaling_matrix_present;
>>>      uint8_t scaling_matrix4[6][16];
>>>      uint8_t scaling_matrix8[2][64];
>>> +    int nal_hrd_parameters_present_flag;
>>> +    int vcl_hrd_parameters_present_flag;
>>> +    int pic_struct_present_flag;
>>> +    int time_offset_length;
>>> +    int cpb_removal_delay_length;      ///< 
>>> cpb_removal_delay_length_minus1 + 1
>>> +    int dpb_output_delay_length;       ///< 
>>> dpb_output_delay_length_minus1 + 1
>>>  }SPS;
>>>   /**
>>> @@ -460,6 +478,12 @@
>>>      int mb_xy;
>>>       uint32_t svq3_watermark_key;
>>
>>> +
>>> +    /**
>>> +     * pic_struct in picture timing SEI message
>>> +     */
>>> +    unsigned int sei_pic_struct;   // The formal name is pic_struct.
>>> +    int sei_pic_struct_valid_flag; // For each frame
>>
>> these comments look a little messy
>
> fixed.
>
>>>  }H264Context;
>>>   #endif /* AVCODEC_H264_H */
>>> Index: libavcodec/h264data.h
>>> ===================================================================
>>> --- libavcodec/h264data.h (revision 15762)
>>> +++ libavcodec/h264data.h (working copy)
>>> @@ -1296,4 +1296,8 @@
>>>      }
>>>  };
>>>  +static const unsigned int sei_NumClockTS_table[9]={
>>> +    1,  1,  1,  2,  2,  3,  3,  2,  3
>>> +};
>>
>> this one fits in uint8_t
>
> done.
>
> Thank you very much for reviewing.

[...]
> +                if(h->sps.time_offset_length > 0)
> +                    skip_bits(&s->gb, 5); /* time_offset */

shouldnt this be skip_bits(&s->gb, h->sps.time_offset_length) ?

[...]

> Index: libavcodec/h264.h
> ===================================================================
> --- libavcodec/h264.h	(revision 15766)
> +++ libavcodec/h264.h	(working copy)
> @@ -110,6 +110,18 @@
>      NAL_AUXILIARY_SLICE=19
>  };
>  
> +typedef enum {
> +    SEI_PIC_STRUCT_FRAME             = 0, // *  0: frame
> +    SEI_PIC_STRUCT_TOP_FIELD         = 1, // *  1: top field
> +    SEI_PIC_STRUCT_BOTTOM_FIELD      = 2, // *  2: bottom field
> +    SEI_PIC_STRUCT_TOP_BOTTOM        = 3, // *  3: top field, bottom field, in that order
> +    SEI_PIC_STRUCT_BOTTOM_TOP        = 4, // *  4: bottom field, top field, in that order
> +    SEI_PIC_STRUCT_TOP_BOTTOM_TOP    = 5, // *  5: top field, bottom field, top field repeated, in that order
> +    SEI_PIC_STRUCT_BOTTOM_TOP_BOTTOM = 6, // *  6: bottom field, top field, bottom field repeated, in that order
> +    SEI_PIC_STRUCT_FRAME_DOUBLING    = 7, // *  7: frame doubling
> +    SEI_PIC_STRUCT_FRAME_TRIPLING    = 8  // *  8: frame tripling
> +} SEI_PicStructType;

The comments do not look doxygen compatible


Also besides these, it would be interresting to also parse PPS/SPS/SEI in
the AVParser (h264_parser.c). While that is unrelated to your patch, it
would be a step toward fixing some long standing timestamp issues with
h264 in mpeg-ps/ts and should not be too hard as the existing parsing code
could be shared. Iam just mentioning this because i thought maybe you
are interrested to work on this too, its of course entirely unrelated to
this patch which is, after the 2 issues above likely ready to be commited.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is not what we do, but why we do it that matters.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20081103/c82757d4/attachment.pgp>



More information about the ffmpeg-devel mailing list