[FFmpeg-devel] [PATCH] h264 parallelized, (was: Parallelized h264 proof-of-concept)

Michael Niedermayer michaelni
Sat Jun 16 18:14:07 CEST 2007


Hi

On Fri, Jun 15, 2007 at 10:10:54PM +0200, Andreas ?man wrote:
> Andreas ?man wrote:
> >Hi
> >
> >Michael Niedermayer wrote:
> >>Hi
> >>
> >>
> >>av_free() + av_malloc() or pass an argument to MPV_common_init()
> >
> >I'll extend MPV_common_init() with an additional argument then.
> >
> >>> static void filter_mb_fast( H264Context *h, int mb_x, int mb_y, uint8_t 
> >>> *img_y, uint8_t *img_cb, uint8_t *img_cr, unsigned int linesize, 
> >>> unsigned int uvlinesize);
> >>>+static void execute_decode_slices(H264Context *h, int reset);
> >>cant you order the new functions so as to avoid that?
> >
> >Hm, not really. decode_slice_header() needs to be able to
> >fire off any pending slices in case the deblocking-type changes
> >within a frame. (AFAIK this is valid according to the specs,
> >perhaps I'm wrong?)
> >
> 
> Okay, here are the finalized patches.
> 
> #1 - Extend MPV_common_init() with an addition arg for context size
>      when doing multi threading.

hmm, seeing the patch, i think i would prefer some simpler solution,
maybe adding the size to MpegEncContext? or even better adding
thread_context[] to H264Context, this would also avoid the casts to
H264Context


> 
> #2 - Factor out init_scan_tables()

looks ok (and can be applied)


> 
> #3 - Decouple bit context from h264 context in decode_ref_pic_marking()

looks ok (and can be applied)


> 
> #4 - Slice level parallelism for deblocking type 0 and 2
> 
> regression tests passes

regression tests ? theres no h.264 regression test ...
i assume you mean you tested this on several h.264 streams and the
output is binary identical ...


[...]
> @@ -3022,8 +3067,18 @@
>      MpegEncContext * const s = &h->s;
>      int temp8, i;
>      uint64_t temp64;
> -    int deblock_left = (s->mb_x > 0);
> -    int deblock_top  = (s->mb_y > 0);
> +    int deblock_left;
> +    int deblock_top;
> +    int mb_xy;
> +
> +    if(h->deblocking_filter == 2) {
> +        mb_xy = s->mb_x + s->mb_y*s->mb_stride;
> +        deblock_left = h->slice_table[mb_xy] == h->slice_table[mb_xy - 1];
> +        deblock_top  = h->slice_table[mb_xy] == h->slice_table[h->top_mb_xy];
> +    } else {
> +        deblock_left = (s->mb_x > 0);
> +        deblock_top = (s->mb_y > 0);
> +    }

is this multitrheading specific? or a deblocking_filter == 2 fix? in the later
case it should be in a seperate patch


[...]
>      if(!FRAME_MBAFF){
>          int qp_thresh = 15 - h->slice_alpha_c0_offset - FFMAX(0, h->pps.chroma_qp_index_offset);
>          int qp = s->current_picture.qscale_table[mb_xy];
> -        if(qp <= qp_thresh
> -           && (mb_x == 0 || ((qp + s->current_picture.qscale_table[mb_xy-1] + 1)>>1) <= qp_thresh)
> -           && (mb_y == 0 || ((qp + s->current_picture.qscale_table[h->top_mb_xy] + 1)>>1) <= qp_thresh)){
> -            return;
> +
> +        if(qp <= qp_thresh) {
> +            if(h->deblocking_filter == 1) {
> +                if((mb_x == 0 || ((qp + s->current_picture.qscale_table[mb_xy-1]      + 1)>>1) <= qp_thresh) &&
> +                   (mb_y == 0 || ((qp + s->current_picture.qscale_table[h->top_mb_xy] + 1)>>1) <= qp_thresh))
> +                    return;
> +            } else {
> +                int mb_slice = h->slice_table[mb_xy];
> +                int left_qp, top_qp;
> +
> +                left_qp = h->slice_table[mb_xy - 1]    == mb_slice ? ((qp + s->current_picture.qscale_table[mb_xy-1]      + 1)>>1) : 0;
> +                top_qp  = h->slice_table[h->top_mb_xy] == mb_slice ? ((qp + s->current_picture.qscale_table[h->top_mb_xy] + 1)>>1) : 0;
> +                if(left_qp <= qp_thresh &&
> +                   top_qp  <= qp_thresh)
> +                    return;
> +            }
>          }
>      }

is this specific to threads?


[...]
> +
> +    h->max_contexts = avctx->thread_count > 0 ? avctx->thread_count : 1;

i think thread_count must be >0


[...]
> +        hx = (H264Context *)s->thread_context[h->current_context] ? (H264Context *)s->thread_context[h->current_context] : h;

what about s->thread_context[0] == h, i think that would avoid this check?
or does that cause other parts of the code to become more complex?


[...]
>          case NAL_DPA:
> -            init_get_bits(&s->gb, ptr, bit_length);
> -            h->intra_gb_ptr=
> -            h->inter_gb_ptr= NULL;
> -            s->data_partitioning = 1;
> +             init_get_bits(&hx->s.gb, ptr, bit_length);
> +             hx->intra_gb_ptr=
> +             hx->inter_gb_ptr= NULL;
> +             hx->s.data_partitioning = 1;
>  
> -            if(decode_slice_header(h) < 0){
> +            if(decode_slice_header(hx, h) < 0){
>                  av_log(h->s.avctx, AV_LOG_ERROR, "decode_slice_header error\n");
>              }

indention is wrong here


[...]
> Index: libavcodec/h264.h
> ===================================================================
> --- libavcodec/h264.h	(revision 9281)
> +++ libavcodec/h264.h	(working copy)
> @@ -381,6 +381,16 @@
>      const uint8_t *field_scan8x8_cavlc_q0;
>  
>      int x264_build;
> +
> +    /* Slice-based multi threading members.
> +     * These are only used in the "master" context */

doxygen supports comments on groups of variables, this should be used
here

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No human being will ever know the Truth, for even if they happen to say it
by chance, they would not even known they had done so. -- Xenophanes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070616/0e373af2/attachment.pgp>



More information about the ffmpeg-devel mailing list