[FFmpeg-devel] determining the reference frame for motion_val

Martin Luessi mluessi
Mon Aug 27 21:37:41 CEST 2007


On 8/23/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> Hi
>
> On Thu, Aug 23, 2007 at 11:07:14AM -0500, Martin Luessi wrote:
> > On 8/22/07, Michael Niedermayer <michaelni at gmx.at> wrote:
> > > Hi
> > >
> > > On Wed, Aug 22, 2007 at 09:21:25AM -0500, Martin Luessi wrote:
> > > > Hi,
> > > >
> > > > For my work I need to extract motion vectors from h264 video. By
> > > > looking at the function ff_print_debug_info in mpegvideo.c and
> > > > previous posts on this list, I figured out how to do that. Right now
> > > > my motion vector extraction function accesses the motion_val table of
> > > > the returned AVFrame and creates a table with one motion vector for
> > > > each 8x8 block. However, I still have the following questions:
> > > >
> > > > 1)
> > > > How can I determine which reference frame is used for a given motion
> > > > vector? I use the USES_LIST macro to find out if there is a forward or
> > > > backward motion vector assigned to a a given block. Let's say we have
> > > > a forward motion vector, how can I find out which previous frame  is
> > > > used as reference? If the current frame is a P-frame and the previous
> > > > one was an I-frame, obviously the previous frame is used. But what
> > > > about if the current frame is a B-frame and the previous frame is B as
> > > > well. As you know h264 has an option to support using B frames as
> > > > reference, so the reference frame could be the previous B-frame or a
> > > > I/P frame further back in the past. I hope you see what I'm getting
> > > > at.
> > >
> > > AVFrame.ref_index
> >
> > Ok, thanks. I looked at the h264 code and wrote a function to copy the
> > ref_index from the current frame. However, I cannot make much sense of
> > the values in ref_index. My function looks something like this:
> >
> > void extract_ref_index(MpegEncContext * mpeg_ctx, int list, AVFrame *
> > pict, uint8_t arr)
> > {
> >   for (int mb8_y = 0; mb8_y < mpeg_ctx->mb_height * 2; mb8_y++){
> >      for (int mb8_x = 0; mb8_x < mpeg_ctx->mb_width * 2; mb8_x++){
> >        int b8_xy = mb8_x + mb8_y * mpeg_ctx->b8_stride;
> >        arr[mb8_x + mb8_y * 2 * mpeg_ctx->mb_width] =
> > pict->ref_index[list][b8_xy];
> >     }
> >   }
> > }
> >
> > >From the way the h264 code uses ref_index I figured that the ref_index
> > has an entry for every 8x8 block and the array size is
> > "mpeg_ctx->b8_stride * 2 * mpeg_ctx->mb_height * sizeof(uint8_t)",
> > which means for a QVGA frame 41x30 bytes, is that correct? However, as
> > I said before the values I get are kind of weird, they are mostly 0
> > and sometimes 255, can anyone explain to me what this means?
>
> there should be no 255 in a used reference, maybe its a intra block
> or the ref is plain nt used (backward ref in forward predicted block)

> > Also, does anybody know if the contents of ref_index are still valid
> > when the frame is returned by avcodec_decode_vide(..)? I'm working on
>
> they should be as valid as motion_val IIRC (if you want to be certain
> read the fine source ...)

Unfortunately I'm still stuck with this issue. I've spent about two
days reading the fine source of the H264 decoder but still can't make
much sense of the contents of ref_index (they are either 0 or 255)
.What I figured out so far is that the contents of ref_index and
motion_val are filled by the function writeback_motion(..) in h264.c.
This function is called during the decoding process of the picture and
from what I get from the source is that the ref_index copied from
H264Context.ref_cache, which means that the ref_index depends on the
state of the decoder at the time when the picture is decoded. When a
user uses avcodec_decode_video, the pictures are returned in display
order which is different from the stream order of the frames. So, am I
right in my assumption that it is not possible to make any reasonable
sense of ref_index of the picture returned by avcodec_decode_video
since the state of the decoder at the time when the picture was
decoded is lost. What I wanted to do is to use ref_index to find out
which picture is used as a reference for the contents of motion_val,
but it starts to look like this is impossible. The other thing is
motion vectors for blocks smaller than 8x8. Michael said earlier that
they should be in motion_val as well but what I see in the source is
that writeback_motion(..) does not handle partitions of blocks smaller
than 8x8.

Any help on this issue would be highly appreciated, I have been
working on this for quite some time now but there is still no
progress.

Martin




More information about the ffmpeg-devel mailing list