[Libav-user] "no picture" from decoding single h264 keyframe

Camera Man i.like.privacy.too at gmail.com
Fri Apr 4 19:42:20 CEST 2014

On 04/03/2014 04:28 AM, Andrey Utkin wrote:

 > This shouldn't be RTP loss, the camera is connected directly to my PC
 > with a patchcord.

It is unlikely to be RTP loss (see below), but I've seen UDP loss even 
in direct connections (due to small buffers and other conditions), so to 
be extra sure this isn't one of the reasons, you should force a TCP session.

 > Could please somebody elaborate what exactly happens? Is above
 > approach generally broken, or it can be fixed (without decoding a lot
 > more frames)

You basically do not need to decode any more frames - an I frame can be 
decoded without any other frame. However, the ffmpeg API is optimized 
for the general use case where you continuously decode lots of frames. 
There are several causes for the video decoding not providing a picture 

1) Stream specifies it shouldn't. Your input stream can say "we need 4 
frames decoded before the first one is shown",  in which case ffmpeg 
will decode 4 frames before returning the first one for you EVEN though 
it is independent. This is for a case of an IBBP (display, which are 
encoded as IPBB) - if you display the I, you will then have a "bubble" 
while you decode the P because you can't show it until after you've 
decoded both Bs.

This being an Axis camera, it is probably not your case - Axis cameras 
usually give a stream that doesn't specify this. But some cameras do (I 
have a Sanyo that does, for example).

2) You are using a multithreaded decode (default build does that). In 
the case, for every thread (default = 2 * cores), you will get one frame 
delay between input and output. This is because decoding is handed off 
to a thread, and control returns to you immediately. So only when you 
feed some more frames, you will get the first frame back.

This is more likely the cause for what you are seeing. You can disable 
multithreaded decode by setting the codec context's "thread_type" field 
to 0 after setting up the stream but before calling avcodec_open2().

In both cases, even if you do not turn multithreaded decoding off, you 
most likely do not need to feed in more frames - you can keep feeding 
NULL packets until you do get the decoded images - but note that as soon 
as you feed the first NULL packet, you imply the stream has ended - and 
you cannot keep decoding it. You will need to reset all buffers and 
restart decoding if you need any other picture decoded.

The important things to understand is that because of reordering, some 
codecs (and h264 is a prime example) do not have a "packet in -> picture 
out" relationship in the ffmpeg API. It would have been clearer if 
instead of decode_video2(), you had two routines: 
"feed_video_packet_to_decoder()", and 
"give_me_a_picture_if_its_ready_or_tell_me_if_not()" - but they are 
fused into "decode_video2()" (for historical reasons, I would guess), so 
you experience the behaviour that you do.

More information about the Libav-user mailing list