[Libav-user] Video and audio timing / syncing

Kalileo kalileo at universalx.net
Fri Mar 29 22:12:16 CET 2013


On Mar 30, 2013, at 03:28 , Brad O'Hearne wrote:

> On Mar 28, 2013, at 11:53 PM, Kalileo <kalileo at universalx.net> wrote:
>> Hi Brad,
>> 
>> when you start writing the packets (muxing them), you give each audio and video packet a DTS (and PTS) value. You can start at zero. 
>> 
>> At the start you give the first audio and the first video packet the same value.  For every new packet you have to increase the DTS value accordingly, depending on the length of the  audio or video packet before. Audio and video packets have different lengths, so you increase them using different step values.  
>> 
>> For example, you can increase the DTS value for every video packets by 4000, and for every audio packets by 2000 (you must correct these values depending on your codecs).
>> 
>> If you use the correct step values, then at the end of your video, both audio and video DTS values should be roughly the same again. If they are not, your step value is wrong.
>> 
>> That's all already. Works perfectly for me.
> 
> Kalileo -- hey thanks for taking the time to respond, it is good to hear from you again. I think you are probably right on target, but I have a few wrinkles to add which have caused me to scratch my head a bit. Check these few tidbits out: 
> 
> - Another poster has mentioned earlier in this thread (if I understood his point accurately) that audio and video streams (timing that is) are completely unrelated in their handling. While we view these streams as single rendered product, that internally they are completely separate entities.

Correct.

> There's kind of an issue of semantics here, but I'm not sure whether that agrees with or contradicts above what you are saying about the relationship between audio and video pts / dts.

No contradiction.

> To the best of what I've been able to determine from mailing list responses, doc, and my testing, it would appear that these settings for audio don't have any material effect on settings for video and vice versa,

Correct, except that they are used for syncing.

> but in viewing the output, they obviously would show sync problems if timings weren't right.

That's what I try to tell you, the length (time) is what you have to set using DTS/PTS, where same DTS means "play at the same time"

> This seems supported by the next several points which follow. 
> 
> - Here's an interesting note: it doesn't appear that pts and dts are even relevant for audio. I don't know whether that is the case across the board, or only in some specific circumstances, but I don't even have to set either value, and the audio is perfect both in the case of writing video frames as well, or if I completely turn off writing of all video frames. I've outputted the audio pts value when not setting it and it is complete junk, yet the audio is perfect. 

Depends on your Player. In the case you describe the audio is the "master", and it just plays, one packet after the other. Audio packets do have a specific length, so that's working fine without additional timing info.

> 
> - If I completely turn off the writing of all audio frames, there is absolutely no change in video rendering -- it still renders video frames at twice the speed.

What Player are you using, what player shows that behavior?

> This would seem to support the fact that a) pts might only be significant for video packets and not for audio, and

Not correct. You can take the video timing as the master, and speed up / slow down the audio to follow the video.

> b) there's no direct relationship between video and audio packet pts. 

Not correct. The relationship is the timing, the length. Same PTS means this video and this audio should be played at the same time. 

> 
> So my next questions become the following: 
> 
> 1. Is setting the audio pts and dts even relevant? I've seen no functional indication that it is. 

If you do not need your audio and video to stay in sync then yes, not relevant. However most Players will think that you want it to be in sync, so setting nonsense values will give you funny results.

> 
> 2. Is there any direct thing that the playback codecs do (other than just rendering at the proper time) to relate audio timing to video timing? There's no comparison or sequencing being done between values is there? 

The codecs don't do any syncing. It's the Player which does take care of syncing. DTS/PTS is what helps the player doing that.

> 
> 3. The whole setting of pts and dts is relative to the time_base configured on the codec context. According to the documentation, the time_base.num should be 1, and the time_base.den should be equal to the expected frames per second. I have both of these set accordingly. However, I got to thinking, what if you expect (I'm going to use round multiples for discussion here, I'm actually setting time_base.den to 24 fps) 30 fps, but at runtime receive only 15 fps. Will this internally have any material impact to rendering? I think this is where some of the FFmpeg code examples may be bypassing an issue common to many actual use-cases. They can virtually guarantee frame-rate and proper pts values by simply generating X frames and assigning them proper pts. But what happens when receiving these frames from an external source and frames aren't delivered at the frame rate expected? Is there some compensation that has to be done in code,

Yes, check the DTS/PTS of audio and video and slow down one or speed up the other when they drift apart. That's the job of a player.

> or is the codec smart enough to render frames at
>  the timings you stamp on them, regardless of whether the frame rate matches your time_base.den setting? 

I don't know why you keep thinking that the codec cares about the time when to render. You give the codec something to decode, or demux, and it does it, as fast as it can. Rendering/displaying happens after that, not by the codec but by some player code, and up to the player to make sure it keeps all in sync.

You might want to study some examples based on the old Dranger's tutorial, where that stuff is explained in much better words than mine.

Regards,
Kalileo


More information about the Libav-user mailing list