[Libav-user] Muxing sparse media streams

Anton Yabchinskiy arn at devline.ru
Thu Apr 21 17:47:10 CEST 2016


Hello,

I'm trying to mux two streams (1280x720 H.264 video and 22050 Hz, mono
PCM audio) into an AVI file. The problem is that stream data may have
arbitrary gaps, and on input I have raw video frames/blocks of audio
data with associated POSIX timestamps (the time at which video/audio
was captured, in microseconds).

So, it may look something like this (where V is video data and A is
corresponding audio data):

| VVVV  VVV   |
|  AAA AA AAAA|

On playback there must be silence instead of missing audio data, and
whatever (maybe, last valid frame) in place of missing video
frames. Also, regions where were no video and no audio at the same
time should be skipped and not muxed to the output (but I'm leaving
this apart for now).

What could be a proper way to mux such data?

I know little about libav* for now, and what I'm doing is the
following.

For video (audio) stream I set time_base as
1/intended_video_frame_rate (1/audio_sample_rate):

	video_stream->time_base.num = 1;
	video_stream->time_base.den = 25;

	audio_stream->time_base.num = 1;
	audio_stream->time_base.den = audio_stream->codec->sample_rate;

I have min audio/video capture timestamp as `origin`. Capture
timestamps are rescaled to stream time_base, and packets are written
with av_interleaved_write_frame:

	static const AVRational time_base = { 1, 1000000 };

	pkt.pts = av_rescale_q(
		capture_ts - origin, time_base, {audio,video}_stream->time_base);
	av_interleaved_write_frame(ctx, &pkt);

For now I'm considering two simpler special cases of input streams.

Input 1 where some leading video is missing:

|  VV|
|AAAA|

In this case the output is satisfactory, audio plays right away, first
video frame is shown until video data actually starts, video and audio
are synchronized.

Input 2 where some leading audio is missing:

|VVVV|
|  AA|

In this case video and audio in the output are played right away,
unsynchronized. So it looks like it was this before muxing (lowercase
A to denote unsync), although packet pts values were valid timestamps
in both cases:

|VVVV|
|aa  |

What I'm doing wrong here? Or is it complete nonsense what I'm trying
to achieve?

Any advice is appreciated.


More information about the Libav-user mailing list