[Libav-user] Syncing a seperate audio and video stream

Nicolas George nicolas.george at normalesup.org
Wed Mar 21 15:20:31 CET 2012

Le primidi 1er germinal, an CCXX, Ludwig Ertl a écrit :
> I want to use ffmpeg to decode Audio and video streams by a
> Trendnet TV-IP512P IP Camera in order to feed it to ffserver for streaming.
> I found a document on the Internet which is describing the video and
> audio format container used by the IP camera:
> http://www.paillassou.com/DCS-2121/CGI_2121.pdf
> See Section 8.2 (Advanced ip-Camera Stream(ACS) Header.
> Unfortunately, Audio and Video streams are served seperately under different
> URLs (See Section 4.1.5 and 4.1.6): /video/ACVS.cgi and /audio/ACAS.cgi
> Now I wrote 2 decoding plugins for libavformat (which I'd like to
> contribute to libavformat once they are working as expected), which I have
> attached in this message.
> They basically work fine, but as I'm completely new to libav, I don't have
> an idea how to sync those 2 streams together.
> Currently, what I'm doing for testing is:
> ffmpeg -i http://admin:xxx-Q0ErXNX1RuZeFKHHnMQK1g@public.gmane.org/video/ACVS.cgi -i
> http://admin:xxx-Q0ErXNX1RuZeFKHHnMQK1g@public.gmane.org/audio/ACAS.cgi test.mpg
> So I have 2 seperate streams which are of course out of sync.
> There is a timestamp field in the frame-Header of each audio/videoframe,
> which is just a unix-timestamp with msec precision, both from the same
> clock source.
> So this information could be used to sync the streams, but I have no clue
> how this could possibly work, as Audio- and Videodecoder plugins don't
> know any- thing from each other and even if they would (via an external
> variable or some ugly hack like that), I don't have a clue how to sync
> them. I suspect that it may have something to do with PTS and DTS
> timestamps, but I don't know how they are used for audio and video sync in
> seperate streams.

The PTS of a frame is the timestamp at which this frame should start to be
played or displayed. That is exactly what you need to sync audio and video.

The DTS is some obscure construction based on an abstract model for a
decoder, and is only relevant when there are B-frames. Otherwise, setting it
at the same value as PTS is fine.

In libavdevice, a lot of demuxers (ALSA, V4L on certain kernels, JACK)
already return a Unix timestamps at microsecond precision, so this is a good
choice. Also, the other possibilities you tried all return a redundant

Now, as to the question on how to sync both streams, the answer is not easy,
but it is the same kind of problem than syncing a capture from V4L and a
capture from ALSA.

With the ffmpeg command-line tool, sync issues of that kind can be solved
with the -async option, but I do not find it very elegant. In custom code,
or possibly in a future version of ffmpeg, other algorithms could be

> What I have already tried was using the clock as PTS for both audio and
> video:
> av_set_pts_info(st, 64, 1, 1000000);  /* 64 bits pts in us */
> pkt->pts = ac->hdr.ulTimeSec * 1000000LL + ac->hdr.ulTimeUSec;

That looks right (except the camelCase), and the best option.

> But this just resulted in a totally garbled video stream.

This is rather strange. Can you show your command line and console output in
that case, and describe what kind of "garbled" you get?

>  * ACS (Advanced ip-Camera Stream) demuxer
>  * Copyright (c) 2012 DI(FH) Ludwig Ertl / CSP GmbH

>  * ACS (Advanced ip-Camera Stream) demuxer
>  * Copyright (c) 2012 DI(FH) Ludwig Ertl / CSP GmbH

This looks promising. But I believe you may have much less work if you try
to merge both files in a single demuxer that can automatically detect
whether it is audio or video.


  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/libav-user/attachments/20120321/ad4d3696/attachment.asc>

More information about the Libav-user mailing list