[FFmpeg-devel] [PATCH] matroska: Set CodecDelay and SeekPreroll in the AVStream

Vignesh Venkatasubramanian vigneshv at google.com
Fri Oct 11 17:31:55 CEST 2013


On Fri, Oct 11, 2013 at 8:09 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Fri, Oct 11, 2013 at 08:05:08AM -0700, Vignesh Venkatasubramanian wrote:
>> On Tue, Oct 8, 2013 at 12:54 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> > On Mon, Oct 07, 2013 at 09:05:09AM -0700, Vignesh Venkatasubramanian wrote:
>> >> On Sun, Oct 6, 2013 at 5:19 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >> > On Thu, Oct 03, 2013 at 04:49:08PM -0700, Vignesh Venkatasubramanian wrote:
>> >> >> On Thu, Sep 26, 2013 at 5:11 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >> >> > On Tue, Sep 24, 2013 at 02:50:16PM -0700, Vignesh Venkatasubramanian wrote:
>> >> >> >> On Tue, Sep 24, 2013 at 2:14 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >> >> >> > On Tue, Sep 24, 2013 at 12:09:31PM -0700, Vignesh Venkatasubramanian wrote:
>> >> >> >> >> On Tue, Sep 24, 2013 at 11:36 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >> >> >> >> > On Tue, Sep 24, 2013 at 10:35:47AM -0700, Vignesh Venkatasubramanian wrote:
>> >> >> >> >> >> On Tue, Sep 24, 2013 at 3:53 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
>> >> >> >> >> >> > On Mon, Sep 23, 2013 at 04:03:25PM -0700, Vignesh Venkatasubramanian wrote:
>> >> >> >> >> >> >> On Mon, Sep 23, 2013 at 4:02 PM, Vignesh Venkatasubramanian
>> >> >> >> >> >> >> <vigneshv at google.com> wrote:
>> >> >> >> >> >> >> > This patch exports the values of Codec Delay and Seek Preroll
>> >> >> >> >> >> >> > container elements as in the AVStream structure. The seek_preroll
>> >> >> >> >> >> >> > field has been added to the AVStream struct and the minor version
>> >> >> >> >> >> >> > of libavformat has been bumped.
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> > Signed-off-by: Vignesh Venkatasubramanian <vigneshv at google.com>
>> >> >> >> >> >> >> > ---
>> >> >> >> >> >> >> >  libavformat/avformat.h    | 16 ++++++++++++++++
>> >> >> >> >> >> >> >  libavformat/matroskadec.c | 14 ++++++++++++++
>> >> >> >> >> >> >> >  libavformat/version.h     |  2 +-
>> >> >> >> >> >> >> >  3 files changed, 31 insertions(+), 1 deletion(-)
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> > diff --git a/libavformat/avformat.h b/libavformat/avformat.h
>> >> >> >> >> >> >> > index b18eb3f..6671a83 100644
>> >> >> >> >> >> >> > --- a/libavformat/avformat.h
>> >> >> >> >> >> >> > +++ b/libavformat/avformat.h
>> >> >> >> >> >> >> > @@ -856,6 +856,9 @@ typedef struct AVStream {
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> >      /**
>> >> >> >> >> >> >> >       * Number of samples to skip at the start of the frame decoded from the next packet.
>> >> >> >> >> >> >> > +     *
>> >> >> >> >> >> >> > +     * Code outside avformat should access this field using:
>> >> >> >> >> >> >> > +     * av_stream_get_skip_samples/set_skip_samples(stream)
>> >> >> >> >> >> >> >       */
>> >> >> >> >> >> >> >      int skip_samples;
>> >> >> >> >> >> >
>> >> >> >> >> >> > when does code outside lavf need to access this ?
>> >> >> >> >> >> >
>> >> >> >> >> >>
>> >> >> >> >> >> you are right. it need not be accessed outside lavf.
>> >> >> >> >> >>
>> >> >> >> >> >> >
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> > @@ -888,10 +891,23 @@ typedef struct AVStream {
>> >> >> >> >> >> >> >       */
>> >> >> >> >> >> >> >      int pts_wrap_behavior;
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> > +    /**
>> >> >> >> >> >> >> > +     * Number of samples to skip after a discontinuity. For example, when a seek
>> >> >> >> >> >> >> > +     * happens.
>> >> >> >> >> >> >> > +     *
>> >> >> >> >> >> >> > +     * Code outside avformat should access this field using:
>> >> >> >> >> >> >> > +     * av_stream_get/set_seek_preroll(stream)
>> >> >> >> >> >> >> > +     */
>> >> >> >> >> >> >> > +    int seek_preroll;
>> >> >> >> >> >> >> > +
>> >> >> >> >> >> >> >  } AVStream;
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> >  AVRational av_stream_get_r_frame_rate(const AVStream *s);
>> >> >> >> >> >> >> >  void       av_stream_set_r_frame_rate(AVStream *s, AVRational r);
>> >> >> >> >> >> >> > +int  av_stream_get_skip_samples(const AVStream *s);
>> >> >> >> >> >> >> > +void av_stream_set_skip_samples(AVStream *s, int skip_samples);
>> >> >> >> >> >> >> > +int  av_stream_get_seek_preroll(const AVStream *s);
>> >> >> >> >> >> >> > +void av_stream_set_seek_preroll(AVStream *s, int seek_preroll);
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> >  #define AV_PROGRAM_RUNNING 1
>> >> >> >> >> >> >> >
>> >> >> >> >> >> >> > diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
>> >> >> >> >> >> >> > index a1b7f56..ecbc723 100644
>> >> >> >> >> >> >> > --- a/libavformat/matroskadec.c
>> >> >> >> >> >> >> > +++ b/libavformat/matroskadec.c
>> >> >> >> >> >> >> > @@ -163,6 +163,8 @@ typedef struct {
>> >> >> >> >> >> >> >      uint64_t default_duration;
>> >> >> >> >> >> >> >      uint64_t flag_default;
>> >> >> >> >> >> >> >      uint64_t flag_forced;
>> >> >> >> >> >> >> > +    uint64_t codec_delay;
>> >> >> >> >> >> >> > +    uint64_t seek_preroll;
>> >> >> >> >> >> >> >      MatroskaTrackVideo video;
>> >> >> >> >> >> >> >      MatroskaTrackAudio audio;
>> >> >> >> >> >> >> >      MatroskaTrackOperation operation;
>> >> >> >> >> >> >> > @@ -410,6 +412,8 @@ static EbmlSyntax matroska_track[] = {
>> >> >> >> >> >> >> >      { MATROSKA_ID_TRACKOPERATION,       EBML_NEST, 0, offsetof(MatroskaTrack,operation), {.n=matroska_track_operation} },
>> >> >> >> >> >> >> >      { MATROSKA_ID_TRACKCONTENTENCODINGS,EBML_NEST, 0, 0, {.n=matroska_track_encodings} },
>> >> >> >> >> >> >> >      { MATROSKA_ID_TRACKMAXBLKADDID,     EBML_UINT, 0, offsetof(MatroskaTrack,max_block_additional_id) },
>> >> >> >> >> >> >> > +    { MATROSKA_ID_CODECDELAY,           EBML_UINT, 0, offsetof(MatroskaTrack,codec_delay) },
>> >> >> >> >> >> >> > +    { MATROSKA_ID_SEEKPREROLL,          EBML_UINT, 0, offsetof(MatroskaTrack,seek_preroll) },
>> >> >> >> >> >> >> >      { MATROSKA_ID_TRACKFLAGENABLED,     EBML_NONE },
>> >> >> >> >> >> >> >      { MATROSKA_ID_TRACKFLAGLACING,      EBML_NONE },
>> >> >> >> >> >> >> >      { MATROSKA_ID_CODECNAME,            EBML_NONE },
>> >> >> >> >> >> >
>> >> >> >> >> >> >> > @@ -1874,6 +1878,16 @@ static int matroska_read_header(AVFormatContext *s)
>> >> >> >> >> >> >> >              st->codec->bits_per_coded_sample = track->audio.bitdepth;
>> >> >> >> >> >> >> >              if (st->codec->codec_id != AV_CODEC_ID_AAC)
>> >> >> >> >> >> >> >              st->need_parsing = AVSTREAM_PARSE_HEADERS;
>> >> >> >> >> >> >> > +            if (track->codec_delay > 0) {
>> >> >> >> >> >> >> > +                st->skip_samples = av_rescale_q(track->codec_delay,
>> >> >> >> >> >> >> > +                                                (AVRational){1, 1000000000},
>> >> >> >> >> >> >> > +                                                (AVRational){1, st->codec->sample_rate});
>> >> >> >> >> >> >> > +            }
>> >> >> >> >> >> >
>> >> >> >> >> >> > i suspect this alone wont work when the user seeks back to the first
>> >> >> >> >> >> > packet
>> >> >> >> >> >>
>> >> >> >> >> >> i don't understand why. can you please explain?
>> >> >> >> >> >
>> >> >> >> >> > because its set when parsing the header but when seeking back the
>> >> >> >> >> > header is not re parsed so its not set again so the first packet
>> >> >> >> >> > would then skip a different amount
>> >> >> >> >> > thats unless iam missing something
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >> yes, you are right. it is up to the decoder implementation to make sure
>> >> >> >> >> that it uses the correct skip_samples value when the input timestamp is
>> >> >> >> >> zero (or start of stream) when seeking.
>> >> >> >> >
>> >> >> >> > the decoder doesnt know when the timestamp is zero, a file might start
>> >> >> >> > from a timestamp differnt than zero
>> >> >> >> >
>> >> >> >>
>> >> >> >> i think i haven't communicated correctly about what i'm trying to
>> >> >> >> achieve with this patch.
>> >> >> >> basically, lavf will put this container value in the stream struct
>> >> >> >> while parsing the
>> >> >> >> header. once it populated, this field should not change (on seeking,
>> >> >> >> etc.). so , it's a place
>> >> >> >> where i can put the container field in the stream.
>> >> >> >
>> >> >> > AVStream->skip_samples will simply cause side data to be added to the
>> >> >> > next packet that indicate to the decoder that a specific number of
>> >> >> > samples should be skiped. skip_samples will be reset to 0 after it
>> >> >> > has been used in this way.
>> >> >> >
>> >> >> > See the code in utils.c that uses skip_samples, its just 5 lines
>> >> >> >
>> >> >> >
>> >> >> > A demuxer could just directly generate the sidedata but that is
>> >> >> > probably often more complex implementation wise
>> >> >> >
>> >> >>
>> >> >> Here is what I am trying to achieve: I need to use libavformat's
>> >> >> demuxer to demux the file but not use libavcodec for the decoding. So
>> >> >
>> >> > that shouldnt affect how values are exported from libavformat
>> >> >
>> >> >
>> >> >> in that case, i need libavformat to export this codec delay container
>> >> >> element's value somewhere. I presumed that to be the AVStream
>> >> >> structure. Can you please tell me how to do this if not through
>> >> >> AVStream?
>> >> >
>> >> > there are many ways on how to export values, AVStream, AVCodecContext
>> >> > AVPacket side data, metadata, ...
>> >> > Each of these have different limitations
>> >> > AVStream/AVCodecContext when used to communicate with the outside
>> >> > should not hold data that changes mid stream (any packet queue
>> >> > would make matching them to packets hard)
>> >> > When the demuxer and decoder layer need to keep track of different
>> >> > values then both AVStream and AVCodecContext might each have the same
>> >> > field like sample_aspect_ratio.
>> >> > Also the decoder/encoder does not have direct access to AVStream
>> >> > But the demuxer/muxer do have access to AVCodecContext, though this
>> >> > is not guranteed to be the very same struct that is used by the
>> >> > actual codec.
>> >> >
>> >> >
>> >> > also
>> >> > what i think is important is that its consistent between muxer and
>> >> > demuxer. Also existing fields should be used when theres no reason
>> >> > not to use them but fields should also not be misused.
>> >> > AVStream->skip_samples is intended to transport values from a demuxers
>> >> > header parser to the first AVPackets sidedata. Currently
>> >> > that requires the value to be set again when seeking to the begin.
>> >> > That could be changed of course if it simplifies something ...
>> >> >
>> >> > Now, above does not awnser your question what to use, but rather
>> >> > tries to explain the differences of the various options.
>> >> > I have no real preferrance what to use and i dont know the fine
>> >> > details of the matroska fields and the specification is kind of not
>> >> > explaining them very well either
>> >> >
>> >>
>> >> Based on your response, i now have a better understanding of the
>> >> various ways to export this information from libavformat. I feel that
>> >> exporting seek_preroll this way is fine. Regarding codec_delay, since
>> >> it is a per file metadata and not associated with a packet, I feel
>> >> that it should not go into skip_samples (which essentially modifies
>> >> the next packet, rather than exporting it as a per-file metadata) but
>> >> a new field should be created in AVStream instead. Please let me know
>> >> if you think this behavior is good so that I can go ahead and do the
>> >> patches.
>> >
>> > can you explain how this will work on the encoder -> muxer side ?
>> >
>>
>> On the encoder/muxer side, SeekPreRoll is a hardcoded value in the
>> muxer. Codec Delay is communicated from the encoder to the muxer
>> through AVCodecContext->delay.
>
> this doesnt seem to match the demuxer/decoder side.
> Maybe iam missing something but this mismatch doesnt seem like a good
> idea

Alright, i agree with that. I will use the same AVCodecContext->delay
field in the demuxer side too to export this value from libavformat.

>
> also if a field to store SeekPreRoll in becomes available then it
> could be set by the encoder and used by the muxer making per encoder
> hardcoded tables in the muxer unneeded
>

There is no way to retrieve this value from the libopus encoder as of
now. So it has to be hardcoded. Are you suggesting doing the
hardcoding in the encoder rather than the muxer? Because that seems
like a good idea, in that case can I add that field to the
AVCodecContext too? Or should i do it someplace else.

> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Breaking DRM is a little like attempting to break through a door even
> though the window is wide open and the only thing in the house is a bunch
> of things you dont want and which you would get tomorrow for free anyway
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list