[FFmpeg-devel] [PATCH 10/15] avformat/matroskaenc: Avoid seeking when writing level 1 elements

Andreas Rheinhardt andreas.rheinhardt at googlemail.com
Tue Apr 2 18:54:00 EEST 2019


Hello,

thanks for taking a look at this.

Hendrik Leppkes:
> On Tue, Apr 2, 2019 at 3:36 PM Andreas Rheinhardt via ffmpeg-devel
> <ffmpeg-devel at ffmpeg.org> wrote:
>>
>> Up until now, the writing process for level 1 elements (those elements
>> for which CRC-32 elements are written by default) was this in case the
>> output was seekable: Write the EBML ID, write an "unkown length" EBML
>> number of the desired length, then write the element into a dynamic
>> buffer, then write the dynamic buffer (after possible calculation and
>> writing of the CRC-element), then seek back to the size element and
>> overwrite the unknown-size element with the real size. The seeking and
>> overwriting part has been eliminated by not writing the size initially.
>>
> 
> I'm not particularly happy that it stops using start_ebml_master and
> basically duplicates its functionality. This adds possible maintenance
> in the future, or hidden bugs.
> 
1. start_ebml_master has the disadvantage that the length of the size
element is chosen in advance; if one doesn't want to use another
dynamic buffer for every master element (I'm thinking about the
thousands of cues for which this would be mostly useless as they are
written with one byte length fields anyway) and doesn't want to waste
bytes for writing lots of elements (the level 1 elements), then these
two functions need to be separated.
> Additionally, before this change, the position of the current writer
> points to the position where the dynamic block is actually going to be
> written - after this, it'll write the size inbetween.
> Especially considering this behavior is different between seek and
> non-seek, I feel a bit uneasy if something might want to reference the
> position - there is a specific warning about that in the CRC case,
> which would apply here equally.
2. What do you mean by "current writer"? It's s->pb, isn't it?

3. You should know that the behaviour differs already between the
seekable and the non-seekable cases (it is true that s->pb points to
the position where the cluster's dynamic buffer is going to be
written, but these are two different positions: once pointing to the
EBML ID, once pointing to the first child of the cluster) and the
relative offsets in the non-seekable case are wrong:

That's because the ID and the length field (i.e. the unknown-length
size field that will be overwritten later) are already included in the
dynamic buffer in the non-seekable case, but according to the Matroska
specifications, the relative offset count begins after the cluster's
length field (i.e. the first element in the cluster (usually a CRC-32
or the cluster timestamp) has a relative offset of zero), so that the
relative offset is currently off by the size of the EBML ID + size of
the length field, i.e. 12 bytes. This is currently no grave problem
given that the relative packet position is only used for the cues,
which aren't written in non-seekable mode.

But it is a problem for debug output. The "Writing block at offset"
message is wrong and currently outputs the relative offset, relative
to the beginning of the cluster EBML ID in the non-seekable mode,
relative to the beginning of the cluster's elements in the seekable
mode. See patch #14 (where I have just found a little bug that I'll
fix in a minute, but which addresses said log message).

4. Given that my patchset actually unifies the seekable and
non-seekable cases wrt writing clusters (and in particular, fixes the
relative offsets in the non-seekable case), I don't see a reason to
feel uneasy. I actually feel uneasy about the current state of
affairs, hence this patchset.
> 
> Maybe the last point can be improved if the size is being included in
> the dynamic buffer and overwritten therein, as such making relative
> positions valid again, even if different to the non-seek case?
> 
5. Currently, the offset in the dynamic buffer coincides with the
relative offset according to the Matroska specifications if we are in
seekable mode. The next patch in this series will extend this to the
non-seekable mode as well. I don't see a reason why I should alter
this by including the size in the dynamic buffer. This would not make
relative offsets valid again, but would make them invalid (except if
we subtracted the length of what has been reserved for the size field
again...).

6. Or do you mean by making "relative positions valid again" that you
want that the offset in s->pb + the offset of an element in the
dynamic buffer coincides with the position where this element will
actually by written in the output? This would effectively mean that
one can't use the minimum required number of bytes for the cluster's
size field and instead would have to reserve so many that it always
works. The only advantage of this would be that the "Writing block at
offset" log message can really spit out the real output offset of the
block (which is impossible if the length of the cluster's size field
hasn't been chosen before writing the block). That's a very slim
advantage to me.

- Andreas


More information about the ffmpeg-devel mailing list