[FFmpeg-devel] [PATCH 1/5] mxfdec: Parse IndexTableSegments and convert them into AVIndexEntry arrays

Michael Niedermayer michaelni at gmx.at
Mon Nov 14 16:03:29 CET 2011


On Mon, Nov 14, 2011 at 09:44:35AM +0100, Tomas Härdin wrote:
> On Sun, 2011-11-13 at 20:41 +0100, Michael Niedermayer wrote:
> > On Fri, Nov 11, 2011 at 03:42:22PM +0100, Tomas Härdin wrote:
> > > New thread since the old one is rather cluttered.
> > > 
> > > I incorporated Georg's single_eubc patch and split the system item hack
> > > into its own patch.
> > > This patch is rather clean now IMO.
> > [...]
> > > +static int mxf_get_sorted_table_segments(MXFContext *mxf, int *nb_sorted_segments, MXFIndexTableSegment ***sorted_segments)
> > > +{
> > > +    int i, j, nb_segments = 0;
> > > +    MXFIndexTableSegment **unsorted_segments;
> > > +    int last_body_sid = -1, last_index_sid = -1, last_index_start = -1;
> > > +
> > > +    /* count number of segments, allocate arrays and copy unsorted segments */
> > > +    for (i = 0; i < mxf->metadata_sets_count; i++)
> > > +        if (mxf->metadata_sets[i]->type == IndexTableSegment)
> > > +            nb_segments++;
> > > +
> > > +    if (!(unsorted_segments = av_calloc(nb_segments, sizeof(*unsorted_segments))) ||
> > > +        !(*sorted_segments  = av_calloc(nb_segments, sizeof(**sorted_segments)))) {
> > > +        av_free(unsorted_segments);
> > > +        return AVERROR(ENOMEM);
> > > +    }
> > > +
> > > +    for (i = j = 0; i < mxf->metadata_sets_count; i++)
> > > +        if (mxf->metadata_sets[i]->type == IndexTableSegment)
> > > +            unsorted_segments[j++] = (MXFIndexTableSegment*)mxf->metadata_sets[i];
> > > +
> > > +    *nb_sorted_segments = 0;
> > 
> > > +
> > > +    /* sort segments by {BodySID, IndexSID, IndexStartPosition}, remove duplicates while we're at it */
> > > +    for (i = 0; i < nb_segments; i++) {
> > > +        int best = -1, best_body_sid = -1, best_index_sid = -1, best_index_start = -1;
> > > +
> > > +        for (j = 0; j < nb_segments; j++) {
> > > +            MXFIndexTableSegment *s = unsorted_segments[j];
> > > +
> > > +            /* Require larger BosySID, IndexSID or IndexStartPosition then the previous entry. This removes duplicates.
> > > +             * We want the smallest values for the keys than what we currently have, unless this is the first such entry this time around.
> > > +             */
> > > +            if ((i == 0     || s->body_sid > last_body_sid || s->index_sid > last_index_sid || s->index_start_position > last_index_start) &&
> > > +                (best == -1 || s->body_sid < best_body_sid || s->index_sid < best_index_sid || s->index_start_position < best_index_start)) {
> > > +                best             = j;
> > > +                best_body_sid    = s->body_sid;
> > > +                best_index_sid   = s->index_sid;
> > > +                best_index_start = s->index_start_position;
> > > +            }
> > > +        }
> > > +
> > > +        /* no suitable entry found -> we're done */
> > > +        if (best == -1)
> > > +            break;
> > > +
> > > +        (*sorted_segments)[(*nb_sorted_segments)++] = unsorted_segments[best];
> > > +        last_body_sid    = best_body_sid;
> > > +        last_index_sid   = best_index_sid;
> > > +        last_index_start = best_index_start;
> > > +    }
> > 
> > how many entries can the to be sorted table contain? iam asking due
> > to the scalability of the selection sort. if its alot the C qsort()
> > should be considered otherwise it doesnt matter
> 
> The number of segments is unlimited, but for a typical file it'll be
> less than a hundred.
> qsort() won't be better since the segments are likely to be in reverse
> order due to the reverse parsing patch (5/5).

qsort() is O(n log(n)) for reverse order
qsort() breaks down to O(n^2) when elements are ordered
so that the pivot always falls onto the maximum or minimum, which is
not easy to achive accidentially IMHO

but with just a few hundread this isnt relevant to this patch review


[...]
> > > +        st->index_entries_allocated_size += duration * sizeof(*st->index_entries);
> > > +        if (st->index_entries_allocated_size / sizeof(*st->index_entries) < st->nb_index_entries + duration) {
> > > +            av_log(mxf->fc, AV_LOG_ERROR, "index_entries_allocated_size overflow; "
> > > +                                          "segment duration %i unreasonably large?\n", duration);
> > > +            return AVERROR_INVALIDDATA;
> > > +        }
> > > +        if (!(st->index_entries = av_realloc(st->index_entries, st->index_entries_allocated_size))) {
> > > +            av_free(sorted_segments);
> > > +            return AVERROR(ENOMEM);
> > > +        }
> > > +        for (k = 0; k < duration; k++) {
> > > +            AVIndexEntry *e = &st->index_entries[st->nb_index_entries];
> > > +            if (k < tableseg->nb_index_entries) {
> > > +                e->pos = tableseg->stream_offset_entries[k];
> > > +                if (n_delta < tableseg->nb_delta_entries) {
> > > +                    if (n_delta < tableseg->nb_delta_entries - 1) {
> > > +                        e->size =
> > > +                            tableseg->slice_offset_entries[k][tableseg->slice[n_delta+1]-1] +
> > > +                            tableseg->element_delta[n_delta+1] -
> > > +                            tableseg->element_delta[n_delta];
> > > +                        if (tableseg->slice[n_delta] > 0)
> > > +                            e->size -= tableseg->slice_offset_entries[k][tableseg->slice[n_delta]-1];
> > > +                    } else if (k < duration - 1) {
> > > +                        e->size = tableseg->stream_offset_entries[k+1] -
> > > +                            tableseg->stream_offset_entries[k] -
> > > +                            tableseg->slice_offset_entries[k][tableseg->slice[tableseg->nb_delta_entries-1]-1] -
> > > +                            tableseg->element_delta[tableseg->nb_delta_entries-1];
> > > +                    } else
> > > +                        e->size = 0;
> > > +                    if (tableseg->slice[n_delta] > 0)
> > > +                        e->pos += tableseg->slice_offset_entries[k][tableseg->slice[n_delta]-1];
> > > +                    e->pos += tableseg->element_delta[n_delta];
> > > +                }
> > > +                e->flags = !(tableseg->flag_entries[k] & 0x30) ? AVINDEX_KEYFRAME : 0;
> > > +            } else {
> > > +                e->pos = (int64_t)k * tableseg->edit_unit_byte_count + accumulated_offset;
> > > +                if (n_delta < tableseg->nb_delta_entries - 1)
> > > +                    e->size = tableseg->element_delta[n_delta+1] - tableseg->element_delta[n_delta];
> > > +                else {
> > > +                    /* use smaller size for last sample if we should */
> > > +                    if (last_sample_size && k == duration - 1)
> > > +                        e->size = last_sample_size;
> > > +                    else
> > > +                        e->size = tableseg->edit_unit_byte_count;
> > > +                    if (tableseg->nb_delta_entries)
> > > +                        e->size -= tableseg->element_delta[tableseg->nb_delta_entries-1];
> > > +                }
> > > +                if (n_delta < tableseg->nb_delta_entries)
> > > +                    e->pos += tableseg->element_delta[n_delta];
> > > +                e->flags = AVINDEX_KEYFRAME;
> > > +            }
> > > +            if (k > 0 && e->pos < mxf->first_essence_length && accumulated_offset == 0)
> > > +                e->pos += mxf->first_essence_kl_length;
> > > +            e->timestamp = sample_duration * st->nb_index_entries++;
> > > +            av_dlog(mxf->fc, "Stream %d IndexEntry %d n_Delta %d Offset %"PRIx64" Timestamp %"PRId64"\n",
> > > +                    st->index, st->nb_index_entries, n_delta, e->pos, e->timestamp);
> > > +            e->pos += mxf->essence_offset;
> > > +            e->min_distance = 0;
> > > +        }
> > > +        accumulated_offset += segment_size;
> > > +    }
> > 
> > why does this not use av_add_index_entry() ?
> 
> To avoid needless allocations. IIRC Baptiste suggested that I keep this
> approach. I see that av_add_index_entry() makes use of
> index_entries_allocated_size, meaning simply keeping the realloc part
> should be enough. It could perhaps be broken out into a function like
> ff_alloc_extra_index_entries() in a separate patch though.

av_realloc() is slow, av_fast_realloc() is faster (by reallocating in
larger steps and not reducing size)
av_add_index_entry() uses the fast variant.

the problem av_add_index_entry() has currently is when adding
millions of entries in non sequential order.

also duration * sizeof(*st->index_entries) needs to be checked for
integer overflow unless something implicitly prevents it from becoming
too large as otherwise the array may be quite a bit smaller than the
data written into it.
av_add_index_entry() already does that check

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20111114/1a914b8b/attachment.asc>


More information about the ffmpeg-devel mailing list