[FFmpeg-devel] [PATCH] First shot at AVCHD seeking via new seeking API

Baptiste Coudurier baptiste.coudurier
Sun Jul 26 00:43:17 CEST 2009


Hi Ivan,

On 07/25/2009 04:14 AM, Ivan Schreter wrote:
> Hi,
>
> Ivan Schreter wrote:
>> Hi Baptiste,
>> [...]
>> I see we have a header "internal.h" in lavf. So I moved the
>> declaration of my new routine there. This is then definitely no public
>> API.
>>
>> OK to commit now (patches attached)?
>>
> Since noone reviewed my patches for about three weeks, I reviewed them
> myself and wrote an extended seek test, which tests whether seeked-to
> position really fulfills the contract of avformat_seek_file. This did
> reveal a small problem, which I fixed. BTW, other formats also have
> issues, so I'll probably send a patch for seek test to the list, when
> time permits.
>
> I also simplified the code by removing handling of special cases, since
> it just made it less clear. Here is the newest set of patches.
>
> Special case handling of seeking before first keyframe or after the last
> keyframe is now handled only for old seeking API inside of mpegts.c
> read_seek(). New seeking API will return -1 if trying to seek outside of
> timestamp range.
>
> If noone objects, I'll commit them during the next week.
>

Since it is only used in mpegts for now, please put all the code in 
mpegts.c, these symbols do not need to be exported.

Also, I think state must not be trashed if seek failed, that is 
av_read_frame_flush must not be called before seeking.

 > [...]
>
> +/**
> + * Helper structure describing keyframe search state of one stream.
> + */
> +typedef struct {
> +    int active;         ///<  Set to 1 for active streams, 0 for inactive.

I don't think inactive stream need a structure, so this field is unneeded.

> +    int64_t epsilon;    ///<  Rounding error based on time base.

Can you explain why it is needed ?

 > [...]
 >
> +
> +/**
> + * Partial search for keyframe in multiple streams.
> + *
> + * This routine searches for the next lower and next higher timestamp to
> + * given target timestamp in each stream, starting at current file position
> + * and ending at position, where all streams have already been examined
> + * (or when all higher key frames found in first iteration).
> + *
> + * This routine is called iteratively with exponential backoff to find lower
> + * timestamp.
> + *
> + * @param s                 format context.
> + * @param timestamp         target timestamp (or position, if AVSEEK_FLAG_BYTE).
> + * @param flags             seeking flags.
> + * @param sync              array with information per stream.
> + * @param keyframes_to_find count of keyframes to find in total.
> + * @param found_lo          pointer to count of already found low timestamp keyframes.
> + * @param found_hi          pointer to count of already found high timestamp keyframes.
> + */
> +static void search_keyframe_multi_partial(AVFormatContext *s,
> +                                          int64_t timestamp,
> +                                          int flags,
> +                                          AVSyncPoint *sync,
> +                                          int keyframes_to_find,
> +                                          int *found_lo,
> +                                          int *found_hi)

I don't think the name is adequate. I'm thinking of a better one.

> +    AVPacket pkt;
> +    AVSyncPoint *sp;
> +    AVStream *st;
> +    int idx;
> +    int flg;
> +    int terminated_count = 0;
> +    int64_t pos, fpos;
> +    int64_t pts, dts;       // PTS/DTS from stream
> +    int64_t pts_tb, dts_tb; // PTS/DTS in common AV_TIME_BASE
> +
> +    for (;;) {
> +        fpos = url_ftell(s->pb);
> +        if (av_read_frame(s,&pkt)<  0) {
> +            // EOF or error, make sure high flags are set
> +            for (idx = 0; idx<  s->nb_streams; ++idx) {
> +                sp =&sync[idx];
> +                if (sp->active) {
> +                    if (!sp->found_hi) {
> +                        // No high frame exists for this stream
> +                        sp->found_hi = 1;
> +                        (*found_hi)++;
> +                        sp->pts_hi = INT64_MAX;
> +                        sp->dts_hi = INT64_MAX;
> +                        sp->pos_hi = INT64_MAX;
> +                    }
> +                }
> +            }
> +            break;
> +        }
> +        idx = pkt.stream_index;
> +        sp =&sync[idx];
> +
> +        if (!sp->active) {
> +            // This stream is not active, skip packet.
> +            continue;
> +        }
> +
> +        flg = pkt.flags;
> +        pos = pkt.pos;
> +        pts = pkt.pts;
> +        dts = pkt.dts;
> +        if (pts == AV_NOPTS_VALUE) {
> +            // Some formats don't provide PTS, only DTS.
> +            pts = dts;
> +        }
> +        av_free_packet(&pkt);
> +
> +        if (pos<  0) {

space before <, one too much after, this applies to code below too.

> +            // Use file position as fallback, if no position returned. This is
> +            // inexact in most formats, so format parser should always fill
> +            // proper position to make seeking routine return reliable results.
> +            pos = fpos;
> +        }

Actually I think fpos is more reliable in our situation :)

> +        if (sp->first_pos == AV_NOPTS_VALUE) {
> +            // Note down termination position for the next iteration - when
> +            // we encounter a packet with the same position, we will ignore
> +            // any further packets for this stream in next iteration (as they
> +            // are already evaluated).
> +            sp->first_pos = pos;
> +        }
> +
> +        if (sp->term_pos != AV_NOPTS_VALUE&&  pos>  sp->term_pos) {
> +            // We are past the end position from last iteration, ignore packet.
> +            if (!sp->terminated) {
> +                sp->terminated = 1;
> +                ++terminated_count;
> +                if (terminated_count == keyframes_to_find)
> +                    break;  // all terminated, iteration done
> +            }
> +            continue;
> +        }

Do term_pos and first_pos need to be per stream ?

> +        // Evaluate key frames with known TS (or any frames, if AVSEEK_FLAG_ANY set).
> +        if (pts != AV_NOPTS_VALUE&&  ((flg&  PKT_FLAG_KEY) || (flags&  AVSEEK_FLAG_ANY))) {
> +            // Rescale stream timestamp to AV_TIME_BASE.
> +            st = s->streams[idx];
> +            pts_tb = av_rescale(pts, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
> +            dts_tb = av_rescale(dts, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
> +
> +            if (flags&  AVSEEK_FLAG_BYTE) {
> +                // Seek by byte position.
> +                if (pos<= timestamp) {
> +                    // Keyframe found before target position.
> +                    if (!sp->found_lo) {
> +                        // Found first keyframe lower than target position.
> +                        sp->found_lo = 1;
> +                        (*found_lo)++;
> +                        sp->pts_lo = pts_tb;
> +                        sp->dts_lo = dts_tb;
> +                        sp->pos_lo = pos;
> +                    } else if (sp->pos_lo<  pos) {
> +                        // Found a better match (closer to target position).
> +                        sp->pts_lo = pts_tb;
> +                        sp->dts_lo = dts_tb;
> +                        sp->pos_lo = pos;
> +                    }
> +                }
> +                if (pos>= timestamp) {
> +                    // Keyframe found after target position.
> +                    if (!sp->found_hi) {
> +                        // Found first keyframe higher than target position.
> +                        sp->found_hi = 1;
> +                        (*found_hi)++;
> +                        sp->pts_hi = pts_tb;
> +                        sp->dts_hi = dts_tb;
> +                        sp->pos_hi = pos;
> +                        if (*found_hi>= keyframes_to_find&&  sp->term_pos == INT64_MAX) {
> +                            // We found high frame for all. They may get updated
> +                            // to TS closer to target TS in later iterations (which
> +                            // will stop at start position of previous iteration).
> +                            break;
> +                        }
> +                    } else if (sp->pos_hi>  pos) {
> +                        // Found a better match (actually, shouldn't happen).
> +                        sp->pts_hi = pts_tb;
> +                        sp->dts_hi = dts_tb;
> +                        sp->pos_hi = pos;
> +                    }
> +                }
> +            } else {
> +                // Seek by timestamp.
> +                if (pts_tb<= timestamp + sp->epsilon) {
> +                    // Keyframe found before target timestamp.
> +                    if (!sp->found_lo) {
> +                        // Found first keyframe lower than target timestamp.
> +                        sp->found_lo = 1;
> +                        (*found_lo)++;
> +                        sp->pts_lo = pts_tb;
> +                        sp->dts_lo = dts_tb;
> +                        sp->pos_lo = pos;
> +                    } else if (sp->pts_lo<  pts_tb) {
> +                        // Found a better match (closer to target timestamp).
> +                        sp->pts_lo = pts_tb;
> +                        sp->dts_lo = dts_tb;
> +                        sp->pos_lo = pos;
> +                    }
> +                }
> +                if (pts_tb + sp->epsilon>= timestamp) {
> +                    // Keyframe found after target timestamp.
> +                    if (!sp->found_hi) {
> +                        // Found first keyframe higher than target timestamp.
> +                        sp->found_hi = 1;
> +                        (*found_hi)++;
> +                        sp->pts_hi = pts_tb;
> +                        sp->dts_hi = dts_tb;
> +                        sp->pos_hi = pos;
> +                        if (*found_hi>= keyframes_to_find&&  sp->term_pos == INT64_MAX) {
> +                            // We found high frame for all. They may get updated
> +                            // to TS closer to target TS in later iterations (which
> +                            // will stop at start position of previous iteration).
> +                            break;
> +                        }

If I understood correctly, at first iteration you should find high 
keyframes for all streams.

If SEEK_BYTE is requested and the pos you are at is > max_ts (max 
position) or term_pos was earlier at the previous iteration, then you 
can break earlier after read_frame anyway.

Do I miss something ?

> +                    } else if (sp->pts_hi>  pts_tb) {
> +                        // Found a better match (actually, shouldn't happen).
> +                        sp->pts_hi = pts_tb;
> +                        sp->dts_hi = dts_tb;
> +                        sp->pos_hi = pos;
> +                    }

Duplicate code between SEEK_BYTE and timestamps. I believe you only need 
one hi and one lo per stream (will be set to either pos or timestamp), 
you don't need both pts and dts. Please explain why if that's not possible.

> +            }
> +        }
> +    }
> +
> +    // Clean up the parser.
> +    av_read_frame_flush(s);
> +}
> +
> +int64_t ff_gen_syncpoint_search(AVFormatContext *s,
> +                                int stream_index,
> +                                int64_t pos,
> +                                int64_t ts_min,
> +                                int64_t ts,
> +                                int64_t ts_max,
> +                                int flags)
> +{
> +    AVSyncPoint sync[MAX_STREAMS], *sp;
> +    AVStream *st;
> +    int i;
> +    int keyframes_to_find = 0;
> +    int64_t curpos;
> +    int64_t step;
> +    int found_lo = 0, found_hi = 0;
> +    int64_t min_distance;
> +    int64_t min_pos = 0;
> +
> +    if (stream_index>= 0&&  !(flags&  AVSEEK_FLAG_BYTE)) {
> +        // Rescale timestamps to AV_TIME_BASE, if timestamp of a reference stream given.
> +        st = s->streams[stream_index];
> +        if (ts != AV_NOPTS_VALUE)
> +            ts = av_rescale(ts, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
> +        if (ts_min != INT64_MIN)
> +            ts_min = av_rescale(ts_min, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
> +        if (ts_max != INT64_MAX)
> +            ts_max = av_rescale(ts_max, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
> +    }

I don't think rescaling is needed for now since in mpegts all streams 
have same timebase.

> +    // Initialize syncpoint structures for each stream.
> +    for (i = 0; i<  s->nb_streams; ++i) {
> +        sp =&sync[i];
> +        st = s->streams[i];
> +        if (s->streams[i]->discard<  AVDISCARD_ALL) {
> +            ++keyframes_to_find;
> +            sp->pos_lo     = INT64_MAX;
> +            sp->pts_lo     = INT64_MAX;
> +            sp->dts_lo     = INT64_MAX;
> +            sp->found_lo   = 0;
> +            sp->pos_hi     = INT64_MAX;
> +            sp->pts_hi     = INT64_MAX;
> +            sp->dts_hi     = INT64_MAX;
> +            sp->found_hi   = 0;
> +            sp->epsilon    = AV_TIME_BASE * (int64_t)st->time_base.num / st->time_base.den;
> +            sp->term_pos   = INT64_MAX;
> +            sp->first_pos  = AV_NOPTS_VALUE;
> +            sp->terminated = 0;
> +            sp->active     = 1;
> +        } else {
> +            sp->active     = 0;
> +        }
> +    }
> +
> +    if (keyframes_to_find == 0) {
> +        // No stream active, error.
> +        return -1;
> +    }
> +
> +    // Find keyframes in all active streams with timestamp/position just before
> +    // and just after requested timestamp/position.
> +    step = 1024;
> +    for (;;) {
> +        curpos = pos - step;
> +        if (curpos<  0)
> +            curpos = 0;
> +        url_fseek(s->pb, curpos, SEEK_SET);
> +        search_keyframe_multi_partial(
> +                s,
> +                ts, flags,
> +                sync, keyframes_to_find,
> +&found_lo,&found_hi);
> +        if (found_lo == keyframes_to_find&&  found_hi == keyframes_to_find)
> +            break;  // have all keyframes we wanted

You can shortcut by checking if user specified a max_pos > ts or min_pos 
< ts. In this case only found_lo or found_hi matters.

> +        if (curpos == 0)
> +            break;  // cannot go back anymore
> +        step *= 2;
> +        // switch termination positions
> +        for (i = 0; i<  s->nb_streams; ++i) {
> +            sp =&sync[i];
> +            if (sp->active) {
> +                sp->term_pos   = sp->first_pos;
> +                sp->first_pos  = AV_NOPTS_VALUE;
> +                sp->terminated = 0;
> +            }

Are both first_pos and term_pos needed ?

> +    }
> +
> +    // Find actual position to start decoding so that decoder synchronizes
> +    // closest to ts and update timestamps in streams.
> +    pos = INT64_MAX;
> +
> +    for (i = 0; i<  s->nb_streams; ++i) {
> +        sp =&sync[i];
> +        if (sp->active) {
> +            min_distance = INT64_MAX;
> +            if (flags&  AVSEEK_FLAG_BYTE) {
> +                // Find closest byte position to requested position, within min/max limits.
> +                if (sp->pos_lo != INT64_MAX&&  ts_min<= sp->pos_lo&&  sp->pos_lo<= ts_max) {
> +                    // low position is in range
> +                    min_distance = ts - sp->pos_lo;
> +                    min_pos = sp->pos_lo;
> +                }
> +                if (sp->pos_hi != INT64_MAX&&  ts_min<= sp->pos_hi&&  sp->pos_hi<= ts_max) {
> +                    // high position is in range, check distance
> +                    if (sp->pos_hi - ts<  min_distance) {
> +                        min_distance = sp->pos_hi - ts;
> +                        min_pos = sp->pos_hi;
> +                    }
> +                }
> +            } else {
> +                // Find timestamp closest to requested timestamp within min/max limits.
> +                if (sp->pts_lo != INT64_MAX&&  ts_min<= sp->pts_lo + sp->epsilon&&  sp->pts_lo - sp->epsilon<= ts_max) {
> +                    // low timestamp is in range
> +                    min_distance = ts - sp->pts_lo;
> +                    min_pos = sp->pos_lo;
> +                }
> +                if (sp->pts_hi != INT64_MAX&&  ts_min<= sp->pts_hi + sp->epsilon&&  sp->pts_hi - sp->epsilon<= ts_max) {
> +                    // high timestamp is in range, check distance
> +                    if (sp->pts_hi - ts<  min_distance) {
> +                        min_distance = sp->pts_hi - ts;
> +                        min_pos = sp->pos_hi;
> +                    }
> +                }
> +            }

Duplicate code between SEEK_BYTE and timestamps.

> +            if (min_distance == INT64_MAX) {
> +                // no timestamp is in range, cannot seek
> +                return -1;
> +            }
> +            if (min_pos<  pos)
> +                pos = min_pos;
> +        }
> +    }
> +
> +    // We now have a position, so update timestamps of all streams appropriately.
> +    for (i = 0; i<  s->nb_streams; ++i) {
> +        sp =&sync[i];
> +        if (sp->active) {
> +            if (sp->found_lo&&  sp->pos_lo>= pos) {
> +                av_update_cur_dts(s, s->streams[i], sp->dts_lo);
> +            } else if (sp->found_hi&&  sp->pos_hi>= pos) {
> +                av_update_cur_dts(s, s->streams[i], sp->dts_hi);
> +            } else {
> +                // error, no suitable position found (actually impossible)
> +                return -1;
> +            }
> +        }
> +    }

Do you really need update cur_dts ?

[...]

-- 
Baptiste COUDURIER                              GnuPG Key Id: 0x5C1ABAAA
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
FFmpeg maintainer                                  http://www.ffmpeg.org



More information about the ffmpeg-devel mailing list