[FFmpeg-user] PTS resolution[s]

Jim DeLaHunt list+ffmpeg-user at jdlh.com
Tue Feb 23 10:58:16 EET 2021


On 2021-02-22 21:35, Mark Filipak (ffmpeg) wrote:

> On 2021-02-23 00:01, Jim DeLaHunt wrote:
>> The Presentation Time Stamp (PTS) value which FFmpeg associates with 
>> video frames and audio data is a 64-bit integer. There is an 
>> associated time base attribute for each video or audio stream, which 
>> gives the number of seconds between successive values of PTS. This 
>> time base might be thought of as the resolution of PTS. Thus if you 
>> have two PTS values pts1 and pts2, then the difference in seconds 
>> between them is (pts2-pts1)*time_base.
>
>
> MPEG PES (Presentation Elemental Stream) uses a 27MHz (exact) clock 
> divided by 300 (exact), so that timebase is 1/(90000Hz)…
I've read something similar. My understanding is that MPEG PES encodes 
Presentation Time Stamp values as integer tick counts in the data 
stream. Is the timebase of 1/(90,000Hz) encoded in the data stream, or 
it is only defined in the spec?
> …(which is 0.01[1..]ms between ticks, exactly). 
Actually, for this discussion I think it's fair to say that 0.01[1..]ms 
is not exactly 1/90 ms, it is just an approximation. Finite decimal 
numbers will never get you the exact value. The rational number is 
exact. For this discussion, it will be clearer to use exact rational 
numbers.
> …my best information so far is that, at least out of the encoder, 
> ffmpeg encodes frames with PTS resolution = 1ms.

My impression from reading the FPS filter source code is that it is 
incomplete to talk about ffmpeg PTS values without also giving the 
corresponding timebase value. It looks to me like the FPS filter does 
not attempt to preserve the incoming PTS values or timebase. It sets a 
new time base of 1/frame_rate, and generates successive integer values 
for PTS. However, and this is crucial, it does seem to value being exact 
about the value of PTS*time_base.

So, that seems to say that your statement "at least out of the encoder, 
ffmpeg encodes frames with PTS resolution = 1ms" is not complete without 
stating the time base value ffmpeg sets out of the encoder.

> To put this into perspective, a 24fps video has delta-PTS = 41.[6..]ms 
> whereas a 24/1.001fps video has delta-PTS = 41.708[3..]milliseconds. 
> That means that the difference between the two is less than the 
> resolution of the ffmpeg timebase (at least, for the encoder -- I 
> don't know about the decoder and the pipeline). That essentially means 
> that ffmpeg can't differentiate between them based on the working PTSs 
> that it keeps.

But what are the time base values which ffmpeg uses for these two 
cases?  If the time base is 1/24 in the first case, and 1,001/24,000 in 
the second case, then the same integer PTS values result in 
PTS*time_base products being exactly the correct time offsets from the 
first frame of the video in each of the two cases.


> I seek someone who can either, 1, confirm what I think, or 2, tell me 
> what the resolution of the decoder and pipeline actually is.

Implicit in your use of the definite article "the" is an apparent 
assumption that FFmpeg has only one resolution for the decoder and the 
pipeline. It looks to me like FFmpeg could well take the liberty of 
changing resolution at each stage of decoder and pipeline, as long as it 
preserves the values for PTS*time_base at each frame (or modifies them 
intentionally, as the FPS filter does).

Best regards,
      —Jim DeLaHunt



More information about the ffmpeg-user mailing list