[FFmpeg-devel] [PATCH] mxf umid generation

Michael Niedermayer michaelni
Mon Mar 9 19:54:58 CET 2009


On Sun, Mar 08, 2009 at 03:00:18PM -0700, Baptiste Coudurier wrote:
> On 3/8/2009 2:08 PM, Michael Niedermayer wrote:
> > On Sun, Mar 08, 2009 at 01:05:55PM -0700, Baptiste Coudurier wrote:
> >> On 3/8/2009 7:10 AM, Michael Niedermayer wrote:
> >>> On Sat, Mar 07, 2009 at 08:53:03PM -0800, Baptiste Coudurier wrote:
> >>>> On 3/7/2009 8:40 PM, Michael Niedermayer wrote:
> >>>>> On Sat, Mar 07, 2009 at 08:05:41PM -0800, Baptiste Coudurier wrote:
> >>>>>> On 3/7/2009 7:52 PM, Michael Niedermayer wrote:
> >>>>>>> On Sat, Mar 07, 2009 at 07:25:54PM -0800, Baptiste Coudurier wrote:
> >>>>>>>> On 3/7/2009 7:16 PM, Michael Niedermayer wrote:
> >>>>>>>>> On Sat, Mar 07, 2009 at 06:31:53PM -0800, Baptiste Coudurier wrote:
> >>>>>>>>>> On 3/7/2009 5:23 PM, Michael Niedermayer wrote:
> >>>>>>>>>>> On Sat, Mar 07, 2009 at 04:14:19PM -0800, Baptiste Coudurier wrote:
> >>>>>>>>>>>> On 3/7/2009 3:36 PM, Michael Niedermayer wrote:
> >>>>>>>>>>>>> On Sat, Mar 07, 2009 at 02:48:49PM -0800, Baptiste Coudurier wrote:
> >>>>>>>>>>>>>> On 3/6/2009 7:44 PM, Michael Niedermayer wrote:
> >>>>>>>>>>>>>>> On Fri, Mar 06, 2009 at 07:28:55PM -0800, Baptiste Coudurier wrote:
> >>>>>>>>>>> [...]
> >>>>>>>>>>>>>> Property changes on: libavutil\random_seed.c
> >>>>>>>>>>>>>> ___________________________________________________________________
> >>>>>>>>>>>>>> Added: svn:eol-style
> >>>>>>>>>>>>>>    + LF
> >>>>>>>>>>>>> intended?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> except these ok
> >>>>>>>>>>>>>
> >>>>>>>>>>>> I'm not sure, I'm on windows because products I test with only works on
> >>>>>>>>>>>> windows, I set ending lines style to unix, but it keeps adding this...
> >>>>>>>>>>>>
> >>>>>>>>>>>> It should be ok I think.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Btw, is lfg ok for this purpose or should I use something else ?
> >>>>>>>>>>> to generate these id numbers out of the seed?
> >>>>>>>>>>> id rather use
> >>>>>>>>>>> seed += 1LL<<32 :)
> >>>>>>>>>>>
> >>>>>>>>>>> lfg seens pointless complexity for this ...
> >>>>>>>>>>>
> >>>>>>>>>>> and patch ok
> >>>>>>>>>>>
> >>>>>>>>>> Well I'd like to use the defined methods for umid generation, see below:
> >>>>>>>>>>
> >>>>>>>>>> "A.3.2 Alternative masking methods
> >>>>>>>>>> The masked material number is an unpredictable number uniformly
> >>>>>>>>>> distributed over the range 0 thru 2^128-1. Its
> >>>>>>>>>> effectiveness as a unique identifier relies on this uniform random
> >>>>>>>>>> distribution, and the exact method of its generation is not
> >>>>>>>>>> important. Therefore, the use of the reference masking method is not
> >>>>>>>>>> normative, and any method providing an equivalent
> >>>>>>>>>> level of unpredictability and uniformity of distribution may be used
> >>>>>>>>>> with the ?masked method? value in the ?number
> >>>>>>>>>> generation method? field of the UMID universal label (reference table 1
> >>>>>>>>>> in 5.1.1)."
> >>>>>>>>>>
> >>>>>>>>>> And instance generation:
> >>>>>>>>>>
> >>>>>>>>>> "B.2 24-bit PRS generator (?2h?)
> >>>>>>>>>> Any suitable psuedo-random sequence (PRS) generator polynomial may be
> >>>>>>>>>> used provided it has a maximal length of
> >>>>>>>>>> 16,777,215 clock cycles. At the point of creating a new instance of the
> >>>>>>>>>> material, the 24-bits from the PRS generator are
> >>>>>>>>>> sampled to gain a new instance value.
> >>>>>>>>>> PRS generators shall not allow a zero value.
> >>>>>>>>> am i right in assuming that this "definition" is a 24bit LFSR?
> >>>>>>>>> if so, this is neither uniform over 2^128 nor unpredictable.
> >>>>>>>>> actually, its trivial to generate all future and past values
> >>>>>>>>> from just 2 24bit values even if the used polynomial is not known.
> >>>>>>>>>
> >>>>>>>>> also if my interrpretation of this "definition" is correct you can
> >>>>>>>>> expect 1 collision in ~4000 ids
> >>>>>>>> Well, "instance number" is 3 bytes and umid is 16 bytes, these are
> >>>>>>>> different numbers, this is what the code is trying to achieve, see the
> >>>>>>>> patch.
> >>>>>>>>
> >>>>>>>>>> NOTES
> >>>>>>>>>> 1 Any suitable seed may be used to start the pseudo-random sequence
> >>>>>>>>>> (PRS) 24-bit generator.
> >>>>>>>>>> 2 The PRS generator should use a free-running clock having no time
> >>>>>>>>>> relationship with the clock used to generate the sampling strobe.
> >>>>>>>>>> 3 The PRS generator clock frequency should be greater than 10 kHz.
> >>>>>>>>>> 4 The number of feedback taps resulting from the PRS generator
> >>>>>>>>>> polynomial should be between 8 and 16 to ensure the random nature
> >>>>>>>>>> of the sequence."
> >>>>>>>>>>
> >>>>>>>>>> What do you think ?
> >>>>>>>>> sounds like the spec is writen by some really incompetent people.
> >>>>>>>> Is it still true now you know that these numbers are different ?
> >>>>>>> a design that cannot be implemented in ANSI C or for the matter of fact
> >>>>>>> any deterministic language is broken
> >>>>>>>
> >>>>>>>
> >>>>>>>> Is the method ok at least for the "instance number" ?
> >>>>>>>>
> >>>>>>>>> [...]
> >>>>>>>>>
> >>>>>>>>> also you should set any bits left after the seed+counter to a random constant
> >>>>>>>>>
> >>>>>>>>> and if you have a 32bit seed you have 32bit of randomness and a PRNG making
> >>>>>>>>> 128 out of that still has just a randomness of 32, you could set 96 bits to
> >>>>>>>>> your pets name it wont make a difference.
> >>>>>>>>>
> >>>>>>>> So there is no way we could be able to generate the 128 bits umid
> >>>>>>>> according to the method ? Can I use the md5 of the 4 bytes of the seed ?
> >>>>>>> you can but it will have a collision in ~64k umids thats the same as if you
> >>>>>>> just take the 32bit seed + some constant like your name.
> >>>>>>> also it violates the spec because it is neither unpredictable not uniform.
> >>>>>>>
> >>>>>>> If you want to follow the spec you need 128 strong random bits per umid.
> >>>>>>> a md5 of 32 LSB from the timer does not qualify ...
> >>>>>>>
> >>>>>> All right, what do you think about this ?
> >>>>> if /dev/random is available better if just a timer is available
> >>>>> you leak the cpu type & compiler version used.
> >>>> Well, I added ff_random_get_seed according to your indication :/
> >>> I didnt suggest to use more than 1 value from it
> >>>
> >> You mean using more than once "seed" ?
> > 
> > yes, using the lsb of the timer twice stores the number of clock ticks
> > between the 2 calls, that is strongly dependant on cpu & compiler thus
> > leaks this information while not doing much else
> > 
> > 
> >> Well I'm following your suggestion of using only seed.
> >>
> >> Can I use LFG like I did in the first place ? I'd like to have this
> >> problem fixed.
> > 
> > you are the maintainer of the code, you can use what you prefer
> > 
> > If you want to conform to the spec you need 16(+3)bytes
> > from /dev/random or if that is unavailable gather bits by user interaction
> > and its timing like some other tools do.
> > And you have to redo this for each mxf file muxed.
> > 
> > if you dont care about the spec and just want to minimize collisions
> > the 32bit seed alone will be as good as filling all 16 bytes by lfg
> 
> I'll use the "undefined method" value I think.
> 
> Ok, what if I use seed only once like this ?
> I prefer having your opinion on randomness and security matter.

i think its fine, though the +(1<<32) is useless, i misunderstood
the original use.
That is i thought that you needed several unique 128bit values and
these could have been generated by
for()
    seed += 1LL<<32;
with just a single value the +1<<32 makes no difference

[...]
> +static void mxf_gen_umid(AVFormatContext *s)
> +{
> +    MXFContext *mxf = s->priv_data;
> +    uint32_t seed = ff_random_get_seed();
> +    uint64_t umid = seed + (1LL<<32) + 0x152947134;
> +
> +    AV_WB64(mxf->umid  , umid);
> +    AV_WB64(mxf->umid+8, umid>>8);
> +
> +    mxf->instance_number = seed;
> +}

also it should be  + 0x15294713400000000LL

The reason is best explained by thinking about playing a game like lotto
In theory assuming the numbers are randomly drawn any set/list of numbers
choosen will give you the same chance to get the price.
But in reality, picking "1,2,3,4,5,6" while it gives you the same chance
to win you would in that case have a very high chance that you have to
share with many other people who choose the same common sequence.

In our case here while leaving 96 of 128 bits 0 is as good as any other
fixed value or one that just depends on the first 32bit (like LFG)
it is much more likely that some other program would make a similar choice
for the high 96 bits. Setting them to a ffmpeg specific constant would
make collisions between ffmpeg generated files and files by other muxers
VERY unlikely.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090309/98358a5f/attachment.pgp>



More information about the ffmpeg-devel mailing list