[FFmpeg-devel] [PATCH] lavu/lfg: add 64 bit random number generator
michael at niedermayer.cc
Tue Mar 15 17:10:38 CET 2016
On Mon, Mar 14, 2016 at 08:42:32PM -0400, Ganesh Ajjanagadde wrote:
> On Sun, Mar 13, 2016 at 11:08 PM, Michael Niedermayer
> <michael at niedermayer.cc> wrote:
> > On Sun, Mar 13, 2016 at 07:12:50PM -0400, Ganesh Ajjanagadde wrote:
> >> This is based on the relatively well known xorshift128+ of Sebastiano
> >> Vigna (https://en.wikipedia.org/wiki/Xorshift) that performs very well on the
> >> BigCrush suite, is very efficient, and is thus used by a number of
> >> clients: http://xorshift.di.unimi.it/ (see Introduction).
> >> Moreover, the implementation is in the public domain:
> >> http://xorshift.di.unimi.it/xorshift128plus.c.
> >> Concretely, it is nearly as fast as av_lfg_get (which only returns 32 bits),
> >> and has a much smaller cache (128 bits). Thus, the timings should also
> >> be more stable.
> >> This is needed because av_lfg_get<<32 | av_lfg_get is far slower, and
> >> likely less random as measured by BigCrush - most LFG's perform
> >> quite poorly on the BigCrush suite:
> >> http://www6.tw.freebsd.org/distfiles/testu01.pdf.
> >> In particular, FFmpeg's seems to be Ran055 in the paper, see pg31.
> >> Sample benchmark (Haswell, GCC + -march=native):
> >> 23200 decicycles in 624 calls of av_lfg_get, 1 runs, 0 skips
> >> 23040 decicycles in 624 calls of av_lfg_get, 2 runs, 0 skips
> >> 22810 decicycles in 624 calls of av_lfg_get, 4 runs, 0 skips
> >> [...]
> >> 19884 decicycles in 624 calls of av_lfg_get, 65532 runs, 4 skips
> >> 19880 decicycles in 624 calls of av_lfg_get, 131067 runs, 5 skips
> >> 19884 decicycles in 624 calls of av_lfg_get, 262136 runs, 8 skips
> >> 19877 decicycles in 624 calls of av_lfg_get, 524276 runs, 12 skips
> >> 25380 decicycles in 624 calls of av_rand64_get, 1 runs, 0 skips
> >> 28560 decicycles in 624 calls of av_rand64_get, 2 runs, 0 skips
> >> 30112 decicycles in 624 calls of av_rand64_get, 4 runs, 0 skips
> >> [...]
> >> 22627 decicycles in 624 calls of av_rand64_get, 65536 runs, 0 skips
> >> 22625 decicycles in 624 calls of av_rand64_get, 131072 runs, 0 skips
> >> 22625 decicycles in 624 calls of av_rand64_get, 262143 runs, 1 skips
> >> 22624 decicycles in 624 calls of av_rand64_get, 524286 runs, 2 skips
> >> Signed-off-by: Ganesh Ajjanagadde <gajjanag at gmail.com>
> >> ---
> >> libavutil/lfg.c | 33 ++++++++++++++++++++++++++++++++-
> >> libavutil/lfg.h | 26 ++++++++++++++++++++++++++
> >> 2 files changed, 58 insertions(+), 1 deletion(-)
> > why do you add this code to lfg.* (Lagged Fibonacci Generator)?
> > its not a lfg
> I just wanted to lay out the implementation, it was easier for me to
> test and compare that way. Also, don't really know if it should be
> public or private. I really felt that all random number stuff should
> go into e.g lavu/rand.h, right now it is quite scattered across
> random_seed.c, lfg.h (which is not great from a discoverability
> Anyway, if you are flexible about where it could go, I will resubmit
> as lavu/rand.h, lavu/rand.c containing this 64 bit gen. Is this fine
> with you?
> Some day or the other this can be cleaned up into e.g a single header,
> but that maybe best handled later.
> > also the LFG could be trivially extended/changed to 64bit if one wants
> > only needs uint64_t being used
> I can believe the extendability, but note that I do not know about the
> quality of such generators. The linked TestU01 paper did not seem too
> happy with the additive lagged Fibonacci generators, and instead
> favored multiplicative ones like av_mlfg_get.
> > also theres av_mlfg_get() which passes all tests, though slower of
> > course
> Are you sure it passes all of BigCrush? What test suite are you referring to?
it lists several multiplicative LFGs they all pass all tests
See "Table I. Results of test batteries applied to well-known RNGs"
unless i misunderstand
of course all generators of this class are expected to have defects
in the least significant bits if one searches hard enough.
I assumed that these tests dont search hard enough
also every PRNG fails a test otherwise it wouldnt be a PRNG but a
that said, i would be somewhat surprised if the more significant side
of the mlfg values fail many tests that arent designed specifically
> > and does xorshift128+ really pass all tests ?
> This was a claim on wikipedia that is I believe incorrect,
wikipedia ... sigh
> himself makes no such absolute claims. All he claims is (from
> /* This is the fastest generator passing BigCrush without
> systematic failures, but due to the relatively short period it is
> acceptable only for applications with a mild amount of parallelism;
> otherwise, use a xorshift1024* generator. */
> As such, he does not rule out all possible failures.
> In fact, his page http://xorshift.di.unimi.it/#quality suggest certain failures.
> I therefore did not mention explicit absolute claims in the message either.
I would have been very surprised if this basic generator didnt have
defects in stronger tests.
the LSBs should fail basic linearity tests
IIUC/IIRC marsaglia suggested to use the xorshift with a multipliction
as non linear step. If thats done id have little doubt in the generator
but i didnt read the xorshift papers so i might be half wrong on some
with just an addition as non linear step i have a odd feeling.
This is all no doubt purely academic and xorshift128+ is likely
more then good enough for us. It just feels a bit "weak" to have a
purely linear generator that is scrabled by a simple addition.
also iam thinking a bit here in relation to a LFG which too uses
just an addition and is weak too.
i think what iam trying to say is that xorshift128plus isnt faster
than the LFG and while its maybe better its not much better
about the performance comparission.
LFGs should be optimizeable in SIMD, is that worth it i dont know
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Breaking DRM is a little like attempting to break through a door even
though the window is wide open and the only thing in the house is a bunch
of things you dont want and which you would get tomorrow for free anyway
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 181 bytes
Desc: Digital signature
More information about the ffmpeg-devel