So far we understood that, in general, the output in a form of a random number came out from a dedicated computer PRNG function. To be perfectly honest, it is like a refined gentleman, of a sophisticated quality recognised by its efficiency when a great number of drafts is requested.
In Python 3.5 (2.7.10) the algorithm being responsible for delivery of pseudo-random numbers is known as Mersenne Twister (MT) developed by two Japanese researchers Makoto Matsumoto and Takuji Nishimurain in 1997 and has become "the standard"
worldwide.
Its basic version uses a 32-bit word generation and is labeled as MT19937. It has been designed with a consideration on the flaws of the various past and currently existing PRNGs. The mathematics standing behind the algorithm itself is beyond the scope of this book. However, as one might suspect, the algorithm
must repeat itself and in case of our Japanese hero, its period is very long, defined by the 24th Mersenne Prime number, i.e.
We derived it in the previous Section.
A full story about the research on MT you may find at http://
www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html — the official website of MT. For the sake of curiosity, below, we will scan across one of the accessible Python implementation of the Mersenne Twister algorithm. Although the code is possible to be found on the Web (e.g. http://code.activestate.com/recipes/
578056-mersenne-twister/) its functionality may not be perfect.
In this exercise we aim at the inspection of the complexity of the MT algorithm itself and at showing an example of implementation utilising some additional Python elements, for instance, bitwise operators (omitted in this Volume) or global variables.
Mersenne Twister in Python.
from datetime import datetime
def initialize_generator(seed):
global MT
M19937= 219937 1 = 431542479...968041471
Code 2.33
bitmask_3 = (2 ** 31) - 1
now = datetime.now()
initialize_generator(now.microsecond)
# printing 10 random numbers for i in range(10):
rnd = extract_number() # an integer!
print(rnd)
A possible output could be:
3489063849 884313957 3591573376 4172670803 3535921056 2614888581 229773391 3718065951 1387448252 4277039060
As for today, the Mersenne Twister established its firm position among numerous PRNGs as a fast and reliable random number source. It has passed a lot (though not all) of tests for randomness.
Jones (2000) points at more reliable and newer algorithms characterised even by much longer periods. Explore them.
Every new random number generator needs to pass a rich set of tests, such as—the NIST Suite (http://csrc.nist.gov/groups/ST/
toolkit/rng) composed of more than 15 cryptographic tests for randomness, in order to determine whether it is sufficient for the cryptographic use.
The Python version of NIST Suite has been recently provided by my friend and quant fellow Stuart Reid (2015). Earlier the same year, I described a novel approach making use of the Walsh–
Hadamard Transform and found a strong evidence of randomness for Mersenne Twister algorithm as implemented in Python 2.7.10 and 3.4 (http://www.quantatrisk.com/2015/04/07/walsh- hadamard-transform-python-tests-for-randomness-of-financial-return-series/).
Now, one the easiest ways to obtain a float number (e.g. of a 12+
digit precision and to be in (0; 1) interval) out of the provided output is to modify the last four lines of Code 2.33 in the following way:
# printing 10 random numbers (float) for i in range(10):
rnd = '0.' + str(extract_number()) + str(extract_number()) rnd = float(rnd)
print(rnd)
leading to, e.g.:
0.38811813362753395 0.6571432443750916
0.3169495821382487 0.8952493993946118 0.9744576171808691 0.9519982367771294 0.2890105433198552 0.4355670321291 0.4654332166969249 0.7070961503032999
Quite handy flexibility of using both strings, integers, and floats—
all together. Nothing new. Just a few seconds of knowledge mixed with our imagination.
As an exercise, try to apply K-S Test for the output obtained based on Code 2.33. Can you confirm uniform distribution for those random numbers? Write your own program that merges Code 2.33 as an input and Code 2.28 as a testing framework. What are your findings? You should be surprised. Tell me why. ☺
Seed and Functions for Random Selection
When you plant a seed of an oak tree, eventually it will grow reaching an impressive size. The problem is that you cannot recreate exactly the same tree from the same seed. It does not apply to PRNGs like the built-in Mersenne Twister. If exactly the same stream of random numbers has to be generated, one can employ the random.seed function.
In Code 2.33 the seed value is taken from the current value of time read out with a help of, mentioned earlier, datetime module.
We assumed the seed is an integer number corresponding to a microsecond:
>>> from datetime import datetime
>>> now = datetime.now()
>>> now
datetime.datetime(2015, 3, 21, 20, 39, 34, 610531)
>>> now.microsecond 610531
Such method can be effective but limited by 999999, i.e. the total number of possible combinations. Analyse the following code:
from datetime import datetime import random as r
x = datetime.now() x = x.microsecond r.seed(x)
print("seed = %g" % x)
print("rv = %.10f" % r.random())
# shuffle the list
lst = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
r.shuffle(lst, r.random) print(lst)
# random sample from the list
Code 2.34
random.seed
random.shuffle
print(r.sample(lst, 3))
where one of possible outputs is:
seed = 319303 rv = 0.5248070368
['c', 'e', 'b', 'f', 'd', 'g', 'a']
['a', 'g', 'e']
First of all, we specified the seed based on the current value of microsecond. If the function is called as seed() or completely omitted, the current system time is used in generating the seed (by default). Therefore, the use of seed(seed) makes sense if we want to repeat all "random" calculations with the same seed.
Secondly, for a list lst we employed the shuffle function which returns a new order of the items in a random manner. The function works in-place. On the other hand, the sample(lst, 3)
function picks randomly three items from the list. The latter can be easily used in order to build a simple LOTTO simulator:
import random x = range(1, 50)
lucky6 = list(random.sample(x, 6)) lucky6.sort()
print(lucky6)
where we pick 6 lucky numbers among the numbers between 1 and 49, e.g:
[10, 14, 21, 29, 41, 45]
The total number of combinations is 13983816. The same result we can get with the application of the choice function. It returns a random element from the non-random sequence:
import random
x = range(1, 50) newlucky6 = []
for i in range(1, 7):
num = random.choice(x) while num in newlucky6:
num = random.choice(x) else:
newlucky6.append(num)
newlucky6.sort() print(newlucky6)
returning, e.g.:
[7, 11, 12, 25, 26, 46]
Here, we ensure that the numbers picked by the choice function are not the same.
random.sample
random.choice
Random Variables from Non-Random Distributions
The random module comes with some basic ready-to-use functions allowing us to draw random numbers associated with a specific distribution. Say, you need to generate a sample of rvs based on the underlying lognormal distribution described by two parameters. Employing the lognormvariate function your wish is possible:
import random
from matplotlib import pyplot as plt mu = 1.21
sigma = 0.43
r = [random.lognormvariate(mu, sigma) for i in range(100000)]
plt.figure(figsize=(8, 5)) plt.hist(r, bins=50) plt.show()
where the empirical histogram of the lognormal rvs we plot for the visual verification (see above).
In summary, the random module offers the following functions:
random. Functions randrange(stop)
randrange(start, stop, [step]) randint(a, b)
random() uniform(a,b)
triangular(low, high, mode) betavariate(alpha, beta) expovariate(lambda) gammavariate(alpha, beta) gauss(mu, sigma)
lognormvariate(mu, sigma) paretovariate(alpha) weibullvariate(alpha, beta)
See References for further exploration of the topic. More on random numbers you will learn in Section 3.4.
random.lognormvariate Code 2.35
2.4.7. urandom
Lastly, a short comment on a viable alternative to the Mersenne Twister implemented within the random module. A computer may be used as source of randomness. Think for a moment that anything from the your keystrokes to the vibration of the cooling fan may be considered as a source of entropy. The operating system has a continuously running method of generating random numbers from the kernel space. The generator keeps the estimate of a number of bits of noise in the entropy pool. In Linux and Mac OS X, that information is stored at the location of /dev/
urandom. There exists an abundant documentation across the Web arguing on /dev/urandom as an attractive source of pseudo-random numbers of the cryptographic quality. I strongly encourage you to explore this field for your own curiosity. Trust me. It’s fascinating!
In Python 3.5, we can use /dev/urandom in the following way:
from matplotlib import pyplot as plt import array
import os
# Generates n random floats in the range [0, 1) using
# os.urandom() as source of randomness def urandom_random(n):
data = os.urandom(n * 8) arr = array.array("Q", data)
return [float(ulong)/(2**64+1) for ulong in arr.tolist()]
# Generates n random ints in the range [a, b] using os.urandom()
# as source of randomness def urandom_randint(n, a, b):
random = urandom_random(n)
return [int(r * (b-a+1) + a) for r in random]
r = urandom_random(1000000)
rint = urandom_randint(1000000, 5, 19)
plt.figure(figsize=(8, 5))
plt.hist(r, bins=50) # or (rint, plt.show()
what returns a lovely uniform distribution for floats (see above).
More on /dev/urandom at you will find at: https://en.wikipedia.org/
wiki//dev/random webpage.
Code 2.36
References
Jones, D., 2000, Good Practice in (Pseudo) Random Number Generation for Bioinformatics Applications, src: http://www0.cs.ucl.ac.uk/staff/
d.jones/GoodPracticeRNG.pdf
Matsumoto, M., Nishimura, T., Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator, ACM Transactions on Modeling and Computer Simulation Vol. 8, No. 1, January pp.3-30 1998
Reid, S., 2015, Random walks down Wall Street, Stochastic Processes in Python, src: http://www.turingfinance.com/random-walks-down-wall-street-stochastic-processes-in-python/
Further Reading
Janke W., 2002, Pseudo Random Numbers: Generation and Quality Checks, Quantum Simulations of Complex Many-Body Systems:
From Theory to Algorithms, Lecture Notes, J. Grotendorst, D.
Marx, A. Muramatsu (Eds.), John von Neumann Institute for Computing, Julich, NIC Series, Vol. 10, ISBN 3-00-009057-6, pp.
447-458, src: https://www.physik.uni-leipzig.de/~janke/Paper/
nic10_447_2002.pdf
Hardy, S., 2004, Pseudorandom Number Generation, Entropy Harvesting, and Provable Security in Linux, src: http://www.blackhat.com/
presentations/bh-europe-04/bh-eu-04-hardy/bh-eu-04-hardy.pdf Malone M., 2015, TIFU by using Math.random(), src: https://
m e d i u m . c o m / @ b e t a b l e / t i f u by u s i n g m a t h r a n d o m -f1c308c4fd9d#.mt0nz380p
Rock, A., 2005, Pseudorandom Number Generators for Cryptographic Applications, src: https://www.rocq.inria.fr/secret/
Andrea.Roeck/pdfs/dipl.pdf