Document Sample

Institute for Theoretical Physics Winter 2003–2004 ETH Zürich Diploma Thesis A Random Number Generator Test Suite for the C++ Standard Mario Rütti March 10, 2004 Supervisor: Prof. M. Troyer† ¡ maruetti@comp-phys.org † troyer@phys.ethz.ch I am grateful to my diploma professor Prof. Matthias Troyer for giving me the opportunity to write this instructive and inspiring diploma thesis. To say nothing of the time he spent helping me to resolve my (and my computer’s) problems, and his effort to ﬁnd new and unconventional solutions. My special thanks also go to my ofﬁce co-worker Manuel Gil for the motivating and amusing discussions about our work and his pleasant companionship. I am grateful to Frank Moser who was acting as editor and assisted me in correcting and polishing my English sentences. I want to apologize to Ariana about lackluster evenings with a friend lost in thought. Thank you for your support and understanding during this time. Finally, I am grateful to my parents for the tremendous support they gave me during my years of studies which enabled me to achieve my goals. to my parents Urs and Heidi Abstract The heart of every Monte Carlo simulation is a source of high quality random numbers and the generator has to be picked carefully. Since the “Ferrenberg affair” it is known to a broad community that statistical tests alone do not sufﬁce to determine the quality of a genera- tor, but also application-based tests are needed. With the inclusion of an extensible random number library and the deﬁnition of a generic interface into the revised C++ standard it will be important to have access to an extensive C++ random number test suite. Most currently available test suites are limited to a subset of tests are written in Fortran or C and cannot easily be used with the C++ random number generator library. In this paper we will present a generic random number test suite written in C++. The framework is based on the Boost reference implementation of the forthcoming C++ standard random number generator library. The Boost implementation so far contains most modern random number generators. Employing generic programming techniques the test suite is ﬂexible, easily extensible and can be used with any random number generator library, in- cluding those written in C and Fortran. Test results are produced in an XML format, which through the use of XSLT transformations allows extraction of summaries or detailed reports, and conversion to HTML, PDF, PostScript or any other format. At this time, the test suite contains a wide range of different test, including the standard tests described by Knuth, Vattulainen’s physical tests, parts of Marsaglia’s Diehard test suite, and a number of number of newer tests. Contents 1. Introduction 1 2. What are random numbers? 2 2.1. Types of random numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3. Analyzing Statistics 4 3.1. χ 2 test (“Chi-square” test) . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. Kolmogorov-Smirnov test (KS test) . . . . . . . . . . . . . . . . . . . . . 6 3.3. Gaussian Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4. Using the “Random Number Generator Test Suite” 10 4.1. How to run a test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2. The rng_test_suite environment . . . . . . . . . . . . . . . . . . . . 11 4.2.1. Template Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.2.2. Conﬁdence Level . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.2.3. Seeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2.4. Random Number Generators . . . . . . . . . . . . . . . . . . . . . 12 4.2.5. Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2.6. Running. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.3. Testing Parallel Random Number Generators . . . . . . . . . . . . . . . . 14 4.4. Iterating a test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.5. Count failings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.6. Bit Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.7. Bit extract test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.8. The XML output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5. Tests for Studying Random Data 18 5.1. Equidistribution test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.2. Run test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2.1. Runs up and down . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2.2. Runs above and below mean . . . . . . . . . . . . . . . . . . . . . 22 5.2.3. Length of runs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3. Gap test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.4. Poker test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.5. Coupon-collectors test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.6. Permutation test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.7. Maximum of t test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.8. Birthday Spacings test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.9. Collision test (Hash test) . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.10. Serial correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 i 5.11. Serial test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.12. Blocking test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.13. Repeating Time Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.14. gcd test (greatest common divisor) . . . . . . . . . . . . . . . . . . . . . . 36 5.15. Gorilla test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.16. Ising-model test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.17. Random-walk test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.18. n-block test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.19. Random Walker on a line (Sn test) . . . . . . . . . . . . . . . . . . . . . . 40 5.20. 2D Intersection test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.21. 2D Height Correlation test . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.22. Sum of independent distributions test . . . . . . . . . . . . . . . . . . . . . 40 5.23. Fourier transform test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.24. Universal statistical test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.25. The Diehard Test Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.25.1. Birthday Spacings test . . . . . . . . . . . . . . . . . . . . . . . . 42 5.25.2. The overlapping 5-permutation test . . . . . . . . . . . . . . . . . 42 5.25.3. Ranks of binary matrices . . . . . . . . . . . . . . . . . . . . . . . 42 5.25.4. The bitstream test . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.25.5. The OPSO, OQSO and DNA tests . . . . . . . . . . . . . . . . . . 44 5.25.6. The count-the-1’s test . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.25.7. The parking lot test . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.25.8. The overlapping sums test . . . . . . . . . . . . . . . . . . . . . . 46 5.25.9. Squeeze test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.25.10.The Minimum Distance test . . . . . . . . . . . . . . . . . . . . . 47 5.25.11.Random Sphere test . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.25.12.The runs test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.25.13.Craps test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 6. Extending the Random Number Generator Test Suite 49 6.1. How to implement a test . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.1.1. Implementing a χ 2 , Kolmogorov-Smirnov or a Gaussian test . . . . 51 6.1.2. χ 2 test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.1.3. Kolmogorov-Smirnov test . . . . . . . . . . . . . . . . . . . . . . 52 6.1.4. Gaussian test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6.2. The multiple_test wrapper . . . . . . . . . . . . . . . . . . . . . . . 54 6.3. Useful sequence diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.4. Demands on Random Number Generators . . . . . . . . . . . . . . . . . . 57 6.5. Foreign Random Number Generators . . . . . . . . . . . . . . . . . . . . . 57 6.6. The XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 A. Collection of Test Parameters 64 B. Examples 66 C. Compiling the Test Suite 67 ii 1. Introduction How random is random? In this diploma thesis a generic random number test suite (RNGTS) is developed. The test suite framework is written in C++ with attention to modern generic programming paradigm. It is based on the Boost reference implementation of the forthcoming C++ standard random number generator library. The aim of RNGTS is to assist in ﬁnding a suitable random number generator for a speciﬁc purpose and in deciding between good and bad random number generators. Through a generic interface the RNGTS makes a variety of different tests available and provides the possibility to extending the suite with user deﬁned tests. The test results are produced in XML format, which allows the transformation into summaries or detailed reports through the use of XSLT style sheets. The main purpose is to support the user in his decision about a random number generator, and in the question how random the numbers produced by the random number generators are. In the second part of this paper there is a short discussion about the different types of random numbers and their applications. Then, in the third part the involved statistical methods and their pertaining program- ming interface are presented. The fourth part contains the handling of RNGTS. This part is a “must” for the user who wants to perform any tests. It also describes the core of the whole test suite. In the ﬁfth part there is a presentation of the most popular random number generator tests, their parameters and programming interfaces. These tests are collected from different sources and authors. The sixth part is for “advanced” users who want to extend RNGTS and add new tests or different extensions. Finally, the appendix contains a collection of different lists with test parameters and other useful stuff. Download The RNGTS framework is located on the www.comp-phys.org web server and may be downloaded there. There are also some installation hints, some examples and the full docu- mentation with additional interface descriptions and the XSL schema. 1 2. What are random numbers? Random numbers are characterised by the fact that their value can not be predicted. Or, in other words, if one constructs a sequence of random numbers, the probability distribution of the following random numbers have to be completely independent of all the other generated numbers. A more sophisticated mathematical deﬁnition and discussion can be found in [6]. 2.1. Types of random numbers There are three types of random numbers, quasi-, pseudo- and true- random numbers. These different types of random numbers have different applications. (It is philosophical question what we can call random or not, but here, we use the following descriptions, its simpler. . . ) True Random Number The most often used example for “truly” random numbers is the decay of a radioactive material. If a Geiger counter is put in front of such a radioactive source, the intervals between the decay events are truly random. True random numbers are gained from physical processes like radioactive decay or also rolling a dice. But rolling a dice is difﬁcult, perhaps someone could control the dice so well to determine the outcome. Pseudo Random Number These numbers are generated by a computer or that is to say, by an algo- rithm and because of this not truly random. Every new number is generated from the previous ones by an algorithm. This means that the new value is fully determined by the previous ones. But, depending on the algorithm, they often have properties making them very suitable for simulations. Quasi Random Number A good description quoted from [25], Chapter 7.7 Sequences of n-tuples that ﬁll n-space more uniformly than uncorrelated ran- dom points are called quasi-random sequences. That term is somewhat of a mis- nomer, since there is nothing random about quasi-random sequences: They are cleverly crafted to be, in fact, sub-random. The sample points in a quasi-random sequence are, in a precise sense, maximally avoiding each other. Quasi random numbers are not designed to appear random, rather to be uniformly distributed. One aim of such numbers is to reduce and control errors in Monte Carlo simulations. A picture is always a good way to illustrate the difference between this two types. In ﬁgure 2.11 and 2.22 we have plots with different numbers of pseudo- and quasi-random numbers. This is a good demonstration to show the structure of quasi-random numbers, but it is also 1 This plot was generated with the Matlab 6 rand generator, a combination of a lagged Fibonacci generator, with a cache of 32 ﬂoating point numbers and a shift register random integer generator. 2 This plot was generated with the sobol.m routine for Matlab from http://www.csit.fsu.edu/ ~burkardt/m_src/sobol/sobol.html. This web-site includes also a variety of references for Sobol sequences and some implementations in different programming languages. 2 2.1. Types of random numbers possible to see that quasi-random numbers ﬁll continuously the hole plane, while pseudo- random numbers may build clusters and holes. If we are talking about random numbers in the following parts, we mean pseudo random numbers. 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 100 Points 250 Points 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 500 Points 1000 Points Figure 2.1.: Pseudo Random Numbers 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 100 Points 250 Points 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 500 Points 1000 Points Figure 2.2.: Quasi Random Numbers 3 3. Analyzing Statistics In this section we describe the χ 2 test and the Kolmogorov-Smirnov test. Both are designed to check if the measured distribution is similar to the expected distribution. So we can compare different distributions. Later on we describe the gaussian test which is based on the gaussian normal distribution. A detailed description for the outlined C++ classes can be found in the section about implementing additional tests 6.1. 3.1. χ 2 test (“Chi-square” test1 ) The χ 2 -Test is perhaps the best known statistical test. It is based on a comparison between the empirical distribution function and the theoretically expected distribution. The empirical distribution is based on the results of the random process. The n measured random values must be divided in k classes I1 I2 £¢¢¢ ¡ ¡ ¡ Ik . The classes ¤ ¤ ¤ contain N1 N2 Nk N values. ¡ ¡ ¢¢¡ ¥ For each class, the expected number of values must be calculated with the expected distribution function Ni N pi for a given pi (pi p i ) ¥ ¦ ¥ ¨ § Considering the squares of the differences between the measured values and the ex- pected values gives the χ 2 value k ni npi 2 1 k n2 χ2 ∑ n i∑ pi § ¨ i ¥ ¥ n (3.1) i 1 © npi 1© With k classes, there are ν k 1 degrees of freedom in the χ 2 distribution. Look- ¥ ing up for χ 2 and ν in “χ 2 distribution” tables, which can be found in [16], [3], the probability being above or below the given χ 2 can be found. Calculating the probability of a χ 2 value is not such an easy task, but there is an algorithm published by Hill and Pike, which can be used, see [11], [12], [14]. Example: Throwing a die After throwing a die 120 times we get the following results value 1 2 3 4 5 6 # observed 15 19 22 21 17 26 1 Sometimes the “χ 2 test” stands for the Equidistribution test. 4 3.1. χ 2 test (“Chi-square” test) There is no reason to change the k 6 natural classes I1 I2 ¥ ¡ ¡ ¡ £¢¢¢ I6 . The number of 1 values is n 120. For a true die we expect a probability of p i ¥ ¥ 6 for each die-number The expected number of values is ni ¥ ¦ npi ¥ 20 The χ 2 value is calculated by the following sum. k 2 ni npi χ2 ∑ § ¨ ¥ i 1 © npi § 15 20 ¨ 2 § 19 20 ¨ 2 § 2 20 ¨ 2 § 21 20 ¨ 2 § 17 20 ¨ 2 § 26 20 ¨ 2 ¤ ¤ ¤ ¤ ¤ ¥ 20 20 20 20 20 20 76 ¥ ¥ 3 80¡ 20 Here we have k 6 classes. This means that the number of degrees of freedom is ¥ ν 5. Looking up for χ 2 3 80 in a table, the value lies between 50% and 75%. ¥ ¥ ¡ This means that we will have a χ 2 3 80 between 25% and 50% of the time. The ¡ randomness observed in this experiment is satisfactory in this test. Available code To handle the χ 2 statistics there is the chisquare_test class, which provides different methods used for the calculation. Some important methods are listed in the declaration below. The class is deﬁned in the chisquare_test.h ﬁle. class chisquare_test { void prepare_statistics(std::size_t count_size, uint64_t runs, std::size_t degOfFreedom = 0); template<class ForwardIterator> void calculate_chisquare_value(ForwardIterator first, ForwardIterator last, std::size_t degOfFreedom); template<class ForwardIterator> void calculate_chisquare_value(ForwardIterator first, ForwardIterator last); void set_chisquare_value(double chiSquareValue, std::size_t degOfFreedom); chiSqr_stat_type get_chisquare_value(); double get_chisquare_prob(); } In the same ﬁle there is also a function to calculate the χ 2 value without class stuff. template<class ForwardIterator, class UnaryFunction> double calc_chisquare_value(ForwardIterator first, ForwardIterator last, UnaryFunction probability, std::size_t degOfFreedom) To calculate the probability from a χ 2 value in the ﬁle chisqr_prob.h ﬁle there is a function managing this task. double chi_probability(double chisqr, int dof) 5 3. Analyzing Statistics 3.2. Kolmogorov-Smirnov test (KS test) As we have seen, the χ 2 test be applied when observations can fall into a ﬁnite number of categories. But normally one will consider random quantities which may assume an inﬁnite number of values. In this test, the random number generators distribution function Fn x is § ¨ compared to the expected distribution function F x . In [16], Knuth deﬁned this functions § ¨ as follows: F x § ¨ ¥ probability that X § x ¨ Fn x § ¨ ¥ number of X1 X2 ¡ ¡ ¡ £¢¢¢ Xn which are x n The n measured random values must be sorted in ascending order, X 1 X2 ¡ ¡ ¢¢¡ Xn To make the test, we form the following statistics: i Kn¡ ¥ ¤ ¤ £ ¢n max ∞ x ∞ § Fn x § ¨ F x § ¥ ¢¨ ¨ ¤¤ ¥ ¢ n max 1 i n n F Xi § ¨ £ i 1 Kn n max F x Fn x n max F Xi ¥ ¤ ¤ £ ¢ ∞ x ∞ § § ¨ § ¨ ¥ ¢¨ ¤¤ ¢ 1 i n § ¨ n Like in the χ 2 -test, we may now look up the values Kn , Kn in a table [16] to determine £ ¡ if they are signiﬁcantly high or low. An other way is to calculate the probabilities by the algorithm given in [1] and in chapter 3.3.1, “C. History, bibliography, and theory” of [16] In [16] there is also formula given to calculate the probability exactly t t t n £ £ nn k∑ k k ¤ n k 1 prob Kn § ¢ ¦ n ¥ ¨ 0 © § k t ¨ § t n k ¨ (3.2) Example: 10 random numbers We got 10 numbers from a random number generator. These are {0.809, 0.465, 0.151, 0.628, 0.318, 0.824, 0.394, 0.968, 0.179, 0.458} First we sort the random numbers Xi ascending order ¡ £ Calculate the quantities Ki and Ki and ﬁnd the maximum of these quantities 6 3.2. Kolmogorov-Smirnov test (KS test) i Xi ¡ Ki £ Ki 1 0.151 0.051 0.151 2 0.179 0.021 0.079 3 0.318 0.018 0.118 4 0.394 0.006 0.094 5 0.458 0.042 0.058 6 0.465 0.135 0.035 7 0.628 0.072 0.028 8 0.809 0.009 0.109 9 0.824 0.076 0.024 10 0.968 0.032 0.068 ¡ With these values we calculate K10 and K10 as follows £ K10 ¡ ¤¤ ¢ ¡ ¥ n max Ki 1 i n § ¥ ¨ ¢ 10 0 135 ¡ ¥ 0 427 ¡ £ K10 ¥ ¤¤ ¢ £ n max Ki 1 i n § ¥ ¨ ¢ 10 0 151 ¡ ¥ 0 478 ¡ If we look up these values in an appropriate table for n 10, we ﬁnd that the chance ¥ ¦ to get a K10 greater then 0 427 or 0 478 lies between 50% and 75%. ¡ ¡ Available code To calculate the Kolmogorov-Smirnov statistics there is a class which supports the required routines. The deﬁnition of this class called ks_test is found in ks_test.h. Some important methods are listed below class ks_test { void prepare_statistics(uint64_t runs); template<class ForwardIterator> void calculate_ks_value(ForwardIterator first, ForwardIterator last); template<class ForwardIterator, class UnaryFunction> void ks_value(ForwardIterator first, ForwardIterator last, UnaryFunction integratedProbDistr); ks_stat_type get_ks_value(); ks_prob_type get_ks_prob(); } There is also a function to calculate the KS values. template<class ForwardIterator, class UnaryFunction> std::pair<double, double> calc_ks_value(ForwardIterator first, ForwardIterator last, UnaryFunction integratedProbDistr) To calculate the probability for a KS value the following function is deﬁned in the ﬁle ks_prob.h. boost::tuple<double, double> ks_probability(int n, std::pair<double, double> ksPair) 7 3. Analyzing Statistics percent factor 1 0.4 0.8 0.6 0.3 95% 50% 5% 0.4 0.2 95% 0.2 5% 0.1 Σ Σ -3 -2 -1 1 2 3 -3 -2 -1 1 2 3 mean Figure 3.1.: Gaussian distribution Figure 3.2.: Percentage function 3.3. Gaussian Test The Gaussian test is a little different from the χ 2 or the Kolmogorov-Smirnov test. In these two tests the expected distribution function is compared with the measured distribution func- tion and based on the difference some indicators are calculated. In the Gaussian test a physical view is used. If a measurement is done, it is known that, even if the best tools are used, the result depends on a number of ruleless and uncontrolled parameters. These measurement errors are random and a combination of different single errors. The central limit theorem predicates that the measured value behaves like a normal dis- tributed random variable (This is valid in the normal case). The normalized density function is written as 1 1 x µ 2 £ ∞ x ∞ f x e 2 σ § (3.3) ¨ ¡ ¡ § ¥ ¨ § ¨ 2πσ ¢ where µ is the mean of expected value and σ the standard deviation. To make a classiﬁcation of measured values one can compare the deviation from the ex- pected value with the standard deviation. It can be calculated that in the interval µ σ µ ¢ ¤ σ 68.3 % of all measured values are expected. If the interval is expanded to µ 3σ µ 3σ ¢ ¤ £ £ we expect 99.7% of all measured values in this range. Based on this theory it is possible to give a possibility for a measured value. The assumption is that the expected value and the deviation are known. The deviation factor is calculated with the following formula: 1 1 1 perc x § ¥ ¨ erf x (3.4) 2 2 ¢ 2 where erf denotes the “error function” erf z 2 z π 0 e t dt. In this formula we deﬁne the § ¥ ¨ ¤ ¥ £ 2 mean value (expected value) as 50 %, if the deviation is positive a percentage value bigger than 50 % results or if the deviation is negative, a percentage value smaller than 50 % results. The function is shown in ﬁgure 3.2. 8 3.3. Gaussian Test Example: Ising model test statistic We run the Ising model test described later on and check the result. From the simulation we get a speciﬁc energy of 1.45183 whereas a value of 1.45306 is expected. The standard deviation is calculated as 0.0037. This results in a deviation from the mean of -0.3324σ . This result can be converted in percent and one gets 36.98 % from the mean value. That means that only 36.98 % of the measured values will be smaller than this value. Available code To get some support calculating the gaussian statistics there is a class called gaussian_test in the ﬁle gaussian_test.h. The declarations of the most important methods are listed below. class gaussian_test { void prepare_statistics(double deviation, double stat_value, double mean); void calc_gaussian_value(); double get_gaussian_prob(); } 9 4. Using the “Random Number Generator Test Suite” This section describes how to use the “Random Number Generator Test Suit” (RNGTS) with the available wrappers and helpers. The aim was to supply a simple but enough powerful interface to build a ﬂexible system to test different types of random number generators with different tests. But also to allow the generation of various kind of result representation through using a universal XML output format. 4.1. How to run a test Testing a random number generator is simple, the only requirement for the generator is that it fulﬁls the Boost Pseudo-random number engine requirements. This can be found in http: //www.boost.org/libs/random/wg21-proposal.html written by Jens Mau- rer. The listing below shows a exemplary test program. // include Boosts random number generator #include <boost/random.hpp> // definition to show progress during the test #define PRINT_STATUS // include the test suite environment #include "rng_test_suite.h" // include all header of used tests #include "poker_test.h" #include "ising_model_test.h" int main() { // import random number generator from Boost using boost::lagged_fibonacci44497; using boost::mt19937; // create a ’TestSuite’ using uint32_t seeds rng_test_suite<> testSuite; // add desired confidence level testSuite.add_confidence_level(0.05); testSuite.add_confidence_level(0.95); testSuite.add_confidence_level(0.1); testSuite.add_confidence_level(0.9); // add desired seeds testSuite.add_seed(314159265); testSuite.add_seed(236598); testSuite.add_seed(1237); // register the random number generator to test 10 4.2. The rng_test_suite environment testSuite.register_rng<lagged_fibonacci44497>("Lagged Fibonacci 44497"); testSuite.register_rng<mt19937>("mt19937, Mersenne Twister", 10000); // create the test object poker_test pokerTest(100000, 5); ising_model_test isingTest(1000000, 16); // register the tests testSuite.register_test<ising_model_test>(isingTest); testSuite.register_test<poker_test>(pokerTest); // run tests... // specify destination for writing the XML output, write output into a file std::ofstream file_out("test_output.xml"); // runs all tests and catches possible exceptions try { // catch possible logic_error exceptions testSuite.run_test(file_out, true); } catch (std::exception& e) { std::cout << "exception occured, program terminated : " << e.what(); } file_out.close(); return 0; } 4.2. The rng_test_suite environment In the following sections we describe the use and the possibilities of the rng_test_suite environment. 4.2.1. Template Parameter The template parameter used to construct the class specify the type of seed values. As noted in the “wg21-proposal” for Boost random number generators the seed values have to be unsigned integral value types. As a default value uint32_t is speziﬁed. template <typename seedType=uint32_t> class rng_test_suite { ... } 4.2.2. Conﬁdence Level To add a conﬁdence level to specify the limits of the statistical calculations the following ¢ method has to be used. The conﬁdence level has to be 0 1 . If nothing was added the test £ suit use 5 and 95 % as standard. void add_confidence_level(double cl); 11 4. Using the “Random Number Generator Test Suite” 4.2.3. Seeds Adding seeds is not such an easy task, because the Pseudo-random number engine require- ments does only specify the iterator based seeding, nothing else. But most generators support also a seed(seedType) method. So it is possible to add multiple seeds to use with the generators. If a generator does not support the seed(seedType) method the test suite uses a pseudo-DES algorithm (see [25], sec. 7.5) to create a set of numbers and feeds these numbers into the generator with the mandatory iterator based seed method. void add_seed(uint32_t seed) The other way to seed the generator is ﬁlling its buffer with values. To do this there is the seed(iterator, iterator) method. This method must be supported by all generators from Boost. The user has to check himself if there are enough values between the two iterators to ﬁll the buffer. If there are insufﬁcient values an exception might be thrown. template <typename seedIter> void add_seed_iterators(const seedIter begin, const seedIter end) <seedIter> type of iterator begin iterator to the begin of the buffer with seeds end iterator to the end of the buffer with seeds starting the test Before a random number generator is seeded, it is reset to the initial state. This means to the same state as it was while adding to the test suite. This guarantees the repeatability for different seeding. If no seeds are added, the tests run with the initial state of the generator. If a generator has to be tested in a special state, e. g. with a special seeded buffer, there is the method register_seeded_rng to handle this case. 4.2.4. Random Number Generators To register the random number generators which have to be tested, the test suite provides the following two methods. (The requirements of a random number generator are described in section 6.4) template <class T> void register_rng(std::string rng_name, uint64_t warmup = 0) <class T> type of the random number generator to test rng_name name of the generator, should be unique warmup number of random numbers to produce with the generator before starting the test This method takes the type of the random number generator as a template parameter. The concrete generator object is created inside the test suite with the default constructor. The seed calls are done in this initial state. template <class T> void register_seeded_rng(T mrng, std::string rng_name, std::string description, uint64_t warmup = 0) 12 4.2. The rng_test_suite environment <class T> type of the random number generator to test mrng object of the random number generator of type T rng_name name of the generator, should be unique description a description of the seed-state warmup number of random numbers to produce with the generator before starting the test This method takes an object of a random number generator as a parameter. So it is possible to use pre-seeded generators. For most generators, all further operations are done on this state of the generator. This is valid if and only if the generator class does not have external links, e. g. function pointers. If a foreign random number generator (see 6.5) is used, the generator will not be seeded before the test - it remains in the previous state. 4.2.5. Tests Adding a test is really simple, just create an object of the test class and add it to the test suite. This is done with the following method. template <class T> void register_test(T test) <class T> type of the test test test to add to collection of tests to perform It is important to note that the test must be in a ’ready to run’ state when it is added to the test suit, because the test suit calls only the run method and nothing before. 4.2.6. Running. . . If all desired generators, seeds and test are added to the test suite the test can be run by calling the run_test method. One has to specify where to write the XML output to. Writing to the terminal is as simple as using a ﬁle as target to write to. The second argument speciﬁes if logic errors, thrown by a test, are caught or not. As an example, an exception may be thrown if one tries to make a binary rank test for matrices bigger than the number of bits of the random number generator. If the exception is not caught, the test suite stops and does not ﬁnish the other tests. If the exception is caught, the test is omitted and the test suit continue its work. void run_test(std::ostream& out, bool catch_logic_errors = true) out ostream to write the XML output to catch_logic_errors speciﬁes if logic_errors thrown by tests are caught or not The run_test method should be in a try-block, there are sources which throws excep- tions. The order of testing all seeds is the following: user seeded generators seed a generator with seed(s) seed a generator with seed(it, it) 13 4. Using the “Random Number Generator Test Suite” 4.3. Testing Parallel Random Number Generators To test a parallel application using different random number generators in different threads, there is a class called parallel_rng_imitator (from parallel_rng_imitator.h) which simulates such an application. The class contains a collection of deﬁnable generators and calls one after another. This generator fulﬁls the Boost speciﬁcation and can be used in a normal way. There are some preconditions to keep in mind when using such a random number genera- tor. All random number generators used in this parallel random number generator must have the same result_type. Unfortunately the boost::uniform_01 type does not support an default constructor, so it is not possible to map the result type to an other type. To do this, a converter which fulﬁls the speciﬁed interface for generators has to be written. All random number generators must have the same maximum and minimum value. Pre-seeded random number generators should be favoured because of a better control over seeding the particular generators. // include Boosts RNGs #include <boost/random.hpp> // include parallel generator #include "parallel_rng_imitator.h" // import RNGs from Boost using boost::minstd_rand0; using boost::lagged_fibonacci19937; using boost::lagged_fibonacci23209; using boost::lagged_fibonacci44497; using boost::mt19937; using boost::ecuyer1988; // make a RNG from two different Lagged Fibonacci RNGs parallel_rng_imitator< boost::tuple< lagged_fibonacci23209, lagged_fibonacci44497> > parallelRNG; // does not compile, because the enlisted generators do not // have same result_type parallel_rng_imitator< boost::tuple< minstd_rand0, // result_type int32_t lagged_fibonacci23209, // result_type double mt19937> // result_type uint32_t > parallelRNG_error_compile; // does compile, but throws an exception because the RNGs // does not have same min() or max() value parallel_rng_imitator< boost::tuple< ecuyer1988, // max = 2147483561 14 4.4. Iterating a test minstd_rand // max = 2147483646 > parallelRNG_error_runtime; 4.4. Iterating a test The idiom says that “Once doesn’t count”. So, we have to repeat a test multiple times and make a statistic over all results. (Probably we also like to repeat this repetition. . . ) This class iterates a given test n times and calculates a Kolmogorov-Smirnov statistic over all results. This is only possible if the test to iterate is derived from the chisquare_test, ks_test or gaussian_test base class. The iteration of a χ 2 or a gaussian test give a normal K-S statistic. But if we have to do this for a K-S test itself, we get four values, K ¡ £ and K for the original K and the same for the original K . ¡ £ The iterate_test fulﬁls the test interface and acts like a normal test. template< class Test > iterate_test(Test test, std::size_t iterations) <class Test> type of the test test test to iterate iterations number of times to iterate the test 4.5. Count failings Another way to decide about success or failure is to count the failings of each test and compare with a maximal number of failures. This class iterates a given test n times and count the number of failings. If the test fails more than the failLimit allows, then it will fail, else the test is passed. (Mathemat- ically, failings failLimit) This is only possible if the test to iterate is derived from the chisquare_test, ks_test or gaussian_test base class. The iteration of a χ 2 or a gaussian test gives one value for failings, the K-S test variation results in two values, one for K and one for K . ¡ £ The count_fails_test fulﬁls the test interface and acts like a normal test. template< class Test > count_fails_test(Test test, std::size_t iterations, std::size_t failLimit) <class Test> type of the test test test to iterate iterations number of times to iterate the test failLimit Limit deciding between failure or success 4.6. Bit Tests In some kind of tests, like in the “count-the-1’s” test from the Diehard test suite, overlap- ping ranges of bits are tested. From each random number some new particular numbers are 15 4. Using the “Random Number Generator Test Suite” built. This is done by masking the bit representation of the number with a speciﬁc mask which is shift from the least signiﬁcant bit to the most signiﬁcant bit. An example of split- ting up a number in overlapping sub-numbers is given in ﬁgure 4.1. This class is called original = 180 1 0 1 1 0 1 0 0 Bits 3..0 = 4 0 1 0 0 Bits 4..1 = 10 1 0 1 0 Bits 5..2 = 13 1 1 0 1 Bits 6..3 = 6 0 1 1 0 Bits 7..4 = 11 1 0 1 1 Figure 4.1.: Bit Test, Example of Bit Concatenation rng_bit_test and is located in the same denominated header ﬁle. This wrapper can only be used if the test is derived from one of the given base classes (chisquare_test, ks_test or gaussian_test). The interface is template<class TEST, int no_bits> rng_bit_test(TEST test) <class TEST> type of the test <int no_bits> number of bits for each random number test test to use for bit test Example As an example we want to know if a sequence of each 10 bits is uniformly distributed in a χ 2 sense. We have to create a test object, pass this to the wrapper an register the test. chisqr_uniformity_test chi_uni_test(200000, 10); rng_bit_test<chisqr_uniformity_test, 10> bit_chi_uni_test(chi_uni_test); rngTest.register_test<rng_bit_test<chisqr_uniformity_test, 10> >(bit_chi_uni_test); 4.7. Bit extract test Another way to test a generator is to extract only a speciﬁc range of bits from each generated random number and interpret this bits as a new number. In ﬁgure 4.2 bits 2 5 are used to ¡ ¡ ¢¢¡ make a new number. Or, we take a speciﬁc bit of a number of random numbers and interpret this bits as a new number. In ﬁgure 4.3 this is done with bit 5. To build a new random number bit ﬁve of six consecutive random numbers are used. This tests are supported by two wrappers in rng_bit_extract.h. template<typename RNG, int start_bit, int no_bits> bit_extract(std::size_t b=10240) <typename RNG> type of random number generator 16 4.8. The XML output Original Bit 5 180 1 0 1 1 0 1 0 0 Original Bit 5 Bit 2 Selected 194 1 1 0 0 0 0 1 0 180 1 0 1 1 0 1 0 0 13 89 0 1 0 1 1 0 0 1 194 1 1 0 0 0 0 1 0 0 134 1 1 1 0 1 0 1 0 89 0 1 0 1 1 0 0 1 6 195 1 1 0 0 0 0 1 1 134 1 1 1 0 1 0 1 0 10 21 0 0 0 1 0 1 0 1 Selected 36 Figure 4.2.: Extracting sub- Figure 4.3.: Concatenating single sequences as next bits to the next random random numbers number <int start_bit> ﬁrst bit of new random number <int no_bits> number of bits of new random number b buffer size of random number generator template<typename RNG, int bit_no, int seqLength> bit_sequence(std::size_t b=10240) <typename RNG> type of random number generator <int bit_no> bit to use for random number <int seqLength> number of bits for each random number b test to use for bit test 4.8. The XML output The result of every test is written out on a speciﬁc stream. This stream may be deﬁned in the run_Test(std::ostream, bool catch_logic_errors = true) method. The output may be written onto the console via std::cout or, better for further processing, to a ﬁle. To write the output in a ﬁle, one has to create a ﬁle like this: #include <fstream> std::ofstream fileOut("results.xml"); For a more detailed description about the XML-schema see 6.6. 17 5. Tests for Studying Random Data In this section we present different tests to study the behavior of random number generators. We can distinguish two different sorts of tests, statistical tests and physical test 1 . The only difference is the motivation to do the test. In the ﬁrst case, we want to know the behavior of some statistical properties, in the second case, we simulate a physical system. (Strictly speaking there are some more tests like “visual tests” or “theoretical test”. But we do not look at them because of lack of automatism). Each of these tests checks a special property of the generated numbers against the theoretically expected behavior. These tests are not my invention, I only collected them and add examples of usage to it. A reference to the source (not source code) is mentioned with each test. Table 5.1 lists many known random number generator tests and its occurrence in often cited test-benches. It is impossible to list all tests, there are an inﬁnite number of them, so we mention the most popular ones. A more interesting table for testers is table 5.2. It shows all available 2 tests in the test suite and their class names. (The name of the header ﬁle is the concatenation of class name and .h). 5.1. Equidistribution test In this test we check if the generated numbers are equally distributed. See [16]. The N measured random values in the interval α ; β must be divided in k classes ¢ £ ¤ ¤ ¤ I1 I2 Ik . The classes contains N1 N2 ¡ ¡ ¡ £¢¢¢ Nk N values. ¡ ¡ ¢¢¡ ¥ For each class, the expected number is calculated with the assumption that all values β α k N appear with the same probability p ¥ § ¨ Check the probability with the χ 2 test for the classes and use the KS test to check the whole data. 1A nice description of physical tests is given in [29] Passing several tests does not prove the randomness of any sequence, however. This is due to the fact that proving randomness requires that the sequence fulﬁls an actual deﬁnition for randomness. An unfortunate fact is, however, that there is no unique deﬁnition for randomness. [...] Therefore, passing many tests is never a sufﬁcient condition for the use of any pseudo random number generator in all applications. In other words, in addition to standard tests, efﬁcient application speciﬁc tests of randomness are also needed. This need is emphasized by recent simulations, in which some physical models combined with special algorithms have been found which are very sensitive to the quality of random numbers. 2I hope that by the time this paper is published the list will already be updated with further implementations 18 5.2. Run test Example: Throwing a die An example for the χ 2 part of this test is given in section 3.1 and the example for the KS test can be found in section 3.2. Constructor in chisqr_uniformity_test.h chisqr_uniformity_test(uint64_t n, std::size_t classes) n number of numbers to count classes number of classes to range in random numbers Constructor in ks_uniformity_test ks_uniformity_test(std::size_t n) n number of random numbers to count 5.2. Run test In this test, we are looking for monotone subsequences of the original sequence, which are called runs. There are three different sorts of tests. We can count “runs up” and “runs down”, “runs above” and “runs below” the mean or the “length of runs”. As an example of a run, consider the sequence of eleven numbers {3 1 4 1 5 9 2 6 5 3 5}. To show the “runs up” we put a vertical line at the left and right and between X i and Xi 1 whenever Xi Xi 1 . Here we get | 3 | 1 4 | 1 5 9 | 2 6 | 5 | 3 5 | ¡ ¡ 5.2.1. Runs up and down Split the sequence of random numbers into increasing and decreasing subsequences ¤ and count the sequences ninc ndec N ¥ If N has an adequate size, the mean and variance are given by 2N 1 µa ¥ 3 16N 29 σ2 ¥ 90 For N 20, the distribution of a is reasonably approximated by a normal distribution, N µa σa . Converting to a standardized normal distribution by § 2 ¨ µa a a § 2N 1 3 ¨ Z0 σa ¥ ¥ § 16N 29 90 ¨ Failure to reject the hypothesis of independence occurs when zα ¡ 2 Z0 zα 2 , ¡ where α is the level of signiﬁcance 19 5. Tests for Studying Random Data Test Available in Test-Bench Knuth1 Helsinki2 Diehard3 SPRNG4 Equidistribution Test (Frequency Test) ¢ ¢ ¢ Gap Test ¢ ¢ ¢ Ising Model Test ¢ n-block test Serial Test ¢ ¢ ¢ Poker Test (Partition Test) ¢ ¢ Coupon collector’s Test ¢ ¢ Permutation Test ¢ ¢ Run Test ¢ ¢ ¢ Maximum of t Test ¢ ¢ ¢ Collision Test (Hash Test) ¢ ¢ ¢ Serial correlation Test ¢ Birthday-Spacing’s Test ¢ ¢ Overlapping Permutations Test ¢ Ranks of 31 31 and 32 32 matrices Test ¢ Ranks of 6 8 Matrices Test ¢ Monkey Tests on 20-bit Words ¢ Monkey Tests OPSO, OQSO, DNA ¢ Count the 1‘s in a Stream of Bytes ¢ Count the 1‘s in Speciﬁc Bytes ¢ Parking Lot Test ¢ Minimum Distance Test ¢ Random Spheres Test ¢ The Sqeeze Test ¢ Overlapping Sums Test ¢ The Craps Test ¢ Sum of distributions (for parallel streams) ¢ FFT ¢ Blocking Test ¢ 2-d Random Walk ¢ Random Walkers on a line (Sn Test) 2D Intersection Test 2D Height Correlation Test Repeating Time Test Gorilla Test § ¢ ¨ gcd Test § ¢ ¨ Maurers Universal Test Figure 5.1.: Compilation of known tests 1 [16] 2 [29] 3 [18], [19] 4 [21] 20 5.2. Run test Test Class Name Description Equidistribution Test (Frequency Test) ks_uniformity_test 5.1 chisqr_uniformity_test 5.1 Gap Test gap_test 5.3 Ising Model Test ising_model_test 5.16 n-block test n_block_test 5.18 Serial Test serial_test 5.11 Poker Test (Partition Test) poker_test 5.4 Coupon collector’s Test coupon_collector_test 5.5 Permutation Test permutation_test 5.6 Run Test runs_test 5.2.3 Maximum of t Test max_of_t_test 5.7 Collision Test (Hash Test) collision_test 5.9 Serial correlation Test serial_correlation_test 5.10 Birthday-Spacing’s Test birthday_spacing_test 5.8 Overlapping Permutations Test 5.25.2 Ranks of 31 31 and 32 32 matrices Test bin_rank_chisqr_test 5.25.3 Ranks of 6 8 Matrices Test bin_rank_ks_test 5.25.3 Monkey Tests on 20-bit Words 5.25.4 Monkey Tests OPSO,OQSO,DNA 5.25.5 Count the 1‘s in a Stream of Bytes 5.25.6 Count the 1‘s in Speciﬁc Bytes 5.25.6 Parking Lot Test 5.25.7 Minimum Distance Test minimum_distance_test 5.25.10 Random Spheres Test random_sphere_test 5.25.11 The Sqeeze Test squeeze_test 5.25.9 Overlapping Sums Test 5.25.8 The Craps Test craps_test 5.25.13 Sum of distributions (for parallel streams) 5.22 FFT 5.23 Blocking Test 5.12 2-d Random Walk random_walk_test 5.17 Random Walkers on a line (S_n Test) 5.19 2D Intersection Test 5.20 2D Height Correlation Test height_corr2d_test 5.21 Repeating Time Test 5.13 Gorilla Test 5.15 GCD Test 5.14 Maurers Universal Test 5.24 Figure 5.2.: Available tests in the RNGTS framework 21 5. Tests for Studying Random Data Example: If a sequence of numbers has to few runs, it is unlikely that it is a real random sequence. If we look at the following sequence, {0.12, 0.35, 0.38, 0.45, 0.51, 0.69, 0.77, 0.78, 0.90, 0.93} we can only ﬁnd one run up. It is not likely to be a random sequence. If a sequence of numbers has too many runs, it is unlikely to be a real random sequence. Look at the sequence {0.08, 0.93, 0.15, 0.96, 0.26, 0.84, 0.28, 0.79, 0.36, 0.57}. If we split this sequence into “runs up” and “runs down”, we will ﬁnd the following. 0 08 0 93 0 15 0 96 0 26 0 84 0 28 0 79 0 36 0 57 ﬁve runs up ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 0 08 0 93 0 15 0 96 0 26 0 84 0 28 0 79 0 36 0 57 four runs down ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ It has nine runs, ﬁve up and four down. 5.2.2. Runs above and below mean This test is an addition to the “Runs up and down” test (5.2.1). It’s easy to build a sequence, with the ﬁrst 20 numbers above mean while the following 20 numbers are below the mean, which does not fail the “Runs up and down” test. So we have to check the behaviour of the runs above and below the mean. Calculate the mean of the sequence of random numbers Split this sequence into subsequences above and below the mean and count the number of runs below nb and above na . r is the total number of runs. The mean and variance of r can be expressed as 2na nb 1 µr ¤ ¥ N 2 2na nb 2na nb N σr2 § ¨ ¥ N2 N 1 § ¨ For either na or nb greater than 20, r is approximately normally distributed b § 2na nb N ¨ 1 2 Z0 ¥ ¡ £ £ 2na nb 2na nb N ¢ £ N2 N 1 ¢ £ Failure to reject the hypothesis of independence occurs when zα ¡ 2 Z0 zα 2 , ¡ where α is the level of signiﬁcance Example: We have the following sequence of random numbers. {0.78, 0.49, 0.41, 0.58, 0.82, 0.26, 0.30, 0.06, 0.36, 0.01}. Calculating the mean gives µ ¥ 0 408 ¡ 22 5.2. Run test Splitting up in subsequences above and below the mean gives the following situation: | 0.78 0.49 0.41 0.58 0.82 | 0.26 0.30 0.06 0.36 0.01 | In this case one run is above, one below the mean. It is not likely to be a random sequence. 5.2.3. Length of runs This test is an addition to the last two tests. It’s still possible to create a sequence of numbers which passes the last two tests, but the probability that this sequence is truly random is very small. Such a sequence may be a run of two numbers below the mean, then a run of two numbers above the mean and so on. So we need to test the randomness of the length of runs. Split the sequence into subsequences in one of the given manner above where N is the number of samples Store the number of runs of length i into RUN[i] Here, we should not apply a χ 2 -test to the data stored in RUN. This is because adjacent runs are not independent. A long run will tend to be followed by a short run, and vice-versa. So, the statistic should be computed as following 1 6 N i ∑1 RUN[i] § Nbi RUN[j] § ¨ nb j ai j ¨ (5.1) j © The coefﬁcients ai j and bi can be found in [16], there is also a method shown to calculate the coefﬁcients for arbitrary maximal run length. Example: Length of “runs up” We have a random sequence: {3 1 4 1 5 9 2 6 5 3 5} Marking the “runs up” in the sequence produces | 3 | 1 4 | 1 5 9 | 2 6 | 5 | 3 5 | We get the following “statistic” – 1 run of length 3 – 3 runs of length 2 – 2 run of length 1 Constructor in runs_test.h runs_test(uint64_t n, std::size_t maxRunLength) n number of random numbers to check for runs maxRunLength run length above this length are cumulated 23 5. Tests for Studying Random Data Internally, this test has to invert a matrix. This functionality is supported by the LAPACK library and the matrix handling is covered with routines from BLAS. The Boost interface for this two libraries is not yet in the ofﬁcial release, but available in the “Boost-Sandbox”. To use this test, the “Boost-Sandbox” must be installed which is also available at [2] at “Sandbox CVS”. 5.3. Gap test This test is used to examine the length of “gaps” between occurrences of samples in a certain range. It determines the length of consecutive subsequences with samples not in a speciﬁc range. The algorithm to count the gap length is found in [16]. Deﬁne an interval α ; β with 0 α β ¢ £ ¡ 1 Deﬁne a list to save the number of occurrence of gaps with length l, where 0 l t. This is easily done with a structure like COUNT[l]. With every occurrence of a sequence of length l, do COUNT[l] = COUNT[l]+1. If l is bigger than t, increase COUNT[t]. Search a subsequence Xi Xi 1 Xi l of the random sequence X0 X1 XN in which Xi l lies in α ; β but the other X’s do not. This subsequence of l 1 numbers repre- ¢ ¨ ¡ ¡ ¡ ¡ £¢¢¢ ¡ ¤ £¢¢¢ ¡ ¡ ¡ ¡ sents a gap of length l. This increases the number in COUNT[l] After enough samples are tested, the χ 2 -test is applied to the k t 1 values of ¤ ¥ COUNT[0], COUNT[1], . . . COUNT[t], using the following probabilities: 2 p0 ¥ p p1 ¥ p1 § p ¨ p2 ¥ p1 § p ¨ ¢¢¡ ¡ ¡ pt £ 1 ¥ p1 § p ¨ t 1 £ pt ¥ § 1 p ¨ t Here p β α , the probability that α Xi β. ¡ ¥ The gap test can be applied with α 0 or β 0 to facilitate the test-procedure. The special ¥ ¥ case α β § 0 1 of 1 1 give rise to the “runs above mean” or “runs below mean” test. 2 2¥ ¨ § ¨ § ¨ This is not the same implementation of the test as used in [29]. They use n random numbers and count the number of gaps, this algorithm produces random numbers until n gaps were counted. An approximative conversion from one test to the other is possible with an estimation for the number of gaps within n random numbers. gaps n β α § ¨ Example: We have the following sequence: {0.11, 0.83, 0.56, 0.95, 0.88, 0.73, 0.91, 0.01, 0.75, 0.67, 0.23, 0.38} In this case we would take the ﬁrst two numbers to determine the interval. This means α 0 11 β 0 83 or 0 11; 0 83 . ¢ ¥ ¡ ¥ ¡ ¡ ¡ £ The sequence to check is {0.56, 0.95, 0.88, 0.73, 0.91, 0.01, 0.75, 0.67, 0.23, 0.38} 24 5.4. Poker test The ﬁrst value lies in the interval, the next two values not. This means that the gap- length is 2. Marked in the sequence, with bold letters for values in the interval and numbers to count the gap-length, the sequence looks as following: 0 56 0 0 95 1 0 88 2 0 73 0 0 91 1 0 01 2 0 75 0 0 67 0 0 23 0 0 38 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ Calculating the probabilities with p ¥ 0 83 ¡ 0 11 ¡ ¥ 0 72 and a total of ﬁve gaps ¡ t pt expected # of gaps counted # of gaps 0 0.72 3.60 3 1 0.2016 1.01 0 2 0.0564 0.28 2 3 0.0158 0.08 0 Constructor in gap_test.h gap_test(std::size_t n, double lowerGapLimit, double upperGapLimit, std::size_t maxGapCount) n number of random numbers to count lowerGapLimit start of gap (α ) upperGapLimit end of gap (β ) maxGapCount number of steps counted until they are cumulated 5.4. Poker test The “original” poker test considers n groups of ﬁve successive integers, denoted by X 5i X5i § 1 X5i 4 ¨ , 0 i n. We observe which of the following seven patterns each quintuple matches: ¡ ¡ ¡ ¡ ¡ £¢¢¢ ¡ All different: abcde Full house: aaabb One pair: aabcd Four of a kind: aaaab Two pairs: aabbc Five of a kind: aaaaa Three of a kind: aaabc A χ 2 -test is based on the number of quintuples in each category. To get a simpler version of this test, a good compromise [16] would be to simply count the number of distinct values in the set of ﬁve. So we would have ﬁve categories: 5 different = all different 4 different = one pair 3 different = two pairs, or three of a kind 2 different = full house, or four of a kind 1 different = ﬁve of a kind This breakdown is easier to determine systematically, and the test is nearly as good. 25 5. Tests for Studying Random Data Generate n groups of k successive numbers Count the numbers of k-tuples with r different values A χ 2 -test can be made using the following probability: ¤ d d § 1 ¨ § d r 1 ¨ k pr (5.2) ¥ dk r ¡ The Stirling number3 (of second kind) k is the number of ways to partition a set of r ¢ £ k elements into exactly r parts. For this test we use k 5, so the Stirling numbers can ¥ r 1 2 3 4 5 be written in a little table. 5 r 1 15 25 10 1 ¢ £ Example: Throwing a die Lets throw a die until there are one hundred values between one and ﬁve. If a six occurs, ignore it. The sequence looks like this: { 5 2 4 4 5 1 2 4 2 3 3 4 2 3 5 5 4 4 1 4 1 2 1 5 1 3 1 1 5 2354243352434352255153114521131255523243431355 4244241355225245345352545} Arrange the sequence into n ¥ 20 groups of k ¥ 5 numbers: 52445|12423|34235|54414|12151|31152|35424| 33524|34352|25515|31145|21131|25552|32434| 31355|42442|41355|22524|53453|52545 Count the number of different values sequence r sequence r sequence r sequence r 52445 3 12423 4 34235 4 54414 3 12151 3 31152 4 35424 4 33524 4 34352 4 25515 3 31145 4 21131 3 25552 2 32434 3 31355 3 42442 2 41355 4 22524 3 53453 3 52545 3 We get the following “statistic” r 1 2 3 4 5 #r 0 2 10 8 0 m k m 3 The Stirling number can be written in a closed form as ¤ m n ¥ 1 m! ∑m k ¦ 0 ¨ ©§ 1 k kn 26 5.5. Coupon-collectors test To calculate the expected values we use equation (5.2) with d ¥ 5 and k ¥ 5. 5 5 1 p1 ¥ 5 0 0016 ¥ ¥ ¡ 5 1 625 ¡ 55 1 5 § 12 ¨ p2 ¥ 5 0 096 ¥ ¥ ¡ 5 2 125 ¡ 55 1 5 2 5 § 12 § ¨ ¨ p3 ¥ 5 0 48 ¥ ¥ ¡ 5 3 25 ¡ 55 1 5 2 5 3 5 § § ¨ 48 § ¨ ¨ p1 ¥ 5 ¥ ¥ 0 384 ¡ 5 4 125 ¡ 55 1 5 2 5 3 5 4 5 § § ¨ § ¨ § ¨ ¨ 24 p5 ¥ ¥ ¥ 0 0384 ¡ 55 5 ¡ 625 It is now possible to make a table with the expected number of special quintuples and the measured number. r # expected # measured 1 0 0016 20 0 032 ¡ ¥ ¡ 0 2 0 096 20 1 92 ¡ ¥ ¡ 2 3 0 48 20 9 6 ¡ ¥ ¡ 10 4 0 384 20 7 68 ¡ ¥ ¡ 8 5 0 0384 20 0 768 ¡ ¥ ¡ 0 Constructor in poker_test.h poker_test(uint64_t n, std::size_t different_cards) n number of poker games different_cards number of different poker cards in the game 5.5. Coupon-collectors test This test is similar to the poker test 5.4. We observe the sequence X 1 X2 and count the ¡ ¡ ¡ ¢¢¢ length r of the subsequence Xi 1 Xi 2 Xi r required to get a “complete set” of integers ¡ ¡ ¡ ¡ ¡ £¢¢¢ from 0 to d 1. Obviously, the minimal length of r is d, the maximum length is not bounded ¡ so we deﬁne a t which gives an upper bound. So it follows d r t. This test is described ¡ in [16]. We run this test until we get n “complete sets” of integers from 0 to d 1 and store the quantity of each length r in a list like COUNT[r] where d r t. All sequences ¡ longer then t are accumulated in COUNT[t]. 27 5. Tests for Studying Random Data To perform a χ 2 -test, with k t d 1 degrees of freedom, we have to know the ¤ ¥ expected probabilities for each length. This are calculated by the following formulas: d! r 1 pr ¥ dr d 1 for d ¡ r ¡ t (5.3) d! t 1 pt ¥ 1 £ (5.4) dt 1 d ¡ r Once more the term d ¡ denotes the Stirling number of second kind. Example: Throwing a die We use the same data as in 5.4. We have integers from 1 to 5 and we deﬁne t ¥ 13 We ﬁrst split up the sequence into “complete sets” and count the length complete set length 5244512423 10 342355441 9 4121513 7 1152354 7 2433524343522551 16 5311452 7 113125552324 12 343135542 9 4424135 7 5225245345352545 ¢¢¡ ¡ ¡ 16 We can calculate the expected length of sequences with equation (5.3), (5.4) and com- pare to the measured length. r pr # expected # counted 5 24 625 =0.0384 0.38 0 6 48 625 =0.0768 0.77 0 7 312 3125 =0.0998 1.00 4 8 336 3125 =0.1075 1.08 0 9 40824 390625 =0.1045 1.05 2 10 37296 390625 =0.0955 0.95 1 11 163704 1953125 =0.0838 0.84 0 12 27984 390625 =0.0716 0.72 1 >13 628901 1953125 =0.3220 3.22 2 Constructor in coupon_collector_test.h coupon_collector_test(uint64_t n, std::size_t different_coupons, std::size_t maxSeq) 28 5.6. Permutation test n number of coupon sets different_coupons number of different coupons maxSeq sequence length above this value will be cumulated 5.6. Permutation test The sequence of numbers is divided into n groups of t elements each, denoted as the vector §Xit Xit 1 Xit t 1 for 0 i n. The elements in each group of t values can have t! ¡ £¢¢¢ ¡ ¡ ¡ £ ¡ £ ¨ ¡ possible orderings. The number of times each ordering appears is counted and a χ 2 -test with ¢ k t! degrees of freedom and with probability 1 t! for each ordering. The theory and an ¥ algorithm may be found in [16]. Divide the input sequence into n groups of t elements each Count the occurrence of each possible ordering in the group Do a χ 2 -test with k ¥ t! degrees of freedom and with probability 1 t! for each ordering Example: We get n ¥ 50 groups of data with each three values out of 1 2 3. 1-10 312 123 312 132 213 231 213 213 312 132 11-20 213 312 312 321 231 321 123 123 123 321 21-30 132 312 312 132 231 132 213 231 213 312 31-40 231 231 123 312 231 123 231 321 321 123 41-50 321 231 123 123 132 213 123 321 321 132 For d 3 there are d! ¥ ¥ 6 different combinations. We count the occurrence of each combination sequence 123 132 213 231 312 321 # sequences 10 7 7 9 9 8 n 50 1 The expected value for every combination is d! ¥ 6 ¥ 83 Constructor in permutation_test.h permutation_test(uint64_t n, std::size_t nrOfElements) n number of permutations to generate nrOfElements number of elements to permute 29 5. Tests for Studying Random Data 5.7. Maximum of t test The sequence of numbers is divided into n groups of t elements each, denoted as the vector §Xit Xit 1 Xit t 1 for 0 i n. Then determine the maximum of each group. The ¡ £¢¢¢ ¡ ¡ ¡ £ ¡ ¢ £ ¨ ¡ distribution of the maxima should follow xt . This test is described in [16]. Divide the input sequence into n groups of t elements each, denoted by Vi ¥ § Xit Xit ¡ 1 £¢¢¢ ¡ ¡ ¡ Xit £ ¡ ¢ t 1 £ ¨ for 0 i n. ¡ Generate a new sequence max V0 max V1 § ¨ § ¢¢¢ ¨ ¡ ¡ ¡ max Vn § £ 1 ¨ We apply the Kolmogorov-Smirnov test to the sequence of maxima with the distribu- tion function F x xt , 0 x 1 . § ¥ ¨ § ¨ ¢ We make k equidistant bins between 0; 1 . To get the expected number of values in £ each bin we have to subtract the probability for the lower bin from the probability for the actual bin. We see that the percentage in bin i 1 k is k i i 1 k . To get the expected values ¥ ¡¡ ¡ t £ ¡ t we multiply the value per bin with the number of groups n. Example: ¢ We have n 50 random ﬂoating point numbers in 0; 1 . This numbers are already ¥ ¨ grouped in sequences of t 5 elements, the maxima per group are printed bold. ¥ seq. random numbers 1 0.911647 0.79844 0.783099 0.394383 0.840188 2 0.55397 0.277775 0.76823 0.335223 0.197551 3 0.95223 0.513401 0.364784 0.628871 0.477397 4 0.606969 0.141603 0.717297 0.635712 0.916195 5 0.156679 0.804177 0.137232 0.242887 0.0163006 6 0.218257 0.998925 0.108809 0.12979 0.400944 7 0.637552 0.296032 0.61264 0.839112 0.512932 8 0.771358 0.292517 0.972775 0.493583 0.524287 9 0.283315 0.891529 0.400229 0.769914 0.526745 10 0.949327 0.0697553 0.919026 0.807725 0.352458 ¢ If we make a binning of the interval 0; 1 into 10 subintervals we should know, how ¨ many values we expect for each bin. 30 5.8. Birthday Spacings test bin range percentage # expected # measured 1 0.0 - 0.1 1 00 10 6 ¡ £ 0.00 0 2 0.1 - 0.2 ¡ 6 30 10 5 £ 0.00 0 3 0.2 - 0.3 ¡ 6 65 10 4 £ 0.02 0 4 0.3 - 0.4 ¡ 3 37 10 3 £ 0.08 0 5 0.4 - 0.5 ¡ 1 15 10 2 £ 0.21 0 6 0.5 - 0.6 ¡ 3 10 10 2 £ 0.47 0 7 0.6 - 0.7 ¡ 7 10 10 2 £ 0.90 0 8 0.7 - 0.8 ¡ 1 44 10 1 £ 1.60 1 9 0.8 - 0.9 ¡ 2 69 10 1 £ 2.63 3 10 0.9 - 1.0 4 69 10 1 ¡ £ 4.10 6 Constructor in max_of_t_test.h max_of_t_test(uint64_t n, std::size_t t, std::size_t bins) n number of groups to check for maximum t number of elements per group bins number of classes for statistic 5.8. Birthday Spacings test In this test we check how random “birthdays” are distributed over a “year”. To do this, we have a look at the spacings between two successive birthdays. This test was ﬁrst implemented in Marsaglias Diehard test suite [18]. The theoretical background was presented in [17]. A stronger version is described in [20] and its implementation can be found at http://www. jstatsoft.org/v07/i03/. In the latest version of [16] this test is also included. Choose a number of m “birthdays” in a “year” of n days Sort the birthdays in ascending order and calculate the space between two successive birthdays Count the number of collisions The expected number of collisions should approximately be Poisson distributed with mean µ m3 4n. This distribution is tested with a χ 2 test. ¥ Example: Lets assume a “year” with n ¥ 365 “days” and m ¥ 15 “birthdays” Sort the birthdays, calculate and sort the spacings. 31 5. Tests for Studying Random Data birthday birthdays sorted spacings spacings sorted 305 71 143 101 30 5 285 122 21 6 290 132 10 10 331 143 11 11 71 173 30 13 122 186 13 15 279 201 15 15 101 228 27 15 201 279 51 21 173 285 6 26 228 290 5 27 132 305 15 30 186 331 26 30 346 346 15 51 The collisions of “birthday spacings” are printed bold. We got three collisions. m3 153 The mean of the Poisson distribution is µ ¥ 4n ¥ 4 365 ¥ 2 312 ¡ Constructor in birthday_spacing_test.h birthday_spacing_test(std::size_t runs, std::size_t birthdays, uint64_t days, std::size_t maxCollisions) runs number of birthday experiments to run birthdays number of birthdays in a year days number of days in a year maxCollisions collision counts above this number are cumulated 5.9. Collision test (Hash test) The χ 2 test statistic is meaningful only when each interval has more than, lets say, 5 samples. But this test is designed such that the number of intervals is much larger than the number of samples. Suppose we throw n balls randomly into m empty urns with m n. If a ball falls into ¡ a nonempty urn we get a collision. This is the 1-dimensional collision test. To get the 2-dimensional version, we have to sort the urns on a 2-dimensional array. Sometime this test is also called “Hash test”. The test can be interpreted as building an enormous hash table and generating an appropriate index. Some theory can be found in [16] or in [4] Select the number of “balls” n and the number of “urns” m. To do this make a list BALLS[n] and insert for ball i the urn in which it falls. 32 5.10. Serial correlation Check the list BALLS[i], i ¥ 1 ¡ ¡ ¢¢¡ n for collisions (check if there are any numbers twice in BALLS[i]) Theoretically, the probability that an urn receives exactly k balls is n 1 k 1 n k £ pk § ¥ ¨ 1 k m m so the expected number of collisions is C ¥ ∑ § k 1 pk¨ § ¨ k 0 n2 If m n, then C 2m . ¡ Example: 1-dimensional test We take m ¥ 28 ¥ 256 “urns” and n ¥ 25 ¥ 32 balls n2 322 The expected number of collisions is C 2m ¥ 2 256 ¥ 2 The number of “urns” in which the ball ﬂies are listed bellow 214, 100, 199, 203, 232, 50, 85, 195, 70, 141, 121, 160, 93, 130, 242, 233, 162, 182, 36, 154, 4, 61, 34, 205, 39, 102, 33, 27, 254, 55, 130, 213 We can see that one collision occurs, in urn 130 will be two balls. Constructor in collision_test.h collision_test(std::size_t runs, uint64_t balls, std::size_t edge_length, std::size_t dim) runs number of experiments to run balls number of balls to throw in urns edge_length edge length of the “urns ﬁeld” dim dimension of the “urns ﬁeld” 5.10. Serial correlation Most random numbers are generated by algorithms and not produced by physical processes. Because of this we must assume that there are dependences between two successive numbers. A way to represent this fact is the “serial correlation coefﬁcient”. The “serial correlation coefﬁcient” C from a sequence X0 X1 XN 1 of N random num- £¢¢¢ ¡ ¡ ¡ £ bers is calculated by the following formula given in [16]: n X0 X1 ¤ X1 X2 ¤ ¤ XN £ £ ¤ £ ¤ ¤ ¤ £ 2 § 2 XN 1 XN 1X0 X0 ¨ § X1 XN 1 ¨ C (5.5) ¥ 2 ¤ 2 ¤ ¤ 2 ¤ ¤ ¤ 2 n X0 § X1 XN 1 £ ¨ X0 X1 § XN £ 1 ¨ 33 5. Tests for Studying Random Data ¤ A correlation coefﬁcient always lies between 1 and 1. When it is zero or very small, it indicates that Xi and X j are independent of each other. A “good” value of C will be between µN 2σN and µN 2σN which means 95%, where ¤ 1 1 N N 3 µN σN § ¨ ¥ ¥ ¤ N 2 (5.6) N 1 N 1 N 1 Generate a sequence of N random numbers X0 X1 ¡ ¡ ¡ ¢¢¢ XN £ 1 Calculate the “serial correlation coefﬁcient” C with formula (5.5) Calculate the mean and the standard deviation and check if C lies between the two σ limit (µN 2σN ) which denotes the 95% limit ¡ Example: Lets take the same sequence of ten random number as in section 3.2 {0.809, 0.465, 0.151, 0.628, 0.318, 0.824, 0.394, 0.968, 0.179, 0.458}. We have to calculate C as shown in equation (5.5) 10 0 809 0 465 0 465 0 151 § ¤ 0 809 0 465 0 151 ¤ ¤ 2 ¡ ¡ ¡ ¡ ¨ £¢¡ ¡ ¡ § ¡ ¡ ¡ ¨ ¢¢¡ ¡ ¡ C ¥ ¤ ¤ ¤ ¤ ¥ 0 515376 ¡ 10 0 8092 0 4652 0 1512 § ¡ ¡ ¡ ¨ ¢¢¡ ¡ ¡ § 0 809 0 465 0 151 ¡ ¡ ¡ ¨ ¢¢¡ ¡ ¡ 2 To check if the calculated C coefﬁcient lies between the 2σ bounds. Use the formulas given in (5.6) 1 1 70 µ10 σ10 ¥ ¥ 0 1111 ¡ ¥ ¥ 0 2803 ¡ 9 9 11 ¢ We see that the valid interval for this test is 0 672; 0 449 . The calculated coefﬁcient ¡ ¡ £ C lies between these bounds. (The interval is so large because we only tested ten numbers) Constructor in serial_correlation_test.h serial_correlation_test(uint64_t n) n number of random numbers to calculate correlation for 5.11. Serial test This test checks if not only particular numbers are uniformly distributed but also two, three or d-dimensional points. To make this test, count the number of times the tuple X di Xdi 1 Xdi § ¡ £¢¢¢ ¡ ¡ ¡ £ ¡ ¢ d 1 £ ¨ occurs, for 0 i n and the dimension d 0. If d 1 the test is the same as the “Equidis- ¡ ¥ tribution test”. The tuples should be χ 2 distributed. Generate n d-tuples Xdi Xdi § ¡ 1 ¡ ¡ ¡ ¢¢¢ Xdi £ ¡ ¢ d 1 £ ¨ , where d 0 and 0 Xj ¡ k 34 5.12. Blocking test Apply a χ 2 test to these k d categories with probability 1 k d in each category. To get a valid χ 2 test n should be large compared to k d ,say n kd A more detailed description is given in [16]. Example: Lets generate n ¥ 10 pairs of random numbers between zero and two (d ¥ 3), the pairs are: 2 1 2 2 § § ¨ § ¨ 20 12 01 11 11 22 12 01 § ¨ § ¨ § ¨ § ¨ § ¨ § ¨ § ¨ ¨ Then we count the appearing of each tuple pairs § 00 ¨ § 01 ¨ § 02 ¨ § 10 ¨ § 11 ¨ § 12 ¨ § 20 ¨ § 21 ¨ § 22 ¨ count 0 2 0 0 2 2 1 1 2 The expected number of pairs for each class is 1 d 3 n ¥ 10 9 ¥ 1 11¡ Constructor in serial_test.h serial_test(uint64_t n, std::size_t gridSize, std::size_t dimension) n number of random numbers to place gridSize edge length of the grid dimension dimension of the grid 5.12. Blocking test For this test I found only a really scanty description in [26] . So, the only way to ﬁnd out how this test has to be implemented is looking into existing source code. The Blocking test tests a proposition of the central limit theorem. This says that, the the sum of k independent variables with zero mean and unit variance approaches the normal distribution with mean zero and variance equal to k. To test the proposition n sums of such groups or blocks will be built and checked for normality. 5.13. Repeating Time Test This test checks if a uniform 0 1 random number generator starts to repeat its sequence § ¨ when it is expected to. If the repetition occurs to soon, the test fails because the generator does not generate all possible number but only a subset of all values. If the ﬁrst repetition occurs to late after the expected value, this means that the numbers are unusually uniformly spread. The implementation and included description can be found in [10] 35 5. Tests for Studying Random Data 5.14. gcd test (greatest common divisor) This test calculates the greatest common divisor of two random numbers using Euclid’s algorithm. Now the number of steps to complete Euclid’s algorithm and the resulting gcd where checked against their expected probability. The idea of this test is described in [20] and some theory may be found in [16] in section 4.5.2, in the exercises and in the accordingly answers. The problem is that the expected distribution of the number of steps and the gcd is un- known, so the comparison must be done with simulated values. Example: We calculate the gcd of u ¥ 216 and v ¥ 256 The algorithms gives: ¤ 256 ¥ 1 216 40 ¤ 216 ¥ 5 40 16 ¤ 40 ¥ 2 16 8 16 ¥ 2 8 We get gcd 216 256 § ¥ ¨ 8 and the number of steps k ¥ 4 5.15. Gorilla test This is a strong version of the monkey test from the Diehard test suite [17] . The test counts the number of missing 26-bit “words” and compares it with the expected value. Unfortu- nately Marsaglia’s version is hard-wired so a more ﬂexible implementation has to develop the associated statistic. The theory of the test and its implementation can be found in [20] 5.16. Ising-model test The Ising model [15] is one of the simplest and most fundamental models of statistical me- chanics. It describes the properties resulting from interacting spins on a lattice. The system considered is an array of N ﬁxed points called lattice sites that form an n- dimensional periodic lattice. Associated with each lattice site is a spin variable s i i § ¥ ¤ ¤ 1 ¢¢¢ N which is a number that is either 1 or 1. If s i ¡ ¡ ¡ ¨ 1, the ith site is said to have ¥ spin up, and if si 1, it is said to have spin down. A given set of numbers s i speciﬁes a ¥ conﬁguration of the whole system. The energy of the system in the conﬁguration speciﬁed by si is deﬁned by the Hamiltonian H ¥ J ∑ si s j ¢ ij £ where J is the coupling energy. The sum is over pairs of nearest-neighbour sites on the lattice. 36 5.17. Random-walk test To perform a Monte Carlo simulation of the Ising model, we use the Wolff cluster-ﬂipping algorithm. This algorithm generates large clusters on a lattice by connecting bonds from starting point to nearest neighbours with the same spin with the following probability: 2J p ¥ 1 e kB T where J is, like above, the coupling energy and T the temperature. The model is simulated at the critical temperature Tc which can be calculated via 2 Tc ¥ ¤ ¥ 2 26918531421302 ¡ log 1 § ¢ 2 ¨ But simulations performed with the Wolff algorithm [33] are very sensitive to the properties of the used random number generator. This effect is published in [8]. Ferrenberg, Landau and Wong denoted aggravating discrepancies between the expected and the simulated energies for some random number generators. A standard model size in literature is a 16 16 square lattice. For this size, at the criti- cal temperature, we know the Ising models exact solution for the energy average E ¡ ¥ 1 45306. The result we are interested in, is the deviation in σ ’s (standard deviation) of the ¡ simulation result from the exact result. To calculate the exact energies we used the exact par- tition functions computed by Häggkvist and Lundow [13]. A similar implementation for a 16 16 lattice is given in [28] and may be found at http://www.physics.helsinki. fi/~vattulai/codes/acorrtiw.f Constructor in ising_model_test.h ising_model_test(uint64_t n, std::size_t lattice_size = 16) n number of Wolff steps in simulation lattice_size edge length of lattice 5.17. Random-walk test In the random walk test [28] we consider random walks, something like brownian motions, on a two dimensional lattice. This is divided into four equal blocks, each of which has an equal probability to contain the random walker after a walk of length n (or n steps). The test is performed N times, and the number of occurrences in each of the four blocks is compared with the expected value of N 4, using the χ 2 test with three degrees of freedom. Vattulainens implementation [28] in Fortran can be found at http://www.physics.helsinki. fi/~vattulai/codes/2drwtest.f Repeat the following procedure N times – Set the x-, y-coordinates zero ¢ – Generate n random number in Xi ¥ 0; 1 ¨ 37 5. Tests for Studying Random Data – Check every random number Xi and move random walker with the rules de- scribed below ¢ ¤ Xi 0 75; 1 00 ¡ ¡ ¨ x ¥ x 1 ¢ Xi 0 50; 0 75 ¡ ¡ ¨ x ¥ x 1 ¢ ¤ Xi 0 25; 0 50 ¡ ¡ ¨ y ¥ y 1 ¢ Xi 0 00; 0 25 ¡ ¡ ¨ y ¥ y 1 – Note down the quadrant in which the random walker stops after n steps Perform a χ 2 test with three degrees of freedom and an expected value of N 4 for each quadrant Example: We take the same random numbers as in 5.3 and make one random walk We apply the given rules for each number random number rule x y init. 0 0 0.11 y ¥ y 1 0 -1 ¤ 0.83 x ¥ x 1 1 -1 0.56 x ¥ x 1 0 -1 ¤ 0.95 x ¥ x 1 1 -1 ¤ 0.88 x ¥ x 1 2 -1 0.73 x ¥ x 1 1 -1 ¤ 0.91 x ¥ x 1 2 -1 0.01 y ¥ y 1 2 -2 ¤ 0.75 x ¥ x 1 3 -2 0.67 x ¥ x 1 2 -2 0.23 y ¥ y 1 2 -3 ¤ 0.38 y ¥ y 1 2 -2 We see that x is positive and y is negative, so our random walker stops in the end in the lower right quadrant Constructor in random_walk_test.h random_walk_test(uint64_t n, uint64_t steps) n number of random walks (runs) steps number of steps each walk 38 5.18. n-block test 5.18. n-block test This test checks the average of subsequences, so called blocks. This is done by calculating the average of many sequences of uniformly distributed random numbers (0 x i 1) and ¡ x increasing a counter if the average of the sequence ¯ 1 2. This test is described in [28] and a implementation in Fortran can be found at http://www.physics.helsinki.fi/ ~vattulai/codes/nblocktest.f Generate a sequence of n random numbers x 1 x2 ¢¢¢ ¡ ¡ ¡ xn where 0 xi ¡ 1 x x Calculate the average ¯ over the sequence, if ¯ 1 2 increase1y Repeat the last two steps N times y1 and perform a χ 2 test x Calculate the measured probability for ¯ 1 2 as y N ¡ 2 ¥ with one degree of freedom on the yi Vattulainens criterion for failing: Each test is repeated 3 times, and the generator fails for ﬁxed n if at least two out of three χ 2 failed, which should occur with a probability of about 3 400 Example: Lets generate N ¥ 10 sequences of n ¥ 8 numbers x1 x2 x3 x4 x5 x6 x7 x8 ¯x 0.84 0.394 0.783 0.798 0.912 0.198 0.335 0.768 0.629 0.278 0.554 0.477 0.629 0.365 0.513 0.952 0.916 0.586 0.636 0.717 0.142 0.607 0.0163 0.243 0.137 0.804 0.413 0.157 0.401 0.13 0.109 0.999 0.218 0.513 0.839 0.421 0.613 0.296 0.638 0.524 0.494 0.973 0.293 0.771 0.575 0.527 0.77 0.4 0.892 0.283 0.352 0.808 0.919 0.619 0.0698 0.949 0.526 0.0861 0.192 0.663 0.89 0.349 0.466 0.0642 0.02 0.458 0.0631 0.238 0.971 0.902 0.851 0.446 0.267 0.54 0.375 0.76 0.513 0.668 0.532 0.0393 0.462 0.438 0.932 0.931 0.721 0.284 0.739 0.64 0.354 0.63 x Now we check the averages ¯ and get y 1 ¥ 5 and y2 ¥ N y1 ¥ 5 Constructor in n_block_test.h n_block_test(std::size_t n, std::size_t block_size) n number of blocks block_size size of each block 39 5. Tests for Studying Random Data 5.19. Random Walker on a line (Sn test) This test uses different random walkers in one dimension. This N random walker move simultaneously without any interaction. At each step in a walk, they can jump left or right with the same probability. After t 1 steps for each walker, the number of visited sites ¡ Sn t has an asymptotic form Sn t f N t γ where the scaling function f N § ¨ ln N 1 2 and § ¥ ¨ § ¨ ¡ γ 1 2 is the expected exponent as based on theory. The value of the exponent γ observed ¥ from simulations serves as a measure of correlations. A description is available in [27] and the appropriate implementation can be found at http: //www.physics.helsinki.fi/~vattulai/codes/sn1d_test.f 5.20. 2D Intersection test In this test we use two random walkers in two dimensions. Their paths are given by two different sequences of random numbers. After n steps of each random walker we calculate the probability that they never meet the same place in plane (at the same time or not) except at their common starting point. For a random process it is known that the number of inter- sections I n behave asymptotically like a power law I n § ¨ n α with an exponent α 5 8. § ¨ £ ¥ A description is available in [27] and the appropriate implementation can be found at http: //www.physics.helsinki.fi/~vattulai/codes/intersections.f 5.21. 2D Height Correlation test The Height Correlation test observes again the behavior of one-dimensional random walker. Here the correlation between the heights of two walkers are measured, where each random walker represents a stream of random numbers. To do this we construct two sequences of random steps on a line (x1 x2 ), then the height is deﬁned as ht xt1 xt2 . The corresponding i i ¥ t φ is known to decay asymptotically as a power law ¡ ¢ £ correlation function Ht ¥ht h0 with a exponent φ 1 2. ¥ A description is available in [27] and the appropriate implementation can be found at http: //www.physics.helsinki.fi/~vattulai/codes/height.f Constructor in height_corr2d_test height_corr2d_test(std::size_t n, std::size_t steps) n number of samples with each two walks steps number of steps per walk 5.22. Sum of independent distributions test This test is used in the SPRNG test suite [26]. It is designed to check multiple streams for independence. This test builds n sums of groupsize random numbers from each stream and tests the distribution with a K-S statistic. 40 5.23. Fourier transform test 5.23. Fourier transform test For this test I found only a really scanty description [26]. So, the only way to ﬁnd out how this test has to be implemented is looking into existing source code. For a short description the following can be said. It is a test for multiple streams, but multiple streams can be built of multiple subsequences. A two-dimensional array has to be ﬁlled with random numbers, each row with n numbers from a different stream. Then the two-dimensional Fourier coefﬁcients were calculated and compared with the expected ones. A related article can be found in [7]. 5.24. Universal statistical test This test was designed to detect any signiﬁcant deviation of a devices output statistics from the statistic of a truly random bit source. This test is done by measuring a parameter closely related to the devices per-bit entropy. The fully description is in [23]. 5.25. The Diehard Test Suite Diehard is the name of a battery of tests for random number sequences which was developed by George Marsaglia in 1995 [18]. The original Code was written in F ORTRAN, but there are two new implementations in C [19], [22]. A useful paper may also be [24]. The tests contained in the Diehard battery are listed bellow. Birthday Spacings test Overlapping Permutations Ranks of 31 31 and 32 32 matrices test Ranks of 6 8 matrices test Count the 1‘s in a Stream of Bytes Count the 1‘s in Speciﬁc Bytes Monkey tests on 20-bit Words Monkey tests OPSO,OQSO,DNA Parking Lot test Overlapping Sums test Sqeeze test Minimum Distance test Random Spheres test 41 5. Tests for Studying Random Data Runs test Craps test The following sections describes tests from the Diehard battery. In most cases there is the original (converted to L TEX style) test description quoted. A 5.25.1. Birthday Spacings test This test is described in 5.8. The parameters used in Diehard are runs 500 birthdays 29 512 days 2 24 16777216 maxCollisions = “not used” 4 5.25.2. The overlapping 5-permutation test The following description is the original text from the Diehard test suite. This is the OPERM5 test. It looks at a sequence of one million 32-bit random integers. Each set of ﬁve consecutive integers can be in one of 120 states, for the 5! possible orderings of ﬁve numbers. Thus the 5th, 6th, 7th, numbers ¡ ¡ ¢¢¡ each provide a state. As many thousands of state transitions are observed, cu- mulative counts are made of the number of occurrences of each state. Then the quadratic form in the weak inverse of the 120 120 covariance matrix yields a test equivalent to the likelihood ratio test that the 120 cell counts came from the speciﬁed (asymptotically) normal distribution with the speciﬁed 120 120 covariance matrix (with rank 99). This version uses 1 000 000 integers, twice. 5.25.3. Ranks of binary matrices The Diehard test suite implements three binary matrix tests for different matrix dimensions. The aim of all these test are the same, namely to check the rank of the constructed random matrix against the expected rank. This implemented binary rank test is more ﬂexible, it is possible to specify the dimension of the matrix to construct from random numbers. In the test each random number is split ¤ into bits 0 n, 1 n 1 and so on, until n reaches the bit length of the random number. ¡ ¡ ¢¢¡ § ¢¢¡ ¡ ¡ ¨ Matrices are constructed from each of these sequences and over each sequence we perform a χ 2 test. At the end we make a K-S test over all χ 2 values. The probabilities for rank k in a m n matrix is given in [32]. In the Diehard test suite the result of the 6 8 and the 31 31 or 32 32 matrices test are analysed in a different way. For the 6 8 matrix the χ 2 probability of all sub-matrices is calculated and then a Kolmogorov-Smirnov test is performed over the values. For the 4 In section 5.8 we wrote the expected value as µ birthdays3 4 days . The maxCollisions value ¥ ¡ § ¢ should be much bigger than this value. In this example we get µ 2, so we can choose 16. ¥ 42 5.25. The Diehard Test Suite bigger matrices, only the χ 2 value for matrix is reported. To accommodate to this differ- ent meanings, there are two different tests implemented, a bin_rank_ks_test and a bin_rank_chisqr_test. The usage of both classes is exactly the same, only the statis- tics are different. Constructors in bin_rank_test.h bin_rank_ks_test(uint64_t n, std::size_t rows, std::size_t columns, std::size_t minRankCount) bin_rank_chisqr_test(uint64_t n, std::size_t rows, std::size_t columns, std::size_t minRankCount) n number of matrices to build rows number of rows in matrix columns number of columns in matrix minRankCount count ranks down to this rank, if a rank is smaller cumulate it In the original implementation the following parameters were used: Ranks of 31 31 matrices test bin_rank_chisqr_test n 40000 rows 31 columns 31 minRankCount 28 Ranks of 32 32 matrices test bin_rank_chisqr_test n 40000 rows 32 columns 32 minRankCount 29 Ranks of 6 8 matrices test bin_rank_ks_test n 100000 rows 6 columns 8 minRankCount 4 5.25.4. The bitstream test The following description is the original text from the Diehard test suite. 43 5. Tests for Studying Random Data The ﬁle under test is viewed as a stream of bits. Call them b b2 . Consider 1 ¡ ¡ ¡ ¢¢¢ an alphabet with two “letters”, 0 and 1 and think of the stream of bits as a succession of 20-letter "words", overlapping. Thus the ﬁrst word is b b2 b20 , 1 ¡ ¡ ¢¢¡ the second is b2 b3 b21 , and so on. The bitstream test counts the number of ¡ ¡ ¢¢¡ missing 20-letter (20-bit) words in a string of 2 21 overlapping 20-letter words. There are 220 possible 20 letter words. For a truly random string of 2 21 19 bits, ¤ the number of missing words j should be (very close to) normally distributed £ with mean 141 909 and σ 428. Thus j 141909 should be a standard normal ¥ 428¢ variate (z score) that leads to a uniform 0 1 p value. The test is repeated ¨ twenty times. 5.25.5. The OPSO, OQSO and DNA tests The text of the following sections is from the original description of the Diehard test suite. OPSO means Overlapping-Pairs-Sparse-Occupancy The OPSO test considers 2-letter words from an alphabet of 1024 letters. Each letter is determined by a speciﬁed ten bits from a 32-bit integer in the se- quence to be tested. OPSO generates 2 21 (overlapping) 2-letter words (from 221 1 "keystrokes") and counts the number of missing words—that is 2-letter ¤ words which do not appear in the entire sequence. That count should be very close to normally distributed with mean 141 909, σ 290. Thus missingwords 141909 ¥ 290 £ should be a standard normal variable. The OPSO test takes 32 bits at a time from the test ﬁle and uses a designated set of ten consecutive bits. It then restarts the ﬁle for the next designated 10 bits, and so on. OQSO means Overlapping-Quadruples-Sparse-Occupancy The test OQSO is similar, except that it considers 4-letter words from an al- phabet of 32 letters, each letter determined by a designated string of 5 con- secutive bits from the test ﬁle, elements of which are assumed 32-bit random integers. The mean number of missing words in a sequence of 2 21 four-letter words, (221 3 “keystrokes"), is again 141909, with σ 295. The mean is ¤ ¥ based on theory; σ comes from extensive simulation. The DNA test The DNA test considers an alphabet of 4 letters C,G,A,T, determined by two designated bits in the sequence of random integers being tested. It considers 10-letter words, so that as in OPSO and OQSO, there are 2 20 possible words, and the mean number of missing words from a string of 2 21 (overlapping) 10- letter words (221 9 “keystrokes") is 141909. The standard deviation σ 339 ¤ ¥ was determined as for OQSO by simulation. (Sigma for OPSO, 290, is the true value (to three places), not determined by simulation. 44 5.25. The Diehard Test Suite 5.25.6. The count-the-1’s test The text of the following sections is from the original description of the Diehard test suite. A stream of bytes This is the “count-the-1’s” test on a stream of bytes. Consider the ﬁle un- der test as a stream of bytes (four per 32 bit integer). Each byte can contain from 0 to 8 1’s, with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let the stream of bytes provide a string of overlapping 5-letter words, each “letter” taking values A,B,C,D,E. The letters are determined by the number of 1’s in a byte 0,1, or 2 yield A, 3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus we have a monkey at a typewriter hitting ﬁve keys with various probabili- ties (37,56,70,56,37 over 256). There are 5 5 possible 5-letter words, and from a string of 256,000 (overlapping) 5-letter words, counts are made on the frequen- cies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a χ 2 test Q5-Q4, the difference of the naive 2 £ Pearson sums of OBS EXP on counts for 5- and 4-letter cell counts. ¢ £ EXP Speciﬁc bytes This is the “count-the-1’s” test for speciﬁc bytes. Consider the ﬁle under test as a stream of 32-bit integers. From each integer, a speciﬁc byte is cho- sen, say the left-most bits 1 to 8. Each byte can contain from 0 to 8 1’s, with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let the speciﬁed bytes from successive integers provide a string of (overlapping) 5-letter words, each “let- ter” taking values A,B,C,D,E. The letters are determined by the number of 1’s, in that byte 0 1 or 2 A, 3 B, 4 C, 5 D, and 6 7 or 8 E. Thus we have a monkey at a typewriter hitting ﬁve keys with various probabilities 37,56,70,56,37 over 256. There are 5 5 possible 5-letter words, and from a string of 256 000 (overlapping) 5-letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a χ 2 test Q5-Q4, the difference of the naive Pearson £ 2 sums of OBS EXP on counts for 5- and 4-letter cell counts. ¢ £ EXP 5.25.7. The parking lot test The following description is the original text from the Diehard test suite. In a square of side 100, randomly “park” a car-a circle of radius 1. Then try to park a 2nd, a 3rd, and so on, each time parking “by ear". That is, if an attempt to park a car causes a crash with one already parked, try again at a new random location. (To avoid path problems, consider parking helicopters rather than cars.) Each attempt leads to either a crash or a success, the latter followed by an increment to the list of cars already parked. If we plot n: the number of 45 5. Tests for Studying Random Data attempts, versus k: the number successfully parked, we get a curve that should be similar to those provided by a perfect random number generator. Theory for the behavior of such a random curve seems beyond reach, and as graphics displays are not available for this battery of tests, a simple characterization of the random experiment is used: k, the number of cars successfully parked after n 12 000 attempts. Simulation shows that k should average 3523 with σ ¥ ¥ 21 9 and is very close to normally distributed. Thus k 21 9 should be a standard ¡ 3523 £ normal variable, which, converted to a uniform variable, provides input to a KS- test based on a sample of 10. 5.25.8. The overlapping sums test The following description is the original text from the Diehard test suite. ¢ Integers are ﬂoated to get a sequence U 1 U 2 . of uniform 0 1 vari- ¢¢¢ ¨ § ¡ ¡ ¡ ¨ § ¨ ¤ ¤ ¤ ¤ ables. Then overlapping sums, S 1 U 1 U 100 S2 U 2 § ¥ ¨ § ¢¢¡ ¡ ¡ ¨ § ¨ ¥ § ¨ ¡ ¡ ¢¢¡ U 101 § are formed. The S’s are virtually normal with a certain covariance ¢¢¢ ¨ ¡ ¡ ¡ matrix. A linear transformation of the S’s converts them to a sequence of in- dependent standard normals, which are converted to uniform variables for a KS-test. The p-values from ten KS-tests are given still another KS-test. 5.25.9. Squeeze test The following description is the original description from Diehard and can be found at [18], [19]. Random integers are ﬂoated to get uniforms on 0 1 . Starting with k 2 31 ¢ ¨ ¥ ¥ 2147483647, the test ﬁnds j, the number of iterations necessary to reduce k to 1, using the reduction k ceiling kU , with U provided by ﬂoating integers ¥ § ¨ from the ﬁle being tested. Such j’s are found 100 000 times, then counts for the number of times j was 6 7 47 48 are used to provide a χ 2 test for cell ¡ ¡ ¡ ¢¢¢ frequencies. Constructor in squeeze_test.h SqueezeTest(uint64_t n, uint64_t squeezeStart, std::size_t maxCount) n number of numbers to squeeze squeezeStart start value of squeezing maxCount squeeze steps bigger then this number are cumulated The implemented version of the squeeze test is a bit more universal. In the original imple- mentation the parameters are n 100000 squeezeStart 231 ¡ 1 2147483647 maxCount 48 46 5.25. The Diehard Test Suite The probability, used to perform the χ 2 test, for i squeeze steps is calculated by the fol- lowing formula 1 ln k i 1 § ¨ £ pik (5.7) k Γi § ¥ ¨ ¨ § 5.25.10. The Minimum Distance test The following description is the original description from Diehard and can be found at [18], [19]. The implemented version is based on [9], there the exact expectation values are given. It does this 100 times: choose n 8000 random points in a square of side ¥ 2 10000. Find d, the minimum distance between the n 2 n pairs of points. If the £ points are truly independent uniform, then d 2 , the square of the minimum dis- tance should be (very close to) exponentially distributed with mean 0 995. Thus ¡ 1 exp d 2 0 995 should be uniform on 0 1 and a KS-test on the resulting ¢ § ¡ ¨ ¨ 100 values serves as a test of uniformity for random points in the square. Test numbers 0 mod 5 are printed but the KS-test is based on the full set of 100 ¥ random choices of 8000 points in the 10000 10000 square. Constructor in minimum_distance_test.h minimum_distance_test(std::size_t runs, std::size_t n) runs number of experiments n number of points to place in square The implemented version of the minimum distance test is a bit more universal. In the original implementation the parameters are runs 8000 n 100 5.25.11. Random Sphere test In this implementation of the “Random Sphere” test the number of spheres to place in space is not ﬁxed, it may be changed. The following description is quoted from the Diehard test suite. To calculate the probabilities a report [9] is really helpful. Choose 4000 random points in a cube of edge 1000. At each point, center a sphere large enough to reach the next closest point. Then the volume of the smallest such sphere is (very close to) exponentially distributed with mean 120π . 3 Thus the radius cubed is exponential with mean 30. (The mean is obtained by extensive simulation). The “3D-spheres” test generates 4000 such spheres 20 times. Each min radius cubed leads to a uniform variable by means of 1 exp r3 30 , then a KS-test is done on the 20 p-values. § ¨ 47 5. Tests for Studying Random Data Constructor in random_sphere_test.h random_sphere_test(std::size_t runs, std::size_t n) runs number of experiments n number of spheres to place in square In the original implementation the parameters shown next are used: runs 20 n 4000 5.25.12. The runs test The runs test is described in section 5.2. The parameters for the runs test used in the Diehard test suite are n 10000 maxRunLength 6 5.25.13. Craps test This is one more test invented with the Diehard test suite. Marsaglia gives the following description: This is the “craps test”. It plays 200 000 games of craps, ﬁnds the number of wins and the number of throws necessary to end each game. The number of wins should be (very close to) a normal with mean 200000 p and variance 200000 p 1 p , with p 244 . Throws necessary to complete the game can § ¨ ¥ 495 vary from 1 to ∞, but counts for all 21 are lumped with 21. A χ 2 test is made on the no.-of-throws cell counts. Each 32-bit integer from the test ﬁle provides ¢ the value for the throw of a die, by ﬂoating to 0 1 , multiplying by 6 and taking ¨ 1 plus the integer part of the result. Constructor in craps_test.h craps_test(uint64_t n, std::size_t max_throws) n number of Craps games to play max_throws maximal number of rolling the dice until the number is cumulated 48 6. Extending the Random Number Generator Test Suite This chapter is, additional to the source code, the key to extend the RNGTS framework. Here is shown how to implement further tests, by using the given base classes or by specifying the requirements of other random number generators. At the end there is also an overview over the used XML-schema. 6.1. How to implement a test If one likes to write a new test for random number generators a speciﬁc interface needs to be implemented. This allows the RNGTS framework to interact with the test, e. g. it executes the test automatically. Unfortunately in C++ it is not possible to deﬁne interfaces which act only as speciﬁcations for the methods of implementation (like in Java). A way to make an interface is to build abstract classes, but then we have virtual function calls. There are only few methods to implement which are described below. The following listing shows the base of each test, containing all required methods. #include "buffered_random.h"// definition of "buffered_random_rumber_generator_base" #include "xml_helper.h" //XML output functions class the_new_test { public: the_new_test(...); void run(buffered_random_rumber_generator_base& rng); std::string test_name() const; template < class InputIterator > void analyze(xml_helper& out, InputIterator cl_begin, InputIterator cl_end) const; void print_parameters(xml_helper& out) const; } The constructor must be able to take all parameters which are needed to run a complete test, e. g. the number of runs. the_new_test(uint64_t runs, ...) The RNGTS framework calls the run method to execute the test. When run has ﬁnished its work, the statistic must be calculated. void run(buffered_random_rumber_generator_base& rng) 49 6. Extending the Random Number Generator Test Suite rng is the actual random number generator to test. It may be converted to a boost::uniform_real generator or an other boost type This method must return the name of the test. std::string test_name() const The task of the analyze(...) method is to check the conﬁdence level for the cal- culated quantities. It also has to write the results in a XML structure to the output. The available XML tags can be found in the XML Schema deﬁnition of the result ﬁle or in the listing. Below a sample implementation is given. template < class InputIterator > void analyze(xml_helper& out, InputIterator cl_begin, InputIterator cl_end) const { // this implementation is given as a example // helper to convert numeric values to strings std::ostringstream val; // tag marks the begin of the result section in the XML output // if it is not a χ 2 or a KS analyze one makes a ’RESULTS’ tag else one can // make a ’CHI_SQUARE’ or ’KOLMOGOROV_SMIRNOV’ tag, or better, one uses the // appropriate base class and this method is already implemented out.startTag("RESULTS"); // write all relevant results as a tag to the XML stream // convert ’result’ to a stream val << result_; // write the tag out.make_result_tag("Error", val.str()); // clear the stream val.str(""); // test each confidence level if it fulfilled or not while (begin != cl_end) { // check if it is fulfilled if (error_ < *begin) { // write ’PASSED’ tag if the result is good out.start_tag("PASSED"); } else { // write ’FAILED’ tag if the result is bad out.start_tag("FAILED"); } // write which confidence level was checked and go to the next conf. level out.add_attribute("confidenceLevel", *begin++); // end the PASSED/FAILED tag out.end_tag(); } // end the ’RESULT’ tag out.end_tag(); } out std::basic_stream to write the output cl_begin iterator to the begin of the data structure whit the conﬁdence level 0 1¡ ¢ cl_end iterator to the end of the data structure whit the conﬁdence level 50 6.1. How to implement a test This method must write all required parameters to reproduce the test to the XML structure, below an example is given. void print_parameters(xml_helper& out) const { // this implementation is given as a example // helper to convert numeric values to strings std::ostringstream val; // converts the ’parameter_’ to a stream val << parameter_; // writes a parameter tag to the XML output out.make_parameter_tag("My Parameter description", val.str()); // clears the stream val.str(""); } } out std::basic_stream to write the XML output 6.1.1. Implementing a χ 2 , Kolmogorov-Smirnov or a Gaussian test This section shows how a χ 2 , Kolmogorov-Smirnov or a Gaussian test can be built. These often used types of tests are supported by a base class which implements generic methods for calculation and analysis. These base classes and their most commonly used methods are the following: chisquare_test template < class DerivedType > class chisquare_tTest { void prepare_statistics(std::size_t count_size, uint64_t runs, std::size_t degOfFreedom = 0); inline std::size_t get_entry(buffered_random_rumber_generator_base& rng); double get_chisqr_probability(std::size_t i) const; } ks_test template < class DerivedType > class ks_test { void prepare_statistics(uint64_t runs); inline double get_entry(buffered_random_rumber_generator_base& rng); } gaussian_test class gaussian_test { void prepare_statistics(double deviation, double stat_value, double mean); void calc_gaussian_value(); } 51 6. Extending the Random Number Generator Test Suite The implementation of a χ 2 or Kolmogorov-Smirnov test are very similar. Implementing a Gaussian test is different because we could not support as much functionality as in the other two tests. Here, only a short overview over the most important methods of the test base class is given. More detailed and speciﬁc information is found in the class description and in the source itself. 6.1.2. χ 2 test To get an overview of the involved methods and the order of method calls there is a sequence diagram in ﬁgure 6.1 which shows the events graphically. The prepare_statistics method has to be called before the underlying test is exe- cuted by the run method. This must be done in the constructor of the test class. void prepare_statistics(std::size_t count_size, uint64_t runs, std::size_t degOfFreedom = 0); count_size The number of classes used to make the statistic runs The number of invocations of the get_entry method degOfFreedom The degrees of freedom used for the statistical calculations, as default is taken count_size ¡ 1 The base class invocates the get_entry method the chosen number of repetitions (runs). This method must return the index of the class which belongs to the calculated/measured value. The appropriate class count will be increased. Keep in mind that this method must not change the state of the class to one not equivalent to the state after the constructor was called. The RNGTS framework only calls get_entry so it is not possible to reset any variables for testing a new generator. inline std::size_t get_entry(buffered_random_rumber_generator_base& rng); rng Random number generator to use in the test return Returns the index of the class appropriate to the calculated value The base class needs the probability for each class to calculate the χ 2 statistic. So, the test class has to support such a method. double get_chisqr_probability(std::size_t i) const; ¨¨¨¦¤¢ ©§¥£¡ i Class to get the probability for. 0 i return Returns the probability for class i 6.1.3. Kolmogorov-Smirnov test To get an overview of the involved methods and the order of method calls the same ﬁg- ure as in the χ 2 test is usefull 6.1, one only has to replace chisquare by ks. The prepare_statistics method has to be called before the underlying test is executed 52 6.1. How to implement a test : concrete_test_ : chisquare_ any chisquare test runner test 1: create 2: create sets possible statistic name 3: prepare_statistics 4: test_name 5: run 6: get_entry some time after 7: get_entry creation... 8: get_entry get_entry is called 'n' time 9: get_entry 10: calculate_chisquare_value 11: get_chisqr_probability 12: print_parameters 13: analyze Figure 6.1.: Sequence diagram for the χ 2 test 53 6. Extending the Random Number Generator Test Suite by the run method. This must be done in the constructor of the test class. The parameters are: void prepare_statistics(uint64_t runs); runs The number of invocations of the get_entry method The base class invocates the get_entry method the chosen number of repetitions (runs). This method must return a probability value for the K-S statistic. (The name “probability” ¢ already tells that the value must be 0 1 ). Keep in mind not to change the internal state of ¡¡ £ the class, for the same reason as in the χ 2 test class. inline double get_entry(buffered_random_rumber_generator_base& rng); rng Random number generator to use in the test return Returns a probability value for the K-S statistic 6.1.4. Gaussian test The main difference to the two base classes above is the fact that the test itself has to calculate some statistical values. This values have to be passed to the base class to make some further calculation. The passing is done via the prepare_statistics method, which obviously has to be called after the test has run. Additionally one needs to implement the run method instead of a get_entry routine. The method calls are little different than in tests before. The exact sequence of calls can be viewed in the sequence diagram in ﬁgure 6.2. void prepare_statistics(double deviation, double stat_value, double mean); deviation The calculated/measured deviation in σ ’s stat_value The calculated/measured value (the “result”) mean The expected mean value After the statistic has been prepared with the method above, the gaussian value can be calculated. This method calculates the deviation from the mean value as a factor (may also be interpreted as percent). void calc_gaussian_value(); A discussion about this method is given in section 3.3. 6.2. The multiple_test wrapper There are some cases in which a test has more than one statistic, e. g. the “runs” test. In such cases it is not possible to derive the test class two times from the base class, we need an other concept. To permit the use of different statistical tests for a test, we provide the multiple_test class as base class. This class takes a tuple of statistical test types and a tuple of as many std::string types as template parameter. 54 6.2. The multiple_test wrapper : concrete_test_ : gaussian_test any gaussian runner test 1: create sets possible 2: create statistic name 3: test_name some time after creation... 4: run 5: prepare_statistics called after 6: calc_gaussian_value run method 7: print_parameters 8: analyze Figure 6.2.: Sequence diagram for the gaussian test 55 6. Extending the Random Number Generator Test Suite The usage of the class is quite simple, the ﬁrst thing to do is to derive the test class from multiple_test base class. In the “run test” example (it contains two χ 2 tests) it looks like: class runs_test : public multiple_test< boost::tuple<chisquare_test<runs_test>, chisquare_test<runs_test> >, boost::tuple<std::string, std::string> > { ... } or as a interface description: template< class T, class S > class multiple_test T boost::tuple containing the wanted statistical test types S std::string containing as many strings as test types in T, this are used to store each tests individual name The constructor of the derived class has to call the constructor of the base class resp. the constructor of multiple_test to set the each statistical tests name. Our run example: runs_test(uint64_t n, std::size_t maxRunLength) :multiple_test< boost::tuple<chisquare_test<runs_test>, chisquare_test<runs_test> >, boost::tuple<std::string, std::string> >(boost::make_tuple("Runs-Up", "Runs-Down")), ... The constructor is called with the two statistic names, “Runs-Up” and “Runs-Down”. The ﬁrst name in the S tuple is assigned to the ﬁrst test in the T tuple and so on. The interface of the constructor is the following: multiple_test(S statistic_names) statistic_names a boost::tuple containing the name of each statistic All statistical tests are stored in a member variable called multipleTest_ which is accessible from the derived class. Getting access to each statistic is simple. E. g. a call of the method boost::tuples::get<0>(multipleTest_).prepare_statistics(...) prepares the ﬁrst statistic in the tuple, where boost::tuples::get<0>(multipleTest_) grants access to the ﬁrst element in the test tuple. In general the following syntax can be used: boost::tuples::get<n>(multipleTest_).method(); n number of statistic to access, the order of statistics is given by the order used in the derivating speciﬁcation method the name of the method to call from statistic at position n It must be denoted that the multiple_test base may be used if and only if all statistics of the associated statistical tests must be written out. This wrapper calls the analyze method of each associated statistical test. 56 6.3. Useful sequence diagrams 6.3. Useful sequence diagrams During the implementation of new tests or other extensions to the test suite, it sometime is important to know the order of method calls. A graphical representation is given with UML diagrams 6.3, 6.4. This diagrams show only some special cases because of the vast variety of different possible cases. 6.4. Demands on Random Number Generators To use a random number generator with this test suit, it has to fulﬁl different properties. These are nearly the same as a boost “Pseudo-Random Number Generator” has to fulﬁl. Jens Maurer wrote a speciﬁcation for the boost library called “Random Number Genera- tor Library Concepts” which can be found in the Boost documentation [2] or a summary in table 6.1. One also has to implement an appropriate traits class to allow using of a seed(value) method. This method is not requested by the standard but often imple- mented. If the generator supports the “single call” method, the traits class can be imple- mented, in the rng_traits.h header, as follows // from lagged_fibonacci.hpp template<class RealType, int w, unsigned int p, unsigned int q> struct has_single_call<brand::lagged_fibonacci_01<RealType, w, p, q> > { BOOST_STATIC_CONSTANT(bool, value = true); }; or, if there is no “single call” method, the value must be false. // example from additive_combine.hpp template<class MLCG1, class MLCG2, typename MLCG1::result_type val> struct has_single_call<brand::additive_combine<MLCG1, MLCG2, val> > { BOOST_STATIC_CONSTANT(bool, value = false); }; 6.5. Foreign Random Number Generators It is also possible to test “foreign” random number generators, as such from C or Fortran. To use such generators a simple wrapper class is delivered which encapsulates the call of the next random number. This class supports all methods required for a pseudo random number generator. The declaration is the following: template<typename return_type, return_type RNG()> class rng_wrapper return_type type of generated random numbers RNG() function pointer to the random number function The constructor of the class has the signature: rng_wrapper(result_type min_value = 0, result_type max_value) 57 6. Extending the Random Number Generator Test Suite test_suite_main : rng_test_suite : buffered_ : concrete_test_ random_ runner 1: add_confidence_level 2: add to set 3: add_seed 4: add to vector 5: register_rng 6: create 7: create generator holder 8: add to vector 9: register_seeded_rng 10: register_test 11: create 12: run_tests Figure 6.3.: Sequence diagram, initialization of the test suite 58 6.5. Foreign Random Number Generators test_suite_main : rng_test_suite : xml_helper : buffered_random_ print test attributes : concrete_test_ run rng test number_generator : test_runner runner 1: run_tests 2: print initial tag 3: add initial attribute 4: get rng from vector 5: print rng specific tag 6: add rng specific attributes 7: print seed tag 8: print seed Why first run the test and then print the parameters? So, it is possible to print 9: get test from vector interesting rng specific parameters, like the number of bits per number 10: set_confidence_level 11: seed 12: warm_up 13: run 14: run 15: print test tag 16: print test attributes 17: run rng test 18: print parameter tag 19: print parameter attributes 20: start analyze tag 21: analyze Figure 6.4.: Sequence diagram, “run a test” part 59 6. Extending the Random Number Generator Test Suite PseudoRandomNumberGenerator requirements expression return type description X::result_type T type of random numbers operator()() T returns next random number min() T lower bound of random numbers max() T upper bound of random numbers X() – default constructor X(it1, it2) void creates an generator initialized with values between it1 and it2 seed() void set same state like in X() seed(it1, it2) void seed generator with values between it1 and it2 operator()() T returns next random number x == y bool checks if generators have same state x != y bool checks if generators have not same state operator<< std::ostream& writes the generator in its textual represen- tation operator>> std::istream& reads the generator from its textual repre- sentation Table 6.1.: Requirements for “Pseudo-Random Number Generators” There is no possibility to specify a seed function as a function pointer! Why not? The problem is the internal use of a clone method which duplicates the state of the generator. In this case we only have a function pointer onto the function delivering the next number. It is not possible to copy the state of the generator. So, the seed function does not make sense because we can not seed from an initial state which is equal for all tests. – Using seeded generators is possible via the register_seeded_rng method. To show the usage of the wrapper we give a short example. We assume that there is a ﬁle called mt199937ar.c implementing a variant of the “Mersenne twister”. We will generate numbers of type double. To do this, the required C functions have to be declared in a C++ ﬁle. This is done with the extern statement: extern "C" { /* generates a random number on [0,1)-real-interval */ double genrand_real2(void) } Adding the generator to the test suit is not a great deal, one only has to specify the desired template parameters and its done. rng_wrapper<double, genrand_real1> mersenne_double; rngTest.register_seeded_rng< rng_wrapper< double, // result type genrand_real1 // function name 60 6.6. The XML Schema > >(mersenne_double, "C Mersenne (double)", "standard seed"); To compile the whole thing the ﬁle containing the generator must be pre compiled into an object ﬁle, which can be linked with the other parts of the test suite. 6.6. The XML Schema The XML format was chosen in order to have a universal format with a simple structure which allows transformation to other formats like HTML or L TEX. Such transformations are A done with so called XSLT [5] (XML Stylesheet Language Translation) style sheets which contain rules to generate appropriate output. Here we cover the translation to HTML and LTEX. A To view the results in HTML one only needs a “modern” web browser understanding XML and stylesheets. “Mozilla” and the “Internet Explorer” are capable to process the instructions. The stylesheet is called xml2html.xsl There is also a stylesheet (xml2LaTeX.xsl) to translate the output to a L TEXsource ﬁle. A To make this transformation, a XSLT processor is used. (A standard one is the “xsltproc” tool, available at [31] as a part of the “GNOME” project) The transformation delivers a L TEX A source ﬁle which simply can be processed to a Post-Script ﬁle or whatever. The structure, attributes and restrictions are deﬁned in an XML schema. A graphical representation is shown in ﬁgure 6.5. The following list shows a short description of the different tags and attributes, a detailed description of the whole schema is found in the source. RNG_TEST_SUITE_RESULT date the tests starting date RNG name the name of the random number generator warmup number of random numbers to throw away for warmup SEED seed seed value or, if the generator was seeded by the user, the string user-seeded description if the generator was seeded by the user, a description of the used seed (optional) TEST name the name of the random number generator PARAMETERS ANALYZE PARAMETER name name of the parameter 61 6. Extending the Random Number Generator Test Suite value value of the parameter CHI_SQUARE name the name of the statistic (optional) chi2 the χ 2 value probability the probability for the χ 2 value dof the degrees of freedom of the statistic KOLMOGOROV_SMIRNOV name the name of the statistic (optional) ksPlus the Kolmogorov-Smirnov K value probPlus the probability for the K value ksMinus the Kolmogorov-Smirnov K ¡ value probMinus the probability for the K ¡ value dof the degrees of freedom of the statistic RESULTS name the name of the statistic (optional) PASSED conﬁdenceLevel conﬁdence level at which the test passes FAILED conﬁdenceLevel conﬁdence level at which the test fails RESULT name name of the result value value value of the result value 62 6.6. The XML Schema RNG SEED RNG_TEST_SUITE_RESULT name : xs:string seed : xs:integer date : xs:date 1..* warmup : xs:integer 1..* description : xs:string 1..* TEST name : xs:string 1 1 ANALYZE PARAMETERS 0..* 0..* 1..* 1..* RESULTS CHI_SQUARE KOLMOGOROV_SMIRNOV PARAMETER name : xs:string name : xs:string name : xs:string name : xs:string value : xs:string 1..* 1..* 1..* result_statistic_t probability_t statistic_t <xs:restriction base="xs:double"> RESULT PASSED <xs:minInclusive value="0.0"/> PASSED FAILED <xs:maxInclusive value="1.0"/> FAILED </xs:restriction> 0..* integer 0..* FAILED <xs:restriction base="xs:integer"> RESULT 0..* confidenceLevel : probability_t <xs:minInclusive value="0"/> </xs:restriction> name : xs:string value : xs:string 0..* PASSED 0..* confidenceLevel : probability_t Figure 6.5.: The XML-Schema 63 A. Collection of Test Parameters The following tables itemize tests and their parameters used in test suits or described in other publications. Test Numbers Iterations1 Other Parameters χ2 100000 10000 classes = 256 χ2 10000 10000 classes = 128 Serial test 100000 1000 dimension = 2 gridSize = 100 Serial test 100000 1000 dimension = 3 gridSize = 20 Serial test 100000 1000 dimension = 4 gridSize = 10 Gap test 25000 1000 lowerGapLimit = 0 upperGapLimit = 0.05 maxGapCount = 30 Gap test 25000 1000 lowerGapLimit = 0.45 upperGapLimit = 0.55 maxGapCount = 30 Gap test 25000 1000 lowerGapLimit = 0.95 upperGapLimit = 1 maxGapCount = 30 Maximum of t 2000 1000 t=5 bins = 5 Maximum of t 2000 1000 t=3 bins = 3 Collision test 16384 1000 dim = 2 edge_length = 1024 Collision test 16384 1000 dim = 4 edge_length = 32 Collision test 16384 1000 dim = 10 edge_length = 4 Run test 100000 1000 maxRunLength = 6 Table A.1.: Test parameters used in [30] 1 In the “Random Number Generator Test Suite”, the number of iterations is not a parameter of the test. The test must be wrapped with the iterate_test class. In this version this is not possible with the “Run test” because of lack of a wrapper class for the multiple_test base class. 64 Test Numbers Steps Random walk test n= 106 107 108 steps = 0 1000 ¡ ¡ ¢¢¡ n block test n = 104 steps = 106 n block test n = 5000 steps = 108 n block test n = 25000 steps = 107 n block test n = 1500 steps = 109 Table A.2.: Test parameters used in [29] Test Numbers Other Parameters χ2 n = 1000000 classes = 100 Serial test n = 500000 gridSize = 64 dimension = 2 Gap test n = 100000 lowerGapLimit = 0.5 upperGapLimit = 0.6 maxGapCount = 20 Permutation test n = 200000 nrOfElements = 5 Runs test n = 600000 maxRunLength = 7 Coupon test n = 20000 different_coupons = 10 maxSeq = 30 Maximum of t n = 100000 t = 10 bins = 10 Poker test n = 100000 different_cards = 10 Table A.3.: Test parameters used in the SPRNG test suite [21] 65 B. Examples To point out the ease of handling of the test suite and to show a number of possibilities, there are some examples added to the source code. Most of the examples have a self-explanatory name and contain a short description of the example inside the code. Here is the list of examples: bit_extract_example This is an example for the “Bit extract test“ in section 4.7. In the ﬁrst part, the lower 10 bits are used to build a new random number, in the second part, bit number 20 of 10 random numbers is used to build a new random number. bit_test_example This is an example for the “Bit test” in section 4.6. A mask of a length of 30 bits is used to produce new random numbers. count_failings_example This is an example for the “Count failings test” in section 4.5. A test is run 1000 times and it passes if it fails less than 100 times. doc_example This is the example from the documentation in section 4.1. foreign_rng_example This is an example of using a foreign random number generator as described in section 6.5. The used random number generator is the original C version of the “Mersenne Twister” which can be downloaded at http://www.math.keio.ac.jp/~nisimura/ random/real1/mt19937-1.c. More detailed instructions are written in the source ﬁle. helsinki This is the same compilation as used in the “Comparative study of some pseudorandom number generators”, [30], excepting the runs test. iterating_example This is an example for the possibility of iterating tests as described in section 4.4. Each test is iterated 1000 times and analysed. iterator_seed_example This example shows how to seed a random number generator with itera- tors. Here, Boost’s “Mersenne Twister” is seeded with a vector ﬁlled by a linear congruential generator. parallel_example This example shows how a simple parallel generator may be constructed and its usage. A parallel generator of two different seeded “Lagged Fibonacci” generators is used. all_tests_example In this example all currently available tests are included. The parameters for the tests, excepting the “helsinki” example, are all examples. So, for real tests they have to be changed or consciously accepted. 66 C. Compiling the Test Suite If anyone does not want to run the Makefile or this ﬁle does not work, the RNGTS may also be compiled by hand. This is quite simple, one only has to consider three points: is the B OOST library installed? The B OOST library must be installed in order to compile the test suite. If it is not, the source can be found in [2]. If the library is installed once, one has to specify the include path. This is done with -I/path_to_boost is the “Runs test” used/included? If the “Runs test” is performed in the test suite or even if its header ﬁle is included, the B OOST Sandbox1 has to be installed and speciﬁed in the include path. The “Runs test” uses the LAPACK and BLAS, so these libraries must also be available. 2 Because the included libraries are based on Fortran code, the g2c library has to be used. In the end we have to add the following arguments to the command line: -llapack -lblas -lg2c -I/path_to_boost-sandbox are any external random number generators used? Last but not least we can also use a external generator, If this is done, the generator must be available as a pre-compiled object ﬁle, which has to be added to the argument line like ext_gen.o If no one of the three points above apply, the following command line may be used to com- pile the test suit. The ﬁle containing the main routine is called RNG_test_suite_test.C. g++ -lm -I/path_to_boost -I. RNG_test_suite_test.C.C -o RNG_test_suite_test.C 1 Also available at [2], via the “Sandbox CVS” link 2 LAPACK and BLAS are installed on most systems. If not, they are available on the Internet at http: //www.netlib.org/lapack/ and http://www.netlib.org/blas/ 67 Bibliography [1] Z. W. Birnbaum and F. H. Tingey. One-sided conﬁdence contours for probability dis- tribution functions. Annals of Mathematical Statistics, 22(4):592–596, 1951. [2] Booster. Boost libraries, 2002–2004. URL http://www.boost.org. [3] I. N. Bronstein, K. A. Semendjajew, G. Musiol, and H. Mühlig. Taschenbuch der Mathematik. Harri Deutsch, Frankfurt am Main, 4 th edition, 1999. ISBN 3-8171-2004- 1. [4] T. H. Chow. Tuning the collision test for stringency, 2000. URL http:// citeseer.nj.nec.com/436535.html. [5] J. Clark. XSL Transformations (XSLT) Version 1.0, 1999. URL http://www.w3. org/TR/xslt. [6] A. Compagner. The Hierarchy of Correlations in Random Binary Sequences. Journal of Statistical Physics, 63(5/6):883–896, 1991. [7] R. R. Coveyou and R. D. Macpherson. Fourier Analysis of Uniform Random Number Generators. J. ACM, 14(1):100–119, 1967. ISSN 0004-5411. URL http://doi. acm.org/10.1145/321371.321379. [8] A. M. Ferrenberg, D. P. Landau, and Y. J. Wong. Monte Carlo Simulations: Hidden Errors from "Good" Random Number Generators. Physical Review Letters, 69(23): 3382–3384, 1992. [9] M. Fischler. Distribution of minimum distance among n random points in d dimensions. Technical report, Fermilab (FNAL), 2001. URL http://www.slac.stanford. edu/spires/find/hep/www?r=fermilab-tm-2170. FERMILAB-TM- 2170. [10] G. Gonnet. Repeating Time Test for U(0,1) Random Number Generators. Technical report, Informatik, ETH, Zurich, May 2003. URL http://www.inf.ethz.ch/ personal/gonnet/RepetitionTest.html. [11] I. D. Hill and M. C. Pike. Algorithm 299: Chi-squared integral. Commun. ACM, 10 (4):243–244, 1967. ISSN 0001-0782. URL http://doi.acm.org/10.1145/ 363242.363274. [12] I. D. Hill and M. C. Pike. Remark on Algorithm 299. ACM Trans. Math. Softw., 11(2): 185, 1985. ISSN 0098-3500. URL http://doi.acm.org/10.1145/214392. 214405. 68 Bibliography [13] R. Häggkvist and P. H. Lundow. The Ising Partition Function for 2D Grids with Pe- riodic Boundary: Computation and Analysis. Journal of Statistical Physics, 108:429– 457, 2002. [14] D. Ibbetson. Algorithm 209: Gauss. Commun. ACM, 6(10):616, 1963. ISSN 0001- 0782. URL http://doi.acm.org/10.1145/367651.367664. [15] E. Ising. Beitrag zur Theorie des Ferromagnetismus. Zeitschrift für Physik, pages 253–258, 1925. [16] D. E. Knuth. The Art of Computer Programming, Volume 2 (3rd Ed.): Seminumerical Algorithms. Addison-Wesley Longman Publishing Co., Inc., 1997. ISBN 0-201-89684- 2. [17] G. Marsaglia. A current view of random number generators. Computer Science and Statistics, 9(26):1–10, 1993. URL http://www.csis.hku.hk/~diehard/ cdrom/linux.tar.gz:monkey.ps. [18] G. Marsaglia. The diehard test suite, 1995. URL http://stat.fsu.edu/~geo/ diehard.html. [19] G. Marsaglia. The diehard test suite, 2003. URL http://www.csis.hku.hk/ ~diehard/. [20] G. Marsaglia and W. W. Tsang. Some difﬁcult-to-pass tests of randomness. Jour- nal of Statistical Software, 7(3):1–8, 2002. URL http://www.jstatsoft. org/v07/i03;http://www.jstatsoft.org/v07/i03/tuftests. c;http://www.jstatsoft.org/v07/i03/tuftests.pdf;http: //www.jstatsoft.org/v07/i03/updates. [21] M. Mascagni. The scalable parallel random number generators library (sprng) for asci monte carlo computations, 1999. URL http://sprng.cs.fsu.edu/. [22] M. Mascagni. A parallel version of the diehard test suite, 2003. URL http://www. cs.fsu.edu/~mascagni/research/. [23] U. Maurer. A Universal Statistical Test for Random Bit Generators. Journal of Cryp- tology, 5(2):89–105, 1992. [24] O. E. Percus and P. A. Whitlock. Theory and application of Marsaglia’s monkey test for pseudorandom number generators. ACM Trans. Model. Comput. Simul., 5(2):87– 100, 1995. ISSN 1049-3301. URL http://doi.acm.org/10.1145/210330. 210331. [25] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C: The Art of Scientiﬁc Computing. Cambridge University Press, 1992. ISBN 0521437148. URL http://lib-www.lanl.gov/numerical/bookcpdf. html. 69 Bibliography [26] A. Srinivasan, M. Mascagni, and D. Ceperley. Testing parallel random number gener- ators. Parallel Comput., 29(1):69–94, 2003. ISSN 0167-8191. [27] I. Vattulainen. Framework for Testing Random Numbers in Parallel Calculations. Phys- ical Review E, 59:7200, 1999. [28] I. Vattulainen, T. Ala-Nissila, and K. Kankaala. Physical Tests for Random Numbers in Simulations. Physical Review Letters, 73:2513–2516, 1994. [29] I. Vattulainen, T. Ala-Nissila, and K. Kankaala. Physical models as tests of randomness. Physical Review E, 52(3):3205–3214, 1995. [30] I. Vattulainen, K. Kankaala, J. Saarinen, and T. Ala-Nissila. A comparatitive study of some pseudorandom number generators. Computer Physics Communications, 86: 209–226, 1995. [31] D. Veillard. The XSLT C library for Gnome, 2003. URL http://xmlsoft.org/ XSLT/xsltproc2.html. [32] E. Welzel. Rank of random matrices over gf[2], 1995. URL http://www.inf. ethz.ch/personal/emo/ps-files/SP-ExpRank.ps. [33] U. Wolff. Collective Monte Carlo Updating for Spin Systems. Physical Review Letters, 62:361, 1989. 70

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 11 |

posted: | 5/23/2012 |

language: | |

pages: | 78 |

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.