VIEWS: 109 PAGES: 68 POSTED ON: 1/22/2011 Public Domain
This work is protected by copyright. It may be downloaded in Adobe format for personal use only. It may not be reproduced in any format either in part or in whole. Brief quotations are allowed without the author’s written permission. Sampling and Statistics Explained Towards commonsensical sampling practices and scientifically sound statistical methods J W Merks Chapter 2 Sampling theory Sampling theory derives from probability theory whose roots are traceable to games of chance such as tossing coins, rolling dice, and drawing cards from a deck or colored balls from a vase. When Stonehenge’s monoliths on Salisbury Plain were tracking summer and winter solstices, Hounds and Jackals, a precursor to playing dice, was already popular in Mesopotamia. Nowadays, betting on all sort of sporting events, gambling on addictive slot machines, and playing a broad range of games of chance, are popular pastimes all over the world. The human race may well possess its innate penchant for gambling and games of chance because it exists, despite staggering odds to the contrary. Gerolamo Cardan (1501-1576) wrote Liber de Ludo Aleae (Book on Games of Chance) long before 1654 when Antoine Gombaud, Chevalier de Méré (1607-1684), a French gambler and a rogue, wrote to Blaise Pascal (1623-1662). Gombaud wanted to know why he lost more when betting to roll at least one double six in 24 rolls of a pair of 6-sided dice than when betting to roll at least one six in four rolls of a single die. Pascal wrote about Gombaud’s gambling problem to Pierre de Fermat (1601-1654), and the ensuing exchange of letters formed the foundation of probability theory. Pascal and Fermat are recognized as the founders of probability theory, even though they published little on games of chance and wrote mostly to each other. Christian Huygens (1625-1695) wrote the first treatise on probability theory based on Pascal’s and Fermat’s correspondence, and designed the first pendulum clock based on Galileo Galilei’s notes (1564-1642). Abraham de Moivre (1667-1754) merged probability theory with algebra and trigonometry in his 1718 “The Doctrine of Chances,” and discovered equivalence between the Bernoulli distribution and the Poisson distributions. Probability is a quantitative measure for the chance or likelihood that a certain event or outcome will occur. For example, the probability to encounter during mining continued mineralization between in situ ordered ore zones in boreholes on a line or profile could be quantified by verifying spatial dependence. In geostatistics, however, spatial dependence between in situ or temporally ordered sets of measured values in sample spaces may be assumed, unless proven otherwise. Paradoxically, Fisher’s F-test should be applied to the variance of a set of measured values and the first variance of the in situ ordered set to prove a significant degree of spatial dependence. Fisher’s F-test has not been approved for application in geostatistical ore reserve estimation. 1 - 68 Probability theory examines nondeterministic systems with randomly distributed discrete and continuous variables. Deterministic laws make tossed coins, rolled dice and roulette wheels come to rest whereas probability theory deals with outcomes and events after forces are balanced and energies exhausted. The elements of probability behind gambling and games of chance explain why sampling theory applies to homogeneous populations. In contrast, sampling practice applies to heterogeneous sampling units such as contents of bulk bags, trucks and wagons, and shipments in unit trains and cargoes aboard bulk carriers, and to heterogeneous sample spaces such as stockpiles at mineral processing plants and bulk terminals, masses and volumes of in situ coals and ores, contaminated sites, and similar stationary situations. The notions of permutations and combinations are briefly explored. These methods of counting are of particular importance in statistical quality control (SQC) where discrete accept/reject outcomes are routinely encountered during inspection of manufactured products. The properties of the Gaussian or normal distribution are more relevant in sampling practice for the mining industry and the international commodity trade because central values and variances of sets of test results, determined in samples selected from multinomial, binomial or Poisson distributions, converge on the normal distribution as more samples are selected. Historically, this expectation is called the Law of Large Numbers. Depending on the rigorousness of mathematical proof, it may also be referred to as either the Weak or the Strong Law of Large Numbers. A corollary of the Law of Large Numbers is the Central Limit Theorem. Not only does it provide a scientifically sound basis to probability theory and mathematical statistics but it also builds a bridge between sampling theory and sampling practice. The formula for the Central Limit Theorem is deceptively simple for the arithmetic mean of a set of test results determined in samples with equal weights but it becomes more complex for the weighted average of a set of test results determined in samples with variable weights. Weighting factors play an important role in sampling practice because each area-, count-, density-, distance-, length-, mass- and volume-weighted average has its own variance. Inexplicably, the distance-weighted average lost its variance long before it was reborn as an honorific kriged estimate or kriged estimator. Geostatistical practitioners have yet to explain why the true variance of the single distance-weighted average was replaced with the pseudo variance of a set of distance-weighted averages. The notion that the distance- weighted averages never had a variance is about as absurd as the belief that its rebirth as a kriged estimate or kriged estimator made this variance vanish without a trace. The Central Limit Theorem derives from the variance of a general function as defined in probability theory. It is easy to prove that the variance of each central value converges on this quintessential theorem when all the weighting factors converge on the same constant weighting factor. The validity of this fundamental relationship can be proved heuristically or shown by stochastic simulation. Presently, random number generators that underpin simulation models can be validated on the basis of a priori probabilities of classic games of chance such as tossing coins, rolling dice, and drawing cards. Plus ça change, plus c’est la même chose! 2 - 68 2.1 Elements of probability theory Tossing a fair coin, rolling an unloaded die, drawing a card from a shuffled deck or a ball from a vase with a homogeneous set of balls of different colors but with the same diameter, are examples of sampling experiments with equiprobable sample spaces and discrete outcomes. Heads and tails are equally likely for an unbiased coin, and so is each of the six faces of an unweighted die. Drawing a single card from a shuffled (randomized or randomly distributed) full deck without a joker is a sampling experiment with 52 equiprobable outcomes. Drawing a single ball from a set in a vase is different in the sense that this sample space is defined by N, the number of balls in the vase. Each of these cases defines a sample space. For example, the sample space is two for tossing a single coin, six for rolling a 6-sided die, 52 for drawing a single card from the full deck without jokers, and N for drawing a particular ball from a vase with N balls. Each toss, roll, or draw gives an elementary outcome, and all possible outcomes of a single toss, roll or draw display a discrete and uniform distribution. Multiple elementary outcomes constitute an event, and multiple events display a discrete and nonuniform distribution or event space. For example, a pair of dice gives an event space of 2, 3…11, 12 dot sums, and displays a discrete and nonuniform distribution. A priori probabilities are reported as fractions or percentages. The probability to toss heads or tails with an unbiased coin is P(H)=P(T)=1÷2=0.50 or P(H)=P(T)=50%. The probability to roll any of the six faces of an unweighted cubic die is P(x)=1÷6=0.167 or P(x)=100÷6=16.7%. The probability to draw the Queen of Hearts from a shuffled deck of cards is P(QH)=1÷52=0.0192 or P(QH)=100÷52=1.92%. The same probability applies to each card in a deck as long as the game is played with an unstacked deck. After all, the probability for each card in a full deck is P(any card)=1÷N, in which N is the number of possible outcomes in this well-defined sample space of 52 cards. The probability of P(any ball)=1÷N applies to each ball that is blindly drawn from a vase with a homogeneous set of N colored balls with identical diameters. This probability remains the same when the selected ball is returned to the set, provided it is homogenized before a next blind draw. The probability for draws with replacement is constant whereas the probability for draws without replacement increases. For example, the second draw from the reduced deck of cards gives P(any card but the first)=100÷51=1.96%, the third draw gives P(any card but the first and second)=100÷50=2.00%, and so on. In the case of independent events where the chance of one event to occur does not impact the chance of the other, the probability of both events to happen is the product of their probabilities. For multiple independent events, the formula for the multiplication rule is equal to the product of P(A and B and…and X)=p(A)·p(B)·…·p(X). If two events are independent and mutually exclusive, the probability of either event to occur is the sum of their probabilities. For a pair of independent and mutually exclusive events P(A and B)=0 and P(A or B)=p(A) + p(B) in accordance with the addition rule, which applies to any set of independent and mutually exclusive events. 3 - 68 The condition that sampling experiments be unbiased is fundamental in sampling theory and practice. If a 6-sided die were loaded or weighted, then the probability of at least two out of six faces is either higher or lower than 16.7%. If a deck of cards were incomplete, then the probability to draw Card X is less than P(X)=100÷52=1.92%. In the case that Card X is the only one left, the final draw is the certain event. If Card X were indeed the last card left, the probability of this outcome would be close to P(X)=1÷52! ≈ 10 -67. The condition that sampling experiments be independent is equally fundamental because it impacts a priori probabilities. This condition lies at the core of sampling theory but its implications in sampling practice are not always transparent. Irrespective of whether a toss with an unbiased coin gave head or tails, the probability of the next toss does not depend on the previous toss but remains P=0.50 or P=50%. Repeated draws from a full deck remain independent when a selected card is returned to the deck, which should be shuffled prior to the next draw. After all, if the Queen of Hearts were the first card, then drawing it twice would be an impossible event unless it is a draw with replacement. Bias tests are designed to verify whether a particular sampling protocol gives unbiased test results. For example, testing a 6-sided die for a bias or systematic error requires that a set of empirical probabilities be compared with the discrete uniform a priori probabilities of 16.7%. Figure 2.1 presents the discrete and uniform a priori probability distribution for a single die, and the discrete and nonuniform a priori probability distributions for a pair or dice, and for a set of three. Figure 2.1 Discrete uniform and nonuniform distributions One die Tw o dice Three dice 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 These bar charts display all possible discrete outcomes for a single die, for a pair of dice and for a set of three. The single die gives a uniform distribution of six equiprobable outcomes, a pair of dice gives a nonuniform distribution of 12 events, and a set of three give a nonuniform distribution of 18 events. The probability distribution for a pair of dice is nonuniform and its slopes are linear. In contrast, the probability distribution for three dice seems to converge on the typical bell- 4 - 68 shaped curve of the Gaussian or normal probability distribution. Multiple dice do indeed define discrete and nonuniform a priori probability distributions that converge ever closer on the normal probability distribution as the number of dice increases. This is a corollary of the Central Limit Theorem. Three dice of different colors define a priori events of P(1;1;1)=3, P(1;1;2)=P(1;2;1)= P(2;1;1)=4,...P(6;6;5)=P(6;5;6)=P(5;6;6)=17, and P(6;6;6)=18 for N=63=216 single outcomes. Table 2.1 lists all the events and expected counts, the relative and cumulative probabilities in percent, and the approximate 98% and 94% confidence intervals. Table 2.1 Event space and dot sum counts for three dice —————————————————————————— Event Count %rel %cum 98% CI 94% CI —————————————————————————— 3 1 0.46 0.46 4 3 1.39 1.85 1.39 5 6 2.78 4.63 4.17 2.78 6 10 4.63 9.26 8.80 7.41 7 15 6.94 16.20 15.74 14.35 8 21 9.72 25.93 25.46 24.07 9 25 11.57 37.50 37.04 35.65 10 27 12.50 50.00 49.54 48.15 11 27 12.50 62.50 62.04 60.65 12 25 11.57 74.07 73.61 72.22 13 21 9.72 83.80 83.33 81.94 14 15 6.94 90.74 90.28 88.89 15 10 4.63 95.37 94.91 93.52 16 6 2.78 98.15 97.69 96.30 17 3 1.39 99.54 99.07 18 1 0.46 100.00 —————————————————————————— Table 2.1 shows that events with 3 and 18 dot sums have relative a priori probabilities of 0.46%. Events with 4, 5,16, 17 dot sums converge on a symmetric 98% confidence range with a lower limit of 98% CRL=4 events and an upper limit of 98% CRU=17 events. Events with 5, 6…15, 16 dot sums are fairly close to a symmetric 94% confidence range with a lower limit of 94% CRL=5 events and an upper limit of 94% CRU=16 events. Events with 3 and 18 dot sums occur only once and events with 4 and 17 dot sums occur three times. Each count, when converted into a fraction of the total count of 216, becomes a weighting factor required to obtain the count-weighted average (the central value of this set of events), the variance of the set, and the variance of its central value. Calculating central values and variances of sets of measured values with variable weights is of critical importance in sampling practice. 5 - 68 Tests for kurtosis and skewness verify whether a set of events departs from the normal probability distribution. For example, if the set in Table 2.1 were to depart significantly from normalcy, then its central value and confidence limits should reflect the weighting factors of wi=ni÷Σni, where ni is the count for the ith event, and Σni=N=63=216. A quick- and-dirty test is to draw a graph of all the events in Table 2.1 against their logarithms. Figure 2.2 shows a chart of dot sums against their logarithms. Figure 2.2 Expected events for three dice 2.0 1.5 log(sums) 1.0 0.5 0.0 3 6 9 12 15 18 Sums of dots The graph in Figure 2.2 converges closely on a straight line. Yet, appearance of normalcy may well be as deceptive as a priori assumption of spatial dependence between in situ or temporally ordered sets of measured values of stochastic variables in sample spaces. The question is whether the expected dot sums for 216 rolls of three unbiased dice diverge significantly from the normal distribution. ISO Technical Committee 69 – Applications of statistical methods, has approved various tests for departure from normality. Several tests, including those for kurtosis and skewness, are discussed in Chapter 3 Sampling Practice where the first, second, third and fourth order differences between a set of measured values and its central value are introduced. Generally, a priori probabilities range from zero to unity, or from 0% to 100%. By definition, p=0 or p=0% are impossible events whereas p=1 or p=100% are certain events. Examples of certain and impossible events range from absurd to whimsical with death and taxation undeniably certain events. Impossible events are to gamble for profit without counting cards, loading dice, or otherwise stacking the odds. Gambling may be a popular pastime but the odds to win favor bingo halls, casinos, lotteries, slot machines, and the like. The term odds refers to the ratio between the probabilities of winning and losing. In the real world of gambling, odds do not add up to zero sum games. Gambling problems are often easier to solve by calculating the probability to lose rather than the probability to win. When Gambaud was betting to roll at least one six in four 6 - 68 casts of a single 6-sided die, he would win more often than not because the probability of P(single-six)=1–(5÷6)4 ≈ 0.518 was in his favor. However, when he was betting to roll at least a double six in 24 casts, he would lose more often than not because P(double- six)=1–(35÷36)24 ≈ 0.491 was to his disadvantage. But why did this Renaissance rogue even try to get a double six by rolling a pair of dice 24-times? Surely, it must have taken him at least six times longer to get a feel for the odds of this bet. Perhaps passing time was as much part of the way of life in Gombaud’s days as playing games. Gombaud gambled without understanding the rules of the game but was smart enough to ask Pascal for assistance. Matheron gambled with geostatistics without understanding the rules of mathematical statistics but was not as smart as Gombaud to ask a statistician for assistance. Perhaps ironically, Matheron’s geostatistics was hailed as a new science in the 1960s. Yet, its most astounding attribute was that the pseudo kriging variance of a set of kriged estimates replaced the true variance of the single distance-weighted average. Statisticians would have told Matheron and his staff that “variances” of sets of distance- weighted averages are invalid measures for variability, precision, and risk. Many would have pointed out that each distance-weighted average has its own variance because one- to-one correspondence between central values and variances is inviolable in mathematical statistics. Most statisticians would have taught Matheron and his students that a set of n functionally dependent (calculated!) values of a stochastic variable gives exactly zero degrees of freedom, that a set of n measured values gives df=n–1 degrees of freedom, and that the in situ ordered set gives dfo=2(n–1) degrees of freedom. Gombaud trusted Pascal and Fermat but Matheron did not trust statisticians. In his ponderous Foreword to Journel and Huijbregts’s Mining Geostatistics, he pontificated, “A statistician who is not familiar with mining may well be discouraged before he can even get a good idea of the problem at hand.” Statisticians would have been as popular at the Centre de Géostatique as horse flies in a barn after his new science of creating order in randomness failed to inspire the few statisticians at the 1970 Geostatistics colloquium. David, in the Introduction to his 1977 Geostatistical Ore Reserve Estimation, declares, “This is not a book for professional statisticians,” and correctly predicts, “…statisticians will find many unqualified statements….” In Section 12.2 Conditional Simulation, he refers to, ”…an infinite set of simulated values…” and ponders how, “To make that infinite set of simulated values smaller and get the model closer to reality…” On the next page, he muses, “The criticism to this model is obvious. The simulation is not reality. There is only one answer. The proof of the pudding is …!” The problem with David’s pudding proof is that many geostatistical ore reserve estimates failed to make the grade. In Chapter V.A. Theory of Kriging, Journel and Huijbregts’s 1978 Mining Geostatistics refers to σK2=o, the zero kriging variance. Given Krige’s infinite set of distance-weighted averages and David’s infinite set of simulated values, it is not surprising that zero kriging “variances” and infinite sets of kriged estimates surfaced in Mining Geostatistics. It is astounding that the variance of the distance-weighted averages and the concept of degrees of freedom vanished on Krige’s watch. On a positive note, Gombaud would have been pleased that casting an unbiased six-sided die did make it into Mining Geostatistics! 7 - 68 2.2 Counting Methods The notions of permutations and combinations deal with all possible arrangements when selecting different or identical objects without or with replacement. When all objects in a set are identifiably different, it is possible to record the order in which a subset of objects is selected. For example, an experiment with three dice must stipulate that each die has a different color to ensure that the correct dot sums for the three faces are recorded. Experiments with a deck of card or a vase of colored balls with the same diameter are examples of selecting a subset of objects from a set of identifiably different objects. The number of arrangements of n different objects selected from a population of no less than n objects with replacement is nn. For example, the number of arrangements of 52 cards selected from a full deck with replacement is P=5252. A typical example of a draw without replacement is the number of permutations for the letters of the alphabet. The first letter may be selected 26 ways, the second 25 ways, and so on until only one letter is left. Hence, the number of permutations for 26 objects without replacement is P=26·25·24…3·2·1=26!. This exclamation mark does not imply excitement but it is one of many austere symbols in probability and sampling theory. When a sampling experiment is carried out with replacement, the 1st selection can be any object in a set of n while the kth selection can still be any object as long as k≤n. Hence, the number of permutations of k objects drawn with replacement from a set of n objects is Pkn. For example, the number of permutations of k=5 letters selected from n=26 letters of the alphabet is Pkn = 265 ≈ 7,893,600. Generally, the number of permutations of n objects selected from n objects one at a time without replacement is P=n!. The symbol for the number of n objects selected from n objects is Pnn. The symbol for the number of k objects selected from n objects is Pkn. Given that the 1st object can be selected in n different ways, the 2nd in n–1 ways, the 3rd in n–2 ways, and the kth in n–k+1 ways, it follows that n(n–1)(n–2)…(n–k+1)= n(n–1)(n– 2)…(n–k+1)(n–k)!÷(n–k)!. Hence, Pkn = n! ÷(n–k)!. A variant of this type of sampling experiment would be to select three-letter and three- digit codes for automobile license plates. With replacement, a subset of three letters may be drawn in n1=263 different ways whereas a subset of three digits may be drawn in n2=103 different ways. A few three-letter words out of n1=17,576 may not be suited for license plates but when multiplied with 103 more than 17 million are probably acceptable. The number of permutations of a set of n objects, consisting of subsets of n1…ni …nk objects of different types such that Σni =n, equals n! ÷ [n1! ·…·ni! · …· nk!]. For example, the number of identifiably different permutations of n=13 letters in “geostatistician” is 13! ÷ [3! ·3! ·2! ·2! ·1! ·1!]=43,243,200. This is a rather small number when compared with the infinite set of kriged estimates defined by two or more measured values, determined in samples selected at positions with different coordinates in a finite sample space. How to select the least biased and most precise subset of an infinite set of kriged estimates is a daunting task. Selecting a subset of k kriged estimates from an infinite set 8 - 68 of kriged estimates gives Pkn permutations in which k is finite but n is infinite. Given that Pkn = n! ÷ [n – k]!, it follows that the number of permutations is immeasurable. By implication, the odds are heavily stacked against those who want to select the least biased and most precise subset of k kriged estimates from an infinite set of kriged estimates. Geostatistical ore reserve practitioners perform this improbable task on a routine basis. For a set of n objects that consist of subsets of n1…ni …nk objects of different types such that Σni = n, it follows that the sum of the probabilities of ni ÷ n+…+ni ÷ n+…+nk ÷ n is equal to unity. In fact, this relationship between permutations and probabilities underlies the multinomial distribution where probabilities are in fact weighting factors required to obtain weighted averages and variances. A combination of objects is a set selected such that the order of the objects in the set is irrelevant. For example, if ten (10) objects in a sample of one hundred (100) turn out to be defective, it does not matter in which order these flawed objects were selected. It does matter that the probability to accept is P(accept)=90%, and that the probability to reject is P(reject)=10%. These probabilities are complementary, which is a characteristic of the binomial or Bernoulli distribution. The symbol and formula for the number of combinations of a set of n objects in a sample of k is Ckn = n! ÷ [(n–k)!·k!]. For example, the number of 5-card hands that can be selected from a 52-card deck is C552 =52! ÷ [(52–5)!·5!] = 2,598,960. The number of combinations of n objects selected in a subset or a single sample of n objects is Cnn =n! ÷ [(n–n)!·n!]. Given that 0!=1 by definition, it follows that Cnn=n! ÷ n!=1. 9 - 68 2.3 Probability distributions Probability distributions for elementary outcomes such as two sides of a coin or six faces of a cubic die are discrete and uniform. Probability distributions for events such as heads and tails for a set of coins, and dot sums for a set of dice are discrete and nonuniform. Discrete variables display discrete and uniform or nonuniform distributions. In contrast, the most relevant continuous and nonuniform probability distributions are the multinomial distribution, the Bernoulli or binomial distribution, the Poisson distribution and the Gaussian or normal distribution. Each of these probability distributions is discussed in a separate section. Various statistical tests are based on comparing observed statistics with values from various probability distributions. For example, Bartlett’s chi-square test, Fisher’s F-test and Student’s t-test compare observed statistics with values of χ²-, F- and t-distributions at selected probability levels and with applicable degrees of freedom. Tabulated values are given in handbooks of statistical tables and in many textbooks on applications of statistical methods. Dedicated software makes is simple to apply statistical tests, and to obtain the required values of various probability distributions. Bartlett’s χ²-test verifies whether three or more variances constitute a homogeneous set. For example, the χ²-test can be applied to the variances of test results for core samples from ordered ore zones on a line of boreholes to verify homogeneity. The χ²-test is also applicable to squared relative standard deviations and squared coefficients of variations. The test verifies whether expected and observed outcomes of sampling experiments are statistically identical or indicative of the presence of bias. Table I.IV in David’s 1977 Geostatistical Ore Reserve Estimation shows how to apply the χ²-test to expected and observed frequencies. In this table of the first geostatistics textbook the concept of degrees of freedom makes a cameo appearance. Fisher’s F-test plays a pivotal role in sampling and statistics because it is the quintessence of analysis of variance (ANOVA). Fisher’s F-test verifies whether two variances are statistically identical or differ significantly. The F-test is also applied to test for spatial dependence in a sample space or a sampling unit, to construct sampling variograms that show where orderliness in sample spaces dissipates into randomness, and to analyze and optimize sampling protocols. Fisher’s F-test, too, is applicable not only to variances but also to squared relative standard deviations and to squared coefficients of variations. Student’s t-test verifies the absence or presence of bias between either paired or pooled sets of measured values of stochastic variables in sample spaces or sampling units. It can be applied at the primary sample selection, sample preparation, and analytical stages of measurement hierarchies. It is of critical importance to international trading partners that mechanical sampling systems and manual sampling protocols be tested for bias. Many applications of Student’s t-test will be discussed. Tukey’s Wholly Significant Difference test, or WSD-test for short, is also discussed. The WSD-test compares each difference between three or more central values with the wholly 10 - 68 significant difference to identify statistical significance at 95% probability. Degrees of freedom play a pivotal role in Bartlett’s chi-square test, Fisher’s F-test, Student’s t-test and Tukey’s WSD-test. Dr J W Tukey, a Princeton professor and a prominent statistician, was present at a geostatistical symposium in 1970. When asked about Matheron’s Theory of Regionalized Variables, Tukey said, “I am now beginning to understand that Kriging is apparently a word for more-or-less stationary, more-or-less least squared smoothing of the data.” David’s 1977 Geostatistical Ore Reserve Estimation cautioned statisticians about many unqualified statements. Armstrong and Champigny’s 1988 A Study on Kriging Small Blocks cautioned against oversmoothing. Journel’s 1992 guidelines cautioned against classical “Fischerian” statistics. Clearly, geostatisticians issue one caution after another while violating with impunity the fundamental requirement of functional independence and ignoring the concept of degrees of freedom. Chapter 3 Sampling practice explores the application of statistical tests in sufficient detail to ensure their proper application and interpretation. Mathematical statistics would collapse if degrees of freedom were ignored. For example, it would be impossible to derive unbiased confidence intervals and ranges for central values such as arithmetic means and weighted averages, metal contents and grades of volumes of in situ ore, and of wet and dry masses of mined ore, mill feed, tailing and concentrate. Neither would it be possible to verify spatial dependence between in situ or temporally ordered sets of measured values, and to chart sampling variograms that show where spatial dependence in sample spaces or sampling units dissipates into randomness. Degrees of freedom are equally indispensable when testing for bias and optimizing sampling protocols. Given that functional independence and degrees of freedom are irrelevant in geostatistics, it follows that the pseudo kriging variance of a set of functionally dependent kriged estimates is an invalid measure for variability, precision, and risk. In classical statistics, one-to-one correspondence between functionally dependent values and variances entails that central values do have variances. The arithmetic mean is the ubiquitous central value of a set of measured values with equal weights. In contrast, an area-, count-, density-, distance-, length- mass- or volume-weighted average is the central value of a set of measured values with variable weights. Long before Matheron converted a flawed variance of mathematical statistics into geostatistics, he didn’t know that each length-weighted average grade has its own. In fact, the length-weighted average was the first central value that lost its variance while Matheron was studying statistics and wrote his first statistical note in 1954. It took some ten years before the distance-weighted average, too, metamorphosed into an honorific but variance-deprived kriged estimate or estimator. 11 - 68 2.3.1 Multinomial distribution The multinomial distribution is most effective when applied to small sample spaces with discrete events such as a set of cards drawn from a full deck, or a sample selected from a vase with balls of different colors but identical diameters. The deck of cards should be shuffled and the contents of the vase should be thoroughly mixed because the properties of the multinomial distributions only apply to homogeneous populations. The probability for a particular event is obtained from the terms of the multinomial expansion, (p1 + ⋅⋅⋅ + p i + ⋅⋅⋅ + p k ) n = 1 where: pi = probability for ith subset k = number of subsets n = elements in all subsets The multinomial distribution is useful to show that the probabilities of discrete events converge on the Gaussian or normal distribution. In sampling practice, the multinomial distribution is of limited use because n, the complete set of events consists of k subsets such that n = n1 +…+ ni +…+ nk, and because the probability for the number of events in each subset is p1… pi … pk such that their sum is p1 +…pi +… pk =1. The probability of a sample of n balls taken from a homogeneous population consisting of k subsets of balls of different colors but with identical diameters is computed with the formula, ) ( p(X = p1 1 + ...+ p i i + ...+ p kk ⋅ n n n ) n !⋅ ...⋅ nn!!⋅ ...⋅ n ! 1 i k The formula is impractical for large sets of events because n! (pronounce n factorial) is the product of n·n-1·n-2…3·2·1 = 9.333·10136 for n=100. In contrast, the product of all probabilities is extremely small so that the probability for a particular event is the product of two terms, the first of which tends toward infinitesimal whereas the second tends toward infinite. This is why confidence limits for the central value of a set of measured values, determined in samples selected from a multinomial distribution that consists of particles of uniform size and different compositions are impractical to compute. A multinomial population that consists of a homogeneous set of black, white and red balls of the same diameter in equal proportions is a useful stochastic system to bridge the gap between sampling theory and sampling practice. How cumbersome it is to compute the probability of selecting a sample of n balls with ni balls of each of three colors is obvious upon realizing that the probability to select a sample of 300 balls with exactly 100 of each color is p(X) = (1÷3)300 ·300! ÷ [100!·100!·100!]. Surely, Sterling’s formula falters and spreadsheet software sputters when a much large number of terms are required to compute meaningful confidence limits. The multinomial distribution is interesting from a theoretical perspective but it has found little application in sampling practices for materials in bulk. 12 - 68 2.3.2 Binomial distribution For k=2, the multinomial distribution becomes the binomial distribution with its multitude of practical applications in statistical quality control. The binomial distribution also forms the basis for Gy’s ubiquitous but simplistic and contentious sampling constant. The binomial or Bernoulli distribution applies to sample spaces with a pair of mutually exclusive outcomes such as heads or tails, black or white, accept or reject, on and off, and so on. It is not surprising, then, that the binomial distribution has found wide application in science and engineering and provides a sound basis to statistical process control (SPC) and statistical quality control (SQC). The Bernoulli distribution is based on the terms of the binomial expansion (p + q) n = 1 where p and q are complementing probabilities for Event P and Event Q, and n is the total number of events. The following formula defines all possible terms for this expansion, (p+q)n=1 where: p = probability for Event P q = probability for Event Q n = number of outcomes The probabilities of p and q are complementary. For example, the probability for heads and tails are p=0.5 and q=1–p=0.5 for an unbiased coin. The probability for rolling a particular side with an unweighted die is p=0.167, which implies that the complementary probability for not rolling that side is q=1–0.167=0.833. Multiple coins and dice give the terms of the binomial expansion with coefficients that display Pascal’s triangle. Each term of the binomial expansion is obtained with the following formula, p( X ) = ( n ) ⋅ p k ⋅ q n − k k where: P(X) = probability for Event X p = probability for Event P q = probability for Event Q k = number of same outcomes n =total number of outcomes The most useful property of the binomial distribution is that a sample of n balls selected from a homogeneous population of black and white balls with identical diameters has a sample variance of var(x)=n·p·q. For example, a sample of 1,000 balls selected from such a population has a sample variance of var(x)=1,000·0.1·0.9=90. This variance applies to 100 black balls and 900 white balls alike. Therefore, it gives a 95% confidence interval of 95% CI = 1.96·√90 ≈ ±19 black balls for a symmetric 95% confidence range with a lower limit of 95% CRL ≈ 100–19 ≈ 81 black balls and an upper limit of 95% CRU ≈ 100 + 19 ≈ 119 black balls. The symmetric 95% confidence range for 900 while balls has a lower limit of 95% CRL ≈ 900–19 ≈ 881, and an upper limit of 13 - 68 95% CRU ≈ 900 + 19 ≈ 919. Logically, the lower limit of 95% CRL ≈ 81 black balls and the upper limit of 95% CRU ≈ 919 white balls coincide and add up to the entire sample of 1,000 balls. In sampling practice, a large set of particles may be divided into two subsets, one of which consists of particles with some desirable characteristic whereas the other consists of particles without that characteristic. This simplistic approach formed the basis for Gy’s sampling theory and his ubiquitous sampling constant. The variance formula for the binomial distribution in particular has been widely used and abused to bridge the gap between sampling theory with its homogeneous populations and sampling practice with its heterogeneous sampling units and sample spaces. A score of authors such as Vezin, Brunton, Richards, Demond, Halferdahl and Gy have applied all sorts of factors to allow for departure from the binomial distribution in sampling practice. Dr Jan Visman, in his 1947 PhD thesis, described a scientifically sound method based on the additive property of variances in a measurement hierarchy. Visman’s work underpins practical applications such as the interleaved sampling protocol. Visman’s sampling theory and its applications will be juxtaposed against Gy’s sampling theory. Visman showed that the sampling variance is the sum of the composition variance and the distribution variance. The composition variance is a measure for the variability between particles within primary increments, and the distribution variance is a measure for the variability between primary increments within sampling units. Visman’s sampling theory was implemented in ASTM D2234–Standard Practice for Collection of a Gross Sample of Coal. Originally published in 1963, it was the first internationally recognized standard to target a precision estimate of ± 10% for dry ash content. Visman’s sampling theory gave impetus to the development of the interleaved sampling protocol, a straightforward procedure that provides precision estimates for central values of stochastic variables in sampling units and sample spaces at the lowest possible cost. The interleaved sampling protocol is particular useful at the bulk sampling stage in mineral exploration. Interleaved samples give unbiased estimates for intrinsic variances of stochastic variables in sample spaces and sampling units alike. Interleaved bulk samples combine logically with testing for spatial dependence between rounds in adits, drifts, pits, or trenches, and with charting of sampling variograms. 14 - 68 2.3.3 Poisson distribution The Poisson distribution approximates the binomial or Bernoulli distribution when p–>0 and n–>∞ such that their product np is constant. The Poisson distribution is called after Siménon-Denis Poisson (1781-1840). Its properties derive from the sum of an infinite set of terms, a derivation traceable to The Doctrine of Chances by Abraham de Moivre (1667-1754). De Moivre’s work in algebra, trigonometry, and probability theory deals with the relation between the probability of a particular event and the frequency of its occurrence as the sum of an infinite set of terms, e− m m 0 e− m m 1 e− m m 2 e− m m 3 + + + ⋅⋅⋅ 0! 1! 2! 3! Since e, the base of the natural logarithm, is the sum of these terms: 1 1 1 1 e= + + + + ⋅⋅⋅ 0! 1! 2! 3! m 0 m1 m 2 m 3 and since e m is the sum of the following set, e m = + + + + ⋅⋅⋅ 0! 1! 2! 3! ∞ e− m m r the sum of the infinite set of terms is: ∑ r! = 1.0 r =0 Consecutive terms of the Poisson distribution correspond to those of the Bernoulli or binomial distribution as n converges on infinite, which implies that, e− m m r r! ≅ ( )p n r r (1 − p ) n − r The equivalence of the Poisson distribution and the Bernoulli distribution is fundamental in sampling practice. Not only do the Poisson and Bernoulli distributions converge but both also converge on the Gaussian or normal distribution. This convergence lends credence to the validity of the Central Limit Theorem. Poisson’s expected value of m is equivalent to Bernoulli’s population mean of np, both of which, in turn, are equivalent to μ, the population mean as defined for the Gaussian or normal distribution. An interesting property of the Poisson distribution is that the expected value of m is also equivalent to σ2, the population variance as defined for the Gaussian or normal distribution. For example, a micro-diamond count of m=15 gives a 95% confidence interval of 95% CI ≈ 1.96·√m ≈ 1.96.σ ≈ 1.96·√15 ≈ ±8 counts, and a symmetric 95% confidence range with a lower limit of 95% CRL ≈ 15–8 ≈ 7 counts, and an upper limit of 95% CRU ≈ 15+8 ≈ 23 counts. 15 - 68 This property of the Poisson distribution explains the coarse particle effect that prevails when small test portions of poorly comminuted test samples are assayed for copper, lead, zinc, and so on. It also explains the nugget effect that occurs when test samples contain native metals, and test portions are fire-assayed for gold and silver. For example, a test portion of 15 g from a 500 g test sample with 25 particles of free gold is expected to contain m=0.75 particle. Similarly, test portions of 30 g and 60 g taken from the same test sample are expected to contain m=1.5 and m=3 particles of gold respectively. Figure 2.3 shows a bar chart with the probabilities as relative percentages for a predicted range of gold particles from k=0 to k=10. Figure 2.3 Probabilities in relative percent 60 50 40 m1=3 30 m2=1.5 20 m3=0.75 10 0 0 1 2 3 4 5 6 7 8 9 10 The probability to select a 15 g test portion without a single gold particle is close to 50 %rel, which explains why this bar chart shows a higher degree of positive skewness than those for m2=1.5 and m3=3. Table 2.2 gives the relative and cumulative percentages for predicted gold particles based on expected values of m1=0.75, m2=1.5, and m3=3. Table 2.2 Relative and cumulative percentages for gold particles ————–––––––––––————————————————————— Predicted 60 g test portion 30 g test portion 15 g test portion Particles %rel %cum %rel %cum %rel %cum ————–––––––––––————————————————————— 0 5.0 5.0 22.3 22.3 47.2 47.2 1 14.9 19.9 33.5 55.8 35.4 82.7 2 22.4 42.3 25.1 80.9 13.3 95.9 3 22.4 64.7 12.6 93.4 3.3 99.3 4 16.8 81.5 4.7 98.1 0.6 99.9 5 10.1 91.6 1.4 99.6 0.1 100.0 6 5.0 96.6 0.4 99.9 7 2.2 98.8 0.1 100.0 8 0.8 99.6 9 0.3 99.9 10 0.1 100.0 ————–––––––––––————————————————————— 16 - 68 Table 2.2 explains why fire assayers walk a fine line between the mass of a test portion and the capacity of a crucible. So-called 30-gram crucibles are suitable for assaying test portions with a mass of about 30 g. In the jargon of fire assayers, test portions of 29.167 g are commonly referred to as “one (1) assay ton.” Bre-X’s bogus gold grades were initially made up by enriching core samples with gold filings, and fire assaying test portions of 100–150 mesh core samples. Perhaps ironically, Bre-X’s quality control was based on fire assaying a second test portion of every tenth pulverized core sample. Due to the single particle or nugget effect, the analytical precision for fire assaying duplicate test portions was bound to exceed a coefficient of variation of 50%. Because gold filings in pulverized core samples are easy to detect and identify, the problem was not only how to enrich core samples but also how to beat the nugget effect and improve the analytical precision for gold. The first step was to replace gold filings with placer gold, diluted with pulverized ore to obtain low, medium and high concentrations. The second step was to replace fire assays of small test portions of pulverized and salted core samples with cyanide leaching 750 g test portions of crushed and salted core samples. In spite of that, the analytical precision for cyanide leaching large test portions of placer-gold-enriched crushed core samples was still extremely low. Pulverized ore with variable placer gold concentrations was added to crushed core samples in a haphazard manner–so much so, in fact, that but a few sets of in situ ordered bogus gold grades displayed spatial dependence. Journel’s postulate that spatial dependence may be assumed played a crucial role in converting Bre-X’s bogus grades and Busang’s barren rock into a massive phantom gold resource. Figure 2.4 shows how the binomial or Bernoulli distribution with a variance of n·p·q= 100·0.05·0.95=4.75 and the Poisson distribution with a variance of m=4.75 converge () because of the equivalence of e − m m r ÷ r ! ≅ r p r (1 − p ) n − r . n Figure 2.4 Probabilities for Bernoulli and Poisson distributions 25 20 15 Bernoulli 10 Poisson 5 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 The closeness of agreement between bar charts for the Bernoulli distribution and for the Poisson distribution provides heuristic proof that these probability distributions do indeed converge. Convergence accelerates for a variance of n·p·q=100·0.02·0.98=1.96 but even more so for a variance of n·p·q=1,000·0.02·0.98=19.6. 17 - 68 Figure 2.4 and Table 2.3 are implemented in the same spreadsheet template. The variance of n·p·q=100·0.05·0.95=4.75 may be replaced with n·p·q=169·0.05·0.95=8.03 but not with n·p·q=170·0.05·0.95=8.08 because the Bernoulli distribution defaults to #NUM! for values larger than 10307. Table 2.3 Comparing Bernoulli and Poisson distributions ————–––––––––––——————————————— Predicted Bernoulli Poisson Bernoulli Poisson Number %rel %rel %cum %cum ————–––––––––––——————————————— 0 0.6 0.9 0.6 0.6 1 3.1 4.1 3.7 5.0 2 8.1 9.8 11.8 14.7 3 14.0 15.5 25.8 30.2 4 17.8 18.4 43.6 48.5 5 18.0 17.4 61.6 66.0 6 15.0 13.8 76.6 79.8 7 10.6 9.4 87.2 89.1 8 6.5 5.6 93.7 94.7 9 3.5 2.9 97.2 97.6 10 1.7 1.4 98.9 99.0 11 0.7 0.6 99.6 99.6 12 0.3 0.2 99.9 99.9 13 0.1 0.1 100.0 100.0 ————–––––––––––——————————————— Given that both probability distributions display the typical bell-shaped curve of the Gaussian or normal distribution, it stands to reason that population parameters converge. This is the reason why the variances of samples selected from homogeneous populations that conform to any of these distributions converge on the Central Limit Theorem. This convergence provides a sound statistical basis to sampling practice, a subject that will be discussed in exhaustive detail in the chapter on sampling practice. 18 - 68 2.3.4 Gaussian or normal distribution The Gaussian or normal distribution is continuous, nonuniform, and quintessential in the application of statistical methods. The adjective normal has no clinical implications but implies many populations are distributed in this manner. De Moivre’s work underpins the Gaussian distribution as much as it does the Poisson distribution. It was Carl Friedrich Gauss (1777-1855) who discovered that a wide range of data sets in the universe give curves of comparable shape when plotted in graphs. In fact, the values of the normal distribution display typical bell-shaped curves that define confidence intervals and ranges for central values of sets of measured values at different probability levels. ( x − μ )2 − 1 2σ 2 y= e σ 2π Its formula shows that y is a function of μ, the population mean, σ2, the population variance, and x, a measured value of the stochastic variable. The presence of e, the base of natural logarithms, and of π, the omnipresent constant reflects De Moivre’s work. Given that the area under the normal curve is unity or 100% when integrated from x=–∞ to x=+∞, it follows that the area between x1 and x2 is equal to the following integral, x2 ( x − μ )2 1 − P( x1 ≤ x ≤ x2 ) = σ 2π ∫e x1 2σ 2 dx Most textbooks on statistics contain tables that give F(z), the area under the normal curve, over a range from z=0.00 to z=3.49. Statistical software and spreadsheet software give the same z-values as are listed in handbooks of statistical tables. Figure 2.4 shows three different population variances about the same population mean. The highest variance gives the lowest curve (red), and the lowest variance gives the tallest curve (green). Figure 2.5 Different population variances and same mean 19 - 68 The question of whether these population variances differ significantly is irrelevant. After all, the observed F-value between the highest and lowest population variances in Figure 2.3 invariably exceeds F0.05;∞;∞=F0.01;∞;∞=1.00. at 5% and 1% probability. In other words, numerically different population variances do indeed differ significantly because df=∞ for each σ2 by definition. Unlike Figure 2.5, in which different population variances are plotted about the same population mean, Figure 2.6 shows different population means with the same population variance. Figure 2.6 Different means and same population variance The question of whether population means differ significantly is equally irrelevant. After all, variances of population means converge on zero because df=∞ for σ2 by definition. Hence, numerically different population means do indeed differ significantly. Figure 2.7 depicts the fundamental relationship between the population variance (red) of a homogeneous sampling unit and the variance of a sample (green) that consists of a set of fifteen (15) primary increments taken from this sampling unit. Figure 2.7 Population variance and sample variance The Central Limit Theorem defines the relationship between the population variance of a homogeneous population and the variance of a sample selected from that population. This 20 - 68 theorem is pivotal in the transition from sampling theory with infinite degrees of freedom to sampling practice where degrees of freedom are finite. In fact, the number of degrees of freedom is either a positive integer for sets of measured values with constant weights or a positive irrational for sets of measured values with variable weights. Inflection points on a graph show its change from convex about the center to concave toward – ∞ and + ∞. By definition, the area under the bell-shaped curve between – ∞ and + ∞ is unity or 100%. Those inflection points were already defined in De Moivre’s work. Table 2.4 gives the most relevant symmetric confidence intervals and ranges as a function of μ, the population mean, and σ, its standard deviation, together with z-values that derive from the Gaussian distribution, and with symbols that will be used throughout the text. Table 2.4 Symmetric confidence intervals and ranges ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– Confidence interval range: lower limit range: upper limit ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– 95% CI = ±1.96·σ 95% CRL = μ–1.96·σ 95% CRU = μ+1.96·σ 99% CI = ±2.58·σ 99% CRL = μ–2.58·σ 99% CRU = μ+2.58·σ 99.9% CI = ±3.29·σ 99.9% CRL = μ–3.29·σ 99.9% CRU = μ+3.29·σ ––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––– Tables that give areas under the normal curve as a function of z-values define the level of significance or risk associated with a particular inference. The value of z=1.96 gives 95% confidence intervals and symmetric 95% confidence ranges whereas values of z=2.58 and z=3.29 give confidence intervals and symmetric confidence ranges at 99% and 99.9% probability respectively. The level of significance or risk is defined as α=2[1–F(z)], in which F(z) is the area under the normal curve. A homogenous population with μ=50 and σ2=25 would give a 95% confidence interval of 95% CI=z0.975·σ=1.96·5=±9.8, or 95% CI=±9.8·100/50=±19.6%rel. Its symmetric 95% confidence range would have a lower limit of 95% CRL=μ–95% CI=50–9.8=40.2 and an upper limit of 95% CRU= μ+95% CI=50+9.8=59.8. Therefore, a measured value of 62.1, determined in a sample taken from this population, may be rejected as a statistical outlier because it exceeds 95% CRU=59.8. In fact, 95 out of 100 measured values (19 out of 20) are expected to fall within this symmetric 95% confidence range from 40.2 to 59.8 whereas five out of 100 measured values (one out of 20) are expected to be either lower than 95% CRL=40.2 or higher than 95% CRU=59.8. A 99% confidence interval of 99% CI=z0.995·σ=2.58·5=±12.9 would give a symmetric confidence range with a lower limit of 99% CRL=50–12.9=37.1 and an upper limit of 99% CRU=50+12.9=62.9. In the same way, a 99.9% confidence interval of 99.9% CI= z0.9995·σ=3.29·5=±16.4 would give a symmetric confidence range with a lower limit of 99.9% CRL=50–16.4=33.6 and an upper limit of 99.9% CRL=50+16.4=66.4. In this case, the measured value of 62.1 should not be rejected as a statistical outlier because it falls between lower and upper limits of symmetric 99% and 99.9% confidence ranges. 21 - 68 Lower and upper limits of asymmetric confidence ranges are mutually exclusive because either the lower limits or the upper limits are valid. Lower limits are effective measures for the precision of contents and grades of ore reserves. By contrast, upper limits are practical measures for the precision of trace metals in contaminated sites. Table 2.5 lists applicable z-values and symbols that will be used throughout this text. Table 2.5 Asymmetric confidence ranges ––––––––––––––––––––––––––––––––––––––––––––––––– ACRL => lower limit ACRU => upper limit ––––––––––––––––––––––––––––––––––––––––––––––––– 95% ACRL = μ–1.65·σ 95% ACRU = μ+1.65·σ 99% ACRL = μ–2.33·σ 99% ACRU = μ+2.33·σ 99.9% ACRL = μ–3.09·σ 99.9% ACRU = μ+3.09·σ ––––––––––––––––––––––––––––––––––––––––––––––––– For simplicity, the same population parameters of μ=50 and σ2=25 are used to show how to derive lower and upper limits of asymmetric confidence ranges. The lower limit of the asymmetric 95% confidence range is 95% ACRL=μ–z0.95·σ=50–1.65·5=50–8.2=41.8. Hence, 95 out of 100 (one out of 20) measured values, determined in samples selected from this population, are expected to be less than 95% ACRL=41.8. Similarly, one out of 100 measured values is expected to be less than 99% ACRL=μ–z0.99·σ=50–2.33·5=38.4 whereas only one out of 1,000 is expected to be less than 99.9% ACRL=μ–z0.999·σ=50– 3.09·5=34.6. The upper limit of the asymmetric 95% confidence range is 95% ACRU=μ+z0.95·σ=50+ 1.65·5=50+8.2=58.2. Thus, 95 out of 100 (one out of 20) measured values are expected to exceed 95% ACRL=58.2, one out of 100 measured values is expected to exceed 99% ACRL=μ+ z0.99·σ=50+2.33·5=61.6, and only one out of 1,000 is expected to exceed 99.9% ACRL=μ+z0.999·σ=50+3.09·5=65.4. The properties of the Gaussian or normal distribution are highly relevant in sampling practice because central values and variances of sets of measured values, determined in samples taken from multinomial, binomial, or Poisson distributions, converge on those of the normal distribution as more samples are selected. Table 2.6 gives the relationship between the population parameters of Bernoulli, Poisson, and Gaussian distributions. Table 2.6 Properties of probability distributions ––––––––––––––––––––––––––––––––––––––––––––––––– Parameter Bernoulli Poisson Gaussian ––––––––––––––––––––––––––––––––––––––––––––––––– Population mean n·p m μ Population variance n·p·q m σ2 ––––––––––––––––––––––––––––––––––––––––––––––––– This convergence of population parameters ensures a smooth transition from sampling theory and unknown true values to sampling practice and unbiased estimates. 22 - 68 2.3.5 Standard uniform distribution Randomly distributed values of the standard uniform distribution are useful to simulate games of chance such as tossing coins or rolling dice. The spreadsheet functions RAND() in Excel and @RAND in Lotus both generate a single randomly distributed value of the standard uniform distribution. Excel’s RAND() function is applied in several numerical examples. Each value simulated with this function is called a Standard Uniform Random Number with the acronym SURN. Given that all outcomes within the range from zero to unity are equiprobable, it follows that 0≤SURN≤1. Simulating a toss of an unbiased coin is easy. If P(head)=0 and P(tail)=1, a value between 0≤SURN<0.5 gives “head” whereas a value between 0.5<SURN≤1 gives “tail”. In Excel format, =IF(RAND()>0.5,”tail”,”head”). Simulating a roll with a 6-sided die is also easy. Because the probability for each face of an unbiased die is p=1/6, multiplying a SURN by 6, adding 0.5, and rounding the sum to the closest integer gives one of the six faces of a single die. In Excel format, =ROUND(RAND()*6+0.5,0). The RAND() function is validated by simulating a large number of tosses of a coin and comparing the observed probability with the expected probability of P(head)=P(tails)= 0.5. The function can also be validated by simulating a large number of rolls with a single die and comparing observed and expected probabilities for each of the six faces. In this case, the population mean and its variance for the standard uniform distribution verify the performance of the RAND() function. The population mean and the population variance of the standard uniform distribution are μ= 0.5 and σ2= 1÷12 = 0.0833 for a standard deviation of σ= 0.2887 and a coefficient of variation of CV= σ·100÷μ = 57.7%rel. The test for bias compares the arithmetic mean of a sample of 121 SURNs with the population mean of μ=0.5. Table 2.7.1 Test for bias ———————————————————————— Parameter/statistic Symbol Value ———————————————————————— Population mean μ 0.5 Sample mean x 0.4782 Difference ∆x 0.0218 Observed t-value t 0.785 Tabulated t-value t0.05;120 1.980 Significance ns ———————————————————————— ns not significant Table 2.7.1 gives the population mean, a sample mean, the difference between these central values, an observed t-value, and the value of the t-distribution. The observed t- value in this table is t = Δx÷[√(var(x)÷n)] = 0.0218÷(√0.0934÷120) = 0.785. Observed t- values are not expected to exceed t0.05;120=1.980 of the t-distribution more that 5 times out of 100 clicks of F9. 23 - 68 Fisher’s F-test is applied to verify whether a sample variance of var(x)=0.0934 for this set of 121 SURNs is statistically identical to the population variance of σ2=1÷12=0.0833 by comparing the observed value of F=0.0934÷0.0833=1.12 with F0.05;120;∞=1.23. Given that a simulated sample variance is either higher or lower than the population variance of σ2=0.0833, the observed F-value is compared with either F0.05;∞;120=1.23 or F0.05;120;∞=1.25. Table 2.7.2 gives the F-statistics for this test. Table 2.7.2 Test for homogeneity of variances —————————————————————––––——— Parameter/statistic Symbol Value —————————————————————————— Population variance σ² 0.0833 Sample variance var(x) 0.0934 Observed F-value F 1.12 Tabulated F-value F0.05;∞;120 1.23 Tabulated F-value F0.05;120;∞ 1.25 Significance ns —————————————————————————— ns not significant This F-test is atypical because the sample variance is equally likely to be lower or higher than the population variance. The F-test is typically applied to verify spatial dependence by comparing the observed F-value between the variance of a set and the first variance term of the ordered set with values of F-distributions at 5% and 1% probability. Fisher’s F-test is also applied to optimize sampling protocols by partitioning the sum of all variances in a measurement chain or hierarchy into its components. The question of whether this ordered set of 121 SURNs displays a significant degree of spatial dependence or is randomly distributed within the time it took to simulate the set is solved by comparing the observed F-value between var(x)=0.0934, the variance of the set, and var1(x)=0.0866, the first variance of the ordered set, with F0.05;120;240=1.31. Table 2.7.3 shows that the observed value of F=0.0934÷0.0866= 1.08 does not exceed this value of the F-distibution. Hence, this temporally ordered set of SURNs is randomly distributed within the sample space of time required to simulate the set. Table 2.7.3 Test for spatial dependence —————————————————————————— Parameter/Statistic Symbol Value —————————————————————————— Variance of set var(x) 0.0934 First variance term var1(x) 0.0866 Observed F-value F 1.08 Tabulated F-value F0.05;120;240 1.31 Tabulated F-value F0.05;240;120 1.33 Significance ns —————————————————————————— ns not significant 24 - 68 Repeatedly clicking F9 in the spreadsheet template for Table 2.7.3 shows that observed F-values are equally likely to either exceed unity or fall below it. F0.05;120;240= 1.31and F0.05;240;120=1.33 differ only marginally. This is why most sets of SURNs are randomly distributed irrespective of which of these values does apply. If var(x), the variance of the set, is lower than var1(x), the first variance term of the ordered set, than FALSE is printed. Covariances, unlike the variance terms for ordered sets, need not be computed to quantify degrees of spatial dependence in sample spaces. Testing for spatial dependence between in situ and temporally ordered sets of measured values precedes the design of sampling variograms. In fact, a sampling variogram is more than a visual presentation of Fisher’s F-test because it shows not only where the degree of spatial is statistically significant but also where it dissipates into randomness. Covariances may play a role when quantifying associative dependence between logically paired data. 25 - 68 2.3.6 Standard normal distribution Randomly distributed values of the standard normal distribution are useful to simulate more complex stochastic systems than games of chance. The RAND() function gives a single SURN whereas the sum of 12 RAND()s minus 6 gives a randomly distributed value of the standard normal distribution. This value is called a Standard Normal Random Number, and its acronym is SNRN. The standard normal distribution has a population mean of μ=0 and a population variance of σ2=1. Because σ2=1÷12 for the standard uniform distribution, and σ2=1 for the standard normal distribution, a set of 12 SURNs in each of 121 cells of a spreadsheet eliminates the need for a multiplication factor. Table 2.8.1 lists the population mean of μ=0, a sample mean of x =0.0515, and the observed and tabulated values. This bias test verifies whether the population mean of μ=0 and a sample mean of x =0.0515 are statistically identical. The observed t-value of t=[ x –μ]÷√[var(x)÷n]=0.0515÷(√1.125÷120)=0.534.is below t0.05;120=1.980. Hence, this sample mean of x =0.0515 is statistically identical to the population mean of μ=0. Table 2.8.1 Test for bias ———————————————————————— Parameter/Statistic Symbol Value ———————————————————————— Population mean μ 0.0000 Sample mean x 0.0515 Observed t-value t 0.534 Tabulated t-value t0.05;120 1.980 Significance ns ———————————————————————— ns not significant Table 2.8.2 gives var(x)=1.1255, the variance of the set of 121 SNRNs on which the above t-test is based. Fisher’s F-test proves that this sample variance of var(x)=1.1255 is statistically identical to the population variance of σ2=1.0 because the observed value of F=1.1255÷1.0=1.13 is below F0.05;120;∞=1.23 at 95% probability. Table 2.8.2 Test for homogeneity of variances —————————————————————————— Parameter/Statistic Symbol Value —————————————————————————— Population variance σ² 1.0 Sample variance var(x) 1.1255 Observed F-value F 1.13 Tabulated F-value F0.05;∞;120 1.23 Tabulated F-value F0.05;120;∞ 1.25 Significance ns —————————————————————————— ns not significant 26 - 68 Logically, Fisher’s F-test for spatial dependence is based on comparing the observed value of F=var(x)/var1(x) with F0.05;120;240=1.31. Predictably, about 50% of observed values of F=var1(x)/var(x) are compared with F0.05;240;120=1.33 because the sample variance is equally likely to be lower or higher than the population variance. In this case, the question of whether the ordered set of 121 SNRNs displays a significant degree of spatial dependence or is randomly distributed in this temporally ordered set is solved by comparing the observed F-value between var(x)=1.1255, the variance of the set, and var1(x)=1.3679, the first variance of the ordered set, with F0.05;240;120=1.33 at 95% probability. The statistics in Table 2.8.3 show that F=1.3679÷1.1255=1.22 is below F0.05;240;120=1.33. Hence, this temporally ordered set is randomly distributed within the time interval required to simulate 121 SNRNs. Table 2.8.3 Test for spatial dependence —————————–———————————————— Parameter/Statistic Symbol Value ————————————————————————— Variance of set var(x) 1.1255 First variance term var1(x) 1.3679 Observed F-value F 1.22 Tabulated F-value F0.05;120;240 1.31 Tabulated F-value F0.05;240;120 1.33 Significance ns ————————————————————————— ns not significant Repeatedly clicking F9 in the spreadsheet template for Table 2.8 demonstrates that the observed F-value is equally likely to either exceed or fall below unity. Student’s t-test for bias and Fisher’s F-test for compatibility of pairs of variances are based on comparing observed values with tabulated values, which are listed as a function of the level of probability and of the number of degrees of freedom. Evidently, degrees of freedom are indispensable when verifying whether simulated sets of SURNs derives from a homogeneous standard uniform distribution, and whether simulated sets of SNRNs derive from a homogenous standard normal distribution, by comparing sample statistics with a priori population parameters. It is straightforward to simulate sets of n=121 SURNs or SNRNs, count df=n–1=120 degrees of freedom for the set and dfo=2(n–1)=240 for the ordered set, and look up tabulated values as a function of degrees of freedom. Simulation models are effective tools to study how the variances of sets of stochastic variables interact in complex functions. A typical example is a mineral processing plant where wet masses and metal grades of mill feed, concentrate, and tailing, interact with the variances of these stochastic variables, give confidence intervals and ranges for the recovery by simulation. A more complex alternative would be to study how partial derivatives towards all variables interact and define deterministic confidence intervals and ranges for some functionally dependent value of the stochastic system. 27 - 68 Fisher’s F-test for spatial dependence can be applied to any ordered set of measured values, determined in primary increments taken from a sampling unit at intervals of constant time or mass. The test can also be applied to any in situ ordered set of measured values, determined in samples selected at positions with different coordinates in a sample space. In each case, the objective is to verify whether an ordered set of measured values displays a statistically significant degree of spatial dependence, or is randomly distributed within a sample space or a sampling unit. Stanford’s Journel referred in 1992 to some kind of decision that spatial dependence may be assumed unless proven otherwise. Journel did not disclose who decided that spatial dependence might be assumed. Nor did he show how to prove otherwise. The question is then why he was troubled when Fisher’s F-test proved a statistically significant degree of spatial dependence between gold grades of a set of ordered rounds in a drift. It does not make any scientific sense to assume spatial dependence between measured values in ordered sets. On the contrary, it would make a great deal of sense to figure out where spatial dependence in sample spaces dissipates into randomness. 28 - 68 2.4 Variance of general function The population variance of a general function is defined by the dependent variable, some set of n independent variables, and the variances of these variables. This formula finds its origin in calculus and probability theory. It shows that the population variance of a general function is the sum of n variance terms, each of which is the squared partial derivative toward an independent variable multiplied by its variance. 2 2 2 ⎛δy⎞ 2 ⎛δy ⎞ 2 ⎛δy ⎞ 2 σ =⎜ 2 ⎟ σX1 + ⎜ ⎟ σ X 2 + ⋅⋅⋅ + ⎜ ⎟ σX n ⎝ δ x1 ⎠ ⎝ δ x2 ⎠ ⎝ δ xn ⎠ y This formula can be used to derive the variance of the arithmetic mean. It is simple to prove that the variances of area-, count-, density-, distance-, length-, mass-, and volume- weighted averages converge on the variance of the arithmetic mean as all the variable weights converge on the same constant weight. The length-weighted average was the first central value to lose its variance, either at the Witwatersrand gold reef complex or at Centre de Géostatistique, before it became a kriged estimate or kriged estimator. The above formula does not define covariance terms and only applies to sets of variables that do not display significant degrees of associative dependence. For example, wet masses, moisture contents, and metal grades of mined ores and mineral concentrates are not expected to display associative dependence. In contrast, in situ densities and grades of massive sulfides invariably display a significant degree of associative dependence. Yet, associative dependence between in situ densities and grades can be taken into account in an effective and intuitive manner without working with covariance terms. The mass of metal contained in a quantity of mined ore, mill feed, or concentrate, is a function of its wet mass, moisture factor and grade factor such that Me = Mw ⋅ MF ⋅ GF . The following formula gives the variance of the mass of contained metal. ⎡ var( Mw) var( MF ) var(GF ) ⎤ var( Me) = Me 2 ⎢ + + ⎣ Mw 2 MF 2 GF 2 ⎥ ⎦ where: var(Me) = variance of mass of contained metal var(Mw) = variance of wet mass var(MF) = variance of moisture factor var(GF) = variance of grade factor This formula forms part of ISO DIS 12744–Sampling Procedures for Determination of Metal and Moisture Content. ISO Technical Committee 183 approved the interleaved sampling protocol because it gives reliable confidence limits for contents and grades of coals, ores, mineral concentrates, smelter residues, recycled materials, and scores of others, at the lowest possible cost. It does so because each pair of interleaved primary samples gives a single degree or freedom. Interleaved sampling protocols are also approved by ISO Technical Committee 69–Applications of Statistical Methods. 29 - 68 A homologue of the above formula can be applied to compute confidence limits for metal contents and grades of ore reserves. This formula has not yet been approved for reserve estimation. The methodology is based on the fact that Me, the mass of metal contained in a volume of in situ ore, is a function of a volume in m3, an in situ density in mt/m3, and a grade factor such that Me = V ⋅ ID ⋅ GF . The following formula gives the variance of the mass of contained metal or metal content of a volume of in situ ore, ⎡ var(V ) var( ID) var(GF ) ⎤ var( Me) = Me2 ⎢ + + ⎣ V 2 ID 2 GF 2 ⎥ ⎦ where: var(Me) = variance of mass of contained metal var(V) = variance of volume var(ID) = variance of in situ density var(GF) = variance of grade factor The above formula provides unbiased confidence limits for contents and grades of ore reserves. The first step is to verify spatial dependence between grades of ordered sets of core samples in boreholes. The second step is to compute variances of average grades for ore zones in boreholes. The third step is to convert statistics of boreholes into cylindrical volumes of in situ proven ore. The fourth step is to verify spatial dependence between grades of ordered boreholes along a cross section or profile. A significant degree of spatial dependence would make it possible to convert cylindrical volumes into contiguous blocks of proven ore. The final step is to summate the variances of metal contents of all blocks, and to convert this sum into confidence limits for the cumulative metal content and the mass-weighted average metal grade. In the case that borehole grades ordered along a profile do not exhibit a significant degree of spatial dependence, cylindrical volumes cannot be converted into contiguous blocks of proven ore. Such cylindrical volumes do not only delineate proven ore within a resource but also provide an effective method to decide where best to drill additional holes. Unbiased confidence intervals and ranges for metal grades and contents of mineral inventories in annual reports are of critical importance to mining investors. This is why several numerical examples will be presented in the chapter on sampling practice to show how to compute confidence intervals and ranges for contents and grades of reserves, and how to define proven ore in resources. The arithmetic mean is by far the most common function in mathematical statistics. The area-, density-, distance-, length-, mass- and volume-weighted average are widely used in geosciences. For simplicity, arithmetic means and weighted averages are referred to as “central values”. Given that a central value is a function of a set of measured values, it follows that each central value has its own variance in mathematical statistics. Yet, each central value does not necessarily have its own variance in geostatistics. The question is not only which central value lost its variance but also who could possibly have lost this variance, when and why. This question stands while geostatocrats and krigeologists assume, krige, smooth and rig the rules of classical statistics. 30 - 68 The variance of a central value of a set of n measured values with variable weights derives from the variance of the set and the sum of squared weighing factors. In formula, n var( x ) = ∑ wi2 ⋅ var( x) 1 where: var(x) = variance of set wi2 = squared ith weighting factor For each wi = 1/n, the sum of n squared weighting factors is equals to n·(1/n)2=1/n. Hence, var( x ) = var( x) / n . This formula is in fact the Central Limit Theorem, a simple formula that defines the relationship between the variance of a set of n measured values with equal weights and the variance of its arithmetic mean. In contrast, the weighted average is the central value of a set of measured values with variable weights. This is why each area-, count-, density-, distance-, length-, mass-, and volume-weighted average does have its own variance. Evidently, the variances of all weighted averages converge on the Central Limit Theorem as all the variable weights converge on the same constant weight. The corollary of variable weights is that the number of degrees of freedom is no longer a positive integer but a positive irrational. Two or more measured values such as counts, densities, lengths, masses, or volumes define but one weighted average. The distance-weighted average is different in the sense that two or more measured values, determined in samples selected at positions with different coordinates in a finite sample space, define an infinite set of distance-weighted averages. The problem is that all weighted averages do have variances in mathematical statistics but that the distance-weighted average is not similarly blessed in Matheronian geostatistics. To have or not to have a variance are mutually exclusive outcomes. The question is then which outcome is true, and which outcome is false. Could Matheron’s new science of geostatistics be flawed? Could mathematical statistics be flawed? Did Matheron study mathematical statistics before creating his new science of geostatistics? In statistics, all variances converge on the Central Limit Theorem as variable weights converge on the same constant weight. Moreover, each weighted average does have its own variance because it is a functionally dependent value. In geostatistics, however, the variance of the length-weighted average and the variance of the distance-weighted average were somehow misplaced. Paradoxically, both distance-weighted average and length-weighted averages were reborn as kriged estimates or kriged estimators. Matheron, in his Foreword to Journel and Huijbregts’s 1978 Mining Geostatistics, put forward that geologists deal with structure and statisticians stick to randomness but did not explain how. Much of Matheron’s seminal work is posted with the On-Line Library of the Centre de Géostatistique. The properties of variances play a critical role in computing unbiased confidence limits for contents and grades of reserves. Matheron paid little attention to the properties of variances, and today’s Centre de Géostatistique even less. The problem with Matheron’s teachings is not so much that he knew too little about applied statistics but that much of what little he knew was wrong. 31 - 68 2. 5 Matheron’s statistics Matheron’s seminal work is accessible for long overdue review and scrutiny at the On- Line Library of the Centre de Géostatistique. Matheron’s Formule des Minerais Connexes, one of the very first papers posted on this website, was written in Algiers on November 25, 1954. A correction to this paper was appended on January 13, 1995. No primary data was provided either in the paper or in the appended correction. In retrospect, it is strange that this paper was marked Note Statistique No 1 because it does confirm Matheron himself believed he was practicing classical statistics in 1954. His first paper, too, proved that Matheron’s statistics didn’t comply with the requirement functional independence, and ignored the concept of degrees of freedom. Given that Matheron’s seminal work showed such a tenuous grasp of mathematical statistics, it is not surprising that French statisticians paid scant attention to his statistics. In those days, Matheron was far removed from rapid advances in mathematical statistics elsewhere in the world. Matheron’s shaky statistics stood in sharp contrast to his credentials at l’Ecole Nationale Superieure des Mines de Paris. Formule des Minerais Connexes is classic Matheronesque in the sense that poorly defined terms and atypical symbols abound but real data are missing. Matheron failed to explain why correlation-regression analysis was applied to Napierian logarithms of paired lead and silver grades. He penned in his paper a correlation coefficient of r=0.85 but not the number of paired grades. As a result, it is still impossible to compute 95% confidence limits for his mean grades of 2.36% for lead and of 89 g/t of silver. He should have but did not point out is that n paired grades give df=n–2 degrees of freedom. Matheron’s problem in 1954 was that central values and degrees of freedom were beyond his grasp. Matheron knew how to test for associative dependence between paired lead and silver grades. What failed to register in his mind is that associative dependence between ordered metal grades are measures for spatial dependence in a sample space. The hypothesis that causality may be assumed without cause stands out as the central red flag in Matheron’s work. In contrast, the Central Limit Theorem is nowhere to be found. In his Rectificatif à la Note Statistique No 1, Matheron mentioned that the length of his core samples was variable. This is why his mean grades of 2.36% of lead and 89 g/t of silver are biased estimates. He did not explain know how to derive the variance of the arithmetic mean grade or the variance of the length-weighted average grade. In fact, the Central Limit Theorem was never approved for application in Matheron’s new science of geostatistics. Matheron huffed and puffed about higher length-weighted average grades but didn’t test whether length-weighted average grades were significantly higher than arithmetic mean grades. Matheron did not know how to compute confidence limits, how to verify spatial dependence, and how to apply Student’s t-test. Matheron did not grasp that his arithmetic mean and his length-weighted average do have variances simply because all functionally dependent values do. Matheron should have taught his students a little set theory by pointing out that one-to-one correspondence is as 32 - 68 common between functionally dependent values and variances as it is between blocks and grades and between core samples and grades. It would be appropriate if the Centre de Géostatistique were to post on its On-Line Library the lengths of all core samples and the set of paired lead and silver grades that underpin Matheron’s very first statistical note. Among those who stand to benefit most are all the neophytes who are presently primed to become the geostatistical scholars of the future. Matheron, in a June1955 paper titled Utilité des méthodes statistiques dans la recherche minière, referred to D G Krige’s A statistical approach to some mine valuation and allied problems at Witwatersrand goldfield and A statistical analysis of some of the borehole values in the Orange Free State. He also referred to H S Sichel’s 1949 Mine valuation and maximum likelihood, and H J de Wijs’s Statistics of ore distribution, Part 1, Nov 1951, and Part 2, Jan 1953. Matheron did not discuss any of the statistical methods of these authors. On the contrary, he praised his own work with lognormal statistics by symbols at the Bureau de Recherches Minière. It was out of character for Matheron to refer to statistical methods of authors in South Africa and The Netherlands. In fact, Matheron so rarely refers to the statistical literature in English-speaking countries that his flawed statistics may well be due to insufficient osmosis of applied statistics. This is why Matheron’s new science is so eerily reminiscent of the tale of the emperor’s new clothes. Matheron’s penchant for μ– and σ2–symbols was unconventional. After all, the use of these symbols is restricted to the unknown population mean and the unknown population variance in probability and sampling theory. It would have been merely a matter of semantics if it were not for the fact that σ2, the unknown population variance, converges on zero as n, the number of measured values, converges on infinite. If it were possible to reduce the unknown true population variance to zero, then the unknown true population mean would be known with absolute certainty. In the real world, infinite sets of measured values are as rare as zero variances. In Matheron’s world of mock statistics, however, infinite sets of kriged estimates and zero kriging variances are as common as shrinking reserves. In his June 1958 Problemes de zero et d’infini (Note Statistique No 17), Matheron mulled over infinite and zero through twenty pages of tortuous text and strange symbols. He did so long before his most gifted followers found infinite sets of distance-weighted averages and zero pseudo variances. Basic concepts and widely accepted statistical terms and symbols went missing in Matheron’s make of statistics. Matheron did not know that measured values and degrees of freedom belong together just as much as do variances and functionally dependent values of stochastic variables. The symbol x stands for the central value of a set of measured values, either with equal weights such at arithmetic means or with variable weights such as area-, count-, density-, distance, length-, mass- or volume-weighted averages. The symbol var ( x ) stands for the variance of a central value, var(x) for the variance of a set, and varj(x) for the jth variance term of the ordered set. None of these symbols and terms scored a passing grade in Matheron’s surreal statistics. 33 - 68 Matheron failed to check and compare his statistics against that of contemporary English- speaking statisticians. Sir Ronald A Fisher was knighted in 1953 for his role in the development of analysis of variance. Matheron still did not like Fisher’s work in 1970. Otherwise, he would have known how to apply Fisher’s F-test to the variance of a set and the first variance term of the ordered set. It also explains why Matheron rather assumes than verifies spatial dependence between grades of ordered core samples in boreholes, ordered ore sections in profiles, or ordered rounds in adits, drifts, pits, or trenches. M J Moroney’s Facts from Figures saw its Second Edition in 1953. This popular book was translated and published in 1970 under the title Comprendre la statistique: vérités et mensonges des chiffres. In 1970, Matheron, his followers, and a few token statisticians met at the very first geostatistics colloquium on campus at The University of Kansas, Lawrence from June 7th to 9th. D Huff’s whimsical How to Lie with Statistics was an instant success in 1954. It inspired W J Reichmann’s 1961 Use and Abuse of Statistics to caution that a world without sound statistics would slowly grind to a halt. H G Wells was so enthralled with statistics that he declared, “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” What would Wells have thought of Matheron’s statistical thinking? To be sure, Matheron himself was rather taken with his own statistical thinking! Matheron failed to show how to derive the length-weighted average of a set of measured values with variable lengths, and how to derive the variance of this sort of central value. Matheron did not know how to count degrees of freedom. Neither did he know that degrees of freedom are positive integers for sets of measured values with equal weights, and positive irrationals for sets of measured values with variable weights. In fact, Matheron knew much too little about degrees of freedom. This is why he could not verify spatial dependence by applying Fisher’s F-test to variances of sets of measured values and first variance terms of ordered sets, and comparing observed F-values with tabulated F-values at selected probability levels and with applicable degrees of freedom. Analysis of variance and the properties of variance were a profound mystery to Matheron when he wrote his first statistical note in Algiers on November 25, 1954. These properties were still a profound a mystery in 1979 when Matheron compiled his rambling Foreword to Journel and Huijbregts’s Mining Geostatistics. During all those years, the slope of Matheron’s learning curve for his neostatistics stayed statistically identical to zero. Matheron played at statistics in the 1950s but did not grasp its elementary rules. Yet, he transformed his neostatistics into geostatistics with single-minded resolve and reckless abandon. What happened to the variances of his lead and silver grades for ordered core samples of variable length? Why didn’t he test for spatial dependence between ordered lead and silver grades? Why didn’t he count degrees of freedom? Why did he violate the most basic rules of classical statistics with impunity? Why did he get away with so much bogus statistics? Surely, it was not just Matheron’s bluster and blarney! It did require implicit approval of the world’s mining industry. This industry would not put up a dime for research into assuming causality without cause but research into stochastic simulation of ore reserves with pseudo variances somehow seems to make scientific sense. 34 - 68 2.6 Matheronian geostatistics Matheron was dabbling at statistics in the early 1950s when he found out that working with statistics of measured values and counting degrees of freedom was not his true calling. So he took a few odds and ends of statistics and turned it into a new science, a sort of statistics by symbols without rigid rules and real data. In time, his new science of assuming spatial dependence, interpolating and extrapolating by kriging, selecting the least biased subset of some infinite set of kriged estimates, and smoothing its pseudo kriging variance to perfection, became the heart and soul of Matheronian geostatistics. Matheron’s Krigeage d’un panneau rectangulair par sa périphérie was his first paper with a krige-inspired eponym in its title. It was completed in 1960 and is posted as Note Geostatistique No 28 with the Centre de Géostatistique‘s On-Line Library. This paper is an archetypal Matheronian study in the sense that ambiguous terms abound, avant-garde symbols rule, and central values have no variances. It was also the year that Matheron set the stage for straying from sound statistics into his novel science of geostatistics. A A' B B' Fig. 1 Source: Note Géostatistique No 28 The above figure is a facsimile of Figure 1 in Matheron’s Note Géostatistique No 28. Matheron labeled the lengths of AA’ and BB’ for his block (panneau) as a, and the lengths of AB and A’B’ for the same block as b. He defined the mean grade (teneur moyenne) of AA’ and BB’ as u, the mean grade of AB and A’B’ as v, and the mean of u and v as the mean grade of Block AA’B’B. Matheron also defined z*, his estimateur and a precursor to the honorific kriged estimate or kriged estimator, as follows. a b z* = ⋅u + ⋅v a+b a+b In classical statistics, Matheron’s estimateur z* is the length-weighted average of u and v. The formula for var(z*), the variance of Matheron’s estimateur, is a homologue of the above formula, in which var(u) and var(v) are the variances of u and v. a b var( z*) = ⋅ var(u ) + ⋅ var(v) a+b a+b Incredibly, Matheron did not derive var(u), var(v), or var(z*). Incredible indeed because u and v are mean grades of opposite sides of his block, and z* is the mean of mean grades of u and v. So why did these variances fail to make the grade in Matheron’s new science? 35 - 68 Matheron had been turning out scores of weighty Notes Statistique since the early 1950s. In 1960, however, he still did not know that mean grades of blocks do have variances. So it comes as no surprise that he did not know how to derive var(k*), the variance of k*, or var(u) and var(v), the variances of u and v. Matheron thought statisticians did not grasp mining problems but he had not the slightest grasp of his own sampling problems. He did not know how to derive the variance of a set of measured values or the variance of its arithmetic mean. Neither did he know how to test for spatial dependence, or how to compute the variance of metal contained in a single block or in a set of blocks. Matheron’s estimateur is the length-weighted average grade of his block. Did he dictate that block grades have no variances? Where did he refer to textbooks on statistics? Who reviewed his notes? If both u and v are a single measured value rather than the arithmetic means of sets, the variance of Matheron’s estimateur is obtained as follows. 2 2 ⎡ a ⎤ ⎡ b ⎤ var( z*) = ⎢ ⎥ ⋅ ( z * −u ) + ⎢ a + b ⎥ ⋅ ( z * − v ) 2 2 ⎣a +b⎦ ⎣ ⎦ This formula derives from the variance of a general function as defined in probability theory. Matheron’s estimateur is a functionally dependent central value of some set of measured values with variable weights. If Matheron’s panneau is a perfect square, the above formula becomes the Central Limit Theorem for n=2. In that case, Matharon’s estimateur gives precisely one degree of freedom. In contrast, a rectangular panneau gives slightly less than one degree of freedom. When Matheron was studying statistics by symbols without real data, counting degrees of freedom for sets of measured values with equal or variable weights ranked rather low on his short list of things to do right. Figure 2.8 Matheronian block A Matheronian block is an anomaly in statistics because its length-weighted average grade is not blessed with a variance. In contrast, the grade of a Matheronian block does have a variance in classical statistics because it is a functionally dependent central value of a set of measured values. Matheron’s problem was not so much that his estimateur is a length-weighted average grade but that it still does not have a variance. Central values such as arithmetic means of sets of measured values with equal weights, or area-, count-, density-, distance-, length-, mass-, and volume-weighted averages of sets of measured values with variable weights, do have variances. This is why central values play a key role not only in mineral processing, smelting, and refining but also in mineral exploration and mining. The Central Limit Theorem links the variance of the set and the variance of its central value. Why then is this theorem redundant in Matheron’s new science? 36 - 68 It is deeply troubling indeed that the variance of the length-weighted average grade of a Matheronian block did not make the grade in his new science. The more so because none of his Notes Statistique mentions that functionally dependent values do have variances, and that sets of measured values do give degrees of freedom. Matheron ruminated how to obtain the best possible grade estimate for a set of Matheronian blocks, and how to minimize la variance d’erreur associated with krigeage ordinaire. What Matheron failed to grasp is that true variances cannot be “minimized”. Thus, it is not at all surprising that Matheron did not know how to derive confidence limits for variances. On page 2 of his paper, Matheron described a set of blocks with a mean grade of xt along a total length of Lt and a mean grade of xm along a total width of Lm. The formula for Matheron’s estimateur m* is a homologue of the earlier one for his estimateur k*. Lt Lt m* = ⋅ xt + ⋅ xm Lt + Lm Lt + Lm In this case, too, Matheron did not derive var(m*), the variance of his estimateur for this set of blocks. Neither did he derive var(Lt) or var(Lm), the variances of sets of measured values along opposite sides of his blocks. Because Matheron did not derive a mean grade for each of his blocks, he could not test for spatial dependence between block grades. Unlike a real statistician, Matheron did not know how to derive any of these statistics. 1. Apply analysis of variance to estimate variances within and between blocks, 2. Verify spatial dependence by applying Fisher’s F-test to the variance of a set and the first variance term of the ordered set, 3. Construct a sampling variogram for ordered block grades by verifying where spatial dependence in a sample space dissipates into randomness, 4. Compute the variances of the metal contents for all blocks from volumes, densities, and grades, and the variances of these variables, 5. Use the additive property of variances to derive confidence limits for the cumulative metal content and for the weighted average grade of this set of n blocks. Fig. 2 Set of n ordered blocks Source: Note Géostatistique No 28 What a pity that Matheron’s grasp of statistics was a bit less astute than he gave himself so much credit for. Who were Matheron’s peers in those heady days? Who studied his torrents of Notes Statistique and Notes Géostatistique? Was Matheron a born genius at 37 - 68 probability or a self-made wizard of odd statistics? His Centre de Géostatistique ought to post with its On-Line Library early reviews of Matheron’s seminal work! His Notes Statistique do not refer to commonly applied statistical methods. Did any bona fide statistician ever review Matheron’s embryonic geostatistics before it was hailed as a new science? As it stands, too many novices are being taught that assuming causality does make a great deal of sense in Matheron’s new science of geostatistics. Matheron did not know in 1954 how to derive the variances of lead and silver grades of ordered core samples. In those days, he did not even know that the length-weighted average grade is the central value of a set of measured values determined in core samples of variable lengths. Neither did he know that each length-weighted average grade does indeed sport its own variance in classical statistics. He still did not know in 1960 that length-weighted average grades do have variances. Matheron’s problem was not so much that length-weighted averages do not have variances but that distance-weighted averages do not have variances either. Matheron’s estimateurs became confusing central values because length-weighted averages and distance-weighted averages were both reborn as kriged estimates or kriged estimators. Matheron’s Catch-22 in 1960 was the difference between a Matheronian block and a Matheronian point. A Matheronian block is a three-dimensional sample space. It is defined by the length-weighted average of a set of measured values determined in samples selected at positions along the periphery of the block. What Matheron failed to do was test for spatial dependence by applying Fisher’s F-test to the variance of the set of measured values and the first variance term of the set ordered along the periphery of his block. That is why Matheron could not possibly confirm whether k*, his estimateur, is an unbiased estimate for the grade of this block. As a result, a Matheronian block may, or may not, have an unbiased grade estimate but its grade definitely did not have a variance. In contrast, a Matheronian point is a zero-dimensional sample space. It is defined by the distance-weighted average of two or more measured values determined in samples selected at positions with different coordinates, either in a two- or in a three-dimensional sample space. The essence of Matheron’s new science is that two or more measured values give an infinite set of different coordinates, and, thus, an infinite set of distance- weighted averages not only within the sample space but also beyond it. Matheron’s folly was that he failed to grasp why the requirement of functional independence and the concept of degrees of freedom foiled his new science. It took Matheron ten more years to replace finite sets of length-weighted average block grades with infinite sets of distance-weighted average point grades. Did he do so because infinite sets are immeasurably larger than finite sets? Matheron could not possibly have known in 1960 that pseudo kriging variances and pseudo kriging covariances of least biased subsets of infinite sets of kriged estimates would become the cornerstones of his new science of geostatistics. Matheron’s docile disciples grasped as much of the rules of statistics as did Matheron himself. In fact, the entire geocabal did not know that length- weighted average block grades and distance-weighted average point grades do have variances. Perhaps they did know but were afraid to tell Matheron. 38 - 68 The fact of the matter is that Matheron knew barely enough to become a bungling statistician. Just the same, he was brazen to boot, belligerent to a fault, and nonplussed by persistent criticism of his new science of geostatistics. What Matheron did do was create orderliness where randomness rules by assuming causality without cause. Therefore, it should not come as a surprise when Matheron’s new science will be remembered most of all as a statistical fraud without parallel in the history of science. Matheron never mastered analysis of variance, Fisher’s F-test, Bartlett’s chi-square test, Student’s t-test, Tukey’s WSD-test, spatial dependence, functional dependence, or degrés de fidelité for that matter. Neither did Matheron grasp that the Central Limit Theorem links the variance of a central values to the variance of a set of measured values, and why this theorem lies at the core of sampling theory and practice. The very first textbook on Matheron’s new science of geostatistics does mention the “famous” Central Limit Theorem in the text but not in the index. The second textbook mentions zero kriging variances but does not mention that it takes infinite sets of kriged estimates to beget zero kriging variances. Just a few rules of classical statistics had gone astray long before Matheron taught his new science at his Centre de Géostatistique. Matheron himself had cooked up a string of strange symbols and a thesaurus of tortuous terms. It was under Matheron’s tutelage that so much neostatistics was done by symbols rather than with real data. In 1960, Matheron defined his estimateur, which turned out to be a variance-deprived length-weighted average block grade. The missing variance of Matheron’s estimateur is the very reason why the Centre de Géostatistique deserves credit for posting Matheron’s seminal work on its On-Line Library. For it will be a lasting source for study and scrutiny in years to come. The question is not so much whether or not Matheron knew that his estimateur had its own variance but why nobody told him. Of course, it is never too late to unravel Matheron’s folly. Conditional simulation and stochastic simulation may sound much more intuitive and comforting than selecting least biased subsets of infinite sets of kriged estimates. In spite of that, such simulation models are seeded with pseudo kriging variances of least biased subsets of infinite sets of kriged estimates. The odds against selecting that elusive least biased subset of an infinite set of kriged estimates are immeasurably low. All the same, geostatistical ore reserve practitioners beat those odds on a routine basis. Such advanced variants of Matheronian geostatistics merit as much mention in the history of science as the Bre-X fraud does in mining history. Assuming causality is not some minor mistake but a true scientific fraud. 39 - 68 2.7 A colloquium on geostatistics The first colloquium on geostatistics was held on campus at the University of Kansas, Lawrence, from June 7 to 9, 1970. The Kansas Geological Survey, the University of Kansas Extension, and the International Association for Mathematical Geology sponsored the surreal episode that brought Matheron and his bogus statistics all the way to the USA. Dr D F Merriam, Chief of Geologic Research, Kansas Geological Survey, edited the proceedings, and Plenum Press published it in 1970. The colloquium was dedicated to, “To all geostatisticians and statistical geologists.” Drs G S Koch, Jr, and R F Link were the token statistical geologists at this gathering of made geostatisticians. Koch and Link’s textbooks on Statistical Analysis of Geological Data, Parts 1 & 2 were published in 1970. This is why organizers of a geostatistical mind could ill afford to not invite those prominent authors. Dr J W Tukey, a professor at Princeton University who masterminded the well-known WSD-test (Wholly Significant Difference), was asked to scrutinize Matheron’s new science of geostatistics. One of the objectives of Link and Koch’s colloquium paper on Experimental Designs and Trend-Surface Analysis was to link the latter to “ordinary analysis of variance.” Matheron’s take on trend surface analysis emerged in his own paper when he wished, “…the well-known problem of ‘trend surface analysis’ perhaps will encounter here its happy end…” Matheron was not just ill mannered but as ill informed as he was in 1967 when he wrote about the pros and cons of kriging and polynomial interpolation. In those days, his problem was that kriging created counterintuitive ore reserves on polynomial curves and trend surfaces alike. Matheron was keen that ore deposits be kriged in three- dimensional sample spaces. So he assumed spatial dependence between ordered sets of measured values, interpolated by kriging, selected subsets of infinite sets of kriged estimates, smoothed pseudo kriging variances and rigged the rules of classical statistics. When Matheron was playing at statistics in the early 1950s, he did not know how to derive variances of length-weighted lead and silver grades of core samples with variable lengths. Neither did he know how to test for spatial dependence between metal grades of ordered core samples simply because he knew next to nothing about analysis of variance and Fisher’s F-test. Some twenty years later Matheron did not know much more about Fisher’s work but what he did know was that classical statistics spelled trouble for his new science of geostatistics. A great deal of trouble indeed because deriving variances of functionally dependent central values and counting degrees of freedom for sets of measured values never made Matheron’s list of things to master. In his 1970 Random Functions and their Applications in Geology, Matheron pointed out that a variogram of Brownian motion along a straight line is “not bounded”, and that “no stationary covariance exists.” In 1954, however, he did not talk about variograms of lead and silver grades of ordered core samples along a straight line of a borehole. In those days, Matheron did not know how to derive variances of length-weighted average grades of core samples of variable lengths. Had he known that length-weighted average grades do have variances he would have brought up that crucial fact in his seminal work. 40 - 68 Matheron did not put in plain words what Brownian motion and ore deposits do have in common. Neither did he point out that Brownian motion along a perfectly straight line make a similar one-dimensional sample spaces as do metal grades of core samples ordered along a slightly curved borehole. Matheron’s problem was not so much that he didn’t know how to derive sampling variograms for lead and silver grades of ordered core samples in a borehole. His problem was really that he failed to grasp how to verify continued mineralization between metal grades of boreholes. That may well explain why the practice of assuming spatial dependence between measured values became so deeply entrenched in Matheron’s new science of geostatistics. Matheron mulled over “a given practical problem” but practical problems and real data were as scarce in his Brownian motion paper as they are in much of his quixotic work. He defined his ”unbiased expression as an estimator for the variogram” as follows. L −h 1 γ *(h) = ∫ [Z ( x + h) − Z ( x)] ⋅ dx 2 2( L − h) 0 Matheron’s random function is a Riemann integral. The validity of this function depends on its continuity for all values of x between zero and L. Matheron did not prove that his ”unbiased expression as an estimator for the variogram” is indeed unbiased. In this 1970 paper, too, he may well have assumed continuity between some set of unknown points rather than derive a variogram to prove his point. It is troubling to the extreme that assumed spatial dependence became the quintessence of Matheron’s new science. Generally, the jth variance term of some ordered set of independently measured values in a sample space or in a sampling unit derives from the following Riemann sum. 1 ∑ ( xi − xi+ j ) ) 2 var j = 2(n − j ) This formula is as traceable to Von Neumann’s work in the 1940s as analysis of variance is to Fisher’s work at that same time. Fisher’s F-test compares the observed value of F=var(x)÷var1(x) with tabulated values of F-distributions at 5% and 1% probability with applicable degrees of freedom. Plotting statistically significant variance terms for an ordered set against the variance of the set and the lower limits of its asymmetric 99% and 95% confidence ranges gives a sampling variogram. This simple graph displays where order in a sample space or in a sampling unit dissipates into disorder. Given that F0.05;∞;∞=F0.01;∞;∞=1.00, it follows that the notion of degrees of freedom is of critical importance in mathematical statistics. Ordered sets of n measured values give dfo=2(n–1) degrees of freedom for the first term, and dfo=2(n–j) for the jth term. The factor 2 reflects that all but the first and last datum is used twice. Degrees of freedom are positive integers for sets of measured values with equal weights but positive irrationals for sets of measured values with variable weights. As the rules of thumb would have it; measured values do give degrees of freedom, and calculated values do have variances. 41 - 68 Maréchal and Serra’s Random Kriging is richly embellished with krige-derived eponyms and cryptic symbols but short of real data. In fact, this paper is surprisingly reminiscent of Matheron’s Random Functions and their Applications in Geology. For example, Figure 10 in the section on Punctual Kriging shows a symmetric matrix of nine (9) squares with sixteen (16) unknown “punctual estimates” in the center square, each of which, in turn, is a functionally dependent value of the same set of nine (9) samples with unknown grades. The caption below this figure reads, “Grades of n samples belonging to nine rectangles P of pattern surrounding x” but real grades are missing. Each of Maréchal and Serra’s “punctual estimates“ is, in fact, the distance-weighted average grade for the selected position. The problem is not so much that all “punctual estimates“ derive from the same set of nine (9) unknown grades but that all variances are missing. Of course, calling distance-weighted average grades “punctual estimates,” or “estimateurs” for that matter, does not change the fact that such central values do have variances. Maréchal and Serra’s Figure 10 resurfaced with the same set of nine (9) unknown grades as Figure 203 on page 286 of David’s 1977 Geostatistical Ore Reserve Estimation. The latter figure, its caption, and the context on the same page are reviewed in Section 2.9 A textbook on geostatistics. Agterberg alluded to some kind of “geologic prediction problem” in the caption below Figure 1 of his 1970 paper on Autocorrelation Functions in Geology. This figure was born again as Figure 64 in his 1974 textbook on Geomathematics. Thus, it took some 4 years before his “geologic prediction problem” became a “typical kriging problem.” The question is then what the difference between Agterberg’s geologic prediction and typical kriging problems is all about. Both figures refer to unknown “known values” for a set of five (5) irregularly spaced points, and an unknown value to be “predicted” for point P0. Agterberg, just like Maréchal and Serra, failed to point out that his “predicted value” is, in fact, the distance-weighted average of a set of five (5) measured values determined at positions with different coordinates. Neither did he point out that his “predicted value” does have a variance because it is a functionally dependent value of his set of five (5) unknown “known values”. Agterberg could have but did not mention that his “predicted value” was bound to converge on the arithmetic mean, and its variance on the Central Limit Theorem, as soon as irregularly spaced points become equidistant to P0. It is of minor concern that Agterberg did not mention that his “predicted value” is a distance-weighted average. It is of mayor concern, however, that he failed to grasp that each distance-weighted average does have its own variance in classical statistics. He could have mentioned that his set of five (5) irregularly spaced points defines an infinite set of “predicted values” within and beyond this sample space. He could have explained how to verify spatial dependence in his sample space by applying Fisher’s F-test to the variance of the set of five (5) measured values and the first variance term of the ordered set. He could have mentioned random and systematic walks to derive variances of sets and variances of ordered sets. Agterberg, more than most geostatistical scholars, ought to know that his predicted values are zero-dimensional point grades. If Agterberg were to agree that zero-dimensional point grades and three-dimensional block grades do indeed have variances, then he ought to revise his 1974 textbook on Geomathematics. 42 - 68 Professor Dr J W Tukey’s response to the themes and problems of the Geostatistics Colloquium in terms of the current state of the art of data analysis, spectrum analysis, and classical statistics is summarized in Some Further Inputs. Tukey mentioned means and variances in his Abstract but didn’t mention one-to-one correspondence between means and variances. Yet, he did not inquire what happened to the variances of Maréchal and Serra’s “punctual estimates,“ or to the variance of Agterberg’s ”predicted value” for that matter. He may not even have noticed that Maréchal and Serra’s “punctual estimates“ and Agterberg’s “predicted value” did not sport variances. Of course, Tukey knew that the Central Limit Theorem defines the relationship between the variance of a set of measured values with identical weights and the variance of its arithmetic mean. He may have overlooked that Agterberg’s “predicted value” converges on the arithmetic mean when irregularly spaced points become equidistant to P0. It is true that Tukey, Koch, and Link failed to notice that Agterberg’s “predicted value” did not have a variance. Yet, they were no doubt aware that its missing variance would converge on the Central Limit Theorem when all irregularly spaced points become equidistant to P0. Tukey, Koch, and Link were but three of many scores of statisticians who were and are still bamboozled by Matheron’s new science of geostatistics, its multitude of baffling terms and symbols, and its scarcity of real data. Tukey stated, “I am now beginning to understand, (kriging) is a word for more or less stationary, more or less least squares, smoothing of data”. Matheron’s new science of assuming, kriging, smoothing and rigging the rules of classical statistics seemed to have mesmerized Tukey as much as it did so many geologists and mining engineers around the world. Matheron’s rejection of Link and Koch’s paper on Experimental Designs and Trend-Surface Analysis should have troubled both Tukey and the token statisticians. In fairness to the few innocents surrounded by a large crowd of geostatistically blessed, it should be kept in mind that symbols and terms of classical statistics were scrambled to the extreme, and that basic rules were violated amidst the Babylonian confusion. Indeed, it would have taken Tukey, Koch, and Link much more than three days to unscramble Matheron’s bizarre alternative to mathematical statistics. Matheron himself was not troubled when dreadful inconsistencies were brought to his attention. On the contrary, he was invariably proud to point out that the world’s mining industry did embrace his new science of doing so much with a few expensive boreholes. Agterberg, in his tribute to “Georges Matheron–Founder of Spatial Statistics,” claimed, “Matheron deserves Fisher–Tukey class standing.” It is absurd to suggest that Matheron ranked on a par with those giants of statistics! Sir Ronald A Fisher was knighted in 1952 because he created analysis of variance based on the properties of variances. In contrast, Georges Matheron created geostatistics because he failed to grasp the properties of variances as much in 1954 as he did until his passing in 2000. What’s more, he taught his students that spatial dependence between ordered sets of measured values in sample spaces could be assumed with impunity. Agterberg’s ranking of Professor Geeorges Matheron is an ill omen not only for the statistical integrity of Chapter 10 Stationary Random Variables and Kriging in his textbook but even more so for his own integrity as a scholar and a scientist. 43 - 68 2.8 Agterberg’s textbook on geomathematics Dr F P Agterberg’s 1974 Geomathematics, Mathematical Background and Geo-Science Applications, is a comprehensive textbook on the application of the queen of sciences in earth sciences. The author covered much of the vast range of tools and techniques that mathematics provides in such rich abundance. This is why most of it will stand the test of time. In spite of that, some of Agterberg’s assumptions and definitions are bound to crumble under scrutiny and his geostatistical thinking is just as wrong as Matheron’s. S C Robinson, in his Foreword to Agterberg’s textbook, stated, “Geomathematics is becoming indispensable to the earth sciences as the huge volume and wide variety of observations and measurements increases.” Yet, Agterberg did statistics by symbols in Figure 1 of his 1970 Autocorrelation Functions in Geology. Four years later, he still had not found five values determined in samples taken at positions with different coordinates in a two-dimensional sample space. Following is a facsimile of Figure 64 as shown on page 353 in Chapter 10 Stationary Random Variables and Kriging. Fig. 64. Typical kriging problem; values are known at five points. Problem is to estimate value at point Po from the known values at P1 – P5. (From Agterberg, 1971) Agterberg was as keen to work with suppositions and symbols as Matheron was ever since he struggled with statistics by symbols in 1954. Such a shared bent may well explain why real data were missing in his 1970 and 1974 figures. He may have tried to grasp Matheron’s surreal statistics before putting his own spin on how to assume continuity of stationary random functions, how to interpolate and extrapolate by kriging, and how to fumble the variance of his estimated value at point P0. In 1970, Agterberg speculated, “Suppose that there exists a two-dimensional autocorrelation function ρij for the linear relationship between all possible pairs of points Pi and Pj.” In 1974, however, he hypothesized, “The method of linear prediction in time series can be adapted to the situation of Fig. 64 by defining a two-dimensional autocorrelation function ρij for the linear relationship between all possible pairs of points Pi and Pj.” So, it took Agterberg four years to progress from “supposing” to “defining” his two-dimensional autocorrelation function. Meanwhile, his geologic prediction problem had turned into some kind of kriging problem but his autocorrelation function solved both problems just the same. 44 - 68 Agterberg was no stranger to functional dependence. In Section 2.3 Variables and Functions of Chapter 2 Review of Calculus, he pointed out, “The equation y = f(x) denotes that y is a function of x.” Under Continuity of Functions on the same page, he explained how to assess whether a function is continuous or discontinuous. “A function f(x) is continuous for a value of x = a if both lim f ( x) and f(a) exist with: x →a lim f ( x) = f (a) x →a This means that in the limit, f(x) assumes the value of f(a) as x approaches a. The values of x in the expression lim f ( x) can be larger as well as smaller than a.” x →a Riemann’s definition for continuity of functions had slipped Agterberg’s mind when he was creating Chapter 10 Stationary Random Variables and Kriging. In Section 10.1 Introduction, he rambled, “We assume that variables which change in value from point to point obey stationary random functions.” He did not disclose who were “we” who decided to “assume” that values of stochastic variables obey stationary random functions. Nor did he explain why assuming spatial dependence between points made sense not just in his chapter on Stationary Random Variables and Kriging but even more so in all of Matheron’s seminal work. In the same section, Agterberg claimed, “The results can be used for interpolation and extrapolation” but did not spell out why extrapolation made just as much sense as interpolation does. Agterberg`s “kriging problem” and his earlier “geologic prediction problem” were quite the same. So much so that Agterberg knew that his “value to estimate” in 1974 and his “value to predict” in 1970 were the very same functionally dependent value. What he failed to grasp or chose to ignore is that functionally dependent values do have variances in mathematical statistics. In fact, his functionally dependent value is the distance- weighted average of his set of five (5) unknown measured values with unknown coordinates. What stands to reason is that this distance-weighted average is bound to converge on the arithmetic mean of the set, and its variance on the Central Limit Theorem, as soon as all of Agterberg’s points become equidistant to P0. The central limit theorem does indeed lie at the core of a perverse problem that found its roots in early Matheronesque thinking. This theorem is fundamental in sampling theory and practice because it links the variance of a set and the number of measured values in the set to the variance of its central value (the arithmetic mean or a weighted average). Agterberg did refer to the central limit theorem in Chapter 6 Probability and Statistics and Chapter 7 Frequency Distributions and Functions of Independent Random Variables. Yet, this theorem did not show up at all in Chapter 10 Stationary Random Variables and Kriging. Agterberg should have but did not explain why the central limit theorem did not make the grade in the very chapter where he endorsed Matheronian geostatistics. What Agterberg did not point out was that his unknown points and unknown coordinates in Figure 64 do not define just one “predicted value” but an infinite set not only within his two-dimensional sample space but also beyond. It is true that he may not have meant to plot predicted values beyond his sample space. On the other hand, Agterberg suggested 45 - 68 in Section 10.1 Introduction, “The results can be used for interpolation and extrapolation.” Yet, he did not clarify whether or not “results” and “predicted values” are somehow synonymous. The question is then what Agterberg was talking about when he blessed both interpolation and extrapolation. What failed to arouse Agterberg’s interest in 1970 was spatial dependence between ordered sets of measured values in sample spaces. He still had not figured out in 1974 how to test for spatial dependence in sample spaces defined by unknown values with unknown coordinates. He could have used symbols similar to those in Equation [10.82] on page 353 to derive the variance of the set and the first variance term of the ordered set. Statistics by symbols would have shown for once that Agterberg, unlike Matheron and his first generation of geostatistical scholars, knew how to test for spatial dependence in sample spaces. Agterberg showed how to apply Fisher’s F-test and assess whether two variances are statistically identical or differ significantly. In fact, he referred to Fisher’s F-test in Chapter 6 Probability and Statistics and in Chapter 7 Frequency Distributions and Functions of Independent Random Variables but not in Chapter 10 Stationary Random Variables and Kriging. The question is then why Agterberg did show how to apply Fisher’s F-test but did not know how to apply the very same test to the variance of a set and the first variance term of the ordered set. Agterberg displayed his savvy at analysis of variance and the properties of variances in Chapter 6. In Chapter 10, however, he fumbled the variance of his predicted value and muddled much of his acumen in analysis of variance. What's more, Agterberg knew as just little as did Matheron and his following about the additive property of variances for multivariate functions such as multiple metal contents of sampling units and sample spaces. What Agterberg did do most of all was to lend too much credence to Matheron’s elusive stationary random function. Agterberg and Matheron shared a prodigious penchant for working with symbols, which may explain why they paid so much more attention to sampling theory than to sampling practice. Dr J Visman played a key role in building a bridge between sampling theory with its homogeneous populations and sampling practice with its heterogeneous sampling units and sample spaces. Surprisingly, Agterberg paid no attention at all to Visman’s work. It is indeed surprising because so much of Visman’s work was published when both were employed with the former Department of Mines and Technical Surveys in Ottawa, Canada. In fact, Visman’s 1961 Towards a Common Basis for the Sampling of Materials was published as CANMET Research Report R93. Visman proved that the variance of selecting a set of primary increments from a heterogeneous sampling unit is the sum of the composition variance and the distribution variance. The composition variance is a measure for the variability between particles within primary increments. In contrast, the distribution variance is a measure for the variability between all primary increments in the set that constitutes the sampling unit. The composition variance is a function of the mass of primary increments. Typical 46 - 68 examples of sampling units are shipments of coals, concentrates, ores, recycled materials, and other materials in bulk. Visman’s sampling model also applies to heterogeneous sample spaces such as ore deposits, geochemical prospects, contaminated sites, and similar stationary situations. For example, an ore deposit is partitionable into a set of large blocks, each of which, in turn, is partitionable into a set of small blocks. The variance between large blocks and the variance between small blocks are measures for the degree of heterogeneity as a function of the average mass of a block. Test results for ordered sets of core samples define cylindrical volumes of proven ore. Pairs of interleaved bulk samples, selected from ordered rounds in drifts, pits, and trenches, not only define metal grades and contents but also show where spatial dependence between rounds dissipates into disorder. Most of all, Visman’s sampling practice ensures unbiased confidence limits for central values of small and large sets of measured values alike. Visman partook in the activities of ASTM Committees D-5 on Coal and Coke, and E-11 on Statistics. ASTM D2234 Standard Practice for Collection of a Gross Sample of Coal, in Annex A1 Test Method for Determining the Variance Components of a Coal, describes Visman’s sampling experiment with sets of small and large increments to estimate composition and distribution variances. ASTM D2234 was the very first internationally recognized standard to specify a precision of ±10% of dry ash content. Visman’s paper on A General Sampling Theory was published in the November 1969 issue of Materials Research & Standards. M David did refer to Visman in his 1977 Geostatistical Ore Reserve Estimation. In his 1974 Geomathematics, however, Agterberg did not mention Visman, P Gy, or any other recognized expert on sampling theory and practice for that matter. In 1967, Gy referred to Visman’s 1947 PhD thesis and his 1961 CANMET Research Report R93. Gy no longer referred to Visman in his 1979 Sampling of Particulate Materials, Theory and Practice. In its Introduction, however, Gy praised Matheron and his “…science known as Geostatistics”. Matheron, in turn, praised Gy in his Preface to Gy’s 1967 L’Echantillonnage de Minerais en Vrac. Matheron and Gy knew all about sampling theory, probability distributions, homogeneous populations, and population means and variances. This is why both failed to grasp that the properties of variances and the concept of degrees of freedom are of paramount importance when sampling heterogeneous sampling units and sample spaces. That kind of mutual praise explains why the properties of variances were missing as much in Matheron’s new science as they were misused in Gy’s sampling practices. It may explain why a paper on The Properties of Variances went missing several times on D F Merriam’s watch as the Editor-in-Chief, Journal for Mathematical Geology. When his anonymous reviewers finally perused this paper, the first one proclaimed, “Geostatistics need obey the concept of degrees of freedom no more so than linear, least squares regression analysis.” The other declared, “Degrees of freedom is an older terminology that is not relevant to the modern development of statistics.” These blatantly biased peer reviews did not trouble Merriam, the Editor-in-Chief, Journal for 47 - 68 Mathematical Geology. Neither did Agterberg and M Armstrong, Merriam’s Associate Editors in 1995, wonder why degrees of freedom were no longer relevant in statistics. The more so because Agterberg himself in 1974 did refer to analysis of variance and degrees of freedom in Chapter 6 Probability and Statistics and Chapter 8 Statistical Dependence; Multiple Regression. In 1992, when Agterberg was Assistant Editor to W D Sinclair, Editor, CIM Bulletin, he approved for publication in CIM Forum a technical brief on Abuse of Statistics. His approval was not at all surprising because Abuse of Statistics showed how to apply Fisher’s F-test just as well as did Agterberg himself in Chapter 6 Probability and Statistics. What he failed to explain in Geomathematics is how to verify spatial dependence in a sample space by applying Fisher’s F-test to the variance of a set of measured values and the first variance term of the ordered set. Agterberg, just like Armstrong in those days, may still not have grasped what the difference between functional dependence and spatial dependence is all about. Agterberg did know in 1974 as much as he does today that analysis of variance and degrees of freedom are inextricably linked. In fact, comparing an observed F-value between two variances with tabulated F-values requires that the number of degrees of freedom for each variance be taken into account. This is the very reason why degrees of freedom will always be relevant in mathematical statistics. Scores of textbooks deal in detail with degrees of freedom. ISO standards on applications of statistical methods, too, obey simple notions. Functionally dependent values do have variances. Measured values do give degrees of freedom. Geostatistical reviewers on Agterberg’s watch in 1995 when he was Associate Editor, Journal for Mathematical Geology, did not take degrees or freedom quite as serious as statisticians do. That explains why the editorial board of CIM’s Geological Society did not rank Dependencies and Degrees of Freedom as high as Agterberg rated Abuse of Statistics in 1992. The CIM editorial board had noticed, “This new brief is longer, but deals with the same topic.” The board did not want to know why functional dependence and spatial dependence are as different as night and day. Neither did the editorial board of CIM’s Geological Society believe that its members ought to know that functionally dependent values do have variances, and that measured values do give degrees of freedom. What’s more, the board was irked because of Geostatistics or Voodoo Statistics. It was a technical brief of sorts, published in the Engineering and Mining Journal of September 1992 and put together to teach elements of statistics to CIM’s statistically dysfunctional enforcers of geostatistics. Yet, the board’s faith in Matheron’s new science of geostatistics never wavered, and CIM’s Geological Society stayed the course. Sinclair, the Editor of CIM Bulletin, and Gerber, his Assistant Editor, were not troubled that functional dependence and spatial dependence are as different as night and day. Meanwhile, Matheronian geostatistics was about to convert Bre-X’s bogus grades and Busang’s barren rock into a massive phantom gold resource. 48 - 68 Robinson mentioned, “…huge volumes of numerical data…” but Agterberg failed to find a fitting set. Otherwise, he could have tested for spatial dependence by applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. A systematic walk from point to point, such that it covers the shorted possible distance between all points in the set, gives unbiased estimates for the first variance term of the ordered set and for the variance of its distance-weighted average. Agterberg did not point out that his “predicted value” is a functionally dependent value of a set of five values. Nor did he point out that functionally dependent values do have variances in classical statistics. His predicted values became kriged estimates in the first textbook on geostatistics and kriged estimators in the second textbook. The problem is not so much that Matheron and his students did not know kriged estimates and kriged estimators have variances but that Agterberg failed to grasp his “predicted value” does have its own variance. Matheron babbled about Brownian motion in 1970 when he assumed his “unbiased expression as an estimator for the variogram” is in fact unbiased. Agterberg, in turn, assumed that Matheron knew about what he babbled. Matheron may well have assumed his stationary random function need not be continuous. Nevertheless, he relieved himself of the burden of proof and assumed his stationary random function captured the capriciousness of randomness. Did Matheron believe in bringing some semblance or orderliness in sample spaces where randomness prevailed? This may well be how his stationary random function turned into the wonderful kriging game of chance. This kriging game assumes spatial dependence between measured values, interpolates and extrapolates by kriging, selects the least biased subset of some infinite set of kriged estimates, smoothes the pseudo kriging variance to perfection, violates the requirement of functional independence, and ignores the concept of degrees of freedom. The question of whether or not Matheron’s stationary random function is continuous did not concern Agterberg in 1970. Yet, his paper on Autocorrelation Functions in Geology suggests that he did search in 1970 for some measure of order in his sample space. Therefore, it is surprising indeed that it was Agterberg who himself said, “The results can be used for interpolation and extrapolation.” Actually, Agterberg made this astounding statement in Chapter 10 Stationary Random Variables and Kriging of his 1974 textbook on Geomathematics. Agterberg is in a bind. He derived Po, his predicted value and the distance-weighted average in classical statistics, but did not derive var(Po), the variance of his predicted value. What he failed to grasp in 1970 and 1974 is that Po converges on the arithmetic mean, and var(Po) on the Central Limit Theorem, as irregularly spaced points become equidistant to Po. Agterberg is even more in a bind because of extrapolation beyond the sample space defined by his five irregularly spaced points. His predicted value Po converges on the arithmetic mean and vary (Po) on the Central Limit Theorem when distances increase between this sample space and his predicted value Po. This predictable outcome gives the 49 - 68 lie to his 1974 assumption that, “The results can be used for interpolation and extrapolation.” A set of three measured values, determined in samples selected at positions with different coordinates, is enough to prove him wrong. It would also be enough to show how to test for spatial dependence between measured values in the ordered set. Of course, Agterberg knows that assuming spatial dependence is as just much as a scientific fraud as is extrapolation without justification. Agterberg should study how much the creator of geostatistics struggled with statistics. When Matheron derived in 1960 the length-weighted average grade of a rectangular block, he thought up the symbol k* and called it his “estimateur”. And just like Agterberg in 1970 and 1974, Matheron did not derive var(k*), the variance of his precursor to the kriged estimate or kriged estimator. What’s more, Matheron did not know either that k* would be the arithmetic mean for a square block, in which case var (k*) obeys the Central Limit Theorem. Agterberg is in a real bind. He is a well-known author and a gifted scholar who ought to know that all weighted averages do have variances simply because the arithmetic mean does. He does know how to apply analysis of variance. He may know how to construct sampling variograms that show where order in sample spaces dissipates into disorder. He has played, and continues to play, prominent roles with organizations such as the International Association for Mathematical Geology, the Journal for Mathematical Statistics, the Canadian Institute of Mining, Metallurgy and Petroleum and CIM Bulletin. Dr F P Agterberg may well believe it is too late to abandon geostatistics because it has been around for so long. He would be wrong. It is never too late to right this wrong and work with mathematic statistics. Rather than remain in denial, he should think about his legacy. He may rest on his laurels and let the wonderful kriging game run rampant. Or he may revise his textbook on Geomathematics and eliminate Chapter 10 Stationary Random Variables and Kriging. He may even want to add a chapter on Dr J Visman’s sampling practice. Visman showed how to estimate the composition and distribution component of the sampling variance as a measure for heterogeneity and spatial dependence in sampling units and sample spaces alike. 50 - 68 2.9 David’s textbook on geostatistics M David’s 1977 Geostatistical Ore Reserve Estimation was the very first textbook on Matheron’s new science. D A Krige, the pioneering plotter of distance-weighted average block grades at the Witwatersrand gold reef complex in South Africa prepared .the Preface to this textbook. Krige pleaded guilty to, “Having been associated intimately with the birth and early development of…geostatistics…” What Krige recalled most of all were “…stormy receptions…” and “…significant skepticism…” Krige did not reveal what it was about Matheron’s new science that brought about significant skepticism and stormy receptions. On the contrary, Krige praised Matheron and followers such as David for the development and establishment of geostatistics. Krige’s praise is deserved indeed because David’s grasp of statistics matched Matheron’s perfectly. Matheron himself, in his 1960 Krigeage d’un panneau rectangulair par sa périphérie, coined the first krige-inspired eponym. It was in this Note Geostatistique No 28 that Matheron fumbled var(z*), the variance of z*, his so-called “estimateur.” David and Krige did not know z* and var(z*) are inseparable. Krige might not even have endorsed this first textbook on geostatistics had he been aware that David’s point grade and its variance, too, are inseparable. Figure 203 plays a pivotal role in proving that Matheron’s new science does give shaky statistics because point grades do not have variances. Fig. 203. Pattern showing all the points within B, which are estimated from the same nine holes Figure 203 is given in Chapter 10 The practice of kriging. It shows nine (9) irregularly spaced holes in nine squares and sixteen (16) symmetric points in the center square marked B. David sought to derive the covariance of “all the points within B.” What he did not know is that the covariance of all the points within B is as useless a measure for spatial dependence as the first variance term of ordered points. The reason is that each of his points is a functionally dependent distance-weighted average grade of the same set of nine (9) holes. David could have derived the covariance of any number of points within B, and each would still be a functionally dependent distance-weighted average point grade of the same set of nine (9) holes. The problem is how many distance-weighted average point grades within B do give a perfectly smoothed pseudo kriging variance. David tried to solve this conundrum just as much as the greatest geostatistical minds did ever since Matheron fumbled the variance of his length-weighted average block grade. 51 - 68 The text below Figure 203 on page 286 is David’s but his pattern of points is the same as that in Figure 10 of Maréchal and Serra’s 1970 Random Kriging. Maréchal and Serra studied “punctual kriging” by symbols as a sort of tribute to Matheron’s geostatistical tinkering. Maréchal and Serra did not report coordinates because of random kriging, “where one does not mind exact locations.” David decided against “…a search for ‘good’ neighbours for each point,” and made up his mind “to keep only those samples falling within the aureola of nine blocks.” David did not explain how to search for “good neighbors” in a sample space where holes have neither grades nor coordinates. Maréchal and Serra’s Figure 10 turned into Figure 203 in the first textbook on geostatistics. David was kind or prudent when he did not praise on the dot “punctual kriging.” In 1970, Maréchal and Serra did not explain how to test for spatial dependence between measured values in ordered sets. Agterberg did not explain it either in his 1970 paper or in his 1974 textbook. Neither did David in his 1977 textbook. If David’s set of nine (9) ordered holes were to display a significant degree of spatial dependence, it would define a finite sample space. Without a significant degree of spatial dependence, however, the same set of nine (9) holes would define a zero-dimensional sample space. Thus, it is of critical importance in mineral exploration and mining to verify spatial dependence between ordered sets of measured values. David, unlike Journel in 1992, did not mention that spatial dependence between measured values might be assumed. Matheron’s 1960“estimateur” is the length-weighted average grade of a set of measured values with variable weights, determined in samples selected along opposite sides of a rectangular block. For a square block, however, it turns into the arithmetic mean grade. In geostatistics, distance- and length-weighted average grades of sets of measured values with variable weights do not have variances. In classical statistics, all central values of sets of measured values with constant or variable weights do have variances. Even David did refer to “the famous central limit theorem” in Section 2.1.1 The standard error of the mean. Central values of sets of measured values with constant weights do have variances in classical statistics. Central values of sets of measured values with variable weights do not have variances in geostatistics. It left geostatistical scholars cold when they were told in the early 1990s that Matheron’s kriged estimate does not have a variance. In Section 10.2.3.3 Combination of point and random kriging on page 286, David states, “Writing all the necessary covariances for that system of equations might be a good test to find out whether one really understands geostatistics.” The perfect test to rate one’s grasp of classical statistics is to count degrees of freedom not only for all the holes but also for all the points within B. David’s nine holes give df=9–1=8 degrees of freedom for the set, and dfo=2(n–1)=16 for the ordered set. By contrast, David’s set of sixteen points within B gives precisely zero degrees of freedom. After all, each point is a functionally dependent value for the same stochastic variable in the same sample space. David’s test for geostatistical acuity is the ultimate exercise in futility simply because the concept of degrees of freedom went missing in Matheron’s new science. A foolproof rule of thumb teaches that measured values do give degrees of freedom, and that functionally dependent values (read calculated values!) do have variances. 52 - 68 Agterberg, the author of Geomathematics, brought up degrees of freedom in Chapter 6 Probability and Statistics and in Chapter 8 Statistical Dependence; Multiple Regression but not in Chapter 10 Stationary Random Variables and Kriging. David, the author of Geostatistical Ore Reserve Estimation, listed degrees of freedom accidentally in some footnote in Table 1.IV on page 25 in Section 1.3.5.2 Estimation of parameters and model fitting. This table derived from Ondrick and Griffith’s 1969 paper. The authors showed how to apply Bartlett’s chi-square test to verify closeness of agreement between expected and observed frequencies of copper grades at the Prince Lyell mine. David did not bother to explain why degrees of freedom play a role not only in Bartlett’s chi-square test but also in Fisher’s F-test, Student’s t-test, and Tukey’s WSD-test. In Section 1.2.3.2 Parameters of dispersion, the concept of degrees of freedom surfaced in its abstract form of n–1 in the denominator of Formula (1.7) on page 6. David mentioned the standard deviation on the same page where the denominator of Formula (1.4) turned out to be n instead of n–1. The author elucidated, “The parameters will most of the time remain unknown. All that we will have are estimators like x and s.” David saw fit to characterize the variance as follows, “The square of the standard deviation is the parameter most commonly used by statisticians since it is easier to handle.” David’s statement showed that he did not grasp why statisticians work with variances and count degrees of freedom. Such as it is David’s textbook would have failed a passing grade simply because of the properties of variances and the concept of degrees of freedom. Moreover, David, just like Matheron, did not know how to verify spatial dependence in sample spaces and sampling units. It defies credulity that neither knew that all types of weighted averages do have variances just as arithmetic means do. It is even more astounding that Agterberg, the author of Geomathematics, did not know that his distance-weighted averages do have variances as much as arithmetic means do. David’s writing suggested he rushed into print his first textbook on Matheron’s new statistics. In a few years, he would be just as driven to write some kind of handbook on geostatistical ore reserve estimation. It is not surprising then that David himself showed misgivings about his first textbook. In his Introduction, David proclaimed, “The text has mainly been written for mining engineers and geologists, who are the people facing the problems of ore reserve and grade control, and who usually have had little exposure to probability and statistics.” As it turned out, David’s opening remarks about Matheronian geostatistics made little statistical sense. It may explain why he rambled on, “Chapter 5 is a few pages of theory to firmly ground the model although statisticians will find many unqualified statements here. This is not a book for professional statisticians.” David’s grasp of applied statistics in mineral exploration and mining was such that he could not have written a textbook for professional statisticians. The author acknowledged Grant NRC7035, which implies that the National Research Council of Canada did not know either that Matheron’s new science gives pseudo kriging variances and covariances. The author also had access to the drafting facilities of the department of Mineral Engineering at Ecole Polytechnique, and to typing assistance of the Mineral Exploration Research Institute. So much support for such shaky statistics is truly breathtaking. 53 - 68 What David did not mention in his Introduction is that he could not care less about unqualified statements. In his List of Notations, David presented “A word of caution” before praising thinkers like himself by claiming, “It has been known for a long time that geostatisticians seem to have that capability of changing notations twice or more in the same page and still understand each other.” What geostatisticians share most of all is an innate capability of failing to grasp the nuts and bolts of classical statistics such as functional and spatial dependence, independently measured values and degrees of freedom, properties of variances, and, last but not least, Fisher’s F-test to verify spatial dependence in sampling units and sample spaces. All the same, David did prove his point by proffering a hodgepodge of terms and symbols that would have taken an ISO Technical Committee on reserve and resource estimation at least ten years to sort out. David showed as much affinity for the σ²–symbol as Matheron did throughout his seminal work. Yet, neither knew that this symbol applies only to unknown population variances. Otherwise, both Matheron and David would have known that confidence limits for variances are a function of degrees of freedom. Chapter 1 Elementary statistical theory and applications started with Synopsis of Chapter 1 and 2. The author announced, “The first chapter should be sufficient for a reader who has never been exposed to statistics, to understand the elementary bases of all further discussions. To our statistician readers, we apologize.” The author did not explain why he apologized. On the contrary, he continued to counsel, “People who are already familiar with statistics should at least read the second chapter, to make sure they correctly link statistical and mining problems.” In Section 1.1 The vocabulary of statistics in mineral resource estimation, the author of the first textbook on geostatistics may well have talked about his personal experience. In this short but significant section, David proclaimed, “Any statistical textbook will start with a few elementary definitions which one tends to immediately forget, while in fact it is very important to keep them in mind so as to avoid making meaningless statements.” On the same page, David referred to Koch and Link 1970 Statistical analysis of geological data. In Chapter 3 Sampling, Koch and Link discussed the concept of degrees of freedom in clear and concise terms. Koch, Link, and Tukey worked with degrees of freedom long before the 1970 colloquium on geostatistics. In contrast, David still failed to grasp what degrees of freedom were all about in his 1977 textbook on geostatistics! In Section 1.1.1 Universe, David declared, “This first definition is not in usual statistical textbooks; despite its name it is not universally admitted but we need it in quantitative geological sciences, thus showing right away that many nonstandard statistical problems will occur.” The author wandered away from the “usual statistical textbooks.” He talked about “many nonstandard statistical problems.” He journeyed in the same section from his “universe” to “a mineral deposit.” In Section 1.1.2 Sampling unit and population, he left his mineral deposit by proclaiming, “A sampling unit is the part of the universe on which a measurement is made.” David’s sampling unit differs from those defined in ISO Standards for coals, concentrates, ores, and other bulk materials. Sample spaces such as mineral deposits and sampling units such as rounds from a drift, a shipment in bulk bags or a cargo aboard a bulk carrier are all part of the universe. 54 - 68 In Section 1.4.1 Definition of independence, the author proclaimed, “In plain words two variables are independent, if knowing one does not tell us anything about the other. For instance, knowing the grade of blast hole D-36 on bench 920 does not help us at all to predict the grade of block 3525 on bench 2100. To take a less extreme example, in many gold mines, knowing that the value of one assay is 0.5 ounces/ton is of no help to predict the grade of another sample 20 feet from the first one.” What David tried to define was spatial independence and dependence. The highest degree of spatial dependence exists between test results for halve core samples of the same whole core sample. A significant degree of spatial dependence may exist between test results for ordered core samples in a borehole. Spatial dependence between central values of ordered ore zones in a profile of boreholes makes it possible to derive confidence limits for content and grade for all ore zones. It may allow parts of inferred resource to be converted into proven ore. Chapter 2 Contribution of distributions to mineral reserve problems, was the one David deemed an essential read. In Section 2.1.1 the standard error of the mean, the author pointed out, “One of the most widely used formulas of statistics is the one which gives the so-called standard error of the mean, or the accuracy of an average estimate.“ What David did not know is that accuracy means the same as unbiased or freedom from error whereas precision is a generic term that refers to all sorts of measures for variability such as variances, standard deviations, coefficients of variation, confidence intervals, and confidence ranges. This mistake is traceable to Gy’s 1967 and 1975 treatises in French. In his 1979 Sampling of particulate materials; Theory and practice, Gy used precision and accuracy as defined in various ISO Standards. In the same section and paragraph, David admitted, “This formula is also responsible for the largest number of mistakes on the account of statistics. It is based on the famous central limit theorem and states that given a population and a group of independent samples drawn from that population…” The author presented the correct formula for the central limit theorem, and decided to give “independent” a bit more attention. So much more that David continued in Section 2.1.2 Conditions of use, “The trouble is, however, that for this formula to be valid the samples have to be independent of each other. Most of the time, they are not, and even if they are, additional samples will probably no longer be independent of each other, and of the original samples.” The author lost his train of thought. What he tried to point out is found on page 35 in his example of a set of ten (10) test results for iron. If n–1 test results for iron are given, then the tenth is known simply because the sum of n differences between measured values and their central value is zero. This is why n–1 out of n measured values are independent whereas only one out of n is dependent. David’s set of ten (10) test results would give df=10–1=9 degrees of freedom if all test results have equal weights. Generally, a set of n measured values with equal weights give df=n–1 degrees of freedom. What David did not know was that the numerator of the central limit theorem is invariably divided by the number of measured values. In geostatistics, however, the numerator is divided by some number of predicted values, kriged estimates, or kriged estimators, to which may or may not be added the number of measured values. 55 - 68 In his Bibliography, David listed Koch and Link’s 1970 Statistical analysis of geological data, and Tukey’s 1951 Propagation of errors, fluctuations, and tolerances. He did not list Moroney’s Facts from figures. This delightful book was first printed in 1951, reprinted countless times, and translated into French in 1970. In 2007, McGill’s library still has 1956 and 1965 editions of Facts from figures. Neither did David refer to Volk’s Applied statistics for engineers. In Chapter 7 Analysis of variance, Volk discussed the properties of variances, the variances of functions, the application of Fisher’s F-test, and the relationship between confidence limits for variances and degrees of freedom. The first edition of Volk’s applied statistics for engineers was published in 1958, and many more have since been printed. David referred to Gy’s early works in French, and to Visman’s 1970 A general sampling theory. It was Visman who built a bridge between sampling theory with its homogeneous populations and sampling practice with its heterogeneous sampling units and sample spaces. A solid grasp of Visman’s thorough research would have improved David’s Geostatistical Ore Reserve Estimation and Agterberg’s Geomathematics. In his Index, David listed aureola, bull’s eye shot, deconvolution, and chaotic component but did not list degrees of freedom, central limit theorem, and functional dependence. In Chapter 12 Ore modelling [sic], the author acknowledged, “There is an infinite set of simulated values,” and wondered how to, “make that infinite set smaller and get the model closer to reality.” What he did not know is that infinite sets of simulated values give zero pseudo kriging variances. David did not tackle the daunting task of selecting the least biased subset of some infinite set of simulated values. In a rare instant of perfect vision, David admitted, “The criticism to this model is obvious. The simulation is not reality. There is only one answer. The proof of the pudding is…!” Hecla’s Grouse Creek and Bre-X’s Busang are but a few inferred resources to fail David’s pudding test. Fisher’s F-test proved that David’s pudding test is a geostatistical slight of hand. The key to doing more with fewer boreholes was to assume spatial dependence between measured values in ordered sets. Agterberg claimed in 1974 that it makes sense to assume that values varying from point to point obey stationary random functions. The practice of assumed causality explains why geostatistical reviewers did object in 1992 when Fisher’s F-test proved a significant degree of spatial dependence between gold grades of ordered rounds in a drift. Standford’s Journel made a cryptic reference to a certain decision in his letter of October 15, 1992, to Professor Dr R Ehrlich, Editor, Journal for Mathematical Geology. Journel wrote, “The very reason for geostatistics or spatial statistics in general is the acceptance (a decision rather) that spatially distributed data should be considered a priori as dependent one to another, unless proven otherwise.” Inferred resources are created when grades within ore sections of boreholes are assumed similar to those between ore sections. Applied statistics does give unbiased confidence limits for contents and grades of proven ore within inferred resources based on spatial dependence within ore sections. In addition, it gives unbiased confidence limits for contents and grades of proven reserves. In contrast, geostatistics gives infinite sets of distance-weighted averages with zero degrees of freedom and zero pseudo variances. 56 - 68 2.10 David’s handbook on geostatistics Professor Dr M David may have assumed that the world’s mining industry wanted more of the same. It would explain why the author of Geostatistical Ore Reserve Estimation wrote this 1988 Handbook of Applied Advanced Geostatistical Ore Reserve Estimation. He still did not think assuming, kriging, and smoothing quite as silly as do statisticians. He did not test for spatial dependence between measured values in ordered sets nor did he derive confidence limits for block grades. He worked with pseudo kriging variances and Lagrange multipliers and ignored proven statistical methods. David’s 1988 handbook had far fewer pages than his 1977 textbook but it did have a weightier title. In those days, he struggled with infinite sets of simulated values, and pondered how to select least biased subsets. He was also proud that geostatisticians still understand each other even when they change notations twice on the same page. He may not have known that his simulated values were in fact distance-weighted average point grades in applied statistics. Some scholar should have told him that simulated values are called either kriged estimates or kriged estimators in Matheron’s new science. David’s handbook was one of a kind if only because of its avant-garde title and poignant lack of bona fide statistics. He acknowledged his research was funded by Natural Science and Engineering Research Council of Canada. His 1977 textbook counted 364 pages, a ½-page Preface, a 4-page List of Notations, an 8-page Index, and a 10-page Biography. In his 1988 handbook, he pressed forward to 216 pages of applied advanced research, four pages of References, no Preface, no Index, and a tacky title. Just the same, David felt much more secure this time than he did in 1977. In fact, he did not apologize to statistical readers nor did he predict any unqualified statements. Once more, he did owe his readers an apology because his synthesis of geostatistics and Lagrange multipliers failed to give unbiased confidence limits for block grades as a measure for risk. David may have perused Journel and Huijbregts’s 1978 Mining Geostatistics. If he did, he might have noted the zero kriging variance under Remark 2 (ii) in Section V.A Theory of Kriging of Chapter V The estimation of in situ resources. In his first textbook, he did not mention that an infinite set of simulated values does give a zero variance and a unity covariance. Nor did he mention the immeasurable odds of selecting the Best Linear Unbiased Estimate (BLUE) out of some infinite set of simulated values. Odds beyond measure may seem somewhat less odd to those who work with dense symbols rather than with transparent irrational numbers. Matheron taught his new science of geostatistics mostly by symbols. It may explain why infinite sets of kriged estimates, zero kriging variances, and unity kriging covariances, did not bother Matheron’s students. It does explain why unbiased confidence limits for metal contents and grades of ore reserves were missing in David’s 1977 Geostatistical Ore Reserve Estimation and Journel, Huijbregts’s 1978 Mining Geostatistics, and scores of similarly flawed textbooks. Unbiased confidence limits derive from real variances of sets of measured values that do have degrees of freedom. In contrast, pseudo variances of sets of functionally dependent values do not have the degrees of freedom it takes. 57 - 68 David used more krige-derived eponyms in his 1988 handbook than he did in his 1977 textbook. Kriged estimates, kriged estimators, kriging variances, kriging covariances, and scores of kriging methods, became all the rage in the 1980s. Scholars would play with all kind of krige-inspired neologisms whenever a new term was needed to put in plain words the intricate workings of Matheronian geostatistics. For example, David himself coined the term “supersimplified random kriging” in Section 5.1.3 of Chapter 5 Kriging. In his 1988 handbook, David reduced even more the number of references to applied statistics. Koch and Link’s 1970 Statistical Analysis of Geological Data was not to be a part of his applied advanced research. Ingamells’s 1970s work was ignored. Jowet’s 1955 studies failed David’s litmus test. Tukey’s 1951 Propagation of Errors, Fluctuations and Tolerances was turfed. Visman’s 1947 Sampling of Coal and Washery Products, and 1970 A General Sampling Theory, both vanished. Gy’s 1979 Sampling of Particulate Materials, Theory and Practice turned out to be David’s sole source of knowledge on sampling and statistics required for his research in this 1988 handbook. Just the same, Visman’s work did inspire Gy’s early treatises in French on sampling theory and practice. As luck would have it, Gy did not deal with analysis of variance and Fisher’s F-test. In his Introduction, David talked about theoretical and applied geostatistics, and pointed out how it evolved into, “…the practical tool it was originally meant to be.” He claimed, “Comparing predictions and reality of course showed discrepancies which could be reduced most of the time.” The author continued in the same vein, “These discrepancies were reduced either by taking geology into account in better ways, or by checking more carefully that the basic statistical hypotheses were met.” He pontificated, “These two approaches, better respect of geology or better respect of statistical hypotheses are in fact equivalent and aim at the same goal, a better orebody model.” David’s textbook and handbook would have benefited from research into how to respect statistical hypotheses. The question of whether a statistical hypothesis is true or false is of critical importance in applied statistics. Matheron’s strangest statistical slant by far was found in his Foreword to Journel and Huijbregts’s Mining Geostatistics when he claimed that geologists stress structure, and that statisticians stress randomness. He may have tried to suggest that geologists test for structure and statisticians for randomness in the same sample spaces. Matheron should have taught geologists how to apply Fisher’s F-test to the variance of a set of measured values and the first variance of the ordered set. It tests the statistical hypothesis whether and where some structure of order in a sample space dissipates into randomness. If these variances are statistically identical, then randomness rules in the sample space. If the first variance term of the ordered set is significantly lower than the variance of the set, then some structure does exist in the sample space. Gy’s 1979 Sampling of Particulate Materials, Theory and Practice did make David’s list because of Gy’s use of statistical methods in sampling practice. Testing for bias between paired test results for different types of samples is based on Student’s t-test. The question of whether two variances are statistically identical or differ significantly is solved by applying Fisher’s F-test. Gy referred to SF=Student-Fisher in the Index of his work. In the text, however, Gy wrote about some unorthodox “Student-Fisher’s t-distribution” 58 - 68 rather than about Fisher’s F-test. Gy’s 1979 work did not show how Fisher’s F-test is applied to optimize sampling protocols by partitioning the sum of two or more variances in the measurement chain or hierarchy into its components. In Section 9.3 Bulk Sample Preparation of Chapter 9 Check Samples and Duplicates, David did not show either how to optimize primary sample selection, sample preparation, and analytical stages of a bulk sampling protocol by applying Fisher’s F-test to the variances of different stages. On a positive note, David did explain Student’s t-test in Section 9.1.2 Comparing two laboratories. Not only did he know how to count degrees of freedom but also knew how to improve the sensitivity of the t-test. He applied Student’s t-test to the difference between central values of sets of identifiably different pairs of measured values. In this case, the statistical hypothesis to be proved either true or false is whether the difference between central values implies absence or presence of bias. David did not apply Fisher’s F-test to solve the question of whether two variances are statistically identical or differ significantly. He did not apply Fisher’s F-test in his 1977 textbook. He still ignored Fisher’s F-test in his 1988 handbook. Not surprisingly because analysis of variance and the properties of variances were as much a mystery to Matheron and all of his students as they were to David since the 1970s. Matheron fumbled the variance of the length-weighted average grade of his three-dimensional block. Agterberg fumbled the variance of the distance-weighted average grade of his zero-dimensional point. David, in turn, was walking the tightrope from applied statistics to Matheronian geostatistics while he struggled with those elusive properties of variances. In Section 3.1.2 The Really Recoverable Reserves of Chapter 3 Block Variance, David dealt with mistakes because some ore is waste while some waste is ore. He talked about, “The distribution of values which we should consider the distribution of estimated values.” He spelled out even more of his advanced thinking when he added, “It can be assumed on the basis of experience that this is also a lognormal distribution however its variance is now smaller.” David claimed, “The variance can be obtained experimentally and theoretically from the ‘smoothing relationship’ which states that:” var( z*) = var( z ) − σ K 2 David’s “smoothing relationship” is not based on true variances. On the contrary, each of his “variances” is the pseudo variance of some set of functionally dependent, distance- weighted average point grades. Pseudo variances and true variances only share squared dimensions. This absurd smoothing relationship evolved when a single length-weighted average grade (Matheron’s three-dimensional block grade) mushroomed into an infinite set of distance-weighted average grades (Agterberg’s zero-dimensional point grades). In his Foreword to Chapter 5 Kriging, David pronounced, “The recent proliferation of different types of kriging…still shows that ordinary kriging, the way it was formulated by Matheron in 1965 and applied by Serra in 1967, is still the tool to use in most circumstances.” To prove his point, the author took off on yet another tangent. In Section 5.1 Improving Ore-Waste Definition with Kriging or Random Kriging, he proclaimed, 59 - 68 “The highly erratic mineralization …makes the usual practice of flagging blast holes totally useless.” Perhaps predictably, David declared, “The solution is to perform kriging on blast holes and to define a new boundary, based on kriged values, rather than blast hole values.” Figure 77 Erratic distribution of blasthole grades For simplicity, the above copper grades are deemed equidistant. In practice, it would make sense to derive the distances between blastholes from coordinates. A cutoff grade of 0.20% Cu was applied to partition blasthole blocks into ore and waste. Kriging-defined ore and waste limits are outlined in Figure 78. Figure 78 Ore-waste limits estimated by kriging The first variance term of the ordered set of erratic grades in Figure 77 derives from a systematic walk that visits each blasthole with more than 0.20% Cu only once, and that covers the shortest possible distance between blastholes. Figure 78 is different because new limits are plotted based on kriged values rather than blasthole values. In this case, too, a systematic walk that visits each blasthole within kriging-defined limits only once, irrespective of its copper grade, and that covers the shortest possible distance, gives the first variance term of the ordered blasthole grades within these limits. 60 - 68 Fisher’s F-test is applied to var(x), the variance of the set, and var1(x), the first variance term of the ordered set. Observed and tabulated F-values in Table 2.10.1 show that the ordered set of erratic copper grades in Figure 77 displays a significant degree of spatial dependence at 95% probability. In contrast, the ordered set of blasthole grades between kriging-defined limits in Figure 79 does not display spatial dependence. Table 2.10.1 Test for spatial dependence ————————————————————————————— Statistic Symbol erratic kriged ————————————————————————————— Variance of set in %2 var(x) 0.0318 0.0394 First variance term in %2 var1(x) 0.0191 0.0314 Observed F-value F 1.67 1.26 Significance * ns Degrees of freedom: Set df 32 45 Ordered set dfo 60 90 Tabulated F-value at: 5% Probability F0.05;df,dfo 1.64 1.51 1% Probability F0.01;df;dfo 2.01 1.79 ————————————————————————————— ns not significant * significant at 5% probability Fisher’s F-test proves that the ordered set of erratic blasthole grades displays spatial dependence at 95% probability. Diluting erratic grades with grades of less than 0.20% Cu within kriging-defined limits causes spatial dependence to dissipate into randomness. Diluting blast hole grades impacts not only the central value of the set but also its variance and confidence limits. Confidence limits for central values derive from standard deviations and tabulated t-values at selected probability levels with applicable degrees of freedom. Table 2.10.2 gives 95% confidence limits for the arithmetic mean of each set. Table 2.10.2 Confidence limits for block grades ————————————————————————————— Statistic Symbol erratic kriged ————————————————————————————— Arithmetic mean in % x 0.39 0.31 95% Confidence range 95% CR Lower limit in % 95% CRL 0.34 0.26 Upper limit in % 95% CRU 0.44 0.36 95% Confidence interval in % 95% CI ±0.048 ±0.052 95% Confidence interval in %rel 95% CI ±12.8 ±16.9 ————————————————————————————— Arithmetic means of 0.39% for the erratic block and 0.31% for the kriged block do differ significantly. In fact, the probability that this statistical hypothesis is true exceeds 95%. Conversely, the probability that it is false is less than 5%. A processing plant would have 61 - 68 received ore with a significantly higher grade if blastholes were flagged. Conversely, it would have received more ore but with a significantly lower grade if ore-waste limits were redefined by kriging blastholes grades. In this case, the kriged block was about 1.5 times larger than the erratic block. Such findings do not lend credence to David’s opinion that kriging beats flagging. David took off on yet another mathematical tangent when he tried to put the Lagrange multiplier to work in 5.1.2. Example of Chapter 5 Kriging. His problem is that Lagrange multipliers do not give confidence limits. Just the same, he sets the stage as follows,“…a block is estimated by a weighted average of the mean BH grade inside it and the four surrounding blocks...” Figure 79 Configuration of neighbour blocks retained around the block to estimate The author added, “There is the same number n of blast holes in each of the blocks and the variogram is isotropic. From the symmetry of the configuration, the weights assigned to the surrounding blocks are the same: A1 = A2 = A3 = A4 = A0.” It seems somewhat contrived but convenient that each block had the same number of blastholes. His statement that the variogram is isotropic rang hollow because he did not even know how to derive the first variance term of a sampling variogram. The author made matters worse by asserting, “If the relative variogram is adopted, a good approximation of its equation for distances less than 300 ft is γ(h) = 0.011 + h/1760.” He did not know how to test for spatial dependence between ordered sets of blasthole grades. What he did know was how to spin semi-variogram nonsense. His work with the Lagrange multiplier made it possible to show more geostatistics by symbols. Real statisticians would have applied Fisher’s F-test to verify spatial dependence by comparing the observed F-value between the variance of the set of blocks in Figure 79 and the first variance term for ordered blocks. Statisticians would have derived the weighted average grade for Block A0 from the grades for Block A1 to Block A4, and confidence limits for this weighted average grade. Statisticians would have applied Tukey’s Wholly Significant Difference test to check whether weighted average block grades are statistically identical or differ significantly. Statisticians would have applied Bartlett’s chi-square to determine whether block variances are statistically identical or differ significantly. 62 - 68 David would have disapproved. In fact, he reviewed for CIM Bulletin in September 1989 a paper titled Precision Estimates for Ore Reserves. The authors used Fisher’s F-test to verify spatial dependence between gold grades of bulk samples taken from a set of ordered rounds in a decline. In his review, David remarked, “The authors present their own method for calculating precision estimates for ore reserves without a single reference (his highlights!) to 20 years worth of work in geostatistical ore reserve estimation (see attached references).” In his 1988 Handbook of Applied Advanced Geostatistical Ore Reserve Estimation, David failed to explain how to apply Fisher’s F-test to verify spatial dependence between measured values in an ordered set. He seemed to have forgotten that his own references to statistics in his 1977 textbook were no longer included in this 1988 handbook. Nor did he ever explain how the properties of variances impact confidence limits for contents and grades of ore reserves. He ignored with impunity the application of analysis of variance for which Sir Ronald A Fisher was knighted in 1953. It is true that Matheron’s teachings caused catatonique géostatistique among his students. David was but one of several scores of geostatistical scholars who agreed that spatial dependence may be assumed, and that degrees of freedom discriminate against functionally dependent values. The reason why David felt compelled to write the first handbook on geostatistics may well have been a bit of work with measured values. Had he been interpolating between measured values, he would have noticed that the variances of such ordered sets rise to a maximum and then converge on zero. This rise and fall of variances is counterintuitive in statistics but pervasive in geostatistics. So much so that a caution against oversmoothing became an integral part of Matheron’s new science of geostatistics. In Chapter 4 Estimation variance, David explicated, “The question of the estimation variance is one of the essential concepts which made geostatistics known.” Despite all his research, he did admit, “…obtaining an exact solution to the problem of the precision on recoverable reserves is an unanswered question…” That is in fact the quintessence of the case against geostatistics! 63 - 68 2.11 A study on kriging small blocks It has been a quite a feat indeed to bring the blessings of Matheron’s new science to the world’s mining industry. Many a geologist thought it odd so much could be done with so few boreholes. But too few knew real statistics well enough to figure out what was wrong with geostatistics. What Matheron and his minions had failed to grasp was that functions do have variances, and that sets of measured values give degrees of freedom. Koch, Link and Tukey were the token statisticians at the very first colloquium on geostatistics in the USA. They had been invited to scrutinize Matheron’s new science. What they failed to notice was that Agterberg’s predicted value had no variance, and that his measured values did not give degrees of freedom. Scores of skeptics had called Matheronian geostatistics a sham. What Agterberg and Matheron did was fumble a few variances and ignore the concept of degrees of freedom. And that’s all it took to do so much with a few boreholes! David’s 1977 Geostatistical Ore Reserve Estimation was first in some kind of mad race to assume, to krige, and to smooth. Journel and Huijbregts’s 1978 Mining Geostatistics was a close second with much more of the same surreal statistics. David deserved credit for confessing that his set of “simulated values” is infinite. Journel and Huijbregts, in turn, came clean in Chapter V The estimation of in situ resources and admitted infinite sets of “kriged estimators” give zero kriging variances. Infinite sets of kriged estimates with zero kriging variances and zilch degrees of freedom troubled none of those authors. Neither did the immeasurable odds of the kriging game give pause. On the contrary, they seemed to have worked out how to select out of any infinite set of kriged estimates the singular subset that does give the Best Linear Unbiased Estimator (BLUE). Selecting such an elusive least biased subset is a formidable task indeed. So much so that taking the Best Linear Unbiased Estimator of an infinite set of kriged estimators would be ranked an impossible event in probability theory. In symbols, P(BLUE)=0. This zero probability explains why scores of ore deposits did not make predicted grades. On a positive note, least biased subsets give finite kriging variances rather than zero kriging variances. On the downside, a finite kriging variance is just as meaningless a measure for variability, precision, and risk as is the zero kriging variance. A lasting problem is that every subset of any infinite set of kriged estimators does give a pseudo kriging variance. David’s 1988 Handbook of Applied Advanced Geostatistical Ore Reserve Estimation set the stage for geostatistical grade control at open pit mines. David’s own research had led him to infer that kriged values were preferable to measured values because he thought the latter were “erratic.” All the same, this so-called erratic block was not just smaller but had a higher metal grade than his kriged block. Moreover, the ordered set of blasthole grades for that erratic block displayed a significant degree of spatial dependence whereas those for David’s kriged block were randomly distributed. As a result, the grade of this erratic block could be estimated with a higher degree of precision than the grade of his kriged block. David’s attempt at geostatistical grade control made a mockery of statistical grade control because it does give less pay dirt and more tailings. David did not derive confidence limits for the metal grade of his kriged block. Neither did he bother to test for bias between metal grades of kriged and erratic blocks. 64 - 68 Practitioners of geostatistics need scapegoats as soon as predicted grades of ore blocks fail to pan out. Surely, to assume, krige and smooth couldn’t possibly cause lower than predicted grades! So it was that the smoothing relationship in David’s 1988 handbook brought about further research into the rise and fall of kriging variances as a function of block volumes. Armstrong was a prominent scholar at the Centre de Géostatistique and Champigny a CIM Member in good standing when they pored over the rise and fall of kriging variances, and wrote A study on kriging small blocks. David himself reviewed this study and approved it for publication in CIM Bulletin, Vol 82, No 923, Mar 1989. Figure 2.11.1 Location of the block to be kriged The above facsimile has the same caption as Figure 1 in Armstrong and Champigny’s study. It is akin to Figure 79 Configuration of neighbour blocks retained around to the block to estimate in David’s 1988 handbook. Both figures are strikingly alike in the sense that simple symbols and real data are missing. David brought up the “famous central limit theorem” in his 1977 textbook but did not grasp that it underpins sampling practice. He may not have thought much of it because he did not work with it anywhere in his 1988 handbook. Otherwise, he might well have told Armstrong and Champigny that this theorem defines the relationship between the variance of the arithmetic mean grade of the block to be kriged and the variance of the set of measured grades about it. If all measured values were in fact equidistant to David’s block to be kriged, its central value would be the arithmetic mean of the set, and the central limit theorem would define its variance. The central limit theorem tripped up Agterberg’s train of thought once more in his 1974 Geomathematics. Agterberg did put up with it in Chapter 6 Probability and Statistics and in Chapter 7 Frequency Distributions and Function of Independent Random Variables. So, it is all the more surprising then that he did not see fit to put it to work in Chapter 10 Stationary Random Variables and Kriging. His caption below Figure 64 in Chapter 10 reads, “Typical kriging problem; values are known at five points. Problem is to estimate value at point P0 from the known values at P1–P5. ” So what was Agterberg’s problem? Agterberg derived the distance-weighted average of his set of five (5) measured values determined in samples taken at positions with different coordinates. He did not know how to derive the central limit theorem for the central value of his set. He did not know how to verify spatial dependence applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. And he did not know how to count degrees of freedom for his set of measured values and for the ordered set. Agterberg did indeed have some real problems. 65 - 68 David’s ”block to be kriged” in Figure 2.11.1 seems quite close to being equidistant to its “neighbour blocks.” The arithmetic mean of this set of grades is an unbiased estimate for the grade of his ”block to be kriged” if, and only if, the ordered set of grades displays a significant degree of spatial dependence. A significant degree of spatial dependence between grades of “neighbour blocks” is a condition sine qua non for the trueness of the grade for David’s ”block to be kriged”. The next figure is alike to Figure 79 in Chapter 4 Estimation Variance of David’s 1988 handbook, and has exactly the same caption. Figure 2.11.2 Configuration of neighbour blocks retained around the block to estimate David did not have his “famous central limit theorem” in mind when he was reviewing Armstrong and Champigny’s study in 1988. But then, the authors did not work with this central limit theorem in their small block study. The fact that kriging variances have nothing but squared dimensions in common with true variances troubled CIM’s reviewer as much as did the fall of kriging variances trouble the authors. After all, incompetent mine planners were to blame simply because they over-smoothed small blocks. Figure 2.11.3 Rising and falling kriging variances The above chart shows that the kriging variance is about y=0.5 for a 10x10m block and close to y=1.0 for a 1x1m block. So, the question is which of these kriged blocks is over- smoothed. After all, 1x1m kriged blocks may make single truckloads but 10x10m kriged blocks give many truckloads. Armstrong and Champigny cooked up a highly improbable interpretation of over-smoothed blocks in the Abstract of their study. The authors set 66 - 68 forth, “Meaningful estimates of individual block grades are obtained when the variogram range is large compared to the block size and the sample spacing. For a variogram range of less than half the sample spacing, the kriged estimates were found to be uncorrelated with the actual grades.” Armstrong and Champigny saw fit to link meaningful estimates and widely spaced blocks but did not explain what made such estimates meaningful. The authors stated but did not prove that kriged block estimates and actual grades were uncorrelated for a variogram range of less than half the sample spacing. What they should have done is verify spatial dependence by applying Fisher’s F-test to the variance of the set and the variance of the ordered set. A systematic walk that calls on the coordinates of each actual grade only once, and covers the shortest possible distance between coordinates gives the variance of the ordered set. Armstrong and Champigny’s study should have signaled the end of assuming, kriging, smoothing, and rigging the rules of applied statistics. It was David’s review that made it survive and thrive. It is simple to assume spatial dependence when working with symbols but wildly at odds with verifying spatial dependence by applying Fisher’s F-test. What is simpler than comparing the observed F-value between the variance of a set and the first variance term of the ordered set with values of F-distributions at different probability levels and with the applicable degrees of freedom for each variance. The problem is that geostatisticians know more about assuming spatial dependence than counting degrees of freedom. David was pleased with Armstrong and Champigny’s geostatistical inferences. Surely, it was reassuring to find out that mine planners were to blame for the rise and fall of kriging variances. So, it would be a matter of teaching those mine planners to assume and krige by the book, and to smooth the best possible blocks. His Geostatistical Ore Reserve Estimation and Handbook of Applied Advanced Geostatistical Ore Reserve Estimation put David in an excellent position to teach the intricacies of geostatistics to geoscientists and mine planners alike. Armstrong and Champigny’s study set the stage for staying the course, and teach what the right way of assuming, kriging and smoothing is all about. Armstrong and Champigny thought their study had proved small blocks should not be over-smoothed. Yet, they did not have the faintest idea why kriging variances rise and fall. The authors did not have a clue why the pseudo kriging variance is as spurious a measure for variability and precision as the pseudo kriging covariance is for associative dependence between measured values in ordered sets. What is surprising is that neither author knew each function has its own variance and measured values give degrees of freedom. They did not know how to count degrees of freedom, and why Fisher’s F-test demands degrees of freedom. That may well be the reason why practitioners of geostatistics are taught to assume rather than prove spatial dependence. Some of Armstrong’s studies are posted with the Online Library of the Centre de Géostatistique but the strange study of the rise and fall of kriging variances is not among them. 67 - 68 Matheron’s seminal work may seem to have brought to an end the human struggle against randomness. For it was Matheron’s vision to assume, krige, smooth, and rig the rules of applied statistics that would create order where chaos might otherwise have prevailed. Thanks to Matheron’s minions it turned into a dreadful pseudo science. Armstrong and Champigny’s caution against over-smoothing small blocks did not change the fact that kriged block grades are functions. Journel’s doctrine of assumed spatial dependence did not solve Agterberg’s problem because distance-weighted averages, too, are functions. Armstrong was the Editor of De Geostatisticis when she pointed her finger at a few critics of geostatistics in a stirring article on “Freedom of Speech?” and pondered, “Does the peer review process deprive these people of their freedom of speech by denying them the chance to express opinions that rum against popular view? Or is the peer review system just doing its job of rejecting papers that do not back up their opinions with scientific fact?” She was an Associate Editor with the Journal for Mathematical Geology in 1992 but still did not grasp the difference between functional and spatial dependence. Bre-X’s bogus grades of a few boreholes at positions with different coordinates defined an infinite set of kriged boreholes with matching bogus grades. So, it was straightforward to assume similar grades between widely spaced boreholes, and add kriged boreholes to Bre-X’s inferred gold resource. Step-out drilling at Busang was extremely effective because assuming spatial dependence between widely spaced lines of salted boreholes added a massive volume of barren rock to Bre-X’s phantom gold resource. Table 2.11.1 lists the basic statistics for three lines of nine kriged boreholes between nine salted boreholes on Line SEZ-44 and eleven salted boreholes on Line SEZ-49. Table 2.11.2 Statistics for kriged and salted boreholes —————————————————————————————————— Statistic Symbol SEZ-44 kriged kriged kriged SEZ-49 —————————————————————————————————— Arithmetic mean in gpt x 2.68 2.90 2.99 3.08 3.17 95% Confidence interval in gpt 95% CI ±0.63 ±0.02 ±0.02 ±0.05 ±0.84 95% Confidence interval in %rel 95% CI ±24 ±0.8 ±0.7 ±1.6 ±26 Variance of set var(x) 0.6718 0.0105 0.0126 0.0218 1.5576 First variance term var1(x) 0.9370 0.0012 0.0009 0.0055 2.0795 Observed F-value F 1.39 8.63 13.76 3.95 1.34 Significance ns ? ? ? ns Degrees of freedom for: Set df 8 0 0 0 10 Ordered set df(o) 16 0 0 0 20 —————————————————————————————————— ns not significant ? ask qualified krigeologist The above statistics show that two lines of salted boreholes do not display a significant degree of spatial dependence, and that three lines of kriged boreholes do indeed create some delusion of spatial dependence. The statistics highlight what happens when each an every distance-weighted average-cum-kriged estimate no longer has its own variance, and when kriging drums din those who work with degrees of freedom. 68 - 68