Chapter 2 Sampling Theory

Document Sample
Chapter 2 Sampling Theory Powered By Docstoc
					This work is protected by copyright. It may be downloaded in Adobe format for personal use only. It may
not be reproduced in any format either in part or in whole. Brief quotations are allowed without the author’s
written permission.



Sampling and Statistics Explained
Towards commonsensical sampling practices and scientifically sound statistical methods

J W Merks


Chapter 2 Sampling theory

Sampling theory derives from probability theory whose roots are traceable to games of
chance such as tossing coins, rolling dice, and drawing cards from a deck or colored balls
from a vase. When Stonehenge’s monoliths on Salisbury Plain were tracking summer and
winter solstices, Hounds and Jackals, a precursor to playing dice, was already popular in
Mesopotamia. Nowadays, betting on all sort of sporting events, gambling on addictive
slot machines, and playing a broad range of games of chance, are popular pastimes all
over the world. The human race may well possess its innate penchant for gambling and
games of chance because it exists, despite staggering odds to the contrary.

Gerolamo Cardan (1501-1576) wrote Liber de Ludo Aleae (Book on Games of Chance)
long before 1654 when Antoine Gombaud, Chevalier de Méré (1607-1684), a French
gambler and a rogue, wrote to Blaise Pascal (1623-1662). Gombaud wanted to know why
he lost more when betting to roll at least one double six in 24 rolls of a pair of 6-sided
dice than when betting to roll at least one six in four rolls of a single die. Pascal wrote
about Gombaud’s gambling problem to Pierre de Fermat (1601-1654), and the ensuing
exchange of letters formed the foundation of probability theory.

Pascal and Fermat are recognized as the founders of probability theory, even though they
published little on games of chance and wrote mostly to each other. Christian Huygens
(1625-1695) wrote the first treatise on probability theory based on Pascal’s and Fermat’s
correspondence, and designed the first pendulum clock based on Galileo Galilei’s notes
(1564-1642). Abraham de Moivre (1667-1754) merged probability theory with algebra
and trigonometry in his 1718 “The Doctrine of Chances,” and discovered equivalence
between the Bernoulli distribution and the Poisson distributions.

Probability is a quantitative measure for the chance or likelihood that a certain event or
outcome will occur. For example, the probability to encounter during mining continued
mineralization between in situ ordered ore zones in boreholes on a line or profile could be
quantified by verifying spatial dependence. In geostatistics, however, spatial dependence
between in situ or temporally ordered sets of measured values in sample spaces may be
assumed, unless proven otherwise. Paradoxically, Fisher’s F-test should be applied to the
variance of a set of measured values and the first variance of the in situ ordered set to
prove a significant degree of spatial dependence. Fisher’s F-test has not been approved
for application in geostatistical ore reserve estimation.


                                                  1 - 68
Probability theory examines nondeterministic systems with randomly distributed discrete
and continuous variables. Deterministic laws make tossed coins, rolled dice and roulette
wheels come to rest whereas probability theory deals with outcomes and events after
forces are balanced and energies exhausted. The elements of probability behind gambling
and games of chance explain why sampling theory applies to homogeneous populations.
In contrast, sampling practice applies to heterogeneous sampling units such as contents of
bulk bags, trucks and wagons, and shipments in unit trains and cargoes aboard bulk
carriers, and to heterogeneous sample spaces such as stockpiles at mineral processing
plants and bulk terminals, masses and volumes of in situ coals and ores, contaminated
sites, and similar stationary situations.

The notions of permutations and combinations are briefly explored. These methods of
counting are of particular importance in statistical quality control (SQC) where discrete
accept/reject outcomes are routinely encountered during inspection of manufactured
products. The properties of the Gaussian or normal distribution are more relevant in
sampling practice for the mining industry and the international commodity trade because
central values and variances of sets of test results, determined in samples selected from
multinomial, binomial or Poisson distributions, converge on the normal distribution as
more samples are selected. Historically, this expectation is called the Law of Large
Numbers. Depending on the rigorousness of mathematical proof, it may also be referred
to as either the Weak or the Strong Law of Large Numbers.

A corollary of the Law of Large Numbers is the Central Limit Theorem. Not only does it
provide a scientifically sound basis to probability theory and mathematical statistics but it
also builds a bridge between sampling theory and sampling practice. The formula for the
Central Limit Theorem is deceptively simple for the arithmetic mean of a set of test
results determined in samples with equal weights but it becomes more complex for the
weighted average of a set of test results determined in samples with variable weights.

Weighting factors play an important role in sampling practice because each area-, count-,
density-, distance-, length-, mass- and volume-weighted average has its own variance.
Inexplicably, the distance-weighted average lost its variance long before it was reborn as
an honorific kriged estimate or kriged estimator. Geostatistical practitioners have yet to
explain why the true variance of the single distance-weighted average was replaced with
the pseudo variance of a set of distance-weighted averages. The notion that the distance-
weighted averages never had a variance is about as absurd as the belief that its rebirth as
a kriged estimate or kriged estimator made this variance vanish without a trace.

The Central Limit Theorem derives from the variance of a general function as defined in
probability theory. It is easy to prove that the variance of each central value converges on
this quintessential theorem when all the weighting factors converge on the same constant
weighting factor. The validity of this fundamental relationship can be proved heuristically
or shown by stochastic simulation. Presently, random number generators that underpin
simulation models can be validated on the basis of a priori probabilities of classic games
of chance such as tossing coins, rolling dice, and drawing cards. Plus ça change, plus
c’est la même chose!



                                           2 - 68
2.1 Elements of probability theory

Tossing a fair coin, rolling an unloaded die, drawing a card from a shuffled deck or a ball
from a vase with a homogeneous set of balls of different colors but with the same
diameter, are examples of sampling experiments with equiprobable sample spaces and
discrete outcomes. Heads and tails are equally likely for an unbiased coin, and so is each
of the six faces of an unweighted die. Drawing a single card from a shuffled (randomized
or randomly distributed) full deck without a joker is a sampling experiment with 52
equiprobable outcomes. Drawing a single ball from a set in a vase is different in the sense
that this sample space is defined by N, the number of balls in the vase.

Each of these cases defines a sample space. For example, the sample space is two for
tossing a single coin, six for rolling a 6-sided die, 52 for drawing a single card from the
full deck without jokers, and N for drawing a particular ball from a vase with N balls.
Each toss, roll, or draw gives an elementary outcome, and all possible outcomes of a
single toss, roll or draw display a discrete and uniform distribution. Multiple elementary
outcomes constitute an event, and multiple events display a discrete and nonuniform
distribution or event space. For example, a pair of dice gives an event space of 2, 3…11,
12 dot sums, and displays a discrete and nonuniform distribution.

A priori probabilities are reported as fractions or percentages. The probability to toss
heads or tails with an unbiased coin is P(H)=P(T)=1÷2=0.50 or P(H)=P(T)=50%. The
probability to roll any of the six faces of an unweighted cubic die is P(x)=1÷6=0.167 or
P(x)=100÷6=16.7%. The probability to draw the Queen of Hearts from a shuffled deck
of cards is P(QH)=1÷52=0.0192 or P(QH)=100÷52=1.92%. The same probability
applies to each card in a deck as long as the game is played with an unstacked deck. After
all, the probability for each card in a full deck is P(any card)=1÷N, in which N is the
number of possible outcomes in this well-defined sample space of 52 cards.

The probability of P(any ball)=1÷N applies to each ball that is blindly drawn from a vase
with a homogeneous set of N colored balls with identical diameters. This probability
remains the same when the selected ball is returned to the set, provided it is homogenized
before a next blind draw. The probability for draws with replacement is constant whereas
the probability for draws without replacement increases. For example, the second draw
from the reduced deck of cards gives P(any card but the first)=100÷51=1.96%, the third
draw gives P(any card but the first and second)=100÷50=2.00%, and so on.

In the case of independent events where the chance of one event to occur does not impact
the chance of the other, the probability of both events to happen is the product of their
probabilities. For multiple independent events, the formula for the multiplication rule is
equal to the product of P(A and B and…and X)=p(A)·p(B)·…·p(X).

If two events are independent and mutually exclusive, the probability of either event to
occur is the sum of their probabilities. For a pair of independent and mutually exclusive
events P(A and B)=0 and P(A or B)=p(A) + p(B) in accordance with the addition rule,
which applies to any set of independent and mutually exclusive events.



                                          3 - 68
The condition that sampling experiments be unbiased is fundamental in sampling theory
and practice. If a 6-sided die were loaded or weighted, then the probability of at least two
out of six faces is either higher or lower than 16.7%. If a deck of cards were incomplete,
then the probability to draw Card X is less than P(X)=100÷52=1.92%. In the case that
Card X is the only one left, the final draw is the certain event. If Card X were indeed the
last card left, the probability of this outcome would be close to P(X)=1÷52! ≈ 10 -67.

The condition that sampling experiments be independent is equally fundamental because
it impacts a priori probabilities. This condition lies at the core of sampling theory but its
implications in sampling practice are not always transparent. Irrespective of whether a
toss with an unbiased coin gave head or tails, the probability of the next toss does not
depend on the previous toss but remains P=0.50 or P=50%. Repeated draws from a full
deck remain independent when a selected card is returned to the deck, which should be
shuffled prior to the next draw. After all, if the Queen of Hearts were the first card, then
drawing it twice would be an impossible event unless it is a draw with replacement.

Bias tests are designed to verify whether a particular sampling protocol gives unbiased
test results. For example, testing a 6-sided die for a bias or systematic error requires that a
set of empirical probabilities be compared with the discrete uniform a priori probabilities
of 16.7%. Figure 2.1 presents the discrete and uniform a priori probability distribution
for a single die, and the discrete and nonuniform a priori probability distributions for a
pair or dice, and for a set of three.

                Figure 2.1 Discrete uniform and nonuniform distributions




                                                                                                  One die
                                                                                                  Tw o dice
                                                                                                  Three dice




                1   2   3   4   5   6   7   8   9   10   11    12   13   14   15   16   17   18




These bar charts display all possible discrete outcomes for a single die, for a pair of dice
and for a set of three. The single die gives a uniform distribution of six equiprobable
outcomes, a pair of dice gives a nonuniform distribution of 12 events, and a set of three
give a nonuniform distribution of 18 events.

The probability distribution for a pair of dice is nonuniform and its slopes are linear. In
contrast, the probability distribution for three dice seems to converge on the typical bell-


                                                              4 - 68
shaped curve of the Gaussian or normal probability distribution. Multiple dice do indeed
define discrete and nonuniform a priori probability distributions that converge ever
closer on the normal probability distribution as the number of dice increases. This is a
corollary of the Central Limit Theorem.

Three dice of different colors define a priori events of P(1;1;1)=3, P(1;1;2)=P(1;2;1)=
P(2;1;1)=4,...P(6;6;5)=P(6;5;6)=P(5;6;6)=17, and P(6;6;6)=18 for N=63=216 single
outcomes. Table 2.1 lists all the events and expected counts, the relative and cumulative
probabilities in percent, and the approximate 98% and 94% confidence intervals.

             Table 2.1 Event space and dot sum counts for three dice
            ——————————————————————————
             Event Count       %rel      %cum     98% CI 94% CI
            ——————————————————————————
               3        1       0.46      0.46
               4        3       1.39      1.85      1.39
               5        6       2.78      4.63      4.17       2.78
               6       10       4.63      9.26      8.80       7.41
               7       15       6.94     16.20     15.74      14.35
               8       21       9.72     25.93     25.46      24.07
               9       25      11.57     37.50     37.04      35.65
              10       27      12.50     50.00     49.54      48.15
              11       27      12.50     62.50     62.04      60.65
              12       25      11.57     74.07     73.61      72.22
              13       21       9.72     83.80     83.33      81.94
              14       15       6.94     90.74     90.28      88.89
              15       10       4.63     95.37     94.91      93.52
              16        6       2.78     98.15     97.69      96.30
              17        3       1.39     99.54     99.07
              18        1       0.46    100.00
            ——————————————————————————

Table 2.1 shows that events with 3 and 18 dot sums have relative a priori probabilities of
0.46%. Events with 4, 5,16, 17 dot sums converge on a symmetric 98% confidence range
with a lower limit of 98% CRL=4 events and an upper limit of 98% CRU=17 events.
Events with 5, 6…15, 16 dot sums are fairly close to a symmetric 94% confidence range
with a lower limit of 94% CRL=5 events and an upper limit of 94% CRU=16 events.

Events with 3 and 18 dot sums occur only once and events with 4 and 17 dot sums occur
three times. Each count, when converted into a fraction of the total count of 216, becomes
a weighting factor required to obtain the count-weighted average (the central value of this
set of events), the variance of the set, and the variance of its central value. Calculating
central values and variances of sets of measured values with variable weights is of critical
importance in sampling practice.




                                          5 - 68
Tests for kurtosis and skewness verify whether a set of events departs from the normal
probability distribution. For example, if the set in Table 2.1 were to depart significantly
from normalcy, then its central value and confidence limits should reflect the weighting
factors of wi=ni÷Σni, where ni is the count for the ith event, and Σni=N=63=216. A quick-
and-dirty test is to draw a graph of all the events in Table 2.1 against their logarithms.
Figure 2.2 shows a chart of dot sums against their logarithms.

                                Figure 2.2 Expected events for three dice
                      2.0




                      1.5
          log(sums)




                      1.0




                      0.5




                      0.0
                            3         6            9              12        15   18

                                                       Sums of dots


The graph in Figure 2.2 converges closely on a straight line. Yet, appearance of normalcy
may well be as deceptive as a priori assumption of spatial dependence between in situ or
temporally ordered sets of measured values of stochastic variables in sample spaces. The
question is whether the expected dot sums for 216 rolls of three unbiased dice diverge
significantly from the normal distribution. ISO Technical Committee 69 – Applications of
statistical methods, has approved various tests for departure from normality. Several
tests, including those for kurtosis and skewness, are discussed in Chapter 3 Sampling
Practice where the first, second, third and fourth order differences between a set of
measured values and its central value are introduced.

Generally, a priori probabilities range from zero to unity, or from 0% to 100%. By
definition, p=0 or p=0% are impossible events whereas p=1 or p=100% are certain
events. Examples of certain and impossible events range from absurd to whimsical with
death and taxation undeniably certain events. Impossible events are to gamble for profit
without counting cards, loading dice, or otherwise stacking the odds. Gambling may be a
popular pastime but the odds to win favor bingo halls, casinos, lotteries, slot machines,
and the like. The term odds refers to the ratio between the probabilities of winning and
losing. In the real world of gambling, odds do not add up to zero sum games.

Gambling problems are often easier to solve by calculating the probability to lose rather
than the probability to win. When Gambaud was betting to roll at least one six in four


                                                  6 - 68
casts of a single 6-sided die, he would win more often than not because the probability of
P(single-six)=1–(5÷6)4 ≈ 0.518 was in his favor. However, when he was betting to roll at
least a double six in 24 casts, he would lose more often than not because P(double-
six)=1–(35÷36)24 ≈ 0.491 was to his disadvantage. But why did this Renaissance rogue
even try to get a double six by rolling a pair of dice 24-times? Surely, it must have taken
him at least six times longer to get a feel for the odds of this bet. Perhaps passing time
was as much part of the way of life in Gombaud’s days as playing games.

Gombaud gambled without understanding the rules of the game but was smart enough to
ask Pascal for assistance. Matheron gambled with geostatistics without understanding the
rules of mathematical statistics but was not as smart as Gombaud to ask a statistician for
assistance. Perhaps ironically, Matheron’s geostatistics was hailed as a new science in the
1960s. Yet, its most astounding attribute was that the pseudo kriging variance of a set of
kriged estimates replaced the true variance of the single distance-weighted average.
Statisticians would have told Matheron and his staff that “variances” of sets of distance-
weighted averages are invalid measures for variability, precision, and risk. Many would
have pointed out that each distance-weighted average has its own variance because one-
to-one correspondence between central values and variances is inviolable in mathematical
statistics. Most statisticians would have taught Matheron and his students that a set of n
functionally dependent (calculated!) values of a stochastic variable gives exactly zero
degrees of freedom, that a set of n measured values gives df=n–1 degrees of freedom,
and that the in situ ordered set gives dfo=2(n–1) degrees of freedom.

Gombaud trusted Pascal and Fermat but Matheron did not trust statisticians. In his
ponderous Foreword to Journel and Huijbregts’s Mining Geostatistics, he pontificated,
“A statistician who is not familiar with mining may well be discouraged before he can
even get a good idea of the problem at hand.” Statisticians would have been as popular at
the Centre de Géostatique as horse flies in a barn after his new science of creating order
in randomness failed to inspire the few statisticians at the 1970 Geostatistics colloquium.

David, in the Introduction to his 1977 Geostatistical Ore Reserve Estimation, declares,
“This is not a book for professional statisticians,” and correctly predicts, “…statisticians
will find many unqualified statements….” In Section 12.2 Conditional Simulation, he
refers to, ”…an infinite set of simulated values…” and ponders how, “To make that
infinite set of simulated values smaller and get the model closer to reality…” On the next
page, he muses, “The criticism to this model is obvious. The simulation is not reality.
There is only one answer. The proof of the pudding is …!” The problem with David’s
pudding proof is that many geostatistical ore reserve estimates failed to make the grade.

In Chapter V.A. Theory of Kriging, Journel and Huijbregts’s 1978 Mining Geostatistics
refers to σK2=o, the zero kriging variance. Given Krige’s infinite set of distance-weighted
averages and David’s infinite set of simulated values, it is not surprising that zero kriging
“variances” and infinite sets of kriged estimates surfaced in Mining Geostatistics. It is
astounding that the variance of the distance-weighted averages and the concept of degrees
of freedom vanished on Krige’s watch. On a positive note, Gombaud would have been
pleased that casting an unbiased six-sided die did make it into Mining Geostatistics!



                                           7 - 68
2.2 Counting Methods

The notions of permutations and combinations deal with all possible arrangements when
selecting different or identical objects without or with replacement. When all objects in a
set are identifiably different, it is possible to record the order in which a subset of objects
is selected. For example, an experiment with three dice must stipulate that each die has a
different color to ensure that the correct dot sums for the three faces are recorded.
Experiments with a deck of card or a vase of colored balls with the same diameter are
examples of selecting a subset of objects from a set of identifiably different objects.

The number of arrangements of n different objects selected from a population of no less
than n objects with replacement is nn. For example, the number of arrangements of 52
cards selected from a full deck with replacement is P=5252. A typical example of a draw
without replacement is the number of permutations for the letters of the alphabet. The
first letter may be selected 26 ways, the second 25 ways, and so on until only one letter is
left. Hence, the number of permutations for 26 objects without replacement is
P=26·25·24…3·2·1=26!. This exclamation mark does not imply excitement but it is one
of many austere symbols in probability and sampling theory.

When a sampling experiment is carried out with replacement, the 1st selection can be any
object in a set of n while the kth selection can still be any object as long as k≤n. Hence, the
number of permutations of k objects drawn with replacement from a set of n objects is
Pkn. For example, the number of permutations of k=5 letters selected from n=26 letters of
the alphabet is Pkn = 265 ≈ 7,893,600.

Generally, the number of permutations of n objects selected from n objects one at a time
without replacement is P=n!. The symbol for the number of n objects selected from n
objects is Pnn. The symbol for the number of k objects selected from n objects is Pkn.
Given that the 1st object can be selected in n different ways, the 2nd in n–1 ways, the 3rd in
n–2 ways, and the kth in n–k+1 ways, it follows that n(n–1)(n–2)…(n–k+1)= n(n–1)(n–
2)…(n–k+1)(n–k)!÷(n–k)!. Hence, Pkn = n! ÷(n–k)!.

A variant of this type of sampling experiment would be to select three-letter and three-
digit codes for automobile license plates. With replacement, a subset of three letters may
be drawn in n1=263 different ways whereas a subset of three digits may be drawn in
n2=103 different ways. A few three-letter words out of n1=17,576 may not be suited for
license plates but when multiplied with 103 more than 17 million are probably acceptable.

The number of permutations of a set of n objects, consisting of subsets of n1…ni …nk
objects of different types such that Σni =n, equals n! ÷ [n1! ·…·ni! · …· nk!]. For example,
the number of identifiably different permutations of n=13 letters in “geostatistician” is
13! ÷ [3! ·3! ·2! ·2! ·1! ·1!]=43,243,200. This is a rather small number when compared
with the infinite set of kriged estimates defined by two or more measured values,
determined in samples selected at positions with different coordinates in a finite sample
space. How to select the least biased and most precise subset of an infinite set of kriged
estimates is a daunting task. Selecting a subset of k kriged estimates from an infinite set


                                            8 - 68
of kriged estimates gives Pkn permutations in which k is finite but n is infinite. Given that
Pkn = n! ÷ [n – k]!, it follows that the number of permutations is immeasurable. By
implication, the odds are heavily stacked against those who want to select the least biased
and most precise subset of k kriged estimates from an infinite set of kriged estimates.
Geostatistical ore reserve practitioners perform this improbable task on a routine basis.

For a set of n objects that consist of subsets of n1…ni …nk objects of different types such
that Σni = n, it follows that the sum of the probabilities of ni ÷ n+…+ni ÷ n+…+nk ÷ n is
equal to unity. In fact, this relationship between permutations and probabilities underlies
the multinomial distribution where probabilities are in fact weighting factors required to
obtain weighted averages and variances.

A combination of objects is a set selected such that the order of the objects in the set is
irrelevant. For example, if ten (10) objects in a sample of one hundred (100) turn out to
be defective, it does not matter in which order these flawed objects were selected. It does
matter that the probability to accept is P(accept)=90%, and that the probability to reject
is P(reject)=10%. These probabilities are complementary, which is a characteristic of the
binomial or Bernoulli distribution.

The symbol and formula for the number of combinations of a set of n objects in a sample
of k is Ckn = n! ÷ [(n–k)!·k!]. For example, the number of 5-card hands that can be
selected from a 52-card deck is C552 =52! ÷ [(52–5)!·5!] = 2,598,960. The number of
combinations of n objects selected in a subset or a single sample of n objects is Cnn =n! ÷
[(n–n)!·n!]. Given that 0!=1 by definition, it follows that Cnn=n! ÷ n!=1.




                                           9 - 68
2.3 Probability distributions

Probability distributions for elementary outcomes such as two sides of a coin or six faces
of a cubic die are discrete and uniform. Probability distributions for events such as heads
and tails for a set of coins, and dot sums for a set of dice are discrete and nonuniform.
Discrete variables display discrete and uniform or nonuniform distributions. In contrast,
the most relevant continuous and nonuniform probability distributions are the
multinomial distribution, the Bernoulli or binomial distribution, the Poisson distribution
and the Gaussian or normal distribution. Each of these probability distributions is
discussed in a separate section.

Various statistical tests are based on comparing observed statistics with values from
various probability distributions. For example, Bartlett’s chi-square test, Fisher’s F-test
and Student’s t-test compare observed statistics with values of χ²-, F- and t-distributions
at selected probability levels and with applicable degrees of freedom. Tabulated values
are given in handbooks of statistical tables and in many textbooks on applications of
statistical methods. Dedicated software makes is simple to apply statistical tests, and to
obtain the required values of various probability distributions.

Bartlett’s χ²-test verifies whether three or more variances constitute a homogeneous set.
For example, the χ²-test can be applied to the variances of test results for core samples
from ordered ore zones on a line of boreholes to verify homogeneity. The χ²-test is also
applicable to squared relative standard deviations and squared coefficients of variations.
The test verifies whether expected and observed outcomes of sampling experiments are
statistically identical or indicative of the presence of bias. Table I.IV in David’s 1977
Geostatistical Ore Reserve Estimation shows how to apply the χ²-test to expected and
observed frequencies. In this table of the first geostatistics textbook the concept of
degrees of freedom makes a cameo appearance.

Fisher’s F-test plays a pivotal role in sampling and statistics because it is the quintessence
of analysis of variance (ANOVA). Fisher’s F-test verifies whether two variances are
statistically identical or differ significantly. The F-test is also applied to test for spatial
dependence in a sample space or a sampling unit, to construct sampling variograms that
show where orderliness in sample spaces dissipates into randomness, and to analyze and
optimize sampling protocols. Fisher’s F-test, too, is applicable not only to variances but
also to squared relative standard deviations and to squared coefficients of variations.

Student’s t-test verifies the absence or presence of bias between either paired or pooled
sets of measured values of stochastic variables in sample spaces or sampling units. It can
be applied at the primary sample selection, sample preparation, and analytical stages of
measurement hierarchies. It is of critical importance to international trading partners that
mechanical sampling systems and manual sampling protocols be tested for bias. Many
applications of Student’s t-test will be discussed.

Tukey’s Wholly Significant Difference test, or WSD-test for short, is also discussed. The
WSD-test compares each difference between three or more central values with the wholly



                                           10 - 68
significant difference to identify statistical significance at 95% probability. Degrees of
freedom play a pivotal role in Bartlett’s chi-square test, Fisher’s F-test, Student’s t-test
and Tukey’s WSD-test.

Dr J W Tukey, a Princeton professor and a prominent statistician, was present at a
geostatistical symposium in 1970. When asked about Matheron’s Theory of Regionalized
Variables, Tukey said, “I am now beginning to understand that Kriging is apparently a
word for more-or-less stationary, more-or-less least squared smoothing of the data.”
David’s 1977 Geostatistical Ore Reserve Estimation cautioned statisticians about many
unqualified statements. Armstrong and Champigny’s 1988 A Study on Kriging Small
Blocks cautioned against oversmoothing. Journel’s 1992 guidelines cautioned against
classical “Fischerian” statistics. Clearly, geostatisticians issue one caution after another
while violating with impunity the fundamental requirement of functional independence
and ignoring the concept of degrees of freedom.

Chapter 3 Sampling practice explores the application of statistical tests in sufficient
detail to ensure their proper application and interpretation. Mathematical statistics would
collapse if degrees of freedom were ignored. For example, it would be impossible to
derive unbiased confidence intervals and ranges for central values such as arithmetic
means and weighted averages, metal contents and grades of volumes of in situ ore, and of
wet and dry masses of mined ore, mill feed, tailing and concentrate. Neither would it be
possible to verify spatial dependence between in situ or temporally ordered sets of
measured values, and to chart sampling variograms that show where spatial dependence
in sample spaces or sampling units dissipates into randomness. Degrees of freedom are
equally indispensable when testing for bias and optimizing sampling protocols.

Given that functional independence and degrees of freedom are irrelevant in geostatistics,
it follows that the pseudo kriging variance of a set of functionally dependent kriged
estimates is an invalid measure for variability, precision, and risk. In classical statistics,
one-to-one correspondence between functionally dependent values and variances entails
that central values do have variances. The arithmetic mean is the ubiquitous central value
of a set of measured values with equal weights. In contrast, an area-, count-, density-,
distance-, length- mass- or volume-weighted average is the central value of a set of
measured values with variable weights.

Long before Matheron converted a flawed variance of mathematical statistics into
geostatistics, he didn’t know that each length-weighted average grade has its own. In fact,
the length-weighted average was the first central value that lost its variance while
Matheron was studying statistics and wrote his first statistical note in 1954. It took some
ten years before the distance-weighted average, too, metamorphosed into an honorific but
variance-deprived kriged estimate or estimator.




                                           11 - 68
2.3.1 Multinomial distribution

The multinomial distribution is most effective when applied to small sample spaces with
discrete events such as a set of cards drawn from a full deck, or a sample selected from a
vase with balls of different colors but identical diameters. The deck of cards should be
shuffled and the contents of the vase should be thoroughly mixed because the properties
of the multinomial distributions only apply to homogeneous populations. The probability
for a particular event is obtained from the terms of the multinomial expansion,

                                (p1 + ⋅⋅⋅ + p i + ⋅⋅⋅ + p k ) n = 1
                            where: pi = probability for ith subset
                                   k = number of subsets
                                   n = elements in all subsets

The multinomial distribution is useful to show that the probabilities of discrete events
converge on the Gaussian or normal distribution. In sampling practice, the multinomial
distribution is of limited use because n, the complete set of events consists of k subsets
such that n = n1 +…+ ni +…+ nk, and because the probability for the number of events in
each subset is p1… pi … pk such that their sum is p1 +…pi +… pk =1. The probability of a
sample of n balls taken from a homogeneous population consisting of k subsets of balls of
different colors but with identical diameters is computed with the formula,


                        )   (
                     p(X = p1 1 + ...+ p i i + ...+ p kk ⋅
                            n             n          n
                                                         ) n !⋅ ...⋅ nn!!⋅ ...⋅ n !
                                                              1        i        k


The formula is impractical for large sets of events because n! (pronounce n factorial) is
the product of n·n-1·n-2…3·2·1 = 9.333·10136 for n=100. In contrast, the product of all
probabilities is extremely small so that the probability for a particular event is the product
of two terms, the first of which tends toward infinitesimal whereas the second tends
toward infinite. This is why confidence limits for the central value of a set of measured
values, determined in samples selected from a multinomial distribution that consists of
particles of uniform size and different compositions are impractical to compute.

A multinomial population that consists of a homogeneous set of black, white and red
balls of the same diameter in equal proportions is a useful stochastic system to bridge the
gap between sampling theory and sampling practice. How cumbersome it is to compute
the probability of selecting a sample of n balls with ni balls of each of three colors is
obvious upon realizing that the probability to select a sample of 300 balls with exactly
100 of each color is p(X) = (1÷3)300 ·300! ÷ [100!·100!·100!]. Surely, Sterling’s formula
falters and spreadsheet software sputters when a much large number of terms are required
to compute meaningful confidence limits. The multinomial distribution is interesting
from a theoretical perspective but it has found little application in sampling practices for
materials in bulk.




                                              12 - 68
2.3.2 Binomial distribution

For k=2, the multinomial distribution becomes the binomial distribution with its
multitude of practical applications in statistical quality control. The binomial distribution
also forms the basis for Gy’s ubiquitous but simplistic and contentious sampling constant.
The binomial or Bernoulli distribution applies to sample spaces with a pair of mutually
exclusive outcomes such as heads or tails, black or white, accept or reject, on and off, and
so on. It is not surprising, then, that the binomial distribution has found wide application
in science and engineering and provides a sound basis to statistical process control (SPC)
and statistical quality control (SQC).

The Bernoulli distribution is based on the terms of the binomial expansion (p + q) n = 1
where p and q are complementing probabilities for Event P and Event Q, and n is the total
number of events. The following formula defines all possible terms for this expansion,

                                           (p+q)n=1

                            where: p = probability for Event P
                                   q = probability for Event Q
                                   n = number of outcomes

The probabilities of p and q are complementary. For example, the probability for heads
and tails are p=0.5 and q=1–p=0.5 for an unbiased coin. The probability for rolling a
particular side with an unweighted die is p=0.167, which implies that the complementary
probability for not rolling that side is q=1–0.167=0.833. Multiple coins and dice give the
terms of the binomial expansion with coefficients that display Pascal’s triangle. Each
term of the binomial expansion is obtained with the following formula,

                                    p( X ) = ( n ) ⋅ p k ⋅ q n − k
                                               k



                          where: P(X) = probability for Event X
                                   p = probability for Event P
                                   q = probability for Event Q
                                   k = number of same outcomes
                                   n =total number of outcomes

The most useful property of the binomial distribution is that a sample of n balls selected
from a homogeneous population of black and white balls with identical diameters has a
sample variance of var(x)=n·p·q. For example, a sample of 1,000 balls selected from such
a population has a sample variance of var(x)=1,000·0.1·0.9=90.

This variance applies to 100 black balls and 900 white balls alike. Therefore, it gives a
95% confidence interval of 95% CI = 1.96·√90 ≈ ±19 black balls for a symmetric 95%
confidence range with a lower limit of 95% CRL ≈ 100–19 ≈ 81 black balls and an upper
limit of 95% CRU ≈ 100 + 19 ≈ 119 black balls. The symmetric 95% confidence range
for 900 while balls has a lower limit of 95% CRL ≈ 900–19 ≈ 881, and an upper limit of


                                             13 - 68
95% CRU ≈ 900 + 19 ≈ 919. Logically, the lower limit of 95% CRL ≈ 81 black balls and
the upper limit of 95% CRU ≈ 919 white balls coincide and add up to the entire sample of
1,000 balls.

In sampling practice, a large set of particles may be divided into two subsets, one of
which consists of particles with some desirable characteristic whereas the other consists
of particles without that characteristic. This simplistic approach formed the basis for Gy’s
sampling theory and his ubiquitous sampling constant.

The variance formula for the binomial distribution in particular has been widely used and
abused to bridge the gap between sampling theory with its homogeneous populations and
sampling practice with its heterogeneous sampling units and sample spaces. A score of
authors such as Vezin, Brunton, Richards, Demond, Halferdahl and Gy have applied all
sorts of factors to allow for departure from the binomial distribution in sampling practice.
Dr Jan Visman, in his 1947 PhD thesis, described a scientifically sound method based on
the additive property of variances in a measurement hierarchy. Visman’s work underpins
practical applications such as the interleaved sampling protocol.

Visman’s sampling theory and its applications will be juxtaposed against Gy’s sampling
theory. Visman showed that the sampling variance is the sum of the composition variance
and the distribution variance. The composition variance is a measure for the variability
between particles within primary increments, and the distribution variance is a measure
for the variability between primary increments within sampling units. Visman’s sampling
theory was implemented in ASTM D2234–Standard Practice for Collection of a Gross
Sample of Coal. Originally published in 1963, it was the first internationally recognized
standard to target a precision estimate of ± 10% for dry ash content.

Visman’s sampling theory gave impetus to the development of the interleaved sampling
protocol, a straightforward procedure that provides precision estimates for central values
of stochastic variables in sampling units and sample spaces at the lowest possible cost.
The interleaved sampling protocol is particular useful at the bulk sampling stage in
mineral exploration. Interleaved samples give unbiased estimates for intrinsic variances
of stochastic variables in sample spaces and sampling units alike. Interleaved bulk
samples combine logically with testing for spatial dependence between rounds in adits,
drifts, pits, or trenches, and with charting of sampling variograms.




                                          14 - 68
2.3.3 Poisson distribution

The Poisson distribution approximates the binomial or Bernoulli distribution when p–>0
and n–>∞ such that their product np is constant. The Poisson distribution is called after
Siménon-Denis Poisson (1781-1840). Its properties derive from the sum of an infinite set
of terms, a derivation traceable to The Doctrine of Chances by Abraham de Moivre
(1667-1754). De Moivre’s work in algebra, trigonometry, and probability theory deals
with the relation between the probability of a particular event and the frequency of its
occurrence as the sum of an infinite set of terms,

                            e− m m 0 e− m m 1 e− m m 2 e− m m 3
                                    +        +        +         ⋅⋅⋅
                               0!       1!       2!       3!

Since e, the base of the natural logarithm, is the sum of these terms:

                                        1 1 1 1
                                   e=     + + + + ⋅⋅⋅
                                        0! 1! 2! 3!

                                                             m 0 m1 m 2 m 3
and since e m is the sum of the following set, e m =            +  +   +    + ⋅⋅⋅
                                                             0! 1! 2! 3!

                                           ∞
                                                e− m m r
the sum of the infinite set of terms is:   ∑       r!
                                                         = 1.0
                                           r =0


Consecutive terms of the Poisson distribution correspond to those of the Bernoulli or
binomial distribution as n converges on infinite, which implies that,

                                  e− m m r
                                     r!
                                           ≅   ( )p
                                                n
                                                r
                                                      r
                                                          (1 − p ) n − r


The equivalence of the Poisson distribution and the Bernoulli distribution is fundamental
in sampling practice. Not only do the Poisson and Bernoulli distributions converge but
both also converge on the Gaussian or normal distribution. This convergence lends
credence to the validity of the Central Limit Theorem.

Poisson’s expected value of m is equivalent to Bernoulli’s population mean of np, both of
which, in turn, are equivalent to μ, the population mean as defined for the Gaussian or
normal distribution. An interesting property of the Poisson distribution is that the
expected value of m is also equivalent to σ2, the population variance as defined for the
Gaussian or normal distribution. For example, a micro-diamond count of m=15 gives a
95% confidence interval of 95% CI ≈ 1.96·√m ≈ 1.96.σ ≈ 1.96·√15 ≈ ±8 counts, and a
symmetric 95% confidence range with a lower limit of 95% CRL ≈ 15–8 ≈ 7 counts, and
an upper limit of 95% CRU ≈ 15+8 ≈ 23 counts.




                                               15 - 68
This property of the Poisson distribution explains the coarse particle effect that prevails
when small test portions of poorly comminuted test samples are assayed for copper, lead,
zinc, and so on. It also explains the nugget effect that occurs when test samples contain
native metals, and test portions are fire-assayed for gold and silver. For example, a test
portion of 15 g from a 500 g test sample with 25 particles of free gold is expected to
contain m=0.75 particle. Similarly, test portions of 30 g and 60 g taken from the same
test sample are expected to contain m=1.5 and m=3 particles of gold respectively.

Figure 2.3 shows a bar chart with the probabilities as relative percentages for a predicted
range of gold particles from k=0 to k=10.

                         Figure 2.3 Probabilities in relative percent

                   60

                   50

                   40
                                                                              m1=3
                   30
                                                                              m2=1.5
                    20
                                                                              m3=0.75
                    10

                     0
                          0   1   2   3   4   5   6   7   8   9   10


The probability to select a 15 g test portion without a single gold particle is close to 50
%rel, which explains why this bar chart shows a higher degree of positive skewness than
those for m2=1.5 and m3=3. Table 2.2 gives the relative and cumulative percentages for
predicted gold particles based on expected values of m1=0.75, m2=1.5, and m3=3.

         Table 2.2 Relative and cumulative percentages for gold particles
      ————–––––––––––—————————————————————
      Predicted 60 g test portion    30 g test portion     15 g test portion
      Particles  %rel      %cum      %rel       %cum       %rel       %cum
      ————–––––––––––—————————————————————
        0          5.0       5.0     22.3       22.3       47.2        47.2
        1        14.9       19.9     33.5       55.8       35.4        82.7
        2        22.4       42.3     25.1       80.9       13.3        95.9
        3        22.4       64.7     12.6       93.4        3.3        99.3
        4        16.8       81.5       4.7      98.1        0.6        99.9
        5        10.1       91.6       1.4      99.6        0.1       100.0
        6          5.0      96.6       0.4      99.9
        7          2.2      98.8       0.1     100.0
        8          0.8      99.6
        9          0.3      99.9
       10          0.1     100.0
      ————–––––––––––—————————————————————


                                              16 - 68
Table 2.2 explains why fire assayers walk a fine line between the mass of a test portion
and the capacity of a crucible. So-called 30-gram crucibles are suitable for assaying test
portions with a mass of about 30 g. In the jargon of fire assayers, test portions of 29.167 g
are commonly referred to as “one (1) assay ton.”

Bre-X’s bogus gold grades were initially made up by enriching core samples with gold
filings, and fire assaying test portions of 100–150 mesh core samples. Perhaps ironically,
Bre-X’s quality control was based on fire assaying a second test portion of every tenth
pulverized core sample. Due to the single particle or nugget effect, the analytical
precision for fire assaying duplicate test portions was bound to exceed a coefficient of
variation of 50%. Because gold filings in pulverized core samples are easy to detect and
identify, the problem was not only how to enrich core samples but also how to beat the
nugget effect and improve the analytical precision for gold.

The first step was to replace gold filings with placer gold, diluted with pulverized ore to
obtain low, medium and high concentrations. The second step was to replace fire assays
of small test portions of pulverized and salted core samples with cyanide leaching 750 g
test portions of crushed and salted core samples. In spite of that, the analytical precision
for cyanide leaching large test portions of placer-gold-enriched crushed core samples was
still extremely low. Pulverized ore with variable placer gold concentrations was added to
crushed core samples in a haphazard manner–so much so, in fact, that but a few sets of in
situ ordered bogus gold grades displayed spatial dependence. Journel’s postulate that
spatial dependence may be assumed played a crucial role in converting Bre-X’s bogus
grades and Busang’s barren rock into a massive phantom gold resource.

Figure 2.4 shows how the binomial or Bernoulli distribution with a variance of n·p·q=
100·0.05·0.95=4.75 and the Poisson distribution with a variance of m=4.75 converge
                                             ()
because of the equivalence of e − m m r ÷ r ! ≅ r p r (1 − p ) n − r .
                                                n




           Figure 2.4 Probabilities for Bernoulli and Poisson distributions

                    25

                    20

                    15
                                                                    Bernoulli
                    10
                                                                    Poisson
                     5

                      0
                          0 1 2 3 4 5
                                      6 7 8 9 10
                                                 11 12 13


The closeness of agreement between bar charts for the Bernoulli distribution and for the
Poisson distribution provides heuristic proof that these probability distributions do indeed
converge. Convergence accelerates for a variance of n·p·q=100·0.02·0.98=1.96 but even
more so for a variance of n·p·q=1,000·0.02·0.98=19.6.


                                          17 - 68
Figure 2.4 and Table 2.3 are implemented in the same spreadsheet template. The variance
of n·p·q=100·0.05·0.95=4.75 may be replaced with n·p·q=169·0.05·0.95=8.03 but not
with n·p·q=170·0.05·0.95=8.08 because the Bernoulli distribution defaults to #NUM! for
values larger than 10307.

              Table 2.3 Comparing Bernoulli and Poisson distributions
              ————–––––––––––———————————————
              Predicted   Bernoulli  Poisson Bernoulli        Poisson
              Number       %rel       %rel        %cum         %cum
              ————–––––––––––———————————————
                  0         0.6         0.9          0.6         0.6
                  1         3.1         4.1          3.7         5.0
                  2         8.1         9.8         11.8        14.7
                  3        14.0        15.5         25.8        30.2
                  4        17.8        18.4         43.6        48.5
                  5        18.0        17.4         61.6        66.0
                  6        15.0        13.8         76.6        79.8
                  7        10.6         9.4         87.2        89.1
                  8         6.5         5.6         93.7        94.7
                  9         3.5         2.9         97.2        97.6
                 10         1.7         1.4         98.9        99.0
                 11         0.7         0.6         99.6        99.6
                 12         0.3         0.2         99.9        99.9
                 13         0.1         0.1       100.0        100.0
              ————–––––––––––———————————————

Given that both probability distributions display the typical bell-shaped curve of the
Gaussian or normal distribution, it stands to reason that population parameters converge.
This is the reason why the variances of samples selected from homogeneous populations
that conform to any of these distributions converge on the Central Limit Theorem. This
convergence provides a sound statistical basis to sampling practice, a subject that will be
discussed in exhaustive detail in the chapter on sampling practice.




                                         18 - 68
2.3.4 Gaussian or normal distribution

The Gaussian or normal distribution is continuous, nonuniform, and quintessential in the
application of statistical methods. The adjective normal has no clinical implications but
implies many populations are distributed in this manner. De Moivre’s work underpins the
Gaussian distribution as much as it does the Poisson distribution. It was Carl Friedrich
Gauss (1777-1855) who discovered that a wide range of data sets in the universe give
curves of comparable shape when plotted in graphs. In fact, the values of the normal
distribution display typical bell-shaped curves that define confidence intervals and ranges
for central values of sets of measured values at different probability levels.
                                                    ( x − μ )2
                                             −
                                        1              2σ 2
                                   y=      e
                                      σ 2π
Its formula shows that y is a function of μ, the population mean, σ2, the population
variance, and x, a measured value of the stochastic variable. The presence of e, the base
of natural logarithms, and of π, the omnipresent constant reflects De Moivre’s work.

Given that the area under the normal curve is unity or 100% when integrated from x=–∞
to x=+∞, it follows that the area between x1 and x2 is equal to the following integral,
                                                        x2           ( x − μ )2
                                              1                  −
                         P( x1 ≤ x ≤ x2 ) =
                                            σ 2π        ∫e
                                                        x1
                                                                        2σ 2
                                                                                  dx

Most textbooks on statistics contain tables that give F(z), the area under the normal curve,
over a range from z=0.00 to z=3.49. Statistical software and spreadsheet software give
the same z-values as are listed in handbooks of statistical tables.

Figure 2.4 shows three different population variances about the same population mean.
The highest variance gives the lowest curve (red), and the lowest variance gives the
tallest curve (green).

              Figure 2.5 Different population variances and same mean




                                          19 - 68
The question of whether these population variances differ significantly is irrelevant. After
all, the observed F-value between the highest and lowest population variances in Figure
2.3 invariably exceeds F0.05;∞;∞=F0.01;∞;∞=1.00. at 5% and 1% probability. In other
words, numerically different population variances do indeed differ significantly because
df=∞ for each σ2 by definition.

Unlike Figure 2.5, in which different population variances are plotted about the same
population mean, Figure 2.6 shows different population means with the same population
variance.

              Figure 2.6 Different means and same population variance




The question of whether population means differ significantly is equally irrelevant. After
all, variances of population means converge on zero because df=∞ for σ2 by definition.
Hence, numerically different population means do indeed differ significantly.

Figure 2.7 depicts the fundamental relationship between the population variance (red) of
a homogeneous sampling unit and the variance of a sample (green) that consists of a set
of fifteen (15) primary increments taken from this sampling unit.

                 Figure 2.7 Population variance and sample variance




The Central Limit Theorem defines the relationship between the population variance of a
homogeneous population and the variance of a sample selected from that population. This


                                          20 - 68
theorem is pivotal in the transition from sampling theory with infinite degrees of freedom
to sampling practice where degrees of freedom are finite. In fact, the number of degrees
of freedom is either a positive integer for sets of measured values with constant weights
or a positive irrational for sets of measured values with variable weights.

Inflection points on a graph show its change from convex about the center to concave
toward – ∞ and + ∞. By definition, the area under the bell-shaped curve between – ∞ and
+ ∞ is unity or 100%. Those inflection points were already defined in De Moivre’s work.

Table 2.4 gives the most relevant symmetric confidence intervals and ranges as a function
of μ, the population mean, and σ, its standard deviation, together with z-values that derive
from the Gaussian distribution, and with symbols that will be used throughout the text.

              Table 2.4 Symmetric confidence intervals and ranges
     –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
      Confidence interval     range: lower limit       range: upper limit
     –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
      95% CI = ±1.96·σ     95% CRL = μ–1.96·σ       95% CRU = μ+1.96·σ
      99% CI = ±2.58·σ     99% CRL = μ–2.58·σ       99% CRU = μ+2.58·σ
      99.9% CI = ±3.29·σ   99.9% CRL = μ–3.29·σ     99.9% CRU = μ+3.29·σ
     –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

Tables that give areas under the normal curve as a function of z-values define the level of
significance or risk associated with a particular inference. The value of z=1.96 gives 95%
confidence intervals and symmetric 95% confidence ranges whereas values of z=2.58 and
z=3.29 give confidence intervals and symmetric confidence ranges at 99% and 99.9%
probability respectively. The level of significance or risk is defined as α=2[1–F(z)], in
which F(z) is the area under the normal curve.

A homogenous population with μ=50 and σ2=25 would give a 95% confidence interval
of 95% CI=z0.975·σ=1.96·5=±9.8, or 95% CI=±9.8·100/50=±19.6%rel. Its symmetric
95% confidence range would have a lower limit of 95% CRL=μ–95% CI=50–9.8=40.2
and an upper limit of 95% CRU= μ+95% CI=50+9.8=59.8. Therefore, a measured value
of 62.1, determined in a sample taken from this population, may be rejected as a
statistical outlier because it exceeds 95% CRU=59.8. In fact, 95 out of 100 measured
values (19 out of 20) are expected to fall within this symmetric 95% confidence range
from 40.2 to 59.8 whereas five out of 100 measured values (one out of 20) are expected
to be either lower than 95% CRL=40.2 or higher than 95% CRU=59.8.

A 99% confidence interval of 99% CI=z0.995·σ=2.58·5=±12.9 would give a symmetric
confidence range with a lower limit of 99% CRL=50–12.9=37.1 and an upper limit of
99% CRU=50+12.9=62.9. In the same way, a 99.9% confidence interval of 99.9% CI=
z0.9995·σ=3.29·5=±16.4 would give a symmetric confidence range with a lower limit of
99.9% CRL=50–16.4=33.6 and an upper limit of 99.9% CRL=50+16.4=66.4. In this
case, the measured value of 62.1 should not be rejected as a statistical outlier because it
falls between lower and upper limits of symmetric 99% and 99.9% confidence ranges.



                                          21 - 68
Lower and upper limits of asymmetric confidence ranges are mutually exclusive because
either the lower limits or the upper limits are valid. Lower limits are effective measures
for the precision of contents and grades of ore reserves. By contrast, upper limits are
practical measures for the precision of trace metals in contaminated sites. Table 2.5 lists
applicable z-values and symbols that will be used throughout this text.

                     Table 2.5 Asymmetric confidence ranges
              –––––––––––––––––––––––––––––––––––––––––––––––––
                 ACRL => lower limit        ACRU => upper limit
              –––––––––––––––––––––––––––––––––––––––––––––––––
              95% ACRL = μ–1.65·σ         95% ACRU = μ+1.65·σ
              99% ACRL = μ–2.33·σ         99% ACRU = μ+2.33·σ
              99.9% ACRL = μ–3.09·σ       99.9% ACRU = μ+3.09·σ
              –––––––––––––––––––––––––––––––––––––––––––––––––

For simplicity, the same population parameters of μ=50 and σ2=25 are used to show how
to derive lower and upper limits of asymmetric confidence ranges. The lower limit of the
asymmetric 95% confidence range is 95% ACRL=μ–z0.95·σ=50–1.65·5=50–8.2=41.8.
Hence, 95 out of 100 (one out of 20) measured values, determined in samples selected
from this population, are expected to be less than 95% ACRL=41.8. Similarly, one out of
100 measured values is expected to be less than 99% ACRL=μ–z0.99·σ=50–2.33·5=38.4
whereas only one out of 1,000 is expected to be less than 99.9% ACRL=μ–z0.999·σ=50–
3.09·5=34.6.

The upper limit of the asymmetric 95% confidence range is 95% ACRU=μ+z0.95·σ=50+
1.65·5=50+8.2=58.2. Thus, 95 out of 100 (one out of 20) measured values are expected
to exceed 95% ACRL=58.2, one out of 100 measured values is expected to exceed 99%
ACRL=μ+ z0.99·σ=50+2.33·5=61.6, and only one out of 1,000 is expected to exceed
99.9% ACRL=μ+z0.999·σ=50+3.09·5=65.4.

The properties of the Gaussian or normal distribution are highly relevant in sampling
practice because central values and variances of sets of measured values, determined in
samples taken from multinomial, binomial, or Poisson distributions, converge on those of
the normal distribution as more samples are selected. Table 2.6 gives the relationship
between the population parameters of Bernoulli, Poisson, and Gaussian distributions.

                  Table 2.6 Properties of probability distributions
              –––––––––––––––––––––––––––––––––––––––––––––––––
               Parameter            Bernoulli     Poisson     Gaussian
              –––––––––––––––––––––––––––––––––––––––––––––––––
               Population mean          n·p           m           μ
              Population variance     n·p·q           m           σ2
              –––––––––––––––––––––––––––––––––––––––––––––––––

This convergence of population parameters ensures a smooth transition from sampling
theory and unknown true values to sampling practice and unbiased estimates.



                                         22 - 68
2.3.5 Standard uniform distribution

Randomly distributed values of the standard uniform distribution are useful to simulate
games of chance such as tossing coins or rolling dice. The spreadsheet functions RAND()
in Excel and @RAND in Lotus both generate a single randomly distributed value of the
standard uniform distribution. Excel’s RAND() function is applied in several numerical
examples. Each value simulated with this function is called a Standard Uniform Random
Number with the acronym SURN. Given that all outcomes within the range from zero to
unity are equiprobable, it follows that 0≤SURN≤1.

Simulating a toss of an unbiased coin is easy. If P(head)=0 and P(tail)=1, a value between
0≤SURN<0.5 gives “head” whereas a value between 0.5<SURN≤1 gives “tail”. In Excel
format, =IF(RAND()>0.5,”tail”,”head”). Simulating a roll with a 6-sided die is also
easy. Because the probability for each face of an unbiased die is p=1/6, multiplying a
SURN by 6, adding 0.5, and rounding the sum to the closest integer gives one of the six
faces of a single die. In Excel format, =ROUND(RAND()*6+0.5,0).

The RAND() function is validated by simulating a large number of tosses of a coin and
comparing the observed probability with the expected probability of P(head)=P(tails)=
0.5. The function can also be validated by simulating a large number of rolls with a single
die and comparing observed and expected probabilities for each of the six faces. In this
case, the population mean and its variance for the standard uniform distribution verify the
performance of the RAND() function.

The population mean and the population variance of the standard uniform distribution are
μ= 0.5 and σ2= 1÷12 = 0.0833 for a standard deviation of σ= 0.2887 and a coefficient of
variation of CV= σ·100÷μ = 57.7%rel. The test for bias compares the arithmetic mean of
a sample of 121 SURNs with the population mean of μ=0.5.

                               Table 2.7.1 Test for bias
               ————————————————————————
                Parameter/statistic           Symbol     Value
               ————————————————————————
               Population mean                  μ        0.5
                Sample mean                     x        0.4782
               Difference                       ∆x       0.0218
               Observed t-value                  t       0.785
               Tabulated t-value            t0.05;120    1.980
               Significance                                ns
               ————————————————————————
                 ns not significant

Table 2.7.1 gives the population mean, a sample mean, the difference between these
central values, an observed t-value, and the value of the t-distribution. The observed t-
value in this table is t = Δx÷[√(var(x)÷n)] = 0.0218÷(√0.0934÷120) = 0.785. Observed t-
values are not expected to exceed t0.05;120=1.980 of the t-distribution more that 5 times
out of 100 clicks of F9.


                                         23 - 68
Fisher’s F-test is applied to verify whether a sample variance of var(x)=0.0934 for this
set of 121 SURNs is statistically identical to the population variance of σ2=1÷12=0.0833
by comparing the observed value of F=0.0934÷0.0833=1.12 with F0.05;120;∞=1.23.
Given that a simulated sample variance is either higher or lower than the population
variance of σ2=0.0833, the observed F-value is compared with either F0.05;∞;120=1.23
or F0.05;120;∞=1.25. Table 2.7.2 gives the F-statistics for this test.

                   Table 2.7.2 Test for homogeneity of variances
            —————————————————————––––———
            Parameter/statistic            Symbol                Value
            ——————————————————————————
            Population variance               σ²                0.0833
            Sample variance                 var(x)              0.0934
            Observed F-value                  F                 1.12
            Tabulated F-value            F0.05;∞;120            1.23
            Tabulated F-value            F0.05;120;∞            1.25
            Significance                                         ns
            ——————————————————————————
               ns not significant

This F-test is atypical because the sample variance is equally likely to be lower or higher
than the population variance. The F-test is typically applied to verify spatial dependence
by comparing the observed F-value between the variance of a set and the first variance
term of the ordered set with values of F-distributions at 5% and 1% probability. Fisher’s
F-test is also applied to optimize sampling protocols by partitioning the sum of all
variances in a measurement chain or hierarchy into its components.

The question of whether this ordered set of 121 SURNs displays a significant degree of
spatial dependence or is randomly distributed within the time it took to simulate the set is
solved by comparing the observed F-value between var(x)=0.0934, the variance of the
set, and var1(x)=0.0866, the first variance of the ordered set, with F0.05;120;240=1.31.
Table 2.7.3 shows that the observed value of F=0.0934÷0.0866= 1.08 does not exceed
this value of the F-distibution. Hence, this temporally ordered set of SURNs is randomly
distributed within the sample space of time required to simulate the set.

                        Table 2.7.3 Test for spatial dependence
               ——————————————————————————
               Parameter/Statistic              Symbol           Value
               ——————————————————————————
               Variance of set                  var(x)          0.0934
               First variance term              var1(x)         0.0866
               Observed F-value                    F             1.08
               Tabulated F-value            F0.05;120;240        1.31
               Tabulated F-value            F0.05;240;120        1.33
                Significance                                      ns
               ——————————————————————————
                  ns not significant



                                          24 - 68
Repeatedly clicking F9 in the spreadsheet template for Table 2.7.3 shows that observed
F-values are equally likely to either exceed unity or fall below it. F0.05;120;240=
1.31and F0.05;240;120=1.33 differ only marginally. This is why most sets of SURNs are
randomly distributed irrespective of which of these values does apply. If var(x), the
variance of the set, is lower than var1(x), the first variance term of the ordered set, than
FALSE is printed.

Covariances, unlike the variance terms for ordered sets, need not be computed to quantify
degrees of spatial dependence in sample spaces. Testing for spatial dependence between
in situ and temporally ordered sets of measured values precedes the design of sampling
variograms. In fact, a sampling variogram is more than a visual presentation of Fisher’s
F-test because it shows not only where the degree of spatial is statistically significant but
also where it dissipates into randomness. Covariances may play a role when quantifying
associative dependence between logically paired data.




                                          25 - 68
2.3.6 Standard normal distribution

Randomly distributed values of the standard normal distribution are useful to simulate
more complex stochastic systems than games of chance. The RAND() function gives a
single SURN whereas the sum of 12 RAND()s minus 6 gives a randomly distributed value
of the standard normal distribution. This value is called a Standard Normal Random
Number, and its acronym is SNRN. The standard normal distribution has a population
mean of μ=0 and a population variance of σ2=1. Because σ2=1÷12 for the standard
uniform distribution, and σ2=1 for the standard normal distribution, a set of 12 SURNs in
each of 121 cells of a spreadsheet eliminates the need for a multiplication factor.

Table 2.8.1 lists the population mean of μ=0, a sample mean of x =0.0515, and the
observed and tabulated values. This bias test verifies whether the population mean of μ=0
and a sample mean of x =0.0515 are statistically identical. The observed t-value of
t=[ x –μ]÷√[var(x)÷n]=0.0515÷(√1.125÷120)=0.534.is below t0.05;120=1.980. Hence,
this sample mean of x =0.0515 is statistically identical to the population mean of μ=0.

                               Table 2.8.1 Test for bias
               ————————————————————————
                Parameter/Statistic          Symbol      Value
               ————————————————————————
               Population mean                  μ        0.0000
               Sample mean                      x        0.0515
                Observed t-value                 t       0.534
               Tabulated t-value            t0.05;120    1.980
               Significance                               ns
               ————————————————————————
                 ns not significant

Table 2.8.2 gives var(x)=1.1255, the variance of the set of 121 SNRNs on which the
above t-test is based. Fisher’s F-test proves that this sample variance of var(x)=1.1255 is
statistically identical to the population variance of σ2=1.0 because the observed value of
F=1.1255÷1.0=1.13 is below F0.05;120;∞=1.23 at 95% probability.

                    Table 2.8.2 Test for homogeneity of variances
            ——————————————————————————
            Parameter/Statistic              Symbol               Value
            ——————————————————————————
            Population variance                σ²                1.0
            Sample variance                  var(x)              1.1255
            Observed F-value                   F                 1.13
            Tabulated F-value             F0.05;∞;120            1.23
            Tabulated F-value             F0.05;120;∞            1.25
            Significance                                          ns
            ——————————————————————————
               ns not significant




                                         26 - 68
Logically, Fisher’s F-test for spatial dependence is based on comparing the observed
value of F=var(x)/var1(x) with F0.05;120;240=1.31. Predictably, about 50% of observed
values of F=var1(x)/var(x) are compared with F0.05;240;120=1.33 because the sample
variance is equally likely to be lower or higher than the population variance.

In this case, the question of whether the ordered set of 121 SNRNs displays a significant
degree of spatial dependence or is randomly distributed in this temporally ordered set is
solved by comparing the observed F-value between var(x)=1.1255, the variance of the
set, and var1(x)=1.3679, the first variance of the ordered set, with F0.05;240;120=1.33 at
95% probability. The statistics in Table 2.8.3 show that F=1.3679÷1.1255=1.22 is below
F0.05;240;120=1.33. Hence, this temporally ordered set is randomly distributed within
the time interval required to simulate 121 SNRNs.

                        Table 2.8.3 Test for spatial dependence
              —————————–————————————————
              Parameter/Statistic               Symbol          Value
               —————————————————————————
              Variance of set                   var(x)          1.1255
              First variance term               var1(x)         1.3679
              Observed F-value                     F             1.22
              Tabulated F-value             F0.05;120;240        1.31
              Tabulated F-value             F0.05;240;120        1.33
              Significance                                        ns
              —————————————————————————
                 ns not significant

Repeatedly clicking F9 in the spreadsheet template for Table 2.8 demonstrates that the
observed F-value is equally likely to either exceed or fall below unity.

Student’s t-test for bias and Fisher’s F-test for compatibility of pairs of variances are
based on comparing observed values with tabulated values, which are listed as a function
of the level of probability and of the number of degrees of freedom. Evidently, degrees of
freedom are indispensable when verifying whether simulated sets of SURNs derives from
a homogeneous standard uniform distribution, and whether simulated sets of SNRNs
derive from a homogenous standard normal distribution, by comparing sample statistics
with a priori population parameters. It is straightforward to simulate sets of n=121
SURNs or SNRNs, count df=n–1=120 degrees of freedom for the set and dfo=2(n–1)=240
for the ordered set, and look up tabulated values as a function of degrees of freedom.

Simulation models are effective tools to study how the variances of sets of stochastic
variables interact in complex functions. A typical example is a mineral processing plant
where wet masses and metal grades of mill feed, concentrate, and tailing, interact with
the variances of these stochastic variables, give confidence intervals and ranges for the
recovery by simulation. A more complex alternative would be to study how partial
derivatives towards all variables interact and define deterministic confidence intervals
and ranges for some functionally dependent value of the stochastic system.



                                         27 - 68
Fisher’s F-test for spatial dependence can be applied to any ordered set of measured
values, determined in primary increments taken from a sampling unit at intervals of
constant time or mass. The test can also be applied to any in situ ordered set of measured
values, determined in samples selected at positions with different coordinates in a sample
space. In each case, the objective is to verify whether an ordered set of measured values
displays a statistically significant degree of spatial dependence, or is randomly distributed
within a sample space or a sampling unit.

Stanford’s Journel referred in 1992 to some kind of decision that spatial dependence may
be assumed unless proven otherwise. Journel did not disclose who decided that spatial
dependence might be assumed. Nor did he show how to prove otherwise. The question is
then why he was troubled when Fisher’s F-test proved a statistically significant degree of
spatial dependence between gold grades of a set of ordered rounds in a drift. It does not
make any scientific sense to assume spatial dependence between measured values in
ordered sets. On the contrary, it would make a great deal of sense to figure out where
spatial dependence in sample spaces dissipates into randomness.




                                          28 - 68
2.4 Variance of general function

The population variance of a general function is defined by the dependent variable, some
set of n independent variables, and the variances of these variables. This formula finds its
origin in calculus and probability theory. It shows that the population variance of a
general function is the sum of n variance terms, each of which is the squared partial
derivative toward an independent variable multiplied by its variance.

                                  2              2                   2
                        ⎛δy⎞ 2 ⎛δy ⎞ 2                        ⎛δy ⎞ 2
                     σ =⎜
                       2
                               ⎟ σX1 + ⎜      ⎟ σ X 2 + ⋅⋅⋅ + ⎜      ⎟ σX n
                        ⎝ δ x1 ⎠       ⎝ δ x2 ⎠               ⎝ δ xn ⎠
                       y




This formula can be used to derive the variance of the arithmetic mean. It is simple to
prove that the variances of area-, count-, density-, distance-, length-, mass-, and volume-
weighted averages converge on the variance of the arithmetic mean as all the variable
weights converge on the same constant weight. The length-weighted average was the first
central value to lose its variance, either at the Witwatersrand gold reef complex or at
Centre de Géostatistique, before it became a kriged estimate or kriged estimator.

The above formula does not define covariance terms and only applies to sets of variables
that do not display significant degrees of associative dependence. For example, wet
masses, moisture contents, and metal grades of mined ores and mineral concentrates are
not expected to display associative dependence. In contrast, in situ densities and grades of
massive sulfides invariably display a significant degree of associative dependence. Yet,
associative dependence between in situ densities and grades can be taken into account in
an effective and intuitive manner without working with covariance terms.

The mass of metal contained in a quantity of mined ore, mill feed, or concentrate, is a
function of its wet mass, moisture factor and grade factor such that Me = Mw ⋅ MF ⋅ GF .
The following formula gives the variance of the mass of contained metal.

                                    ⎡ var( Mw) var( MF ) var(GF ) ⎤
                    var( Me) = Me 2 ⎢         +         +
                                    ⎣ Mw
                                            2
                                                 MF 2      GF 2 ⎥ ⎦

                 where: var(Me)       = variance of mass of contained metal
                        var(Mw)       = variance of wet mass
                        var(MF)       = variance of moisture factor
                        var(GF)       = variance of grade factor

This formula forms part of ISO DIS 12744–Sampling Procedures for Determination of
Metal and Moisture Content. ISO Technical Committee 183 approved the interleaved
sampling protocol because it gives reliable confidence limits for contents and grades of
coals, ores, mineral concentrates, smelter residues, recycled materials, and scores of
others, at the lowest possible cost. It does so because each pair of interleaved primary
samples gives a single degree or freedom. Interleaved sampling protocols are also
approved by ISO Technical Committee 69–Applications of Statistical Methods.


                                            29 - 68
A homologue of the above formula can be applied to compute confidence limits for metal
contents and grades of ore reserves. This formula has not yet been approved for reserve
estimation. The methodology is based on the fact that Me, the mass of metal contained in
a volume of in situ ore, is a function of a volume in m3, an in situ density in mt/m3, and a
grade factor such that Me = V ⋅ ID ⋅ GF . The following formula gives the variance of the
mass of contained metal or metal content of a volume of in situ ore,

                                     ⎡ var(V ) var( ID) var(GF ) ⎤
                      var( Me) = Me2 ⎢        +        +
                                     ⎣ V
                                           2
                                                 ID 2     GF 2 ⎥ ⎦

                  where: var(Me)   = variance of mass of contained metal
                         var(V)    = variance of volume
                         var(ID)   = variance of in situ density
                         var(GF)   = variance of grade factor

The above formula provides unbiased confidence limits for contents and grades of ore
reserves. The first step is to verify spatial dependence between grades of ordered sets of
core samples in boreholes. The second step is to compute variances of average grades for
ore zones in boreholes. The third step is to convert statistics of boreholes into cylindrical
volumes of in situ proven ore. The fourth step is to verify spatial dependence between
grades of ordered boreholes along a cross section or profile. A significant degree of
spatial dependence would make it possible to convert cylindrical volumes into contiguous
blocks of proven ore. The final step is to summate the variances of metal contents of all
blocks, and to convert this sum into confidence limits for the cumulative metal content
and the mass-weighted average metal grade.

In the case that borehole grades ordered along a profile do not exhibit a significant degree
of spatial dependence, cylindrical volumes cannot be converted into contiguous blocks of
proven ore. Such cylindrical volumes do not only delineate proven ore within a resource
but also provide an effective method to decide where best to drill additional holes.
Unbiased confidence intervals and ranges for metal grades and contents of mineral
inventories in annual reports are of critical importance to mining investors. This is why
several numerical examples will be presented in the chapter on sampling practice to show
how to compute confidence intervals and ranges for contents and grades of reserves, and
how to define proven ore in resources.

The arithmetic mean is by far the most common function in mathematical statistics. The
area-, density-, distance-, length-, mass- and volume-weighted average are widely used in
geosciences. For simplicity, arithmetic means and weighted averages are referred to as
“central values”. Given that a central value is a function of a set of measured values, it
follows that each central value has its own variance in mathematical statistics. Yet, each
central value does not necessarily have its own variance in geostatistics. The question is
not only which central value lost its variance but also who could possibly have lost this
variance, when and why. This question stands while geostatocrats and krigeologists
assume, krige, smooth and rig the rules of classical statistics.



                                          30 - 68
The variance of a central value of a set of n measured values with variable weights
derives from the variance of the set and the sum of squared weighing factors. In formula,

                                             n
                                  var( x ) = ∑ wi2 ⋅ var( x)
                                             1
                             where: var(x) = variance of set
                            wi2 = squared ith weighting factor

For each wi = 1/n, the sum of n squared weighting factors is equals to n·(1/n)2=1/n.
Hence, var( x ) = var( x) / n . This formula is in fact the Central Limit Theorem, a simple
formula that defines the relationship between the variance of a set of n measured values
with equal weights and the variance of its arithmetic mean.

In contrast, the weighted average is the central value of a set of measured values with
variable weights. This is why each area-, count-, density-, distance-, length-, mass-, and
volume-weighted average does have its own variance. Evidently, the variances of all
weighted averages converge on the Central Limit Theorem as all the variable weights
converge on the same constant weight. The corollary of variable weights is that the
number of degrees of freedom is no longer a positive integer but a positive irrational.

Two or more measured values such as counts, densities, lengths, masses, or volumes
define but one weighted average. The distance-weighted average is different in the sense
that two or more measured values, determined in samples selected at positions with
different coordinates in a finite sample space, define an infinite set of distance-weighted
averages. The problem is that all weighted averages do have variances in mathematical
statistics but that the distance-weighted average is not similarly blessed in Matheronian
geostatistics. To have or not to have a variance are mutually exclusive outcomes. The
question is then which outcome is true, and which outcome is false. Could Matheron’s
new science of geostatistics be flawed? Could mathematical statistics be flawed? Did
Matheron study mathematical statistics before creating his new science of geostatistics?

In statistics, all variances converge on the Central Limit Theorem as variable weights
converge on the same constant weight. Moreover, each weighted average does have its
own variance because it is a functionally dependent value. In geostatistics, however, the
variance of the length-weighted average and the variance of the distance-weighted
average were somehow misplaced. Paradoxically, both distance-weighted average and
length-weighted averages were reborn as kriged estimates or kriged estimators.

Matheron, in his Foreword to Journel and Huijbregts’s 1978 Mining Geostatistics, put
forward that geologists deal with structure and statisticians stick to randomness but did
not explain how. Much of Matheron’s seminal work is posted with the On-Line Library
of the Centre de Géostatistique. The properties of variances play a critical role in
computing unbiased confidence limits for contents and grades of reserves. Matheron paid
little attention to the properties of variances, and today’s Centre de Géostatistique even
less. The problem with Matheron’s teachings is not so much that he knew too little about
applied statistics but that much of what little he knew was wrong.


                                          31 - 68
2. 5 Matheron’s statistics

Matheron’s seminal work is accessible for long overdue review and scrutiny at the On-
Line Library of the Centre de Géostatistique. Matheron’s Formule des Minerais
Connexes, one of the very first papers posted on this website, was written in Algiers on
November 25, 1954. A correction to this paper was appended on January 13, 1995. No
primary data was provided either in the paper or in the appended correction.

In retrospect, it is strange that this paper was marked Note Statistique No 1 because it
does confirm Matheron himself believed he was practicing classical statistics in 1954.
His first paper, too, proved that Matheron’s statistics didn’t comply with the requirement
functional independence, and ignored the concept of degrees of freedom. Given that
Matheron’s seminal work showed such a tenuous grasp of mathematical statistics, it is
not surprising that French statisticians paid scant attention to his statistics. In those days,
Matheron was far removed from rapid advances in mathematical statistics elsewhere in
the world. Matheron’s shaky statistics stood in sharp contrast to his credentials at l’Ecole
Nationale Superieure des Mines de Paris.

Formule des Minerais Connexes is classic Matheronesque in the sense that poorly
defined terms and atypical symbols abound but real data are missing. Matheron failed to
explain why correlation-regression analysis was applied to Napierian logarithms of paired
lead and silver grades. He penned in his paper a correlation coefficient of r=0.85 but not
the number of paired grades. As a result, it is still impossible to compute 95% confidence
limits for his mean grades of 2.36% for lead and of 89 g/t of silver. He should have but
did not point out is that n paired grades give df=n–2 degrees of freedom. Matheron’s
problem in 1954 was that central values and degrees of freedom were beyond his grasp.

Matheron knew how to test for associative dependence between paired lead and silver
grades. What failed to register in his mind is that associative dependence between ordered
metal grades are measures for spatial dependence in a sample space. The hypothesis that
causality may be assumed without cause stands out as the central red flag in Matheron’s
work. In contrast, the Central Limit Theorem is nowhere to be found.

In his Rectificatif à la Note Statistique No 1, Matheron mentioned that the length of his
core samples was variable. This is why his mean grades of 2.36% of lead and 89 g/t of
silver are biased estimates. He did not explain know how to derive the variance of the
arithmetic mean grade or the variance of the length-weighted average grade. In fact, the
Central Limit Theorem was never approved for application in Matheron’s new science of
geostatistics. Matheron huffed and puffed about higher length-weighted average grades
but didn’t test whether length-weighted average grades were significantly higher than
arithmetic mean grades. Matheron did not know how to compute confidence limits, how
to verify spatial dependence, and how to apply Student’s t-test.

Matheron did not grasp that his arithmetic mean and his length-weighted average do have
variances simply because all functionally dependent values do. Matheron should have
taught his students a little set theory by pointing out that one-to-one correspondence is as



                                           32 - 68
common between functionally dependent values and variances as it is between blocks and
grades and between core samples and grades. It would be appropriate if the Centre de
Géostatistique were to post on its On-Line Library the lengths of all core samples and the
set of paired lead and silver grades that underpin Matheron’s very first statistical note.
Among those who stand to benefit most are all the neophytes who are presently primed to
become the geostatistical scholars of the future.

Matheron, in a June1955 paper titled Utilité des méthodes statistiques dans la recherche
minière, referred to D G Krige’s A statistical approach to some mine valuation and allied
problems at Witwatersrand goldfield and A statistical analysis of some of the borehole
values in the Orange Free State. He also referred to H S Sichel’s 1949 Mine valuation
and maximum likelihood, and H J de Wijs’s Statistics of ore distribution, Part 1, Nov
1951, and Part 2, Jan 1953. Matheron did not discuss any of the statistical methods of
these authors. On the contrary, he praised his own work with lognormal statistics by
symbols at the Bureau de Recherches Minière.

It was out of character for Matheron to refer to statistical methods of authors in South
Africa and The Netherlands. In fact, Matheron so rarely refers to the statistical literature
in English-speaking countries that his flawed statistics may well be due to insufficient
osmosis of applied statistics. This is why Matheron’s new science is so eerily reminiscent
of the tale of the emperor’s new clothes.

Matheron’s penchant for μ– and σ2–symbols was unconventional. After all, the use of
these symbols is restricted to the unknown population mean and the unknown population
variance in probability and sampling theory. It would have been merely a matter of
semantics if it were not for the fact that σ2, the unknown population variance, converges
on zero as n, the number of measured values, converges on infinite. If it were possible to
reduce the unknown true population variance to zero, then the unknown true population
mean would be known with absolute certainty.

In the real world, infinite sets of measured values are as rare as zero variances. In
Matheron’s world of mock statistics, however, infinite sets of kriged estimates and zero
kriging variances are as common as shrinking reserves. In his June 1958 Problemes de
zero et d’infini (Note Statistique No 17), Matheron mulled over infinite and zero through
twenty pages of tortuous text and strange symbols. He did so long before his most gifted
followers found infinite sets of distance-weighted averages and zero pseudo variances.

Basic concepts and widely accepted statistical terms and symbols went missing in
Matheron’s make of statistics. Matheron did not know that measured values and degrees
of freedom belong together just as much as do variances and functionally dependent
values of stochastic variables. The symbol x stands for the central value of a set of
measured values, either with equal weights such at arithmetic means or with variable
weights such as area-, count-, density-, distance, length-, mass- or volume-weighted
averages. The symbol var ( x ) stands for the variance of a central value, var(x) for the
variance of a set, and varj(x) for the jth variance term of the ordered set. None of these
symbols and terms scored a passing grade in Matheron’s surreal statistics.


                                          33 - 68
Matheron failed to check and compare his statistics against that of contemporary English-
speaking statisticians. Sir Ronald A Fisher was knighted in 1953 for his role in the
development of analysis of variance. Matheron still did not like Fisher’s work in 1970.
Otherwise, he would have known how to apply Fisher’s F-test to the variance of a set and
the first variance term of the ordered set. It also explains why Matheron rather assumes
than verifies spatial dependence between grades of ordered core samples in boreholes,
ordered ore sections in profiles, or ordered rounds in adits, drifts, pits, or trenches.

M J Moroney’s Facts from Figures saw its Second Edition in 1953. This popular book
was translated and published in 1970 under the title Comprendre la statistique: vérités et
mensonges des chiffres. In 1970, Matheron, his followers, and a few token statisticians
met at the very first geostatistics colloquium on campus at The University of Kansas,
Lawrence from June 7th to 9th. D Huff’s whimsical How to Lie with Statistics was an
instant success in 1954. It inspired W J Reichmann’s 1961 Use and Abuse of Statistics to
caution that a world without sound statistics would slowly grind to a halt. H G Wells was
so enthralled with statistics that he declared, “Statistical thinking will one day be as
necessary for efficient citizenship as the ability to read and write.” What would Wells
have thought of Matheron’s statistical thinking? To be sure, Matheron himself was rather
taken with his own statistical thinking!

Matheron failed to show how to derive the length-weighted average of a set of measured
values with variable lengths, and how to derive the variance of this sort of central value.
Matheron did not know how to count degrees of freedom. Neither did he know that
degrees of freedom are positive integers for sets of measured values with equal weights,
and positive irrationals for sets of measured values with variable weights.

In fact, Matheron knew much too little about degrees of freedom. This is why he could
not verify spatial dependence by applying Fisher’s F-test to variances of sets of measured
values and first variance terms of ordered sets, and comparing observed F-values with
tabulated F-values at selected probability levels and with applicable degrees of freedom.
Analysis of variance and the properties of variance were a profound mystery to Matheron
when he wrote his first statistical note in Algiers on November 25, 1954. These properties
were still a profound a mystery in 1979 when Matheron compiled his rambling Foreword
to Journel and Huijbregts’s Mining Geostatistics. During all those years, the slope of
Matheron’s learning curve for his neostatistics stayed statistically identical to zero.

Matheron played at statistics in the 1950s but did not grasp its elementary rules. Yet, he
transformed his neostatistics into geostatistics with single-minded resolve and reckless
abandon. What happened to the variances of his lead and silver grades for ordered core
samples of variable length? Why didn’t he test for spatial dependence between ordered
lead and silver grades? Why didn’t he count degrees of freedom? Why did he violate the
most basic rules of classical statistics with impunity? Why did he get away with so much
bogus statistics? Surely, it was not just Matheron’s bluster and blarney! It did require
implicit approval of the world’s mining industry. This industry would not put up a dime
for research into assuming causality without cause but research into stochastic simulation
of ore reserves with pseudo variances somehow seems to make scientific sense.



                                         34 - 68
2.6 Matheronian geostatistics

Matheron was dabbling at statistics in the early 1950s when he found out that working
with statistics of measured values and counting degrees of freedom was not his true
calling. So he took a few odds and ends of statistics and turned it into a new science, a
sort of statistics by symbols without rigid rules and real data. In time, his new science of
assuming spatial dependence, interpolating and extrapolating by kriging, selecting the
least biased subset of some infinite set of kriged estimates, and smoothing its pseudo
kriging variance to perfection, became the heart and soul of Matheronian geostatistics.

Matheron’s Krigeage d’un panneau rectangulair par sa périphérie was his first paper
with a krige-inspired eponym in its title. It was completed in 1960 and is posted as Note
Geostatistique No 28 with the Centre de Géostatistique‘s On-Line Library. This paper is
an archetypal Matheronian study in the sense that ambiguous terms abound, avant-garde
symbols rule, and central values have no variances. It was also the year that Matheron set
the stage for straying from sound statistics into his novel science of geostatistics.


                                     A               A'




                                     B               B'




                                         Fig. 1
                            Source: Note Géostatistique No 28

The above figure is a facsimile of Figure 1 in Matheron’s Note Géostatistique No 28.
Matheron labeled the lengths of AA’ and BB’ for his block (panneau) as a, and the
lengths of AB and A’B’ for the same block as b. He defined the mean grade (teneur
moyenne) of AA’ and BB’ as u, the mean grade of AB and A’B’ as v, and the mean of u
and v as the mean grade of Block AA’B’B. Matheron also defined z*, his estimateur and
a precursor to the honorific kriged estimate or kriged estimator, as follows.
                                          a         b
                                   z* =      ⋅u +       ⋅v
                                        a+b        a+b

In classical statistics, Matheron’s estimateur z* is the length-weighted average of u and v.
The formula for var(z*), the variance of Matheron’s estimateur, is a homologue of the
above formula, in which var(u) and var(v) are the variances of u and v.
                                         a               b
                             var( z*) =     ⋅ var(u ) +     ⋅ var(v)
                                        a+b             a+b

Incredibly, Matheron did not derive var(u), var(v), or var(z*). Incredible indeed because
u and v are mean grades of opposite sides of his block, and z* is the mean of mean grades
of u and v. So why did these variances fail to make the grade in Matheron’s new science?



                                          35 - 68
Matheron had been turning out scores of weighty Notes Statistique since the early 1950s.
In 1960, however, he still did not know that mean grades of blocks do have variances. So
it comes as no surprise that he did not know how to derive var(k*), the variance of k*, or
var(u) and var(v), the variances of u and v. Matheron thought statisticians did not grasp
mining problems but he had not the slightest grasp of his own sampling problems. He did
not know how to derive the variance of a set of measured values or the variance of its
arithmetic mean. Neither did he know how to test for spatial dependence, or how to
compute the variance of metal contained in a single block or in a set of blocks.

Matheron’s estimateur is the length-weighted average grade of his block. Did he dictate
that block grades have no variances? Where did he refer to textbooks on statistics? Who
reviewed his notes? If both u and v are a single measured value rather than the arithmetic
means of sets, the variance of Matheron’s estimateur is obtained as follows.

                                         2                        2
                                ⎡ a ⎤                 ⎡ b ⎤
                     var( z*) = ⎢    ⎥ ⋅ ( z * −u ) + ⎢ a + b ⎥ ⋅ ( z * − v )
                                                   2                          2

                                ⎣a +b⎦                ⎣       ⎦

This formula derives from the variance of a general function as defined in probability
theory. Matheron’s estimateur is a functionally dependent central value of some set of
measured values with variable weights. If Matheron’s panneau is a perfect square, the
above formula becomes the Central Limit Theorem for n=2. In that case, Matharon’s
estimateur gives precisely one degree of freedom. In contrast, a rectangular panneau
gives slightly less than one degree of freedom. When Matheron was studying statistics by
symbols without real data, counting degrees of freedom for sets of measured values with
equal or variable weights ranked rather low on his short list of things to do right.




                         Figure 2.8 Matheronian block

A Matheronian block is an anomaly in statistics because its length-weighted average
grade is not blessed with a variance. In contrast, the grade of a Matheronian block does
have a variance in classical statistics because it is a functionally dependent central value
of a set of measured values. Matheron’s problem was not so much that his estimateur is a
length-weighted average grade but that it still does not have a variance. Central values
such as arithmetic means of sets of measured values with equal weights, or area-, count-,
density-, distance-, length-, mass-, and volume-weighted averages of sets of measured
values with variable weights, do have variances. This is why central values play a key
role not only in mineral processing, smelting, and refining but also in mineral exploration
and mining. The Central Limit Theorem links the variance of the set and the variance of
its central value. Why then is this theorem redundant in Matheron’s new science?


                                             36 - 68
It is deeply troubling indeed that the variance of the length-weighted average grade of a
Matheronian block did not make the grade in his new science. The more so because none
of his Notes Statistique mentions that functionally dependent values do have variances,
and that sets of measured values do give degrees of freedom. Matheron ruminated how to
obtain the best possible grade estimate for a set of Matheronian blocks, and how to
minimize la variance d’erreur associated with krigeage ordinaire. What Matheron failed
to grasp is that true variances cannot be “minimized”. Thus, it is not at all surprising that
Matheron did not know how to derive confidence limits for variances.

On page 2 of his paper, Matheron described a set of blocks with a mean grade of xt along
a total length of Lt and a mean grade of xm along a total width of Lm. The formula for
Matheron’s estimateur m* is a homologue of the earlier one for his estimateur k*.

                                         Lt             Lt
                               m* =           ⋅ xt +         ⋅ xm
                                      Lt + Lm        Lt + Lm

In this case, too, Matheron did not derive var(m*), the variance of his estimateur for this
set of blocks. Neither did he derive var(Lt) or var(Lm), the variances of sets of measured
values along opposite sides of his blocks. Because Matheron did not derive a mean grade
for each of his blocks, he could not test for spatial dependence between block grades.
Unlike a real statistician, Matheron did not know how to derive any of these statistics.

 1. Apply analysis of variance to estimate variances within and between blocks,
 2. Verify spatial dependence by applying Fisher’s F-test to the variance of a set and the
    first variance term of the ordered set,
 3. Construct a sampling variogram for ordered block grades by verifying where spatial
    dependence in a sample space dissipates into randomness,
 4. Compute the variances of the metal contents for all blocks from volumes, densities,
    and grades, and the variances of these variables,
 5. Use the additive property of variances to derive confidence limits for the cumulative
    metal content and for the weighted average grade of this set of n blocks.




                             Fig. 2 Set of n ordered blocks
                            Source: Note Géostatistique No 28

What a pity that Matheron’s grasp of statistics was a bit less astute than he gave himself
so much credit for. Who were Matheron’s peers in those heady days? Who studied his
torrents of Notes Statistique and Notes Géostatistique? Was Matheron a born genius at


                                            37 - 68
probability or a self-made wizard of odd statistics? His Centre de Géostatistique ought to
post with its On-Line Library early reviews of Matheron’s seminal work! His Notes
Statistique do not refer to commonly applied statistical methods. Did any bona fide
statistician ever review Matheron’s embryonic geostatistics before it was hailed as a new
science? As it stands, too many novices are being taught that assuming causality does
make a great deal of sense in Matheron’s new science of geostatistics.

Matheron did not know in 1954 how to derive the variances of lead and silver grades of
ordered core samples. In those days, he did not even know that the length-weighted
average grade is the central value of a set of measured values determined in core samples
of variable lengths. Neither did he know that each length-weighted average grade does
indeed sport its own variance in classical statistics. He still did not know in 1960 that
length-weighted average grades do have variances. Matheron’s problem was not so much
that length-weighted averages do not have variances but that distance-weighted averages
do not have variances either. Matheron’s estimateurs became confusing central values
because length-weighted averages and distance-weighted averages were both reborn as
kriged estimates or kriged estimators.

Matheron’s Catch-22 in 1960 was the difference between a Matheronian block and a
Matheronian point. A Matheronian block is a three-dimensional sample space. It is
defined by the length-weighted average of a set of measured values determined in
samples selected at positions along the periphery of the block. What Matheron failed to
do was test for spatial dependence by applying Fisher’s F-test to the variance of the set of
measured values and the first variance term of the set ordered along the periphery of his
block. That is why Matheron could not possibly confirm whether k*, his estimateur, is an
unbiased estimate for the grade of this block. As a result, a Matheronian block may, or
may not, have an unbiased grade estimate but its grade definitely did not have a variance.

In contrast, a Matheronian point is a zero-dimensional sample space. It is defined by the
distance-weighted average of two or more measured values determined in samples
selected at positions with different coordinates, either in a two- or in a three-dimensional
sample space. The essence of Matheron’s new science is that two or more measured
values give an infinite set of different coordinates, and, thus, an infinite set of distance-
weighted averages not only within the sample space but also beyond it. Matheron’s folly
was that he failed to grasp why the requirement of functional independence and the
concept of degrees of freedom foiled his new science.

It took Matheron ten more years to replace finite sets of length-weighted average block
grades with infinite sets of distance-weighted average point grades. Did he do so because
infinite sets are immeasurably larger than finite sets? Matheron could not possibly have
known in 1960 that pseudo kriging variances and pseudo kriging covariances of least
biased subsets of infinite sets of kriged estimates would become the cornerstones of his
new science of geostatistics. Matheron’s docile disciples grasped as much of the rules of
statistics as did Matheron himself. In fact, the entire geocabal did not know that length-
weighted average block grades and distance-weighted average point grades do have
variances. Perhaps they did know but were afraid to tell Matheron.



                                          38 - 68
The fact of the matter is that Matheron knew barely enough to become a bungling
statistician. Just the same, he was brazen to boot, belligerent to a fault, and nonplussed by
persistent criticism of his new science of geostatistics. What Matheron did do was create
orderliness where randomness rules by assuming causality without cause. Therefore, it
should not come as a surprise when Matheron’s new science will be remembered most of
all as a statistical fraud without parallel in the history of science.

Matheron never mastered analysis of variance, Fisher’s F-test, Bartlett’s chi-square test,
Student’s t-test, Tukey’s WSD-test, spatial dependence, functional dependence, or degrés
de fidelité for that matter. Neither did Matheron grasp that the Central Limit Theorem
links the variance of a central values to the variance of a set of measured values, and why
this theorem lies at the core of sampling theory and practice. The very first textbook on
Matheron’s new science of geostatistics does mention the “famous” Central Limit
Theorem in the text but not in the index. The second textbook mentions zero kriging
variances but does not mention that it takes infinite sets of kriged estimates to beget zero
kriging variances.

Just a few rules of classical statistics had gone astray long before Matheron taught his
new science at his Centre de Géostatistique. Matheron himself had cooked up a string of
strange symbols and a thesaurus of tortuous terms. It was under Matheron’s tutelage that
so much neostatistics was done by symbols rather than with real data. In 1960, Matheron
defined his estimateur, which turned out to be a variance-deprived length-weighted
average block grade. The missing variance of Matheron’s estimateur is the very reason
why the Centre de Géostatistique deserves credit for posting Matheron’s seminal work on
its On-Line Library. For it will be a lasting source for study and scrutiny in years to
come. The question is not so much whether or not Matheron knew that his estimateur had
its own variance but why nobody told him. Of course, it is never too late to unravel
Matheron’s folly.

Conditional simulation and stochastic simulation may sound much more intuitive and
comforting than selecting least biased subsets of infinite sets of kriged estimates. In spite
of that, such simulation models are seeded with pseudo kriging variances of least biased
subsets of infinite sets of kriged estimates. The odds against selecting that elusive least
biased subset of an infinite set of kriged estimates are immeasurably low. All the same,
geostatistical ore reserve practitioners beat those odds on a routine basis. Such advanced
variants of Matheronian geostatistics merit as much mention in the history of science as
the Bre-X fraud does in mining history. Assuming causality is not some minor mistake
but a true scientific fraud.




                                          39 - 68
2.7 A colloquium on geostatistics

The first colloquium on geostatistics was held on campus at the University of Kansas,
Lawrence, from June 7 to 9, 1970. The Kansas Geological Survey, the University of
Kansas Extension, and the International Association for Mathematical Geology
sponsored the surreal episode that brought Matheron and his bogus statistics all the way
to the USA. Dr D F Merriam, Chief of Geologic Research, Kansas Geological Survey,
edited the proceedings, and Plenum Press published it in 1970. The colloquium was
dedicated to, “To all geostatisticians and statistical geologists.”

Drs G S Koch, Jr, and R F Link were the token statistical geologists at this gathering of
made geostatisticians. Koch and Link’s textbooks on Statistical Analysis of Geological
Data, Parts 1 & 2 were published in 1970. This is why organizers of a geostatistical mind
could ill afford to not invite those prominent authors. Dr J W Tukey, a professor at
Princeton University who masterminded the well-known WSD-test (Wholly Significant
Difference), was asked to scrutinize Matheron’s new science of geostatistics.

One of the objectives of Link and Koch’s colloquium paper on Experimental Designs and
Trend-Surface Analysis was to link the latter to “ordinary analysis of variance.”
Matheron’s take on trend surface analysis emerged in his own paper when he wished,
“…the well-known problem of ‘trend surface analysis’ perhaps will encounter here its
happy end…” Matheron was not just ill mannered but as ill informed as he was in 1967
when he wrote about the pros and cons of kriging and polynomial interpolation. In those
days, his problem was that kriging created counterintuitive ore reserves on polynomial
curves and trend surfaces alike. Matheron was keen that ore deposits be kriged in three-
dimensional sample spaces. So he assumed spatial dependence between ordered sets of
measured values, interpolated by kriging, selected subsets of infinite sets of kriged
estimates, smoothed pseudo kriging variances and rigged the rules of classical statistics.

When Matheron was playing at statistics in the early 1950s, he did not know how to
derive variances of length-weighted lead and silver grades of core samples with variable
lengths. Neither did he know how to test for spatial dependence between metal grades of
ordered core samples simply because he knew next to nothing about analysis of variance
and Fisher’s F-test. Some twenty years later Matheron did not know much more about
Fisher’s work but what he did know was that classical statistics spelled trouble for his
new science of geostatistics. A great deal of trouble indeed because deriving variances of
functionally dependent central values and counting degrees of freedom for sets of
measured values never made Matheron’s list of things to master.

In his 1970 Random Functions and their Applications in Geology, Matheron pointed out
that a variogram of Brownian motion along a straight line is “not bounded”, and that “no
stationary covariance exists.” In 1954, however, he did not talk about variograms of lead
and silver grades of ordered core samples along a straight line of a borehole. In those
days, Matheron did not know how to derive variances of length-weighted average grades
of core samples of variable lengths. Had he known that length-weighted average grades
do have variances he would have brought up that crucial fact in his seminal work.



                                         40 - 68
Matheron did not put in plain words what Brownian motion and ore deposits do have in
common. Neither did he point out that Brownian motion along a perfectly straight line
make a similar one-dimensional sample spaces as do metal grades of core samples
ordered along a slightly curved borehole. Matheron’s problem was not so much that he
didn’t know how to derive sampling variograms for lead and silver grades of ordered core
samples in a borehole. His problem was really that he failed to grasp how to verify
continued mineralization between metal grades of boreholes. That may well explain why
the practice of assuming spatial dependence between measured values became so deeply
entrenched in Matheron’s new science of geostatistics.

Matheron mulled over “a given practical problem” but practical problems and real data
were as scarce in his Brownian motion paper as they are in much of his quixotic work. He
defined his ”unbiased expression as an estimator for the variogram” as follows.

                                               L −h
                                       1
                        γ *(h) =                ∫ [Z ( x + h) − Z ( x)]       ⋅ dx
                                                                          2

                                   2( L − h)    0


Matheron’s random function is a Riemann integral. The validity of this function depends
on its continuity for all values of x between zero and L. Matheron did not prove that his
”unbiased expression as an estimator for the variogram” is indeed unbiased. In this 1970
paper, too, he may well have assumed continuity between some set of unknown points
rather than derive a variogram to prove his point. It is troubling to the extreme that
assumed spatial dependence became the quintessence of Matheron’s new science.

Generally, the jth variance term of some ordered set of independently measured values in
a sample space or in a sampling unit derives from the following Riemann sum.

                                            1
                                                   ∑ ( xi − xi+ j ) )
                                                                      2
                               var j =
                                         2(n − j )

This formula is as traceable to Von Neumann’s work in the 1940s as analysis of variance
is to Fisher’s work at that same time. Fisher’s F-test compares the observed value of
F=var(x)÷var1(x) with tabulated values of F-distributions at 5% and 1% probability with
applicable degrees of freedom. Plotting statistically significant variance terms for an
ordered set against the variance of the set and the lower limits of its asymmetric 99% and
95% confidence ranges gives a sampling variogram. This simple graph displays where
order in a sample space or in a sampling unit dissipates into disorder.

Given that F0.05;∞;∞=F0.01;∞;∞=1.00, it follows that the notion of degrees of freedom
is of critical importance in mathematical statistics. Ordered sets of n measured values
give dfo=2(n–1) degrees of freedom for the first term, and dfo=2(n–j) for the jth term. The
factor 2 reflects that all but the first and last datum is used twice. Degrees of freedom are
positive integers for sets of measured values with equal weights but positive irrationals
for sets of measured values with variable weights. As the rules of thumb would have it;
measured values do give degrees of freedom, and calculated values do have variances.


                                               41 - 68
Maréchal and Serra’s Random Kriging is richly embellished with krige-derived eponyms
and cryptic symbols but short of real data. In fact, this paper is surprisingly reminiscent
of Matheron’s Random Functions and their Applications in Geology. For example,
Figure 10 in the section on Punctual Kriging shows a symmetric matrix of nine (9)
squares with sixteen (16) unknown “punctual estimates” in the center square, each of
which, in turn, is a functionally dependent value of the same set of nine (9) samples with
unknown grades. The caption below this figure reads, “Grades of n samples belonging to
nine rectangles P of pattern surrounding x” but real grades are missing.

Each of Maréchal and Serra’s “punctual estimates“ is, in fact, the distance-weighted
average grade for the selected position. The problem is not so much that all “punctual
estimates“ derive from the same set of nine (9) unknown grades but that all variances are
missing. Of course, calling distance-weighted average grades “punctual estimates,” or
“estimateurs” for that matter, does not change the fact that such central values do have
variances. Maréchal and Serra’s Figure 10 resurfaced with the same set of nine (9)
unknown grades as Figure 203 on page 286 of David’s 1977 Geostatistical Ore Reserve
Estimation. The latter figure, its caption, and the context on the same page are reviewed
in Section 2.9 A textbook on geostatistics.

Agterberg alluded to some kind of “geologic prediction problem” in the caption below
Figure 1 of his 1970 paper on Autocorrelation Functions in Geology. This figure was
born again as Figure 64 in his 1974 textbook on Geomathematics. Thus, it took some 4
years before his “geologic prediction problem” became a “typical kriging problem.” The
question is then what the difference between Agterberg’s geologic prediction and typical
kriging problems is all about. Both figures refer to unknown “known values” for a set of
five (5) irregularly spaced points, and an unknown value to be “predicted” for point P0.
Agterberg, just like Maréchal and Serra, failed to point out that his “predicted value” is,
in fact, the distance-weighted average of a set of five (5) measured values determined at
positions with different coordinates. Neither did he point out that his “predicted value”
does have a variance because it is a functionally dependent value of his set of five (5)
unknown “known values”. Agterberg could have but did not mention that his “predicted
value” was bound to converge on the arithmetic mean, and its variance on the Central
Limit Theorem, as soon as irregularly spaced points become equidistant to P0.

It is of minor concern that Agterberg did not mention that his “predicted value” is a
distance-weighted average. It is of mayor concern, however, that he failed to grasp that
each distance-weighted average does have its own variance in classical statistics. He
could have mentioned that his set of five (5) irregularly spaced points defines an infinite
set of “predicted values” within and beyond this sample space. He could have explained
how to verify spatial dependence in his sample space by applying Fisher’s F-test to the
variance of the set of five (5) measured values and the first variance term of the ordered
set. He could have mentioned random and systematic walks to derive variances of sets
and variances of ordered sets. Agterberg, more than most geostatistical scholars, ought to
know that his predicted values are zero-dimensional point grades. If Agterberg were to
agree that zero-dimensional point grades and three-dimensional block grades do indeed
have variances, then he ought to revise his 1974 textbook on Geomathematics.



                                         42 - 68
Professor Dr J W Tukey’s response to the themes and problems of the Geostatistics
Colloquium in terms of the current state of the art of data analysis, spectrum analysis, and
classical statistics is summarized in Some Further Inputs. Tukey mentioned means and
variances in his Abstract but didn’t mention one-to-one correspondence between means
and variances. Yet, he did not inquire what happened to the variances of Maréchal and
Serra’s “punctual estimates,“ or to the variance of Agterberg’s ”predicted value” for that
matter. He may not even have noticed that Maréchal and Serra’s “punctual estimates“
and Agterberg’s “predicted value” did not sport variances.

Of course, Tukey knew that the Central Limit Theorem defines the relationship between
the variance of a set of measured values with identical weights and the variance of its
arithmetic mean. He may have overlooked that Agterberg’s “predicted value” converges
on the arithmetic mean when irregularly spaced points become equidistant to P0. It is true
that Tukey, Koch, and Link failed to notice that Agterberg’s “predicted value” did not
have a variance. Yet, they were no doubt aware that its missing variance would converge
on the Central Limit Theorem when all irregularly spaced points become equidistant to
P0. Tukey, Koch, and Link were but three of many scores of statisticians who were and
are still bamboozled by Matheron’s new science of geostatistics, its multitude of baffling
terms and symbols, and its scarcity of real data.

Tukey stated, “I am now beginning to understand, (kriging) is a word for more or less
stationary, more or less least squares, smoothing of data”. Matheron’s new science of
assuming, kriging, smoothing and rigging the rules of classical statistics seemed to have
mesmerized Tukey as much as it did so many geologists and mining engineers around the
world. Matheron’s rejection of Link and Koch’s paper on Experimental Designs and
Trend-Surface Analysis should have troubled both Tukey and the token statisticians. In
fairness to the few innocents surrounded by a large crowd of geostatistically blessed, it
should be kept in mind that symbols and terms of classical statistics were scrambled to
the extreme, and that basic rules were violated amidst the Babylonian confusion. Indeed,
it would have taken Tukey, Koch, and Link much more than three days to unscramble
Matheron’s bizarre alternative to mathematical statistics. Matheron himself was not
troubled when dreadful inconsistencies were brought to his attention. On the contrary, he
was invariably proud to point out that the world’s mining industry did embrace his new
science of doing so much with a few expensive boreholes.

Agterberg, in his tribute to “Georges Matheron–Founder of Spatial Statistics,” claimed,
“Matheron deserves Fisher–Tukey class standing.” It is absurd to suggest that Matheron
ranked on a par with those giants of statistics! Sir Ronald A Fisher was knighted in 1952
because he created analysis of variance based on the properties of variances. In contrast,
Georges Matheron created geostatistics because he failed to grasp the properties of
variances as much in 1954 as he did until his passing in 2000. What’s more, he taught his
students that spatial dependence between ordered sets of measured values in sample
spaces could be assumed with impunity. Agterberg’s ranking of Professor Geeorges
Matheron is an ill omen not only for the statistical integrity of Chapter 10 Stationary
Random Variables and Kriging in his textbook but even more so for his own integrity as
a scholar and a scientist.



                                          43 - 68
2.8 Agterberg’s textbook on geomathematics

Dr F P Agterberg’s 1974 Geomathematics, Mathematical Background and Geo-Science
Applications, is a comprehensive textbook on the application of the queen of sciences in
earth sciences. The author covered much of the vast range of tools and techniques that
mathematics provides in such rich abundance. This is why most of it will stand the test of
time. In spite of that, some of Agterberg’s assumptions and definitions are bound to
crumble under scrutiny and his geostatistical thinking is just as wrong as Matheron’s.

S C Robinson, in his Foreword to Agterberg’s textbook, stated, “Geomathematics is
becoming indispensable to the earth sciences as the huge volume and wide variety of
observations and measurements increases.” Yet, Agterberg did statistics by symbols in
Figure 1 of his 1970 Autocorrelation Functions in Geology. Four years later, he still had
not found five values determined in samples taken at positions with different coordinates
in a two-dimensional sample space. Following is a facsimile of Figure 64 as shown on
page 353 in Chapter 10 Stationary Random Variables and Kriging.




           Fig. 64. Typical kriging problem; values are known at five points.
        Problem is to estimate value at point Po from the known values at P1 – P5.
                                 (From Agterberg, 1971)

Agterberg was as keen to work with suppositions and symbols as Matheron was ever
since he struggled with statistics by symbols in 1954. Such a shared bent may well
explain why real data were missing in his 1970 and 1974 figures. He may have tried to
grasp Matheron’s surreal statistics before putting his own spin on how to assume
continuity of stationary random functions, how to interpolate and extrapolate by kriging,
and how to fumble the variance of his estimated value at point P0.

In 1970, Agterberg speculated, “Suppose that there exists a two-dimensional
autocorrelation function ρij for the linear relationship between all possible pairs of points
Pi and Pj.” In 1974, however, he hypothesized, “The method of linear prediction in time
series can be adapted to the situation of Fig. 64 by defining a two-dimensional
autocorrelation function ρij for the linear relationship between all possible pairs of points
Pi and Pj.” So, it took Agterberg four years to progress from “supposing” to “defining”
his two-dimensional autocorrelation function. Meanwhile, his geologic prediction
problem had turned into some kind of kriging problem but his autocorrelation function
solved both problems just the same.



                                          44 - 68
Agterberg was no stranger to functional dependence. In Section 2.3 Variables and
Functions of Chapter 2 Review of Calculus, he pointed out, “The equation y = f(x)
denotes that y is a function of x.” Under Continuity of Functions on the same page, he
explained how to assess whether a function is continuous or discontinuous.

  “A function f(x) is continuous for a value of x = a if both lim f ( x) and f(a) exist with:
                                                               x →a

lim f ( x) = f (a)
x →a

This means that in the limit, f(x) assumes the value of f(a) as x approaches a. The values
of x in the expression lim f ( x) can be larger as well as smaller than a.”
                       x →a



Riemann’s definition for continuity of functions had slipped Agterberg’s mind when he
was creating Chapter 10 Stationary Random Variables and Kriging. In Section 10.1
Introduction, he rambled, “We assume that variables which change in value from point to
point obey stationary random functions.” He did not disclose who were “we” who
decided to “assume” that values of stochastic variables obey stationary random
functions. Nor did he explain why assuming spatial dependence between points made
sense not just in his chapter on Stationary Random Variables and Kriging but even more
so in all of Matheron’s seminal work. In the same section, Agterberg claimed, “The
results can be used for interpolation and extrapolation” but did not spell out why
extrapolation made just as much sense as interpolation does.

Agterberg`s “kriging problem” and his earlier “geologic prediction problem” were quite
the same. So much so that Agterberg knew that his “value to estimate” in 1974 and his
“value to predict” in 1970 were the very same functionally dependent value. What he
failed to grasp or chose to ignore is that functionally dependent values do have variances
in mathematical statistics. In fact, his functionally dependent value is the distance-
weighted average of his set of five (5) unknown measured values with unknown
coordinates. What stands to reason is that this distance-weighted average is bound to
converge on the arithmetic mean of the set, and its variance on the Central Limit
Theorem, as soon as all of Agterberg’s points become equidistant to P0.

The central limit theorem does indeed lie at the core of a perverse problem that found its
roots in early Matheronesque thinking. This theorem is fundamental in sampling theory
and practice because it links the variance of a set and the number of measured values in
the set to the variance of its central value (the arithmetic mean or a weighted average).
Agterberg did refer to the central limit theorem in Chapter 6 Probability and Statistics
and Chapter 7 Frequency Distributions and Functions of Independent Random Variables.
Yet, this theorem did not show up at all in Chapter 10 Stationary Random Variables and
Kriging. Agterberg should have but did not explain why the central limit theorem did not
make the grade in the very chapter where he endorsed Matheronian geostatistics.

What Agterberg did not point out was that his unknown points and unknown coordinates
in Figure 64 do not define just one “predicted value” but an infinite set not only within
his two-dimensional sample space but also beyond. It is true that he may not have meant
to plot predicted values beyond his sample space. On the other hand, Agterberg suggested


                                           45 - 68
in Section 10.1 Introduction, “The results can be used for interpolation and
extrapolation.” Yet, he did not clarify whether or not “results” and “predicted values”
are somehow synonymous. The question is then what Agterberg was talking about when
he blessed both interpolation and extrapolation.

What failed to arouse Agterberg’s interest in 1970 was spatial dependence between
ordered sets of measured values in sample spaces. He still had not figured out in 1974
how to test for spatial dependence in sample spaces defined by unknown values with
unknown coordinates. He could have used symbols similar to those in Equation [10.82]
on page 353 to derive the variance of the set and the first variance term of the ordered set.
Statistics by symbols would have shown for once that Agterberg, unlike Matheron and
his first generation of geostatistical scholars, knew how to test for spatial dependence in
sample spaces.

Agterberg showed how to apply Fisher’s F-test and assess whether two variances are
statistically identical or differ significantly. In fact, he referred to Fisher’s F-test in
Chapter 6 Probability and Statistics and in Chapter 7 Frequency Distributions and
Functions of Independent Random Variables but not in Chapter 10 Stationary Random
Variables and Kriging. The question is then why Agterberg did show how to apply
Fisher’s F-test but did not know how to apply the very same test to the variance of a set
and the first variance term of the ordered set.

Agterberg displayed his savvy at analysis of variance and the properties of variances in
Chapter 6. In Chapter 10, however, he fumbled the variance of his predicted value and
muddled much of his acumen in analysis of variance. What's more, Agterberg knew as
just little as did Matheron and his following about the additive property of variances for
multivariate functions such as multiple metal contents of sampling units and sample
spaces. What Agterberg did do most of all was to lend too much credence to Matheron’s
elusive stationary random function. Agterberg and Matheron shared a prodigious
penchant for working with symbols, which may explain why they paid so much more
attention to sampling theory than to sampling practice.

Dr J Visman played a key role in building a bridge between sampling theory with its
homogeneous populations and sampling practice with its heterogeneous sampling units
and sample spaces. Surprisingly, Agterberg paid no attention at all to Visman’s work. It
is indeed surprising because so much of Visman’s work was published when both were
employed with the former Department of Mines and Technical Surveys in Ottawa,
Canada. In fact, Visman’s 1961 Towards a Common Basis for the Sampling of Materials
was published as CANMET Research Report R93.

Visman proved that the variance of selecting a set of primary increments from a
heterogeneous sampling unit is the sum of the composition variance and the distribution
variance. The composition variance is a measure for the variability between particles
within primary increments. In contrast, the distribution variance is a measure for the
variability between all primary increments in the set that constitutes the sampling unit.
The composition variance is a function of the mass of primary increments. Typical



                                          46 - 68
examples of sampling units are shipments of coals, concentrates, ores, recycled materials,
and other materials in bulk.

Visman’s sampling model also applies to heterogeneous sample spaces such as ore
deposits, geochemical prospects, contaminated sites, and similar stationary situations. For
example, an ore deposit is partitionable into a set of large blocks, each of which, in turn,
is partitionable into a set of small blocks. The variance between large blocks and the
variance between small blocks are measures for the degree of heterogeneity as a function
of the average mass of a block. Test results for ordered sets of core samples define
cylindrical volumes of proven ore. Pairs of interleaved bulk samples, selected from
ordered rounds in drifts, pits, and trenches, not only define metal grades and contents but
also show where spatial dependence between rounds dissipates into disorder. Most of all,
Visman’s sampling practice ensures unbiased confidence limits for central values of
small and large sets of measured values alike.

Visman partook in the activities of ASTM Committees D-5 on Coal and Coke, and E-11
on Statistics. ASTM D2234 Standard Practice for Collection of a Gross Sample of Coal,
in Annex A1 Test Method for Determining the Variance Components of a Coal, describes
Visman’s sampling experiment with sets of small and large increments to estimate
composition and distribution variances. ASTM D2234 was the very first internationally
recognized standard to specify a precision of ±10% of dry ash content. Visman’s paper
on A General Sampling Theory was published in the November 1969 issue of Materials
Research & Standards.

M David did refer to Visman in his 1977 Geostatistical Ore Reserve Estimation. In his
1974 Geomathematics, however, Agterberg did not mention Visman, P Gy, or any other
recognized expert on sampling theory and practice for that matter. In 1967, Gy referred to
Visman’s 1947 PhD thesis and his 1961 CANMET Research Report R93. Gy no longer
referred to Visman in his 1979 Sampling of Particulate Materials, Theory and Practice.
In its Introduction, however, Gy praised Matheron and his “…science known as
Geostatistics”. Matheron, in turn, praised Gy in his Preface to Gy’s 1967
L’Echantillonnage de Minerais en Vrac. Matheron and Gy knew all about sampling
theory, probability distributions, homogeneous populations, and population means and
variances. This is why both failed to grasp that the properties of variances and the
concept of degrees of freedom are of paramount importance when sampling
heterogeneous sampling units and sample spaces.

That kind of mutual praise explains why the properties of variances were missing as
much in Matheron’s new science as they were misused in Gy’s sampling practices. It
may explain why a paper on The Properties of Variances went missing several times on
D F Merriam’s watch as the Editor-in-Chief, Journal for Mathematical Geology. When
his anonymous reviewers finally perused this paper, the first one proclaimed,
“Geostatistics need obey the concept of degrees of freedom no more so than linear, least
squares regression analysis.” The other declared, “Degrees of freedom is an older
terminology that is not relevant to the modern development of statistics.” These blatantly
biased peer reviews did not trouble Merriam, the Editor-in-Chief, Journal for



                                          47 - 68
Mathematical Geology. Neither did Agterberg and M Armstrong, Merriam’s Associate
Editors in 1995, wonder why degrees of freedom were no longer relevant in statistics.
The more so because Agterberg himself in 1974 did refer to analysis of variance and
degrees of freedom in Chapter 6 Probability and Statistics and Chapter 8 Statistical
Dependence; Multiple Regression.

In 1992, when Agterberg was Assistant Editor to W D Sinclair, Editor, CIM Bulletin, he
approved for publication in CIM Forum a technical brief on Abuse of Statistics. His
approval was not at all surprising because Abuse of Statistics showed how to apply
Fisher’s F-test just as well as did Agterberg himself in Chapter 6 Probability and
Statistics. What he failed to explain in Geomathematics is how to verify spatial
dependence in a sample space by applying Fisher’s F-test to the variance of a set of
measured values and the first variance term of the ordered set. Agterberg, just like
Armstrong in those days, may still not have grasped what the difference between
functional dependence and spatial dependence is all about.

Agterberg did know in 1974 as much as he does today that analysis of variance and
degrees of freedom are inextricably linked. In fact, comparing an observed F-value
between two variances with tabulated F-values requires that the number of degrees of
freedom for each variance be taken into account. This is the very reason why degrees of
freedom will always be relevant in mathematical statistics. Scores of textbooks deal in
detail with degrees of freedom. ISO standards on applications of statistical methods, too,
obey simple notions. Functionally dependent values do have variances. Measured values
do give degrees of freedom.

Geostatistical reviewers on Agterberg’s watch in 1995 when he was Associate Editor,
Journal for Mathematical Geology, did not take degrees or freedom quite as serious as
statisticians do. That explains why the editorial board of CIM’s Geological Society did
not rank Dependencies and Degrees of Freedom as high as Agterberg rated Abuse of
Statistics in 1992. The CIM editorial board had noticed, “This new brief is longer, but
deals with the same topic.” The board did not want to know why functional dependence
and spatial dependence are as different as night and day. Neither did the editorial board of
CIM’s Geological Society believe that its members ought to know that functionally
dependent values do have variances, and that measured values do give degrees of
freedom.

What’s more, the board was irked because of Geostatistics or Voodoo Statistics. It was a
technical brief of sorts, published in the Engineering and Mining Journal of September
1992 and put together to teach elements of statistics to CIM’s statistically dysfunctional
enforcers of geostatistics. Yet, the board’s faith in Matheron’s new science of
geostatistics never wavered, and CIM’s Geological Society stayed the course. Sinclair,
the Editor of CIM Bulletin, and Gerber, his Assistant Editor, were not troubled that
functional dependence and spatial dependence are as different as night and day.
Meanwhile, Matheronian geostatistics was about to convert Bre-X’s bogus grades and
Busang’s barren rock into a massive phantom gold resource.




                                          48 - 68
Robinson mentioned, “…huge volumes of numerical data…” but Agterberg failed to find
a fitting set. Otherwise, he could have tested for spatial dependence by applying Fisher’s
F-test to the variance of the set and the first variance term of the ordered set. A systematic
walk from point to point, such that it covers the shorted possible distance between all
points in the set, gives unbiased estimates for the first variance term of the ordered set
and for the variance of its distance-weighted average.

Agterberg did not point out that his “predicted value” is a functionally dependent value
of a set of five values. Nor did he point out that functionally dependent values do have
variances in classical statistics. His predicted values became kriged estimates in the first
textbook on geostatistics and kriged estimators in the second textbook. The problem is
not so much that Matheron and his students did not know kriged estimates and kriged
estimators have variances but that Agterberg failed to grasp his “predicted value” does
have its own variance.

Matheron babbled about Brownian motion in 1970 when he assumed his “unbiased
expression as an estimator for the variogram” is in fact unbiased. Agterberg, in turn,
assumed that Matheron knew about what he babbled. Matheron may well have assumed
his stationary random function need not be continuous. Nevertheless, he relieved himself
of the burden of proof and assumed his stationary random function captured the
capriciousness of randomness. Did Matheron believe in bringing some semblance or
orderliness in sample spaces where randomness prevailed? This may well be how his
stationary random function turned into the wonderful kriging game of chance. This
kriging game assumes spatial dependence between measured values, interpolates and
extrapolates by kriging, selects the least biased subset of some infinite set of kriged
estimates, smoothes the pseudo kriging variance to perfection, violates the requirement of
functional independence, and ignores the concept of degrees of freedom.

The question of whether or not Matheron’s stationary random function is continuous did
not concern Agterberg in 1970. Yet, his paper on Autocorrelation Functions in Geology
suggests that he did search in 1970 for some measure of order in his sample space.
Therefore, it is surprising indeed that it was Agterberg who himself said, “The results can
be used for interpolation and extrapolation.” Actually, Agterberg made this astounding
statement in Chapter 10 Stationary Random Variables and Kriging of his 1974 textbook
on Geomathematics.

Agterberg is in a bind. He derived Po, his predicted value and the distance-weighted
average in classical statistics, but did not derive var(Po), the variance of his predicted
value. What he failed to grasp in 1970 and 1974 is that Po converges on the arithmetic
mean, and var(Po) on the Central Limit Theorem, as irregularly spaced points become
equidistant to Po.

Agterberg is even more in a bind because of extrapolation beyond the sample space
defined by his five irregularly spaced points. His predicted value Po converges on the
arithmetic mean and vary (Po) on the Central Limit Theorem when distances increase
between this sample space and his predicted value Po. This predictable outcome gives the



                                           49 - 68
lie to his 1974 assumption that, “The results can be used for interpolation and
extrapolation.” A set of three measured values, determined in samples selected at
positions with different coordinates, is enough to prove him wrong. It would also be
enough to show how to test for spatial dependence between measured values in the
ordered set. Of course, Agterberg knows that assuming spatial dependence is as just
much as a scientific fraud as is extrapolation without justification.

Agterberg should study how much the creator of geostatistics struggled with statistics.
When Matheron derived in 1960 the length-weighted average grade of a rectangular
block, he thought up the symbol k* and called it his “estimateur”. And just like
Agterberg in 1970 and 1974, Matheron did not derive var(k*), the variance of his
precursor to the kriged estimate or kriged estimator. What’s more, Matheron did not
know either that k* would be the arithmetic mean for a square block, in which case var
(k*) obeys the Central Limit Theorem.

Agterberg is in a real bind. He is a well-known author and a gifted scholar who ought to
know that all weighted averages do have variances simply because the arithmetic mean
does. He does know how to apply analysis of variance. He may know how to construct
sampling variograms that show where order in sample spaces dissipates into disorder. He
has played, and continues to play, prominent roles with organizations such as the
International Association for Mathematical Geology, the Journal for Mathematical
Statistics, the Canadian Institute of Mining, Metallurgy and Petroleum and CIM Bulletin.

Dr F P Agterberg may well believe it is too late to abandon geostatistics because it has
been around for so long. He would be wrong. It is never too late to right this wrong and
work with mathematic statistics. Rather than remain in denial, he should think about his
legacy. He may rest on his laurels and let the wonderful kriging game run rampant. Or he
may revise his textbook on Geomathematics and eliminate Chapter 10 Stationary
Random Variables and Kriging. He may even want to add a chapter on Dr J Visman’s
sampling practice. Visman showed how to estimate the composition and distribution
component of the sampling variance as a measure for heterogeneity and spatial
dependence in sampling units and sample spaces alike.




                                        50 - 68
2.9 David’s textbook on geostatistics

M David’s 1977 Geostatistical Ore Reserve Estimation was the very first textbook on
Matheron’s new science. D A Krige, the pioneering plotter of distance-weighted average
block grades at the Witwatersrand gold reef complex in South Africa prepared .the
Preface to this textbook. Krige pleaded guilty to, “Having been associated intimately
with the birth and early development of…geostatistics…” What Krige recalled most of all
were “…stormy receptions…” and “…significant skepticism…” Krige did not reveal
what it was about Matheron’s new science that brought about significant skepticism and
stormy receptions. On the contrary, Krige praised Matheron and followers such as David
for the development and establishment of geostatistics. Krige’s praise is deserved indeed
because David’s grasp of statistics matched Matheron’s perfectly.

Matheron himself, in his 1960 Krigeage d’un panneau rectangulair par sa périphérie,
coined the first krige-inspired eponym. It was in this Note Geostatistique No 28 that
Matheron fumbled var(z*), the variance of z*, his so-called “estimateur.” David and
Krige did not know z* and var(z*) are inseparable. Krige might not even have endorsed
this first textbook on geostatistics had he been aware that David’s point grade and its
variance, too, are inseparable. Figure 203 plays a pivotal role in proving that Matheron’s
new science does give shaky statistics because point grades do not have variances.




                   Fig. 203. Pattern showing all the points within B,
                    which are estimated from the same nine holes

Figure 203 is given in Chapter 10 The practice of kriging. It shows nine (9) irregularly
spaced holes in nine squares and sixteen (16) symmetric points in the center square
marked B. David sought to derive the covariance of “all the points within B.” What he
did not know is that the covariance of all the points within B is as useless a measure for
spatial dependence as the first variance term of ordered points. The reason is that each of
his points is a functionally dependent distance-weighted average grade of the same set of
nine (9) holes. David could have derived the covariance of any number of points within
B, and each would still be a functionally dependent distance-weighted average point
grade of the same set of nine (9) holes. The problem is how many distance-weighted
average point grades within B do give a perfectly smoothed pseudo kriging variance.
David tried to solve this conundrum just as much as the greatest geostatistical minds did
ever since Matheron fumbled the variance of his length-weighted average block grade.


                                         51 - 68
The text below Figure 203 on page 286 is David’s but his pattern of points is the same as
that in Figure 10 of Maréchal and Serra’s 1970 Random Kriging. Maréchal and Serra
studied “punctual kriging” by symbols as a sort of tribute to Matheron’s geostatistical
tinkering. Maréchal and Serra did not report coordinates because of random kriging,
“where one does not mind exact locations.” David decided against “…a search for
‘good’ neighbours for each point,” and made up his mind “to keep only those samples
falling within the aureola of nine blocks.” David did not explain how to search for “good
neighbors” in a sample space where holes have neither grades nor coordinates. Maréchal
and Serra’s Figure 10 turned into Figure 203 in the first textbook on geostatistics. David
was kind or prudent when he did not praise on the dot “punctual kriging.”

In 1970, Maréchal and Serra did not explain how to test for spatial dependence between
measured values in ordered sets. Agterberg did not explain it either in his 1970 paper or
in his 1974 textbook. Neither did David in his 1977 textbook. If David’s set of nine (9)
ordered holes were to display a significant degree of spatial dependence, it would define
a finite sample space. Without a significant degree of spatial dependence, however, the
same set of nine (9) holes would define a zero-dimensional sample space. Thus, it is of
critical importance in mineral exploration and mining to verify spatial dependence
between ordered sets of measured values. David, unlike Journel in 1992, did not mention
that spatial dependence between measured values might be assumed.

Matheron’s 1960“estimateur” is the length-weighted average grade of a set of measured
values with variable weights, determined in samples selected along opposite sides of a
rectangular block. For a square block, however, it turns into the arithmetic mean grade. In
geostatistics, distance- and length-weighted average grades of sets of measured values
with variable weights do not have variances. In classical statistics, all central values of
sets of measured values with constant or variable weights do have variances. Even David
did refer to “the famous central limit theorem” in Section 2.1.1 The standard error of the
mean. Central values of sets of measured values with constant weights do have variances
in classical statistics. Central values of sets of measured values with variable weights do
not have variances in geostatistics. It left geostatistical scholars cold when they were told
in the early 1990s that Matheron’s kriged estimate does not have a variance.

In Section 10.2.3.3 Combination of point and random kriging on page 286, David states,
“Writing all the necessary covariances for that system of equations might be a good test
to find out whether one really understands geostatistics.” The perfect test to rate one’s
grasp of classical statistics is to count degrees of freedom not only for all the holes but
also for all the points within B. David’s nine holes give df=9–1=8 degrees of freedom for
the set, and dfo=2(n–1)=16 for the ordered set. By contrast, David’s set of sixteen points
within B gives precisely zero degrees of freedom. After all, each point is a functionally
dependent value for the same stochastic variable in the same sample space. David’s test
for geostatistical acuity is the ultimate exercise in futility simply because the concept of
degrees of freedom went missing in Matheron’s new science. A foolproof rule of thumb
teaches that measured values do give degrees of freedom, and that functionally dependent
values (read calculated values!) do have variances.




                                          52 - 68
Agterberg, the author of Geomathematics, brought up degrees of freedom in Chapter 6
Probability and Statistics and in Chapter 8 Statistical Dependence; Multiple Regression
but not in Chapter 10 Stationary Random Variables and Kriging. David, the author of
Geostatistical Ore Reserve Estimation, listed degrees of freedom accidentally in some
footnote in Table 1.IV on page 25 in Section 1.3.5.2 Estimation of parameters and model
fitting. This table derived from Ondrick and Griffith’s 1969 paper. The authors showed
how to apply Bartlett’s chi-square test to verify closeness of agreement between expected
and observed frequencies of copper grades at the Prince Lyell mine. David did not bother
to explain why degrees of freedom play a role not only in Bartlett’s chi-square test but
also in Fisher’s F-test, Student’s t-test, and Tukey’s WSD-test.

In Section 1.2.3.2 Parameters of dispersion, the concept of degrees of freedom surfaced
in its abstract form of n–1 in the denominator of Formula (1.7) on page 6. David
mentioned the standard deviation on the same page where the denominator of Formula
(1.4) turned out to be n instead of n–1. The author elucidated, “The parameters will most
of the time remain unknown. All that we will have are estimators like x and s.” David
saw fit to characterize the variance as follows, “The square of the standard deviation is
the parameter most commonly used by statisticians since it is easier to handle.” David’s
statement showed that he did not grasp why statisticians work with variances and count
degrees of freedom.

Such as it is David’s textbook would have failed a passing grade simply because of the
properties of variances and the concept of degrees of freedom. Moreover, David, just like
Matheron, did not know how to verify spatial dependence in sample spaces and sampling
units. It defies credulity that neither knew that all types of weighted averages do have
variances just as arithmetic means do. It is even more astounding that Agterberg, the
author of Geomathematics, did not know that his distance-weighted averages do have
variances as much as arithmetic means do. David’s writing suggested he rushed into print
his first textbook on Matheron’s new statistics. In a few years, he would be just as driven
to write some kind of handbook on geostatistical ore reserve estimation. It is not
surprising then that David himself showed misgivings about his first textbook.

In his Introduction, David proclaimed, “The text has mainly been written for mining
engineers and geologists, who are the people facing the problems of ore reserve and
grade control, and who usually have had little exposure to probability and statistics.” As
it turned out, David’s opening remarks about Matheronian geostatistics made little
statistical sense. It may explain why he rambled on, “Chapter 5 is a few pages of theory
to firmly ground the model although statisticians will find many unqualified statements
here. This is not a book for professional statisticians.” David’s grasp of applied statistics
in mineral exploration and mining was such that he could not have written a textbook for
professional statisticians. The author acknowledged Grant NRC7035, which implies that
the National Research Council of Canada did not know either that Matheron’s new
science gives pseudo kriging variances and covariances. The author also had access to the
drafting facilities of the department of Mineral Engineering at Ecole Polytechnique, and
to typing assistance of the Mineral Exploration Research Institute. So much support for
such shaky statistics is truly breathtaking.



                                          53 - 68
What David did not mention in his Introduction is that he could not care less about
unqualified statements. In his List of Notations, David presented “A word of caution”
before praising thinkers like himself by claiming, “It has been known for a long time that
geostatisticians seem to have that capability of changing notations twice or more in the
same page and still understand each other.” What geostatisticians share most of all is an
innate capability of failing to grasp the nuts and bolts of classical statistics such as
functional and spatial dependence, independently measured values and degrees of
freedom, properties of variances, and, last but not least, Fisher’s F-test to verify spatial
dependence in sampling units and sample spaces. All the same, David did prove his point
by proffering a hodgepodge of terms and symbols that would have taken an ISO
Technical Committee on reserve and resource estimation at least ten years to sort out.
David showed as much affinity for the σ²–symbol as Matheron did throughout his
seminal work. Yet, neither knew that this symbol applies only to unknown population
variances. Otherwise, both Matheron and David would have known that confidence limits
for variances are a function of degrees of freedom.

Chapter 1 Elementary statistical theory and applications started with Synopsis of Chapter
1 and 2. The author announced, “The first chapter should be sufficient for a reader who
has never been exposed to statistics, to understand the elementary bases of all further
discussions. To our statistician readers, we apologize.” The author did not explain why
he apologized. On the contrary, he continued to counsel, “People who are already
familiar with statistics should at least read the second chapter, to make sure they
correctly link statistical and mining problems.”

In Section 1.1 The vocabulary of statistics in mineral resource estimation, the author of
the first textbook on geostatistics may well have talked about his personal experience. In
this short but significant section, David proclaimed, “Any statistical textbook will start
with a few elementary definitions which one tends to immediately forget, while in fact it is
very important to keep them in mind so as to avoid making meaningless statements.” On
the same page, David referred to Koch and Link 1970 Statistical analysis of geological
data. In Chapter 3 Sampling, Koch and Link discussed the concept of degrees of freedom
in clear and concise terms. Koch, Link, and Tukey worked with degrees of freedom long
before the 1970 colloquium on geostatistics. In contrast, David still failed to grasp what
degrees of freedom were all about in his 1977 textbook on geostatistics!

In Section 1.1.1 Universe, David declared, “This first definition is not in usual statistical
textbooks; despite its name it is not universally admitted but we need it in quantitative
geological sciences, thus showing right away that many nonstandard statistical problems
will occur.” The author wandered away from the “usual statistical textbooks.” He talked
about “many nonstandard statistical problems.” He journeyed in the same section from
his “universe” to “a mineral deposit.” In Section 1.1.2 Sampling unit and population, he
left his mineral deposit by proclaiming, “A sampling unit is the part of the universe on
which a measurement is made.” David’s sampling unit differs from those defined in ISO
Standards for coals, concentrates, ores, and other bulk materials. Sample spaces such as
mineral deposits and sampling units such as rounds from a drift, a shipment in bulk bags
or a cargo aboard a bulk carrier are all part of the universe.



                                          54 - 68
In Section 1.4.1 Definition of independence, the author proclaimed, “In plain words two
variables are independent, if knowing one does not tell us anything about the other. For
instance, knowing the grade of blast hole D-36 on bench 920 does not help us at all to
predict the grade of block 3525 on bench 2100. To take a less extreme example, in many
gold mines, knowing that the value of one assay is 0.5 ounces/ton is of no help to predict
the grade of another sample 20 feet from the first one.” What David tried to define was
spatial independence and dependence. The highest degree of spatial dependence exists
between test results for halve core samples of the same whole core sample. A significant
degree of spatial dependence may exist between test results for ordered core samples in a
borehole. Spatial dependence between central values of ordered ore zones in a profile of
boreholes makes it possible to derive confidence limits for content and grade for all ore
zones. It may allow parts of inferred resource to be converted into proven ore.

Chapter 2 Contribution of distributions to mineral reserve problems, was the one David
deemed an essential read. In Section 2.1.1 the standard error of the mean, the author
pointed out, “One of the most widely used formulas of statistics is the one which gives the
so-called standard error of the mean, or the accuracy of an average estimate.“ What
David did not know is that accuracy means the same as unbiased or freedom from error
whereas precision is a generic term that refers to all sorts of measures for variability such
as variances, standard deviations, coefficients of variation, confidence intervals, and
confidence ranges. This mistake is traceable to Gy’s 1967 and 1975 treatises in French. In
his 1979 Sampling of particulate materials; Theory and practice, Gy used precision and
accuracy as defined in various ISO Standards.

In the same section and paragraph, David admitted, “This formula is also responsible for
the largest number of mistakes on the account of statistics. It is based on the famous
central limit theorem and states that given a population and a group of independent
samples drawn from that population…” The author presented the correct formula for the
central limit theorem, and decided to give “independent” a bit more attention. So much
more that David continued in Section 2.1.2 Conditions of use, “The trouble is, however,
that for this formula to be valid the samples have to be independent of each other. Most
of the time, they are not, and even if they are, additional samples will probably no longer
be independent of each other, and of the original samples.”

The author lost his train of thought. What he tried to point out is found on page 35 in his
example of a set of ten (10) test results for iron. If n–1 test results for iron are given, then
the tenth is known simply because the sum of n differences between measured values and
their central value is zero. This is why n–1 out of n measured values are independent
whereas only one out of n is dependent. David’s set of ten (10) test results would give
df=10–1=9 degrees of freedom if all test results have equal weights. Generally, a set of n
measured values with equal weights give df=n–1 degrees of freedom. What David did not
know was that the numerator of the central limit theorem is invariably divided by the
number of measured values. In geostatistics, however, the numerator is divided by some
number of predicted values, kriged estimates, or kriged estimators, to which may or may
not be added the number of measured values.




                                            55 - 68
In his Bibliography, David listed Koch and Link’s 1970 Statistical analysis of geological
data, and Tukey’s 1951 Propagation of errors, fluctuations, and tolerances. He did not
list Moroney’s Facts from figures. This delightful book was first printed in 1951,
reprinted countless times, and translated into French in 1970. In 2007, McGill’s library
still has 1956 and 1965 editions of Facts from figures. Neither did David refer to Volk’s
Applied statistics for engineers. In Chapter 7 Analysis of variance, Volk discussed the
properties of variances, the variances of functions, the application of Fisher’s F-test, and
the relationship between confidence limits for variances and degrees of freedom. The first
edition of Volk’s applied statistics for engineers was published in 1958, and many more
have since been printed.

David referred to Gy’s early works in French, and to Visman’s 1970 A general sampling
theory. It was Visman who built a bridge between sampling theory with its homogeneous
populations and sampling practice with its heterogeneous sampling units and sample
spaces. A solid grasp of Visman’s thorough research would have improved David’s
Geostatistical Ore Reserve Estimation and Agterberg’s Geomathematics.

In his Index, David listed aureola, bull’s eye shot, deconvolution, and chaotic component
but did not list degrees of freedom, central limit theorem, and functional dependence. In
Chapter 12 Ore modelling [sic], the author acknowledged, “There is an infinite set of
simulated values,” and wondered how to, “make that infinite set smaller and get the
model closer to reality.” What he did not know is that infinite sets of simulated values
give zero pseudo kriging variances. David did not tackle the daunting task of selecting
the least biased subset of some infinite set of simulated values. In a rare instant of perfect
vision, David admitted, “The criticism to this model is obvious. The simulation is not
reality. There is only one answer. The proof of the pudding is…!” Hecla’s Grouse Creek
and Bre-X’s Busang are but a few inferred resources to fail David’s pudding test. Fisher’s
F-test proved that David’s pudding test is a geostatistical slight of hand.

The key to doing more with fewer boreholes was to assume spatial dependence between
measured values in ordered sets. Agterberg claimed in 1974 that it makes sense to assume
that values varying from point to point obey stationary random functions. The practice of
assumed causality explains why geostatistical reviewers did object in 1992 when Fisher’s
F-test proved a significant degree of spatial dependence between gold grades of ordered
rounds in a drift. Standford’s Journel made a cryptic reference to a certain decision in his
letter of October 15, 1992, to Professor Dr R Ehrlich, Editor, Journal for Mathematical
Geology. Journel wrote, “The very reason for geostatistics or spatial statistics in general
is the acceptance (a decision rather) that spatially distributed data should be considered
a priori as dependent one to another, unless proven otherwise.”

Inferred resources are created when grades within ore sections of boreholes are assumed
similar to those between ore sections. Applied statistics does give unbiased confidence
limits for contents and grades of proven ore within inferred resources based on spatial
dependence within ore sections. In addition, it gives unbiased confidence limits for
contents and grades of proven reserves. In contrast, geostatistics gives infinite sets of
distance-weighted averages with zero degrees of freedom and zero pseudo variances.



                                           56 - 68
2.10 David’s handbook on geostatistics

Professor Dr M David may have assumed that the world’s mining industry wanted more
of the same. It would explain why the author of Geostatistical Ore Reserve Estimation
wrote this 1988 Handbook of Applied Advanced Geostatistical Ore Reserve Estimation.
He still did not think assuming, kriging, and smoothing quite as silly as do statisticians.
He did not test for spatial dependence between measured values in ordered sets nor did he
derive confidence limits for block grades. He worked with pseudo kriging variances and
Lagrange multipliers and ignored proven statistical methods.

David’s 1988 handbook had far fewer pages than his 1977 textbook but it did have a
weightier title. In those days, he struggled with infinite sets of simulated values, and
pondered how to select least biased subsets. He was also proud that geostatisticians still
understand each other even when they change notations twice on the same page. He may
not have known that his simulated values were in fact distance-weighted average point
grades in applied statistics. Some scholar should have told him that simulated values are
called either kriged estimates or kriged estimators in Matheron’s new science.

David’s handbook was one of a kind if only because of its avant-garde title and poignant
lack of bona fide statistics. He acknowledged his research was funded by Natural Science
and Engineering Research Council of Canada. His 1977 textbook counted 364 pages, a
½-page Preface, a 4-page List of Notations, an 8-page Index, and a 10-page Biography. In
his 1988 handbook, he pressed forward to 216 pages of applied advanced research, four
pages of References, no Preface, no Index, and a tacky title. Just the same, David felt
much more secure this time than he did in 1977. In fact, he did not apologize to statistical
readers nor did he predict any unqualified statements. Once more, he did owe his readers
an apology because his synthesis of geostatistics and Lagrange multipliers failed to give
unbiased confidence limits for block grades as a measure for risk.

David may have perused Journel and Huijbregts’s 1978 Mining Geostatistics. If he did,
he might have noted the zero kriging variance under Remark 2 (ii) in Section V.A Theory
of Kriging of Chapter V The estimation of in situ resources. In his first textbook, he did
not mention that an infinite set of simulated values does give a zero variance and a unity
covariance. Nor did he mention the immeasurable odds of selecting the Best Linear
Unbiased Estimate (BLUE) out of some infinite set of simulated values. Odds beyond
measure may seem somewhat less odd to those who work with dense symbols rather than
with transparent irrational numbers.

Matheron taught his new science of geostatistics mostly by symbols. It may explain why
infinite sets of kriged estimates, zero kriging variances, and unity kriging covariances,
did not bother Matheron’s students. It does explain why unbiased confidence limits for
metal contents and grades of ore reserves were missing in David’s 1977 Geostatistical
Ore Reserve Estimation and Journel, Huijbregts’s 1978 Mining Geostatistics, and scores
of similarly flawed textbooks. Unbiased confidence limits derive from real variances of
sets of measured values that do have degrees of freedom. In contrast, pseudo variances of
sets of functionally dependent values do not have the degrees of freedom it takes.



                                          57 - 68
David used more krige-derived eponyms in his 1988 handbook than he did in his 1977
textbook. Kriged estimates, kriged estimators, kriging variances, kriging covariances, and
scores of kriging methods, became all the rage in the 1980s. Scholars would play with all
kind of krige-inspired neologisms whenever a new term was needed to put in plain words
the intricate workings of Matheronian geostatistics. For example, David himself coined
the term “supersimplified random kriging” in Section 5.1.3 of Chapter 5 Kriging.

In his 1988 handbook, David reduced even more the number of references to applied
statistics. Koch and Link’s 1970 Statistical Analysis of Geological Data was not to be a
part of his applied advanced research. Ingamells’s 1970s work was ignored. Jowet’s 1955
studies failed David’s litmus test. Tukey’s 1951 Propagation of Errors, Fluctuations and
Tolerances was turfed. Visman’s 1947 Sampling of Coal and Washery Products, and
1970 A General Sampling Theory, both vanished. Gy’s 1979 Sampling of Particulate
Materials, Theory and Practice turned out to be David’s sole source of knowledge on
sampling and statistics required for his research in this 1988 handbook. Just the same,
Visman’s work did inspire Gy’s early treatises in French on sampling theory and practice.
As luck would have it, Gy did not deal with analysis of variance and Fisher’s F-test.

In his Introduction, David talked about theoretical and applied geostatistics, and pointed
out how it evolved into, “…the practical tool it was originally meant to be.” He claimed,
“Comparing predictions and reality of course showed discrepancies which could be
reduced most of the time.” The author continued in the same vein, “These discrepancies
were reduced either by taking geology into account in better ways, or by checking more
carefully that the basic statistical hypotheses were met.” He pontificated, “These two
approaches, better respect of geology or better respect of statistical hypotheses are in
fact equivalent and aim at the same goal, a better orebody model.”

David’s textbook and handbook would have benefited from research into how to respect
statistical hypotheses. The question of whether a statistical hypothesis is true or false is of
critical importance in applied statistics. Matheron’s strangest statistical slant by far was
found in his Foreword to Journel and Huijbregts’s Mining Geostatistics when he claimed
that geologists stress structure, and that statisticians stress randomness. He may have tried
to suggest that geologists test for structure and statisticians for randomness in the same
sample spaces. Matheron should have taught geologists how to apply Fisher’s F-test to
the variance of a set of measured values and the first variance of the ordered set. It tests
the statistical hypothesis whether and where some structure of order in a sample space
dissipates into randomness. If these variances are statistically identical, then randomness
rules in the sample space. If the first variance term of the ordered set is significantly
lower than the variance of the set, then some structure does exist in the sample space.

Gy’s 1979 Sampling of Particulate Materials, Theory and Practice did make David’s list
because of Gy’s use of statistical methods in sampling practice. Testing for bias between
paired test results for different types of samples is based on Student’s t-test. The question
of whether two variances are statistically identical or differ significantly is solved by
applying Fisher’s F-test. Gy referred to SF=Student-Fisher in the Index of his work. In
the text, however, Gy wrote about some unorthodox “Student-Fisher’s t-distribution”



                                           58 - 68
rather than about Fisher’s F-test. Gy’s 1979 work did not show how Fisher’s F-test is
applied to optimize sampling protocols by partitioning the sum of two or more variances
in the measurement chain or hierarchy into its components. In Section 9.3 Bulk Sample
Preparation of Chapter 9 Check Samples and Duplicates, David did not show either how
to optimize primary sample selection, sample preparation, and analytical stages of a bulk
sampling protocol by applying Fisher’s F-test to the variances of different stages.

On a positive note, David did explain Student’s t-test in Section 9.1.2 Comparing two
laboratories. Not only did he know how to count degrees of freedom but also knew how
to improve the sensitivity of the t-test. He applied Student’s t-test to the difference
between central values of sets of identifiably different pairs of measured values. In this
case, the statistical hypothesis to be proved either true or false is whether the difference
between central values implies absence or presence of bias.

David did not apply Fisher’s F-test to solve the question of whether two variances are
statistically identical or differ significantly. He did not apply Fisher’s F-test in his 1977
textbook. He still ignored Fisher’s F-test in his 1988 handbook. Not surprisingly because
analysis of variance and the properties of variances were as much a mystery to Matheron
and all of his students as they were to David since the 1970s. Matheron fumbled the
variance of the length-weighted average grade of his three-dimensional block. Agterberg
fumbled the variance of the distance-weighted average grade of his zero-dimensional
point. David, in turn, was walking the tightrope from applied statistics to Matheronian
geostatistics while he struggled with those elusive properties of variances.

In Section 3.1.2 The Really Recoverable Reserves of Chapter 3 Block Variance, David
dealt with mistakes because some ore is waste while some waste is ore. He talked about,
“The distribution of values which we should consider the distribution of estimated
values.” He spelled out even more of his advanced thinking when he added, “It can be
assumed on the basis of experience that this is also a lognormal distribution however its
variance is now smaller.” David claimed, “The variance can be obtained experimentally
and theoretically from the ‘smoothing relationship’ which states that:”

                                   var( z*) = var( z ) − σ K
                                                           2




David’s “smoothing relationship” is not based on true variances. On the contrary, each of
his “variances” is the pseudo variance of some set of functionally dependent, distance-
weighted average point grades. Pseudo variances and true variances only share squared
dimensions. This absurd smoothing relationship evolved when a single length-weighted
average grade (Matheron’s three-dimensional block grade) mushroomed into an infinite
set of distance-weighted average grades (Agterberg’s zero-dimensional point grades).

In his Foreword to Chapter 5 Kriging, David pronounced, “The recent proliferation of
different types of kriging…still shows that ordinary kriging, the way it was formulated by
Matheron in 1965 and applied by Serra in 1967, is still the tool to use in most
circumstances.” To prove his point, the author took off on yet another tangent. In Section
5.1 Improving Ore-Waste Definition with Kriging or Random Kriging, he proclaimed,


                                           59 - 68
“The highly erratic mineralization …makes the usual practice of flagging blast holes
totally useless.” Perhaps predictably, David declared, “The solution is to perform kriging
on blast holes and to define a new boundary, based on kriged values, rather than blast
hole values.”




                  Figure 77 Erratic distribution of blasthole grades

For simplicity, the above copper grades are deemed equidistant. In practice, it would
make sense to derive the distances between blastholes from coordinates. A cutoff grade
of 0.20% Cu was applied to partition blasthole blocks into ore and waste. Kriging-defined
ore and waste limits are outlined in Figure 78.




                   Figure 78 Ore-waste limits estimated by kriging

The first variance term of the ordered set of erratic grades in Figure 77 derives from a
systematic walk that visits each blasthole with more than 0.20% Cu only once, and that
covers the shortest possible distance between blastholes. Figure 78 is different because
new limits are plotted based on kriged values rather than blasthole values. In this case,
too, a systematic walk that visits each blasthole within kriging-defined limits only once,
irrespective of its copper grade, and that covers the shortest possible distance, gives the
first variance term of the ordered blasthole grades within these limits.



                                         60 - 68
Fisher’s F-test is applied to var(x), the variance of the set, and var1(x), the first variance
term of the ordered set. Observed and tabulated F-values in Table 2.10.1 show that the
ordered set of erratic copper grades in Figure 77 displays a significant degree of spatial
dependence at 95% probability. In contrast, the ordered set of blasthole grades between
kriging-defined limits in Figure 79 does not display spatial dependence.

                        Table 2.10.1 Test for spatial dependence
         —————————————————————————————
         Statistic                          Symbol       erratic kriged
         —————————————————————————————
         Variance of set in %2               var(x)       0.0318 0.0394
         First variance term in %2           var1(x)      0.0191 0.0314
         Observed F-value                       F          1.67   1.26
         Significance                                        *     ns
         Degrees of freedom:
                          Set                  df           32     45
                  Ordered set                  dfo          60     90
         Tabulated F-value at:
                5% Probability            F0.05;df,dfo     1.64   1.51
                1% Probability            F0.01;df;dfo     2.01   1.79
         —————————————————————————————
          ns not significant   * significant at 5% probability

Fisher’s F-test proves that the ordered set of erratic blasthole grades displays spatial
dependence at 95% probability. Diluting erratic grades with grades of less than 0.20% Cu
within kriging-defined limits causes spatial dependence to dissipate into randomness.

 Diluting blast hole grades impacts not only the central value of the set but also its
variance and confidence limits. Confidence limits for central values derive from standard
deviations and tabulated t-values at selected probability levels with applicable degrees of
freedom. Table 2.10.2 gives 95% confidence limits for the arithmetic mean of each set.

                   Table 2.10.2 Confidence limits for block grades
         —————————————————————————————
         Statistic                        Symbol        erratic    kriged
         —————————————————————————————
         Arithmetic mean in %                x           0.39       0.31
         95% Confidence range             95% CR
                     Lower limit in %    95% CRL         0.34       0.26
                     Upper limit in %   95% CRU          0.44       0.36
         95% Confidence interval in %     95% CI        ±0.048     ±0.052
         95% Confidence interval in %rel 95% CI         ±12.8      ±16.9
         —————————————————————————————

Arithmetic means of 0.39% for the erratic block and 0.31% for the kriged block do differ
significantly. In fact, the probability that this statistical hypothesis is true exceeds 95%.
Conversely, the probability that it is false is less than 5%. A processing plant would have


                                              61 - 68
received ore with a significantly higher grade if blastholes were flagged. Conversely, it
would have received more ore but with a significantly lower grade if ore-waste limits
were redefined by kriging blastholes grades. In this case, the kriged block was about 1.5
times larger than the erratic block. Such findings do not lend credence to David’s opinion
that kriging beats flagging.

David took off on yet another mathematical tangent when he tried to put the Lagrange
multiplier to work in 5.1.2. Example of Chapter 5 Kriging. His problem is that Lagrange
multipliers do not give confidence limits. Just the same, he sets the stage as follows,“…a
block is estimated by a weighted average of the mean BH grade inside it and the four
surrounding blocks...”




                     Figure 79 Configuration of neighbour blocks
                         retained around the block to estimate

The author added, “There is the same number n of blast holes in each of the blocks and
the variogram is isotropic. From the symmetry of the configuration, the weights assigned
to the surrounding blocks are the same: A1 = A2 = A3 = A4 = A0.” It seems somewhat
contrived but convenient that each block had the same number of blastholes. His
statement that the variogram is isotropic rang hollow because he did not even know how
to derive the first variance term of a sampling variogram. The author made matters worse
by asserting, “If the relative variogram is adopted, a good approximation of its equation
for distances less than 300 ft is γ(h) = 0.011 + h/1760.” He did not know how to test for
spatial dependence between ordered sets of blasthole grades. What he did know was how
to spin semi-variogram nonsense. His work with the Lagrange multiplier made it possible
to show more geostatistics by symbols.

Real statisticians would have applied Fisher’s F-test to verify spatial dependence by
comparing the observed F-value between the variance of the set of blocks in Figure 79
and the first variance term for ordered blocks. Statisticians would have derived the
weighted average grade for Block A0 from the grades for Block A1 to Block A4, and
confidence limits for this weighted average grade. Statisticians would have applied
Tukey’s Wholly Significant Difference test to check whether weighted average block
grades are statistically identical or differ significantly. Statisticians would have applied
Bartlett’s chi-square to determine whether block variances are statistically identical or
differ significantly.




                                          62 - 68
David would have disapproved. In fact, he reviewed for CIM Bulletin in September 1989
a paper titled Precision Estimates for Ore Reserves. The authors used Fisher’s F-test to
verify spatial dependence between gold grades of bulk samples taken from a set of
ordered rounds in a decline. In his review, David remarked, “The authors present their
own method for calculating precision estimates for ore reserves without a single
reference (his highlights!) to 20 years worth of work in geostatistical ore reserve
estimation (see attached references).”

In his 1988 Handbook of Applied Advanced Geostatistical Ore Reserve Estimation,
David failed to explain how to apply Fisher’s F-test to verify spatial dependence between
measured values in an ordered set. He seemed to have forgotten that his own references
to statistics in his 1977 textbook were no longer included in this 1988 handbook. Nor did
he ever explain how the properties of variances impact confidence limits for contents and
grades of ore reserves. He ignored with impunity the application of analysis of variance
for which Sir Ronald A Fisher was knighted in 1953. It is true that Matheron’s teachings
caused catatonique géostatistique among his students. David was but one of several
scores of geostatistical scholars who agreed that spatial dependence may be assumed, and
that degrees of freedom discriminate against functionally dependent values.

The reason why David felt compelled to write the first handbook on geostatistics may
well have been a bit of work with measured values. Had he been interpolating between
measured values, he would have noticed that the variances of such ordered sets rise to a
maximum and then converge on zero. This rise and fall of variances is counterintuitive in
statistics but pervasive in geostatistics. So much so that a caution against oversmoothing
became an integral part of Matheron’s new science of geostatistics. In Chapter 4
Estimation variance, David explicated, “The question of the estimation variance is one of
the essential concepts which made geostatistics known.” Despite all his research, he did
admit, “…obtaining an exact solution to the problem of the precision on recoverable
reserves is an unanswered question…” That is in fact the quintessence of the case against
geostatistics!




                                         63 - 68
2.11 A study on kriging small blocks

It has been a quite a feat indeed to bring the blessings of Matheron’s new science to the
world’s mining industry. Many a geologist thought it odd so much could be done with so
few boreholes. But too few knew real statistics well enough to figure out what was wrong
with geostatistics. What Matheron and his minions had failed to grasp was that functions
do have variances, and that sets of measured values give degrees of freedom. Koch, Link
and Tukey were the token statisticians at the very first colloquium on geostatistics in the
USA. They had been invited to scrutinize Matheron’s new science. What they failed to
notice was that Agterberg’s predicted value had no variance, and that his measured values
did not give degrees of freedom. Scores of skeptics had called Matheronian geostatistics a
sham. What Agterberg and Matheron did was fumble a few variances and ignore the
concept of degrees of freedom. And that’s all it took to do so much with a few boreholes!

David’s 1977 Geostatistical Ore Reserve Estimation was first in some kind of mad race
to assume, to krige, and to smooth. Journel and Huijbregts’s 1978 Mining Geostatistics
was a close second with much more of the same surreal statistics. David deserved credit
for confessing that his set of “simulated values” is infinite. Journel and Huijbregts, in
turn, came clean in Chapter V The estimation of in situ resources and admitted infinite
sets of “kriged estimators” give zero kriging variances. Infinite sets of kriged estimates
with zero kriging variances and zilch degrees of freedom troubled none of those authors.
Neither did the immeasurable odds of the kriging game give pause. On the contrary, they
seemed to have worked out how to select out of any infinite set of kriged estimates the
singular subset that does give the Best Linear Unbiased Estimator (BLUE).

Selecting such an elusive least biased subset is a formidable task indeed. So much so that
taking the Best Linear Unbiased Estimator of an infinite set of kriged estimators would
be ranked an impossible event in probability theory. In symbols, P(BLUE)=0. This zero
probability explains why scores of ore deposits did not make predicted grades. On a
positive note, least biased subsets give finite kriging variances rather than zero kriging
variances. On the downside, a finite kriging variance is just as meaningless a measure for
variability, precision, and risk as is the zero kriging variance. A lasting problem is that
every subset of any infinite set of kriged estimators does give a pseudo kriging variance.

David’s 1988 Handbook of Applied Advanced Geostatistical Ore Reserve Estimation set
the stage for geostatistical grade control at open pit mines. David’s own research had led
him to infer that kriged values were preferable to measured values because he thought the
latter were “erratic.” All the same, this so-called erratic block was not just smaller but
had a higher metal grade than his kriged block. Moreover, the ordered set of blasthole
grades for that erratic block displayed a significant degree of spatial dependence whereas
those for David’s kriged block were randomly distributed. As a result, the grade of this
erratic block could be estimated with a higher degree of precision than the grade of his
kriged block. David’s attempt at geostatistical grade control made a mockery of statistical
grade control because it does give less pay dirt and more tailings. David did not derive
confidence limits for the metal grade of his kriged block. Neither did he bother to test for
bias between metal grades of kriged and erratic blocks.



                                          64 - 68
Practitioners of geostatistics need scapegoats as soon as predicted grades of ore blocks
fail to pan out. Surely, to assume, krige and smooth couldn’t possibly cause lower than
predicted grades! So it was that the smoothing relationship in David’s 1988 handbook
brought about further research into the rise and fall of kriging variances as a function of
block volumes. Armstrong was a prominent scholar at the Centre de Géostatistique and
Champigny a CIM Member in good standing when they pored over the rise and fall of
kriging variances, and wrote A study on kriging small blocks. David himself reviewed
this study and approved it for publication in CIM Bulletin, Vol 82, No 923, Mar 1989.




                    Figure 2.11.1 Location of the block to be kriged

The above facsimile has the same caption as Figure 1 in Armstrong and Champigny’s
study. It is akin to Figure 79 Configuration of neighbour blocks retained around to the
block to estimate in David’s 1988 handbook. Both figures are strikingly alike in the sense
that simple symbols and real data are missing. David brought up the “famous central
limit theorem” in his 1977 textbook but did not grasp that it underpins sampling practice.
He may not have thought much of it because he did not work with it anywhere in his
1988 handbook. Otherwise, he might well have told Armstrong and Champigny that this
theorem defines the relationship between the variance of the arithmetic mean grade of the
block to be kriged and the variance of the set of measured grades about it. If all measured
values were in fact equidistant to David’s block to be kriged, its central value would be
the arithmetic mean of the set, and the central limit theorem would define its variance.

The central limit theorem tripped up Agterberg’s train of thought once more in his 1974
Geomathematics. Agterberg did put up with it in Chapter 6 Probability and Statistics and
in Chapter 7 Frequency Distributions and Function of Independent Random Variables.
So, it is all the more surprising then that he did not see fit to put it to work in Chapter 10
Stationary Random Variables and Kriging. His caption below Figure 64 in Chapter 10
reads, “Typical kriging problem; values are known at five points. Problem is to estimate
value at point P0 from the known values at P1–P5. ” So what was Agterberg’s problem?

Agterberg derived the distance-weighted average of his set of five (5) measured values
determined in samples taken at positions with different coordinates. He did not know how
to derive the central limit theorem for the central value of his set. He did not know how to
verify spatial dependence applying Fisher’s F-test to the variance of the set and the first
variance term of the ordered set. And he did not know how to count degrees of freedom
for his set of measured values and for the ordered set. Agterberg did indeed have some
real problems.


                                           65 - 68
David’s ”block to be kriged” in Figure 2.11.1 seems quite close to being equidistant to its
“neighbour blocks.” The arithmetic mean of this set of grades is an unbiased estimate for
the grade of his ”block to be kriged” if, and only if, the ordered set of grades displays a
significant degree of spatial dependence. A significant degree of spatial dependence
between grades of “neighbour blocks” is a condition sine qua non for the trueness of the
grade for David’s ”block to be kriged”. The next figure is alike to Figure 79 in Chapter 4
Estimation Variance of David’s 1988 handbook, and has exactly the same caption.




                   Figure 2.11.2 Configuration of neighbour blocks
                        retained around the block to estimate

David did not have his “famous central limit theorem” in mind when he was reviewing
Armstrong and Champigny’s study in 1988. But then, the authors did not work with this
central limit theorem in their small block study. The fact that kriging variances have
nothing but squared dimensions in common with true variances troubled CIM’s reviewer
as much as did the fall of kriging variances trouble the authors. After all, incompetent
mine planners were to blame simply because they over-smoothed small blocks.




                  Figure 2.11.3 Rising and falling kriging variances

The above chart shows that the kriging variance is about y=0.5 for a 10x10m block and
close to y=1.0 for a 1x1m block. So, the question is which of these kriged blocks is over-
smoothed. After all, 1x1m kriged blocks may make single truckloads but 10x10m kriged
blocks give many truckloads. Armstrong and Champigny cooked up a highly improbable
interpretation of over-smoothed blocks in the Abstract of their study. The authors set


                                         66 - 68
forth, “Meaningful estimates of individual block grades are obtained when the variogram
range is large compared to the block size and the sample spacing. For a variogram range
of less than half the sample spacing, the kriged estimates were found to be uncorrelated
with the actual grades.”

Armstrong and Champigny saw fit to link meaningful estimates and widely spaced blocks
but did not explain what made such estimates meaningful. The authors stated but did not
prove that kriged block estimates and actual grades were uncorrelated for a variogram
range of less than half the sample spacing. What they should have done is verify spatial
dependence by applying Fisher’s F-test to the variance of the set and the variance of the
ordered set. A systematic walk that calls on the coordinates of each actual grade only
once, and covers the shortest possible distance between coordinates gives the variance of
the ordered set.

Armstrong and Champigny’s study should have signaled the end of assuming, kriging,
smoothing, and rigging the rules of applied statistics. It was David’s review that made it
survive and thrive. It is simple to assume spatial dependence when working with symbols
but wildly at odds with verifying spatial dependence by applying Fisher’s F-test. What is
simpler than comparing the observed F-value between the variance of a set and the first
variance term of the ordered set with values of F-distributions at different probability
levels and with the applicable degrees of freedom for each variance. The problem is that
geostatisticians know more about assuming spatial dependence than counting degrees of
freedom.

David was pleased with Armstrong and Champigny’s geostatistical inferences. Surely, it
was reassuring to find out that mine planners were to blame for the rise and fall of kriging
variances. So, it would be a matter of teaching those mine planners to assume and krige
by the book, and to smooth the best possible blocks. His Geostatistical Ore Reserve
Estimation and Handbook of Applied Advanced Geostatistical Ore Reserve Estimation
put David in an excellent position to teach the intricacies of geostatistics to geoscientists
and mine planners alike. Armstrong and Champigny’s study set the stage for staying the
course, and teach what the right way of assuming, kriging and smoothing is all about.

Armstrong and Champigny thought their study had proved small blocks should not be
over-smoothed. Yet, they did not have the faintest idea why kriging variances rise and
fall. The authors did not have a clue why the pseudo kriging variance is as spurious a
measure for variability and precision as the pseudo kriging covariance is for associative
dependence between measured values in ordered sets.

What is surprising is that neither author knew each function has its own variance and
measured values give degrees of freedom. They did not know how to count degrees of
freedom, and why Fisher’s F-test demands degrees of freedom. That may well be the
reason why practitioners of geostatistics are taught to assume rather than prove spatial
dependence. Some of Armstrong’s studies are posted with the Online Library of the
Centre de Géostatistique but the strange study of the rise and fall of kriging variances is
not among them.



                                          67 - 68
Matheron’s seminal work may seem to have brought to an end the human struggle against
randomness. For it was Matheron’s vision to assume, krige, smooth, and rig the rules of
applied statistics that would create order where chaos might otherwise have prevailed.
Thanks to Matheron’s minions it turned into a dreadful pseudo science. Armstrong and
Champigny’s caution against over-smoothing small blocks did not change the fact that
kriged block grades are functions. Journel’s doctrine of assumed spatial dependence did
not solve Agterberg’s problem because distance-weighted averages, too, are functions.

Armstrong was the Editor of De Geostatisticis when she pointed her finger at a few
critics of geostatistics in a stirring article on “Freedom of Speech?” and pondered, “Does
the peer review process deprive these people of their freedom of speech by denying them
the chance to express opinions that rum against popular view? Or is the peer review
system just doing its job of rejecting papers that do not back up their opinions with
scientific fact?” She was an Associate Editor with the Journal for Mathematical Geology
in 1992 but still did not grasp the difference between functional and spatial dependence.

Bre-X’s bogus grades of a few boreholes at positions with different coordinates defined
an infinite set of kriged boreholes with matching bogus grades. So, it was straightforward
to assume similar grades between widely spaced boreholes, and add kriged boreholes to
Bre-X’s inferred gold resource. Step-out drilling at Busang was extremely effective
because assuming spatial dependence between widely spaced lines of salted boreholes
added a massive volume of barren rock to Bre-X’s phantom gold resource.

Table 2.11.1 lists the basic statistics for three lines of nine kriged boreholes between nine
salted boreholes on Line SEZ-44 and eleven salted boreholes on Line SEZ-49.

        Table 2.11.2 Statistics for kriged and salted boreholes
   ——————————————————————————————————
   Statistic                               Symbol SEZ-44            kriged   kriged   kriged   SEZ-49
   ——————————————————————————————————
   Arithmetic mean in gpt x 2.68 2.90 2.99 3.08 3.17
   95% Confidence interval in gpt           95% CI          ±0.63   ±0.02    ±0.02    ±0.05     ±0.84
   95% Confidence interval in %rel          95% CI           ±24    ±0.8     ±0.7     ±1.6       ±26

   Variance of set                          var(x)         0.6718   0.0105   0.0126   0.0218   1.5576
   First variance term                      var1(x)        0.9370   0.0012   0.0009   0.0055   2.0795
   Observed F-value                           F             1.39     8.63     13.76    3.95     1.34
   Significance                                              ns        ?        ?        ?       ns
   Degrees of freedom for:
                       Set                     df            8        0        0        0        10
               Ordered set                    df(o)          16       0        0        0        20
   ——————————————————————————————————
     ns not significant   ? ask qualified krigeologist


The above statistics show that two lines of salted boreholes do not display a significant
degree of spatial dependence, and that three lines of kriged boreholes do indeed create
some delusion of spatial dependence. The statistics highlight what happens when each an
every distance-weighted average-cum-kriged estimate no longer has its own variance, and
when kriging drums din those who work with degrees of freedom.


                                                         68 - 68

				
DOCUMENT INFO