A SPREADSHEET APPROACH TO TEACHING RESAMPLING
(Journal of Education for Business July/August, 1998)
(Does not include figures)
Richard L. Morris
Rock Hill, SC 29733
Barbara A. Price
School of Business
Georgia Southern University
A SPREADSHEET APPROACH TO TEACHING RESAMPLING
This article presents the use of a spreadsheet-based package to teach resampling. A brief
description of resampling is given, followed by a discussion of the package. Then two problems
are used to illustrate resampling. These problems are initially solved using classical parametric
methods, and some of the advantages and disadvantages of this approach are offered. This is
followed by a discussion of how resampling could be applied to the problems. Finally, the
resampling method is expanded by incorporating Lotus into the process to simulate enough
replications to provide for reasonable conclusions.
Anyone who has ever taught statistics has observed that most undergraduate students can
learn the steps involved in statistical inference but few of them really understand what they do.
For that matter, how many of us really understood what we were studying when we first took
statistics? The difficulty is that in presenting what essentially are fairly simple concepts, a
complicated set of theory and mathematical calculations has to be learned also.
One solution to this problem is the use of resampling techniques to assist in explaining
many of the concepts of hypothesis testing. Resampling is simple to understand, it makes no
assumptions on the underlying distribution, and involves no complicated mathematical formulas.
The basic concept is straightforward. Rather than assuming an underlying distribution and
making inferences based on formulas, a sample space is constructed from the original sample (or
samples) and then samples are drawn repeatedly from this "population." Inferences are drawn
based on the resulting samples. It is computationally very expensive, of course, but most
practical problems are well within the capabilities of modern PCS.
The authors have begun to write a series of macros to perform resampling in Lotus. It is
hoped that the availability of such a package will encourage more teachers to incorporate
resampling in their courses. Most students are already familiar with spreadsheets and might find
such an approach easier than learning a specially written resampling package. In addition,
students can examine the macros and associated formulas to gain more of an understanding of
what occurs. If they want to modify output, such as resizing graphs or adding explanations, it is
easily done. Furthermore, the macros themselves may be modified by skilled Lotus users to treat
Julian L. Simon, Professor of Business Administration at the University of Maryland, can
be considered the father of the use of resampling as a method to teach statistics. His definition of
resampling is as follows:
Let's define "resampling." A statistical procedure models a physical process. A
resampling method simulates the model with easy-to-manipulate symbols. Either
the observed data (in the case of a problem in statistical inference), or a data-
generating mechanism such as a die (if it is a problem in probability), is used to
produce new hypothetical samples, the properties of which can then be examined.
The resampler postulates a universe and then examines how it behavesin the
case of statistics, comparing the outcomes to the criterion we choose. (Simon,
One of the virtues of resampling is that we do not need to assume that the populations are
WHY USE A SPREADSHEET FOR RESAMPLING?
Before describing the spreadsheet package, it might be worthwhile to discuss some of the
pros and cons of resampling with a spreadsheet. Speaking of the cons first, there is at least one
resampling package already developed (Rosa-Hatko, 1995). The authors have not used it, but it
is almost certainly much faster and more powerful than any spreadsheet-based package could be.
However, most students are familiar with at least one of the commonly used spreadsheets, so
that having access to a spreadsheet-based package might be the key to initiating them to
There are other advantages also. With the macros to be discussed here, there is no need
to type in a series of commands. Everything is windows-based and actuated by macro buttons. If
the user wishes to modify some of the output for presentation purposes, it is very easy to do so.
Since the macros are in the spreadsheet itself, the user can examine them to see what they do.
Lastly, and this might be the most important consideration, users competent with macro
commands may easily modify the macros to fit their individual applications. For example, if one
wishes to resample using the median rather than the mean as a measure of central tendency, all he
or she has to do is modify one of the formulas in the macros.
ORGANIZATION OF THE RESAMPLING SPREADSHEET
The resampling package relies on a series of macros in Lotus 4+. The name of the
spreadsheet containing the macros is RESAMPLE.WK4. When the user opens the spreadsheet,
an autoexec macro automatically goes to a control section, from which the user can go to any of
the available applications. Each application is in a separate worksheet within the spreadsheet. In
addition to the application worksheets, there is a separate worksheet for the results, another
containing the macros themselves, and a README worksheet with instructions on using the
macros. The macros are designed to be as user-friendly as possible and to that end utilize many
Lotus features in the macros. For example, macro buttons are provided to navigate to the various
application worksheets and to invoke macros. When the spreadsheet opens up and the user is in
the control section, shown in Figure 1, he or she uses one of the buttons to go to whichever
application is desired.
At this stage of development, the spreadsheet contains applications for one mean, two
means, paired differences, one variance and one proportion. Each of these has an associated
macro button in the control section. A README section, shown in Figure 2, lists the
instructions necessary to operate the spreadsheet.
In the instructional package developed, only the bootstrap procedure has been
incorporated. The sampling environment is assumed to be an infinite population, and sampling
is with replacement. Also, the available applications are limited to inferences for a single mean,
a single proportion, a single variance, two means from independent populations, and the mean for
paired differences. In the following sections two examples are used to explain spreadsheet-based
Consider the following data from the article "Pick a Sample," by Peterson (1991).
Battery Lifetime (weeks) Mean Std. Dev.
With additive: 33 32 31 28 31 29 29 24 30 28 29.5 2.550
Without additive: 28 28 32 31 24 23 31 27 27 31 28.2 3.084
The first row of observations lists the lives of 10 automobile batteries with an additive
designed to lengthen their useful life. The second row shows the lives of 10 batteries that did not
receive the additive. To verify the effectiveness of the additive, we may conduct a pooled t-test
H 0 : 1 2
H 1 : 1 > 2
1 = the mean lifetime of the batteries with the additive
2 = the mean lifetime of the batteries without the additive
In order to conduct the pooled t-test, we need to make the following assumptions:
The samples are independently drawn.
The variances of the two populations are equal, i.e. 21 = 2 2 , even
though the value is unknown.
The populations are normally distributed.
Furthermore, when H 0 is true the means of the two populations are equal.
( X1- X 2 )-0
t n1+ n2- 2 = 2 2
S p S p
DECISION RULE: If α = .05, reject H 0 if t n1+n2- 2 > 1.734
( 29.5 - 28.2 ) - 0
t 18 = = 1.027
CONCLUSION: Fail to reject H0; cannot conclude that the additive lengthened the useful
life of the batteries. (p-value .159)
Before actually performing the resampling experiment, it might be informative to
describe how resampling would be done without a computer, perhaps as a way of introducing it
in the classroom. First, treat the two sets of observations as if they comprise one population in
which there is no difference between the batteries with the additive and those without it. Then
take repeated random samples, two at a time, of 10 observations each. This could be
accomplished by putting the 20 observations into a hat and sampling. Since our samples can be
considered as coming from infinite-sized populations, we should sample with replacement, by
replacing each observation as drawn and re-shuffling.
Calculate the mean of each sample and compute the difference. Repeat this process a
sufficient number of times to arrive at a meaningful sampling distribution. This distribution
reflects the behavior of the sample means if there is no significant difference between the two
original sets of batteries. By examining the sampling distribution, it can then be determined
whether the difference between the actual sample means is great enough to be significant. This is
a good exercise to conduct as an introduction to resampling. It involves learning by doing, it is
an exact simulation of how resampling is actually done, and it soon becomes clear why a
computer is necessary to perform the calculations.
RESAMPLING USING THE SPREADSHEET PACKAGE
To conduct the test with resampling, the user should navigate to the data entry section for
two means and enter the data as shown in Figure 3. The number of resamples is specified at
1,000, the null hypothesis at 0, and the confidence level at .95. Since this is an upper-tailed
example, U is entered for the type of test.
The resampling is done using the Lotus @VLOOKUP command, which in effect samples
with replacement, or equivalently, from an infinite or very large population. This is appropriate
for this example because the population can be considered as the very large number of batteries
that might or not be treated with the additive.
After the input is complete, the user clicks the RESAMPLE macro button, invoking the
macros necessary for the two means test. When resampling is finished, the spreadsheet ends up
in the RESULTS worksheet, illustrated in Figure 4. Shown in the initial screen of this section
are the p-values and confidence intervals, both from resampling and from parametric methods.
As can be seen, there is very close agreement between the results. It has been the authors'
experience that the results are usually very close for large samples, but not as close for smaller
samples. This is consistent with the literature to date on resampling. See, for example, Seppala,
Moskowitz, Plante, and Tang (1995), in which the results of resampling and standard methods
were compared for the calculation of control charts. One of the conclusions of this reference is
that resampling and classical methods are in very good agreement when the underlying
distribution is normal. However, when sampling from a skewed population (they used the
exponential as an example) the resulting bias (the difference between the resampling or
parametric results and "true" results) is less with resampling.
There are two more sections of interest in the RESULTS worksheet. The first is the
graph of the resamples, shown in Figure 5, obtained by scrolling down one screen. As expected,
since the samples are combined before resampling, the mean of this histogram is approximately
zero. The sample mean difference of 1.3 lies in the upper tail of the distribution consistent with
its p-value. Scrolling to the right from the original RESULTS screen, the user may view the
resamples, some of which are shown in Figure 6.
From a pedagogical standpoint, the most important points to dwell on are the resamples
themselves and the graphs. The instructor should point out the formulas used in the resampled
observations to show the equivalence between them and resampling by hand. The graphs show
intuitively how likely it is to obtain the actual sample difference from the population constructed
from the original sample.
The preceding problem required a very modest amount of computer time. It took 6
seconds to run the complete resampling procedure for 1,000 resamples. The REPEAT procedure
took only 3 seconds to perform 1,000 resamples. REPEAT takes less time because it can take
advantage of a certain amount of setup work already done by RESAMPLE. For 200 resamples,
RESAMPLE took 4 seconds and REPEAT required 2 seconds. All of these times were recorded
on a Pentium 166.
The second example makes use of data collected as part of a project in one of the authors'
graduate statistics classes. The project involved a water processing plant located in Bucksport,
South Carolina, and operated by the Grand Strand Water & Sewer Authority (Burgoyne, Eliott,
King, and Carroll, 1995). At the time, the treatment plant was measuring pH levels of processed
water both manually, using pH indicator paper, and automatically with a computerized pH meter.
This meter operated by means of an electrode suspended in the water and automatically took
readings every five minutes. Operating personnel at the plant used both methods because they
were not as confident of the accuracy of the automated procedure as they were of the manual
method. However, they were considering eliminating the manual method if they could be
assured of the automated method's accuracy. Accordingly, the students doing the project took a
series of 177 simultaneous pH readings using both the manual and the automated methods.
Assuming that the manual method is reliable, its readings should provide a good benchmark for
the automated process so that if there is no difference, on average, between the two, it can be
concluded that the automated method is as reliable as the manual one. This is a classic paired
Ho : d = 0
H1: d 0
difference experiment in which the following set of hypotheses hold:
where d is the mean paired difference between the manual and automated readings. If the
assumption of no difference is rejected, then we can conclude that the readings are different, on
average, between the manual and automated methods. If the manual method is considered
accurate, a significant difference between the two could be blamed on the unreliability of the
automated method, and they should keep both.
To analyze this problem, the paired difference worksheet was used. Again, it is
appropriate to think of this as sampling from an infinite population, since the readings will
continue virtually forever. The input appears in Figure 7, although only some of the 177
observations are shown. The results of 1,000 resamples achieved a resampled p-value of .30364.
This agrees closely with the theoretical value of .31262. Since neither one would cause us to
reject the null for any reasonable significance level, we may conclude that the automated pH
method is consistent enough with the manual one to justify using the automated method
This second example, was of course, much larger than the first one and consequently took
longer to run. It took 48 seconds to run the RESAMPLE macro for 1,000 resamples. REPEAT
took 38 seconds for 1,000 resamples. When the number of resamples was decreased to 200,
RESAMPLE took 24 seconds while REPEAT required 15 seconds. Times on a 486 computer
were much slower, but manageable for smaller problems.
In the authors' experience so far, there is very close agreement between resample and
theoretical results for large samples. This is reasonable, since, according to the Central Limit
Theorem, the sampling distribution of the mean approaches normality with larger sample sizes.
For smaller problems however, there is less agreement. The authors suspect that this
disagreement is caused by skewness, as suggested by Seppala et al. (1995)
As a test, the authors simulated 1,000 resamples for 1,000 trials (a total of 1,000,000
resamples) using the data from Example 1. The mean p-value for the 1,000 trials was .1548,
compared to the theoretical p-value of .1589, so agreement is very close on average. The mean
absolute deviation for the 1,000 trials was .0093, or somewhat less than 1%. The mean lower
confidence limit for the 1,000 trials was 1.110, compared to the theoretical value of 1.358. The
mean upper confidence limit was 3.712 compared to a theoretical value of 3.958. The mean
absolute deviations of the LCL and UCL, respectively, were .0842 and .0839.
CONCLUSIONS AND PLANS FOR FUTURE DEVELOPMENT
At this time, the package includes macros that perform the same operations presented in
this paper for cases involving one mean, two means, paired differences, one variance and one
proportion. We plan to extend the macros in the future to handle perhaps two or three additional
common situations, such as regression and correlation analysis. Consideration is also being
given to offering the finite population, sampling without replacement or randomization sampling
environment alternative. The authors would appreciate very much any suggestions as to what
features to add to the package.
A copy of the package is available on request from the first author.
Burgoyne, S., Eliott, B., King, R., and Carroll, M. (1995). pH analysis for surface water
treatment plant, Conway & South Pump Stations. Unpublished class project.
Peterson, I. (1991). Pick a sample. Science News, 140, 56-58.
Ricketts, C. and Berry, J. (1994). Teaching statistics through resampling. Teaching Statistics,
16 (2), 41-44.
Rosa-Hatko, W. (1995). Resampling stats. ORMS Today, 22 (2), 72-74.
Seppala, T., Moskowitz, H., Plante, R., & Tang, J. (1995). Statistical process control via the
subgroup bootstrap. Journal of Quality Technology, 27, 139-153.
Simon, J. L. (1993). Resampling: The New Statistics. Belmont, CA: Duxbury.
Simon, J. L. (1994). The resampling method for statistical inference. Basics chapter of
Philosophy document of Internet manuscript.
Things to do:
Either get last reference or delete it.
Is first Simon reference to book or internet document? Look at
one of the others. Would look like this:
Simon, J. L. (1993). Resampling: The New Statistics (pp. 345-346). Duxbury
Check to see if publications cited need issue number (they don't
if pages run throughout the volume.)
Where is Duxbury located?
Is # of words OK?