Learning Center
Plans & pricing Sign in
Sign Out



     (Journal of Education for Business July/August, 1998)
                   (Does not include figures)

                                                    Richard L. Morris
                                                  Winthrop University
                                                   Rock Hill, SC 29733
                                                      Barbara A. Price
                                                    School of Business
                                           Georgia Southern University
                                                       Statesboro, GA


       This article presents the use of a spreadsheet-based package to teach resampling. A brief

description of resampling is given, followed by a discussion of the package. Then two problems

are used to illustrate resampling. These problems are initially solved using classical parametric

methods, and some of the advantages and disadvantages of this approach are offered. This is

followed by a discussion of how resampling could be applied to the problems. Finally, the

resampling method is expanded by incorporating Lotus into the process to simulate enough

replications to provide for reasonable conclusions.


       Anyone who has ever taught statistics has observed that most undergraduate students can

learn the steps involved in statistical inference but few of them really understand what they do.

For that matter, how many of us really understood what we were studying when we first took

statistics? The difficulty is that in presenting what essentially are fairly simple concepts, a

complicated set of theory and mathematical calculations has to be learned also.

       One solution to this problem is the use of resampling techniques to assist in explaining

many of the concepts of hypothesis testing. Resampling is simple to understand, it makes no

assumptions on the underlying distribution, and involves no complicated mathematical formulas.

The basic concept is straightforward. Rather than assuming an underlying distribution and

making inferences based on formulas, a sample space is constructed from the original sample (or

samples) and then samples are drawn repeatedly from this "population." Inferences are drawn

based on the resulting samples. It is computationally very expensive, of course, but most

practical problems are well within the capabilities of modern PCS.
       The authors have begun to write a series of macros to perform resampling in Lotus. It is

hoped that the availability of such a package will encourage more teachers to incorporate

resampling in their courses. Most students are already familiar with spreadsheets and might find

such an approach easier than learning a specially written resampling package. In addition,

students can examine the macros and associated formulas to gain more of an understanding of

what occurs. If they want to modify output, such as resizing graphs or adding explanations, it is

easily done. Furthermore, the macros themselves may be modified by skilled Lotus users to treat

new situations.


       Julian L. Simon, Professor of Business Administration at the University of Maryland, can

be considered the father of the use of resampling as a method to teach statistics. His definition of

resampling is as follows:

       Let's define "resampling." A statistical procedure models a physical process. A

       resampling method simulates the model with easy-to-manipulate symbols. Either

       the observed data (in the case of a problem in statistical inference), or a data-

       generating mechanism such as a die (if it is a problem in probability), is used to

       produce new hypothetical samples, the properties of which can then be examined.

        The resampler postulates a universe and then examines how it behavesin the

       case of statistics, comparing the outcomes to the criterion we choose. (Simon,


One of the virtues of resampling is that we do not need to assume that the populations are

normally distributed.


       Before describing the spreadsheet package, it might be worthwhile to discuss some of the

pros and cons of resampling with a spreadsheet. Speaking of the cons first, there is at least one

resampling package already developed (Rosa-Hatko, 1995). The authors have not used it, but it

is almost certainly much faster and more powerful than any spreadsheet-based package could be.

However, most students are familiar with at least one of the commonly used spreadsheets, so

that having access to a spreadsheet-based package might be the key to initiating them to


       There are other advantages also. With the macros to be discussed here, there is no need

to type in a series of commands. Everything is windows-based and actuated by macro buttons. If

the user wishes to modify some of the output for presentation purposes, it is very easy to do so.

Since the macros are in the spreadsheet itself, the user can examine them to see what they do.

Lastly, and this might be the most important consideration, users competent with macro

commands may easily modify the macros to fit their individual applications. For example, if one

wishes to resample using the median rather than the mean as a measure of central tendency, all he

or she has to do is modify one of the formulas in the macros.


       The resampling package relies on a series of macros in Lotus 4+. The name of the

spreadsheet containing the macros is RESAMPLE.WK4. When the user opens the spreadsheet,

an autoexec macro automatically goes to a control section, from which the user can go to any of

the available applications. Each application is in a separate worksheet within the spreadsheet. In

addition to the application worksheets, there is a separate worksheet for the results, another

containing the macros themselves, and a README worksheet with instructions on using the

macros. The macros are designed to be as user-friendly as possible and to that end utilize many

Lotus features in the macros. For example, macro buttons are provided to navigate to the various

application worksheets and to invoke macros. When the spreadsheet opens up and the user is in

the control section, shown in Figure 1, he or she uses one of the buttons to go to whichever

application is desired.

       At this stage of development, the spreadsheet contains applications for one mean, two

means, paired differences, one variance and one proportion. Each of these has an associated

macro button in the control section. A README section, shown in Figure 2, lists the

instructions necessary to operate the spreadsheet.

       In the instructional package developed, only the bootstrap procedure has been

incorporated. The sampling environment is assumed to be an infinite population, and sampling

is with replacement. Also, the available applications are limited to inferences for a single mean,

a single proportion, a single variance, two means from independent populations, and the mean for

paired differences. In the following sections two examples are used to explain spreadsheet-based


                                          EXAMPLE 1

         Consider the following data from the article "Pick a Sample," by Peterson (1991).

                                      Battery Lifetime (weeks)                Mean Std. Dev.

               With additive:         33 32 31 28 31 29 29 24 30 28           29.5     2.550

               Without additive:      28 28 32 31 24 23 31 27 27 31           28.2     3.084

         The first row of observations lists the lives of 10 automobile batteries with an additive

designed to lengthen their useful life. The second row shows the lives of 10 batteries that did not

receive the additive. To verify the effectiveness of the additive, we may conduct a pooled t-test

as follows:

                                              H 0 : 1   2

                                              H 1 : 1 >  2


           1 = the mean lifetime of the batteries with the additive

           2 = the mean lifetime of the batteries without the additive

In order to conduct the pooled t-test, we need to make the following assumptions:

                        The samples are independently drawn.

                        The variances of the two populations are equal, i.e.  21 =  2 2 , even

                        though the value is unknown.

                        The populations are normally distributed.

                        Furthermore, when H 0 is true the means of the two populations are equal.


                                                          ( X1- X 2 )-0
                                          t n1+ n2- 2 =        2    2
                                                            S p S p
                                                             n1   n2

DECISION RULE: If α = .05, reject H 0 if t n1+n2- 2 > 1.734


                                          ( 29.5 - 28.2 ) - 0
                                 t 18 =                       = 1.027
                                            8.003 8.003
                                              10       10

CONCLUSION:            Fail to reject H0; cannot conclude that the additive lengthened the useful

                       life of the batteries. (p-value .159)

       Before actually performing the resampling experiment, it might be informative to

describe how resampling would be done without a computer, perhaps as a way of introducing it

in the classroom. First, treat the two sets of observations as if they comprise one population in

which there is no difference between the batteries with the additive and those without it. Then

take repeated random samples, two at a time, of 10 observations each. This could be

accomplished by putting the 20 observations into a hat and sampling. Since our samples can be

considered as coming from infinite-sized populations, we should sample with replacement, by

replacing each observation as drawn and re-shuffling.

       Calculate the mean of each sample and compute the difference. Repeat this process a

sufficient number of times to arrive at a meaningful sampling distribution. This distribution

reflects the behavior of the sample means if there is no significant difference between the two

original sets of batteries. By examining the sampling distribution, it can then be determined

whether the difference between the actual sample means is great enough to be significant. This is

a good exercise to conduct as an introduction to resampling. It involves learning by doing, it is

an exact simulation of how resampling is actually done, and it soon becomes clear why a

computer is necessary to perform the calculations.


       To conduct the test with resampling, the user should navigate to the data entry section for

two means and enter the data as shown in Figure 3. The number of resamples is specified at

1,000, the null hypothesis at 0, and the confidence level at .95. Since this is an upper-tailed

example, U is entered for the type of test.

       The resampling is done using the Lotus @VLOOKUP command, which in effect samples

with replacement, or equivalently, from an infinite or very large population. This is appropriate

for this example because the population can be considered as the very large number of batteries

that might or not be treated with the additive.

       After the input is complete, the user clicks the RESAMPLE macro button, invoking the

macros necessary for the two means test. When resampling is finished, the spreadsheet ends up

in the RESULTS worksheet, illustrated in Figure 4. Shown in the initial screen of this section

are the p-values and confidence intervals, both from resampling and from parametric methods.

As can be seen, there is very close agreement between the results. It has been the authors'

experience that the results are usually very close for large samples, but not as close for smaller

samples. This is consistent with the literature to date on resampling. See, for example, Seppala,

Moskowitz, Plante, and Tang (1995), in which the results of resampling and standard methods

were compared for the calculation of control charts. One of the conclusions of this reference is

that resampling and classical methods are in very good agreement when the underlying

distribution is normal. However, when sampling from a skewed population (they used the

exponential as an example) the resulting bias (the difference between the resampling or

parametric results and "true" results) is less with resampling.

       There are two more sections of interest in the RESULTS worksheet. The first is the

graph of the resamples, shown in Figure 5, obtained by scrolling down one screen. As expected,

since the samples are combined before resampling, the mean of this histogram is approximately

zero. The sample mean difference of 1.3 lies in the upper tail of the distribution consistent with

its p-value. Scrolling to the right from the original RESULTS screen, the user may view the

resamples, some of which are shown in Figure 6.

       From a pedagogical standpoint, the most important points to dwell on are the resamples

themselves and the graphs. The instructor should point out the formulas used in the resampled

observations to show the equivalence between them and resampling by hand. The graphs show

intuitively how likely it is to obtain the actual sample difference from the population constructed

from the original sample.

       The preceding problem required a very modest amount of computer time. It took 6

seconds to run the complete resampling procedure for 1,000 resamples. The REPEAT procedure

took only 3 seconds to perform 1,000 resamples. REPEAT takes less time because it can take

advantage of a certain amount of setup work already done by RESAMPLE. For 200 resamples,

RESAMPLE took 4 seconds and REPEAT required 2 seconds. All of these times were recorded

on a Pentium 166.

                                           EXAMPLE 2

       The second example makes use of data collected as part of a project in one of the authors'

graduate statistics classes. The project involved a water processing plant located in Bucksport,

South Carolina, and operated by the Grand Strand Water & Sewer Authority (Burgoyne, Eliott,

King, and Carroll, 1995). At the time, the treatment plant was measuring pH levels of processed

water both manually, using pH indicator paper, and automatically with a computerized pH meter.

This meter operated by means of an electrode suspended in the water and automatically took

readings every five minutes. Operating personnel at the plant used both methods because they

were not as confident of the accuracy of the automated procedure as they were of the manual

method. However, they were considering eliminating the manual method if they could be

assured of the automated method's accuracy. Accordingly, the students doing the project took a

series of 177 simultaneous pH readings using both the manual and the automated methods.

Assuming that the manual method is reliable, its readings should provide a good benchmark for

the automated process so that if there is no difference, on average, between the two, it can be

concluded that the automated method is as reliable as the manual one. This is a classic paired

                                            Ho : d = 0

                                            H1: d  0

difference experiment in which the following set of hypotheses hold:

where  d is the mean paired difference between the manual and automated readings. If the

assumption of no difference is rejected, then we can conclude that the readings are different, on

average, between the manual and automated methods. If the manual method is considered

accurate, a significant difference between the two could be blamed on the unreliability of the

automated method, and they should keep both.

       To analyze this problem, the paired difference worksheet was used. Again, it is

appropriate to think of this as sampling from an infinite population, since the readings will

continue virtually forever. The input appears in Figure 7, although only some of the 177

observations are shown. The results of 1,000 resamples achieved a resampled p-value of .30364.

This agrees closely with the theoretical value of .31262. Since neither one would cause us to

reject the null for any reasonable significance level, we may conclude that the automated pH

method is consistent enough with the manual one to justify using the automated method


       This second example, was of course, much larger than the first one and consequently took

longer to run. It took 48 seconds to run the RESAMPLE macro for 1,000 resamples. REPEAT

took 38 seconds for 1,000 resamples. When the number of resamples was decreased to 200,

RESAMPLE took 24 seconds while REPEAT required 15 seconds. Times on a 486 computer

were much slower, but manageable for smaller problems.


       In the authors' experience so far, there is very close agreement between resample and

theoretical results for large samples. This is reasonable, since, according to the Central Limit

Theorem, the sampling distribution of the mean approaches normality with larger sample sizes.

For smaller problems however, there is less agreement. The authors suspect that this

disagreement is caused by skewness, as suggested by Seppala et al. (1995)

       As a test, the authors simulated 1,000 resamples for 1,000 trials (a total of 1,000,000

resamples) using the data from Example 1. The mean p-value for the 1,000 trials was .1548,

compared to the theoretical p-value of .1589, so agreement is very close on average. The mean

absolute deviation for the 1,000 trials was .0093, or somewhat less than 1%. The mean lower

confidence limit for the 1,000 trials was 1.110, compared to the theoretical value of 1.358. The

mean upper confidence limit was 3.712 compared to a theoretical value of 3.958. The mean

absolute deviations of the LCL and UCL, respectively, were .0842 and .0839.


       At this time, the package includes macros that perform the same operations presented in

this paper for cases involving one mean, two means, paired differences, one variance and one

proportion. We plan to extend the macros in the future to handle perhaps two or three additional

common situations, such as regression and correlation analysis. Consideration is also being

given to offering the finite population, sampling without replacement or randomization sampling

environment alternative. The authors would appreciate very much any suggestions as to what

features to add to the package.

       A copy of the package is available on request from the first author.


Burgoyne, S., Eliott, B., King, R., and Carroll, M. (1995). pH analysis for surface water

treatment plant, Conway & South Pump Stations. Unpublished class project.

Peterson, I. (1991). Pick a sample. Science News, 140, 56-58.

Ricketts, C. and Berry, J. (1994). Teaching statistics through resampling. Teaching Statistics,

16 (2), 41-44.

Rosa-Hatko, W. (1995). Resampling stats. ORMS Today, 22 (2), 72-74.

Seppala, T., Moskowitz, H., Plante, R., & Tang, J. (1995). Statistical process control via the

subgroup bootstrap. Journal of Quality Technology, 27, 139-153.

Simon, J. L. (1993). Resampling: The New Statistics. Belmont, CA: Duxbury.

Simon, J. L. (1994). The resampling method for statistical inference. Basics chapter of

Philosophy document of Internet manuscript.
Things to do:

Either get last reference or delete it.

Is first Simon reference to book or internet document? Look at

one of the others. Would look like this:

     Simon, J. L. (1993). Resampling: The New Statistics (pp. 345-346). Duxbury

Check to see if publications cited need issue number (they don't

if pages run throughout the volume.)

Where is Duxbury located?

Is # of words OK?

Remember figures


To top