# ch4 sampling distribution

Document Sample

CHAPTER 4

SAMPLES AND SAMPLING DISTRIBUTIONS
1.   Why do We Use Samples?
2.   Probability Sampling
2.1. Simple Random Samples
3.   Sampling Distributions
3.1. The Sampling Distribution of the Sample Mean
3.1.1. The Expected Value of
3.1.1.1. The Relationship Between the Mean of the Parent Population and the Mean of All
Values
3.1.2. The Variance of and Standard Error of
3.1.3. The Relationship Between the Variance of the Parent Population and the Variance of
3.1.4. The Shape of the Sampling Distribution of . The Relationship Between the Parent Population
Distribution and the Sampling Distribution
3.1.5. Examples Using the Normal Sampling Distribution of
3.1.6. The Margin of Sampling Error (MOE)
3.1.7. Error Probability α
3.1.8. Determining the Sample Size for a Given MOE
3.2. The Sampling Distribution of the Sample Proportion
3.2.1. The Expected Value of
3.2.1.1. The Relationship Between the Parent Population Proportion and the Mean of All
Values
3.2.2. The Variance of and Standard Error of
3.2.2.1. The Relationship Between Variance of the Binary Parent Population and the
Variance of
3.2.3. The Sampling Distribution of as a Normal Distribution
3.2.4. Margin of Error for
3.2.5. Determining the Sample Size for a Given MOE

1. Why do We Use Samples?
Sampling is the basis of inferential statistics. A sample is a segment of a population. It is, therefore, expected to
reflect the population. By studying the characteristics of the sample one can make inferences about the
population. There are several reasons why we take a part of the population to study rather than taking a full
census of the population. These are:

   Samples cost less.
   Sampling takes less time.
   Samples are more accurate. Sample observations are usually of higher quality because they are better
screened for errors in measurement and for duplication and misclassifications;
   Samples can be destroyed to gain information about quality (destructive sampling).

2. Probability Sampling
A sample in which each element of the population has a known and nonzero chance of being selected is called a
probability sample.

2.1. Simple Random Samples
A simple random sample is a probability sample in which all possible samples of size n are equally likely to be
chosen. To explain this requirement, let the population consist of letters A, B, C, D, and E. Since there are five

Chapter 4—Sampling Distributions                                                                  Page 1 of 33
items in the population, then N = 5. We want to select a sample of size 3, that is, n = 3. Since sampling is random
(the letters are written on little balls and are put in a bowl), there is more than one way that we can select 3 items
from 5 items. Using the combination formula, the total number of possible samples is C(N, n) = C(5, 3) = 10. The
following is the list of all 10 possible samples:

ABC ABD ABE ACD ACE

The definition of SRS implies that each sample has the equal chance of 0.10 of being selected. This process of
simple random selection applies to a finite (small) population. The simple random selection process is different
when the population is infinite (large). Even when the population is not infinite, the application of the definition
becomes very cumbersome. For example, what if the population size is 50 and we want to select a sample of size
10. How many different samples are possible? Using the combination formula, the total number of possible
samples is 10,272,278,170. It would be impractical! to list all the 10.3 billion possible samples and select one of
them at random.

The correct procedure to select a random sample is to assign a serial number to each of the population elements
and select the sample by drawing a pre-specified number of serial numbers at random (use the "random numbers
table").

3. Sampling Distributions
A sampling distribution is a probability distribution of a sample statistic. Recall from Chapter 1 that a sample
statistic is a summary characteristic computed from sample data. Since a sample statistic is a summary
characteristic obtain from randomly selected sample, the sample statistic is then a random variable. The value
assigned to the sample statistic is randomly determined. Furthermore, because a sample statistic is a random
variable, it has a probability distribution. The probability distribution of a sample statistic is called a sampling
distribution.

3.1. The Sampling Distribution of the Sample Mean
Since is a summary characteristic computed from sample data, then it is a sample statistic. The probability
distribution of is called the sampling distribution of . The reason we are able to define a probability distribution
for x is that is a random variable. The value of is determined by the samples chosen through a random process.

To illustrate the sampling distribution of in the simplest terms, consider the following example: The Jones family
has five children. The following table lists the age of the children. Since we are considering the age of all the
Jones’ children, then the age data constitutes a population.

Age
Name                  x
Ann                   3
Beth                  6
Charlotte             9
David                12
Eric                 15

Suppose, as an experiment, we want to estimate the average age of the children by taking a sample of size three.
Note that for estimation purposes only a single sample of a size n is randomly selected. Thus, a single random
sample selected from the above “population” may result in the sample elements, say, Ann, Beth and David, with

Chapter 4—Sampling Distributions                                                                   Page 2 of 33
1
corresponding values {3, 6, 12}. But we know this is one of the 10 possible samples. There are nine other possible
samples that we could have randomly selected. Next table lists all the ten possible samples of size n = 3 that we
may select from a population of size N = 5. The table also shows the average age computed from the values of
each sample.

Sample Mean
Sample           Sample Values
=
x
Composition              x
n
A     B     C      3     6      9               6
A     B     D      3     6      12              7
A     B     E      3     6      15              8
A     C     D      3     9      12              8
A     C     E      3     9      15              9
A     D     E      3     12     15             10
B     C     D      6     9      12              9
B     C     E      6     9      15             10
B     D     E      6     12     15             11
C     D     E      9     12     15             12

In above table note that the values 8, 9 and 10 appear twice. Since three of the ten are repeated, then there
are seven distinct values of . Next table shows the sampling distribution of x , which is the listing of all 7 possible
values the random variable can take on along with the probability (relative frequency) associated with each
value. Since in the sampling process values 8, 9 and 10 each occur twice, then the probability associated with
these values is 2 ∕ 10 = 0.20.

The sampling distribution of the sample mean age is then,

Sampling Distribution of
f( )
6                 0.1
7                 0.1
8                 0.2
9                 0.2
10                0.2
11                0.1
12                0.1
1.0

The following diagram shows the chart of the sampling distribution.

1
Using the combination formula C(N, n), there are C(5, 3) = 10 different samples of size three selected from 5
objects without replacement.

Chapter 4—Sampling Distributions                                                                   Page 3 of 33
Sampling Distribution of

f(x̄)                      0.2     0.2   0.2

0.1     0.1                               0.1     0.1

6          7      8        9    10          11   12

3.1.1. The Expected Value of
The sample statistic is a random variable with a probability distribution. Like all other random variables,
therefore, has an expected value and a variance. The expected value of is the (weighted) average of all the
sample means. The weights are the probability associated with each value of the sample mean. Since the
expected value represents the average of all possible sample means, it is also denoted by the symbol μ x .

E( ) = μ =  ( )

In the Jones family example the expected value of the sampling distribution of is determined as shown in following
table.

Calculation of μ x
x                f( )          f( )
6                0.1           0.6
7                0.1           0.7
8                0.2           1.6
9                0.2           1.8
10                0.2           2.0
11                0.1           1.1
12                0.1           1.2
μ = ( )=          9.0

Note that we may compute μ x directly from the 10 unweighted values. In that case,

6  7  8  8  9  10  9  10  11  12 90
μ =                                            =    =9
10                      10

Chapter 4—Sampling Distributions                                                              Page 4 of 33
3.1.1.1. The Relationship Between the Mean of the Parent Population and the Mean
of All Values
To show an important relationship between the expected value of (the average of the sample means, μ ) and the
mean of the parent population μ, determine the parent population mean directly from the Jones family children
population age data in.

μ
 x  3  6  9  12  15  9
N            5

The parent population average age μ = 9 is exactly the same as the mean of . That is, the average value of all
possible sample means is equal to the mean of the parent population—the mean of the means equals the mean.

E( ) = μ = μ

This equality is not coincidental for this example. The equality of the expected value of the sampling distribution
of x and the population mean μ is true for all sampling distributions of . The mean of the means equals the
2
mean!

3.1.2. The Variance and the Standard Error of
The variance of , denoted by var( ), like any other variance measure, is simply the mean squared deviation of the
random variable . The calculation of the variance using all values is shown below:

Variance of Values
( − μ )²
6                            9
7                            4
8                            1
8                            1
9                            0
10                             1
9                            0
10                             1
11                             4
12                             9

μx = 9         ( − μ )² = 30
var( ) = 30 ∕ 10 = 3

Since within the random variable framework the mean and expected value convey the same meaning, then we can
express the variance of as the expected value (weighted average) of the squared deviations of :

var( ) = E[( − μ )²] = ( − μ )²f( )

Next table shows the calculation of var( ) as the expected value of squared deviations.

2
See Appendix for the mathematical proof that E( ) = μ.

Chapter 4—Sampling Distributions                                                                Page 5 of 33
Calculation of var( ) = E[( − μ )²]
f( )           ( − μ )²f( )
6                     0.1                0.9
7                     0.1                0.4
8                     0.2                0.2
9                     0.2                0.0
10                    0.2                0.2
11                    0.1                0.4
12                    0.1                0.9
var( ) = ( − μ )²f( ) =           3.0

The standard deviation of is called the standard error of and is denoted by se( ). The standard error is a
measure of the dispersion of all possible values around the mean of . It is the positive square root of the var( ).
For the Jones family example:

se( ) =    var(x ) =       3 = 1.732

3.1.2.1.   The Relationship Between the Variance of Parent Population and the
Variance of
Going back to the population age data, compute the population variance, using the variance formula we learned in
Chapter 1:

σ2 =
 (x  μ)2 = 90 = 18
N           5
2                                                                                   2
Note that var( ) ≠ σ . This is always the case. However, there is a definite relationship between var( ) and σ . This
relationship is shown as

σ2  N  n 
var( ) =             
n  N 1 

From the Jones family example

18  5  3 
var( ) =             =3
3  5 1 

Nn
In the var( ) formula, pay special attention to the term               .
N 1

This term is called the finite population correction factor (FPCF). When the population is finite or small, as in the
example above, the sample size relative to the population, n ∕ N, is large: 3 ∕ 5 = 60%. When population is nonfinite
or large this ratio becomes insignificant, the FPCF approaches 1 and, therefore, it plays no role in the var( )
formula. The tendency of the FPCF to approach 1 as N gets larger is shown in the following table. A sample size of
n = 10 is used to show this tendency.

Chapter 4—Sampling Distributions                                                                Page 6 of 33
Finite Population Correction Factor
as N Increases (for n = 10)
Nn
N                     N 1
25                0.6250
50                0.8163
100                0.9091
1,000                0.9910
10,000                0.9991
100,000                 0.9999
1000,000                 1.0000

3
Thus, for large populations, the variance of becomes

σ2
var( ) =
n

The standard error of , as the square root of var( ) is then,

σ
se( ) =
n

3.1.3. The Number of Possible Samples and Values
To explain the concepts of sampling distribution, expected value, and standard error of the sampling distribution,
we used a simple example where from ridiculously small parent population (N = 5) we took ridiculously small
samples (n = 3). The number of possible samples (, the Greek letter nu) is determined using the combination
formula:

ν = C(N, n) = C(5, 3) = 10.

When the population size N increases, even with small sample size n, the number of possible samples ν, and the
number of corresponding values computed from these samples, quickly rises to astronomical levels. The
following table shows this clearly.

N            n                  ν
5            3                           10
10            3                          120
50            5                    2,118,760
100           10           17,310,309,456,440

In Chapter 1 we used the example of residents of a Florida retirement community as the population, where
N = 608, from which we selected a single sample of size n = 40 to explain the difference between the population
parameter μ and the sample statistic . For that explanation we used only a single sample the values of which
were selected randomly. This sample yielded a sample mean of = 62.8. This was only one sample and one
among the following possible number of values:

ν = 749,670,807,490,441,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000.
3
See Appendix for the mathematical proof.

Chapter 4—Sampling Distributions                                                               Page 7 of 33
Summary of the Different Variance Concepts and Formulas

With the introduction of the variance of we have added a new variance concept to the two we learned in Chapter
1. These variance concepts are summarized below:

Population Variance measures the mean squared deviation of population data around the population mean:

σ2 
 (x  μ)2
N

Sample Variance measures the mean squared deviation of a sample data around the sample mean:

s2 
(x  x )2
n 1

Variance of the mean measures the mean squared deviation of all possible values around the mean of . Since
in all sampling problems there are astronomically large number of values, there is no formula to compute the
var( ) from all possible values of . Rather, if the population variance is given, var( ) is determined as follows:
σ2
var( ) =
n

Practice Problem Set 4.1
1.   A population contains 40 items. How many possible samples of size 10 can be selected from this population?
2.   A population contains five items, A, B, C, D, and E, with the numeric values shown below:

Item               x
A               12
B                9
C               21
D               15
E                3

(a)  Compute the mean, variance, and the standard deviation of the population.
(b)  How many samples of size n = 3 can we select from this population?
(c)  List all the samples and compute the mean, variance, and the standard deviation of each sample.
(d)  Write the sampling distribution of .
(e)  Compute the expected value of the sample mean, E( ), from the sampling distribution of . Compare this
mean to the parent population mean μ.
(f) Compute the variance and standard error of the mean from the population data.
(g) Compute the variance and standard error of the mean from the sampling distribution .
3.   The average weight of the adult male population is 180 pounds with a standard deviation of 25 pounds. What
are the mean and stand error of sampling distribution of , when the sample size n = 64?
4.   A population with an unknown distribution has a mean of 1,246 and a standard deviation 345.
(a) For samples of size n = 25 selected from this population, what is the mean of the sampling distribution of
?
(b) When n = 25, what is the standard error of ?
(c) What is the mean of if the sample size is n = 64?
(d) What is the standard error of if the sample size is n = 64?

Chapter 4—Sampling Distributions                                                               Page 8 of 33
3.1.4. The Shape of the Sampling Distribution of . The Relationship Between the Parent
Population Distribution and the Sampling distribution
The foundation of inferential statistics is the sampling distribution. We use the sampling distribution of to infer
about the population mean μ. The shape of the sampling distribution plays a vital role in inferential statistics. In
order make the inference about the population parameter, the sampling distribution must have a specific shape.
The required shape of distribution is the normal distribution. If the sampling distribution is not normal, then it
cannot be used for inferential statistics.

At the outset, the most important issue to understand is that the shape of the sampling distribution of depends
on one of two things: (1) the shape or distribution of the population data set, and/or (2) the size of the sample (n).

3.1.4.1.      When the Parent Population Has a Normal (Bell-Shaped)
Distribution
The first practical conclusion from this discussion is that when the parent population has a normal (bell-shaped)
distribution with mean μ and standard deviation σ, the sampling distribution of also has a normal distribution
with mean μ = μ and standard deviation (standard error) se( ) = σ n .

When the parent population
distribution is normal with
mean μ and standard
deviation σ, ...                              σ
Parent population
distribution

x

... the sampling distribution
of is also normal with
mean μ and standard error
se( ) = σ ⁄ n                                 σ ⁄ √n

Sampling
distribution of x̄

μ

3.1.4.2.      When the parent population is not normally distributed
When the parent population is not normally distributed, the shape of the sampling distribution will depend on the
sample size n. The sampling distribution of will approach normal as the size of the sample increases. The rule
thumb is, if the sample size is 30 or more, the sampling distribution of will be treated as if normal. This conclusion
is based on the Central Limit Theorem.

Chapter 4—Sampling Distributions                                                                      Page 9 of 33
When the parent population
distribution is NOT normal, ...

x
... the sampling distribution
of is approximately normal
with mean μ and standard
σ ⁄ √n       error se( ) = σ ⁄ n, if the
sample size n ≥ 30

μ

This property of the sampling distribution makes statistical inference about μ possible even when the population is
not normally distributed.

3.1.5. Examples Using the Normally Distributed Sampling Distribution of
The subsequent chapters are all devoted essentially to inferential statistic, where we will apply the basic concepts
we learned in this chapter to infer about characteristics of population data by analyzing the characteristics of
sample data. Inferences about a summary characteristic of the population data, for now the mean μ, from the
mean of a sample are never exact statements. These inferences, instead, are probabilistic statements. To make
these probabilistic statements, and be able to state the exact probabilities, it is essential that the sampling
distribution of be normal. The following examples are typical applications of the normal distribution to the
sampling distribution of . What we learn from these examples, will help us with understanding of inferences
about the population mean in the subsequent chapters.

Example
In a bottling plant the amount of soda in each 32-ounce bottle is a normally distributed random variable with a
mean μ = 32 ounces and standard deviation of σ = 0.3 ounces.

a)       If a single bottle is randomly selected, what is the probability that it contains between 31.8 and 32.2
ounces of soda? Alternatively stated, given the mean and standard deviation of the fill of bottles, what
fraction (proportion, or percentage) of the bottles contain between 31.8 and 32.2 ounces of soda?

Chapter 4—Sampling Distributions                                                               Page 10 of 33
Note: This part of the problem does not deal with sampling distribution. It is shown, however, to explain how to
differentiate between the probability of x (the random variable representing the parent population) and the
probability of x (the random variable representing the sample means).

μ = 32               σ = 0.3

P(31.8 < x < 32.2)

x  μ 31.8  32
z                   0.67 and 0.67
σ      0.3

P(−0.67 < z < 0.67) = 0.4971

b)       If a sample of size n = 9 bottles is taken, what is the probability that the mean of this sample, , is between
31.8 and 32.2 ounces? Alternatively stated, what fraction (proportion, or percentage) of the means
obtained from samples of size n = 9 fall within 31.8 and 32.2 ounces?

Now you are dealing with the probability distribution of . Since the parent population of bottles is normal, then
the distribution of values (the sampling distribution of ) is also normal with the following mean and standard
deviation (standard error):
σ    0.3
μ x = μ = 32      and       se(x )     =     = 0.1
n     9
The objective is now to find

P(31.8 < < 32.2)

First we must convert the normal random variable to the standard normal z. The z conversion formula is

x μ
z=
se(x )
31.8  32
Using this formula we can find z =             = −2.00 and 2.00.
0.1
P(−2.00 < z < 2.00) = 0.9545

Chapter 4—Sampling Distributions                                                                Page 11 of 33
0.2514          0.4971          0.2514

x

0.0228           0.9545          0.0228

31.8            32.2

Note that in these two examples, even though the distribution of x (the parent population) and sampling
distribution of both have the same mean (μ = 32), the same interval (31.8-32.2) contains 95.5% of all the values
but only 49.7% of the x values. The reason for this difference is that the values are far less dispersed than the x
values. And, this is because the standard deviation of the distribution of , se( ) = σ n , is smaller than the
standard deviation of x. The values are much more closely clustered around the mean μ = 32 than the x values.

3.1.6. The Margin of Sampling Error
The next example is used to explain the extremely important concept of the margin of sampling error (MOE). This
concept plays a crucial rule in inferential statistic. You must always keep MOE in mind when dealing with the
sampling distribution of a sample statistic.

Example
A given population has a mean of 50 and a standard deviation of 18. Consider the sampling distribution of the
means of samples of size 36 obtained from this population. Find the interval of values, symmetric about the
mean, that contains 90 percent of all possible values.

First, establish the parameters of the distribution of the population and the sampling distribution.

In the population, x is normally distributed with mean μ = 50 and standard deviation σ = 18.

In the sampling distribution, is normally distributed (because n > 30) with mean μ x = μ = 50 and standard error
se( ) = σ   n = 18   36 = 3.

Consider the following diagram showing the distribution of where L and U represent the upper and lower end of
the interval which contains 90% of all possible sample means obtained from samples of size n = 36. The objective
is to find the values of L and U.

Chapter 4—Sampling Distributions                                                               Page 12 of 33
P( L ≤ ≤   U) =   0.90

0.90

0.05                                0.05

L              50            U

You can find L and      U   using the formula that converts the normal random variable into the standard normal
random variable z:
x μ
z=
se(x )
From which you can solve for :

= μ + z∙se( )

The term z∙se( ) in this formula is called the margin of sampling error or simply the margin of error (MOE).

MOE = z∙se( )

To find MOE, first compute

       18
se( ) =       =        =3
n       36

The value for z is determined as follows: Note that the area within the interval is 90%. Thus, the two tail areas are
5% each. Therefore, the z score corresponding to U is the z score that bounds a right tail area of 5%, that is,
z0.05 = 1.64. Thus,

MOE = z0.05 se( ) = 1.64 × 3 = 4.92

The margin of error of 4.92 simply implies that 90% of all possible values fall within ±4.92 (data units) from the
population mean μ. The lower and upper ends of the interval are thus:

L= 50 – 4.92 = 45.02
U = 50 + 4.92 = 54.92

Chapter 4—Sampling Distributions                                                               Page 13 of 33
MOE = z∙se( )

0.90

−4.92              +4.92
45.02               50             54.92

Again, the lower and upper boundaries of this interval indicates that 90% of all fall within the interval bounded by
45.02 and 54.92. Stated differently, 90% of the means computed from samples of size n = 36 deviate from the
parent population mean by no more than ±4.92.

Example
In the previous example, where μ = 50 and σ = 18, find the interval that contains 95% of all the means obtained
from samples of size n = 36.

Form this example we must find the 95% margin of error.

MOE = z0.025 se( ) = 1.96 × 3 = 5.88

MOE = z0.025 ∙se( )

0.95

−5.88              +5.88
44.12                   50               55.88

Thus,

L= μ − MOE = 50 − 5.88 = 44.12
U = μ + MOE = 50 + 5.88 = 55.88

Example
In the soda bottle example, where μ = 32 ounces and σ = 0.3 ounces, find the interval that contains 95% of the
means obtained from samples of size n = 25 bottles.

se( ) = 0.3 ⁄ 25 = 0.06

MOE = z0.025 se( ) = 1.96 × 0.06 = 0.118

L=   32 – 0.118 = 31.882

Chapter 4—Sampling Distributions                                                              Page 14 of 33
U   = 32 + 0.118 = 32.12

We can, therefore, state that of every 100 samples of size 25 that we select from the population of soda bottles,
95 of them will have a sample average of soda that is between 31.88 and 32.12 ounces.

3.1.7. Error Probability α
In computing the MOE in the first two examples in this section, each MOE involved a specified probability. The
first required an interval with a 90% margin of error, and the second a 95% MOE. In the first example, the interval
built around μ using a 90% MOE contained 90% of all possible sample means. Thus 10% of the sample means fell
outside the interval, that is, they deviated from μ by more than the established MOE. Thus, in that example, if a
random sample of size n = 36 were selected from the population, there was 10% probability that the sample mean
deviated from the μ = 50 by more than ±4.92. This 10% probability is called the error probability and is denoted by
the Greek letter α.

In the second example, 95% of sample means deviated from μ = 50 by no more than ±5.88. The error probability
in that example was, therefore, α = 0.05.

Using the α as a general symbol for error probability, the MOE formula can then be written as:

MOE = zα/2 se( )

Note that the subscript of z is α/2, since we divide the error probability equally between the two tails of the
normal curve.

3.1.8. Determining the Sample Size for a Given Margin of Error
The margin of error formula MOE = zα/2 se( ), the standard error is se( ) = σ ⁄ n. Therefore, we can write the MOE
formula as

MOE = zα/2 
n

This indicates that the MOE varies with the sample size n. The bigger the sample size, the narrower the MOE. In
many statistical questions require you to determine the sample size for a specified MOE. To determine n, we can
reconfigure the MOE formula as follows:

z / 2
n=
MOE
2
z 
n = /2 
 MOE 
     

Example
In the previous example, where μ = 32 ounces and σ = 0.3 ounces, what should the sample size be so that 95% of
all possible sample means would fall within a margin of error of 0.08 ounces from the population mean?

Given a 95% MOE, the error probability is then α = 0.05.

2            2
z         1.96  0.3 
n =  0.025  =              = 54.02
 MOE       0.08 

Chapter 4—Sampling Distributions                                                                Page 15 of 33
Rounded up, n = 55.

Note that in this example, we are interested in a narrower margin of error (0.08 versus 0.118). To make MOE
narrower and, hence, the interval more precise, we must increase the sample size. Of every 100 means obtained
from samples of size n = 55 bottles, 95 of them are expected to fall within 0.08 ounces from the mean of all bottles
filled by the machine.

Practice Problem Set 4.2
1.   A population with an unknown distribution has a mean of 1,246 and a standard deviation 345. Answer the
following question for samples of size n = 25 selected from this population.
(a) What is the shape of the distribution of x ? Is the distribution of x normal?
(b) What is the distribution of x if the sample size is n = 64?

2.   135,296 students took a statewide math test. The average score of the population of 135,296 students taking
the test was μ = 64, with a standard deviation of σ = 22.
(a) For samples of size n = 50 student tests selected from this population, what is the mean of the sampling
distribution of ?
(b) For n = 50, what is the standard error of ?
(c) For n = 50, what is the shape of the distribution of ? Is the distribution of x normal?
(d) A sample of 50 scores is selected. What is the probability that the sample mean is between 60 and 68?
(e) What fraction (proportion, or percentage) of the means obtained from samples of size n = 50 fall within 5
score units from the population mean score?
(f) For samples of size 81, what interval of values would contain 90% of all sample means?
(g) For samples of size 81, what is the 95% margin of sampling error?
(h) For samples of size 81, what interval of values would contain 95% of all sample means?

3.   The lifetime of light bulbs produced by a particular manufacturer have mean 1,200 hours and standard
deviation 400 hours. The population distribution is normal.
(a) If a single light bulb is selected. What is the probability that the lifetime is between 1,100 and 1,300?
(b) If a sample of 25 light bulbs is selected, what is the probability that the sample mean lifetime is between
1,100 and 1,300?
(c) What fraction (proportion, or percentage) of the means obtained from samples of size n = 25 fall within
50 hours from the population mean hours of lifetime?
(d) What is the 95% margin of statistical error for the sample means, if the sample size is 36?
(e) What interval of x values contains 95% of all 's?
(f) What sample size generates a 95% margin of statistical error of 20 hours?

4.   According to a recent report the nationwide average price of regular gasoline is \$2.48 with a standard
deviation of \$0.20.
(a) A sample of 50 gasoline stations is selected. What is the probability that the sample mean is between
\$2.43 and \$2.53?
(b) For a sample of size 75, what is the 90% margin of statistical error?
(c) What interval contains 95% of values for all samples of size 45?
(d) What sample size yields and 95% margin of error of \$0.01?

5.   The average systolic blood pressure in the population of normal healthy adults is 120 mm Hg with a standard
deviation of 10 mm Hg.
(a) If repeated samples of size 25 individuals is taken, what proportion of samples will have mean values
greater than 124 mm Hg?
(b) what interval of sample mean values of systolic blood pressure, symmetric about the mean, would
contain 95% of all sample means?

Chapter 4—Sampling Distributions                                                               Page 16 of 33
3.2.     The Sampling Distribution of the Sample Proportion
Consider a population of size N. Let x be the number of elements in the population that posses a given attribute.
Then x is the number of "successes" and the ratio of x over N is the population proportion π.
x
π=
N

For example, in the 2002-2003 academic year a total of 37,196 students (full-time equivalent) were enrolled at IU-
Bloomington, of whom 30,131 were undergraduate students. Thus the population proportion of undergraduates
enrolled at IU-Bloomington in that year was:

30,131
π=          = 0.81
37,196

Now, suppose a sample of size n students is taken from the population. The proportion of successes (in this case
undergraduates) in the sample (the sample proportion) is

x
=
n

Suppose you took a sample of n = 200 students of whom x = 156 were undergraduate students, then the sample
proportion is,

156
=       = 0.78
200

Note that, like , which is the sample statistic estimating the population parameter µ, is also a sample statistic,
now estimating the population parameter π. Like , is then a random variable because its value is determined by
the outcome of a random experiment—the experiment being selecting a random sample. The probability
distribution of is called the sampling distribution of .

To explain how the sampling distribution is generated, consider the Jones family example used in explaining the
sampling distribution of . In this case, instead of the age of the children, we are interested in a non-quantitative
attribute of the children, their gender (male/female). To show how the concepts of the sampling distributions of
and are closely related, assign the value “1” to “female” (the attribute of interest in this example) and “0” to
“male”. The following table shows the population elements by gender and the numeric assignment to each
gender.

Gender of the Jones Family Children
Name          Gender      Numeric Assignment
Ann                   F                  1
Beth                  F                  1
Charlotte             F                  1
David                 M                  0
Eric                  M                  0

Note that proportion in fact measures the average of the data in a data set where the data values are limited to 0
and 1, that is xi = {0, 1}. Let x = xi (that is, x being the sum of individual xi values). Then,

Chapter 4—Sampling Distributions                                                               Page 17 of 33
N
 xi       x
π = i 1   =
N       N

In this example

5
x =  xi = 1 + 1 + 1 + 0 + 0 = 3
i 1

Thus, π = 3 ∕ 5 = 0.6

The proportion of females in the population is 3 ∕ 5 = 0.60. We conduct an experiment by taking a sample of size
n = 3 to “estimate” the population proportion. For samples of size n = 3, there are 10 samples possible with the
sample proportion of females shown in the following table.

Sample Proportion of Females Among the Jones Family Children
Sample Proportion
Sample Values                     x
=
Sample Composition                xi                          n
A         B        C        1       1        1               3/3
A         B        D        1       1        0               2/3
A         B        E        1       1        0               2/3
A         C        D        1       1        0               2/3
A         C        E        1       1        0               2/3
A         D        E        1       0        0               1/3
B         C        D        1       1        0               2/3
B         C        E        1       1        0               2/3
B         D        E        1       0        0               1/3
C         D        E        1       0        0               1/3

The sampling distribution of , the proportion of females, is shown below as the relative frequency of the
proportions in previous table.

Sampling Distribution of
f( )
1/3               0.30
2/3               0.60
3/3               0.10
1.00

Chapter 4—Sampling Distributions                                                             Page 18 of 33
3.2.1. The Expected Value (Mean) of
The sample statistic is a random variable with a probability distribution. Again, like all other random variables,
has an expected value and a standard deviation. The expected value of is the (weighted) average of all the
sample proportions. The weights are the probability associated with each value of the sample proportion. Since
the expected value represents the average of all possible sample proportions, it is also denoted by the symbol μ .

E( ) = μ =  P( )

Using the sampling distribution of the number of females shown in the previous table, the calculation of the mean
of is shown as follows.

Calculation of E( )
f( )           f( )
1/3            0.30            0.10
2/3            0.60            0.40
3/3            0.10            0.10
μ = E( ) =  f( ) =      0.60

Alternatively, we can compute μ p directly from the unweighted values.

1  2/3  2/3  2/3  2/3  1/3  2/3  2/3  1/3  1/3
μ =                                                           = 3 ∕ 5 = 0.60
10

3.2.1.1.       The Relationship Between the Parent Population Proportion and the
Mean of All Values
Now, considering the binary population data of the gender of the children, three out of five children are female.
Therefore, the population proportion is,

π = 3 ∕ 5 = 0.60

Note the important conclusion here that the average of all possible sample proportions is exactly the same as the
4
population proportion π.

E( ) = μ = π

Recall that at the start of this discussion it was stated that the proportion is a special case of the mean where the
values in the data set are binary values 0’s and 1’s. Thus, the mean of the sampling distribution of and the mean
of sampling distribution of are both equal to the population mean. Only the symbols differ— π is the mean of
the population when the data is binary, and μ is the mean of non-binary data.

3.2.2. The Variance and Standard Error of
The variance of , denoted by var( ), is the mean squared deviation of . In the Jones family example, the variance
of is calculated directly from all the values as follows:

4
See Appendix for the proof that E( ) = π

Chapter 4—Sampling Distributions                                                               Page 19 of 33
Calculation of the var( ) from All
Values
( − π)²
1.00                            0.1600
0.67                            0.0044
0.67                            0.0044
0.67                            0.0044
0.67                            0.0044
0.33                            0.0711
0.67                            0.0044
0.67                            0.0044
0.33                            0.0711
0.33                            0.0711
( − π)² = 0.4000
var( ) = 0.4 ∕ 10 = 0.04

Since is a random variable, we can also compute var( ) as the expected value of the squared deviations of :

Calculation of var( ) from Sampling Distribution of
f( )           ( − π)² f( )
1/3                0.30                 0.021
2/3                0.60                 0.003
3/3                0.10                 0.016
( − π)² f( ) =           0.040

The standard error of is simply the square root of the variance:

se( ) =    var(p ) = 0.20

3.2.2.1.       The Relationship Between Variance of the Parent Population and the
Variance of
To explain the relationship, let’s first compute the variance of the parent population in our Jones’ children
example. Using the appropriate symbols (use π in place of μ since the population data consists of only 0’s and 1’s),
the familiar population variance formula, recalling the special case of the variance of binary data from Chapter 1,
is:

σ² = π(1 − π)
σ² = 0.6(1 – 0.6) = 0.24

The variance of is then,

σ 2  N  n  π(1  π)  N  n 
var( ) =              =                
n  N 1       n  N 1 

0.24  5  3 
var( ) =                = 0.04
3  51

Chapter 4—Sampling Distributions                                                              Page 20 of 33
When the population is non-finite the FPCF approaches 1 and disappears from the picture and the formula for var(
5
p ) becomes simply:

π(1  π)
var( ) =
n

The standard error of is then,

π(1  π)
se( ) =
n

3.2.3. The Sampling Distribution of as a Normal Distribution
In the binomial distribution, as the number of independent trials increases (and if probability of success π is closer
to 0.5), then the distribution of the binomial random variable x, the number of successes in the trial, can be
approximated by the normal distribution. The rule of thumb for x to be approximately normally distributed is:

nπ = 5 and nπ(1 − π) = 5

Now, rather than x, we are interested in the distribution of the random variable . Note that is a linear
x                                                1
transformation of x: = . We obtain by multiplying x by the constant . Thus, if x is approximately normal,
n                                                n
then its linear transformation is also approximately normal. (Only the location of the normal curve along the
number line changes and not its shape.) The following diagram shows the sampling distribution of as a normal
π(1  π)
distribution with a mean of π and the standard deviation (standard error) of se( ) =                .
n

Sampling Distribution of

se( ) =   π(1  π)
se(p̄)                 n

E(p̄) = π

The following examples use the normal distribution to solve probabilities involving the sampling distribution of .

Example
Seventy percent (70%) of vehicles on Indiana interstate highways violate the speed limit of 70 mph. A sample of
500 vehicles are randomly clocked for speed. What is the probability that more than 72% of vehicles in the sample
violate the speed limit?

5                                 π(1  π)
For the proof that var( ) =            see Appendix.
n

Chapter 4—Sampling Distributions                                                                   Page 21 of 33
0.70     0.72

Find P( > 0.72).

Since the requirements for normal approximation are satisfied (nπ = 360, and nπ (1 − π) = 100.8), then is
normally distributed with the following parameters:

(1  )   0.7(1  0.7)
μ = π = 0.70       and     se( ) =                          = 0.0205
n           500

Since is normal, then the z formula is:

p π
z=
se(p)

The computed z score is

0.68  0.70
z=               = 0.98
0.0205

P(z > −0.98) = 0.8365

Example
In the previous example, what is the probability that the sample proportion is within 3 percentage points from the
population proportion? Alternatively stated, what fraction (proportion, or percentage) of proportions computed
from repeated samples of size 500 are within 3 percentage points from the population proportion?

P(π – 0.03 < < π + 0.03) = P(0.7 – 0.03 < < 0.7 + 0.03) = P(0.67 <      < 0.73)
0.67  0.70
z=              = −1.46 and 1.46
0.0205
P(−1.46 < z < 1.46) = 0.8557

Chapter 4—Sampling Distributions                                                             Page 22 of 33
P(π − 0.03 <    < π − 0.03) = 0.8557

0.8557

0.67           0.70          0.73

Example
In the previous example, what fraction (percentage, or proportion) of ’s computed from samples of size n = 500
fall within 4 percentage points from the population proportion?

P(0.66 < < 0.74)
0.66  0.70
z=              = −1.95 and 1.95
0.0205
P(−1.95 < z < 1.95) = 0.9488.

Example
In the previous example, what fraction (percentage, or proportion) of ’s computed from samples of size n = 1,000
fall within 4 percentage points from the population proportion?

P(0.66 < < 0.74)
(1  )     0.7(1  0.7)
se( ) =                           = 0.0145
n           1000
0.66  0.70
z=              = −2.76 and 2.76
0.0145
P(−2.76 < z < 2.76) = 0.9942.

3.2.4. Margin of Error for
Similar to the discussion of MOE for , the concept of margin of error for plays a crucial rule in inferential
statistics. This is why we place a special emphasis on this topic. The following example involves the MOE for .

Example
Given that the population proportion of vehicles violating the legal speed limit is 0.70, using the sample size of
n = 1,000, in the sampling distribution of find the interval of values, symmetric about π, which contains 90% of
all sample proportions computed from random samples of size n = 1,000.

To find the lower and upper ends of the interval, you must add to and subtract from π a certain quantity (in this
case, a fractional value, or percentage points). The lower end and upper end are denoted by, respectively, L and
U.

Chapter 4—Sampling Distributions                                                             Page 23 of 33
0.90

0.05                                    0.05
pL             0.70                 U

p π
The quantity added to and subtracted from π is the MOE. To obtain the MOE for , rearrange z =                   by solving
se(p)
for .

= π + z∙se( )

Thus, to obtain    U   we must add z∙se( ) to π and to obtain    L   we must subtract z∙se( ).

We know π = 0.70, and se( ) = 0.0145. Since we want 90% of all sample proportions to be included in the interval,
then of the remaining α = 10% (recall that α is called the error probability), one half will be on the right tail and the
other half on the left tail outside the interval. The margin of statistical error is then,

MOE = zα/2 se( )

Since α = 0.10, the relevant z-score is zα/2 = z0.05 = 1.64. The MOE for the 90% interval is then:

MOE = 1.64(0.0145) = 0.024

The lower and upper end of the interval are therefore:

L=   π − z∙se( ) = 0.70 − 0.024 = 0.676                    U=   π + z∙se( ) = 0.70 + 0.024 = 0.724

This means that if you took repeated samples of 1,000 vehicles and computed the proportion in each sample
which violated the speed limit, then 90% of these proportions would have values ranging from 0.676 to 0.724.
Alternatively stated, 90% of sample proportions would deviated from π by no more than ±0.024, or ±2.4
percentage points.

Example
Suppose in a certain election a candidate received 55% of the votes. What fraction (percentage or proportion) of
sample proportions obtained from repeated samples of size 600 voters each would fall within 3 percentage points
of the population proportion of 0.55?

The objective here is to find P(0.52 <       < 0.58). First determine se( )

Chapter 4—Sampling Distributions                                                                      Page 24 of 33
π(1  π)   0.55(1  0.55)
se( ) =                                = 0.0203
n            600

p π
z=
se(p)

P(-1.48 < z < 1.48) = 0.8611

Therefore, about 86% of sample proportions would deviate from π = 0.55 by no more than ±0.03, or by no more
than 3 percentage points

Example
In the previous example, where π = 0.55, what interval of values, symmetric about the mean of the sampling
distribution, would contain 95% of values of all possible samples?

0.95
0.025                                           0.025

pL            0.55                  U

Now that we have learned about MOE,

L,    U   = π ± MOE

Since the interval is to contain 95% of all sample proportions, then the error probability is α = 0.05. The margin of
error is then,

MOE = z0.025 se( )= 1.96 × 0.0203 = 0.0398 (≈ 0.04)

That is, 95% of sample proportions in samples of size 600 fall within 0.04 (or 4 percentage points) from the
population proportion of 0.55.

U   = 0.55 + 0.04 = 0.59,

and,

L   = 0.55 − 0.04 = 0.51

3.2.5. Determining the Sample Size for a Given MOE
Once again, in many inferential statistics questions you will be asked to determine the sample size that yields a
desired margin of error for . Considering the formula for the margin of error for , the MOE varies inversely with
sample size.

Chapter 4—Sampling Distributions                                                                Page 25 of 33
MOE = zα/2 se( )

MOE = zα/2 π(1  π)
n

We can rearrange this formula to solve for n. Squaring both sides and then solving for n we obtain the formula to
determine the sample size for a given MOE.

2
 zα / 2   
n = π(1  π)
 MOE      

          

Example
In the previous question, where π = 0.55, what is the minimum sample size so that the probability that the sample
proportion is within ±0.02 (or 2 percentage points) from the population proportion is 95%?

Here we looking for a 95% MOE. Therefore, the error probability is α = 0.05, and zα/2 = z0.025 = 1.96. We want the
margin of error to be MOE = 0.02

2
 1.96 
n = 0.55(0.45)        = 2376.99            n = 2,377 (Always round up)
 0.02 

Practice Problem Set 4.3
1.   The following table represents the population of 10 graduate and undergraduate students, of whom 4 will be
randomly selected for a particular assignment. Let 1 denote undergraduate, and 0 graduate.

Person        U/G        Person           U/G
A            1           F               1
B            1           G               1
C            1           H               1
D            0            I              1
E            0            J              0

(a) How many possible selections or samples (without replacement) of size n = 4 can we take from this
population?
(b) What is the population proportion (π) of undergraduates?
(c) What is the population variance?
(d) Suppose a random sample selected contains individuals B, F, H, and J. What is the sample proportion of
(e) What is the mean or expected value of proportions of all samples of size 4 selected from this population?
(f) What is the variance and standard error of ?

2.   Assume a population proportion of π = 0.6.
(a) A sample of size n = 100 is selected. What is the standard error of ?
(b) A sample of size n = 100 is selected. What is the probability that the sample proportion p is between 0.57
and 0.63?
(c) What fraction (proportion, or percentage) of sample proportion obtained from samples of size n = 500 fall
within ±0.04 (or 4 percentage points) from the population proportion?
(d) For samples of size n = 1000, what is the 95% margin of sampling error for ?
(e) For samples of size 1000, what interval of values contain 95% of 's?

Chapter 4—Sampling Distributions                                                              Page 26 of 33
(f) What sample size provides a 95% margin of error of 0.02?

3.   At a major university campus, 38% of the students live in the dormitories. A random sample of 300 students is
selected for a particular study.
(a) What is the standard error of ?
(b) What is the probability that in the sample between 0.34 and 0.42 live in dormitories?
(c) Suppose the sample size is increased to n = 600. What fraction (proportion, or percentage) of proportions
computed from samples of size n = 600 fall within ±0.03 (or 3 percentage points) from the proportion of
the population of students who live in dormitories.
(d) What is the interval of values that contains 99% of 's?
(e) What sample size is required to build an interval such that it contains 99% of all sample proportions within
±0.01 of the population proportion?

Appendix
The proof that μ x = E(x) = μ:
 x  1                          E(x1 )  E(x2 )    E(x n ) nμ
E(x )  E    
 n   n E(x1  x2    x n )                                  μ
                                              n                n

σ2
The proof that var(x )       :
n
x  1 2                        σ 2 (x 1 )    σ 2 (x n )
var(x )  σ 2      
 n  n2     σ (x1    x n ) 
                                            n2
2        2      2
Since all xi are selected from the same population, then σ (x1) = ∙∙∙ = σ (xn) = σ . Therefore,
nσ 2 σ2
var( x )       
n2      n
And, se( x ) = σ n

The proof that E( ) = π:

After taking a sample of size n, determining the number of successes x in the sample (number of females, in the
above example) is a Bernoulli process. Thus x has a binomial distribution. In Chapter 2 it was shown that the
expected value of a binomial random variable is:

E(x) = nπ

x
Since =        , then x = n . Substituting for x in E(x), we have
n

E(n ) = nπ
nE( ) = nπ

Thus, E( ) = π
π(1  π)
The proof that var( ) =           :
n
Once again, since the number of successes (x) in a sample has a binomial distribution, then the variance of x is:
var(x) = σ² = nπ(1 −π). Substituting for x = n , we have:
var(n ) = nπ(1 – π)

Chapter 4—Sampling Distributions                                                                  Page 27 of 33
n²var( ) = nπ(1 – π)
π(1  π)
var( ) =
n

Solutions to Practice Problem Set 4.1
1.   A population contains 40 items. How many possible samples of size 10 can be selected from this population?
C(N, n) = C(40, 10) = 847,660,528
2.   A population contains five items, A, B, C, D, and E, with the numeric values shown below:

Item            x
A            12
B             9
C            21
D            15
E             3

(a) Compute the mean, variance, and the standard deviation of the population.

x        (x – μ)²
12               0
9               9
21              81
15               9
3              81
60             180

μ = x ∕ N = 60 ∕ 5 = 12             σ = (x – μ)² ∕ N = 180 ∕ 5 = 36
2
σ=6

(b) How many samples of size n = 3 can we select from this population?
C(N, n) = C(5, 3) = 10

(c) List all the samples and compute the mean, variance, and the standard deviation of each sample.
x̄ = x ∕ n          s = (x – x̄)² ∕ (n – 1)
2

2
Samples                 Sample Values                       x̄        s               s
A       B       C           12      9           21                   14       39         6.245
A       B       D           12      9           15                   12        9         3.000
A       B       E           12      9            3                    8       21         4.583
A       C       D           12    21            15                   16       21         4.583
A       C       E           12    21             3                   12       81         9.000
A       D       E           12    15             3                   10       39         6.245
B       C       D            9    21            15                   15       36         6.000
B       C       E            9    21             3                   11       84         9.165
B       D       E            9    15             3                    9       36         6.000
C       D       E           21    15             3                   13       84         9.165

(d) Write the sampling distribution of x̄.

Chapter 4—Sampling Distributions                                                                     Page 28 of 33
x̄           f(x̄)
8           0.1
9           0.1
10           0.1
11           0.1
12           0.2
13           0.1
14           0.1
15           0.1
16           0.1

(e) Compute the expected value of the sample mean, E(x̄), from the sampling distribution of x̄. Comparer this
mean to the parent population mean μ.

x̄       f(x̄)         x̄ f(x̄)
8        0.1              0.8
9        0.1              0.9
10        0.1              1.0
11        0.1              1.1
12        0.2              2.4
13        0.1              1.3
14        0.1              1.4
15        0.1              1.5
16        0.1              1.6
E(x̄) =      12.0

E(x̄) = μx̄ = μ = 12

(f) Compute the variance and standard error of the mean from the population data.

The population variance is σ =(x – μ)² ∕ N = 36
2

The variance of x̄ is:
σ2    N  n  36  5  3 
var(x̄) =               =           =6                  se(x̄) =       var(x ) = 2.449
n    N 1    3  51 

(g) Compute the variance and standard error of the mean from the sampling distribution x .

2
x̄         f(x̄)      (x̄ − μx̄) f(x̄)
8         0.1                 1.6
9         0.1                 0.9
10          0.1                 0.4
11          0.1                 0.1
12          0.2                 0.0
13          0.1                 0.1
14          0.1                 0.4
15          0.1                 0.9
16          0.1                 1.6
6.0

Chapter 4—Sampling Distributions                                                                        Page 29 of 33
var(x̄) = (x̄ − μx̄) f(x̄)= 6
2
se(x̄) = 2.449

3.   The average weight of the adult male population is 180 pounds with a standard deviation of 25 pounds. What
are the mean and stand error of sampling distribution of x̄, when the sample size n = 64?
μx̄ = μ = 180
Since the population is non-finite, then FPCF is not needed in the variance and standard error formula.
se(x̄) = σ ∕ n = 3
4.   A population with an unknown distribution has a mean of 1,246 and a standard deviation 345.
(a) For samples of size n = 25 selected from this population, what is the mean of the sampling distribution of
x̄?
μx̄ = μ = 1,246
(b) When n = 25, what is the standard error of x ?
se(x̄) = σ ∕ n = 345 ∕ 5 = 69
(c) What is the mean of x̄ if the sample size is n = 64?
μx̄ = μ = 1,246. μx̄ is always equal to μ, regardless of the sample size.
(d) What is the standard error of x if the sample size is n = 64?
se(x̄) = σ ∕ n = 345 ∕ 8 = 43.

Solutions to Practice Problem Set 4.2
1.   A population with an unknown distribution has a mean of 1,246 and a standard deviation 345. Answer the
following question for samples of size n = 25 selected from this population.
(a) What is the shape of the distribution of x̄? Is the distribution of x normal?
Since the distribution of the parent population is unknown and the sample size is less than 30, we cannot
ascertain the shape of the sampling distribution of x̄
(b) What is the distribution of x̄ if the sample size is n = 64?
Since the sample size is greater than 30, per Central Limit Theorem, the sampling distribution of x̄ is
approximately normal.
2.   135,296 students took a statewide math test. The average score of the population of 135,296 students taking
the test was μ = 64, with a standard deviation of σ = 22.
(a) For samples of size n = 50 student tests selected from this population, what is the mean of the sampling
distribution of x̄?
μx̄ = μ = 64. μx̄ is always equal to μ, regardless of the sample size.
(b) For n = 50, what is the standard error of x̄?
se(x̄) = σ ∕ n = 22 ∕ 50 = 3.111
(c) For n = 50, what is the shape of the distribution of x ? Is the distribution of x normal?
Since n > 30, the distribution is approximately normal.
(d) A sample of 50 scores is selected. What is the probability that the sample mean is between 60 and 68?
P(60 < x̄ < 68)
μx̄ = μ = 64         se(x̄) = σ ∕ n = 22 ∕ 50 = 3.111
x  μ 60  64
z          =          = -1.29 and 1.29
se(x )    3.111
P(-1.29 < z < 1.29) = 0.8029
(e) What fraction (proportion, or percentage) of the means obtained from samples of size n = 50 fall within 5
score units from the population mean score?
P(64 – 5 < x̄ < 64 + 5) = P(59 < x̄ < 69)
P(-1.61 < z < 1.61) = 0.8926
(f) For samples of size 81, what interval of x̄ values would contain 90% of all sample means?
P( xL  x  xU ) = 0.90
x L , xU = μ ± MOE           MOE = z0.05 se(x̄) = 1.64(2.444) = 4.01
x L , xU = 64 ± 4.01 = (59.99, 68.01)
(g) For samples of size 81, what is the 95% margin of sampling error?

Chapter 4—Sampling Distributions                                                              Page 30 of 33
MOE = z0.025 se(x̄) = 1.96(2.444) = 4.79
(h) For samples of size 81, what interval of x̄ values would contain 95% of all sample means?
x L , xU = 64 ± 4.79 = (59.21, 68.79)
3.   The lifetime of light bulbs produced by a particular manufacturer have mean 1,200 hours and standard
deviation 400 hours. The population distribution is normal.
(a) If a single light bulb is selected. What is the probability that the lifetime is between 1,100 and 1,300?
P(1100 < x < 1300)
x μ
z           = -0.25 and 0.25         P(-0.25 < z < 0.25) = 0.1974
σ
(b) If a sample of 25 light bulbs is selected, what is the probability that the sample mean lifetime is between
1,100 and 1,300?
P(1100 < x̄ < 1300)
se(x̄) = σ ∕ n = 400 ∕ 25 = 80
x μ
z           = -1.25 and 1.25
se(x )
P(-1.25 < z < 1.25) = 0.7887
(c) What fraction (proportion, or percentage) of the means obtained from samples of size n = 25 fall within
50 hours from the population mean hours of lifetime?
P(1150 < x̄ < 1250)
P(-0.63 < z < 0.63) = 0.4731
(d) What is the 95% margin of statistical error for the sample means, if the sample size is 36?
MOE = z0.025 se(x̄) = 1.96(0.667) = 130.67 hours.
(e) What interval of x̄ values contains 95% of all x̄'s?
x L , xU = 1200 ± 130.67 = (1069.33, 1330.67)
(f) What sample size generates a 95% margin of statistical error of 20 hours?
2                2
z     σ    1.96  400 
n   0.025  =              = 1537
 MOE          20      

4.   According to a recent report the nationwide average price of regular gasoline is \$2.48 with a standard
deviation of \$0.20.
(a) A sample of 50 gasoline stations is selected. What is the probability that the sample mean is between
\$2.43 and \$2.53?
P(2.43 < x̄ < 2.53)
P(-1.79 < z < 1.79) = 0.9265
(b) For a sample of size 75, what is the 90% margin of statistical error?
σ
MOE  z 0.025        = 0.038
n
(c) What interval contains 95% of x values for all samples of size 45?
0.20
x L , xU = μ ± MOE           MOE =1.96      = 0.059
45
x L , xU = 2.48 ± 0.059 = (2.42, 2.54)
(d) What sample size yields and 95% margin of error of \$0.01?
2                 2
z     σ    1.96  0.20 
n   0.025  =                = 1537
 MOE       0.01 
5.   The average systolic blood pressure in the population of normal healthy adults is 120 mm Hg with a standard
deviation of 10 mm Hg.
(a) If repeated samples of size 25 individuals is taken, what proportion of samples will have mean values
greater than 124 mm Hg?
P(x̄ > 124)

Chapter 4—Sampling Distributions                                                               Page 31 of 33
se(x̄) = 2
x μ
z           =2    P(z > 2) = 0.0228
se(x )
(b) what interval of sample mean values of systolic blood pressure, symmetric about the mean, would
contain 95% of all sample means?
x L , xU = μ ± MOE           MOE = z 0.025se(x ) = 1.96(2) = 3.92
x L , xU = 120 ± 3.92 = (116.08, 123.92)

Solutions Practice Problem Set 4.3
1.   The following table represents the population of 10 graduate and undergraduate students, of whom 4 will be
randomly selected for a particular assignment. Let 1 denote undergraduate, and 0 graduate.

Person        U/G        Person   U/G
A            1           F       1
B            1           G       1
C            1           H       1
D            0            I      1
E            0            J      0

(a) How many possible selections or samples (without replacement) of size n = 4 can we take from this
population?
ν = C(10, 4) = 210
(b) What is the population proportion (π) of undergraduates?
Let x be the number of undergraduates. Then π = x ∕ N = 7 ∕ 10 = 0.70
(c) What is the population variance?
2
σ = π(1− π) = 0.21
(d) Suppose a random sample selected contains individuals B, F, H, and J. What is the sample proportion of
Sample: {1, 1, 1, 0}               p  x n = 0.75
(e) What is the mean or expected value of proportions of all samples of size 4 selected from this population?
E( p ) = μ p = π = 0.70
(f) What is the variance and standard error of p ?
π(1  π)  N  n  0.7(1  0.7)  10  4 
var( p ) =                 =                      = 0.035
n  N 1            4        10  1 

2.   Assume a population proportion of π = 0.6.
(a) A sample of size n = 100 is selected. What is the standard error of p ?
π(1  π) 0.6(1  0.6)
var( p ) =          =             = 0.0024   se( p ) = 0.0490
n          100
(b) A sample of size n = 100 is selected. What is the probability that the sample proportion p is between 0.57
and 0.63?
P(0.57 < p < 0.63)
p  π 0.57  0.60
z         =             = -0.61 and 0.61
se(p)       0.049
P(-0.61 < z < 0.61) = 0.4581
(c) What fraction (proportion, or percentage) of sample proportion obtained from samples of size n = 500 fall
within ±0.04 (or 4 percentage points) from the population proportion?
P(0.56 < p < 0.64)

Chapter 4—Sampling Distributions                                                             Page 32 of 33
π(1  π)      0.6(1  0.6)
se( p ) =            =                 = 0.0219
n              500
p π
z       = -1.37 and 1.37
se(p)
P(-1.37 < z < 1.37) = 0.8293
(d) For samples of size n = 1000, what is the 95% margin of sampling error for p ?
0.6(1  0.6)
se( p ) =                = 0.0155      MOE = z 0.025se(p) = 1.96(0.0155) = 0.030
1000
(e) For samples of size 1000, what interval of p values contain 95% of p 's?
pL , pU  π  MOE             pL , pU  0.60  0.03 = (0.57, 0.63)
(f) What sample size provides a 95% margin of error of 0.02?
2                      2
z                      1.96 
n  π(1  π) 0.025  = 0.6(1  0.6)       = 2305
 MOE                   0.02 

3.   At a major university campus, 38% of the students live in the dormitories. A random sample of 300 students is
selected for a particular study.
(a) What is the standard error of p ?
π(1  π)     0.38(1  0.38)
se( p ) =            =                  = 0.28
n              300
(b) What is the probability that in the sample between 0.34 and 0.42 live in dormitories?
P(0.34 < p < 0.42)
p π
z       = -1.43 and 1.43
se(p)
P(-1.43 < z < 1.43) = 0.8473
(c) Suppose the sample size is increased to n = 600. What fraction (proportion, or percentage) of proportions
computed from samples of size n = 600 fall within ±0.03 (or 3 percentage points) from the proportion of
the population of students who live in dormitories.
0.38(1  0.38)
se( p ) =                   = 0.0198
600
P(0.35 < p < 0.41)
p π
z         = -1.52 and 1.52
se(p)
P(-1.52 < z < 1.52) = 0.8715
(d) What is the interval of p values that contains 99% of p 's?
pL , pU  π  MOE
MOE  z 0.005se(p)           z0.005 = 2.58
MOE = 2.58(0.0198) = 0.051
pL , pU  0.38  0.051 = (0.329, 0.431)
(e) What sample size is required to build an interval such that it contains 99% of all sample proportions within
±0.01 of the population proportion?
P (π  0.01  p  π  0.01)  0.99
MOE = 0.01
2                            2
z                        2.58 
n  π(1  π) 0.005  = 0.38(1  0.38)       = 15863
 MOE                     0.01 

Chapter 4—Sampling Distributions                                                               Page 33 of 33

DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 22 posted: 12/3/2011 language: English pages: 33
How are you planning on using Docstoc?