Docstoc

Sampling Distributions

Document Sample
Sampling Distributions Powered By Docstoc
					 Chapter 9


 Sampling
Distributions


                1
   9.1 Introduction

 In real life calculating parameters of
  populations is prohibitive because
  populations are very large.
 Rather than investigating the whole
  population, we take a sample, calculate a
  statistic related to the parameter of
  interest, and make an inference.
 The sampling distribution of the statistic
  is the tool that tells us how close is the
  statistic to the parameter.
                                               2
  9.2 Sampling Distribution of
  the Mean

 An example
     A die is thrown infinitely many times. Let X
      represent the number of spots showing on
      any throw.
     The probability distribution of X is
                                         E(X) = 1(1/6) +
          x    1 2 3 4 5 6               2(1/6) + 3(1/6)+
                                         ………………….= 3.5
          p(x) 1/6 1/6 1/6 1/6 1/6 1/6
                                         V(X) = (1-3.5)2(1/6) +
                                         (2-3.5)2(1/6) +
                                            …………. …= 2.92
                                                                  3
   Throwing a die twice – sample mean


 Suppose we want to estimate m
  from the mean x of a sample of
  size n = 2.
 What is the distribution of x ?




                                        4
   Throwing a die twice – sample mean


Sample         Mean Sample     Mean   Sample         Mean
   1     1,1     1    13   3,1   2      25     5,1        3
   2     1,2    1.5   14   3,2  2.5     26     5,2      3.5
   3     1,3     2    15   3,3   3      27     5,3        4
   4     1,4    2.5   16   3,4  3.5     28     5,4      4.5
   5     1,5     3    17   3,5   4      29     5,5        5
   6     1,6    3.5   18   3,6  4.5     30     5,6      5.5
   7     2,1    1.5   19   4,1  2.5     31     6,1      3.5
   8     2,2     2    20   4,2   3      32     6,2        4
   9     2,3    2.5   21   4,3  3.5     33     6,3      4.5
  10     2,4     3    22   4,4   4      34     6,4        5
  11     2,5    3.5   23   4,5  4.5     35     6,5      5.5
  12     2,6     4    24   4,6   5      36     6,6        6

                                                              5
       Sample          Mean Sample     Mean          Sample            Mean
          1      1,1     1    13   3,1   2             25        5,1        3
        The distribution of x when n = 2
          2
          3
                 1,2
                 1,3
                        1.5
                         2
                              14
                              15
                                   3,2
                                   3,3
                                        2.5
                                         3
                                                       26
                                                       27
                                                                 5,2
                                                                 5,3
                                                                          3.5
                                                                            4
          4      1,4    2.5   16   3,4  3.5            28        5,4      4.5
                                                                  2
          5      1,5     3    17   3,5   4             29
                                                       2         x
                                                                 5,5        5
          6
          7     Note : m x  m x and
                 1,6
                 2,1
                        3.5
                        1.5
                              18
                              19
                                   3,6
                                   4,1
                                        4.5
                                        2.5           x
                                                       30
                                                       31       5,6
                                                                 6,1
                                                                          5.5
                                                                          3.5
          8
          9
                 2,2
                 2,3
                         2
                        2.5
                              20
                              21
                                   4,2
                                   4,3
                                         3
                                        3.5
                                                       32
                                                       33
                                                                   2
                                                                 6,2
                                                                 6,3
                                                                            4
                                                                          4.5
         10      2,4     3    22   4,4   4             34        6,4        5
         11      2,5    3.5   23   4,5  4.5            35        6,5      5.5
         12      2,6     4    24   4,6   5             36        6,6        6

                                                           E( x) =1.0(1/36)+
6/36                                                       1.5(2/36)+….=3.5
5/36
                                                           V(X) = (1.0-3.5)2(1/36)+
4/36                                                       (1.5-3.5)2(2/36)... = 1.46
3/36
2/36
1/36
         1      1.5    2.0   2.5   3.0   3.5   4.0   4.5   5.0    5.5 6.0       x       6
            Sampling Distribution of the
                      Mean
n5                 n  10                n  25
m x  3.5           m x  3.5             m x  3.5
             2                    2                  2
  .5833 (  x )
 2
 x                   x  .2917 (  x )
                      2
                                            .1167 (  x )
                                           2
                                           x
              5 6                  10                  25




                                                              7
      Sampling Distribution of the
                Mean
n5                      n  10                n  25
m x  3.5                m x  3.5             m x  3.5
                  2                  2                    2
 2  .5833 (     x
                    )      .2917 (  x )
                           2
                           x                     .1167 (  x )
                                                   2
                                                   x
  x
                  5                   10                    25




                                              2
             Notice that  x is smaller than . x.
                             2

             The larger the sample size the
             smaller  x . Therefore, x tends
                        2

             to fall closer to m, as the sample
             size increases.
                                                                   8
      Sampling Distribution of the
                Mean
Demonstration: The variance of the sample mean is
smaller than the variance of the population.

                      Mean = 1.5 Mean = 2. Mean = 2.5

 Population      1      1.5        2
                                   2       2.5     3
                        1.5
                        1.5        2
                                   2       2.5
                                           2.5
                        1.5        2       2.5
                 Compare                   2.5
                        1.5 the variability of the population
                                   2
                        1.5
                        1.5        2 of the2.5
                                           2.5
                        1.5
                 to the variability2       2.5
                                            sample mean.
                        1.5        2       2.5
                        1.5                2.5
                        1.5        2       2.5
Let us take samples     1.5
                        1.5        2
                                   2       2.5
                                           2.5
of two observations



                                                                9
  Sampling Distribution of the
            Mean

               Also,
 Expected value of the population =
         (1 + 2 + 3)/3 = 2

Expected value of the sample mean =
        (1.5 + 2 + 2.5)/3 = 2


                                      10
    The Central Limit Theorem

 If a random sample is drawn from any
  population, the sampling distribution of the
  sample mean is approximately normal for a
  sufficiently large sample size.
 The larger the sample size, the more closely
  the sampling distribution of x will resemble a
  normal distribution.


                                                   11
Sampling Distribution of the Sample
               Mean


 1. m x  m x
          x
           2
 2.  
      2
      x
            n
 3. If x is normal, x is normal. If x is nonnormal
   x is approximately normally distribute d for
   sufficient ly large sample size.


                                                     12
          Sampling Distribution of the
                Sample Mean
 Example 9.1
     The amount of soda pop in each bottle is normally
      distributed with a mean of 32.2 ounces and a
      standard deviation of .3 ounces.
     Find the probability that a bottle bought by a
      customer will contain more than 32 ounces.
     Solution
                                                      0.7486
        The random variable X is the
         amount of soda in a bottle.
                           x  m 32  32.2
         P( x  32)  P(                  )
                            x      .3         x = 32 m = 32.2
                    P( z  .67)  0.7486                       13
           Sampling Distribution of the
                 Sample Mean
 Find the probability that a carton of four bottles will
  have a mean of more than 32 ounces of soda per
  bottle.
 Solution
      Define the random variable as the mean amount of soda per
       bottle.
                     x  m 32  32.2
   P( x  32)  P(                  )                        0.9082
                      x    .3 4
              P( z  1.33)  0.9082
                                                               0.7486
                                         x = 32
                                         x  32 m = 32.2
                                                m x  32.2         14
         Sampling Distribution of the
               Sample Mean
 Example 9.2
     Dean’s claim: The average weekly income of
      B.B.A graduates one year after graduation is
      $600.
     Suppose the distribution of weekly income has a
      standard deviation of $100. What is the
      probability that 25 randomly selected graduates
      have an average weekly income of less than
      $550?
     Solution                   x  m 550  600
                P( x  550)  P(                  )
                                  x     100 25
                            P( z  2.5)  0.0062      15
 Sampling Distribution of the Sample
                Mean

 Example 9.2– continued
     If a random sample of 25 graduates actually had
      an average weekly income of $550, what would
      you conclude about the validity of the claim that
      the average weekly income is 600?
     Solution
         With m = 600 the probability of observing a sample mean
          as low as 550 is very small (0.0062). The claim that the
          mean weekly income is $600 is probably unjustified.
         It will be more reasonable to assume that m is smaller
          than $600, because then a sample mean of $550
          becomes more probable.

                                                                     16
       Using Sampling Distributions for
       Inference
 To make inference about population parameters we use
  sampling distributions (as in Example 9.2).
 The symmetry of the normal distribution along with the
  sample distribution of the mean lead to:
                                                  xm
     P( 1.96  z  1.96 )  .95, or P( 1.96         1.96 )  .95
                                                   n
  - Z.025     Z.025
     This can be written as
                                
     P( 1.96     x  m  1.96     )  .95
               n                  n
     which become
                                               
                  P(m  1.96      x  m  1.96    )  .95
                               n                 n
                                                                17
         Using Sampling Distributions for
         Inference


       Standard normal distribution Z         Normal distribution of    x
                                              100                  100
                               P(600  1.96        x  600  1.96     )  .95
                                               25                   25


                 .95
                                                        .95
.025                           .025 .025                                        .025

                               Z                                                 x
         -1.96    0    -1.96                     
                                                 100      m             
                                                       m600 Pm600 9696 100
                                     m  1..96
                                   P(600  196                (  1. 1.
                                                  n
                                                  25                     n 25
                                                                                18
    Using Sampling Distributions for
    Inference

                     100                  100
        P(600  1.96      x  600  1.96     )  .95
                      25                   25
        Which reduces to P(560.8  x  639.2)  .95

 Conclusion
   There is 95% chance that the sample mean

    falls within the interval [560.8, 639.2] if the
    population mean is 600.
   Since the sample mean was 550, the

    population mean is probably not 600.
                                                        19
   9.3 Sampling Distribution of
       a Proportion

 The parameter of interest for nominal data
  is the proportion of times a particular
  outcome (success) occurs.
 To estimate the population proportion ‘p’
  we use the sample proportion.     The number
                                        of successes

                              ^
          The estimate of p = p =   X
                                    n

                                                  20
   9.3 Sampling Distribution of
       a Proportion

 Since X is binomial, probabilities about ^
                                           p
  can be calculated from the binomial
  distribution.
                           ^
 Yet, for inference about p we prefer to use
  normal approximation to the binomial.



                                           21
        Normal approximation to the
                 Binomial
   Normal approximation to
    the binomial works best
    when
       the number of
        experiments (sample
        size) is large, and
       the probability of success,
        p, is close to 0.5.

   For the approximation to
    provide good results two
    conditions should be met:

     np      5; n(1 - p)      5


                                      22
      Normal approximation to the
               Binomial

Example
  Approximate the binomial probability P(x=10)
  when n = 20 and p = .5

  The parameters of the normal distribution
  used to approximate the binomial are:

             m = np; 2 = np(1 - p)


                                                 23
      Normal approximation to the
               Binomial

Let us build a normal                          m = np = 20(.5) = 10;
distribution to approximate the                2 = np(1 - p) = 20(.5)(1 - .5) = 5
                                                = 51/2 = 2.24
binomial P(X = 10).

P(XBinomial = 10) =.176                                  P(9.5<YNormal<10.5)
                                                         The approximation
~ P(9.5<Y<10.5)
=

                             9.5   10   10.5
                     9.5  10     10.5  10
               P(            Z           )  .1742
                       2.24         2.24
                                                                                     24
   Normal approximation to the
            Binomial

 More examples of normal approximation
  to the binomial
  P(X  4) @ P(Y< 4.5)

                           4
                               4.5
  P(X 14) @ P(Y > 13.5)


                                     13.5   14   25
   Approximate Sampling Distribution
   of a Sample Proportion

 From the laws of expected value and variance,
                          ˆ              ˆ
  it can be shown that E( p ) = p and V( p )
  =p(1-p)/n
 If both np > 5 and np(1-p) > 5, then

              ˆ
             pp
       z
             p(1  p)
                n
 Z is approximately standard normally
  distributed.                                26
 Example 9.3
     A state representative received 52% of the
      votes in the last election.
     One year later the representative wanted
      to study his popularity.
     If his popularity has not changed, what is
      the probability that more than half of a
      sample of 300 voters would vote for him?

                                                   27
 Example 9.3
     Solution
          The number of respondents who prefer the
           representative is binomial with n = 300 and p =
           .52. Thus, np = 300(.52) = 156 and
           n(1-p) = 300(1-.52) = 144 (both greater than 5)

                         p p
                          ˆ                .50  .52     
      P( p  .50)  P
         ˆ                                                .7549
                        p(1  p) n   (.52)(1  .52) 300 
                                                        


                                                                     28
9.4 Sampling Distribution of the
    Difference Between Two Means

 Independent samples are drawn from
  each of two normal populations
 We’re interested in the sampling
  distribution of the difference between the
  two sample means x 1  x 2




                                           29
      Sampling Distribution of the
      Difference Between Two Means

 The distribution of x 1  x 2 is normal if
     The two samples are independent, and
     The parent populations are normally
      distributed.
 If the two populations are not both
  normally distributed, but the sample
  sizes are 30 or more, the distribution of
  x 1  x 2 is approximately normal.
                                               30
   Sampling Distribution of the
   Difference Between Two Means

 Applying the laws of expected value and
  variance we have:
       E( x1  x 2 )  E( x1 )  E( x 2 )  m1  m 2
                                              1  2
                                               2

       V( x 1  x 2 )  V( x 1 )  V( x 2 )      2
                                              n    n
 We can define:
                     ( x 1  x 2 )  ( m1  m 2 )
                Z
                              1  2
                               2
                                  2
                              n1 n2                    31
      Sampling Distribution of the
      Difference Between Two Means

Example 9.4
     The starting salaries of MBA students from
      two universities (WLU and UWO) are $62,000
      (stand.dev. = $14,500), and $60,000 (stand.
      dev. = $18,3000).
     What is the probability that a sample mean of
      WLU students will exceed the sample mean of
      UWO students? (nWLU = 50; nUWO = 60)


                                                      32
     Sampling Distribution of the
     Difference Between Two Means
 Example 9.4 – Solution
   We need to determine P( x1  x 2  0)

   m1 - m2 = 62,000 - 60,000 = $2,000

   12   2
          2
                    14,5002 18,3002
                                             $3,128
   n       n            50            60
                         x1  x 2  (m1 - m 2 ) 0  2000
   P( x1  x 2  0)  P(                               )
                               1 2
                                  2     2         3128
                                    
                                n1 n2
    P( z  .64)  .5  .2389  .7389
                                                            33

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:11
posted:5/10/2012
language:
pages:33