Docstoc

Confidence Interval Estimation - PowerPoint

Document Sample
Confidence Interval Estimation - PowerPoint Powered By Docstoc
					Confidence Interval Estimation
              CQMS202 Sections 410 & 420

                     September 23, 2010




                                           1
Outline
 Housekeeping items

 Review

 Confidence interval estimation for the mean (σ known)

 Confidence interval estimation for the mean (σ unknown)

 Confidence interval estimation for the proportion

 Determining sample size

 Confidence interval estimation and ethical issues


                                                            2
Housekeeping
 No enrollment

 Quiz 1 – October 7 – 45-60 mins

 Receive 75% refund if drop before October 7




                                                3
Review
 Properties of normal distribution

 Transforming an observed value to standardized score




                                                         4
Problem 6.12 (p. 286, using calculator)
 Given a standardized normal distribution (with a mean of
   0 and a standard deviation of 1), what is the probability
   that:

 a. Z is less than 1.57?

 b. Z is greater than 1.84?

 c. Z is between 1.57 and 1.84?

 d. Z is less than 1.57 or greater than 1.84?




                                                               5
Sampling and sampling distribution
 Sample is used to estimate population characteristics

 Efficient, economical, more practical (but involves error)




                                                               6
 Population vs. sample
             N                                  n
             X      i                         X       i
           i1
                                          X   i1
                 N                                  n

             ( X ) 2                        ( X ) 2
  
       X2  N                  s 
                                       X2 
                                              n 1
            N                           n 1

             X                               X  X
        Z                            Z
                                               X

                                                           x  

                                   x 
                                         n
                                                                     7
     Population vs. Sample
Population Parameters            Sample Statistics
- µ (mu)  mean                  -      (x-bar)  mean

                                     M  used in other places
- σ (sigma)  standard           - s  standard deviation
  deviation
-   σ2 (sigma-square)           - s2 (s-square) 
    variance                                 variance

- σx-bar (sigma-x-bar)          - sx-bar(s-x-bar) 

    standard error of the mean       sample standard error

                                                                8
Problem 7.72
The fill amount of bottles of a soft drink is normally
  distributed, with a mean of 2.0 liters and a standard
  deviation of 0.05 liter. If you select a random sample of
  25 bottles, what is the probability that the sample mean
  will be
a. Between 1.99 and 2.0 liters?
b. Below 1.98 liters?
c. Greater than 2.01 liters?
d. The probability is 99% that the sample mean amount of
   soft drink will be at least how much?
e. The probability is 99% that the sample mean amount of
   soft drink will be between which two values
   (symmetrically distributed around the mean)?



                                                              9
Confidence interval estimation
for the mean (σ known)
 Deductive reasoning: population  sample

 Inductive reasoning: sample  population

 Estimation
   Point estimate: value of a single sample statistic (e.g., mean)
   Confidence interval estimate: numbers constructed around the
    point estimate




                                                                      10
 Estimating μ
  μ is rarely known. But we can estimate μ from sample
   means. There are some exceptions: estimated μ for IQ is
   around 100 with σ around 15.

  Standardization is achieved through a large number of
   testing in order to provide a more “accurate”
   estimation of the population μ.


     X  X      XU    (z)( x )    (z)( / n )
z
       X        X L    (z)( x )    (z)( / n )
                   X  (z)( / n )
                                                           11
Confidence intervals in a graph



               μ=2
Sample 1
Sample 2
Sample 3

   .
   .
   .
   .
   .
Sample 100
                                  12
Why z = ±1.96?
With alpha (α) = .05, two-tailed, zcrit = ± 1.96




             -z-crit                           +z-crit




                                .95
.025                                                     .025




                                                                13
Standard error, standard deviation
& sample size
                                          
  Sampling error: n , σ-xbar 
                                   x 
                                          n
  Level of confidence: (1-α) x 100%


                                             
   X  (z / 2 )(     X  (z / 2 )( )
                    )
                  n                       n

   z / 2  1.96,  .05(95%confidence)
   z / 2  2.58,  .01(99%confidence)
                                                  14
Caveats
 Assumption of normality

 Large sample size

 How to check? Data screening, especially data
  visualization skills such as stem-and-leaf plot and
  boxplot

 σ is not always known…




                                                        15
Problem 9.2 (p. 365)
 If X-bar = 125, σ = 24 and n = 36, construct a 99%
 confidence interval estimate of the population mean, μ.




                                                           16
Confidence interval estimation
for the mean (σ unknown)
 t distribution

 William Gosset, 1908

 Guinness Brewing Company

 Used the pseudonym of Student because the company
  would not allow him to publish under his name

 Develop the t distribution which relies on s2 instead of σ2.

 Sometimes called Student’s t distribution



                                                                 17
  Introduction to t statistics
 Working with what’s available

 z tests often require information about the population that is
  not available

 Population mean (μ) can be inferred from sample mean (M).
  But hard to decide the variance (hence standard deviation)
  of the population. Without σ, we cannot obtain std. err. (σM)
  for the z-formula.

 But t accommodates this limitation.




                                                                   18
The t-distribution
                     df = a + 2c
                       df = a + c
                        df = a




                                    19
Comparing t to z distributions

                                   df = a + 2c
                                     df = a + c
                                      df = a
                                                  z




a=n–1

http://www.econtools.com/jevons/java/Graphics2D/tDist.html

                                                             20
Why did the t distribution fluctuates?
  Not enough info on the population  can’t determine σ

  But we have the sample standard deviation: s

                                  ( X) 2
  Calculating σ:
                          X2 
                                    N
                    
                               N

                              ( X ) 2
  Calculating s:
                        X2 
                               n 1
                  s
                           n 1
                                                           21
                                                                21
 Degrees of Freedom
 Sample statistics are estimates of population parameters.

 n is a biased estimate of N  the larger the n, the closer it is
  to N

 To obtain an unbiased estimator, some adjustment is
  necessary.

 Degrees of freedom (df) indicates the number of obs in a
  sample minus the number of estimated parameters.

 For calculating s, the estimated parameter is σ. Hence, n – 1




                                                                     22
Standard Error for one sample
 It is more often that we deal with a sample than an
  individual.

 To estimate a population from a sample, we need sM
  instead of s.

                                        2
                       s    s
                  sM     
                        n   n
 We’ll come back to the s.e. again…


                                                        23
     Confidence interval estimation
     for the mean (σ unknown)

                                                       
        X  (z / 2 )(         )    X  (z / 2 )(       )
                           n                            n
                      s                      s
        X  (t / 2 )( )    X  (t / 2 )( )
                       n                      n
      df for t = n - 1




                                                                24
Problem 9.17 (p.373)
The data below represent the total fat, in grams per serving, for
  a sample of 20 chicken sandwiches from fast-food chains.
7, 8, 4, 5, 16, 20, 20, 24, 19, 30, 23, 30, 25, 19, 29, 29, 30, 30,
  40, 56
a. Construct a 95% confidence interval for the population mean
   total fat, in grams per serving.
b. Interpret the interval constructed in a.
c. What assumption must you make about the population
   distribution in order to construct the confidence interval
   estimate in a?
d. Do you think that the assumption needed in order to
   construct the confidence interval estimate in a is valid?
   Explain.
                                                                      25
Confidence interval estimation
for the proportion
 Population proportion: π

 Point estimate for π is the sample proportion: p = X/n

               p(1 p)                        p(1 p)
p  (z / 2 )(         )    p  (z / 2 )(         )
                  n                              n
    z / 2  1.96,  .05(95%confidence)
    z / 2  2.58,  .01(99%confidence)
                  X
               p  ~ X,(n  X)  5
                  n
                                                           26
   Problem 9.27 (p. 377)
The start of the twenty-first century saw many corporate scandals
  and many individuals lost faith in business. In a 2007 poll
  conducted by the NYC-based Edelman Public Relations firm, 57%
  of respondents say they trust business to “do what is right”. This
  percentage was the highest in the annual survey since 2001.
a. Construct a 95% confidence interval estimate of the population
   proportion of individuals who trust business to “do what is
   right” assuming that the poll surveyed:
  1. 100 individuals
  2. 200 individuals
  3. 300 individuals

b. Discuss the effect that sample size has on the width of
   confidence intervals.
c. Interpret the intervals in a.

                                                                       27
Determining sample size

                                 e = sampling error
  X  z / 2
                     n
                                                           (1  )
                         e  z / 2           e  z / 2
                                      n                       n
 Point estimate x-bar



         z / 2           n  z / 2 (1  )
                 2       2                2
       n          2                          2
              e                            e
                                                                       28
Problem 9.39 (p. 382)
An advertising agency that serves a major radio station
  wants to estimate the mean amount of time that the
  station’s audience spends listening to the radio daily.
  From past studies, the stand deviation is estimated as
  45 minutes.

a. What sample size is needed if the agency wants to be
   90% confident of being correct to within ±5 minutes?

b. If 99% confidence is desired, how many listeners need
   to be selected?




                                                            29
   Problem 9.43 (p. 383)
A study of 658 CEOs conducted by the Conference Board
  reported that 250 stated that their company’s greatest
  concern was sustained and steady top-line growth.

a. Construct a 95% confidence interval for the proportion of
   CEOs whose greatest concern was sustained and steady top-
   line growth.

b. Interpret the interval constructed in a.

c. To conduct a follow-up study to estimate the population
   proportion of CEOs whose greatest concern was sustained
   and steady top-line growth to within ±.01 with 95%
   confidence, how many CEOs would you survey?


                                                               30
Review problems
 9.52, 9.54, 9.58



Next week
 Chapter 10




                     31

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:28
posted:11/25/2012
language:Unknown
pages:31