# Confidence Interval Estimation - PowerPoint

Document Sample

```					Confidence Interval Estimation
CQMS202 Sections 410 & 420

September 23, 2010

1
Outline
 Housekeeping items

 Review

 Confidence interval estimation for the mean (σ known)

 Confidence interval estimation for the mean (σ unknown)

 Confidence interval estimation for the proportion

 Determining sample size

 Confidence interval estimation and ethical issues

2
Housekeeping
 No enrollment

 Quiz 1 – October 7 – 45-60 mins

 Receive 75% refund if drop before October 7

3
Review
 Properties of normal distribution

 Transforming an observed value to standardized score

4
Problem 6.12 (p. 286, using calculator)
Given a standardized normal distribution (with a mean of
0 and a standard deviation of 1), what is the probability
that:

a. Z is less than 1.57?

b. Z is greater than 1.84?

c. Z is between 1.57 and 1.84?

d. Z is less than 1.57 or greater than 1.84?

5
Sampling and sampling distribution
 Sample is used to estimate population characteristics

 Efficient, economical, more practical (but involves error)

6
Population vs. sample
N                                  n
X      i                         X       i
   i1
X   i1
N                                  n

( X ) 2                        ( X ) 2
  
X2  N                  s 
 X2 
n 1
N                           n 1

X                               X  X
Z                            Z
                              X

                 x  

x 
           n
7
Population vs. Sample
Population Parameters            Sample Statistics
- µ (mu)  mean                  -      (x-bar)  mean

M  used in other places
- σ (sigma)  standard           - s  standard deviation
deviation
-   σ2 (sigma-square)           - s2 (s-square) 
variance                                 variance

- σx-bar (sigma-x-bar)          - sx-bar(s-x-bar) 

standard error of the mean       sample standard error

8
Problem 7.72
The fill amount of bottles of a soft drink is normally
distributed, with a mean of 2.0 liters and a standard
deviation of 0.05 liter. If you select a random sample of
25 bottles, what is the probability that the sample mean
will be
a. Between 1.99 and 2.0 liters?
b. Below 1.98 liters?
c. Greater than 2.01 liters?
d. The probability is 99% that the sample mean amount of
soft drink will be at least how much?
e. The probability is 99% that the sample mean amount of
soft drink will be between which two values
(symmetrically distributed around the mean)?

9
Confidence interval estimation
for the mean (σ known)
 Deductive reasoning: population  sample

 Inductive reasoning: sample  population

 Estimation
 Point estimate: value of a single sample statistic (e.g., mean)
 Confidence interval estimate: numbers constructed around the
point estimate

10
Estimating μ
 μ is rarely known. But we can estimate μ from sample
means. There are some exceptions: estimated μ for IQ is
around 100 with σ around 15.

 Standardization is achieved through a large number of
testing in order to provide a more “accurate”
estimation of the population μ.

X  X      XU    (z)( x )    (z)( / n )
z
X        X L    (z)( x )    (z)( / n )
  X  (z)( / n )
                                                         11
Confidence intervals in a graph

μ=2
Sample 1
Sample 2
Sample 3

.
.
.
.
.
Sample 100
12
Why z = ±1.96?
With alpha (α) = .05, two-tailed, zcrit = ± 1.96

-z-crit                           +z-crit

.95
.025                                                     .025

13
Standard error, standard deviation
& sample size

 Sampling error: n , σ-xbar 
x 
n
 Level of confidence: (1-α) x 100%

                          
X  (z / 2 )(     X  (z / 2 )( )
)
n                       n

z / 2  1.96,  .05(95%confidence)
z / 2  2.58,  .01(99%confidence)
14
Caveats
 Assumption of normality

 Large sample size

 How to check? Data screening, especially data
visualization skills such as stem-and-leaf plot and
boxplot

 σ is not always known…

15
Problem 9.2 (p. 365)
If X-bar = 125, σ = 24 and n = 36, construct a 99%
confidence interval estimate of the population mean, μ.

16
Confidence interval estimation
for the mean (σ unknown)
 t distribution

 William Gosset, 1908

 Guinness Brewing Company

 Used the pseudonym of Student because the company
would not allow him to publish under his name

 Develop the t distribution which relies on s2 instead of σ2.

 Sometimes called Student’s t distribution

17
Introduction to t statistics
 Working with what’s available

 z tests often require information about the population that is
not available

 Population mean (μ) can be inferred from sample mean (M).
But hard to decide the variance (hence standard deviation)
of the population. Without σ, we cannot obtain std. err. (σM)
for the z-formula.

 But t accommodates this limitation.

18
The t-distribution
df = a + 2c
df = a + c
df = a

19
Comparing t to z distributions

df = a + 2c
df = a + c
df = a
z

a=n–1

http://www.econtools.com/jevons/java/Graphics2D/tDist.html

20
Why did the t distribution fluctuates?
 Not enough info on the population  can’t determine σ

 But we have the sample standard deviation: s

( X) 2
 Calculating σ:
 X2 
N

N

( X ) 2
 Calculating s:
 X2 
n 1
            s
n 1
21
21
Degrees of Freedom
 Sample statistics are estimates of population parameters.

 n is a biased estimate of N  the larger the n, the closer it is
to N

 To obtain an unbiased estimator, some adjustment is
necessary.

 Degrees of freedom (df) indicates the number of obs in a
sample minus the number of estimated parameters.

 For calculating s, the estimated parameter is σ. Hence, n – 1

22
Standard Error for one sample
 It is more often that we deal with a sample than an
individual.

 To estimate a population from a sample, we need sM

2
s    s
sM     
n   n
 We’ll come back to the s.e. again…

23
Confidence interval estimation
for the mean (σ unknown)

                              
X  (z / 2 )(         )    X  (z / 2 )(       )
n                            n
s                      s
X  (t / 2 )( )    X  (t / 2 )( )
n                      n
      df for t = n - 1


24
Problem 9.17 (p.373)
The data below represent the total fat, in grams per serving, for
a sample of 20 chicken sandwiches from fast-food chains.
7, 8, 4, 5, 16, 20, 20, 24, 19, 30, 23, 30, 25, 19, 29, 29, 30, 30,
40, 56
a. Construct a 95% confidence interval for the population mean
total fat, in grams per serving.
b. Interpret the interval constructed in a.
c. What assumption must you make about the population
distribution in order to construct the confidence interval
estimate in a?
d. Do you think that the assumption needed in order to
construct the confidence interval estimate in a is valid?
Explain.
25
Confidence interval estimation
for the proportion
 Population proportion: π

 Point estimate for π is the sample proportion: p = X/n

p(1 p)                        p(1 p)
p  (z / 2 )(         )    p  (z / 2 )(         )
n                              n
z / 2  1.96,  .05(95%confidence)
z / 2  2.58,  .01(99%confidence)
X
p  ~ X,(n  X)  5
n
26
Problem 9.27 (p. 377)
The start of the twenty-first century saw many corporate scandals
and many individuals lost faith in business. In a 2007 poll
conducted by the NYC-based Edelman Public Relations firm, 57%
of respondents say they trust business to “do what is right”. This
percentage was the highest in the annual survey since 2001.
a. Construct a 95% confidence interval estimate of the population
proportion of individuals who trust business to “do what is
right” assuming that the poll surveyed:
1. 100 individuals
2. 200 individuals
3. 300 individuals

b. Discuss the effect that sample size has on the width of
confidence intervals.
c. Interpret the intervals in a.

27
Determining sample size

                e = sampling error
X  z / 2
n
                     (1  )
e  z / 2           e  z / 2
n                       n
Point estimate x-bar

  z / 2           n  z / 2 (1  )
2       2                2
n          2                          2
e                            e
28
Problem 9.39 (p. 382)
wants to estimate the mean amount of time that the
station’s audience spends listening to the radio daily.
From past studies, the stand deviation is estimated as
45 minutes.

a. What sample size is needed if the agency wants to be
90% confident of being correct to within ±5 minutes?

b. If 99% confidence is desired, how many listeners need
to be selected?

29
Problem 9.43 (p. 383)
A study of 658 CEOs conducted by the Conference Board
reported that 250 stated that their company’s greatest
concern was sustained and steady top-line growth.

a. Construct a 95% confidence interval for the proportion of
CEOs whose greatest concern was sustained and steady top-
line growth.

b. Interpret the interval constructed in a.

c. To conduct a follow-up study to estimate the population
proportion of CEOs whose greatest concern was sustained
and steady top-line growth to within ±.01 with 95%
confidence, how many CEOs would you survey?

30
Review problems
 9.52, 9.54, 9.58

Next week
 Chapter 10

31

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 28 posted: 11/25/2012 language: Unknown pages: 31