Embed
Email

Estimation

Document Sample
Estimation
Shared by: HC11112418320
Categories
Tags
Stats
views:
1
posted:
11/24/2011
language:
English
pages:
45
Some Concepts

*Estimators: Random variables used to estimate

population parameters.

Example:



p hat is an estimator of p





*Estimates: Specific values of an estimator.

Example:

*Point Estimates: Specify a single value of a

population parameter.

Example: x =120



*Confidence Interval/Interval Estimates:

Calculates 2 numbers based on sample that

form an interval within which the parameter

is expected to lie.





Example: 115=30

Example: a sample with n=50, x =8.1and s=0.24

1) Point estimate of population mean

2) Interval estimate of population mean with a 95%

confidence level

3) Interval estimate of population mean with a 90%

confidence level.

The interval estimate so far is two-sided interval

estimate. Some times we need one-side interval

estimate:

*Lower one-sided confidence interval (LCL)

LCL= (point estimate)- z *(standard error of estimator)



*Upper one-sided confidence interval(UCL)

UCL= (point estimate)+ z *(standard error of estimator)





Where standard error of estimator =

n

Example: a sample with n=50, x =8.1 and s=0.24



4) LCL estimate of population mean



5) UCL estimate of population mean

When sample is large (n is large), we can use

sample standard deviation s as reliable

estimator of population standard deviation

And no matter what distribution the population has,

sampling distribution of sample mean is

normally distributed

But, what if sample is small?

Consequences of small sample size:

1) if population does not possess normal

distribution, sampling distribution of small

sample mean is not normal distribution.



2) s / n is much more variable than n



(It has thicker tails)

What should we do to deal with those problems?

Assumption: 1) population is normally distributed

2) sample is small and  is unknown

If we use sample standard deviation s to replace 

The statistic t follows one mound-shaped distribution,

which is called t-distribution.

x

t

s/ n



Where s is sample standard deviation

 (x i  x) 2

s

xi



n 1

Property of t-distribution

1) t-distribution is mound-shaped

2) t-distribution is perfectly symmetric about t=0

3) t-distribution is more variable than z (normal distribution

is also called z-distribution)

4) t-distribution is affected by the sample size n

Because when n changes, s changes, t-distribution

is affected by sample size n. Therefore,

different sample sizes have different t-

distributions, even if sample is picked form

the same population.

We call number n-1 the number of degree of

freedom (d.f.) associated with s2 and thus

the t-statistic.

Example: when n=10, d.f.=10-1=9

So, before we check the t-distribution, we need to

first determine degree of freedom.

How to check the t-distribution table

• tα records the value of t such that an area α lies

to its right. (Appendix II, Table 4, pp611)

• Determine the degree of freedom

• Determine what value α is (usually a is given in

the question)

• Find the tα

Problems:

1) n=10, find t0.05 and t0.025

2) n30, find t0.05 and t0.025



3) Repeat 1 with n=9



Do you remember we said when n>30, we may

think it is a large sample? After doing these

problems, did you get the intuition about why

we set 30 as the dividing line?

Remember the t-distribution based on the assumption that

the sampled population possesses a normal probability

distribution.

This is very restricted assumption.

Fortunately, it can be shown that distribution of the t

statistic possesses nearly the same shape as the theoretical t-

distribution for population that are nonnormal but possesses

a mound-shaped probability distribution. So even

population is not normal but mount-shaped, we can use t-

distribution.

In our class, except for when specifically stated, we will

suppose using t-distribution is proper.

Small-sample (1-  )100% confidence interval

estimator of the mean is

s

x  t / 2

n

Where, s is sample standard deviation and t has

(n-1) d.f.





s is the estimated standard deviation

n of x

Problem: here is a sample of diamonds weights:

{0.46, 0.61, 0.52, 0.48, 0.57, 0.54}

Find the 95% confidence interval estimate for population

mean.

Estimating the difference between two means

Properties of the sampling distribution of ( x1  x2 ), the

difference two sample means:

When independent random samples of n1 and n2

observations have been selected from population with

means 1 and  2 and variances  12 and  2 2



respectively, the sampling distribution of the difference

will have the following properties:

1) The mean and standard deviation of ( x1  x2 ) will be

( x x )  1  2

1 2



and

 12 2

2

 (x x )  

1 2

n1 n2

2) If the sampled populations are normally

distributed, then the sampling distribution of

( x1  x2 ) is exactly normally distributed,

regardless of the sample size.

3) If the sample populations are not normally

distributed, then the sampling distribution of

( x1  x2 ) is approximately normally

distributed when n1 and n2 are large, due to

the Central Limit Theorem.

Point Estimation of (1   2 ) Large Sample



1   2  x1  x2



A (1-  ) 100% confidence interval for ( 1   2)



 12 2

2

( x1  x2 )  z / 2 

n1 n2



If population variance are unknown, they can be

approximated by the sample variances.

Example: n1=30 n2=30, sample mean are 1.32 and

1.04 respectively, sample variance are 0.9734

and 0.7291 respectively.

1) Point estimate of difference of mean

2) 90% confidence interval estimate of difference

of mean

In small sample

Assumption 1: both samples are picked from

population with normal distribution.

Assumption 2: both population possess equal

variances.

Then t statistic follows t-distribution

( x1  x2 )  ( 1   2 )

t

1 1

s 

n1 n2

Where s is sample standard deviation

Degrees of freedom = n1+n2-2

Both assumptions are very restrictive

Assumption 1: As before, even if the population has a

non-normal distribution, but have a mound-shaped

distribution, a t-statistic found will be very close

to the theoretical t-statistic.

Assumption 2: Usually we don’t know population

distribution. In our class, we just can suppose they

are same when we have such questions

As to the question which s to use, we pool the

information from two samples and get the so-called

“pooled estimator of population variance”

S is called pooled estimator of population variance

n1 n2



 ( x1i  x1 ) 2   ( x2i  x2 ) 2

s2  i 1 i 1

n1  n2  2

or



(n1  1)s12  (n2  1) s2

2

s2 

(n1  1)  (n2  1)

Estimating the Difference Between Two Mean:

Small Sample





If the two samples are small (<30)

Point estimate of the difference:

1   2  x1  x2

A (1-  )100% small-sample confidence interval

for ( 1   2 )



1 1

( x1  x2 )  t / 2 s 

n1 n2



Where s is obtained from the pooled estimate, given before.

Example: here are two samples

{32, 37, 35, 28, 41, 44, 35, 31, 34}

{35, 31, 29, 25, 34, 40, 27, 32, 31}

What is 95% confidence interval estimate of

difference between two sample means?

Properties of sampling distribution of the

sample proportion

1. If a random sample of n observations is selected from

a binomial population with parameter p, the sampling

^

distribution of the sample proportion p  x n



will have a mean  ^  p

p



pq

and a standard deviation:  ^ 

p n

where q=1-p

2. when the sample size n is large, the sampling

distribution of sample proportion will be

approximately normal. Remember the rule is np and

nq both greater than or equals to 5

Estimating a Binomial proportion



Point estimator:

The sample proportion is an unbiased estimator of

population proportion, so

^

*Point estimator of p: px n

*Interval estimator depend on what sampling

distribution of sample proportion is

So, If

^

1) we use sample proportion p as reliable

estimate of population proportion p, and

^ ^

2) n* p and n*(1- p ) both greater than 5, then





A (1-  )100% confidence interval for p is





ˆˆ

pq

p  z / 2

ˆ

n

Example: n=100, x=59

1) What is the point estimate of p?

2) What is 95% confidence interval of the

estimate for p?

Estimating the Difference Between Two

Binomial Proportions

• Point estimate

( p1  p2 )  ( p1  p2 )

ˆ ˆ





• Confidence interval for the difference

ˆ ˆ ˆ ˆ

p1q1 p2 q2

( p1  p2 )  z / 2

ˆ ˆ ( 

n1 n2

Example

• A recent survey of girls between 9 and 14 years old

showed that 44% of the white girls surveyed and 28% of

the African-American girls reported that they had tried to

lose weight. Assume 100 girls of each ethnicity were

surveyed.



– What is the estimated difference in population

proportions.



– Construct a 99% confidence interval of the difference

in proportions.

Choosing sample size

Questions to ask before determining optimal n:

1) What level of confidence do you want to have

(i.e., the value of 100(1-  )?

2) What is the maximum difference (D) you want

to permit between the estimate of the population

parameter and the true population parameter.

Procedure

• Once you have chosen D and alpha solve

the following equation for sample size n:



z / 2  (standard error of the estimator)  D





So this can be used for any of the estimators that we have

talked about thus far, you just need the proper standard

error

estimation of the mean

σ

σx 

n

estimation of differenceof means

 12 2

2

 x x  

1 2

n1 n2

estimation of a proportion

ˆˆ

pq

p 

ˆ

n

estimation of differenceof proportions

ˆ ˆ

p1q1 ˆ ˆ

p2 q2

(p p

ˆ ˆ )  

1 2

n1 n2

Additional info

• When you are using two different samples

calculate just one n and use it for both

• Choosing the proper standard deviation and

proportion can be hard, use any information

from past experiments you may have

• When using proportions choosing a p of .5

will give you the largest possible n value

you may want to use it to be safe

Additional info continued

• If you are concerned with the n to choose

when using means and you have some idea

of what the range (R ) is you can estimate

the standard deviation as





4  R

Example: we know the population standard

deviation is 0.24 and we want the error of

the estimation to be less than 0.06 with

95% confidence.

At least how many observations do we need in

the sample?


Related docs
Other docs by HC11112418320
2011
Views: 110  |  Downloads: 0
Lumber List
Views: 4  |  Downloads: 0
Kimball� Electronics Group
Views: 0  |  Downloads: 0
Architecture
Views: 0  |  Downloads: 0
Exercises
Views: 2  |  Downloads: 0
Adding and Subtracting FRACTIONS!!!!
Views: 0  |  Downloads: 0
Third Grade
Views: 0  |  Downloads: 0
BEAR 100 2010
Views: 0  |  Downloads: 0
WHY SAMPLE
Views: 0  |  Downloads: 0
A G R E M E N T
Views: 1  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!