VIEWS: 26 PAGES: 42 CATEGORY: Education POSTED ON: 7/20/2010 Public Domain
Chapter 7. Statistical Estimation and Sampling Distributions 7.1 Point Estimates 7.2 Properties of Point Estimates 7.3 Sampling Distributions 7.4 Constructing Parameter Estimates 7.5 Supplementary Problems NIPRL 7.1 Point Estimates 7.1.1 Parameters • Parameters – In statistical inference, the term parameter is used to denote a quantity , say, that is a property of an unknown probability distribution. – For example, the mean, variance, or a particular quantile of the probability distribution – Parameters are unknown, and one of the goals of statistical inference is to estimate them. NIPRL Figure 7.1 The relationship between a point estimate and an unknown parameter θ NIPRL Figure 7.2 Estimation of the population mean by the sample mean NIPRL 7.1.2 Statistics • Statistics – In statistical inference, the term statistics is used to denote a quantity that is a property of a sample. – Statistics are functions of a random sample. For example, the sample mean, sample variance, or a particular sample quantile. – Statistics are random variables whose observed values can be calculated from a set of observed data. Examples : X1 X 2 Xn sample mean X n i1 n ( X i X )2 sample variance S 2 n 1 NIPRL 7.1.3 Estimation • Estimation – A procedure of “guessing” properties of the population from which data are collected. – A point estimate of an unknown parameter is ˆ a statistic that represents a “guess” at the value of . • Example 1 (Machine breakdowns) – How to estimate P(machine breakdown due to operator misuse) ? • Example 2 (Rolling mill scrap) – How to estimate the mean and variance of the probability distribution of % scrap ( D% scrap ) ? NIPRL 7.2 Properties of Point Estimates 7.2.1. Unbiased Estimates (1/5) • Definitions - A point estimate ˆ for a parameter is said to be unbiased if E (ˆ) - If a point estimate is not unbiased, then its bias is defined to be bias E (ˆ) NIPRL 7.2.1. Unbiased Estimates (2/5) • Point estimate of a success probability X If X B(n, p), then p ˆ n - X B (n, p ) E ( X ) np, X 1 1 hence E ( p ) E ( ) E ( X ) np p ˆ n n n it means that p is anunbiased estimate ˆ NIPRL 7.2.1. Unbiased Estimates(3/5) • Point estimate of a population mean If X 1 , X 2 , , X n is a sample observations from a prob. dist. with a mean , then the sample mean X is an unbiased ˆ point estimate of the population mean . - since E ( X i ) , 1 i n , n 1 1 n 1 E ( ) E ( X ) E ( X i ) E ( X i ) n ˆ n i 1 n i 1 n NIPRL 7.2.1. Unbiased Estimates(4/5) • Point estimate of a population variance If X 1 , X 2 , , X n is a sample observations from a prob. dist. with a variance 2 , then the sample variance i 1 n ( X i X )2 2 S2 ˆ n 1 is an unbiased point estimate of the population variance 2 NIPRL 7.2.1. Unbiased Estimates (5/5) i1 ( X i X )2 i1 (( X i ) ( X ))2 n n i 1 ( X i ) 2 2( X ) i 1 ( X i ) n( X ) 2 n n i 1 ( X i ) 2 n( X ) 2 n note E ( X i ) , E ( X i ) 2 Var ( X i ) 2 2 E( X ) , E ( X ) 2 Var ( X ) n E (S ) 2 1 n 1 E i 1 ( X i X ) n 2 1 n 1 E i1 ( X i ) 2 n( X ) 2 n 1 n 2 2 i 1 n 2 n 1 n NIPRL 7.2.2. Minimum Variance Estimates (1/4) • Which is the better of two unbiased point estimates? Probability density function of ˆ 2 Probability density function of ˆ 1 x NIPRL 7.2.2. Minimum Variance Estimates (2/4) ˆ ˆ ˆ ˆ SinceVar (1 Var ( 2 ), 2 is a better point estimate than1. This can be written ˆ ˆ P(| 1 | )P(| 2 | )for any value of 0. Probability density function of ˆ 2 Probability density function of ˆ 1 NIPRL 7.2.2. Minimum Variance Estimates (3/4) • An unbiased point estimate whose variance is smaller than any other unbiased point estimate: minimum variance unbised estimate (MVUE) • Relative efficiency ˆ The relative efficiency of an unbiased point estimate 1 ˆ Var ( 2 ) ˆ to an unbiased point estimate 2 is ˆ Var ( ) 1 • Mean squared error (MSE) – MSE (? ) E ( ) 2 How is it decomposed – Why is it useful ? NIPRL 7.2.2. Minimum Variance Estimates (4/4) Probability density Probability density function of ˆ 2 function of ˆ 1 ˆ1 ˆ2 x Bias of ˆ1 Bias of ˆ2 NIPRL Example: two independent measurements XA N (C , 2.97) \ and \ X B N (C ,1.62) Point estimates of the unknown C C A X A \ and \ C B X B They are both unbiased estimates since E[C A ] C \ and \ E[C B ] C. The relative efficiency of C A to C B is Var (C B ) / Var (C A ) 1.62/ 2.97 0.55. Let us consider a new estimate C pC A (1 p)C B . Then, this estimate is unbiased since E[C] pE[C A ] (1 p)E[C B ] C. What is the optimal value of p that results in C having the smallest possible mean square error (MSE)? NIPRL Let the variance of C be given by Var (C ) p 2Var (C A ) (1 p) 2Var (C B ) p 2 12 (1 p)2 2 2 . Differentiating with respect to p yields that d Var (C ) 2 p 12 2(1 p) 2 2 . dp The value of p that minimizes Var (C) : d 1/ 12 Var (C ) | p p 0 p dp 1/ 12 1/ 2 2 Therefore, in this example, 1/ 2.97 p 0.35. 1/ 2.97 1/1.62 The variance of 1 Var (C ) 1.05. 1/ 2.97 1/1.62 NIPRL The relative efficiency of C B to C is Var (C ) 1.05 0.65. Var (C B ) 1.62 In general, assuming that we have n independent and unbiased estimates i , i 1, , n having variance i , i 1, , n 2 respectively for a parameter , we can set the unbiased n estimator as 2 i /i i 1 n . 1/ i2 i 1 The variance of this estimator is 1 Var ( ) n . 1/ i2 i 1 NIPRL Mean square error (MSE): Let us consider a point estimate . Then, the mean square error is defined by MSE( ) E[( )2 ]. Moreover, notice that MSE ( ) E[( ) 2 ] E[( E[ ] E[ ] ) 2 ] E[( E[ ]) 2 2( E[ ])( E[ ] ) ( E[ ] ) 2 E[( E[ ]) 2 ] ( E[ ] ) 2 Var ( ) bias 2 NIPRL 7.3 Sampling Distribution 7.3.1 Sample Proportion (1/2) • X If X B(n, p ), then the sample propotion p ˆ has n p (1 p ) ˆ the approximate distribution p N p, n E ( p ) p, ˆ X p (1 p ) Var ( X ) np (1 p ) Var ( p ) Var ( ˆ ) n n NIPRL 7.3.1 Sample Proportion (2/2) • Standard error of the sample mean Thestandard error of the sample mean is defined as p (1 p) s.e.( p ) ˆ , but since p is usually unknown, n wereplace p by the observed value p x / nto have ˆ s.e.( p ) p (1 p) 1 x(n x) ˆ ˆ ˆ n n n NIPRL 7.3.2 Sample Mean (1/3) • Distribution of Sample Mean If X 1 , , X n are observation from a population with mean and variance 2 , then the Central Limit Theorem says, 2 X ˆ N ( , ) n NIPRL 7.3.2 Sample Mean (2/3) • Standard error of the sample mean The standard error of the sample mean is defined as s.e.( X ) , but since is usually unknown, n wemaysafelyreplace ,whennislarge, by the observed value s, s s.e.( X ) n NIPRL 7.3.2 Sample Mean (3/3) If n 20, prob. that X lies within / 4 of ˆ 2 P X P N ( , ) 4 4 4 20 4 20 20 P N (0,1) (1.12) (1.12) 0.7372 4 4 74% Probability density function of X when n 20 ˆ / 4 / 4 NIPRL 7.3.3 Sample Variance (1/2) • Distribution of Sample Variance If X 1 , , X n normally distributed with mean and variance 2 , then the sample variance S 2 has the distribution n1 2 S 2 2 (n 1) NIPRL Theorem: if X i , i 1, , n is a sample from a normal population having mean and variance , then X \ and \ S 2 2 are independent random variables, with X being normal with mean and variance 2 / n and (n 1) S 2 / 2 being chi-square with n-1 degrees of freedom. (proof) n n 2 Let Yi X i . Then, (Yi Y ) Yi nY 2 2 i 1 i 1 or equivalently, n n ( X i X ) 2 ( X i ) 2 n( X ) 2 . i 1 i 1 Dividing this equation by , we get 2 n n (( X i ) / ) 2 (( X i X ) / ) 2 ( n ( X ) / ) 2 . i 1 i 1 Cf. Let X and Y be independent chi-square random variables with m and n degrees of freedom respectively. Then, Z=X+Y is a chi-square random variable with m+n degrees of freedom. NIPRL In the previous equation, n ( n ( X ) / )2 i 1 12 \ and n (( X i ) / ) 2 i 1 n . 2 Therefore, n (( X i X ) / ) 2 ( n 1) S 2 / 2 i 1 n 1. 2 NIPRL 7.3.3 Sample Variance (2/2) • t-statistics 2 n X N , (X ) N (0,1) n S n 1 2 And also so that (n 1) n ( X ) n( X ) N (0, 1) tn 1 S S 2 n 1 (n 1) NIPRL 7.4 Constructing Parameter Estimates 7.4.1 The Method of Moments (1/3) • Method of moments point estimate for One Parameter If a data set of observations x1 , , xn from a probabilty distribution that depends upon one unknown parameter , ˆ the method of moments point estimate of the parameter is found by solving the equation x E ( X ) NIPRL 7.4.1 The Method of Moments (2/3) • Method of moments point estimates for Two Parameters For unknown two parameters 1 and 2 , The method of moments point estimates are found by solving the equations x E ( X ) and s 2 Var ( X ) NIPRL 7.4.1 The Method of Moments (3/3) • Examples - For example of normally distributed data, since E ( X ) and Var ( X ) 2 for N ( , 2), the method of moments give x and s 2 2 ˆ - Suppose that the data observations 2.0 2.4 3.1 3.9 4.5 4.8 5.7 9.9 are obtained from a U (0, ). ˆ x 4.5375 and E ( X ) 2 4.5375 9.075 2 – What if the distribution is exponential with the parameter ? NIPRL 7.4.2 Maximum Likelihood Estimates (1/4) • Maximum Likelihood Estimate for One Parameter If a data set consists of observations x1 , , xn from a probability distrbution f ( x, ) depending upon one unknown parameter , ˆ The maximum likelihood estimate of the parameter is found by maximizing the likelihood function L( x1 , , xn , ) f ( x1 , ) f ( xn , ) NIPRL 7.4.2 Maximum Likelihood Estimates (2/4) • Example - If x1 , , xn are a set of Bernoulli observations, with f (1, p ) p and f (0, p) 1 p, i.e. f ( xi , p) p xi (1 p )1 xi - The likelihood function is n L( p; x1 , , xn ) p xi (1 p )1 xi p x (1 p ) n x , i 1 where x x1 xn , ˆ and the m.l.e p is the value that maximizes this. - The log-likelihood is ln( L) x ln( p) ( n x) ln(1 p) d ln( L) x n x x and 0 p ˆ dp p 1 p n NIPRL 7.4.2 Maximum Likelihood Estimates (3/4) • Maximum Likelihood Estimate for Two Parameters For two unknown parameters 1 and 2 , ˆ the maximum likelihood estimates and ˆ 1 2 are found bymaximizing the likelihood function L(1 , 2 ;x1 , , xn ) f ( x1; 1 , 2 ) f ( xn ; 1 , 2 ) NIPRL 7.4.2 Maximum Likelihood Estimates (4/4) • Example 1 The normal dist. f ( x, , ) 2 e ( x ) 2 / 2 2 2 n likelihood L( x1 , xn , , ) f ( xi , , 2 ) 2 i 1 n/2 1 n 2 exp ( xi ) 2 / 2 2 2 i 1 i 1 ( xi )2 n n so that log-likelihood ln( L) ln(2 ) 2 2 2 2 i 1 ( xi ) i 1 ( xi )2 n n d ln( L) d ln( L) n , 2 d 2 d ( )2 2 2 4 setting two equations to zeros, ( xi )2 i 1 ( xi x )2 n n x, ˆ ˆ 2 i 1 ˆ n n NIPRL 7.4.3 Examples (1/6) • Glass Sheet Flaws At a glass manufacturing company, 30 randomly selected sheets of glass are inspected. If the dist. of the number of flaws per sheet is taken to have a Poisson dist. How should thepara. be estimated? - The method of moment ˆ E( X ) x NIPRL 7.4.3 Examples (2/6) • The maximum likelihood estimate: e xi f ( xi , ) , so that the likelihood is xi ! n e n ( x1 xn ) L( x1 , , xn , ) f ( xi , ) i 1 ( x1 ! xn !) The log-likelihood is therefore ln( L) n ( x1 xn ) ln( ) ln( x1 ! xn !) d ln( L) ( x1 xn ) ˆ so that n 0 x d NIPRL 7.4.3 Examples (3/6) • Example 26: Fish Tagging and Recapture Suppose fisherman wants to estimate the fish stock N of a lake and that 34 fish have been tagged and released back into the lake. If, over a period time, the fisherman catches 50 fish and 9 of them are tagged, then an intuitive estimate of total number of fish is ˆ 34 50 189 N 9 this assumes that the proportion of tagged fish roughly equal to the proportion of the fisherman's catch that is tagged. NIPRL 7.4.3 Examples (4/6) Under the assumption that all the fish are equally likely to be caught, the dist. of the number of tagged fish X in the fisherman's catch of 50 fish is a hypergeometric dist. with r 34, n 50 and N unknown. r N r hypergeometric P( X x | N , n, r ) x n x , E( X ) nr 50 34 N N N n nr Since there is only one obs. x and so x x, equating E ( X ) x N Nˆ 50 34 188.89 9 NIPRL 7.4.3 Examples (5/6) r Under binomial approximation, in this case success prob. p N is estimated to be p ˆ x 9 ˆ r 50 34 , Hence N n 50 ˆ p 9 • Example 36: Bee Colonies The data on the proportion of workers bees that leave a colony with a queen bee. 0.28 0.32 0.09 0.35 0.45 0.41 0.06 0.16 0.16 0.46 0.35 0.29 0.31. If the entomologist wishes to model this proportion with a beta dist., How should the parameters be estimated? NIPRL 7.4.3 Examples (6/6) (a b) a 1 beta f ( x | a, b) x (1 x)b 1 , 0 x 1, a 0, b 0 (a)(b) a ab E( X ) , Var ( X ) ab (a b) 2 (a b 1) x 0.3007, s 2 0.01966 ˆ ˆ So the point estimates a and b are solution to the equations a ab 0.3007 and 0.01966 ab (a b) (a b 1) 2 ˆ which are a 2.92 and b 6.78 ˆ NIPRL MLE for U (0, ) • For some distribution, the MLE may not be found by differentiation. You have to look at the curve of the likelihood function itself. • The MLE of max { X 1 , , X n} NIPRL