VIEWS: 12 PAGES: 74 POSTED ON: 3/28/2011 Public Domain
Chapter 11 Hypothesis Testing A statistical hypothesis is an assertion(主張) or conjecture(推測) concerning one or more populations. In hypothesis testing, we begin by making a tentative assumption about a population parameter. This tentative assumption is called the null hypothesis, denote by H0. The opposite statement is called the alternative hypothesis, denoted by Ha. Acceptance and Rejection Acceptance of a hypothesis merely implies that the data do not give sufficient evidence to refute it. On the other hand, rejection implies that the sample evidence refutes. i.e. rejection means that there is a small probability of obtaining the sample information observed when, in fact, the hypothesis is true. One- and Two-Tailed Tests One-tailed test: – if the alternative is one-sided H0:m (=)m0 H0:m (=) m0 Ha:m < m0 Ha:m > m0 Two-tailed test: – if the alternative is two-sided H0:m = m0 Ha:m m0 Testing a Statistical Hypothesis The value m0 is called the critical value. The region whose value satisfies the null hypothesis is called the acceptance region. The region whose value satisfies the alternative hypothesis is called the critical region. Example 11.1 The manufacturer of a certain brand of cigarettes claims that the average nicotine content does not exceed 2.5 mg. State the null and alternative hypotheses. H0:m 2.5 Ha:m > 2.5 Example 11.2 A real estate claims that 60% of all private residences being built today are 3-bedroom homes. To test this claim, a large sample of new residences is inspected; the proportion of these homes with 3 bedrooms is recorded and used as our test statistic. State the null and alternative hypotheses. H0:p = 0.6 Ha:p 0.6 Testing Research Hypothesis The rejection of H0 supports the conclusion and action being sought. The research hypothesis therefore should be expressed as the alternative hypothesis. Testing the Validity of a Claim The null hypothesis is generally based on the assumption that the claim is true. The alternative hypothesis is then formulated so that rejection of H0 will provide statistical evidence that the stated assumption is incorrect. Action to correct the claim should be considered whenever H0 is rejected. Testing in Decision-Making In previous tow cases, action is taken if H0 is rejected. In many instances, however, action must be taken both when H0 cannot be rejected and when H0 can be rejected. Conclusion in Hypothesis Test The null and alternative hypotheses are competing statements about the population. Either the null hypothesis H0 is true or the alternative hypothesis Ha is true, but not both. Ideally, the hypothesis testing procedure should lead to the acceptance of H0 when H0 is true and the rejection of H0 when Ha is true. Type I and II Error Type I Error Rejection of the null hypothesis when it is true is called the type I error. The probability of committing a type I error is called the level of significance, denoted by a . We can control the probability of making type I error by using level of significance. Commonly choices for a are 0.05 and 0.01. Type II Error Acceptance of the null hypothesis when it is false is called type II error. We do not control the probability of making type II error. If we decide to accept H0 , we cannot determine how confident we can be with that decision. We use the statement “do not reject H0“, instead of “accept H0 “. p-value The p-value is the probability of obtaining a sample result that is at least as unlikely as what is observed. A small p-value indicates that the sample results is unusual given the assumption that H0 is true. As with the hypothesis tests, a small p-value leads to the rejection of H0 p-value The p-value approach is designed to give the user an alternative (in terms of probability) to a mere “reject” or “do not reject” conclusion. If p-value < a then we reject H0 – Two-tailed: P=2*P(z>Z when m = m0 ) – One-tailed: P=P(z>Z when m = m0 ) when Z>0 – One-tailed: P=P(z<Z when m = m0 ) when Z<0 Steps of Hypothesis Testing Determine the null and alternative hypotheses. Select the test statistic that will be used. Specify the level of significance. Use the level of significance to develop the rejection rule. Collect the sample data and compute the value of the test statistic. Steps of Hypothesis Testing Compare the value of test statistic to the critical values specified in the rejection rule to determine whether or not H0 should rejected. Compute the p-value based on the test statistic in Step 5. Use the p-value to determine whether or not H0 should rejected. One-Tailed Tests of m (n30) H0:m (=) m0 Ha:m < m0 It is convenient to standardize and formally involve the standard normal random variable Z, where s known (n30) Under H0 , if m m0 then Z has an N(0,1) distribution, and hence the expression can be used to write an approximate acceptance region. s known (n30) The critical region is designed to control a, the probability of type I error. Given a computed value , the formal test involves: – if the computed test statistic Z<- Za , reject H0. – If Z> -Za, do not reject H0. One-Tailed Tests of m (n30) H0:m (=) m0 Ha:m > m0 Under H0 , if m m0 the expression can be used to write an approximate acceptance region. s known (n30) Given a computed value , the formal test involves: – if the computed test statistic Z> Za , reject H0. – If Z <Za, do not reject H0. Example 11.3 A random sample of 100 recorded deaths in the United States during the past year showed an average life span of 71.8 years. Assuming a population standard deviation of 8.9 years, does this seem to indicate that average life span today is greater than 70 years? Use a 0.05 level of significance. Solution H0:m = 70 years Ha:m >70 years a=0.05 Critical region: Z > 1.645, where Conclusion: reject H0 and conclude that the average life span today is greater than 70 years. Solution p-value: Use the table, we have P=P(z>2.02)=0.0217 < 0.05 As a result, the evidence in favor of Ha is even stronger than that by a 0.05 level of significance. s unknown (n30) It is convenient to standardize and formally involve the standard normal random variable Z, where One-Tailed Tests (n<30) If the population has a normal distribution, it is convenient to standardize . – Use the student-t random variable T, when s is unknown. – Or use the standard normal random variable Z, when s is known. s unknown (n<30) Under H0 , if m (=) m0 then T has an t- distribution, and hence the expression can be used to write an approximate acceptance region. s unknown (n<30) Given a computed value , the formal test involves: – if the computed test statistic T<- ta , reject H0. – If T>- ta , do not reject H0. s unknown (n<30) Under H0 , if m (=) m0 then T has an t- distribution, and hence the expression can be used to write an approximate acceptance region. s unknown (n<30) Given a computed value , the formal test involves: – if the computed test statistic T> ta , reject H0. – If T< ta , do not reject H0. Example 11.4 The Edison Electric Institute has claimed that a vacuum cleaners expends an average of 46 kwh/ year. If a random sample of 12 homes included in a planned studied indicates that vacuum cleaners expend an average of 42 kwh/year with standard deviation of 11.9 kwh, does this suggest at the 0.05 level of significance that vacuum cleaners expend, on average, less than 46 kwh/year? Assume normal population. Solution H0:m = 46 kwh Ha:m <46 kwh a=0.05 Critical region: t < -1.796, where Conclusion:Do not reject H0 and conclude that the average number of kw expanded annually by vacuum cleaners is not significantly less than 46. Solution p-value: Use the table, we have P=P(T<-1.16)=0.135 > 0.05 Conclusion: Do not reject H0 Two-Tailed Tests of m (n30) H0:m = m0 Ha:m m0 It is convenient to standardize and formally involve the standard normal random variable Z, where s known (n30) Under H0 , if m = m0 then Z has an N(0,1) distribution, and hence the expression can be used to write an approximate acceptance region. s known (n30) The critical region is designed to control a, the probability of type I error. Given a computed value , the formal test involves: – if the computed test statistic Z>Za/2 or Z<- Za/2 , reject H0. – If -Za/2 <Z< Za/2 , do not reject H0. Example 11.5 A manufacturer of sports equipment has developed a new synthetic fishing line that he claims has a mean breaking strength of 8 kg with a standard deviation of 0.5 kg. Test the hypothesis that m=8 kg against the alternative that m8 kg if a random sample of 50 lines is tested and found to have a mean breaking strength of 7.8 kg. Use a 0.01 level of significance. Solution H0:m = 8 kg Ha:m 8 kg a=0.01 Critical region: Z<-2.575 and Z > 2.575, where Conclusion: reject H0 and conclude that the average breaking strength is not equal to 8 kg, in fact, is less than 8 kg. Solution p-value: Use the table, we have P=P(|z|>2.83)=0.0046 < 0.01 reject the null hypothesis that m=8 kg at a level of significance smaller than 0.01. s unknown(n<30) If the population has a normal distribution, it is convenient to standardize . – Use the student-t random variable T, when s is unknown. – Or use the standard normal random variable Z, when s is known. s unknown (n<30) Under H0 , if m = m0 then T has an t- distribution, and hence the expression can be used to write an approximate acceptance region. s unknown (n<30) The critical region is designed to control a, the probability of type I error. Given a computed value , the formal test involves: – if the computed test statistic T>ta/2 or T<- ta/2 , reject H0. – If -ta/2 <T< ta/2 , do not reject H0. s known (n<30) Given a computed value , the formal test involves: – if the computed test statistic Z>Za/2 or Z<- Za/2 , reject H0. – If -Za/2 <Z< Za/2 , do not reject H0. Relation to Confidence Interval (1-a)100% confidence interval: when s is known. And when s is unknown. A Confidence Interval Approach H0:m = m0 Ha:m m0 1 Compute the confidence interval m0 . 2 If the confidence interval contains the hypothesized value m0 ,do not reject H0. 3 Otherwise, reject H0. Test about Population Proportion One-tailed test: – if the alternative is one-sided H0: p (=)p0 H0: p (=)p0 Ha : p < p 0 Ha : p > p 0 Two-tailed test: – if the alternative is two-sided H0 : p = p 0 Ha : p p 0 Test Statistic One-Tailed Test of P H0: p (=)p0 Ha : p > p 0 Given a computed value Z , the formal test involves: – if the computed test statistic Z>Za reject H0. – If Z< Za , do not reject H0. One-Tailed Test of P H0: p (=)p0 Ha : p < p 0 Given a computed value Z, the formal test involves: – if the computed test statistic Z<- Za , reject H0. – If -Za <Z , do not reject H0. Two-Tailed Test of P H0: p = p0 Ha : p p 0 Given a computed value Z, the formal test involves: – if the computed test statistic Z>Za/2 or Z<- Za/2 , reject H0. – If -Za/2 <Z< Za/2 , do not reject H0. p-value If p-value < a then we reject H0 – Two-tailed: P=2*P(z>Z when p = p0 ) or P=(|z| >Z when p=p0 ) – One-tailed: P=P(z>Z when p = p0 ) when Z>0 or P=(z < Z when p=p0 ) when Z<0 p-value H0: p (=)p0 Ha : p < p 0 P=P(z<Z when p=p0) H0: p (=)p0 Ha : p > p 0 P=P(z > Z when p=p0) H0 : p = p 0 Ha : p p 0 P=(|z| > Z when p=p0) Example 11.6 A builder claims that heat pumps are installed in 70% of all homes being constructed today in the city of Richmond. Would you agree with this claim if a random survey of new homes in this city shows that 8 out of 15 had heat pumps installed? Use a 0.10 level of significance. Solution H0: p = 0.7 Ha: p 0.7 a=0.10 Z0.05=1.645 - Z0.05=-1.645 z > - Z0.05 Do not reject H0 Solution p-value: Use the table, we have P=2*P(z -1.4124)=0.1586 > 0.10 Do not reject H0 Conclusion: there is insufficient reason to doubt the builder’s claim. Binomial Distribution Approach Binomial variable X with p=0.7 and n=15 x=8 and np0=15*0.7=10.5 p-value: Do not reject H0 Conclusion: there is insufficient reason to doubt the builder’s claim. Test about Population Variance One-tailed test: – if the alternative is one-sided H0: s2 (=) s20 H0: s2 (=) s20 Ha: s2 < s20 Ha: s2 > s20 Two-tailed test: – if the alternative is two-sided H0: s2 = s20 Ha: s2 s20 Test Statistic One-Tailed Test of s 2 H0: s2 (=) s20 Ha: s2 > s20 Given a computed value x2 , the formal test involves: – if the computed test statistic x2 > x2a reject H0. – If x2 < x2a , do not reject H0. One-Tailed Test of s 2 H0: s2 (=) s20 Ha: s2 < s20 Given a computed value x2, the formal test involves: – if the computed test statistic x2 < x21-a , reject H0 . – If x2 > x21-a , do not reject H0. Two-Tailed Test of s 2 H0: s2 = s20 Ha: s2 s20 Given a computed value x2, the formal test involves: – if the computed test statistic x2 > x2a/2 or x2 < x21-a/2 , reject H0. – If x21-a/2 < x2 < x2a/2 , do not reject H0. p-value If p-value < a then we reject H0 – Two-tailed: P=2*P(X2> x2 when s2 = s20 ) or P=(| X2 | > x2 when s2 = s20 ) – One-tailed: P=P(X2 > x2 when s2 = s20 ) or P=(X2 < x2 when s2 = s20 ) Calculating Type II Error H0:m (=)m0 Ha:m < m0 Calculating Type II Error Denoting the probability of making type II error as b . Example 11.7 A quality manager must decide to accept a shipment of batteries from a supplier or return to the supplier of poor quality. Assume that the required battery life is at 120 hours. To evaluate the quality of an incoming shipment, a sample of 36 batteries are sampled. The standard deviation of the population is 12. What is the type II error if the actual mean life is 112 and the manager accept this shipment? Solution H0:m 120 Ha:m < 120 a= 0.05, s =12, n =36 The test statistic is Solution If m=112 is really true, what is the probability of accepting H0: m120 and hence committing type II error? Type II Error The probability of correctly rejecting H0 when it is false is called power, denoted by 1-b . Determine the Sample Size H0:m (=)m0 Ha:m < m0 Notation: Za = z-value for type I error. Zb = z-value for type II error m0 = population mean in Ho ma = population mean used for type II error Determine the Sample Size Determine the Sample Size Example 11.8 (followed 11.7) Assume the manager make the following statements about the shipment: – Type I error: If the mean life of the batteries in the shipment is 120, I am willing to risk an a = 0.05 probability to reject this shipment. – Type II error: If the mean life of batteries in the shipment is 115, I am willing to risk a b = 0.10 probability to accept this shipment. How many samples should he take? Solution The recommended sample size is 50 to satisfy the manager’s allowance.