VIEWS: 71 PAGES: 86 POSTED ON: 8/8/2012 Public Domain
10.1 Chapter 10: One- and Two-Sample Tests of Hypotheses Take Sample Inference Population Sample Many of the same types of things we did in Chapter 9 will be done here as well, but through more formal methods (hypothesis tests). Below is a summary of the whole process: 1) Define a population and parameter(s) of interest. 2) State a hypothesis about the parameter’s value. 3) Take a representative sample from the population. 4) Calculate the statistic(s) using the sample. 5) Make inferences from the sample to the population by using hypothesis tests. The hypothesis tests will allow us to make a decision about hypotheses of interest with a certain level of confidence. From this chapter, it is important to learn the following: Null and alternative hypotheses Type I and II errors Hypothesis tests procedures for a variety of problems 2005 Christopher R. Bilder 10.2 10.1-10.7: The Basics of Hypothesis Testing and Testing Hypotheses Regarding Hypothesis: A statement that something is true. Below is an example to help introduce hypothesis testing: Example: Light Bulbs (light_bulbs.xls from Chapter 9) Suppose that General Electric is interested in estimating the mean lifetimes of its light bulbs. It hypothesizes that =250 (this could be what is stated on the package). How can this be checked? General Electric takes a random sample of 16 light bulbs and finds they last on average for 299.2 hours with a standard deviation of 80 hours. The 95% C.I. for is 264.14 < < 334.26. Is =250? Since 264.14 < < 334.26 with a 95% level of confidence, appears to be greater than 250. Therefore, reject the hypothesis of =250. Suppose before the sample was conducted, General Electric hypothesized that =270. Is this correct? Again, the sample was taken and the C.I. above was obtained. Since could be 264.15, 268, 270, 272, 2005 Christopher R. Bilder 10.3 300,…, =270 may be correct. Therefore, do not reject the hypothesis of =270. There is not sufficient evidence from the sample to prove the hypothesized value of =270 to be incorrect. Finally, suppose before the sample was conducted, General Electric hypothesized that =350. Is this correct? Again, the sample was taken and the confidence interval above was obtained. Since 264.14 < < 334.26 with a 95% level of confidence, appears to be less than 350. Therefore, reject the hypothesis of =350. The above is an informal example of a hypothesis test. In many real life situations, there is a hypothesis about the population mean or other population parameters. A sample from the population is taken to investigate the hypothesis. For the first hypothesis of =250 in the light bulb example, two hypotheses were considered: Null Hypothesis, Ho:=250 Alternative Hypothesis, Ha:250 One of two possible decisions were made: Reject Ho - This indicates is not 250 2005 Christopher R. Bilder 10.4 Don't Reject Ho - This indicates there is not sufficient evidence from the sample to say is different from 250. You can not say "Accept Ho"; i.e., can not say Ho Comment [CRB1]: C.I. gave a range of is true. See the reason in the following (and previous) possible value for . One of those is the hypothesized value of the population mean (if example. we don’t reject Ho). Thus, the null hypothesis may or may not be true. Note: Some people will use the terminology “Fail to reject Ho” instead of “Don’t reject Ho”. Both are fine to use. Example: Jury Trials Juries are asked to consider two hypotheses: Ho:Defendant is innocent Ha:Defendant is guilty The defendant is assumed innocent until proven guilty. In hypothesis testing, we assume Ho is true until there is enough evidence to prove otherwise. The jury listens to the prosecution and the defense to make a judgment. This is like taking a SAMPLE. If there is ENOUGH evidence (beyond a reasonable doubt) to convict - Reject Ho, the defendant is “guilty”. If there is NOT ENOUGH (reasonable doubt) evidence to convict - Don't Reject Ho, the defendant is “not guilty”. Notice, this does not mean the defendant is innocent. 2005 Christopher R. Bilder 10.5 Types of errors in hypothesis test decisions Type I - Reject Ho, but in reality Ho is true Type II - Don't reject Ho, but in reality Ha is true (reality=population) These errors indicate that the sample led us to believe something about the population that is incorrect. Example: Jury Trials Type I: Reject Ho = jury says the defendant is guilty, but Ho is really true = defendant is innocent. Send an innocent person to jail Type II: Don't reject Ho = jury says the defendant is not guilty, but Ha is really true = defendant is guilty. Let a criminal go free Probability of making errors A type I error is the more serious error in the jury trial example and in statistics. Thus, the P(Type I error) is 2005 Christopher R. Bilder 10.6 controlled in a hypothesis test at a specified level denoted by . Therefore, P(Type I error) = P(Reject Ho | Ho is TRUE) = . This is also called the “level of significance”. A type II error is generally not as serious, so it is usually not controlled at a fixed level. We can still define the probability of committing this error: P(Type II error) = P(Don’t reject Ho | Ha is TRUE) = Table describing the two errors: Based on Sample Reject Ho Don’t Reject Ho Ho is TRUE Type I Error Correct Conclusion Population Ha is TRUE Correct Conclusion Type II Error Same table, but with the conditional probabilities: Based on Sample Reject Ho Ha is TRUE P(Reject Ho | P(Reject Ho | Ho is TRUE Ho is TRUE) = Ha is TRUE) = 1- Population P(Don’t reject Ho | P(Don’t teject Ho | Ha is TRUE Ho is TRUE) = 1- Ha is TRUE) = Use the basic definitions of conditional probabilities from Chapter 2 to help interpret the table! Remember that P(A|B) + P(A|B) = 1 2005 Christopher R. Bilder 10.7 Power In hypothesis testing, we will make the assumption that Ho is true and then try to prove it to be incorrect using the evidence gathered in the sample. Thus, it is important to define P(Reject Ho | Ha is TRUE). This is called the power of the test. Notice where this result falls in the above table and it has a probability of 1-. Question: Do you want this probability to be small or large? 2005 Christopher R. Bilder 10.8 Three Methods for performing a hypothesis test 1) Confidence interval – Section 10.6 2) Test statistic – Sections 10.5, 10.7 3) P-value – Sections 10.4, 10.5, 10.7 All three provide the same answer when testing the population mean! Note that there may be slightly different conclusions when testing a population proportion, p. 2005 Christopher R. Bilder 10.9 1) The confidence interval method - 4 Steps 1. State Ho:=0 Ha:0 where 0 is some number 2. Find the C.I. for 3. Reject or do not reject Ho – Check if the hypothesized value of is inside the interval. 4. Conclusion – Describe what 3. means in terms of the original problem Example: GPA Example. Test the hypothesis that the mean GPA of UNL students is 3.0. Suppose P(Type I error)==0.05, x =2.9, n=16, and s=0.1. 1. Ho:=3.0 Ha:3.0 s s 2. x t / 2,n1 x t / 2,n1 n n 0.1 0.1 2.9 2.131 2.9 2.131 4 4 2.847 < < 2.953 3. Reject Ho since =3.0 is not in the interval. 4. The average GPA of UNL students is not 3.0. 2005 Christopher R. Bilder 10.10 Remember: The probability of incorrectly rejecting =3.0 is 5% (probability of making a type I error). Thus, if the whole process of taking a sample and doing the hypothesis is repeated 1,000 times WITH = 3.0, we would expect 0.051,000 = 50 times to incorrectly reject Ho:=3.0. Example: Volleyball quality control (hyp_volleyball_data.xls) Suppose Mikasa, a volleyball manufacturer, is concerned about whether their volleyballs are being produced with the correct radius of 11.6cm. A sample of 36 volleyballs is taken with x =11.5 and s=1. Part of the data set is below. Volleyball Radius 11.38 12.78 10.61 10.10 Is there evidence to show the volleyballs are being made incorrectly? Conduct a hypothesis test with =0.05. 1. Ho:=11.6 Ha:11.6 2005 Christopher R. Bilder 10.11 s s 2. x t / 2,n1 x t / 2,n1 n n 1 1 11.5 2.03 11.5 2.03 6 6 11.16 < < 11.84 3. Do not reject Ho since =11.6 is in the interval. 4. There is not sufficient evidence to prove the volleyballs are being made incorrectly. OR There is not sufficient evidence to conclude the population mean radius is different from 11.6. Notes: What should Mikasa do? Continue with production of the volleyballs. I did not say, "The volleyballs are being produced correctly." THIS IS WRONG because of the probability of committing a Type II Error is NOT controlled ( was not stated). Compare this to the GPA example! 2005 Christopher R. Bilder 10.12 2) The test statistic method - 5 Steps 1. State Ho and Ha x 0 2. Find the test statistic: t s/ n Does this look familiar??? See Chapters 8 and 9. The test statistic examines how far the sample mean is from the hypothesized mean. The numerator of t, x -0, is divided by s / n to account for the variation of x . Provided Ho is true and X1, X2, …, Xn is a random sample from population with a normal PDF with E(X) = 0 and Var(X) = 2, the random variable version of the test statistic: X 0 T S/ n has a t-distribution with = n-1 degrees of freedom. In part 1 of the Chapter 9 (p. 12) notes, we showed that P( t / 2,n1 T t / 2,n1) 1 X 0 P( t / 2,n1 t / 2,n1) 1 S/ n 2005 Christopher R. Bilder 10.13 Thus, we have a range of probable values for T for a given . If we observe t to be outside of this range, this gives us evidence that our initial assumption of “Ho is true” is incorrect! 3. Find the critical values: ±t/2, n-1 There are two critical values which define the range of probable values for T. Again, remember that P( t / 2,n1 T t / 2,n1) 1 X 0 P( t / 2,n1 t / 2,n1) 1 S/ n If we observe t to be outside of this range, this may give us evidence that our initial assumption of “Ho is true” is incorrect! 4. Reject or do not reject Ho i) Draw the t-distribution 2005 Christopher R. Bilder 10.14 ii) Plot the critical value iii) Label the graph with reject and don't reject regions iv) Plot the test statistic v) Write reject or don’t reject Ho and provide a reason t-distribution h(t) Reject Ho Reject Ho Don't Reject Ho Critical Value 0 Critical Value t 5. Conclusion – Describe what 4. means in terms of the original problem. Example: Volleyball quality control (hyp_volleyball_data.xls and hyp_1sample_pic.xls); remember n=36 and =0.05 1. Ho:=11.6 Ha:11.6 x 0 2. t = (11.5 - 11.6)/(1/6) = -0.6 s/ n 3. ± t/2, n-1=±2.03 4. 2005 Christopher R. Bilder 10.15 t-distribution h(t) Reject Ho Reject Ho Don't Reject Ho -2.03 -0.6 0 2.03 t Since –2.03<-0.6<2.03, do not reject Ho 5. There is not sufficient evidence to prove the volleyballs are being made incorrectly. OR There is not sufficient evidence to conclude the population mean radius is different from 11.6. To help better understand the test statistic method, suppose we had a different x and everything else remained the same. Below is a table showing what would happen with the hypothesis test. Case x t Decision 2005 Christopher R. Bilder 10.16 Case x t Decision 1 11.2 -2.4 Reject Ho 2 11.4 -1.2 Don't Reject Ho 3 11.6 0 Don't Reject Ho 4 11.8 1.2 Don't Reject Ho 5 12.0 2.4 Reject Ho t-distribution h(t) Reject Ho Reject Ho Don't Reject Ho -2.4 -2.03 -1.2 0 1.2 2.03 2.4 t See the file, hyp_1sample_pic.xls, for how some of the calculations can be done in Excel. 2005 Christopher R. Bilder 10.17 2005 Christopher R. Bilder 10.18 2005 Christopher R. Bilder 10.19 All values in red can be changed by the user to see the effect on the test statistic, critical values, and the hypothesis test decision. Make changes on your own so that you familiarize yourself with what happens if the sample size increases, standard deviation changes,… Note that this file can be used to help perform ANY hypothesis test for a population mean. P-values will be discussed later in this chapter. Notes: The way hypothesis testing is set up is to try to find evidence (through a sample) against the null hypothesis (Ho). If enough evidence is found, we can conclude that the alternative hypothesis (Ha) to be true (with the probability of a type I error of ). Since (probability of type II error) is not controlled, we can not set up hypothesis testing to go the other way. Notice that in the formula for t we put in the hypothesized value of , 0. We assume the null hypothesis to be true by doing this (remember the jury trial example). We put values from the sample ( x , s, n) into the test statistic to see if the sample mean is far enough from the hypothesized mean to conclude that the null hypothesis is incorrect. Why was Ho:=11.6 vs. Ha:11.6? o In order for the theory behind all of this to work, we need the equal sign in Ho. 2005 Christopher R. Bilder 10.20 o If some kind of new "action" is to be taken when a hypothesis is proved to be true, this hypothesis typically should be in Ha. This is because we can control the probability of making an error in our decision (i.e. is specified). What would happen if Mikasa's volleyballs did not have an average radius of 11.6? Production of volleyballs would be stopped and the manufacturing process would be investigated to find the problem. This implies we should use Ha:11.6. What would happen if Mikasa's volleyballs have an average radius of 11.6? The production of volleyballs would continue. The book uses the normal PDF instead of the t- distribution when n30 and is known (Section 10.5) o The same problems with using the normal PDF version of the C.I. occur here; is unknown in real- life applications. o Remember that for large samples (n30) the t- distribution is approximately a standard normal PDF. o IN THIS CLASS, WE WILL ONLY USE THE t- DISTRIBUTION! 2005 Christopher R. Bilder 10.21 Example: Volleyball quality control (hyp_volleyball_data.xls and hyp_1sample_pic.xls) There are a few different ways Excel can be used to help do the hypothesis test. One way has already been shown in hyp_1sample_pic.xls. Another way is to select TOOLS > DATA ANALYSIS from the main Excel menu bar and select the appropriate analysis tool. Before this can be done, a new column of data must be entered into the spreadsheet for the hypothesized value of . Below is part of the spreadsheet with this information entered. After this is entered, select TOOLS > DATA ANALYSIS from the main Excel menu bar. Select t-test: Paired Two Sample for Means. The hypothesis test performed for one population mean is actually a special case of 2005 Christopher R. Bilder 10.22 another hypothesis test called “Paired Two sample for Means”. We will learn about this hypothesis test in Section 10.9. After the correct option is chosen, select OK to bring up the t-test Paired Two Sample for Means window. Below is the completed window. Notice what is put in the Variable 2 Range – the column of hypothesized values. The Hypothesized Mean Difference value comes from 0 in Ho:-11.6=0 vs. Ho:- 11.60. Below is the output produced after selecting OK. 2005 Christopher R. Bilder 10.23 Volleyball Radius Hypothesized value Mean 11.5000 11.6 Variance 1.0000 0 Observations 36 36 Pearson Correlation #DIV/0! Hypothesized Mean Difference 0 df 35 t Stat -0.6001 P(T<=t) one-tail 0.2762 t Critical one-tail 1.6896 P(T<=t) two-tail 0.5523 t Critical two-tail 2.0301 For example, the x , s2, and n are given in their appropriate rows and columns. The test statistic is given in the t Stat row. The positive part of the critical values is given in the t Critical two-tail row. There are a few other things given in the table which we will discuss later. Chris Malone’s Excel Instructions - From the website (http://www.statsteacher.com/excel), select Analyses > Mean: One Sample to find help on how to use the Data Analysis tool in Excel to perform the hypothesis tests discussed in this chapter. 2005 Christopher R. Bilder 10.24 2005 Christopher R. Bilder 10.25 3) The p-value method – 5 steps The test statistic method compared t and the critical values from the t-distribution. The p-value method compares probabilities. Thus, we have “p”-value method. 1. State Ho and Ha 2. Find the p-value x 0 Find the test statistic t and compute the p- s/ n value as 2P(T > |t|) where T is a random variable with = n-1 degrees of freedom. The Excel function is 2*TDIST(ABS(t), df , 1) t-distribution h(t) t 0 |t| The p-value gives the probability of finding a value of |t| at least this great assuming the null hypothesis is true. 2005 Christopher R. Bilder 10.26 The probability, P(T > |t|), is multiplied by two since the disagreement between the data and Ho can be in two directions; i.e., on two tails of the PDF. Remember that the p-value is just a probability found through integration! Here, the p-value is 1 / 2 u2 ( 1) / 2 2 1 du |t| / 2 3. State 4. Reject or do not reject Ho Reject Ho if p-value < Don’t reject Ho if p-value Remember is also called "the level of significance" 5. Conclusion – Describe what 4. means in terms of the original problem. Example: Volleyball quality control (hyp_volleyball_data.xls and hyp_1sample_pic.xls) 1. Ho:=11.6 Ha:11.6 2005 Christopher R. Bilder 10.27 x 0 2. t = (11.5 - 11.6)/(1/6) = -0.6 s/ n 2P(T > |-0.6|) = 2P(T > 0.6) = 20.2762 = 0.5524 where = n – 1 = 35. The probability of observing a test statistic value this great in magnitude, |-0.6|, is 0.5524 if Ho:=11.6 was true. Therefore, this is a likely event to happen if =11.6. Another way to think about the p-value is the following: If really was 11.6, then a test statistic value, t, at least this large in absolute value (0.6) would occur about 55% of the time if the hypothesis test process (take a new sample and perform a new hypothesis test) is repeated a very large number of times. In other words, this is likely to occur if =11.6. Thus, could be 11.6 since this is a likely event. Finding the p-value using integration: 35 1 / 2 (35 1) / 2 1 t 2 2 35 dt = 20.2762 = |0.6| 35 / 2 35 0.5524 > f(t):=GAMMA((nu+1)/2)/(GAMMA(nu/2) * sqrt(Pi*nu)) * (1+t^2/nu)^(-(nu+1)/2); 2005 Christopher R. Bilder 10.28 ( 1/2 1/2 ) 1 1 1 t 2 2 2 f( t ) := 1 2 > 2*int(eval(f(t),nu=35),t=abs(- 0.6)..infinity); .5523713960 3. =0.05 4. Since 0.5524 > 0.05 do not reject Ho t-distribution 0.2762 h(t) /2= =0.025 0.025 t 0 |-0.6| 5. The sample does not provide enough evidence to suggest that the volleyballs are being made with the wrong radius. OR 2005 Christopher R. Bilder 10.29 There is not sufficient evidence to conclude the population mean radius is different from 11.6. Fill in the p-values for the table below: Case x t Decision p-value 1 11.2 -2.4 Reject Ho 2 11.4 -1.2 Don't Reject Ho 3 11.6 0 Don't Reject Ho 4 11.8 1.2 Don't Reject Ho 5 12.0 2.4 Reject Ho See the file, hyp_1sample_pic.xls, for how the p-value calculations can be done in Excel. Also, make changes on your own to the values in red so that you familiarize yourself with what happens if the sample size increases, standard deviation changes,… Question: How can the power be increased for a hypothesis test? Make sure you can do the hypothesis test problems with EACH method. Remember all three hypothesis test 2005 Christopher R. Bilder 10.30 methods give the same answers for tests involving and the t-distribution. Understanding the type I error rate Suppose Ho is true and the type I error rate is denoted by . If the hypothesis testing procedure is repeated R times (take a new sample and perform a new hypothesis test), we would expect R of the hypothesis tests to incorrectly reject Ho. Example: CI_ex.xls from Chapter 9 Using the CLT_GPA_ex.xls file, confidence intervals for the population mean are calculated for each of the 1,000 samples. If =0.05, we would expect approximately 5% of the confidence intervals to NOT contain the population mean. The proportion that contain the population mean is 94.7%. Thus, the “type I error rate” is 5.3%. If this procedure was repeated a lot more than 1,000 times, the type I error rate would be 5%. 2005 Christopher R. Bilder 10.31 Notes: The hypothesis tests done so far are often called t-tests since the t-distribution is used in the test. Use hyp_1sample_pic.xls as a template for some of the calculations needed for the hypothesis tests! One-tail tests - The hypothesis tests performed so far have been of the form: Ho:=o vs. Ha:o where o is just a number like 11.6cm. In order to reject Ho, the test statistic is too big OR too small (i.e. there are two rejection regions). These kinds of hypothesis tests are called two-tail tests because the rejection region falls in two tails of the PDF. There are also ONE-TAIL tests of the form: Test Name Ho:o Left-tail Ha:<o Ho:o Right-tail Ha:>o The discussion of these types of tests will be postponed until p. 10.67. 2005 Christopher R. Bilder 10.32 10.8: Two Samples: Tests on Two Means In Sections 9.8 and 9.9, we examined two different cases of estimating the difference between two means: i) Independent samples from two populations ii) Dependent (paired) samples from two populations Both of these situations will be discussed with respect to hypothesis testing here. Similar to Chapter 9, we will only work with the realistic situations of unknown 1 and 2 2 and 1 possibly unequal to 2 for independent 2 2 2 samples. For the dependent samples, we will only work with the case where the population variance is unknown as well. Example: Dividend Yield (div_yield.xls from chapter 9) Is there a difference in average dividend yield of Comment [CRB2]: Why would there possibly companies traded on the NYSE vs. NASDAQ? Perform be a difference – type of companies on the stock exchanges a hypothesis test to determine if there is a difference. Plot from Chapter 9. 2005 Christopher R. Bilder 10.33 Do you think there is a difference between the mean dividend yields for all companies traded on the stock exchanges? Try to make an initial judgement based on the plots. Suppose 1 = NYSE and 2 = NASDAQ. C.I. Method using =0.05: 1) Ho:1 - 2 = 0 Ha:1 - 2 0 2005 Christopher R. Bilder 10.34 2) The 95% C.I. is -0.0070 < 1 - 2 < 0.0245 (from Chapter 9) 3) Do not reject Ho since 0 is in the interval. 4) There is not sufficient evidence to indicate a difference between the mean dividend yields for companies traded on the two stock exchanges. Of course, you could also do the hypothesis test using the test statistic and p-value methods. Test statistic: From Chapter 9, we saw that P t / 2, X1 X2 1 2 t / 2, 1 2 S1 S2 2 n1 n2 2 s1 s2 2 2 n n where 1 2 2 2 2 2 s1 n1 s2 n2 n1 1 n2 1 Thus, the test statistic to be used here replaces the random variables with their observed values for what is in the middle of the inequality above: 2005 Christopher R. Bilder 10.35 x1 x2 (1 2 ) x1 x2 0 t 2 2 2 2 s1 s2 s1 s2 n1 n2 n1 n2 Notice that 1 - 2 is replaced with 0. This is done since most often we will hypothesize Ho:1 - 2 = 0 or Ho:1 - 2 0 or Ho:1 - 2 0. P-value: 2P(T>|t|) where T has degrees of freedom for a two-tail test. Example: Dividend Yield (div_yield.xls from chapter 9) Below are the screen captures from Chapter 9: 2005 Christopher R. Bilder 10.36 Test statistic method (using =0.05): 1) Ho:1 - 2 = 0 Ha:1 - 2 0 x1 x2 0 0.0206 0.0118 2) t 1.1181 2 s1 s2 0.03192 0.02892 2 n1 n2 30 30 3) t0.025,57 = 2.0025 4) T Distribution Probability Reject Ho Reject Ho Don't Reject Ho t -2.0025 2.0025 0 1.1181 2005 Christopher R. Bilder 10.37 Do not reject Ho since -2.0025 < 1.1181 < 2.0025 5) There is not sufficient evidence to indicate a difference between the mean dividend yields for companies traded on the two stock exchanges. P-value method: 1) Ho:1 - 2 = 0 Ha:1 - 2 0 2) 2P(T > |1.1181|) = 0.2682 3) = 0.05 4) Do not reject Ho since 0.2682 > 0.05 5) There is not sufficient evidence to indicate a difference between the dividend yields for companies traded on the two stock exchanges. Using the Data Analysis tool Select TOOLS > DATA ANALYSIS > T-TEST: TWO- SAMPLE ASSUMING UNEQUAL VARIANCES to bring up the window below. 2005 Christopher R. Bilder 10.38 Notice the window has already been filled in with the appropriate entries for the hypothesis test. The labels box was not checked because no labels were in the variable 1 or 2 ranges. The reason no labels were included was due to how the data was entered into Excel (see the file). If the data was entered differently, the labels option could be used. Click OK to produce the output below. Note that variable 1 is NYSE and variable 2 is NASDAQ. It is usually good to label these in the output if the Labels option was not selected. 2005 Christopher R. Bilder 10.39 Variable 1 Variable 2 Mean 0.0206 0.0118 Variance 0.0010 0.0008 Observations 30 30 Hypothesized Mean Difference 0 df 57 t Stat 1.1181 P(T<=t) one-tail 0.1341 t Critical one-tail 1.6720 P(T<=t) two-tail 0.2682 t Critical two-tail 2.0025 Notes: The same problems discussed earlier in Chapter 10 regarding the critical value and p-value calculation occur here. When you are doing a one-tail test, you need to be very careful with what you are testing (right or left tail test) and what Excel thinks you are testing. For example, notice the Variable 1 Range contains the NYSE data. This means Excel thinks I am testing NYSE-NASDAQ. One can easily highlight the NASDAQ range for the Variable 1 Range, but still think they are testing NYSE-NASDAQ! Chris Malone’s Excel Instructions Help on how to perform this test in Excel is available on the Excel Instructions website. Select Analyses > Mean: Two Sample – Independent. 2005 Christopher R. Bilder 10.40 2005 Christopher R. Bilder 10.41 Hypothesis tests involving D for dependent (paired) samples Example: CPT (cpt.xls from Chapter 9) A pharmaceutical company is conducting clinical trials on a new drug used to treat schizophrenia patients. Ten healthy male volunteers were given 3mg of the drug. Before the drug was administered (time=0) and 4 hours after (time=4), a psychometric test called the Continuous Performance Test (CPT) was administered and the number of “hits” was recorded for each patient. From cpt.xls: 2005 Christopher R. Bilder 10.42 Notice that (time 0 hits) – (time 4 hits) was found (Di = Xi1- Xi2 where 1 = time 0 and 2 = time 4). Perform a hypothesis test to determine if there was any type of an effect on the average hits at the different times. C.I. Method using =0.01: 1) Ho:D = 0 Ha:D 0 2) The 99% C.I. is 1.4470 < D < 4.1530 3) Reject Ho since 0 is not in the interval. 4) There is sufficient evidence to indicate a difference between the average hits at the two time periods. Of course, you could also do the hypothesis test using the test statistic and p-value methods. Test statistic: Note that 2005 Christopher R. Bilder 10.43 D D P t / 2, t / 2, 1 SD n where D is the random variable for the difference in sample means and = n-1. Thus, the test statistic to be used here replaces the random variables with their observed values for what is in the middle of the inequality above: d D d0 t sD n sD n Notice that D is replaced with 0. This is done since most often we will hypothesize Ho:D = 0 or Ho:D 0 or Ho:D 0. P-value: 2P(T>|t|) where T has = n – 1 degrees of freedom for a two-tail test. Example: CPT (cpt.xls from Chapter 9) Test statistic method: 1) Ho:D = 0 Ha:D 0 2005 Christopher R. Bilder 10.44 d 2.8 2) t 6.7254 sD n 1.3166 / 10 3) t0.005,9 = 3.2498 4) Reject Ho since 6.7254 > 3.2498. 5) There is sufficient evidence to indicate a difference between the average hits for the two time periods. Thus, the drug is having an effect on the patients. P-value method: 1) Ho:D = 0 Ha:D 0 2) 2P(T > |6.7254|) = 8.6 10-5 3) = 0.01 4) Reject Ho since 8.6 10-5 < 0.01 5) There is sufficient evidence to indicate a difference between the average hits for the two time periods. Thus, the drug is having an effect on the patients. 2005 Christopher R. Bilder 10.45 Using the Data Analysis tool Select TOOLS > DATA ANALYSIS > T-TEST: PAIRED TWO SAMPLE FOR MEANS to bring up the window below. Notice the window has already been filled in with the appropriate entries for the hypothesis test. The labels box was checked because labels were in the variable 1 and 2 ranges. Click OK to produce the output below. 2005 Christopher R. Bilder 10.46 t-Test: Paired Two Sample for Means Time 0 Time 4 Mean 49.6 46.8 Variance 84.04 91.29 Observations 10 10 Pearson Correlation 0.9910 Hypothesized Mean Difference 0 df 9 t Stat 6.7254 P(T<=t) one-tail 4.301E-05 t Critical one-tail 2.8214 P(T<=t) two-tail 8.602E-05 t Critical two-tail 3.2498 Notes: The same problems discussed earlier in Chapter 10 regarding the critical value and p-value calculation occur here. When you are doing a one-tail test, you need to be very careful with what you are testing (right or left tail test) and what Excel thinks you are testing. For example, notice the Variable 1 Range contains the Time 0 data. This means Excel thinks difference = Time 0 – Time 4. One can easily highlight the Time 4 range for the Variable 1 Range, but still think they are testing difference = Time 0 – Time 4! 2005 Christopher R. Bilder 10.47 10.9: Choice of Sample Size for Testing Means This section discusses how to choose a sample size in order to have a particular level of power (1-). You are not responsible for the material in this section. 10.10: Graphical Methods for Comparing Means We have already been doing this starting in Section 8.3. Box plots and dot plots are the two useful tools which can be used here! 2005 Christopher R. Bilder 10.48 10.11: One Sample: Test on a Single Proportion We examined C.I.s for a proportion, p, in Section 9.10. Now, we are going to formalize the discussion of making decisions about p using hypothesis tests. The hypothesis test described in this section and the one in Section 10.12 are two places where performing a hypothesis test using the C.I.s discussed in Chapter 9 may not result in the same answer as in Chapter 10. Example: HCV (HCV.xls from Chapter 9) Excerpt from Tebbs, Bilder, and Moser (Communications in Statistics: Theory and Methods, 2003): Hepatitis C (HCV) is a viral infection that causes cirrhosis and cancer of the liver. Since HCV is transmitted through contact with infectious blood, screening donors is important to prevent further transmission. With over 4.5 million people infected in the United States, the World Health Organization has projected that HCV will be a major burden on the US health care system before the year 2020. In our paper, we used data from Liu et al. (Transfusion, 1997) to demonstrate a new statistical procedure. The authors reported results on 1,875 blood donors screened 2005 Christopher R. Bilder 10.49 Comment [unl3]: "Shu-zoh" for HCV at the Blood Transfusion Service in Xuzhou City, China. There were 42 positive blood donors found. Suppose Xuzhou City officials say that HCV prevalence is 0.01. What do you think about the correctness of their statement? Use =0.05 to perform a hypothesis test in order to examine their statement. Hypothesis test using the C.I. method: 1) Ho:p = 0.01 Ha:p 0.01 2) Using the Agresti-Coull C.I., x (Z2 / 2 ) 2 42 (1.962 ) / 2 p 0.0234 n (Z2 / 2 ) 1875 1.962 p(1 p) p(1 p) p z / 2 p p z / 2 n z2 / 2 n z2 / 2 0.0234(1 0.0234) 0.0234(1 0.0234) 0.0234 1.96 p 0.0234 1.96 1875 1.962 1875 1.962 0.0165 < p < 0.0302 3) Since 0.01 is not in the C.I., reject Ho. 4) There is sufficient evidence to conclude that the proportion of people in Xuzhou City with HCV is not 0.01. 2005 Christopher R. Bilder 10.50 Of course, you could also do the hypothesis test using the test statistic and p-value methods. Test statistic: Suppose our null hypothesis is: Ho:p = p0 or Ho:p p0 or Ho:p p0. From Chapter 9, we saw that ˆ Pp P z / 2 z / 2 1 p(1 p) / n Thus, the test statistic to be used here replaces the random variables with their observed values for what is in the middle of the inequality above: p p0 ˆ z p0 (1 p0 ) / n where p0 is the hypothesized value of p. Question: What are the critical values? P-value: 2P(Z>|z|) for a two-tail test. 2005 Christopher R. Bilder 10.51 Example: HCV (HCV.xls from Chapter 9) Suppose Xuzhou City officials say that HCV prevalence is 0.01. What do you think about the correctness of their statement? Use =0.05 to perform a hypothesis test in order to examine their statement. Hypothesis test using the test statistic method: 1) Ho:p = 0.01 Ha:p 0.01 x 42 2) Note that p ˆ 0.0224 n 1875 p p0 ˆ 0.0224 0.01 z 5.3964 p0 (1 p0 ) / n 0.01(1 0.01) /1875 3) z0.025 = 1.96 4) Since 5.3964 > 1.96, reject Ho. 5) There is sufficient evidence to conclude that the proportion of people in Xuzhou City with HCV is not 0.01. Hypothesis test using the p-value method: 2005 Christopher R. Bilder 10.52 1) Ho:p = 0.01 Ha:p 0.01 2) 2P(Z > |5.3964|) = 6.8010-8 Remember that a p-value is found through integration. The p-value is z2 1 2 P(Z | 5.3964 |) 2 e 2 dz = 6.8010-8 5.3964 2 In Maple, > 2*int(f(z),z=5.3964..infinity); .6799127778 10 -7 3) = 0.05 4) Since 6.8110-8 < 0.05, reject Ho. 5) There is sufficient evidence that the proportion of people in Xuzhou City with HCV is not 0.01. Below are the results from HCV.xls. 2005 Christopher R. Bilder 10.53 2005 Christopher R. Bilder 10.54 Why can the C.I. method result in a different test outcome? When deriving a C.I. for p, we used the following result ˆ Pp P z / 2 z / 2 1 p(1 p) / n ˆ ˆ P P z / 2 p(1 p) / n p P z / 2 p(1 p) / n 1 This directly resulted in the C.I. of p(1 p) ˆ ˆ p(1 p) ˆ ˆ p z / 2 ˆ p p z / 2 ˆ n n p(1 p) Notice that we needed to replace p in ˆ with p . n This was because we could not use the parameter value ˆ to calculate the C.I. for the parameter itself! Using p in p(1 p) can cause problems so we ended up using the n Agresti-Coull C.I. of p(1 p) p(1 p) p z / 2 p p z / 2 n z2 / 2 n z2 / 2 2005 Christopher R. Bilder 10.55 x (z2 / 2 ) 2 where p . n (z2 / 2 ) For hypothesis testing, we will specify a hypothesized value of p. Thus, we can now use the interval of p z / 2 p0 (1 p0 ) / n p p z / 2 p0 (1 p0 ) / n ˆ ˆ to do the hypothesis test where p0 is the hypothesized value of p. Using this interval will result in the same hypothesis test outcomes as the test statistic and p- value methods. Example: HCV (HCV.xls) In the previous output under “Hypothesis test” the p z / 2 p0 (1 p0 ) / n p p z / 2 p0 (1 p0 ) / n interval is ˆ ˆ calculated to be 0.0179 < p < 0.0269. Since 0.01 is not in the interval, reject Ho. 2005 Christopher R. Bilder 10.56 10.12: Two Samples: Tests on Two Proportions We examined C.I.s for the difference of two proportions, p1-p2, in Section 9.11. Now, we are going to formalize the discussion of making decisions about p1-p2 using hypothesis tests. The hypothesis test described in this section and the one in Section 10.11 are two places where performing a hypothesis test using the C.I.s discussed in Chapter 9 may not result in the same answer as in Chapter 10. Example: Larry Bird (bird_ch9.xls from Chapter 9) Second Made Missed Total Made 251 34 285 First Missed 48 5 53 Total 299 39 338 Consider the first free throw made as one population and first free throw missed as the second population. We are interested in estimating the second free throw probability of success (made) for each first free throw outcome population. Then x1 251 x2 48 p1 ˆ 0.8807 and p2 ˆ 0.9057 n1 285 n2 53 2005 Christopher R. Bilder 10.57 Perform a hypothesis test to determine if the probability of success for the second free throw attempt is dependent on the outcome of the first free throw. Use =0.05 to perform a hypothesis test in order to examine the statement. Hypothesis test using the C.I. method: 1) Ho:p1-p2 = 0 Ha:p1-p2 0 2) Using the Agresti-Caffo C.I.: x 1 251 1 p1 1 0.8780 and n1 2 285 2 x 1 48 1 p2 2 0.8909 . n2 2 53 2 0.8780(1 0.8780) 0.8909(1 0.8909) 0.8780 0.8909 1.96 p1 p2 285 2 285 2 0.8780(1 0.8780) 0.8909(1 0.8909) 0.8780 0.8909 1.96 285 2 285 2 -0.1035 < p1-p2 < 0.0778 3) Since 0 is in the C.I., do not reject Ho. 4) There is not sufficient evidence to conclude that the outcome on the first throw has an effect on the outcome of the second free throw. 2005 Christopher R. Bilder 10.58 Of course, you could also do the hypothesis test using the test statistic and p-value methods. Test statistic: Suppose our null hypothesis is: Ho:p1-p2 = 0 or Ho:p1-p2 0 or Ho:p1-p2 0 In Section 9.11, we used the following result for the C.I. called “Large sample C.I. for p1-p2” (first C.I. in that section): P z / 2 P1 P2 p1 p2 ˆ ˆ z / 2 1 p1(1 p1) p2 (1 p2 ) n1 n2 Thus, the test statistic to be used here replaces the random variables with their observed values for what is in the middle of the inequality above: z p1 p2 p1 p2 ˆ ˆ p1(1 p1) p2 (1 p2 ) n1 n2 Notice that we still need to substitute the hypothesized parameter values for p1-p2. Since the null hypothesis always include a p1-p2=0, this means 2005 Christopher R. Bilder 10.59 p1=p2! Call this common value of proportions, pc (the book just calls it p, but this could be confused with p in Section 10.11). The test statistic then becomes: z p1 p2 0 ˆ ˆ p1 p2 0 ˆ ˆ pc (1 pc ) pc (1 pc ) 1 1 pc (1 pc ) n1 n2 n1 n2 where 0 was substituted for p1-p2 in the numerator and pc was substituted for p1 and p2 in the denominator. Now, we need to estimate pc. Since p1=p2 under Ho, we could say that this means both populations are the same (since this is the only parameter for both). Thus, we could then “pool” the results from both samples to form an estimator for pc: x1 x2 pc ˆ n1 n2 The final test statistic becomes, z p1 p2 0 ˆ ˆ ˆ 1 1 pc (1 pc ) ˆ n1 n2 2005 Christopher R. Bilder 10.60 Question: What are the critical values? P-value: 2P(Z>|z|) for a two-tail test. Example: Larry Bird (bird_ch9.xls) Hypothesis test using the test statistic method: 1) Ho:p1-p2 = 0 Ha:p1-p2 0 x1 x2 251 48 2) Note that pc ˆ 0.8846 . Then n1 n2 285 53 z p1 p2 0 ˆ ˆ 0.8807 0.9057 0 ˆ 1 1 1 1 pc (1 pc ) ˆ 0.8846(1 0.8846) n1 n2 285 53 0.5222 3) z0.025 = 1.96 4) Since -1.96 < -0.5222 < 1.96, do not reject Ho. 2005 Christopher R. Bilder 10.61 5) There is not sufficient evidence to conclude that the outcome on the first throw has an effect on the outcome of the second free throw. Hypothesis test using the p-value method: 1) Ho:p1-p2 = 0 Ha:p1-p2 0 2) 2P(Z > |-0.5222|) = 0.6015 3) = 0.05 4) Since 0.6015 > 0.05, do not reject Ho. 5) There is not sufficient evidence to conclude that the outcome on the first throw has an effect on the outcome of the second free throw. Below are the results from bird_ch9.xls. 2005 Christopher R. Bilder 10.62 2005 Christopher R. Bilder 10.63 2005 Christopher R. Bilder 10.64 Questions: Are first and second free throw attempts independent? Suppose the purpose of the problem was changed to: Perform a hypothesis test to determine if the probability of success for the second free throw attempt decreases if the first throw is missed rather than made. Note that this is what most basketball fans probably think. How would you perform the test? 1) Ho:p1-p2 0 Ha:p1-p2 > 0 Notice that in Ha, p1>p2. Thus, probability of success on the second free throws given first one is made is greater than the probability of success on the second free throws given first one is missed. If we reject Ho to conclude Ha is true, the probability of this being incorrect (type I error) is . 2) z = -0.5222 3) +z0.05 = +1.645 4) Since -0.5222 < 1.645, do not reject Ho 2005 Christopher R. Bilder 10.65 5)There is not sufficient evidence to conclude that the probability of success on the second throw decreases when the first free throw is missed rather than made. The p-value for the above test would be P(Z > -0.5222) = 0.6992 which is found with the Excel function of 1-NORMDIST(-0.5222,0,1,TRUE). 2005 Christopher R. Bilder 10.66 10.13: One- and Two-Sample Tests Concerning Variances These lecture notes are available on the schedule web page of the course website. 10.14: Goodness-of-Fit Test These lecture notes are available on the schedule web page of the course website. Note that the confidence interval method is not available for this hypothesis test! 2005 Christopher R. Bilder 10.67 Back to Sections 10.1-10.7 One-tail tests The hypothesis tests done so far have been of the form: Ho:=o vs. Ha:o where o is just a number like 11.6cm. In order to reject Ho, the test statistic is too big OR too small (i.e. there are two rejection regions). These kinds of hypothesis tests are called two-tail tests because the rejection region falls in two tails of the PDF. Now we are going to discuss ONE-TAIL Tests: Test Name Ho:o Left-tail Ha:<o Ho:o Right-tail Ha:>o To reject Ho for left-tail tests, the test statistic must be <, i.e. on the left side of the PDF. To reject Ho for right-tail tests, the test statistic must be >, i.e. on the right side of the PDF. 2005 Christopher R. Bilder 10.68 1) The Confidence Interval Method - 4 Steps 1. State Ho and Ha 2. Find the “one-sided” C.I. for Test Name (1-)100% C.I. Ho:o s Left-tail x t,n1 Ha:<o n Ho:o s Right-tail x t,n1 Ha:>o n For example, the C.I. for the left-tail test gives an Comment [b4]: What happens if reversed the interval such as (-,2). Therefore, we have an interval? One could never reject! That's why the interval is in this direction. upper bound on the value of . If Ho:3 and Ha:<3, then the C.I. says that is less than 3. 3. Reject or do not reject Ho – Check if the hypothesized value of is inside the interval. 4. Conclusion – Describe what 3. means in terms of the original problem Example: Tire life A consumer group is concerned about a manufacturer's claim that their tires last on average a least 22,000 miles. A sample of 100 tires are taken and the number of miles each lasted is recorded. The sample mean was 2005 Christopher R. Bilder 10.69 21,819 miles and the sample standard deviation was 1,295 miles. Perform a hypothesis test to see if there is evidence to disprove the manufacturer's claim using a type I error rate of 0.01. 1. Ho:22,000 Ha:<22,000 In order to disprove the claim, <22,000 needs to be in Ha. Then = P(reject Ho | Ho is true) = P(sample says <22,000 | 22,000) = 0.01. Thus, I am controlling the probability of making this type of error! 2. Find the “one-sided” C.I. for s = 21,819 + 2.3641,295/ 100 x t,n1 n where TINV(0.012, 99) is used to find the critical value. Notice that 0.01 is NOT used in the TINV() function. See Chapter 9’s discussion on the use of this function. Therefore, the 99% confidence interval is 2005 Christopher R. Bilder 10.70 - < < 22,125.14 3. Since 22,000 is in the C.I., do not reject Ho. 4. There is not sufficient evidence to disprove the manufacturer’s tire life claim. 2) The Test Statistic Method - 5 Steps 1.State Ho and Ha 2.Find the test statistic 3.Find the critical value Test Name Critical Value Ho:o Left-tail -t, n-1 Ha:<o Ho:o Right-tail +t, n-1 Ha:>o 4. Reject or do not reject Ho Left-tail: 2005 Christopher R. Bilder 10.71 Right – tail: Write reject or don’t reject Ho and provide a reason. 5. Conclusion Example: Tire life 1.Ho:22,000 Ha:<22,000 x 0 21,819 22,000 2. t 1.4 s/ n 1,295 / 100 3.-t, n-1 = -t0.01, 99 = -2.364 4. 2005 Christopher R. Bilder 10.72 Since -1.4 > -2.364 don't reject Ho. 5. There is not sufficient evidence to disprove the manufacturer’s tire life claim. 3) The P-value Method - 5 Steps 1.State Ho and Ha 2.Find the p-value a)Compute test statistic b)Find the p-value Test Name p-value Ho:o Ha:<o Left-tail P(T<t) Ho:o Ha:>o Right-tail P(T>t) 2005 Christopher R. Bilder 10.73 For right-tail (left-tail) tests, this gives the probability of finding a value of t at least this great (small) assuming Ho is true. Note: These are one-tail tests so only the probability for one-tail is needed. 3.State 4.Reject or do not reject Ho Reject Ho if p-value < and do not reject if p-value Example of don’t reject for right-tail test: 5. Conclusion 2005 Christopher R. Bilder 10.74 Example: Tire life 1.Ho:22,000 Ha:<22,000 2.P(T < -1.4) = 0.0823 To find this value in Excel, use the symmetry property of the t-distribution because Excel will not find t-distribution probabilities associated with negative values of the test statistic. The function used is TDIST(1.4, 99, 1). Remember that the p-value is found through integration: 1.4 99 1 / 2 (99 1) / 2 1 t 2 99 dt = 0.0823 99 / 2 99 > f(t):=GAMMA((nu+1)/2)/(GAMMA(nu/2) * sqrt(Pi*nu)) * (1+t^2/nu)^(-(nu+1)/2); ( 1/2 1/2 ) 1 1 1 t 2 2 2 f( t ) := 1 2 > int(eval(f(t),nu=99),t=-infinity..-1.4); .08231967967 3. = 0.01 4. 2005 Christopher R. Bilder 10.75 0 Since 0.0823 > 0.01, don't reject Ho. 5.There is not sufficient evidence to disprove the manufacturer’s tire life claim. Note: P-value interpretation: If is really 22,000 in the population, then a test statistic value, t, no smaller than what this was observed would occur about 8.23% of time if the hypothesis test process (take a new sample and perform a new hypothesis test) is repeated a large number of times. Thus, it may occur about 8 times out of 100. This is borderline with regard to it being a likely event, and it is why the p-value is close to the level of significance, =0.01. Often, people will say this is “marginal evidence” against Ho. Example: Volleyball quality control (hyp_volleyball_data.xls) Be VERY careful with the Excel calculation here! 2005 Christopher R. Bilder 10.76 Suppose Ho:11.6 vs. Ha:<11.6 is being tested with =0.05. The same Excel output as before would be produced using the t-test Paired Two Sample for Means window. Volleyball Radius Hypothesized value Mean 11.5000 11.6 Variance 1.0000 0 Observations 36 36 Pearson Correlation #DIV/0! Hypothesized Mean Difference 0 df 35 t Stat -0.6001 P(T<=t) one-tail 0.2762 t Critical one-tail 1.6896 P(T<=t) two-tail 0.5523 t Critical two-tail 2.0301 The “one-tail test” p-value is given by Excel to be 0.2762 and the critical value given is 1.6896. How does Excel know if we want a left-tail or right-tail test? We never specified it! This is what Excel does: Excel will always give a positive critical value. You need to realize that it is a left-tail test and a negative critical value is needed. Excel calculates the p-value as P(T<t) if t<0 and P(T>t) if t>0. In this case, Excel calculates the p- value correctly for this left-tail test. However, if the hypotheses were switched to Ho:11.6 vs. 2005 Christopher R. Bilder 10.77 Ha:>11.6, Excel would calculate the p-value still as 0.2762. The correct p-value would be 1-0.2762 = 0.7238 since we would want P(T>-0.6001) = 1-P(T<- 0.6001). Be careful on relying too much on the Excel output!!! See the file, hyp_1sample_pic.xls, for how some of the one-tail calculations can be done in Excel without the Data Analysis tool. 2005 Christopher R. Bilder 10.78 2005 Christopher R. Bilder 10.79 As mentioned before, all values in red can be changed by the user to see the effect on the test statistic, critical value, and the hypothesis test decision. Make changes on your own so that you familiarize yourself with what happens if the sample size increases, standard deviation changes,… Example: Cavaliers – this is an old test problem Chevrolet has been advertising a 3-year, 36,000-mile warranty for its Cavaliers. The warranty covers the engine, transmission, and drive train for all new Cavaliers up to 3 years or 36,000 miles, whichever comes first. One Chevrolet dealer believes the drivers tend to reach 36,000 miles before 3-years of ownership. The dealer takes a random sample of 32 Cavalier owners producing the following statistics on number of miles driven after 3 years: x =39,900 and s=1,866. 1. State the Type I and II errors for the hypotheses below. Ho: 36,000 Ha:> 36,000 Type I: Reject Ho, but Ho is true. Reject 36,000, but really is 36,000. 2005 Christopher R. Bilder 10.80 The sample leads you to believe that the average miles driven is greater than 36,000 (i.e. 36,000 is incorrect), but in actuality the average miles driven is less than or equal to 36,000. Notes: The probability of this happening is set at a level of . The probability of correctly rejecting Ho is 1-. Type II: Do not reject Ho, but Ha is really true. Do not reject 36,000, but really is > 36,000. The sample does not give you enough evidence to conclude that the average miles driven is greater than 36,000, but the average miles driven really is greater than 36,000. Notes: The probability of this happening is . This probability is not controlled. Thus given that really is > 36,000, the probability of committing this Type II error could be 0, 0.1, 0.2, 0.99, …, or 1. The probability of correctly concluding do not reject Ho when Ho is really true is 1-. Notice how specifying and not specifying controls what goes into Ho and Ha! 2005 Christopher R. Bilder 10.81 2. Perform a hypothesis test at the significance level of (=) 0.01 using the test statistic or p-value method. Test statistic method: 1. Ho: 36,000 Ha: >36,000 39,900 36,000 2. t 11.8230 1,866 / 32 3. t0.01, 31=2.453 4. Since 11.8230>2.453 reject Ho. 5. There is sufficient evidence to show that the average miles driven in 3-years is greater than 36,000 miles. Make sure you can draw a picture of the t- distribution for this example! P-value method: 2. Ho: 36,000 Ha: >36,000 39,900 36,000 3. t 11.8230 1,866 / 32 p-value = P(T>t) = P(T>11.8230)=2.544x10-13 P-value interpretation: If is really 36,000 in the population, then a test statistic value, t, at least this 2005 Christopher R. Bilder 10.82 large (11.82) would occur 0.0000000002544% of the time if the hypothesis test process (take a new sample and perform a new hypothesis test) is repeated a large number of times. This is VERY, VERY unlikely! Therefore, most likely really is NOT 36,000. 4. =0.01 5. Since 2.544x10-13 < 0.01, reject Ho. 6. There is sufficient evidence to show that the average miles driven in 3-years is greater than 36,000 miles. C.I. method: Do on your own! 3. From the customers’ viewpoints, should they expect to have their Cavaliers to be under warranty for 3 years? Explain. No, since the average number of miles driven is greater than 36,000. Thus, their warranty will expire before 3- years of ownership is reached on average. Note: Students usually say deciding what goes into Ho and Ha is the toughest part of hypothesis testing! Note that the equality part always leads to what is in Ho. Also, let the control of a type I error guide you to what goes in Ho or Ha. 2005 Christopher R. Bilder 10.83 From Section 10.8 Example: CPT (cpt.xls in Chapter 9) Suppose the problem asked you to determine if the average hits decrease. If this happened, it could mean: o drug causes drowsiness o drug causes blurred vision o some other effect In this case, we would have a one-tail test. Test statistic method: 1) Ho:D 0 Ha:D > 0; average hits go down over time Notice that the purpose of the problem was to determine if the hits decrease. The only way we can determine this is to put it in Ha. Think about what the possible conclusions could be: Reject Ho: There is sufficient evidence to indicate the average hits decrease over the two time periods. The probability of making a type I error here (reject Ho, but Ho is true) is = 0.01. Do not reject Ho: There is not sufficient evidence to indicate the average hits have decreased. Notice that does NOT say 1 - 2 0. 2005 Christopher R. Bilder 10.84 d 2.8 2) t 6.7254 sD n 1.3166 / 10 3) +t0.01,9 = +2.8214 4) Reject Ho since 6.7254 > 2.8214. 5) There is sufficient evidence to indicate the average hits decrease over the two time periods. P-value method: 1) Ho:D 0 Ha:D > 0 2) P(T > 6.7254) = 4.3 10-5 3) = 0.01 4) Reject Ho since 4.3 10-5 < 0.01 6) There is sufficient evidence to indicate the average hits decrease over the two time periods. 2005 Christopher R. Bilder 10.85 From Section 10.11-10.12 What adjustments need to be made to the hypothesis testing procedures in these sections for one-tail tests? 2005 Christopher R. Bilder 10.86 Summary of hypothesis testing steps For the test statistic method: 1) State Ho and Ha 2) Calculate the test statistic 3) State the critical value 4) Decide whether or not to reject Ho 5) State a conclusion in terms of the problem For the p-value method: 1) State Ho and Ha 2) Calculate the p-value 3) State 4) Decide whether or not to reject Ho 5) State a conclusion in terms of the problem For the C.I. method: 1) State Ho and Ha 2) Calculate the C.I. 3) Decide whether or not to reject Ho. 4) State a conclusion in terms of the problem 2005 Christopher R. Bilder