Chapter 24: Comparing Means Comparing Two Means Population model For independent random parameter of interest is variables, variances add. the difference between If we know the population the means, 1 2 . means 12 2 2 The statistic of interest SD y1 y 2 n1 n2 is the difference in the If we estimate, using the sample two observed means, means y1 y 2 . SE y y 1 2 s12 s2 2 n1 n2 Comparing Two Means Confidence interval is call a two-sample t-interval. The hypothesis test is called a two- sample t-test. y y ME 1 2 ME t SE y y * 1 2 A Sampling Distribution for the Difference Between Two Means When the conditions are Modeled by a Student’s t-model met, the standardized with a number of degrees of sample difference freedom found with a special between the means of two formula. independent groups, We estimate the standard error t y1 y 2 1 2 with SE y1 y 2 SE y1 y 2 s12 s2 2 n1 n2 Assumptions and Conditions Independence Assumption Randomization Surveys: representative random samples Experiments: randomized 10% Condition Normal Population Assumption Nearly Normal Condition Check both samples. Draw pictures! Independent Groups Assumption Think about how the data were collected. Two-sample t-interval When the conditions are met, find the confidence interval for the difference between means of two independent groups. Since the standard error of the difference is SE y1 - y 2 s12 s2 2 n1 n2 , * the interval is y1 - y2 tdf SE y1 - y 2 The critical value depends on the particular confidence level C that you specify and on the number of degrees of freedom, which we get from sample size and a special formula. Comparing Brand Name & Generic Batteries L1: Brand Name Find the interval that is likely L2: Generic with 95% confidence to Plot contain the true difference G B between the mean lifetime of the generic brand AA batteries and the mean lifetime of the brand-name batteries Comparing Brand Name & Generic Batteries Check the conditions: Independent groups assumption: batteries manufactured by two different companies from separate packages should be independent. Randomization: the batteries were selected at random from those available for sale. This is not exactly an SRS, but a reasonably representative random sample. Since the batteries come in packs, they may not be independent. Repeat the experiment for several packages of batteries. Comparing Brand Name & Generic Batteries Check the conditions: Histograms 10%: the number of Brand Name (L1) sampled batteries are certainly less than 10% of all AA batteries manufactured by the companies. Generic (L2) Nearly Normal condition: the samples are small, but the histograms look unimodal and symmetric. Comparing Brand Name & Generic Batteries State the sampling STAT TESTS 2-SampTInt distribution model for the statistic: Under these conditions, the sampling model of the difference in the sample means can be modeled by a Student’s t-model with about 9 degrees of freedom. Choose your method: We will use a two-sample t-interval. Comparing Brand Name & Generic Batteries Interpretation: tell what the confidence interval means We are 95% confident that the mean useful life of the generic batteries is between 2.1 minutes and 35.1 minutes longer than the mean useful life of the brand-name batteries for this task. If generic batteries are cheaper, there seems little reason not to use them. If it is more trouble or costs more to buy them, then you should consider whether the additional performance is worth it. Testing the Difference Between Two Means Two-sample t-test for the difference between the means of two independent groups: The conditions for the two-sample t-test for the difference between the means of two independent groups are the same as for the two- sample t-interval. We test the hypothesis H O : 1 2 O , where the hypothesized difference is almost always 0, using the statistic t y 1 y 2 O . The standard error is SE y1 y 2 SE y1 y 2 s12 s2 2 n1 n2 . Camera Price Offers State the null hypothesis: Check to plots: We want to know if people are L1: Friend more likely to offer a different amount for a used camera when L2: Stranger buying from a friend or a stranger. HO: The difference in mean price offered to friends and the mean price offered to strangers is zero: F S 0 HA:The difference in mean price is not zero: F S 0 Camera Price Offers Check the conditions: Check the conditions: Independent groups Nearly Normal condition: assumption: randomizing the Histograms of the two sets experiment gives us of prices are unimodal and independent groups. symmetric. Randomization condition: the L1 L2 experiment was randomized. Subjects were assigned to treatment groups at random. 10% condition: this is a randomized experiment, so this condition does not apply. Camera Price Offers State the sampling distribution model of the statistic: Because the conditions are satisfied, it is appropriate to model the sampling distribution of the difference in the means with a Student’s t-model. Choose your method. We will perform a two-sample t-test. Camera Price Offers Calculate: Draw: STAT TESTS 2-SampTTests Camera Price Offers Conclusion: The P-value tells us that if there were no difference in the mean prices, the difference we have observed would occur only 0.6% of the time. That’s too rare for most people to believe, so we reject the null hypothesis and conclude that people are likely to pay a friend for a used camera a different amount than they would pay a stranger. We may want to take special care not to pay too much when buying an item such as this from a friend. Pooled t-test If we are willing to assume that means’ variances are equal, we can pool the data from the two groups to estimate the common variance and make the degrees of freedom formula much simpler. We are still estimating the pooled standard deviation from the data, so we use Student’s t-model, and the test is called a pooled t-test. Pooled Variance t-test for the Difference Between Two Independent Means The conditions for the pooled t-test for the difference between two independent means are the same as for the two-sample t- test with the additional assumption that the variances of the two groups are the same. We test the hypothesis H O : 1 2 O , where the hypothesized difference is almost always 0, using the statistic t y y . The standard error is 1 2 O SE pooled y y 1 2 s2 s2 SE pooled y1 y 2 pooled n1 pooled n2 . Pooled Variance t-test for the Difference Between Two Independent Means The pooled variance is: n1 1 s12 n2 1 s22 . pooled s2 n1 1 n2 1 When the conditions are met and the null hypothesis is true, this statistic follows a Student’s t-model with n1 1 n2 1 degrees of freedom. * The corresponding interval is y1 - y 2 tdf SE pooled y1 - y 2 , where the critical value t * depends on the confidence level and is found with n1 1 n2 1 degrees of freedom. When to Pool? The advantage of the pooled method is greatest when the samples are small. But this is when it’s hardest to check conditions. When the choice between two-sample t and pooled-t methods make a difference (sample size is small), the test for whether the variances are equal hardly works at all. In a randomized comparative experiment, we know that each treatment group is a random sample from the same population. So each treatment group begins with the same population variance. In this case, assuming equal variances is the same as assuming that the treatment doesn’t change the variance. Check the conditions: Boxplots, Boxplots, Boxplots!!! When to Pool? Because the advantages of pooling are small, and you are allowed to pool only rarely – when the equal variances assumption is met: DON’T! It is never wrong NOT to pool!! CAUTION!!! Watch out for paired data. If the samples are not independent, you cannot use the two-sample methods. Two-sample methods can only be used if the observations in the two groups are independent. Look at the plots! Check for outliers and non-normal distributions. Make and examine boxplots.
Pages to are hidden for
"Pay Check Samples"Please download to view full document