VIEWS: 8 PAGES: 35 POSTED ON: 9/28/2012 Public Domain
Economics 105: Statistics • Any questions? • Please read Ch 5 of Freakonomics Two-Sample Tests Two-Sample Tests Population Population Population Means, Proportions, Population Means, Variances Independent Related Independent Samples Samples Samples Examples: Population 1 vs. Same population Proportion 1 vs. Variance 1 vs. independent before vs. after independent Variance 2 Population 2 treatment Proportion 2 2 Test for Differences Among More Than Two Proportions • Extend the 2 test to the case with more than two independent populations: H0: π1 = π2 = … = πc H1: Not all of the πj are equal (j = 1, 2, …, c) The Chi-Square Test Statistic The Chi-square test statistic is: (fo fe )2 2 all cells fe • where: fo = observed frequency in a particular cell of the 2 x c table fe = expected frequency in a particular cell if H0 is true 2 for the 2 x c case has (2-1)(c-1) = c - 1 degrees of freedom (Assumed: each cell in the contingency table has expected frequency of at least 1) Computing the Overall Proportion The overall X1 X 2 Xc X p proportion is: n1 n2 nc n • Expected cell frequencies for the c categories are calculated as in the 2 x 2 case, and the decision rule is the same: Decision Rule: Where 2U is from the If 2 > 2U, reject H0, chi-squared distribution otherwise, do not with c – 1 degrees of reject H0 freedom The Marascuilo Procedure • Used when the null hypothesis of equal proportions is rejected • Enables you to make comparisons between all pairs • Start with the observed differences, pj – pj’, for all pairs (for j ≠ j’) . . . • . . .then compare the absolute difference to a calculated critical range The Marascuilo Procedure (continued) • Critical Range for the Marascuilo Procedure: p j (1 p j ) p j' (1 p j' ) Critical range 2 U nj n j' • (Note: the critical range is different for each pairwise comparison) • A particular pair of proportions is significantly different if | pj – pj’| > critical range for j and j’ Marascuilo Procedure Example A University is thinking of switching to a trimester academic calendar. A random sample of 100 administrators, 50 students, and 50 faculty members were surveyed Opinion Administrators Students Faculty Favor 63 20 37 Oppose 37 30 13 Totals 100 50 50 Using a 1% level of significance, which groups have a different attitude? Marascuilo Procedure: Solution Excel Output: compare Marascuilo Procedure Sample Sample Absolute Std. Error Critical Group Proportion Size Comparison Difference of Difference Range Results 1 0.63 100 1 to 2 0.23 0.08444525 0.256 Means are not different 2 0.4 50 1 to 3 0.11 0.07860662 0.239 Means are not different 3 0.74 50 2 to 3 0.34 0.09299462 0.282 Means are different Other Data Level of significance 0.01 Chi-sq Critical Value 9.21 d.f 2 Q Statistic 3.034856 At 1% level of significance, there is evidence of a difference in attitude between students and faculty 2 Test of Independence • Similar to the 2 test for equality of more than two proportions, but extends the concept to contingency tables with r rows and c columns H0: The two categorical variables are independent (i.e., there is no relationship between them) H1: The two categorical variables are dependent (i.e., there is a relationship between them) 2 Test of Independence (continued) The Chi-square test statistic is: (fo fe )2 2 all cells fe • where: fo = observed frequency in a particular cell of the r x c table fe = expected frequency in a particular cell if H0 is true 2 for the r x c case has (r-1)(c-1) degrees of freedom (Assumed: each cell in the contingency table has expected frequency of at least 1) Expected Cell Frequencies • Expected cell frequencies: row total column total fe n where: row total = sum of all frequencies in the row column total = sum of all frequencies in the column n = overall sample size Decision Rule • The decision rule is If 2 > 2U, reject H0, otherwise, do not reject H0 Where 2U is from the chi-squared distribution with (r – 1)(c – 1) degrees of freedom Example • The meal plan selected by 200 students is shown below: Class Number of meals per week Standing 20/week 10/week none Total Fresh. 24 32 14 70 Soph. 22 26 12 60 Junior 10 14 6 30 Senior 14 16 10 40 Total 70 88 42 200 Example (continued) • The hypothesis to be tested is: H0: Meal plan and class standing are independent (i.e., there is no relationship between them) H1: Meal plan and class standing are dependent (i.e., there is a relationship between them) Example: Expected Cell Frequencies (continued) Observed: Class Number of meals per week Standin Expected cell g 20/wk 10/wk none Total frequencies if H0 is true: Fresh. 24 32 14 70 Soph. 22 26 12 60 Number of meals Class per week Junior 10 14 6 30 Senior 14 16 10 40 Standing 20/wk 10/wk none Total Total 70 88 42 200 Fresh. 24.5 30.8 14.7 70 Example for one cell: Soph. 21.0 26.4 12.6 60 row total column total Junior 10.5 13.2 6.3 30 fe n Senior 14.0 17.6 8.4 40 30 70 Total 70 88 42 200 10.5 200 Example: The Test Statistic (continued) • The test statistic value is: ( fo fe )2 2 all cells fe (24 24.5)2 (32 30.8)2 (10 8.4)2 0.709 24.5 30.8 8.4 2U = 12.592 for = 0.05 from the chi-squared distribution with (4 – 1)(3 – 1) = 6 degrees of freedom Example: Decision and Interpretation(continued) The test statistic is 2 0.709 , U with 6 d.f. 12.592 2 Decision Rule: If 2 > 12.592, reject H0, otherwise, do not reject H0 Here, 2 = 0.709 < 2U = 12.592, so do not reject H0 0 Do not Reject H0 2 Conclusion: there is not reject H0 sufficient evidence that meal 2U=12.592 plan and class standing are related at = 0.05 Two-Sample Tests in Excel For independent samples: • Independent sample Z test with variances known: – Data | data analysis | z-test: two sample for means • Pooled variance t test: – Data | data analysis | t-test: two sample assuming equal variances • Separate-variance t test: – Data | data analysis | t-test: two sample assuming unequal variances For paired samples (t test): – Data | data analysis | t-test: paired two sample for means For variances: • F test for two variances: – Data | data analysis | F-test: two sample for variances Wilcoxon Rank-Sum Test for Differences in 2 Medians • Test two independent population medians • Populations need not be normally distributed • Distribution free procedure • Used when only rank data are available • Must use normal approximation if either of the sample sizes is larger than 10 Wilcoxon Rank-Sum Test: Small Samples • Can use when both n1 , n2 ≤ 10 • Assign ranks to the combined n1 + n2 sample observations – If unequal sample sizes, let n1 refer to smaller-sized sample – Smallest value rank = 1, largest value rank = n1 + n2 – Assign average rank for ties • Sum the ranks for each sample: T1 and T2 • Obtain test statistic, T1 (from smaller sample) Checking the Rankings • The sum of the rankings must satisfy the formula below • Can use this to verify the sums T1 and T2 n(n 1) T1 T2 2 where n = n1 + n2 Wilcoxon Rank-Sum Test: Hypothesis and Decision Rule M1 = median of population 1; M2 = median of population 2 Test statistic = T1 (Sum of ranks from smaller sample) Two-Tail Test Left-Tail Test Right-Tail Test H0: M1 = M2 H0: M1 M2 H0: M1 M2 H1: M1 M2 H1: M1 < M2 H1: M1 > M2 Reject Do Not Reject Reject Do Not Reject Do Not Reject Reject Reject T1L T1U T1L T1U Reject H0 if T1 < T1L Reject H0 if T1 < T1L Reject H0 if T1 > T1U or if T1 > T1U Wilcoxon Rank-Sum Test: Small Sample Example Sample data are collected on the capacity rates (% of capacity) for two factories. Are the median operating rates for two factories the same? • For factory A, the rates are 71, 82, 77, 94, 88 • For factory B, the rates are 85, 82, 92, 97 Test for equality of the population medians at the 0.05 significance level Wilcoxon Rank-Sum Test: Small Sample Example Capacity Rank (continued) Ranked Capacity Factory A Factory B Factory A Factory B values: 71 1 77 2 Tie in 3rd and 82 3.5 4th places 82 3.5 85 5 88 6 92 7 94 8 97 9 Rank Sums: 20.5 24.5 Wilcoxon Rank-Sum Test: Small Sample Example (continued) Factory B has the smaller sample size, so the test statistic is the sum of the Factory B ranks: T1 = 24.5 The sample sizes are: n1 = 4 (factory B) n2 = 5 (factory A) The level of significance is = .05 Wilcoxon Rank-Sum Test: Small Sample Example (continued) Lower and Upper n1 Critical n2 One- Two- 4 5 Values for Tailed Tailed T1 from 4 Appendix table E.8, .05 .10 12, 28 19, 36 page 829 .025 .05 11, 29 17, 38 in BLK 5 .01 .02 10, 30 16, 39 10e: .005 .01 --, -- 15, 40 6 T1L = 11 and T1U = 29 Wilcoxon Rank-Sum Test: Small Sample Solution (continued) = .05 Test Statistic (Sum of • n1 = 4 , n 2 = 5 ranks from smaller sample): Two-Tail Test T1 = 24.5 H0: M1 = M2 H1: M1 M2 Decision: Reject Do Not Reject Do not reject at = 0.05 Reject T1L=11 T1U=29 Conclusion: There is not enough evidence to Reject H0 if T1 < T1L=11 conclude that the pop medians are or if T1 > T1U=29 different. Wilcoxon Rank-Sum Test (Large Sample) • For large samples, the test statistic T1 is approximately normal with mean T and standard 1 deviation T : 1 n1(n 1) T1 T n1 n 2 (n 1) 1 2 12 – Must use the normal approximation if either n1 or n2 > 10 – Assign n1 to be smaller of the two sample sizes the – Could use the normal approximation for small samples Wilcoxon Rank-Sum Test (Large Sample) (continued) • The Z test statistic is n1(n 1) T1 T1 T1 2 Z T1 n1n 2 (n 1) 12 • Where Z ~ N(0,1) TI FF( U Q uipr issed) anom pr essr T ncom ckm e™ de a e e d c s i ar e neded t osee t hi pt ur e. c o Interpreting Electoral Polls* FF( n cTm d Q ui ki e™ an a c c ncom essed) de pt ur e. o TI ar eU ededpr osee t hi om pr essr e t s i • WASHINGTON (Reuters - 09/09/00) - A new Newsweek poll on Saturday showed Vice President Al Gore maintaining a strong lead over Texas Gov. George Bush in the presidential race, but a CNN/USA Today survey found the candidates virtually tied. • According to the Newsweek poll, Democratic nominee Gore leads Republican nominee Bush 47 percent to 39 percent among registered voters, with Green Party candidate Ralph Nader at 3 percent and Reform Party candidate Pat Buchanan at 1 percent. • Among likely voters, Gore led Bush 49 percent to 41 percent, the same margin as among registered voters. • The poll was conducted by Princeton Survey Research Associates Sept. 7-8 among 756 registered voters, including 595 who said they were likely to vote in the election. • The margin of error was 4 percentage points for the survey of registered voters and 5 percentage points for likely voters. *Source: http://www.kellogg.northwestern.edu/faculty/weber/decs-433/Presidential_Polls.htm cTm d Q ui ki e™ an a FF( nncom essed) de pt ur e. o c c TI ar eU ededpr osee t hi om pr essr Interpreting Electoral Polls e t s i ncom ckm e™ de a T d TI FF( U Q uipr issed) anom pr essr e c o e s i c ar e neded t osee t hi pt ur e. • Where do the 4% points and 5% points come from? • Recall … p(1 p) p 1.96 n 1 • A rough calculation for margin of error is n • Sample Size Margin of Error 1 10,000 .01 .0364 2,500 .02 756 1,112 .03 1 625 .04 .0410 400 .05 595 cTm d Q ui ki e™ an a FF( nncom essed) de pt ur e. o c c TI ar eU ededpr osee t hi om pr essr Interpreting Electoral Polls e t s i ncom ckm e™ de a T d TI FF( U Q uipr issed) anom pr essr e c o e s i c ar e neded t osee t hi pt ur e. • What are the precise margins of error on each candidate’s support? • Candidate Sample p Precise Margin of error Gore 47 3.55 % points Bush 39 3.47 p(1 p) Nader 3 1.21 1.96 756 Buchanan 1 .71 • Conclusions 1 is upper bound for margin of error at 95% confidence level n • But it is much too big for proportions far from 50% cTm d Q ui ki e™ an a FF( nncom essed) de pt ur e. o c c TI ar eU ededpr osee t hi om pr essr Interpreting Electoral Polls e t s i ncom ckm e™ de a T d TI FF( U Q uipr issed) anom pr essr e c o e s i c ar e neded t osee t hi pt ur e. • Is Gore statistically ahead? • Rule of thumb to use when watching Fox/MSNBC/etc: – Double the margin of error reported in the news article & compare that to the difference in sample proportions if Gore 1, n X i if neither 0, X i if Bush D i1 1, n • D will then be the difference between the proportion of voters supporting Gore and the proportion supporting Bush • Find E[D] and Var[D] cTm d Q ui ki e™ an a FF( nncom essed) de pt ur e. o c c TI ar eU ededpr osee t hi om pr essr Interpreting Electoral Polls e t s i ncom ckm e™ de a T d TI FF( U Q uipr issed) anom pr essr e c o e s i c ar e neded t osee t hi pt ur e. • Is Gore statistically ahead? Yes … • Rule of thumb to use when watching Fox/MSNBC/etc: – Double the margin of error reported in the news article & compare that to the difference in sample proportions E[D] .47.39 .08 .47(1 .47) .39(1 .39) 2(.47)(.39) StdDev[D] .0336 756 95% CI for the “lead” .08 1.96 * (.0336) • (.0141, .1458)