VIEWS: 15 PAGES: 98 POSTED ON: 4/21/2012 Public Domain
e2c54c4c-1c5c-408f-ae71-65a3bcbc249a.xlsx A B C D E F G H I J 1 Basic Descriptive Statistics 5▼ 2 Score Raw 3 Symbolic Formula Function Computed Excel Number Scores 4 Name Symbol See notes below Name Formula Function i X 5 Sum SX X1 + X2 + X3 . . . + Xn SUM ◄1 1 5 6 Sample or Population Size n or N n or N = COUNT ◄2 6 7 Degrees of Freedom d.f. n-1= ◄3 7 8 Sample or Population Mean or m sum/number = SX/n = AVERAGE ◄4 5 9 Sum of Squares SS S(X - ) = SUM, DEVSQ 2 ◄9 3 2 10 Sample Variance S SS/d.f. = VAR.S ◄10 8 11 Sample Standard Deviation S (SS / d.f.) = STDEV.S ◄11 7 12 Intermediate Calculation S/ ◄12 2 13 Sample Coefficient of Variation Cvar (S / )*100 ◄13 1 14 Legend 5 15 Labels Check Data & 9 16 Given Data Sample Worksheet 1 8 17 Order of Computation Quiz Data 7 18 Cells using formulas 6 19 Cells using function 5 20 Cells using functions in a formula 6 21 Optional Data 7 22 8 23 9 24 9 25 5 26 6 27 28 Last Z score = -0.72 29 CVAR = 21.25 30 31 Worksheet Quiz 1 Data 22 32 Enter this data in 25 33 Column J 13 34 and Complete 9 Prepared by G. Lee Griffith, Ph.D. 4/21/2012 e2c54c4c-1c5c-408f-ae71-65a3bcbc249a.xlsx A B C D E F G H I J 35 Worksheet Quiz 1 8 36 14 37 8 38 13 39 10 40 16 41 15 42 12 43 16 44 45 5 46 6 47 7 48 5 49 3 50 8 51 7 52 2 53 1 54 5 55 56 57 58 59 60 61 62 63 64 65 66 67 68 Prepared by G. Lee Griffith, Ph.D. 4/21/2012 e2c54c4c-1c5c-408f-ae71-65a3bcbc249a.xlsx A B C D E F G H I J 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 Prepared by G. Lee Griffith, Ph.D. 4/21/2012 e2c54c4c-1c5c-408f-ae71-65a3bcbc249a.xlsx K L M N O P 1 6▼ 7▼ 8 ▼ 14 ▼ 15 ▼ 2 DeviationSquared Std z-score 3 Mean Score Dev Sc Dev Z= 2 4 x X - x (X - x) S (X - x) / S 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Prepared by G. Lee Griffith, Ph.D. 4/21/2012 e2c54c4c-1c5c-408f-ae71-65a3bcbc249a.xlsx K L M N O P 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 Prepared by G. Lee Griffith, Ph.D. 4/21/2012 e2c54c4c-1c5c-408f-ae71-65a3bcbc249a.xlsx K L M N O P 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 Prepared by G. Lee Griffith, Ph.D. 4/21/2012 Worksheet 2 Histogram Build Worksheet 2 Data: Use this data to construct a histogram Construct a histogram showing the number of each color of ribbon awarded at the state fair. Ribbons Awarded at State Fair Color of Ribbon Frequency of Ribbon White 8 Red 6 Blue 1 Purple 4 Grand 2 Source: simulated Check Data: use this data to check your understanding of creating a histogram and in Sample Worksheet 2 quiz Construct a histogram showing the number of male drinkers in each category CLASSIFICATIONS OF PARTICIPANTS BY DRINKING CATEGORY Q-F-V Category Number of Participants Light 33 Moderate 54 Heavy 81 Source: http://www.mass.gov/mdaa/mvcrimes/Psychophysical%20tests%20for%20DWI.pdf Worksheet Quiz 2 Data Construct a histogram for the data below and answer the questions in Worksheet 2 Quiz College Student Alcohol Use in 1999 Category Percent (n = 13,819) Abstainer (past y) 19.2 Nonbinge drinker† 36.6 Occasional binge drinker‡ 21.4 Frequent binge drinker§ 22.7 †Students who consumed alcohol in the past year but did not binge. ‡Students who binged one or two times in a 2-week period. §Students who binged three or more times in a 2-week period. the state fair. and in Sample Worksheet 2 quiz %20DWI.pdf Worksheet 2 Pareto and Pie Chart Build Worksheet 2 Pareto and Pie Chart Data Construct two graphs. First a Pareto Chart. Be sure to sort the columns in the table by descending frequency Second construct a Pie Chart Madison County Indiana 1995 Frequency of Crimes Reported Crime Frequency ARSONS 8 BURGLARIES 91 MOTOR VEHICLE THEFTS 46 RAPES 6 ROBBERIES 5 Source: http://fisher.lib.Virginia.EDU/cgi-local/crimebin/new2.cgi Check Data and Sample Worksheet Quiz 2 Data: use the following data to check that you know how to create Pareto and Pie charts and to answer the sample Works Problem % Wrong Size 24 Did not want 34 Item was defective 38 No reason given 4 This is simulated data on types of problems in goods returned to Target use it to construct a Pareto Diagram (sorted Worksheet Quiz 2 Data For the data below create a Pareto and Pie charts and answer the questions in Worksheet Quiz 2 Table 1: Differences in Average Income and Family Size Among Families with Children, by Marital Status and Sex o Type of Family Mean Income Median Income Married Couple Families $79,048 $62,931 Male Householder, No Wife 44,270 32,516 Present Female Householder, No 29,075 21,529 Husband Present http://www.aspe.hhs.gov/hsp/marriage-well-being03/LitReview.pdf escending frequency o answer the sample Worksheet 2 quiz ct a Pareto Diagram (sorted) and a pie chart. by Marital Status and Sex of Household Head: 2000 Income per Person $18,515 14,719 9,023 Worksheet 2 Bar Graph Build Worksheet 2 Data: Use this data to build two bar graphs and compare with key Construct 2 graphs. The first to show which country has the lowest higher education rate. The second graph should show which country has the most gender difference in higher education rate. Percentage of the population in large industrialized countries who had completed education. Country Total Male Female Canada 16.9 18 18.9 France 9.2 11.9 11.3 Germany 12.6 12.7 11 Italy 7.5 7.7 8.1 Japan 13.3 34.2 11.5 United Kingdom 11.7 15.7 11.7 United States 24.4 23.4 23.5 Percentage of the population in large industrialized countries who had completed higher education, by age, sex, and source: http://nces.ed.gov/pubs/ce/c9723a01.html Check Data and Sample Worksheet Quiz Data: use this data to check your work and answer questions in Sample W Using the following data construct an absolute frequency bar graph and take the Sample Worksheet 2 Quiz. This data represents the marital status of Residents of Madison County Indiana who are 15 years and over. Group Frequency Percent Never married 22,623 21.2 Now married, except separated 61,567 57.6 Separated 1,392 1.3 Widowed 7,874 7.4 Divorced 13,481 12.6 Source: http://factfinder.census.gov/servlet/QTTable?_bm=y&-qr_name=DEC_2000_SF3_U_DP2&-ds_name=DEC Worksheet Quiz 2 Data Construct a Bar Graph comparing household income in Madison and Marion Counties (Indiana) then take Workshe Family Income Madison Marion Less than $10,000 7.9 8.5 $10,000 to $14,999 7.2 6.2 $15,000 to $24,999 15.4 13.9 $25,000 to $34,999 14.3 14.2 $35,000 to $49,999 18 17.7 $50,000 to $74,999 19.7 20 $75,000 to $99,999 9.9 9.6 $100,000 to $149,999 5.5 6.6 $150,000 to $199,999 1 1.6 $200,000 or more 1.1 1.7 source 1: http://factfinder.census.gov/servlet/QTTable?_bm=y&-qr_name=DEC_2000_SF3_U_DP3&-ds_name=DE source 2: http://factfinder.census.gov/servlet/QTTable?_bm=y&-qr_name=DEC_2000_SF3_U_DP3&-ds_name=DE gher education rate. higher education, by age, sex, and country: 1994 nd answer questions in Sample Worksheet 2 ample Worksheet 2 Quiz. ho are 15 years and over. 00_SF3_U_DP2&-ds_name=DEC_2000_SF3_U&-_lang=en&-_sse=on&-geo_id=05000US18095 ties (Indiana) then take Worksheet Quiz 2 000_SF3_U_DP3&-ds_name=DEC_2000_SF3_U&-_lang=en&-_sse=on&-geo_id=05000US18095 000_SF3_U_DP3&-ds_name=DEC_2000_SF3_U&-_lang=en&-_sse=on&-geo_id=05000US18097 Worksheet 2 Data Curve Build Worksheet 2 Data: Use this data to construct a data curve. Construct a data curve showing the average level of depression reported by patients on differing doses of Prozac Relationship of Dose to Level of Depression Dose in mgs Beck Depression Scale 0 30 25 28 50 15 75 20 100 30 Source: Data is simulated Check Data: Use the following data to check that you know how to create a correct data curve and in Sample Work Contingent Payment Standard Treatment Days of Drug Free Urine Frequency (Percent) Frequency( Percent) 0 14 19 1-4 10 20 5-8 13 0 9-11 3 2 Source: http://www.drugabuse.gov/pdf/monographs/25.pdf page 56 Worksheet Quiz 2 Data: Create a data curve showing number of accidents as a function of blood alcohol level then take Worksheet 2 Quiz Blood alcohol level Accidents for which driver was culpable <50 20 50-79 15 80-149 58 >=150 80 Source: http://www.grotenhermen.com/driving/bates.pdf page 231 ffering doses of Prozac urve and in Sample Worksheet 2 Quiz n take Worksheet 2 Quiz Worksheet 2 Time Series Build Data: Use this data to build a time series graph Construct a times series showing income by year Per capita personal income Year Income 1993 21220 1994 22056 1995 23063 1996 24169 1997 25298 1998 26240 1999 27002 2000 28369 2001 29975 2002 27563 Source: Data is fabricated Check Data: Use this to construct and Time Series graph and for Sample worksheet 2 Quiz Pedestrians crashes in New Orleans 14+year olds Study Year Total period Baseline 1990 844 1991 848 1992 861 1993 799 1994 837 http://www.nhtsa.dot.gov/people/injury/alcohol/PedestrianAccident/Main_report.html Worksheet Quiz 2 Data Construct a Time Series Graph for the data below and take Worksheet 2 Quiz Reported AIDS Infections In Madison County Indiana Year Frequency 1992 378 1993 755 1994 718 1995 518 1996 645 1997 555 1998 482 1999 347 2000 360 2001 360 Source: http://www.cdc.gov/hiv/stats/hasrsupp83/table1.htm Worksheet 2 Scatterplot Build Worksheet 2 data Construct a scatterplot showing the relationship of the weight of shoes to time to complete race Here are weights of the contestant shoes in grams per shoe and times in seconds Contestant Weight Time 1 382 58 2 395 59 3 375 54 4 400 53 5 402 61 6 389 61 7 410 63 8 420 65 9 378 57 10 375 59 11 368 53 12 369 54 13 381 58 14 382 57 Source: Simulated Check data: Use the data below to check that you are able to create a scattergram Relationship of Blood Alcohol Content and Driving Impairment in Women BAC Score 0.025 13.73 0.05 18.75 0.075 23.78 0.1 28.81 0.15 38.86 0.2 48.91 Source http://www.mass.gov/mdaa/mvcrimes/Psychophysical%20tests%20for%20DWI.pdf Worksheet Quiz 2 Data Construct a scatterplot showing the relationship between the # of sexual partners and self-esteem # of partners Self esteem 2 23 4 22 6 21 8 17 20 5 5 23 1 30 0 25 2 22 Source: Simulated complete race s and self-esteem Worksheet 2 Frequency Polygon and Ogive Build Worksheet 2 Data: Use this data to build a Frequency Polygon and Cumulative % Ogive Students in an anthropology class received the following scores on the first test The scores are grouped into intervals for the sake of a clearer display. Cumulative Cumulative URL Midpoints Frequency Frequency % 22.5 15 0 0 0% 37.5 30 22 22 29% 52.5 45 35 57 74% 67.5 60 15 72 94% 82.5 75 5 77 100% 97.5 90 0 77 100% Source: Simulated Check Data use this data to build a Frequency Polygon & Ogive & for Sample Worksheet 2 Monthly Mortgage Costs in Indiana (approximate) URL Midpoint Cumulative Cumulative Payment Payment Frequency Frequency % 250 0 0 0 0% 500 250 98114 98114 10% 1000 750 516,739 614853 63% 1500 1250 253,798 868651 89% 2000 1750 72753 941404 96% 2500 2250 36875 978279 100% 3000 2750 0 978279 100% This category is not accurate Source: Adapted from http://censtats.census.gov/data/IN/04018.pdf#page=2 Worksheet Quiz 2 Data For the data below construct a Frequency Polygon and Ogive and answer the questions in Worksheet Quiz 2. Data represent ages of people in US. Cumulative Cumulative URL Midpoint Frequency Frequency % 0 0 0 0 0% 4.5 2 423,215 423,215 7% 14.5 10 1309904 1,733,119 27% 24.5 20 879213 2,612,332 40% 34.5 30 831,125 3,443,457 53% 44.5 40 960,703 4,404,160 68% 54.5 50 816,865 5,221,025 80% 64.5 60 529844 5,750,869 88% 74.5 70 395,393 6,146,262 95% 84.5 80 265,880 6,412,142 99% 94.5 90 91,558 6,503,700 100% 104.5 100 0 6,503,700 100% Source: Significantly adapted from http://censtats.census.gov/data/IN/04018.pdf#page=2 gory is not accurate Basic Statistics Raw Deviation Std z-score Symbolic Function Computed Excel Score Scores Mean Score Dev Z= 2 Name Symbol Formula Name Formula Function Number X x X - x (X - x) S (X - x) / S Sum SX X1 + X2 + X3 . . . + Xk SUM 1 11 Sample Size or Population Size n or N n or N = COUNT 10 Degrees of Freedom d.f. n-1 9 Sample or Population Mean x or m sum/number = SX/n AVERAGE 8 Mode Mode Mode = MODE.SNGL 7 Median Median Median = MEDIAN 6 Minimum Min Minimum = MIN 5 Maximum Max Maximum = MAX 4 Range Range Range = Max - Min 3 Sum of Squares SS S(X - )2 or S(X - m)2 SUM, DEVSQ 2 Sample Variance S2 SS/d.f. = VAR.S 1 Sample Standard Deviation S (SS / d.f.) STDEV.S 5 Population Variance s2 S(X - m)2/N VAR.P 4 Population Standard Deviation s (S(X-m)2/N) STDEV.P 6 Intermediate Calculation S/ 6 Sample Coefficient of Variation Cvar (S / )*100 Check Data 5 Use to check work 6 and with 7 Sample Worksheet 8 3 Quiz 9 8 7 6 5 4 3 2 1 2 3 4 3 Cvar = 49.05 Last Z score = -0.79 Worksheet Quiz 3 Data Reebok Nike Create two separate worksheets 523 925 Enter the data from column in each 632 656 Answer the questions in 529 424 Worksheet 3 Quiz 828 365 Each value represents the 323 656 total dollar sales for that brand in 1101 895 one store for one week. 675 878 525 525 987 969 564 858 636 941 656 793 785 354 778 636 454 787 565 565 11 10 9 8 7 6 5 4 3 2 1 5 4 6 6 Binomial Distribution Function or Name Symbol Value Formula Successes X= 2 Trials n= 3 Probability of one success from one trial p= 0.5 Cumulative (True) or Point (False) Cumulative TRUE Probability of X or fewer successes in n trials P(<=X) = BINOM.DIST Probability of greater than X successes in n trials P(> X) = = 1 - P(<=X) Cumulative (True) or Point (False) Point FALSE Probability of exactly X successes in n trials P(=X) = BINOM.DIST Probability of other than X successes in n trials: Compliment P(/=X) = = 1 - P(=X) Check Data: use to check work and Sample Worksheet 5 & 6 Quiz X= 5 n= 8 p= 0.2 Complement 0.9908 Worksheet 5 & 6 Quiz Data: Complete Worksheet 5 with the data from the problem below An employer is accused of discriminating against applicants from Purdue. Using an applicant pool in which 20% of those who apply are from Purdue, in the last 10 hires only one Purdue graduate has been selected. You do not need to revise the graph. Discriminate between proportions and percents. 2 3 0.5 0 1 2 3 0.125 Likelihood of Girls in Family of 3 0.375 0.375 0.125 Children 0.375 0.375 0.400 Probability 0.300 0.200 0.125 0.125 0.100 0.000 0 1 2 3 Number of Girls n! P( X ) 0.375 * p X * q n X (n X )! X ! Poisson Distribution What are you given? Use only one column. Occurrences and Units Mean (l) Total number of occurrences in all units = 900 The number of units = 7000 The specific number of occurrences that are of interest in one unit = X= 2 3 Mean = Total Occurrences/Units l= 4 Cumulative (True) or Point (False) = T or F ? TRUE TRUE Probability of a score at X or below: Area at and below = P(<=X;l) Probability of a score above X = Area Above = P(>X;l) Cumulative (True) or Point (False) = T or F? FALSE FALSE P(=X;l) = Area at the Point = P(=X;l) P(/=X:l) = Complement = Area everywhere except point P(/=X;l) This worksheet has data from 3 separate problems. Problem 1: If in 7000 MP3 players you have 900 pieces missing what is the likelihood that any one will have 2 pieces missing Problem 2: If the average number of customers per hour is 4 what is the likelihood that you will have exactly 3 in one hour? Problem 3: If the likelihood of having Herpes is .20. If you have a group of 40 people what is the likelihood of 5 or fewer having Check Data:. Use to check worksheet and for Sample Worksheet 5 & 6 Quiz Total number of occurrences in all units = 500 The number of units = 600 Specific Outcome in sample 3 6 mean (l) = 0.8333 5 Complement = 0.9581 0.8538 Here are some possible problems for the check data: Problem1: If there are 600 Public 4 yr Colleges and a total of 500 death due to alcohol, how likely would one school having 3 ex Problem 2: If the average number of dates that an AU student gets in 4 years is 5 what is the likelihood of getting 6 or less? Problem 3: If 1% of calls received at the switchboard are wrong numbers. If AU get 300 calls how likely will 2 be wrong number Worksheet Quiz 5&6 Data A local social action group is attempting to determine if the number of deaths due to leukemia is unexpectedly high. If the rate is so high that it would occur by chance less often than 0.05 they will investigate further. The probability that any one person in the population would get the disease is .00014 The size of the local county is 38,000. The number of cases was 12. (Remember that events that are very unlikely to occur are suspect) 900 0.2 7000 40 2 3 5 4 ven? Use only one column. Function, Formula Probability or Explanation e l lX 0.2 = probability (p) 0.0000 40 = unit size(n) 5 =X P( X ; l) = n*p = l X! TRUE POISSON.DIST = 1- P(<=X;l) FALSE POISSON.DIST = 1- P(=X;l) ne will have 2 pieces missing have exactly 3 in one hour? likelihood of 5 or fewer having herpes? 0.01 = probability (p) 300 = unit size(n) 2 =X 3 = mean = (l) 0.7760 = Complement ly would one school having 3 exactly? elihood of getting 6 or less? w likely will 2 be wrong numbers unexpectedly high. Gaussian Distribution Problems Name Symbol Explanation or Formula Score #1 Raw score X Given information 35.0000 Mean of Distribution or m Given information 40.0000 Standard Deviation S or s Given information 6.0000 Population Size N Total number of scores in the distribution (optional) 750 Deviation Score X- X-x = z-score z (X - x) / S = Function or Formula Area Probability below p NORM.S.DIST Probability above 1-p 1-p = Number Number of scores below # N*p= Number of scores above # N * (1 - p) = Check Data and Sample Worksheet 7 Quiz Data Score #1 X= 270 Mean = SX / n = 250 Std. Dev. = 10 N= 400 Worksheet 7 Quiz Data You are selling for a sunglasses company. You visit retailers each day and obtain orde Some days you are very successful, others not. On a given day you sell only 400 dollars worth of glasses. Average daily sales for all representatives on all days in your company are mean $575 The standard deviation for sales for all reps on all days is $95. The boss accuses you of goofing off. You think this is an unfair accusation. On how many of the 200 working days each year can you be expected to sell 400 dollars or less assuming you are an average sales representative? 35.0000 40.0000 6.0000 750 Score #2 Explanation or Formula 46.0000 The second score (optional) Cell reference to Column D Cell reference to Column D 0.094092609 ( X m ) 2 Cell reference to Column D 1 f ( x; m , s ) 2s 2 =X-x = (X - x) / S e Area Area between 2 scores s 2 Estimate for 0 <= z <=1 Number Number between 2 scores 0.5 1 1 3 q( x) 0.5 (2 ) 2 (z z ) 7 Score #2 Between 2 Estimate for Z >1 279 0.0209 Between 2 8 (1 z ) ( z ) /(1 z z 2 ) tailers each day and obtain orders. n your company are mean $575 an unfair accusation. you be expected to sell s representative? 46.0000 ( X m ) 2 2s 2 e 1 z z2) Worksheet 7B Gaussian Distribution from Area (p) (or Number of Scores (#)) to cutoff Information given? Name Symbol Proportion Mean Sample or Population Mean or m 258 Standard Deviation Sample or Population Standard Deviation S or s = 5 # of scores of interest Number of scores in the area of interest #= Total N of scores Total number of scores represented by the entire distribution N= Area of interest Proportion of the area of the curve below point of interest p = 0.1000 Critical Vallue NORM.S.INV gives the z-score associated with area z= Precision Multiplying z*S yields the interval width z*S= Absolute Precision Absolute Value of Precision |z * S| Lower Cutoff Mean - precision X= Upper Cutoff Mean + precision X= Check work with extra data set below Mean (x or m)= x or m 24 Standard Deviation (s or S) = s or S 3 Total number of scores represented by the entire distribution = # Proportion of the area of the curve of interest N Number of scores in an area p = # / N 0.12 X = Cutoff score = Deviation - Mean = X 20.48 X = Cutoff score = Deviation + Mean = X 27.52 Worksheet 7 Quiz Data John wants to play college basketball. His height is 75.3 inches He will not be considered unless of the males graduating in his year he is in the group the tallest 9000 In a given year the number of males who graduate from high school in the USA is 175,000 Heights are distributed in a Gaussian Distribution with a mean of 71.2 The standard deviation of the heights is 2.4 What is the shortest that someone can be, if they are going to make the group of the tallest 9000? 258 5 0.1000 Information given? Formula or # of scores Function 30 6 1750 20000 =#/N NORM.S.INV =z*S ABS = (X - x) - x = (X - x) + x 72 15 250 6500 45.47 98.53 30 6 1750 20000 Sampling Distribution Problems Name Symbol Explanation Score #1 Score #2 Sample mean Given information 27 29 Population mean m Given information 30 Population Standard Deviation S or s Given information 5 Size or each sample n Given information 22 Population Size N Number of Samples (optional information) 200 Deviation Score -m Distance of x from m Square Root of n √n Square root of the sample size Standard Error of the Mean S / √n Spread of the Distribution of Sample Means z-score Z Deviation Score / Standard Error Function or Formula Area Area Probability below p NORM.S.DIST Probability above 1-p (1 -p) = Number Number Number of means below x # N*p= Number of means above x # N * (1 - p) = Check work with extra data set below Mean #1 Mean # 2 Sample mean = x = 45 46 Population Mean = m = 47 Population Standard Deviation = s = 6 Sample size = n = 31 Population Size = N = 500 Worksheet 7 Quiz Data You have developed a variety of neighborhood self-help groups. The average age of all groups (both Senior and not Senior) is 56 The group size is 8 The standard deviation of age of all the groups is 9 The total number of groups in the city 312 Senior groups are defined as having an average age over 60 27 29 30 5 22 200 Explanation The second mean (optional) Cell reference to Column d Cell reference to Column d Cell reference to Column d Cell reference to Column d = -m = n^.5 = S / √n = ( - m) / s Area between 2 scores Number between 2 scores Between 2 0.1450 Between 2 72 Confidence Interval for the Population Mean Is my standard deviation based on a population or sample? Name Explanation or Formula Symbol Population Sample Sample Mean Given Information 1022.0000 Standard Deviation Given Information s or s 57.0000 Sample Size Given Information n 25 Confidence Level Given Information CI 95 Degrees of Freedom n-1 d.f. Square Root of Sample Size n^.5 n Standard Error of the Mean s / n s Area in tails as % 100 - CL % Area in 2 tails as proportion % / 100 a Area in one tail a/2 a/2 Function or Formula Dist. = Z t Critical Value NORM.S.INV, T.INV C.V. Precision CV * s Absolute value of Precision ABS Lower Confidence Limit - |precision| LCL Upper Confidence Limit + |precision| UCL Interpretation: The population mean will fall between the LCL and UCL in the specified % of such intervals. Check data & for Sample Worksheet 8 Quiz 56 55 60 69 61 56 61 41 55 58 58 39 58 59 54 54 59 51 63 71 52 42 69 55 59 59 42 57 54 58 41 65 CI = 95 UCL = 53.07 58.86 Worksheet 8 Quiz Data Cell phone bills in dollars for a sample of people. 45 30 52 73 35 30 49 39 32 35 48 81 31 85 51 54 66 90 50 56 72 40 47 59 47 50 54 56 51 62 Compute summary statistics using functions then compute the 95% confidence interval 1022.0000 57.0000 25 95 n or sample? Explanation Cell reference to D Cell reference to D Cell reference to D Cell reference to D Used only with t Needed to compute sx Needed to compute precision Used to compute a Used with Z & t Distribution Function yields cutoff Area on either side of mean Makes sure it is positive Cutoff below mean Cutoff above mean of such intervals. One sample test of mean Formula, Function Distribution Name or Explanation Symbol Z Is population standard deviation (s) known? Yes Sample Mean Observed 99 Population Mean Hypothesized (expected) m 93 Standard Deviation Population or Sample s or s 12 Sample Size Number of scores n 32 Acceptable risk Type 1 error a 0.01 Deviation Observed-expected diff. -m Square Root of sample size n^.5 n Standard Error of the Mean Standard deviation / n sx Degrees of Freedom n-1 d.f. Test Statistic Deviation / standard error z or t Absolute Value ABS |TS| Negative of AV -1 * |TS| Distribution Z 1 Tailed probability NORM.S.DIST, T.DIST p1 2 Tailed probability 2*p1 p2 Confidence Intervals Area in one tail a/2 a/2 Critical Value NORM.S.INV, T.INV C.V. Precision CV * sx Absolute value of Precision ABS |CV * sx| Lower Confidence Limit x - |precision| LCL Upper Confidence Limit x + |precision| UCL Tails 2 Tails 1 tail right 1 tail left Null Hypothesis H0: m = 130 H0: m <= 130 H0: m >= 130 Alternate Hypothesis H1: m /= 130 H1: m > 130 H1: m < 130 Check data & for Sample Worksheet 9 Quiz Sample Mean Observed 30 Population Mean Expected or Hypothesized m 26 Standard Deviation Sample or Population s or s 9 Sample Size number of scores n 32 Alpha + 0.05 2 Tailed probability 2*p1 p2 0.0119 Worksheet 9 Quiz Data DO NOT ASSUME these are in order for coping and pasting. The average sales per day for your company (which has 1500 stores) are The number of stores using your AU strategy are Sales per day for the stores using the AU strategy are The standard deviation of the stores using the strategy is Alpha = You would like to impress the boss that the AU strategy is not just producing a chance improvement but a statistically significant improvement. Do the necessary statistical analysis and draw your conclusions. 99 93 12 32 0.01 t Explanation No Cell reference from D X mH t X m H 0 0 sˆ sˆ X s Cell reference from D s t sˆ sˆ X Cell reference from D nn X Cell reference from D X Used with T.INV Should be the same for Z and t Should be the same for Z and t Should be the same for Z and t Used with T.DIST and T.INV Should be the same for Z and t Used with Z and t Used with Z and t t Tails 1 2 T.INV uses E20 & E12 Area on either side of mean Makes sure it is positive Cutoff below mean Cutoff above mean If p <= a then reject H0 0.0173 2 $4,500 25 $5,100 $1,200 0.05 our conclusions. Two Sample Mean Test Name Formula or Function Sample Means 1 - 2 Population Means (optional) m1 - m2 Deviation of observed difference from expected (x1 - x2) - (m1 - m2) Standard deviations S or s Variances S2 or s2 Sample sizes n1 + n2 Degrees of freedom df1+ df2 = (n1 - 1) + (n2 - 1) Unequal Sample Size Correction Factor 1/n1 + 1/n2 Sum of Squares df1*S12+df2*S22 Pooled Variance of Difference SS/d.f. Adjusted Pooled Variance of Difference SS/d.f.*CF Variance / n S12 /n1 + S22/n2 Smaller of d.f.1 or d.f.2 = IF Std. error col. D for Z & t unequal col. H for t equal (S12 /n1) + (S22/n2)] Test Statistic [t (unequal) or z] deviation / standard error Absolute Value of Test Statistic ABS Negative of absolute value -1 * |TS| Are the population standard deviations (s's) known? 1 Tailed probability (area of the curve in one tail) NORM.S.DIST T.DIST T.DIST 2 Tailed probability (area of the curve in both tails) 2 * p1 Confidence Intervals Type one error alpha Area in one tail a/2 Critical Value NORM.S.INV, T.INV, T.INV Precision CV * sx Absolute value of Precision ABS Lower Confidence Limit x - |precision| Upper Confidence Limit x + |precision| Tails 2 Tails Null Hypothesis H0: m1 = m2 Alternate Hypothesis H1: m1 /= m2 Check Data & for Sample Worksheet 10A Quiz Data 1 - 2 m1 - m2 (x1 - x2) - (m1 - m2) S or s S2 or s2 n1 + n2 Worksheet 10A Quiz Data You are attempting to determine weather your athletes will perform better if they consume Gatorade or water. You divide your sprinters into two groups and have one group pre-load on Gatorade and the other pre-load on water. You then have them run a 400-meter race and record the average time of the two groups. The average time for the Gatorade was in seconds = 47 The average time for the water was in seconds = 49 The standard deviation for the Gatorade group = 5 The standard deviation for the water group = 4 The size of both groups = 17 Alpha = 0.05 You may assume the population standard deviations, although unknown, are equal. (Be careful in running lower times are better) Welch's Approximate Method for CI when there is heterosedasticity s2 a 0.0 na 15 na-1 14 dfw #DIV/0! MEw #DIV/0! LCL #DIV/0! UCL #DIV/0! Symbol Group 1 Group 2 Difference 73.00 - 68.00 = - = 18.00 10.00 N 15 11 Sum ↓ d.f.total + = ← t equal CF + = ← t equal SS + = ← t equal ← t equal s2 1- 2 ← t equal + = ← z & t unequal d.f.smaller ←t unequal t equal ↓ s 1- 2 ← z & t unequal (SS/d.f.)*CF] TS ← z & t unequal ← t equal |TS| ← t equal ← z & t unequal ← t equal s's) known? Yes No, are s's equal? Default No Yes Z t (unequal) t (equal) Tails p1 1 p2 2 a 0.05 0.05 cell reference to D26 a / 2 for z 0.025 a / 2 for t C.V. #DIV/0! T.INV df from H9 Use D28 & D16 =H16*H28 |CV * sx| #DIV/0! ABS LCL #DIV/0! x - |precision| UCL #DIV/0! x - |precision| 1 tail right 1 tail left H0: m1 <= m2 H0: m1 >= m2 H1: m1 > m2 H1: m1 < m2 le Worksheet 10A Quiz Data 45.00 - 52.00 - 21.00 24.00 441.0 576.0 N 32 21 Z t (unequal) t (equal) p1 0.1378 0.1442 0.1337 p2 0.2755 0.2885 0.2673 ey consume Gatorade or water. orade and the other pre-load on water. s2 b 0.0 nb 11 nb-1 10 t #DIV/0! 73.00 - 68.00 - 18.00 10.00 15 11 Effect Size 1- 2) 2 2 S +S (S12 + S22)/2 SQRT((S12 + S22)/2) 1 - 2)/SQRT((S12 + S22)/2) =d Cohen's D 2 d 2 d +4 √(d2 + 4) d / √(d2 + 4) = rΥλ Effect size correlation Effect Size Explanation http://www.uccs.edu/~faculty/lbecker/es.htm F Test Standard Deviations Name Function or Formula Symbol Group 1 Sample sizes Given information n 25 Standard Deviations Given information S 6.70 Degrees of freedom n -1 d.f. Variance S^2 or given information in F & G S2 Larger 2 Variance IF S Degrees of freedom IF d.f. SL / SS2 2 Test Statistic: F Test F P(F) Probability in 1 tail F.DIST.RT p1 Probability in 2 tails 2 * p1 p2 Tails 2 Tails 1 tail right 1 tail left Null Hypothesis H0: s21 = s22 H0: s21 <= s22 H0: s21 >= s22 Alternate Hypothesis H1: s21 /= s22 H1: s21 > s22 H1: s21 < s22 Check Data & Sample Worksheet 10 AB Data Group 1 Sample size = n = 24 Standard Deviations = S = 5.30 Variances = S2 = Probability for 2 tails = 0.0359 Worksheet 10 AB Data It is the 4th quarter, Jones Green your basketball team is down 94 to 96. 2 1 There is no time left on the clock. 0 2 A technical foul has been 2 1 called against your opponent. 2 2 Your team will get two shots. 0 0 Below are the number of shots your 0 1 best two free throw shooters 2 2 have made out of 2 in the 2 1 last 18 two-shot foul game situations. 0 1 Which player will you choose and why? 2 1 Alpha = .05 2 1 0 1 2 1 2 2 0 1 2 1 0 2 2 1 25 6.70 What am I given? dard Deviations Variances Group 2 Group 1 Group 2 s L 2 20 25 23 F 9.45 s S 2 44.89 89.30 Smaller Larger Smaller Tails P(F) Tails 1 1 2 2 Group 2 Group 1 Group 2 29 24 20 8.20 32.12 43.98 2 0.4688 2 20 25 23 9.45 44.89 89.30 Dependent t test Worksheet 10C Dependent t Name Formula or Function Symbol Value Means AVERAGE Sample Standard Deviations STDEV.S S Mean of the differences AVERAGE Expected difference (if given) Expected difference (if given) m Deviation -m Standard Deviation of the Differences STDEV.S SD Sample size COUNT np Square Root of Sample Size np^.5 np Standard Error of the Differences S / np sD Test statistic: t-test deviation / standard error t Absolute value of Test statistics ABS |t| Left tail value of t |t|*-1 -t Degrees of freedom np -1 = d.f.D P(t) 1 Tailed probability T.DIST p1 2 Tailed probability (both tails) 2*p1 p2 Confidence Intervals Area in tails a a 0.05 Area in left tail a/2 a/2 Critical Value T.INV C.V. Precision CV * sD Absolute value of Precision ABS |CV * sD| Lower Confidence Limit deviation - |precision| LCL Upper Confidence Limit deviation + |precision| UCL Tails 2 Tails 1 tail right 1 tail left Null Hypothesis H0: m1 = m2 H0: m1 <= H0: m1 >= m2 m2 Alternate Hypothesis H1: m1 /= m2 H1: m1 > H1: m1 < m2 m2 Check Data & Sample Worksheet 10C Data Group 1 Group 2 26 24 25 22 22 21 21 23 29 24 31 29 32 17 29 19 Probability for 1 tail 0.0265 Probability for 2 tails 0.0531 Worksheet 10C Data You wish to see if cholesterol scores on an old diet are higher than cholesterol scores on an new diet. You measure ten individuals before and after 1 month on the diet. alpha = .05 Individual Diet Old New 1 300 301 2 285 281 3 310 300 4 267 250 5 309 300 6 293 345 7 310 301 8 267 232 9 325 302 10 301 300 16 11 12 5 17 11 12 7 5 4 6 10 X X Effect Size Group 1 2 D 1 -2) Difference S12 X1 - X2 ↓ S22 16 11 S12 + S22 12 5 (S12 + S22)/2 17 11 SQRT((S12 + S22)/2) 12 7 1 - 2)/SQRT((S12 + S22)/2) =d 5 4 Effect Size Explanation http://www.uccs.edu/~faculty/lbecker/es.h 6 10 D mD tD Tails SD 1 2 nD month on the diet. Cohen's D //www.uccs.edu/~faculty/lbecker/es.htm Pearson Correlation Formula or 3↓ 4↓ Name Function Symbol Value X X- ZX Mean of X, Y AVERAGE , 1→ Sample Standard Deviation of X, Y STDEV.S S 2→ Sum of ZxZy = SUM SZxZy ←8 43 Sample Size number of pairs n ←9 48 Degrees of freedom for r n-1 d.f.1 ←10 56 Pearson Correlation Formula SZXZY / (n -1) r ←11 61 Pearson Correlation Function CORREL r ←12 67 Degrees of freedom for t n - 2 for t d.f.2 ←13 70 Coefficient of Determination r2 r2 ←14 2 2 Coefficient of Non-determination 1-r k ←15 df2/k2 ←16 Square Root of C13 2 (df2/k )^.5 df2/k ) 2 ←17 2 Test Statistic: t for Pearson r*(df2/k ) t ←18 Absolute value of t ABS |t| ←19 left tail |t|*-1 -t P(t) Tails 1 Tailed probability T.DIST, d.f.2 = n-2 p1 1 2 Tailed probability 2 * p1 p2 2 H0: r = 0; H1: r /= 0 if p2 <= a then reject H0 Check data & 5 2 Sample Worksheet 11A Data 6 5 3 3 7 1 8 2 2 5 4 7 |t| 1.4289 ←19 P(t) Tails p1 0.1062 1 p2 0.2124 2 Worksheet 11A Data Bonus Products Assume that you are testing to see 0.0 1.25 if there is a relationship between the amount 0.1 1.35 of bonus/hr. and products sold /hr. Use a = .05. 0.2 1.40 1.0 1.65 1.5 1.70 2.0 1.80 2.5 1.85 43 128 48 120 56 135 61 143 67 141 70 152 5↓ 6↓ 7↓ Y Y- ZY ZxZy 128 120 135 143 141 152 Spearman Correlation Function RANK.AVG Name or Formula Symbol Value Score X Rank of X Acceptable Risk Given information a 0.05 1 1 Sum of the squared differences in ranks SUM Sd2 2 2 6*Sd2 = 3 3 Sample size COUNT np 4 4 2 n = 5 5 n2 -1 = 6 6 n(n2 -1) = 7 7 2 6Sd /n(n2 -1) = 8 8 Spearman Correlation Formula 1-6Sd2/n(n2 -1) = rs computed 9 Spearman Correlation Function CORREL on ranks rs computed 10 Absolute Value of Spearman ABS |rs| 11 Critical Value for rs (2 tail test) rs table #N/A 12 H0: rs = 0; rs /= 0 If |rs| computed >= rs table reject H0 13 14 15 Check Data and for XY 16 Sample Worksheet 11AB Score Score 17 52 225 18 45 220 19 61 210 20 51 230 21 36 265 22 38 241 23 42 254 24 54 228 25 53 229 26 rs computed -0.767 27 28 Worksheet 11AB Quiz Data 29 Physician Skill ranking Income Looks Rank 30 Jones 61 150,000 90 31 Smith 87 200,000 60 32 Doe 32 75,000 109 33 Green 120 60,000 155 34 Young 175 190,000 36 35 Short 155 50,000 180 36 You will need to make two copies of this worksheet on separate tabs. 37 A group of single adults were asked to rank pictures of each 38 physician for attractiveness. 39 A group of nurses ranked the physicians skill. 40 Information of each physician's net income 41 for last year was obtained from the IRS. 42 Determine if the physician's income is more closely related to 43 skill in the operating room or good looks. 44 45 1 125 46 2 90 47 3 110 48 4 75 49 5 92 50 6 85 51 7 65 52 8 55 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 RANK.AVG Spearman Table Y Rank of Y d 2 d N/a 0.10 0.05 0.01 125 5 0.900 1.000 90 6 0.829 0.886 1.000 110 7 0.715 0.786 0.929 75 8 0.620 0.715 0.881 92 9 0.600 0.700 0.834 85 10 0.564 0.649 0.794 65 11 0.537 0.619 0.764 55 12 0.504 0.588 0.735 13 0.484 0.561 0.704 14 0.464 0.539 0.680 15 0.447 0.522 0.658 16 0.430 0.503 0.636 17 0.415 0.488 0.618 18 0.402 0.474 0.600 19 0.392 0.460 0.585 20 0.381 0.447 0.570 21 0.371 0.437 0.556 22 0.361 0.426 0.544 23 0.353 0.417 0.532 24 0.345 0.407 0.521 25 0.337 0.399 0.511 26 0.331 0.391 0.501 27 0.325 0.383 0.493 28 0.319 0.376 0.484 29 0.312 0.369 0.475 30 0.307 0.363 0.467 6Sd 2 rs 1 n n2 1 ) Regression and Prediction Interval Variable Name Function or Formula Statistic X Sample Mean AVERAGE for X &Y x 1 Standard Deviation STDEV.S for X & Y s 2 Sample Size COUNT for X & Y n 3 Compute Pearson & Test for Significance X Pearson Correlation CORREL r 4 22 Degrees of freedom for t n-2 d.f.t 5 14 Coefficient of Determination r2 r2 6 31 2 2 Coefficient of Non-determination 1-r k 7 36 2 2 Ratio of d.f. to Coef of ND (n - 2)/(1 - r ) d.f ./ k 8 9 Square Root of E11 2 (d.f ./ k )^.5 (d.f ./ k ) 2 9 41 Test Statistic: t r * (d.f ./ k2) t 10 19 Absolute value ABS |t| 11 left tail |t|*-1 -t 12 1 tail probability = T.DIST p1 13 2 tail probability = 2 * p1 p2 14 Predict Y value from X Score to predict from Given Information X 15 702 Predicted Value of Y FORECAST Y' 16 Set up Prediction Interval around Y' Sum of Squares of predicted scores SUM of Column J S(Y-Y')2 20 Variance of predicted scores S(Y-Y')2/d.f. 21 Standard Error of Estimate (S(Y-Y')2/d.f.)^.5 se 22 Confidence Level Given Information CI 22 99 Area in tails as % 100 - CI % 24 Area in 2 tails as proportion % / 100 a 25 Area in one tail as a proportion a/2 a 26 Critical Value of t T.INV C.V. 27 Precision CV * se Precision 28 Absolute value of Precision ABS |Precision| 29 Lower Confidence Limit Y' - |precision| LCL 30 Upper Confidence Limit Y' + |precision| UCL 31 Check Data & X= 42 Data for Sample CI = 99 Worksheet 11C X Y 51 152 45 149 39 135 56 161 71 176 82 189 91 184 75 165 51 147 se 6.44 LCL 120.07 UCL 165.15 Worksheet 11C Data You have data for long distance telephone charges/month and sales volume/month for each of your sales representatives. Ms. Smith has reported a monthly expense of 594 expect what limits for her sales with confidence level 99 Phone charge Sales 475 28,000 209 14,000 684 35,000 359 15,000 576 29,000 704 34,000 22 20 14 14 31 54 36 63 9 17 41 71 19 23 Variable Y alpha = 0.05 Continue with Regression 17 18 19 Y Y' Y-Y' (Y-Y')2 20 n2 14 t r* 54 63 1 r2 Sy y') 17 2 71 23 sest n 2 CIy' y'±t * Sest 1-way X2 Formula or Name Function Symbol Chog Protestant RC Other Frequency Observed Given fO 90 45 35 15 Sample size SUM n Proportion expected Given p 0.45 0.25 0.15 0.10 Frequency Expected n * p or Given fE Difference fO - fE Squared Difference (fO - fE)2 Cell Chi-square (fO - fE)2 / fE Number of categories COUNT k Degrees of freedom k-1 d.f. Test Statistic: Chi-Square SUM X2 Probability CHISQ.DIST.RT p H0: fO = fE; H1: fO /= fE Retaining H0 is saying the frequencies observed do not differ from what would be expected by chance from given proportions Rejecting H0 is saying the frequencies observed differ by more than we would expect by chance from given proportions Effect Size Sample size * d.f. multiply N*(k-1) Intermediate Calculation divide X2 / (N*(k-1)) Cramer's Phi SQRT (X2 / (N*(k-1)))^.5 Check Data and Sample Worksheet 12 AB Bush Kerry Nadar fO 20 15 3 p 0.50 0.48 0.02 p= 0.0269 Worksheet 12 AB Data You wish to determine if the proportion of Purdue, Indiana, and Notre Dame graduates who work in your company is different from the distribution in its population of your state % in state # in company Purdue 82% 155 Alpha = .05 Indiana University 12% 35 Be sure to convert percents to proportions. Notre Dame 6% 10 90 45 35 15 7 0.45 0.25 0.15 0.10 0.05 None 7 0.05 ance from given proportions rom given proportions 4X4 2-way Chi Square Restaurant If you have no data for a cell leave it blank do NOT use a zero. Class Ruby T's Garfields Red Lob Applebys Freshman Sophomore 15 25 25 26 14 45 19 43 Fe Junior 32 21 36 42 Senior 34 24 26 24 Column Total fo Row t Col t Rows Columns Rows -1 Columns -1 d.f. Hypothesis Hypothesis Test of Decision Assoication Null H0 : f O = f E Independence Retain No Alternate H1: fO /= fE Dependence Reject Yes Check Data and Sample Worksheet 12AB Data Priority Major Relationship Money Business 23 15 Other 15 10 p= 0.9667 Worksheet 12AB Data Table shows the arrival time of packages 2 days or More than 2 days less UPS 32 11 FEDEX 35 3 Do the services differ in the # of packages delivered in 2 days or less? Alpha = .05 15 25 14 19 25 26 45 43 32 21 36 42 34 24 26 24 e it blank do NOT use a zero. Row Total RowTotal * ColumnTotal Fe GrandTotal Grand Total Cell X2 Grand t fe fo-fe (fo-fe)2 (fo-fe)2/fe Chi-square Sum X2 = Probabilty CHISQ.DIST.RT p = d. f . Rows 1) * Columns 1) Effect Size Phi 2X2 Chi-square/Sample Size divide X2 / N #DIV/0! SQRT (X2 / N)^.5 #DIV/0! Cramer's Phi >2X2 Determine k Minimum 0 Compute subtract k-1 -1 Sample size * (k-1) multiply N*(k-1) 0 2 Intermediate Calculation divide X / (N*(k-1)) #DIV/0! c SQRT (X2 / (N*(k-1)))^.5 #DIV/0! Simple ANOVA Group Names Shell Gulf Amoco Score # 1 35 25 29 Score # 2 29 35 34 Score # 3 27 41 36 Score # 4 38 44 38 Score # 5 42 39 22 Score # 6 51 41 21 Score # 7 23 35 Score # 8 33 Score # 9 Score # 10 Score # 11 Score # 12 Score # 13 Score # 14 Score # 15 Sample Size (of each group) COUNT n Total of all Sample Sizes n1 + n2 + . . . Nk SUM N Number of Groups COUNT k Sum of Squares (of each group) DEVSQ SS Means (of each group) AVERAGE x Grand Mean (of all scores) AVERAGE x Deviation xj - x Squared Deviation (xj - x)2 Weighted Squared Deviation nj(xj - x)2 Sum of Squares Between (for all groups) SUM Snj(xj - x)2 SSB Sum of Squares Within (for all groups) SUM SS(X - x)2 SSW Sum of Square Total (of all scores) DEVSQ SST Degrees of Freedom Between k -1 d.f.B Degrees of Freedom Within N-k d.f.W Degrees of Freedom Total N-1 d.f.T Mean Square Between SSB / dfB MSB Mean Square Within SSW / dfW MSW Test Statistics: F MSB / MSW F Probability F.DIST.RT p Source SS d.f. MS F Between the Means Snj(xj - x)2 Between Within the Groups SS(X - x)2 Within Sum of Squares Total SSB + SSW Total Hypothesis Check work with data below Null: H0: m1 = m2 = m3 = m4 . . = mK Papa J Domino Pz Hut CiCi Alternate: H1: At least 2 means differ 9 5 9 3 8 6 8 4 6 8 7 5 7 7 6 3 5 5 4 9 2 1 8 Source SS d.f. MS F SSB 67.06 3 22.35 8.572 SSW 49.55 19 2.61 SST 116.61 You are considering 3 cities for the annual convention of your professional association In order to determine which location is going to be the most economical for your members, you have sampled motel rooms and prices in each of the 3 cities. Below are the cities and the prices for the motel rooms Is there a significant difference in the cost of a room in the 3 different cities. Alpha = .05 Fort Wayne Chicago Indy 75 105 75 85 130 90 100 115 100 105 145 125 35 25 29 29 35 34 27 41 36 38 44 38 42 39 22 51 41 21 23 35 33 p p 0.0008 Simple ANOVA Group Names Psy Business Score # 1 10 10 Score # 2 9 12 Score # 3 6 9 Score # 4 8 4 Score # 5 7 6 Score # 6 5 5 Score # 7 8 6 Score # 8 4 3 Score # 9 6 9 Score # 10 9 11 Score # 11 12 Score # 12 Score # 13 Score # 14 Score # 15 Sample Size (of each group) COUNT n Total of all Sample Sizes SUM N Number of Groups COUNT k Sum of Squares (of each group) DEVSQ SS Means (of each group) AVERAGE x Grand Mean (of all scores) AVERAGE x Deviation xj - x Squared Deviation (xj - x)2 Weighted Squared Deviation nj(xj - x)2 Sum of Squares Between (for all groups) SUM Snj(xj - x)2 SSB Sum of Squares Within (for all groups) SUM SS(X - x)2 SSW Sum of Square Total (of all scores) DEVSQ SST Degrees of Freedom Between k -1 d.f.B Degrees of Freedom Within N-k d.f.W Degrees of Freedom Total N-1 d.f.T Mean Square Between SSB / dfB MSB Mean Square Within SSW / dfW MSW Test Statistics: F MSB / MSW F Probability F.DIST.RT p Source SS d.f. Between the Means Snj(xj - x) 2 Between Within the Groups SS(X - x)2 Within Sum of Squares Total SSB + SSW Total Group Eta-squared SSB / SST* 100 h2 Mean n Group 1 vs. vs. vs. Group 2 Difference in Means x1-x2 Tukey use if n's are equal Absolute Value of Difference in Means ABS |x1-x2| Tabled value for Tukey for .05 q for 0.05 q #N/A Tabled value for Tukey for .01 q for 0.01 q #N/A Means Square Within / n MSW/n MSW / n Square Root (Means Square Within / n) (MSW/n)^.5 (MSW / n).5 Tukey Value for a = .05 q.05 * (MSw/n)^.5 TV.05 Tukey Value for a = .01 q.01 * (MSw/n)^.5 TV.01 Critical Value.05: if positive reject H0 |x1-x2| - TV.05 CV.05 Critical Value.01: if positive reject H0 |x1-x2| - TV.01 CV.01 Scheffe use if n's are not equal Square of difference in mean (x1-x2)^2 (x1-x2)2 Reciprocal of first sample size 1 / n1 1 / n1 Reciprocal of second sample size 1 / n2 1 / n2 Sum of reciprocals 1 / n1 + 1 / n2 Product of Sum of reciprocals, dfb & MSW =$D29*$D33*D61 Scheffe Value =d58/d62 Fs Probability of type 1 error in Rejecting H0 F.DIST.RT p Hypotheses for ANOVA Hypothesis Null: H0: m1 m2 m3 m4 . . mK Check work with this data Alternate: H1: At least 2 means differ? Arby's Bob Evans 12 13 Hypotheses for Tukey or Scheffe 12 15 Null: H0: m1 m2 13 14 Alternate: H1: m1 / m3 14 13 15 12 12 15 12 16 11 14 13 9 9 12 12 Table is available at p= 0.5964507 0.0275641 Tukey Table On the Web Critical Values for Tukey's HSD Worksheet 13B Quiz Data Determine if four brands of athletic shoe differ in the number of months they last. A B 1) 24 36 2) 36 48 3) 12 50 4) 11 39 5) 5 38 6) 30 42 7) 21 46 8) 32 37 9) 15 47 10) 7 44 10 10 12 9 12 13 6 9 11 8 4 19 7 6 17 5 5 13 8 6 14 4 3 15 6 9 11 9 11 9 12 6 Pol Sci SW 12 5 13 5 11 9 19 7 17 5 13 9 14 6 15 3 11 7 9 5 6 Use Tukey MS F p vs. vs. vs. vs. Both Values for alpha = .05 Tukey Tukey Degrees of Freedom Within Tukey Infinity Tukey 5 Tukey 6 Tukey 7 Tukey 8 Tukey 9 Tukey 10 11 Scheffe 12 Scheffe 13 Scheffe 14 Scheffe 15 Scheffe 16 Scheffe 17 Scheffe 18 19 20 24 Chi-Chi's Dolinsky's 30 9 14 40 8 12 60 9 13 120 8 14 9 11 10 10 9 11 12 12 14 13 11 11 12 0.9958222 0.0010136 0.4595344 0.0477663 ds of athletic shoe months they last. C D 24 30 29 32 21 29 30 31 28 33 22 35 27 28 24 32 26 33 25 35 5 5 9 7 5 9 6 3 7 5 Values for alpha = .01 Degrees of Freedom Between 1 2 3 4 5 6 7 8 9 Degrees of Freedom Within 2.77 3.31 3.63 3.86 4.03 4.17 4.29 4.39 4.47 5 3.64 4.60 5.22 5.67 6.03 6.33 6.58 6.80 6.99 6 3.46 4.34 4.90 5.30 5.63 5.90 6.12 6.32 6.49 7 3.34 4.16 4.68 5.06 5.36 5.61 5.82 6.00 6.16 8 3.26 4.04 4.53 4.89 5.17 5.40 5.60 5.77 5.92 9 3.20 3.95 4.41 4.76 5.02 5.24 5.43 5.59 5.74 10 3.15 3.88 4.33 4.65 4.91 5.12 5.30 5.46 5.60 11 3.11 3.82 4.26 4.57 4.82 5.03 5.20 5.35 5.49 12 3.08 3.77 4.20 4.51 4.75 4.95 5.12 5.27 5.39 13 3.06 3.73 4.15 4.45 4.69 4.88 5.05 5.19 5.32 14 3.03 3.70 4.11 4.41 4.64 4.83 4.99 5.13 5.25 15 3.01 3.67 4.08 4.37 4.59 4.78 4.94 5.08 5.20 16 3.00 3.65 4.05 4.33 4.56 4.74 4.90 5.03 5.15 17 2.98 3.63 4.02 4.30 4.52 4.70 4.86 4.99 5.11 18 2.97 3.61 4.00 4.28 4.49 4.67 4.82 4.96 5.07 19 2.96 3.59 3.98 4.25 4.47 4.65 4.79 4.92 5.04 20 2.95 3.58 3.96 4.23 4.45 4.62 4.77 4.90 5.01 24 2.92 3.53 3.90 4.17 4.37 4.54 4.68 4.81 4.92 30 2.89 3.49 3.85 4.10 4.30 4.46 4.60 4.72 4.82 40 2.86 3.44 3.79 4.04 4.23 4.39 4.52 4.63 4.73 60 2.83 3.40 3.74 3.98 4.16 4.31 4.44 4.55 4.65 120 2.80 3.36 3.68 3.92 4.10 4.24 4.36 4.47 4.56 Infinity Degrees of Freedom Between 1 2 3 4 5 6 7 8 9 5.70 6.98 7.80 8.42 8.91 9.32 9.67 9.97 10.24 5.24 6.33 7.03 7.56 7.97 8.32 8.61 8.87 9.10 4.95 5.92 6.54 7.01 7.37 7.68 7.94 8.17 8.37 4.75 5.64 6.20 6.62 6.96 7.24 7.47 7.68 7.86 4.60 5.43 5.96 6.35 6.66 6.91 7.13 7.33 7.49 4.48 5.27 5.77 6.14 6.43 6.67 6.87 7.05 7.21 4.39 5.15 5.62 5.97 6.25 6.48 6.67 6.84 6.99 4.32 5.05 5.50 5.84 6.10 6.32 6.51 6.67 6.81 4.26 4.96 5.40 5.73 5.98 6.19 6.37 6.53 6.67 4.21 4.89 5.32 5.63 5.88 6.08 6.26 6.41 6.54 4.17 4.84 5.25 5.56 5.80 5.99 6.16 6.31 6.44 4.13 4.79 5.19 5.49 5.72 5.92 6.08 6.22 6.35 4.10 4.74 5.14 5.43 5.66 5.85 6.01 6.15 6.27 4.07 4.70 5.09 5.38 5.60 5.79 5.94 6.08 6.20 4.05 4.67 5.05 5.33 5.55 5.73 5.89 6.02 6.14 4.02 4.64 5.02 5.29 5.51 5.69 5.84 5.97 6.09 3.96 4.55 4.91 5.17 5.37 5.54 5.69 5.81 5.92 3.89 4.45 4.80 5.05 5.24 5.40 5.54 5.65 5.76 3.82 4.37 4.70 4.93 5.11 5.26 5.39 5.50 5.60 3.76 4.28 4.59 4.82 4.99 5.13 5.25 5.36 5.45 3.70 4.20 4.50 4.71 4.87 5.01 5.12 5.21 5.30 3.64 4.12 4.40 4.60 4.76 4.88 4.99 5.08 5.16