VIEWS: 18 PAGES: 53 POSTED ON: 7/27/2012 Public Domain
Midterm Review Session Things to Review • Concepts • Basic formulae • Statistical tests Things to Review • Concepts • Basic formulae • Statistical tests Populations <-> Parameters; Samples <-> Estimates Nomenclature Population Sample Parameter Statistics Mean x Variance s2 Standard s Deviation In a random sample, each member of a population has an equal and independent chance of being selected. Review - types of variables Nominal • Categorical variables Ordinal Discrete • Numerical variables Continuous Reality Ho true Ho false Result Reject Ho Type I error correct Do not reject Ho correct Type II error Sampling distribution of the mean, n=10 Sampling distribution of the mean, n=100 Sampling distribution of the mean, n = 1000 Things to Review • Concepts • Basic formulae • Statistical tests Things to Review • Concepts • Basic formulae • Statistical tests Sample Null hypothesis Test statistic Null distribution compare How unusual is this test statistic? P < 0.05 P > 0.05 Reject Ho Fail to reject Ho Statistical tests • Binomial test • Chi-squared goodness-of-fit – Proportional, binomial, poisson • Chi-squared contingency test • t-tests – One-sample t-test – Paired t-test – Two-sample t-test Statistical tests • Binomial test • Chi-squared goodness-of-fit – Proportional, binomial, poisson • Chi-squared contingency test • t-tests – One-sample t-test – Paired t-test – Two-sample t-test Quick reference summary: Binomial test • What is it for? Compares the proportion of successes in a sample to a hypothesized value, po • What does it assume? Individual trials are randomly sampled and independent • Test statistic: X, the number of successes • Distribution under Ho: binomial with parameters n and po . • Formula: n x P(x) p 1 p nx P = 2 * Pr[xX] x P(x) = probability of a total of x successes p = probability of success in each trial n = total number of trials Binomial test Null hypothesis Sample Pr[success]=po Test statistic Null distribution x = number of successes compare Binomial n, po How unusual is this test statistic? P < 0.05 P > 0.05 Reject Ho Fail to reject Ho Binomial test H0: The relative frequency of successes in the population is p0 HA: The relative frequency of successes in the population is not p0 Statistical tests • Binomial test • Chi-squared goodness-of-fit – Proportional, binomial, poisson • Chi-squared contingency test • t-tests – One-sample t-test – Paired t-test – Two-sample t-test Quick reference summary: 2 Goodness-of-Fit test • What is it for? Compares observed frequencies in categories of a single variable to the expected frequencies under a random model • What does it assume? Random samples; no expected values < 1; no more than 20% of expected values < 5 • Test statistic: 2 • Distribution under Ho: 2 with df=# categories - # parameters - 1 • Formula: 2 Observed Expected 2 i Expectedi i all classes 2 goodness of fit test Sample Null hypothesis: Data fit a particular Discrete distribution Calculate expected values Test statistic Observedi Expectedi Null distribution: 2 2 compar all classes Expectedi e 2 With N-1-param. d.f. How unusual is this test statistic? P < 0.05 P > 0.05 Reject Ho Fail to reject Ho 2 Goodness-of-Fit test H0: The data come from a certain distribution HA: The data do not come from that distrubition Possible distributions n x Pr[ x] p 1 p nx x e X PrX X! Pr[x] = n * frequency of occurrence Given a number of categories Proportional Probability proportional to number of opportunities Days of the week, months of the year Number of successes in n trials Binomial Have to know n, p under the null hypothesis Punnett square, many p=0.5 examples Number of events in interval of space or time Poisson n not fixed, not given p Car wrecks, flowers in a field Statistical tests • Binomial test • Chi-squared goodness-of-fit – Proportional, binomial, poisson • Chi-squared contingency test • t-tests – One-sample t-test – Paired t-test – Two-sample t-test Quick reference summary: 2 Contingency Test • What is it for? Tests the null hypothesis of no association between two categorical variables • What does it assume? Random samples; no expected values < 1; no more than 20% of expected values < 5 • Test statistic: 2 • Distribution under Ho: 2 with df=(r-1)(c-1) where r = # rows, c = # columns • Formulae: Observedi Expectedi 2 Expected RowTotal* ColTotal GrandTotal 2 Expectedi all classes 2 Contingency Test Sample Null hypothesis: No association between variables Calculate expected values Test statistic Observedi Expectedi Null distribution: 2 2 compar all classes Expectedi e 2 With (r-1)(c-1) d.f. How unusual is this test statistic? P < 0.05 P > 0.05 Reject Ho Fail to reject Ho 2 Contingency test H0: There is no association between these two variables HA: There is an association between these two variables Statistical tests • Binomial test • Chi-squared goodness-of-fit – Proportional, binomial, poisson • Chi-squared contingency test • t-tests – One-sample t-test – Paired t-test – Two-sample t-test Quick reference summary: One sample t-test • What is it for? Compares the mean of a numerical variable to a hypothesized value, μo • What does it assume? Individuals are randomly sampled from a population that is normally distributed. • Test statistic: t • Distribution under Ho: t-distribution with n-1 degrees of freedom. • Formula: Y o t SEY One-sample t-test Sample Null hypothesis The population mean is equal to o Test statistic Null distribution compare t with n-1 df Y o t s/ n How unusual is this test statistic? P < 0.05 P > 0.05 Reject Ho Fail to reject Ho One-sample t-test Ho: The population mean is equal to o Ha: The population mean is not equal to o Paired vs. 2 sample comparisons Quick reference summary: Paired t-test • What is it for? To test whether the mean difference in a population equals a null hypothesized value, μdo • What does it assume? Pairs are randomly sampled from a population. The differences are normally distributed • Test statistic: t • Distribution under Ho: t-distribution with n-1 degrees of freedom, where n is the number of pairs • Formula: d do t SE d Paired t-test Sample Null hypothesis The mean difference is equal to o Test statistic Null distribution t with n-1 df d do compare t *n is the number of pairs SE d How unusual is this test statistic? P > 0.05 P < 0.05 Reject Ho Fail to reject Ho Paired t-test Ho: The mean difference is equal to 0 Ha: The mean difference is not equal 0 Quick reference summary: Two-sample t-test • What is it for? Tests whether two groups have the same mean • What does it assume? Both samples are random samples. The numerical variable is normally distributed within both populations. The variance of the distribution is the same in the two populations • Test statistic: t • Distribution under Ho: t-distribution with n1+n2-2 degrees of freedom. 1 Y1 Y2 2 1 SEY Y sp • Formulae: t n1 n 2 1 2 SE Y Y s 2 p df1s12 df2 s2 2 1 2 df1 df2 Two-sample t-test Null hypothesis Sample The two populations have the same mean 12 Test statistic Null distribution compare t with n1+n2-2 df Y1 Y2 t SE Y Y 1 2 How unusual is this test statistic? P < 0.05 P > 0.05 Reject Ho Fail to reject Ho Two-sample t-test Ho: The means of the two populations are equal Ha: The means of the two populations are not equal Which test do I use? Methods for a single variable 1 How many variables am I comparing? 2 Methods for comparing two variables Methods for one variable Is the variable categorical Categorical or numerical? Comparing to a single proportion po Numerical or to a distribution? po distribution 2 Goodness- One-sample t-test Binomial test of-fit test Methods for two variables X Explanatory variable Response variable Categorical Numerical Contingency table Categorical Grouped bar graph Y Mosaic plot Multiple histograms Scatter plot Numerical Cumulative frequency distributions Methods for two variables X Explanatory variable Response variable Categorical Numerical Contingency table Contingency Logistic Categorical Grouped bar graph analysis regression Y Mosaic plot Multiple histograms Scatter plot Numerical t-test distributions Cumulative frequency Regression Methods for two variables Is the response variable categorical or numerical? Categorical Numerical Contingency t-test analysis How many variables am I comparing? 1 2 Is the variable categorical Is the response variable or numerical? categorical or numerical? Categorical Comparing to a Numerical Numerical single proportion po Categorical or to a distribution? po distribution 2 Goodness- Contingency t-test Binomial test One-sample t-test analysis of-fit test Sample Problems An experiment compared the testes sizes of four experimental populations of monogamous flies to four populations of polygamous flies: a. What is the difference in mean testes size for males from monogamous populations compared to males from polyandrous populations? What is the 95% confidence interval for this estimate? b. Carry out a hypothesis test to compare the means of these two groups. What conclusions can you draw? Sample Problems In Vancouver, the probability of rain during a winter day is 0.58, for a spring day 0.38, for a summer day 0.25, and for a fall day 0.53. Each of these seasons lasts one quarter of the year. What is the probability of rain on a randomly-chosen day in Vancouver? Sample problems A study by Doll et al. (1994) examined the relationship between moderate intake of alcohol and the risk of heart disease. 410 men (209 "abstainers" and 201 "moderate drinkers") were observed over a period of 10 years, and the number experiencing cardiac arrest over this period was recorded and compared with drinking habits. All men were 40 years of age at the start of the experiment. By the end of the experiment, 12 abstainers had experienced cardiac arrest whereas 9 moderate drinkers had experienced cardiac arrest. Test whether or not relative frequency of cardiac arrest was different in the two groups of men. Sample Problems An RSPCA survey of 200 randomly-chosen Australian pet owners found that 10 said that they had met their partner through owning the pet. A. Find the 95% confidence interval for the proportion of Australian pet owners who find love through their pets. B. What test would you use to test if the true proportion is significantly different from 0.01? Write the formula that you would use to calculate a P-value. Sample Problems One thousand coins were each flipped 8 times, and the number of heads was recorded for each coin. Here are the results: Does the distribution of coin flips match the distribution expected with fair coins? ("Fair coin" means that the probability of heads per flip is 0.5.) Carry out a hypothesis test. Sample problems Vertebrates are thought to be unidirectional in growth, with size either increasing or holding steady throughout life. Marine iguanas from the Galápagos are unusual in a number of ways, and a team of researchers has suggested that these iguanas might actually shrink during the low food periods caused by El Niño events (Wikelski and Thom 2000). During these events, up to 90% of the iguana population can die from starvation. Here is a plot of the changes in body length of 64 surviving iguanas during the 1992-1993 El Niño event. The average change in length was −5.81mm, with standard deviation 19.50mm. Test the hypothesis that length did not change on average during the El Niño event .