Document Sample

Testing of hypothesis Dr.L.Jeyaseelan Dept. of Biostatistics Christian Medical College Vellore, India Statistics Inferential Statistics Descriptive Statistics Hypothesis testing Summarize mean / proportion (incidence / prevalence) Comparison of means Comparison of proportions ( incidences / prevalences) Hypotheses Research Question Is there a (statistically) significant difference between two groups with respect to the outcome? Null Hypothesis There is no (statistically) significant difference between two groups with respect to the outcome. Alternative Hypothesis There is a (statistically) significant difference between two groups with respect to the outcome. Two groups – two independent populations Outcome – scores obtained Intervention – Educational training P - Value Probability of getting a result as extreme as or more extreme than the one observed when the null hypothesis is true. When our study results in a probability of 0.01, we say that the likelihood of getting the difference we found by chance would be 1 in a 100 times. It is unlikely that our results occurred by chance and the difference we found in the sample probably due to the teaching programme. Variation (i) Chance Variation (ii) Effect Variation The difference that we might find between the two groups’ exam achievement in our sample might have occurred by chance, or it might have occurred due to the teaching programme ‘P’ as a significance level P < 0.05 result is statistically significant P > 0.05 result is not statistically significant. These cutoffs are arbitrary & have no specific importance. COMPARISON OF MEANS t - tests A bit of history... W.A. Gassit (1905) first published a t-test. He worked at the Guiness Brewery in Dublin and published under the name Student. The test was called Student Test (later shortened to t test). Types of t-tests One sample t-test t-test for two independent (uncorrelated) samples (i) Equal variance (ii) Unequal variance t-test for two paired (correlated) samples Comparison of two independent Means (Student’s t-test / unpaired t-test) A t-test is used when we wish to compare two means Type of data required Independent One nominal variable with two levels Variable E.g., (i) boy/girl students; (ii) non-smoking/heavy smoking mothers Dependent Continuous variable Variable E.g., (i) marks obtained by the students in the annual exam; (ii) Birth weight of children Assumptions The samples are random & independent of each other The independent variable is categorical & contains only two levels The distribution of dependent variable is normal. If the distribution is seriously skewed, the t-test may be invalid. The variances are equal in both the groups Example data A study was conducted to compare the birth weights of children born to 15 non-smoking with those of children born to 14 heavy smoking mothers. Non-smoking Mothers Heavy smoking Mothers (n = 15) (n = 14) 3.99 3.18 | x1 x 2 | 3.79 2.84 t 3.60 2.90 1 1 S 3.73 3.27 n1 n 2 3.21 3.85 3.60 3.52 4.08 3.23 Where, 3.61 2.76 3.83 3.60 (n1 )s1 (n2 )s2 2 3.31 3.75 S 2 2 4.13 3.59 n1 n 2 2 3.26 3.63 3.54 2.38 3.51 2.34 2.71 Checking the Normality Unequal Variances Sometimes we wish to compare two groups of observations where the assumption of normality is reasonable, but the variability in the two groups are markedly different Two questions arise: (1)How different do the variances have to be before we should not use the two sample t-test? (2)What can we do if this happens? Unequal Variances – Contd.. (1) Levene’s test for equality of variances Null Hypothesis : The variances are equal Alternative Hypothesis : The variances are not equal If Levene’s test is not significant …. P>0.05 Report “equal variances assumed” If Levene’s test is significant ……... P<0.05 Report “equal variances not assumed” (2) Use Modified t-test in the presence unequal variances How to report the results? Heavy smoking Non-smoking Diff in means P-Value mothers mothers (n=14) (n=15) (95% CI) Mean SD Mean SD Birth weight of children 3.20 0.49 3.60 0.37 0.4 (0.06 – 0.72) 0.022 The difference between birth weight of children born to non-smoking and heavy smoking mothers found by chance is only 2 in a 100 times. The distribution of data Normal data: SD < ½ mean use t-test Skewed / Non-normal data: SD > ½ mean use Non parametric Mann - Whitney test / log – transformed t-test Note: Applicable only for variables where negative values are impossible (e.g., Rate of GFR change) Ref: Altman DG, 1991 Clinical Significance Vs Statistical Significance A possible antipyretic is tested in patients with the common cold. 500 receive the candidate drug 500 receive a placebo control Temperatures measured 4 hours after dosing N Mean StDev SE Mean Drug 500 39.950 0.653 0.029 Control 500 40.058 0.699 0.031 p value = 0.011 Statistical Significance? Yes. Probably there is a reduction in temperature __________________________________ Clinical Significance? NO. Temperature only fell by about 0.1c __________________________________ Because the sample size is so large we are able to detect a very small change in temperature Misuses of t-test • t-test for non-normal data. Hospital 1 Hospital 2 Mean (SD) n Mean (SD) n Length of Stay (in days) 26 (17) 11 79 (57) 13 Heterogeneous data – SD > ½ (mean) Correct Method: Non-parametric Mann-Whitney test with Median and Range values • t-test for paired observations Before intervention After intervention (n = 12) Mean SD Mean SD BP Levels 142.0 30.5 120.5 31.5 Correct method: Paired t-test Misuses of t-test (Contd. ..) • Multiple t-test Comparison of length of stays between three hospitals Hospital 1 Hospital 2 Hospital 3 Mean n Mean (SD) n Mean (SD) n (SD) Length of Stay 25 (5) 12 75 (20) 13 30 (10) 14 (in days) Hospital 1 vs Hospital 2 P- value = ? Hospital 1 vs Hospital 3 P- value = ? Hospital 2 vs Hospital 3 P- value = ? The effective p-value for 3 comparison is 3 x 0.05 = 0.15 Correct method: ANOVA with Bonferroni correction. Two groups of paired Observations Paired t-test • Same individuals are studied more than once in different circumstances eg. Measurements made on the same people before and after intervention • The outcome variable should be continuous • The difference between pre - post measurements should be normally distributed A study was carried to evaluate the effect of the new diet on weight loss. The study population consist of 12 people have used the diet for 2 months; their weights before and after the diet are given below. Weight (Kgs) Patient No. Before Diet After Diet 1 75 70 2 60 54 3 68 58 4 98 93 5 83 78 6 89 84 7 65 60 8 78 77 9 95 90 10 80 76 11 100 94 12 108 100 The research question asks whether the diet makes a difference? Paired t test output t- test To examine the difference between two independent groups paired t-test To examine the difference between pre & post measures of the same group How do we compare more than two groups means?? Example: Treatments: A, B, C & D Response : BP level How does t-test concept work here? A versus B B versus C A versus C B versus D A versus D C versus D The rate of error increases exponentially by the number of tests conducted… 1-(1-0.05)6 = 0.27 Instead of using a series of individual comparisons we examine the differences among the groups through an analysis that considers the variation across all groups at once. Analysis of Variance (ANOVA) WHY ANOVA not ANOME? Although means are compared, the comparisons are made using estimate of variance. The ANOVA test statistic or F statistics are actually ratios of estimate of variance. Hypotheses The main analysis is to determine whether the population means are all equal. If there are K means then the null hypothesis is H o 1 2 ... k Alternative hypothesis is given by H A 1 2 ... k Type of data required Independent One nominal variable (>2 levels) Variable E.g., Socio economic status (low / medium / high) Dependent Continuous variable (normally Variable distributed) E.g., hb level Assumptions The samples are random & independent of each other The independent variable is categorical & contains more than two levels The distribution of dependent variable is normal. If the distribution is seriously skewed, the ANOVA may be invalid. The groups should have equal variances Example data A study was conducted to assess the hb levels of women in low, medium and high socio economic status SL Low Medium High SL Low Medium High No (n = 20) (n = 18) (n = 17) No (n = 20) (n = 18) (n = 17) 1 8.10 8.40 12.70 11 9.20 12.00 12.70 2 8.00 11.10 11.80 12 7.40 10.90 13.40 3 6.90 10.80 13.10 13 10.70 11.70 14.30 4 11.40 11.00 12.30 14 11.40 11.00 13.80 5 10.70 12.20 10.90 15 7.70 12.20 15.00 6 10.20 8.70 12.60 16 6.10 11.20 14.20 7 8.90 12.30 13.20 17 11.00 10.70 9.20 8 9.90 11.50 14.20 18 11.10 9.90 9 6.80 11.60 11.80 19 7.90 10 9.10 12.90 12.40 20 10.60 Source of Variation ANOVA separates the variation in all the data into two parts: The variation between the each group mean and the overall mean for all the groups (the between group variability) and the variation between each study participant and the participants group mean (the within-group variability). If the between-group variability is much greater than the within-group variability, there are likely to be difference between the group means. ANOVA data Group 1 Group 2 Group 3 ANOVA output Multiple Comparisons procedure ANOVA is a " group comparison " that determines whether a statistically significant difference exists somewhere among the groups studied. If a significant difference is indicated, ANOVA is usually followed by a " multiple comparison procedure " that compares combinations of groups to examine further any differences among them. The most common multiple comparison procedure is the " pairwise comparison ", in which each group mean is compared (two at a time) to all other group means to determine which groups differ significantly. Bonferroni Test Uses t tests to perform pairwise comparisons between group means, but controls overall error rate by setting the error rate for each test to the experiment wise error rate divided by the total number of tests. Disadvantage with this procedure is that true overall level may be so much less than the maximum value ‘’ that none of individual tests are more likely to be rejected. Tukey’s Method Uses the studentized range statistic to make all of the pairwise comparisons between groups.Sets the experiment wise error rate at the error rate for the collection for all pairwise comparisons This method is applicable when 1. Size of the sample from each group are equal. 2. Pairwise comparisons of means are of primary interest that is Null hypothesis of the form. to be considered. Scheffé test Performs simultaneous joint pairwise comparisons for all possible pairwise combinations of means. Uses the F sampling distribution. This method is recommended when 1. The size of the samples selected from the different populations are unequal. 2. Comparisons other than simple pairwise comparison between two means are of interest. Analysis of Covariance (ANCOVA) Analysis of covariance ANCOVA is an another ANOVA technique which combines the ANOVA with regression to measure the differences among group means The advantages that ANCOVA has over other techniques are: The ability to reduce the error variance in the outcome measure. The ability to measure group differences after allowing for other differences between subjects. • In ANOVA two sets of variables are involved in the analysis the independent and the dependent variable. With ANCOVA a third type of variable is included: the covariate which is continuous Assumptions 1. The groups should be mutually exclusive. 2. The variance of the groups should be equivalent. 3. The dependent variable should be normally distributed. 4. The covariate should be a continuous variable. 5. The covariate and the dependent variable must show a linear relationship. 6. The direction and strength of relationship between the covariate and dependent variable must be similar in each group (homogeneity of regression across groups). Steps for the analysis Check whether the dependent variable is normally distributed. (Use rule of thump) Sum chol • Test whether the variance of the dependent variable is similar across groups (Bartlett’s test for equal variances) Oneway chol group, tabulate • Measure the correlation between cholesterol and age. Corr chol age Twoway (scatter chol age) Cont.. Homogeneity of regression across groups is equivalent to testing interaction between the covariate and the independent variable. Anova chol group age age*group, contin(age) If interaction is significant one could study the effect of age on cholesterol in each of the two groups separately. If the interaction is not significant then the assumptions are met and it is appropriate to do ANCOVA. anova chol group age age*group, contin(age) Summary ANCOVA is an extension of ANOVA that allows us to remove additional sources of variation from the error term, thus enhancing the power of our analysis. ANCOVA Should be used only after careful consideration has been given to meeting the underlying assumptions. It is especially important to check for homogeneity of regression, because if that assumption is violated, ANCOVA can lead to improper interpretations of results. Example In a survey to examine relationships between the nutrition and the health of women in middle west, the concentration of cholesterol in the blood serum was determined on 56 randomly selected subjects of Iowa and 130 in Nebraska After controlling for age, do the two groups (Iowa, Nebraska) differ significantly on the cholesterol levels? Dataset ANOVA without adjusting for age Testing Homogeneity of Variances across groups Measuring the correlation between cholesterol and age Correlations between the dependent variable and the covariate Testing Homogeneity of regression across groups Testing the homogeneity of regression across groups Model shows that the interaction term is not significant (Assumption is met) The Interaction term is eliminated from the model (Full Factorial model) The ANCOVA results Interpretation of the findings After controlling for the covariate age the two groups, (IOWA and Nebraska) do not differ significantly in their cholesterol levels. Note that the error variance was very high when age is not adjusted in the model

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 4 |

posted: | 8/16/2012 |

language: | |

pages: | 59 |

OTHER DOCS BY hedongchenchen

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.