Testing of hypothesis

Document Sample
Testing of hypothesis Powered By Docstoc
					Testing of hypothesis
     Dept. of Biostatistics
   Christian Medical College
         Vellore, India

Inferential Statistics                    Descriptive Statistics

Hypothesis testing                       Summarize mean / proportion
                                               (incidence / prevalence)
  Comparison of means
  Comparison of proportions
          ( incidences / prevalences)
Research Question
       Is there a (statistically) significant difference between two
groups with respect to the outcome?

Null Hypothesis
       There is no (statistically) significant difference between two
groups with respect to the outcome.

Alternative Hypothesis
       There is a (statistically) significant difference between two
groups with respect to the outcome.

         Two groups – two independent populations
                Outcome – scores obtained
            Intervention – Educational training
                      P - Value
Probability of getting a result as extreme as or more
extreme than the one observed when the null
hypothesis is true.

When our study results in a probability of 0.01, we say
that the likelihood of getting the difference we found by
chance would be 1 in a 100 times.

It is unlikely that our results occurred by chance and the
difference we found in the sample probably due to the
teaching programme.

(i) Chance Variation
(ii) Effect Variation

  The difference that we might find between the two
  groups’ exam achievement in our sample might
  have occurred by chance, or it might have occurred
  due to the teaching programme
              ‘P’ as a significance level

   P < 0.05         result is statistically significant

   P > 0.05         result is not statistically significant.

These cutoffs are arbitrary & have no specific importance.
                     COMPARISON OF MEANS

                                t - tests

                                 A bit of history...
W.A. Gassit (1905) first published a t-test. He worked at the Guiness Brewery in
Dublin and published under the name Student. The test was called Student Test
                            (later shortened to t test).
                     Types of t-tests

 One sample t-test

 t-test for two independent (uncorrelated) samples
        (i) Equal variance (ii) Unequal variance

 t-test for two paired (correlated) samples
  Comparison of two independent Means
            (Student’s t-test / unpaired t-test)

A t-test is used when we wish to compare two means

Type of data required

Independent      One nominal variable with two levels
                 E.g., (i) boy/girl students; (ii) non-smoking/heavy
                           smoking mothers

Dependent       Continuous variable
Variable        E.g., (i) marks obtained by the students in the
                annual exam; (ii) Birth weight of children

   The samples are random & independent of each

   The independent variable is categorical & contains
    only two levels

   The distribution of dependent variable is normal. If
    the distribution is seriously skewed, the t-test may
    be invalid.

   The variances are equal in both the groups
Example data
A study was conducted to compare the birth weights of
children born to 15 non-smoking with those of children
born to 14 heavy smoking mothers.
 Non-smoking Mothers   Heavy smoking Mothers
       (n = 15)               (n = 14)
        3.99                   3.18
                                                     | x1  x 2 |
        3.79                   2.84            t
        3.60                   2.90                     1      1
                                                    S      
        3.73                   3.27                     n1 n 2
        3.21                   3.85
        3.60                   3.52
        4.08                   3.23            Where,
        3.61                   2.76
        3.83                   3.60
                                                   (n1 )s1  (n2 )s2
        3.31                   3.75            S 
                                                2                  2

        4.13                   3.59                  n1  n 2  2
        3.26                   3.63
        3.54                   2.38
        3.51                   2.34
Checking the Normality
Unequal Variances

 Sometimes we wish to compare two groups of observations
 where the assumption of normality is reasonable, but the
 variability in the two groups are markedly different

 Two questions arise:
 (1)How different do the variances have to be before we
    should not use the two sample t-test?
 (2)What can we do if this happens?
Unequal Variances – Contd..
(1) Levene’s test for equality of variances
  Null Hypothesis           : The variances are equal
  Alternative Hypothesis    : The variances are not equal

  If Levene’s test is not significant …. P>0.05
          Report “equal variances assumed”

  If Levene’s test is significant ……... P<0.05
          Report “equal variances not assumed”

(2) Use Modified t-test in the presence unequal variances
 How to report the results?

                  Heavy smoking     Non-smoking      Diff in means      P-Value
                       mothers          mothers
                      (n=14)           (n=15)
                                                          (95% CI)
                  Mean      SD    Mean       SD
Birth weight of
                  3.20     0.49    3.60     0.37    0.4 (0.06 – 0.72)    0.022

The difference between birth weight of children born to non-smoking and heavy
smoking mothers found by chance is only 2 in a 100 times.
                        The distribution of data

Normal data:
         SD < ½ mean                         use t-test

Skewed / Non-normal data:
         SD > ½ mean                         use Non parametric
                                             Mann - Whitney test /
                                             log – transformed t-test

Note: Applicable only for variables where negative values are impossible
     (e.g., Rate of GFR change)

Ref: Altman DG, 1991
Clinical Significance Vs Statistical Significance
A possible antipyretic is tested in patients with the common cold.
500 receive the candidate drug
500 receive a placebo control
Temperatures measured 4 hours after dosing
                  N         Mean         StDev      SE Mean
 Drug            500        39.950       0.653       0.029
 Control         500        40.058       0.699       0.031
                                                         p value = 0.011

Statistical Significance? Yes. Probably there is a reduction in temperature
Clinical Significance?   NO. Temperature only fell by about 0.1c

Because the sample size is so large we are able to detect a very small
change in temperature
                                Misuses of t-test
 •     t-test for non-normal data.
                                          Hospital 1                        Hospital 2
                                   Mean (SD)            n         Mean (SD)              n

     Length of Stay (in days)       26 (17)            11             79 (57)            13

 Heterogeneous data – SD > ½ (mean)
Correct Method: Non-parametric Mann-Whitney test with Median
and Range values

 •     t-test for paired observations
                                     Before intervention     After intervention
                                                       (n = 12)
                                      Mean        SD          Mean          SD
                    BP Levels         142.0      30.5         120.5        31.5

Correct method: Paired t-test
                Misuses of t-test (Contd. ..)
•    Multiple t-test
Comparison of length of stays between three hospitals

                          Hospital 1         Hospital 2              Hospital 3
                                   n    Mean (SD)         n     Mean (SD)         n
    Length of Stay
                       25 (5)      12    75 (20)          13     30 (10)          14
      (in days)

    Hospital 1 vs Hospital 2            P- value = ?
    Hospital 1 vs Hospital 3            P- value = ?
    Hospital 2 vs Hospital 3            P- value = ?
    The effective p-value for 3 comparison is 3 x 0.05 = 0.15

Correct method: ANOVA with Bonferroni correction.
     Two groups of paired Observations
                       Paired t-test
•   Same individuals are studied more than once in different
       eg. Measurements made on the same people before
                and after intervention
•   The outcome variable should be continuous
•   The difference between pre - post measurements should
    be normally distributed
A study was carried to evaluate the effect of the new diet on
weight loss. The study population consist of 12 people have
used the diet for 2 months; their weights before and after the
diet are given below.
                                      Weight (Kgs)
                  Patient No.
                                Before Diet   After Diet
                      1             75           70
                      2             60           54
                      3             68           58
                      4             98           93
                      5             83           78
                      6             89           84
                      7             65           60
                      8             78           77
                      9             95           90
                      10            80           76
                      11           100           94
                      12           108           100

The research question asks whether the diet makes a difference?
Paired t test output
     t- test    To examine the difference between two
 independent groups

 paired t-test  To examine the difference between pre
 & post measures of the same group

How do we compare more than two groups means??
Treatments: A, B, C & D
Response : BP level

         How does t-test concept work here?

            A versus B         B versus C
            A versus C         B versus D
            A versus D         C versus D

The rate of error increases exponentially by the number
of tests conducted…
                  1-(1-0.05)6 = 0.27
Instead of using a series of individual comparisons we
examine the differences among the groups through an
analysis that considers the variation across all groups
at once.

       Analysis of Variance (ANOVA)

Although means are compared, the comparisons are
made using estimate of variance. The ANOVA test
statistic or F statistics are actually ratios of estimate of

The main analysis is to determine whether the
population means are all equal. If there are K means
then the null hypothesis is

          H o  1   2  ...   k
Alternative hypothesis is given by

          H A  1  2  ...  k
Type of data required

Independent   One nominal variable (>2 levels)
              E.g., Socio economic status (low / medium / high)

Dependent     Continuous variable (normally
Variable      distributed)
              E.g., hb level

   The samples are random & independent of each other

   The independent variable is categorical & contains
    more than two levels

   The distribution of dependent variable is normal. If
    the distribution is seriously skewed, the ANOVA may
    be invalid.

   The groups should have equal variances
 Example data
    A study was conducted to assess the hb levels of
    women in low, medium and high socio economic status

SL   Low    Medium     High       SL   Low    Medium     High
No (n = 20) (n = 18) (n = 17)     No (n = 20) (n = 18) (n = 17)
1     8.10    8.40    12.70       11    9.20    12.00   12.70
2     8.00    11.10    11.80      12    7.40    10.90   13.40
3     6.90    10.80   13.10       13   10.70    11.70   14.30
4     11.40   11.00   12.30       14   11.40    11.00   13.80
5     10.70   12.20   10.90       15    7.70    12.20   15.00
6     10.20   8.70    12.60       16    6.10    11.20   14.20
7     8.90    12.30   13.20       17   11.00    10.70    9.20
8     9.90    11.50   14.20       18   11.10    9.90
9     6.80    11.60    11.80      19    7.90
10    9.10    12.90   12.40       20   10.60
Source of Variation

ANOVA separates the variation in all the data into
two parts:

The variation between the each group mean and the
overall mean for all the groups (the between group
variability) and the variation between each study
participant and the participants group mean (the
within-group variability).

If the between-group variability is much greater than
the within-group variability, there are likely to be
difference between the group means.
ANOVA data

             Group 1

             Group 2

             Group 3
ANOVA output
  Multiple Comparisons procedure

ANOVA is a " group comparison " that determines
whether a statistically significant difference exists
somewhere among the groups studied. If a significant
difference is indicated, ANOVA is usually followed by a "
multiple comparison procedure " that compares
combinations of groups to examine further any
differences among them. The most common multiple
comparison procedure is the " pairwise comparison ", in
which each group mean is compared (two at a time) to
all other group means to determine which groups differ
      Bonferroni Test

      Uses t tests to perform pairwise comparisons
between group means, but controls overall error rate
by setting the error rate for each test to the
experiment wise error rate divided by the total
number of tests.

      Disadvantage with this procedure is that true
overall level may be so much less than the
maximum value ‘’ that none of individual tests are
more likely to be rejected.
Tukey’s Method

Uses the studentized range statistic to make all of the
pairwise comparisons between groups.Sets the experiment
wise error rate at the error rate for the collection for all
pairwise comparisons

This method is applicable when

      1.     Size of the sample from each group are equal.
      2.     Pairwise comparisons of means are of primary
             interest that is Null hypothesis of the form.
             to be considered.
Scheffé test

Performs simultaneous joint pairwise comparisons for
all possible pairwise combinations of means. Uses the
F sampling distribution.

This method is recommended when

1.    The size of the samples selected from the
      different populations are unequal.
2.    Comparisons other than simple pairwise
      comparison between two means are of interest.
Analysis of Covariance
                  Analysis of covariance

   ANCOVA is an another ANOVA technique which combines the
    ANOVA with regression to measure the differences among group
   The advantages that ANCOVA has over other techniques are:
   The ability to reduce the error variance in the outcome measure.
    The ability to measure group differences after allowing for other
    differences between subjects.
•    In ANOVA two sets of variables are involved in the analysis the
    independent and the dependent variable. With ANCOVA a third type
    of variable is included: the covariate which is continuous

1.   The groups should be mutually exclusive.
2.   The variance of the groups should be equivalent.
3.   The dependent variable should be normally distributed.
4.   The covariate should be a continuous variable.
5.   The covariate and the dependent variable must show a linear
6.   The direction and strength of relationship between the covariate
     and dependent variable must be similar in each group
     (homogeneity of regression across groups).
                  Steps for the analysis

   Check whether the dependent variable is normally distributed.
   (Use rule of thump)
   Sum chol

•   Test whether the variance of the dependent variable is similar across
    groups (Bartlett’s test for equal variances)
   Oneway chol group, tabulate

•   Measure the correlation between cholesterol and age.
   Corr chol age
   Twoway (scatter chol age)

   Homogeneity of regression across groups is equivalent to testing
    interaction between the covariate and the independent variable.
   Anova chol group age age*group, contin(age)

   If interaction is significant one could study the effect of age on
    cholesterol in each of the two groups separately.
   If the interaction is not significant then the assumptions are met and
    it is appropriate to do ANCOVA.
    anova chol group age age*group, contin(age)
   ANCOVA is an extension of ANOVA that allows us to
    remove additional sources of variation from the error
    term, thus enhancing the power of our analysis.

   ANCOVA Should be used only after careful
    consideration has been given to meeting the
    underlying assumptions.

   It is especially important to check for homogeneity of
    regression, because if that assumption is violated,
    ANCOVA can lead to improper interpretations of

   In a survey to examine relationships between the
    nutrition and the health of women in middle west, the
    concentration of cholesterol in the blood serum was
    determined on 56 randomly selected subjects of Iowa
    and 130 in Nebraska

   After controlling for age, do the two groups (Iowa,
    Nebraska) differ significantly on the cholesterol levels?
ANOVA without adjusting for age
Testing Homogeneity of Variances across groups
Measuring the correlation between cholesterol and age
Correlations between the dependent variable and
                  the covariate
Testing Homogeneity of regression across
Testing the homogeneity of regression across
Model shows that the interaction term is not
      significant (Assumption is met)
The Interaction term is eliminated from the
          (Full Factorial model)
The ANCOVA results
            Interpretation of the findings

   After controlling for the covariate age the two groups,
    (IOWA and Nebraska) do not differ significantly in their
    cholesterol levels.
   Note that the error variance was very high when age is
    not adjusted in the model

Shared By: