Document Sample
Chapter8part1 Powered By Docstoc
					Paired Samples versus Independent Samples
Paired Design 1
With paired data, we are interested in
comparing the responses within each
pair. We will analyze the differences of
the responses that form each pair.

         Paired Data: Response = Annual Salary (in $1000s)
           W i fe R e s p o n s e H u s b a n d R e s p o n s e i ffe r e n c e = W i fe - H u s b a n d
                  15                         20                     1 5 - 2 0 =-5
                  45                         31                     4 5 - 3 1 =1 4
                  50                         50                     50 - 50 = 0
                  16                         30                     1 6 - 3 0 =-1 4
                  56                         72                     5 6 - 7 2 =-1 6
                                                              M e a n D i ffe r e n c e = -4 .2

Paired Design 2


We have paired or matched samples when we know, in advance, that an
observation in one data set is directly related to a specific observation in
the other data set. It may be that the related sets of units are each
measured once (Paired Design 1), or that the same unit is measured twice
(Paired Design 2). In a paired design, the two sets of data must have the
same number of observations.
Independent Samples Design

                    Independent Samples Data:
                Response = Annual Salary (in $1000s)
                        W om en Res pons e                    M en Res pons e
                               15                                     20
                               45                                     31
                               50                                     50
                               16                                     30
                               56                                     72

                    M e a n fo r W o m e n = 3 6 .4      M e a n f o r M e n = 4 0 .6
                         D i f fe r e n c e i n t h e m e a n s = 3 6 .4 - 4 0 .6 = -4 .2

In the two independent samples scenario, we will compare the
responses of one treatment group as a whole to the responses of the
other treatment group as a whole. We will calculate summary measures
for the observations from one treatment group and compare them to
similar summary measures calculated from the observations from the
other treatment group.

We have two independent samples when two unrelated sets of units are
measured, one sample from each population, as in Independent Samples
Design 11.3. In a design with two independent samples, although the
same sample size is often preferable, the sample sizes might be different.
Let’s Do It! Paired Samples versus Independent Samples

  (a) Three hundred registered voters were selected at random, 30 from
        each of 10 midwestern counties, to participate in a study on
        attitudes about how well the president is performing his job.
        They were each asked to answer a short multiple-choice
        questionnaire and then they watched a 20-minute video that
        presented information about the job description of the
        president. After watching the video, the same 300 selected
        voters were asked to answer a follow-up multiple-choice
        questionnaire. The investigator of this study will have two sets
        of data: the initial questionnaire scores and the follow-up
        questionnaire scores. Is this a paired or independent samples
        Circle one:        Paired          Independent

(b) Thirty dogs were selected at random from those residing at the
    humane society last month. The 30 dogs were split at random into
    two groups. The first group of 15 dogs was trained to perform a
    certain task using a reward method. The second group of 15 dogs
    was trained to perform the same task using a reward-punishment
    method. The investigator of this study will have two sets of data: the
    learning times for the dogs trained with the reward method and the
    learning times for the dogs trained with the reward-punishment
    method. Is this a paired or independent samples design?
   Circle one:         Paired          Independent
Let’s Do It! 2 Design a Study
For each of the following research questions, briefly describe how you
might design a study to address the question (discuss whether paired or
independent samples would be obtained):

(a)   Do freshmen students use the library to study more often than
      senior students?

(b)   Do books cost more on average at the local bookstore or through

(c)   Will taking summer school improve reading levels for
      Kindergarteners going into first grade?
Paired Samples
In a paired design, units in each par are alike (in fact, they may be the
same unit), whereas units in different pairs may be quite dissimilar.

                                                       observation for
                                                       treatment 1

                                                        observation for
                                                         treatment 2
                       Population of Paired Observations
                            D = difference = treatment 1 - treatment 2

Since we are interested in the difference for each pair, the differences
are what we analyze in paired designs.

Example Weight Change
A study was conducted to estimate the mean weight change of a female
adult who quits smoking. The weights of eight female adults before they
stopped smoking and five weeks after they stopped smoking were
recorded. The differences, computed as “after -before,” are given below.
Subject          1         2        3        4        5         6         7     8
After            154       181      151      120      131       130       121   128
Before           148       176      153      116      129       128       120   132
Difference       6         5        -2       4        2         2         1     -4

Here we have another example of a paired design.

(a)   Compute the sample mean difference in weight.

(b)   Compute the sample standard deviation of the differences.
              (a) The sample mean difference is d =1.75 pounds. Note that the
                    differences computed as “after - before” represent the weight gain
                    for a subject. A positive value indicates weight gain and a negative
                    value indicates a weight loss.
              (b)   The sample standard deviation is SD =3.412 pounds

              The Paired T-Test

         Paired t-Test
          H0 : D  0 versus      H1 : D  0 or
             H0 :  D    0 versus H1 : D  0 or
             H0 :  D    0 versus H1 : D  0 .

         Data:            The sample of n differences, generically written as
                          d1 , d 2 ,, d n from which the sample mean difference
                          d and the sample standard deviation of the differences s D
                   can be computed.
bserved Test Statistic: T  d  0 and the null distribution for the T variable
         is a t(n-1) distribution.
         p-value: We find the p-value for the test using the t(n - 1) distribution.
                        The direction of extreme will depend on how the alternative
                        hypothesis is expressed. 
         Decision: A p-value less than or equal to  leads to rejection of H0
            If we are interested in assessing if D is equal to some
              hypothesized value that is not 0, we would replace 0 in the test
              statistic expression with this other null value.
            The test statistic is the same no matter how the alternative
              hypothesis is expressed.
  Example Comparing Test Scores
  A group of 10 randomly selected children of elementary school age
  among those in the Mankato County who were recently diagnosed with
  asthma was tested to see if a new children’s educational video is
  effective in increasing the children’s knowledge about asthma. A nurse
  gave the children an oral test containing questions about asthma before
  and after seeing the animated video. The test scores are given below:
Child:           1 2 3 4 5 6 7 8 9 10
Before:         61 60 52 74 64 75 42 63 53 56            Mean = 60
After:          67 62 54 83 60 89 44 67 62 57            Mean =64.5

  (a)     Explain why we have paired data here and not two independent

  (b)     We are interested in examining the differences in the scores for
          each child. Compute the differences and find the sample mean
          difference and the sample standard deviation of the differences.

  (c)  The researchers wish to assess if the data provide sufficient
       evidence to conclude that the mean score after viewing the
       educational video is significantly higher than the mean score before
       the viewing. The test will be conduced at the 5% level of
       significance. State the appropriate hypotheses to be tested in terms
       of the population mean difference in test scores  d .
   (d) Compute the observed t-test statistic value.

  (e)     Find the corresponding p-value.

  (f)     State the decision and conclusion using a 5% significance level.

  (a) Since we have two observations from the same child, we have
       paired data.
(b)         The observed differences computed here as are as follows: “after-
Child:                   1   2   3   4   5    6 7    8    9 10
d = After - Before       6   2   2   9   -4   14 2    4    9 1   Mean diff =4.5

The first observed difference is 6 and is represented by d1, and the last
difference is also positive and is represented by d10 = 1. The observed
sample mean difference is d  4.5, which is our estimate of the
unknown mean difference,  D . The observed sample standard deviation
of the differences is sD  5126, which is our estimate of the unknown
population standard deviation  D .

(c)         Since we defined our differences as diff =after - before, it is positive
            differences that would show some support that the video is
            effective in improving the mean test score. Thus the corresponding
            hypotheses to be tested are H0 : D  0 versus H1:  D  0.

                                                   4.5  0
(d) The observed t-test statistic is given by t             2.78 .

  This means we observed a sample mean difference that was about 2.78
standard errors above the hypothesized mean difference of zero.
Is this large enough (that is, far enough above zero) to reject the null

      (e)     The p-value is the probability of getting a                          Area=p-value
              test statistic as large as or larger than
              the observed test statistic of 2.78,
              computed using a t-distribution                           0

              with nine degrees of freedom.

With the TI Using the tcdf: p-value = PT  2.78 = tcdf(2.78, E99, 9) =
Using the T-Test function under the STAT TESTS menu.

In the TESTS menu located under the STAT button, we select the 2:T-
Test option. With the sample mean of 4.5, the sample standard deviation
of 5.126, and the sample size of n = 10, we can use the Stats option of
this test. The corresponding input and output screens are shown.
Notice that the null or hypothesized value is zero.

                     p-value = PT  2.78 = 0.01077.

(f) Decision and Conclusion
Since our p-value is less than 0.05, at the   0.05 significance level we
would reject H 0 , and conclude there is sufficient evidence to say that the
mean score after viewing the educational video is significantly higher
than the mean score before the viewing.
Let’s Do It!
Two creams are available by prescription for treating moderate skin
burns. A study to compare the effectiveness of the two creams is
conducted using 15 patients with moderate burns on their arms. Two
spots of the same size and degree of burn are marked on each patient’s
arm. One of the two creams is selected at random and applied to the
first spot, while the remaining spot is treated with the other cream. The
number of days until the burn has healed is recorded for each spot.
These data are provided with the difference in healing time (in days).
Consider the data and interval estimate for comparing the two burn
cream treatments in
Patient Number   1    2    3    4   5   6    7   8    9    10   11   12   13   14   15
Cream1= C1       16   2    10   7   6   10   5   4    19   7    12   9    10   20   12
Cream2= C2       14   4    10   4   5   12   5   6    23   10   12   7    11   24   10
Diff =C1- C2     2    -2   0    3   1   -2   0   -2   -4   -3   0    2    -1   -4   2

      We wish to test the claim that there is no difference between the
      two creams at the 5% significance level.

Homework page 344: 67, 68, 69, 72, 73, 76, 77

Shared By: