Multivariate Analysis by WLae56

VIEWS: 0 PAGES: 45

									Multivariate
   Analysis
              One-way ANOVA

 Tests the difference in the means of 2 or more nominal groups
   E.g., High vs. Medium vs. Low exposure

 Can be used with more than one IV
   Two-way ANOVA, Three-way ANOVA etc.
                          ANOVA

 _______-way ANOVA
   Number refers to the number of IVs

 Tests whether there are differences in the means of IV groups
   E.g.:
      Experimental vs. control group
      Women vs. Men
      High vs. Medium vs. Low exposure
                 Logic of ANOVA

 Variance partitioned into:
   1. Systematic variance:
     the result of the influence of the Ivs
   2. Error variance:
     the result of unknown factors

 Variation in scores partitions the variance into two
  parts by calculating the “sum of squares”:
   1. Between groups variation (systematic)
   2. Within groups variation (error)

 SS total = SS between + SS within
Significant and Non-significant
          Differences




Significant:        Non-significant:
 Between > Within    Within > Between
        Partitioning the Variance
              Comparisons
 Total variation = score – grand mean

 Between variation = group mean – grand mean

 Within variation = score – group mean

 Deviation is taken, then squared, then summed across cases
   Hence the term “Sum of squares” (SS)
   One-way ANOVA example
Total SS (deviation   from grand mean)
      Group A             Group B         Group C
        49             56         54
        52             57         52
        52             57         56
        53             60         50
        49             60         53

Mean = 51             58             53

Grand mean = 54
   One-way ANOVA example

Total SS (deviation from grand mean)
      Group A      Group B    Group C
        -5 25        2  4       0 0
        -2 4         3  9      -2 4
        -2 4         3  9       2 4
        -1 1         6 36      -4 16
        -5 25        6 36      -1 1

Sum of squares = 59 + 94 + 25 = 178
     One-way ANOVA example
Between SS (group mean – grand mean)
                                A    B           C
Group means                          51   58    53
Group deviation from grand mean      -3     4   -1
Squared deviation                     9   16     1
n(squared deviation)            45   80    5

      Between SS = 45 + 80 + 5 = 130

Grand mean = 54
     One-way ANOVA example

Within SS (score - group mean)
                                 A     B     C
                                 51    58    53

Deviation from group means       -2    -2    1
                                   1    -1   -1
                                   1    -1    3
                                   2     2   -3
                                  -2     2    0

Squared deviations               4     4     1
                                  1    1      1
                                  1    1      9
                                  4    4      9
                                  4    4      0
Within SS = 14 + 14 + 20 = 48
    The F equation for ANOVA

F = Between groups sum of squares/(k-1)
   Within groups sum of squares/(N-k)


  N = total number of subjects
  k = number of groups
  Numerator = Mean square between groups
  Denominator = Mean square within groups
F-table page 195% POINTS FOR THE F DISTRIBUTION Page 1

                           Numerator Degrees of Freedom
       *     1      2      3     4     5     6     7          8      9      10    *

       1    161    199    216    225    230    234    237    239    241    242    1
       2    18.5   19.0   19.2   19.2   19.3   19.3   19.4   19.4   19.4   19.4   2
 D     3    10.1   9.55   9.28   9.12   9.01   8.94   8.89   8.85   8.81   8.79   3
 e     4    7.71   6.94   6.59   6.39   6.26   6.16   6.09   6.04   6.00   5.96   4
 n     5    6.61   5.79   5.41   5.19   5.05   4.95   4.88   4.82   4.77   4.74   5
 o
 m     6    5.99   5.14   4.76   4.53   4.39   4.28   4.21   4.15   4.10   4.06    6
 i     7    5.59   4.74   4.35   4.12   3.97   3.87   3.79   3.73   3.68   3.64    7
 n     8    5.32   4.46   4.07   3.84   3.69   3.58   3.50   3.44   3.39   3.35    8
 a     9    5.12   4.26   3.86   3.63   3.48   3.37   3.29   3.23   3.18   3.14    9
 t    10    4.96   4.10   3.71   3.48   3.33   3.22   3.14   3.07   3.02   2.98   10
 o
 r    11    4.84   3.98   3.59   3.36   3.20   3.09   3.01   2.95   2.90   2.85   11
      12    4.75   3.89   3.49   3.26   3.11   3.00   2.91   2.85   2.80   2.75   12
 D    13    4.67   3.81   3.41   3.18   3.03   2.92   2.83   2.77   2.71   2.67   13
 e    14    4.60   3.74   3.34   3.11   2.96   2.85   2.76   2.70   2.65   2.60   14
 g    15    4.54   3.68   3.29   3.06   2.90   2.79   2.71   2.64   2.59   2.54   15
 r
 e    16    4.49   3.63   3.24   3.01   2.85   2.74   2.66   2.59   2.54   2.49   16
 e    17    4.45   3.59   3.20   2.96   2.81   2.70   2.61   2.55   2.49   2.45   17
 s    18    4.41   3.55   3.16   2.93   2.77   2.66   2.58   2.51   2.46   2.41   18
      19    4.38   3.52   3.13   2.90   2.74   2.63   2.54   2.48   2.42   2.38   19
 o    20    4.35   3.49   3.10   2.87   2.71   2.60   2.51   2.45   2.39   2.35   20
 f
      21    4.32   3.47   3.07   2.84   2.68   2.57   2.49   2.42   2.37   2.32   21
 F    22    4.30   3.44   3.05   2.82   2.66   2.55   2.46   2.40   2.34   2.30   22
 r    23    4.28   3.42   3.03   2.80   2.64   2.53   2.44   2.37   2.32   2.27   23
 e    24    4.26   3.40   3.01   2.78   2.62   2.51   2.42   2.36   2.30   2.25   24
 e    25    4.24   3.39   2.99   2.76   2.60   2.49   2.40   2.34   2.28   2.24   25
 d
 o    26    4.23   3.37   2.98   2.74   2.59   2.47   2.39   2.32   2.27   2.22   26
 m    27    4.21   3.35   2.96   2.73   2.57   2.46   2.37   2.31   2.25   2.20   27
      28    4.20   3.34   2.95   2.71   2.56   2.45   2.36   2.29   2.24   2.19   28
      29    4.18   3.33   2.93   2.70   2.55   2.43   2.35   2.28   2.22   2.18   29
      30    4.17   3.32   2.92   2.69   2.53   2.42   2.33   2.27   2.21   2.16   30

      35    4.12   3.27   2.87   2.64   2.49   2.37   2.29   2.22   2.16   2.11   35
      40    4.08   3.23   2.84   2.61   2.45   2.34   2.25   2.18   2.12   2.08   40
      50    4.03   3.18   2.79   2.56   2.40   2.29   2.20   2.13   2.07   2.03   50
      60    4.00   3.15   2.76   2.53   2.37   2.25   2.17   2.10   2.04   1.99   60
      70    3.98   3.13   2.74   2.50   2.35   2.23   2.14   2.07   2.02   1.97   70

       80   3.96   3.11   2.72   2.49   2.33   2.21   2.13   2.06   2.00   1.95   80
      100   3.94   3.09   2.70   2.46   2.31   2.19   2.10   2.03   1.97   1.93 100
      150   3.90   3.06   2.66   2.43   2.27   2.16   2.07   2.00   1.94   1.89 150
      300   3.87   3.03   2.63   2.40   2.24   2.13   2.04   1.97   1.91   1.86 300
     1000   3.85   3.00   2.61   2.38   2.22   2.11   2.02   1.95   1.89   1.84 1000
               Significance of F
F-critical is 3.89 (2,12 df)

F observed 16.25 > F critical 3.89

Groups are significantly different

-T-tests could then be run to determine which groups are
significantly different from which other groups
  Computer Printout Example
                                             Descriptiv es

GAVE 'THE FINGER' TO SOMEONE WHILE DRIVI
                                                         95% Confidence Interval for
                                                                  Mean
          N       Mean      Std. Deviation Std. Error   Lower Bound Upper Bound        Minimum    Maximum
1.00       1462   1.7148          1.28915     .03372         1.6486         1.7809         1.00       7.00
2.00       1858   1.3660           .93491     .02169         1.3234         1.4085         1.00       7.00
Total      3320   1.5196          1.11830     .01941         1.4815         1.5576         1.00       7.00



                                            ANOVA

        GAVE 'THE FINGER' TO SOMEONE WHILE DRIVI
                            Sum of
                           Squares         df       Mean Square           F            Sig.
        Between Groups       99.536            1         99.536          81.522          .000
        Within Groups      4051.191         3318          1.221
        Total              4150.727         3319
              Two-way ANOVA

 ANOVA compares:
   Between and within groups variance

 Adds a second IV to one-way ANOVA
   2 IV and 1 DV

 Analyzes significance of:
   Main effects of each IV
   Interaction effect of the IVs
Graphs of potential outcomes

 No main effects or interactions

 Main effects of color only

 Main effects for motion only

 Main effects for color and motion

 Interactions
            Graphs
A
R
O                      x Motion
U
S
A                      * Still
L




    Color        B&W
    No main effects for interactions
A
R
O                             x Motion
U
S
A                             * Still
L




        Color         B&W
    No main effects for interactions
A
R
O                             x Motion
U
S       x              x      * Still
A       *              *
L




        Color         B&W
    Main effects for color only
A
R
O                         x Motion
U
S
A                         * Still
L




      Color        B&W
    Main effects for color only
A
R     *
O     x                   x Motion
U
S
A                         * Still
L

                    *
                    x


      Color        B&W
    Main effects for motion only
A
R
O                         x Motion
U
S
A                         * Still
L




       Color        B&W
    Main effects for motion only
A
R
O       x            x    x Motion
U
S
A                         * Still
L
        *           *



       Color        B&W
    Main effects for color and motion
A
R
O                             x Motion
U
S
A                             * Still
L




         Color         B&W
    Main effects for color and motion
A
R        x
O                             x Motion
U
S
                        x     * Still
A        *
L


                       *

         Color         B&W
    Transverse interaction
A
R
O                      x Motion
U
S
A                      * Still
L




    Color        B&W
    Transverse interaction
A
     *            x
R
O                      x Motion
U
S
A                      * Still
L
     x            *


    Color        B&W
    Interaction—color only makes a
         difference for motion
A
R
O                           x Motion
U
S
A                           * Still
L




        Color        B&W
    Interaction—color only makes a
         difference for motion
A
R        x
O                           x Motion
U
S
A                           * Still
L
         *            x
                      *


        Color        B&W
Partitioning the variance for Two-
           way ANOVA
Total variation =

  Main effect variable 1 +

  Main effect variable 2 +

  Interaction +

  Residual (within)
    Summary Table for Two-way
            ANOVA
Source          SS   df   MS   F
Main effect 1
Main effect 2
Interaction
Within


Total
                  Printout Example
                        Tests of Between-Subj ects Effects

Dependent Variable: MARIJUANA USE SHOULD BE LEGALIZED
                  Type III Sum
Source             of Squares       df      Mean Square         F        Sig.
Corrected Model         74.465 a        7        10.638         3.392      .001
Intercept             5889.077          1      5889.077      1877.565      .000
SEX                     13.191          1        13.191         4.205      .040
RACE2                   19.048          3         6.349         2.024      .108
SEX * RACE2                .560         3          .187           .060     .981
Error               10366.297        3305         3.137
Total               31942.000        3313
Corrected Total     10440.762        3312
  a. R Squared = .007 (Adjusted R Squared = .005)
              Printout plot
      Estimated Marginal Means of MARIJUANA USE S
3.2



3.0



2.8

                            RACE OF RESPONDENT(W

2.6
                                 1.00

                                 2.00
2.4
                                 3.00

2.2                              4.00
  1.00                    2.00


      SEX OF RESPONDENT
Scatter Plot of Price and Attendance
               Attendance


                 2.5

                   2


                 1.5

                   1

                 0.5

                                                                     Price
                             2        3        4        5        6


 Price is the average seat price for a single regular season game in today’s dollars

 Attendance is total annual attendance and is in millions of people per annum.
      Is there a relation there?

 Lets use linear regression to find out, that is
   Let’s fit a straight line to the data.
   But aren’t there lots of straight lines that could fit?
      Yes!
             Desirable Properties
 We would like the “closest” line, that is the one that
  minimizes the error
   The idea here is that there is actually a relation, but there is also
     noise. We would like to make sure the noise (i.e., the deviation
     from the postulated straight line) to be as small as possible.

 We would like the error (or noise) to be unrelated to the
  independent variable (in this case price).
      If it were, it would not be noise --- right!
Scatter Plot of Price and Attendance
               Attendance


                 2.5

                   2


                 1.5

                   1

                 0.5

                                                                     Price
                             2        3        4        5        6


 Price is the average seat price for a single regular season game in today’s dollars

 Attendance is total annual attendance and is in millions of people per annum.
            Simple Regression

 The simple linear regression MODEL is:


                       y = 0 + 1x +              x   y

 describes how y is related to x
   0 and 1 are called parameters of the model.       e
    is a random variable called the error term.
         Simple Regression


 Graph of the regression equation is a straight line.
 β0 is the population y-intercept of the regression line.
 β1 is the population slope of the regression line.
 E(y) is the expected value of y for a given x value
Simple Regression
        E(y)

               Regression line


Intercept                 Slope 1
       0                is positive


                                       x
Simple Regression
        E(y)

               Regression line
Intercept
       0
                    Slope 1
                      is 0


                                 x
              Types of
          Regression Models
1 Explanatory        Regression   2+ Explanatory
   Variable            Models        Variables


         Simple                       Multiple



                   Non-                           Non-
Linear                       Linear
                  Linear                         Linear
          Regression Modeling Steps

 1.   Hypothesize Deterministic Components
 2.   Estimate Unknown Model Parameters
 3.   Specify Probability Distribution of Random Error Term
   Estimate Standard Deviation of Error
 4.   Evaluate Model
 5.   Use Model for Prediction & Estimation
Linear Multiple Regression Model

 1.  Relationship between 1 dependent & 2 or more
  independent variables is a linear function


       Population          Population         Random
       Y-intercept         slopes             error

 Yi   0   1X 1i   2 X 2i   k X ki   i
  Dependent                        Independent
  (response)                       (explanatory)
  variable                         variables
         Multiple Regression Model

Multivariate
model                Yi =  0 +  1X1i +  2X2i + i
               Y             (Observed Y)


      Response      0          i
        Plane
                                                X2

 X1            (X1i,X2i)
                        E(Y) =  0 +  1X1i +  2X2i

								
To top