# Multivariate Analysis by WLae56

VIEWS: 0 PAGES: 45

• pg 1
```									Multivariate
Analysis
One-way ANOVA

 Tests the difference in the means of 2 or more nominal groups
 E.g., High vs. Medium vs. Low exposure

 Can be used with more than one IV
 Two-way ANOVA, Three-way ANOVA etc.
ANOVA

 _______-way ANOVA
 Number refers to the number of IVs

 Tests whether there are differences in the means of IV groups
 E.g.:
 Experimental vs. control group
 Women vs. Men
 High vs. Medium vs. Low exposure
Logic of ANOVA

 Variance partitioned into:
 1. Systematic variance:
 the result of the influence of the Ivs
 2. Error variance:
 the result of unknown factors

 Variation in scores partitions the variance into two
parts by calculating the “sum of squares”:
 1. Between groups variation (systematic)
 2. Within groups variation (error)

 SS total = SS between + SS within
Significant and Non-significant
Differences

Significant:        Non-significant:
Between > Within    Within > Between
Partitioning the Variance
Comparisons
 Total variation = score – grand mean

 Between variation = group mean – grand mean

 Within variation = score – group mean

 Deviation is taken, then squared, then summed across cases
 Hence the term “Sum of squares” (SS)
One-way ANOVA example
Total SS (deviation   from grand mean)
Group A             Group B         Group C
49             56         54
52             57         52
52             57         56
53             60         50
49             60         53

Mean = 51             58             53

Grand mean = 54
One-way ANOVA example

Total SS (deviation from grand mean)
Group A      Group B    Group C
-5 25        2  4       0 0
-2 4         3  9      -2 4
-2 4         3  9       2 4
-1 1         6 36      -4 16
-5 25        6 36      -1 1

Sum of squares = 59 + 94 + 25 = 178
One-way ANOVA example
Between SS (group mean – grand mean)
A    B           C
Group means                          51   58    53
Group deviation from grand mean      -3     4   -1
Squared deviation                     9   16     1
n(squared deviation)            45   80    5

Between SS = 45 + 80 + 5 = 130

Grand mean = 54
One-way ANOVA example

Within SS (score - group mean)
A     B     C
51    58    53

Deviation from group means       -2    -2    1
1    -1   -1
1    -1    3
2     2   -3
-2     2    0

Squared deviations               4     4     1
1    1      1
1    1      9
4    4      9
4    4      0
Within SS = 14 + 14 + 20 = 48
The F equation for ANOVA

F = Between groups sum of squares/(k-1)
Within groups sum of squares/(N-k)

N = total number of subjects
k = number of groups
Numerator = Mean square between groups
Denominator = Mean square within groups
F-table page 195% POINTS FOR THE F DISTRIBUTION Page 1

Numerator Degrees of Freedom
*     1      2      3     4     5     6     7          8      9      10    *

1    161    199    216    225    230    234    237    239    241    242    1
2    18.5   19.0   19.2   19.2   19.3   19.3   19.4   19.4   19.4   19.4   2
D     3    10.1   9.55   9.28   9.12   9.01   8.94   8.89   8.85   8.81   8.79   3
e     4    7.71   6.94   6.59   6.39   6.26   6.16   6.09   6.04   6.00   5.96   4
n     5    6.61   5.79   5.41   5.19   5.05   4.95   4.88   4.82   4.77   4.74   5
o
m     6    5.99   5.14   4.76   4.53   4.39   4.28   4.21   4.15   4.10   4.06    6
i     7    5.59   4.74   4.35   4.12   3.97   3.87   3.79   3.73   3.68   3.64    7
n     8    5.32   4.46   4.07   3.84   3.69   3.58   3.50   3.44   3.39   3.35    8
a     9    5.12   4.26   3.86   3.63   3.48   3.37   3.29   3.23   3.18   3.14    9
t    10    4.96   4.10   3.71   3.48   3.33   3.22   3.14   3.07   3.02   2.98   10
o
r    11    4.84   3.98   3.59   3.36   3.20   3.09   3.01   2.95   2.90   2.85   11
12    4.75   3.89   3.49   3.26   3.11   3.00   2.91   2.85   2.80   2.75   12
D    13    4.67   3.81   3.41   3.18   3.03   2.92   2.83   2.77   2.71   2.67   13
e    14    4.60   3.74   3.34   3.11   2.96   2.85   2.76   2.70   2.65   2.60   14
g    15    4.54   3.68   3.29   3.06   2.90   2.79   2.71   2.64   2.59   2.54   15
r
e    16    4.49   3.63   3.24   3.01   2.85   2.74   2.66   2.59   2.54   2.49   16
e    17    4.45   3.59   3.20   2.96   2.81   2.70   2.61   2.55   2.49   2.45   17
s    18    4.41   3.55   3.16   2.93   2.77   2.66   2.58   2.51   2.46   2.41   18
19    4.38   3.52   3.13   2.90   2.74   2.63   2.54   2.48   2.42   2.38   19
o    20    4.35   3.49   3.10   2.87   2.71   2.60   2.51   2.45   2.39   2.35   20
f
21    4.32   3.47   3.07   2.84   2.68   2.57   2.49   2.42   2.37   2.32   21
F    22    4.30   3.44   3.05   2.82   2.66   2.55   2.46   2.40   2.34   2.30   22
r    23    4.28   3.42   3.03   2.80   2.64   2.53   2.44   2.37   2.32   2.27   23
e    24    4.26   3.40   3.01   2.78   2.62   2.51   2.42   2.36   2.30   2.25   24
e    25    4.24   3.39   2.99   2.76   2.60   2.49   2.40   2.34   2.28   2.24   25
d
o    26    4.23   3.37   2.98   2.74   2.59   2.47   2.39   2.32   2.27   2.22   26
m    27    4.21   3.35   2.96   2.73   2.57   2.46   2.37   2.31   2.25   2.20   27
28    4.20   3.34   2.95   2.71   2.56   2.45   2.36   2.29   2.24   2.19   28
29    4.18   3.33   2.93   2.70   2.55   2.43   2.35   2.28   2.22   2.18   29
30    4.17   3.32   2.92   2.69   2.53   2.42   2.33   2.27   2.21   2.16   30

35    4.12   3.27   2.87   2.64   2.49   2.37   2.29   2.22   2.16   2.11   35
40    4.08   3.23   2.84   2.61   2.45   2.34   2.25   2.18   2.12   2.08   40
50    4.03   3.18   2.79   2.56   2.40   2.29   2.20   2.13   2.07   2.03   50
60    4.00   3.15   2.76   2.53   2.37   2.25   2.17   2.10   2.04   1.99   60
70    3.98   3.13   2.74   2.50   2.35   2.23   2.14   2.07   2.02   1.97   70

80   3.96   3.11   2.72   2.49   2.33   2.21   2.13   2.06   2.00   1.95   80
100   3.94   3.09   2.70   2.46   2.31   2.19   2.10   2.03   1.97   1.93 100
150   3.90   3.06   2.66   2.43   2.27   2.16   2.07   2.00   1.94   1.89 150
300   3.87   3.03   2.63   2.40   2.24   2.13   2.04   1.97   1.91   1.86 300
1000   3.85   3.00   2.61   2.38   2.22   2.11   2.02   1.95   1.89   1.84 1000
Significance of F
F-critical is 3.89 (2,12 df)

F observed 16.25 > F critical 3.89

Groups are significantly different

-T-tests could then be run to determine which groups are
significantly different from which other groups
Computer Printout Example
Descriptiv es

GAVE 'THE FINGER' TO SOMEONE WHILE DRIVI
95% Confidence Interval for
Mean
N       Mean      Std. Deviation Std. Error   Lower Bound Upper Bound        Minimum    Maximum
1.00       1462   1.7148          1.28915     .03372         1.6486         1.7809         1.00       7.00
2.00       1858   1.3660           .93491     .02169         1.3234         1.4085         1.00       7.00
Total      3320   1.5196          1.11830     .01941         1.4815         1.5576         1.00       7.00

ANOVA

GAVE 'THE FINGER' TO SOMEONE WHILE DRIVI
Sum of
Squares         df       Mean Square           F            Sig.
Between Groups       99.536            1         99.536          81.522          .000
Within Groups      4051.191         3318          1.221
Total              4150.727         3319
Two-way ANOVA

 ANOVA compares:
 Between and within groups variance

 Adds a second IV to one-way ANOVA
 2 IV and 1 DV

 Analyzes significance of:
 Main effects of each IV
 Interaction effect of the IVs
Graphs of potential outcomes

 No main effects or interactions

 Main effects of color only

 Main effects for motion only

 Main effects for color and motion

 Interactions
Graphs
A
R
O                      x Motion
U
S
A                      * Still
L

Color        B&W
No main effects for interactions
A
R
O                             x Motion
U
S
A                             * Still
L

Color         B&W
No main effects for interactions
A
R
O                             x Motion
U
S       x              x      * Still
A       *              *
L

Color         B&W
Main effects for color only
A
R
O                         x Motion
U
S
A                         * Still
L

Color        B&W
Main effects for color only
A
R     *
O     x                   x Motion
U
S
A                         * Still
L

*
x

Color        B&W
Main effects for motion only
A
R
O                         x Motion
U
S
A                         * Still
L

Color        B&W
Main effects for motion only
A
R
O       x            x    x Motion
U
S
A                         * Still
L
*           *

Color        B&W
Main effects for color and motion
A
R
O                             x Motion
U
S
A                             * Still
L

Color         B&W
Main effects for color and motion
A
R        x
O                             x Motion
U
S
x     * Still
A        *
L

*

Color         B&W
Transverse interaction
A
R
O                      x Motion
U
S
A                      * Still
L

Color        B&W
Transverse interaction
A
*            x
R
O                      x Motion
U
S
A                      * Still
L
x            *

Color        B&W
Interaction—color only makes a
difference for motion
A
R
O                           x Motion
U
S
A                           * Still
L

Color        B&W
Interaction—color only makes a
difference for motion
A
R        x
O                           x Motion
U
S
A                           * Still
L
*            x
*

Color        B&W
Partitioning the variance for Two-
way ANOVA
Total variation =

Main effect variable 1 +

Main effect variable 2 +

Interaction +

Residual (within)
Summary Table for Two-way
ANOVA
Source          SS   df   MS   F
Main effect 1
Main effect 2
Interaction
Within

Total
Printout Example
Tests of Between-Subj ects Effects

Dependent Variable: MARIJUANA USE SHOULD BE LEGALIZED
Type III Sum
Source             of Squares       df      Mean Square         F        Sig.
Corrected Model         74.465 a        7        10.638         3.392      .001
Intercept             5889.077          1      5889.077      1877.565      .000
SEX                     13.191          1        13.191         4.205      .040
RACE2                   19.048          3         6.349         2.024      .108
SEX * RACE2                .560         3          .187           .060     .981
Error               10366.297        3305         3.137
Total               31942.000        3313
Corrected Total     10440.762        3312
a. R Squared = .007 (Adjusted R Squared = .005)
Printout plot
Estimated Marginal Means of MARIJUANA USE S
3.2

3.0

2.8

RACE OF RESPONDENT(W

2.6
1.00

2.00
2.4
3.00

2.2                              4.00
1.00                    2.00

SEX OF RESPONDENT
Scatter Plot of Price and Attendance
Attendance

2.5

2

1.5

1

0.5

Price
2        3        4        5        6

 Price is the average seat price for a single regular season game in today’s dollars

 Attendance is total annual attendance and is in millions of people per annum.
Is there a relation there?

 Lets use linear regression to find out, that is
 Let’s fit a straight line to the data.
 But aren’t there lots of straight lines that could fit?
 Yes!
Desirable Properties
 We would like the “closest” line, that is the one that
minimizes the error
 The idea here is that there is actually a relation, but there is also
noise. We would like to make sure the noise (i.e., the deviation
from the postulated straight line) to be as small as possible.

 We would like the error (or noise) to be unrelated to the
independent variable (in this case price).
 If it were, it would not be noise --- right!
Scatter Plot of Price and Attendance
Attendance

2.5

2

1.5

1

0.5

Price
2        3        4        5        6

 Price is the average seat price for a single regular season game in today’s dollars

 Attendance is total annual attendance and is in millions of people per annum.
Simple Regression

 The simple linear regression MODEL is:

y = 0 + 1x +              x   y

 describes how y is related to x
 0 and 1 are called parameters of the model.       e
  is a random variable called the error term.
Simple Regression

 Graph of the regression equation is a straight line.
 β0 is the population y-intercept of the regression line.
 β1 is the population slope of the regression line.
 E(y) is the expected value of y for a given x value
Simple Regression
E(y)

Regression line

Intercept                 Slope 1
0                is positive

x
Simple Regression
E(y)

Regression line
Intercept
0
Slope 1
is 0

x
Types of
Regression Models
1 Explanatory        Regression   2+ Explanatory
Variable            Models        Variables

Simple                       Multiple

Non-                           Non-
Linear                       Linear
Linear                         Linear
Regression Modeling Steps

 1.   Hypothesize Deterministic Components
 2.   Estimate Unknown Model Parameters
 3.   Specify Probability Distribution of Random Error Term
 Estimate Standard Deviation of Error
 4.   Evaluate Model
 5.   Use Model for Prediction & Estimation
Linear Multiple Regression Model

 1.  Relationship between 1 dependent & 2 or more
independent variables is a linear function

Population          Population         Random
Y-intercept         slopes             error

Yi   0   1X 1i   2 X 2i   k X ki   i
Dependent                        Independent
(response)                       (explanatory)
variable                         variables
Multiple Regression Model

Multivariate
model                Yi =  0 +  1X1i +  2X2i + i
Y             (Observed Y)

Response      0          i
Plane
X2

X1            (X1i,X2i)
E(Y) =  0 +  1X1i +  2X2i

```
To top