# matrices and data by FB80043

VIEWS: 4 PAGES: 49

• pg 1
```									Matrices are essentially boxes filled with numbers organized in rows and columns.
Hence matrices are easily represented on an Excel worksheet.
Capital Letters such as A, B, C, D, etc. are generally used as names for matrices.
Most statistical notation uses x, y & z to denote sample data; similarly X, Y & Z will denote a matrix of sample data
with each column representing a variable and each row representing a sampled unit.

The size of a matrix is determined by the number of rows and number of columns.
In reporting the size of a matrix,
the numbers of rows is 1st and number of columns is 2nd.
2 by 3 denotes a matrix with 2 rows and 3 columns.
5 by 4 denotes a matrix with 5 rows and 4 columns.
3 by 1 denotes a matrix with 3 rows and 1 column and is called a column matrix or column             vector
1 by 5 denotes a matrix with 1 row and 5 columns and is called a row matrix or row vector.

If the (number of rows) = (number of columns), then the matrix is considered to be a   Square Matrix.

I will place a box around a set of cells to denote a matrix in Excel.
Standard mathematical notation is to put numbers in brackets.
Matrix A Col. 1 Col. 2 Col. 3 Col. 4
é -1 - 2 3 4 ù
A = ê 11 12 13 14 ú
Row 1       -1    -2     3      4
Row 2      11     12    13     14               ê                  ú
Row 3      24     23    21     20
ê24
ë      23 21 20ú
û
3 by 4
matrix of sample data

umn vector.

s in brackets.
Special Matrices
Column Matrix - Column Vector - Matrix with one column.
Row Matrix - Row Vector - Matrix with one row.
Square Matrix - Matrix with (number of rows) = (number of columns).
Zero Matrix - Matrix with only zeroes for all number values or entries.
Matrix Transpose - A matrix created from another matrix by reversing rows and columns.
Transpose notation - the transpose of A can be denoted as AT or A'.
Symmetric Matrix - A square matrix such that A = A', that is aij = aji for all values of i & j.
Diagonal Matrix - A symmetric matrix with all values off of the main diagonal = 0; a ij=0 for i≠j.
Identity Matrix - A diagonal matrix will all diagonal values = 1; aij=1 for i=j & aij=0 for i≠j.

A Transpose
Matrix A   Col. 1   Col. 2   Col. 3   Col. 4     AT or A'   Col. 1     Col. 2    Col. 3
Row 1      -1       -2        3        4         Row 1      -1        11        24
Row 2      11       12       13       14         Row 2      -2        12        23
Row 3      24       23       21       20         Row 3       3        13        21
3 by 4                     Row 4       4        14        20
4 by 3
ws and columns.

l values of i & j.
onal = 0; a ij=0 for i≠j.
=0 for i≠j.
To perform the operations of addition or subtraction for matrices the size
of the matrices must be the same. Number of rows must be the same for
both matrices and the number of columns must also be the same. The result
is a matrix of the same exact size. The result is determined by performing the
appropriate operation (addition or subtraction) on all corresponding elements
or cells.

A                           B                        A+B
-1   -2        3    4        9    8        7    5        8     6    10    9
11   12       13   14   +   -1    0       -2    0   =   10    12    11   14
24   23       21   20        0   -3        1   -5       24    20    22   15
3 by 4                     3 by 4                        3 by 4

A                           B                         A-B
-1   -2        3    4        9    8        7    5       -10   -10   -4   -1
11   12       13   14   -   -1    0       -2    0   =    12    12   15   14
24   23       21   20        0   -3        1   -5        24    26   20   25
3 by 4                     3 by 4                        3 by 4
Excel has several functions that perform operations on matrices and some can produce a
result that is a matrix displayed in an array.

To use one of the functions that places a matrix result in an array of cells one MUST
hold down the Ctrl & Shift keys simultaneously while clicking on OK or
pressing the Enter key.
These functions appear in two primary locations in Excel functions.
TRANSPOSE in the Lookup & Reference category of functions.
MMULT in the Math & Trig category of functions.
MINVERSE in the Math & Trig category of functions.
The above functions allow for a matrix as an input and can produce matrix output.
For the above functions one MUST hold down the Ctrl & Shift keys
simultaneously while clicking on OK or pressing the Enter key to obtain
matrix output.
MDETERM in the Math & Trig category of functions calculates the DETERMINANT of a
square matrix..
The above function allows for a matrix as an input and will have a single number
rather than a matrix as output.
If the value of the determinant for a covariance or correlation matrix is zero then at
least one variable in the data set is a linear combination of other variables.
can produce a

MUST
OK or

matrix output.
ift keys
key to obtain

of a

gle number

s zero then at
A                                                    B
0              1                 The       -1               0           1
Two
1              2                Must        2               3           4
2 by 2   # Columns for    < be > # Rows for the   2 by 3
the 1st matrix    Equal 2nd matrix
If the number of columns for the
A*B                           1st matrix does not equal the
number of rows for the 2nd matrix
Col1      Col2          Col3
then multiplication of the two
ROW1         0*(-1)+1*2 0*0+1*3       0*1+1*4            matrices is impossible.
ROW2         1*(-1)+2*2 1*0+2*3       1*1+2*4
Result = 2 by 3
2          3           4               These are calculated from
3          6           9               the cells above

2               3             4         This result was obtained
3               6             9         by using the MMULT function
f columns for the
not equal the
s for the 2nd matrix
ion of the two

ulated from

s obtained
MULT function
Identity   Inverse         Inverse
Element    Operation       Element Operation          Element
-1
a      Multiplication      1       Division        a = 1/a
a        Division          1     Multiplication    a-1 = 1/a

A          Matrix       Identity   Multipy by      Inverse                 1
Multiplication    Matrix     Inverse       Matrix = A-1             0
2 by 2 Identity

To find an inverse of a matrix A                                    1         0
1. A must be a square matrix                                        0         1
2. The determinant of A must not = 0                                0         0
3 by 3 Identity
Steps for finding an inverse using Excel is on the next workbook.
0
1
2 by 2 Identity

0
0
1
by 3 Identity
A-1 is used to denote the inverse of the matrix A
|A| is used to denote the determinant of the matrix A
To find an inverse of a matrix
To find an inverse of a matrix A A
1. A must be a square matrix
1. A must be a square matrix
2. The determinant of A must not be equal to 0, |A|≠0.
2. The determinant of A must not = 0
To find an inverse, the matrix MUST be square. If the number of rows
A                  does not equal the number of columns, an inverse cannot be found.
1      2       3         Also the determinant must not be zero. Placing the cursor on cell B13
0      1       0         click on function button, go to Math & Trig and then select MDETERM.
2     -1       4         Select the cells in B8 through D10 and click on OK. Since the
3 by 3               determinant of -2 is not zero then an inverse can be found.
Highlight the cells B16 through D18,click on function button, go to
|A| =    -2   = Determinant of A   Math & Trig and then select MINVERSE. Select the cells in B8
through D10 and simultanedously hold down both the Ctrl & Shift
A-1                 keys while clicking on OK. The inverse will appear in the
-2     5.5      1.5       highlighted cells B16 through D18.
0       1        0
1     -2.5     -0.5       To confirm that this is the inverse multipy the original matrix times the
inverse. Highlight the cells B22 through D24,click on function button,
3 by 3
go to Math & Trig and then select MMULT. Select the cells B8
through D10 for Array1 and cells B22 through D24 for Array2 then
A * A-1              hold down the Ctrl & Shift keys simultaneously while clicking on
1      0       0         OK. The resulting 3 by 3 identity matrix provides confirmation that it is
0      1       0         the inverse.
0      0       1
3 by 3

Determinant of a 2 by 2 matrix

éa b ù
B=ê      , then B = a  d - b  c
ë c dú
û
mber of rows

on cell B13
MDETERM.

Ctrl & Shift

x times the
ion button,

clicking on
tion that it is
The system of 3 equations below is coverted to matrix form.
The inverse matrix is found and used to obtain solution.
2x + 5y + 4z = 4
x + 4y + 3z = 1
x - 3y - 2z = 5
(Example 6, page 263 of 6th edition of Harshbarger & Reynolds

C                      Determinant(C) = -1 < hence an inverse can be found
2    5   4         x        4 To obtain C-1, the Matrix Inverse for C, Highlight a
3 by 3 area, Select the function MINVERSE,
1    4   3     *   y =      1 Indicate the location of the matrix C as below.
1 -3 -2            z        5 Hold down the Ctrl & Shift keys OK.
simultaneously while clicking on
3 by 3
C-1
-1 2      1
-5 8      2
7 -11 -3

x                   4         3   To find this matrix of Solution values, use the MMULT
-1                     function to multiply C-1 times the column matrix of right
y    =   C      *   1   =    -2   hand side values (See Below). Hold down the Ctrl &
z                   5         2   Shift keys simultaneously while clicking on OK.
se can be found
A                    I (2by2)              I*A
5        2           4       1       0        5      2       4
6        4           7       0       1        6      4       7
(2 by 3)
B        A' or A transpose       I (3by3)
1          3       5       6        1      0       0
-3          0       2       4        0      1       0
-3          5       4       7        0      0       1

Scalar Multiplication          B + A'              A*I
2*A                   6      9        5      2       4
10       4            8      -1      4        6      4       7
12       8           14       1     12
Data matrix for values of observations of m variables (m columns)
for n sampling units (n rows).
x1= variable 1   x2= variable 2   x3= variable 3    …     xm= variable m
Sampling unit 1           x11             x12              x13         …           x1m
Sampling unit 2           x21             x22              x23         …           x2m
Sampling unit 3           x31             x32              x33         …           x3m
Sampling unit 4           x41             x42              x43         …           x4m
…                  …                …                …          …            …
Sampling unit n           xn1             xn2              xn3         …           xnm

Each variable will be measured using the same unit of measure (inches, kilograms, test grade, etc.)
for all sampling units but each variable may have a unit of measure that is different from all of the
other variables.
Mean Vector - a vector containing the sample mean of each variable
Variance Vector - a vector containing the sample variance of each variable
Standard Deviation Vector - a vector containing the sample standard deviation of each variable
Outlier                                                                           An Outlier may have
An observation with a unique combination of characteristics                       large or small value
identifiable as distinctly different from the other observations.
(Hair 6th text definition page 73)                                                variable or it may ha
An observation that is so extreme that it stands apart from the rest of           combination of value
the observations; that is, it differs so greatly from the remaining               the dark red point in
observations that it gives rise to the question whether it is from the            below.
same population or involves measurement error.
(Pocket Dictionary of Statistics, 2002)

Reasons for an Outlier:                                                 25
Procedural or measurement error that is a result of data entry error,
20
measurement coding or measuring the wrong thing.
A rare or extraordinary event has occured for the phenomenon being
observed and the observation(s) come from this event.                   15
(Rain attributing to a 100 year flood.)
The data are for an observation that may be from a different            10
phenomenon than the one being studied.
5
Should an outlier be included or excluded for an analysis?
0
Is it truly indicative of the phenomenon being studied?
0          5
Does its inclusion mean that the sample data are no longer truly
representative of the phenomenon?
I like to use the terms
Extreme point for a data value set apart from from the rest and
Nonindicative point for a data value that is deemed to not be
representative of the phenomenon being studied.
X1   X2 sum       diff
An Outlier may have an extremely                1       18   18      36         0
small value for a single                2        2    2       4         0
ariable or it may have an unusual              3       11   13      24        -2
4        7   10      17        -3
ombination of values as shown by               5       18   18      36         0
he dark red point in the X-Y scatter            6       19   21      40        -2
7        6    9      15        -3
8        0    0       0         0
9       17   20      37        -3
10       19   19      38         0
11        5    7      12        -2
12       11   14      25        -3
13        2    3       5        -1
14        7    8      15        -1
15       15   19      34        -4
16       17   18      35        -1
17        2    3       5        -1
18        1    1       2         0
19       11   14      25        -3
20       19   19      38         0
outlier    18    4      22      14 Extreme Point that seems to be Nonindicative.
Mean      10.7 11.4 22.14 -0.714
10        15          20
Variance 49.1 52.4 189.8 13.11
Correlation     0.87        -0.065
Total Variance 101           202.9
49.1
eems to be Nonindicative.
x1, x2 through xm denote m variables with observations for n sampling units.
Sample Mean for the jth variable       Sample Variance for the jth variable       Sample Standard Deviation is

 x          - xj 
n                                    n                           the square root of the sample

 xij
2   variance.
ij                   The Sample Standard Deviatio
for the jth variable is
xj =       i =1
s2 =         i =1                          sj.
n -1
j
n
AVERAGE function in Excel              VAR is a 2007 function and VAR.S is an Excel 2010 function

Sample Covariance between the jth and kth variables                                Excel 2007 does not have a fu

 x          - x j    xik - x k 
n                                           sample covariance.
The COVAR function
ij                                  population covariance or the m
COV ( x j , x k ) = s jk =            i =1                                         estimator rather than the minim
n -1                     unbiased estimator that is gen
calculate the sample covarianc
COVARIANCE.S in Excel 2010 calculates the sample covariance.                       The Sample Covariance
=COVAR(data)*n/(n
Excel 2007 or earlier does not
find a sample covariance.
COVARIANCE.S(data)
covariance in Excel 2010.
e Standard Deviation is
uare root of the sample

mple Standard Deviation
variable is denoted by

el 2010 function

2007 does not have a function to find a
e covariance.
OVAR function formula finds the
ation covariance or the maximum likelihood
ator rather than the minimum variance
sed estimator that is generally used to
ate the sample covariance.
ample Covariance can be calculated by
AR(data)*n/(n-1)
2007 or earlier does not have a function to
sample covariance.
ARIANCE.S(data) calculates the sample
ance in Excel 2010.
Measurement           Statistic
Original Units          The Mean is a measure of the location of the middle of a distribution.
(Original Units)2        The Variance is a measure of the dispersion or spread for a distribution.
Original Units          The Standard Deviation is a measure of the dispersion or spread for a distributi
Product of two sets of
The Covariance is a measure of the linear relationship between two variables.
Original Units
Unit Free            The Correlation is a unit free measure of the linear relationship between two var

Sample Correlation between the jth and kth variables

s jk                       COV ( x j , x k )
r jk =                  =
s j  sk         Std .Dev.( x j )  Std .Dev.( x k )
n xij - x j   xik - x k                  
 s   s
            


=
i =1       j          k                      
n -1
a distribution.
for a distribution.
n or spread for a distribution.
between two variables.
ationship between two variables.
Data matrices for values of observations of m variables (m columns) for n sampling units (n rows).
Original Data matrix X   x1= variable 1         x2= variable 2    x3= variable 3              …   xm= variable m
Sampling unit 1                    x11                 x12               x13                  …         x1m
Sampling unit 2                    x21                 x22               x23                  …         x2m
Sampling unit 3                    x31                 x32               x33                  …         x3m
…                     …                   …                 …                    …         …
Sampling unit n                    xn1                 xn2               xn3                  …         xnm
Xd is the matrix of deviation scores from the                          Xs is the matrix of standardized scores for each
mean of each variable. It is often referred to as                      variable. The mean is zero and the standard
mean-corrected or mean-adjusted or centered                            deviation is one for each variable in the mean
data matrix. The mean is zero for each variable in                     corrected data set.
a mean-corrected or centered data set.                                      é x11 - x1             x12 - x 2           x1m - x m ù
ê s                                                  ú
é x11 - x1         x12 - x 2        ...      x1m - x m ù                                         s2                  sm
ê      1
ú
ê
x 21 - x1        x 22 - x 2                x2 m - xm
ú             ê x 21 - x1            x 22 - x 2          x2m - xm ú
...                                                                      
Xd   =ê                                                       ú       X s = ê s1                       s2                  sm     ú
ê ...                  ...          ...          ...    ú             ê                                                    ú
                  
ê                                                       ú             ê x n1 - x1            xn 2 - x2           x nm - x m ú
ë x n1 - x1        xn 2 - x2        ...      x nm - x m û
ê                                                    ú
n                                                                  ë s1                       s2                  sm     û
 xij
             j 
n                           sj = standard
 x ij - x
2
i =1
xj =             = Mean for the j variable        th
deviation of the j
i =1
n                                                        s2 =                                      variable.
n -1
j
sampling units (n rows).

tandardized scores for each
n is zero and the standard
each variable in the mean-

2 - x2           x1m - x m ù
                ú
s2                  sm
ú
2 - x2           x2m - xm ú

s2                  sm     ú
                        ú
2 - x2           x nm - x m ú
                ú
s2                  sm     û
sj = standard
deviation of the jth
variable.
For a set of data with observation values for m variables and n sampling units one can
calculate a covariance matrix and a correlation matrix. Both are m by m matrices.

                 - xk 
n
 xij - x j  xik
S = Covariance Matrix             R = Correlation Matrix
x1    x2   x3 …      xm           x1    x2  x3 …       xm
i =1
x 1 s1 2
s12 s13 … s1m           x1   1    r12 r13 …       r1m s jk =
x2 s21 s22 s23 … s2m              x2 r21     1   r23 …      r2m                  n -1
x3 s31 s32 s32 … s3m              x3 r31 r32      1   …     r3m           s jk
…    …     …    …    …    …       …    …     …   …    …     …   rjk =
xm sm1 sm2 sm3 … sm2              xm rm1 rm2 rm3 …           1         s j  sk
sj2 = sample variance of the jth variable.                                   é           xi1 - x1 
n            2
sjk = sample covariance between the jth and kth variables.                   ê        i =1
rjk = sample correlation between the jth and kth variables.                  ê n
SSCP = ê i1 xi1 - x1    xi 2 - x
SSCP = Martrix of Sum of Squares and Cross Products      SSCP                ê =
SSCP = Xd' * Xd           SSCP = X d  X d
'            S=                       ên              
n -1               ê   xi1 - x1    xim - x
In Excel the covariance calculated by COVARIANCE.S or by
ë =1
adjusting the COVAR or by Data Analysis value is for the Population Covarianceiwhich divides
by n rather than by (n-1). Hence to get the Sample Covariance one must correct the calculations by multiplyi
dividing by (n-1). Data Analysis can be found on the Data tab in 2010 or 2007 and under tools for previous ve
covariance in Data Analysis calculates both the covariances and the variances using the population formula t
meaning that all values must be corrected to obtain the sample Covariance Matrix. The CORREL function
in Data Analysis do not need to be corrected.
1. Total Variance of a matrix = Trace of the matrix = Sum of the diagonal elements
2. Generalized Variance of a matrix = Determinant of the matrix
units one can

xij - x j    xik - xk 
n -1

  xi1 - x1 
n             2
  xi 2 - x2    xi1 - x1 
n
       xim - xm    xi1 - x1 
n
ù
i =1                        i =1                                  i =1                       ú
ú
  xi1 - x1    xi 2 - x2              xi 2 - x2               xim - xm    xi 2 - x2 ú
n                                        n            2                 n

=1                                      i =1                        i =1
ú
                                                                                  ú
  xi1 - x1    xim - xm       xi 2 - x2    xim - xm                xim - xm 
n                                 n                                          n
                        2
ú
=1
which divides                    i =1                                      i =1                    û
he calculations by multiplying by n and then
under tools for previous versions. The
g the population formula that divides by n,
The CORREL function and the correlation
A covariance or correlation matrix, S, can be used to determine eigenvalues (or Characteristic Roots).
An eigenvalue is a value λ that is a solution to the equation |S - λ*I|=0.
For a matrix A, |A| denotes the determinant of the matrix A.
For a value λ there is a vector V (m by 1) that satifies the matrix equation S*V = λ*V.
This vector V is an eigenvector. (For each eigenvalue there is an associated eigenvector.)

Important properties:
For an m by m matrix S, there are m eigenvalues.
The sum of these m eigenvalues is equal to the total variance of S, the trace of S.
The product of these m eigenvalues is equal to the generalized variance of S, the determinant of S.

é9 4 ù            9-        4                                                          For  = 11
S =ê        ú , Solve               =0
ë 4 3û               4      3-                                                         é9 4ù é v1 ù      év ù
 ê ú =11  ê 1 ú
ê 4 3ú v
(9 -  )  (3 -  ) - 4  4 = 0                                                             ë    û ë 2û       ëv2 û
This yields 2 equations with 2 unknowns
27 -12   -16 =11-12   = (11 -  )  (1 -  ) = 0
2                             2
9  v1  4  v2 =11  v1
11 -  = 0, or 1 -  = 0                                                                        4  v1  3  v2 =11  v2
 = 11, or  = 1                                                                                or
http://en.wikipedia.org/wiki/Eigenvectors                                                       - 2  v1  4  v2 = 0
The German word eigen was first used in this context by Hilbert in 1904 (there was an earlier
related usage by Helmholtz). "Eigen" can be translated as "own", "peculiar to",                 4  v1 - 8  v2 = 0
"characteristic" or "individual" emphasizing how important eigenvalues are to defining the
unique nature of a specific transformation. In English mathematical jargon, the closest
See the next tab for eigenvectors
translation would be "characteristic"; and some older references do use expressions like
"characteristic value" and "characteristic vector", or even "Eigenwert", German for
eigenvalue. In the past, the standard translation used to be "proper". Today the more
distinctive term "eigenvalue" is standard.
distinctive term "eigenvalue" is standard.
For each eigenvalue λ there is an associated eigenvector V (m by 1) that satifies the
equation S*V = λ*V.

é9 4 ù
For S = ê    ú,  = 11 &  = 1
ë 4 3û
For  = 11                                            For  = 1
é9 4ù é v1 ù      év ù                              é9       4 ù é v1 ù      é v1 ù
 ê ú =11  ê 1 ú
ê 4 3ú v                                                               =1  ê ú
ë    û ë 2û       ëv2 û
ê4
ë        3 ú êv2 ú
û ë û         ëv2 û
This yields 2 equations with 2 unknowns                  This yields 2 equations with 2 unknowns

9  v1  4  v2 =11  v1                             9  v1  4  v 2 = 1  v1
4  v1  3  v2 =11  v2                             4  v1  3  v 2 = 1  v 2
or                                                   or
- 2  v1  4  v2 = 0                                8  v1  4  v 2 = 0
4  v1 - 8  v2 = 0                                  4  v1  2  v 2 = 0
There is no unique solution to the above system          There is no unique solution to the above system

v1 = 2  v2        Any two values of v1 and v2 satisfying the
equation define an eigenvector. An
v 2 = - 2  v1
eigenvector is a line or direction in the data
space. A general procedure is to give a vector
of length one as the eigenvector.
Find the eigenvalues and eigenvectors for the covariance matrix below
Matrix
1. To solve for  using Excel select a
10          -2
-2           4
cell and put in an initial value for .
Use this cell to create a new matrix
Determinant = 36
(Matrix -*I) and use MDETERM to find
= 10.60555        its determinant.
Matrix - *I
2. Use Solver to find a value of 
-0.60555       -2        that will make the determinant = 0.
-2     -6.60555
Determinant = 5.44E-07
3. Select initial values for
Vector          Matrix * Vector    the eigenvector for the
-1.19338           -12.6564         calculated eigenvalue (Vector
0.361325            3.83205         to the left), then use MMULT to
1.246876 = Length                   multiple the matrix times the
Vector
 * Vector            difference    4. Use scalar multiplication to multiply l
-12.6564                4.01E-07     times the Vector. Take the difference
3.832049                  1E-06      between this scalar product and the matrix
product.
5. Use Solver to find the vector values that
will make both differences = 0.
There will be more than one eigenvalue and associated eigenvector so you can
find the other eigenvalues by trying different starting values and for each
eigenvalue the same process can be used to find the associated eigenvector.
Data can be transformed from the original measurement to another variable.
A new variable y1 can be defined as a linear transformation of the variable x1 with y1 = a1*x1 + b1.
A new variable y2 can be defined as a linear transformation of the variable x2 with y2 = a2*x2 + b2.
The coefficients a1 and a2 must not be zero.

y1 = a1  x1  b1 ( a1  0) & y 2 = a2  x2  b2 ( a2  0
y1 = a1  x1  b1 & y 2 = a2  x2  b2
VAR ( y1 ) = s 21 = a1  VAR ( x1 ) or s y1 = a1  s x1
y
2

VAR ( y 2 ) = s 22 = a2  VAR ( x2 ) or s y2 = a2  s x2
y
2

COV ( y1 , y 2 ) = s y1 y2 = a1  a2  COV ( x1 , x2 ) = a1  a2
ry1 , y2 = rx1 , x2
é a1 ù    é b1 ù
X =  x1        x 2 , A = ê ú , B = ê ú , Y =  y1                            y 2 , Y = X  A 
ë a2 û    ëb2 û
X = x1 x2 , Y = y1 y2 , Y = X  A B
é sx
2
sx1 x2 ù       é s2                s y1 y2 ù
SX = ê 1                2 ú
, SY = ê y1                   2 ú
, SY = A'S X  A
êsx1 x2
ë                s x2 úû       ês y1 y2
ë                    s y2 ú û
1 with y1 = a1*x1 + b1.
2 with y2 = a2*x2 + b2.

 x2  b2 ( a2  0)

= a1  s x1
= a 2  s x2
( x1 , x2 ) = a1  a2  s x1 x2

y 2 , Y = X  A  B
Let a ' =  a1                   a2        ...      am             A vector describes a direction in
space.
The Length of the vector a = ||a||
m               Suppose m=3 & a' = [10 2 11]
a =  a 2 a = 102  22  112 = 100  4  121 = 225 =15
j
j
A vector is normalized if its length is 1. To normalize a vector, multiply the vector by the inverse o
[10/15 2/15 11/15] is the normalized vector for the vector a.

Euclidean Distance between the two points a & b = ||a - b||

a -b =           a1 - b1 2  a2 - b2 2  ...  am - bm 2
Suppose m=3 & b' = [8 3 13]


a -b = 10-8  2 -3  11-13 = 4 1 4 = 9 = 3
2         2              2

Two Vectors a and b are ORTHOGONAL if and only if a'•b=0

a'            b      a'•b
10       2    11       8      229 =10*8+2*3+11*13
3      Hence, a & b are not orthogonal.
13
escribes a direction in multiple dimensional

vector by the inverse of its length.
#1 in Chapter 2 of the 2002 edition of the Stevens multivariate text
A             B             C            D              E           X
2 4 1            1 2         1 3 5           4 2          1 -1 2     1       2
3 -2 5           2 1         6 2 1           2 6        -1 3 1       3       1
3 4                                    2 1 10       4       6
u'          v                                                      5       7
1 3           2
7
Find
a) A + C
b) A + B
c) A*B
d) A*C
e) u' * D * u
f) u' * v
g) (A + C)'
h) 3 * C
i) Determinant of D
j) Inverse of D = D-1
k) Determinant of E
l) Inverse of E
m) u' * D-1 * u
n) B*A (compare result to c above)
o) X' * X
Find the eigenvalues & eigenvectors for each covariance matrix.

a.       4            -4                      1 = 12        eigenvector = [v1, v2]', where -2*v1 = v2
-4            10                      2 = 2         eigenvector = [v1, v2]', where v1 =2*v2
Product = 24               24 = Generalized Variance

b.      21            12                      1 = 29        eigenvector = [v1, v2]', where 2*v1 = 3*v2
12            11                      2 = 3         eigenvector = [v1, v2]', where -3*v1 =2*v2
Product = 87               87 = Generalized Variance

c.       1     -0.632456                      1 = 1.632456 eigenvector = [v1, v2]', where -v1 = v2
-0.632456     1                          2 = 0.367544 eigenvector = [v1, v2]', where v1 =v2
Product = 0.6            0.6 = Generalized Variance

Find the distance between these two points in 3-dimensional space
Variable 1 Variable 2 Variable 3
Point 1          1          2          3
2.236068 =SQRT((1-3)^2+(2-2)^2+(3-4)^2)
Point 2          3          2          4

= 12
Matrix                       Matrix - *I
10            -4                 -2         -4
-4             4                 -4         -8
Determinant = 0

difference
Vector           Matrix * Vector               * Vector
2                       24                    24                   0
-1                      -12                   -12                   0

2.236068 = Length
re -2*v1 = v2

re 2*v1 = 3*v2
re -3*v1 =2*v2

2-2)^2+(3-4)^2)
X1       X2        X3
10       1         7          Use the last six digits of your student
5       2         7
number for X3. The values in X3 are for
7       4         1
0       4         9
someone with 771906 as the last 6 digits
2       5         0          of their student number.
6       2         6

a. Find the covariance matrix for the variables X1 & X2
b. Find the correlation matrix for the variables X1 & X2
c. Find the mean corrected or centered values for the variables X1 & X2
d. Find the covariance matrix for the mean corrected or centered values for the variables X1 & X2
e. Find the correlation matrix for the mean corrected or centered values for the variables X1 & X2
f. Find the standardized values for the variables X1 & X2
g. Find the covariance matrix for the standardized for the variables X1 & X2
h. Find the correlation matrix for the standardized values for the variables X1 & X2
i. Find the eigenvalues & an eigenvector for each eigenvalue for the covariance matrix for the variables X1 & X2
j. Find the eigenvalues & an eigenvector for each eigenvalue for the correlation matrix for the variables X1 & X2
k. Find the covariance matrix for the variables X1, X2 & X3
l. Find the correlation matrix for the variables X1, X2 & X3
bles X1 & X2
les X1 & X2

ix for the variables X1 & X2
x for the variables X1 & X2
last 6 digits = 771906
X1          X2        X3    digit    Xd1       Xd2      Xd3        Z1       Z2
10          1         7       1        5        -2        2      1.39754   -1.291
5          2         7       2        0        -1        2            0 -0.6455
7          4         1       3        2         1       -4      0.55902 0.6455
0          4         9       4       -5         1        4      -1.3975 0.6455
2          5         0       5       -3         2       -5      -0.8385 1.29099
6          2         6       6        1        -1        1      0.27951 -0.6455
Mean                5          3        5                0       0       0           0        0
Variance        12.8         2.4     13.2             12.8     2.4    13.2           1        1
Std. Dev.   3.57771 1.54919 3.63318               3.57771 1.54919 3.63318            1        1

Covariance Matrix                   Covariance Matrix             Covariance Matrix
12.8        -4      -0.4             12.8       -4      -0.4        1     -0.7217
-4         2.4     -3.4              -4        2.4     -3.4     -0.7217      1
-0.4       -3.4     13.2             -0.4      -3.4     13.2     -0.0308 -0.6041

Correlation Matrix                   Correlation Matrix           Correlation Matrix
1     -0.72169 -0.03077              1      -0.7217 -0.0308       1      -0.7217
-0.72169       1    -0.60407          -0.7217       1    -0.6041   -0.7217       1
-0.03077 -0.60407        1            -0.0308 -0.6041         1    -0.0308 -0.6041

X1        X2        X3
X1                  1
X2          -0.72169         1
X3          -0.03077 -0.60407            1
1.2
-0.86603       1.2
-0.03693 -0.72488          1.2
Z3            Xd1*Xd2 Xd1*Xd3 Xd2*Xd3
0.55048            -10      10      -4
0.55048              0       0      -2
-1.101             2      -8      -4
1.10096             -5     -20       4
-1.3762             -6      15     -10
0.27524             -1       1      -1
0   Sum         -20      -2     -17
1   Cov          -4    -0.4    -3.4
1

ovariance Matrix
-0.0308      Covariance Matrix of the
-0.6041      Standardized Data is the
1
Correlation Matrix of the Original
orrelation Matrix        Data.
-0.0308
-0.6041
1
Covariance Matrix                                                             Correlation Matrix
12.8       -4                                                                   1      -0.72169
-4        2.4                                                              -0.72169       1
Trace =   15.2      = Total Variance                                          Trace =      2
Determinant =  14.72      = Generalized Variance                              Determinant = 0.479167

= 2                                                                            = 0.5
Matrix - *I                                                                  Matrix - *I
10.8      -4                                                                   0.5      -0.72169
-4       0.4                                                               -0.72169       0.5
Determinant = -11.68                                                          Determinant = -0.27083

Eigenvalue 1 = 14.16049                 15.2 = sum of eigenvalues             Eigenvalue 1 = 1.721688
Eigenvalue 2 = 1.039512                14.72 = product of eigenvalues         Eigenvalue 2 = 0.278313
Eigenvalues found by varying starting value & using solver                    Eigenvalues found by varying starting value & u
= Total Variance
= Generalized Variance

2.000001 = sum of eigenvalues
0.479167 = product of eigenvalues
varying starting value & using solver
Singular Value Decomposition
X = data matrix
W = orthogonal rotation, where W'=W-1
D-1 = diagonal stretching/shrinking matrix
Zs = Matrix of standardized data

Zs = X•W•D-1       then   Zs•D = X•W•(D-1•D) = X•W•I = X•W

Zs•D = X•W         then   Zs•D•W' = X•W•W' = X

Data Matrix = (Standardized Data)*(Scaling Matrix)*(Orthogonal Rotation Matrix)
on Matrix)

```
To top