matrices and data by FB80043

VIEWS: 4 PAGES: 49

									Matrices are essentially boxes filled with numbers organized in rows and columns.
Hence matrices are easily represented on an Excel worksheet.
Capital Letters such as A, B, C, D, etc. are generally used as names for matrices.
Most statistical notation uses x, y & z to denote sample data; similarly X, Y & Z will denote a matrix of sample data
with each column representing a variable and each row representing a sampled unit.

The size of a matrix is determined by the number of rows and number of columns.
In reporting the size of a matrix,
the numbers of rows is 1st and number of columns is 2nd.
2 by 3 denotes a matrix with 2 rows and 3 columns.
5 by 4 denotes a matrix with 5 rows and 4 columns.
3 by 1 denotes a matrix with 3 rows and 1 column and is called a column matrix or column             vector
1 by 5 denotes a matrix with 1 row and 5 columns and is called a row matrix or row vector.

If the (number of rows) = (number of columns), then the matrix is considered to be a   Square Matrix.


I will place a box around a set of cells to denote a matrix in Excel.
                                              Standard mathematical notation is to put numbers in brackets.
Matrix A Col. 1 Col. 2 Col. 3 Col. 4
                                                   é -1 - 2 3 4 ù
                                              A = ê 11 12 13 14 ú
   Row 1       -1    -2     3      4
   Row 2      11     12    13     14               ê                  ú
   Row 3      24     23    21     20
                                                 ê24
                                                 ë      23 21 20ú
                                                                û
                    3 by 4
matrix of sample data




umn vector.




s in brackets.
 Special Matrices
 Column Matrix - Column Vector - Matrix with one column.
 Row Matrix - Row Vector - Matrix with one row.
 Square Matrix - Matrix with (number of rows) = (number of columns).
 Zero Matrix - Matrix with only zeroes for all number values or entries.
 Matrix Transpose - A matrix created from another matrix by reversing rows and columns.
 Transpose notation - the transpose of A can be denoted as AT or A'.
 Symmetric Matrix - A square matrix such that A = A', that is aij = aji for all values of i & j.
 Diagonal Matrix - A symmetric matrix with all values off of the main diagonal = 0; a ij=0 for i≠j.
 Identity Matrix - A diagonal matrix will all diagonal values = 1; aij=1 for i=j & aij=0 for i≠j.



                                                 A Transpose
Matrix A   Col. 1   Col. 2   Col. 3   Col. 4     AT or A'   Col. 1     Col. 2    Col. 3
  Row 1      -1       -2        3        4         Row 1      -1        11        24
  Row 2      11       12       13       14         Row 2      -2        12        23
  Row 3      24       23       21       20         Row 3       3        13        21
                        3 by 4                     Row 4       4        14        20
                                                                      4 by 3
ws and columns.

l values of i & j.
onal = 0; a ij=0 for i≠j.
      =0 for i≠j.
To perform the operations of addition or subtraction for matrices the size
of the matrices must be the same. Number of rows must be the same for
both matrices and the number of columns must also be the same. The result
is a matrix of the same exact size. The result is determined by performing the
appropriate operation (addition or subtraction) on all corresponding elements
or cells.


               A                           B                        A+B
     -1   -2        3    4        9    8        7    5        8     6    10    9
     11   12       13   14   +   -1    0       -2    0   =   10    12    11   14
     24   23       21   20        0   -3        1   -5       24    20    22   15
           3 by 4                     3 by 4                        3 by 4

               A                           B                         A-B
     -1   -2        3    4        9    8        7    5       -10   -10   -4   -1
     11   12       13   14   -   -1    0       -2    0   =    12    12   15   14
     24   23       21   20        0   -3        1   -5        24    26   20   25
           3 by 4                     3 by 4                        3 by 4
Excel has several functions that perform operations on matrices and some can produce a
result that is a matrix displayed in an array.

To use one of the functions that places a matrix result in an array of cells one MUST
hold down the Ctrl & Shift keys simultaneously while clicking on OK or
pressing the Enter key.
These functions appear in two primary locations in Excel functions.
TRANSPOSE in the Lookup & Reference category of functions.
MMULT in the Math & Trig category of functions.
MINVERSE in the Math & Trig category of functions.
The above functions allow for a matrix as an input and can produce matrix output.
For the above functions one MUST hold down the Ctrl & Shift keys
simultaneously while clicking on OK or pressing the Enter key to obtain
matrix output.
MDETERM in the Math & Trig category of functions calculates the DETERMINANT of a
square matrix..
The above function allows for a matrix as an input and will have a single number
rather than a matrix as output.
If the value of the determinant for a covariance or correlation matrix is zero then at
least one variable in the data set is a linear combination of other variables.
 can produce a


    MUST
      OK or




matrix output.
ift keys
key to obtain


         of a

 gle number

s zero then at
         A                                                    B
  0              1                 The       -1               0           1
                                   Two
  1              2                Must        2               3           4
       2 by 2   # Columns for    < be > # Rows for the   2 by 3
                the 1st matrix    Equal 2nd matrix
                                                         If the number of columns for the
                           A*B                           1st matrix does not equal the
                                                         number of rows for the 2nd matrix
                Col1      Col2          Col3
                                                         then multiplication of the two
ROW1         0*(-1)+1*2 0*0+1*3       0*1+1*4            matrices is impossible.
ROW2         1*(-1)+2*2 1*0+2*3       1*1+2*4
                      Result = 2 by 3
                  2          3           4               These are calculated from
                  3          6           9               the cells above

                 2               3             4         This result was obtained
                 3               6             9         by using the MMULT function
f columns for the
  not equal the
s for the 2nd matrix
 ion of the two




 ulated from


s obtained
 MULT function
                           Identity   Inverse         Inverse
Element    Operation       Element Operation          Element
   a       Addition           0     Subtraction           -a
   a      Subtraction         0      Addition             -a
                                                       -1
   a      Multiplication      1       Division        a = 1/a
   a        Division          1     Multiplication    a-1 = 1/a

   A          Matrix       Identity   Multipy by      Inverse                 1
          Multiplication    Matrix     Inverse       Matrix = A-1             0
                                                                           2 by 2 Identity

To find an inverse of a matrix A                                    1         0
1. A must be a square matrix                                        0         1
2. The determinant of A must not = 0                                0         0
                                                                        3 by 3 Identity
Steps for finding an inverse using Excel is on the next workbook.
            0
            1
 2 by 2 Identity

                0
                0
                1
by 3 Identity
A-1 is used to denote the inverse of the matrix A
|A| is used to denote the determinant of the matrix A
 To find an inverse of a matrix
To find an inverse of a matrix A A
 1. A must be a square matrix
1. A must be a square matrix
 2. The determinant of A must not be equal to 0, |A|≠0.
2. The determinant of A must not = 0
                                    To find an inverse, the matrix MUST be square. If the number of rows
                 A                  does not equal the number of columns, an inverse cannot be found.
           1      2       3         Also the determinant must not be zero. Placing the cursor on cell B13
           0      1       0         click on function button, go to Math & Trig and then select MDETERM.
           2     -1       4         Select the cells in B8 through D10 and click on OK. Since the
               3 by 3               determinant of -2 is not zero then an inverse can be found.
                                    Highlight the cells B16 through D18,click on function button, go to
 |A| =    -2   = Determinant of A   Math & Trig and then select MINVERSE. Select the cells in B8
                                    through D10 and simultanedously hold down both the Ctrl & Shift
                A-1                 keys while clicking on OK. The inverse will appear in the
          -2     5.5      1.5       highlighted cells B16 through D18.
          0       1        0
          1     -2.5     -0.5       To confirm that this is the inverse multipy the original matrix times the
                                    inverse. Highlight the cells B22 through D24,click on function button,
               3 by 3
                                    go to Math & Trig and then select MMULT. Select the cells B8
                                    through D10 for Array1 and cells B22 through D24 for Array2 then
               A * A-1              hold down the Ctrl & Shift keys simultaneously while clicking on
           1      0       0         OK. The resulting 3 by 3 identity matrix provides confirmation that it is
           0      1       0         the inverse.
           0      0       1
               3 by 3


         Determinant of a 2 by 2 matrix

    éa b ù
  B=ê      , then B = a  d - b  c
    ë c dú
         û
mber of rows

 on cell B13
 MDETERM.




Ctrl & Shift



 x times the
 ion button,


clicking on
 tion that it is
The system of 3 equations below is coverted to matrix form.
The inverse matrix is found and used to obtain solution.
2x + 5y + 4z = 4
 x + 4y + 3z = 1
 x - 3y - 2z = 5
(Example 6, page 263 of 6th edition of Harshbarger & Reynolds


        C                      Determinant(C) = -1 < hence an inverse can be found
    2    5   4         x        4 To obtain C-1, the Matrix Inverse for C, Highlight a
                                    3 by 3 area, Select the function MINVERSE,
    1    4   3     *   y =      1 Indicate the location of the matrix C as below.
    1 -3 -2            z        5 Hold down the Ctrl & Shift keys OK.
                                    simultaneously while clicking on
      3 by 3
        C-1
   -1 2      1
   -5 8      2
    7 -11 -3



   x                   4         3   To find this matrix of Solution values, use the MMULT
              -1                     function to multiply C-1 times the column matrix of right
   y    =   C      *   1   =    -2   hand side values (See Below). Hold down the Ctrl &
   z                   5         2   Shift keys simultaneously while clicking on OK.
se can be found
         A                    I (2by2)              I*A
5        2           4       1       0        5      2       4
6        4           7       0       1        6      4       7
      (2 by 3)
                 B        A' or A transpose       I (3by3)
          1          3       5       6        1      0       0
         -3          0       2       4        0      1       0
         -3          5       4       7        0      0       1

Scalar Multiplication          B + A'              A*I
        2*A                   6      9        5      2       4
10       4            8      -1      4        6      4       7
12       8           14       1     12
            Data matrix for values of observations of m variables (m columns)
                               for n sampling units (n rows).
                   x1= variable 1   x2= variable 2   x3= variable 3    …     xm= variable m
Sampling unit 1           x11             x12              x13         …           x1m
Sampling unit 2           x21             x22              x23         …           x2m
Sampling unit 3           x31             x32              x33         …           x3m
Sampling unit 4           x41             x42              x43         …           x4m
       …                  …                …                …          …            …
Sampling unit n           xn1             xn2              xn3         …           xnm

 Each variable will be measured using the same unit of measure (inches, kilograms, test grade, etc.)
 for all sampling units but each variable may have a unit of measure that is different from all of the
 other variables.
 Mean Vector - a vector containing the sample mean of each variable
 Variance Vector - a vector containing the sample variance of each variable
 Standard Deviation Vector - a vector containing the sample standard deviation of each variable
Outlier                                                                           An Outlier may have
An observation with a unique combination of characteristics                       large or small value
identifiable as distinctly different from the other observations.
(Hair 6th text definition page 73)                                                variable or it may ha
An observation that is so extreme that it stands apart from the rest of           combination of value
the observations; that is, it differs so greatly from the remaining               the dark red point in
observations that it gives rise to the question whether it is from the            below.
same population or involves measurement error.
(Pocket Dictionary of Statistics, 2002)

Reasons for an Outlier:                                                 25
Procedural or measurement error that is a result of data entry error,
                                                                        20
measurement coding or measuring the wrong thing.
A rare or extraordinary event has occured for the phenomenon being
observed and the observation(s) come from this event.                   15
(Rain attributing to a 100 year flood.)
The data are for an observation that may be from a different            10
phenomenon than the one being studied.
                                                                          5
Should an outlier be included or excluded for an analysis?
                                                                          0
Is it truly indicative of the phenomenon being studied?
                                                                              0          5
Does its inclusion mean that the sample data are no longer truly
representative of the phenomenon?
I like to use the terms
Extreme point for a data value set apart from from the rest and
Nonindicative point for a data value that is deemed to not be
representative of the phenomenon being studied.
                                                        X1   X2 sum       diff
An Outlier may have an extremely                1       18   18      36         0
        small value for a single                2        2    2       4         0
 ariable or it may have an unusual              3       11   13      24        -2
                                                4        7   10      17        -3
 ombination of values as shown by               5       18   18      36         0
he dark red point in the X-Y scatter            6       19   21      40        -2
                                                7        6    9      15        -3
                                                8        0    0       0         0
                                                9       17   20      37        -3
                                               10       19   19      38         0
                                               11        5    7      12        -2
                                               12       11   14      25        -3
                                               13        2    3       5        -1
                                               14        7    8      15        -1
                                               15       15   19      34        -4
                                               16       17   18      35        -1
                                               17        2    3       5        -1
                                               18        1    1       2         0
                                               19       11   14      25        -3
                                               20       19   19      38         0
                                             outlier    18    4      22      14 Extreme Point that seems to be Nonindicative.
                                             Mean      10.7 11.4 22.14 -0.714
                 10        15          20
                                            Variance 49.1 52.4 189.8 13.11
                                            Correlation     0.87        -0.065
                                            Total Variance 101           202.9
                                                       49.1
eems to be Nonindicative.
x1, x2 through xm denote m variables with observations for n sampling units.
Sample Mean for the jth variable       Sample Variance for the jth variable       Sample Standard Deviation is


                                                     x          - xj 
                 n                                    n                           the square root of the sample

                xij
                                                                              2   variance.
                                                             ij                   The Sample Standard Deviatio
                                                                                  for the jth variable is
     xj =       i =1
                                       s2 =         i =1                          sj.
                                                            n -1
                                        j
                     n
AVERAGE function in Excel              VAR is a 2007 function and VAR.S is an Excel 2010 function

Sample Covariance between the jth and kth variables                                Excel 2007 does not have a fu


                                       x          - x j    xik - x k 
                                       n                                           sample covariance.
                                                                                   The COVAR function
                                               ij                                  population covariance or the m
COV ( x j , x k ) = s jk =            i =1                                         estimator rather than the minim
                                                          n -1                     unbiased estimator that is gen
                                                                                   calculate the sample covarianc
COVARIANCE.S in Excel 2010 calculates the sample covariance.                       The Sample Covariance
                                                                                   =COVAR(data)*n/(n
                                                                                   Excel 2007 or earlier does not
                                                                                   find a sample covariance.
                                                                                   COVARIANCE.S(data)
                                                                                   covariance in Excel 2010.
e Standard Deviation is
uare root of the sample

 mple Standard Deviation
   variable is denoted by



el 2010 function

 2007 does not have a function to find a
 e covariance.
 OVAR function formula finds the
ation covariance or the maximum likelihood
ator rather than the minimum variance
sed estimator that is generally used to
 ate the sample covariance.
 ample Covariance can be calculated by
 AR(data)*n/(n-1)
 2007 or earlier does not have a function to
 sample covariance.
ARIANCE.S(data) calculates the sample
 ance in Excel 2010.
      Measurement           Statistic
    Original Units          The Mean is a measure of the location of the middle of a distribution.
   (Original Units)2        The Variance is a measure of the dispersion or spread for a distribution.
    Original Units          The Standard Deviation is a measure of the dispersion or spread for a distributi
Product of two sets of
                            The Covariance is a measure of the linear relationship between two variables.
    Original Units
       Unit Free            The Correlation is a unit free measure of the linear relationship between two var

Sample Correlation between the jth and kth variables

              s jk                       COV ( x j , x k )
r jk =                  =
           s j  sk         Std .Dev.( x j )  Std .Dev.( x k )
      n xij - x j   xik - x k                  
   s   s
                   
                                                  
                                                  
=
  i =1       j          k                      
                n -1
a distribution.
 for a distribution.
n or spread for a distribution.
between two variables.
ationship between two variables.
     Data matrices for values of observations of m variables (m columns) for n sampling units (n rows).
Original Data matrix X   x1= variable 1         x2= variable 2    x3= variable 3              …   xm= variable m
Sampling unit 1                    x11                 x12               x13                  …         x1m
Sampling unit 2                    x21                 x22               x23                  …         x2m
Sampling unit 3                    x31                 x32               x33                  …         x3m
             …                     …                   …                 …                    …         …
Sampling unit n                    xn1                 xn2               xn3                  …         xnm
Xd is the matrix of deviation scores from the                          Xs is the matrix of standardized scores for each
mean of each variable. It is often referred to as                      variable. The mean is zero and the standard
mean-corrected or mean-adjusted or centered                            deviation is one for each variable in the mean
data matrix. The mean is zero for each variable in                     corrected data set.
a mean-corrected or centered data set.                                      é x11 - x1             x12 - x 2           x1m - x m ù
                                                                            ê s                                                  ú
      é x11 - x1         x12 - x 2        ...      x1m - x m ù                                         s2                  sm
                                                                            ê      1
                                                                                                                                  ú
      ê
        x 21 - x1        x 22 - x 2                x2 m - xm
                                                              ú             ê x 21 - x1            x 22 - x 2          x2m - xm ú
                                          ...                                                                      
Xd   =ê                                                       ú       X s = ê s1                       s2                  sm     ú
      ê ...                  ...          ...          ...    ú             ê                                                    ú
                                                                                                                         
      ê                                                       ú             ê x n1 - x1            xn 2 - x2           x nm - x m ú
      ë x n1 - x1        xn 2 - x2        ...      x nm - x m û
                                                                            ê                                                    ú
         n                                                                  ë s1                       s2                  sm     û
         xij
                                                                                                       j 
                                                                                    n                           sj = standard
                                                                                    x ij - x
                                                                                                           2
       i =1
xj =             = Mean for the j variable        th
                                                                                                                deviation of the j
                                                                                   i =1
             n                                                        s2 =                                      variable.
                                                                                              n -1
                                                                       j
 sampling units (n rows).




tandardized scores for each
n is zero and the standard
 each variable in the mean-


2 - x2           x1m - x m ù
                           ú
 s2                  sm
                            ú
2 - x2           x2m - xm ú
           
 s2                  sm     ú
                         ú
2 - x2           x nm - x m ú
                           ú
 s2                  sm     û
         sj = standard
         deviation of the jth
         variable.
For a set of data with observation values for m variables and n sampling units one can
calculate a covariance matrix and a correlation matrix. Both are m by m matrices.

                                                                                                 - xk 
                                                                          n
                                                                          xij - x j  xik
     S = Covariance Matrix             R = Correlation Matrix
      x1    x2   x3 …      xm           x1    x2  x3 …       xm
                                                                        i =1
 x 1 s1 2
           s12 s13 … s1m           x1   1    r12 r13 …       r1m s jk =
 x2 s21 s22 s23 … s2m              x2 r21     1   r23 …      r2m                  n -1
 x3 s31 s32 s32 … s3m              x3 r31 r32      1   …     r3m           s jk
 …    …     …    …    …    …       …    …     …   …    …     …   rjk =
 xm sm1 sm2 sm3 … sm2              xm rm1 rm2 rm3 …           1         s j  sk
sj2 = sample variance of the jth variable.                                   é           xi1 - x1 
                                                                                        n            2
sjk = sample covariance between the jth and kth variables.                   ê        i =1
rjk = sample correlation between the jth and kth variables.                  ê n
                                                                   SSCP = ê i1 xi1 - x1    xi 2 - x
SSCP = Martrix of Sum of Squares and Cross Products      SSCP                ê =
SSCP = Xd' * Xd           SSCP = X d  X d
                                       '            S=                       ên              
                                                          n -1               ê   xi1 - x1    xim - x
In Excel the covariance calculated by COVARIANCE.S or by
                                                                             ë =1
adjusting the COVAR or by Data Analysis value is for the Population Covarianceiwhich divides
by n rather than by (n-1). Hence to get the Sample Covariance one must correct the calculations by multiplyi
dividing by (n-1). Data Analysis can be found on the Data tab in 2010 or 2007 and under tools for previous ve
covariance in Data Analysis calculates both the covariances and the variances using the population formula t
meaning that all values must be corrected to obtain the sample Covariance Matrix. The CORREL function
in Data Analysis do not need to be corrected.
1. Total Variance of a matrix = Trace of the matrix = Sum of the diagonal elements
2. Generalized Variance of a matrix = Determinant of the matrix
units one can


xij - x j    xik - xk 
       n -1



         xi1 - x1 
       n             2
                                    xi 2 - x2    xi1 - x1 
                                   n
                                                                          xim - xm    xi1 - x1 
                                                                         n
                                                                                                       ù
      i =1                        i =1                                  i =1                       ú
                                                                                                   ú
  xi1 - x1    xi 2 - x2              xi 2 - x2               xim - xm    xi 2 - x2 ú
 n                                        n            2                 n

 =1                                      i =1                        i =1
                                                                                                   ú
                                                                                               ú
  xi1 - x1    xim - xm       xi 2 - x2    xim - xm                xim - xm 
n                                 n                                          n
                                                                                           2
                                                                                                   ú
=1
which divides                    i =1                                      i =1                    û
he calculations by multiplying by n and then
 under tools for previous versions. The
 g the population formula that divides by n,
   The CORREL function and the correlation
A covariance or correlation matrix, S, can be used to determine eigenvalues (or Characteristic Roots).
An eigenvalue is a value λ that is a solution to the equation |S - λ*I|=0.
For a matrix A, |A| denotes the determinant of the matrix A.
For a value λ there is a vector V (m by 1) that satifies the matrix equation S*V = λ*V.
This vector V is an eigenvector. (For each eigenvalue there is an associated eigenvector.)

Important properties:
For an m by m matrix S, there are m eigenvalues.
The sum of these m eigenvalues is equal to the total variance of S, the trace of S.
The product of these m eigenvalues is equal to the generalized variance of S, the determinant of S.



    é9 4 ù            9-        4                                                          For  = 11
S =ê        ú , Solve               =0
    ë 4 3û               4      3-                                                         é9 4ù é v1 ù      év ù
                                                                                                   ê ú =11  ê 1 ú
                                                                                            ê 4 3ú v
(9 -  )  (3 -  ) - 4  4 = 0                                                             ë    û ë 2û       ëv2 û
                                                                                                This yields 2 equations with 2 unknowns
27 -12   -16 =11-12   = (11 -  )  (1 -  ) = 0
                  2                             2
                                                                                                9  v1  4  v2 =11  v1
11 -  = 0, or 1 -  = 0                                                                        4  v1  3  v2 =11  v2
 = 11, or  = 1                                                                                or
The following was copied on 1-30-06 from Wikipedia, the free encyclopedia
http://en.wikipedia.org/wiki/Eigenvectors                                                       - 2  v1  4  v2 = 0
The German word eigen was first used in this context by Hilbert in 1904 (there was an earlier
related usage by Helmholtz). "Eigen" can be translated as "own", "peculiar to",                 4  v1 - 8  v2 = 0
"characteristic" or "individual" emphasizing how important eigenvalues are to defining the
unique nature of a specific transformation. In English mathematical jargon, the closest
                                                                                                 See the next tab for eigenvectors
translation would be "characteristic"; and some older references do use expressions like
"characteristic value" and "characteristic vector", or even "Eigenwert", German for
eigenvalue. In the past, the standard translation used to be "proper". Today the more
distinctive term "eigenvalue" is standard.
distinctive term "eigenvalue" is standard.
   For each eigenvalue λ there is an associated eigenvector V (m by 1) that satifies the
   equation S*V = λ*V.

            é9 4 ù
    For S = ê    ú,  = 11 &  = 1
            ë 4 3û
    For  = 11                                            For  = 1
     é9 4ù é v1 ù      év ù                              é9       4 ù é v1 ù      é v1 ù
            ê ú =11  ê 1 ú
     ê 4 3ú v                                                               =1  ê ú
     ë    û ë 2û       ëv2 û
                                                         ê4
                                                         ë        3 ú êv2 ú
                                                                    û ë û         ëv2 û
This yields 2 equations with 2 unknowns                  This yields 2 equations with 2 unknowns

     9  v1  4  v2 =11  v1                             9  v1  4  v 2 = 1  v1
     4  v1  3  v2 =11  v2                             4  v1  3  v 2 = 1  v 2
     or                                                   or
     - 2  v1  4  v2 = 0                                8  v1  4  v 2 = 0
     4  v1 - 8  v2 = 0                                  4  v1  2  v 2 = 0
There is no unique solution to the above system          There is no unique solution to the above system

  v1 = 2  v2        Any two values of v1 and v2 satisfying the
                     equation define an eigenvector. An
                                                                      v 2 = - 2  v1
                     eigenvector is a line or direction in the data
                     space. A general procedure is to give a vector
                     of length one as the eigenvector.
Find the eigenvalues and eigenvectors for the covariance matrix below
                   Matrix
                                      1. To solve for  using Excel select a
               10          -2
               -2           4
                                      cell and put in an initial value for .
                                      Use this cell to create a new matrix
       Determinant = 36
                                    (Matrix -*I) and use MDETERM to find
                 = 10.60555        its determinant.
              Matrix - *I
                                   2. Use Solver to find a value of 
          -0.60555       -2        that will make the determinant = 0.
              -2     -6.60555
     Determinant = 5.44E-07
                                             3. Select initial values for
          Vector          Matrix * Vector    the eigenvector for the
         -1.19338           -12.6564         calculated eigenvalue (Vector
         0.361325            3.83205         to the left), then use MMULT to
         1.246876 = Length                   multiple the matrix times the
                                             Vector
           * Vector            difference    4. Use scalar multiplication to multiply l
         -12.6564                4.01E-07     times the Vector. Take the difference
         3.832049                  1E-06      between this scalar product and the matrix
                                              product.
                                              5. Use Solver to find the vector values that
                                              will make both differences = 0.
           There will be more than one eigenvalue and associated eigenvector so you can
           find the other eigenvalues by trying different starting values and for each
           eigenvalue the same process can be used to find the associated eigenvector.
Data can be transformed from the original measurement to another variable.
A new variable y1 can be defined as a linear transformation of the variable x1 with y1 = a1*x1 + b1.
A new variable y2 can be defined as a linear transformation of the variable x2 with y2 = a2*x2 + b2.
          The coefficients a1 and a2 must not be zero.

 y1 = a1  x1  b1 ( a1  0) & y 2 = a2  x2  b2 ( a2  0
 y1 = a1  x1  b1 & y 2 = a2  x2  b2
VAR ( y1 ) = s 21 = a1  VAR ( x1 ) or s y1 = a1  s x1
               y
                     2


VAR ( y 2 ) = s 22 = a2  VAR ( x2 ) or s y2 = a2  s x2
                y
                      2


 COV ( y1 , y 2 ) = s y1 y2 = a1  a2  COV ( x1 , x2 ) = a1  a2
 ry1 , y2 = rx1 , x2
                            é a1 ù    é b1 ù
 X =  x1        x 2 , A = ê ú , B = ê ú , Y =  y1                            y 2 , Y = X  A 
                            ë a2 û    ëb2 û
  X = x1 x2 , Y = y1 y2 , Y = X  A B
      é sx
         2
                      sx1 x2 ù       é s2                s y1 y2 ù
 SX = ê 1                2 ú
                              , SY = ê y1                   2 ú
                                                                  , SY = A'S X  A
      êsx1 x2
      ë                s x2 úû       ês y1 y2
                                     ë                    s y2 ú û
1 with y1 = a1*x1 + b1.
2 with y2 = a2*x2 + b2.




 x2  b2 ( a2  0)


= a1  s x1
    = a 2  s x2
( x1 , x2 ) = a1  a2  s x1 x2


    y 2 , Y = X  A  B
Let a ' =  a1                   a2        ...      am             A vector describes a direction in
                                                                    space.
The Length of the vector a = ||a||
           m               Suppose m=3 & a' = [10 2 11]
  a =  a 2 a = 102  22  112 = 100  4  121 = 225 =15
          j
               j
A vector is normalized if its length is 1. To normalize a vector, multiply the vector by the inverse o
[10/15 2/15 11/15] is the normalized vector for the vector a.

Euclidean Distance between the two points a & b = ||a - b||

  a -b =           a1 - b1 2  a2 - b2 2  ...  am - bm 2
Suppose m=3 & b' = [8 3 13]

                                
 a -b = 10-8  2 -3  11-13 = 4 1 4 = 9 = 3
                             2         2              2



Two Vectors a and b are ORTHOGONAL if and only if a'•b=0

                   a'            b      a'•b
          10       2    11       8      229 =10*8+2*3+11*13
                                 3      Hence, a & b are not orthogonal.
                                 13
escribes a direction in multiple dimensional




vector by the inverse of its length.
#1 in Chapter 2 of the 2002 edition of the Stevens multivariate text
       A             B             C            D              E           X
  2 4 1            1 2         1 3 5           4 2          1 -1 2     1       2
  3 -2 5           2 1         6 2 1           2 6        -1 3 1       3       1
                   3 4                                    2 1 10       4       6
    u'          v                                                      5       7
  1 3           2
                7
Find
a) A + C
b) A + B
 c) A*B
d) A*C
e) u' * D * u
f) u' * v
g) (A + C)'
h) 3 * C
i) Determinant of D
j) Inverse of D = D-1
k) Determinant of E
l) Inverse of E
m) u' * D-1 * u
n) B*A (compare result to c above)
o) X' * X
Find the eigenvalues & eigenvectors for each covariance matrix.

a.       4            -4                      1 = 12        eigenvector = [v1, v2]', where -2*v1 = v2
        -4            10                      2 = 2         eigenvector = [v1, v2]', where v1 =2*v2
                                         Product = 24               24 = Generalized Variance

b.      21            12                      1 = 29        eigenvector = [v1, v2]', where 2*v1 = 3*v2
        12            11                      2 = 3         eigenvector = [v1, v2]', where -3*v1 =2*v2
                                         Product = 87               87 = Generalized Variance

c.       1     -0.632456                      1 = 1.632456 eigenvector = [v1, v2]', where -v1 = v2
     -0.632456     1                          2 = 0.367544 eigenvector = [v1, v2]', where v1 =v2
                                         Product = 0.6            0.6 = Generalized Variance

Find the distance between these two points in 3-dimensional space
               Variable 1 Variable 2 Variable 3
   Point 1          1          2          3
                                                         2.236068 =SQRT((1-3)^2+(2-2)^2+(3-4)^2)
   Point 2          3          2          4

                                            = 12
             Matrix                       Matrix - *I
        10            -4                 -2         -4
        -4             4                 -4         -8
                                  Determinant = 0

                                                                       difference
      Vector           Matrix * Vector               * Vector
         2                       24                    24                   0
        -1                      -12                   -12                   0

      2.236068 = Length
re -2*v1 = v2




re 2*v1 = 3*v2
re -3*v1 =2*v2




2-2)^2+(3-4)^2)
   X1       X2        X3
   10       1         7          Use the last six digits of your student
    5       2         7
                                 number for X3. The values in X3 are for
    7       4         1
    0       4         9
                                 someone with 771906 as the last 6 digits
    2       5         0          of their student number.
    6       2         6


a. Find the covariance matrix for the variables X1 & X2
b. Find the correlation matrix for the variables X1 & X2
c. Find the mean corrected or centered values for the variables X1 & X2
d. Find the covariance matrix for the mean corrected or centered values for the variables X1 & X2
e. Find the correlation matrix for the mean corrected or centered values for the variables X1 & X2
f. Find the standardized values for the variables X1 & X2
g. Find the covariance matrix for the standardized for the variables X1 & X2
h. Find the correlation matrix for the standardized values for the variables X1 & X2
i. Find the eigenvalues & an eigenvector for each eigenvalue for the covariance matrix for the variables X1 & X2
j. Find the eigenvalues & an eigenvector for each eigenvalue for the correlation matrix for the variables X1 & X2
k. Find the covariance matrix for the variables X1, X2 & X3
l. Find the correlation matrix for the variables X1, X2 & X3
bles X1 & X2
 les X1 & X2



ix for the variables X1 & X2
x for the variables X1 & X2
                 last 6 digits = 771906
              X1          X2        X3    digit    Xd1       Xd2      Xd3        Z1       Z2
              10          1         7       1        5        -2        2      1.39754   -1.291
               5          2         7       2        0        -1        2            0 -0.6455
               7          4         1       3        2         1       -4      0.55902 0.6455
               0          4         9       4       -5         1        4      -1.3975 0.6455
               2          5         0       5       -3         2       -5      -0.8385 1.29099
               6          2         6       6        1        -1        1      0.27951 -0.6455
Mean                5          3        5                0       0       0           0        0
Variance        12.8         2.4     13.2             12.8     2.4    13.2           1        1
Std. Dev.   3.57771 1.54919 3.63318               3.57771 1.54919 3.63318            1        1

                  Covariance Matrix                   Covariance Matrix             Covariance Matrix
              12.8        -4      -0.4             12.8       -4      -0.4        1     -0.7217
               -4         2.4     -3.4              -4        2.4     -3.4     -0.7217      1
              -0.4       -3.4     13.2             -0.4      -3.4     13.2     -0.0308 -0.6041

                  Correlation Matrix                   Correlation Matrix           Correlation Matrix
                1     -0.72169 -0.03077              1      -0.7217 -0.0308       1      -0.7217
            -0.72169       1    -0.60407          -0.7217       1    -0.6041   -0.7217       1
            -0.03077 -0.60407        1            -0.0308 -0.6041         1    -0.0308 -0.6041

              X1        X2        X3
X1                  1
X2          -0.72169         1
X3          -0.03077 -0.60407            1
                  1.2
            -0.86603       1.2
            -0.03693 -0.72488          1.2
               Z3            Xd1*Xd2 Xd1*Xd3 Xd2*Xd3
            0.55048            -10      10      -4
            0.55048              0       0      -2
              -1.101             2      -8      -4
            1.10096             -5     -20       4
            -1.3762             -6      15     -10
            0.27524             -1       1      -1
                   0   Sum         -20      -2     -17
                   1   Cov          -4    -0.4    -3.4
                   1

ovariance Matrix
            -0.0308      Covariance Matrix of the
            -0.6041      Standardized Data is the
                 1
                         Correlation Matrix of the Original
orrelation Matrix        Data.
             -0.0308
             -0.6041
                  1
          Covariance Matrix                                                             Correlation Matrix
             12.8       -4                                                                   1      -0.72169
              -4        2.4                                                              -0.72169       1
             Trace =   15.2      = Total Variance                                          Trace =      2
       Determinant =  14.72      = Generalized Variance                              Determinant = 0.479167

                 = 2                                                                            = 0.5
          Matrix - *I                                                                  Matrix - *I
             10.8      -4                                                                   0.5      -0.72169
              -4       0.4                                                               -0.72169       0.5
       Determinant = -11.68                                                          Determinant = -0.27083

      Eigenvalue 1 = 14.16049                 15.2 = sum of eigenvalues             Eigenvalue 1 = 1.721688
      Eigenvalue 2 = 1.039512                14.72 = product of eigenvalues         Eigenvalue 2 = 0.278313
Eigenvalues found by varying starting value & using solver                    Eigenvalues found by varying starting value & u
            = Total Variance
            = Generalized Variance




                          2.000001 = sum of eigenvalues
                          0.479167 = product of eigenvalues
varying starting value & using solver
Singular Value Decomposition
X = data matrix
W = orthogonal rotation, where W'=W-1
D-1 = diagonal stretching/shrinking matrix
Zs = Matrix of standardized data

Zs = X•W•D-1       then   Zs•D = X•W•(D-1•D) = X•W•I = X•W

Zs•D = X•W         then   Zs•D•W' = X•W•W' = X

Data Matrix = (Standardized Data)*(Scaling Matrix)*(Orthogonal Rotation Matrix)
on Matrix)

								
To top