Matrix Tutorial
Transition Matrices Graphs Random Walks
Pádraig Cunningham University College Dublin
2
Objective
To show how some advanced mathematics has practical application in data mining / information retrieval. To show how some practical problems in data mining / information retrieval can be solved using matrix decomposition. To give you a flavour of some aspects of the course.
3
Stochastic Matrix: Markov process
In 1998 (in some state) Land use is: 30% I (Res), 20% II (Com), 50% III (Ind) Over 5 year period, the probabilities for change of use are:
From I To I To II To III 0.8 0.1 0.1 From II 0.1 0.7 0.2 From III 0 0.1 0.9
4
Stochastic Matrix: Markov process
Land Use after 5 years
26
22
52
=
0.8 0.1 0.1
0.1 0.7 0.2
0 0.1 0.9
30 20 50
v1 = Av0
similarly and so on…
v2 = A2v0
http://kinetigram.com/mck/LinearAlgebra/JPaisMatrixMult04/classes/JPaisMatrixMult04.html
5
Stochastic Matrix: Markov process
When this converges:
vn = Avn
it converges to vn an eigenvector of A corresponding to an eigenvalue 1. vn = [12.5 25 62.5]
i.e.
6
Brief Review of Eigenvectors
The eigenvectors v and eigenvalues λ of a matrix A are the ones satisfying
Avi = λivi
i.e. vi is a vector that:
Pre-multiplying
by matrix A
is the same as
Multiplying
by the corresponding eigenvalue λi
7
The important property…
Repeated application of the matrix to an arbitrary vector results in a vector proportional to the eigenvector with largest eigenvalue
http://mathworld.wolfram.com/Eigenvector.html
lim A y = λ b v
n →∞
n
n 1 1 1
What has this got to do with Random Walks?...
€
8
Transition Matrices & Random Walks
Consider a random walk over a set of linked web pages. The situation is defined by a transition (links) matrix. The eigenvector corresponding to the largest eigenvalue of the transition matrix tells us the probabilities of the walk ending on the various pages.
9
Web Pages Example
B C A E D
9
Web Pages Example
From B C A E D To
A B C D E A 1 1 0 0 1 B 0 1 1 0 1 C 0 0 1 1 1 D 0 0 0 1 1 E 1 0 1 0 1
9
Web Pages Example
From B C A E
A A 1 1 0 0 1 B C D E
B 0 1 1 0 1
C 0 0 1 1 1
D 0 0 0 1 1
E 1 0 1 0 1
To D
Eigenvector corresponding to largest Eigenvalue
0.38 0.20 0.49 0.26 0.71
9
Web Pages Example
From B C A E
A A 1 1 0 0 1 B C D E
B 0 1 1 0 1
C 0 0 1 1 1
D 0 0 0 1 1
E 1 0 1 0 1
To D
Eigenvector corresponding to largest Eigenvalue
0.38 0.20 0.49 0.26 0.71
EVD: http://kinetigram.com/mck/LinearAlgebra/JPaisEVD04/classes/
JPaisEVD04.html
10
Review of Matrix Algebra
Why matrix algebra now?
The
Google PageRank algorithm uses Eigenvectors in ranking relevant pages.
Resources
http://mathworld.wolfram.com/Eigenvector.html The Matrix Cookbook
http://www.imm.dtu.dk/pubdb/views/edoc_download.php/3274/pdf/imm3274.pdf
11
Brief Review of Eigenvectors
Eigenvectors are a special set of vectors associated with a linear system of equations (i.e., a matrix equation). Each eigenvector is paired with a corresponding so-called eigenvalue. The decomposition of a square matrix into eigenvalues and eigenvectors is known as eigen decomposition
http://mathworld.wolfram.com/Eigenvector.html
12
Matrices in JAVA - e.g. JAMA
Class EigenvalueDecomposition
Constructor Methods
EigenvalueDecomposition(Matrix Arg)
Matrix GetV() Matrix GetD()
Where A is the original matrix and:
AV=VD
13
Summary
Data describing connections between objects can be described as a graph This graph can be represented as a matrix Interesting structure can be discovered in this data using Matrix Eigen-decomposition