Linear Algebra 2 with Maple by murplelake81

VIEWS: 333 PAGES: 89

									                Linear Algebra 2
                      with Maple




    Denis Sevee
John Abbott College
Contents

0 Review                                                                                                                                                                1

1 Change of Basis                                                                                                                                                       17
  1.1 Coordinates Relative to a Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                       17
  1.2 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                     19
  1.3 Examples of Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                       26

2 Eigenvalues and Eigenvectors                                                                                                                                          39
  2.1 Eigenvectors and Eigenvalues . . . . . . .                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    39
  2.2 Diagonalization . . . . . . . . . . . . . . .                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    53
  2.3 Eigenvectors and Linear Transformations                          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    64
  2.4 Complex Eigenvalues . . . . . . . . . . . .                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    72

3 Dynamical Systems                                                                                                                                                     85
  3.1 The Fibonacci Sequence . . . . . . . . . . . . . . .                                 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    86
  3.2 Difference Equations . . . . . . . . . . . . . . . . .                                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    91
  3.3 Examples of Dynamical Systems . . . . . . . . . .                                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   104
  3.4 The Geometry of Dynamical Systems in the Plane.                                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   117
  3.5 Dynamical Systems with Complex Eigenvalues . .                                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   121
  3.6 More Examples . . . . . . . . . . . . . . . . . . . .                                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   129
      3.6.1 Population Growth . . . . . . . . . . . . . .                                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   129
      3.6.2 Solving Systems of Equations . . . . . . . .                                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   135

4 Inner Products and Projections                                                                                                                                       137
  4.1 The Standard Inner Product .         .   .   .   .   .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   137
  4.2 Orthogonal Sets . . . . . . . . .    .   .   .   .   .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   145
  4.3 Orthogonal Projections . . . .       .   .   .   .   .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   158
  4.4 The Gram-Schmidt Procedure .         .   .   .   .   .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   170
  4.5 Least-Squares Problems . . . .       .   .   .   .   .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   182
  4.6 Data Fitting . . . . . . . . . . .   .   .   .   .   .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   193
  4.7 An Experiment by Galileo . . .       .   .   .   .   .       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   206

5 Inner Product Spaces                                                                                                                                                 213
  5.1 Inner Products . . . . . . . . . .       . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   213
  5.2 Approximations in Inner Product          Spaces              .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   230
  5.3 Fourier Series . . . . . . . . . . .     . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   239
  5.4 Discrete Fourier Transform . . .         . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   247

                                                               i
ii                                                                                                                      CONTENTS


6 Symmetric Matrices                                                                               259
  6.1 Symmetric Matrices and Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . 259
  6.2 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
  6.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

7 The   Singular Value Decomposition                                                                                                            291
  7.1   Singular Values . . . . . . . . . . . . . . . . . . . . . . . .             .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   291
  7.2   Geometry of the Singular Value Decomposition . . . . . .                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   305
  7.3   The Singular Value Decomposition and the Pseudoinverse                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   314
  7.4   The SVD and the Fundamental Subspaces of a Matrix . .                       .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   324
  7.5   The SVD and Statistics . . . . . . . . . . . . . . . . . . .                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   328
  7.6   Total Least Squares . . . . . . . . . . . . . . . . . . . . .               .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   335

8 Calculus and Linear Algebra                                                                                                                   345
  8.1 Calculus with Discrete Data . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   345
  8.2 Differential Equations and Dynamical Systems           .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   348
  8.3 An Oscillating Spring . . . . . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   349
  8.4 Differential Equations and Linear Algebra . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   354

A Linear Algebra with Maple                                                                                                                     359

B Complex Numbers                                                                                                                               367

C Linear Transformations                                                                                                                        373

D Partitioned Matrices                                                                                                                          377
Preface

This book was written for students taking a second semester linear algebra course at John Abbott
College. In the first semester of linear algebra students have been exposed to the theory of linear
systems which includes techniques of solving systems, basic matrix and vector operations, determi-
nants, subspaces (of Rn ), and the concept of a basis. In this book we will look at the other two
main pillars of linear algebra: the theory of eigenvectors and the theory of orthogonality.
   The first part of the book covers the theory of eigenvectors and eigenspaces. In condensed form
the general thread of topics is
                                                       
                                                        Uncoupling in difference equations
      Eigenvectors −→ Diagonalization −→                  Uncoupling in differential equations
                                                          Principal axes
                                                       

   The next portion of the book covers the theory of orthogonality. The thread of topics here may
be summarized as
                                                               Least-squares Lines
              Orthogonal bases    −→     Projections −→
                                                               Best Approximations

   The last part of the book deals with connections between these first two topics and leads from
symmetric matrices to the singular value decomposition.
   Along with these main topics there are several other motifs running through the text:

   • The contrast between discrete and continuous representations. A large number of
     important applications of linear algebra involve converting from continuous models to discrete
     models, and most commonly this conversion is done by sampling the continuous data at evenly
     spaced intervals. We look at many examples of this, the main one being the contrast between
     linear dynamical systems and systems of linear differential equations.
   • The interpretation of one-dimensional data as vectors and two-dimensional data
     as matrices. Vectors are usually introduced as arrows (i.e., directed line segments) in R2 and
     R3 and this geometric interpretation provides important intuitive insight into the geometry
     of vector operations. These insights can and should be carried over into more abstract vector
     spaces, but the techniques of linear algebra are frequently used in spaces of very high dimension
     where the entries in the vector (or matrix) are measurements of some quantity. In these cases
     other methods of visualizing the vectors are more appropriate.
   • Matrix-vector multiplication. What happens when you multiply a vector by a matrix?
     It’s incredible that this simple computation (which is usually presented in an introductory
     linear algebra course via the algorithm of pairing the entries in the rows of the matrix with
     the entries of the vector, multiplying the pairs, and adding the results) is so versatile. So
     many basic procedures can be interpreted as matrix-vector multiplication: finding the dot or
     cross products, solving a system, finding least-squares solutions of systems, finding projections,

                                                 iii
iv                                                                                       CONTENTS


       rotating vectors in space, the Gram-Schmidt procedure, and others can all be interpreted as
       matrix-vector multiplication.
       The singular value decomposition is particularly interesting in this regard as it reveals a simple
       geometric pattern behind any matrix-vector multiplication.
     • The idea of distance. It makes intuitive sense to talk about distance in R2 or R3 , spaces
       which can be easily visualized. But the idea of distance can be extended to more abstract
       vector spaces; not only Rn for larger values of n but also spaces of functions, matrices or other
       mathematical objects. The extension of this idea to abstract vector spaces allows us to talk
       about the distance between two functions (for one example), and trying to minimize these
       distances leads us to the idea of best approximations.

       Earlier we mentioned the division of linear algebra into the theories of linear systems, of
       eigenspaces, and of orthogonality. There is another standard way of dividing linear algebra:
       theoretical linear algebra, applied linear algebra, and numerical linear algebra. This book tries
       to maintain a balance between these. In particular, the last two are closely linked with the
       use of computer technology. Most important applications of linear algebra involve the use of
       computers and numerical linear algebra deals with the nuts and bolts of how computations
       can be performed most effectively. In this book we use the computer algebra system Maple
       to illustrate the topics.
Chapter 0

Review

This section is a summary of some results from the first semester of linear algebra that will play
important roles in this course. They are not listed in any particular order, and if there are any of the
following topics that you don’t understand you should review it in a textbook or see your teacher.



 Systems of Linear Equations
   There are several basic ways of looking at a system of linear equations.
   The most obvious way is to view such a system as a collection of linear equations with a certain
number of unknowns. For example,

                                     3x1 + 2x2 − 9x3 + 4x4 = 13
                                     2x1 + 3x2 + 7x3 − 3x4 = 22
                                     5x1 − 5x2 + 5x3 + 2x4 = 17

would be a system of 3 equations with 4 unknowns. This type of system can be solved by setting
up the augmented matrix and reducing it using elementary row operations. The two standard
approaches for solving a linear system by row reduction are called Gaussian elimination and
Gauss-Jordan elimination. A key result here is that any linear system has either (a) a unique
solution, (b) infinitely many solutions, or (c) no solution.

   Any system of linear equations can also be interpreted as a matrix equation in the form

                                               Ax = b
So, for example, the above system could be written as:
                                                           
                                                  x
                                              4  1
                                                                   
                                3   2 −9                          13
                                                     x
                                         7 −3   2
                                                            
                              2    3                        =  22 
                                                   x3      
                                5 −5     5    2                   17
                                                     x4
   This type of equation represents a linear system in terms of a matrix-vector product. It says
that if you take vector x and multiply it by matrix A, then you get vector b as a result. From this
point of view olving the system amounts to finding the vector x which satisfies the equation Ax = b.
   The equation Ax = b means that vector x (in R4 in the above example) has somehow been
transformed into vector b ( in R3 in the above example) when it is multiplied by matrix A. This

                                                   1
2                                                                                               0. Review


operation of multiplying a vector by a matrix is one example of the more general concept of a linear
transformation.
   If A is an invertible matrix then the solution to the system Ax = b can be written as x = A−1 b.

   There is a third important way of looking at a linear system. The left side of the equation Ax = b
can be seen as a linear combination of vectors. So the above system can be seen as:
                                                             
                         3           2         −9          4       13
                    x1  2  + x2  3  + x3  7  + x4  −3  =  22 
                         5          −5          5          2       17

    This means that when you are trying to solve the system Ax = b you are trying to find out how
vector b can be written as a linear combination of the columns of matrix A. The unknowns are
called the weights of this combination.



Classification of Systems of Equations
   There are three basic classes of linear systems, Ax = b, depending on the size of matrix A.
Suppose A is an m × n matrix, then:

    • The system is said to be overdetermined if m > n. In this case there are more equations
      than unknowns. When A is reduced there can’t be a pivot in each row although there might
      be a pivot in each column of A.

    • The system is said to be underdetermined if m < n. This means there are fewer equations
      than unknowns. When A is reduced there can’t be a pivot in each column although there
      might be a pivot in each row.

    • When m = n the system is called evendetermined. This is the one form where it is possible
      to get a pivot in each row and column when A is reduced.



Linear Independence
   Any collection of vectors can be classified as being either linearly dependent or linearly
independent. A collection of vectors is linearly dependent if one of the vectors in the collection
can be written as a linear combination of the other vectors in the set.
   An equivalent way of defining linear independence for vectors in Rn is the following: If v1 , v2 , . . . , vn
are vectors in Rn and A = v1 v2 · · · vn , then the columns of A are linearly independent if
and only if Ax = 0 has only the trivial solution.



Bases
    Any non-zero vector space has a basis. In fact, for any such vector space there are many possible
choices for a basis. If the dimension of the space is n (with n > 0), then any basis will consist of n
linearly independent vectors from that space. It is also true that any set of n linearly independent
vectors in the space will be a basis of that space.
                                                                                                  3


   The significance of a basis for a vector space is that any vector in the vector space can be
expressed in a unique way as a linear combination of the basis vectors. This is one of the
most important ideas in linear algebra.
   The subspace consisting of only the zero vector is said to be 0-dimensional and has no basis.



The Dot Product
    If you have two vectors in Rn they can be multiplied together using the dot product formula.
As a rule the dot product is not difficult to compute but you should know that the dot product can
also be expressed as a matrix product. That is:

                                          u · v = uT v = vT u

    A vector in Rn can be seen as an n × 1 matrix, so the above formula says that the dot product of
two vectors can be written as the transpose of one vector times the other vector. This will be used
a lot in this course.
    There are some other important aspects of the dot product you should remember:

  1. u · v = 0 means that u and v are orthogonal.

  2. u · u = uT u = u   2
                            . This equation relates the dot product to the length of a vector.

  3. u · v = u     v cos(θ). In this equation θ is the angle between u and v and 0 ≤ θ ≤ π.



Matrix Multiplication
    There is an important way of looking at matrix multiplication which will be used frequently in
this course. If you have two matrices of the appropriate dimensions then

                 AB = A b1      b2   b3   ···   bn = Ab1           Ab2   Ab3     ···   Abn

   In other words, when you multiply two matrices you multiply the first matrix with each of the
column vectors of the second matrix.
   You can also look at matrix multiplication as a collection of dot products. So

                                     a1 T b1     a1 T b2             a1 T bn
                                                                            
                                                           ···
                                   a2 T b1      a2 T b2   ···       a2 T bn 
                             AB = 
                                                                            
                                                            ..               
                                                              .             
                                    am T b1     am T b2    ...          T
                                                                    am bn

    In the above ai T stands for row i of A, and bj stands for column j of B.
    This might look complicated but it is just the standard computational procedure for multiplying
two matrices together. That is, you go along the rows of the first matrix and down the columns
of the second matrix, multiplying the corresponding entries and adding the results. (These are the
same steps used in finding the dot product of two vectors.)
4                                                                                                0. Review


Unit Vectors
   A unit vector is a vector whose length is 1.
                               v
   You should remember that        gives a unit vector in the same direction as v for any non-zero
                               v
vector v. When this formula is used to convert a vector into a unit vector, the procedure is called
normalizing the vector.



Projections
                                                                                                 uT v
   The projection of one vector, u, onto another vector, v, is given by Projv u =                vT v v.   The
orthogonal component of the projection is given by u − Projv u.



Diagonal Matrices
    A diagonal matrix is a matrix A where aij = 0 if i = j. That is, for a diagonal matrix all the
entries off the main diagonal are equal to zero. We will not require a diagonal matrix to be square
(this is slightly different from the conventional use of this term.). Sometimes non-square diagonal
matrices will be referred to as rectangular diagonal matrices. In a diagonal matrix some (or all) of
the entries on the main diagonal could also be zero.
    Why are diagonal matrices important? They are important mainly because of their simplicity.
If
                                                                  
                                        a11 0       0 ···       0
                                       0 a22 0 · · ·           0 
                                                                  
                                       0      0 a33 · · ·      0 
                                 A=                               
                                                        ..        
                                                           .      
                                         0     0    0 · · · ann
    is a square diagonal matrix then

                                             ak
                                                                                     
                                              11     0      0      ···           0
                                             0     ak
                                                     22     0      ···           0    
                                                           ak
                                                                                     
                                  Ak = 
                                             0      0      33     ···           0    
                                                                                      
                                                                  ..                 
                                                                     .               
                                             0       0     0       ···       ak
                                                                              nn
    for any positive integer k.
    And                                                                                     
                                       1/a11         0       0            ···          0
                                   
                                        0         1/a22     0            ···          0     
                                                                                             
                           A −1
                                  =
                                        0           0     1/a33          ···          0     
                                                                                             
                                                                         ..                 
                                                                            .               
                                         0           0         0          ···        1/ann
assuming that all the diagonal entries are non-zero.
   Also, for a diagonal matrix the matrix-vector product Av is simple to compute. Multiplication
by a square diagonal matrix just amounts to scaling each entry of v by the corresponding diagonal
entry.
                                                                                                    5


Rotations
                               cos θ sin θ
   The 2 × 2 matrix R =                       is called a rotation matrix or a rotator. If any vector
                             − sin θ cos θ
     2
in R is multiplied by such a matrix then the vector is rotated clockwise around the origin through
the angle θ. Note that the columns of this matrix are orthogonal, unit vectors in R2 for any value
of θ.
    In R3 rotations are more complicated. Rotations around the x, y, or z axis would correspond to
multiplying by the following rotators:
                                                                                       
                1     0      0              cos θ 0 − sin θ                cos θ − sin θ 0
        Rx = 0 cos θ − sin θ  Ry =  0             1     0  Rz =  sin θ       cos θ 0
                0 sin θ    cos θ             sin θ 0 cos θ                   0      0     1


The Null Space
    Any m × n matrix A has an associated vector space called the null space of A. This subspace is
denoted by Nul A. Nul A will be a subspace of Rn and consists of the solutions to Ax = 0, but you
should remember that any system of this type (a homogeneous system) has either only the trivial
solution x = 0, or an infinite number of solutions.
    The last statement implies that the null space of A might consist of only the zero vector (that
is, Nul A might be zero dimensional) or will contain an infinite number of vectors. The null space
is just the zero vector if the columns of A are linearly independent. If the columns of A are linearly
dependent then Ax = 0 has non-trivial solutions and Nul A will contain an infinite number of vectors.
    What’s so special about Nul A? Well, if Ax = b is consistent the the solution is just the null
space of A translated by the addition of a constant vector.



The Column Space and Row Space
    Any m × n matrix A has an associated vector space called the column space of A. This subspace
is denoted by Col A. Col A is a subspace of Rm and consists of all possible linear combinations of
the columns of A.
    What’s so special about the column space? You know that any system of equations Ax = b is
either consistent or inconsistent. It is consistent if b can be written as a linear combination of the
columns of A, otherwise it is inconsistent. This means, the system is consistent when b is in Col A,
and is inconsistent when b is not in Col A.
    The dimension of the column space of A is called the rank of A. The rank is also the number of
pivots in the reduced row echelon form of A.
    There is another space associated with an m × n matrix A called the row space and denoted
Row A. This is the set of all possible linear combinations of the rows of A. The row space and
column space alsways have the same dimension.
    There is one particular aspect of the row space that will be relevant for this course. Suppose x
is a vector such that Ax = 0 (that is, x is in the null space of A). It was pointed out earlier that
the product Ax can be seen as a collection of dot products so the equation Ax = 0 means that any
vector in Nul A must be orthogonal to the rows of A.
6                                                                                                  0. Review


      Exercises
    1. Let                                                                             
                          1            1                   1           0                      4
                         1          0                 −1         1
                                                                                    , v =  −1 
                                                                                             
                         1  , u2 =  1
                   u1 =            
                                                         0  , u4 =  1
                                                , u3 = 
                                               
                                                                    
                                                                                           1 
                          0            1                   1           1                     −1

        (a) Show that {u1 , u2 , u3 , u4 } is a basis for R4 .
        (b) Write v as a linear combination of these basis vectors.
    2. Are the following statements true or false.
        (a) An underdetermined system cannot have a unique solution.
        (b) An overdetermined system must be inconsistent.
        (c) An evendetermined system must have a unique solution.
        (d) A consistent, underdetermined system must have an infinite number of solutions.
        (e) An underdetermined, homogeneous system must have an infinite number of solutions.
        (f) If the system Ax = 0 has only the trivial solution then this system cannot be underdetermined.
    3. Let A be an n × n matrix. Which of the following conditions are equivalent to A being invertible?

        (a) The rank of A is n.                                  (d) The null space of A is {0}.
        (b) The column space of A is {0}.                        (e) The null space of A is Rn .
        (c) The column space of A is Rn .                        (f) The determinant of A is 0.

    4. The dot product of u and v is given by uT v = vT u. What happens if it is the second vector that
       is transposed? That is, what is uvT ? Is uvT = vuT ?
    5. Suppose AB = I for two matrices A and B. What can you say about the orthogonality of the row
       vectors of A and the column vectors of B? (In this problem why is it not true that B = A−1 ?)
                                           
                                           1
                                 2         1 
                                           
    6. Normalize the vectors  1  and  1 
                                           
                                −2         1 
                                             1
    7. Let                                                     
                                                      3          1
                                                u =  3  , v = 2
                                                     −1          2
       Find the projection of u onto v. Find the projection of v onto u. Find the orthogonal component
       of the projection in both of these cases.
    8. Show that the projection and orthogonal component of the projection are the same whether you
       project u onto v or project u onto kv for any non-zero scalar k. What is the formula for the
       projection of u onto v if v is known to be a unit vector?
    9. Suppose A is an m × n diagonal matrix with 1’s down the diagonal and m > n. Using the notation
                                                                 I
       of partitioned matrices this means that we can write A =     .
                                                                 O
                                                                                                    7


    (a) Give a description of the effect of multiplying a vector in Rn by A.
    (b) Give a description of the effect of multiplying a vector in Rm by AT .
     (c) What happens to a vector if it is multiplied by AAT ?
    (d) What happens to a vector if it is multiplied by AT A?

   Multiplying a vector by A is called zero padding and multiplying by AT is called truncation.
   Zero padding increases the dimensionality of a vector, and truncation decreases the dimensionality.
   The above comments can be summarized by the rule:

                               Zero-padding is the transpose of truncation.

10. Let                                                     
                                               4       0   0
                                              0       2   0
                                            A=
                                              0
                                                             
                                                       0   1
                                               0       0   0
   Describe what happens to a vector in R3 when it is multiplied by A. Describe what happens to a
   vector in R4 when it is multiplied by AT .

11. (a) Let                                                      
                                                0      1    0    0
                                               0      0    1    0
                                             A=
                                               0
                                                                  
                                                       0    0    1
                                                0      0    0    0
          Describe the effect when a vector in R4 is multiplied by A. What is A4 ?
    (b) Let A be the n × n matrix                                     
                                            0      1   0   ...   0    0
                                           0      0   1   ...   0    0
                                                       .
                                                                      
                                         A=
                                                      .
                                                       .
                                                                       
                                                                       
                                                                      
                                           0      0       ...   0    1
                                            0      0   0   ...   0    0
          What is A2 ? What is An ?
     (c) Let                                                     
                                                0      1    0    0
                                               0      0    1    0
                                             A=
                                               0
                                                                  
                                                       0    0    1
                                                1      0    0    0
          Describe the effect when a vector in R4 is multiplied by A. What is A4 ?
    (d) Let A be the n × n matrix                                     
                                            0      1   0   ...   0    0
                                           0      0   1   ...   0    0
                                                       .
                                                                      
                                         A=
                                                      .
                                                       .
                                                                       
                                                                       
                                                                      
                                           0      0       ...   0    1
                                            1      0   0   ...   0    0
          What is A2 ? What is An ?
8                                                                                              0. Review

                                                     cos θ
    12. Why is any unit vector in R2 of the form             ?
                                                     sin θ

                   a
    13. Let u =          be any vector in R2 . Find a vector orthogonal to u.
                   b

    14. Use matrix multiplication to show that a clockwise rotation (in R3 ) of π/2 around the x axis
        followed by a clockwise rotation of π/2 around the z axis, followed by a clockwise rotation of π/2
        around the y axis is equivalent to a counter-clockwise rotation of π/2 around the z axis.
                                                             1 1    1
    15. Give a geometric description of the null space of               . Give an equation that defines this
                                                             2 1    3
        space.
        Give a geometric description of the row space of this matrix. Give an equation for this space.
        Why is the column space of this matrix all of R2 ?
                  2 4
    16. Let A =       .
                  3 6

         (a) How can you tell by observation that this matrix has rank 1?
         (b) Write this matrix in the form uvT where u and v are vectors in R2 .
         (c) Find a basis for Nul A, Col A, and Row A.
    17. (a) Show that uvT has rank 1 for any two non-zero vectors u and v.
         (b) Show that   if w is orthogonal to v then w is in the null space of uvT .
                                      
                 1 2       3             1
    18. Let A = 1 1       2 and v =  1 . Is v in Col A? Is v in Nul A?
                 0 1       1             −1
                                                                                                              9


                                       Using MAPLE


Example 1.
    In this Maple lab we will illustrate an important aspect of linear transformations.
                                    cos t
    In R2 any vector of the form          lies on the unit circle. In fact, the entire unit circle is the set of
                                    sin t
all vectors of this form as t ranges from 0 to 2π. In Maple we can illustrate this as follows:

         >with(LinearAlgebra):
         >v:=<cos(t), sin(t)>:
         >plot([v[1],v[2],t=0..2*Pi]);


                                                      1
                                                    0.8
                                                    0.6
                                                    0.4
                                                    0.2

                                         –1 –0.6      0 0.20.40.60.8 1
                                                   –0.2
                                                   –0.4
                                                   –0.6
                                                   –0.8
                                                     –1

                                         Figure 1: The unit circle.

   Here are some points you should understand about the above Maple commands:

   • The first command, with(LinearAlgebra), loads the linear algebra package into Maple . This
     loads some additional commands dealing with linear algebra into Maple . All the Maple exercises
     in this book require this package so you should enter this command at the beginning of each Maple
     lab. From now on this command won’t be given explicitly but it will be understood that you have
     entered it at the beginning of each Maple session.
      We ended the line with a colon. If you end the line with a semi-colon Maple will print out the
      names of all the new commands that have been loaded.
                                                       cos t
   • The second command defines a vector v =                  .
                                                       sin t

   • Angle brackets, square brackets, parenthesis, and curly brackets all have different meanings in
     Maple . So make sure you use the same type of bracket as indicated in the example.

   • The plot command as used here takes a list as input. The list consists of three entries. The first
     entry is the x value of the point that is being plotted. This value is cos t and this is the first entry
     in vector v. This first entry can be selected using the command v[1]. The second entry in the
     list is the y value. This is the second entry in v and is selected by using v[2]. The third entry in
     the list is the range of the parameter t. In this case the range is from 0 to 2π.
10                                                                                           0. Review

                    1 −1                      cos t
     Now let A =            . If we multiply        by matrix A it is equivalent to multiplying the entire
                    2 −1                      sin t
unit circle by A. In this case we say that we are applying a linear transformation to the circle. How does
the circle get transformed by A? We will use Maple to find the answer:

         >A:=<<1,2>|<-1,-1>>;
         >u:=A.v;
         >plot([u[1],u[2],t=0..2*Pi],scaling=constrained);


                                                     2

                                                     1


                                               –1    0   0.5 1

                                                    –1

                                                    –2

                                                                         1   −1
                         Figure 2: The unit circle transformed by               .
                                                                         2   −1

     Here are some comments about the above:
     • The first line illustrates one method of defining a matrix in Maple . Here we define the matrix
       column by column. There is one pair of angled brackets for the entire matrix and each column
       vector is enclosed in angled brackets. The columns are separated by vertical bars, |.
     • The second command illustrates how to perform matrix multiplication in Maple . Scalar multipli-
       cation uses the * symbol. Matrix multiplication uses the . symbol.
     • The last line plots the transformed circle. The added parameter scaling=constrained forces the
       x and y axes to be scaled the same.
    Now we will repeat the same procedure for other matrices. (In a Maple worksheet you can just scroll
up, change the values in matrix A, and re-enter the subsequent commands.
                                −3 1
    For example, if we let A =             we would get Figure 3.
                                 2 −1

                                                     2

                                                     1

                                       –3 –2    –1 0       1     2   3
                                                  –1

                                                    –2

                                                                         −3 1
                         Figure 3: The unit circle transformed by             .
                                                                         2 −1
                                                                                                          11

                   1 1.5
   If we let A =         we would get Figure 4.
                   0 2

                                                      2


                                                      1



                                            –1.5 –1–0.5       0.5 1 1.5

                                                     –1


                                                     –2

                                                                               1   1.5
                          Figure 4: The unit circle transformed by                     .
                                                                               0    2

                   2 0
   If we let A =        we would get Figure 5.
                   0 .3


                                                      2


                                                          1



                                       –2       –1    0           1       2

                                                     –1


                                                     –2

                                                                               2   0
                           Figure 5: The unit circle transformed by                   .
                                                                               0   .3


    This last example is particularly important. The matrix A in this example is diagonal and therefore
multiplication by A corresponds to scaling the unit circle by 2 in the horizontal direction and by .3 in
the vertical direction. The effect is to stretch the circle horizontally and flatten the circle vertically. You
should find it easy to predict the result of multiplying the unit circle by any diagonal matrix.

   For our next two examples let

                                   cos(.3) − sin(.3)                          .4 0
                            R=                                 and S =
                                   sin(.3) cos(.3)                            0 1.5

Notice that R is a rotation matrix. Matrix R rotates vectors in R2 counter-clockwise by .3 radians. S is
a diagonal matrix. Matrix S scales vectors in R2 by different amounts along the horizontal and vertical
axes. What does the unit circle look like after multiplying by R? by S? byR2 ? by S 2 ? by R3 ? by S 3 ?
    Now consider the matrices RS and SR, how do these two matrices transform the unit circle? Do
they have the same effect?
12                                                                                              0. Review


         >R:=<<cos(.3),sin(.3)>|<-sin(.3),cos(.3)>>;
         >S:=<<.4,0>|<0,1.5>>:
         >u1:=R.S.v;
         >u2:=S.R.v;
         >plot([u1[1],u1[2],t=0..2*Pi],scaling=constrained);
         >plot([u2[1],u2[2],t=0..2*Pi],scaling=constrained);
     This gives Figure 6 and Figure 7.
                                                                              1.5
                        1                                                       1

                      0.5                                                     0.5

                   –0.5 0    0.5
                     –0.5                                                    –0.5

                       –1                                                      –1

                                                                             –1.5

         Figure 6: Transforming by RS.                           Figure 7: Transforming by SR.

     You should understand why you got these pictures. (A lot of people expect them to be reversed.)

    You should try a few more examples using some 2 × 2 matrices of your own choice.
    In each of the above examples the transformed circle was, in fact, an ellipse. By the end of the course
you will have a clear understanding of why you get an ellipse in these cases. But, in fact, do you always
get an ellipse?
                          1 3
    Suppose we let A =           then the transformed circle is shown in Figure 8.
                          2 6

                                                   6

                                                   4

                                                   2

                                                –2 0     2
                                                  –2

                                                  –4

                                                  –6

                                                             1   3
              Figure 8: The unit circle transformed by             , a non-invertible matrix.
                                                             2   6

    In this case we get a straight line. Is this an ellipse? Well, yes, in a sense. This can be seen as what
is called a degenerate ellipse, an ellipse where the width along the minor axis has shrunk to zero.
    Why did we get a straight line in this last example? You should be able to understand this result
on purely theoretical grounds. First, notice that A in this last example is not invertible. Therefore the
columns of A must be linearly dependent. If you look at the column space you should be able to see
that it is just the line y = 2x. When you multiply any vector by matrix A you are just forming a linear
combination of the columns of A, that means that the result of that multiplication has to be in the
                                                                                                            13


column space, i.e. the result must lie on the line y = 2x. Therefore every point on the unit circle ends
up on this line.
    Now consider the following questions.

   • The above examples all applied a linear transformation to the unit circle. What would the results
     have been if the same transformations had been applied to a circle, centered at the origin, of radius
     2?

                                                                                        cos θ   − sin θ
   • How would the unit circle look if it was multiplied by a matrix of the form                        ?
                                                                                        sin θ    cos θ

                                                                  1 −1                           √
   • How would the unit circle look if it was multiplied by            ? (Hint: factor the scalar 2 out
                                                                  1 1
      of this matrix.)

   • What would happen to the unit sphere in R3 if it was multiplied by a 3 × 3 diagonal matrix with
     non-zero entries on the diagonal? What if one of the diagonal entries is 0? What if two of the
     diagonal entries are 0?

   • Modify the Maple procedures from this section to see what happens when a transformation is
     applied to a circle that is not centered at the origin. Is the result still an ellipse? Give a mathematical
     explanation of how the results in this case compare with the results when the circle is centered at
     the origin.


Example 2.
    In this example we will look at a particular type of problem that is closely related to much of what we
will be doing later on in this book. Suppose you have a set of n points in R2 with distinct x coordinates,
then it is possible to find a polynomial of degree n− 1 that passes through all the points. This polynomial
is called an interpolating polynomial. For example, you can find a straight line that passes through
any two such points, you can find a quadratic which passes through any three such points, four such
points determine a cubic polynomial, and so on.
    To illustrate, suppose we have the following five points: (1,3), (2,1), (3,6), (4,4), (8,0). We can then
find a fourth degree polynomial (this is called a quartic)

                                 p(x) = a0 + a1 x + a2 x2 + a3 x3 + a4 x4

that passes through all five points. Furthermore, we can find this polynomial by solving a system of five
equations with five unknowns. The equations making up the system are found by substituting each of
the points into the polynomial. Each time we substitute a numerical value for x and p(x) we are left
with an equation that is linear in the unknowns a0 , a1 , a2 , a3 , and a4 . This gives the following system

                                            a0 + a1 + a2 + a3 + a4       =   3
                                      a0 + 2a1 + 4a2 + 8a3 + 16a4        =   1
                                     a0 + 3a1 + 9a2 + 27a3 + 81a4        =   6
                                  a0 + 4a1 + 16a2 + 64a3 + 256a4         =   4
                                a0 + 8a1 + 64a2 + 512a3 + 4096a4         =   0
14                                                                                             0. Review


which can be written as the matrix    equation
                                                              
                                 1     1 1    1    1           a0      3
                              1       2 4    8   16         a1  1
                                                              
                              1       3 9 27     81         a2  = 6
                                                              
                              1       4 16 64 256           a3  4
                                 1     8 64 512 4096           a4      0

    Notice that the coefficient of a0 is always 1 and these values make up the first column of the coefficient
matrix. The coefficients of a1 are the x coordinates of the given points and these x coordinates make up
the second column of the coefficient matrix. The coefficients of a2 are the squares of the x coordinates
and these make up the third column of the coefficient matrix. The same pattern continues for the
remaining columns. The coefficient matrix of this type of system is called a Vandermonde matrix
and there is a command in Maple for defining this type of matrix. We will solve this problem using the
following Maple code, but first here are some comments about the code:

     • We have ended most lines of Maple input with a colon to prevent the output from being printed.
       You might want to change these colons to semi-colons to see the result of each command.

     • The first line defines a list which we called xvals consisting of the x coordinates of the five points.
       The second line defines a list of the y coordinates. Each of these lists can be seen as a vector in
       R5 .

     • The VandermondeMatrix command in Maple takes a vector of the x coordinates as input and
       returns the corresponding Vandermonde matrix. The third line below defines the 5×5 Vandermonde
       matrix for our system. The first column consists of the coefficients of a0 , i.e., all 1’s. The second
       column consists of the coefficients of a1 , i.e., the x values of our points. The third column consists
       of the coefficients of a2 , i.e., the squares of the x values. And so on.
       Each column of the Vandermonde matrix consists of the x coordinates raised to various powers. In
       the first column these coordinates are raised to the power 0. In the second column they are raised
       to the power 1. And so on.

     • The fourth line solves the system by using the inverse of the Vandermonde matrix. We called the
       solution sol, so sol is a vector containing the values of the 5 coefficients of our interpolating
       polynomial.

     • The fifth line gives another way of solving the system using row reduction. There are actually two
       commands on this line. The first command creates the augmented matrix of our system and then
       puts it in reduced row echelon form using the rref command. We call the reduced matrix RV.
       The second command selects the sixth column from this reduced matrix. This column will be the
       solution to the system.

     • The sixth line defines our interpolating polynomial which we call p. In this line we use the add
       command in Maple . As the parameter i varies from 1 to 5 the expression sol[i] takes on each
       of the 5 values in sol and multiplies it by the corresponding power of x. The add command then
       adds these terms together giving us the polynomial.

          >xvals:=<1,2,3,4,8>:
          >yvals:=<3,1,6,4,0>:
          >V:=VandermondeMatrix(xvals):
          >sol:=V^(-1).yvals):
          >p:=add(sol[i]*x^(i-1),i=1..5);
                                                                                                     15


   The last line returns the interpolating polynomial:
                             1264 1244    129 2 275 3    59 4
                                 −     x+    x −    x +     x
                              35   21      4     42     140
   We will now use Maple to plot this results. Again we will give some comments concerning the
commands.

   • We will include the points in our plot. To do this in Maple we need to make up a list consisting
     of the 5 points. That is we need the list

                                           [[1, 3], [2, 1], [3, 6], [4, 4], [8, 0]]

     A list in Maple is contained within square brackets so this list contains 5 items. Each item in the
     list is itself a list of two values, the coordinates of the points.
     The first line defines this list of points. Note how the seq command is used to create this sequence
     of coordinates. The expressions xvals[i] and yvals[i] return the x and y coordinates previously
     defined as i ranges from 1 to 5.

   • The second line defines the plot of the points and calls it p1. The plot is not shown at this stage.

   • The third line defines the plot of the interpolating polynomial calling it p2.

   • The fourth line uses the display command from the plots package in Maple . This command
     allows us to combine several plots into one.

        >pts:=[seq( [xvals[i],yvals[i]], i=1..5)]:
        >p1:=plot(pts,style=point,color=black):
        >p2:=plot(p,x=1..8):
        >plots[display](p1,p2);

   This returns Figure 9 which shows the polynomial passing through each of the five given points.


                                      30



                                      20



                                      10




                                       0      1   2    3   4    5   6    7    8



                                     –10



                                     –20



                                     –30




                 Figure 9: An interpolating polynomial passing through 5 points.

    We will look at another example of the same type that involves more points. In the following Maple
code the first line illustrates another way of defining a vector in Maple . Here we generate a vector in
R9 . The expression i->i-5 is a funtion that generates the entries in the vector. The parameter i rages
from 1 to 9 and the corresponding entry in the vector is i-5.
16                                                                                            0. Review


    The second line generates the y coordinates by evaluating the signum function at each x value. The
signum function returns -1 if the input is negative, it returns 1 if the input is positive, and and returns
0 if the input is 0. The Map command just tells Maple to apply this function to each entry in xvals.
The rest of example follows the same steps as the previous example.

        >xvals:=Vector(9, i->i-5);
        >yvals:=Map(signum,xvals):
        >V:=VandermondeMatrix(xvals):
        >sol:=V^(-1).yvals):
        >p:=add(sol[i]*x^(i-1),i=1..9);
        >pts:=[seq([xvals[i],yvals[i]],i=1..9)]:
        >p1:=plot(pts,style=point):
        >p2:=plot(p,x=-4..4):
        >plots[display](p1,p2);

This returns Figure 10.


                                                              1




                                                            0.5




                                       –4   –3   –2   –1      0   1   2   3   4




                                                           –0.5




                                                             –1




                  Figure 10: An interpolating polynomial passing through 9 points.

   Finally we will point out that a unique solution to this type of problem is guaranteed as long as the
Vandermonde matrix is invertible and this will happen whenever the x values are all distinct.
Chapter 1

Change of Basis

First, some comments about notation. In this book a basis will be represented by a calligraphic
symbol such as B or C. In particular, the standard basis of Rn will be represented by E.
    If a certain letter is used to represent a basis, then the vectors of that basis will usually be
represented by the same letter in lower case bold font with subscripts. For example, B = {b1 , b2 , b3 }
would represent a basis of a three dimensional vector space. In this book we will usually understand
a basis to be an ordered basis. This means that the vectors in the basis are specified in a particular
order. The same basis vectors in a different order would be regarded as a different basis.
    Finally, given the above usage, a capital letter in regular font will indicate the matrix created by
using the basis vectors (in the appropriate order) as columns.1 So for example, B = b1 b2 b3
is the matrix formed by using the vectors in basis B as columns.



1.1      Coordinates Relative to a Basis
   This section is based on three important points that you should know from a previous linear
algebra course.

    • Any non-empty vector space has a basis.

    • Given a vector space, V , and a basis for that space, B, then any vector in V can be written as
      a unique linear combination of those basis vectors.

    • Any non-empty vector space has, in fact, infinitely many possible choices for a basis and each
      basis will contain the same number of vectors. (This number is the dimension of the vector
      space.)

    Let B = {b1 , b2 , . . . , bn } be a basis of some vector space and let v be some vector in that space.
It then follows that


                                         v = c1 b1 + c2 b2 + · · · + cn bn

    1 The matrix formed from the standard basis is an exception to this. This matrix would be the identity matrix and

is represented by I.


                                                         17
18                                                                                 1. Change of Basis


for some weights c1 , c2 , . . . , cn . These weights are called the coordinates of v relative to basis B.
These weights can be seen as a vector in Rn and this vector is represented by the symbol [v]B . So
                                                          
                                                           c1
                                                          c2 
                                                  [v]B =  . 
                                                          
                                                         ..
                                                          cn
    The vector [v]B is also called the B-representation of v. You should look at the vector [v]B as a
recipe for how to create vector v from the vectors in basis B. It gives you the numerical information
required to construct v from the basis vectors.
    The following example will illustrate the points made so far in this chapter:



     Let
                                     1   1                       1    5
                             B=        ,                 C=        ,
                                     1   −1                      2   −1
                                      5    1                     1   0
                              D=         ,                  E=     ,
                                      −1   2                     0   1




   Each of these sets is a basis of R2 (why?). In fact, the last of these bases is just the standard
basis of R2 . You should also notice that C and D contain the same vectors in a different order.
   Now let
                                                  1       1        10
                              v = 4b1 + 6b2 = 4     +6        =
                                                  1      −1       −2
   Notice that v is defined as a linear combination of b1 and b2 . The weights of this linear
combination are 4 and 6. These are the coordinates of v relative to basis B, so we have

                                                        4
                                               [v]B =
                                                        6
The entries in this vector tell you how to obtain v from the vectors in basis B.
   How can v be obtained from the vectors in C? The answer here should be obvious if you look at
the relevant vectors. To obtain v you have to take zero times the first basis vector and two times
the second basis vector. In other words,
                                                        0
                                               [v]C =
                                                        2
     Similarly it should be very easy to see that
                                                        2
                                               [v]D =
                                                        0
and
                                                        10
                                              [v]E =
                                                        −2
   To recap, the expression [v]B stands for a vector. The entries in this vector are the weights
required to create vector v from the vectors in the basis B.

Example 1.1.1
1.2. Change of Basis                                                                            19


            Let
                                                                           
                                         1                  0                  1
                                   b1 = 1           b2 = 1           b3 = 2
                                         1                  1                  1

            Then B = {b1 , b2 , b3 } will be a basis of R3 . (You should verify this yourself.)
                     
                     1
            Let v = 2. What is [v]B ?
                     3
            The notation is new, but you should recognize this as a variation on one of the funda-
            mental problems of linear algebra. The problem is how to write one vector as a linear
            combination of some other vectors. This amounts to solving the system with augmented
            matrix
                                                 b1    b2   b3       v

            Setting up this augmented matrix and applying elementary row operations would result
            in
                                                                    
                                      1 0 1 1            1 0 0      2
                                    1 1 2 2 ∼ 0 1 0              2 
                                      1 1 1 3            0 0 1 −1

            The solution to this system gives you the weights of the desired combination so
                                                            
                                                           2
                                                  [v]B =  2 
                                                          −1
                                                    
                                                     0
            Now suppose you are told that [u]B = 3, then what is u?
                                                     5
            Again, the computation is simple. It’s just a matter of understanding the notation. The
            information that you are given means that
                                                           
                                                     0     0     5      5
                              u = 0b1 + 3b2 + 5b3 = 0 + 3 + 10 = 13
                                                     0     3     5      8




1.2     Change of Basis
The Standard Basis
                                                               
                                                                c1
                                                               c2 
Suppose B = {b1 , b2 , . . . bn } is a basis of Rn and [v]B =  . . How can you find [v]E ?
                                                               
                                                              ..
                                                                cn
20                                                                                1. Change of Basis


Look at the following equations. (Give the justification for each step.)

                                [v]E   =   v
                                       =   c1 b1 + c2 b2 + · · · + cn bn
                                                                    
                                                                     c1
                                                                    c2 
                                       =    b1 b2 . . . bn  . 
                                                                    
                                                                   ..
                                                                      cn
                                       =   B [v]B

    The above derivation shows that to convert the B-representation of a vector to the representation
in terms of the standard basis you just have to multiply by B. This matrix is therefore called the
change of basis matrix from B to E and is represented by the symbol PE←B .
    Suppose, on the other hand, you knew the representation of a vector, v, in the standard basis
and wanted the representation relative to B. This is the same problem as trying to write v as a linear
combination of the columns of B. It is therefore the same as trying to solve the system Bx = v.
The solution to this equation is given by B −1 v = B −1 [v]E . The conclusion we can draw from this
                                                 −1
is that PB←E = B −1 . In other words, PB←E = PB←E .

Example 1.2.1
            Let
                                                                  2   1
                                           B = {b1 , b2 } =         ,
                                                                  1   2
           Then we have
                                                              2   1
                                                 PE←B =
                                                              1   2
           and
                                                        −1
                                                 2 1               2/3 −1/3
                                       PB←E =                =
                                                 1 2              −1/3 2/3

                                            3
           Suppose you are given [u]B =        . What is [u]E ?
                                           −1
           The point of this section is that this can be answered by a matrix multiplication. We
           have
                                                         2 1    3      5
                                   [u]E = PE←B [u]B =               =
                                                         1 2 −1        1
           So the coordinates of vector u relative to the standard basis are (5, 1), and the coordinates
           of u relative to B are (3, -1). These are illustrated in Figures 1.1 and 1.2.
                                                −2
           Similarly, if we knew that [v]E =         then we would have
                                                 4

                                                      2/3 −1/3             −2   −8/3
                              [v]B = PB←E [v]E =                              =
                                                     −1/3 2/3              4    10/3




Example 1.2.2
1.2. Change of Basis                                                                            21

                                           6

                                           4

                                           2


                                     –2          2     4     6

                                          –2


                 Figure 1.1: The vector (5,1) relative to the standard basis.




                   Figure 1.2: The vector (5,1) relative to basis B is (3,-1).


          Let                                                     
                                                    2        1       0 
                              B = {b1 , b2 , b3 } =  1  ,  0   , 1 
                                                       0      1       1
                                                                         
                                                                 
                                                   3              0
                                     [u]B =  −1  , [v]E =      0 
                                                  −1              3
         What are [u]E and [v]B ?
         Again the main point is that both of these questions can be answered by matrix multipli-
         cation. To find [u]E we just have to multiply PE←B [u]B , and from the above discussion
         this is just B [u]B :
                                                            
                                   2 1 0            3          5
                                  1 0 1   −1  =  2  = [u]
                                                                        E
                                   0 1 1          −1         −2
         More intuitively, when you compute the above matrix multiplication you are just com-
         bining the columns of B using the weights in vector [u]B . The result will be vector u
         which is the same as [u]E .
         For the second question we again just have to multiply by the appropriate change of
         basis matrix. This means that we must compute PB←E [v]E , and this is just B −1 [v]E . In
         principle this is fine, but in practice it means that we would have to first find the inverse
         of B and then carry out the matrix multiplication. It would be computationally simpler
         to set up the appropriate augmented matrix and reduce it:
22                                                                            1. Change of Basis


                                                                      
                                     2    1 0    0     1 0        0   −1
                                    1    0 1    0 ∼ 0 1        0    2 
                                     0    1 1    3     0 0        1    1
                          
                      −1
           So [v]B =  2 .
                        1
           To confirm this result we can compute:
                                                  
                                            −2     2     0     0
                          −b2 + 2b2 + b3 = −1 + 0 + 1 = 0 = v = [v]E
                                             0     2     1     3




Arbitrary Bases
Suppose now that you have two bases of Rn , B and C, and you want to convert a vector from its
B-representation to its C-representation. In other words, what is PC←B ?
   The following diagram is meant to illustrate the answer.



                                             E
                                        
                                          d
                                            d
                                  B          d C
                                                 −1
                                              d
                                                d
                                                  ‚
                                                  d
                               B               E C
                                         −1
                                        C B


   The idea is that if you start off with a vector in terms of basis B, then you convert into the
standard basis by multiplying by B. When you have the representation in the standard basis you
convert into the basis C by multiplying by C −1 . These two steps can be combined into multiplying
the original vector by C −1 B. So the change of basis matrix from B to C is PC←B = C −1 B.
   Notice that one implication of the above discussion is that PB←C = (C −1 B)−1 = B −1 C.


Example 1.2.3
                                         2   1                        1   2                   3
           S uppose B = {b1 , b2 } =       ,     , C = {c1 , c2 } =     ,      , and [v]B =      .
                                         3   1                        3   1                   −1
           What is [v]C ?
           To answer this question we just have to evaluate C −1 B [v]B . We will first find C −1 B
           using row reduction:

                             1 2    2 1          1   2  2  1              1   0 4/5 1/5
                  [C |B] =                  ∼                         ∼
                             3 1    3 1          0   −5 −3 −2             0   1 3/5 2/5
1.2. Change of Basis                                                                                               23


             The right hand side2 of the final array is C −1 B. So now we multiply this matrix by the
             given vector:

                                                   4/5 1/5       3    11/5
                                                                    =
                                                   3/5 2/5       −1   7/5
                                    11/5
             Therefore [v]C =            . To confirm this we can compute v from its representation in
                                    7/5
             the two bases.
             From basis B we get
                                                            2    1   5
                                                    v=3       −1   =
                                                            3    1   8
             From basis C we get

                                           1       2   11/5   14/5   25/5   5
                              v = 11/5       + 7/5   =      +      =      =
                                           3       1   33/5   7/5    40/5   8




   2 Putting a matrix in reduced row echelon form by elementary row operations is equivalent to multiplying by a

matrix. In this example the left hand side of the array was transformed from C to I, which is equivalent to multiplying
by C −1 . Since the same row operations were done to the right hand side, the ride hand side becomes C −1 B.
24                                                                               1. Change of Basis


       Exercises

                    1        2            5
     1. Let b1 =      , b2 =   , and v =    . Let B = {b1 , b2 }.
                    2        1           −2
        Find [v]B .
                                                
                    1        1          1             1
     2. Let b1 = 1, b2 =  1 , b3 = −1, and v = 5. Let B = {b1 , b2 , b3 }.
                    1       −1          −1            5
        Find [v]B .
                                            
                    1       4         7           7
     3. Let b1 = 2, b2 = 5, b3 = 8,and v = 2.
                    3       6         8           4
        (a) Find [v]E .
        (b) Find [v]B .
        (c) Find PB←E
                    3         −1                                                         2
     4. Let b1 =       , b2 =    . Let B = {b1 , b2 } and C = {b2 , b1 }. Suppose [v]B =   .
                    −1         1                                                         3
        (a) Find [v]E .
        (b) Find [v]C .
        (c) Find PB←E .
        (d) Find PB←C .
                    3                                           1
     5. Let u =       . Find a basis, B, of R2 such that [u]B =   . How many choices for B are there in
                    0                                           1
       this case?
                                       
                   2           −1          0
     6. Let b1 = −1, b2 =  2 , b3 = −1 and let B = {b1 , b2 , b3 }.
                   0           −1          2
                             
                              1
         (a) Suppose [v]B = −1. Find v.
                              1
                           
                           0
         (b) Suppose u = 0. Find [u]B .
                           4

                        1            1            1            3
     7. Let b1 =            , b2 =       , c1 =       , c2 =       .
                        1            2            2            4
        (a) Find PB←C and PC←B .
                             5
        (b) If [v]B =            find [v]C .
                             6
        (c) What is [c2 ]B ?
        (d) What is [c2 ]C ?
1.2. Change of Basis                                                                                           25


  8. Let                                             
                                   1     1     0       1     1     1 
                               B = 1 , 1 , 1 , C = 0 , 2 , 2
                                    1     0     1         0     0     3
                                                                      

     Find PC←B and PB←C
  9. Let B = {b1 , b2 , b3 } be a basis for R3 . Let C = {b3 , b1 , b2 }.
      (a) What is [b1 ]B ?
      (b) What is [b1 ]C ?
       (c) What is PB←C ?
      (d) What is PC←B ?
 10. Let B = {b1 , b2 , . . . , bn } be a basis for Rn , and let C = {Ab1 , Ab2 , . . . , Abn } for some invertible
     n × n matrix A.
      (a) What is PB←C ?
      (b) What is PC←B ?
 11. Suppose B is an invertible n × n matrix, then the columns of B form a basis for Rn . Call this
     basis B. It follows that B 2 is also invertible, and so the columns of B 2 also form a basis of Rn .
     Call this basis C. What is PC←B ?
 12. Let B = {b1 , b2 , . . . , bn } and C = {c1 , c2 , . . . , cn } be bases of Rn . Show that

                                           PC←B = [b1 ]C [b2 ]C · · · [bn ]C

 13. Let B = {b1 , b2 , . . . , bn } be a basis for some n dimensional subspace. What is [bi ]B ?
 14. Let B, C, and D be different bases of Rn . Show that PD←B = PD←C PC←B .
 15. Are the following true or false:
      (a) [u + v]B = [u]B + [v]B
      (b) [kv]B = k [v]B , where k is a scalar.
       (c) [Av]B = A [v]B , where A is a matrix of the appropriate dimension.
26                                                                            1. Change of Basis


1.3     Examples of Bases
The Haar Basis
There are some bases that are important enough to be singled out and given special names. One
example is the standard basis. Another basis that has some important applications is called the
Haar basis. We will occasionally use the Haar basis for examples later in this book. There is a Haar
basis for Rn whenever n is a power of 2. If we use the Haar basis for Rn as the columns of a matrix,
Hn , we have:

                                                 1   1
                                          H2 =
                                                 1   −1
                                                       
                                          1      1  1 0
                                         1      1 −1 0 
                                    H4 = 
                                         1
                                                        
                                                 −1 0 1
                                          1      −1 0 −1
                                                            
                                1     1  1       0  1 0  0 0
                               1     1  1       0 −1 0  0 0
                                                            
                               1     1 −1       0  0 1  0 0
                                                            
                               1     1 −1       0  0 −1 0 0
                          H8 = 
                               1
                                                             
                                    −1 0        1  0 0  1 0
                               1    −1 0        1  0 0 −1 0 
                                                            
                               1    −1 0        −1 0 0  0 1
                                1    −1 0        −1 0 0  0 −1
The pattern behind the Haar basis should be clear from these examples. One particularly important
property of the Haar basis that will become relevant later on is that the vectors in the basis are
mutually orthogonal.

Example 1.3.1
                                         
                                         1
                                        2
            Find the coordinates of v =   relative to the Haar basis of R4 .
                                        3
                                         4
           The simplest way to answer this question is to set up the appropriate augmented matrix
           and reduce it:
                                                                             
                              1     1    1    0 1           1 0 0 0        5/2
                             1     1 −1      0 2   0 1 0 0               −1 
                                                   ∼                         
                             1 −1       0    1 3   0 0 1 0 −1/2 
                              1 −1       0 −1 4             0 0 0 1 −1/2

           The last column gives the coordinates we are looking for.



Example 1.3.2
                      T
           Compute H4 H4 .
           By simple matrix multiplication we have
1.3. Examples of Bases                                                                               27


                                                                                            
                                   1 1 1  1    1              1  1 0    4               0   0   0
                                 1 1 −1 −1 1               1 −1 0  0               4   0   0
                                                                  =                         
                                 1 −1 0  0  1              −1 0 1  0               0   2   0
                                   0 0 1 −1 1                 −1 0 −1   0               0   0   2
               The following questions are meant to reveal the significance of the above computation.
                    •What do the entries down the diagonal of this matrix product tell you about the
                     vectors in H4 ?
                    •What do the zeroes off the diagonal of this product tell you about the vectors in
                     H4 ?
                               −1
                    •What is H4 ? (This inverse should be fairly easy to find given the above matrix
                                        −1
                     product. Hint: is H4 = H T ?)



Sampling
There is another method of generating vectors, and in some cases basis vectors, that will be men-
tioned here. This method might seem strange at first, but it lies behind some very important
applications of linear algebra.
    Suppose you have a set of functions, say f1 (x) = 1, f2 (x) = x, f3 (x) = x2 , and f4 (x) = x3 .
These functions can be sampled. This means they can be evaluated at a set of equally spaced3
x values. The values generated by each function are then placed in a vector. For example we can
evaluate the above functions at x = 0, 1, 2, 3. Evaluating the four functions at these points gives the
following corresponding vectors:
                                                                  
                              1             1             1              1
                             1          2           4            8
                       v1 =  
                                 v2 =  
                                                 v3 =  
                                                               v2 =  
                                                                        
                              1             3             9              27
                              1             4             16             64

Figure 1.3 illustrates the connection between the function fi (x) and the vector vi
    You can easily confirm that these vectors form a basis for R4 .
    Suppose we sample the same 4 functions at x = −2, −1, 1, 2 and use the resulting vectors as
columns of matrix. The resulting matrix would be
                                                       
                                         1 −2 4 −8
                                        1 −1 1 −1
                                                       
                                        1 1 1 1 
                                         1 2 4 8

You will again find that this matrix is invertible so in this case the sampled vectors would be a basis
of R4 .
    If we had the functions g1 (x) = 1, g2 (x) = x, g3 (x) = 2x, and g4 (x) = 3x and sampled them at
x = 1, 2, 3, 4 and use the sampled vectors as columns of a matrix we would get
                                                          
                                              1 1 2 3
                                            1 2 4 6 
                                                          
                                            1 3 6 9 
                                              1 4 8 12
  3 It   is not absolutely necessary that the input values be equally spaced but it is more common
28                                                                             1. Change of Basis


                           26
                           24
                           22
                           20
                           18
                           16
                           14
                           12
                           10
                            8
                            6
                            4
                            2
                             0               1               2           3

              Figure 1.3: The polynomials 1,x, x2 , and x3 and their sampled values.


This matrix would not be invertible so in this case the sampled vectors do not form a basis of R4 .
    As another example we will sample the functions gn (x) = cos(nx/2) for n = 1, 2, . . . , 6 at the
values x = 0, π/3, 2π/3, π, 4π/3, 5π/3. If we use the resulting vectors as the columns of a matrix we
get
                                                                           
                             1       1        1      1     1         1
                                     √                                √ 
                           1 1/2 3          1/2     0 −1/2 −1/2 3 
                                                                           
                                                                           
                           1       1/2     −1/2 −1 −1/2            1/2     
                                                                           
                                                                           
                           1        0       −1      0     1         0      
                                                                           
                                                                           
                           1      −1/2     −1/2 1 −1/2            −1/2 
                                       √                              √
                                                                           
                             1 −1/2 3 1/2            0 −1/2 1/2 3
   This matrix would have a non-zero determinant so the vectors are linearly independent and are
therefore a basis for R6 .
   Figure 1.4 is a plot of the cosine functions used in this example and the sampled values.

                                    1
                                  0.8
                                  0.6
                                  0.4
                                  0.2
                                    0    1       2   3   4       5   6
                                 –0.2
                                 –0.4
                                 –0.6
                                 –0.8
                                   –1

            Figure 1.4: A basis for R6 generated by sampling a set of cosine functions.
1.3. Examples of Bases                                                                                   29


    This figure shows 6 cosine functions. One each of the curves there are 6 points marked. The y
values at these points are the entries of the sampled vector.
    Sampling a set of functions in this way will not always give independent vectors, but there are
many important bases that are generated in this way.
    You can look at sampling as a technique of approximating a continuous function by a finite,
discrete set of points. In practice, sampling is not restricted to this procedure of finding a basis.
More commonly it is used as means of simply collecting data. If any physical parameter is measured
at regular intervals the parameter is said to be sampled. The data collected can be seen as a vector.
    For example, a record was kept of the number of lynx trapped yearly in the Mackenzie River
District of Western Canada from 1821 to 1934. This data corresponds to a vector with 114 entries.
How can this vector in R114 be plotted? The most natural way is to plot the entries as the y
coordinates of a series of points where the x coordinates run from 1 to 114 (or from 1821 to 1934).
This would result in Figure 1.5.


                                 7000

                                 6000

                                 5000

                                 4000
                                  lynx
                                 3000

                                 2000

                                 1000

                                     0      20    40     60   80   100
                                                       time

                    Figure 1.5: Plot of the number of lynx captured against time.


    This data can be interpreted as representative of the overall lynx population. Plotting the vector
in this way reveals that the lynx population seems to oscillate over time. Why would anyone possibly
want to convert this vector to a different basis? At this point there doesn’t seem to be any possible
reason for changing to a new basis; but, as we will see later, there is a very important reason for
converting data sets to a new basis.

Two-dimensional Bases
One of the key ideas in this book is that you can interpret a list of data values as a vector, but
data can also be arranged in a two-dimensional format as rows and columns – that is, as a matrix.
One common example of this is a a digitized image. A monochrome digital image such as picture
on a computer screen can be seen as a two dimensional array of pixel values where each value in
the matrix indicates the gray level intensity of the corresponding pixel. A color image can be seen
as a combination of three such matrices. One each for the intensities of the red, blue and green
components of each pixel.
   Let {h1 , h2 , . . . , h8 } be the Haar basis for R8 . Then each product hi hT will give an 8 × 8 matrix.
                                                                                j
There will be 64 such matrices and, although we will not prove this, these matrices will in fact be a
basis for the set of all possible 8 × 8 matrices. In other words, any 8 × 8 image is just some linear
combination of these 64 basis images. Figure 1.6 is a picture of what these basis images look like.
30                                                                              1. Change of Basis




                                Figure 1.6: A 2 dimensional Haar basis.



     So if you look, for example, at the image in row 3 column 2 it is generated by



                                       
                                      1
                                    1
                                     
                                    −1
                                     
                                    −1
                        h3 hT
                            2   =    
                                    0     1   1 1   1 −1 −1 −1          −1
                                     
                                    0
                                     
                                    0
                                      0
                                                                  
                                      1      1  1     1 −1 −1 −1 −1
                                    1       1  1     1 −1 −1 −1 −1
                                                                  
                                    −1     −1 −1     −1 1 1  1  1
                                                                  
                                    −1     −1 −1     −1 1 1  1  1
                                =   
                                    0
                                                                   
                                            0  0     0  0 0  0  0
                                    0       0  0     0  0 0  0  0
                                                                  
                                    0       0  0     0  0 0  0  0
                                      0      0  0     0  0 0  0  0


and the black, white, and gray regions of the image correspond to entries of -1, 0, and 1 respectively.
1.3. Examples of Bases                                                                             31


   The third image in the bottom row would be h8 hT . This would be
                                                  3

                                                                                     
             0                                        0  0  0 0 0              0   0   0
            0                                     0   0  0 0 0              0   0   0
                                                                                     
            0                                     0   0  0 0 0              0   0   0
                                                                                     
            0                                     0   0  0 0 0              0   0   0
              1 1       −1 −1 0 0         0    0 =                                   
            0                                     0   0  0 0 0              0   0   0
                                                                                     
            0                                     0   0  0 0 0              0   0   0
                                                                                     
            1                                     1   1 −1 −1 0             0   0   0
             −1                                       −1 −1 1 1 0              0   0   0

   Figure 1.6 consists of 64 8 × 8 pictures. Each picture by itself is not very interesting, but as
mentioned before any 8 × 8 image can be obtained by some linear combination of these.
   How can you find the coordinates of a two-dimensional array relative to a basis? Suppose A is a
square matrix and we want to convert it to the two-dimensional Haar basis as described above. Let
P be the change of basis matrix from the standard basis to the Haar basis. To convert matrix A we
first convert the columns of A to the Haar basis. This is done by computing the product P A. We
then convert the rows of the resulting matrix to the Haar basis. We can do this by converting the
rows to columns by taking the transpose and then multiplying by P . This gives, P (P A)T = P AT P T .
We then take the transpose again to put rows and columns back into their original position. This
gives (P AT P T )T = P AP T .


Example 1.3.3
           T o illustrate the above let
                                                                    
                                                   1       0    0   1
                                                  0       1    1   0
                                                A=
                                                  0
                                                                     
                                                           1    1   0
                                                   1       0    0   1

           and
                                                                              
                                                                 1 1  1      0
                                                                1 1 −1      0
                                H = h1     h2    h3   h4       =
                                                                1 −1 0
                                                                               
                                                                             1
                                                                 1 −1 0      −1

           Let P = H −1 , then to convert to the two-dimensional Haar basis we compute
                                                                        
                                                   1/2     0     0    0
                                                  0       0     0    0 
                                      P AP T    =
                                                  0
                                                                         
                                                           0    1/2 −1/2
                                                    0      0    −1/2 1/2

           The entries in this matrix are the weights given to hi hT to construct matrix A. In this
                                                                   j
           case most of those weights are 0. If we just look at the non-zero weights we have

                            A = 1/2 h1 hT + h3 hT + h4 hT − 1/2 h3 hT + h4 hT
                                        1       3       4           4       3


           The first three terms correspond to the diagonal entries of P AP T , the last two correspond
           to the off-diagonal entries. If we compute this linear combination by first evaluating the
32                                                                              1. Change of Basis




                              Figure 1.7: A 2 dimensional cosine basis.


            diagonal and off-diagonal components separately we get

                               1/2 h1 hT + h3 hT + h4 hT − 1/2 h3 hT + h4 hT
                                       1       3       4            4         3
                                                                               
                              1    0 1/2 1/2             0    0     −1/2 1/2
                             0    1 1/2 1/2  0             0       1/2 −1/2
                        =   
                            1/2 1/2 1
                                                 +                              
                                              0  −1/2 1/2            0        0 
                             1/2 1/2 0        1        1/2 −1/2        0        0
                                                          
                                                  1 0 0 1 Student Version of MATLAB
                                                0 1 1 0
                        =                       
                                                0 1 1 0
                                                           

                                                  1 0 0 1


    There is a format for compressing digital images called JPEG compression. This is a technique
for reducing the amount of computer memory required to store a digital image. This technique
involves converting 8 × 8 images into a set of basis images that are generated from the Discrete
Cosine Basis. This basis is generated by sampling the functions

                                                     π      1
                                      fk (t) = cos     k(t − )
                                                     8      2

for t = 1, 2, 3, . . . , 8. The resulting basis vectors are then multiplied together in the same way as
with the Haar basis and we obtain 64 images. These basis images are shown in Figure 1.7. Again,
each image corresponds to an 8 × 8 matrix of the form uvT .
1.3. Examples of Bases                                                                                   33


    Exercises
                                               
                                               1
                                              0
  1. Let H be the Haar basis for R4 . Let v =  .
                                              0
                                               1
     Find [v]H .
  2. Let B = {e1 , e1 + e2 , e1 + e2 + e3 , e1 + e2 + e3 + e4 }. Let H be the Haar basis for R4 .
      (a) What is PH←B ?
      (b) What is PB←H ?
  3. Let f1 (x) = 1x , f2 (x) = 2x , f3 (x) = 3x , f4 (x) = 4x . If these functions are sampled at x =
     0, 1, 2, 3 do the resulting vectors form a basis of R4 ?
  4. Let f1 (x) = cos x, f2 (x) = cos 2x, f3 (x) = cos 3x, f4 (x) = cos 4x. If these functions are sampled
     at x = 0, π/4, π/2, 3π/4 do the resulting vectors form a basis of R4 ?
  5. Show that if you sample a line in R3 at three different points the resulting vectors will not be a
     basis for R3 .
  6. Let E = {e1 , e2 } be the standard basis of R2 . Write down the basis for the vector space of all 2 × 2
                                                                            a b
     matrices generated by products of the form ei eT . Write the matrix
                                                      j                            as a linear combination
                                                                            c d
     of these basis vectors.
                              1  1
  7. Let B = b1      b2 =           . Write down the basis for the vector space of all 2 × 2 matrices
                              1 −1
                                                                4 0
     generated by products of the form bi bT . Let Let A =
                                             j                         and let P = B −1 . Show that
                                                                2 1
     P AP T gives the weights required to write A as a linear combination of the basis you found.
                                              1   0         2 4
  8. Repeat the previous problem for B =            and A =     .
                                              2   1         4 2
34                                                                                       1. Change of Basis


                                        Using MAPLE

Example 1.
    In this example we will begin by using Maple to generate a basis for R5 by sampling the functions 1,
x, x2 , x3 , x4 at the values x = .2, .4, .6, .8, 1.0. This is precisely what happens with the Vandermonde
matrix that we saw in the previous chapter. So we can proceed as follows:


          >with(LinearAlgebra):
          >xv:=<.2,.4,.6,.8,1.0>;
          >B:=VandermondeMatrix(xv);
          >Determinant(B);
                           .00002949
    If you look at matrix B the connection between each of the 5 columns of B and each of the 5
functions being sampled should be clear.                                                         √
                                                                                          √
    Now we will define two new vectors, u and w, by sampling f (x) = x5 + 1 and g(x) = x + 1 − x
at the same values. We now have two vectors in R5 and a basis, B, for R5 . We will find the coordinates
of these vectors relative to basis B. This just involves multiplying the vectots by the change of basis
matrix PB←E = B −1 .
    The following commands illustrate one method for sampling a function over a given set of values.
First we will define the functions to be sampled using the “arrow” notation. Next we need a list of values
where the functions will be sampled. We already have these values in xv. Finally we can use the map
command in Maple to evaluate the functions at the given values.


          >f:=x -> x^5+1;
          >g:=x -> sqrt(x)+sqrt(1-x);
          >u:=map(f, xv);
          >w:=map(g,xv);
          >ub:=B^(-1).u;
                        [1.0384, -.4384, 1.800, -3.400, 3.000])
          >wb:=B^(-1).w;
                        [1.0000, 3.0137, -8.5037, 10.9801, -5.4901]

                                                           
                                 1.03840              1.0000
                               −.43840             3.0137 
                                                           
                                1.8000  and [w] = −8.5037 .
     This tells us that [u]B =                 B           
                               −3.4000             10.9801 
                                 3.0000              −5.4901

   Now our basis vectors, v1 , . . . , v5 , in this example can be seen as discrete approximations of the
functions 1, x, x2 , x3 , x4 and u was a discrete approximation of x5 + 1. Notice:

     • The discrete vector u is a linear combination of the discrete basis v1 , . . . , v5 .

     • The continuous function x5 +1 is not a linear combination of the continuous functions 1, x, x2 , x3 , x4 .
1.3. Examples of Bases                                                                                  35


    We will combine 1, x, x2 , x3 , x4 using the weights of [u]B and [w]B , and then plot the results.
    There is another way of looking at what we are doing that might make things clearer. We started
with two functions f (x) and g(x). On each of these functions we chose five points. We then found the
interpolating polynomials of degree four passing through these points. In the following Maple commands
we will call these interpolating polynomials f1 anf g1.
    There is one aspect of the following commands that sometimes causes confusion: we defined f and g
as functions in Maple , but f1 and g1 are being defined as expressions. The plot command in Maple
can be used with either functions or expressions with a slight difference in the syntax. We have opted
for using expressions with the plot command. When we enter f(x) and g(x) as inputs to the plot
                                                           √     √
commands below we obtain the expressions x5 + 1 and x + 1 − x.


        >f1:=add(ub[i]*x^(i-1),i=1..5);
        >g1:=add(wb[i]*x^(i-1),i=1..5);
        >plot([f(x), f1],x=-.2..1.2,color=black,thickness=[1,2]);
        >plot([g(x), g1],x=0..1, color=black, thickness=[1,2]);


   This gives the plots shown in Figure 1.8 and Figure 1.9.
                3.5
                                                             1.4




                 3

                                                             1.3



                2.5



                                                             1.2

                 2




                                                              1.1
                1.5




                  1                                            1
         –0.2         0   0.2   0.4   0.6   0.8   1   1.2           0   0.2   0.4       0.6   0.8   1
                                       x                                            x




      Figure 1.8: Plot of f (x) and f 1.                    Figure 1.9: Plot of g(x) and g1.

     We see that if we use apply the weights obtained from the discrete case to the continuous case we get
a linear combination of 1, x, x2 , x3 , x4 that gives a fairly good approximation to x5 + 1 on the interval
[0, 1]. The plot we drew seems to imply the functions are very close except when we are close to x = 0
where we can see the plots begin to diverge. The following plot gives a better idea about how close these
two functions are. We will plot the difference between the two functions:


        >plot(f(x)-f1,x=0..1,color=black);
        >plot(g(x)-g1,x=0..1,color=black);


   The resulting plots are shown in Figure 1.10 and Figure 1.11.
   This shows the vertical distance between the functions, we see that the linear combination we com-
puted oscillates around x5 + 1 with the largest distance (as was clear from the previous plot) occurring
at x = 0 but with a comparatively small distance over most of the interval. The points where the
36                                                                                             1. Change of Basis

                             x
                 0.2   0.4       0.6   0.8   1
             0




                                                           0.06


         –0.01




                                                           0.04


         –0.02



                                                           0.02


         –0.03



                                                              0    0.2   0.4       0.6   0.8      1
                                                                               x




        Figure 1.10: Plot of f (x) − f 1.                 Figure 1.11: Plot of g(x) − g1.


two functions agree (where the distance is 0) are precisely the points where we sampled the functions:
x = 0.2, 0.4, 0.6, 0.8, 1.0.

     Now lets create another basis for R5 . This time we will sample the same functions, but at the points
.4, .6, .8, 1, 1.2. Here we will use the seq command in Maple to define our vector of x values.


          >xv2:=Vector(5, i->.2+.2*i);
          >C:=VandermondeMatrix(xv2);

   Let’s call our new basis C. What is the change of basis matrix from B to C? From results in this
chapter it should be clear that this would be the matrix C −1 B. In Maple


          >C^(-1).B;


     This gives (after rounding) the change of basis matrix
                                                                   
                                   1 −.2000 .0400 −.0080 .0016
                                 0     1      −.400 .1200 −.3200
                                                                   
                                 0     0        1     −.6000 .2400 
                                                                   
                                 0     0        0         1  −.8000
                                   0    0        0         0    1
Hard Question: What is the connection between the entries in this change of basis matrix and the
binomial theorem?

Example 2.
    In this example we will generate a basis for R6 by sampling the functions cos kt for k = 0, 1, . . . , 5 at
the points 0, π/5, 2π/5, 3π/5, 4π/5, π. We will illustrate a slightly different approach from the previous
example.
    The command unapply that is used in the first line converts Maple expressions to Maple functions.
So, for example, the command unapply(cos(2*t), t) would return the function t -> cos(2*t).
1.3. Examples of Bases                                                                                  37


        >for i from 1 to 6 do f[i]:=unapply(cos((i-1)*t),t) od;
        >tvals:=Vector( 6, i-> evalf((i-1)*Pi/5));
        >A:=Matrix(6,6, (i,j)->f[j](tvals[i]));
        >Determinant(A);
                      -62.5

   The last line confirms that the sampled vectors form a basis for R6 .
   Now we will sample the function
                                                t2      1
                                         g(t) =     +
                                                 6    t+1
at the same points and generate another vector in R6 . Then we will express this vector as a linear
combination of the basis vectors.


        >g:=t->t^2/6+1/(t+1);
        >u:=map(g,tvals);
        >x:=A^(-1).u; # calculate the weights

Finally, we will combine the continuous functions that we used to generate our basis using the weights
that we found and plot the result along with g(t).


        >p:=add(x[i]*f[i](t),i=1..6);
        >data:=[seq( [tvals[i],u[i]],i=1..6)]; # the sampled points of g(t)
        >p1:=plot([p,g(t)],t=0..Pi):
        >p2:=plot(data,style=point,symbol=box):
        >plots[display]([p1,p2]);

   We get the following plot:

                                     1.8

                                     1.6

                                     1.4

                                     1.2

                                       1

                                     0.8

                                     0.6
                                           0 0.5   1   1.5   2   2.5   3
                                                         t

    We again see that by using the weights we found we get a linear combination that agrees exactly at
the sampled points (where we computed our discrete basis).

Example 3
   In this example we will illustrate the idea of a two-dimensional basis. We will first generate an 20 × 20
matrix and display it as an image.
38                                                                                 1. Change of Basis


         >with(plots):
         >A:=Matrix(20,20,(i,j)-> if abs(i-10)+abs(j-10)<8 then 1 else 0 fi);
         >matrixplot(A,orientation=[0,0],heights=histogram,style=patchnogrid,
                     shading=zgrayscale,scaling=constrained);


    This defines a 20 × 20 matrix all of whose entries are 0 or 1. If we interpret the entries in the matrix
as a greyscale color where 0 is black and 1 is white then the matrix would give theimage shown in Figure
1.10:




                                  Figure 1.12: Matrix A as an image.

   We will now convert the image to a different basis. Any invertible matrix corresponds to a change of
basis matrix. We will define a band matrix B.


         >B:=BandMatrix([.05,.1,.2,.5,.2,.1,.05],3,20):
         >P:=B^(-1):
         >A1:=P.A.P^%T:
         >A2:=B.A.B^%T:
         >matrixplot(A1,orientation=[0,0],heights=histogram,style=patchnogrid,
                     shading=zgrayscale,scaling=constrained);
         >matrixplot(A2,orientation=[0,0],heights=histogram,style=patchnogrid,
                     shading=zgrayscale,scaling=constrained);


     This gives two other 20 × 20 matrices which correspond to the following images:




     Figure 1.13: Matrix A in a new basis.          Figure 1.14: Matrix A in a new basis.
Chapter 2

Eigenvalues and Eigenvectors

What happens when you multiply a vector by a square matrix? One simple answer to this question
is that you get another vector of the same size. Typically both the length and the direction of
the matrix-vector product will be different from those of the original vector but in some cases it
is possible that the direction will stay the same.1 If the vectors do have the same direcection it
means that multiplying the original vector by the matrix is equivalent to multiplying that vector by
a scalar. At first the fact that this can occur might seem to be nothing more than a curiosity or
a coincidence but in fact it will lead to some of the most important applications of linear algebra.
This chapter and the next will look at the significance of this idea.



2.1       Eigenvectors and Eigenvalues
    We start with the main definition of the chapter.


Definition 1 Let A be an n × n matrix. If v is a non-zero vector such that Av = λv for some
scalar λ then v is called an eigenvector of A and λ is called an eigenvalue of A.


    For example, let

                                   1 −1                        2                      1
                           A=                   ,      u=           ,        v=
                                   2  4                        1                     −2
Then we have

                                            1                                  3
                                 Au =                   and         Av =
                                            8                                 −6
    Notice in Figure 2.1 that Au and u are not parallel. Au is not a scalar multiple of u. On the
other hand Av is parallel to v. Av is a scalar multiple of v. In fact, Av = 3v so v is an eigenvector
of A with corresponding eigenvalue 3. So multiplying v by A has the effect of scaling v by a factor
of 3.

   1 In this book when we say two non-zero vectors have the same direction we will mean that one vector is a scalar

multiple of the other, including the possibility that this scalar is negative or zero. In other words, both vectors lie
on the same line through the origin. If one vector is a negative multiple of the other they are said to have the same
direction but a different sense.


                                                          39
40                                                                             2. Eigenvalues and Eigenvectors

                                                           8        Au

                                                           6

                                                           4

                                                           2
                                                                         u
                                         –4        –2                    2         4
                                                          –2         v

                                                          –4

                                                          –6                  Av

                                      Figure 2.1: Plot of u, v, Au, and Av.


Theorem 2.1 Let A be an n × n matrix. Then A is invertible if and only if the number 0 is not
an eigenvalue of A.


Proof. Suppose 0 is an eigenvalue of A, then Av = 0v = 0 for some non-zero vector v. This means
that the vector v would be a non-trivial solution of the system Ax = 0. Since this system has a
non-trivial solution it follows that A is not invertible.
    On the other hand if A is not invertible then Ax = 0 has non-trivial solutions, and so there is a
non-zero vector v such that Av = 0 = 0v. Therefore 0 is an eigenvalue of A with v a corresponding
eigenvector.



Theorem 2.2 If v1 , v2 , . . . , vp are eigenvectors of matrix A that correspond to distinct eigenvalues
λ1 , λ2 , . . . , λp , then the set {v1 , v2 , . . . , vp } is linearly independent.


Proof. The set {v1 , v2 , . . . , vp } is either linearly dependent or linearly independent. We will begin
by assuming this set is linearly dependent and show that this assumption leads to a contradiction.
We can then conclude that the set must be linearly independent.
     If {v1 , v2 , . . . , vp } is linearly dependent then there is some number k < p such that {v1 , v2 , . . . , vk }
is linearly independent2 and

                                     vk+1 = c1 v1 + c2 v2 + · · · + ck vk                                      Equation 1

for some scalars c1 , c2 , . . . , ck . Multiplying both sides of equation 1 by A we have

                                Avk+1 = c1 Av1 + c2 Av2 + · · · + ck Avk                                        Equation 2
   2 This is basically saying that if the set of vectors is linearly dependent then one of the vectors is a linear combination

of the others. In fact, it must be some linear combination of a linearly independent subset of the other vectors. So
just suppose that the set of vectors is ordered so that these independent vectors come first.
2.1. Eigenvectors and Eigenvalues                                                                                41


Since the vi are eigenvectors of A we have

                           λk+1 vk+1 = c1 λ1 v1 + c2 λ2 v2 + · · · + ck λk vk                          Equation 3

Multiplying equation 1 by λk+1 we get

                      λk+1 vk+1 = c1 λk+1 v1 + c2 λk+1 v2 + · · · + ck λk+1 vk                         Equation 4

Subtracting equation 4 from equation 3 we get

                    0 = c1 (λ1 − λk+1 )v1 + c2 (λ2 − λk+1 )v2 + · · · + ck (λk − λk+1 )vk
Because {v1 , v2 , . . . , vk } is linearly independent the weights in this last equation must all be 0. But
λi − λk+1 = 0 for i = 1, . . . , k since the eigenvalues are distinct. Therefore ci = 0 for i = 1, . . . , k.
But this implies that vk+1 = 0 which is impossible since this vector is an eigenvector of A. We can
then conclude that {v1 , v2 , . . . , vp } is linearly independent.


Computing Eigenvalues and Eigenvectors
Given a square matrix A an eigenvector is by definition a non-zero vector v that satisfies the equation
Av = λv for some scalar λ. This equation can be rewritten as

                                                  Av − λv = 0
and then by factoring we get

                                                 (A − λI) v = 0
It then follows that the eigenvector v is a non-trivial solution of the system (A − λI)x = 0 since
an eigenvector is a non-zero vector. This argument works in reverse also. That is, any non-trivial
solution of (A − λI)x = 0 is a eigenvector of A corresponding to eigenvalue λ. So we have the
important result that the eigenvectors of A are exactly the non-trivial solutions of (A − λI)x = 0.
    But this system has non-trivial solutions if and only if A − λI is not invertible, and this happens
if and only if det(A − λI) = 0. This is the idea that we will use to find the eigenvalues of a matrix:


                      λ is an eigenvalue of matrix A if and only if det(A − λI) = 0.



Definition 2 Given an n × n matrix A the polynomial det(A − λI) is called the characteristic
polynomial of A. The equation det(A − λI) = 0 is called the characteristic equation of A.

    One of the implications of the above discussion is that if A is an n × n matrix then the charac-
teristic polynomial of A has degree n. A must therefore have n eigenvalues though these eigenvalues
don’t have to be distinct.3
    If λ is an eigenvalue of matrix A, then the corresponding eigenvectors are the non-trivial solutions
to (A − λI)x = 0. But the solutions to this equation constitute the null space of A − λI, so the
   3 The roots of a polynomial can be found by factoring. If one factor is repeated k times the corresponding root is

said to have multiplicity k. For example, the sixth degree polynomial p(t) = (t − 4)(t − 5)2 (t − 7)3 has six roots.
The six roots are not distinct. The roots are 4, 5 (multiplicity 2), and 7 (multiplicity 3).
42                                                                2. Eigenvalues and Eigenvectors


eigenvectors corresponding to a specific eigenvalue (along with the zero vector) must form a subspace
of Rn . This leads to the following definition.

Definition 3 Let A be an n × n matrix and let λ be an eigenvalue of A. The set of eigenvec-
tors corresponding to λ along with the zero vector is a subspace of Rn . This subspace is called an
eigenspace. We will use the notation Eλ to refer to the eigenspace corresponding to eigenvalue λ.

   Note that the zero vector itself is not an eigenvector for any eigenvalue, but the zero vector is in
the eigenspace of every eigenvalue.

Example 2.1.1
                                                                                 2   1
            S uppose we want to find the eigenvalues and eigenvectors of A =            . The first step
                                                                                 1   2
            is to compute the characteristic polynomial det(A − λI).

                                         2−λ      1
                          |A − λI| =                      = λ2 − 4λ + 3 = (λ − 3)(λ − 1)
                                          1      2−λ
            This gives eigenvalues of 3 and 1.

            For λ = 3 we find a basis for the eigenspace by solving (A − 3I)x = 0 as follows

                                          −1  1 0               1 −1     0
                                                           ∼
                                           1 −1 0               0  0     0
                                       1
            The solution here is x = t    . So E3 is a one dimensional subspace of R2 . Any non-zero
                                       1
            vector from this subspace is an eigenvector with an eigenvalue of 3. If, for example, we
                       4
            take u =     we get
                       4
                                          2 1        4   8+4   12    4
                                  Au =                 =     =    =3
                                          1 2        4   4+8   12    4

            Similarly for λ = 1 we solve (A − I)x = 0:
                                             1 1      0        1 1   0
                                                          ∼
                                             1 1      0        0 0   0

                                                1
            Here we have the solution x = t         . So E1 is a line through the origin in R2 . Every
                                               −1
            point on this line except the origin is an eigenvector with λ = 1. So, for example, if we
                       5
            take v =        we get
                      −5

                                            2    1    5    10 − 5    5
                                     Av =                =        =
                                            1    2    −5   5 − 10   −5

                             1      1         3
            Now let w = 2      +1        =      . The vector w is a linear combination of eigenvectors
                             1      −1        1
            but is not an eigenvector itself since
                                                      2   1    3   7
                                            Aw =                 =
                                                      1   2    1   5
2.1. Eigenvectors and Eigenvalues                                                                      43


           and this result is not a scalar multiple of w. The following diagram gives an indication
           of what happens when w is multiplied by A.

                                 8


                                 6
                                                                      Aw
                                4
                               This component
                                       is
                                  scaled by 3
                                2
                                                    w

                                 0      2        4       6                 8
                              This component is scaled by 1
                               –2

Figure 2.2: Vector w is multiplied by A. w has components in each eigenspace which are stretched
by different amounts when multiplied by A.



       REMARK: There is a simple method for finding eigenvectors of a 2 × 2 matrix. Suppose
       you have a 2 × 2 matrix and you want to find the eigenvectors – that is, you want to find
       a basis for the eigenspaces. You first subtract the eigenvalue down the main diagonal. At
                                                          »     –
                                                            a b
       this point you would have a matrix of the form             . When you reduce this matrix you
                                                            c d
       have to get a row of zeroes. In other words you can make one of the rows equal to 0 right
       away.»It usually won’t matter which row. So at this point you would have a matrix of the
                    –
              a b
       form           . Now the eigenspace is just the null space of this matrix, but the null space
              0 0
       consists of vectors orthogonal to the rows of the matrix. To find a vector orthogonal to
       » –
        a
            you just have to switch the entries and change the sign of one of them. Doing this
         b
                » –
                 −b
       you get          as a basis of the eigenspace.
                  a
                                     »      –
                                       5 3
       For example, the matrix                has an eigenvalue of λ = 3. What is a basis of the
                                       4 9
       corresponding eigenspace. Subtracting 3 down the diagonal (actually –
                                                                           »      just from the 5 in
                                                                             2 3
       the first row) and making one of rows a row of zeroes we get                 . A basis for the
                                                                             0 0
       eigenspace will be any vector orthogonal to the first row. So a basis for this eigenspace
                  » –
                    3
       would be          .
                   −2
       Remember, this method is only valid for 2 × 2 matrices.


Example 2.1.2
                                                                                    
                                                                   0 1              0
           This next example will involve a 3 × 3 matrix. Let A = 8 0              8.
                                                                   0 1              0
44                                                                2. Eigenvalues and Eigenvectors


             The characteristic polynomial of A would be
                                      −λ  1  0
                                       8 −λ  8         = 16λ − λ3 = λ(λ2 − 16)
                                       0  1 −λ
             We then see that A has 3 eigenvalues: λ1 = 0, λ2 = 4, and λ3 = −4. If you look at the
             columns of A it is clear that they are not linearly independent and so A is not invertible.
             For this reason we should have expected 0 to be an eigenvalue.
             For λ1 = 0 we set up and reduce the following augmented matrix:
                                                                        
                                           0 1 0 0              1 0 1 0
                                         8 0 8 0 ∼ 0 1 0 0 
                                           0 1 0 0              0 0 0 0
                                        
                                          1
             It is then easy to obtain  0  as a basis for E0 .
                                         −1
             For λ2 = 4 the same procedure gives
                                                                            
                                       −4     1     0 0           1 0 −1 0
                                     8 −4          8 0  ∼  0 1 −4 0 
                                         0    1 −4 0              0 0    0 0
                                    
                                     1
             This gives a basis of 4 for E4 .
                                     1                                        
                                                                               1
             We’ll skip the details but a similar procedure gives a basis of −4 for E−4 .
                                                                               1

   The preceding examples illustrate a 3-step method for find eigenvalues and eigenvectors of a
square matrix. The steps are:
     • Write down A − λI and calculate its determinant to get the characteristic polynomial.
     • Find the roots of the characteristic polynomial. These are the eigenvalues of A.
     • For each eigenvalue set up the augmented matrix of the system (A − λI)x = 0 and solve using
       elementary row operations. If you haven’t made any calculation errors this system must have
       non-trivial solutions. These non-trivial solutions are exactly the eigenvectors for the eigenvalue
       used.
   For large matrices there are more sophisticated numerical procedures that are used to find eigen-
values and eigenvectors, but for small matrices the above steps will often work easily. In some cases
there are also shortcuts that can be used. For example, look at the following theorem.


Theorem 2.3 The eigenvalues of a triangular matrix are the entries on the diagonal.


   This theorem is a simple consequence of the fact that the determinant of a triangular matrix is
the product of the entries on the diagonal.

Example 2.1.3
2.1. Eigenvectors and Eigenvalues                                                                45


           Find the eigenvalues and a basis for each eigenspace of
                                                           
                                                    3 1 1
                                              A = 0 3 1
                                                    0 0 1

           In this case we get

                                           3−λ  1          1
                          det(A − λI) =     0  3−λ         1         = (3 − λ)2 (1 − λ)
                                            0   0         1−λ
           So this matrix has two eigenvalues. λ = 3 is an   eigenvalue of multiplicity 2, and λ = 1
           is an eigenvalue.
           To find a basis for E3 we have the following
                                                                      
                                      0 1     1 0            0   1   0 0
                                    0 0      1 0 ∼        0   0   1 0 
                                      0 0 −2 0               0   0   0 0
                                  
                                   1
           This gives a basis of 0, so E3 is in fact the x1 axis.
                                   0
           For λ = 1 we get                                             
                                        2 1 1 0            1 0 1/4       0
                                       0 2 1 0  ∼  0 1 1/2            0 
                                        0 0 0 0            0 0      0    0
                                       
                                 −1/4
           This gives a basis −1/2 for E1 or if you prefer this       vector could be scaled by -4
                                    1      
                                            1
           resulting in the basis vector  2 .
                                           −4

   We end this section with one more definition.

Definition 4 An eigenbasis of Rn is a basis composed entirely out of eigenvectors for some n × n
matrix A.
46                                                                              2. Eigenvalues and Eigenvectors


      Exercises
     1. Is λ = 2 an eigenvalue of the following matrices?

                                            4     1              3     3              0 2
                                      (a)                  (b)                 (c)
                                            2     3              2     8              1 0

     2. Is λ = 4 an eigenvalue of the following matrices?

                                                                                              
                                  2 0 1                        1     1     1              4 4     4
                            (a)  4 4 −2                 (b) 3     3     −1       (c) 4 4     4
                                 −2 0 5                        0     1     2              4 4     4
              
             1
     3. Is −1 an eigenvector of the following matrices?
             1
                                                                                              
                                 2 3 1             0 −1 2                                 0   0   0
                           (a) 1 3 2 (b) 2 2 −3                                  (c) 0   0   0
                                 3 4 1             4 1    0                               0   0   0
                                                             
                   1                         a b              c
     4. Show that 1 is an eigenvector of  c a              b . What is the corresponding eigenvalue?
                   1                         b c              a
                            
                    −3 1 −3
     5. The matrix  20 3 10  has eigenvalues 3 and -2. Find a basis for E3 and E−2 .
                     2 −2 4

                   4   4
     6. Let A =          . For what value(s) of k (if any) is
                   k   1

         (a) λ = 2 an eigenvalue of A?
                  2
         (b) is     an eigenvector of A?
                  3
         (c) is A not diagonalizable?
         (d) is A not invertible?

     7. Find the eigenvalues of the following matrices. Find a basis of each eigenspace.

                                                      0 1                  1  3
                                            (a)                  (b)
                                                      2 1                  −1 −3

                                                      2   3                7   −2
                                            (c)                    (d)
                                                      1   4                2   3

     8. Find the eigenvalues of the following matrices. Find a basis for each eigenspace.
2.1. Eigenvectors and Eigenvalues                                                                   47


                                                                                 
                                         0 0 1                        1         0 0
                                   (a)  0 5 0                  (b) 0         5 0
                                        −2 0 3                        3         0 −2
                                                                                 
                                         0 1        1                     1     1 0
                                    (c) 1 0        1               (d) 1     0 1
                                         1 1        0                     0     1 1

  9. Find the eigenvalues of the following matrices. Find a basis for each eigenspace.
                                                                                 
                                          2     1 1                   1         1 1
                                     (a) 0     2 1             (b) 1         1 1
                                          0     0 1                   1         1 1
                                                                                 
                                          2 0        0                0         0 2
                                     (c) 0 2        0          (d) 0         2 0
                                          0 0        2                2         0 0

 10. Find a basis for each eigenspace of                                 
                                                 2       0       0       0
                                                2       1       0       0
                                                                         
                                                2       1       2       0
                                                 2       1       1       1

 11. Find a basis for each eigenspace of
                                                                           
                                                1    0       0       0    1
                                               0    1       0       1    0
                                                                           
                                               0    0       1       0    0
                                                                           
                                               0    1       0       1    0
                                                1    0       0       0    1

 12. Construct specific numerical 2 × 2 matrices, A and B, to illustrate the following statements:
      (a) If λ1 is an eigenvalue of A and λ2 is an eigenvalue of B, then you cannot conclude that
          λ1 + λ2 is an eigenvalue of A + B.
      (b) If λ1 is an eigenvalue of A and λ2 is an eigenvalue of B, then you cannot conclude that λ1 λ2
          is an eigenvalue of AB.
                        
                0 0 1
 13. Let A = 0 1 0. Notice that when you multiply a vector in R3 by A the order of the entries
                1 0 0
     gets reversed. What do eigenvectors of A look like? What happens to these eigenvectors when
     you reverse the order of the entries?
                a b
 14. Let A =           . Show that the characteristic polynomial is λ2 − tr(A)λ + det A. Here tr(A)
                c d
     stands for the trace of A, that is the sum of the entries on the main diagonal of A. This gives a
     simple procedure for finding the characteristic polynomial of a 2 × 2 matrix.
 15. Show that the trace of a matrix is the sum of the eigenvalues. This means that similar matrices
     have the same same trace.
48                                                               2. Eigenvalues and Eigenvectors


 16. Let A be a square matrix. Show that A and AT have the same eigenvalues. (Hint: show that they
     have the same characteristic polynomial.)
 17. Give an example to show that A and AT do not necessarily have the same eigenspaces.
 18. Let A and B be n×n matrices. If B is invertible, show that AB and BA have the same eigenvalues
     by showing that they have the same characteristic equation.
 19. Suppose that A is an m × n matrix and B is an n × m matrix.
      (a) Show that AB and BA have the same non-zero eigenvalues.
      (b) Show that if m > n then λ = 0 must be an eigenvalue of AB.
      (c) What are the eigenvalues of the matrices vT v and vvT where v is in Rn .
                                            
                                         0 1
                   1 1 0
      (d) Let A =            and B = 1 0. What are the eigenvalues of AB and BA?
                   0 0 2
                                         0 1
 20. Let A be an invertible matrix. Show that if v is an eigenvector of A with corresponding eigenvalue
     λ, then v is also an eigenvector of A−1 with corresponding eigenvalue 1/λ.
 21. Let A be a square matrix. Show that if v is an eigenvector of A with corresponding eigenvalue λ,
     then v is also an eigenvector of A2 with corresponding eigenvalue λ2 .
 22. Show that if v is an eigenvector of both A and B, then v is an eigenvector of AB.
 23. A matrix, P , is called idempotent if P 2 = P . Show that if a matrix P is idempotent then the
     only possible eigenvalues of P are 0 and 1.
 24. The Cayley-Hamilton Theorem says that any square matrix satisfies it’s own characteristic
     equation (interpreted as a matrix equation). Verify the Cayley-Hamilton Theorem for the following
     specific matrices:

                                           1    1                1   2
                                     (a)                   (b)
                                           1    1               3   4     
                                                           0   1   0   0
                                       0       2 0         0    0   1   0
                                  (c) 0       0 .5   (d) 
                                                                          
                                                             0   0   0   1
                                       1       0 0
                                                             0   0   0   0
2.1. Eigenvectors and Eigenvalues                                                                      49


                                    Using MAPLE


Example 1
    The LinearAlgebra package in Maple contains several procedures related to eigenvalues and eigen-
                                                                                            1 2
vectors which we will illustrate in this section. We begin by defining a matrix A =                 . The
                                                                                            3 15
command CharacteristicPolynomial computes the characteristic polynomial of a matrix. This com-
mand requires two inputs: the matrix and the variable that will be used in the characteristic polynomial.
As illustrated below if you want to use the variable λ for the characteristic polynomial you must spell it
out. You can use any variable you wish, so in the third example below we use t for the variable in the
characteristic polynomial. If we set the characteristic polynomial equal to 0 and solve we will get the
eigenvalues.

        >A:=<<1,3>|<2,15>>;
        >cp:=CharacteristicPolynomial(A,lambda);

                                           cp := λ2 − 5λ − 2

        >ev:=solve(cp=0,lambda);           ## This returns the eigenvalues of A

                                         5 1√    5 1√
                                          +   33, −  33
                                         2 2     2 2
        >cp2:=CharacteristicPolynomial(A^(-1),lambda); ## The charpoly of the inverse of A

                                                     5    1
                                           cp := λ2 + λ −
                                                     2    2
        >solve(cp2=0,lambda); ## the eigenvalues of the inverse of A

                                        5 1√      5 1√
                                       − +   33, − −   33
                                        4 4       4 4
        >rationalize(1/ev[1]); ## the reciprocals of the eigenvalues of A
                                  are the eigenvalues of the inverse of A

                                               5 1√
                                              − +   33
                                               4 4
        >cp3:=CharacteristicPolynomial(A^%T,t);
        >solve(cp3=0,t); ## A and the transpose of A have the same eigenvalues

                                         5 1√    5 1√
                                          +   33, −  33
                                         2 2     2 2
   There are more direct ways of finding the eigenvalues:

        >Eigenvalues(A);
        >Eigenvectors(A);
50                                                                  2. Eigenvalues and Eigenvectors


   The first command is pretty straightforward. It returns a vector whose entries are the eigenvalues of
A. The second command is more complicated it returns
                              √                 √ −1              √ −1
                          8 + 55         2 7 + 55          2 7 − 55
                              √
                          8 − 55                1                 1
    This returns a vector containing the eigenvalues and a matrix containing the corresponding eigenvec-
tors.
    In the above example Maple found the exact values for the eigenvalues (as opposed to floating point
decimal approximations. For large matrices the exact values might be extremely complicated expressions,
or they might even be impossible for Maple to compute. In these cases it is better to have Maple
compute floating point approximations for the eigenvalues. You can force Maple to do this if you enter
the values in the matrix as decimals (actually if only one entry in the matrix is entered as a decimal thatis
enough to force Maple to compute the eigenvalues in floating point format). In the next example we
will use the 6 × 6 matrix                                          
                                              2 1 0 0 0 0
                                             1 2 1 0 0 0
                                                                   
                                             0 1 2 1 0 0
                                       A=   0 0 1 2 1 0
                                                                    
                                                                   
                                             0 0 0 1 2 1
                                              0 0 0 0 1 2
          >A:=BandMatrix([1, 2.0, 1], 1, 6);
          >Eigenvaluess(A);
                     .1980623, .7530204, 1.554958, 2.445042, 3.246980, 3.801938
     The first command defines a 6 × 6 band matrix. Notice that one of the entries is given as a decimal.
In this case Maple returns floating point approximations for the eigenvalues. Try changing the 2.0 in
the definition of A to the exact value of 2 and see what Maple returns. This should convince you that
floating point notation can sometimes be clearer than the exact values.
     Finally,in the exercises for the previous section we mentioned the Cayley-Hamilton Theorem. We will
illustrate this theorem using Maple .
          >A:=<<1,2,3>|<2,1,2>|<3,2,1>>;
          >cp:=charpoly(A,t);
          >subs(t=A,cp);
          >simplify(%);
                                            0
     The first line defines a matrix in Maple . The second line computes the characteristic polynomial of
that matrix. The third line substitutes the matrix itself into the characteristic polynomial. This returns
the expression A3 − 3A2 − 14A − 8 but doesn’t evaluate it. (Note: Maple is smart enough to understand
that the last term is now interpreted as −8I.) Finally, the fourth line evaluates the substitution and
returns the zero matrix. If you define A to be any square matrix this last result will always be the zero
matrix. That is the Cayley-Hamilton Theorem: any matrix satisfies it’s own characteristic polynomial.
     One consequence of this is that any positive integer power of an n×n matrix can always be expressed in
terms of a linear combination of that matrix raised to non-negative integer powers lower than n (including
I). For example in the above case we can write:
                               A4    =   A(A3 )
                                     =   A(3A2 + 14A + 8I)
                                     =   3A3 + 14A2 + 8A
                                     =   3(3A2 + 14A + 8I) + 14A2 + 8A
                                     =   23A2 + 50A + 24I
2.1. Eigenvectors and Eigenvalues                                                                       51


   If we wanted to reduce A10 to such a linear combination we could use Maple
        >A:=’A’:   ### we want to use A purely symbolically
        >algsubs(A^3-3*A^2-14*A-8=0, A^10);
This returns our solution:
                                   736255A2 + 1988250A + 1032504I


Example 2
                1 3
   Let A =           . We know that when a vector,v, in R2 is multiplied by A then the direction of Av
                3 1
will typically be different from the direction of v. What is the angle between v and Av?
                                                                                   cos(t)
    In this example we will plot the angle between Av and v for unit vectors v =          . We will use
                                                                                   sin(t)
the VectorAngle command which computes the angle between two vectors.

        >A:=< <1,3>|<3,1>>:
        >v:=<cos(t),sin(t)>:
        >plot( VectorAngle(v, A.v), t=0..2*Pi);


                                          3




                                         2.5




                                          2



                                 angle
                                         1.5




                                           1




                                         0.5




                                          0    1   2   3       4   5   6

                                                           t



                              Figure 2.3: The angle between v and Av.

    This plot is shown in Figure 2.3. What is the significance of the maximum and minimum values of
this plot? Why are there 4 of them?
    We also know that the length of a vector will typically change after it is multiplied by a matrix. We
will plot the ratio
                                                  Av
                                                   v
This ratio represents the stretching factor of vectors when multiplied by A. Notice that since we are plot-
ting this where v is a unit vector so the denominator will be 1. There is a command in the LinearAlgebra
package which will find the length of a vector. This command is Norm and must be used with the syntax
Norm(v, 2). We will plot this ratio along with the previous plot.
52                                                                         2. Eigenvalues and Eigenvectors

         >A:=<<1,3>|<3,1>>:
         >v:=<cos(t),sin(t)>:
         >plot( [VectorAngle(v, A.v), Norm(A.v,2)], t=0..2*Pi);

                                   4




                                   3




                                   2




                                    1




                                   0     1       2    3         4      5   6

                                                          t



                         Figure 2.4: The effects of A on length and direction.

    The result in show in Figure 2.4. Explain the relation between this plot and the output of the
following Maple command:
         >Eigenvectors(A);
                  [-2,4] , [[-1, 1],[1, 1]]
     We will redo the above with the matrix
                                                      2        1
                                                 A=
                                                      3        4

         >A:=<<1,3>|<3,1>>:
         >v:=<cos(t),sin(t)>:
         >Eigenvectors{A);
             [1.00, 5.00], [ [-.707, .707], [-.35, -1.06]]
         >plot( [VectorAngle(v, A.v), Norm(A.v,2)], t=0..2*Pi);

   The output is shown in Figure 2.5. Why does this plot have four minima while the previous
example had two? We have also drawn vertical lines at these minima. What is the significance of
the intersection of these vertical lines and the upper curve (the norm of Av)?
   Redo this type of plot for the matrices

                                             2   1            .8 .6
                                             0   2            −.6 .8
2.2. Diagonalization                                                                             53



                                   5




                                   4




                                   3




                                   2




                                   1




                                   0     1    2    3       4   5   6

                                                       t



                       Figure 2.5: The effects of A on length and direction.


2.2     Diagonalization
Definition 5 Two square matrices, A and B, of the same size are said to be similar if there is an
invertible matrix P such that B = P −1 AP .

   Note that the condition in the above definition is the same as saying A = P BP −1 ; or if we let
Q = P −1 it is equivalent to saying A = Q−1 BQ.
   The significance of similarity between matrices will become clearer as the course goes on, but for
now there is one particular consequence of similarity that will be pointed out.


Theorem 2.4 If n × n matrices A and B are similar then they have the same characteristic poly-
nomial and hence the same eigenvalues.


Proof. Suppose B = P −1 AP . Then

                                   B − λI    = P −1 AP − λP −1 P
                                             = P −1 (AP − λP )
                                             = P −1 (A − λI) P


   It then follows that:

                           det(B − λI)   =   det P −1 (A − λI) P
                                         =   det(P −1 ) · det(A − λI) · det(P )
                                                1
                                         =           · det(A − λI) · det(P )
                                             det(P )
                                         =   det(A − λI)
54                                                                       2. Eigenvalues and Eigenvectors


   We then see that A and B have the same characteristic polynomial and from this it follows that
they have the same eigenvalues (with the same multiplicities).



Example 2.2.1
                        2 1
             Let A =          . The trace of A is 4 and the determinant is 3, so A has the characteristic
                        1 2
                          2
             polynomial λ − 4λ + 3. In Example 2.1.1 we saw that A has eigenvalues of 3 and 1 with
                                           1          1
             corresponding eigenvectors       and        .
                                           1         −1
             We will now define a new matrix, B, that will be similar to A. First, we define an
                                                                     1 2
             arbitrary invertible 2 × 2 matrix, P . We will let P =          and then define
                                                                     1 1

                                                   −1    2     2     1      1 2   3   3
                                B = P −1 AP =                                   =
                                                    1    −1    1     2      1 1   0   1

             Then A and B are similar matrices. Since B is triangular it is easy to see that the
             eigenvalues of B are 3 and 1. These eigenvalues have corresponding eigenvectors of
              1         3
                 and        so even though similar matrices have the same eigenvalues they have, in
              0        −2
             general, different eigenspaces.



Definition 6 A square matrix A is said to be diagonalizable if it is similar to a diagonal matrix.
That is, if A = P DP −1 for some invertible matrix P and some diagonal matrix D. The matrix P
is said to diagonalize A.




Theorem 2.5 An n × n matrix A is diagonalizable if and only if it has n linearly independent
eigenvectors.


   Proof. Suppose A has n linearly independent eigenvectors. Let v1 , v2 , . . . , vn be these linearly
independent eigenvectors and let λ1 , λ2 , . . . , λn be the corresponding eigenvalues.
                                                              
                                       λ1          0 ···     0
                                      0 λ2 · · ·            0 
   Let P = [v1 v2 . . . vn ] and D =  .           .         .  then
                                                              
                                      .   .       .
                                                   .         . 
                                                             .
                                          0    0   ···    λn

                                    AP    =    A [v1 v2 . . . vn ]
                                          =    [Av1 Av2 . . . Avn ]
                                          =    [λ1 v1 λ2 v2 . . . λn vn ]

     The last line follows because the vectors vi are eigenvectors of A.
     We also have:
2.2. Diagonalization                                                                               55



                                                                                    
                                                            λ1  0 ···            0
                                                            0 λ2 · · ·          0   
                           PD    =    [v1 v2 . . . vn ]     .  .                .
                                                                                    
                                                            .
                                                             .  .
                                                                .                .
                                                                                 .
                                                                                     
                                                                                     
                                                             0  0 ···        λn
                                 =    [λ1 v1 λ2 v2 · · · λn vn ]

    We then have AP = P D and multiplying both sides on the right by P −1 gives A = P DP −1 .
(How do we know that P is invertible?) So A is similar to D and is therefore diagonalizable.
    The proof is half finished. We still have to show that if A is diagonalizable then A has n linearly
independent eigenvectors. Suppose A is diagonalizable then A = P DP −1 for some diagonal matrix
D and invertible matrix P . It then follows that AP = P D and using the same notation as above
this equation can be written

                         Av1    Av2   ...   Avn = λ1 v1            λ2 v2   ...       λn vn

From this equation it follows that the columns of P will be eigenvectors with the eigenvalues the
entries on the diagonal of D. The independence of the eigenvectors follows from the invertibility of
P.



Example 2.2.2
                               
                      2     2 1
            Let A =  1     3 1 . Can this matrix be diagonalized?
                      1     2 2

           Step 1. The characteristic polynomial is det(A − λI) = (5 − λ)(1 − λ)2 . The eigenvalues
           are therefore 5 and 1 (multiplicity 2).

           Step 2. Find a basis for each eigenspace.
                                     
                                    1
           For λ = 5 we get v1 =  1 .
                                    1
                                                                                            
                                                                                   1
           For λ = 1 we get a two dimensional eigenspace with basis vectors v2 =  0  and
                                                                                −1
                    2
           v3 =  −1 .
                    0
           We now have three linearly independent eigenvectors.

           Step 3. The matrix P that will diagonalize A will be constructed from the eigenvectors.
           The eigenvectors can be used as the columns of P in any order. The eigenvalues will
           then turn up as the diagonal entries of D in the corresponding order. So, for example,
           we can let                                          
                                                   1    1    2
                                           P = 1       0 −1 
                                                   1 −1      0
56                                                             2. Eigenvalues and Eigenvectors


            Step 4. We now know that A can be diagonalized and if we use the above P we get
                                                          
                                                  5 0 0
                                          D= 0 1 0 
                                                  0 0 1

            There was no computation involved in finding D but if you want to check your answer
            you can use the fact that D = P −1 AP so:

                                                                                  
                                     1/4  1/2  1/4     2          2 1    1        1  2
                     P −1 AP   =    1/4  1/2 −3/4   1          3 1  1        0 −1 
                                     1/4 −1/2  1/4     1          2 2    1       −1  0
                                          
                                     5 0 0
                               =    0 1 0 
                                     0 0 1


                                                                                 1 1
     Not every square matrix can be diagonalized. A simple example is A =              . This matrix
                                                                                 0 1
has only one eigenvalue, λ = 1, so in order for it to be diagonalizable the corresponding eigenspace
would have to be 2 dimensional. But subtracting 1 down the diagonal gives a matrix of rank
1 so the corresponding eigenspace is only 1 dimensional. The conclusion is that this matrix is
not diagonalizable. The general idea is that a matrix is not diagonalizable if you can’t find enough
linearly independent eigenvectors to make up matrix P . Square matrices that cannot be diagonalized
are said to be deficient.
    The following theorem, which will be stated without proof, gives some important insight into
diagonalizable matrices.

Theorem 2.6 Let λ be an eigenvalue of matrix A. If λ has multiplicity n then the dimension of
the corresponding eigenspace is at most n.

    For example, suppose A is a 4 × 4 matrix with characteristic polynomial (λ − 3)2 (λ − 5)2 . Then
this matrix has two eigenvalues, both of multiplicity 2. That means there are two eigenspaces and the
dimension of each eigenspace is at most 2. Taking this a bit further, it means that each eigenspace
has a dimension of either 1 or 2. (The dimension can’t be 0 because an eigenspace has to contain
non-zero eigenvectors.) In order for A to be diagonalizable both eigenspaces would have to have
dimension 2. If at least one of these eigenspaces has dimension 1 then matrix A is deficient and
cannot be diagonalized.
    In general, for a square matrix A to be diagonalizable each eigenspace must be of the maximum
possible dimension – this maximum possible dimension is the multiplicity of the corresponding
eigenvalue. So a deficient matrix has an eigenspace whose dimension is less than the multiplicity of
the corresponding eigenvalue.

Example 2.2.3
            C onsider the following matrices
                                                                               
                                    4 0 0      0   0             4   1   0   0   0
                                  0 4 0       0   0          0    4   1   0   0
                                                                               
                              A = 0 0 4
                                              0   0
                                                          B = 0
                                                                    0   4   0   0
                                                                                  
                                  0 0 0       5   0          0    0   0   5   0
                                    0 0 0      0   5             0   0   0   0   5
2.2. Diagonalization                                                                              57


           It should be clear that 4 is an eigenvalue of multiplicity 3 for both A and B, and that 5
           is an eigenvalue of multiplicity 2 for both matrices.
           When you subtract 5 from each diagonal entry in either matrix you get three pivot
           columns and two columns of zeroes; that is, the resulting matrices have rank 3. The
           corresponding eigenspace is therefore 2 dimensional for each matrix. The dimension of
           the eigenspace is equal to the multiplicity of the eigenvalue.
           But notice what happens when you subtract 4 from the diagonal entries of these matrices.
           With matrix A you get three columns of zeroes resulting in a matrix of rank 2 with a 3
           dimensional eigenspace. Matrix B, on the other hand, will still contain 4 pivots (the rank
           will be 4) so the corresponding eigenspace is only 1 dimensional. Matrix B is deficient
           and cannot be diagonalized.


Theorem 2.7 If A = P DP −1 where D is diagonal. Then An = P Dn P −1 for any positive integer
n.

Proof. Suppose A = P DP −1 then

                    An   =   P DP −1 · P DP −1 · P DP −1 . . . P DP −1 · P DP −1
                         =   P D(P −1 P )D(P −1 P ) . . . (P −1 P )D(P −1 P )DP −1
                         =   P DIDID . . . IDP −1
                         =   P Dn P −1



    The significance of this last theorem is connected with the fact that if D is a diagonal matrix
then Dn is easy to evaluate. To evaluate Dn you just have to raise each entry on the main diagonal
to the nth power. So if A is diagonalizable, the most efficient way of computing An is by evaluating
P Dn P −1 . The importance of this result will be investigated in the next chapter.
58                                                                         2. Eigenvalues and Eigenvectors


     Exercises
     1. Find a basis for the eigenspaces of the following matrices. Are any of these matrices diagonalizable?

                                                                                     
                                    2 0      0          2        1 0            2     1 0
                               (a) 0 2      0    (b) 0        2 0      (c) 0     2 1
                                    0 0      2          0        0 2            0     0 2

     2. Diagonalize the following matrices if possible. That is, give matrices P and D such that P −1 AP =
        D where D is diagonal.

                                        2 0                      1    0               0   1
                              (a) A =              (b) A =                 (c) A =
                                        1 1                      2    1               1   2

     3. Diagonalize the following matrices if possible.

                                         3    1                   1    3               5 5
                             (a) A =               (b) A =                  (c) A =
                                        −1    5                   4    2               5 5

                               8 −12
     4. (a) Can the matrix           be diagonalized?
                               6 −10
                                                             8       −12
         (b) Find any values of α for which the matrix                   cannot be diagonalized.
                                                             6        α
                                   
                          4     2 2
     5. Diagonalize A =  2     4 2  if possible. The eigenvalues of this matrix are λ = 2, 8.
                          2     2 4
                                    
                          4     0 −2
     6. Diagonalize A =  2     5  4  if possible. The eigenvalues of this matrix are λ = 5, 4.
                          0     0  5
                                 
                          2  2 −1
     7. Diagonalize A =  1  3 −1  if possible. The eigenvalues of this matrix are λ = 5, 1
                         −1 −2  2

                                                                                               2     0
     8. Find a matrix which has eigenvalues of 2 and 4 with corresponding eigenvectors           and   .
                                                                                               1     2

                                                                                               a     b
     9. Find a matrix which has eigenvalues of 1 and 0 with corresponding eigenvectors           and   .
                                                                                               b     a

 10. Find a matrix which has eigenvalues of -1
                                            and 0 where -1 is an eigenvalue of multiplicity 2 with
                                 1          1
     corresponding eigenvectors 1 and −1 and the eigenvalue 0 has corresponding eigenvector
                                 0          1
      
      0
     0.
      1
2.2. Diagonalization                                                                                 59

                     5     −3
 11. Suppose A =              . Matrix A can be diagonalized as
                     6     −4

                                             1 1    2   0      2       −1
                                       A=
                                             1 2    0   −1    −1       1

     Find A7 .
                        
              1     −1 4
 12. Let A = 3     2 −1. The eigenvalues of A are 3, -2, and 1.
              2     1 −1
      (a) Find a basis for each eigenspace of A.
      (b) Find a matrix P that diagonalizes A.
      (c) Calculate A9 .
                                                             1     1
 13. It was pointed out in this section that the matrix                 is not diagonalizable. Show that
                                                             0     1
      1+ǫ 1
                 is diagonalizable for ǫ > 0. What are the eigenspaces of this matrix? What hap-
        0     1
     pens to these eigenspaces as ǫ → 0?
                                             
                                       1 a 1
 14. For what value of a, if any, is  0 1 a  not diagonalizable?
                                       0 0 2

                                                                                                     2
 15. Let A be a 2 × 2 matrix. Suppose A has eigenvalues 3 and 0 with corresponding eigenvectors
                                                                                                     1
           1            0
     and      . Let x =   .
           −1           6
      (a) Find Ax.
      (b) Find A.
     Let   be      3 
 16.  A a 3 × matrix. Suppose A has eigenvalues 1, 2, −4 with corresponding eigenvectors
                                 
      1    1        4            1
     1, 0, and 2. Let x = 2.
      1    2        3            3
      (a) Find Ax.
      (b) Find A.
 17. Show that the only matrix similar to the zero matrix is the zero matrix.
 18. Show that the only matrix similar to the identity matrix is the identity matrix.
 19. Suppose A is an n × n matrix such that A is similar to −A.
      (a) Show that if n is odd then det A = 0.
      (b) Find an invertible 2 × 2 matrix A such that A is similar to −A.
 20. Show that if A is similar to B, and B is similar to C, then A is similar to C.
 21. Suppose A and B are similar matrices. Show that
60                                                             2. Eigenvalues and Eigenvectors


      (a) if A is invertible then B is also invertible.
      (b) if A is idempotent then B is also idempotent. (Remember A is idempotent if A2 = A.)
      (c) if A is nilpotent then B is also nilpotent. (A is said to be nilpotent if An = 0 for some
          positive integer n.)
                  1   0                   0     1
 22. Show that          is not similar to         .
                  0   2                   2     0
 23. Suppose that A is a 4 × 4 matrix with two eigenvalues. Each eigenspace is one dimensional. Is A
     diagonalizable?
 24. Suppose that A is a 5 × 5 matrix with 5 eigenvalues. Is A diagonalizable?
 25. Suppose that A is a 5 × 5 matrix with 4 eigenvalues. Is A diagonalizable?
 26. Suppose B = P −1 AP . Let v be an eigenvector of A with eigenvalue λ. Show that P −1 v is an
     eigenvector of B with eigenvalue λ.
 27. Suppose u and v are non-zero vectors in Rn . Show that the n × n matrix uvT is diagonalizable
     if and only if uT v = 0. (Hint: use problem 18 from section 2.1)
2.2. Diagonalization                                                                                    61


                                     Using MAPLE

Example 1
   In this example we will use Maple to diagonalize a matrix. We will begin with the following 2 × 2
matrix:
                                              8/3 4/3
                                        A=
                                              2/3 10/3
First we need the eigenvectors of A. In Maple we could do the following:
        >A:=<< 8/3,2/3>|<4/3,10/3>>;
        >ev:=Eigenvectors(A);
                                           [2, 4] [[−2, 1], [1, 1]]
    The last command tells us that the eigenvalues are 2 and 4 and that each eigenvalue has multiplicity
1. This command also returns a set of basis eigenvectors for each eigenspace. In this case each eigenspace
is 1 dimensional and so each set contains one vector. We next define these eigenvectors in Maple and
diagonalize A.
        >v1:=<-2,1>:
        >v2:=<1,1>:
        >P:=<v1|v2>;
        >evalm(P^(-1).A.P);
                                                  2 0
   As expected, when A is diagonalized we get            with the eigenvalues turning up on the diagonal.
                                                  0 4
    How is the unit circle transformed by matrix A? We have seen before that the unit circle can be seen
                          cos(t)
as vectors of the form            as t ranges from 0 to 2π, and we will now multiply these vectors by A.
                          sin(t)
We will plot the results along with the eigenspaces of A.
        >v:=<cos(t),sin(t)>:
        >tv:=A.v;
        >p1:=plot([v[1],v[2],t=0..2*Pi]): ## the unit circle
        >p2:=plot([tv[1],tv[2],t=0..2*Pi]): ## the circle multiplied by A
        >p3:=plot([-2*t,t,t=-1.5..1.5]): ## one eigenspace
        >p4:=plot([t,t,t=-3..3]): ## the other eigenspace
        >plots[display]([p1,p2,p3,p4],scaling=constrained,color=black);
    This gives us the plot in Figure 2.6
    If you look at the points on the unit circle that intersect the eigenspaces, you should understand that
these points are stretched out by factors of 2 and 4 along the eigenspaces to the points on the ellipse
that intersect the eigespaces. Notice also that the eigenspaces are not the axes of the ellipse. We will
learn later how to find these axes.
    Try repeating the above example using the following matrices:

                                          1   1                3 1
                                         1/2 3/3               1 3


Example 2
62                                                                      2. Eigenvalues and Eigenvectors


                                                    3     lambda=4
                                                    2
                                        lambda=2
                                                    1

                                        –3 –2 –1           1    2   3
                                                –1
                                                   –2
                                                   –3


                                               Figure 2.6:


     We will look at another example involving a larger   matrix. Suppose we want to diagonalize
                                                              
                                                 1 4      3 5
                                               2 1       2 3
                                          B=  3 2
                                                               
                                                          1 2
                                                 4 3      4 1

     We will first enter this into Maple .
          >B:=< <1,2,3,4>|<4,1,2,3>|<3,2,1,4><5,3,2,1>>:
          >evb:=Eigenvectors(B);
    Notice when B was entered into Maple one of the entries was entered as 1.0. By entering at
least one entry in floating point format you force Maple into using floating point approximations for its
computations rather than exact arithmetic. To see the advantage of this try changing the 1.0 to 1 and
see what happens.
    We now want to pick out the four eigenvectors. We could just retype them as we did in the first
example but in this case the following approach is simpler:

          >P:=evb[2];
          >P^(-1).B.P;
     This gives
                                                                                               
                         −2.0               −0.000000015        0.0000000005     0.0000000004
                                                                                          
          
                 0.00000000823064292       10.28582791         0.000000014   −0.000000012 
                                                                                           
                                                                                          
          
                 0.00000000712841625       0.000000013         −2.856210504 −0.000000014 
              −0.0000000003959861145        0.0000000009       −0.00000000100 −1.429617398

    Notice here that P −1 BP didn’t quite give a diagonal matrix. The entries off the diagonal weren’t
zeroes. But they were all small (around the order of 10−8 ). This is the result of using floating point
approximations. When using floating point calculations numbers that should be zero turn out to be very
small nonzero values as a result of rounding errors.

Example 3
   In this example we will illustrate another way of diagonalizing a matrix in Maple . There is a command
in Maple which will diagonalize a matrix in one step and place the eigenvectors in a matrix.
2.2. Diagonalization                                                                                   63

        >A:=<<1,-2>|<-2,6>>:
        >ev,P:=Eigenvectors(A);

        >P^(-1).A.P;

                                                  2   0
                                                  0   5
    In this example the columns of matrix P are the eigenvectors of A.
    The command JordanForm converts any matrix to what is called Jordan canonical form. If
the matrix is diagonalizable then the Jordan form is the similar diagonal matrix. If the matrix is not
diagonalzable then the Jordan form is, in a sense, the matrix that is as close to diagonal as possible. It
is an almost diagonal matrix with eigenvalues down the diagonal but with some non-zero entries (in facr
1s) above the diagonal. For example, the matrix
                                                         
                                             −4 3 −8
                                            18 −3 18 
                                             13 −5 17

is not diagonalizable. The JordanForm command returns the following
        >A:=<<-4,18,13>|<3,-3,-5>|<-8,18,17>>;
        >ev,P:=Eigenvectors(A); ### note that P has only 2 non-zero columns, A ios defective
        >P:=JordanForm(A,output=’Q’);    ### compare this with the previous output
        >P^(-1).A.P;    ### this is almost diagonal
                                              
                                        4 0 0
                                      0 3 1
                                        0 0 3
64                                                                           2. Eigenvalues and Eigenvectors


2.3      Eigenvectors and Linear Transformations
   You should recall that if A is an m × n matrix and x is a vector in Rn , then Ax is a vector in
 m
R . In this context multiplication of a vector by A is said to be a linear transformation from Rn
to Rm . The idea is simple: you have a vector in Rn , you multiply the vector by A, and as a result
you get a vector in Rm . This transformation can also be written using the notation x → Ax, and
the vector x is said to be mapped to the vector Ax.4
   A linear transformation is sometimes represented using functional notation. That is, the linear
transformation might be called T and then for any input vector, v, the result of applying the
transformation is called T (v). In other words, T (v) = Av.
   Now suppose we have:

                                                 1 2                   2
                                         A=                 , x=
                                                 2 1                  −1
with the following bases of R2 :

                                      1           1                      1           1
                            B=            ,                 , C=              ,
                                      1          −2                      1          −1

  In this case multiplication by A would correspond to a linear transformation from R2 to R2 .
What does x get mapped into by this transformation? A simple computation gives

                                                      1 2     2   0
                                          Ax =                  =
                                                      2 1    −1   3

    But now suppose that we were representing our vectors relative to a different basis other than
the standard basis. Suppose, for example, we were representing vectors relative to basis B. In that
                                                                                        1
case our starting vector, before the transformation, would be represented by [x]B =          and this
                                                                                        1
                                              1
vector would be mapped into [Ax]B =              . The point we want to illustrate in this section is
                                            −1
that even if we represent our vectors relative to some other basis the linear transformation will still
correspond to multiplying by a certain matrix. This matrix will not be A.
    If we are representing vectors relative to basis B, then what matrix would we have to multiply
by in order to apply the same linear transformation? If we start with [x]B how can we find the result
of applying the linear transformation? We will break the problem up into smaller steps. First we
multiply by PE←B . This will give us our starting vector in the standard basis, and we know that
in the standard basis we multiply by matrix A to apply the linear transformation. Finally, we want
our transformed vector represented in basis B, not the standard basis, so we multiply by PB←E and
we end up with [Ax]B .

                                                       A
                                           x          −→          Ax
                                     B      ↑                      ↓         B −1
                                          [x]B      −→           [Ax]B
                                                  B −1 AB
   If we put all of that together, what did [x]B get multiplied by? If you arrange things properly
you should see that our starting vector was multiplied by PB←E APE←B = B −1 AB. This matrix is
  4 Rn is also called the domain of the transformation.     Rm   is called the co-domain. The range of the transfor-
mation is Col A.
2.3. Eigenvectors and Linear Transformations                                                     65


called the B-matrix of the transformation. If we evaluate this matrix we get
                                    −1
                           1    1        1 2          1    1          3   −2
                                                                =
                           1   −2        2 1          1   −2          0   −1

   If we use this to check our earlier computation we get

                                     3 −2        1             1
                                                          =
                                     0 −1        1             −1

                                    B −1 AB    [x]B       =   [Ax]B
   Now repeat the above procedure and find the C-matrix of this linear transformation.

    Recall that two n × n matrices, A and B, are similar if A = P −1 BP . Since P is invertible, the
columns of P form a basis for Rn . This means that the transformation x → Bx represents the same
transformation as x → Ax if you interpret the first as relative to the columns of P and the second
as relative to the standard basis.

   The following points summarize this section

   • A linear transformation from Rn to Rm corresponds to multiplying a vector in Rn by an m × n
     matrix.

   • A vector has an infinite number of different representations corresponding to different bases.
     As a consequence the matrix representation of a linear transformation varies with the choice
     of a basis.

   • If two matrices are similar then they correspond to the same linear transformation expressed
     relative to different bases.

   • If a matrix A is diagonalizable then the diagonal matrix is, in a sense, the simplest way
     of representing the linear transformation x → Ax. This happens when we represent the
     transformation relative to a basis of eigenvectors of A.


Example 2.3.1
           L et                                                  
                                                    5 0         −4
                                               A = 2 1         −2
                                                    3 0         −2
           and let T be the linear transformation x → Ax.
           Let                                      
                                                1     1    1 
                                           B = 1 , 0 , 1
                                                   1   1    0
                                                             

           and                                      
                                               1     0     4 
                                           C = 0 , 1 , 2
                                                1     0     3
                                                              

           If we define matrices B and C as usual we get
66                                                         2. Eigenvalues and Eigenvectors


                                                          
                                                   1   0 1
                                       B −1 AB = 0    1 2
                                                   0   0 2
                                                          
                                                  1    0 0
                                       C −1 AC = 0    1 0
                                                  0    0 2
     Notice that although B and C are both bases of R3 , C is composed of eigenvectors of A
     and so the C-matrix of T (x) = Ax is diagonal.
     The important thing to understand here is: In what sense do A, B −1 AB, and C −1 AC
     represent the same linear transformation?
                                           
                                           4
     Choose any vector in R3 . Say, v = 3. If we apply the linear transformation T to v
                                           3
     we get
                                                         
                                              5 0 −4 4            8
                             T (v) = Av = 2 1 −2 3 = 5
                                              3 0 −2 3            6
                                        
                                         8
     So transformation T maps v into 5.
                                         6
     But suppose we look at things from the point of view of basis B. If we express v in terms
                                
                                 2
     of basis B we have [v]B = 1. If we multiply this vector by the matrix B −1 AB we get
                                 1
                                                         
                                               1 0 1 2            3
                              B −1 AB [v]B = 0 1 2 1 = 3
                                               0 0 2 1            2
     We want to show that this last vector is, in fact, the same vector as Av. In order to see
     this we should look at this result as the coordinates of a vector relative to basis B. In
     the standard basis that vector would then be
                                                            
                                            1        1        1      8
                     3b1 + 3b2 + 2b3 = 3 1 + 3 0 + 2 1 = 5 = Av
                                            1        1        0      6
     In other words, the transformation represented by matrix A relative to the standard
     basis would be represented by B −1 AB relative to basis B.
     Similarly, if we look at things from point of view of the eigenbasis, C, our initial vector
                        
                         0
     would be [v]C = 1. Applying the transformation to this coordinate vector we get
                         1
                                                         
                                               1 0 0 0             0
                               C −1 AC [v]C = 0 1 0 1 = 1
                                               0 0 2 1             2
     This should be the C-representation of Av. To see this we put this last result in terms
     of the standard basis
2.3. Eigenvectors and Linear Transformations                                                     67


                                                             
                                               1       0       4     8
                          0c1 + 1c2 + 2c3 = 0 0 + 1 1 + 2 2 = 5 = Av
                                               1       0       3     6
         Every different basis would give a a different representation of vector v and a differ-
         ent representation of transformation T . The matrices corresponding to these different
         representations of T are exactly the matrices that are similar to A.


Example 2.3.2
                     0 −1               1
          Let A =            and v =      . Matrix A corresponds to a counter-clockwise rotation
                     1 0                0
                                  0
         of 90◦ . So we get Av =     . That is, a unit vector on the x axis is transformed (rotated)
                                  1
         to a unit vector along the y axis.
         Now we’ll look at the same transformation from the point of view of a different basis.
                     1/2 1
         Let B =              and let the columns of B be our new basis, B. The matrix of the
                      0 1
         transformation x → Ax relative to basis B would be

                                        2 −2     0 −1     1/2 1   −1 −4
                           B −1 AB =                            =
                                        0 1      1 0       0 1    1/2 1
                                                                                                2
         What does the transformation v → Av look like in terms of basis B? We have [v]B =        ,
                                                                                                0
         and transforming this we get

                                           −1 −4       2   −2
                                                         =
                                           1/2 1       0    1
         So we have
                                1   0
                                  →                    in the standard basis
                                0   1

                                2   −2
                                  →                    in basis B
                                0   1
         In Figure 2.7 you see vector v and the result of transforming v by multiplying by A (i.e.,
         by rotating 90◦ ).
         In Figure 2.8 you see exactly the same situation but now it is overlaid with the grid
         corresponding to basis B. You should be able to see how the starting vector corresponds
              2
         to       in this new coordinate system, and how the transformed vector corresponds to
              0
          −2
                .
            1
68                                                              2. Eigenvalues and Eigenvectors




                             Figure 2.7: Counter-clockwise rotation by 90◦




                                                  (-2,1)


                                           1



                                           -2                (2,0)




                       Figure 2.8: Counter-clockwise rotation by 90◦ and basis B


       Exercises
     1. Let
                                          1 1      2         1        2
                                 A=           ,v =    , b1 =   , b2 =
                                          1 1      −1        1        1

       and let B = {b1 , b2 }.

        (a) Find Av, [v]B , and [Av]B .
        (b) Find the B-matrix of the linear transformation x → Ax.

     2. Let
                                      0 2       3        1        1
                                 A=        ,v =   , b1 =   , b2 =
                                      −1 1      1        1        −1

       and let B = {b1 , b2 }.

        (a) Find Av, [v]B , and [Av]B .
2.3. Eigenvectors and Linear Transformations                                                        69


      (b) Find the B-matrix of the linear transformation x → Ax.
                                      
               1 1 2                1       0     1 
  3. Let A = 2 1 3 and B = 1 , 1 , 0 .
               1 0 1                   0     1     1
                                                     

      (a) Find the B-matrix of the transformation T (x) = Ax.
                      
                      1
      (b) Let [v]B = 1. Find [T (v)]B and T (v).
                      1
  4. Let                                              
                                              1 0  0 −1
                                            −1 1  0 0
                                          A=
                                             0 −1 1
                                                       
                                                     0
                                              0 0 −1 1
     Let H be the Haar basis of R4 . What is the H-matrix of the linear transformation x → Ax?
  5. Let
                                                   2 2      1
                                              A=       ,v =
                                                   0 1      −1

     and let B be a basis of eigenvectors of A.
      (a) Find [v]B , [Av]B , and A10 v   B
      (b) Find A10 v.
  6. Let A be a 2 × 2 rotation matrix. Let B = {e2 , e1 }. Show that the B-matrix of the linear
     transformation x → Ax is A−1 .
  7. Show that if A is similar to B, then A2 is similar to B 2 .
  8. Suppose that A is similar to A2 . What can you say about the eigenvalues of A?
  9. Suppose that A and B are both invertible and that A is similar to B, show that A−1 is similar to
     B −1 .
 10. Show that if A and B are similar then they have the same rank.
 11. Suppose v is an eigenvector of A with corresponding eigenvalue λ. Let B = P −1 AP for some
     invertible matrix P . Show that P −1 v is an eigenvector of B.
 12. Suppose B and C are similar to A. Is B + C similar to A?
 13. Let T be the linear transformation from Rn to Rn defined by T (x) = Ix. This is called the identity
     transformation. Let B be a basis of Rn . What is the B-matrix of this transformation?
 14. Let T be the linear transformation from Rn to Rn defined by T (x) = Ox. This is called the zero
     transformation. Let B be a basis of Rn . What is the B-matrix of this transformation?
70                                                                2. Eigenvalues and Eigenvectors


                                       Using MAPLE

Example 1
    Similar matrices correspond to the same linear transformation relative to different bases. In this
example we will begin by definining two matrices A and P . We will then compute B = P AP −1 . It
follows that A and B are similar.

          >A:=BandMatrix([1,2,1],1,4):
          >P:=Matrix(4,4, (i,j)->min(i,j)):
          >B:=P^(-1).A.P:
    We now want to illustrate the fact that A and B correspond to the same transformation from the
point of view of different bases. We will define an arbitrary vector, v, and multiply it by A. We will call
the transformed vector tv.

          >v:=<1,4,5,2>:
          >tv:=A.v;
                               tv := [6, 14, 16, 9]
   We now convert the vectors v and tv into the basis formed by the columns of P and call the results
vp and tvp.

          >vp:=P^(-1).v;
                                       vp := [-2, 2, 4, -3]
          >tvp:=P^(-1).tv;
                                       tvp := [-2, 6 , 9 -7]
     Finally, to show the connection between A and B we just have to do one more computation.

          >B.vp;
                                       [-2, 6 , 9 -7]
     This is the same result as tvp.                    
                                                        a
                                                        b
     Now try redoing the above steps but define v to be  .
                                                        c
                                                        d

Example 2
                                                
                −2 1        0     0               1
               1 −2 1            0             2
    Let A =                         and v =  . Let T : R4 → R4 be defined by T (x) = Ax.
              0       1 −2 1                   3
                 0     0    1 −2                  4
We will find the representation of T relative to the Haar basis, H, for R4 and apply both versions of
transformation T to vector v.
    First we enter the relevant information into Maple . There are many ways of entering a matrix into
Maple . Here we use the augment command. This command takes a sequence of vectors or matrices
(having the same number of rows) and combines them into one matrix.
2.3. Eigenvectors and Linear Transformations                                                        71


        >A:=<<-2,1,0,0>|<1,-2,1,0>|<0,1,-2,1>|<0,0,1,-2>>:
        >H:=<<1,1,1,1>|<1,1,-1,-1>|<1,-1,0,0>|<0,0,1,-1>>:
        >v:=<1,2,3,4>:
        >tv:=A.v;
                         tv:=[0, 0, 0, -5]
   The last command computed T (v) and we called the result tv in Maple . Now we want to do the
same computation relative to the Haar basis. The H-representation of T will be given by H −1 AH.

        >B:=H^(-1).A.H;

   This gives the H-matrix of the transformation as:
                                                                 
                                     −1/2     0     −1/4 1/4
                                  0       −3/2 1/4      1/4      
                                                                 
                                  −1/2 1/2          −3 −1/2      
                                      1/2   1/2 −1/2 −3
Continuing with Maple :
        >vh:=H^(-1).v;
                                   vh:=[5/2, -1, -1/2, -1/2]
        >tvh:=B.vh;
                                   tvh:=[-5/4, 5/4, 0, 5/2]
        >H.tvh;
                                           [0, 0, 0, -5]

   You should find the above pretty straightforward. It shows that relative to the standard basis we had
                                                  
                                             1        0
                                           2     0
                                            → 
                                           3     0
                                             4       −5

and relative to the Haar basis we had the same vector transformation represented as
                                                         
                                            5/2        −5/4
                                          −1 
                                                 →  5/4 
                                                           
                                         
                                         −1/2       0 
                                           −1/2         5/2
72                                                                           2. Eigenvalues and Eigenvectors


2.4          Complex Eigenvalues
    The eigenvalues of an n × n matrix are the roots of a polynomial of degree n but not every such
polynomial will have n real roots. It is possible for the characteristic polynomial to have complex
roots. We have seen that real eigenvalues have a geometric interpretation as scaling factors in certain
directions. In this section we will look matrices with complex eigenvalues and show that these also
have a straightforward geometric interpretation.
                0 1
    Let A =           . This matrix has the characteristic equation λ2 + 1 = 0 so we see that A has
               −1 0
no real eigenvalues. The characteristic equation can however be solved in terms in terms of complex
numbers (see Appendix B for a review of complex numbers), and so matrix A does have complex
eigenvalues. In particular, if we solve the characteristic equation we get the complex eigenvalues
λ1 = i and λ2 = −i.
    Matrix A is a rotator, it rotates vectors in R2 clockwise by π/2 radians. It should therefore be
clear that no non-zero vector is transformed into a scalar multiple of itself when multiplied by A,
and so there are no eigenvectors in R2 . The fact that A has complex eigenvalues tells us that there
will also be complex eigenvectors in C2 .5 These complex eigenvectors can be easily computed in
                                                              1
the usual way. For λ1 = i we will get an eigenvector v1 =       . For λ2 = −i we get an eigenvector
                                                              i
        1
v2 =       . If we multiply these eigenvectors by A we have the following:
       −i

                                              0 1        1    i    1
                                    Av1 =                  =    =i   = λ1 v1
                                              −1 0       i   −1    i

      and
                                           0 1        1    −i      1
                                 Av2 =                   =    = −i    = λ2 v2
                                           −1 0       −i   −1      −i
    Can matrix A be diagonalized? The answer is no if you restrict yourself to real numbers, but the
answer is yes if you use complex numbers. If we proceed as before by using our two basis eigenvectors
as columns of P we have
                                                              −1
                                                     1   1         0 1       1 1
                                   P −1 AP     =
                                                     i   −i        −1 0      i −i
                                                     1/2 −i/2        0 1        1 1
                                               =
                                                     1/2 i/2         −1 0       i −i
                                                     i   0
                                               =
                                                     0   −i

   So, just as with real eigenvalues, when the matrix is diagonalized the complex eigenvalues turn
up as the diagonal entries.

                                                     cos θ  sin θ
      As a more general example consider the matrix               . This matrix has characteristic
                                                    − sin θ cos θ
equation λ2 − 2 cos θ λ + 1 = 0. We can now find the eigenvalues by the quadratic formula
                             √
                   2 cos θ ± 4 cos θ2 − 4
              λ=                          = cos θ ± − sin θ2 = cos θ ± i sin θ = e±iθ
                              2
     5A   vector with n complex entries is said to be a vector in Cn . The symbol C stands for the set of complex numbers
as   R stands for the set of real numbers.
2.4. Complex Eigenvalues                                                                                         73


   So again we get two complex eigenvalues6 that are conjugates with corresponding complex eigen-
         1
vectors     . We have
        ±i

                                  cos θ     sin θ    1            cos θ + i sin θ
                                                          =
                                 − sin θ    cos θ    i           − sin θ + i cos θ
                                                                                       1
                                                          =    (cos θ + i sin θ)
                                                                                       i
and
                                cos θ      sin θ    1             cos θ − i sin θ
                                                          =
                               − sin θ     cos θ    −i           − sin θ − i cos θ
                                                                                       1
                                                          =    (cos θ − i sin θ)
                                                                                       −i


    The last two examples involved rotation matrices, but not only rotation matrices have complex
                                         1 1
eigenvalues. For example, the matrix            has the characteristic equation λ2 − 4λ + 5 = 0 giving
                                        −2 3
eigenvalues
                                          √
                                     4 ± 16 − 20       4 ± 2i
                                 λ=                 =         =2±i
                                            2            2
                                              1                                                  1
    For λ = 2 + i we get an eigenvector of       , and for λ = 2 − i we get an eigenvector of        .
                                            1+i                                                1−i
Once again, as explained in Appendix B, the eigenvalues are conjugates.

Scaling and Rotation in R2
           a −b
Let A =           . What happens if we multiply a vector in R2 by a matrix of this type? First
           b a
notice that this matrix can be rewritten


                                a   −b                         √ a         − √a2b+b2
                                            =       a2 + b 2    a2 +b2
                                b   a                          √ b           √ a
                                                                a2 +b2        a2 +b2

                                                               cos θ     − sin θ
                                            =       a2 + b 2
                                                               sin θ      cos θ
                      a
where θ = arccos √           . Written this way it is easy to see that multiplying by A corresponds to
                    a2 + b 2                                      √
rotating the vector through angle θ and scaling by a factor of a2 + b2 .
    Now suppose we have any (real) 2 × 2 matrix A with complex eigenvalues. Then A will be similar
to a diagonal matrix with complex entries on the diagonal. To diagonalize A it will be necessary to
use complex eigenvectors. We will now show that it is possible to find a basis of real eigenvectors
                                                                             a −b
such that matrix A, expressed in terms of this basis, will have the form             . This means that
                                                                              b a
any 2 × 2 matrix with complex eigenvalues is similar to a matrix that involves just scaling and
rotation.
   6 This is not quite true. There are some values of θ which will give real eigenvalues. What are these values? What

are the eigenvalues in these cases?
74                                                               2. Eigenvalues and Eigenvectors


Theorem 2.8 Let A be a (real) 2 × 2 matrix with a complex eigenvalue λ = a + bi. Let v be a
complex eigenvector corresponding to λ and let P = Re(v) Im(v) then

                                                                a b
                               A = P CP −1      where    C=
                                                                −b a

Proof. Suppose A is a real 2 × 2 matrix with complex eigenvalues and eigenvectors. Let Av = λv,
                 v ¯v
so we also have A¯ = λ¯ where λ = a + bi. Let vR = Re(v) and vI = Im(v). We then have

                                ¯
                   AvR = A (v + v) /2 = ((a + bi)v + (a − bi)¯ ) /2 = avR − bvI
                                                             v

and
                               ¯
                  AvI = A (v − v) /2i = ((a + bi)v − (a − bi)¯ ) /2i = bvR + avI
                                                             v
                                 a b
     Let P = vR   vI and C =          . Using the above results we can write
                                 −b a

                                                         a b
                                PC    =    vR     vI
                                                         −b a
                                      =    avR − bvI      bvR + avI
                                      =    AvR     AvI
                                      =    A vR     vI
                                      =    AP

   So AP = P C and A = P CP −1 .
   There is a gap in this proof: Why must P be invertible? We leave it as an exercise to fill in this
part of the proof.


Example 2.4.3
                       1   4
           L et A =            . This matrix has characteristic polynomial
                      −.8 −2.2

                                                λ2 + 1.2λ + .25 + 1

           Using the quadratic formula we get eigenvalues
                                              √
                                       −1.2 ± 1.44 − 4
                                                          = −.6 ± .8i
                                               2
                                                                                 1
           For λ = −.6 + .8i we get a corresponding complex eigenvector v =            . We now
                                                                             −.4 + .2i
                                                                                            1
           want to express A in terms of a new basis. The new basis will be b1 = Re(v) =
                                                                                           −.4
                               0                   1    0
           and b2 = Im(v) =       . So we let P =          and we get
                              .2                  −.4 .2

                                                  1 0      1   4        1    0
                                 P −1 AP   =
                                                  2 5     −.8 −2.2     −.4   .2
                                                  −.6 .8
                                           =
                                                  −.8 −.6
2.4. Complex Eigenvalues                                                                         75

                                                      a −b
         and the resulting matrix has the structure          of a scaling combined with a rotation.
                                                      b a
         The scaling factor would be      (−.6)2 + .82 = 1 and the rotation would be clockwise by
         arccos(−.6).


Example 2.4.4
          In this example we will take the reverse approach. Let

                                                      cos 10◦   − sin 10◦
                                         A = .95
                                                      sin 10◦    cos 10◦

         This matrix corresponds to a counter-clockwise rotation of 10 degrees combined with a
                                                                 1
         scaling by a factor of .95. If we start with the vector   and multiply repeatedly by A
                                                                 0
         eighteen times we get the set of points shown in Figure 2.9.
                                                  1
                                                0.8
                                                0.6
                                                0.4
                                                0.2

                                  –1    –0.6      0 0.20.40.60.8 1
                                               –0.2
                                               –0.4
                                               –0.6
                                               –0.8
                                                 –1

                          Figure 2.9: Rotation and Scaling using A

                                                                                        2 −1
         We will now convert matrix A to a new basis. We will arbitrarily choose P =
                                                                                        1 1
         and let our new basis be the columns of P . Following the construction of Theorem 2.8
         we let

                                                            .991 −.110
                                        B = P AP −1 =
                                                            .275 .881
                                      1                          1/3
         In this new basis the vector     would correspond to          . If we start with this point
                                      0                         −1/3
         and transform it 18 times we would get Figure 2.10.
         The eigenspaces are also plotted here (they are determined by the columns of P ). They
         can be seen as the axes of a new coordinate system and it still takes 9 iterations to go
         through one quadrant, but now the quadrants are not all the same size.
         Figures 2.11 and 2.12 illustrate the result of applying 72 iterations of matrix A and
         B. These plots correspond to two complete cycles plus scaling. Since the scaling factor
         of .95 is applied at each step it follows that after 72 steps the total scaling would be
         (.95)72 ≈ .025.


         You should interpret these plots as two representations of the same thing. With matrix
         A there is a straightforward rotation by 10 degrees at each step combined with a scaling
76                                                                             2. Eigenvalues and Eigenvectors


                                                   0.2



                                           –0.2          0.2   0.4           0.6
                                                     0




                                                  –0.2




                                                  –0.4




                                                  –0.6




                                Figure 2.10: Rotation and Scaling using B
                    1
                  0.8                                                  0.2

                  0.6
                  0.4                                          –0.2                0.2   0.4   0.6
                                                                         0
                  0.2

     –1   –0.6      0 0.20.40.60.8 1                                  –0.2
                 –0.2
                 –0.4
                                                                      –0.4
                 –0.6
                 –0.8
                   –1                                                 –0.6




Figure 2.11: 72 iterations using A.                      Figure 2.12: 72 iterations using B.


             by .95. Using matrix B does the same thing in a sense, except now the axes are skewed
             and stretched. But in both cases the points spiral in to the origin and take 36 iterations
             to complete one cycle. If the scaling factor was larger than 1, the points would have
             spiralled away from the origin. Without a scaling factor (or, rather, if the scaling factor
             was 1) the points would form a closed cycle; with matrix A the points would lie on a
             circle, with matrix B they would lie on an ellipse.


Example 2.4.5
             I n this example we will illustrate how these ideas can be extended to larger matrices.
             We will let                                         
                                                     1 1/2 1/2
                                               A = 2 2       −2 
                                                     0 1       1
             The characteristic polynomial of A would be
                                       −λ3 + 4λ2 − 6λ + 4 = (2 − λ)(λ2 − 2λ + 2)
             The first factor gives an eigenvalue of λ1 = 2 and a corresponding eigenvector would be
              
              1
             1.
              1
             The quadratic factor gives eigenvalues
                                             √
                                          2± 4−8               √
                                                      = 1 ± i = 2e±iπ/4
                                              2
2.4. Complex Eigenvalues                                                                       77

                                                                             
                                                                          1−i
         If we take λ2 = 1 + i we get a corresponding eigenvector  2i . So we also have
                                                                         2
                                                        1+i
         λ3 = 1 − i with a corresponding eigenvector  −2i .
                                                          2
         Now by changing basis we can find matrices similar to A which might have a simpler
         structure. There is no real basis that will result in a similar diagonal matrix but if we
                                     
                                     1
         take our real eigenvector, 1 and the real and imaginary parts of one of the conjugate
                                    1       
                                     1        −1
         complex eigenvectors, say 0 and  2  and define
                                     2          0
                                                             
                                                   1 1 −1
                                            P = 1 0 2 
                                                   1 2 0

         then we will have                                  
                                                    2    0 0
                                         P −1 AP = 0    1 1
                                                    0    −1 1
         Notice that the real eigenvalue has turned up on the diagonal entries and the complex
                                                               a b
         eigenvalues have turned up as a block of the form            . In particular, they have
                                                              −b a
                                   1 1                                      √
         turned up as the block           which corresponds to a scaling by 2 and a rotation of
                                  −1 1
         π/4.
78                                                                    2. Eigenvalues and Eigenvectors


       Exercises
     1. Find the eigenvalues of the following matrices, and a basis for each eigenspace.

                                               1 1                  1   4
                                         (a)                 (b)
                                               −5 5                −.8 −2.2

                                               5   −2               .5 −.6
                                         (c)                 (d)
                                               1   3               .75 1.1

                                               0   −1              0 b
                                         (e)                 (f)
                                               1   1               −b 0

                   a b
     2. Let A =         .
                   −b 0

         (a) What are the eigenvalues of A?
         (b) Under what conditions does A have complex eigenvalues?
         (c) What can you say about A when a = ±2b?

                                                                      a    b
     3. Find an invertible matrix P and a matrix C of the form               such that A = P CP −1 for the
                                                                      −b   a
        following matrices
                                               1 1                 5 −1
                                         (a)                 (b)
                                               −2 3                3 2

                                               0 1                 1.08 .32
                                         (c)                 (d)
                                               −4 0                −2.72 .12

                   0 1           1
     4. Let A =         and x0 =   . Consider the dynamical system xt+1 = Axt .
                   −1 0          1

         (a) Plot the trajectory of this system with the given x0 .
                                                                                                  1   2
         (b) Recompute and plot the trajectory if the system is expressed relative to the basis     ,      .
                                                                                                  1   0
                                                                                                  2   1
         (c) Recompute and plot the trajectory if the system is expressed relative to the basis     ,      .
                                                                                                  0   1
                                                                                                  −2   1
         (d) Recompute and plot the trajectory if the system is expressed relative to the basis      ,     .
                                                                                                   2   3
                                  
                 0      0   0    1
                1      0   0    0
     5. Let C = 
                0
                                   .
                        1   0    0
                 0      0   1    0

         (a) What is the characteristic polynomial of C?
         (b) Evaluate C 2 ,C 3 , C 4 , C 5 .
         (c) Find the eigenvalues of C and a basis for each eigenspace.
2.4. Complex Eigenvalues                                                                      79

                                  
                        1 0 0
  6. The matrix R1 = 0 0 1 is a rotation of 90 degrees around the x axis in R3 . The matrix
                      0 −1 0
             0 0 1
     R2 =  0 1 0 is a rotation of 90 degrees around the y axis in R3 . Find the eigenvalues and
             −1 0 0
     corresponding eigenvectors of R1 R2 .
  7. Show that the matrix P = vR vI in Theorem 2.8 must be invertible. (Hint: show that if the
     columns of P are not linearly independent then the eigenvalues of A would not be complex.)
80                                                               2. Eigenvalues and Eigenvectors


                                    Using MAPLE
Example 1.
   There is a command in Maple for plotting points in the complex plane. The command is complexplot
and is part of the plots package. In this example we will define a 32×32 matrix with complex eigenvalues
and plot those eigenvalues. The first Maple command given below defines a 32 × 32 matrix A where
each entry in the matrix is given by
                                           aij = cos(.6i(j − 1))
Notice the symmetry around the real axis of the plot which is a consequence of the fact that the complex
eigenvalues come in conjugate pairs.
        >A:=Matrix(32,32,(i,j)->cos(i*(j-1)*.6)):
        >ev:=Eigenvalues(A):
        >plots[complexplot](convert(ev,list),style=point,symbol=circle);



                                                    4




                                                    2




                                          –4   –2        2   4




                                                    –2




                                                    –4




                           Figure 2.13: The 32 complex eigenvalues of A

    Here’s a slightly more complicated example. In this case we will look at the eigenvalues of matrices
of the form                                                    
                                        0 1 0 0 ··· 0 0
                                      k 0 1 0 · · · 0 0
                                                               
                                       0 k 0 1 · · · 0 0
                                      .
                                                               
                                      .              ..        
                                      .                 .      
                                                                
                                       0 0 0 0 · · · 0 1
                                        1 0 0 0 ··· k 0
where the parameter k ranges from -1 to 1.
        >f:=k->BandMatrix([k,0,1],1,24):
        >for i to 21 do
          B:=f(.1*(i-1)-1):
          B[24,1]:=1:
          ev:=Eigenvalues(M):
          p[i]:=complexplot(convert(ev,list), style=point,color=black,symbol=circle):
          od:
        >display([seq(p[i],i=1..21)]);
2.4. Complex Eigenvalues                                                                        81

                                                  2




                                                  1




                                     –2    –1             1       2




                                                 –1




                                                 –2




                            Figure 2.14: The complex eigenvalues of B


   The resulting plot shows the eigenvalues of 21     different matrices of the above form where k =
−1.0, −0.9, . . . , 0.9, 1.0.
   We’ll look at one more example. Here we’ll let
                                                                  
                                         0 1 0        0   ···    0
                                       0 0 1         0   ···    0
                                                                  
                                   S = 0 0 0         1   ···    0
                                       
                                       .
                                                                   
                                       .
                                                                   
                                         .                         
                                          1 0     0 0     ···    0

We will plot the eigenvalues of M = S 2 + S − I and of M −1 .

        >A:=BandMatrix([0,0,1.0],1,64):A[64,1]:=1.0:
        >M:=evalm(S^2+S-1): ### remember Maple interprets 1 as the identity matrix
        >ev1:=Eigenvalues(M):
        >ev2:=Eigenvalues(M^(-1))]:
        >plots[complexplot](convert(ev1,list),style=point);
        >plots[complexplot](convert(ev2,list),style=point);


                                 1.50                              .60
                                 1.00                              .40
                                   .50                             .20
                        –2. –1.00 0. .501.00
                          –1.50 -.50 0.                       -.50 0.0.   .50
                                  -.50                            -.20
                                –1.00                             -.40
                                 –1.5                             -.60

                           Figure 2.15: The eigenvalues of M and M −1 .

   The plots are shown in Figure 2.15. Try to figure out the relation between these two plots.

Example 2.
82                                                                    2. Eigenvalues and Eigenvectors


     We will illustrate how to use Maple to generate some of the images used in this section. Suppose
we want a rotation matrix that will take 32 steps to rotate a complete circle, then each step will rotate
    2π
by     . Suppose we also want a starting point to spiral outward so that after one complete rotation of
    32
2π radians its distance from the origin will have been doubled, then at each step the scaling would be
 √
 32
    2. The following commands in Maple will define the appropriate matrix and compute and plot two
complete cycles.
     The first three lines below define the values that will determine our matrix. The fifth line defines our
starting point, x0 . The sixth line computes 64 points of the trajectory in a loop. The last two lines plot
the points which we have computed.

         >c:=evalf(cos(2*Pi/32));
         >s:=evalf(sin(2*Pi/32));
         >r:=evalf(2^(1/32)); ### the scaling factor
         >A:=r*<<c,-s>|<s,c>>; ## the rotation matrix with scaling
         >x[0]:=<1,0>; ## the starting point
         >for i to 64 do x[i]:=A.x[i-1] od: ## this finds x1 to x64
         >pdata:=[seq( [ [x[i][1], x[i][2] ] ), i=0..64)]:
         >plot(pdata,style=point,symbol=box); ### phase plot of our system

     We get Figure 2.16.


                                                 3

                                                 2

                                                 1


                                      –2   –1           1   2     3      4

                                                –1

                                                –2


                                     Figure 2.16: Multiplying by A

    Notice that it takes 8 steps to pass through each quadrant and that after each complete cycle the
intercept on the horizontal axis is doubled.
    Now suppose we look at the same transformation relative to a different basis. Let’s choose the basis

                                                     3    1
                                           B=          ,
                                                     1   −1/3

     Here are some comments concerning the following Maple commands:

     • The plot p2 gives the axes of the new coordinate system. The axes can be represented as the lines
       tv1 and tv2 . This command plots these parametric representations of the axes. The definition of
       p2 might look confusing at first but remember v1[1] and v1[2] just give the first and second
       components of v1.
2.4. Complex Eigenvalues                                                                             83


   • It is strongly recommended that you try the same example with several other choices for a basis.
     If you change the values of v1 and v2 you can just reenter all the other commands to see the
     corresponding plot. You might have to change the size of the viewing window as given by the view
     option in the display command.

                                                             a b
   • Notice that matrix B doesn’t fall into the pattern           but P −1 BP = A and so a change of
                                                             −b a
      basis will put it in that form.

        >v1:=<3,1>:
        >v2:=<1,-1/3>:
        >P:=<v1|v2>;
        >B:=P.A.P^(-1);   ## P and A are similar
        >x0:=<1,0>; ## the starting point
        >for i to 64 do x[i]:=evalm(B&*x[i-1]) od:
        >pdata:=[seq( [ x[i][1], x[i][2] ], i=0..64)]:
        >p1:=plot(pdata,style=point,symbol=box):
        >p2:=plot({[v1[1]*t,v1[2]*t,t=-10..10],[v2[1]*t,v2[2]*t,t=-10..10]}):
        >plots[display]([p1,p2],view=[-6..6,-6..6]);

   This gives Figure 2.17.

                                                     6

                                                     4

                                                     2


                                –6       –4    –2    0       2     4     6
                                                    –2

                                                    –4

                                                    –6

                                        Figure 2.17: Multiplying by B.

     Notice in this case if we look at the new axes as dividing the plane into 4 unequal “quadrants” it
still takes 8 steps to pass through each “quadrant”. The distance of any point on the path is also still
doubling after a complete cycle.

Example 3.
                                                                     
                                                      0        1 0 0
                                                    0         0 1 0
   For the next example look at the matrix A =     0
                                                                      . This matrix has characteristic
                                                               0 0 1
                                                     −1        0 0 0
polynomial λ4 + 1 which has no real roots, and so A has 4     complex eigenvalues. In Maple
84                                                              2. Eigenvalues and Eigenvectors

         >A:=<<0,0,0,-1>|<<1,0,0,0>|<0,1,0,0>|<0,0,1,0>>;
         >ev,V:=Eigenvectors(A);
                                                         √     √          √     √
                                                           2±i 2        − 2±i 2
    We now see that the four complex eigenvalues are                and            . We will take the
                                                             2               2
real and imaginary parts of two of the eigenvectors that ARE NOT conjugates of each other (otherwise
the resulting vectors would not be linearly independent.) and place them as columns in a matrix. (Note
that Maple does not always return the eigenvectors in the same order so you will have to look and see
which two columns to use.)

          >for i to 4 do v[i]:=simplify(ev[i]) od;
          >v1:=map(Re,Column(V,1)):v2:=map(Im,Column(V,1)); ### one conjugate pair
          >v3:=map(Re,Column(V,2)):v4:=map(Im,Column(V,2)); ### another conjugate pair
          >P:=<v1|v2|v3|v4>;
     This gives                  √     √    √    √   
                                 − 2/2 − 2/2  2/2  2/2
                                 0     −1     0   −1 
                             P = √     √    √
                                 2/2 − 2/2 − 2/2
                                                  √   
                                                   2/2
                                    1    0     1    0
Now we will evaluate P −1 AP :

          >P^(-1).A.P;
This gives us the matrix
                                     √     √                
                                       √2/2 √2/2    0     0
                                     − 2/2   2/2   0     0 
                           P −1 AP =              √     √   
                                      0      0   − 2/2 −√2/2
                                                  √
                                        0     0     2/2 − 2/2

    This reveals the rotations hidden inside matrix A. It shows that multiplying by A corresponds to a
rotation of π/4 in the x1 x2 plane and a rotation of 3π/4 in the x3 x4 plane.
    Note that in this case the Cayley-Hamilton theorem tells us that A8 = I so any trajectory would be
periodic of length 8. An arbitrary trajectory would start with
                                                       
                                         a        b         c
                                        b     c       d
                                         →   →  ···
                                       c     d        −a
                                         d      −a         −b

What are the next 6 steps of this trajectory? What is the trajectory in the x1 x2 plane? In the x2 x4
plane?

								
To top