APMA 0160 — Introduction to Computing Sciences Project 1: Ill-conditioning of the Vandermonde matrix Due date: April 1 (Wednesday) Background There is a concept we have not discussed which is a major motivating factor in choosing the basis and/or nodal set on which to perform interpolation. This concept is called the “condition number” of a matrix. The exact definition of this concept is not necessary for our purposes. Given a linear system of equations Ax = b, one should think of the condition number of A as being (very roughly) the rate at which the solution, x, will change with respect to a change in b. Thus, if the condition number is large, even a small error in b may cause a large error in x. On the other hand, if the condition number is small then the error in x will not be much bigger than the error in b. In this project, we’ll explore the ill-conditioning of some matrices, and we’ll also determine how this conditioning affects our computation of the interpolant. In order to gain a good understanding of why this “condition number” is important, we will use Matlab’s built-in function cond to compute condition numbers. For a matrix, A, cond(A) returns the condition number. The executive summary of condition numbers is as follows: large condition numbers (say, of order 108 ) are bad, and indicate that numerical precision will be lost during a matrix inversion; small condition numbers (anything less than 103 or so) are good and indicate that a matrix is “easily” invertible. (Please note that the condition number of a matrix and its determinant have absolutely nothing to do with each other!) If the condition number reaches around 1016 , there will probably be significant errors in the inversion process. All of this is a result of the fact that Matlab can only store a finite number of digits. Exercises: 1. Let us first explore the meaning of the condition number.
(i) Consider the linear system Ax = b, where A= 1 1 −1 1 , b= 1 1 .
The solution to the linear system Ax = b is x = A−1 b. If you make a small mistake in b and instead use ˜= b 1+ 1−
as the new right hand side, then the solution to the linear system A˜ = ˜ is x b −1˜ x = A b. On paper, compute (by hand) the error ˜ x − x := A−1 b − A−1˜ ˜ b. Once you have explicit expression for A−1 b − A−1˜ set = 10−10 , go to b, Matlab, and use Matlab’s built-in functions cond and inv to compute the condition number of A and the error A−1 b − A−1˜ Comment on the magnib. tude of the error of x compared to the error of b. (ii) Now we consider a different matrix A, which is A= −1 + −1 1 1 .
The vectors b and ˜ are the same as in part (i). On paper, compute the b error x − x = A−1 b − A−1˜ by hand. Once you have explicit expression ˜ b −1 −1˜ for A b − A b, set = 10−10 , go to Matlab, and use Matlab’s built-in functions cond and inv to compute the condition number of A and the error A−1 b − A−1˜ Comment on the magnitude of the error of x compared to the b. error of b. 2. Recall that the interpolation problem is defined by a collection of data points {(xi , yi )}m and a collection of basis functions {φj (x)}m . One of i=1 j=1 the goals in the interpolation problem is to find the associated coefficients c defined by c = V −1 y, where the Vandermonde Matrix V associated with the data and the basis is defined by V (i, j) = φj (xi ), i = 1 : m, j = 1 : m
and y is the vector of data values. However, if we make a small error in y, will the resultant error in c be greatly amplified, or will it stay controlled? Considering our discussion above, this question can be answered by appealing to the condition number of V . Recall that Matlab can only store a finite number of digits. This means that numbers like π cannot actually be represented in Matlab. Thus, when we type numbers into Matlab, often times there is a small, but nonzero, error which is present. This error is usually on the magnitude of 1016 , and most of the time we’ll never see it cause any trouble. However, as the previous exercise has shown, if we make a small error as a result of Matlab’s finite precision and we operate on the erroneous vector with the inverse of an ill-conditioned matrix, the resultant error in our answer can be quite large. Here we shall explore this error in the context of the global interpolation problem. Let us first consider the effect that the nodal locations xi have on the condition number of the Vandermonde matrix. In all of the following, we shall be working on the reference interval x ∈ [−1, 1]. Take the following nodal sets and tabulate the condition numbers of the Vandermonde matrices using the monomial basis φj (x) = xj−1 versus m for m = 10, 15, 20, 25, 30, 35, 40: • xi equispaced nodes: xi = −1 +
2 (i m−1
− 1) for i = 1, 2, · · · , m.
i−1 • xi the chebyshev nodes: xi = cos(π − π m−1 ) for i = 1, 2, · · · , m.
Based on your findings, and given that we wish to minimize the condition number, which is the better nodal set to use? 3. We have considered which set of nodes to use, let us consider which basis functions to use. The monomials are the easiest polynomial basis to write down, but there’s no guarantee that this is the best in terms of minimizing the Vandermonde condition number. Use the nodal set you chose as the “better” from above, and tabulate condition numbers of the Vandermonde matrices for m = 10, 15, 20, 25, 30, 35, 40 for each of the following basis sets: • The monomials: φj (x) = xj−1 for j = 1, 2, · · · , m. • The Newton basis: for j = 1, 2, · · · , m, φj (x) = 1 Πj−1 (x − xk ) k=1 if j = 1 if j > 1
• The “Chebyshev polynomials”: φj (x) = cos((j − 1) arccos x) for j = 1, 2, · · · , m.
For example, the first few Chebyshev polynomials are φ1 (x) = 1, φ2 (x) = x, φ3 (x) = 2x2 − 1, φ4 (x) = 4x3 − 3x, . . . (These polynomials form a very important basis in advanced scientific computing.) Please note that for all of the basis sets above, the linear span of them is the same. In particular, this implies that for a fixed nodal set, all three interpolants calculated will be exactly the same. In other words, the Newton polynomials basis is simply a “rearrangement” of the monomials, and the Chebyshev polynomials are also just a “rearrangement” of the monomials. Thus, any difference you see in the condition number of the Vandermonde matrix has nothing to do with the type of interpolant: they are all the same function. 4. Finally, we shall put all of the information we have gathered together to see the effect that ill-conditioned systems have on interpolation. Let m = 45. Let’s test interpolation on the function f (x) = 1 1 + 25x2
For the rest of the this problem, we shall define the interpolation values as yi = f (xi ) for given xi . We are doing a global interpolation problem, which means that we are effectively constructing a function p(x) = m ci φi (x) which satisfies i=1 p(xi ) = yi = f (xi ) for every i. According to the theory, the above relation is exact, and we expect (or rather hope) that computationally it will be exact (or very close
to exact).
2 (i) Define equidistant nodal points as xi = −1+ m−1 (i−1) for i = 1, 2, · · · , m. Let the basis functions be the monomials, φi (x) = xi−1 for i = 1, 2, · · · , m. Using standard inversion of the Vandermonde matrix, evaluate the interpolant at the nodes xi . I.e., compute p(xi ). On a graph, plot f (xi ) − p(xi ). What is maxi |f (xi ) − p(xi )|, the maximum error? How do you explain this?
(ii) Again use the equidistant nodes, xi , but use the Newton polynomial basis. Using standard inversion of the Vandermonde matrix to find the interpolant, compute p(xi ) and plot f (xi )−p(xi ). What is the maximum error? Then use divided differences to find the interpolant, compute p(xi ) and plot f (xi ) − p(xi ). What is the maximum error now? How does it compare with the monomial basis case? Explain.
i−1 (iii) Now use the Chebyshev nodes xi = cos(π − π m−1 ) for i = 1, 2, · · · , m, and use the monomial basis. Now what is the maximum error between p(xi ) and f (xi )? How do you explain this based on what you found in the previous exercises?
(iv) use the Chebyshev nodes and Chebyshev polynomial basis to form the Vandermonde matrix and form the interpolant. What is the maximum error between p(xi ) and f (xi )? How do you explain this? (v) If you had to interpolate a function and were able to choose both the nodal locations and the basis functions, which combination would you choose from those given above? We didn’t consider the Lagrange polynomial basis. Why not? What is the disadvantage of Lagrange interpolation? (Consider a different aspect of interpolation: evaluating the interpolant at non-nodal locations which we called z in class.)