VIEWS: 32 PAGES: 14 CATEGORY: Education POSTED ON: 4/8/2010 Public Domain
CHAPTER 6. LINEAR EQUATIONS Part 1. Single Linear Equations An equation of the form Ax + By = C (6.1) where A, B, C are some constants (with A and B not vanishing simultaneously), repre- sents a line in xy-plane. An equation of the form Ax + By + Cz = D (6.2) where A, B, C, D are constants (with A, B and C not vanishing simultaneously) rep- resents a plane in the 3–dimensional space R3 . More generally, an equation of the form A1 x1 + A2 x2 + · · · + An xn = B. (6.3) where A1 , A2 , . . . , An , B are some constants (with A1 , A2 , . . . , An not vanishing simultaneously), represents an object called hyperplane in the n–dimensional space Rn . We say that a point P = (p1 , p2 , . . . , pn ) in Rn is on the hyperplane H represented by (6.3) if the coordinates of P satisfy equation (6.3), that is, A1 p1 + A2 p2 + · · · + An pn = B. Certainly, when A1 p1 + A2 p2 + · · · + An pn = B, we say that P is not on H (or, H does not contain P ). To give a quick example, consider the line given by the equation 2x − 3y = 1. Then the points P1 = (2, 1) and P2 = (−1, −1) are on the line, because 2 × 2 − 3 × 1 = 1 and 2 × (−1) − 3 × (−1) = 1. But the point Q = (1, 2) is not on this line, because 2 × 1 − 3 × 2 = −4 = 1. Notice that a line (or, more generally a hyperplane) is completely determined by the set set of all points on it. Because of this, diﬀerent equations may represent the same line. For example, equations 2x − 3y = 1, 4x − 6y = 2 and −2x + 3y = −1 all describe the same line. This is because if the coordinates of a point satisfy one of these equations, then the other two equations are also satisﬁed. When a hyperplane H is represented by equation (6.3), the ﬁrst thing we have to note is the following important fact: the vector N = (A1 , A2 , . . . , An ) is perpendicular to H; 1 more precisely, for any vector v parallel to H, we have N ⊥ v. Indeed, a parallel vector of H can be obtained in the following way. Take arbitrary points P = (p1 , p2 , . . . , pn ) and Q = (q1 , q2 , . . . , qn ) on H. Then −→ v = P Q = (q1 − p1 , q2 − p2 , . . . , qn − pn ) is a vector parallel to H. That P and Q are points on H means that their coordinates satisfy equation (6.3), that is A1 q1 + A2 q2 + · · · + An qn = B. A1 p1 + A2 p2 + · · · + An pn = B. Subtracting the last two identities, we obtain A1 (q1 − p1 ) + A2 (q2 − p2 ) + · · · + An (qn − pn ) = 0. The left hand side is nothing but N • v. So we have N • v = 0, which means N ⊥ v. See the following ﬁgure Example 6.1. Find the line ℓ in the xy–plane perpendicular to the vector (3, 2) and through the point (1, 2). Solution. In view of the assumption that the vector (3, 2) is perpendicular to ℓ, we can write an equation of this line as 3x + 2y = C. Since the point (1, 2) is on ℓ, we have 3 × 1 + 2 × 2 = C and hence C = 7. So the answer is 3x + 2y = 7. 2 Example 6.2. Find the plane Π through the following points A(1, 2, 6), B(3, −1, 1) and C(2, 1, 5). −→ −→ Solution. Let u = AB = (2, −3, −5) and v = AC = (1, −1, −1), which are vectors parallel to Π. Then their cross product u × v = (−2, −3, 1) is perpendicular to Π. We can use (2, 3, −1) as the normal vector to write Π in the form 2x + 3y − z = k, where k is a constant. Since the point (1, 2, 6) is on Π, we have 2 × 1 + 3 × 2 − 6 = k and hence k = 2. So the answer is 2x + 3y − z = 2. Example 6.3. Find distance from the origin O to the plane Π : 2x − 2y + z = 12. Solution. Take the “side view” of the plane, as indicated in the following ﬁgure −→ We look for the point P on Π so that OP is perpendicular to Π. The distance of O to −→ the plane Π is just the magnitude of the vector OP . Since vector N = (2, −2, 1) is also −→ −→ perpendicular to Π, OP must be a scalar multiple of N, say OP = aN = a(2, −2, 1) = (2a, −2a, a). Since P is a point on Π, its coordinates x = 2a, y = −2a, z = a satisfy the −→ equation for Π: 2(2a)−2(−2a)+a = 12, which gives a = 4/3. So OP = (8/3, −8/3.4/3). −→ Thus the distance from O to Π is |OP | = (8/3)2 + (−8/3)2 + (4/3)2 = 4. Exercise 6.1. Find the line ℓ in the xy–plane perpendicular to the vector (5, −2) and through the point (2, 3). Exercise 6.2. Find the line ℓ in the xy–plane perpendicular to the line 2x−3y = 1 and through the origin (0, 0). (Hint: (3, 2) is a vector perpendicular to the vector (2, −3).) √ √ √ √ Exercise 6.3. Find the angle between the lines ℓ1 : 6x+ 2 = 1 and ℓ2 : 6x− 2 = 1. (Hint: the angle between two lines is just the angle between their normal vectors.) 3 Exercise 6.4. Find the plane Π through the points P (1, 0, 0), Q(2, 3, 1) and R(1, 1, 2). Exercise 6.5. Find the plane Π through the points P (a, b, c), Q(b, c, a) and R(c, a, b), where a, b, c are certain distinct constants. Exercise 6.6. Find the distance from the origin O to the line ℓ : 3x + 4y = 10. Exercise 6.6. Find the distance from the point P (3, 2.−4) to the plane Π : 2x+2y−z = −→ 0. (Hint: If Q is the point on Π nearest to P (3, 2, −4), then P Q is perpendicular to Π.) Part 2. Lines, Planes, etc. in Parametric equations A point and a direction determine a line. Given a point A and a nonzero vector v, what sort of equation represents the line through A in the direction of v? To be deﬁnite, we assume that A and v are in the n–dimensional space Rn , given by A = (a1 , a2 , . . . , an ) and v = (v1 , v2 , . . . , vn ) respectively. Let P = (x1 , x2 , . . . , xn ) be −→ a “general point” on the line ℓ through P in the direction of v. Then the vector AP along the line ℓ is parallel to v, because v gives the direction of ℓ; see the ﬁgure below: −→ Consequently AP is a scalar multiple of v, say −→ AP = tv (6.4) −→ For convenience, let us put a = OA = (a1 , a2 , . . . , an ), which may be called the position − → vector of A. Similarly, we put x = OP = (x1 , x2 , . . . , xn ), the position vector of P . 4 −→ −→ − → Then we have AP = OP − OA = x − a so that (6.4) becomes x − a = tv. After rearranging, we arrive at x = a + tv (6.5) Using coordinates, we rewrite this as (x1 , x2 , . . . , xn ) = (a1 , a2 , . . . , an ) + t(v1 , v2 , . . . , vn ) = (a1 + v1 t, a2 + v2 t, . . . , an + vn t) Matching components of vectors in two sides of the above identity, we obtain x1 = a1 + v1 t, x2 = a2 + v2 t, . . . , xn = an + vn t, (6.6) Putting in a more economic way, we write xk = ak + vk t; k = 1, 2, . . . , n. Equations (6.6) are called parametric equations for ℓ, with parameter t. One good way to think of (6.5) is to regard t as the time parameter and P as a point moving along ℓ in uniform motion, with (6.5) as its “equation of motion” described in terms of its position vector x. Vector v turns out to be the velocity of this uniform motion. To give a quick example, the line through the point (1, 3) parallel to the vector (−2, 5) can be expressed in parametric equations x = 1 − 2t, y = 3 + 5t. Next example: we are asked to ﬁnd parametric equations for the line through points A = (5, 3, 4) and −→ B = (1, 6, 2). Then the vector AB = (−4, 3, −2) is parallel to this line. So we can use the parametric equations x = 5 − 4t, y = 3 + 3t, z = 4 − 2t to describe this line. More generally, given two points A and B in Rn , we can ﬁnd parametric equations for −→ − −→ −→ this line as follows. Let a = OA and b = OB. Then AB = b − a is a vector parallel to the line. Hence, to describe this line, we can use parametric equation (in vector form) x = a + t(b − a), which can be rewritten as x = (1 − t)a + tb. (6.7) Notice that, when t = 0, we have x = a, the position vector of A; when t = 1, x = b, the position vector of B; when t = 1/2, x = (a + b)/2, the position vector of the midpoint between A and B. It is not hard to see that a point is on the line segment between A and B if and only if it can be expressed as (1 − t)a + tb for some t with 0 ≤ t ≤ 1. Example 6.4. Find an equation for the plane Π through the intersecting lines in parametric equations: ℓ1 : x = 2 + t, y = 1 + 3t, z = 3 + 2t; 5 ℓ2 : x = 2 − t, y = 1 + 2t, z = 3 + 2t. Notice that the point P (2, 1, 3) is the intersection of ℓ1 and ℓ2 . Solution. Vectors v1 = (1, 3, 2) and v2 = (−1, 2, 2) are parallel to ℓ1 and ℓ2 respectively. So they are parallel to Π. Consequently their cross product v1 × v2 = (2, −4, 5) is normal (i.e. perpendicular) to Π. Hence Π can be expressed as an equation of the form 2x − 4y + 5z = C. That the point P (2, 1, 3) is on Π tells us that 2 × 2 − 4 × 1 + 5 × 3 = C, or C = 15. Thus our ﬁnal answer is 2x − 4y + 5z = 15. Example 6.5. Find parametric equations for the line ℓ through point P (2, 6, 3), which is parallel to planes Π1 : 2x + 3y − z = 3 and Π2 : x + y + 2z = 5. Solution. Vectors N1 = (2, 3, −1) and N2 = (1, 1, 2) are normal to Π1 and Π2 respectively. Their cross product v = N1 × N2 = (7, −5, −1) is parallel to both Π1 and Π2 and hence v is parallel to ℓ. Thus we can describe ℓ by parametric equations x = 2 + 7t, y = 6 − 5t, z = 3 − t. Exercise 6.7. Find an equation for the plane Π through the point P (1, 2, −1) and the line ℓ given by parametric equations x = 2 + t, y = 1 + 2t, z = 1 − 2t. (Hint: Notice −→ that the point Q(2, 1, 1) is on ℓ. Both vectors P Q and v = (1, 2, −2) are parallel to Π.) Exercise 6.8. Let ℓ1 : x = 1+2t, y = 1+3t, z = 2+2t and ℓ2 : x = 1+2t, y = 1−t, z = 2+t be intersecting lines given in parametric equations. Find parametric equations of the line ℓ though the point of intersection (1, 1, 2), perpendicular to both ℓ1 and ℓ2 . Exercise 6.9. Let ℓ1 : x = 2+2t, y = 1+3t, z = 1+2t and ℓ2 : x = 1+2t, y = 1−t, z = 2+t be lines given in parametric equations. Check that ℓ1 and ℓ2 are neither intersecting nor 6 parallel. Find an equation of the plane through ℓ1 and parallel to ℓ2 , and ﬁnd an equation of the plane through ℓ2 and parallel to ℓ1 . A point and two directions determine a plane. Given a point A and two nonzero vectors u and v, which are not parallel to each other, we look for parametric equations representing the plane Π through A parallel to both u and v. To be deﬁnite, we assume that A, u and v are in the n–dimensional space Rn , given by A = (a1 , a2 , . . . , an ) u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) respectively. Let P = (x1 , x2 , . . . , xn ) be − → a “general point” on Π. Then vector AP can be written as the sum of two vectors, one parallel to u and the other parallel to v. We can express the one parallel to u as a scalar multiple of u, say su. Similarly we can write the other as tv for some scalar t. See the ﬁgure below: −→ −→ Thus we have AP = su + tv. Put a = OA = (a1 , a2 , . . . , an ) (the position vector of A) −→ −→ − → − → and x = OP = (x1 , x2 , . . . , xn ). Then we have AP = OP − OA = x − a so that the last identity becomes x − a = su + tv. After rearranging, we arrive at x = a + su + tv (6.8) In coordinates, (x1 , x2 , . . . , xn ) = (a1 + u1 s + v1 t, a2 + u2 s + v2 t, . . . , an + un s + vn t) Matching components of vectors in two sides of the above identity, we obtain xk = ak + uk s + vk t; k = 1, 2, . . . , n. Equations in (6.6) are parametric equations for the plane Π, with parameters s and t. In the 3–dimensional space, we can either use parametric equations or an equation of the form Ax + By + Cz = D to represent a plane Π. To avoid confusion, we call Ax + By + cz = D a normal equation for Π. We choose such a name because the 7 coeﬃcients A, B, C here are components of the vector (A, B, C) normal to this plane. Example 6.6. Find a normal equation for the plane which can be described by parametric equations x = 2 + 3s + 2t, y = 1 + 2s − 3t, z = 1 + s + t. Solution. The answer can be obtained by an algebraic procedure for eliminating s, t. But here we prefer to use the geometric approach. Since vectors u = (3, 2, 1) and v = (2, −3, 1) are parallel to Π, their cross product N = u × v = (5, −1, −10) is normal to Π. So a normal equation for Π is in the form 5x − y + 10z = D. That the point (2, 1, 1) is on Π tells us that 5 × 2 − 1 × 1 − 10 × 1 = D and hence D = −1. Our answer ﬁnal is 5x − y − 10z + 1 = 0. Example 6.7. Find parametric equations for the plane Π through the line ℓ1 : x = 2 + 2t, y = 1 + 3t, z = 1 + 2t and parallel to the line ℓ2 : x = 1 + 2t, y = 1 − t, z = 2 + t. Solution. The given information tells us that the vectors u = (2, 3, 2) and v = (2, −1, 1) are parallel to Π and the point (2, 1, 1) is on Π. Hence x = 2 + 2s + 2t, y = 1 + 3s − t, z = 1 + 2s + t are parametric equations for Π. Exercise 6.10. Find an equation for the plane passing through the origin perpendicular to both planes 2x + y + z = 3 and x − y + 2z + 1 = 0. Exercise 6.11. Find an equation for the plane which is perpendicular to the plane 2x + 3y + y + 1 = 0 and contains the line ℓ in parametric equations x = 2 + t, y = −3 + t, z = −1 + 2t. Exercise 6.12. Find the distance between the parametric lines ℓ1 : x = 2 + 2t, y = 1 + 3t, z = 1 + 2t and ℓ2 : x = 1 + 2t, y = 1 − t, z = 2 + t. (Hint: This distance is the same as the distance between the planes Π1 and Π2 in Exercise 6.9.) We have seen that lines and planes can be represented by parametric equations. A line uses one parameter, and a plane uses two. The number of parameters depends on the dimension of the object we study. Now, let out your imagination and think of an r-dimensional object F in the n–dimensional space Rn , which will be called an r–ﬂat (or, using a standard term, an r–dimensional aﬃne subspace). When r = 1, F is a line, and when r = 2, F is a plane. In general, a r–ﬂat F is determined by a point A on it, 8 with position vector a, and r vectors vj (1 ≤ j ≤ r) parallel to F, which are in “general position”, or more precisely, linearly independent. A point P (x1 , x2 , . . . , xn ) is on F if and only if its position vector x = (x1 , x2 , . . . , xn ) can be exxpressed as x = a + t1 v1 + t2 v2 + · · · + tr vr . (6.10) for some suitable choice of t1 , t2 , . . . , tr . We call (6.10) as a parametric representation of the ﬂat F, in vector form. Employing the summation symbol, this can be recast as r x=a+ tj vj . j=1 Parametric equations give an explicit way to describe a ﬂat. On the other hand, often a ﬂat is deﬁned implicitly as the solution set of a system of linear equations, and the algebraic description of this solution set is often put in parametric equations of the form (6.10) given as above, as we will see in the next part. Here we need to explain the requirement of linear independence of vectors v1 , v, . . . vr in (6.10) above, which guarantees that the number r of parameters t1 , t2 , . . . , tr is minimal in describing the ﬂat F. Roughly it says that the only possible linear relation among these vectors is the trivial one. By a linear relation among vectors v1 , v, . . . vr we mean an identity of the form a1 v1 + a2 v +2 + · · · + ar vr = 0. (6.11) where a1 , a2 , . . . , ar are some scalars. Certainly (6.11) holds when a1 = 0, a2 = 0, . . . , ar = 0, that is, 0 v1 + 0 v +2 + · · · + 0 vr = 0. (6.12) We call (6.12) a trivial linear relation. We say that vectors v1 , v, . . . vr are linearly independent if this is the only possible linear relation among these vectors, in other words, a1 v1 + a2 v +2 + · · · + ar vr = 0 implies a1 = 0, a2 = 0, . . . , ar = 0. 9 For a quick example, we note that vectors u = (1, 0) and v = (1, 1) are linearly independent. To see this, we take a linear relation au + bv = 0. Then a(1, 0) + b(1, 1) = (0, 0), or (a + b, b) = (0, 0), that is, a + b = 0 and b = 0, which give a = 0, b = 0. The converse of linear independence is linear dependence. We say that a (ﬁnite) set of vectors v1 , v, . . . vr is linearly dependent if there is a nontrivial linear relation among them, in other words, (6.11) holds for some scalars a1 , a2 , . . . , ar , not all equal to zero (in other words, at least one of these scalars is nonzero). To give a quick example, we note that the vectors v1 = (1, 2, 3) v2 = (4, 5, 6) and v3 = (7, 8, 9) are linearly dependent, in view of v1 − 2v2 + v3 = 0. Notice that we can use fewer parameters to describe the ﬂat given by x = t1 v1 + t2 v2 + t3 v3 . Indeed, from v1 − 2v2 + v3 = 0 we have v3 = 2v2 − v1 and hence x = t1 v1 + t2 v2 + t3 v3 becomes x = t1 v1 + t2 v2 + t3 (2v2 − v1 ) = (t1 − t3 )v1 + (t2 + 2t3 )v2 , or x = s1 v1 + s2 v2 , where s1 = t1 − t3 and s2 = t2 + 2t3 . Hence two parameters s1 , s2 instead of three are enough for describing this ﬂat. This example shows why we require vectors v1 , v, . . . vr in (6.10) to be linearly independent. Exercise 6.13. Prove that vectors (1, 1, 1, 1), (0, 1, 1, 1), (0, 0, 1, 1), (0, 0, 0, 1) in R4 are linearly independent. Exercise 6.14. Prove that vectors (1, 0), (0, 1), (a, b) are linearly dependent, where a, b are arbitrary numbers. Exercise 6.15. Prove that if nonzero vectors u and v are linearly dependent, then any one of them is a scalar multiple of the other. Exercise 6.16. Prove that if vectors u and v are linearly independent, then 2u + v and u − 2v are also linearly independnet. Problem 6.17. Prove that if nonzero vectors u1 , u2 , . . . , um are mutually perpendicular in the sense that uj • uk = 0, then they are linearly independnet. Part 3. Systems of Linear Equations and Flats A plane in 3-space can be described by a system of one equation of the form ax + by + cz = d. 10 A line in 3-space can be described by a system of two equations of the form a1 x + b1 y + c1 z = d1 a2 x + b2 y + c2 z = d2 There are two equations, each of them represents a plane, and together they represent the line which is the intersection of these two planes. A line in 2-space can be described by a system of one equation of the form ax + by = c. In general a ﬂat in Rn can be presented as the solution set to a system of linear equations in n variables, say x1 , x2 , . . . , xn : a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 ... am1 x1 + am2 x2 + · · · + amn xn = bm (6.13) In order to put this ﬂat in parametric representation, we solve this system of equations. Here we describe a general procedure of solving this system called Gaussian elimination. Arrange the coeﬃcients and constant terms of the above system into a table (or a matrix), called the augmented matrix of the system (6.13): a11 a12 · · · a1n b1 a 21 a22 · · · a2n b2 A= . . .. amn am2 · · · amn bm The numbers a11 , a12 , · · · a1n , b1 form the ﬁrst row of A and a21 , a22 , · · · a2n , b2 form the second row, etc. We are allowed to perform the following three types of so–called elementary row operations: (a) interchange two rows; (b) multiply (or divide) a row by a nonzero number; and (c) add a nonzero multiple of one row to another. It is not hard to see that every elementary row operation on the augmented matrix A corresponds to a modiﬁcation of the system of equations in (6.12) and such a modiﬁcation does not alter the solution set of the system. 11 After a sequence of elementary row operations, we can “row reduce” A into a matrix of a particular form, called the echelon form. A matrix B is in echelon form if: 1. there is a positive integer r, called the rank of B such that the ﬁrst r rows are nonzero rows and the remaining rows are zero rows (by a nonzero row we mean a row with at least one number nonzero); 2. the ﬁrst nonzero number in a nonzero row is 1, called the leading one of the row; 3. if the leading one of the ith row occurs in ki th entry (1 ≤ i ≤ r), then k1 < k2 < · · · < kr . For example, 0 0 1 ∗ ∗ ∗ ∗ ∗ 0 0 0 1 ∗ ∗ ∗ ∗ 0 0 0 0 0 1 ∗ ∗, 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 where the entries ﬁlled with ∗ can take arbitrary values, is an echelon matrix. Once we rwo reduce an augmented matrix into an echelon form, the corresponding system is easy to solve by backward substitution. Example 6.8. Let F be the ﬂat which is the solution set of the following system of linear equations x1 + 2x2 + x3 + 5x4 + 6x5 = 0 2x1 + 4x2 + 5x4 + 4x5 = −4 x1 + 2x2 − x3 + x4 − 2x5 = −2. Find a parametric representation of F. Solution. Row reduce the augmented matrix as follows: 1 2 1 5 6 0 A = 2 4 0 6 4 −2 R2 − 2R1 ∼ 1 2 −1 1 −2 −2 R3 − R1 1 2 1 5 6 0 0 0 −2 −4 −8 −2 ∼ 0 0 −2 −4 −8 −2 R3 − R2 1 2 1 5 6 0 0 0 −2 −4 −8 −2 (−1/2) × R2 ∼ 0 0 0 0 0 0 12 1 2 1 5 6 0 R 1 − R2 0 0 1 2 4 1 ∼ 0 0 0 0 0 0 1 2 0 3 2 0 0 0 1 2 4 1 = B 0 0 0 0 0 0 Here R2 −2R1 means subtracting the second row by twice of the ﬁrst row, (−1/2)×R2 means multiplying the second row by −1/2, and ∼ is read as “row equivalent to”. The corresponding system of equations according to the last echelon matrix B is x1 + 2x2 + 3x4 + 2x5 = 0, x3 + 2x4 + 4x5 = 1. (∗) There are two leading ones, which are coeﬃcients of the variables x1 and x3 . The “leading variables” x1 , x3 are uniquely determined by the other variables, namely x2 , x4 and x5 , whose values can be freely assigned. Letting t1 = x2 , t3 = x4 and t3 = x5 , (∗) gives x1 = −2t1 − 3t2 − 2t3 , x2 = t1 , x3 = 1 − 2t2 − 4t3 , x4 = t2 , x5 = t3 which are parametric equations for the ﬂat F. We can also put this parametric repre- sentation of F in a vector expression in the form of (6.10) discussed in the last section: x = a + t1 v1 + t2 v2 + t3 v3 , where a = (0, 0, 1, 0, 0), v1 = (−2, 1, 0, 0, 0), v2 = (−3, 0, −2, 1, 0) and v3 = (−2, 0, −4, 0, 1). Exercise 6.18. Solve the system of equations with its augmented matrix given by 1 1 1 1 1 A = 2 1 3 4 2 4 3 5 3 2 Exercise 6.19. Let F be a ﬂat implicitly deﬁned by the system of linear equations x1 + x2 + x3 + x5 + x5 = 2, 2x1 + 2x2 − x3 − x4 + 2x5 = 1 Find a parametric representation of this ﬂat. What is the dimension of F? 13 Problem 6.20. Notice that the system (6.13) can be rewritten as ai •x = bi (1 ≤ i ≤ m), where ai = (ai1 , ai2 , . . . , ain ) are called the row vectors of the coeﬃcient matrix a11 a12 ··· a1n a21 a22 ··· a2n A= . . . . amn am2 · · · amn Let the solution set of the system be in the parametric form x = v0 +t1 v1 +t2 v2 +· · ·+tr vr Prove that ai ⊥ vj for all i, j with 1 ≤ i ≤ m and 1 ≤ j ≤ r. 14