VIEWS: 86 PAGES: 7 CATEGORY: Engineering POSTED ON: 1/16/2010 Public Domain
Mohan Sahu mohansahutgv@gmail.com Optimisation The Second Derivative Test I shall work only with functions f: R2 x y →R f x y [eg. f x y = z = xy − x5 /5 − y 3 /3 + 4 ] This has as its graph a surface always, see ﬁgure 2.1 Figure 2.1: Graph of a function from R2 to R The ﬁrst derivative is ∂f ∂f , ∂x ∂y a b a 1 × 2 matrix which is, at a point [eg. f x y just a pair of numbers. = z = xy − x5 /5 − y 3 /3 + 4 ∂f ∂f , = y − x4 , x − y 2 ∂x ∂y ∂f ∂f , ∂x ∂y 2 3 = [3 − 16, 2 − 9] = [−13, −7] This matrix should be thought of as a linear map from R2 to R: [−13, −7] x y = −13x − 7y It is the linear part of an aﬃne map from R2 to R x−2 y−3 25 5 33 3 z = [−13, −7] + (2)(3) − (f 2 3) − +4 ↑ = −5.4) This is just the two dimensional version of y = mx + c and has graph a plane x x 2 which is tangent to f = xy−x5 /5−y 3 /3+4 at the point = . y y 3 So this generalises the familiar case of y = mx + c being tangent to y = f (x) at a point and m being the derivative at that point, as in ﬁgure 2.2. To ﬁnd a critical point of this function,that is a maximum, minimum or saddle point, we want the tangent plane to be horizontal hence: Figure 2.2: Graph of a function from R to R Deﬁnition 2.1. If f : R2 → R is diﬀerentiable and of f then ∂f ∂f , ∂x ∂y a b = [0, 0]. a b is a critical point Remark 2.1.1. I deal with maps f : Rn → R when n = 2, but generalising to larger n is quite trivial. We would have that f is diﬀerentiable at a ∈ Rn if and only if there is a unique aﬃne (linear plus a shift) map from Rn to R tangent to f at a. The linear part of this then has a (row) matrix representing it ∂f ∂f ∂f , ,··· , ∂x1 ∂x2 ∂xn x=a Remark 2.1.2. We would like to carry the old second derivative test through from one dimension to two (at least) to distinguish between maxima, minima and saddle points. This remark will make more sense if you play with the DEMO’s program on the Graphing Calculator and plot the ﬁve cases mentioned in the introduction. I sure hope you can draw the surfaces x2 + y 2 and x2 − y 2 , because if not you are DEAD MEAT. Deﬁnition 2.2. A quadratic form on R2 is a function f : R2 → R which is a sum of terms xp y q where p, q ∈ N (the natural numbers: 0,1,2,3, . . .) and p + q ≤ 2 and at least one term has p + q = 2. Deﬁnition 2.3. [alternative 1:] A quadratic form on R2 is a function f : R2 → R which can be written f x y = ax2 + bxy + cy 2 + dx + ey + g for some numbers a,b,c,d,e,g and not all of a,b,c are zero. Deﬁnition 2.4. [alternative 2:] A quadratic form on R2 is a function f : R2 → R which can be written f x y = [x − α, y − β] a11 a12 a21 a22 x−α y−β +c for real numbers α, β, c, aij 1 ≤ i, j ≤ 2 and with a12 = a21 Remark 2.1.3. You might want to check that all these 3 deﬁnitions are equivalent. Notice that this is just a polynomial function of degree two in two variables. Deﬁnition 2.5. If f : R2 → R is twice diﬀerentiable at x y = a b the second derivative is the matrix in the quadratic form [x − a, y − b] ∂2f ∂x2 ∂2f ∂y∂x ∂2f ∂x∂y ∂2f ∂y 2 x−a y−b Remark 2.1.4. When the ﬁrst derivative is zero, it is the “best ﬁtting a quadratic” to f at , although we need to add in a constant to lift b it up so that it is “more than tangent” to the surface which is the graph of a f at . You met this in ﬁrst semester in Taylor’s theorem for functions b of two variables. a b for a continuously diﬀerentiable function f having ﬁrst derivative zero at a a , then in a neighbourhood of , f has either a maximum or a minb b imum, whereas if the determinant is negative then in a neighbourhood of a , f is a saddle point. If the determinant is zero, the test is uninformab tive. Theorem 2.1. If the determinant of the second derivative is positive at “Proof ” by arm-waving: We have that if the ﬁrst derivative is zero, the a second derivative at of f is the approximating quadratic form from b Taylor’s theorem, so we can work with this (second order) approximation to f in order to decide what shape (approximately) the graph of f has. The quadratic approximation is just a symmetric matrix, and all the information about the shape of the surface is contained in it. Because it is symmetric, it can be diagonalised by an orthogonal matrix, (ie we can rotate the surface until the quadratic form matrix is just a 0 0 b We can now rescale the new x and y axes by dividing the x by |a| and the y by |b|. This won’t change the shape of the surface in any essential way. This means all quadratic forms are, up to shifting, rotating and stretching 1 0 0 1 ie. x2 + y 2 or −1 0 0 −1 − x2 − y 2 [x, y] 1 0 0 1 x y or 1 0 0 −1 x2 − y 2 = x2 + y 2 or −1 0 0 1 . − x2 + y 2 since et cetera. We do not have to actually do the diagonalisation. We simply note that the determinant in the ﬁrst two cases is positive and the determinant is not changed by rotations, nor is the sign of the determinant changed by scalings. Proposition 2.1.1. If f : R2 → R is a function which has Df a b = ∂f ∂f , ∂x ∂y a b zero and if D f 2 a b = ∂2f ∂x2 ∂2f ∂y∂x ∂2f ∂xy ∂2f ∂y 2 a b is continuous on a neighbourhood of and if det D2 f then if a b >0 a b ∂2f >0 ∂x2 a b a b ∂2f <0 ∂x2 a b a . b f has a local minimum at and if f has a local maximum at “Proof ” The trace of a matrix is the sum of the diagonal terms and this is also unchanged by rotations, and the sign of it is unchanged by scalings. So again we reduce to the four possible basic quadratic forms 1 0 0 1 x2 + y 2 −1 0 0 −1 −x2 − y 2 1 0 0 −1 x2 − y 2 −1 0 0 1 −x2 + y 2 , , , and the trace distinguishes betweeen the ﬁrst two, being positive at a minimum and negative at a maximum. Since the two diagonal terms have the same sign we need only look at the sign of the ﬁrst. Example 2.1.1. Find and classify all critical points of f x y = xy − x4 − y 2 + 2. 2.1. THE SECOND DERIVATIVE TEST Solution 15 ∂f ∂f , = y − 4x3 , x − 2y ∂x ∂y at a critical point, this is the zero matrix [0, 0] so y = 4x3 and y = 1 x 2 so 1 x = 4x3 ⇒ x = 0 or x2 = 2 so x = 0 or x = when y = 0, y = 1 √ 8 1 8 −1 √ 8 −1 √ 2 8 or x = 1 √ 2 8 y= and there are three critical points ∂2f ∂x2 ∂2f ∂y∂x 0 0 = 1 √ 8 1 √ 2 8 −1 √ 8 −1 √ 2 8 . D f= 0 0 1 √ 8 1 √ 2 8 2 ∂2f ∂x∂y ∂2f ∂y 2 −12x2 1 1 −2 0 0 D2 f D2 f = 0 1 1 −2 = −12 8 and det = −1 so is a saddle point. 1 1 −2 maximum or a minimum. 2 −1 √ 8 −1 √ 2 8 and det = 3 − 1 = 2 so the point is either a D f is the same. Since the trace is −3 1 both are maxima. 2 Remark 2.1.5. Only a wild optimist would believe I have got this all correct without making a slip somewhere. So I recommend strongly that you try checking it on a computer with Mathematica, or by using a graphics calculator (or the software on the Mac).