Multivariate Calculus by mohansahu

VIEWS: 86 PAGES: 7

									Mohan Sahu
mohansahutgv@gmail.com

Optimisation
The Second Derivative Test
I shall work only with functions f: R2 x y →R f x y

[eg. f

x y

= z = xy − x5 /5 − y 3 /3 + 4 ]

This has as its graph a surface always, see figure 2.1

Figure 2.1: Graph of a function from R2 to R

The first derivative is ∂f ∂f , ∂x ∂y a b a 1 × 2 matrix

which is, at a point [eg. f x y

just a pair of numbers.

= z = xy − x5 /5 − y 3 /3 + 4 ∂f ∂f , = y − x4 , x − y 2 ∂x ∂y ∂f ∂f , ∂x ∂y

2 3

= [3 − 16, 2 − 9] = [−13, −7]

This matrix should be thought of as a linear map from R2 to R: [−13, −7] x y = −13x − 7y

It is the linear part of an affine map from R2 to R x−2 y−3
25 5 33 3

z = [−13, −7]

+

(2)(3) − (f 2 3)

−

+4

↑ = −5.4)

This is just the two dimensional version of y = mx + c and has graph a plane x x 2 which is tangent to f = xy−x5 /5−y 3 /3+4 at the point = . y y 3 So this generalises the familiar case of y = mx + c being tangent to y = f (x) at a point and m being the derivative at that point, as in figure 2.2. To find a critical point of this function,that is a maximum, minimum or saddle point, we want the tangent plane to be horizontal hence:

Figure 2.2: Graph of a function from R to R Definition 2.1. If f : R2 → R is differentiable and of f then ∂f ∂f , ∂x ∂y a b = [0, 0]. a b is a critical point

Remark 2.1.1. I deal with maps f : Rn → R when n = 2, but generalising to larger n is quite trivial. We would have that f is differentiable at a ∈ Rn if and only if there is a unique affine (linear plus a shift) map from Rn to R tangent to f at a. The linear part of this then has a (row) matrix representing it ∂f ∂f ∂f , ,··· , ∂x1 ∂x2 ∂xn x=a Remark 2.1.2. We would like to carry the old second derivative test through from one dimension to two (at least) to distinguish between maxima, minima and saddle points. This remark will make more sense if you play with the DEMO’s program on the Graphing Calculator and plot the five cases mentioned in the introduction. I sure hope you can draw the surfaces x2 + y 2 and x2 − y 2 , because if not you are DEAD MEAT.

Definition 2.2. A quadratic form on R2 is a function f : R2 → R which is a sum of terms xp y q where p, q ∈ N (the natural numbers: 0,1,2,3, . . .) and p + q ≤ 2 and at least one term has p + q = 2. Definition 2.3. [alternative 1:] A quadratic form on R2 is a function f : R2 → R which can be written f x y = ax2 + bxy + cy 2 + dx + ey + g

for some numbers a,b,c,d,e,g and not all of a,b,c are zero. Definition 2.4. [alternative 2:] A quadratic form on R2 is a function f : R2 → R which can be written f x y = [x − α, y − β] a11 a12 a21 a22 x−α y−β +c

for real numbers α, β, c, aij

1 ≤ i, j ≤ 2 and with a12 = a21

Remark 2.1.3. You might want to check that all these 3 definitions are equivalent. Notice that this is just a polynomial function of degree two in two variables. Definition 2.5. If f : R2 → R is twice differentiable at x y = a b

the second derivative is the matrix in the quadratic form [x − a, y − b]
∂2f ∂x2 ∂2f ∂y∂x ∂2f ∂x∂y ∂2f ∂y 2

x−a y−b

Remark 2.1.4. When the first derivative is zero, it is the “best fitting a quadratic” to f at , although we need to add in a constant to lift b it up so that it is “more than tangent” to the surface which is the graph of a f at . You met this in first semester in Taylor’s theorem for functions b of two variables.

a b for a continuously differentiable function f having first derivative zero at a a , then in a neighbourhood of , f has either a maximum or a minb b imum, whereas if the determinant is negative then in a neighbourhood of a , f is a saddle point. If the determinant is zero, the test is uninformab tive. Theorem 2.1. If the determinant of the second derivative is positive at “Proof ” by arm-waving: We have that if the first derivative is zero, the a second derivative at of f is the approximating quadratic form from b Taylor’s theorem, so we can work with this (second order) approximation to f in order to decide what shape (approximately) the graph of f has. The quadratic approximation is just a symmetric matrix, and all the information about the shape of the surface is contained in it. Because it is symmetric, it can be diagonalised by an orthogonal matrix, (ie we can rotate the surface until the quadratic form matrix is just a 0 0 b We can now rescale the new x and y axes by dividing the x by |a| and the y by |b|. This won’t change the shape of the surface in any essential way. This means all quadratic forms are, up to shifting, rotating and stretching 1 0 0 1 ie. x2 + y 2 or −1 0 0 −1 − x2 − y 2 [x, y] 1 0 0 1 x y or 1 0 0 −1 x2 − y 2 = x2 + y 2 or −1 0 0 1 .

− x2 + y 2 since

et cetera. We do not have to actually do the diagonalisation. We simply note that the determinant in the first two cases is positive and the determinant is not changed by rotations, nor is the sign of the determinant changed by scalings. Proposition 2.1.1. If f : R2 → R is a function which has Df a b = ∂f ∂f , ∂x ∂y a b

zero and if D f
2

a b

=

∂2f ∂x2 ∂2f ∂y∂x

∂2f ∂xy ∂2f ∂y 2

a b

is continuous on a neighbourhood of and if det D2 f then if a b >0

a b

∂2f >0 ∂x2 a b a b ∂2f <0 ∂x2 a b a . b

f has a local minimum at and if

f has a local maximum at

“Proof ” The trace of a matrix is the sum of the diagonal terms and this is also unchanged by rotations, and the sign of it is unchanged by scalings. So again we reduce to the four possible basic quadratic forms 1 0 0 1 x2 + y 2 −1 0 0 −1 −x2 − y 2 1 0 0 −1 x2 − y 2 −1 0 0 1 −x2 + y 2

,

,

,

and the trace distinguishes betweeen the first two, being positive at a minimum and negative at a maximum. Since the two diagonal terms have the same sign we need only look at the sign of the first. Example 2.1.1. Find and classify all critical points of f x y = xy − x4 − y 2 + 2.

2.1. THE SECOND DERIVATIVE TEST Solution

15

∂f ∂f , = y − 4x3 , x − 2y ∂x ∂y

at a critical point, this is the zero matrix [0, 0] so y = 4x3 and y = 1 x 2 so 1 x = 4x3 ⇒ x = 0 or x2 = 2 so x = 0 or x = when y = 0, y =
1 √ 8 1 8 −1 √ 8 −1 √ 2 8

or x =

1 √ 2 8

y=

and there are three critical points
∂2f ∂x2 ∂2f ∂y∂x

0 0 =

1 √ 8 1 √ 2 8

−1 √ 8 −1 √ 2 8

.

D f= 0 0
1 √ 8 1 √ 2 8

2

∂2f ∂x∂y ∂2f ∂y 2

−12x2 1 1 −2 0 0

D2 f D2 f

=

0 1 1 −2 =
−12 8

and det = −1 so

is a saddle point.

1 1 −2 maximum or a minimum.
2 −1 √ 8 −1 √ 2 8

and det = 3 − 1 = 2 so the point is either a

D f

is the same. Since the trace is −3 1 both are maxima. 2

Remark 2.1.5. Only a wild optimist would believe I have got this all correct without making a slip somewhere. So I recommend strongly that you try checking it on a computer with Mathematica, or by using a graphics calculator (or the software on the Mac).


								
To top