# LECTURE 11

Document Sample

```					                                  Lecture :# 11     Date 10/13/04

1.      Refer to pages 8-9 on graphical methods to solve LINEAR optimization problems.

TODAY'S CLASS:

1.      Basic principles of how a numerical routine will work-derivative and iteratively
double derivative for non-linear optimization. information

2.      Limitations of classical techniques (based on derivatives)

3.      Recent non-conventional methods -counter intuitive methods Genetic Algorithm,
Simulated Annealing

4.      Reading Assignment of paper handed out on lecture # 10.

ˆ           ˆ
For a univariate case, f( X ) where X = {x1} deriving gradients and double derivatives is
easy. In reality, problems are hardly univariate in nature.

But if if  = {x1 x2 x3 . . . . . . . xn}T, then comes the issue of vectors, Matrices, Jacobians and
Hessian.

Univariate                                        Multivariate
ˆ
X {x1} scalor                                      ˆ
X {x x . . . xn}vector
Double derivative                                 Hessian Matrix

Example :  = {x1 x2}T

F ()

The Jacobian would be :



 F                              2F 2F          
 x                              2                
 1  And Hession would be         x1 x1x 2      
 F                              2                
 x                              F      2F      
 2                               x2x1 x2 2
                  

Now notice the Hessian matrixi:

(1) It is a square matrix

(2) It is a symmetric matrix

2F
(3) Diagonal times are the absolute double derivatives                   not mixed like
x 2
2F
.
x 2 x 2

2F
Earlier when we talked about Minimality requirements we mentioned                  0 (for a
x 2
univariate case). However, for Hessians, the terminology now changes as it in a
MATRIX.

For requirements the Hessian H MUST be positive definite.

Positive Definiteness

A matrix is positive definite when its eigen values are all positive.

A I = 0

A = matrix / Identity matrix eigen value.

See Hand-out on Eigen value.

Another definition of positive definiteness of a matrix is,

a11 a12 a13 . . . . . . a1n 
a a                         
 21 22 a13 . . . . . . a1n 
.                           
if suppose A                              
.                           
.                           
                            
an1 a n2 .......... ann 
                  ..... 
nn
matrix

Then say         A1 = a11

a11 a12
A2 =
a21 a 22
a11 a12      a13
A3 = a21 a 22       a 23
a31 a 32     a 33

The whole matrix
An =
nn

Then A is positive definite if

A1, A2, A3 . . . . . An are ALL positive.

(1) So REMEMBER - when it comes to multivariate case of

ˆ
X = {x1 x2 . . . . . xn}T, 'Hessian' matrix require positive –definiteness.

SEE HAND OUT ON EIGEN VALUES

Home Work # 3 PROBLEM SOLUTION

Maximum volume of a box in a sphere of unit radius:

Using constraint :
1
x3 = (1-x12-x22) 2

Rewrite objective function as,
1
f (x1 x2) = 8x1x2 (1- x12-x22) 2

                
1

f                                            
2
x2
 8 x  1 -x12  x 2 2 2  2 1 2            0

1

x1                          x1  x2          
2                                    

f
x1
 8 x1


              
 1 -x12  x 2 2 12        x2 2
1  x12  x2 2          
1

0
2 
                                                   

or, 1-2x12-x22 = 0

1-x12-2x22= 0

1
which gives xA1= xA2 =
3
1
 xA1=      So it's a CUBE
3
Now check for 2nd order partial derivatives  write the Hession - if you wish !!
NON-LINEAR OPTIMIZATION (CLASSICAL)
ROUTINES

(1) They are all iterative
- F(x) - is highly non-linear

ˆ

and error using knowledge of gradient and double derivatives.

Some methods also use double gradients for more accuracy.

(5) Work on the concept of 'step length' and direction

(6) Direction  Downhill -ve gradient

Step length - compromise needed - too small will take you FOREVER to climb a
MOUNTAIN

too big - you may miss peaks.

dF        ˆ
(7) Obviously the search direction will be -         F ( X ) if we want to reach the minimum.
ˆ
dx

Let's write in mathematical form our problems
Suppose 'p' is search direction and 'x' is step length, then ideally our next best approximation to
minimum value of f(xn) at iteration K (if k = o, x0 is initial ppoint) is,
f(xk+p)
which can be expanded as a Taylor's series,
1 2 T 2
f((xk+p) = (xk) + pTfk+       p  f ( x k ) p (orders highest than 2 neglected).
2
Now, (1) If the algorithm works on both       fk    &     2fk
Jacobian        Hessian

then it will be more accurate than as algorithm using just fk.

Why? Because the Taylor series approximation is MORE ACCURATE

Figure to be drawn in class

But the problems with Hessian:-

(1) More computationally intensive to compute Hessian than Jacobian (more computer
operations)

(2) Gradient only methods will be efficient but may not be as accurate.
SIGNS OF A GOOD ALGORITHM

(1) Should not be sensitive to initial guess.

(2) Should not be too sensitive to step length

(3) Converge fast

(4) Should not get trapped easily in LOCAL MINIMA.

SO MUCH FOR
NON-LINFAR OPTIMIZATION
BUT WHAT IS THE PROBLEM IN
HYDROSCIENCE

A. Unfortunately Model response (out put), say, goodness of fit with observation, its
ˆ
surface, with X , model parameters is a very noisy, complicated discontinuous surface-
Full of local minim, ridges, flat areas etc. (show example in class) – Reter to saddle
points pts of inflection

B. Gradient based techniques tend to get trapped or stuck - even computing gradient is
difficult at times.

C. Scaling issues - f(x) = 1O9x12+x22

f(x) is very sensitive to slight change in x1 but not as sensitive to x2.

In natural systems f(x) type objective functions are very common- Why?

Consider river flow with ground water flow

River flow ~ km/hrs

ground water flow ~ cm/hr

There is a scale difference of 105 orders

Given these known problems,

Hydrologists come up with non conventional techniques. I'll mention only two:

A. Genetic Algorithm

B. Simulated Amealing

Read the paper by Duan et al. (1992)

Next class Oct 20/22-write in ONE PAGE summary of what you've understood reading the
papers.

HW#4
Solve:

Compute the gradient f(x) and 2f(x) of the Rosenbrock function:

f(x) = 100(x2-x12)2+(1-x1)2

Find the 'global' minimum.

(you have to show that Hessian is +ve definite)

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 8 posted: 6/27/2012 language: pages: 6