# Example Linear system ARRI

Document Sample

```					          Automation & Robotics Research Institute (ARRI)

Nonlinear Network Structures for
Optimal Control

Frank L. Lewis and Murad Abu-Khalaf
Advanced Controls, Sensors, and MEMS (ACSM) group

1
System

x  f(x)  g(x)u ( x)
Cost

V   Q( x)  W (u ) dt
0

The Usual Suspects

Q( x)  xT Qx          W (u )  u T Ru

2

Let  () denote the set of admissible
controls. A control u : n  m is defined   A stabilizing control
to be admissible with respect to the state   may not be admissible!
penalty function Q ( x ) on  , denoted
u   () , if:
u
 is continuous on ,
u ( 0)  0
            ,

u stabilizes (1) on  ,


    Q ( x)  W (u )dt   , x  
0

3

Generalized HJB Equation

GHJB(V , u ) 
V
T

 f  gu   Q  uT Ru  0,
x
V (0)  0.
Optimal Control (SVFB)

1 1 T    V * ( x)
u ( x)   R g ( x)
*

2           x
Hamilton-Jacobi-Bellman (HJB) Equation


HJB(V * ) 
T                 T
V *          1 V *   1 T V
*
f Q        gR g         0,
x           4 x           x
V * (0)  0.

4
PROBLEM- HJB usually has no analytic solution
SOLUTION- Successive Approximation

u ( 0 ) a stabilizing control

V (i )T
x
              
f  gu (i )  Q  W (u (i ) )  0, V (i ) (0)  0   A contraction map
(Saridis)

( i 1)     1 1 T V (i )T
u              R g                                   Saridis and Beard used Galerkin
2       x
Approx to allow for GHJB solution

Converges to optimal solution

Gives u(x) in SVFB form

5
For Constrained Controls


V   Q( x)  W (u ) dt
0

with                    u
W (u )  2  1 (  )  Rd  ,
T
0

PD if       u 1 (u )  0 when u  0
 1 (u )

u

6
New GHJB is

V T
x
       
 f  gu   Q( x)  2  1 (u ) T R du  0, V (0)  0

 1 1 T V T 
Natural, exact,
u    R g
2                                         no approximation
         x 

u(t) constrained if (.) is a saturation function!

tanh(p)
1

p
-1

7
Problem- cannot solve HJB

Solution- Use Successive Approximation on GHJB

Iterate:

u ( 0 ) a stabilizing control

V (i )T
x
                   u(i )
   
f  gu  Q( x)  2  1 (  ) R d  0, V (i ) (0)  0
(i )
0
T

 1 1 T V (i )T   
u    ( i 1)
   R g
2                  

         x        

8
Lemma 3.1: Improved Saturated Control Law
(i )
If    u ( i )   ( ) ,     and V     satisfies the equation
GHJB (V ( i ) , u ( i ) )  0 with the boundary condition
V ( i ) (0)  0 , then the new control derived as                       Theorem 3.2: Convergence
 1 1 T     V ( i ) ( x )           of Successive
u ( x )    R g ( x )
( i 1)
          Approximations
 2             x 
If u (0)  () , then
is an admissible control for the system on  .                          1. V  V
(i )    *
uniformly on
Moreover, if the control bound  () is                                    
V ( i 1) is the
2. u   (), i  0
(i )
monotonically non-decreasing and
unique positive definite function satisfying the
3.   u (i )  u*
equation GHJB (V (i 1) , u (i 1) )  0 ,( iwith the( iboundary
V ( i 1) (0)  0            V 1) ( x )  V ) ( x ) x 
condition                   , then                                  .

9
Lemma 3.4: Optimal Saturated Control has the Largest Stability
Region
*
The saturated control u has a stability region that is the largest
(i )
of any other saturated controlu         that is admissible with respect
Q  x
to        and the system( f , g ) .

Note that there may be stabilizing saturated controls that
*
have larger stability regions than u , but are not admissible with
respect to Q  x  and the system
( f , g)
.

10
Problem- Cannot solve GHJB!
Solution- Neural Network to approximate V(i)(x)

L
V ( x)   w(ji ) j ( x)  WLT ( i ) L ( x),
L
(i )

j 1
Select basis set    L (x)
1
VT    (.)      WT

x1                                        (.)         y1
2
(.)
x2                                        (.)         y2

3
(.)

xn                                        (.)         ym
L
inputs                      (.)                  outputs

hidden layer

Two-Layer Neural Network with adjustable output weights

11
VL          L ( L)
(i )              T

           WL   L ( x)WL
(i )   T       (i )

x           x

Let
p
 ( p)  A  tanh( )
A                             Nonzero residual!

Then GHJB is

  u (i )  2 
1  u                 (i )
WL  L ( f  gu )  Q  2u RA tanh 
( i )T        (i )       (i )
  A 2 R ln 1  
 A                            ( x)
                 A  
          

u ( i ) ( x) 
 1 1 T                    
 A tanh     R g ( x) L WL (i 1) 
T

 2A                        

12
Neural-network-based nearly optimal saturated control law.

13
To minimize the residual error in a LS sense

Evaluate the GHJB at a number of points                  x1 , x 2 ,..., x N     on 

Note, if

u ( i ) ( x) 
 1 1 T                    
 A tanh     R g ( x) L WL (i 1) 
T

 2A                        

A( x, WL )   L ( f  gu ( i ) )
(i )

 u (i )              u (i )  2 
 b( x,WL )  Q  2u (i ) RA tanh 1 
(i )
 A       A2 R ln 1  
 A  


                             

Then, GHJB is

WL
( i )T
A( x,W       L
( i 1)T

)  b( x, WL
( i )T
)   ( x)

14
Evaluating this at N points gives

L x N coefficient matrix

WL
( i )T
A( x ,W
1   L
( i 1)T
) A( x2 ,WL
( i 1)T
) A( xN ,WL
( i 1)T
)   

 b( x1 ,WL
( i )T
) b( x2,WL
( i )T
)b( xN ,WL
( i )T


Solve by LS

NN Training Set!

15
Select the N sample points xk

Uniform Mesh Grid in R n                 Random selection- Montecarlo

1                                 1
Approximation error is                  Approximation error is
(Barron)
2/n
N
N

Montecarlo overcomes NP-complexity problems!

16
ASIDE-
Useful for reducing complexity of
fuzzy logic systems?

Uniform grid of
Separable Gaussian activation
functions for RBF NN
17
Lemma 3.1: Equation (28) will have a unique
solution when
N

 A( x ,W
k 1
k   L
( i 1)
) AT ( xk ,WL(i1) )  I

where     is a positive constant, andI is the identity
matrix. This is a persistency of excitation (PE)
( i 1)
condition on A( xk ,WL ) .

This guides the
choice of the N
sample vectors
xk

NN Training Set must be PE

18
Algorithm and Proofs work for any Q(x) in                       Constrained input given by
                                                           u

V   Q( x)  W (u ) dt                             W (u )  2  (  )  Rd  ,
T
1

0                                                           0

CONSTRAINED STATE CONTROL
k
nc
 xl 
Q( x, k )  x Qx   
T
               k large and even
l 1  Bl   l 

MINIMUM-TIME CONTROL


                 u

V    tanh( x Q x)  2  (  )  Rd   dt
T            1      T

0                  0                 
ts

For small R and    x Qx  0
T
this is approx.     V   1 dt ,
0

19
Example: Linear system

 x1  0 0.5  x1   0 
 x   1 1.5   x    1 u,             u 1
 2           2  

V15 ( x1 , x2 )  w1 x1  w2 x2  w3 x1 x2  w4 x1 
2           2                           4

w5 x2  w6 x1 x2  w7 x1 x2  w8 x1 x2  w9 x1  w10 x2
4        3           2   2               3           6           6

w11 x1 x2  w12 x1 x2  w13 x1 x2  w14 x1 x2  w15 x1 x2
5           4   2               3   3       2   4           5

0.4  x1  0.4,          0.4  x2  0.4,

20
Region of asymptotic stability for the initial controller,
u0  tanh  LQR  0.4142 x1  3.4142 x2 

21
Region of asymptotic stability
for the nearly optimal controller,

 1 V15 
u  tanh         ,
 2 x2 
T
 8.85 -0.76 3.51 -2.52 -1.64 
W151   -2.86 -2.24 1.69 11.09 7.51  .
                               
 20.61 21.57 24.35 20.84 10.53 
                               

22
Example: Nonlinear oscillator

x1  x1  x2  x1( x12  x2 ),
2

x2   x1  x2  x2 ( x12  x2 )  u.
2

u 1

V24 ( x1 , x2 )  w1 x12  w2 x2  w3 x1 x2  w4 x14  w5 x2 
2                           4

w6 x13 x2  w7 x12 x2  w8 x1 x2  w9 x16  w10 x2
2          3                 6

w11 x15 x2  w12 x14 x2  w13 x13 x2  w14 x12 x2  w15 x1 x2 
2            3            4           5

w16 x18  w17 x2  w18 x17 x2  w19 x16 x2  w20 x15 x2
8                         2            3

 w21 x14 x2  w22 x13 x2  w23 x12 x2  w24 x1 x2
4            5            6           7

u0  tanh  5x1  3x2 , 

23
 2.62 x1  4.23x2  0.39 x2  4.0 x13  8.65 x12 x2 
3

                                                    
 8.94 x1 x2  5.53x2  2.26 x1  5.78 x1 x2  
2         5         5          4

                                                    
u   tanh  11.00 x13 x2  2.57 x12 x2  2.00 x1 x2  2.08 x2 
2             3            4          7

 0.49 x17  1.65 x16 x2  2.71x15 x2  2.19 x14 x2 
2             3

                                                    
 0.76 x13 x2  1.77 x12 x2  0.87 x1 x2
4            5            6            
                                                    

State Trajectory for the Nearly Optimal Control Law                                                      Nearly Optimal Control Signal with Input Constraints
1                                                                                                        1
x1
x2
0.8
0.8

0.6

0.6

0.4

Control Input u(x)
Systems States

0.4
0.2

0.2                                                                                                        0

-0.2
0

-0.4

-0.2

-0.6

-0.4
-0.8

-0.6                                                                                                      -1
0   5         10       15        20       25       30         35        40                               0   5          10       15       20       25       30          35   40
Time(s)                                                                                                   Time(s)

24

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 1 posted: 11/7/2012 language: English pages: 24