Example Linear system ARRI

Document Sample
Example Linear system ARRI Powered By Docstoc
					          Automation & Robotics Research Institute (ARRI)

   Nonlinear Network Structures for
          Optimal Control

   Frank L. Lewis and Murad Abu-Khalaf
Advanced Controls, Sensors, and MEMS (ACSM) group




                                                     1
System

   x  f(x)  g(x)u ( x)
Cost
         
  V   Q( x)  W (u ) dt
         0

 The Usual Suspects

       Q( x)  xT Qx          W (u )  u T Ru




                                                2
Definition 2.1: Admissible Controls

Let  () denote the set of admissible
controls. A control u : n  m is defined   A stabilizing control
to be admissible with respect to the state   may not be admissible!
penalty function Q ( x ) on  , denoted
u   () , if:
      u
    is continuous on ,
      u ( 0)  0
               ,
   
      u stabilizes (1) on  ,
       

      Q ( x)  W (u )dt   , x  
       0




                                                                      3
NONLINEAR QUADRATIC REGULATOR

Generalized HJB Equation
                              
                 GHJB(V , u ) 
                V
                     T

                       f  gu   Q  uT Ru  0,
                 x
                V (0)  0.
 Optimal Control (SVFB)

                         1 1 T    V * ( x)
               u ( x)   R g ( x)
                *

                         2           x
 Hamilton-Jacobi-Bellman (HJB) Equation

                     
            HJB(V * ) 
                T                 T
           V *          1 V *   1 T V
                                          *
                   f Q        gR g         0,
            x           4 x           x
           V * (0)  0.




                                                    4
 PROBLEM- HJB usually has no analytic solution
 SOLUTION- Successive Approximation


 u ( 0 ) a stabilizing control

V (i )T
 x
                      
         f  gu (i )  Q  W (u (i ) )  0, V (i ) (0)  0   A contraction map
                                                             (Saridis)


          ( i 1)     1 1 T V (i )T
      u              R g                                   Saridis and Beard used Galerkin
                      2       x
                                                             Approx to allow for GHJB solution



             Converges to optimal solution

             Gives u(x) in SVFB form




                                                                                           5
                For Constrained Controls
 NONLINEAR NONQUADRATIC REGULATOR

            
       V   Q( x)  W (u ) dt
            0

with                    u
        W (u )  2  1 (  )  Rd  ,
                                     T
                                                 Nonquadratic form- Lyshevsky
                        0


                PD if       u 1 (u )  0 when u  0
                                                         1 (u )


                                                                    u




                                                                                6
New GHJB is


 V T
  x
                                     
       f  gu   Q( x)  2  1 (u ) T R du  0, V (0)  0

               1 1 T V T 
                                                          Natural, exact,
       u    R g
              2                                         no approximation
                       x 


     u(t) constrained if (.) is a saturation function!

                                          tanh(p)
                                                     1



                                                                p
                                                     -1

                                                                             7
           Problem- cannot solve HJB

           Solution- Use Successive Approximation on GHJB


Iterate:

            u ( 0 ) a stabilizing control

     V (i )T
      x
                               u(i )
                                                      
              f  gu  Q( x)  2  1 (  ) R d  0, V (i ) (0)  0
                    (i )
                                 0
                                            T




                                1 1 T V (i )T   
           u    ( i 1)
                             R g
                               2                  
                                                   
                                        x        




                                                                        8
Lemma 3.1: Improved Saturated Control Law
                                 (i )
If    u ( i )   ( ) ,     and V     satisfies the equation
GHJB (V ( i ) , u ( i ) )  0 with the boundary condition
V ( i ) (0)  0 , then the new control derived as                       Theorem 3.2: Convergence
                                  1 1 T     V ( i ) ( x )           of Successive
             u ( x )    R g ( x )
               ( i 1)
                                                                       Approximations
                                  2             x 
                                                                        If u (0)  () , then
is an admissible control for the system on  .                          1. V  V
                                                                             (i )    *
                                                                                          uniformly on
Moreover, if the control bound  () is                                    
                                                   V ( i 1) is the
                                                                        2. u   (), i  0
                                                                            (i )
monotonically non-decreasing and
unique positive definite function satisfying the
                                                                        3.   u (i )  u*
equation GHJB (V (i 1) , u (i 1) )  0 ,( iwith the( iboundary
          V ( i 1) (0)  0            V 1) ( x )  V ) ( x ) x 
condition                   , then                                  .




                                                                                                9
Lemma 3.4: Optimal Saturated Control has the Largest Stability
Region
                         *
The saturated control u has a stability region that is the largest
                                   (i )
of any other saturated controlu         that is admissible with respect
   Q  x
to        and the system( f , g ) .

     Note that there may be stabilizing saturated controls that
                                       *
have larger stability regions than u , but are not admissible with
respect to Q  x  and the system
                                  ( f , g)
                                           .




                                                                          10
Problem- Cannot solve GHJB!
Solution- Neural Network to approximate V(i)(x)

                     L
       V ( x)   w(ji ) j ( x)  WLT ( i ) L ( x),
         L
          (i )

                     j 1
                                                                   Select basis set    L (x)
                                      1
                            VT    (.)      WT

       x1                                        (.)         y1
                                      2
                                  (.)
       x2                                        (.)         y2

                                      3
                                  (.)

       xn                                        (.)         ym
                                     L
      inputs                      (.)                  outputs

                                 hidden layer

            Two-Layer Neural Network with adjustable output weights

                                                                                                11
    Cost gradient approximation
             VL          L ( L)
                (i )              T

                                  WL   L ( x)WL
                                     (i )   T       (i )

              x           x

     Let
                                      p
              ( p)  A  tanh( )
                                      A                             Nonzero residual!

     Then GHJB is

                                                         u (i )  2 
                                   1  u                 (i )
WL  L ( f  gu )  Q  2u RA tanh 
  ( i )T        (i )       (i )
                                            A 2 R ln 1  
                                       A                            ( x)
                                                       A  
                                                                 

                           u ( i ) ( x) 
                                     1 1 T                    
                            A tanh     R g ( x) L WL (i 1) 
                                                    T

                                     2A                        

                                                                                        12
Neural-network-based nearly optimal saturated control law.




                                                             13
To minimize the residual error in a LS sense

 Evaluate the GHJB at a number of points                  x1 , x 2 ,..., x N     on 

    Note, if

                      u ( i ) ( x) 
                                1 1 T                    
                       A tanh     R g ( x) L WL (i 1) 
                                               T

                                2A                        

     A( x, WL )   L ( f  gu ( i ) )
               (i )



                                         u (i )              u (i )  2 
    b( x,WL )  Q  2u (i ) RA tanh 1 
            (i )
                                         A       A2 R ln 1  
                                                                   A  
                                                                        
                                                                            
                                                                     
                                                             
     Then, GHJB is


           WL
                ( i )T
                         A( x,W       L
                                           ( i 1)T
                                                      
                                                      )  b( x, WL
                                                                       ( i )T
                                                                                )   ( x)

                                                                                             14
Evaluating this at N points gives


                                                                    L x N coefficient matrix

WL
     ( i )T
              A( x ,W
                  1   L
                          ( i 1)T
                                     ) A( x2 ,WL
                                                   ( i 1)T
                                                               ) A( xN ,WL
                                                                               ( i 1)T
                                                                                          )   
                 
                b( x1 ,WL
                                ( i )T
                                         ) b( x2,WL
                                                      ( i )T
                                                               )b( xN ,WL
                                                                             ( i )T
                                                                                      

                Solve by LS




   NN Training Set!


                                                                                                  15
Select the N sample points xk


 Uniform Mesh Grid in R n                 Random selection- Montecarlo




                             1                                 1
Approximation error is                  Approximation error is
(Barron)
                             2/n
                                                               N
                         N

             Montecarlo overcomes NP-complexity problems!




                                                                         16
ASIDE-
Useful for reducing complexity of
fuzzy logic systems?




                      Uniform grid of
                      Separable Gaussian activation
                      functions for RBF NN
                                                      17
  Lemma 3.1: Equation (28) will have a unique
  solution when
              N

              A( x ,W
             k 1
                    k   L
                         ( i 1)
                                   ) AT ( xk ,WL(i1) )  I
         
  where     is a positive constant, andI is the identity
  matrix. This is a persistency of excitation (PE)
                         ( i 1)
  condition on A( xk ,WL ) .



                                                               This guides the
                                                               choice of the N
                                                               sample vectors
                                                               xk


NN Training Set must be PE

                                                                                 18
Algorithm and Proofs work for any Q(x) in                       Constrained input given by
                                                                         u

        V   Q( x)  W (u ) dt                             W (u )  2  (  )  Rd  ,
                                                                                    T
                                                                               1


              0                                                           0



  CONSTRAINED STATE CONTROL
                                                         k
                                       nc
                                           xl 
                  Q( x, k )  x Qx   
                               T
                                                                    k large and even
                                     l 1  Bl   l 




   MINIMUM-TIME CONTROL

             
                               u
                                                  
        V    tanh( x Q x)  2  (  )  Rd   dt
                       T            1      T


            0                  0                 
                                                                     ts

  For small R and    x Qx  0
                      T
                                            this is approx.     V   1 dt ,
                                                                     0



                                                                                              19
Example: Linear system

 x1  0 0.5  x1   0 
 x   1 1.5   x    1 u,             u 1
 2           2  

V15 ( x1 , x2 )  w1 x1  w2 x2  w3 x1 x2  w4 x1 
                      2           2                           4




 w5 x2  w6 x1 x2  w7 x1 x2  w8 x1 x2  w9 x1  w10 x2
     4        3           2   2               3           6           6




 w11 x1 x2  w12 x1 x2  w13 x1 x2  w14 x1 x2  w15 x1 x2
      5           4   2               3   3       2   4           5




 0.4  x1  0.4,          0.4  x2  0.4,




                                                                          20
Region of asymptotic stability for the initial controller,
      u0  tanh  LQR  0.4142 x1  3.4142 x2 


                                                             21
Region of asymptotic stability
for the nearly optimal controller,

                          1 V15 
                u  tanh         ,
                          2 x2 
                                          T
         8.85 -0.76 3.51 -2.52 -1.64 
W151   -2.86 -2.24 1.69 11.09 7.51  .
                                       
         20.61 21.57 24.35 20.84 10.53 
                                       




                                     22
Example: Nonlinear oscillator

x1  x1  x2  x1( x12  x2 ),
                          2


x2   x1  x2  x2 ( x12  x2 )  u.
                             2

                                                     u 1

V24 ( x1 , x2 )  w1 x12  w2 x2  w3 x1 x2  w4 x14  w5 x2 
                               2                           4


              w6 x13 x2  w7 x12 x2  w8 x1 x2  w9 x16  w10 x2
                                  2          3                 6


              w11 x15 x2  w12 x14 x2  w13 x13 x2  w14 x12 x2  w15 x1 x2 
                                    2            3            4           5


              w16 x18  w17 x2  w18 x17 x2  w19 x16 x2  w20 x15 x2
                             8                         2            3


               w21 x14 x2  w22 x13 x2  w23 x12 x2  w24 x1 x2
                         4            5            6           7




 u0  tanh  5x1  3x2 , 



                                                                                23
                2.62 x1  4.23x2  0.39 x2  4.0 x13  8.65 x12 x2 
                                              3

                                                                   
                8.94 x1 x2  5.53x2  2.26 x1  5.78 x1 x2  
                             2         5         5          4

                                                                   
    u   tanh  11.00 x13 x2  2.57 x12 x2  2.00 x1 x2  2.08 x2 
                            2             3            4          7


                0.49 x17  1.65 x16 x2  2.71x15 x2  2.19 x14 x2 
                                                    2             3

                                                                   
                0.76 x13 x2  1.77 x12 x2  0.87 x1 x2
                             4            5            6            
                                                                   


                                State Trajectory for the Nearly Optimal Control Law                                                      Nearly Optimal Control Signal with Input Constraints
                   1                                                                                                        1
                                                                                           x1
                                                                                           x2
                                                                                                                          0.8
                 0.8


                                                                                                                          0.6

                 0.6

                                                                                                                          0.4




                                                                                                     Control Input u(x)
Systems States




                 0.4
                                                                                                                          0.2



                 0.2                                                                                                        0



                                                                                                                          -0.2
                   0


                                                                                                                          -0.4

                 -0.2

                                                                                                                          -0.6


                 -0.4
                                                                                                                          -0.8



                 -0.6                                                                                                      -1
                        0   5         10       15        20       25       30         35        40                               0   5          10       15       20       25       30          35   40
                                                     Time(s)                                                                                                   Time(s)




                                                                                                                                                                                                          24

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:1
posted:11/7/2012
language:English
pages:24