Docstoc

Numerical Methods for the Solution of Optimal Feedback Control ...2011120220

Document Sample
Numerical Methods for the Solution of Optimal Feedback Control ...2011120220 Powered By Docstoc
					 Numerical Methods for the
Solution of Optimal Feedback
     Control Problems

    Steven James Richardson




      This thesis is presented for the degree of
              Doctor of Philosophy
      of The University of Western Australia
     Department of Mathematics & Statistics.
                 October, 2007
ii
                                                                                        iii


                                     Abstract

   In this thesis we consider some numerical solution methods for solving optimal
feed-back control problems. Finding solutions to problems of this nature involves a
significantly increased degree of difficulty compared to the related task of solving op-
timal open-loop control problems. Specifically, a feed-back control depends on both
time and state variables, and so its determination by numerical schemes is subject to
the well known “curse of dimensionality”. Consequently efficient numerical methods
are critical to the accurate determination of optimal feed-back controls.
   Optimal feed-back control problems can be formulated in two equivalent ways,
providing the opportunity to seek solutions using either of these alternatives.
   The first formulation expresses the problem as a nonlinear hyperbolic partial
differential equation, known as the Hamilton-Jacobi-Bellman (HJB) equation. We
present a method based on solving the so called viscosity approximation to the HJB
equation, in which the HJB equation is perturbed by a small viscosity term to give
a quasi-linear parabolic partial differential equation. The method involves the use
of an exponentially fitted finite volume/element method in the spatial domain, com-
bined with an implicit time stepping, to produce an unconditionally stable solution
scheme.
   We also consider some of the more theoretical aspects related to the solution of
the HJB equation using the exponentially fitted finite volume/element method. In
particular, we present relevant existence, uniqueness and convergence results.
   The second formulation, often referred to as the direct approach, considers the
optimisation of an objective functional with respect to the control function. The
optimisation is subject to the system dynamics, and various constraints on the state
and control variables. We consider an approach based on this direct formulation
using a modified version of the Multivariate Adaptive Regression B-spline algo-
rithm (BMARS), applied previously in high dimensional regression modeling. This
method was developed specifically to enable the solution of optimal feed-back con-
trol problems with high dimensional state spaces. We demonstrate the efficiency of
the approach by performing numerical experiments on problems with up to six state
variables.
iv
                                                                                               v


                                     Contents


Abstract                                                                                 iii

List of Tables                                                                           ix

List of Figures                                                                          xi

Acknowledgements                                                                         xv

1 Introduction                                                                            1
   1.1   Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . .        2
   1.2   Hamilton-Jacobi-Bellman Equation . . . . . . . . . . . . . . . . . . .           3
   1.3   Direct Optimal feed-back Control Problem . . . . . . . . . . . . . . .           4
   1.4   Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     6
   1.5   Problem Assumptions       . . . . . . . . . . . . . . . . . . . . . . . . . .    8
   1.6   Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      9

2 Background                                                                             11
   2.1   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
   2.2   The Nature of the Value Function . . . . . . . . . . . . . . . . . . . . 11
   2.3   Viscosity Solutions : Definition and Basic Properties . . . . . . . . . 13
   2.4   The Value Function and the HJB Equation . . . . . . . . . . . . . . . 17
   2.5   Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Numerical Solution of Hamilton-Jacobi-Bellman Equations by an
  Exponentially Fitted Finite Volume Method                     19
   3.1   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
   3.2   Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
   3.3   The Singularly Perturbed Problem . . . . . . . . . . . . . . . . . . . 20
   3.4   The Discretisation Method . . . . . . . . . . . . . . . . . . . . . . . . 22
   3.5   Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 28
   3.6   Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4 Convergence of the Viscosity Approximation and the Calculation
  of an Extended Domain                                          43
   4.1   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
   4.2   Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vi


       4.3 Viscosity Solutions of Parabolic Differential Equations . . . . . . . . . 45
       4.4 The Viscosity Solution of Problem 4.2.2 . . . . . . . . . . . . . . . . . 49
       4.5 Convergence of the Viscosity Approximation . . . . . . . . . . . . . . 57
       4.6 Domain of Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . 62
       4.7 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
       4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

     5 Convergence of the Exponentially Fitted Finite Element Method
       for the Solution of the Hamilton-Jacobi-Bellman Equation      85
       5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
       5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
       5.3 Variational Form of the Perturbed HJB Equation . . . . . . . . . . . 87
       5.4 Spatial Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
       5.5 Full Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
       5.6 Error estimate : Assuming the optimal control is known a-priori . . . 96
       5.7 Error Estimate : The complete problem . . . . . . . . . . . . . . . . . 106
       5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

     6 A Multivariate Adaptive Regression B-spline Algorithm (BMARS)
       for Solving a Class of Nonlinear Optimal feed-back Control Prob-
       lems                                                            119
       6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
       6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
       6.3 Optimal feed-back Control Problems with High Dimensional State
           Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
       6.4 Solution Space Selection . . . . . . . . . . . . . . . . . . . . . . . . . 121
       6.5 Approximation of the Optimal feed-back Control: Formulation 1 . . . 128
       6.6 Approximation of the Optimal feed-back Control: Formulation 2 . . . 133
       6.7 Approximation of the Optimal feed-back Control: Formulation 3 . . . 138
       6.8 Comparative performance of BMARS . . . . . . . . . . . . . . . . . . 142
       6.9 Six Dimensional Example . . . . . . . . . . . . . . . . . . . . . . . . 144
       6.10 Suggestions for Future Work . . . . . . . . . . . . . . . . . . . . . . . 148
       6.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

     7 Conclusion                                                                      153

     A Supplementary Definitions and Theorems                                           155
                                                                                          vii


B Chapter 2 : Supplementary Results and Proofs                             157
  B.1 The Dynamic Programming Principle . . . . . . . . . . . . . . . . . . 157
  B.2 The Hamilton-Jacobi-Bellman Equation in the Classical Sense . . . . 157

C Chapter 4 : Supplementary Results and Proofs                                    159
  C.1 Proof of Corollary 4.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 160
   C.2 Proof of Corollary 4.4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . 165

D Chapter 5: Supplementary Results and Proofs                                   169
  D.1 Proof of Lemma 5.4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
  D.2 Proof of Theorem 5.7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 170
  D.3 Proof of Proposition 5.7.2 . . . . . . . . . . . . . . . . . . . . . . . . 171

E Chapter 6 : Supplementary Results                                      175
  E.1 The Space of Piecewise Polynomial Functions and B-Splines . . . . . 175
   E.2 Bases for Pk,Ξ,Λ (Ω) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
   E.3 Objective Cost Data for Maximum Rocket Height Problem . . . . . . 181

Bibliography                                                                      185
viii
                                                                                          ix


                              List of Tables

3.5.1 Computed costs for Problem 1. . . . . . . . . . . . . . . . . . . . . . 29
3.5.2 Computed costs for Problem 2. . . . . . . . . . . . . . . . . . . . . . 30
3.5.3 Computed costs for Problem 3. . . . . . . . . . . . . . . . . . . . . . 32

6.4.1 The BMARS Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.4.2 The Adapted BMARS Algorithm. . . . . . . . . . . . . . . . . . . . . 126
6.8.1 Objective costs for feed-back controls calculated using the method of
      [35]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.8.2 The relative errors using the method of [35] and the adapted BMARS
      algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.9.1 The Adapted BMARS Algorithm for Multiple Controls. . . . . . . . . 146

E.3.1Objective costs for feedback controls calculated using formulation 1. . 181
E.3.2Objective costs for feedback controls calculated using formulation 2. . 182
E.3.3Objective costs for feedback controls calculated using formulation 3. . 183
x
                                                                                        xi


                             List of Figures

2.2.1 A function satisfying the requirements specified for L in Example 2.2.1. 12

3.4.1 Elements and edges associated with the node xi in two dimensions. . 23
3.5.1 Surface plots of the control from Problem 1 at time 0.25 (top), and
      0.75 (bottom). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.2 Top: The computed trajectories of x1 (left), and x2 (right) for Test
     1.1. Bottom: The control for Test 1.1. . . . . . . . . . . . . . . . . . 34
3.5.3 Top: The computed trajectories of x1 (left), and x2 (right) for Test
      1.2. Bottom: The control for Test 1.2. . . . . . . . . . . . . . . . . . 35
3.5.4 Top: The computed trajectories of x1 (left), and x2 (right) for Test
     1.3. Bottom: The control for Test 1.3. . . . . . . . . . . . . . . . . . 36
3.5.5 Top: The computed trajectories of x1 (left), and x2 (right) for Test
      1.4. Bottom: The control for Test 1.4. . . . . . . . . . . . . . . . . . 37
3.5.6 Top: The computed trajectories of x1 (left),and x2 (right) for Test
      2.2. Middle: The computed trajectory of x3 for Test 2.2. Bottom:
      Control 1 (left) and control 2 (right) for Test 2.2. . . . . . . . . . . . 38
3.5.7 Top: The computed trajectories of x1 (left),and x2 (right) for Test
     2.3. Middle: The computed trajectory of x3 for Test 2.3. Bottom:
     Control 1 (left) and control 2 (right) for Test 2.3. . . . . . . . . . . . 39
3.5.8 Top: The computed trajectories of x1 (left), and x2 (right) for Test
     3.3. Middle: The computed trajectory of x3 (left) and control 1 for
     Test 3.3. Bottom: Control 2 (left) and control 3 (right) for Test 3.3. . 40
3.5.9 Top: The computed trajectories of x1 (left), and x2 (right) for Test
      3.5. Middle: The computed trajectory of x3 (left) and control 1 for
     Test 3.5. Bottom: Control 2 (left) and control 3 (right) for Test 3.5. . 41

                                                         ˜
4.7.1 A plot comparing the calculated upper bound for Ω ( bounded by
                            ˜
      outer circle) and the Ω used in the numerical experiment (bounded
      by dashed rectangle) for Problem 1. Also shown are Ω (bounded by
      solid rectangle), and Ωcircle (bounded by inner circle). . . . . . . . . . 75
                                               ˜
4.7.2 A plot comparing the upper bounds of Ω for Problem 1, calculated
     using different sized subdivisions of Ω (bounded by solid rectangle).
     Working from the outside we have divisions of 1 × 1, 2 × 2, 5 × 5,
                                                         ˜
     10 × 10, 20 × 20, and 40 × 40. Also shown is the Ω used in the
     numerical experiment (bounded by dashed rectangle). . . . . . . . . . 75
xii


      4.7.3 A plot showing the smallest rectangular domain containing the Ω     ˜
            calculated using the 40×40 division of Ω (bounded by solid rectangle)
                                           ˜
            for Problem 1. Also shown are Ω calculated without any division of Ω
                                              ˜
            (bounded by outer circle) and the Ω used in the numerical experiment
           (bounded by dashed rectangle). . . . . . . . . . . . . . . . . . . . . . 78
                                                                               ˜
      4.7.4 A cross-section in the x1 − x2 plane, showing the upper bounds for Ω
           calculated using different sized sub-divisions of Ω (bounded by solid
           rectangle) for Problem 2. Working from the outside we have divisions
           of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
           ˜
           Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . 78
                                                                               ˜
      4.7.5 A cross-section in the x1 − x3 plane, showing the upper bounds for Ω
            calculated using different sized sub-divisions of Ω (bounded by solid
            rectangle) for Problem 2. Working from the outside we have divisions
           of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
           ˜
           Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 79
                                                                               ˜
      4.7.6 A cross-section in the x2 − x3 plane, showing the upper bounds for Ω
           calculated using different sized sub-divisions of Ω (bounded by solid
           rectangle) for Problem 2. Working from the outside we have divisions
           of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
           ˜
           Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 79
                                                                               ˜
      4.7.7 A cross-section in the x1 − x2 plane, showing the upper bounds for Ω
            calculated using different sized sub-divisions of Ω (bounded by solid
           rectangle) for Problem 3. Working from the outside we have divisions
           of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
           ˜
           Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . 81
      4.7.8 A cross-section in the x1 − x3 plane, showing the upper bounds for Ω  ˜
            calculated using different sized sub-divisions of Ω (bounded by solid
            rectangle) for Problem 3. Working from the outside we have divisions
            of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
            ˜
            Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 81
                                                                               ˜
      4.7.9 A cross-section in the x2 − x3 plane, showing the upper bounds for Ω
            calculated using different sized sub-divisions of Ω (bounded by solid
           rectangle) for Problem 3. Working from the outside we have divisions
           of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
           ˜
           Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 83
                                                                                          xiii


5.4.1 Elements of the Dirichlet tessellation Dh and Box mesh Bh associ-
      ated with node xi of the Delaunay mesh Th . The support of the
      constant basis function ξi is shaded on the left, and the support of
     the exponential basis function φi is shaded on the right. . . . . . . . . 89

6.5.1 Plots of the states and control for Test 1. The order is x1 (top left),
      x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
      open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.5.2 Plots of the states and control for Test 2. The order is x1 (top left),
     x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
     open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.5.3 Number of basis functions vs Relative error. Test1(-), Test2(...),
      Test3(- -), Test4(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.6.1 Number of basis functions vs Relative error. Test1(-), Test2(...),
      Test3(- -), Test4(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.6.2 Plots of the states and control for Test 1. The order is x1 (top left),
     x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
     open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.6.3 Plots of the states and control for Test 2. The order is x1 (top left),
      x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
      open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.6.4 Plots of the states and control for Test 4. The order is x1 (top left),
     x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
     open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.7.1 Number of basis functions vs Relative error. Test1(-), Test2(...),
      Test3(- -), Test4(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.7.2 Plots of the states and control for Test 1. The order is x1 (top left),
     x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
     open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.7.3 Plots of the states and control for Test 2. The order is x1 (top left),
      x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
      open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.7.4 Plots of the states and control for Test 4. The order is x1 (top left),
     x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
     open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.9.1 Number of basis functions vs Relative error. Test 1(-), Test 2(- -),
      Test E(...), Test F(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . 149
xiv


      6.9.2 Number of basis functions vs Relative error. Test A(-), Test B(...),
            Test C(- -), Test D(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . 149

      E.2.1From left to right: 6(x − 1)0 , 6(x − 1)1 , 3(x − 1)2 , (x − 1)3 . . . . . . 178
                                       +           +           +          +
      E.2.2From left to right: B1,1,t(x), B1,2,t(x), B1,3,t(x), B1,4,t(x), where t =
           (0, 1, 2, 3, 4). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
                                                                                        xv


                               Acknowledgements

   I would firstly like to thank my primary supervisor, Associate Professor Song
Wang. I cannot begin to express my gratitude for your contribution to my PhD
studies. After spending the first year of my candidature on another project which
was ultimately unproductive, I felt hopeless and desperate at my quickly diminishing
PhD prospects. By taking me on as your student at this time, you provided me with
a life line, and a reignited sense of hope and enthusiasm toward my studies. Thank
you not only for you ongoing guidance, patience and concern for the progression
of my research, but also for your concern toward my financial situation and the
opportunities you provided to that end. Looking back I could not predict with any
confidence that I would have made it to this point without your assistance in these
areas.
   To my other supervisor, Associate Professor Les Jennings. I understand that as
Head of Department you have large demands on your time, and I appreciate not
only your agreement to co-supervise my research, but also your willingness to very
often drop whatever you were doing to help me when I would turn up unannounced.
Thank you for you help and patience in identifying many of my coding errors, and
for all of your advice, particularly that relating to MISER.
   To the computing staff, Roman Bogoyev, Michael Juschke, and Con Savas.
Thank you all for helping to address the many computing difficulties I have faced
over my time as a postgraduate student. In particular thank you to Roman for all
of your advice. You have contributed significantly not only to my understanding of
programming, but also to my general computer literacy.
   To the administration staff, and in particular Annette Harrison. Thank you
for dealing with my many tedious pay requests over the years, and also for your
assistance in printing and binding copies of this thesis.
   Thanks also to all staff members who at one time or another contributed their
time or resources to the discussion of mathematical ideas relating to my research, or
my path as a PhD student. These staff include Dr Grant Keady, Adjunct Professor
CJ Goh, and Dr Des Hill. In addition I would also like to thank the supervisors
of my initial ill-fated project, Dr Neville Fowkes, and Associate Professor Terry
Edwards (School of Oil and Gas Engineering) for the time they spent advising me
during that year.
   To my fellow PhD students with whom I have shared room 2.19, John Bamberg,
Jonathan Campbell, and Geoff Pearce. I don’t know that on the whole we have been
productive influences on each other, however I have enjoyed our many discussions
xvi


      which have helped distract me from the often tremendous frustration which comes
      with doing a PhD.
         From a financial point of view I would like to thank the University of Western
      Australia for awarding me a Maude Gledden scholarship, and the School of Math-
      ematics and Statistics for providing me with the opportunity to supplement this
      income with part time teaching work. I would also like to thank the Australian Re-
      search Council who provided the grant money for the research of which mine formed
      some part.
          Finally, to my wife Katy. Thank you for your support throughout this time, and
      for your willingness to accept the sacrifices which have accompanied my studies.
      T.N.E.T is finished!

				
DOCUMENT INFO