Numerical Methods for the Solution of Optimal Feedback Control ...2011120220

Document Sample

```					 Numerical Methods for the
Solution of Optimal Feedback
Control Problems

Steven James Richardson

This thesis is presented for the degree of
Doctor of Philosophy
of The University of Western Australia
Department of Mathematics & Statistics.
October, 2007
ii
iii

Abstract

In this thesis we consider some numerical solution methods for solving optimal
feed-back control problems. Finding solutions to problems of this nature involves a
signiﬁcantly increased degree of diﬃculty compared to the related task of solving op-
timal open-loop control problems. Speciﬁcally, a feed-back control depends on both
time and state variables, and so its determination by numerical schemes is subject to
the well known “curse of dimensionality”. Consequently eﬃcient numerical methods
are critical to the accurate determination of optimal feed-back controls.
Optimal feed-back control problems can be formulated in two equivalent ways,
providing the opportunity to seek solutions using either of these alternatives.
The ﬁrst formulation expresses the problem as a nonlinear hyperbolic partial
diﬀerential equation, known as the Hamilton-Jacobi-Bellman (HJB) equation. We
present a method based on solving the so called viscosity approximation to the HJB
equation, in which the HJB equation is perturbed by a small viscosity term to give
a quasi-linear parabolic partial diﬀerential equation. The method involves the use
of an exponentially ﬁtted ﬁnite volume/element method in the spatial domain, com-
bined with an implicit time stepping, to produce an unconditionally stable solution
scheme.
We also consider some of the more theoretical aspects related to the solution of
the HJB equation using the exponentially ﬁtted ﬁnite volume/element method. In
particular, we present relevant existence, uniqueness and convergence results.
The second formulation, often referred to as the direct approach, considers the
optimisation of an objective functional with respect to the control function. The
optimisation is subject to the system dynamics, and various constraints on the state
and control variables. We consider an approach based on this direct formulation
using a modiﬁed version of the Multivariate Adaptive Regression B-spline algo-
rithm (BMARS), applied previously in high dimensional regression modeling. This
method was developed speciﬁcally to enable the solution of optimal feed-back con-
trol problems with high dimensional state spaces. We demonstrate the eﬃciency of
the approach by performing numerical experiments on problems with up to six state
variables.
iv
v

Contents

Abstract                                                                                 iii

List of Tables                                                                           ix

List of Figures                                                                          xi

Acknowledgements                                                                         xv

1 Introduction                                                                            1
1.1   Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . .        2
1.2   Hamilton-Jacobi-Bellman Equation . . . . . . . . . . . . . . . . . . .           3
1.3   Direct Optimal feed-back Control Problem . . . . . . . . . . . . . . .           4
1.4   Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     6
1.5   Problem Assumptions       . . . . . . . . . . . . . . . . . . . . . . . . . .    8
1.6   Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      9

2 Background                                                                             11
2.1   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2   The Nature of the Value Function . . . . . . . . . . . . . . . . . . . . 11
2.3   Viscosity Solutions : Deﬁnition and Basic Properties . . . . . . . . . 13
2.4   The Value Function and the HJB Equation . . . . . . . . . . . . . . . 17
2.5   Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Numerical Solution of Hamilton-Jacobi-Bellman Equations by an
Exponentially Fitted Finite Volume Method                     19
3.1   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2   Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3   The Singularly Perturbed Problem . . . . . . . . . . . . . . . . . . . 20
3.4   The Discretisation Method . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5   Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.6   Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4 Convergence of the Viscosity Approximation and the Calculation
of an Extended Domain                                          43
4.1   Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2   Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
vi

4.3 Viscosity Solutions of Parabolic Diﬀerential Equations . . . . . . . . . 45
4.4 The Viscosity Solution of Problem 4.2.2 . . . . . . . . . . . . . . . . . 49
4.5 Convergence of the Viscosity Approximation . . . . . . . . . . . . . . 57
4.6 Domain of Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5 Convergence of the Exponentially Fitted Finite Element Method
for the Solution of the Hamilton-Jacobi-Bellman Equation      85
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3 Variational Form of the Perturbed HJB Equation . . . . . . . . . . . 87
5.4 Spatial Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.5 Full Discretisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.6 Error estimate : Assuming the optimal control is known a-priori . . . 96
5.7 Error Estimate : The complete problem . . . . . . . . . . . . . . . . . 106
5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6 A Multivariate Adaptive Regression B-spline Algorithm (BMARS)
for Solving a Class of Nonlinear Optimal feed-back Control Prob-
lems                                                            119
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.3 Optimal feed-back Control Problems with High Dimensional State
Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.4 Solution Space Selection . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.5 Approximation of the Optimal feed-back Control: Formulation 1 . . . 128
6.6 Approximation of the Optimal feed-back Control: Formulation 2 . . . 133
6.7 Approximation of the Optimal feed-back Control: Formulation 3 . . . 138
6.8 Comparative performance of BMARS . . . . . . . . . . . . . . . . . . 142
6.9 Six Dimensional Example . . . . . . . . . . . . . . . . . . . . . . . . 144
6.10 Suggestions for Future Work . . . . . . . . . . . . . . . . . . . . . . . 148
6.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

7 Conclusion                                                                      153

A Supplementary Deﬁnitions and Theorems                                           155
vii

B Chapter 2 : Supplementary Results and Proofs                             157
B.1 The Dynamic Programming Principle . . . . . . . . . . . . . . . . . . 157
B.2 The Hamilton-Jacobi-Bellman Equation in the Classical Sense . . . . 157

C Chapter 4 : Supplementary Results and Proofs                                    159
C.1 Proof of Corollary 4.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 160
C.2 Proof of Corollary 4.4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . 165

D Chapter 5: Supplementary Results and Proofs                                   169
D.1 Proof of Lemma 5.4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
D.2 Proof of Theorem 5.7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 170
D.3 Proof of Proposition 5.7.2 . . . . . . . . . . . . . . . . . . . . . . . . 171

E Chapter 6 : Supplementary Results                                      175
E.1 The Space of Piecewise Polynomial Functions and B-Splines . . . . . 175
E.2 Bases for Pk,Ξ,Λ (Ω) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
E.3 Objective Cost Data for Maximum Rocket Height Problem . . . . . . 181

Bibliography                                                                      185
viii
ix

List of Tables

3.5.1 Computed costs for Problem 1. . . . . . . . . . . . . . . . . . . . . . 29
3.5.2 Computed costs for Problem 2. . . . . . . . . . . . . . . . . . . . . . 30
3.5.3 Computed costs for Problem 3. . . . . . . . . . . . . . . . . . . . . . 32

6.4.1 The BMARS Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.4.2 The Adapted BMARS Algorithm. . . . . . . . . . . . . . . . . . . . . 126
6.8.1 Objective costs for feed-back controls calculated using the method of
[35]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.8.2 The relative errors using the method of [35] and the adapted BMARS
algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.9.1 The Adapted BMARS Algorithm for Multiple Controls. . . . . . . . . 146

E.3.1Objective costs for feedback controls calculated using formulation 1. . 181
E.3.2Objective costs for feedback controls calculated using formulation 2. . 182
E.3.3Objective costs for feedback controls calculated using formulation 3. . 183
x
xi

List of Figures

2.2.1 A function satisfying the requirements speciﬁed for L in Example 2.2.1. 12

3.4.1 Elements and edges associated with the node xi in two dimensions. . 23
3.5.1 Surface plots of the control from Problem 1 at time 0.25 (top), and
0.75 (bottom). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.2 Top: The computed trajectories of x1 (left), and x2 (right) for Test
1.1. Bottom: The control for Test 1.1. . . . . . . . . . . . . . . . . . 34
3.5.3 Top: The computed trajectories of x1 (left), and x2 (right) for Test
1.2. Bottom: The control for Test 1.2. . . . . . . . . . . . . . . . . . 35
3.5.4 Top: The computed trajectories of x1 (left), and x2 (right) for Test
1.3. Bottom: The control for Test 1.3. . . . . . . . . . . . . . . . . . 36
3.5.5 Top: The computed trajectories of x1 (left), and x2 (right) for Test
1.4. Bottom: The control for Test 1.4. . . . . . . . . . . . . . . . . . 37
3.5.6 Top: The computed trajectories of x1 (left),and x2 (right) for Test
2.2. Middle: The computed trajectory of x3 for Test 2.2. Bottom:
Control 1 (left) and control 2 (right) for Test 2.2. . . . . . . . . . . . 38
3.5.7 Top: The computed trajectories of x1 (left),and x2 (right) for Test
2.3. Middle: The computed trajectory of x3 for Test 2.3. Bottom:
Control 1 (left) and control 2 (right) for Test 2.3. . . . . . . . . . . . 39
3.5.8 Top: The computed trajectories of x1 (left), and x2 (right) for Test
3.3. Middle: The computed trajectory of x3 (left) and control 1 for
Test 3.3. Bottom: Control 2 (left) and control 3 (right) for Test 3.3. . 40
3.5.9 Top: The computed trajectories of x1 (left), and x2 (right) for Test
3.5. Middle: The computed trajectory of x3 (left) and control 1 for
Test 3.5. Bottom: Control 2 (left) and control 3 (right) for Test 3.5. . 41

˜
4.7.1 A plot comparing the calculated upper bound for Ω ( bounded by
˜
outer circle) and the Ω used in the numerical experiment (bounded
by dashed rectangle) for Problem 1. Also shown are Ω (bounded by
solid rectangle), and Ωcircle (bounded by inner circle). . . . . . . . . . 75
˜
4.7.2 A plot comparing the upper bounds of Ω for Problem 1, calculated
using diﬀerent sized subdivisions of Ω (bounded by solid rectangle).
Working from the outside we have divisions of 1 × 1, 2 × 2, 5 × 5,
˜
10 × 10, 20 × 20, and 40 × 40. Also shown is the Ω used in the
numerical experiment (bounded by dashed rectangle). . . . . . . . . . 75
xii

4.7.3 A plot showing the smallest rectangular domain containing the Ω     ˜
calculated using the 40×40 division of Ω (bounded by solid rectangle)
˜
for Problem 1. Also shown are Ω calculated without any division of Ω
˜
(bounded by outer circle) and the Ω used in the numerical experiment
(bounded by dashed rectangle). . . . . . . . . . . . . . . . . . . . . . 78
˜
4.7.4 A cross-section in the x1 − x2 plane, showing the upper bounds for Ω
calculated using diﬀerent sized sub-divisions of Ω (bounded by solid
rectangle) for Problem 2. Working from the outside we have divisions
of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
˜
Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . 78
˜
4.7.5 A cross-section in the x1 − x3 plane, showing the upper bounds for Ω
calculated using diﬀerent sized sub-divisions of Ω (bounded by solid
rectangle) for Problem 2. Working from the outside we have divisions
of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
˜
Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 79
˜
4.7.6 A cross-section in the x2 − x3 plane, showing the upper bounds for Ω
calculated using diﬀerent sized sub-divisions of Ω (bounded by solid
rectangle) for Problem 2. Working from the outside we have divisions
of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
˜
Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 79
˜
4.7.7 A cross-section in the x1 − x2 plane, showing the upper bounds for Ω
calculated using diﬀerent sized sub-divisions of Ω (bounded by solid
rectangle) for Problem 3. Working from the outside we have divisions
of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
˜
Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . 81
4.7.8 A cross-section in the x1 − x3 plane, showing the upper bounds for Ω  ˜
calculated using diﬀerent sized sub-divisions of Ω (bounded by solid
rectangle) for Problem 3. Working from the outside we have divisions
of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
˜
Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 81
˜
4.7.9 A cross-section in the x2 − x3 plane, showing the upper bounds for Ω
calculated using diﬀerent sized sub-divisions of Ω (bounded by solid
rectangle) for Problem 3. Working from the outside we have divisions
of 1 × 1 × 1, 2 × 2 × 2, 4 × 4 × 4 and 12 × 12 × 12. Also shown is the
˜
Ω used in the numerical experiment (- - -). . . . . . . . . . . . . . . . 83
xiii

5.4.1 Elements of the Dirichlet tessellation Dh and Box mesh Bh associ-
ated with node xi of the Delaunay mesh Th . The support of the
constant basis function ξi is shaded on the left, and the support of
the exponential basis function φi is shaded on the right. . . . . . . . . 89

6.5.1 Plots of the states and control for Test 1. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.5.2 Plots of the states and control for Test 2. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.5.3 Number of basis functions vs Relative error. Test1(-), Test2(...),
Test3(- -), Test4(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.6.1 Number of basis functions vs Relative error. Test1(-), Test2(...),
Test3(- -), Test4(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.6.2 Plots of the states and control for Test 1. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.6.3 Plots of the states and control for Test 2. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.6.4 Plots of the states and control for Test 4. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.7.1 Number of basis functions vs Relative error. Test1(-), Test2(...),
Test3(- -), Test4(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.7.2 Plots of the states and control for Test 1. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.7.3 Plots of the states and control for Test 2. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.7.4 Plots of the states and control for Test 4. The order is x1 (top left),
x2 (top right), x3 (bottom left), u (bottom right). closed-loop (-),
open-loop (...). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.9.1 Number of basis functions vs Relative error. Test 1(-), Test 2(- -),
Test E(...), Test F(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . 149
xiv

6.9.2 Number of basis functions vs Relative error. Test A(-), Test B(...),
Test C(- -), Test D(-.-.) . . . . . . . . . . . . . . . . . . . . . . . . . . 149

E.2.1From left to right: 6(x − 1)0 , 6(x − 1)1 , 3(x − 1)2 , (x − 1)3 . . . . . . 178
+           +           +          +
E.2.2From left to right: B1,1,t(x), B1,2,t(x), B1,3,t(x), B1,4,t(x), where t =
(0, 1, 2, 3, 4). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
xv

Acknowledgements

I would ﬁrstly like to thank my primary supervisor, Associate Professor Song
Wang. I cannot begin to express my gratitude for your contribution to my PhD
studies. After spending the ﬁrst year of my candidature on another project which
was ultimately unproductive, I felt hopeless and desperate at my quickly diminishing
PhD prospects. By taking me on as your student at this time, you provided me with
a life line, and a reignited sense of hope and enthusiasm toward my studies. Thank
you not only for you ongoing guidance, patience and concern for the progression
of my research, but also for your concern toward my ﬁnancial situation and the
opportunities you provided to that end. Looking back I could not predict with any
conﬁdence that I would have made it to this point without your assistance in these
areas.
To my other supervisor, Associate Professor Les Jennings. I understand that as
Head of Department you have large demands on your time, and I appreciate not
only your agreement to co-supervise my research, but also your willingness to very
often drop whatever you were doing to help me when I would turn up unannounced.
Thank you for you help and patience in identifying many of my coding errors, and
To the computing staﬀ, Roman Bogoyev, Michael Juschke, and Con Savas.
Thank you all for helping to address the many computing diﬃculties I have faced
over my time as a postgraduate student. In particular thank you to Roman for all
of your advice. You have contributed signiﬁcantly not only to my understanding of
programming, but also to my general computer literacy.
To the administration staﬀ, and in particular Annette Harrison. Thank you
for dealing with my many tedious pay requests over the years, and also for your
assistance in printing and binding copies of this thesis.
Thanks also to all staﬀ members who at one time or another contributed their
time or resources to the discussion of mathematical ideas relating to my research, or
my path as a PhD student. These staﬀ include Dr Grant Keady, Adjunct Professor
CJ Goh, and Dr Des Hill. In addition I would also like to thank the supervisors
of my initial ill-fated project, Dr Neville Fowkes, and Associate Professor Terry
Edwards (School of Oil and Gas Engineering) for the time they spent advising me
during that year.
To my fellow PhD students with whom I have shared room 2.19, John Bamberg,
Jonathan Campbell, and Geoﬀ Pearce. I don’t know that on the whole we have been
productive inﬂuences on each other, however I have enjoyed our many discussions
xvi

which have helped distract me from the often tremendous frustration which comes
with doing a PhD.
From a ﬁnancial point of view I would like to thank the University of Western
Australia for awarding me a Maude Gledden scholarship, and the School of Math-
ematics and Statistics for providing me with the opportunity to supplement this
income with part time teaching work. I would also like to thank the Australian Re-
search Council who provided the grant money for the research of which mine formed
some part.
Finally, to my wife Katy. Thank you for your support throughout this time, and
for your willingness to accept the sacriﬁces which have accompanied my studies.
T.N.E.T is ﬁnished!

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 10 posted: 1/22/2011 language: English pages: 16