Document Sample

Chapter 3 Quadratic Functions in Linear System Analysis The modern algorithms for analysis and design of LTI systems are best explained in terms of quadratic forms, serving as storage functions and supply rates for the underlying state space models. Though alternative interpretations of the basic theory are possible, the approach leads to the simplest derivations of the main results, and is most helpful for a researcher seeking to derive a generalization of the classical techniques to a variety of time-varying, nonlinear, or large scale applications. This chapter presents the fundamental tools of the “quadratic” theory of LTI systems: the classical Lyapunov theorem, the so-called Kalman-Yakubovich-Popov (KYP) lemma (sometimes referred to as the (generalized) positive real lemma), and the generalized Parrott’s theorem, as well as the framework of semideﬁnite programming, or linear matrix inequality (LMI) optimization. 3.1 Storage Functions and Supply Rates The notions of storage functions and supply rates apply to very general classes of systems, generalizing the deﬁnitions of energy from physics models and Lyapunov functions from autonomous systems. 3.1.1 General Dissipativity Let S be a discrete time system with input w = w(t) ∈ IRm and output x = x(t) ∈ IRn . Let V : I n → IR and σ : IRn × IRm → IR be two functions. The system S is said to be R 61 62 dissipative with storage function V and supply rate σ when the inequality σ(x(t), w(t)) ≥ V (x(t + 1)) − V (x(t)) (3.1.1) is satisﬁed for every input/output pair (x(·), w(·)) and for every time instance t. The deﬁnition in the continuous time case is similar: instead of (3.1.1), it requires the integral inequality t1 σ(x(t), w(t)) ≥ V (x(t1 )) − V (x(t0 )) (3.1.2) t0 to be satisﬁed for all input/output pairs (x(·), w(·)) and time instances t1 > t0 . Example 3.1.1 The system describing a passive two-port electric circuit (possibly a nonlinear one) is dissipative with supply rate σ = vi and storage function V = E, where E is the total circuit energy, v is the voltage between the ports, and i is the current ﬂowing through the ports. Naturally, dissipativity with non-negative supply rates is not interesting: every σ ≥ 0 satisﬁes (3.1.1) (or (3.1.2) in the discrete time case) with V (x) ≡ 0. In most applications, dissipativity is applied with x being the state of the system. For example, if S is deﬁned by the diﬀerential equation ˙ x(t) = F (x(t), w(t)), (3.1.3) where F : I n × IRm → IRn is a given continuous function (for example, it could be that R F (x, w) = Ax + Bw when (3.1.1) is an LTI model), V : IRn → IR is diﬀerentiable, and σ : IRn × IRm → IR is continuous, the resulting system is dissipative with storage function V and supply rate σ if and only if the inequality ˙ σ(x, w) ≥ V (x)F (x, w) (3.1.4) holds for all x, w. Similarly, a discrete time system x(t + 1) = F (x(t), w(t)) (3.1.5) is dissipative with storage function V and supply rate σ if and only σ(x, w) ≥ V (F (x, w)) − V (x) (3.1.6) for all x, w. 63 Example 3.1.2 The nonlinear system x = −x + w3 , ˙ is dissipative with storage function V (x) = 0.75|x|4/3 and supply rate σ(x, w) = w4 −xw, because the inequality w(−x + w3 ) ≥ x1/3 (−x + w3 ) R. holds for all x, w ∈ I A very important special case of dissipativity is represented by the notion of L2 gain: for a system with input w = w(t), output e = e(t), and state x, its L2 gain is deﬁned as the maximal lower bound of those γ > 0 for which the system is dissipative with supply rate σ = γ 2 |w|2 − |e|2 and some non-negative storage function V = V (x). The deﬁnition of the L2 gain is a special case of an important class of dissipativity conditions deﬁned in terms of quadratic supply rates. Such conditions (sometimes referred to as Integral Quadratic Constraints, or IQC) facilitate analysis of robustness of uncertain and nonlinear systems, by allowing to reduce complex signal relations to a set of IQC, allowing the user to model everything with linear equations and quadratic inequalities. 3.1.2 Linear-Quadratric Dissipativity Extensive use of quadratic supply rates and quadratic storage functions for systems de- scribed by linear equations is typical for modern robust control. An application of the technique usually involves a linear state space model ˙ x(t) = Ax(t) + Bw(t) (3.1.7) (or, equivalently, x(t + 1) = Ax(t) + Bw(t) (3.1.8) in the discrete time case), deﬁned by real matrices A, B (of dimensions n-by-n and n-by-m respectively), a quadratic form V = V (x) = x′ P x (P = P ′) (3.1.9) deﬁned by a symmetric n-by-n real matrix P , and a quadratic supply rate ′ x x σ = σ(x, w) = Σ = x′ Σ11 x + 2x′ Σ12 w + w ′ Σ22 w, (3.1.10) w w 64 deﬁned by a symmetric (n + m)-by-(n + m) matrix Σ. It is easy to check that system (3.1.7) is dissipative with supply rate (3.1.10) and storage function (3.1.9) if and only if σ(x, w) − 2x′ P (Ax + Bw) ≥ 0, (3.1.11) where the “≥ 0” means positive semideﬁniteness of the quadratic form on the left side of the inequality. Similarly, system (3.1.7) is dissipative with supply rate (3.1.10) and storage function (3.1.9) if and only if σ(x, w) − (Ax + Bw)′ P (Ax + Bw) + x′ P x ≥ 0. (3.1.12) The quadratic form inequalities can also be written as matrix inequalities: (3.1.11) is equivalent to I A′ Σ− P A B − P I 0 ≥ 0, (3.1.13) 0 B′ while (3.1.12) is equivalent to A′ I Σ− P A B − P I 0 ≥ 0. (3.1.14) B′ 0 Note that the inequalities (3.1.13) and (3.1.14) are linear with respect to Σ and P : this will be used later in the application of semideﬁnite programming to robustness analysis and optimization. Since positive deﬁniteness of quadratic forms is relatively easy to verify numerically, quadratic dissipativity is frequently used to certify certain properties of linear systems. 3.1.3 Example: Lyapunov Equations and Inequalities Let A be a Hurwitz matrix (i.e. all eigenvalues of A have negative real part), which means that every solution x = x(t) of the diﬀerential equation ˙ x(t) = Ax(t) (3.1.15) converges to zero as t → +∞. Assume that our desire is to calculate, or to estimate, the integral ∞ J= x(t)′ Rx(t)dt, 0 65 given a symmetric matrix R = R′ and the initial value x0 = x(0). In particular, when x0 = B and R = C ′ C, the value of J equals the square of the H2 norm of the transfer matrix G(s) = C(sI − A)−1 B. Trying to use a “brute force” to solve the problem, in the fashion of ﬁnding an explicit expression for x(t) as a function of t, and integrating the resulting expression, is unlikely to work well in all but the most simple examples. After all, to calculate x(t) analytically requires one to ﬁnd all eigenvalues of A, and for dimensions higher than 4, there is no explicit formula for those. Translating the problem into the frequency domain by using Fourier (or Laplace) transforms and the Parceval formulae, also leads to rather complicate looking integrals. In contrast, the idea of quadratic storage functions produces a real breakthrough for the task. Consider (3.1.15) as a special case of (3.1.7) (with B = 0), and deﬁne the quadratic supply rate σ(x, w) = −x′ Rx. A quadratic function V (x) = x′ P x is a valid storage function for system (3.1.15) with supply rate σ(x, w) = −x′ Rx if and only if P A + A′ P + R ≤ 0. (3.1.16) Every solution P = P ′ of (3.1.16) (as a rule, (3.1.16) is referred to as the Lyapunov inequality) provides an upper bound for J: indeed, since dV (x(t)) ≤ −x(t)′ Rx(t), dt we have ∞ dV (x(t)) J ≤− dt = V (x(0)) − V (x(+∞)) = x′0 P x0 . 0 dt Moreover, replacing (3.1.16) with the Lyapunov equation P A + A′ P + R = 0 (3.1.17) will result in the exact expression J = x′0 P x0 . Since (3.1.17) is a linear equation, it is usually rather straightforward to solve, either analytically or numerically. In particular, we can see now that, as long as (3.1.17) is feasible and A has rational coeﬃcients, the value of J witll be a rational one as well. The framework presented here demonstrates a typical application of quadratic dissipa- tivity in the linear systems theory. However, it leaves one important theoretical question unanswered: how do we know that the equation (3.1.17) is feasible? After all, if (3.1.17) has no solution P = P ′ , the presented approach will fail! 66 It turns out that feasibility of equation (3.1.17) is guaranteed by a classical theorem by Lyapunov. As the theorem is a special case of the much more powerful KYP Lemma, to be presented in the next section, which serves a similar role in the case of general quadratic dissipativity, the issue is not discussed in this subsection. 3.2 The KYP Lemma The so-called Kalman-Yakubovich-Popov (KYP) Lemma, also known as the Positive Real Lemma, and in many equivalent forms, is an eminently important theoretical statement of modern control theory, providing insight into the connection between frequency domain, time domain, and quadratic dissipativity properties of LTI systems. The KYP Lemma is used in justifying a large number of analysis and optimization algorithms employed in robust control, feedback optimization, and model reduction. 3.2.1 The statement of the KYP Lemma In this subsection, a special version of the KYP Lemma, covering what can be referred to as the non-singular continuous time case. Theorem 3.2.1 Let A, B be a pair of real matrices of dimensions n-by-n and n-by-m respectively, associated, naturally, with the LTI state space model ˙ x(t) = Ax(t) + Bw(t). (3.2.18) Assume that the pair (A, B) is (continuous time) stabilizable, in the sense that A + BK0 is a Hurwitz matrix for some real matrix K0 . Let ′ x x σ(x, w) = Σ = x′ Σ11 x + 2x′ Σ12 w + w ′Σ22 w, (3.2.19) w w be a Hermitian form σ : C n × C m → IR, deﬁned by a real symmetric matrix Σ = Σ′ . Then the following statements (a)-(e) are equivalent: (a) there exists ǫ > 0 such that system (3.2.18) is dissipative with supply rate σ(x, w) − ǫ|x|2 − ǫ|w|2; (b) there exists a symmetric matrix P = P ′ such that σ(x, w) − 2x′ P (Ax + Bw) ≫ 0; (3.2.20) 67 (c) σ(0, w) ≫ 0, and σ is strictly positive deﬁnite on the subspace UA,B (ω) = {(x, w) ∈ C n × C m : jωx = Ax + Bw} (3.2.21) for every ω ∈ IR; (d) σ(0, w) ≫ 0, and the Hamiltonian system ′ ′ dx dH dψ dH dH =− , = , = 0, (3.2.22) dt dψ dt dx dw where H(ψ, x, w) = σ(x, w) − ψ ′ (Ax + Bw), (3.2.23) has no eigenvalues on the imaginary axis; ′ (e) σ(0, w) ≫ 0, and there exist real matrices P0 = P0 and K such that A + BK is a Hurwitz matrix, and σ(x, w) − 2x′ P0 (Ax + Bw) = σ(0, w − Kx). (3.2.24) Moreover, when statements (a)-(e) are true, the matrix P0 in (e) is such that P0 ≤ P for every P = P ′ satisfying σ(x, w) − 2x′ P (Ax + Bw) ≥ 0, (3.2.25) −1 and can be computed in the form P0 = −Z1 Z2 , where the columns of Z = [Z1 ; Z2 ] form a basis in the stable invariant subspace of the matrix of the Hamiltonian system from (3.2.22). In Theorem 3.2.1, statement (a) means strict dissipativity of system (3.2.18) with the quadratic supply rate σ, but with an arbitrary storage function V = V (x). Statement (b) is the same dissipativity as in (a), but with a quadratic storage function. Statement (c) links the dissipativity issue to the frequency domain calculation (the equality between the H-Inﬁnity norm and L2 gain of stable LTI systems is a special case of this relation). Statement (d) is related to an eﬃcient way of verifying quadratic dissipativity of LTI systems. The Hamiltonian equations are written in the form (3.2.22) as a reference to the classical mechanics: they actually result in the autonomous linear system of diﬀerential equations x(t) ˙ x(t) A − BΣ−1 Σ21 22 BΣ−1 B ′ 22 ˙ =H , H= . (3.2.26) ψ(t) ψ(t) Σ11 − Σ12 Σ−1 Σ21 A′ − Σ12 Σ−1 B ′ 22 22 68 The “eigenvalues of (3.2.23)” should be understood as the eigenvalues of H. Similarly, the “stable invariant subspace” of (3.2.23) refers to the set of 2n-dimensional vectors z0 such that the solution of the linear ODE dz/dt = Hz converges to zero as t → ∞ when z(0) = z0 . Statement (e), used in the design of optimal controllers, relates to “accuracy” of the dissipation-based analysis: P0 represents the “tightest possible” solution of the dissipation inequality. There are obvious parallels with the relation between Lyapunov equalities and inequalities here. When w has the meaning of the control vector, the resulting matrix K hints at using the full state feedback control w(t) = Kx(t). 3.2.2 KYP Lemma and L2 Gain An instructive example of application of the KYP Lemma is the following statement about L2 gains of stable LTI models. Theorem 3.2.2 Assume that the coeﬃcient A of the state space model ˙ x(t) = Ax(t) + Bw(t), e(t) = Cx(t) + Dw(t) (3.2.27) is a Hurwitz matrix. Then the following conditions are equivalent: (a) the L2 gain of system (3.2.27) is less than 1, i.e. there exists γ < 1 such that T inf {γ|w(t)|2 − |e(t)|2 }dt > −∞ (3.2.28) T >0 0 for every solution of (3.2.27); (b) there exists a real matrix P = P ′ ≥ 0 such that |w|2 − |Cx + Dw|2 − 2x′ P (Ax + Bw) ≫ 0; (3.2.29) (c) the H-Inﬁnity norm G ∞ of the transfer matrix G(s) = D + C(sI − A)−1 B (3.2.30) is less than 1. The equivalence between (a) and (c) is the familiar association between L2 gains and H-Inﬁnity norms for LTI systems. The equivalence of (a) and (c) to (b) provides a useful link to semideﬁnite programming, and also serves as a key observation in the derivation of H-Inﬁnity optimization algorithms. 69 (a)⇒(c) Assume to the contrary that (c) is not valid. This means that for every γ < 1 there exists ω ∈ IR and w0 ∈ C m such that |w0| = 1 and |G(jω)w0| ≥ γ. Equivalently, this means that |Cx0 + Dw0 | ≥ γ, where x0 = (jωI − A)−1 Bw0 satisﬁes the equation jωx0 = Ax0 + Bw0 . (3.2.31) Then xc (t) = ejωt x0 , w0 (t) = ejωt w0 are complex solutions of the diﬀerential equation in (3.2.27). Hence x(t) = Re{xc (t)}, w(t) = Re{wc (t)} are real solutions of the diﬀerential equation in (3.2.27). On the other hand, if T > 0 is such that ωT = 2πk for some integer k ≥ 0 then T T {γ|w(t)|2 − |cx(t) + Dw(t)|2 }dt = {γ|w0 |2 − |Cx0 + Dw0|2 }, 0 2 which converges to −∞ as T → +∞, certifying that (a) is not valid. Hence (a) implies (c). (c)⇒(b) Since A is a Huirwitz matrix, the pair (A, B) is stabilizable, hence the KYP Lemma applies to A, B, σ, where σ(x, w) = |w|2 − |Cx + Dw|2 . (3.2.32) By assumption, there exist δ1 > 0 such that σ(x0 , w0 ) ≥ δ1 |w0 |2 (3.2.33) whenever x0 , w0 satisfy (3.2.31) for some ω ∈ IR ∪ {∞}. Also, since A has no eigenvalues on the imaginary axis, there exists δ2 > 0 such that |w0 |2 ≥ δ2 |x0 |2 (3.2.34) 70 whenever x0 , w0 satisfy (3.2.31) for some ω ∈ IR ∪ {∞}. Combined together, (3.2.31) and (3.2.33) certify that condition (c) of the KYP Lemma is satisﬁed. Therefore, condition (b) of the KYP Lemma is true as well, which means existence of a matrix P = P ′ satifying the inequality in (3.2.29). It remains to show that the matrix P must be positive semideﬁnite. Indeed, substituting w = 0 into (3.2.29) yields the inequality x′ P Ax ≤ 0, which means that the quadratic form V (x) = x′ P x is monotonically non-increasing along the solutions x = x(t) of the autonomous system ˙ x(t) = Ax(t). (3.2.35) Since A is a Hurwitz matrix, every solution of (3.2.35) converges to zero as t → ∞. Hence V (x(0)) ≥ lim V (x(t)) = 0 t→+∞ for every x(0) ∈ I n , i.e. P ≥ 0. R (b)⇒(a) Note that (3.2.29) implies existence of γ < 1 such that γ|w|2 − |Cx + Dw|2 − 2x′ P (Ax + Bw) ≥ 0. Integrating the inequality along the solutions of (3.2.27) yields T {γ|w(t)|2 − |e(t)|2 }dt ≥ x(T )′ P x(T ) − x(0)′ P x(0) ≥ −x(0)′ P x(0), 0 which implies (3.2.28). Examples The CT transfer function G(s) = (s + 1)−1 has H-Inﬁnity notm of 1. For the state space model ˙ x(t) = −x(t) + w(t), e(t) = x(t), a non-negative quadratic storage function V (x) = px2 , p ≥ 0 certifying the L2 gain bound via the inequlity γ|w|2 − |x|2 ≥ 2xp(−x + w) exists if and only if γ ≥ 1 (one can take p = 1). Note that the KYP Lemma, as stated in this section, does not guarantee the existence of V for γ = 1, only for γ > 1. 71 The KYP Lemma, as stated in this section, means, in particular that a quadratic form P = P ′ (not necessarily a positive semideﬁnite one) satisfying (3.2.29) exists if and only if the L-Inﬁnity norm of G is less than 1. For example, the CT transfer function G(s) = (s − 1)−1 has L-Inﬁnity notm of 1, though its H-Inﬁnity norm equals +∞. For the state space model ˙ x(t) = x(t) + w(t), e(t) = x(t), a quadratic storage function V (x) = px2 satisfying the dissipation inequality γ|w|2 − |x|2 ≥ 2xp(x + w) exists if and only if γ ≥ 1 (one can take p = −1). Finally, consider the state space model dotx(t) = −x(t), e(t) = x(t) + w(t), with transfer function G(s) ≡ 1. While the H-Inﬁnity norm equals 1, a quadratic storage function V (x) = px2 certifying the L2 gain bound via the inequlity γ|w|2 − |x + w|2 ≥ 2xp(−x) exists for γ > 1, but does not exists for the “boundary value” γ = 1 (the solutions p = pγ which are available for γ > 1 will converge to +∞ as γ → 1). 3.2.3 KYP Lemma and Quadratic Program Control A rather abstract optimal control problem, with signiﬁcant applications in linear feedback optimization, can be formulated as the task of ﬁnding, given matrices A, B and the initial value x0 , a control signal u = u(t) which drives the state x = x(t) of the model ˙ x(t) = Ax(t) + Bu(t), x(0) = x0 , (3.2.36) to zero as t → +∞, while minimizing the quadratic cost ∞ J= {|Cx(t)|2 + |u(t)|2 }dt → min . (3.2.37) 0 u(·) Consider, to begin with, a simple example of problem (3.2.36),(3.2.37), where x = x(t) and u = u(t) are scalar signals, A = 0, and B = C = 1: ∞ {x(t)2 + u(t)2 }dt → min subject to x(t) = u(t), x(0) = 1. ˙ (3.2.38) 0 72 The optimal control in (3.2.38) can be found using quadratic storage functions and a completion of squares argument. Leaving some preliminary thought aside (for now), let V (x) = −x2 . Then, subject to the diﬀerential equation in (3.2.38), dV (x) x2 + u2 − = x2 + u2 + 2xu = (x + u)2 , dt which, after integration from 0 to +∞, yields ∞ J =1+ |x(t) + u(t)|2 dt, (3.2.39) 0 as long as x(t) → 0 as t → ∞. While (3.2.39) clearly certiﬁes that J ≥ 1 for every control action, it also suggests that the “feedback rule” u(t) = −x(t) can be used to determine a control action for which J = 1, which would be optimal: the only thing to check is that the relation u(t) = −x(t) is compatible with x(t) → 0 as t → ∞. This yields u(t) = −e−t , x(t) = e−t as the optimal solution in (3.2.38). Note that taking V (x) = x2 instead of V (x) = −x2 would yield ∞ J = −1 + |x(t) − u(t)|2 dt 0 which, while providing a (rather weak!) lower bound for J, cannot be used for ﬁnding the optimal u(·), as the relation u(t) = x(t), together with the diﬀerential equation in (3.2.38), results in a solution x(t) = et which does not converge to zero. For the general case of problem (3.2.36),(3.2.37), the approach described above calls for ﬁnding a symmetric matrix P = P ′ such that |Cx|2 + |u|2 − 2x′ P (Ax + Bu) = |u − Kx|2 , (3.2.40) and A + BK is a Hurwitz matrix. Then the identity ∞ J = −x′0 P x0 + |u(t) − Kx(t)|2 dt, 0 which is valid whenever x(t) → 0 as t → ∞, certiﬁes that −x′0 P x0 is a lower bound of J, achieved by the control satisfying the relation u(t) = Kx(t). While the approach would work whenever a matrix P = P ′ with the required properties exists, the key theoretical question is whether such P can always be found. It turns out that the KYP Lemma can answer the question completely, in the form of the following statement. 73 Theorem 3.2.3 Let A, B, C be real matrices of dimensions n-by-n, n-by-m, and k-by-n respectively. The following conditions are equivalent: (a) there exist matrices P = P ′ ≤ 0 and K such that identity (3.2.40) holds, and A + BK is a Hurwitz matrix; R (b) the pair (A, B) is stabilizable, and Cv = 0 whenever Av = jωv for some ω ∈ I and v = 0. Moreover, if condition (a) is satisﬁed then the matrices P = P ′ and K are uniquely deﬁned by condition (a). Proof. Assume that condition (a) is true. Then, since A + BK is a Hurwitz matrix, the pair (A, B) is stabilizable. In addition, since the quadratic form identity (3.2.40) has all real coeﬃcients, it can be extended to a Hermitian form identity |Cx|2 + |u|2 − 2Re{x′ P (Ax + Bu)} = |u − Kx|2 , (3.2.41) valid for all x ∈ C n , u ∈ C m . When Av = jωv, substituting u = 0, x = v into (3.2.41) would yield |Kv|2 = 0, and hence (A + BK)v = jωv, which contradicts the assumption that A + BK is a Hurwitz matrix. Now assume that (b) is true. A careful examination of (a) shows that it is equivalent to the item (e) from the KYP Lemma, which will be true if the correspoding condition (c) is true. Since, according to (b), the only solution of equations jωx = Ax + Bw, Cx = 0, w = 0 is x = 0, w = 0, condition (c) of the KYP Lemma is satisﬁed. 3.3 Semideﬁnite Programming This section introduces a new tool, semideﬁnite programming, also called linear matrix inequality (LMI) optimization, which, among many possible applications, can be used to optimize weights (multipliers, or D-scales) associated with standard H-Inﬁnity feedback design setup. 3.3.1 A Generic LMI Setup A set of real symmetric k-by-k matrices H0 , H1 , . . . , Hm , and a 1-by-m row vector C ∈ IR1,m deﬁne an instance of a semideﬁnite program as the task Cx → min subject to H(x) ≥ 0 (3.3.42) 74 of ﬁnding a real vector x ∈ IRm which minimizes Cx subject to the constraint H(x) ≥ 0, where m x1 Hk xk , x = . , def . H(x) = H0 + . (3.3.43) k=1 xm and the inequality α ≥ 0 for a k-by-k Hermitian matrix α = α′ means that v ′ αv ≥ 0 for every vector v ∈ C k . Since the matrix-valued function x → H(x) in (3.3.43) is linear with respect to x (a more appropriate term would be aﬃne), the inequality H(x) ≥ 0 is sometimes called linear matrix inequality (LMI), and the optimization task can also be referred to as LMI optimization. Semideﬁnite programming has a special place in optimization: it is more general (and hence more ﬂexible) than linear programming (which is a special case of LMI op- timization), but still can be solved relatively eﬃciently (under mild assumptions, by a polynomial time algorithm). Due to persistent presence of matrix positivity constraints in system analysis (Lyapunov inequalities, quadratic storage functions, and other ap- plications) semideﬁnite programming is well-suited for robustness analysis and feedback optimization. Several MATLAB toolboxes oﬀer LMI optimization tools. The native Robust Control Toolbox oﬀers an LMI optimization engine. The SeDuMi package, also well tested, can be downloaded for free on the Web. The languages used to deﬁne a system of LMI in these packages are not very user-friendly, so I advise one to use an interface to it, which, as part of package IQCbeta, is freely available from the class Web site, and is already installed on Athena. I believe (could be biased, since supervised its creation) that IQCbeta helps to cut signiﬁcantly the coding eﬀort when solving semideﬁnite programs. To use IQCbeta on Athena, put the content of http://web.mit.edu/6.245/www/images/startup6245.m into your startup.m ﬁle (should be in your ~ /matlab/ directory). To install IQCbeta on a personal computer, unpack iqcb030.tar.gz and set MATLAB search path to include iqc beta and its subdirectories. Example 3.3.1 According to a classical theorem, n-by-n real matrix A is a Hurwitz matrix if P A + A′ P < 0 for some real symmetric matrix P = P ′ ≥ 0. If P is normalized by the extra condition P ≤ I, the degree of stability of A can be characterized as the maximum of r such that P A + A′ P + rI ≤ 0. The task of maximizing r subject to the constraints is a special case of semideﬁnite programming, in which the components of decision vector x are the entries of P 75 and the scalar r (i.e. n(n + 1)/2 scalar variables to deﬁne P and one more to deﬁne r); matrix H(x) is deﬁned in such a way that −P A − A′ P − rI 0 0 H(x) = 0 I − P 0 , 0 0 P which ensures that single inequality H(x) ≥ 0 is equivalent to three inequalities P A + A′ P + rI ≤ 0, P ≤ I, P ≥ 0; and C is deﬁned in such a way that Cx = −r. It is quite clear that setting the LMI optimization in terms of matrices H0 , H1 , . . . , Hm , where m = n(n + 1)/2 + 1, would be time consuming. Also, the non-zero entries of Hi will be the same coeﬃcients of A repeated over and over again. To simplify the setup, the possibility of using matrix decision parameters (for example, to work with P as a single variable), and deﬁning multiple matrix inequalities (to treat P A + A′ P + rI ≤ 0, P ≤ I, and P ≥ 0 separately) is usually provided within the semideﬁnite programming packages. In IQCbeta, the semideﬁnite program can be deﬁned and solved by the following code. function [p,r]=lmiexlyap(A) n=size(A,1); % problem dimension abst_init_lmi; % initialize LMI environment of IQCbeta p=symmetric(n); % p is n-by-n symmetric matrix decision variable r=symmetric; % r is a scalar decision variable p>0; % define the matrix inequalities p<eye(n); p*A+A’*p+r*eye(n)<0; lmitbx_options([0 0 0 0 1]); % turn off diagnostics/trace lmi_mincx_tbx(-r); % call the SDP optimization engine p=value(p); % get numerical value of optimal p r=value(r); % get numerical value of optimal r While every computational procedure utilizing ﬁxed precision arithmetics must be used with caution, semideﬁnite programming appears to be especially vulnerable to numerical errors. Here is some short list of reasons to be on guard while using LMI optimization. (a) Currently used algorithms of LMI optimization begin by assuming that the optimal solution x0 of (3.3.42), if exists, satisﬁes a known length bound |x0 | ≤ R. The number of operations required by the algorithms grows polynomally with log(R). 76 (b) The minimal length |x| of a vector satisfying H(x) ≥ 0 can have a double exponential growth with respect to k and m. For example1 , the system of LMI’s xi+1 xi x1 ≥ 4, ≥ 0 (i = 1, 2, ..., m − 1) xi 1 with respect to scalar decision variables x1 , . . . , xm is equivalent to x1 ≥ 4, xi+1 ≥ x2 (i = 1, 2, ..., m − 1), i i and hence its minimal solution if xi = 22 . Therefore one cannot expect that a solution of a semideﬁnite program will be small when the coeﬃcients of matrices Hi are small. (c) Currently used algorithms of LMI optimization rely on the assumption that the set of all x satisfying condition H(x) ≥ 0 contains a ball of radius ǫ > 0. In fact, the optimization actually looks at those decision vectors x which satisfy the strict ineqiality H(x) > 0. The number of operations required by the algorithms grows polynomally with log(ǫ). (d) It is possible for an optimal solution of (3.3.42) not to exist even when the minimized quantity Cx has a ﬁnite lower bound subject to the feasibility constraint H(x) ≥ 0. For example, when x1 1 H(x) = , Cx = x1 , 1 x2 Cx has a maximal lower bound 0 which is not achievable. 3.3.2 D-Scales and Robust L2 Gain Bounds Recall that a version of KYP Lemma claims that system SCT [A, B, C, D] is stable and has H-Inﬁnity norm strictly less than γ > 0 if and only if there exists a positive deﬁnite quadratic form V (x) = x′ P x such that quadratic form γ 2 |w| − |Cx + Dw|2 − 2x′ P (Ax + Bw) (3.3.44) is positive deﬁnite as well, i.e. if and only if the system of strict matrix inequalities P > 0, ′ ′ ′ ′ γ 2 E0I E0I − ECD ECD − EI0 P EAB − EAB P EI0 > 0 (3.3.45) 1 I have learned it from Prof. Pablo Parrilo 77 has a solution P = P ′ , EI0 = I 0 , E0I = 0 I , ECD = C D , so that the entries in (3.3.45) are the matrices of quadratic forms ′ ′ x ′ x 2 x ′ x {E0I E0I } = |w| , {ECD ECD } = |Cx + Dw|2, w w w w ′ x ′ ′ x {EI0 P EAB + EAB P EI0 } = 2x′ P (Ax + Bw). w w Consider now the situation when, for a system of linear state space equations ˙ x(t) = Ax(t) + Bw(t) (3.3.46) with input w has 2N + 2 outputs wi , ei (i ∈ {0, 1, . . . , N}) deﬁned by x x wi = Li , ei = Fi , w w and, for all i = 0, signals wi and ei satisfy the unit L2 gain bound condition T inf {|ei (t)|2 − |wi (t)|2 }dt > −∞. (3.3.47) T >0 0 One possible interpretation of this setup, in which wi are diﬀerent components of w, each wi with i = 0 is the output of system ∆i (possibly nonlinear or time-varying) with input ei , and L2 gain of ∆i is smaller than 1, is shown on Figure 3.3.1. Consider the task of establishing, under the assumptions made, an upper bound for the closed loop L2 gain from w0 to e0 . By the deﬁnition of L2 gain, if r ≥ 0 is such that T inf {r|w0 (t)|2 − |e0 (t)|2 }dt > −∞ (3.3.48) T >0 0 for all signals e, w satisfying the equations of system G and conditions (3.3.47) for i = 0, √ the square root γ = r is an upper bound for the closed loop L2 gain from w0 to e0 . It is easy to see that (3.3.47) (subject to (3.3.46)) implies (3.3.48) (subject to (3.3.46)) whenever there exist non-negative constants di ≥ 0 such that T N inf r|w0(t)|2 − |e0 (t)|2 + di [|wi (t)|2 − |ei (t)|2 ] dt > −∞, (3.3.49) T >0 0 i=1 78 ∆ w1 . . . wN e1 . . . eN - - - G w0 - - e0 Figure 3.3.1: Robust L2 gain Bound also subject to (3.3.46). Since for P = P ′ > 0 and (3.3.46) imply T {2x(t)′ P (Ax(t) + Bw(t)) = x(T )′ P x(T ) − x(0)′ P x(0) ≥ −x(0)′ P x(0), 0 existence of P = P ′ > 0 such that quadratic form N 2 2 r|L0 z| − |F0 z| + di {|Li z|2 − |Fi z|2 } − 2x′ P (Ax + Bw) i=1 (with respect to z = [x; w]) is positive deﬁnite implies (3.3.49). Equivalently, if r, di , and P = P ′ satisfy the inequalities di ≥ 0, P ≥ 0, and N rL′0 L0 − ′ F0 F0 + ′ ′ di {L′i Li − Fi′ Fi } − EI0 P EAB − EAB P EI0 > 0, (3.3.50) i=1 √ then γ = r is an upper bound for the closed loop L2 gain from w0 to e0 . Thus, solving the semideﬁnite program with decision parameters r, di and P = P ′ which seeks to minimize r subject to LMI’s P > 0, di > 0, and (3.3.50), yields an upper bound for the closed loop L2 gain from w0 to e0 . The coeﬃcients di are frequently called multipliers or D-scales, and the whole approach to bounding L2 gain is sometimes referred to as D-scaling analysis of robustness. Example 3.3.2 Consider the task of ﬁnding an upper bound of closed loop L2 gain from f to q in the unity feedback system shown on Figure 3.3.2, where G = G(s) is a given SISO LTI 79 f + q -f- M - G(s) - 6− Figure 3.3.2: Uncertain Gain Feedback for Example 3.3.2 system deﬁned by a state space model SCT [A, B, C, D], and M is an uncertain constant gain ranging in the given interval [a, b]. To apply semideﬁnite programming, deﬁne 2M b+a w0 = f, w1 = − (f − q), e0 = q, e1 = f − q. b−a b−a Then the nominal part of system equation can be represented by the SIMULINK diagram on Figure 3.3.2. The normalized uncertainty is relating signals w1 and e1 according to w1 = δe1 , where δ ∈ [−1, 1] is an uncertain constant, and hence L2 gain “from e1 to w1 ” is not larger than 1. 2 2 (b−a)/2 e1 w1 uncertainty scale 1 (b+a)/2 G 1 w0 e0 nominal gain plant Figure 3.3.3: Nominal Model for Example 3.3.2 MATLAB code using the SIMULINK diagram is shown below. function g=exlmid1(G,a,b) assignin(’base’,’G’,G); % export coefficients assignin(’base’,’a’,a); assignin(’base’,’b’,b); [A,B,C,D]=linmod(’exlmid1mod’); % extract plant model n=size(A,1); % define LMI coefficient matrices 80 EI0=[eye(n) zeros(n,2)];EAB=[A B]; F0=[C(1,:) D(1,:)];F1=[C(2,:) D(2,:)]; L0=[zeros(1,n) 1 0];L1=[zeros(1,n) 0 1]; abst_init_lmi; % initialize LMI environment of IQCbeta p=symmetric(n); % define matrix decision variables r=symmetric; d=symmetric; p>0; d>0; % define the matrix inequalities L0’*r*L0-F0’*F0+L1’*d*L1-F1’*d*F1-EI0’*p*EAB-EAB’*p*EI0>0; lmitbx_options([0 0 0 0 1]); % turn off diagnostics/trace R=lmi_mincx_tbx(r); % call the SDP optimization engine g=sqrt(R); For simple (low order) systems, the code usually produces accurate upper bounds. For more complicated examples, the bound can be overly conservative. Analysis via Integral Quadratic Constraints (IQC), to be studied later in the course, allows one to improve quality of robustness analysis dramatically within the framework of LMI optimization. 81 3.4 Model Reduction This section introduces basic elements of robust control approach to model reduction for LTI systems: numerical measures of controllability and observability, Hankel singular numbers, balanced truncation, and Hankel optimal model reduction. 3.4.1 Hankel Singular Numbers An attarctive approach to reducing the number of states in a state space model is to begin by ﬁguring out the order of relative importance between system’s states, and them remove all states with lesser importance. The approach leads to the introduction of numerical measures of controllability and observability, which leads to Hankel singular values as a measures of state’s importance. Motivation: removal of (almost) uncontrollable/unobservable modes Removing an unobservable or an uncontrollable mode is an easy way of reducing the dimension of the state vector in a state space model. For example, system ˙ x1 = −x1 + x2 + f, ˙ x2 = −2x2 , y = x1 + x2 , can be replaced by ˙ x = −x + f, y = x without changing its transfer function. In this state space model, with −1 1 1 A= , B= , C= 1 1 , D = 0, 0 −2 0 the controllability matrix 1 0 Mc = B AB = −1 0 satisﬁes pMc = 0 for p = [0 1], and hence the variable px = x2 represents an uncontrollable mode. The removal of such mode can be viewed as a canonical projection model reduction ˆ ˆ ˆ A → A = UAV, B → B = UB, C → C = CV, 82 where the columns of V form a basis in the column range of Mc (which is the same as the null space of p), and U can be selected quite arbitrarily subject to the usual constraint UV = I. Strictly speaking, the example above cannot even be considered as “model reduction”, as the orders of the original and the projected systems are both equal to 1. A more interesting situation is represented by the perturbed model ˙ x1 = −x1 + x2 + f, ˙ x2 = −2x2 + ǫf, y = x1 + x2 , (same A, C, D but a modiﬁed B), where ǫ > 0 is a parameter. Intuitively, one can expect that, when ǫ > 0 is small enough, ˙ x = −x + f, y = x is still a good reduced model. This expectation can be related to the fact that x2 is diﬃcult to control by f when ǫ > 0 is small. One can say that the mode x2 = px, which corresponds to the left (row) eigenvector of the A matrix (pA = −2p in this case), is almost uncontrollable, which can be seen directly from the transfer function s+2+ǫ 1+ǫ ǫ G(s) = = − , (s + 2)(s + 1) s+1 s+2 which has a small coeﬃcient at (s + 2)−1 in its partial fraction expansion. One can attempt to introduce a measure of importance of an LTI system’s mode (understood as a pole a of system’s transfer function G(s)) as something related to the absolute value of the coeﬃcient with which (s − a)−1 enters the partial fraction expansion of G. However, it is rarely a good idea to base a model reduction algorithm solely on removal of “unimportant” system modes. For example, both modes a = 1−ǫ and a = 1+ǫ of the LTI system with transfer function 1 1 H(s) = + , s+1−ǫ s+1+ǫ where ǫ > 0 is small, are equally important, and none can be removed without introducing ˆ a huge model reduction error, despite the fact that a very good reduced model H is given by ˆ 2 H(s) = . s+1 83 The objective of his subsection is to introduce a special joint measure of controllability and observability for every vector in the state space of an LTI system. Then, the reduced model is obtained by removing those components of the state vector which have the lowest importance factor in terms of this measure. Observability measures Consider a state space model ˙ x(t) = Ax(t) + Bf (t), y(t) = Cx(t) + Df (t), (3.4.51) where A is an n-by-n Hurwitz matrix (all eigenvalues have negative real part). When f (t) ≡ 0 for t ≥ 0, the value of the output y(t) at a given moment t is uniquely deﬁned by x(0), and converges to zero exponentially as t → +∞. Hence the integral ∞ Eo = |y(t)|2dt, 0 measuring the “observable output energy” accumulated in the initial state, is a function of x(0), i.e. Eo = Eo (x(0)). Moreover, since y(t) is a linear function of x(0), Eo will be a quadratic form with respect of x(0), i.e. Eo (x(0)) = x(0)′ Wo x(0) for some symmetric matrix Wo . Note that Eo = Eo (x) is a storage function for the autonomous system dx/dt = Ax with quadratic supply rate σo (x) = −|Cx|2 . Since A is a Hurwitz matrix, Eo is the smallest storage function for this supply rate. The quadratic form Eo = E0 (x(0)) can be used as a measure of observability deﬁned on the state space of system (3.4.51). In particular, Eo (x(0)) = 0 whenever Mo x(0) = [C; CA; . . . ; CAn−1 ]x(0) = 0. The positive semideﬁnite matrix Wo of the quadratic form Eo is called the observability Gramian2 . Since, by the deﬁnition, Eo (x(0)) ≥ 0 for all x(0), the matrix Wo is always positive semideﬁnite. Indeed, Wo > 0 whenever the pair (C, A) is observable. It is important to notice that the word “system state” actually includes two diﬀerent meanings. One meaning is that of a primal state, which, for a general model (3.4.51), is a column n-vector. For example, for a state space model ˙ ˙ x1 = −x1 + f, x2 = −x2 + f, y = x1 + x2 , (3.4.52) 2 Whether this should be spelled as “Grammian” or “Gramian”, is unclear 84 a primal state is a particular column vector value of x(t) ∈ R2 , such as x(−1.2) = [1; −7]. Such column vector values are more precisely referred to as the primal states of (3.4.52). On the other hand, one frequently refers to (3.4.52) as a “system with two states” (despite the fact that the set R2 has an inﬁnite number of elements), and, in this context, x1 = x1 (t) and x2 = x2 (t) can be referred to as the two states of the system. Let us call such states the dual states. For a general model (3.4.51), a dual state is a particular linear combination xp = px(t) of the scalar components of x = x(t), deﬁned by a row n-vector p. For example, in (3.4.52), the dual states x1 = x1 (t) and x2 = x2 (t) are deﬁned by row vectors p = [1 0] and p = [0 1] respectively. Therefore, it is natural to ask for a deﬁnition of an observability measure of a given dual state of (3.4.51). It is deﬁned as E o (p) = inf Eo (x0 ) for p = 0, x0 : px0 =1 i.e. as the minimal output energy which can be observed for t ≥ 0 when px(0) = 1. Note that inﬁmum over an empty set equals plus inﬁnity, hence E o (0) = ∞. When the pair (C, A) is observable, and hence Wo > 0 is invertible, the dual observability measure is given by 1 E o (p) = for p = 0. −1 pWo p′ In this case the inverse 1 −1 = pWo p′ E o (p) is a quadratic storage function for system dp/dt = pA + qC with quadratic supply rate σ o (p, q) = −|q|2 . According to the KYP Lemma, E o (p)−1 is the smallest storage function for this supply rate. The following theorem is frequently utilized for computing Wo numerically. Theorem 3.4.1 Wo is the unique solution of the Lyapunov equation Wo A + A′ Wo + C ′ C = 0. (3.4.53) Proof. Since system (3.4.51) is time invariant, the identity ∞ x(0)′ Wo x(0) = |Cx(τ )|2 dτ 0 implies ∞ x(t)′ Wo x(t) = |Cx(τ )|2 dτ. t 85 Diﬀerentiating the second identity with respect to t at t = 0 yields 2x(0)′ Wo Ax(0) = −|Cx(0)|2 for all x(0) ∈ Rn . Comparing the coeﬃcients on both sides of the quadratic identity yields (3.4.53). Finding a numerical solution of (3.4.53) is not easy when n is about 104 and larger. In such situation, Theorem 3.4.1 can be used as a basis for ﬁnding an approximation of Wo . It is important to understand that the observability measure alone should not be the only numerical test for choosing which states to eliminate in a model reduction procedure. Instead, a combination of observability and a controllability measures, to be introduced in the next subsection, should be used. Controllability measures Since A is a Hurwitz matrix, every input signal f = f (t) of ﬁnite energy, i.e. such that ∞ 2 f = |f (t)|2 dt < ∞, −∞ corresponds to a unique initial condition x(0) in (3.4.51) for which the corresponding solution x = x(t) satisﬁes x(t) → 0 as t → −∞. This solution is given by ∞ x(t) = eAτ Bf (t − τ )dτ, 0 where eM denotes the matrix exponent of a square matrix M. One can say that input f = f (t) drives the system state from x(−∞) = 0 to x(0) = X(f (·)). Let p be a 1-by-n row vector, so that the product px(t) is a dual state of (3.4.51) – a linear combination of components of the state space vector. The (dual) controllability measure E c = E c (p) is deﬁned as the maximal value of |px(0)|2 which can be achieved by using an input f = f (t) of unit energy: E c (p) = max{|pX(f (·))|2 : f ≤ 1}. Accordingly, the primal controllability measure Ec = Ec (x0 ) is deﬁned for a column vector x0 ∈ Rn as Ec (x0 ) = inf E c (p). p: px0 =1 The following statement describes some basic properties of these controllability mea- sures. 86 Theorem 3.4.2 Assuming that A is an n-by-n Hurwitz matrix. (a) E c (p) = pWc p′ is a quadratic form with the coeﬃcient matrix ∞ ′ Wc = eAt BB ′ eA t dt. 0 (b) Wc is the unique solution of the Lyapunov equation AWc + Wc A′ = −BB ′ . (3.4.54) (c) A given state x0 ∈ Rn is reachable from zero if and only if Ec (x0 ) > 0 or, equiva- lently, the equation Wc p′ = x0 has a solution p′ . In this case Ec (x0 ) = px0 is the minimum of f (·) 2 subject to X(f (·)) = x0 . Proof. To prove (a), note that max −∞∞ g(t)′ f (t)dt = g , f ≤1 hence ∞ E c (p) = |peAt B|2 dt = pWc p′ . 0 Statement (b) is actually a re-wording of Theorem 3.4.1, with C replaced by B ′ , A replaced by A′ , Wo replaced by Wc , and x(0) replaced by p. To prove (c), consider ﬁrst the case when equation Wc p′ = x0 has no solution p. Then there exists a row vector p0 such that p0 Wc = 0 but p0 x0 = 0. Here the equality means that |p0 X(f (·))|2 equals zero for every ﬁnite energy signal f = f (t). Since, from the inequality, |p0 x0 |2 > 0, the state x0 is not reachable from zero. Now assume that x0 = Wc p′ for some p. Then f 2 ≥ pWc p′ = px0 whenever x0 = X(f (·)). On the other hand, for ′ B ′ e−A t p′ , t ≤ 0, f (t) = 0, t > 0, we have f 2 = px0 and x0 = X(f (·)). When the pair (A, B) is controllable, and, hence, Wc > 0, the primal controllability measure Ec = Ec (x0 ) can be expressed as 1 Ec (x0 ) = for x0 = 0. x′0 Wc−1 x0 In this case E c (p) is the minimal storage function with supply rate σ c (p) = −|pB|2 for system dp/dt = pA, and Ec (x)−1 is the minimal storage function with supply rate σc (x, f ) = −|f |2 for system dx/dt = Ax + Bf . 87 Joint controllability and observability measures The joint controllability and observability measures are deﬁned as products of the corre- sponding controllability and observability measures: Eoc (x0 ) = Eo (x0 )Ec (x0 ), E oc (p) = E o (p)E c (p). For controllable and observable systems Wc and Wo are positive deﬁnite, and the formulae can be simpliﬁed to x′0 Wo x0 pWc p′ Eoc (x0 ) = (x0 = 0), E oc (p) = (p = 0). x′0 Wc−1 x0 −1 pWo p′ Trying to ﬁnd the extremal values of Eoc (x) or E oc (p) leads to extremum equations Wo x − hWc−1 x = 0, where h = x′ Wc−1 x, −1 pWc − hpW )o−1 = 0, where h = p′ Wo p′ , which proves that the extremal vectors and the extremal values can be deﬁned in terms of the eigenvalues and eigenvectors of matrix Wc Wo . Since Wc and Wo are positive semideﬁnite, all eigenvalues of Wc Wo are real and non- negative. The square roots of those eigenvalues are called Hankel singular numbers of system (3.4.51), and play a central role in understanding its model reduction. 3.4.2 Balanced Truncation Balanced truncation is a model reduction algorithm based on straightforward elimination of those states of system (3.4.51) which have lesser joint controllability and observability measure. Dominant Controllability and Observability Subspaces For model reduction purposes, we are interested in ﬁnding a subspace of primal state vectors for which the minimum of the joint controllability and observability measure over all non-zero elements is maximal. A basis in this subspace will yield columns of a projection matrix V . Similarly, we are interested in ﬁnding a subspace of dual state vectors for which the minimum of the joint controllability and observability measure over all non-zero elements is maximal. A basis in this subspace will yield rows of a projection matrix V . The following theorem can be used in ﬁnding such V and U. 88 Theorem 3.4.3 Let Wc = Lc L′c and Wo = L′o Lo be the Choleski decompositions of the controllability and observability Gramians. Let ρ1 ≥ · · · ≥ ρr > ρr+1 ≥ · · · ≥ ρn ≥ 0 be the ordered eigenvenvectors of L′c Wo Lc (possibly repeated). Let ψ1 , . . . , ψr be the corre- sponding ﬁrst r normalized eigenvectors of L′c Wo Lc , i.e. ′ L′c Wo Lc ψi = ρi ψi , |ψi |2 = 1, ψi ψk = 0 for i = k. 1/2 Let σi = ρi . (a) ρ1 ≥ · · · ≥ ρr are also the eigenvalues of Lo Wc L′o , and the corresponding normalized row eigenvectors can be deﬁned by −1 ′ φi = σi ψi L′c L′o (i = 1, . . . , r). (b) The set of all linear combinations of vectors Lc ψi is the only r-dimensional linear subspace V in Rn such that Eoc (v) ≥ ρr for every v ∈ V. (c) The set of all linear combinations of row vectors φi Lo is the only r-dimensional linear subspace U of row n-vectors such that E oc (u) ≥ ρk for every u ∈ U. (d) UV = Ir , where −1/2 σ1 φ1 V = Lc −1/2 −1/2 , U = . . ψ1 σ1 . . . ψr σr Lo . . −1/2 σr φr The proof of the theorem is obtained by inspection. Balanced Truncation Algorithm The projection model reduction algorithm which uses the projection matrices U, V de- scribed in the previous subsection is called balanced truncation. The “balancing” termi- nology comes from the following trivial observation. Theorem 3.4.4 Assume that system (3.4.51) is both controllable and observable. Then Wc = Lc L′c > 0 and Wo = L′o Lo > 0 are positive deﬁnite. Moreover, if Ψ is the orthogonal matrix which columns are the ordered eigenvectors of L′c Wo Lc , and Σ is the diagonal 89 matrix with numbers σi on the diagonal, then the linear state transformation x = Sz, with S = Lc ΨΣ−1/2 , yields an equivalent state space model with the coeﬃcient matrices ¯ ¯ ¯ A = S −1 AS, B = S −1 B, C = CS, for which both the controllability and observability Gramians equal Σ. State state space models for which Wo = Wc are equal diagonal matrices with ordered positive diagonal entries σk are called balanced. The numbers σk are called the (Han- kel) singular numbers of the corresponding system. The (canonical) method of balanced truncation is based on removing the states of a balanced realization which correspond to singular numbers below a certain threshold. Indeed, a practical implementation does not have to involve a calculation of a complete balanced model: only the projection matrices U, V are necessary. Example Let 1 1 G(s) = + , s+1−ǫ s+1+ǫ where ǫ > 0 is a small parameter. A state space model is −1 + ǫ 0 1 A= , B= , C= 1 1 , D = 0. 0 −1 − ǫ 1 The controllability Gramians are given by 1 1 1−ǫ 1 Wo = Wc = 1 . 2 1 1+ǫ The Hankel singular numbers are the eigenvalues of Wo = Wc , and equal √ 1 1 ± 1 − ǫ2 + ǫ4 σ1,2 = . 2 1 − ǫ2 The corresponding eigenvectors are 1 v1,2 = 1 . 2σ1,2 − 1−ǫ 90 The dominant eigenvector deﬁnes √1 2 1 1 V ≈ √1 , U≈ √ 2 √ 2 . 2 The resulting reduced system is approximately given by √ √ ˆ ˆ ˆ A ≈ −1, B ≈ 2, C =≈ 2. Approximate Gramians The two most expensive phases of numerical calculations associated with the canonical method of balanced truncation are ﬁnding the Gramians Wo , Wc , and calculating the dominant eigenvectors ψi of L′l Wo Lc . for systems of large dimensions (more than 104 states) ﬁnding the Gramians exactly becomes diﬃcult. As a viable alternative, lower and upper bounds of the Gramians can be used to provide provably reliable results. Here by lower bounds of Gramians Wo , Wc we mean − positive semideﬁnite matrices Wo , Wc− for which the inequalities − Wo ≥ Wo , Wc ≥ Wc− are guaranteed. The deﬁnition of upper bounds will be more strict: by upper bounds of Gramians Wo , Wc deﬁned by the Lyapunov equalities Wo A + A′ Wo = −C ′ C, AWc + Wc A′ = −BB ′ , + where A is a Hurwitz matrix, we mean solutions Wo , Wc+ of the corresponding Lyapunov inequalities + + Wo A + A′ Wo ≤ −C ′ C, AWc+ + Wc+ A′ ≤ −BB ′ . These inequalities imply that Wo ≥ Wo and Wc+ ≥ Wc , but the inverse implication is not + always true. The following simple observation can be used to produce lower bounds of the Gramians. Theorem 3.4.5 Let A be an n-by-n Hurwitz matrix. Let F be an n-by-m matrix. For s ∈ C with Re(s) > 0 deﬁne a = a(s) = (sIn − A)−1 (¯In + A), s b = b(s) = 2Re(s)(sIn − A)−1 B. Then 91 (a) a is a Schur matrix (all eigenvalues strictly inside the unit disc); (b) an n-by-n matrix P is a solution of the “continuous time” Lyapunov equation AP + P A′ = −BB ′ if and only if it is a solution of the “discrete time” Lyapunov equation P = aP a′ + bb′ ; (c) the matrix P from (b) is the limit P = lim Pk , k→∞ where the symmetric matrices P0 ≤ P1 ≤ P2 ≤ · · · ≤ P are deﬁned by P0 = 0, Pk+1 = a(sk )Pk a(sk )′ + b(sk )b(sk )′ , and {sk } is a sequence of complex numbers contained in a compact subset of the open right half plane. The theorem reduces ﬁnding a lower bound of a solution of a Lyapunov equation to solving systems of linear equations (used to produce b(sk ) and the products a(sk )Pk a(sk )′ . Calculation of an upper bound of a Gramian could be more tricky. One approach to ﬁnding such upper bounds relies on having a valid energy function for A available. Indeed, assume that Q = Q′ satisﬁes AQ + QA′ < 0. If Wc− is a good lower bound of the controllability Gramian Wc , deﬁned by AWc + Wc A′ = −BB ′ , then AWc− + Wc− A′ ≈ −BB ′ , and hence A(Wc− + ǫQ) + (Wc− + ǫQ)A′ ≤ −BB ′ for some ǫ > 0 (which will, hopefully, be small enough). Then Wc− and Wc− + ǫQ are a lower and an upper bound for the controllability Gramian Wc . 92 Lower bounds for model reduction error A major beneﬁt of doing balanced truncation is given by a lower bound of the error of arbitrary reduced models (not only those produced via balanced truncation). − ′ Theorem 3.4.6 Let Wo = Fo Fo and Wc− Fc Fc′ be lower bounds of the observability and controllability Gramians Wo , Wc of a stable LTI model G = G(s). Let − − σ1 ≥ σ2 ≥ · · · ≥ 0 − be the ordered singular numbers of Fo Fc . Then σk is a lower bound for the k-th Hankel singular number σk = σk (G) of the system, and ˆ G−G ∞ − ≥ σk ˆ for every system G of order less than k. − Proof. Let Zk denote the subspace spanned by the k dominant eigenvectors of Fc′ Wo Fc , i.e. − |Fo Fc z| ≥ σk |z| ∀ z ∈ Zk . Since Wc ≥ Fc Fc′ , every vector Fc z lies in the range of Wc , and q ′ Fc z ≤ |z|2 whenever Fc z = Wc q. Hence every state x(0) = Fc z can be reached from x(−∞) = 0 using a minimal energy input f = fz (t) (depending linearly on z) of energy not exceeding |z|2 . On the other hand, every state − x(0) = Fc z with z ∈ Zk will produce at least |Fo Fc z|2 ≥ (σk )2 |z|2 of output energy. Since G ˆ is a linear system of order less than k, there exists at least one non-zero z ∈ Zk for which the ˆ input f = fz (t) produces a zero state x(0) = 0 at zero time. Then, assuming the input is zero − for t > 0, the error output energy is at least (σk )2 |z|2 . Since the testing input energy is not − ˆ larger than |z|2 > 0, this yields an energy gain of (σk )2 , which means that G − G ∞ ≥ σk . − Upper bonds for balanced truncation errors The result from the previous subsection states that, for a stable LTI system G, no method ˆ can produce a reduced model G of order less than k such that the H-Inﬁnity error G−G ∞ˆ is less than the k-th singular number σk = σk (G). The statement is easy to apply, since − lower bounds σk of σk can be computed by using lower bounds of system Gramians. The following theorem gives an upper bound of model reduction error for the exact implementation of the balanced truncation method. 93 Theorem 3.4.7 Let σ1 > σ2 > · · · > σh be the ordered set of diﬀerent Hankel singular ˆ numbers of a stable LTI system G. Let G be the reduced model obtained by removing the states corresponding to singular numbers not larger than σk from a balanced realization of ˆ G. Then G is stable, and satisﬁes h ˆ G−G ∞ ≤2 σi . i=k The utility of Theorem 3.4.7 in practical calculations of H-Inﬁnity norms of model reduction errors is questionable: an exact calculation of the H-Inﬁnity norm is possible at about the same cost, and the upper bound itself can be quite conservative. Neverthe- less, the theorem provides an important reassuring insight into the potential of balanced truncation: since the singular numbers of exponentially stable LTI systems decay expo- nentially, the upper bound of Theorem 3.4.7 is not expected to be much larger than the lower bound. For example, for a system with singular numbers σi = 2−i , a kth order reduced model cannot have quality better than 2−k−1, and exact balanced truncation is guaranteed to provide quality of at least 2−k+1 . The proof of Theorem 3.4.7 is based on estimating the quality of balanced truncation in the case when only the states of a balanced realization corresponding to the smallest Hankel singular number are removed, which is done in the following technical lemma. Lemma 3.4.1 Let W = W ′ > 0 be a positive deﬁnite symmetric n-by-n matrix satisfying the Lyapunov equalities W A + A′ W = −C ′ C, AW + W A′ = −BB ′ . (3.4.55) Assume that W, A, B, C can be partitioned as ′ P 0 A11 A12 B1 C1 W = , A= , B= , C′ = ′ , 0 σIr A21 A22 B2 C2 ′ where A22 is an r-by-r matrix, and matrices B2 , C2 have r rows. Then (a) transfer matrices G(s) = C(sIn − A)−1 B, G1 (s) = C1 (sIn−r − A11 )−1 B1 are stable; 94 (b) the Lyapunov equalities ′ ′ P A11 + A′11 P = −C1 C1 , A11 P + P A′11 = −B1 B1 are satisﬁed; (c) G − G1 ∞ ≤ 2σ. Proof. It is suﬃcient to consider the case when the dimension m of f = f (t) equals the dimension k of y = y(t). (If m < k, add zero columns to B, if m > k, add zero rows to C.) First note that re-writing (3.4.55) in terms of the blocks Aik , Bi , Ci yields P A11 + A′ P 11 ′ = −C1 C1 , (3.4.56) P A12 + σA′ 21 = ′ −C1 C2 , (3.4.57) σ(A22 + A′ ) 22 = ′ −C2 C2 , (3.4.58) A11 P + P A′11 = ′ −B1 B1 , (3.4.59) σA12 + P A′ 21 = ′ −B1 B2 , (3.4.60) ′ ′ σ(A22 + A22 ) = −B2 B2 . (3.4.61) ′ Note (3.4.58) together with (3.4.61) implies that C2 = B2 θ for some unitary matrix θ. Also, (3.4.56) and (3.4.59) prove (b). To prove (a), note that for every complex eigenvector v = 0 of A, Av = sv for some s ∈ C, multiplication of the ﬁrst equation in (3.4.55) by v ′ on the left and by v on the right yields 2Re(s)v ′ W v = −|Cv|2 . Hence either Re(s) < 0 or Re(s) = 0 and Cv = 0. Hence all unstable modes of A are unobserv- able, and G = G(s) has no unstable poles. The same proof applies to G1 , since A11 satisﬁes similar Lyapunov equations. To prove (c), consider the following state space model of the error system G − G1 : ˙ x1 = A11 x1 + A12 x2 + B1 f, ˙ x2 = A21 x1 + A22 x2 + B2 f, ˙ x3 = A11 x3 + B1 f, e = C1 x1 + c2 x2 − C1 x3 . It would be suﬃcient to ﬁnd a positive deﬁnite quadratic form V (x) = x′ Hx such that dV (x(t)) ψ(t) = 4σ 2 |f (t)|2 − |e(t)|2 − ≥0 dt 95 for all solutions of system equations. Indeed, such Lyapunov function V can be readily presented, though there is no easy way to describe the intuitive meaning of its format: V (x) = σ 2 (x1 + x3 )′ P −1 (x1 + x3 ) + (x1 − x3 )′ P (x1 − x3 ) + 2σ|x2 |2 . To streamline the derivation, introduce the shortcut notation ′ ′ z = x1 + x3 , ∆ = x1 − x3 , δ = C1 ∆, u = σ −1 B2 x2 , q = B1 P −1 z. The equations now take the form ˙ ∆ = A11 ∆ + A12 x2 , ˙ z = A11 z + A12 x2 + 2B1 f, ˙ x2 = A22 x2 + 0.5A21 (z + ∆) + B2 f, e = C1 ∆ + σθ ′ u. We have ψ = 4σ 2 |f |2 − |C1 ∆ + σθ ′ u|2 − 2σ 2 z ′ P −1 (A11 z + A12 x2 + 2B1 f ) −2∆′ P (A11 ∆ + A12 x2 ) − 4σx′ [A22 x2 + 0.5A21 (z + ∆) + B2 f ] 2 = 4σ 2 |f |2 − |C1 ∆ + σθ ′ u|2 + σ 2 |q|2 − 4σ 2 q ′ f + |δ|2 + 2σ 2 |u|2 − 4σ 2 u′ f − 2z ′ [σ 2 P −1 A12 + σA′ ]x2 − 2∆′ [P A12 + σA′ ]x2 21 21 = 4σ 2 |f |2 − |C1 ∆ + σθ ′ u|2 + σ 2 |q|2 − 4σ 2 q ′ f + |δ|2 +2σ 2 |u|2 − 4σ 2 u′ f + 2σ 2 q ′ u + 2σδ′ θ ′ u ≥ 4σ 2 |f |2 − |C1 ∆ + σθ ′ u|2 + |δ|2 + 2σ 2 |u|2 + 2σδ′ θ ′ u − σ 2 |u − 2f |2 = 0 (the ﬁrst identity transformation utilized (3.4.56), (3.4.59), (3.4.58), the second identity used (3.4.57), (3.4.60), the third (inequality) applied minimization with respect to q, the last (identity) depended on θ being a unitary matrix. It is important to realize that the lemma remains valid when the original Lyapunov equalities are replaced by the corresponding Lyapunov inequalities (of course, the equal- ities in (b) will get replaced by inequalities as well). Indeed, Lyapunov inequalities W A + A′ W ≤ −C ′ C, AW + W A′ ≤ −BB ′ are equivalent to Lyapunov equalities ˜ ˜ ˜˜ W A + A′ W = −C ′ C, AW + W A′ = −B B ′ , 96 where ˜ ˜ C B= B Bδ , C= , Cδ and Bδ , Cδ are chosen appropriately. Note that G(s) is a left upper corner block in the transfer matrix ˜ ˜ ˜ G = C(sIn − A)−1 B. Since H-Inﬁnity norm of a transfer matrix is not smaller than H-Inﬁnity norm of its block, ˜ ˜ applying Lemma 3.4.1 to the system deﬁned by A, B, C yields the stated generalization. 3.5 Hankel Model Reduction This lecture covers the basic principles of using Hankel operators in model reduction of LTI systems. Hankel operators Let L2 denote the set of all integrable functions e : R → Rr such that r ∞ |e(t)|2 dt < ∞. −∞ Let L2 (−∞, 0) denote the subset of L2 which consist of functions e such that e(t) = 0 for r r t ≥ 0. The elements of L2 (−∞, 0) will be called anti-causal in this lecture. Similarly, let r L2 (0, ∞) be the subset of functions e ∈ L2 such that e(t) = 0 for t < 0. The elements of r r L2 (0, ∞) will be called causal. r Let G = G(s) be a k-by-m matrix-valued function (not necessarily a rational one), bounded on the jω-axis. The corresponding Hankel operator H = HG is the linear transformation which maps anti-causal square integrable functions f ∈ L2 (−∞, 0) to r causal square integrable functions h = HG f ∈ L2 (0, ∞) according to the following rule: r h(t) = y(t)u(t), where y(t) is the inverse Fourier transform of Y (jw) = G(jω)F (jω), F (jω) is the Fourier transform of f (t), and u(t) is the unit step function 1, t ≥ 0, u(t) = 0, t < 0. In terms of the (stable, but not necessarily causal) LTI system deﬁned by G, the Hankel operator maps anti-causal inputs f = f (t) to the causal parts h = h(t) = y(t)u(t) of the complete system response y(t). In particular, when G is anti-stable, i.e. is a proper 97 rational transfer matrix without poles s with Re(s) ≤ 0, the associated LTI system is anti-causal, and hence the resulting Hankel operator HG is zero. More generally, adding an anti-stable component to G does not aﬀect the resulting HG . Hankel matrices Let a > 0 be a ﬁxed positive number. Then functions √ k 2a a−s Θk (jω) = , k = 0, 1, 2, . . . s+a a+s form an orthonormal basis in the space of stable strictly proper transfer functions, in the sense that for every such function H = H(s) there exists a square summable sequence of real numbers h0 , h1 , h2 , . . . satisfying ∞ H(jω) = hk Θk (jω), k=0 i.e. N 2 ∞ ∞ 1 H(jω) − hk Θk (jω) dω = |hk |2 → 0 as N → ∞. 2π −∞ k=0 k=N +1 In a similar sense, the inverse Fourier transforms θk = θk (t) of Θk = Θk (jω), form an orthonormal basis in L2 (0, ∞), and the inverse Fourier transforms θk (−t) of Θk (−jω) 1 form an orthonormal basis in L2 (0, ∞). 1 The following lemma allows one to establish a matrix representation of a Hankel operator with respect to input basis {θk (−t)}∞ and output basis {θk (t)}∞ . k=0 k=0 Lemma 3.5.1 Let ∞ k 1 a + jω adω gk = G(jω) . π −∞ a − jω a2+ ω2 Then the result h = h(t) of applying HG to f (t) = θr (−t) is given by ∞ h(t) = gr+k+1θk (t). k=0 98 An important implication of the lemma is that the matrix of HG with respect to the input/output bases {θk (−t)}∞ , {θk (t)}∞ is the Hankel matrix k=0 k=0 g1 g2 g3 g4 g2 g3 g4 ΓG = g 3 g 4 . g4 .. . In general, gk are matrices with real coeﬃcients, in which case ΓG is called the block Hankel matrix. Proof. Consider the decomposition ∞ h(t) = hk θk (t). k=0 By orthonormality of θk (·), ∞ ∞ hk = h(t)θk (t)dt = y(t)θk (t)dt, 0 −∞ where y is the response of the stable LTI system associated with G to f (t) = θr (−t). By the Parceval formula, ∞ 1 hk = Θk (−jω)G(jω)Θr (−jω)dω 2π −∞ ∞ k+r+1 1 a + jω 2adω = G(jω) = gk+r+1 . 2π −∞ a − jω (a + jω)(a − jω) Singular values of a Hankel operator Let M be an a-by-b matrix representing a linear transformation from Rb to Ra . Remember that the operator norm of M is deﬁned as the minimal upper bound for all ratios |Mv|/|v|, where v ranges over the set of all non-zero vectors in Rb . In addition, the r-th singular number of M can be deﬁned as the minimal operator norm of the diﬀerence ∆ = M − M , ˆ ˆ where M ranges over the set of all matrices with rank less than r. These deﬁnitions extend naturally to linear transformations of other normed vector spaces (possibly inﬁnite dimensional). In particular, for a linear transformation M from 99 L2 (−∞) to L2 (0, ∞), its operator norm is deﬁned as the square root of the minimal m k upper bound for the ratio ∞ 0 2 |(Mf )(t)| dt/ |f (t)|2dt, 0 −∞ where 0 |f (t)|2 dt > 0. −∞ Such transformation M is said to have rank less than r if for every family of r functions f1 , . . . , fr ∈ L2 (−∞, 0) there exist constants c1 , . . . , cr , not all equal to zero, such that k c1 (Mf1 ) + · · · + cr (Mfr ) ≡ 0. Finally, the r-th singular number of M can be deﬁned as the minimal operator norm of ˆ ˆ the diﬀerence ∆ = M − M , where M ranges over the set of all matrices with rank less than r. This allows us to talk about the k-th singular number of the Hankel operator HG associated with a given matrix-valued function G = G(jω), bounded on the imaginary axis. The largest singular number is called the Hankel norm G H of G, while the k-th singular number is called the k-th Hankel singular number of G. For rational transfer matrices G, calculation of singular numbers of the correspond- ing Hankel operator can be done using observability and controllability Gramians. The following theorem was, essentially, proven in the lectures on balanced truncation. Theorem 3.5.1 Let A be an n-by-n Hurwitz matrix, B, C be matrices of dimensions n- by-m and k-by-n respectively, such that the pair (A, B) is controllable, and the pair (C, A) is observable. Let Wc , Wo be the corresponding controllability and observability Gramians. Then, for G(s) = C(sI − A)−1 B, the Hankel operator HG has exactly n positive singular numbers, which are the square roots of the eigenvalues of Wc Wo . It is also true that a Hankel operator with a ﬁnite number of positive singular numbers is deﬁned by a rational transfer matrix. Theorem 3.5.2 Let G = G(jω) be a bounded matrix-valued function deﬁned on the imaginary axis. Let a > 0 be a positive number. If the Hankel operator HG has less than r positive singular numbers then the coeﬃcients ∞ k 1 a + jω adω gk = G(jω) π −∞ a − jω a2+ ω2 100 coincide for k > 0 with such coeﬃcients of a stable strictly proper transfer matrix G1 of order less than r. For some non-rational transfer matrices, analytical calculation of σi may be possible. For example, the i-th largest singular number of HG , where G(s) = exp(−s), equals 1 for all positive i. In general, singular numbers of HG will converge to zero if G = G(jω) is continuous on the extended imaginary axis (note that G(s) = exp(−s) is not continuous at ω = ∞). The converse statement is not true. The Hankel optimal model reduction setup Let G = G(s) be a matrix-valued function bounded on the jω-axis. The task of Hankel ˆ optimal model reduction of G calls for ﬁnding a stable LTI system G of order less than a given positive integer m, such that the Hankel norm ∆ H of the diﬀerence ∆ = G − G ˆ is minimal. Since Hankel operator HG represents a “part” of the total LTI system with transfer matrix G, Hankel norm is never larger than H-Inﬁnity norm. Hence, Hankel optimal model reduction setup can be viewed as a relaxation of the “original” (H-Inﬁnity optimal) model reduction formulation. While no acceptable solution is available for the H-inﬁnity case, Hankel optimal model reduction has an elegant and algorithmically eﬃcient solution. 3.5.1 The AAK theorem The solution of the Hankel optimal model reduction problem is based on the famous Adamyan-Arov-Krein (AAK) theorem, presented in this section. The AAK Theorem The famous Adamyan-Arov-Krein theorem provides both a theoretical insight and (taking a constructive proof into account) an explicit algorithm for ﬁnding Hankel optimal reduced models. Theorem 3.5.3 Let G = G(s) be a matrix-valued function bounded on the jω-axis. Let σ1 ≥ σ2 ≥ . . . σm ≥ 0 be the m largest singular values of HG . Then σm is the minimum ˆ ˆ of G − G H over the set of all stable systems G of order less than m. In other words, approximating Hankel operators by general linear transformations of rank less than m cannot be done better (in terms of the minimal L2 gain of the error) than approximating it by Hankel operators of rank less than m. 101 The proof of the theorem, to be given in this section for the case of a rational trans- fer matrix G = G(s), is constructive, and provides a simple state space algorithm for calculating the Hankel optimal reduced model. H-Inﬁnity quality of Hankel optimal reduced models It is well established by numerical experiments that Hankel optimal reduced models usu- ally oﬀer very high H-Inﬁnity quality of model reduction. A somewhat conservative de- scription of this eﬀect is given by the following extension of the AAK theorem. Theorem 3.5.4 Let G = G(s) be a stable rational function. Assume that the Hankel singular numbers σk = σk (G) of G satisfy σm−1 > σm = σm+1 = · · · = σm+r−1 > σm+r . ˜ ˜ ˜ σk (G) = σm (G) for m ≤ k < m+r, and σm+r (G) < σm (G). Let σm > σm+1 > σm+2 > . . . ˜ be the ordered sequence of diﬀerent Hankel singular values of G, starting with σm = σm ˜ and σm+1 = σm+r . Then ˆ (a) there exists a Hankel optimal reduced model GH of order less than m such that m r ˆ G − GH ∞ ≤ σm + σk ; m k>m ˆ (b) there exists a model G∗ of order less than m such that m ˆ G − G∗ ∞ ≤ ˜ σk . m k≥m Just as in the case of the basic AAK theorem, the proof of Theorem 3.5.4 is construc- tive, and hence provides an explicit algorithm for calculation of reduced models with the described properties. In practice, the actual H-Inﬁnity norm of model reduction error is much smaller. It is important to remember that the Hankel optimal reduced model is never unique (at least, the “D” terms do not have any eﬀect on the Hankel norm, and hence can be modiﬁed arbitrarily). The proven H-Inﬁnity model reduction error bound is guaranteed only for a specially selected Hankel optimal reduced model. Also, the reduced model from (b) is not necessarily a Hankel optimal reduced model. 102

DOCUMENT INFO

Shared By:

Categories:

Tags:
lyapunov functions, linear systems, quadratic functions, piecewise linear, quadratic function, lyapunov function, nonlinear systems, quadratic surface, quadratic equations, linear functions, theorem 1, equilibrium point, global analysis, linear system, convex hull

Stats:

views: | 86 |

posted: | 5/27/2010 |

language: | English |

pages: | 42 |

OTHER DOCS BY bfk20410

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.