14th IFAC (International Federation of Automatic Control) Symposium on System Identification, SYSID 2006, March 29-31
IMPACT OF SYSTEM IDENTIFICATION METHODS IN METABOLIC MODELLING AND CONTROL
Dr. J. Geoffrey Chase
Department of Mechanical Engineering Centre for Bio-Engineering University of Canterbury Christchurch, New Zealand
Metabolic modelling can significantly improve the clinical control of hyperglycaemia with model-based protocols (e.g. Hovorka et al., 2004;
Chase et al., 2005)
For clinical utility, model parameters must be accurately identified for real-time prediction of response to intervention Current identification methods are mostly non-linear and nonconvex, and very computationally intense
With increasing model complexity, parameter trade-off can result in problematic identification. A typical solution is probabilistic population fitting methods (e.g. Vicini and Cobelli, 2001; Hovorka et al., 2004)
Typical clinical situation might use models and identification methods from different sources with local cohort/data.
The Problem & The Goal
Non-linear and non-convex identification methods and models can deliver sub-optimal results, affecting control prediction
– Clinically, prediction is the only true measure of utility
What is the clinical impact of mixing models and identification methods (if any)?
– Currently, model, system ID method and control are all designed together. – What happens if someone “mix and matches” without the original designers insights or experience?
This research compares a recently introduced linear, convex integralbased method and the commonly used non-linear recursive least squares (NRLS) identification method
– Using an accepted metabolic system model from one source and clinical data from another source for “independence” – “Independence” represents the typical clinical situation and avoids the models or methods being tuned for the cohort
The goal is to examine the computational cost and outcomes of these different methods in a clinical control application context
The model chosen for comparison is loosely based on the 2compt. minimal model (2CMM) first proposed by Caumo & Cobelli (1993)
– Well documented model that is widely used as a foundation
Main change is the 3 insulin compartments for the remote effects of insulin on glucose distribution/transport, disposal and EGP introduced by Hovorka et al. (2002)
– Similar model has been used clinically for control
Comprises 6 compartments in total
– 2 glucose compartments g1(t) and g2(t) – 3 insulin action compartments QD(t), QT(t) and QEGP(t) – 1 plasma insulin compartment I(t) )
Integral-Based Parameter Fitting
A “minimal” approach to identification is used with most model constants identified a priori from literature results
– – – Selection of population valued constants is a major issue in biomedical modeling as it assumes the parameter is not highly sensitive to results This assumption may not be true in all clinical scenarios or cohorts Required in many cases to ensure the model is identifiable from the available data
The remaining insulin sensitivities SI,D, SI,T and SI,EGP are identified as time-varying model parameters driving the model dynamics (details in the paper) This approach minimises total computational cost while enabling individual model constants to be varied for more optimised prediction and fit (e.g. Hann et al., 2005) What is the effect of mixing this approach and this model?
– – – Would be an “easy” combination for an independent researcher Will all assumptions on constant parameters hold? Can we identify despite inaccessible, unmeasurable compartments?
Integral-Based Parameter Fitting
SI,D, SI,T and SI,EGP are defined piecewise constant over a time period of 60mins using Heaviside step functions, H(t).
S I , j S I , j ,i ( H (t t( i 1) ) H (t ti )) where j D, T and EGP
i 1 N
Definition of the distribution of these parameters are arbitrary i.e. cubic, quadratic etc.
– Approach allows constants to define variation and be pulled out of integrals
2nd order polynomial interpolation is assumed between glucose measurements in the accessible glucose compartment g1(t)
– Error using this approximation has been shown to be minimal
(Hann et al., 2005)
Integral-Based Parameter Fitting
Inaccessible glucose compartment g2(t) modelled using a 2nd order Lagrange polynomial approximation to analytical solution for this immeasurable compartment (fortunately, it’s a simple enough dynamic) Within a time period of [t0 tf ], an arbitrary number of equations can be generated by integration of model equations over different time periods
The non-linear model thus decomposes into a linear equation system in unknown constants defining parameters to be identified
– Resulting least squares solution is starting point independent and convex!
A SI ,T g2 (t0 ) g2 (t1 ) g2 (t f ) SI , EGP EGPb b
C S I , D d
Patient data (n=7) was chosen from an intensive care unit hyperglycaemia control trial
(Chase et al., 2005)
Each set of patient data spans 10hrs with glucose measurements at 0.5hr intervals.
– Average glucose levels are ~ 6mmol/L (range ~4-10 mmol/)
Prediction window is 1hr following hourly clinical interventions
Median APACHE II = 23, inter-quartile range = 19-25
Results: Model Fit
Residual plot of model fit to patient data
Model fit errors
– Patient 2 (highest RMSE 0.80mmol/l, error SD 0.59mmo/l) – Patient 5 (smallest RMSE 0.15mmol/l, error SD 0.08mmol/l)
-1.5 Patient 2 -2 Patient 5 Patient 1,3,4,6,7
Model fit mean absolute percent error (MAPE) for cohort ranges from 2.4-7.4% which is within reported sensor error
Residual plot of model prediction to patient data
Model prediction errors
– MAE for cohort is 1.03mmol/l, error SD is 0.78mmol/l – RMSE is 1.31mmol/l, MAPE 20.21%
Patient 2 Patient 5 Patients 1,3,4,6,7 150 200 250 300 350 400 450 500 550 600
– Very variable depending on
the patient and/or time
Prediction MAPE exceeds the reported sensor error Errors are mostly at or within sensor error or very wide
Average model fit RMSE for NRLS and integral-based methods
0.7 Nonlinear Parameter ID Integral-Based Parameter ID
NRLS implemented using a non-linear ODE least squares solver in MATLAB on a Pentium M 1.7GHz PC, 1Gb RAM Integral method has lower error even with approximated compartment Average values of SI,D, SI,T and SI,EGP from literature used as starting points Integral-based method with linear approximation of g2(t) is 140X-660X faster than NRLS
0 0 100 200 300 400 500 600
NRLS finds local minima as seen in higher average model fit RMSE at most times Average time to complete model fit for one 10hr trial using linear integral-based method was 0.46±0.16s vs 122.60±42.81s using NRLS
Is it the model or method?
Average model prediction RMSE with 1-compt. glucose model
1.8 2-compartment glucose model Chase et al. model (1-compartment glucose model)
(Chase et al., 2005)
Care must be taken not to over fit available data with model dynamics. For this cohort, the 1-compt. glucose model has significantly smaller prediction errors for a given set of parameters This result is due to differences in model dynamics and ability to fit the observed behaviour, independent of fitting method However, model constants were average a priori values and not further optimised Hence the level of prediction accuracy reported may be expected
A convex identification method exposes the model prediction errors, identifying potential inadequacies in model dynamics and/or constants
Cohort model fit RMSE and MAPE were lower using linear integral-based method compared to NRLS – for the same model Model complexity can be extended (i.e. multiple compartments) without significantly affecting identification computation time Integrals can be used for simple inaccessible compartments using approximations Fitted parameters were all within reported physiological ranges Issues:
– – Different model dynamics and parameters may work better for different cohorts or situations – the comparison is not “complete” and this work is presented to show the potential impacts A priori global identifiability should always be considered. Not all models are globally identifiable for all parameters.
Linear, integral-based method shown to have lower computational cost leading to increased PI speed A convex method can identify potential areas of model difficulty or which other parameters may need to be identified in place of a population value.
Jessica Lin & AIC3
Jason Wong & AIC4
AIC2 & Dr. Geoff Shaw
Prof Steen Andreassen
Dr Kirsten McAuley
Prof Jim Mann
Maths and Stats Gurus
Dr Dom Lee
Dr Bob Broughton
Prof Graeme Wake Dr Chris Hann