Trudinger by qingyunliuliu


									  Model-data fusion for the
coupled carbon-water system
    Cathy Trudinger, Michael Raupach, Peter Briggs
  CSIRO Marine and Atmospheric Research, Australia
                     Peter Rayner
                    LSCE, France

n   Model-data fusion
    (= data assimilation + parameter estimation)
n   Parameter estimation with the Kalman filter
n   Australian Water Availability Project
n   OptIC project – Optimisation Intercomparison
                   Model-data fusion
         Model:                                    Observations:
  - Process representation                   - ‘Real world’ representation
  - Subjective, incomplete                        - Incomplete, patchy
- Capable of interpolation &                    - No forecast capability

                          Optimal combination
                           (involves model-obs
                          mismatch & strategy to

                           - “Best of both worlds”
                        - Identify model weaknesses
                             - Forecast capability
                              - Confidence limits
        Choices in model-data fusion
    n   Target variables – what model quantities to vary to
        match observations – e.g. initial conditions, model
        parameters, time-varying model quantities, forcing

    n   Cost function – measure of misfit between observations
        and corresponding model quantities
        e.g. J(targets) = (H(targets) - obs)2 + (targets - priors)2

    n   Fusion method - search strategy
        n   Batch (non-sequential) e.g. down-gradient, global search
        n   Sequential e.g. Kalman filter

Approach and issues will differ to some extent between disciplines –
e.g. numerical weather prediction vs terrestrial carbon cycle
         The Ensemble Kalman filter
n   Ensemble Kalman filter (EnKF) – sequential method that
    uses Monte Carlo techniques; error statistics are
    represented using an ensemble of model states.

n   Two steps:                     Initial       Update using
                                   ensemble      measurement
n   Model used to predict
    from one time to next
n   Update using observation
                               Time:   t0                t1     t2
        Parameter estimation with the
           Ensemble Kalman filter
n   Augmented state vector to be estimated contains
    n Time-dependent model variables
    n Time-independent model parameters

n   State vector estimate at any time is due to
    observations up to that time
Our component of Australian Water Availability project:
  develop a Hydrological and Terrestrial Biosphere
       Data Assimilation System for Australia
     MODEL                       OBSERVATIONS                 PRIOR INFORMATION
n   Soil moisture            n   NDVI                        n Initial parameter estimates
n   Leaf carbon              n   Monthly river flows         n Soil, vegetation types
n   Water fluxes             n   Weather: rainfall, solar
n   Carbon fluxes                radiation, temperature

                             MODEL-DATA FUSION
                         n    Ensemble Kalman Filter
                         n    Down-gradient method (LM)

            n   Analysis of past, present and future water and carbon budgets
            n   Maps of soil moisture, vegetation growth
            n   Process understanding
            n   Drought assessments, national water balance
           AWAP- Dynamic Model and Observation
                        Model      Timestep = 1 day
                                                                      Spatial resolution = 5x5 km
n   State variables (x) and dynamic model
     n Dynamic model is of general form dx/dt = F (x, u, p)
     n All fluxes (F) are functions F (x, u, p) = F (state vector, met forcing, params)
     n Governing equations for state vector x = (W, CL):

     Soil water W:

     Leaf carbon CL:

n   Observations (z) and observation model
     n NDVI                 = func(CL)
     n Catchment discharge = average of FWR + FWD [- extraction - river loss]

n   State vector in EnKF: x = [W, CL, NDVI, Dis, params]
Southern Murray Darling Basin, Australia:
   "unimpaired" gauged catchments
     J   F   M   A   M   J   J   A   S   O   N   D
83                                                   Murrumbidgee
84                                                   Relative Soil
85                                                   Moisture (0 to 1)
02                                                    (Forward run
03                                                    with priors,
04                                                    no assimilation)
           Predicted and observed discharge
      11 unimpaired catchments in Murrumbidgee
25-year time series: Jan 1981 to December 2005

                                                 (Forward run with
                                                 priors, no assimilation)
Model-data synthesis approach:
- State and parameter estimation with the EnKF
- Assimilate NDVI and monthly catchment discharge
Why Kalman filter?
- Can account for model error (stochastic component)
- Consistent statistics (uncertainty analysis)
- Forecast capability (with uncertainty)
- Time-averaged observations in EnKF
(e.g. monthly catchment discharge)
- Specifying statistical model (model and observation errors)
- KF (sequential) vs batch parameter estimation methods?
(using Levenberg-Marquardt method; also OptIC project)
                                                Estimated parameters

          Preliminary results:
            Adelong Creek
   Blue = Ensemble Kalman filter (sequential)
   Red = Levenberg-Marquardt (PEST) (batch)

Monthly mean discharge/runoff
                         OptIC project
        Optimisation method intercomparison
n   International intercomparison of parameter estimation
    methods in biogeochemistry
n   Simple test model, noisy pseudo-data
n   9 participants submitted results
n   Methods used:
    n   Down-gradient (Levenberg-Marquardt, adjoint),
    n   Sequential (extended Kalman filter, ensemble Kalman filter)
    n   Global search (Metropolis, Metropolis MCMC, Metropolis-
        Hastings MCMC).

                                                                   Estimate parameters
                                                                   p1 , p 2 , k 1 , k 2

F(t) – forcing (log-Markovian i.e. log of forcing is Markovian)
x1 – fast store
x2 – slow store
p1, p2 – scales for effect of x1 and x2 limitation of production
k1, k2 – decay rates for pools
s0 – seed production (constant value to prevent collapse)
Noisy pseudo-observations
                       T1: Gaussian (G)

                       T4: Gaussian but noise in
                       x2 correlated with noise in
                       x1 (GC)

                       T6: Gaussian with 99% of
                       x2 data missing (GM)

                       T2: Log-normal (L)

                       T3: Gaussian + temporally
                       correlated (Markov) (GT)

                       T5: Gaussian + drifts (GD)
     Estimates divided by true parameters




Cost function

Some participants used cost functions with weights,
wi(t), that depended on each noisy observation zi(t)
                Code          Method                                                     Weights

                LM1           Monte Carlo then Levenberg-Marquardt                        f(zi(t))

                LM1Rob        As LM1, but ignore 2% highest summands in cost fn           f(zi(t))
                LM2           Levenberg-Marquardt                                          0.01
                LM3           Levenberg-Marquardt                                         f(zi(t))
                Adj1          Down-gradient search using model adjoint                      1.0
                Adj2          Down-gradient search using model adjoint                     sd(x)
                EKF           Extended Kalman filter (with parameters in state vector)   sd(resids)

                EnKF          Ensemble Kalman filter (with parameters in state vector)   sd(resids)
                Met           Metropolis                                                 sd(resids)
                MetRob        As Met but absolute deviations not least squares           sd(resids)

                MetMCMC       Metropolis Markov Chain Monte Carlo                           1.0
                MetMCMCq      As MetMCMC but quadratic weights                            f(zi2(t))
                MH_MCMC       Metropolis-Hastings Markov Chain Monte Carlo                  1.0

                         wi(t) = f(zi(t)) less successful than constant weights
          Choice of cost function
n   Evans (2003) – review of parameter estimation in
    biogeochemical models - “it was hard to find two
    groups of workers who made the same choice for the
    form of the misfit function”, with most of the
    differences being in the form of the weights.

n   Evans (2003) and the OptIC project emphasise that the
    choice of cost function matters, and should be made
    deliberately not by accident or default.

                                 (Evans 2003, J. Marine Systems)
              Optic project results

n   Choice of cost function had large impact on results
n   Most troublesome noise types:- temporally
    correlated noise
n   The Kalman filter did as well as the batch methods
n   For more information on OptIc:
Thank you!

To top