Ensemble Data Assimilation

Document Sample
Ensemble Data Assimilation Powered By Docstoc
					Ensemble Data Assimilation

Lars Nerger
Alfred Wegener Institute for Polar and Marine Research
Bremerhaven, Germany
Bremen Supercomputing Competence Center BremHLR

            • Sequential data assimilation

            • Ensemble-based Kalman filters

            • Application example:
              Assimilation of satellite ocean color data

Lars Nerger - Ensemble Data Assimilation
                  Sequential data assimilation
               ensemble-based Kalman filters

Lars Nerger - Ensemble Data Assimilation
          Data Assimilation

         Optimal estimation of system state:
             • initial conditions          (for forecasts, …)
             • trajectory                  (temperature, concentrations, …)
             • parameters                  (growth of phytoplankton, …)
             • fluxes                      (heat, primary production, …)
             • boundary conditions and ‘forcing’            (wind stress, …)
             together with corresponding estimation errors

         2nd-order use of data assimilation:
             • reduce model error
               (improve parameterizations and model formulation)
             • provide re-analysis data sets

Lars Nerger - Ensemble Data Assimilation
         Issues of Data Assimilation

         High dimension of numerical model - O(107-109)
              Very costly prediction of errors
         Sparse and irregular observations
         Nonlinearity of system
         Uncertainty in errors of model and data
          (variances and correlations)

         Approximate statistical estimation methods required
         Common type are ensemble-based Kalman
          filter algorithms

Lars Nerger - Ensemble Data Assimilation
     Sequential Data Assimilation (Kalman filter view)

         Consider some physical system (ocean, atmosphere,…)

             time                      Sequential assimilation: correct model state
                                       estimate when observations are available
           state                       (analysis); propagate estimate (forecast)

                                                                     Size of correction
                                                                     determined by
    model                                                            error estimates



Lars Nerger - Ensemble Data Assimilation
         Probabilistic view: Optimal estimation

         Consider probability distribution of model and observations

                                                                  Explicit integration
                                forecast                           of probablility
                                                                   distribution is not
                                                                  Ensemble methods:

                     analysis                                     Kalman-based filters
                                                                   assume Gaussian
                                                                  Particle filters
                                           observation             more general,
                                                                   but costly
                                                                  (both types base on
   time 0                        time 1                  time 2
                                                                    Bayes law)

Lars Nerger - Ensemble Data Assimilation

      Assumed by the equations of a Kalman filter-based algorithm
            Gaussian forecast probability distribution
            Observation errors Gaussian distributed
      Analysis is combination of two Gaussian distributions
      Estimation problem can be formulated in terms of means
      and covariance matrices of probability distributions

     But: Nonlinearity will not conserve Gaussianity!

     (Extended KF conserves Gaussianity by first-order

Lars Nerger - Ensemble Data Assimilation
Ensemble-based Kalman Filters

 Foundation: Kalman filter (Kalman, 1960)
   • optimal estimation problem
   • express problem in terms of state estimate x and
     error covariance matrix P (normal distributions)
   • propagate matrix P by linear (linearized) model
   • variance-minimizing analysis

 Ensemble-based Kalman filter:
   • sample state x and covariance matrix P by ensemble of
    model states
   • propagate x and P by integration of ensemble states
   • Apply linear analysis of Kalman filter
   First filter in oceanography: “Ensemble Kalman Filter”
   (Evensen, 1994), second: SEIK (Pham, 1998)
         Ensemble-based Kalman Filter

         Approximate probability distributions by ensembles
         (low-rank approximation)
                               analysis            Questions:

              ensemble                             • How to generate initial ensemble?
              forecast                             • How to resample after analysis?
  sampling                                 resampling


   time 0                        time 1                  time 2

Lars Nerger - Ensemble Data Assimilation
         Sampling Example

                               " 3.0 1.0 0.0 %
                               $             '       "0.0%
                          Pt = $1.0 3.0 0.0 '; x t = $ '
                               $             '       #0.0&
                               # 0.0 0.0 0.01&


Lars Nerger - Ensemble Data Assimilation
         Issues of ensemble-based KFs

      No filter works without tuning
            forgetting factor/covariance inflation
            localization
      Other issues
            Optimal initialization unknown (is it important?)
            ensemble integration still costly
            Simulating model error
            Nonlinearity
            Non-Gaussian fields or observations
            Bias (i.e. systematic errors in model and observations)
Lars Nerger - Ensemble Data Assimilation
         The SEIK* filter (Pham, 1998)

         Use factorization of covariance matrix P = VUV
          (singular value decomposition)
         Approximate P by truncation to leading singular values
         (low rank r « state dimension n)
         Forecast: Use ensemble of minimum size N = r+1
         Analysis:
             • Regular KF update of state estimate x
             • Update P by updating U
         Ensemble-transformation:
             • Transform ensemble states to represent new x and P
               (can be combined with Analysis)

           *Singular “Evolutive” Interpolated Kalman
Lars Nerger - Ensemble Data Assimilation
         Local SEIK filter

    • Analysis:
         • Update small regions
           (e.g. single water columns)
         • Consider only observations
          within cut-off distance
               neglects long-range

    • Ensemble transformation:
         • Transform local ensemble
         • Use same transformation matrix
           in each local domain

Nerger, L., S. Danilov, W. Hiller, and J. Schröter. Ocean Dynamics 56 (2006) 634
                           Application example

                Global assimilation of satellite
                               ocean color data

Lars Nerger - Ensemble Data Assimilation
       Satellite Ocean Color (Chlorophyll) Observations

          Natural Color 3/16/2004                   Chlorophyll Concentrations

  Source: NASA “Visible Earth”, Image courtesy the SeaWiFS Project,
  NASA/GSFC, and Orbimage

Lars Nerger - Ensemble Data Assimilation
         NASA Ocean Biogeochemical Model (NOBM)
  Dust (Iron)   Sea Ice     Winds, ozone, relative humidity, pressure,       Winds   SST
                                 precip. water, clouds, aerosols

                     Spectral          (OASIM)                 Spectral
                 Irradiances                                    Irradiances

                                Optical          Layer Depths

Biogeochemical                  Temperature, Layer Depths                   Circulation
Processes Model                     Advection-diffusion

                     Chlorophyll, Phytoplankton Groups                   Global model grid:
                     Primary Production                                  domain: 84ºS to 72ºN
    Outputs:         Nutrients                                           1 1/4º lon., 2/3º lat.
                     DOC, DIC, pCO2                                      14 layers
                     Spectral Irradiance/Radiance
   NOBM - Biogeochemical Processes Model
             Ecosystem Component

Nutrients                           Phytoplankton

   Si               Silica            Diatoms
  NO3                     N/C
                 Herbivores           bacteria

               Iron                  lithophores
        Configuration of assimilation system

        Very simple, for now:
         Univariate assimilation of surface chlorophyll
         Use LSEIK
         Simplification: Static state error covariance matrix
         Observation errors chosen for good filtering performance
         Perform analysis update on logarithms
         Initial state estimate from free model
         State error covariance matrix estimated from variability of
          8-year model trajectory around 3-months running mean
         Analysis: Daily at model midnight for 9/1997 - 12/2004
         Ensemble size 31

Nerger, L. and W. W. Gregg. J. Mar. Syst.68 (2007) 237
                    Re-analysis of Chlorophyll Data

                                    mg/m3                                             mg/m3

                               Merge model with data



                                                     Result: Re-analysis fields
                                                       • Daily over 7 years
                                                       • Spatially complete
                                                       • Error comparable (regionally
                                                         below) to that of satellite data
Nerger, L., and W.W. Gregg. J. Marine Systems, 73 (2008) 87-102
      Primary Production

      Model: computed as depth-
       integrated product of
       growth-rate times Carbon-to-
       Chlorophyll ratio
      VGPM: Vertical Generalized
       Production model - satellite
       data only
      Primary production from
       assimilation consistent with

     (VGPM: Behrenfeld, M.J., P.G. Falkowski.       Mean relative difference to VGPM:
     Photosynthetic rates derived from satellite-   Free:              11.2%
     based chlorophyll concentration, Limnol.       Assimilation:      -0.5%
     Oce. 42 (1997) 1-20)

Lars Nerger - Ensemble Data Assimilation
      Open Questions - Future Directions

  Developments are typically application-driven: “It works!”
   Foundation of ensemble-based Kalman filters:
        Ensemble-based Kalman filters
        are practical - but rely on many assumptions
            ⇒ Do we understand the influence of violating the assumptions?
        Working assimilation system requires pragmatism
            ⇒ What is the minimum set of empiric assumptions and ‘fixes’?

   Further algorithmic developments
        smoothing (correction back in time)
        adaptive error-subspace estimation
        mixed normal - log-normal assimilation
        algorithms without Gaussian assumptions
Lars Nerger - Ensemble Data Assimilation
      Future applications

  Assimilation can be practical if ensemble forecasts are performed
   overhead of filter algorithm typically small
   Do we have observations that can constrain the model?
   Which assimilation method?
    (Adjoint, ensemble-based Kalman, Particles, other…)
   First (technical) implementation of assimilation system
         rather simple for Kalman filters
         changes to model code itself are small
   Making the assimilation system really work takes time

Lars Nerger - Ensemble Data Assimilation

          • Ensemble-based filters enable assimilation with limited
            ensemble sizes, also for (some) nonlinear problems
          • Applying filter algorithms is not straight forward
            - but incomplete set of typical ‘fixes’ exists
          • Properties of ensemble-based Kalman filters applied to
            non-linear systems are not well understood
          • Building a fully working assimilation system takes time
            (mostly for tuning)

          Note: Filter algorithms are available on request in form of
                 PDAF (Parallel Data Assimilation Framework)

Lars Nerger - Ensemble Data Assimilation
                                     Thank you!

               Wolfgang Hiller, Jens Schröter (AWI)
             Watson Gregg, Nancy Casey (NASA/GSFC) - Ensemble Data Assimilation