Statistical data assimilation for Hurricane forecasting Montserrat

Document Sample
Statistical data assimilation for Hurricane forecasting Montserrat Powered By Docstoc
					              SAMSI workshop. October 5, 2005.



              Statistical data
              assimilation for
           Hurricane forecasting


Montserrat Fuentes, Kristen Foley and
              Lian Xie
Statistics Department, and Marine Earth Atmospheric Sciences
                     Department, NCSU


                             1
                         Motivation

• Hurricane Katrina struck the coasts of Louisiana and Mississippi
  on August 29, sending a massive storm surge that pounded
  against the levees intended to protect New Orleans from flooding.
  The levees gave way after the hurricane’s eye had passed,
  flooding most of the city and causing unprecedented damage.
  Katrina then slammed into Mississippi with only slightly less
  force and inflicted catastrophic damage to the cities of Gulfport
  and Biloxi.
• The number killed in Louisiana, Mississippi, Alabama and
  Florida has topped 1,000, far more than have been killed by a
  hurricane in the U.S. in decades.

                                2
• The Princeton Ocean Model (POM) is one of the models used to
  simulate the coastal ocean response to hurricane force winds and
  pressure.
• “We need to be smarter about how we ingest and use
  observations in real time. We have a lot of data that doesn’t
  make it into the current generation of forecast models
  effectively.” quoted from the“Federal Computer Week”




                                3
OUR OBJECTIVE:
Assimilating data in ocean numerical models to improve the
prediction of storm surge–the rise of water caused by tropical cyclone
wind and pressure.




                                  4
Data Assimilation for Coastal Ocean Prediction
                                   Input Fields
                               to initialize model
                            Forcing
                           Parameters


                                 Numerical Ocean
   Observations
 (water level stations,              Model
  buoys, satellite, etc)         (Princeton Ocean Model)




      FILTERING METHODS




                                        5
                           Case study


Numerical model
The coastal ocean circulation model is based on the Princeton Ocean
Model (POM), one of the best validated, three-dimensional,
time-dependent coastal ocean models (Mellor, 2003).
POM has been adopted by NOAA as the operational coastal ocean
forecasting system for the U.S. East Coast. It has been further
extended by Xie et al. (2004) for storm surge prediction, including
simulating the wetting and drying process over the inter-tidal and
salt marsh and flooding under hurricane forcing.



                                  6
Input Data
Surface wind fields are the most important input variable used to initialize
the numerical ocean model to simulate the coastal ocean response to
hurricanes. A wind field is decomposed into East-West) (u) and
North-South (v) orthogonal components.
There are three sets of available wind data to be used in hurricane
forecasting and hindcasting.

  • Data provided by the National Hurricane Center (NHC) specifies the
    location of the storm center and the central pressure.

  • Hourly wind speed data at 24 buoy locations along the Eastern coast
    is also posted by NOAA’s National Data Buoy Center (NDBC).

  • Finally, gridded wind field data is available from NOAA’s Hurricane
    Research Division (HRD).



                                     7
                                                                                                 CHLV2
                                                                                                         44014
                                                                                                 DUCN7



                 35                                                                          CLKN7               41001


                                                                                     FPSN7

                                                                 FBIS1
                                                                         41004
                                                                                                     41002

                                                       41008
      latitude

                 30




                                     KTNF1           SAUF1
                                                                         Sept. 15 1330Z
                                        CDRF1
                                                                             41010
                             42036                           41009



                                             VENF1
                                                                 LKWF1 SPGF1


                                                               FWYF1
                 25




                                                        MLRF1
                                                     LONF1
                                         DRYF1      SMKF1
                                               SANF1




                       −85                              −80                                          −75                 −70

                                                       longitude




Figure 1: Observed data for Hurricane Floyd for Sept. 15 hour 1330Z. The
box is the domain of the gridded HRD data available for this time period.

                                                             8
Currently for storm surge forecasting a deterministic wind model is
used, referred to as the Holland wind model (Holland 1980) which
determines the wind speed under cyclostrophic balance.
The Holland wind model for surface wind velocity at a given location
is:
                           B                                   1/2
             B    Rmax
   W (ri ) =                   (Pn − Pc ) exp{−(Rmax/ri )B }         (1)
             ρ     ri

where ri is the radius from the storm center to site si at time t and
φi is the angle from the storm center to site si . Pn is the ambient
pressure, Pc is the hurricane central pressure, ρ is the air density
(fixed at 1.2 kg m−3 ), and ri is the distance from the hurricane
center to the location of the wind measurement, si .


                                    9
                                                 u

                                         W               v
                                                     φ



                                             r

                                     φ
         N−S




                                E−W




Figure 2: U-V wind decomposition for symmetric Holland wind model.

                                10
State variable: Water Level
The state variable of interest is the change in surface water elevation
which is used for storm surge forecasting.
We use data from 5 coastal water level stations maintained by
NOAA’s National Ocean Service (NOS) Center for Operational
Oceanographic Products and Services (CO-OPS).




                                  11
Figure 3: Observed water level data. Time series plots of observed water
level (adjusted for tides and location) in meters at 5 water level stations
along the coast.                     12
                      Data Assimilation


We can represent the two sources of information that are merged in
data assimilation in the form of an observation equation and a state
or system equation that are functions of the state vector X t of the
underlying (unobservable) system.
Notation:
             t
X t (ti ) = Xi(n×1) – true (unknown) state vector at time ti
  o
Yi(p×1) – new vector of observations
 a
Xi(n×1) – analysis state vector, best estimate of the unknown state at
      time ti based on data assimilation methods
 f
Xi(n×1) – output from the numerical model


                                  13
                        Data Assimilation


                       Observation Equation:
                                       t
                            Yio = Hi [Xi ] +   o
                                               i                   (2)
                          State Equation:
                           t        t
                          Xi = Mi [Xi−1 ] + vi                     (3)
            o
    where   i   ∼ (0, Ri ), vi ∼ (0, Qi ) and Cov( o , vi ) = 0.
                                                   i



• Hi [.] is an observation operator which interpolates the model
  variables to the observation locations and transforms the model
  variables to the observed variables.
• Mi [.] represents the forward integration of the current state by
  the numerical model.

                                     14
                         DA Example


• The State Vector: The change in surface water elevation at n
  grid locations (s1 , ..., sn ) around the center of Hurricane Floyd on
                                 t
  Sept. 16th, hour 0130. Xi
• The Data: Hourly water elevation values (adjusted for tides)
  from 5-10 coastal water level stations (w1 , ...wp ). Yio
• The Model: Given the surface wind fields as forcings of the
  ocean model, POM is evolved forward in time to create a forecast
                                                           f
  for the water elevation values at the n grid locations. Xi




                                  15
             Standard Kalman Filter (KF)


                       Observation Equation:
                                      t
                            Yio = Hi Xi +    o
                                             i                          (4)
                          State Equation:
                            t       t
                           Xi = Mi Xi−1 + vi                            (5)
             o
     where   i   ∼ N (0, Ri ), vi ∼ N (0, Qi ) and Cov( o , vi ) = 0.
                                                        i

• Standard Kalman filtering assumes that observation and
  forecasting error is Normal.
• KF also assumes that the observation and state equations are
  linear systems of the state variables. (i.e. Hi(p×n) , Mi(n×n) )


                                    16
Kalman Filter: Best Linear Unbiased Estimator

                                         f
                               a
• We want a linear estimator: Xi = W1,i Xi + W2,i Yio
                                    a        t
• We want an unbiased estimator: E(Xi ) = E(Xi )

                =⇒ W1,i = I − W2,i Hi                       (6)
                         f                   f
                    a
                =⇒ Xi = Xi + W2,i (Yio − Hi Xi )            (7)

• We want the ”best” estimator: Find Wi that minimizes the mean
  square estimation (analyis) error, E[ea (ea )T ] where
                                        i   i
         a     t
  ea = Xi − Xi .
   i

• The optimal weight matrix = the Kalman gain matrix:

             Wi(n×p) = Ki = Pif Hi (Ri + Hi Pif Hi )−1
                                 T               T
                                                            (8)

                              17
  where Pif = Mi Pi−1 MiT + Qi .
                  a

                t
• The BLUE for Xi is
                          f                 f
                     a
                    Xi = Xi + Ki (Yio − Hi Xi )                   (9)

          a
• Since Xi an unbiased estimator, the analysis error covariance
  can be found by:

                 Pia = E[ea (ea )T ] = (I − Ki Hi )Pif
                          i   i                               (10)

• The Kalman filter is very similar to other data assimilation
  methods such as Optimal Interpolation (OI) and 3D-Var
  methods except that in these methods the forecast or background
  error covariance is estimated once and then remains constant,
  Pif = B, rather than being advanced by the numerical model.


                                18
             Adapting the Kalman Filter


• In atmospheric science applications of DA the state vector of
  interest is of very high dimension and the dynamic systems being
  modeled are unstable and possibly chaotic. This implies:
  – The numerical model Mi [.] is nonlinear.


  – The KF update step for the error covariance is
    extremely
    computationally expensive.


  – Normality assumption for the error terms may not be
    realistic.


                               19
            Ensemble Kalman Filter (EnsKF)


The Ensemble Kalman Filter uses a Monte Carlo approach to
generate samples of the state vector and carry out an ensemble of
data assimilation cycles. This ensemble is then used to estimate the
forecast error covariance. Forecast Step:
 • Sample Xj (ti−1 ) ∼ P (X a (ti−1 ), P a (ti−1 )), j = 1, ..., K
            f
 • Compute Xj (ti ) = Mj [Xj (ti−1 )], j = 1, ..., K
 • Calculate sample mean and covariance:
                    K                            K
    ˆ f (ti ) = 1
    X                    f   ˆ f (ti ) = 1
                        Xj , P                          f   ˆ      f
                                                      [Xj − X f ][Xj − X f ]T
                                                                       ˆ
                K   j=1
                                        K −1    j=1



                                     20
         Ensemble Kalman Filter (EnsKF)


• Analysis Step:
  – Perturb the observations at time ti by adding random noise
    (typically Gaussian):
    Yj∗ (ti ) = Y o (ti ) + ηj ,
    ηj ∼ N (0, R(p×p) ), j = 1, ..., K
  – Update each member of the forecast sample with the KF
                               f             f
    analysis step: Xj (ti ) = Xj + K(Yj∗ − HXj ), j = 1, ..., K
    K = P f H T (R + H P f H T )−1
         ˆ                ˆ
  – This now serves as a sample for time ti and the process
    repeats.


                               21
           Ensemble Kalman Filter (EnsKF)


• EnsKF can be used with non-Gaussian error distributions.
   – However for non-Gaussian distributions, estimates for
     only the first two moments of the update distribution
     may not be sufficient.
• It can be shown that for large ensemble size, K, the update
  Xj (ti ), j = 1, ..., K, is a sample from the update distribution
  from the standard KF:
                             f               f
  Xj (ti ) ∼ N (X a (ti ) = Xi +Ki (Yio −Hi Xi ), P a (ti ) = (I−Ki Hi )Pif )
   – However, KF was derived for the Gaussian case, so
     for non-normal errors we don’t know anything about
     the properties of these estimators.

                                   22
         Bayesian Framework: Normal Example


Under the assumptions of linearity and normality of the standard KF:
  • Prior:                                 f
                    P (Xi |Yi−1 ) ∼ N (Xi , Pif )
                           t    o

                          t    o           t
    Likelihood: P (Yio |Xi , Yi−1 ) ∼ N (Xi , Ri )
                    t         o           a
    Posterior: P (Xi |Yio , Yi−1 ) ∼ N (Xi , Pia )


     – The posterior mean is equal to the KF update.
                             a    f                 f
                            Xi = Xi + Ki (Yio − Hi Xi )

     – The posterior covariance is equal to the KF analysis error
       covariance.
                              Pia = (I − Ki Hi )Pif
  • A Bayesian framework allows us to relax the assumption of linearity
    and normality and take into account different sources of uncertainty.

                                      23
                        EnsKF for Floyd


A symmetric Holland wind model is used to initialize POM based on
track information available from NOAA on September 15th and 16th,
1999.
The state variable of interest is the change in surface water elevation
caused by the hurricane wind and pressure, referred to as storm
surge.




                                  24
Figure 4: Land relief and bathymetric data (in meters) for our domain.

                                  25
Figure 5: Wind fields from Holland Model based on best track informa-
tion for the center location and central pressure of Floyd on September
                                    26
15th, hour 03 through September 16th, hour 03Z.
A statistical ensemble of initial water elevation fields was generated
using a Gaussian distribution centered at zero meters with an
exponential spatial covariance model.
The ensemble of initial fields was used to initialize the model. The
Ensemble Kalman Filter was used to update the state vector using
observed data from 5 water level stations in NC, SC and GA within
the model domain.




                                  27
Figure 6: Updated ensemble of water level fields (referred to as analysis
fields) using the Ensemble Kalman Filter to combine model output and
observed values



                                   28
Figure 7: Forecast Error. The updated ensemble of water level fields are
used to initialize the model and forecast forward 1, 3 and 6 hours ahead of
the updating time. This graph shows the effect of the EnsKF step on the
model output compared to the observed values for three of the water level
stations. The observed value used in update step is represented by the solid
green line, the model value by the dashed-black line, the ensemble mean by
the solid blue line, and the ensemble members by dotted-blue lines.
                                     29
                   Improving input fields


Surface wind fields are the most important input variable used to
initialize the numerical ocean model to simulate the coastal ocean
response to hurricanes.
Currently input wind fields are obtained from the symmetric Holland
model. We improve input fields by combining Holland model with
observed wind fields.




                                 30
  Measurement error              Bias parameters            Measurement error
      s ub2, s vb2                    au, av                    s ua2, s va2

      Buoy Data                           HRD Analysis Fields
        Vb(s)                                  Va(s)



                       True Values
                          V(s)


Physical Parameterization            Stochastic Component
        V H (s)                               %
                                             V ( s)
    (Holland Model)                 (Linear Model of Coreg.)




                            31
Data Model


                  Vb (s) = V(s) +
                  ˆ                   b
                                          (s)                    (11)
                  Va (s) = a(s) + b(s)V(s) +
                  ˆ                             a
                                                    (s)          (12)

Vb represents buoy data, Va represents HRD data.
ˆ                        ˆ

Underlying true wind fields

                       V(s) = VH (s) + V(s)
                                       ˜                         (13)

The first component VH = (uH , v H )T captures the large scale
structure defined by the axissymmetric Holland wind model. The
Holland model is based on the physical forces driving the hurricane
winds and plays an important role in prediction. However, this
deterministic model does not account for any irregularities or

                                 32
asymmetries in a particular storm.
The stochastic component accounts for these features in the wind
field and is modeled with a non-intrinsic Linear Model of
Coregionalization (Wackernagel 1998). Let V = (˜, v )T and
                                            ˜    u ˜

                           V(s) = Aw(s)
                           ˜                                    (14)

where A is a 2 × 2 lower triangular weight matrix that will determine
the covariance between the u and v wind components and
w(s) = (w1 (s), w2 (s))T are independent underlying spatial processes
with different spatial correlation functions.
We are evaluating the improvement in the performance of POM by
using the proposed framework for the input wind fields.



                                 33