Document Sample
					    Proceedings of the Third International Conference on Mathematics and Natural Sciences
                                         (ICMNS 2010)


                                      Heri Kuswanto
        Department of Statistics, Faculty of Mathematics and Natural Science ,
          Institute Technology of Sepuluh Nopember, Surabaya, Indonesia

Abstract. Ensemble forecasts are usually underdispersed and hence, it needs to be
calibrated. This paper proposes a theoretical framework of new calibrated method
for ensemble forecasts of non-normally distributed climate variables using meta
Gaussian distribution. The method combines the strength of Bayesian Model
Averaging (BMA) with climatological distribution in order to generate a reliable
calibrated forecast. Meta Gaussian offers flexibility in the likelihood parameter
estimation and no transformation is necessarily to do.
Keywords: Ensemble forec asts, Non-normal, Meta-Gaussian, Hierarc hic al model

1         Introduction
Ensemble forecast is currently being a popular method used in many ensemble
prediction system (ESP) for instance NAEFS (America), the European Centre for
Medium-Range Weather Forecasts (ECMWF), etc. The raw ensemble forecast is
usually underdispersed and uncalibrated. Calibration is then necessarily to do in
order to generate a reliable probabilistic forecasts. Bayesian model averaging
(BMA), proposed by Raftery et al. (2005), is the most popular calibration method
used in the EPS centers. The calibration is basically done by correcting the bias
and adjusting the variance, and it was firstly introduced for calibrating
temperature ensemble forecasts and mean sea level pressure, which are cases of
normally distributed climate variable. The BMA method has bee n extended to
calibrate non-normally climate variables e.g streamflow [Duan et al. (2007),Vrug et
al. (2008], , wind speed [Sloughter et al. (2010)], precipitation [Sloughter et al.
(2007)], etc.

More recent work of Bishop and Shanley (2008) argued that the BMA better
describes the likelihood of observing the forecast given the truth. Using BMA solely
for calibration does not cover uncertainty induced by the climatology, which makes
it fail to calibrate forecast in extreme events. This argument leads to an idea of
developing a new calibration method which quantifies the climatological
uncertainty. It is done by simply combining BMA concept with climatological prior,
by means of Bayes’ theorem. This method is in line with the proposal of
Krzysztofowicz and Evans (2008). Bayesian processor of forecast (BPF). The
proposed method is not satisfying in the sense that it makes a simplification by

summarizing the individual forecast information into one point value (i.e. mean of
the ensemble) in the likelihood function. This approximation may lose some
important information in the ensemble, and the application is limited to the normal

Kuswanto et al. (2010) proposes a new calibration method using the same concept
as Bishop and Shanley (2008), where the BMA concept is firstly revised by framing
it onto hierearchical model in order to maintain the individual member
information. The method successfully calibrates the ensemble temperature
forecasts, and outperforms the standard BMA as well as the climatological forecast.
This paper extends the concept to non-normally climate variables. Unlike the
normal case which can generate a close form of predictive distribution, non -normal
case has non-conjugate prior which offers some difficulties. Simple distribution
adjustment cannot be applied, and standard posterior distribution cannot be
obtained. To solve this problem, we propose to employ normal quantile transform
(NQT) to form a meta-Gaussian likelihood function, in order to extract the
dependence structure between the forecasts and observation, without doing any
deseasonalization or transformation to the original series in order to generate a
stationary series.

2       Calibration method
The proposed calibration method includes of defining the hierarchical model for the
BMA, specificing the prior and likelihood distributions, and developing the
predictive distribution. The basic concept of the hierarchical model in this paper is
the same as Kuswanto et al. (2010). The remaining steps are a new concept
introduced for non-normally distributed cases.

2.1 Hierarchical model of Bayesian model averaging

Suppose that     be forecasts of quantity     produced by the EPS, where     is
drawn from a finite mixture probability models                , and    be model
from which     is drawn with index      so that         . By defining that
                                              as the prior of model    , then by law
of total probability


The BMA hypothesis of conditional independence makes it possible to have
                                      , and hence


The ensemble models are assumed as exchangeable, therefore prior                       .
Note that it is incorrect to approach                             by the conditional
distribution             The latter case consists of cases where and cases where
       ,    and consequently the estimation        of                       would be
contaminated by many non-relevant cases where              . It is expected that the
relationship between       and    is more reliable when           . This problem can
be solved by assuming that there is an unobserved latent variable          such that
            if and only if     . We then obtain the hierarchical model as in Figure
1 (see [6] for details).

The latent variable     represents the quantity that the weather forecasting model
is capable of simulating reliable forecast with some error caused by uncertainty in
initial conditions. We shall assume that there exists a linear relationship between
   and y as well as between     and y such as



           Figure 1. Directed Acyclic Graph for hierarchical model

                          ξ                               (3)


The regression model (3) represents the relationship between the observation y and
a member of s of the ensemble forecasts when the model is correct, that is when
         , which happen when            , while (4) describes the relationship between
y and     , which is less reliable than (3). Hence, the error term of the regression for
the latter case should be greater, indicated by an addition of extra noise . We
thus assume that this term is constant in order to simplify the estimation
procedure. The following subsections briefly describe each component of the
predictive distribution

where            is prior distribution and                            is likelihood

2.2 Prior likelihood

The climatological prior distribution      characterizes the natural variability of
the observation y. In an ideal case, the EPS has a longer record of observation y
than the ensemble forecasts X. Two natural questions arise in this case i.e. how do
we approximate the distribution of y? and should we use all available information
to estimate the parameter of the climatological prior used in the predictive

The choice of the distribution depends on the prior information about the climate
variable of interest, for instance streamflow and wind-speed frequently have
Gamma or Weibull distribution, precipitation can be approximated by discrete
distribution, etc. The distribution of the variable can be fit easily. The second
question should be answered by considering the fact that the climatological
observation y is highly affected by seasonal variability or temporal trend induced
by climate change. The climatological prior distribution should be estimated from
the climatological observation data which minimizes the seasonality effect. To cope
with this condition, we propose to use the following specification for estimating the
climatological distribution, instead of applying a deseasonalization or
transformation to the time original series. Suppose that we have past 30 years of
climatological record in the database (excluding the current year) and let d be the
date of interest. The climatological distribution for forecast on date d is estimated
from a collection of observations within 5 days before and after d extracted every
year within last 30 years. Of course there is no formal justi fication to set 5 days as
the interval. In fact, Kuswanto et al. (2010) showed that increases or decreased the
interval length from 3 to 7 days does not change the distribution.

2.3 Likelihood function

The likelihood describes the density of  conditional on y. It basically explains the
stochastic dependence between both variables. Hence, the variables needed to form
the likelihood function is the joint sample between      and y. In our case, the
likelihood represents the density of       when          given the value of y, that
is         Such as the prior, we need to specify the distribution of the likelihood
and the sample used to estimate the parameters. Unlike the normal distribution
which is a conjugate prior of normal likelihood, the marginal distribution of
(Gamma or Weibull distribution) is not a conjugate of normal likelihood. Hence, a
close form predictive distribution cannot be obtained. A natural way to solve this
problem is by doing a transformation to the original series e.g. BoxCox
transformation in order to obtain a stationary and normally distributed series.
However, such tranfromation may lose many important information about the
untrasformed series for instance higher moments of the distribution. Moreover,
inverse transformation is difficult to do.

We propose to use meta-Gaussian distribution of Kelly and Krzysztofowitz (2008),
by firstly apply normal quantile transform (NQT) to the joint sample. There are at
least two potential advantages of using meta-Gaussian. First, the transformation is
based on the rank series and hence the stochastic dependence between
observation and forecasts reflects the original dependence. Second, inverse
transform from the meta-Gaussian distribution is very easy to do i.e. we only need
to multiply the cumulative distribution (CDF) with its Jacobian.

Suppose that we have joint sample of          , then the NQT of both variables are


where is the standard normal distribution function and           is its inverse. From
these transformation, according to (3) and (4) we have



where    is the latent variable in the transformed space. The parameters and
can be estimated by maximum likelihood or ordinary least square. Moreover, we
also have standard error of the latent variable    .

We now discuss the choice of the sample by which the parameters of the likelihood
distribution should be estimated. The following arguments describe conditions
which support the choice of the sample. As the likelihood sample is a composition
of climatological observation and ensemble forecasts, then the stochastic
dependence between both variables might be influenced by some conditions e.g.
weather condition among others, and therefore it changes adaptively. Moreover, it
is very likely that the sample is suffered from the seasonality effect. Seasonality
and non-stationarity lead to a non-constant likelihood parameters overtime i.e. the
forecasts in winter may be more informative than in summer and consequently the
stochastic dependence in winter is stronger than in summer season (Kelly and
Krzysztofowitz (1997)). The proposed method does not involve any
deseasonalization of the time series in order to remove any seasonality effect or
nonstationarity. The procedure as employed for the prior speci fication is no longer
applicable as we usually have only short record of the joint sample. As an
alternative to this, the likelihood joint sample in this paper is speci fied by the
training window, which is n sample preceding the forecast date. In practice, the
length of the training window should not be a very short period in order to have
informative parameters.

2.4 Posterior distribution

The posterior distribution is obtained by s imply combining the climatological prior
with the likelihood function. It is used to generate the probabilistic predictive
distribution of the calibrated ensemble forecasts . Given the prior distribution
with its parameters (A, B and             , marginal distribution             with its

parameters, and the dependence parameters obtained from the likelihood, then the
posterior CDF can be expressed as follows:

It is straightforward to show that after multiplying this with its Jacobian, the
probability density function (PDF) or the predictive distribution in the original
space has form of


The mean of the predictive distribution is a weighted kernel between the
climatological forecast and ensemble forecasts on the date. The mean of the
predictive distribution depends on s, therefore, the value can be same or different
for each member according to the ensemble forecast at the calibrated date.
Meanwhile, the variance is equal for all members. The predictive distribution of
the forecasts on the date is the one that has optimum      . The optimum standard
deviation will be determined by grid selection procedure, the one which optimizes
the objective function for instance Continuous Ranked P robability Score (CRPS) of
Unger (1985) and Herbash (2000) among others.

3       Conclusion
We proposed a new calibration method of ensemble forecasts for non-normally
distributed climate variables. The proposed method involves of revising the BMA
framework into hierarchical model, and uses Bayes’s theorem in order to generate

the predictive distribution. Two uncertainties are incorporated in the predictive
distribution i.e. climatological uncertainty and forecasts uncertainty. It solves the
problem of underdispersion of the ensemble forecasts. The dependence parameters
in the likelihood function are estimated by firstly applying normal quantile
transform (NQT) to the series in order to generate a meta-Gaussian distribution.
Proper definition of the sample choice of each predictive component allows us to
have series which is free of seasonality effect and temporal trend. Hence, no
transformation or standardization is needed to obtain stationary series, which is a
necessary condition of applying the normal quant ile transform.

The author would like to thank Department of Mathematics and Statistics, Laval
University Canada for giving opportunity to conduct this research during author’s
Postdoctoral research.

[1] Bishop, C. and K. Shanley (2008), Bayesian Model Averaging's problematic
    treatment of extreme weather and a paradigm shift that fixes it, Monthly
    Weather Review, 136, 4641-4652.
[2] Duan, Q. and N. K. Ajami, X. Gao and S. Sorooshian (2007), Multi -model
    ensemble hydrologic prediction using Bayesian model averaging, Advance in
    Water Resources, 30, 1371-1386.
[3] Hersbach, H. (2000), Decomposition of the continuous ranked probability
    score for ensemble prediction systems, Weather Forec asting, 15, 559-570.
[4] Kelly, K. S. and R. Krzysztofowitz (1997). A bivariate meta-Gaussian density for
    use in hydrology, Stochastic Hydrol. Hydraul., 11, 17-31.
[5] Krzysztofowicz, R. and B. Evans (2008), Probabilistic Forecasts from the
    National Digital Forecast Database, Weather and Forecasting, 23, 270-289.
[6] Kuswanto, H. , V. Fortin., A-C. Favre and B. N. Trinh (2010), Combining
    Bayesian processor of ensemble and Bayesian model averaging for reliable
    weather forecasts. Prepared for submission.
[7] Raftery, A. E., T. Gneiting, F. Balabdaqui, and M. Polakowski (2005), Using
    Bayesian model to calibrate forecast ensembles, Monthly Weather Review, 133,
[8] Sloughter, J., A. E. Raftery, T. Gneiting, and C. Fraley (2007), Probabilistic
    Quantitative Precipitation Forecasting Using Bayesian Model Averaging,
    Monthly Weather Review, 135, 3209-3220.

[9] Sloughter, J., T. Gneiting and A. E. Raftery (201 0),     Probabilistic Wind
    Speed Forecasting using Ensembles and Bayesian Model
    Averaging, Journal of the Americ an Statistic al Association, 105, 25-35.

[10] Unger, D. A. (1985), A method to estimate the continuous ranked probability
     score. Preprints, Ninth Conf. on Probability and Statistics in Athmospheric
     Sciences, Virginia Beach, VA, Amer. Meteor. Soc., 206-213.
[11] Vrugt, J. A., Cees G. H. Diks, and M. P., Clark (2008), Ensemble Baye sian
     model averaging using Markov Chain Monte Carlo sampling, Environ. Fluid
     Mech., 8, 579-595.



Shared By: