Document Sample

Proceedings of the Third International Conference on Mathematics and Natural Sciences (ICMNS 2010) NEW CALIBRATION METHOD FOR ENSEMBLE FORECAST OF NON-NORMALLY DISTRIBUTED CLIMATE VARIABLES USING META-GAUSIAN DISTRIBUTION Heri Kuswanto Department of Statistics, Faculty of Mathematics and Natural Science , Institute Technology of Sepuluh Nopember, Surabaya, Indonesia Email: heri_k@statistika.its.ac.id Abstract. Ensemble forecasts are usually underdispersed and hence, it needs to be calibrated. This paper proposes a theoretical framework of new calibrated method for ensemble forecasts of non-normally distributed climate variables using meta Gaussian distribution. The method combines the strength of Bayesian Model Averaging (BMA) with climatological distribution in order to generate a reliable calibrated forecast. Meta Gaussian offers flexibility in the likelihood parameter estimation and no transformation is necessarily to do. Keywords: Ensemble forec asts, Non-normal, Meta-Gaussian, Hierarc hic al model 1 Introduction Ensemble forecast is currently being a popular method used in many ensemble prediction system (ESP) for instance NAEFS (America), the European Centre for Medium-Range Weather Forecasts (ECMWF), etc. The raw ensemble forecast is usually underdispersed and uncalibrated. Calibration is then necessarily to do in order to generate a reliable probabilistic forecasts. Bayesian model averaging (BMA), proposed by Raftery et al. (2005), is the most popular calibration method used in the EPS centers. The calibration is basically done by correcting the bias and adjusting the variance, and it was firstly introduced for calibrating temperature ensemble forecasts and mean sea level pressure, which are cases of normally distributed climate variable. The BMA method has bee n extended to calibrate non-normally climate variables e.g streamflow [Duan et al. (2007),Vrug et al. (2008], , wind speed [Sloughter et al. (2010)], precipitation [Sloughter et al. (2007)], etc. More recent work of Bishop and Shanley (2008) argued that the BMA better describes the likelihood of observing the forecast given the truth. Using BMA solely for calibration does not cover uncertainty induced by the climatology, which makes it fail to calibrate forecast in extreme events. This argument leads to an idea of developing a new calibration method which quantifies the climatological uncertainty. It is done by simply combining BMA concept with climatological prior, by means of Bayes’ theorem. This method is in line with the proposal of Krzysztofowicz and Evans (2008). Bayesian processor of forecast (BPF). The proposed method is not satisfying in the sense that it makes a simplification by 932 summarizing the individual forecast information into one point value (i.e. mean of the ensemble) in the likelihood function. This approximation may lose some important information in the ensemble, and the application is limited to the normal variable. Kuswanto et al. (2010) proposes a new calibration method using the same concept as Bishop and Shanley (2008), where the BMA concept is firstly revised by framing it onto hierearchical model in order to maintain the individual member information. The method successfully calibrates the ensemble temperature forecasts, and outperforms the standard BMA as well as the climatological forecast. This paper extends the concept to non-normally climate variables. Unlike the normal case which can generate a close form of predictive distribution, non -normal case has non-conjugate prior which offers some difficulties. Simple distribution adjustment cannot be applied, and standard posterior distribution cannot be obtained. To solve this problem, we propose to employ normal quantile transform (NQT) to form a meta-Gaussian likelihood function, in order to extract the dependence structure between the forecasts and observation, without doing any deseasonalization or transformation to the original series in order to generate a stationary series. 2 Calibration method The proposed calibration method includes of defining the hierarchical model for the BMA, specificing the prior and likelihood distributions, and developing the predictive distribution. The basic concept of the hierarchical model in this paper is the same as Kuswanto et al. (2010). The remaining steps are a new concept introduced for non-normally distributed cases. 2.1 Hierarchical model of Bayesian model averaging Suppose that be forecasts of quantity produced by the EPS, where is drawn from a finite mixture probability models , and be model from which is drawn with index so that . By defining that as the prior of model , then by law of total probability (1). The BMA hypothesis of conditional independence makes it possible to have , and hence (2) 933 The ensemble models are assumed as exchangeable, therefore prior . Note that it is incorrect to approach by the conditional distribution The latter case consists of cases where and cases where , and consequently the estimation of would be contaminated by many non-relevant cases where . It is expected that the relationship between and is more reliable when . This problem can be solved by assuming that there is an unobserved latent variable such that if and only if . We then obtain the hierarchical model as in Figure 1 (see [6] for details). The latent variable represents the quantity that the weather forecasting model is capable of simulating reliable forecast with some error caused by uncertainty in initial conditions. We shall assume that there exists a linear relationship between and y as well as between and y such as µ y … Figure 1. Directed Acyclic Graph for hierarchical model ξ (3) (4) The regression model (3) represents the relationship between the observation y and a member of s of the ensemble forecasts when the model is correct, that is when , which happen when , while (4) describes the relationship between y and , which is less reliable than (3). Hence, the error term of the regression for the latter case should be greater, indicated by an addition of extra noise . We thus assume that this term is constant in order to simplify the estimation procedure. The following subsections briefly describe each component of the predictive distribution 934 (5) where is prior distribution and is likelihood function. 2.2 Prior likelihood The climatological prior distribution characterizes the natural variability of the observation y. In an ideal case, the EPS has a longer record of observation y than the ensemble forecasts X. Two natural questions arise in this case i.e. how do we approximate the distribution of y? and should we use all available information to estimate the parameter of the climatological prior used in the predictive prediction? The choice of the distribution depends on the prior information about the climate variable of interest, for instance streamflow and wind-speed frequently have Gamma or Weibull distribution, precipitation can be approximated by discrete distribution, etc. The distribution of the variable can be fit easily. The second question should be answered by considering the fact that the climatological observation y is highly affected by seasonal variability or temporal trend induced by climate change. The climatological prior distribution should be estimated from the climatological observation data which minimizes the seasonality effect. To cope with this condition, we propose to use the following specification for estimating the climatological distribution, instead of applying a deseasonalization or transformation to the time original series. Suppose that we have past 30 years of climatological record in the database (excluding the current year) and let d be the date of interest. The climatological distribution for forecast on date d is estimated from a collection of observations within 5 days before and after d extracted every year within last 30 years. Of course there is no formal justi fication to set 5 days as the interval. In fact, Kuswanto et al. (2010) showed that increases or decreased the interval length from 3 to 7 days does not change the distribution. 2.3 Likelihood function The likelihood describes the density of conditional on y. It basically explains the stochastic dependence between both variables. Hence, the variables needed to form the likelihood function is the joint sample between and y. In our case, the likelihood represents the density of when given the value of y, that is Such as the prior, we need to specify the distribution of the likelihood and the sample used to estimate the parameters. Unlike the normal distribution which is a conjugate prior of normal likelihood, the marginal distribution of (Gamma or Weibull distribution) is not a conjugate of normal likelihood. Hence, a close form predictive distribution cannot be obtained. A natural way to solve this problem is by doing a transformation to the original series e.g. BoxCox transformation in order to obtain a stationary and normally distributed series. However, such tranfromation may lose many important information about the untrasformed series for instance higher moments of the distribution. Moreover, inverse transformation is difficult to do. 935 We propose to use meta-Gaussian distribution of Kelly and Krzysztofowitz (2008), by firstly apply normal quantile transform (NQT) to the joint sample. There are at least two potential advantages of using meta-Gaussian. First, the transformation is based on the rank series and hence the stochastic dependence between observation and forecasts reflects the original dependence. Second, inverse transform from the meta-Gaussian distribution is very easy to do i.e. we only need to multiply the cumulative distribution (CDF) with its Jacobian. Suppose that we have joint sample of , then the NQT of both variables are and where is the standard normal distribution function and is its inverse. From these transformation, according to (3) and (4) we have (6) (7) where is the latent variable in the transformed space. The parameters and can be estimated by maximum likelihood or ordinary least square. Moreover, we also have standard error of the latent variable . We now discuss the choice of the sample by which the parameters of the likelihood distribution should be estimated. The following arguments describe conditions which support the choice of the sample. As the likelihood sample is a composition of climatological observation and ensemble forecasts, then the stochastic dependence between both variables might be influenced by some conditions e.g. weather condition among others, and therefore it changes adaptively. Moreover, it is very likely that the sample is suffered from the seasonality effect. Seasonality and non-stationarity lead to a non-constant likelihood parameters overtime i.e. the forecasts in winter may be more informative than in summer and consequently the stochastic dependence in winter is stronger than in summer season (Kelly and Krzysztofowitz (1997)). The proposed method does not involve any deseasonalization of the time series in order to remove any seasonality effect or nonstationarity. The procedure as employed for the prior speci fication is no longer applicable as we usually have only short record of the joint sample. As an alternative to this, the likelihood joint sample in this paper is speci fied by the training window, which is n sample preceding the forecast date. In practice, the length of the training window should not be a very short period in order to have informative parameters. 2.4 Posterior distribution The posterior distribution is obtained by s imply combining the climatological prior with the likelihood function. It is used to generate the probabilistic predictive distribution of the calibrated ensemble forecasts . Given the prior distribution with its parameters (A, B and , marginal distribution with its 936 parameters, and the dependence parameters obtained from the likelihood, then the posterior CDF can be expressed as follows: It is straightforward to show that after multiplying this with its Jacobian, the probability density function (PDF) or the predictive distribution in the original space has form of where The mean of the predictive distribution is a weighted kernel between the climatological forecast and ensemble forecasts on the date. The mean of the predictive distribution depends on s, therefore, the value can be same or different for each member according to the ensemble forecast at the calibrated date. Meanwhile, the variance is equal for all members. The predictive distribution of the forecasts on the date is the one that has optimum . The optimum standard deviation will be determined by grid selection procedure, the one which optimizes the objective function for instance Continuous Ranked P robability Score (CRPS) of Unger (1985) and Herbash (2000) among others. 3 Conclusion We proposed a new calibration method of ensemble forecasts for non-normally distributed climate variables. The proposed method involves of revising the BMA framework into hierarchical model, and uses Bayes’s theorem in order to generate 937 the predictive distribution. Two uncertainties are incorporated in the predictive distribution i.e. climatological uncertainty and forecasts uncertainty. It solves the problem of underdispersion of the ensemble forecasts. The dependence parameters in the likelihood function are estimated by firstly applying normal quantile transform (NQT) to the series in order to generate a meta-Gaussian distribution. Proper definition of the sample choice of each predictive component allows us to have series which is free of seasonality effect and temporal trend. Hence, no transformation or standardization is needed to obtain stationary series, which is a necessary condition of applying the normal quant ile transform. Acknowledgements The author would like to thank Department of Mathematics and Statistics, Laval University Canada for giving opportunity to conduct this research during author’s Postdoctoral research. References [1] Bishop, C. and K. Shanley (2008), Bayesian Model Averaging's problematic treatment of extreme weather and a paradigm shift that fixes it, Monthly Weather Review, 136, 4641-4652. [2] Duan, Q. and N. K. Ajami, X. Gao and S. Sorooshian (2007), Multi -model ensemble hydrologic prediction using Bayesian model averaging, Advance in Water Resources, 30, 1371-1386. [3] Hersbach, H. (2000), Decomposition of the continuous ranked probability score for ensemble prediction systems, Weather Forec asting, 15, 559-570. [4] Kelly, K. S. and R. Krzysztofowitz (1997). A bivariate meta-Gaussian density for use in hydrology, Stochastic Hydrol. Hydraul., 11, 17-31. [5] Krzysztofowicz, R. and B. Evans (2008), Probabilistic Forecasts from the National Digital Forecast Database, Weather and Forecasting, 23, 270-289. [6] Kuswanto, H. , V. Fortin., A-C. Favre and B. N. Trinh (2010), Combining Bayesian processor of ensemble and Bayesian model averaging for reliable weather forecasts. Prepared for submission. [7] Raftery, A. E., T. Gneiting, F. Balabdaqui, and M. Polakowski (2005), Using Bayesian model to calibrate forecast ensembles, Monthly Weather Review, 133, 1155-1173. [8] Sloughter, J., A. E. Raftery, T. Gneiting, and C. Fraley (2007), Probabilistic Quantitative Precipitation Forecasting Using Bayesian Model Averaging, Monthly Weather Review, 135, 3209-3220. [9] Sloughter, J., T. Gneiting and A. E. Raftery (201 0), Probabilistic Wind Speed Forecasting using Ensembles and Bayesian Model Averaging, Journal of the Americ an Statistic al Association, 105, 25-35. 938 [10] Unger, D. A. (1985), A method to estimate the continuous ranked probability score. Preprints, Ninth Conf. on Probability and Statistics in Athmospheric Sciences, Virginia Beach, VA, Amer. Meteor. Soc., 206-213. [11] Vrugt, J. A., Cees G. H. Diks, and M. P., Clark (2008), Ensemble Baye sian model averaging using Markov Chain Monte Carlo sampling, Environ. Fluid Mech., 8, 579-595. . 939

DOCUMENT INFO

Shared By:

Categories:

Stats:

views: | 14 |

posted: | 6/18/2012 |

language: | English |

pages: | 8 |

OTHER DOCS BY alpd03l

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.