SP3 Milestone by chenmeixiu


									     IMPRINTS Milestone Report
     Contract No: FP7-ENV-2008-1-226555

   SP3 Milestone
   October 2010
   Milestone M3.3

   Date                                      02.12.2010

   Report Number                             M_SP3-2010-11
   Revision Number                           draft

   Due data for Milestone:                   November 2010
   Actual submission date:                   December 2nd 2010

   Subproject Leader                         ULANC

                         IMPRINTS is co-funded by the European Community
           Seventh Framework Programme for European Research and Technological Development

   IMPRINTS is a Collaborative Project focused on Theme - ENVIRONMENT: Preparedness
   and risk management for flash floods including generation of sediment and associated debris flow.

                               Start date 15th January 2009, duration 42 Months

   Document Dissemination Level
   PU       Public
   RE       Restricted to a group specified by the consortium (including the Commission Services)
   CO       Confidential, only for members of the consortium (including the Commission Services)

Coordinator:           Centre de Recerca Aplicada en Hidrometeorologia (CRAHI-UPC)
                       Universitat Politècnica de Catalunya
Project Contract No:   FP7-ENV-2008-1-226555
IMPRINTS Milestone Report
Contract No: FP7-ENV-2008-1-226555

                                           Table of Contents
1.    Description of Milestone..................................................................................................... 2
2.    Summary of the work to carried up .................................................................................... 2
3.    Difficulties arisen (if any) and actions................................................................................ 3
4.    Further Recommendations .................................................................................................. 3
5.    Associated Publications ...................................................................................................... 3
6.    APPENDIXES (if any) ....................................................................................................... 4

SP3_M3.3_toPartners.doc                                         1                                                      02.12.2010
IMPRINTS Milestone Report
Contract No: FP7-ENV-2008-1-226555

1. Description of Milestone
Milestone 3.3 (“1st version of the probabilistic conditioning methodology”) has been realized by
propagating ensemble rainfall inputs into hydrological models. The methodology follows the work
initiated by Germann et al. (2009) with the use of the radar ensemble generator REAL for
operational hydrological modelling with PREVAH (Viviroli et al., 2009) for the Verzasca river

2. Summary of the work to carried up
The work has been split into three perspectives:

    a. Operational simulations. The simulations with PREVAH conditioned by the radar
       ensemble generator REAL are fully operational. Simulations are projected into the future
       for up to three days lead time by nudging the REAL inputs with precipitations forecasts
       obtained from a high resolution numerical weather prediction model. A very recent event is
       presented in Figure 1.

    Figure 1: Operational hydrological ensemble nowcasting with REAL and PREVAH (Germann et al.,
    2009), computed on the 16 November 2010 in real-time with deterministic initial conditions from the
    10 November 2010 for the Verzasca basin in southern Switzerland (186 km2). The nowcasting is
    conditioned by 25 members from REAL (light grey) are shown with corresponding interquartile
    range (REAL IQR, red area) and the median (red line). Additionally, two deterministic runs are
    shown: deterministic radar QPE (yellow line) and forcing with interpolated pluviometer data (green
    line). The observed runoff is shown in blue. Spatially interpolated observed precipitation as ensemble
    precipitation from the REAL members (orange whisker-plots). All deterministic and probabilistic
    members are chained to a three day forecasts by the COSMOCH7 NWP of MeteoSwiss.

    b. Probabilistic verification of ensemble runoff simulations conditioned by hourly
       ensemble weather radar fields. Results have been presented on a poster at the
       IMPRINTS Workshop in Barcelona [Appendix] and have been accepted for poster
       presentation in the “Weather Radar and Hydrology” (WRaH) conference to be held in the
       United Kingdom in 2011 from 18 to 21 April at the University of Exeter. A paper for a
       Red-Book Paper of IAHS is in preparation. The corresponding Abstract with the title
       “Flood nowcasting in the Southern Swiss Alps using radar ensemble” can be found in the

SP3_M3.3_toPartners.doc                             2                                         02.12.2010
IMPRINTS Milestone Report
Contract No: FP7-ENV-2008-1-226555

    c. Off-line experiments of propagation and superposition of three sources of
       uncertainty. A full paper is close to acceptance in "Atmospheric Research." The paper
       describes an experimental framework for investigating the relative contribution of
       meteorological forcing uncertainties, initial conditions uncertainties and hydrological
       model parameter uncertainties in the realization of hydrological ensemble forecasts.
       Simulations were done for the Verzasca river basin (186 km2). For seven events in the time
       frame June 2007 to November 2008 it was possible to quantify the uncertainty for a five-
       day forecast range yielded by inputs of an ensemble numerical weather prediction (NWP)
       model (COSMO-LEPS, 16 members), the uncertainty in real-time assimilation of weather
       radar precipitation fields expressed using an ensemble approach (REAL, 25 members), and
       the equifinal parameter realizations of the hydrological model adopted (PREVAH, 26
       members). Combining the three kinds of uncertainty results in a hydrological ensemble of
       10400 members. An analysis of sub-samples from the ensemble provides insight in the
       contribution of each kind of uncertainty to the total uncertainty. The submitted paper
       (Zappa et al., submitted to Atmospheric Research) and the correspondent poster presented
       at the IMPRINTS Workshop in Barcelona are attached to this milestone report [Appendix].
       The conditioning (superposition) methodology is presented in detail in sections 3.2 and 3.4
       of the attached paper.

3. Difficulties arisen (if any) and actions
No significant difficulties arose.

4. Further Recommendations
Conditioning with ensemble rainfall information should be extended to the first two to four hours
in the future. Work for conditioning PREVAH with new tools from IMPRINTS SP1 will start in
2011. MeteoSwiss gave a talk on the progresses of SP1 on December 1st 2010 at WSL. First
conditioning of PREVAH with NORA and PREVAH could be presented at the next Meeting in

5. Associated Publications
Viviroli D, Zappa M, Gurtz J and Weingartner R. An introduction to the hydrological modelling
   system PREVAH and its pre- and post-processing-tools. Environmental Modelling & Software.
   24(10): 1209–1222. doi:10.1016/j.envsoft.2009.04.001
Germann U, Berenguer M, Sempere-Torres D, Zappa M. 2009. REAL - Ensemble radar
   precipitation for hydrology in a mountanious region. Quarterly Journal of the Royal
   Meteorological Society. 135: 445–456. doi:/10.1002/qj.375
Zappa M, Jaun S, Germann U, Walser A, Fundel F. Superimposition of three sources of
   uncertainties in operational flood forecasting chains. Submitted to Atmospheric Research in
   revised version. Thematic Issue on COST731.

SP3_M3.3_toPartners.doc                       3                                     02.12.2010
IMPRINTS Milestone Report
Contract No: FP7-ENV-2008-1-226555

6. APPENDIXES (if any)
The cited paper and the two cited posters presented at the IMPRINTS workshop held in Barcelona
in June 2010 are attached to this milestone summary.

Liechti K., Fundel F., Germann U. and Zappa M. “Flood nowcasting in the Southern Swiss Alps
using radar ensemble”. Abstract for WRaH2011 Exeter.

SP3_M3.3_toPartners.doc                     4                                    02.12.2010
Flood nowcasting in the Southern Swiss Alps using radar ensemble

K. Liechti, F. Fundel, U. Germann, M. Zappa

Since April 2007 the MeteoSwiss radar ensemble product REAL has been in operation and
used for operational flash flood nowcasting by the WSL. REAL consists of 25 members and
is generated hourly by the current radar quantitative precipitation estimates (QPE). REAL
covers an area in the Southern Swiss Alps where orographic and convective precipitation is
frequent. This ensemble QPE is processed by the semi-distributed hydrological model
PREVAH. This provides operational ensemble nowcasts for several basins with areas from 44
to 1500 km2. The smaller basins are prone to flash floods, whereas the larger ones are rather
affected by large floods after long-lasting rainfall.
In this contribution the performance of nowcasts driven by REAL are compared to nowcasts
driven by deterministic radar and rain gauge data. First results for the Verzasca river basin
(186 km2) demonstrate, that REAL outperforms deterministic radar over the whole range of
discharges, while the results with rain gauge data are threshold dependent.
The spread of the hydrological nowcast grows dependently to the time the system is allowed
to develop forced by the radar ensemble. Therefore the performance of ensemble nowcasts is
expected to improve when using the radar ensemble for a certain time before the initialisation
of a nowcast (initialisation period).
It will be analysed how long the initialisation period should be to gain the optimal spread for a
subsequent hydrological nowcast. First results for the Verzasca river basin indicate that the
maximum skill in discharge nowcasts is reached with an initialisation period of 7 days.
Forthcoming, it is planned to combine REAL and numerical weather predictions from
atmospheric models for flash flood forecasting. In addition, new nowcasting radar QPE
products including blending with rain gauge data will be tested.
Radar Ensemble for Operational Hydrology
Probabilistic Verification of a 33 Months Long Time Series of
Verzasca Runoff                        F. Fundel, K. Liechti and M. Zappa (WSL)
                                                                                                                                                        U. Germann (MeteoSwiss)
                                                                                          Introduction (Germann et al., QJRMS, 2009; Zappa et al., ASL, 2010)
                                                                                          In the past decade a series of sophisticated algorithms to obtain the best
                                                                                          radar estimates of surface precipitation rates over all of Switzerland have
                                                                                          been developed. In spite of significant improvements, for hydrological
                                                                                          applications the residual uncertainty is still relatively large. A novel
                                                                                          solution to express this residual uncertainty is to generate an ensemble
                                                                                          of radar precipitation fields by combining stochastic simulation and
                                                                                          detailed knowledge of the radar error structure. A prototype ensemble
                                                                                          generator (REAL) has been implemented and is running in real-time since
                                                                                          April 2007. The ensemble of precipitation field time series from REAL
                                                                                          consists of 25 members and is updated operationally every 60 minutes
REAL: Radar Ensemble                                                                      and propagated through the semi distributed hydrological model PREVAH.
for Hydrology in the Alps
                                                                                          In numerical weather prediction spread increases with lead time. In analogy
                                                                                           the spread of radar ensemble products increases with the number of hours
                                                                                          in which one allow them to diverge from initialization conditions.
                                                                                          We allow each of the 25 REAL members to build up a 10 day chain of
                                                                                          spatially and temporally correlated precipitation values. Thus spread can
                                                                                          develop from day to day. During long dry spells the spread can converge.
                                                                                          During long wet spell the spread can grow. Our setup starts from “Day
                                                                                          minus 10” with identical initial conditions for the hydrological simulations.
                                                                                          In case of rainfall the 25 chains of weather radar precipitation propagate
First Verification Results                                                                                                separately through the hydrological model.
                                                                                                                          We repeated the 10-days simulations starting
                                                                                                                           them at each consecutive day since April 2007
                                                                                                                          until Mai 2010. Our particular setup allows
                                                                                                                          then to create chains of discharge values
                                                                                                                          with identical “spread-time” and to evaluate
                                                                                                                          such data with standard probabilistic
                                                                                                                          verification metrics as generally used for
                                                                                                                                 evaluating ensemble discharge forecasts
                                                                                                                                 (e.g. : Jaun and Ahrens, HESS, 2009).
                                                                                                                                  Here first analysis for the period until
                                                                                                                                  December 2009 are shown.
  Talagrand diagram of Verzasca daily maxima runoff                                                                                .
                                                                  Building maximum discharge daily chains with growing “spread-time”
  driven with REAL precipitation for different “spread-times”

  Brier Skill Score for the 80th, 90th, 95th and 99th quantile   Relaibility component of the Brier Score as measure for calibration.   Resolution component of the Brier Score
  of Verzasca daily maximum runoff driven with REAL
  precipitation. Dashed lines are the 5% and 95% confidence

   REAL driven runoff ensemble analysis are underdispersive and biased yet
   skillful. They could provide a good initialization for an ongoing forecast.                                                                           FP7-ENV-2008-1 IMPRINTS 226555
Superposition of three sources of uncertainties
in operational flood forecasting chains
in mountainous areas                         M. Zappa, S. Jaun (WSL)
                                   U. Germann, A. Walser (MeteoSwiss)

                                                Introduction (Germann et al., QJRMS, 2009; Zappa et al., ASL, 2010)
                                                We set up an experimental framework for investigating the relative contribution of
                                                meteorological forcing uncertainties, initial conditions uncertainties and hydrological
                                                model parameter uncertainties in the realization of hydrological ensemble forecasts.
                                                Simulations were done for a representative mesoscale basin of the Swiss Alps,
                                                the Verzasca river basin (186 km2).

                                                For different events in the time frame June 2007 to November 2008 it was possible to quantify the
                                                uncertainty for a five-day forecast range yielded by inputs of an ensemble numerical weather
                                                prediction model (C-LEPS, 16 members), the uncertainty in real-time assimilation of weather radar
                                                precipitation fields expressed using an ensemble approach (REAL, 25 members), and the parameter
                                                uncertainty of the adopted hydrological model PREVAH (MOD, 26 members). The simultaneous
                                                propagation of all three ensembles generates a hydrological ensemble of 10400 members.
                                                Targeted analyses of members’ sub-samples provide insights on uncertainty superposition.

Results confirm the expectations and show that for the opera-
tional simulation of peak-runoff events the hydrological model
uncertainty is less pronounced than the uncertainty obtained by
propagating radar precipitation fields (by a factor larger than 4)
and NWP forecasts through the hydrological model (by a factor
larger than 10). The use of weather radar ensembles for generat-
ing hydrologically consistent ensembles of initial conditions pre-
vious to the propagation of COSMO-LEPS through the hydrologi-
                                                                           MOD: Uncertainty of the tunable parameters of the hydrological model
cal model show that the uncertainty in initial conditions decays
within the first 48 hours of the forecast. Another finding from the
experiments is that the spread obtained when superposing two
or more sources of uncertainty is larger than the cumulated
spread of experiments when only one uncertainty source is
propagated through the hydrological model. The spread
obtained from uncertainty superposition is growing non-linearly.

The experimental setup provides interesting answers to questions
linked to uncertainty propagation and superposition in a hydro-
meteorological forecasting system. By use of radar ensembles
input uncertainties are considered for nowcasting.
The simultaneous application of REAL and parameter uncertainties                 REAL/MOD: Superposing weather radar uncertainty with MOD
 generates ensembles that nicely envelop the observed hydrograph.
The magnitude of the uncertainty attributed to the difference in
initial conditions is smaller than the uncertainty attributed to the
hydrological model parameters and almost negligible with respect
to the spread owed to COSMO-LEPS.

Further efforts are planned in order to implement interpolation-
based ensembles within our experimental chain.
An objective quantitative verification of the ensemble simulations
against observed data will be presented in follow-up studies.

                                                 Submitted to
                                                 Atmospheric Research,
                                                 Special Issue on “ COST 731   LEPS/MOD: Superposing COSMO-LEPS NWP uncertainty with MOD
             FP7-ENV-2008-1 IMPRINTS 226555
Click here to view linked References

        1       "Superposition of three sources of uncertainties in operational
        2                                flood forecasting chains"


        4    Massimiliano Zappa1, Simon Jaun1,2, Urs Germann3, André Walser3 and Felix Fundel1


        6    (1) Swiss Federal Research Institute WSL, Birmensdorf, Switzerland

        7    (2) Institute for Atmospheric and Climate Science, ETH Zurich, Switzerland

        8    (3) Swiss Federal Office of Meteorology and Climatology MeteoSwiss, Switzerland

       13    Submitted to Atmospheric Research, Special Issue on “COST 731 - UNCERTAINTY
       15    SYSTEMS” Guest Editors, Dr. Andrea Rossa and Dr. Massimiliano Zappa


       18                   1st Submission: 31. May 2009

       19                   2nd Submission: 22. October 2010

       20                   3rd Submission: 30. November 2010



       23          Corresponding Author:

       24          Dr. Massimiliano Zappa

       25          Swiss Federal Institute for Forest, Snow and Landscape Research WSL

       26          Mountain Hydrology and Torrents

       27          Zürcherstrasse 111, CH-8903 Birmensdorf

       28          massimiliano.zappa@wsl.ch

29   Abstract

30         One of the less known aspects of operational flood forecasting systems in complex
31   topographic areas is the way how the uncertainties of its components propagate and superpose
32   when they are fed into a hydrological model. This paper describes an experimental framework
33   for investigating the relative contribution of meteorological forcing uncertainties, initial
34   conditions uncertainties and hydrological model parameter uncertainties in the realization of
35   hydrological ensemble forecasts. Simulations were done for a representative small-scale basin
36   of the Swiss Alps, the Verzasca river basin (186 km2).

37         For seven events in the time frame June 2007 to November 2008 it was possible to
38   quantify the uncertainty for a five-day forecast range yielded by inputs of an ensemble
39   numerical weather prediction (NWP) model (COSMO-LEPS, 16 members), the uncertainty in
40   real-time assimilation of weather radar precipitation fields expressed using an ensemble
41   approach (REAL, 25 members), and the equifinal parameter realizations of the hydrological
42   model adopted (PREVAH, 26 members). Combining the three kinds of uncertainty results in
43   a hydrological ensemble of 10400 members. An analyses of sub-samples from the ensemble
44   provides insight in the contribution of each kind of uncertainty to the total uncertainty.

45         The results confirm our expectations and show that for the operational simulation of
46   peak-runoff events the hydrological model uncertainty is less pronounced than the uncertainty
47   obtained by propagating radar precipitation fields (by a factor larger than 4 in our specific
48   setup) and NWP forecasts through the hydrological model (by a factor larger than 10). The
49   use of precipitation radar ensembles for generating ensembles of initial conditions shows that
50   the uncertainty in initial conditions decays within the first 48 hours of the forecast. We also
51   show that the total spread obtained when superposing two or more sources of uncertainty is
52   larger than the cumulated spread of experiments when only one uncertainty source is
53   propagated through the hydrological model. The full spread obtained from uncertainty
54   superposition is growing non-linearly.


56   Keywords: flood forecasting, uncertainty superposition, weather radar ensemble, atmospheric
57   EPS, model uncertainty, PREVAH, MAP D-PHASE, COST 731

58   1. Introduction

59         Operational flood forecasting is an important task in order to detect potentially
60   hazardous extreme rainfall-runoff events in time. This is particularly challenging in
61   mountainous areas, where the orography strongly complicates the setup and operational
62   workflow of most components of an end-to-end flood forecasting system. Such systems
63   consists of atmospheric models (e.g. Rotach et al., 2009), hydrological prediction systems (e.g.
64   Zappa et al., 2008), nowcasting tools used for estimating initial conditions (e.g. Germann et
65   al., 2009) and warnings for end-users (Bruen et al., 2010; Frick and Hegg, this issue).

66         Each component of the system is affected by uncertainties linked to the physical
67   representation of orography, to the parameterization schemes of the models involved and the
68   limitations of the observing platforms providing real-time data (Zappa et al., 2010). For an
69   integral consideration of uncertainty three key sources of errors have to be considered: a) the
70   uncertainty arising from incomplete process representation including the error in the
71   estimation of model parameters (Vrugt et al., 2005), b) the uncertainty in the initial conditions
72   and c) the uncertainty of the observed/forecasted hydrometeorological input. This
73   “uncertainty triplet” (Figure 1) superposes when data are fed into a hydrological model. The
74   integral uncertainty is the result of the interactions of all sources of uncertainty that are
75   propagating.

76         In the field of numerical weather prediction, ensemble systems are established as
77   standard tools to estimate and describe predicition uncertainties. Deterministic numerical
78   weather predictions (NWPs) are intrinsically limited by the chaotic nature of the atmospheric
79   dynamics. Already in the 1960s, Lorenz (1963) demonstrated in a seminal study that small
80   errors in the initial conditions of a weather forecast can grow rapidly, leading to highly
81   diverging solutions. In order to estimate predictability, much research has been undertaken to
82   develop probabilistic forecasting methodologies (see the reviews by Ehrendorfer, 1997 and
83   Palmer, 2000). In the last years, several studies have been devoted to the regional scales using
84   limited-area ensembles, in particular for forecasting heavy precipitation events (e.g. Stensrud
85   et al., 2000; Walser et al., 2004). Motivated by the reported results, initiatives for operational
86   limited-area ensemble prediction systems (EPSs) have emerged, e.g. the SRNWP-PEPS
87   (Quiby and Denhard 2003) and COSMO-LEPS (Marsigli et al., 2005). It is nowadays
88   common to apply atmospheric EPS as a forcing in operational flood-forecasting systems

 89   (Siccardi et al., 2005; Verbunt et al., 2007; Bartholmes et al., 2009 and see Cloke and
 90   Pappenberger, 2008 for a review).

 91         One of the advantages all meteorological ensemble approaches have in common is the
 92   simple interface with hydrological impact models. Each member of the ensemble can be fed
 93   into the hydrological model and generate forecast. The spread arising from the outcomes of all
 94   members represents the sensitivity of the hydrological system to the meteorological ensemble.
 95   Recently, ensemble techniques have been proposed to quantify uncertainties in observing
 96   systems (Collier, 2007), such as radar precipitation estimation and nowcasting (e.g. Berenguer
 97   et al., 2005; Bowler et al., 2006; Szturc et al., 2008; Lee et al., 2009; Germann et al., 2009),
 98   pluviometer-based ensembles (Ahrens and Jaun, 2007; Villarini and Krajewski, 2008; Moulin
 99   et al., 2009; Pappenberger et al., 2009), or satellite rainfall retrieval (e.g. Bellerby and Sun,
100   2007; Clark and Slater, 2006). In addition the use of observation-based ensembles allows
101   obtaining a hydrologically consistent ensemble of initial conditions for simulations coupled
102   with atmospheric EPS.

103         The hydrological model uncertainty is a further measure that is needed being accounted
104   and communicated in hydrological forecasting. The problem of parameter estimation and
105   equifinality is not a prerogative of hydrology (Beven, 1993; Beven and Freer, 2001; Vrugt et
106   al., 2003; Pappenberger and Beven 2006), but is a common issue in environmental modelling
107   (see Matott et al., 2009 for a review).

108         This paper describes an experimental flood-forecasting chain emerging from the joint
109   activities of the MAP–D-PHASE project (Rotach et al., 2009) and the COST action 731
110   (Rossa et al., this issue). A novel approach from our study is the superposition (or “cascading”,
111   Pappenberger et al., 2005) of the “uncertainty triplet” described above. To summarize we will:

112         - Propagate COSMO-LEPS (section 2.3) and the radar ensemble fields from REAL
113   (Germann et al., 2009; section 2.2) through the hydrological model PREVAH (Viviroli et al.,
114   2009a; section 2.1)

115         - Estimate the uncertainty of PREVAH tunable parameters by Monte Carlo sampling
116   and select different parameter sub-samples (section 3.2)

117         - Define different experimental settings for superposing the uncertainties from
118   PREVAH, REAL and COSMO-LEPS (section 3.4)

119         - Quantify uncertainty and express it as average spread for a forecast period of 120
120   hours, as defined by the lead-time of COSMO-LEPS forecasts (section 3.5).

121         As experimental area the Swiss Verzasca river basin (186 km2, section 3.1) has been
122   selected. This was the authors' main test bed during MAP D-PHASE. Data are available since
123   beginning of the MAP D-PHASE demonstration period in June 2007.

124         Our main goal is to estimate the different magnitudes of spread generated by our
125   particular definitions of input uncertainties (REAL and COSMO/LEPS), initial conditions
126   uncertainties (REAL for estimating initial conditions before feeding COSMO/LEPS into
127   PREVAH) and hydrological model uncertainties (use of different set of calibrated parameters).
128   As a further goal we want to identify how spread grows when different sources of uncertainty
129   are superposed.


131   2. Methods

132   2.1 The operational hydrological model PREVAH

133         We adopt the semi-distributed hydrological catchment modelling system PREVAH
134   (Precipitation-Runoff-Evapotranspiration HRU Model; Viviroli et al., 2009a), which has been
135   developed to improve the understanding of the spatial and temporal variability of hydrological
136   processes in catchments with complex topography. A review on previous work with
137   PREVAH is presented in Viviroli et al. (2009a), which also thoroughly introduces the model
138   physics, parameterizations and pre- and post-processing tools.

139         Besides application for investigating water resources in mountainous basins (Zappa et
140   al., 2003; Zappa and Kan, 2007; Koboltschnig et al., 2009), in recent times PREVAH has
141   been more and more used in quasi-operational hydrological applications and re-forecasts of
142   flooding events in Switzerland. Verbunt et al. (2006) presented an indirect verification of
143   deterministic quantitative precipitation forecasts (QPF) for the river Rhine. Verbunt et al.
144   (2007) and Jaun et al. (2008) presented case studies on coupling PREVAH with the ensemble
145   numerical weather prediction system COSMO-LEPS. Jaun and Ahrens (2009) verify a two-
146   year reforecast experiment of the PREVAH/COSMO-LEPS forecasting chain for the Swiss
147   Rhine basin. Romang et al. (2011) introduce the application of PREVAH for early flood
148   warning in Swiss mesoscale basins. PREVAH is adopted as a “hydrological engine” for
149   superposing three sources of uncertainty (Figure 1).


151   2.2 Dealing with uncertainties within operational weather radar systems

152         In the past decade MeteoSwiss, the Swiss Federal Office of Meteorology and
153   Climatology, developed and implemented a series of sophisticated algorithms to obtain best
154   estimates of surface precipitation rates over Switzerland using a radar network (Germann et
155   al, 2006). In spite of significant improvements, the residual uncertainty is still relatively large.
156   A novel promising solution to express this residual uncertainty is to generate an ensemble of
157   radar precipitation fields by combining stochastic simulations and detailed knowledge of the
158   radar signal error structure. The method is called REAL, which stands for Radar Ensemble
159   generator designed for usage in the Alps using LU decomposition (Germann et al., 2009).

160         In REAL, the original (deterministic) radar precipitation field (1x1 km2 resolution) is
161   perturbed with a stochastic component, which has the same mean and covariance structure in
162   space and time as the covariance matrix of the radar errors. In a first step mean and
163   covariance structure of radar errors are determined by comparing radar estimates with rain
164   gauge measurements. Radar errors are defined as the logarithm of the ratio between the true
165   (unknown) precipitation values divided by the radar estimate. This is a reasonable definition
166   given the fact that most radar errors are actually multiplicative (Germann et al., 2006). In a
167   second step REAL generates a number of perturbation fields using singular value
168   decomposition of the radar error covariance matrix, stochastic simulation using the LU
169   decomposition algorithm, and autoregressive filtering. Each ensemble member is a possible
170   realization of the unknown true precipitation field time series given the radar reflectivity
171   measurements and the radar error covariance matrix. For the complete mathematical
172   derivation of REAL we refer to Germann et al. (2009).

173         A prototype ensemble generator has been implemented as part of MAP D-PHASE and
174   COST-731 and is running in real-time in an automatic mode since spring 2007. The ensemble
175   of precipitation field time series from REAL consists of 25 members and is updated
176   operationally every 60 minutes and propagated through PREVAH.


178   2.3 Quantification of uncertainty from ensemble NWP-systems

179         Early identification of severe long-lasting rainfall events within the next five days is
180   obtained from the Limited-area Ensemble Prediction System of the COnsortium for Small-

181   scale MOdelling COSMO-LEPS (Marsigli et al., 2005). In the current configuration,
182   COSMO-LEPS provides once a day a 16 member ensemble forecast with 132 hours lead-time
183   for large parts of Europe. COMSO-LEPS is initialized at 12:00 UTC whereas the first 12
184   forecast hours are not used due to misrepresentations during model spin up. Initial and
185   boundary conditions are taken from the European Centre for Medium-Range Weather
186   Forecast EPS (Molteni et al., 1996). The horizontal grid-spacing of COSMO-LEPS is 10x10
187   km2 which is rather coarse for the small Verzasca basin, but due to the high computational
188   costs ensemble forecasts with higher resolutions are not yet available for the medium-range.
189   Six meteorological surface variables (air temperature, precipitation, humidity, wind, sunshine
190   duration derived from cloud cover, global radiation) are obtained from the ensemble NWP
191   and downscaled for hydrological modeling. The setup adopted for downscaling information
192   from COSMO-LEPS for hydrological applications is the same as presented in Jaun et al.
193   (2008) and relies on bilinear interpolation. Air temperature is adjusted according to elevation
194   by adopting a constant lapse rate of 0.65 °C per 100 m.


196   3. Experimental design

197   3.1 Study area

198         The Verzasca basin has an area of 186 km² up to the main gauge in Lavertezzo (Figure
199   2). The basin is located in the southern part of Switzerland and is little affected by human
200   activities. Its elevation range is 490-2870 m a.s.l. Forests (30%), shrub (25%), rocks (20%)
201   and alpine pastures (20%) are the predominant land cover classes. Soils are rather shallow
202   (generally smaller than 30 cm) and the plant available field capacity is below 5% volume. The
203   discharge regime is governed by snowmelt in spring and early summer and by heavy rainfall
204   events in fall (Ranzi et al., 2007). The river is rather prone to flash floods (Wöhling et al.,
205   2006) and leads into the “Lago di Vogorno” an artificial reservoir maintained by a private
206   Hydropower Company.

207         The hydrological properties of the catchment are derived from gridded maps of
208   elevation, land use, land cover and soil properties (Gurtz et al., 1999), which are available at
209   100x100 m2 resolution. For the present application a resolution of 500x500 m2 is generated
210   previous to the delineation of hydrological response units (Viviroli et al., 2009a). The runoff
211   gauging station at the catchment outlet is maintained by the Swiss Federal Office for
212   Environment, which provides data at 10 minutes resolution operationally. Flood peaks at

213   Lavertezzo may exceed 600 m3s-1 (~3.2 m3s-1km²). Base flow in winter can be less than one
214   m3s-1.

215            The operational meteorological forcing is obtained from several sources. MeteoSwiss
216   maintains a network of automatic stations providing a detailed set of meteorological variables
217   with a sampling interval of up to 10 minutes (Figure 2). The administration of the Canton
218   Ticino (UCA Ct. Ticino on Figure 2) maintains an additional network of pluviometers, which
219   samples precipitation data in real-time with a temporal resolution of 30 minutes. One of the
220   latter is the only automatic pluviometer within the basin. Furthermore weather radar
221   precipitation fields are available (Section 2.2.).


223   3.2 Consideration of hydrological uncertainty
224            The initial setup and calibration of the hydrological model was based on previous
225   applications in the Verzasca river basin (Wöhling et al., 2006; Ranzi et al., 2007). The used
226   default calibration is focused on the identification of a single parameter set with highest
227   performance in the simulation of the average flows and with the smallest volume error
228   between observed and simulated time series (Zappa and Kan, 2007; Viviroli et al., 2009a).
229   Since the target of this study is the quantification of uncertainty propagation in
230   hydrometeorological flood forecasting chains, only seven parameters being relevant for
231   surface runoff generation were allowed to randomly change during the MC experiment (Table
232   1). The identification of these seven sensitive parameters relies on experience (Zappa, 2002),
233   on consideration of the model structure (Gurtz et al., 2003) and targeted sensitivity studies on
234   flood peak calibration (Viviroli et al., 2009b). Table 1 indicates the basic value of the seven
235   parameters after the default calibration and the ranges allowed for parameter sampling during
236   the MC experiment. Further uncertainties linked to the parameters controlling snow
237   accumulation, snow melting and base-flow have been disregarded. A total of 2527 MC runs
238   were computed for the period 1996-2001, whereby the year 1996 was only used as a spin-up
239   year. Please note, that we are not addressing the full predictive uncertainty of the forecasting
240   chain as defined in Draper (1995) and Todini (2009), but we focus on the parameter
241   uncertainty as obtained by selecting equifinal realizations from a Monte Carlo (MC)
242   experiment, as well as observation and algorithm uncertainty by the ensemble methods for the
243   NWP and the radar systems (see above). However, for the model chain used, the obtained
244   uncertainty is the best available estimate of the full predictive uncertainty and our

245   hydrological experiments which rely on assessing different sets of model parameters to fit
246   past observations provide a practicable way to quantify how parameter uncertainty might
247   contribute to the full predictive uncertainty of the system (Figure 1).

248           The decision if a model run is behavioural or not is based on a subjective choice of
249   likelihood function(s) (Beven, 1993; Madsen, 2000 and 2003; Viviroli et al., 2009b; Bosshard
250   and Zappa, 2008). As the goal of the modelling experiments is the estimation of flood peaks,
251   two goodness-of-fit measures focused on peak-discharge have been computed for each MC
252   realization. As a first measure, the well-known Nash and Sutcliffe (1970) (NSE) efficiency is
253   used:

                                 Qt − qt

254   NSE = 1 −       t =1
                                               , NSE ∈] − ∞,1]   (1)
                         n                 2

                         t =1
                                 Qt − Q

255           where Qt is the observed hourly runoff at the time step t, Q the average of observed
256   runoff, qt the simulated runoff at the time step t and n the number of time steps. NSE
257   quantifies the relative improvement of the model compared to the mean of the observations.
258   NSE is particularly adequate for our present application, since it is particularly sensitive to
259   high flows. Its use is less advisable for studies focussed on obtaining the best calibrated
260   values for both high and low-flows (Legates and McCabe, 1999; Schaefli and Gupta, 2007).

261           In addition to NSE a second function is used. Lamb (1999) and Viviroli et al. (2009b)
262   introduce and discuss several scores for obtaining tailored parameters sets for flood-peak
263   estimations. One of them is the sum of weighted absolute errors (SWAE), which is defined as:

                             (             )
         SWAE = ∑ Qt Qt − qt , SWAE ∈ [0, ∞[
264                                                                                            (2)
                    t =1

265           A value of a = 1.5 was used as proposed by Lamb (1999) for evaluation of peak flow
266   conditions. Behavioural simulations show a lower SWAE.

267           The 2527 MC runs (Figure 3) were ranked according to their performance in the
268   defined calibration period. As a compound measure of performance a weighted product of
269   NSE (weight=3) and SWAE (weight=1) was adopted to build a single score Li:
              ⎛ NSE i            ⎞ SWAE AVG
270      Li = ⎜
              ⎜ NSE              ⎟ ⋅
                                 ⎟                                                             (3)
              ⎝     AVG          ⎠   SWAEi

271         A MC realization i having NSEi above the average NSEAVG of all realizations and
272   SWAEi lower than the average SWAEAVG of all realizations will be ranked higher than MC
273   runs showing an opposite behaviour with respect to the average NSE and SWAE. The analysis
274   of the MC runs showed that SWAE varied between 3500 and 6000 while the range of NSE
275   was 0.71 to 0.84 (Figure 3). Finally a Li range between 0.5 and 1.34 was obtained for all runs.

276         Figure 3 shows a dot-plot of all MC realizations with NSE on the y-axis and SWAE on
277   the x-axis. The obtained pattern allows for a visual discrimination between realizations with
278   higher and lower performance, with the best realizations being in the upper-left region of the
279   dot-plot. For the analysis in the remaining sections of the paper three sub-samples of 26
280   parameter sets each were isolated by ranking all realizations by sorting Li. The first sub-
281   sample consists of the best 26 realizations (99.5%; Li: 1.289/1.339). The second sub-samples
282   collects the 26 sets around the 95% ranking (Li: 1.238/1.246). The third sub-sample is a
283   selection of 26 runs around the 80% ranking (Li: 1.152/1.157). Table 2 displays some
284   statistical measures about the three sub-samples of 26 parameter sets. Except for the storage
285   coefficient controlling the generation of interflow K1, the 26 runs with highest performance
286   present for all seven tuneable parameters the lowest standard deviation within the sub-sample
287   itself. The highest variability is computed within the 80% sub-sample.


289   3.3. The selected peak-flow events

290         All experiments rely on a long-term simulation with PREVAH using the basic
291   parameter calibration (Table 1). Initial conditions for September 1st 2005 (Figure 4) are
292   generated by a reference run using interpolated observed pluviometer data. This reference run
293   starting on January 1st 1996 was obtained from an offline meteorological database, which also
294   includes stations that are not available in real-time. Starting from September 1st 2005 a second
295   long-term simulation relying on operationally available data only has been run to produce
296   initial conditions for March 1st 2007. This run used precipitation data from the operational
297   pluviometers operated by MeteoSwiss and the river network administration of the Canton of
298   Ticino (Figure 2) as an input. From March 1st 2007 operational time series of radar QPE and
299   REAL are also available.

300         The time frame for the implementation of PREVAH in operational mode was decided
301   in order to have good initial conditions for the MAP-D-PHASE demonstration period. In the
302   period March 1st 2007 to November 23rd 2008 seven events with peak-runoff ranging between

303   77 and 541 m3s-1 have been identified (Table 3). The return period of the highest flood peak in
304   the considered period on September 7th 2008 is approximately 5 years on the basis of extreme
305   value statistics of a time series starting in 1990 and having an average yearly flood of 385
306   m3s-1. The cumulative precipitation in the period previous to the 7 events is also indicated in
307   Table 3, both as spatially interpolated pluviometer data (areal precipitation estimate with
308   inverse distance weighting interpolation) and as assimilated QPE from the weather radar. It
309   can be observed, that the event on October 29th 2008 occurred after a relative dry antecedent
310   period, while in the days and weeks previous to the August 22nd 2007 over 150 mm rainfall
311   were estimated for the Verzasca basin. In the antecedent cumulative precipitation for the
312   November 5th 2008 event the precipitation event that triggered the October 29th 2008 peak
313   flow are included.


315   3.4 The seven experiments towards estimation of uncertainty superposition

316         The availability of several different data sets of deterministic and probabilistic
317   precipitation measurements and forecasts and the identification of sets of hydrological model
318   parameters allows the computation of uncertainty superposition. Seven different experiments
319   (Figure 4 and Table 3) have been completed:

320   1) MOD/PLUV: in this experiment the simulations from March 1st 2007 have been continued
321      until November 25th 2008 with the same configuration used since September 1st 2005. No
322      sources of uncertainty were considered. During the simulation a series of model starting
323      points were stored 10 to 20 days ahead of a major discharge event (see section 3.3 and
324      Table 3). The timing for saving the initial conditions was chosen in order to guarantee that
325      almost only base-flow is contributing to the discharge at initialization and that a minor
326      rainfall event is included in the time span between the models restart point and the peak-
327      flow event. In a second stage a temporally nested simulation was run starting from the
328      defined initialization date (Day -10/-20, Table 3) until 10 to 15 days after the event. For
329      the nested sub-period 3 times 26 model runs were run (Figures 3 and 4 and Table 2).

330   2) MOD/RAD: this experiment is identical to the MOD/PLUV experiment, with the only
331      change that the precipitation forcing is obtained from the weather radar (see section 2.2).
332      Also in this case model runs for temporally nested sub-periods in correspondence to peak-
333      flow events were run by accounting the uncertainty in the determination of calibrated
334      model parameters. It is important to declare here that in the case of simulations forced

335      with radar data (either deterministic or estimated with REAL) no bias in rainfall and
336      snowfall is accounted for. The radar QPE is already corrected for biases during the pre-
337      processing (Germann et al., 2006). Therefore: the two parameters of PREVAH controlling
338      such corrections are set to 0% (Table 2).

339   3) REAL: in this experiment only the uncertainty arising from the weather radar QPE is
340      accounted for. 25 ensemble members from the radar ensemble generator (section 2.2) are
341      used to force PREVAH. The initial conditions at initialization of the nested runs are
342      obtained from the MOD/RAD experiment, being forced with the deterministic radar QPE
343      since March 1st 2007 (Figure 4).

344   4) REAL/MOD: this experiment is the first one where uncertainty superposition is
345      considered. To reduce the computational effort, only one of the 3 parameter sub-sets is
346      accounted for (see section 4.1), namely the 95% sub set (Table 2), which includes the 26
347      model runs being ranked in the top 94.5% to 95.5% among the 2527 MC realizations (see
348      section 3.2). In detail: for each nested period 25 (REAL) x 26 (MOD_95%) runs were
349      completed in order to estimate the interaction between the radar-QPE and the uncertainties
350      of the hydrological model. Also in this case restart points for the hydrological model were
351      saved for later initialization of probabilistic forecasts with COSMO-LEPS (see below and
352      Table 4).

353   5) LEPS: in this experiment only the uncertainty arising from feeding PREVAH with the 16
354      COSMO-LEPS ensemble members is accounted for. The initial conditions at initialization
355      of COSMO-LEPS forecasts (Table 4 and Figure 4) are obtained from the model being
356      forced with the deterministic radar QPE since March 1st 2007 (Figure 4). A total of 19
357      COSMO-LEPS 5-day forecasts for the Verzasca river basin were selected (Table 3).
358      COSMO-LEPS QPF are not bias corrected (Table 2).

359   6) LEPS/MOD: in this experiment both the uncertainty of the model parameters and of the
360      NWP forecasts are considered. In detail: for each COSMO-LEPS initialization point 16
361      (COSMO/LEPS) x 26 (MOD-95%) runs were completed. This gives an ensemble of 416
362      5-days forecasts.

363   7) FULL: the final experiment is the combination of REAL/MOD and LEPS/MOD. For each
364      of the 19 COSMO-LEPS ensemble forecasts 650 different initial conditions are available
365      from the superposition of REAL with the model parameter uncertainty (see above). Thus,
366      650 (REAL/MOD) x 16 (COSMO-LEPS) runs were computed for all 19 forecasts (Figure

367      4 and Table 3). An overall ensemble of 10400 members results for evaluation and
368      quantifying uncertainty superposition by simultaneous consideration of uncertainties in
369      the QPE (REAL), in the NWP forecasts (COSMO-LEPS) and in the determination of the
370      parameters of the hydrological model (MOD-95%).


372   3.5. Quantification of uncertainty

373         We aim at quantifying the propagation and superposition of uncertainty when forcing
374   PREVAH with different meteorological time series and different configuration of its tunable
375   parameters. In all experiments a time frame of 120 hours is evaluated (Figure 4). The time
376   frame is defined by the initialization time of the COSMO-LEPS forecast used. We assume
377   that the average spread of the simulated ensemble hydrographs is related to the uncertainty of
378   the experimental settings used. For allowing intercomparison between experiments all
379   statistics have been computed for the same 120 hours period. We take the average of the
380   ensemble quantiles during the 120 hours as an objective measure for quantifying the
381   uncertainty. Prior to the averaging, quantiles ( q % ) are determined for each of the 120 hours

382   being evaluated. Equation 4 defines the computation of the average of quantiles q % for the
383   defined time frame:
                i =n

                ∑q     i
384      q% =   i =1
                           ,       n=120 time steps                                            (4)

385         q % denotes the “average quantile” of discharge during n=120 time steps. q % has been

386   computed for the levels 0%, 25%, 50% (the median), 75% and 100%. The average
387   interquartile range IQR can be obtained by subtracting q 25 from q 75 , while the average range

388   of spread is computed by subtracting q 0 from q100 .


390   4. Results

391         In this section the findings from the different experiments are discussed. The observed
392   runoff hydrograph and the average discharge during the events are also plotted, and should
393   give a subjective indication on the plausibility of the obtained result. The evaluation of long
394   series of operational forecasts with COSMO-LEPS and nowcast runs with REAL,

395   MOD/PLUV and MOD/RAD are not detailed here. Nevertheless Appendix A1 and Figure A1
396   give a concise summary on the quality of the probabilistic (COSMO-LEPS, REAL) and
397   deterministic (MOD, RAD) simulation during the period June 2007 to November 2008, in
398   which all selected events are included in and for which there is a detailed verification report
399   (Diezig et al., 2010). The verification indicates that all used deterministic and probabilistic
400   meteorological inputs results in discharge estimations that perform better than climatology.
401   Even if REAL and COSMO-LEPS present similar skill against observations, the following
402   sections will outline that the spread of these two sources of ensemble precipitation input may
403   differ quite a lot for events leading to high discharge events.


405   4.1 Parameter uncertainty

406         The MOD/PLUV and MOD/RAD experiments have been evaluated by quantifying the
407   average ensemble spread ( q100 - q 0 ) during the seven events (Table 4). MOD/PLUV was run
408   using each of the three different sub-sets of parameter realizations (Table 2). For MOD/RAD
409   only the results from the 26 realizations from the set MOD_95% are shown. Depending on the
410   intensity of the event (peak-flow) and the differences in antecedent precipitation (Table 3)
411   different values of spread are obtained for the different events. The largest average ensemble
412   spread (about 30 m3s-1 for both MOD_95%/PLUV and MOD_95%/RAD) is found during the
413   event leading to the September 7th 2008 peak-flow of 541 m3s-1.

414         The application of parameter sub-samples with higher NSE and SWAE results in
415   reduced spread. The average spread resulting by propagating the MOD_99.5% sub-sample is
416   30% lower than the one computed when propagating MOD_95%. The spread obtained by
417   propagating the MOD_80% sub-sample is on 30% higher than the one obtained from
418   MOD_95% (Table 4).

419         The average spread for the seven events obtained from 26 realizations of PREVAH
420   forced with weather radar QPE is about 14% (MOD95%/RAD) lower than the corresponding
421   spread of the runs forced with interpolated pluviometer data (MOD95%/PLUV). Only for the
422   event leading to the July 13 2008 peak flow the spread of the weather radar-driven
423   simulations are larger than the ones run with rain gauge data. This is due to a local convective
424   rainfall event that was not recorded by the pluviometers, but that resulted in locally very high
425   radar QPE. The main reason for having a lower spread with radar QPE than with pluviometer

426   forcing is the effect of bias correction, which is applied to the pluviometer data only. The
427   variation in the bias correction (Table 3) covers both the input and model uncertainties. This
428   is the way errors in estimating precipitation are currently accounted for. However, there is an
429   important constraint as compared to state-of-the-art observation-based precipitation
430   ensembles (e.g. Ahrens and Jaun, 2007; Moulin et al., 2009; Pappenberger et al., 2009). The
431   hydrological model uses the precipitation bias corrections (Table 2) as a global tunable
432   parameter for accounting for different sources of error in the treatment of rain gauge data: a)
433   direct measurement errors, b) systematic errors due to the choice, location and availability of
434   meteorological stations and, c) uncertainties in the generation of spatially interpolated fields.
435   Additionally the bias correction parameters also contribute to a compensation of systematic
436   errors in the estimation of evapotranspiration and other water fluxes by PREVAH (Zappa,
437   2002; Viviroli et al., 2009a). Methods for generating observation-based ensembles (both
438   based on weather radar and simulations) are only focusing on the estimation uncertainties in
439   the gridding of precipitation information and are therefore better suited for the propagation of
440   input uncertainties.

441         Figure 5 shows in detail simulations for the November 5th 2008 event. The related
442   evaluation of the average spread for the 120 hours window starting from November 3rd 2008
443   00:00 is summarized in Table 4. The spread arising from adopting three different parameter
444   sub-sets clearly increases by using sets with lower Li during the calibration period. While the
445   shape of the simulated ensembles above and below the median remains very similar among
446   the three cases, the distance of the upper and lower ensemble envelopes grows with
447   decreasing likelihood within the calibration period. As a consequence, the number of
448   observations falling within the uncertainty band drawn by the ensembles increases when using
449   MOD_80 as compared to both MOD_99.5% and MOD_95%. The spread computed when
450   using weather radar information is slightly higher at the start of the event. During the event
451   the spread obtained from radar forcing gets clearly smaller as the one obtained from forcing
452   using interpolated data from pluviometers. This is confirmed by the average values of spread
453   during the event (Table 4).

454         Spreads resulting from this analysis range between 7 and 30 m3s-1 for the seven
455   investigated events (MOD_95%). We selected MOD_95% as a benchmark against which to
456   compare spreads resulting from the other sources of uncertainty from now on. This decision is
457   taken with the intent of avoiding over fitting (when using MOD_99.5% as a benchmark).


459   4.2 Weather radar uncertainty and superposition with parameter uncertainty

460          Following the proof-of-concept presented in Germann et al. (2009), PREVAH was run
461   by adopting ensemble radar QPE ensembles obtained from REAL. The runs forced by REAL
462   members use the initial conditions of a deterministic run forced by the operational radar QPE
463   of MeteoSwiss until some days ahead of the event (Figure 4 and Table 4). From that
464   initialization point the procedure described in section 3.4 (experiment “REAL”) is applied. As
465   for the results presented in the previous section the average ensemble spread of the seven
466   selected events has been computed for a 120 hours time frame (Table 5). In analogy also the
467   experiment REAL/MOD was completed by varying both, the REAL member and the
468   calibrated parameter realization from the MOD_95% set one after the other.

469          The model runs resulted in an average spread ranging between 25 and 167 m3s-1 for
470   REAL and between 34 and 216 m3s-1 for REAL/MOD. If we compare these results with the
471   outcomes of MOD_95%/RAD, the REAL and REAL/MOD runs (Table 4) present a higher
472   spread by a factor of 4.3 (REAL) and 5.6 (REAL/MOD).
473          Figure 6 shows two examples of 5-days ensemble hydrographs obtained for the
474   experiments REAL and REAL/MOD. Contrarily to the ensembles shown in Figure 5 almost
475   all observed values fall within the ensemble envelopes. Only the falling limbs close to the end
476   of the simulation are underestimated by both REAL and REAL/MOD ensembles. For the
477   cases August 22nd 2007 and November 5th 2008 events there is clear evidence that the spread
478   arising by joint consideration of two sources of uncertainty is higher than the one obtained by
479   propagating only the REAL members through the hydrological model. The spread from the
480   REAL/MOD ensemble is 25% to 40% higher than that of the REAL realizations. The average
481   additional spread for the seven events is 17 m3s-1 (Table 5). Combining the analyses of Tables
482   4 and 5, the following findings can be stated for simulations REAL and REAL/MOD:

483      -    The average spread for the seven events stemming from the parameter ensemble is
484           about 12 m3s-1 (PREVAH forced by deterministic radar QPE and 26 parameter
485           realizations from MOD_95%).

486      -    The coupling of PREVAH with REAL results in hydrograph ensembles with an
487           average spread of over 55 m3s-1 for the same seven events.

488      -    If both REAL and MOD_95% are applied an ensemble of 650 members is generated.
489           The obtained average spread in this case is about 72 m3s-1.

490          This means that REAL/MOD generates a 6% to 7% larger spread than the sum of the
491   spread obtained from the experiment MOD/RAD_95% and REAL (67 m3s-1). This indicates
492   that an amplification of spread by superposition of two sources of uncertainty is occurring.
493   Our particular modeling system is characterized by non-linear responses, mostly explained by
494   conceptual threshold processes in the runoff generation module of PREVAH (Gurtz et al.,
495   2003). At the level of interquartile range amplification of spread has been observed in only
496   one of the 19 cases considered (Table 3). Thus only a subset of all considered REAL and
497   MOD combinations triggers a non linear reaction within the runoff generation module of
498   PREVAH. In all other cases the IQR-spread of REAL/MOD is in average 9% smaller than the
499   cumulative spread of REAL and MOD.


501   4.3 COSMO-LEPS uncertainty and superposition with model uncertainty

502          As expected the average spread obtained by propagating NWP forecasts through the
503   hydrological model is much larger than the one obtained from the experiments discussed
504   above (Figure 7 and Table 4). The computation of LEPS generates average spreads that are
505   about 10 times higher than the ones of MOD_95%/RAD and 2.3 times higher than the ones
506   from REAL (Tables 4 and 5). Contrary to previous experiments, that are always related to an
507   occurred precipitation event, the LEPS ensemble (initialized as declared in Table 5) also
508   includes members that are forecasting very low or no precipitation at all for the respective
509   event (e.g. Figure 7 for the August 22nd 2007 event, upper panels). The forecast initialized on
510   August the 20th 2007 includes a relevant number of members that show no runoff increase at
511   all within the 120 forecast hours. Even the 25% quantile shows a maximum discharge that is
512   slightly higher than the discharge at initialization time. In case of this event the whole
513   observed time series falls within the envelope drawn by the LEPS experiment. Unfortunately,
514   that the spread is very large. This makes any kind of decision making related to that case
515   almost impossible. Anyway, in this specific case a potential end-user taking actions on the
516   basis of the 75% quartile would have been very efficient in his decision making. Further
517   considerations on skill for decision making are only possible after sound verification of long-
518   term time series of consecutive forecasts (e.g., Fundel and Zappa, 2010)

519          The results from the November 5th 2008 event (lower panels in Figure 7) show different
520   characteristics. All LEPS ensemble members agree that the first runoff first peak is to be
521   expected in the second half of the first day of the forecast, and that a second (higher) peak will
522   arrive about 60 hours after initialization of the forecast. Potential users focusing on the 75%
523   quantile would have probably over-reacted at the start of the event, but would have been able
524   to cope with the peak on November 5th 2008.

525          The LEPS/MOD experiments represents a second series of simulations, for which
526   parameter uncertainty is accounted for and superposed to the uncertainty originating from the
527   LEPS (right panels in Figure 7). The average spread from the LEPS/MOD ensemble is 13% to
528   25% higher than the one of the model realizations based on LEPS only. The average
529   additional spread for the seven events is 23 m3s-1 (Table 5).

530          In analogy to joint consideration of Tables 4 and 5 in Section 4.2 the experiments with
531   LEPS and LEPS/MOD allow the following statements:

532      -    Average spread from MOD_95%/RAD is about 12 m3s-1 (see above).

533      -    The coupling of PREVAH with LEPS generated hydrograph ensembles with an
534           average spread of over 130 m3s-1 for the seven events considered.

535      -    Applying both LEPS and MOD_95% results in an ensemble of 416 members. The
536           obtained average spread is larger than 150 m3s-1.

537          This means that REAL/MOD generates a 9% to 10% larger spread than the sum of the
538   spread obtained from the experiment MOD/RAD_95% and LEPS (142 m3s-1). Also in this
539   case the superposition of the two sources of uncertainty causes an amplification of the full
540   spread. In this case an amplification of spread measured by the interquartile range has been
541   observed in seven cases (Table 3). On average the IQR-spread of REAL/MOD is 2% smaller
542   than the cumulative spread of LEPS and MOD.

543          When propagating numerical forecasts from an ensemble prediction system such as
544   COSMO-LEPS through a hydrological model for mesoscale areas such as the Verzasca basin
545   (186 km2), the big mismatch between the basin area and the resolution of the ensemble
546   prediction system (10x10 km2 mesh size) has to be kept in mind. Nevertheless studies with
547   such kind of hydrological ensemble predictions have been very popular in the last few years
548   (Cloke and Pappenberger, 2008; Jaun et al., 2007) and have found already application in

549   operational chains. This scale restriction is less problematic for applications in macro-scale
550   basins (Pappenberger et al., 2005; Batholomes et al., 2009).

552         4.4. Superposition of three sources of uncertainty

553         The last experiment combines the initial conditions obtained from the REAL/MOD
554   experiment (650 members) with the 16 ensemble members of COSMO-LEPS (see Section 3.4
555   and Figure 4) and thus considers the entire “uncertainty triplet” (Figure 1). The LEPS/MOD
556   experiment discussed above is extended by additionally perturbing the initial conditions
557   forcing PREVAH with REAL, up to start of the COSMO-LEPS propagation through
558   PREVAH. By accounting for these additional perturbations the average spread for the seven
559   events increases by about 4.5%, from 153 (LEPS/MOD) to 160 m3s-1 (FULL, Table 5). Only
560   the run initialized on August 12th 2008 shows a distinctly higher additional uncertainty (~15%
561   more) in the FULL experiment, as compared to the LEPS/MOD experiment (Figure 8). The
562   FULL ensemble shows already a large spread at initialization, as determined by the
563   antecedent conditions obtained from REAL/MOD runs. This difference in the overall spread
564   gradually converges but it is still well defined at the time of the first runoff peak shortly after
565   2:00 on August 13th 2008, where the maximum peak-flow of FULL is about 50 m3s-1 higher
566   than the corresponding LEPS/MOD peak. The difference is also well visible in the IQR. The
567   second peak, late in the evening of August 15th 2008 shows nearby identical shape and ranges
568   for both FULL and LEPS/MOD. The uncertainties owed to the REAL influence on
569   REAL/MOD decays during the first part of the event.

570         Figure 9 shows an overview on all 19 “FULL” experiments, each of them summarizing
571   the spread arising from 10400 5-days forecasts. In 12 cases the observed average discharge is
572   found within the IQR. Only the experiment with the longest lead time initialized on August
573   10th 2008 produced a q100 lower than the observed average discharge during the 120 forecast
574   hours considered. The model run initialized 24 hour later (August 11th 2008) generates an
575   ensemble spread that strongly overestimates the observed value. The correspondent runs for
576   three following days show a gradual reduction in ensemble spread. The reason for the large
577   spread is that some COSMO-LEPS members are forecasting severe convective precipitation,
578   while others predicted no precipitation at all. Finally a moderate thunderstorm occurred in the
579   evening of August 11th 2008. REAL also generated large spread in its members with
580   cumulated rainfall for August 11th 2008 ranging between 3 and 40 mm. This explains the

581   large discrepancy in initial conditions observed at initialization of the LEPS forecasts on
582   August 12th 2008 (see Figure 8).


584   4.5 Attributing the contribute to the total uncertainty

585          The outcome from the three experiments dealing with uncertainty superposition (FULL,
586   REAL/MOD, LEPS/MOD) can be sorted out in order to allocate the contribution of one of the
587   sources of spread to the whole experimental uncertainty. For this analysis we put the focus on
588   one event only, namely the November 5th 2008 event with COSMO-LEPS forecasts initialized
589   on November 3rd 2008 (see also Figures 5 to 7). The following procedure was applied:

590      -    Calculate the quantiles of all runs of the experiment (Eq. 4);

591      -    Grouping in turn all runs sharing the same MOD, REAL or LEPS member (Table 6);

592      -    Averaging the quantiles of the sub-sample and calculate correspondent spread metrics
593           (Figure 10).

594          The three main findings from Table 6 are:

595      a) FULL: The 10400 FULL runs give an average ensemble spread “ q100 – q 0 ” of 139.0
596           m3s-1. There are 400 model runs sharing the same parameter set. This means that we
597           can compute 26 different “ q100 – q 0 ” and average them to obtaining an integral
598           measure indicating the spread attributed to the two sources of uncertainty that have
599           been varied in this specific case (REAL & LEPS). In this example the “ q100 – q 0 ” that
600           cannot be attributed to MOD is 125.4 m3s-1 (90% of the total spread). When REAL is
601           used as a filter and both MOD and LEPS are varied, then almost 98% of the “ q100 –

602           q 0 ” is obtained. REAL contributes in a very limited way to the whole ensemble

603           spread. Finally, if the influence of LEPS is averaged then only 11.5% of the FULL
604           spread can be attributed (Table 6). Similar outcomes are observed when looking at the
605           IQR “ q 75 – q 25 ”.

606      b) REAL/MOD: The REAL/MOD ensemble generates a “ q100 – q 0 ” of 56.3 m3s-1. Here

607           the “ q100 – q 0 ” that cannot be allocated to MOD is 45.9 m3s-1 (80% of the total spread).

608            When REAL is used as a filter and only MOD is varied, then 17.9% of the spread can
609            be allocated (Table 6).

610      c) LEPS/MOD: The 400 LEPS/MOD realizations are resulting in an average “ q100 – q 0 ”
611            of 138 m3s-1. When focusing on the role of changing MOD and averaging the
612            influence of LEPS then “ q100 – q 0 ” is only 15.6 m3s-1 (11% of the total spread). If we

613            make a sub-sample that filters the spread of MOD, then about 88% of the whole
614            spread of LEPS/MOD can still be allocated (Table 6).

615         Figure 10 is a graphic rendering of Table 6 in form of box-plots. All experiments in
616   which LEPS contributes to the spread variation show an average spread close to the one of the
617   spread of the whole experiment. If only LEPS is propagated then the average spread is 130
618   m3s-1 (Table 5). If also model uncertainty is propagated, then the average spread increases by
619   about 23 m3s-1 to 153 m3s-1. If different initial conditions from REAL are also considered the
620   additional increases is 7 m3s-1 only (total:160 m3s-1, Table 5). If REAL is used to generate
621   initial conditions only, its influence to the total spread is smaller than the influence of the
622   hydrological model uncertainty. Using REAL as a forcing during the event increases the
623   spread by about 4.5 times (in the specific case of November 5th 2008) compared to the spread
624   that can be attributed to the model parameters. This confirms the outcomes summarized in
625   Tables 4 and 5.


627   5. Discussion and conclusions

628         The experimental setup, accounting for three sources of uncertainty, presented in this
629   paper, provides interesting answers to questions linked to uncertainty propagation and
630   superposition in a hydrometeorological forecasting system.

631         The used setup showed that the hydrological model (PREVAH) uncertainty is less
632   pronounced than the uncertainty obtained by propagating radar precipitation fields (REAL)
633   and NWP forecasts (COSMO-LEPS) through the hydrological model. The average difference
634   in spread for a five-days forecast range in the seven events considered results in a factor larger
635   than four between MOD/RAD and REAL and in a factor above ten between MOD/RAD and
636   LEPS .

637         Since the size of the Verzasca basin is only a few square kilometers larger than the
638   mesh size of COSMO-LEPS there is almost no averaging effect. This contributes to the large
639   spread of the obtained hydrographs when COSMO-LEPS is used. Gallus (2002) warns about
640   using NWPs grid-point information as for verification against point data. In the case of the
641   Verzasca this is almost the case, since we use information of few COSMO-LEPS grid points
642   in order to force our impact model and compare it to observations.

643         The estimation of PREVAH parameter uncertainty is strongly depending on the way the
644   parameters have been sampled and ranked. Numerous approaches are possible for this kind of
645   problem (Matott et al., 2009). We are confident, that the chosen approach is appropriate to
646   estimate the parameter uncertainty of PREVAH within the presented superposition
647   experiment. Of course the parameter uncertainty is estimated on the basis of the whole
648   calibration period. Current literature (He et al., 2009; Cullmann and Wriedt, 2008 and
649   Pappenberger and Beven, 2004) offers some examples of approaches that try to combine
650   parameter configurations being successful in the complete data basis with other parameter
651   configurations estimated for single events or series of events.

652         Amplification of spread is obtained if the combination of LEPS (or REAL) and triggers
653   a non linear reaction of the runoff generation module of PREVAH (Gurtz et al., 2003; Viviroli
654   et al., 2009a) which includes a threshold parameter for activating the generation of surface
655   runoff (Table 1). Such a non linear response needs to be accounted for by hydrological
656   models, since a sudden increase of discharge coefficients has been observed in many basins
657   during long lasting heavy precipitation events (e.g. Naef et al., 2008). Such threshold
658   processes can be also identified in for of step-structures in the flood frequency statistic (e.g.
659   Merz and Blöschl, 2008). In all considered cases we observed an amplification of the full
660   spread, while the corresponding interquartile range is mostly smaller when two error sources
661   are superposed.

662         By use of REAL, input uncertainties are considered for nowcasting. We showed that the
663   simultaneous application of REAL and parameter uncertainties generates ensembles that
664   nicely envelop the observed hydrograph. Besides weather-radar based approaches,
665   observation-based ensembles with pluviometer data have been recently proposed. Recent
666   studies propose the use of the Kriging variance (Ahrens and Jaun, 2007; Moulin et al., 2009;
667   Pappenberger et al., 2009) for the estimation of the interpolation uncertainty of ground-based
668   precipitation data for hydrological purposes. Jaun (2008) showed that hydrological simulation

669   forced by observation-based ensembles is sensitive to the density and number of stations
670   available. The interpolation uncertainty increases with decreasing number of representative
671   stations available. These restrictions do not apply to REAL, which is able to operationally
672   generate high resolution observation-based ensembles for hydrology. Nevertheless,
673   observation-based pluviometers ensembles are certainly a feasible way to consider input
674   uncertainty in regions where the weather radar coverage is not adequate. Further efforts are
675   planned in order to implement interpolation-based ensembles within our experimental chain.

676         The use of weather radar ensembles for generating hydrologically consistent ensembles
677   of initial conditions previous to the propagation of COSMO-LEPS through the hydrological
678   model show that the uncertainty in initial conditions decays within the first 48 hours of the
679   forecast. The magnitude of the uncertainty attributed to the difference in initial conditions is
680   smaller than the uncertainty attributed to the hydrological model parameters and almost
681   negligible with respect to the spread owed to COSMO-LEPS.

682         The operational implementation of this experiment for the small Verzasca river basin
683   would be a priori possible. To realize a run with all 10400 ensembles including 650 runs for
684   the determination of initial conditions requires about 6 hours CPU time. The application on
685   larger river basin requires a reduction in number of simulations. The adaptive forecasting
686   concept proposed by Romanowicz et al. (2006 and 2008) could be a possible approach to
687   estimate which members need to be computed.
690   Acknowledgments:
691   We want to acknowledge the Swiss Federal Office for Environment providing us runoff data
692   from their operational networks. Thanks to the Ufficio dei corsi d’acqua (Canton Ticino) and
693   Istituto Scienze della Terra (SUPSI) for additional rain-gauge data. This study is part of
694   COST-731 and MAP D-PHASE, and was funded by MeteoSwiss, WSL and the State
695   Secretariat for Education and Research SER (COST 731). The comments of the two reviewers
696   F. Pappenberger and L. Moulin and of the Guest Editor A. Rossa helped clarifying the paper.
698   Appendix A1:
699   In this appendix we give a concise summary on the verification of the probabilistic (COSMO-
700   LEPS, REAL) and deterministic (MOD, RAD) simulations during the period June 2007 to

701   November 2008, expressed with probabilistic measures of skill. Such kind of verification is
702   established in atmospheric sciences (Brier, 1950; Wilks, 2006; Weigel et al., 2007; Ahrens
703   and Walser, 2008) and is enjoying increasing popularity in hydrological sciences both for the
704   analysis of single events and for verification of long time series (Jaun et al., 2008; Jaun and
705   Ahrens, 2009; Bartholmes et al., 2009; Roulin and Vannitsem, 2005, Roulin 2007; Laio and
706   Tamea 2007, Brown et al., 2010).
707   Figure A1 shows the relative operating characteristic curves (ROC, Wilks, 2006) of LEPS and
708   REAL for the period June 2007 to November 2008. The ROC for the deterministic
709   simulations MOD/RAD and MOD/PLUV are also indicated as a point. The analysis has been
710   completed for three different thresholds, all of them representing a percentile (50%, 75%,
711   95%) of the observed discharge during these 18 months. Additionally the Brier Skill Score
712   (BSS, Wilks, 2006) of the ensemble products is declared. For LEPS the analysis has been
713   completed for different lead-times (Jaun and Ahrens, 2009). The lead-time of one and five
714   days are displayed in Figure A1.
715   The obtained ROC and BSS show that both REAL and LEPS are skillful for all selected
716   thresholds. When low discharge percentiles are tested (50% and 75%) BSS of LEPS decreases
717   only slightly between day one and day five forecasts. The skill of forecast for discharges
718   above 75.8 m3s-1 (95% percentile) is better for LEPS forecasts with lead time of one day than
719   for LEPS with five days lead time.
720   BSS of REAL is high for the lowest and the highest percentiles considered. For the 75%
721   discharge (17.2 m3s-1) percentile, REAL tends to have an increased rate of false alarms.
722   MOD/RAD und MOD/PLUV show similar behavior as the ensemble products. MOD/RAD
723   has a higher hit rate than MOD/PLUV when the 75% discharge percentile is tested. On the
724   other hand MOD/PLUV has fewer false alarms than MOD/RAD when discharge above 5.97
725   m3s-1 (50% percentile) is verified. An extended objective quantitative verification of the
726   ensemble simulations against observed data will be presented in follow-up studies.
728   References
729   Ahrens, B. and Jaun, S., 2007. On evaluation of ensemble precipitation forecasts with
730          observation-based ensembles. Advances in Geosciences, 10: 139-144.
731   Ahrens, B. and Walser, A., 2008. Information-based skill scores for probabilistic forecasts.
732          Monthly Weather Review, 136(1): 352-363.
733   Bartholmes, J.C., Thielen, J., Ramos, M.H. and Gentilini, S., 2009. The european flood alert
734          system EFAS - Part 2: Statistical skill assessment of probabilistic and deterministic
735          operational forecasts. Hydrology and Earth System Sciences, 13(2): 141-153.

736   Bellerby, T.J. and Sun, J.Z., 2005. Probabilistic and ensemble representations of the
737           uncertainty in an IR/microwave satellite precipitation product. Journal of
738           Hydrometeorology, 6(6): 1032-1044.
739   Berenguer, M., Corral, C., Sanchez-Diezma, R. and Sempere-Torres, D., 2005. Hydrological
740           validation of a radar-based nowcasting technique. Journal of Hydrometeorology, 6(4):
741           532-549.
742   Beven, K., 1993. Prophecy, Reality and Uncertainty in Distributed Hydrological Modeling.
743           Advances in Water Resources, 16(1): 41-51.
744   Beven, K., 2006. On undermining the science? Hydrological Processes, 20(14): 3141-3146.
745   Beven, K.J., 2001. Rainfall-runoff modelling: the primer. Wiley, Chichester, 360 pp.
746   Bosshard, T. and Zappa, M., 2008. Regional parameter allocation and predictive uncertainty
747           estimation of a rainfall-runoff model in the poorly gauged Three Gorges Area (PR
748           China). Physics and Chemistry of the Earth, 33(17-18): 1095-1104.
749   Bowler, N.E., Pierce, C.E. and Seed, A.W., 2006. STEPS: A probabilistic precipitation
750           forecasting scheme which merges an extrapolation nowcast with downscaled NWP.
751           Quarterly Journal of the Royal Meteorological Society, 132(620): 2127-2155.
752   Brier, G.W., 1950. Verification of forecasts expressed in terms of probability. Monthly
753           Weather Review, 78(1): 1-3.
754   Brown, J.D., Demargne, J., Seo, D.J. and Liu, Y.Q., 2010. The Ensemble Verification System
755           (EVS): A software tool for verifying ensemble forecasts of hydrometeorological and
756           hydrologic variables at discrete locations. Environmental Modelling & Software,
757           25(7): 854-872.
758   Bruen, M. et al., 2010. Visualizing flood forecasting uncertainty: some current European EPS
759           platforms-COST731 working group 3. Atmospheric Science Letters, 11(2): 92-99.
760   Clark, M.P. and Slater, A.G., 2006. Probabilistic quantitative precipitation estimation in
761           complex terrain. Journal of Hydrometeorology, 7(1): 3-22.
762   Cloke, H.L. and Pappenberger, F., 2009. Ensemble flood forecasting: A review. Journal of
763           Hydrology, 375(3-4): 613-626.
764   Collier, C.G., 2007. Flash flood forecasting: What are the limits of predictability? Quarterly
765           Journal of the Royal Meteorological Society, 133(622): 3-23.
766   Cullmann, J. and Wriedt, G., 2008. Joint application of event-based calibration and dynamic
767           identifiability analysis in rainfall-runoff modelling: implications for model
768           parametrisation. Journal of Hydroinformatics, 10(4): 301-316.
769   Diezig, R., Fundel, F., Jaun, S. and Vogt, S., 2010. Verification of runoff forecasts by the
770           FOEN and the WSL. In: CHR (Editor), Advances in Flood Forecasting and the
771           Implications for Risk Management. International Commission for the Hydrology of
772           the Rhine Basin (CHR), Alkmaar, pp. 111-113.
773   Draper, D., 1995. Assessment and propagation of model uncertainty. Journal of the Royal
774           Statistical Society Series B-Methodological, 57(1): 45-97.
775   Ehrendorfer, M., 1997. Predicting the uncertainty of numerical weather forecasts: a review.
776           Meteorologische Zeitschrift, 6(4): 147-183.
777   Frick, J. and Hegg, C., 2011. Effects of uncertainty information in meteorological and
778           hydrological forecasting on users' decision making processes. Atmospheric
779           Research(Thematic Issue on COST731.): this issue.
780   Fundel, F. and Zappa, M., 2011. Hydrological Ensemble Forecasting in Mesoscale
781           Catchments: Sensitivity to Initial Conditions and Value of Reforecasts. Water
782           Resources Research, in review.

784   Gallus, W.A., 2002. Impact of verification grid-box size on warm-season QPF skill measures.
785           Weather and Forecasting, 17(6): 1296-1302.
786   Germann, U., Berenguer, M., Sempere-Torres, D. and Zappa, M., 2009. REAL - Ensemble
787           radar precipitation estimation for hydrology in a mountainous region. Quarterly
788           Journal of the Royal Meteorological Society, 135(639): 445-456.
789   Germann, U., Galli, G., Boscacci, M. and Bolliger, M., 2006. Radar precipitation
790           measurement in a mountainous region. Quarterly Journal of the Royal Meteorological
791           Society, 132(618): 1669-1692.
792   Gurtz, J., Baltensweiler, A. and Lang, H., 1999. Spatially distributed hydrotope-based
793           modelling of evapotranspiration and runoff in mountainous basins. Hydrological
794           Processes, 13(17): 2751-2768.
795   Gurtz, J. et al., 2003. A comparative study in modelling runoff and its components in two
796           mountainous catchments. Hydrological Processes, 17(2): 297-311.
797   He, Y. et al., 2010. Ensemble forecasting using TIGGE for the July-September 2008 floods in
798           the Upper Huai catchment: a case study. Atmospheric Science Letters, 11(2): 132-138.
799   He, Y. et al., 2009. Tracking the uncertainty in flood alerts driven by grand ensemble weather
800           predictions. Meteorological Applications, 16(1): 91-101.
801   Jaun, S., 2008. Towards operational probabilistic runoff forecasts, Dissertation No. 17817,
802           ETH Zurich, [available online at http://e-collection.ethbib.ethz.ch/view/eth:41686].
803   Jaun, S. and Ahrens, B., 2009. Evaluation of a probabilistic hydrometeorological forecast
804           system. Hydrology and Earth System Sciences Discussions, 6: 1843-1877.
805   Jaun, S., Ahrens, B., Walser, A., Ewen, T. and Schar, C., 2008. A probabilistic view on the
806           August 2005 floods in the upper Rhine catchment. Natural Hazards and Earth System
807           Sciences, 8(2): 281-291.
808   Jaun, S. and Zappa, M., 2009. dphase_prevah: hydrological model PREVAH run by
809           IAC_ETH and WSL for the MAP D-PHASE project. World Data Center for Climate.
810           World Data Center for Climate. [doi: 10.1594/WDCC/dphase_prevah].
811   Koboltschnig, G.R., Schoner, W., Holzmann, H. and Zappa, M., 2009. Glaciermelt of a small
812           basin contributing to runoff under the extreme climate conditions in the summer of
813           2003. Hydrological Processes, 23(7): 1010-1018.
814   Laio, F. and Tamea, S., 2007. Verification tools for probabilistic forecasts of continuous
815           hydrological variables. Hydrology and Earth System Sciences, 11(4): 1267-1277.
816   Lamb, R., 1999. Calibration of a conceptual rainfall-runoff model for flood frequency
817           estimation by continuous simulation. Water Resources Research, 35(10): 3103-3114.
818   Lee, C.K., Lee, G., Zawadzki, I. and Kim, K.E., 2009. A Preliminary Analysis of Spatial
819           Variability of Raindrop Size Distributions during Stratiform Rain Events. Journal of
820           Applied Meteorology and Climatology, 48(2): 270-283.
821   Legates, D.R. and McCabe, G.J., 1999. Evaluating the use of "Goodness-of-Fit" measures in
822           hydrologic and hydroclimatic model validation. Water Resources Research, 35: 233-
823           241.
824   Lorenz, E.N., 1963. Deterministic Nonperiodic Flow. Journal of the Atmospheric Sciences,
825           20(2): 130-141.
826   Madsen, H., 2000. Automatic calibration of a conceptual rainfall-runoff model using multiple
827           objectives. Journal of Hydrology, 235(3-4): 276-288.
828   Madsen, H., 2003. Parameter estimation in distributed hydrological catchment modelling
829           using automatic calibration with multiple objectives. Advances in Water Resources,
830           26(2): 205-216.

831   Marsigli, C., Boccanera, F., Montani, A. and Paccagnella, T., 2005. The COSMO-LEPS
832          mesoscale ensemble system: validation of the methodology and verification.
833          Nonlinear Processes in Geophysics, 12(4): 527-536.
834   Matott, L.S., Babendreier, J.E. and Purucker, S.T., 2009. Evaluating uncertainty in integrated
835          environmental models: A review of concepts and tools. Water Resources Research,
836          45.
837   Merz, R. and Bloschl, G., 2008. Flood frequency hydrology: 1. Temporal, spatial, and causal
838          expansion of information. Water Resources Research, 44(8).
839   Molteni, F., Buizza, R., Palmer, T.N. and Petroliagis, T., 1996. The ECMWF ensemble
840          prediction system: Methodology and validation. Quarterly Journal of the Royal
841          Meteorological Society, 122(529): 73-119.
842   Moulin, L., Gaume, E. and Obled, C., 2009. Uncertainties on mean areal precipitation:
843          assessment and impact on streamflow simulations. Hydrology and Earth System
844          Sciences, 13(2): 99-114.
845   Naef, F., Schmocker-Fackel, P., Margreth, M., Kienzler, P. and Scherrer, S., 2008. Die
846          Häufung der Hochwasser der letzten Jahre. In: G.R. Bezzola and C. Hegg (Editors),
847          Ereignisanalyse der Hochwasser 2005 - Teil 2, Analyse von Prozessen, Massnahmen
848          und Gefahrengrundlagen in Umwelt. Umwelt-Wissen nr. 08025, pp. 429.
849   Nash, J.E. and Sutcliffe, J.V., 1970. River flow forecasting through conceptual models (1), a
850          discussion of principles. Journal of Hydrology, 10: 282-290.
851   Palmer, T.N., 2000. Predicting uncertainty in forecasts of weather and climate. Reports on
852          Progress in Physics, 63(2): 71-116.
853   Pappenberger, F. and Beven, K.J., 2004. Functional classification and evaluation of
854          hydrographs based on Multicomponent Mapping (Mx). International Journal of River
855          Basin Management, 2(2): 89 - 100.
856   Pappenberger, F. and Beven, K.J., 2006. Ignorance is bliss: Or seven reasons not to use
857          uncertainty analysis. Water Resources Research, 42(5).
858   Pappenberger, F. et al., 2005. Cascading model uncertainty from medium range weather
859          forecasts (10 days) through a rainfall-runoff model to flood inundation predictions
860          within the European Flood Forecasting System (EFFS). Hydrology and Earth System
861          Sciences, 9(4): 381-393.
862   Pappenberger, F., Ghelli, A., Buizza, R. and Bodis, K., 2009. The Skill of Probabilistic
863          Precipitation Forecasts under Observational Uncertainties within the Generalized
864          Likelihood Uncertainty Estimation Framework for Hydrological Applications. Journal
865          of Hydrometeorology, 10(3): 807-819.
866   Quiby, J. and Denhard, M., 2003. SRNWP-DWD Poor-Man Ensemble Prediction System: the
867          PEPS Project., Eumetnet Newsletter, pp. 9-12.
868   Ranzi, R., Zappa, M. and Bacchi, B., 2007. Hydrological aspects of the Mesoscale Alpine
869          Programme: Findings from field experiments and simulations. Quarterly Journal of the
870          Royal Meteorological Society, 133(625): 867-880.
871   Romang, H. et al., 2011. IFKIS-Hydro – Early Warning and Information System for Floods
872          and Debris Flows. Natural Hazards: doi: 10.1007/s11069-010-9507-8.
873   Romanowicz, R.J., Young, P.C. and Beven, K.J., 2006. Data assimilation and adaptive
874          forecasting of water levels in the river Severn catchment, United Kingdom. Water
875          Resources Research, 42(6).
876   Romanowicz, R.J., Young, P.C., Beven, K.J. and Pappenberger, F., 2008. A data based
877          mechanistic approach to nonlinear flood routing and adaptive flood level forecasting.
878          Advances in Water Resources, 31(8): 1048-1056.

879   Rossa, A. et al., 2011. Uncertainty propagation in advanced hydro-meteorological forecast
880           systems: The COST 731 Action. Atmospheric Research(Thematic Issue on
881           COST731.): this issue.
882   Rotach, M.W. et al., 2009. MAP D-PHASE Real-Time Demonstration of Weather Forecast
883           Quality in the Alpine Region. Bulletin of the American Meteorological Society, 90(9):
884           1321-+.
885   Roulin, E., 2007. Skill and relative economic value of medium-range hydrological ensemble
886           predictions. Hydrology and Earth System Sciences, 11(2): 725-737.
887   Roulin, E. and Vannitsem, S., 2005. Skill of medium-range hydrological ensemble
888           predictions. Journal of Hydrometeorology, 6(5): 729-744.
889   Schaefli, B. and Gupta, H.V., 2007. Do Nash values have value? Hydrological Processes,
890           21(15): 2075-2080.
891   Siccardi, F., Boni, G., Ferraris, L. and Rudari, R., 2005. A hydrometeorological approach for
892           probabilistic flood forecast. Journal of Geophysical Research-Atmospheres, 110(D5).
893   Stensrud, D.J., Bao, J.W. and Warner, T.T., 2000. Using initial condition and model physics
894           perturbations in short-range ensemble simulations of mesoscale convective systems.
895           Monthly Weather Review, 128(7): 2077-2107.
896   Szturc, J., Osrodka, K., Jurczyk, A. and Jelonek, L., 2008. Concept of dealing with
897           uncertainty in radar-based data for hydrological purpose. Natural Hazards and Earth
898           System Sciences, 8(2): 267-279.
899   Todini, E., 2009. Predictive uncertainty assessment in real time flood forecasting. In: P.C.
900           Baveye, M. Laba and J. Mysiak (Editors), Uncertainties in Environmental Modelling
901           and Consequences for Policy Making. NATO Science for Peace and Security Series C
902           - Environmental Security, pp. 205-228.
903   Verbunt, M., Walser, A., Gurtz, J., Montani, A. and Schar, C., 2007. Probabilistic flood
904           forecasting with a limited-area ensemble prediction system: Selected case studies.
905           Journal of Hydrometeorology, 8(4): 897-909.
906   Verbunt, M., Zappa, M., Gurtz, J. and Kaufmann, P., 2006. Verification of a coupled
907           hydrometeorological modelling approach for alpine tributaries in the Rhine basin.
908           Journal of Hydrology, 324(1-4): 224-238.
909   Villarini, G. and Krajewski, W.F., 2008. Empirically-based modeling of spatial sampling
910           uncertainties associated with rainfall measurements by rain gauges. Advances in
911           Water Resources, 31(7): 1015-1023.
912   Viviroli, D., Zappa, M., Gurtz, J. and Weingartner, R., 2009a. An introduction to the
913           hydrological modelling system PREVAH and its pre- and post-processing-tools.
914           Environmental Modelling & Software, 24(10): 1209-1222.
915   Viviroli, D., Zappa, M., Schwanbeck, J., Gurtz, J. and Weingartner, R., 2009b. Continuous
916           simulation for flood estimation in ungauged mesoscale catchments of Switzerland -
917           Part I: Modelling framework and calibration results. Journal of Hydrology, 377(1-2):
918           191-207.
919   Vrugt, J.A., Diks, C.G.H., Gupta, H.V., Bouten, W. and Verstraten, J.M., 2005. Improved
920           treatment of uncertainty in hydrologic modeling: Combining the strengths of global
921           optimization and data assimilation. Water Resources Research, 41(1).
922   Vrugt, J.A., Gupta, H.V., Bouten, W. and Sorooshian, S., 2003. A Shuffled Complex
923           Evolution Metropolis algorithm for optimization and uncertainty assessment of
924           hydrologic model parameters. Water Resources Research, 39(8).
925   Walser, A., Luthi, D. and Schar, C., 2004. Predictability of precipitation in a cloud-resolving
926           model. Monthly Weather Review, 132(2): 560-577.

927   Weigel, A.P., Liniger, M.A. and Appenzeller, C., 2007. Generalization of the discrete brier
928          and ranked probability skill scores for weighted multimodel ensemble forecasts.
929          Monthly Weather Review, 135(7): 2778-2785.
930   Wilks, D., 2006. Statistical methods in the atmospheric sciences, vol. 91 of International
931          geophysics series. Elsevier, Amsterdam, The Netherlands.
932   Wohling, T., Lennartz, F. and Zappa, M., 2006. Technical Note: Updating procedure for flood
933          forecasting with conceptual HBV-type models. Hydrology and Earth System Sciences,
934          10(6): 783-788.
935   Zappa, M., 2002. Multiple-response verification of a distributed hydrological model at
936          different spatial scales, Dissertation No. 14895, ETH Zurich, [available online at:
937          http://e-collection.ethbib.ethz.ch/show?type=diss&nr=14895]. pp.
938   Zappa, M. et al., 2010. Propagation of uncertainty from observing systems and NWP into
939          hydrological models: COST-731 Working Group 2. Atmospheric Science Letters,
940          11(2): 83-91.
941   Zappa, M. and Kan, C., 2007. Extreme heat and runoff extremes in the Swiss Alps. Natural
942          Hazards and Earth System Sciences, 7(3): 375-389.
943   Zappa, M., Pos, F., Strasser, U., Warmerdam, P. and Gurtz, J., 2003. Seasonal water balance
944          of an Alpine catchment as evaluated by different methods for spatially distributed
945          snowmelt modelling. Nordic Hydrology, 34(3): 179-202.
946   Zappa, M. et al., 2008. MAP D-PHASE: real-time demonstration of hydrological ensemble
947          prediction systems. Atmospheric Science Letters, 9(2): 80-87.

949   Table 1: Definition of model parameters allowed varying in the Monte-Carlo runs. The “default”
950   parameters are the result of a standard calibration procedure (Viviroli et al., 2009a). The random
951   sampling of the parameters was limited to values included in the interval defined by MCMin and MCMax.
      Symbol                               Parameter                        Unit     Default      MCMin         MCMax

      Pcorr         Rainfall adjustment*                                     [%]       12.8        0.0           30.0
      Scorr         Snow adjustment*                                         [%]       37.4        20.0          50.0
      BETA          Soil moisture recharge exponent                           -        3.8         3.0           6.0
      SGR           Threshold for surface runoff                            mm          41          30            50
      K0            Storage coefficient for surface runoff                    h         21          10            30
      K1            Storage coefficient for interflow                         h        127         100           150
      PERC          Deep Percolation                                       mm h-1     0.153        0.10          0.20
953        * The two parameters controlling the bias adjustment of the precipitation input (rain or snow) are only used if
954   the hydrological model is fed by interpolated pluviometers data. Although the NWP models and precipitations
955   estimates with the weather radar contain systematic errors, it was decided to avoid bias-corrections (Verbunt et
956   al., 2006).


958   Table 2: Summary of the three parameter-sets of 26 members each after inferring Monte
959   Carlo simulations (see text for details). The numbers declare the median (Med.) and standard
960   deviation (St.Dev) of the seven parameters that were randomly varied. "Range" indicates the
961   ratio between St.Dev. and the dimension of the interval allowed for this parameters (Table 1).


               Symbol      Unit        MOD_99.5%              MOD_95%              MOD_80%
                                   Med. / St.Dev / Range Med. / St.Dev / Range Med. / St.Dev / Range

               Pcorr       [%]       11.54 / 2.8 / 0.09     14.3 / 5.1 / 0.17    19.3 / 6.6 / 0.22
               Scorr       [%]        32.1 / 7.2 / 0.24     29.6 / 9.2 / 0.31    33.6 / 9.5 / 0.32
               BETA         -         4.6 / 0.87 / 0.29     4.5 / 0.82 / 0.27    4.1 / 1.03 / 0.34
               SGR         mm         33.2 / 3.6 / 0.18     39.1 / 5.1 / 0.26    38.1 / 5.8 / 0.29
               K0           h        12.7 / 0.97 / 0.05     11.8 / 2.0 / 0.1     15.6 / 2.4 / 0.12
               K1           h        127 / 14.5 / 0.29     122 / 15.6 / 0.31     129 / 12.8 / 0.26
               PERC       mm h-1     0.11 / 0.013 /0.13   0.13 / 0.022 / 0.22   0.14 / 0.031 / 0.31

963   Table 3: Accumulated precipitation during the five day previous to the seven peak-flow
964   events investigated. The column “Day -10/-20” declares the moment where initial conditions
965   from a deterministic run are stored in order to trigger experiments on uncertainty propagation
966   and superposition (Figure 4). The list of the used COSMO-LEPS forecasts is sorted after the
967   lead time in days before the event.


         Event       Peak                       Accumulated                  120 hours COSMO/LEPS
                              Day -10/-20
      (year/month   Runoff                    precipitation until              forecast initialization
         /day)      [m3s-1]                      Day-5 [mm]                        (month/day)
                                            pluviometers radar      Day-5   Day-4 Day-3 Day-2 Day-1 Day-0
      2007/08/22    100.7       08/01           151           153                   08/19 08/20 08/21
      2008/07/07    80.3        06/28            10           38                    07/04 07/05 07/06
      2008/07/13    163.0       06/28            87           113                   07/10 07/11
      2008/08/15    76.9        07/20            45           98    08/10   08/11 08/12 08/13 08/14
      2008/09/07    541.0       08/20            28           29                    09/04
      2008/10/29    210.6       10/09             9            4                              10/27 10/28 10/29
      2008/11/05    157.5       10/09           219           186                             11/03 11/04

970   Table 4: Average ensemble spread ( q100 - q 0 ) in m3s-1 for seven peak-runoff events when
971   adopting different sets of model parameter realizations and either pluviometers
972   (MOD_%/PLUV) or weather radar QPF (MOD_%/RAD) as precipitation forcing.

      Event              Initialization      MOD_%/PLUV        MOD_%/RAD
      (year/month/day)    (month/day)     99.5% 95% 80%           95%
      Members                  -            26   26   26           26
      2007/08/22             08/20        10.3   14.2   18.5       9.3
      2008/07/07             07/05         4.9    8.3   10.4       6.5
      2008/07/13             07/11         5.3    7.2   10.0       9.5
      2008/08/15             08/12         5.9   10.3   11.7       8.3
      2008/09/07             09/04        22.8   30.0   38.9      30.4
      2008/10/29             10/27         9.9   14.6   19.6      10.8
      2008/11/05             11/03        10.8   13.9   18.8       9.9
        Average             [m3s-1]       10.0   14.1   18.3      12.1

975   Table 5: Average ensemble spread ( q100 - q 0 ) in m3s-1 for seven peak-runoff events when
976   adopting different experimental settings for propagating and superposing uncertainty in
977   operational hydrological simulations.

                                                REAL      LEPS
          Event          Initialization REAL         LEPS      FULL
                                                /MOD      /MOD
      (year/month/day)    (month/day)
      Members                 -          25      650      16     416    10400
      2007/08/22            08/20       37.2    48.9    117.0   137.0   141.0
      2008/07/07            07/05       24.8    33.8     84.5   100.0   105.0
      2008/07/13            07/11       42.0    53.9    123.7   146.0   146.0
      2008/08/15            08/12       37.9    48.6    100.0   123.0   142.0
      2008/09/07            09/04       167.0   216.0   288.0   328.0   338.0
      2008/10/29            10/27       34.9    48.9     82.0   102.0   110.0
      2008/11/05            11/03       43.3    56.3    116.0   138.0   139.0
      Average              [m3s-1]      55.3    72.3    130.0   153.0   160.0

980   Table 6: Attributing the contribution of different sources of uncertainty to the average spread
981   ensemble spread ( q100 – q 0 ) and IQR ( q 75 – q 25 ) in m3s-1 for the November 5th 2008 peak

982   runoff event. The observed value and the correspondent ensemble median ( q 50 ) are also

983   summarized. Sub-samples of three experiments are evaluated to estimate the different
984   contribution of MOD, LEPS and REAL to the total experimental uncertainty. Details on the
985   experiments and acronyms are found in Section 3.4.
                                          Average of n         Observation     q 50    q100 – q 0 q 75 – q 25
       Experiment   Filter    Varying     realizations Members   [m3s-1]        3 -1
                                                                              [m s ]    [m3s-1]     [m3s-1]
      FULL     None   All three                 1        10400       62.4      67.0     139.0        68.4
               MOD REAL & LEPS                 26         400        62.4      65.3     125.4        62.3
               REAL MOD & LEPS                 25         416        62.4      66.8     136.1        67.0
               LEPS REAL & MOD                 16         650        62.4      71.9     15.9         10.8
      REAL/MOD NONE    Both                     1         650        62.4      56.7     56.3         31.0
               REAL    MOD                     25         26         62.4      57.7     10.1          6.9
               MOD     REAL                    26         25         62.4      55.7     45.9         25.5
      LEPS/MOD None     Both                    1         416        62.4      66.9     138.0        67.9
               MOD     LEPS                    26          16        62.4      64.4     121.8        59.6
               LEPS    MOD                     16          26        62.4      71.5     15.7         10.1

 989   Figures captions:


 991   Figure 1: Main sources of uncertainties propagating and superposing through a hydrological
 992   model in hydrometeorological forecasting chains.


 994   Figure 2. Situation map of the Verzasca river basin in southern Switzerland including
 995   hydrometric (FOEN) and meteorological networks (MeteoSwiss and UCA). Additionally, the
 996   location of the Monte Lema weather radar few kilometers southern of the basin is displayed.
 997   Graphic elements reproduced by kind authorization of “swisstopo” (JA022265) and BFS
 998   GEOSTAT.


1000   Figure 3: Dot-plot of the 2527 Monte Carlo realizations for the application of PREVAH in the
1001   Verzasca river basin during the calibration period 1996-2001. NSE and SWAE are used to
1002   select three sub-samples (99.5%, 95% and 80%) of acceptable parameter sets consisting of 26
1003   realizations each.

1005   Figure 4: Design of the seven experiments run for quantification of uncertainty superposition.
1006   The time window for the statistics is defined by the lead time of the C-LEPS forecasts.


1008   Figure 5: November 5th 2008 event: visualization of the spread obtained by adopting different
1009   sets of model parameter realizations for simulations with PREVAH. The observed hydrograph
1010   is drawn as black line. The shaded dark and light grey areas are delimitated by the quantiles of
1011   the ensemble realizations (q0, q25, q75, q100). The dashed black line draws the ensemble median
1012   (q50). Top left: realizations obtained from pluviometric data and the MOD_99.5% set. Top
1013   right: same but MOD_95% set is applied. Bottom left: same for MOD_80%. Bottom right:
1014   weather radar data are used combined with the MOD_95% parameters realizations set.


1016   Figure 6: Ensemble hydrographs for the August 22nd 2007 (upper panels) and November 5th
1017   2008 (bottom panels) events as obtained by forcing PREVAH with 25 radar ensemble
1018   members (REAL, left panels) and by jointly accounting for both radar and model parameter

1019   uncertainty (REAL/MOD, right panels). The observed hydrograph is drawn as black line. The
1020   shaded dark and light grey areas are delimitated by the quantiles of the ensemble realizations
1021   (q0, q25, q75, q100). The dashed black line draws the ensemble median (q50).


1023   Figure 7: As Figure 6 but with PREVAH forced by 16 COSMO-LEPS ensemble members
1024   (LEPS, left panels) and by jointly accounting for both COSMO-LEPS and model parameter
1025   uncertainty (LEPS/MOD, right panels).


1027   Figure 8: Ensemble hydrographs for the August 15th 2008 with PREVAH initialized on
1028   August 12th 2008 and forced by jointly accounting for both COSMO-LEPS and model
1029   parameter uncertainty (LEPS/MOD, left panel) and by accounting all three sources of
1030   uncertainty in the experimental chain (FULL, right panel). Legend as Figures 6 and 7.


1032   Figure 9: Box plots summarizing the average ensemble discharge quantiles related 19
1033   experiments (lower captions on the x-axis) of superposing the uncertainty from three sources.
1034   The experiments related to different events (upper captions of the x-axis) are separated by a
1035   vertical line crossing the x-axis. The observed average discharge during the 120 hours of each
1036   experiment is displayed as a thick horizontal black line. The thick horizontal white line
1037   depicts q 50 within the box plot drawn by q 0 , q 25 , q 75 and q100 .


1039   Figure 10: Box plots summarizing the average ensemble discharge quantiles for the
1040   November 5th 2008 event initialized on November 3rd 2008. Three experiments (upper
1041   captions of the x-axis) are evaluated as complete set (“no filter”) and by separating the
1042   influence of different sources of uncertainty (“filter MOD/REAL/LEPS”). The observed
1043   average discharge during the 120 hours of each experiment is displayed as a thick horizontal
1044   black line. The thick horizontal white line depicts q 50 within the box plot drawn by q 0 , q 25 ,

1045   q 75 and q100 .


1047   Figure A1: Relative operating characteristic curves (ROC, Wilks, 2006) of LEPS and REAL
1048   for the period June 2007 to November 2008. The ROC for the deterministic simulations
1049   MOD/RAD and MOD/PLUV is indicated as a point. ROC are plot for three different
1050   discharge thresholds corresponding to the 0.5 (left), 0.75 (middle) and 0.9 (right) quantiles.


Click here to download high resolution image
Click here to download high resolution image
Figure 3
Click here to download high resolution image
Figure 4
Click here to download high resolution image
Figure 5
Click here to download high resolution image
Figure 6
Click here to download high resolution image
Figure 7
Click here to download high resolution image
Figure 8
Click here to download high resolution image
Figure 9
Click here to download high resolution image
Figure 10
Click here to download high resolution image
Click here to download high resolution image

To top