Data for Prediction of Rainfall by Artificial Neural Network (DOC download)

Document Sample
Data for Prediction of Rainfall by Artificial Neural Network (DOC download) Powered By Docstoc
                    PREDICTION SYSTEM

   BERNARD B. HSIEH                           CPT CHARLES L. BARTOS
   USAE R&D CENTER                            USAE R&D CENTER
   VICKSBURG, MS 39180 USA                    VICKSBURG, MS 39180 USA


    A streamflow prediction system is developed by the Artificial Neural Networks
    (ANN) for addressing the flood forecasting issues of two different scale
    watersheds, the Sava River, Croatia and a segment of the lower Mississippi
    River. The study investigated the prediction system with single-point river
    stage, upstream-downstream riverflow forecasting, and rainfall-runoff
    hydrological process. The study indicated that the minimum length of
    riverstages required achieving about 90 percent of up to 3 days forecasting
    reliability was about 3 months for the Sava River. The reasonable downstream
    riverflow prediction from upper stream gauges was found in the Sava River
    even only half year daily values were available for model training. On the
    Mississippi River, with 16 years long-term daily information, the ANN can
    construct a very high precision riverflow forecasting system for Memphis, TN,
    from two upstream inputs, near the confluence of the Ohio River, without
    significant rainfall contribution in this river segment.

   Forecasting a riverflow provides a warning of impending stages during floods and
assists in regulating reservoir outflows during low river flows for water resources
management. For military application, an accurate forecasting of river stage and flow is
critical during military operations since it directly influences force mobility.
   Most hydrologic processes exhibit a high degree of temporal and spatial variability, and
are further plagued by issues of nonlinearity of physical processes, conflicting spatial and
temporal scale, and uncertainty in parameter estimates. The capability exists to extract the
relationship between the inputs and outputs of a process, without the physics being
explicitly provided. It is also possible to provide a map from one multivariable space to
another, given a set of data representing that mapping. These properties of ANNs may be
well suited to the problems of estimation and prediction in hydrology.
  Two major approaches for modeling the rainfall-runoff or prediction of riverflow have
been explored in the literature: conceptual (physical basis) modeling and system theoretic
modeling. While conceptual models are important in understanding hydrologic processes,
there are many practical situations such as streamflow forecasting where the main concern
is making accurate predictions at specific watershed locations. In such a situation, a
hydrologist may prefer not to expend the time and efforts required in developing and
implementing a conceptual model or numerical model, but instead implement a simpler
system theoretic model, such as ANN.
  Applications of ANN in rainfall-runoff modeling and streamflow forecasting have been
described in many sources. The algorithms to performed these approaches were from
backpropagation (Hjelmfelt and Wang 1996), time-delayed (Karunanithi et al. 1994),
recurrent (Carriere, Mobaghegh, and Gaskari, 1996), radial-basis function (Fernando and
Jayawardena 1998), modular (Zhang and Govindaraju 1998), to self-organizing (Hsu,
Gupta and Sorooshian 1998). It is noted that only one reference for each algorithm is cited.
   The objective of this study is to demonstrate the applicability of the system theoretic
ANN approach in developing effective nonlinear models of the river stage, riverflow-
forecasting process without the need to explicitly represent the internal hydrologic
structure of the watershed for military use. In addition, a large-scale watershed, such as the
Mississippi River, is also used to demonstrate the capability of flood forecasting by ANN.

  For the hydrological applications, the backpropogation network, a class example of
supervised learning, is a popular computational tool for the study applications. However,
considering travel time between the input and output for the signal propagation of natural
phenomena, the time-delayed ANN and recurrent ANN are also used to perform the
comparisons. The fundamental theory regarding these algorithms will not be discussed
  The software used for this study is NeuroSolutions (version 3.0). The particular version
used to run the simulation is an Excel environment module. The data set is divided into
training, cross-validation, and testing portions. In order to keep the simplicity of model
structure, the number of hidden layers for the neural architecture is chosen as one. The
performance analysis is represented by several quantity numbers, including mean square
error, normalized mean square error (NMSE), mean/maximum/minimum absolute errors,
and correlation coefficient (CC).

  The Lower Mississippi River is considered to begin at Cairo, IL, at the confluence of the
Ohio and Upper Mississippi Rivers. It travels southward a distance of approximately 954
miles to Head of passes, LA. During 1973, a series flood occurred in the Lower
Mississippi River. The peak flows for the crest stages were over 1.5 million cfs. Major
flooding as what occurred in 1973 is a good example of the need for forecasting system as
an essential tool to reduce flood damage.
  In this study, the ANN was used to predict the riverflow at Memphis, TN, from the
upstream gauge at Thebes, IL before the Mississippi River merges with the Ohio River and
the nearby gauge at Metropolis, IL, at its confluence with the Ohio River. The lateral
contribution of tributaries within this river segment is the Obion, Hatchie, Loosahachie,
and Wolf Rivers in West Tennessee and rainfall in this river basin. The purpose of this
study is to identify the prediction capability with minimum hydrologic information using
of ANN and to determine the contribution of the Ohio River to flooding.
  A database was developed using 16 years (from 1975 to 1990) of daily riverflow from
these three stations and ten daily rainfall stations, which are about uniformly distributed
spatially over the river basin. A rainfall-runoff model was constructed using two upstream
flows (Ohio & Mississippi Rivers) and total daily rainfall as the inputs and the downstream
riverflow as the output. The first 6 years’ data are used as the training; the next 2 years’
information is used as the cross-validation, and the last 8 years is used as the testing. A
multilayered perceptrons feed-forward backpropagation architecture design was used. The
fairly accurate results are obtained by three subdatasets based on the performance of
NMSE and CC. The CC ranges are 0.95, 0.93 and 0.94. However, the graphical
comparisons show that the spikes match very well but a phase exists between the observed
values and simulated outputs. This difference implies the consideration of time lags is
required. Hence, the second test was conducted by up to 2-day lag for each input series. It
forms four inputs and one output system. This modification produced a significant
improvement. Figure 1 shows the model testing results. Although the cross-validation
overestimated the flow values, the testing set presents excellent results. A sensitivity
analysis was performed to identify the ranking among these four inputs. It indicated that

              1600000                                              observed           ANN output
 Flow Rate (cfs)

                            1                                                   292
                                1983 Portion, Total Plot of Testing Data (1983-1990, daily)

Figure 1. Testing results for desired riverflow and actual network riverflow at Memphis

the highest correlation of downstream gauges was related to 2-day lag riverflow for both
upstream gauges.
  The second scenario of this model was to add the rainfall factor (up to 2-day lag) as the
input variable. There was very limited difference between these two runs. This seems to
indicate that the downstream prediction is not being sensitive to rainfall. This also means
that the contribution of watershed inflow comes from tributaries. However, it might not
necessary to test the sensitivity of tributary flow, since the forecasting system is developed
well enough by ANN only to require upstream gauges.

  The Sava River is the largest river in former Yugoslavia. Since Yugoslavia was divided
into several new republics, the Sava River starts into Slovenia. Going downstream, the
Sava flows through Croatia, Bosnia, and Serbia. The total drainage area at the confluence
of the Sava and Danube River comprises 96 thousand square kilometers and the watershed
length is 2,255 km. The length of the Sava River is 950 km.
  During the Bosnia war, the prediction of river stages for military crossings became
particularly important. The accuracy of prediction was critical to determine the schedule of
military operations, especially the locations at which to construct bridges. Therefore, a
riverflow and stage forecasting system was required to address changes in weather
condition. An upstream-downstream modeling study can provide the necessary answers.
  A number of riverflows and stages are available for nearly two dozen gauges in both the
mainstream and some tributaries. The best data files that can currently be used to construct
the model are eight stations for riverflow and two stations for river stage. These riverflow
stations are along the main river, and the data set is a year of daily mean flow (the most
downstream station is the site for the bridge for the military operation). The daily river
stage (2-1/2 years of data) data exist only for the two most downstream stations, which
both have the military bridges. This modeling effort is to find an alternative method to
predict downstream flow and stage based on the minimum upstream information, other
than the numerical watershed model simulation.

Sava River Stage Forecasting Model
  Using the data available, a river stage forcasting model was constructed using the
Savonski gauge (upstream) to predict the Zupanja gauge (downstream). The data file (2.5
years) as the regular ANN modeling procedure was divided into a training set (1 year), a
cross-validation set (6 months) and testing set. This model again trained by multilayer
perceptron with backpropagation algorithm with one hidden layer. Since these two gauge
stations are not far from each other, the result shows very good agreement with
observations. In order to test the model forecasting capability, the forecast ranges were set
to up to 3 days ahead and several scenarios were conducted. The lowest correlation
coefficient was 0.911 and the mean absolute error was 0.52 m for the 3-day ahead
prediction with current and previous 2- day stage at the upstream location.

Sava River Flow Prediction Model
  As described above, the data file for riverflow exists only at eight stations from upstream
to downstream. Station 8 (Slavonski Brod) is the most downstream gauge and the only one
that has a bridge. From the preliminary analysis for riverflow distribution, the first four
stations show similar flow patterns. The pattern starts to change at station 5 (Ornac) due to
merging of tributary flow. The flow patterns change rapidly from station 5 to station 7
(Davor) due the more complicated hydrographic conditions and geomorphology.
  It was therefore decided to take two locations, one model using station 1 and station 5
with time lags and try to predict the flow at station 8. While the training (6 months’ data)
shows fair agreement, the cross-validation (1 month of data) and testing (5 months’ data)
overestimate results with some degree of deviation. The explanation for these differences
is that the first 6 months’ flow patterns are quite different from the last six month patterns
and the record for training is not long enough to adjust the difference from stations 1 and 5
to station 8. Some improvement was found if the input series also included station 7. Even
better agreement was obtained by using time-lag recurrent algorithm. The final result for
model testing is shown in Figure 2.

  It is interesting to know how reliable the prediction would be if only limited data were
available. This will be demonstrated by selecting single station and repeating the model
run with different lengths of record. The river stage at Zupanja was selected to perform
this test.
  Nine test runs were conducted with different lengths of training, cross-validation, and
testing data with forecasting ranges from 1 day to 3 days. The results are summarized in
Table 1. Six parameters were used to determine the prediction reliability. This table
provides the prediction reliability giving length of record and expected criterion. For
example, for only 3 months’ record, a 3-day prediction has over 1-m prediction error and
the correlation coefficient is about 0.90.
                                              )78 ,13 ceD - 1 luJ( etad gnitseT
                               1   81    53    25      96     68    301     021   731    451      171
    ) smc( etaR wo lF

                        0041                           tuptuo NNA
                        0061                             devresbo

 Figure 2. Testing results for desired riverflow and actual network output, Slavonski Brod

                                               MSE     NMSE         MAE     Min Abs E Max Abs E         r

                                   1 day p.   0.0821    0.0233     0.2015    0.0002     1.3868      0.9884
  1 yr training                    2 day p.   0.2838    0.0808     0.4020    0.0027     2.3479      0.9595
                                   3 day p.   0.5554    0.1583     0.5802    0.0036     3.2460      0.9203

                                   1 day p.   0.1125    0.0241     0.2336    0.0004     2.0161      0.9885
  6 mo training                    2 day p.   0.3413    0.0731     0.4075    0.0001     3.3268      0.9628
                                   3 day p.   0.6491    0.1390     0.5990    0.0024     3.8329      0.9283

                                   1 day p.   0.6955    0.2053     0.6783    0.0041     1.5347      0.9731
  3 mo training                    2 day p.   0.9933    0.3041     0.8544    0.0034     1.7320      0.9489
                                   3 day p.   1.3350    0.4227     1.0194    0.0201     1.9215      0.9174

    Table 1. Prediction reliability due to the length of training record for the Sava River
             stage prediction model

  With the computer facility speed improvement, the training time due to different
algorithms may no longer be such a critical factor if the training record is not so long and
design architecture is not so complicated. The testing accuracy could get worse if the
selection of the algorithm to represent the problem is not proper. Four different algorithms
for the example of the riverflow prediction of the Mississippi River will demonstrate this
comparison. Table 2 summarizes the results. While the traditional backpropagation
algorithm without time-delay showed the least accuracy, the recurrent algorithm
represented the best results. The results indicate that the information from input to output
mapping with certain memory length and strong nonlinearity can best describe this
hydrological phenomenon.
                        Backpropagation      Backpropagation,         Time-Delay        Recurrent
                                             time shift input

           NMSE               0.0966                0.0303              0.0194             0.0168
           R                  0.9505                0.9847              0.9903             0.9918

           NMSE               0.1448                0.0416              0.0665             0.0652
           R                  0.9286                0.9807              0.9679             0.9680

           NMSE               0.1042                0.0344              0.0210             0.0171
           R                  0.9475                0.9834              0.9909             0.9922

           Table 2. Riverflow prediction reliability due to approach algorithms for the
                    Mississippi River segment model

  ANN algorithms are successfully applied to two different scale watershed systems for
riverflow and stage prediction with the addressing the two primary hydrological
forecasting issues – time delay and nonlinearity. In a segment of the Lower Mississippi
River, the riverflow at Memphis, TN can be predicted with a high degree of accuracy, even
with no rainfall data provided, from two upstream gauges. This model also can be used to
simulate the influence of the Ohio River downstream to the Mississippi River. Relatively
less accurate results were obtained for the Sava River due mainly to the record of limited
  The prediction for river stage/flow can be obtained by generating the relationship
between training length and performance parameters. The proper selection for solution
algorithm could help to increase the model accuracy. The best performance of ANN for
flow prediction heavily depends on not only the length of the data set but also whether the
most significant patterns are included or not.

  The U.S. Army Engineers, Research and Development Center, Waterways Experiment
funded this work. The Chief of Engineers to publish this information granted permission.

Carriere, P.S. Mohaghegh, and R. Gaskari, 1996, “Performance of a Virtual Runoff Hydrographic
   System,” Journal of Water Resources Planning and Management, Vol. 122, No 6. 120-125.
Fernando, D.A.K. and A. W. Jayawardena, 1998, “Runoff Forecasting Using RBF Networks with
   OLS Algorithm,” Journal of Hydrologic Engineering, 3(3) 203-209.
Hjelmfelt, A. T. and M. Wang, 1996, “Predicting Runoff Using Artificial Neural Networks,” Surface
   Water Hydrology, 233-244.
Hsu K, H. V. Gupta, and S. Sorooshian, 1998, “ Streamflow Forecasting Using Artificial Neural
   Networks,” ASCE Water Resources Engineering Conference ’98, 967-972.
Karunanithi, N., W. J. Grenney, D. Whiteley, and K. Bovee, 1994, “Neural Networks for River Flow
   Prediction,” ASCE Journal of Computing Civil Engineering, 8(2), 201-220.
Zhang B. and R. S. Govindaraju, 1998, “Using Modular Neural Networks to Predict Watershed
   Runoff,” ASCE Water Resources Engineering Conference’98, 897-902.

Shared By:
Description: Data for Prediction of Rainfall by Artificial Neural Network. document sample