CUAHSI Hydrologic Information Systems Proposal Narrative by wanghonghx

VIEWS: 19 PAGES: 21

									            GeoInformatics: CUAHSI Hydrologic Information Systems
                                        Project Summary

This proposal advances integrative hydrologic science through the development of a hydrologic
information system that can be implemented at universities throughout the United States. It
involves collaboration between hydrologic scientists from the Consortium of Universities for the
Advancement of Hydrologic Science, Inc (CUAHSI) and computer scientists from the San Diego
Supercomputer Center, and supports a larger strategy at NSF to develop cyberinfrastructure for
the environmental and earth sciences. The CUAHSI Hydrologic Information System (HIS) is a
geographically distributed network of hydrologic data sources and functions that are integrated
using web services so that they function as a connected whole. CUAHSI web services make
national water data archives directly accessible to hydrologic scientists, almost as if the data
were located on a local disk drive.

HIS is built on a hydrologic information model that has six components: time series (streamflow,
water quality, groundwater levels), multidimensional fields (remote sensing, Nexrad
precipitation, weather and climate model outputs), geospatial themes (terrain, hydrography,
watersheds, soils, land cover, vegetation, geology, aquifers), simulation models (existing models
and scientific workflow models), information tools (data access, transformation, publication,
analysis, visualization), and information collections (digital watersheds, digital aquifers, digital
estuaries) accessible through online portals and desktop applications.

The intellectual merit of this project is that it harnesses information technology to support
hydrologic science by building an information model that has a coherent intellectual structure
and synthesizes data from many disciplines. It enables the tracing of water movement and
transport of constituents vertically between atmosphere, surface water and groundwater, and
horizontally through the landscape from watersheds and aquifers to streams, rivers, estuaries and
bays. It integrates data across scales of space and time. This will enable the testing of hypotheses
about the interfaces between hydrologic processes in a manner and scale that is rarely attempted
now. The results of these studies will be published in journal articles and as a series of CUAHSI
monographs. The hydrologic information system developed in this project is of significant
value in itself for hydrologic science research and also in the wider sense of being an example of
how cyberinfrastructure is developed for earth and environmental sciences.

The broader impacts of this project include its networking of hydrologic scientists at many
universities who will jointly be contributing and receiving hydrologic information.        The
CUAHSI hydrologic information system and its accompanying datasets will be developed in the
public domain and available to the professional hydrology community, and to educators at all
levels. The integration of water information is important for water science, but also for water
planning, engineering and management. This project will have broad benefits in improving
water information access and scientific analysis across the nation.
Results from Prior NSF Support
(only the prior project most relevant to this project is reported)

David R. Maidment
Grant # EAR-0413265 ($964,218) 1/04 – 3/07 CUAHSI Hydrologic Information Systems
Principal Investigators: D.R.Maidment, C. Baru, P. Kumar, M. Piasecki, R. Hooper,
Summary of Results: This project defined how cyberinfrastructure should be developed for the
hydrologic sciences, and developed a prototype community hydrologic information system using web
services through the Consortium of Universities for the Advancement of Hydrologic Sciences (CUAHSI). .
Publications:
Goodall, J.,2005, A geotemporal framework for hydrologic analysis, PhD dissertation, University of Texas
at Austin.
Strassberg, G., 2005, A geographic data model for groundwater, PhD dissertation, University of Texas at
Austin.
Maidment, D.R., (Ed.), 2005, CUAHSI Hydrologic Information System Status Report, Consortium of
Universities for the Advancement of Hydrologic Science, Inc,
224p.,http://www.cuahsi.org/docs/HISStatusSept15.pdf

David G. Tarboton
Collaborator on the Grant # EAR-0413265 ($964,218) 1/04 – 3/07 CUAHSI Hydrologic Information
Systems for which David Maidment is the Principal Investigator.
Summary of Results:
Dr. Tarboton has as part of this project been responsible for the User Needs Assessment Survey and
Hydrologic Observations Data Model Design.
Publications:
Bandaragoda, C. J., D. G. Tarboton and D. R. Maidment, (2005), "User Needs Assessment, Chapter 4,"
in Hydrologic Information System Status Report, Version 1, Edited by D. R. Maidment, p.48-87,
http://www.cuahsi.org/docs/HISStatusSept15.pdf.
Bandaragoda, C., D. G. Tarboton and D. R. Maidment, (2006), "Hydrology's Effort Towards the
Cyberfrontier," EOS, 87(1): 2,6.
Horsburgh, J. S., D. G. Tarboton and D. R. Maidment, (2005), "A Community Data Model for Hydrologic
Observations, Chapter 6," in Hydrologic Information System Status Report, Version 1, Edited by D. R.
Maidment, p.102-135, http://www.cuahsi.org/docs/HISStatusSept15.pdf.

Michael Piasecki
Grant # EAR-0412904 ($181,007) 1/04 – 3/07 CUAHSI Hydrologic Information Systems
Principal Investigators: D.R.Maidment, C. Baru, P. Kumar, M. Piasecki, R. Hooper,
Summary of Results: Piasecki has developed a basic community metadata profile and also conceptual
representations for vocabulary taxonomy and processes using ontologies.
Publications: Bermudez, L., 2004, “ONTOMET: Ontology Metadata Framework”, PhD dissertation,
Drexel University, and a Status Report (see above).

Ilya Zaslavsky
Grant # EAR-0413182 ($1,059,352) 1/04 – 3/07 CUAHSI Hydrologic Information Systems
Principal Investigators: D.R.Maidment, C. Baru, P. Kumar, M. Piasecki, R. Hooper,
Summary of Results: Dr. Zaslavsky has been responsible for the development and deployment
of cyberinfrastructure components of the prototype CUAHSI hydrologic information system, the
Hydrologic Data Access system in particular.
Publications:
Baru, C., I. Zaslavsky, and R. Wahadj (2005), "System Architecture, Chapter 3," in Hydrologic
Information System Status Report, Version 1, Edited by D. R. Maidment, p.24-47,
http://www.cuahsi.org/docs/HISStatusSept15.pdf.




                                                        1
Ruddell, B., Zaslavsky, I., Kumar, P., Jennings, C., Mehnert, E., Thomas, D., Holmes, R., Maidment, D.,
Piasecki, M. 2005. TI: Development of the Illinois River Basin Virtual Observatory Prototype. Eos Trans.
AGU, 86(52), Fall Meet. Suppl., Abstract IN24A-04




                                                     2
Project Description
Introduction

The Consortium of Universities for the Advancement of Hydrologic Science, Inc, (CUAHSI) is a
legally independent organization of which 101 US universities are members, that is supported by
the National Science Foundation to develop infrastructure and services for the advancement of
hydrologic science in the United States. The Hydrologic Information System (HIS) project is
one component of CUAHSI’s mission, which is motivated by the community’s desire to better
access and analyze hydrologic information. The broad aim of HIS is to provide a strong and
flexible foundation for data-intensive hydrologic research that can evolve as the needs of the
community change.

This coincides within the National Science Foundation with a larger initiative related to
development of cyberinfrastructure for revolutionizing science and engineering. The emerging
vision “is to use cyberinfrastructure to build more ubiquitous, comprehensive digital
environments that become interactive and functionally complete for research communities in
terms of people, data, information, tools, and instruments and that operate at unprecedented
levels of computational, storage, and data transfer capacity.” (Atkins et al., 2003, p.17). NSF
has recently set up an Office of Cyberinfrastructure, and is presently developing a strategic
planning document called “NSF’s Cyberinfrastructure Vision for 21st Century Discovery” (NSF
Cyberinfrastructure Council, 2005). The CUAHSI Hydrologic Information System project is
the key component of NSF’s cyberinfrastructure development to support hydrologic science.

The existing CUAHSI Hydrologic Information System project was initiated in April 2004 for a
two-year period, which terminates in March 2006. This proposal is for a 5-year renewal of that
project. In this proposal, the existing and proposed HIS projects are referred to as HIS Phase 1,
and HIS Phase 2, respectively. There is a closely parallel NSF program in environmental
engineering called CLEANER (Collaborative Large-Scale Engineering Analysis Network for
Environmental Research). A status report summarizing the findings of HIS Phase 1 has been
published (Maidment, 2005), and has been reviewed in detail by the CLEANER
cyberinfrastructure committee, who consider it a model to which other science communities can
look to guide their cyberinfrastructure development efforts. Many copies of this report have
been requested by NSF for distribution to Program Officers concerned with cyberinfrastructure
development.

Project Goals

The CUAHSI HIS project has four goals:
    Data Access – provide rapid access to a large volume of high quality hydrologic data;
    Hydrologic Observatories – develop a digital watershed framework for synthesizing
      data and models for a hydrologic region;
    Hydrologic Science – strengthen place-based hydrologic science by supporting the
      representation of hydrologic processes with equations by an enhanced capacity to
      describe hydrologic environments with data;



                                                3
      Hydrologic Education – quantify and visualize the movement of water and chemicals in
       a hydrologic environment continuously in space and time.
A user survey conducted during HIS Phase 1 (Tarboton et al., 2005) showed that CUAHSI
members rank Data Access as the highest priority of these four goals.

Hydrologic Information Model

When HIS Phase 1 was initiated, what a Hydrologic Information System would consist of, or
how it would function, was unknown. It required more than a year of study to identify the four
goals just described. HIS investigators were confronted with a plethora of information types and
cyberinfrastructure techniques, without a visible pattern for structuring them. In any such
situation of great complexity, it is useful to take the whole problem and break it down into a
series of components, which can be addressed individually, and then reassembled to form a
solution for the whole problem.

The CUAHSI Hydrologic Information Model consists of seven components: time series,
multidimensional fields, geospatial themes, simulation models, information tools, information
collections and web portals. The first three of these components are categories of data. The
second three are means for analyzing data and modeling processes and storing sets of related
data and models. Web portals are the windows through which data and models are accessed and
shared. These components are now described in more detail:

Time Series – these include observational data records from streamflow, precipitation and
groundwater level gages, water quality and biological sampling, and climate and weather
stations. Typically the locations of these measurements are represented in space as points –
groundwater levels are measured at a very large number of points, but with a few samples at each
point, randomly scattered in time; water quality sampling produces large numbers of variables at
each sampling, the number of geographic points is smaller than for groundwater levels but the
number of data measured at each point is larger; streamflow data are very systematically
collected at regular intervals through time and a limited number of locations. Time series may
also be produced by hydrologic simulation models and by averaging continuous phenomena over
spatial regions, such as the average precipitation over a watershed.

Multidimensional Fields – these include the products of satellite and aircraft remote sensing,
Nexrad radar rainfall grids, and the output of weather and climate models. Fields represent
continuously distributed phenomena in space, with one or more variables described at each
location on a regular mesh or array. Observed fields are often spatially extensive but thin in
time, and what is needed for hydrologic science is spatially localized over a watershed or aquifer
but much deeper in time, so there is an important space-time recomposition problem involved
when using such data sources.

Geospatial Themes – these are static representations of particular layers of information
describing the earth’s surface and subsurface, including land surface terrain, watershed and
stream networks, stream channel morphology, land cover, vegetation, soils, geology and
aquifers. Other themes that may enter a study are census data on population, agricultural
statistics, and infrastructure such as roads and dams. A particular theme may have a different



                                                4
form depending on the spatial scale – the land surface terrain of whole nation may be represented
using a 1 km Digital Elevation Model grid, while the terrain surface of a small watershed may be
represented using LIDAR data with 1m post spacings. Geospatial themes may be comprised of
discrete space objects or features, which are spatially distinct points, lines, areas or volumes, or
they may be continuous space themes, such as terrain surfaces or digital orthophoto images.

Simulation Models – these are computerized sets of equations representing the functioning of
hydrologic processes. Hydrologic modeling has occurred over decades, and most current models
were developed as stand-alone systems with specially designed input and output files that have
little in common from one model to another. Transferring information between these models
and an external data infrastructure is possible but difficult and directly connecting one arbitrarily
selected hydrologic model with another is nearly impossible. New methods of modeling that
emphasize loosely connected modularized functions are needed.

Information Tools – these are devices for accessing, transforming, analyzing and visualizing
hydrologic data, and for connecting data with models. Tools generally perform a single
function, and when many tools are assembled into a package, they form a toolkit or application
system. The use of scientific workflows such as Kepler (Altintas et al., 2004), D2K (NCSA,
2006), or ModelBuilder (ESRI, 2006), to sequence the operations of tools is a new way of
constructing complex analysis and modeling systems.

Information Collections – these are assemblies of series, fields, themes, models and tools into a
connected structure that comprehensively describes a hydrologic environment. A Digital
Watershed is an information collection describing a drainage basin. One may similarly define a
Digital Aquifer or Digital Estuary to describe other hydrologic environments. Data file formats,
such as HDF (Hierarchical Data Format) (Folk, 2005), or the ArcGIS geodatabase may be used
to store collections. A more flexible method is provided by the GEON registration system in
which individual datatypes registered in a catalog can be combined into a “Data Integration Cart”
collection using a variety of relationships including spatial (GEON, 2005)..

Web Portals – these represent a type of content management system web site that serve as
gateways to a broad array of resources and services accessed through portlets. Cross-platform
portlet standards are emerging (OASIS, 2003), to allow portal developers easily embed remotely-
running web services into their portals. A cybercollaboratory is a particular kind of portal
designed to facilitate sharing and discussion of information among a community of scientific
investigators

Thus, the CUAHSI Hydrologic Information System is a geographically distributed set of series,
fields, themes, models, and tools, which are connected using the internet, assembled into
collections, and accessed through portals. The innovation in computer science which makes all
this possible is the Service Oriented Architecture (SOA) in which individual functions are
constructed as web services and made available at network nodes for use by other participants in
the network via standard access protocols . Thus, a hydrologic scientist using CUAHSI HIS
has access to a network of data sources, models and tools some of which are resident on his or
her own computer but many of which function automatically on remote computers, in much the




                                                  5
same way that a scientist communicates with colleagues via email without worrying about what
operating system or software the recipients email system is using.

Accomplishments of HIS Phase 1

(1) CUAHSI Web Services
A CUAHSI web services library has been built that provides direct access to data from the
USGS National Water Information System, the Ameriflux tower network, a portion of the
National Climatic Data Center’s (NCDC) archive, and some products from MODIS remote
sensing. Functions in the NWIS library, can be viewed at
http://river.sdsc.edu/NWISTS/nwis.asmx. For streamflow, this library contains several functions,
such as GetSiteInfo, GetDischargeInfo, and GetDischargeValues. GetSiteInfo takes a USGS
station number as input and returns an XML document that contains station metadata (name,
location, and other attributes); GetDischargeInfo takes a station number as input and returns the
number of discharge observations and the date times of the first and last observations;
GetDischargeValues takes a station number, start date and end date as inputs, and produces a
time series of discharge values and times. Similar functions exist for extracting water quality
and groundwater data from NWIS.

Each web service method is an elementary piece of code, which performs a single function and is
described using the Web Service Definition Language (WSDL), a W3C standard that enables
instructions made on one computer to be executed on another. CUAHSI web services for NWIS
are web page scrapers -- they programmatically mimic the action a human user would take when
using the NWIS web site http://waterdata.usgs.gov/nwis to create a URL request string that when
submitted to NWIS produces the same output file a human user sees. The web service then
parses that file to transform the information into a standard XML format as required by WSDL.
A similar approach has been used to access Ameriflux and MODIS data. The value in this
approach is that it just uses a data agency’s web site as it is. The NCDC web services use a
different and more profound approach – NCDC has placed a portion of its archive outside its
firewall for the Automated Surface Observing System (ASOS) at airport weather stations. The
NCDC has created querying functions using Simple Open Access Protocol (SOAP), another
W3C standard that allows CUAHSI to make direct data access requests into the NCDC archive
without mimicking any web page operations. This is faster and more secure for CUAHSI, but
more risky for the data provider because it allows remote machine access to the archive.

The advent of CUAHSI web services means that a hydrologic scientist using any programming
language (Fortran, C/C++, Visual Basic, Java), or any application (Excel, ArcGIS, Matlab),
running on any operating system (Windows, Unix, Linux, MacIntosh), can directly access
hydrologic data in several national archives (NWIS, Ameriflux, NCDC, MODIS). This is a
remarkable accomplishment. It is almost as if the national data archives are loaded on a local
disk in the user’s computer. A HydroObjects library has been prepared that acts as a
middleware component on the user’s computer and provides applications like Excel, ArcGIS and
Matlab with access to the web services without having to program each service into each
application.




                                                6
The value of CUAHSI web services was quickly recognized outside academic circles. The
availability of these services was announced during a CUAHSI cyberseminar on HIS presented
on Friday, October 28, 2005. On Wednesday Nov 2, Jason Love, from a private firm, RESPEC,
in Sioux Falls, South Dakota, posted on the EPA Basins list server: “Occasionally one comes
across something that is worth sharing; the CUAHSI Hydrologic Information Systems - Web
Services Library for NWIS is a valuable tool for those of us interested in rapidly acquiring and
processing data from the USGS, e.g., calibrating models and performing watershed
assessments.” He provided a tutorial on how to use the services with Matlab, which CUAHSI
had not developed. Thus, the technology transfer from the academia to the private sector to the
public sector occurred in less than one week! Better access to the nation’s water information has
wide benefits beyond its contributions to the advancement of hydrologic science.

(2) Hydrologic Data Access System

Web services work just fine when you know where the data have been measured and what has
been measured there. A data archive like NWIS is actually a collection of observing networks,
one for streamflow, another for water quality, a third for groundwater levels. Each network has
a set of observation stations, each with its own name, identifying number, latitude and longitude.
Each station has a set of one or more observation parameters (stage height, streamflow,
dissolved oxygen, water level) that may be regularly or irregularly recorded through time to form
observation series. This pattern of networks of stations having parameters described by series is
repeated in all the national hydrologic observation systems (NWIS, EPA Storet, NCDC,
NAWQA, Ameriflux) and the pattern is repeated in state, local and academic investigator
hydrologic observation systems. The web sites that provide access to these data have tabular
interfaces because the data series are stored behind them in relational database tables.

As part of providing access to a particular observation network, a station map is constructed by
building a program or web service for GetSiteInfo and applying that systematically over the
spatial domain of the data source to harvest all the station locations. Then, by applying a
GetParameterInfo service, the number and type of measurements available at that station can also
be harvested. The end result is a hydrologic observations metadata catalog for the network that
consists of a dot map showing station locations, and attached attribute tables that show what is
measured at each station. The station maps for each observing network can be integrated to
form an observation station map for the nation, as shown in Figure 1.




Figure 1. Hydrologic observation station map for the continental US.




                                                7
The observation station map is presented in the CUAHSI Hydrologic Data Access System
(HDAS) (http://river.sdsc.edu/HDAS) against a backdrop of watershed and stream network data
to provide spatial context. A hydrologic scientist can zoom in to any region of the nation and see
where data have been measured, query what has been measured there, obtain graphs and tables
of selected observation series, and download them as .csv or Excel files. These functions are
supported by calls to the web services library. The HDAS provides a common data window on
water observation information in the nation in much the same way that Travelocity or the Home
Shopping Network do for travel or shopping.

The HIS effort began in 2002 with a committee that wrote an HIS white paper, later published as
a CUAHSI Technical Report (HIS, 2002). In that report, it was envisaged that the HIS would be
based on a Hydrologic Data Access Center which would be a centralized facility supporting data
access. What has emerged through web services, however, is a Hydrologic Data Access System
which hydrologic scientists can extend by adding web services and station maps for any
observation network in the nation. Thus, the role of CUAHSI HIS is to create the framework for
this system and its services for accessing national data archives, and then support scientists in
CUAHSI institutions to extend this system by adding state, local and individual investigator
networks.

(3) Hydrologic Observations Database

Hydrologic scientists collect data in field campaigns and experimental sites and it is useful to
have a standardized hydrologic observations database for storing that information so that it can
be automatically incorporated into the national HIS when the investigator is ready to publish the
information. Based on an initial design concept, and review from 22 CUAHSI scientists, a more
fully configured relational database schema has been designed and tested on limited hydrologic
datasets (Horsburgh et al., 2005).
[David Tarboton – you might want to rephrase this or write more here]

(4) Hydrologic Metadata

Whether data are obtained from a government agency archive or an investigator dataset, they
require metadata to describe the character of the information. Each national data archive has its
own metadata schema, and the scope of the issue can be assessed from the fact that the NWIS
system alone stores data for more than 10,000 parameters, mostly water quality species. The
EPA Storet system for water quality has another metadata schema, different from that of the
USGS even when describing exactly the same water quality variable. The National Climatic
Data Center has yet another approach. Atmospheric water flux data from the Ameriflux
network have one symbology and the same variables described in the North American Regional
Reanalysis of climate have another. There is thus a very complex problem of semantic
mediation or interpreting the parameter information from each individual source correctly, and
then finding how parameter information from separate sources can be combined appropriately.
[Michael – you might want to rephrase this or write more here]

(5) Hydrologic Modeling using Web Services




                                                8
Hydrologic models can be run operated using web services by automatically ingesting their input
data from national archives at run time. Goodall (2005) built a hydrologic flux coupler that
ingests precipitation, evaporation and groundwater recharge flux fields from the North American
regional reanalysis of climate, streamflow discharge data from NWIS gaging stations, and
combined all of these with geospatial information on watersheds to compute a daily or monthly
water balance for watersheds of the Neuse basin, North Carolina. This accomplishes in minutes
what otherwise requires hours of tedious effort manipulating web pages to get data, make format
conversions to get everything in the right units, and then run the water balance simulation. The
resulting water balance model can readily be applied anywhere in the nation because it is built
directly over the national data archives.

This was accomplished using a scientific workflow language, ArcGIS ModelBuilder, in which
modular tools are visually connected in an iconic tableau to show the computational logic. It
has also been shown that entire hydrologic simulation models such as HEC-HMS or HEC-RAS
can be called directly as tools in a workflow (Whiteaker et al, 2006). The workflow model
shown in Figure 2 can itself be published on a server and called as a web service by a user at a
remote location. Thus, models can become web services just like data and a geographically
distributed system of data and models created.




Figure 2. A scientific workflow using web services to ingest hydrologic flux and flow data and
perform a water balance for watersheds in the Neuse basin, North Carolina.

Other NSF scientific communities are developing scientific workflows – the LTER ecological
community is working with the SDSC Kepler system, and the NCSA environmental
cyberinfrastructure project is using their D2K (Data to Knowledge) system. It has been
demonstrated that CUAHSI’s web services for hydrologic data can be called as tools in Kepler
and in D2K, so the same web services library can support any number of scientific workflow
systems. The NCSA environmental cyberinfrastructure project is investigating metaworkflows
that will make workflow models developed in different systems interoperable.
[Jon Goodall – you may want to edit or add something here].


                                                9
(6) Digital Watershed Toolkit

A Digital Watershed Toolkit (http://geo.sdsc.edu/cuahsi/Toolkit/tabid/79/Default.aspx) has been
prepared which contains six components: a groundwater data model and toolkit, the hydrologic
observations database schema, GeoLearn – a standalone system for processing remote sensing
data, a river channel morphology model and toolkit, a watershed data model and toolkit, and a
Time Series Analyst. The Time Series Analyst is particularly interesting because it was
developed independently of the CUAHSI HIS project by Jeffrey Horsburgh at Utah Water
Research Laboratory (UWRL). At first it operated only on its own specially constructed
database of downloaded hydrologic data for a watershed in Utah. When CUAHSI web services
became available Dr Horsburgh create a version of this analyst that operates directly over these
web services, and thus his analyst is now applicable anywhere in the nation. This application is
supported on the web at UWRL, http://water.usu.edu/nwisanalyst/ so now a hydrologic scientist
anywhere in the nation can access Dr Horsburgh’s tool and execute it on NWIS data collected
anywhere in the nation! This multiplies by a factor of thousands the value of Dr Horsburgh’s
work in programming this analyst tool. The HIS development team hopes that other CUAHSI
members will similarly nationally enable their tools and models as Dr Horsburgh has done and
include them in the Digital Watershed Toolkit.

(7) Hydrologic Observatory Collection

A Digital Library has been constructed for the Illinois River Basin Observatory
(http://irbho.cee.uiuc.edu/irbho/digitallibrary.php) that indexes more than 600 information
sources containing information and data about the observatory region. Initially this collection
was assembled using Arbitrary Digital Objects, which are unstructured collections of information
in zip files with metadata descriptors, but it later emerged that arbitrarily structured objects are
difficult to interpret automatically so now the collection has been reconfigured using the GEON
registration system in which each indexed object has a particular datatype (e.g. image, relational
databases, netCDF file), and thus can be linked automatically to tools that will open and view
such datatypes.

(8) Web Portals

Products from the HIS-1 effort are displayed through an HIS portal mounted at the San Diego
Supercomputer Center (http://geo.sdsc.edu/cuahsi/), through the CLEANER-CUAHSI
Cybercollaboratory mounted at the NCSA in Illinois (http://cleaner.ncsa.uiuc.edu/home/), and
through the CUAHSI program office portal in Washington (http://www.cuahsi.org/his.html).
Thus, multiple science communities and web outlets can access the same hydrologic information
model components. These components may also be similarly incorporated in web portals that
have been created by regional CUAHSI Hydrologic Observatory teams.

Assessment of HIS Phase 1 Accomplishments

Three letters attached to this proposal have been solicited to assess the accomplishments from the
current HIS project. Dr William Michenor, Co-Director of the NEON project office, and



                                                10
Associate Director of the LTER Network Office, represents the ecological science community.
He writes “I remain extremely impressed with the team of experts that you have engaged to
develop the HIS and the logical and product-oriented process that you have followed.” He
describes the HIS products developed to date for hydrologic data, and goes on “Importantly, and
from a NEON perspective, it will be possible for us to build upon the CUAHSI HIS efforts and
to focus more of our energies on activities related to creating the databases and analytical tools
related to terrestrial ecosystems and biodiversity”.

Dr Robert Hirsch writes as Associate Director for Water at the USGS, and Chairman of the
Subcommittee on Water Availability and Quality (SWAQ) of the President’s National Science
and Technology Council, which is “the primary coordinating and planning group for water-
related science and technology in Federal government”. Dr Hirsch states “I am familiar with the
HIS project and am highly supportive of its goals and highly impressed by its accomplishments
to date…. I believe that the HIS effort holds great potential for helping all of the data delivery
services of the Federal water agencies live up to their full potential.” Dr Hirsch has suggested
and CUAHSI has accepted that the SWAQ should set up a committee of federal water agency
representatives to advise CUAHSI on how best to bring together the nation’s water information.
Thus, CUAHSI HIS has been endorsed at the highest level of federal water agency coordination.

Clint Brown writes as Director of Software Products for the Environmental Systems Research
Institute (ESRI), makers of the ArcGIS geographic information system. He states “This
important project will greatly benefit information access, integration, and use by the hydrologic
science community, yet it goes far beyond this single community….We believe that your work
will be a critical footprint in illustrating how to build distributed information systems that
integrate scientific principles for helping to better manage our nation.”

These three perspectives, from leaders in the neighboring sciences, in the federal water agencies,
and in the computer industry, show collectively the high regard in which the CUAHSI HIS
project is held. It truly is viewed as a critical component for the advancement not just of
hydrologic science, but of ecological science, of uniting the nation’s water information, and of
advancing the scientific implementation of information systems on a broad scale. The challenge
for the HIS team is to maintain the pattern of accomplishments which have earned this regard.

Plans for HIS Phase 2

CUAHSI Community Challenge

Now that a reasonable understanding of has been obtained through HIS Phase 1 of how a
community-based hydrologic information system can function, an obvious focus for Phase 2 is to
consolidate and focus the effort on delivering usable information products. The task statements
presented subsequently have that aim. However, it is useful in the larger sense to have a single
strategic goal to unite various components of the effort. Dr Richard Hooper, President and CEO
of CUAHSI, has presented the following CUAHSI community challenge to animate the activities
of the observatory teams: “Predict the fluxes of water and chemicals, continuously in space and
time, throughout the rivers, lakes, aquifers and estuaries of the nation.” At first sight, this seems
like a lofty goal, so far out into the future as to be unrealizable. But Global Climate Modeling



                                                 11
was similarly thought to be far-fetched when it was initiated, and the first models were not very
effective, but that science has now matured to the point where GCM results are leading to critical
national policy decisions. Having a lofty science challenge in mind helps to keep the HIS effort
focused on supporting science issues – it is very easy to become preoccupied with the very
necessary details of how cyberinfrastructure actually works.

There are several advantages of the CUAHSI community challenge – it serves to link the four
HIS goals (data access, observatories, science, education); it helps to frame science questions
that clarify particular aspects of the challenge; it shows that HIS must think in terms of
continuous space and time distributions of phenomena, but also consider the movement of water
and chemicals into and out of discrete-space objects (rivers, lakes, aquifers, estuaries); it presents
a large national scale hydrologic science challenge that the NSF petascale computing
infrastructure could support; it means that HIS must deliver information and integrate modeling
across all spatial scales from global and continental scale weather patterns to water movement
and chemical transport and transformation a point location within a soil column or a stream
channel. And HIS must consider all time scales, from instantaneous events like flood peak
flows, to the very slow evolution of the landscape through geological time. Figure 3 shows
typical cartographic mapping scales for geospatial themes that CUAHSI HIS uses to depict
information in digital watersheds.




Figure 3. CUAHSI HIS integrates information and modeling across all spatial scales.

CLEANER and WATERS

CUAHSI in geosciences and CLEANER in environmental engineering have been encouraged by
NSF to jointly pursue an MREFC (Major Research Equipment Facilities Construction) project
called WATERS (Water and Terrestrial Environmental Research System). This is a very long
term goal whose implementation cannot begin until 2011 or later, but in the mean time, it is
expected that these two programs will have interoperable systems and facilities. A conceptual
diagram of the research process, prepared by Dr Barbara Minsker, PI of the CLEANER planning


                                                 12
office, is shown in Figure 4. It contains six components of an integrated cyberinfrastructure
process: knowledge services, data services, workflows and model services, meta-workflows,
collaboration services, and digital libraries. Knowledge services help scientists and engineers
find the information they need quickly and effectively, and the remaining terms have been
explained earlier in this proposal. On January 19, 2006, Drs Minsker and Maidment presented a
joint seminar to the NSF Engineering Directorate, in which they pointed out that the NCSA
environmental cyberinfrastructure project, in which Dr Minsker participates, is focused on the
first, fourth and fifth of these boxes, namely knowledge services, meta-workflows and
collaboration services, while the CUAHSI HIS project is focused on the second, third and sixth
boxes, data services, workflows and model services and digital libraries. The two projects are
working together to construct this integrated cyberinfrastructure for the research process.




Figure 4. Integrated cyberinfrastructure for WATERS1

Surface Process Cyberinfrastructure

On January 18, 2006, there was held at NSF in Washington, a meeting to coordinate earth
surface process cyberinfrastructure. The PI of this proposal (Maidment) developed a meeting
book with contributions from all the groups represented at the meeting (see Table 1) to document
their aspirations and technical approaches. It is clear that there is a great deal of synergy among
these various cyberinfrastructure efforts – indeed one of the most exciting aspects of
cyberinfrastructure development is to benefit from data and model sharing with neighboring
sciences. The CUAHSI HIS team will work diligently to accomplish this goal.

Organization       Name                                                                Web site
CLEANER            Collaborative Large-Scale Engineering Analysis Network for          http://cleaner.ncsa.uiuc.edu/
                   Environmental Research
CSDMS              Community Surface Dynamics Modeling System                          http://www.nced.umn.edu/
                                                                                       CSDMS.html
CUAHSI             Consortium of Universities for the Advancement of Hydrologic        http://www.cuahsi.org
                   Science, Inc
CZEN               Critical Zone Exploration Network                                   http://www.wssc.psu.edu/
EarthChem          Advancing Data Management in Solid Earth Geochemistry               http://www.earthchem.org/

1
    This diagram is a slight amendment of one drafted by Dr Barbara Minsker, University of Illinois.


                                                          13
GEON         The Geosciences Network                                    http://www.geongrid.org/
IRIS         Incorporated Research Institutions for Seismology          http://www.iris.edu/
LTER         The US Long Term Ecological Research network               http://www.lternet.edu/
NCAR         National Center for Atmospheric Research                   http://www.ncar.ucar.edu/
NCED         National Center for Earth-surface Dynamics                 http://www.nced.umn.edu/
NCEAS        National Center for Ecological Analysis and Synthesis      http://www.nceas.ucsb.edu/
SAHRA        Sustainability of semi-Arid Hydrology and Riparian Areas   http://www.sahra.arizona.edu/
Unidata      Unidata Program Center                                     http://www.unidata.ucar.edu/
Table 1. Surface process cyberinfrastructure communities and centers.

SAHRA and NCED

The NSF Earth Sciences Division supports two Science and Technology Centers, SAHRA and
NCED (see Table 1). Both of these centers maintain substantial hydrology and stream channel
morphology data collection efforts, SAHRA in the Upper Rio Grande and Upper San Pedro
basins, and NCED in the Angelo Coast Range Reserve in the Eel River watershed, California.
NCED has one component of its program called “Desktop Watersheds” which is aimed at
developing predictive models for watershed behavior that can guide field investigation. The
CUAHSI Digital Watershed provides a structured information base upon which the predictive
models contained within the Desktop Watershed may be constructed. CUAHSI will work with
NCED and SAHRA to incorporate their data into the national HIS. A commitment letter to this
effect from Dr James Shuttleworth, Director of SAHRA, is attached to this proposal.

CUAHSI Program Office

The CUAHSI Program Office in Washington, DC is the hub of CUAHSI’s operation and in all
policy matters the CUAHSI HIS program is subordinated to CUAHSI’s President and Executive
Committee. In discussions between CUAHSI, NSF and HIS, it has been decided that all
outreach efforts for the CUAHSI HIS program, such as the annual HIS Symposium, will be
funded from the program office. Moreover, to the extent possible, the CUAHSI web site
http://www.cuahsi.org will be become the central portal to which hydrologic scientists go to
learn about and use HIS products.

HIS Committees

The HIS program will have two associated committees, an HIS Standing Committee and a
Technical Advisory Committee. The HIS Standing Committee represents the CUAHSI
community to provide oversight on the functioning of HIS. This Committee has been
functioning already throughout HIS Phase 1. Its members are Dennis Lettenmaier, University of
Washington (Chairman), Larry Band, University of North Carolina, William Michener,
University of New Mexico, Paul Morin, University of Minnesota, and Kelly Redmond, Desert
Research Institute. Members of the Standing Committee participate in the HIS biweekly
conference calls and the HIS PI and the Standing Committee Chairman consult with each other
from time to time. When necessary, the Standing Committee develops formal reports for the
CUAHSI Executive Committee about the HIS program. The Technical Advisory Committee is
still being formed but it will consist of representatives from the SWAQ federal water agencies
and from the computer industry. The purpose of this Committee, which will probably meet
twice per year, is to provide advice and guidance to the HIS effort


                                                   14
GIS and HIS

Geographic Information Systems (GIS) are an important underlying technology in HIS because
of the use of geospatial themes in HIS and the use of geospatial location as an integration
mechanism for integration of hydrologic observation networks. The Hydrologic Data Access
System is built at the San Diego Supercomputer Center on top of the ArcGIS Server version 9.1.
This web server technology is a back-end infrastructure that does not in any way impede the
manner in which hydrologic scientists receive and use hydrologic data. As indicated in the
attached letter from Clint Brown, the Environmental Systems Research Institute has “joined in an
informal partnership with CUAHSI and San Diego Super Computing Center to help support the
HIS system implementation using GIS technology and methods.” This collaboration is helpful
to the HIS in several ways. First it provides a worthwhile peer review of our HIS product
development strategy from experienced industrial software engineers. The task of building
information products to support users in more than 100 universities on a budget of less than $1
million per year is a challenge. The knowledge acquired by listening to experienced ESRI
software engineers whose products are already deployed at all these institutions and widely used
in hydrologic science is helpful to our development team.

Moreover, collaboration with ESRI provides a means of referring groups who want to build HIS
for themselves but who don’t fit into the domain of NSF’s science focus. At the SAHRA center,
for example, there is an aspiration to build an Arizona Hydrologic Information System, to unite
water information for Arizona in much the same way that HIS is proposing here to unite that
information for the nation. CUAHSI HIS cannot dilute its efforts by supporting State-level HIS
programs in various States.

NSF has a policy stated in the EAR/IF proposal guidelines that all products developed under this
competition shall be open source, so there is a policy issue that has to be worked out between
CUAHSI, NSF and the participating universities to ensure that all products and information
developed in this project conform to appropriate standards and guidelines. The fact that 96% of
the CUAHSI community uses the Windows operating system needs to be considered in this
policy. A draft CUAHSI policy statement addressing this issue is attached to this proposal. This
policy statement will subsequently be discussed, opened for public comment, amended and
adopted by the CUAHSI Executive Committee after consultation with NSF. The CUAHSI HIS
project will then conform to the policy guidelines thus established.

Tasks

An outline of the proposed tasks to be undertaken during HIS Phase 2 is now presented, using
the structure of the products and services developed during Phase 1. This is followed by a
section on project management in which the various roles taken by the project team members is
explained. It should be understood that the responsibility for executing all these tasks is a
shared responsibility carried by the team as a whole, and the work is proportioned out according
to resources and knowledge among the PI’s in various ways depending on the individual task.

(1) CUAHSI Web Services



                                               15
Continue to build and deploy web services for hydrologic data from national archives. Work
with the agencies who maintain those archives to ensure the services are secure and updated as
necessary. Develop the HydroObjects library for local access to web services from applications
and programming languages. Design a database for the hydrologic observations metadata
catalog and deploy the observation station maps in several formats, such as kml files for Google
Earth. Develop application examples and tutorials for utilization of web services by hydrologic
scientists. Develop more web services for observation fields, such as remote sensing, Nexrad
and weather and climate grid models. Investigate the OpenDAP web data access system and
compare it to OGC standards such as the web coverage and map services. Select an appropriate
method for deployment of CUAHSI web services for observation fields.

(2) Hydrologic Data Access System

Develop and maintain the Hydrologic Data Access System, adapting it to new upgrades of the
underlying server technology as they occur. Develop a querying system to select sets of stations
with particular kinds of measurements available. Incorporate geospatial themes, observation
series and fields for CUAHSI regional observatories, and the SAHRA and NCED study regions.

(3) Hydrologic Observations Database

[David T. – this is your slot – please define your role!]

(4) Hydrologic Metadata

[Michael – this is your slot – please define your role!]

(5) Hydrologic Modeling using Web Services

Demonstrate that existing hydrologic simulation models can be individually operated as web
services, thus providing hydrologic simulation services. Show that scientific workflows can be
used to integrate hydrologic simulation services with existing hydrologic data services to supply
the input data for a simulation. Demonstrate that the output from one hydrologic simulation
service can be ingested as the input to another hydrologic simulation service operated elsewhere.
Show how hydrologic models defined for different spatial scales and hydrologic environments
can for an integrated hydrologic simulation and data system. Use the integrated system to
address important hydrologic science research questions in the Neuse River basin, North
Carolina

(6) Digital Watershed Toolkit

Develop and maintain the Digital Watershed Toolkit and encourage contributions of tools from
CUAHSI community members to form part of this toolkit, including web-based tools like the
Utah State Time Series Analyst. Define a protocol for what is an acceptable level of utility,
performance and documentation needed for a tool to be included in the CUAHSI Toolkit.




                                                 16
(7) Hydrologic Information Collections

Build a Hydrologic Information Repository at the San Diego Supercomputer Center so that
hydrologic scientists who wish to contribute their data and models can do so without having to
maintain the server architecture themselves. Continue to develop the GEON information
registration system so that collections can be registered and stored in the repository.


(8) Web Portals

Maintain the HIS web portal and support the deployment of HIS tools in the CLEANER and
CUAHSI portals. Investigate the connection with the GeoSpatial OneStop portal for federal
information http://geodata.gov and determine if this is a suitable venue for a similar outreach for
water information – a “Water OneStop”.

(9) Outreach

Present an Annual HIS Symposium and an Annual Report that summarize the findings and
products of the HIS project. Present cyberseminars and seminars at various locations around the
nation to keep CUAHSI members updated on HIS progress. Prepare tutorials about HIS use and
short courses of instruction in HIS tools and methods.

HIS Project Roles

University of Texas – PI – David Maidment

(1) Project Management – Responsible to NSF as a single point of contact for all aspects of the
HIS project management. Interacts with the CUAHSI Program Office and President (R.
Hooper), and partnership efforts with neighboring science communities, federal water agencies,
and the computer industry. Coordinates with the four co-PI’s the work being done in their
institutions. Is responsible for project reporting and documentation of results from the project as
a whole.

(2) HIS Component Development – prototype web services and tools for time series,
multidimensional fields and geospatial themes. Development of the HydroObjects library.
Design and development of Digital Watersheds.


San Diego Supercomputer Center – co-PI – Ilya Zaslavsky

Hydrologic Service Oriented Architecture – responsible for development of a web services
oriented architecture for hydrology. This includes the Hydrologic Data Access System,
including its observations metadata catalog, the web services library, and the hydrologic
information repository. Is the key cyberinfrastructure designer for CUAHSI HIS. Responsible
for deploying and maintaining 24/7 hydrologic data services infrastructure at SDSC




                                                17
Drexel University – co-PI – Michael Piasecki

Hydrologic Metadata – responsible for definition of HIS metadata standards for all components
of the CUAHSI Hydrologic Information Model, mediation among the metadata systems used by
federal water agencies and CUAHSI metadata, definition of a hydrologic markup language
(HML), development of a data search engine for the HDAS, and development of a framework of
hydrologic ontologies, including controlled vocabularies for domain keywords.

Duke University – co-PI – Jon Goodall

Web Services for Hydrologic Modeling – responsible for defining how hydrologic simulation
models can be wrapped as web services, for designing scientific workflows that combine data
and simulation modeling, for prototyping the next generation of hydrologic simulation models
that operates on top of CUAHSI cyberinfrastructure.

Utah State University – co-PI—David Tarboton

Web Services for Hydrologic Observations – responsible for designing a hydrologic
observations database, defining how information from sensor networks, and field sampling is
loaded into the database, for web services that publish information from the database.

Project Management

HIS Phase 1 has laid out a reasonable strategy for technical development of the CUAHSI
Hydrologic Information System, and to some degree what should follow in Phase 2 is a focusing
of that effort to produce usable information products for the CUAHSI community. To this end,
the size of the project team in Phase 2 has been reduced considerably – in Phase 1 there were
five PI’s and 12 collaborating scientists. In this proposal there are just five PI’s. This team has
become accustomed to working together over the last two years, and maintains frequent
communications by email and by means of a two-hour conference call held every two weeks.
The same relationship of four academic PI’s doing research and prototyping, and the San Diego
Supercomputer Center consolidating and synthesizing the products will continue in Phase 2.

HIS Phase 1 is a Collaborative Grant where each institution submitted its budget separately to
NSF. HIS Phase 2 is proposed as a Cooperative Project where the University of Texas at Austin
serves as the single point of contact for NSF, and the other four institutions are subcontracted to
Texas. The Cooperative Project mechanism allows NSF a greater degree of control over the
effort, including adjustment of the amount and proportioning of the budget from year to year if
experience shows that to be necessary. The University of Texas at Austin has waived its right to
charge overhead on the funds passed through the university to the subcontractors.

Dr Maidment holds an endowed Chair at the University of Texas at Austin, which comes with a
reduced teaching load. He is also Director of the University’s Center for Research in Water
Resources, which means that he has a clerical and technical support staff in place. The path of
development of CUAHSI’s various programs has demonstrated that the success of a community-
based science effort depends critically on the skills and commitment of the program leader. If



                                                18
the present PI (Maidment) were unexpectedly to be unable to continue in that role, leadership of
the project will pass to Dr David Tarboton of Utah State University. Dr Tarboton was host to the
widely acclaimed CUAHSI Hydrologic Observatory Conference in Logan, UT, in August 2004,
and is a trusted leader within the CUAHSI community. Each Fall semester for the last six years,
he and Dr Maidment have been teaching together via the internet a graduate course on GIS in
Water Resources, so each is closely acquainted with the technical thinking of the other.

References

Atkins, D., et al., (2003), “Revolutionizing Science and Engineering Through
Cyberinfrastructure”, Report of the National Science Foundation Blue-Ribbon Advisory Panel
on Cyberinfrastructure, 84p., http://www.communitytechnology.org/nsf_ci_report/report.pdf

CLEANER Cyberinfrastructure Committee, (2005), “Review of Hydrologic Information System
Status Report”, December 19, 5 p.

Goodall, J.L., A geotemporal framework for hydrologic analysis, PhD dissertation, University of
Texas at Austin, August 2005.

Horsburgh, J. S., D. G. Tarboton and D. R. Maidment, (2005), "A Community Data Model for
Hydrologic Observations, Chapter 6," in Hydrologic Information System Status Report, Version
1, Edited by D. R. Maidment, p.102-135, http://www.cuahsi.org/docs/HISStatusSept15.pdf.

Maidment, D.R.(ed.) (2005), Hydrologic Information System Status Report, Consortium of
Universities for the Advancement of Hydrologic Science, Inc, September 15, 214pp,
http://www.cuahsi.org/docs/HISStatusSept15.pdf

NSF Cyberinfrastructure Council, (2005), “NSF’s Cyberinfrastructure Vision for the 21st
Century Discovery”, Version 4.0, Sept 26, 24 pp. http://www.nsf.gov/od/oci/CI-v40.pdf

Tarboton et al, User Survey – cite the EOS paper

Folk (2005) – HDF format.

Whiteaker et al. 2006 – Map to Map paper.



I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludдscher, S. Mock, 2004.
Kepler: An Extensible System for Design and Execution of Scientific
Workflows , , system demonstration, 16th Intl. Conf. on Scientific and
Statistical Database Management (SSDBM'04), 21-23 June 2004, Santorini
Island, Greece.

NCSA, 2006. D2K - Data to Knowledge. Accessed February 2006 at http://alg.ncsa.uiuc.edu/do/tools/d2k




                                                     19
ESRI, 2006. ArcGIS ModelBuilder. Accessed February 2006 at
http://www.esri.com/software/arcgis/about/desktop.html#modelbuilder
GEON, 2005. GEON Annual Report. Accessed February 2006 at
www.geongrid.org/communications/annual_reports/Annual_Report_2005_Final_Pub.pdf

OASIS, 2003. WSRP: Web Services for Remote Portlets (http://www.oasis-open.org/committees/wsrp/)




                                                20

								
To top