Interoperable Web Services for Distributed Data Access and

Document Sample
Interoperable Web Services for Distributed Data Access and Powered By Docstoc
					Interoperable Web Services for Distributed Data
 Access and Analysis of Emissions Inventories
                              Stefan Falke
          Center for Air Pollution Impact and Trend Analysis (CAPITA)
        Department of Energy, Environmental and Chemical Engineering
                              Washington University
                                St. Louis, Missouri

                             Terry Keating
                     US Environmental Protection Agency
                          Office of Air & Radiation
                              Washington, DC,

                      GEIA 2006 Open Conference
                             Paris, France
                         November 30, 2006
Objectives: advance the implementation of the Networked Environmental
Information Systems for Global Emissions Inventories (NEISGEI), an US
EPA initiative to develop a web-based global air emissions inventory network to
    • access to distributed emission inventory data at multi-spatial and temporal
    • tools for data processing and analysis
    • means for sharing data & tools
    • an environment for collaboration among researchers, regulators, policy
    analysts and interested public

Approach: Develop, test, and implement components of an air quality
cyberinfrastructure using the latest advances in information technology to make
multi-scale air emissions data and tools easier to find, use and integrate.

           An air emissions “cyberinfrastructure”
  Cyberinfrastructure - information sciences and technologies used to build new
  types of scientific and engineering knowledge environments with the goal of pursuing
  research and management more effectively and efficiently.

“Contemporary projects require effective federation of both distributed resources (data and
facilities) and distributed, multidisciplinary expertise and cyberinfrastructure is a key to
making this possible.” - NSF Blue Ribbon Report on Cyberinfrastructure, 2003

                                                                     (Atkins, 2004)
                 Conceptual Diagram of an Emissions Cyberinfrastructure

                   Wrappers/                                                Users &
                   Adapters/        Data Catalogs                           Projects
      Data                                                 Mediators /
       XML         Standards                                Portals
                               GEIA/ACCENT    Geospatial
                                Data Portal   One-Stop
Emissions Inventories                                                       Generation

      Portals                                                             Data Analysis

   Activity Data
                                        Web                               Comparison
                                    Tools/Services                        of Emissions
 Emissions Factors                             Spatial
                                              Allocation                   Model
    Surrogates                 Estimation      Transport
                                Methods         Models
                  Networked Inventories Principles

Distributed/Federated. Data are shared but remain distributed and maintained by their
original inventory organizations. The data are dynamically accessed from multiple sources
through the Internet rather than collecting all emission data in a single repository.

Non-intrusive. The technologies needed to bring inventory nodes together in a
distributed network should not require substantial modifications by the emission inventory
organizations in order to participate. However, there will need to be some harmonization of
existing inventory data.

Transparent. From the emission inventory user’s perspective, the distributed data
should appear to originate from a single database. One interface to multiple data sets
should be possible without required special software or download onto the user’s

Flexible/Extendable (Interoperable). An emission data network should be designed
with the ability to easily incorporate new data and tools from new providers joining the
network so that they can be integrated with existing data and tools.
                          NEISGEI Web Portal

A community
resource providing
access to,
descriptions of, and
dialogues about
an array of content
and services for
exploring and
emissions data,
and ideas.

Built using LifeRay, an
                           Accessible through
open-source portal
               Federated data system - DataFed
   The Data Federation is a web-based infrastructure for distributed data access
   and collaborative processing/analysis of air quality data. (Husar et al., 2004)


                                                                           NEISGEI is
                                                                           built on
                                                                           and services.

50+ Datasets
                                                                 Export or connect to
                                                                 other web services
                           Geospatial Web Standards
Standards for finding, accessing, portraying, and processing geospatial data are defined
by the Open Geospatial Consortium (OGC).
    • Web Map Server (WMS) for exchanging map images, but the
    • Web Feature Service (WFS) retrieves discrete feature data (roads, political
    • Web Coverage Service (WCS) allows access to multidimensional data that
    represent coverages, such as grids or point monitoring data
    • Sensor Observation Service (SOS) multidimensional access to measurement
While these standards are based on the geospatial domain, many are designed to be extended to
support non-geographic data “dimensions,” such as time and the many other dimension tables
found in emissions inventories.
                                                                   Geospatial One-Stop
  Web Coverage Service (WCS)
              GetCoverage                  netCDF
 WCS Server                 WCS Client

         GetCoverage Request
                Using Standard Interfaces for Web Access
Current Process:
Data access without standard interfaces:
1)   Find data in Portal                              User
2)   Download data                            2.           1.
3)   Reformat / “Wrapping”

                                                                       Multi-                      5.
4)   Repeat 1-3 for other datasets
                                                    Emissions           dim
5)   Browse, visualize, analyze                      Portal            cube

Possible Future Process:
Data access with standard interfaces:
1) Find data in Portal
2) Access through standard interfaces        WCS                       User
3) Browse, visualize, analyze

                                                                     Emissions                2.    3.
                                        2.                            Portal(s)

                   Multi-dimensional Browsing
   RETRO Biomass Burning Emissions

                                          August 2000
Source: RETRO

                      1960-2000 monthly
                         Custom Web Applications

GoogleMaps mashup using DataFed data access interfaces for browsing and visualizing a smoke
event in Idaho on September 12, 2006. The “standard” GoogleMaps application is augmented
with a HTML/Javascript table that accesses monitoring data through standard interfaces.
Information technologies (particularly service oriented architectures
and web services) provide opportunities to realize benefits of
distributed databases using standardized interfaces

Distributed databases allow data to remain maintained by owner
         - dynamically updated (avoids versioning issues)
         - make connection once – always get latest and greatest

Standard interfaces foster networked activity and sharing of data and
tools through interoperability
         - simplify integration and analysis by moving the information
           technology details to the background

Federated inventories, datasets, models, analysis tools, portals
       - no “one-stop” can meet all user needs
       - faster progress through distributed, shared efforts