Solid Earth Research Virtual Observatory
Grid/Web Services and Portals
Supporting Earthquake Science
Current SERVOGrid is USA Project led by JPL (Jet
Propulsion Laboratory) but next is iSERVO with
August 26 2004
International Collaboration between Australia, China,
Japan and USA Beijing China
Geoffrey Fox, Marlon Pierce
Community Grids Lab,
Pervasive Technologies Laboratories
Solid Earth Science Questions
1. What is the nature of 4. How do magmatic systems
deformation at plate evolve and under what
boundaries and what are the conditions do volcanoes
implications for earthquake erupt?
5. What are the dynamics of the
2. How do tectonics and climate mantle and crust and how
interact to shape the Earth’s does the Earth’s surface
surface and create natural respond?
6. What are the dynamics of the
3. What are the interactions Earth’s magnetic field and its
among ice masses, oceans, interactions with the Earth
and the solid Earth and their system?
implications for sea level
From NASA’s Solid Earth Science Working Group
Report, Living on a Restless Planet, Nov. 2002
The Solid Earth is:
Complex, Nonlinear, and Self-Organizing
Relevant questions that Computational
technologies can help answer:
1. How can the study of strongly correlated solid earth
systems be enabled by space-based data sets?
2. What can numerical simulations reveal about the
physical processes that characterize these systems?
3. How do interactions in these systems lead to space-
time correlations and patterns?
4. What are the important feedback loops that mode-lock
the system behavior?
5. How do processes on a multiplicity of different scales
interact to produce the emergent structures that are
6. Do the strong correlations allow the capability to
forecast the system behavior in any sense?
Characteristics of Computing for
Solid Earth Science
Widely distributed datasets in various formats
• GPS, Fault data, Seismic data sets, InSAR satellite data
• Many available in state of art tar files that can be FTP’d
• Provenance problems: faults have controversial parameters
like slip rates which have to be estimated.
Distributed models and expertise
• Lots of codes with different regions of validity, ranging from
cellular automata to finite element to data mining applications
• Simplest challenges are just making these codes useable for
• And hooking this codes to data sources
• Some codes also have export or IP restrictions
• Other codes are highly specialized to their deployment
Decomposable problems requiring interoperability for
linking full models
• The fidelity of your fault modeling can vary considerably
• Link codes (through data) to support multiple scales
Seamless Access to data repositories and computing
Integration of multiple data sources including
databases, file systems, sensors, …, with simulation
Core web services for common tasks like command
execution and file management.
Meta-data generation, archiving, and access with
extending openGIS (Geography as a Web service)
Portals with component model (portlets) for user
interfaces and web control of all capabilities
Basic Grid tools: complex job management and
Collaboration to support world-wide work
• “Collaboration” can range from data sharing to Audio-video
Codes range from simple “rough estimate” codes to
parallel, high performance applications.
• Disloc: handles multiple arbitrarily dipping dislocations
(faults) in an elastic half-space.
• Simplex: inverts surface geodetic displacements for fault
parameters using simulated annealing downhill residual
• GeoFEST: Three-dimensional viscoelastic finite element
model for calculating nodal displacements and tractions.
Allows for realistic fault geometry and characteristics,
material properties, and body forces.
• Virtual California: Program to simulate interactions
between vertical strike-slip faults using an elastic layer over
a viscoelastic half-space
• RDAHMM: Time series analysis program based on Hidden
Markov Modeling. Produces feature vectors and probabilities
for transitioning from one class to another.
Preprocessors, mesh generators: AKIRA suite
Visualization tools: RIVA, GMT, IDL
SERVOGrid Codes, Relationships
Elastic Dislocation Inversion Viscoelastic FEM
Viscoelastic Layered BEM
Fault Model BEM
This linkage called Workflow in Grid/Web Service parlance
Role of Workflow
Programming the Grid: Workflow describes linkage
As distributed, linkage must be by messages
Linkage is two-way and has both control and data
Apply to multi-scale (complexity) linkage, multi-
program linkage, link visualization to simulation, GIS
to simulations and viz filters to each other
Microsoft-IBM specification BPEL is current preferred
Web Service XML specification of workflow
SERVOGrid uses ANT (well known XML build tool) to
perform workflow and this works well in our relatively
(i)SERVO Web (Grid) Services
Programs: All applications wrapped as Services using proxy strategy
Job Submission: support remote batch and shell invocations
• Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh
generation (Akira/Apollo) and visualization packages (RIVA,
• Uploading, downloading, backend crossloading (i.e. move files
between remote machines)
• Remote copies, renames, etc.
Workflow: Apache Ant-based remote service orchestration
• For coupling related sequences of remote actions, such as RIVA
Data services: support remote data bases and query construction
• XML data model being adopted for common formats with
translation services to “legacy” formats.
• Migrating to Geography Markup Language (GML) descriptions.
Metadata Services: for archiving user session information.
Security: Authentication and Authorization
• Authentication describes who the user is
• Authorization describes what a given user can do
– What data and computers can be accessed
– Basically a database
• Current portal uses password accounts and provides services for free for
– iSERVO should decide on “charging for” services
• We have (through Community portal effort OGCE) support for GSI and
Kerberos authentication services.
– These just plug in and replace the default login service.
• Authorization is currently simple: you can only reach your files.
– iSERVO should develop an authorization policy
• Simultaneous Cross Administrative Domain access is a very hard Grid
problem and no consensus as to good solution
• Systematic use of Services helps security/privacy/IP issues as “danger
of misuse” is lower for services (which have limited privileges) than for
direct computer access
SERVO Data Sources
• Developed as part of the project
• QuakeTables: http://infogroup.usc.edu:8080
Seismic data formats
• Available from www.scec.org
• SCSN, SCEDC, Dinger-Shearer, Haukkson
GPS data formats
• Available from www.scign.org
• JPL, SOPAC, USGS
Applications and Observational Data
Several SERVO codes work directly with
• GeoFEST, VirtualCalifornia, Simplex, and Disloc all
depend upon fault models.
• RDAHMM and Pattern Informatics codes use seismic
• RDAHMM primarily used with GPS data
Problem: We need to provide a way to integrate
these codes with the online data repositories.
• QuakeTables Fault Database was developed
• What about GPS and Earthquake Catalogs?
• Many formats, data available in tars or files, not
searchable, not easy to integrate with applications
Solution: use databases to store catalog data; use
XML (GML) as exchange data format; use Web
Services for data exchanges, invoking queries, and
Geographical Information Service
(GIS) Data Formats and Services
OpenGIS Consortium (OGC) is an international group
for defining GIS data formats and services.
Main data format language is the XML-based GML.
• Subdivided into schemas for drawing maps, representing
features, observations, …
First Step: design GML schemas and build specialized
Web Services for GPS and Earthquake data.
OGC also defines services.
• Services include Web Features Services, Web Map Services,
• These are currently pre-Web Service, based on HTTP Post,
but they are being revised to comply with WS standards.
Next Step: Implement OGC compatible Web Services
for this problem i.e. build a GIS Grid
• Also build services to interact with QuakeTables Fault DB.
GML and Existing Data Formats
GPS or seismic data used in this project
are retrieved from different URLs and
have different text formats.
Seismic data formats
• SCSN, SCEDC, Dinger-Shearer, Haukkson
GPS data formats
• JPL, SOPAC, USGS
We defined 2 GML Schemas to unify
A summary of all supported formats and
data sources can also be found there.
Prototype GML Service
First version of the
• Tried XML databases
but performance was
• Currently database
Download results are
in GML, but we can
convert to appropriate
Search XML DB For GPS Catalogs
openGIS Grid Semantics
• Note GIS (Geographical Information System) Grid at heart of all these
• Geography Markup Language (GML) is an XML encoding for the
specification of the geometry and properties of geographic features.
GML utilizes the OpenGIS Abstract Specification geometry model
which has been harmonized with the ISO geospatial geometry model.
– We are building CI specific ontologies in terms of GML to define
faults, satellites etc.
• Styled Layer Descriptor (SLD) specifies the format of a map-styling
language for portraying the output of Web Map Servers, Web Feature
Servers and Web Coverage Servers etc. SLD will enable different
communities in the Emergency Response area to develop a set of
customized portrayal rules that best fit their mission requirements.
– This becomes the specification of portals to different composite Grids
• Sensor Markup Language (SensorML) defines the information model
for discovering, querying and controlling Web-resident sensors.
• Observations & Measurements (O&M) defines the information model
for observations that are returned from the CrisisGrid sensors.
GIS Grid Services I
• Web Feature Service (WFS) supports the query and discovery of
geographic features delivering GML representations of simple
geospatial features in response to queries from HTTP clients. WFS
can access geographic features including critical infrastructure
features, incident locations, and flood-related geographic features
including inundation areas, watershed boundaries, and demographic
• Web Coverage Service (WCS) supports the query and discovery of
digital geospatial information such as digital elevation models,
imagery, orthophotography, weather coverages (such as predicted
rainfall, air pressure, wind speed and direction), and any other space-
varying flood-related phenomena.
• Web Map Service (WMS) uses a SLD portrayal to generate "pictures"
of georeferenced feature or coverage data.WMS will provide a means
to portray geographic information independent of the underlying data
model (WFS or WCS).
• Coverage Portrayal Service (CPS) defines a standard interface for
producing visual pictures from coverage data typically accessed via
WCS with a SLD portrayal.
GIS Grid Services II
• Web Terrain Service (WTS) augments WMS with advanced
visualization including 3D terrains.
• Catalog Service - Web Profile (CS-W) is a catalog service
that will be built on a general Grid metadata service
• Sensor Collection Service (SCS) fetches observations from
a sensor or group of sensors and will be integrated with
research on Grid sensor services
• Sensor Planning Service (SPS) assists in 'collection
feasibility plans' and to process collection requests for a
sensor or group of sensors.
• Web Notification Service (WNS) will be replaced by
standard Grid notification service
QuakeTables+OGC Web Map Service
Streams and Workflow
NaradaBrokering can manage streams from
• Audio/Video conferences
• Inter-service communication in workflow
http://www.hpsearch.org/demo/ describes scripting
Grids involve streams
as well as compute and
Workflow and dataflow
like BPEL imply
SERVOGrid has many types of metadata
We are designing RDFS descriptions for the
• Simulation codes, mesh generators, etc.
• Visualization tools
• Data types
• Computing resources
These are easily expressed as RDFS (actually
DAML) “nuggets” of information.
• Create instances of these
• Use properties to link instances.
Some Sample Relationships
installedOn Computer GMT
usesInput Stress Map
USC Fault DB Fault
Data Storage DataType
Full Portal Demo:
• Request an account
• Downloads available in November
GPS and Seismic Database Demo:
Setting up your own GPS or Seismic database
Some Grid Controversies
• 1) There are several proposals for the Web Service
extensions needed for Grids – why do we ignore?
– OGSI (GT3)
– WSRF (GT4)
– WS-GAF (Newcastle)
– WS-I+ (Pure Web Services)
• We use WS-I+ approach – can later add extensions when
– This approach adopted by next phase of UK e-Science Program
• 2) Web Services are too slow as use HTTP with clumsy
ASCII XML data (SOAP)?
– Currently no problem but can use separate control channel from
data channel if need high performance
• Agree on what (type of) resources and capabilities need to put on the
– Computers, instruments, databases, visualization, maps, job
• Agree on interfaces to resources from OGSA-DAI (databases) to
particular data structures (GML/OpenGIS) – specify in XML
• Implement Resources and Capabilities as Services
– User Interface should be a portlet that can be integrated by the
portal into web interface
• Make certain overarching Grid capabilities such as workflow,
federation and metadata are sufficient
• SERVO Grid is a prototype of this strategy using several US sites
rather than several countries
– Can be naturally extended to iSERVO, education, emergency
response by extending resources
• Web Service Architecture ensures continued interoperability and
Web service performance is not an issue when
used to invoke services that take hours to
• Later real-time sensors will probe performance
Reliability is a larger problem.
• Need monitoring/heartbeat services.
Information systems still have a long way to go.
• UDDI is part of WS-I but has/had some well known
• WS-Discovery has some interesting concepts but is too
specialized to ad-hoc networks.
• Peer-to-peer systems provide many useful concepts like
discovery and caching.
• Semantic Web provides powerful resource descriptions
that could be exploited.
• XML Databases slow
Further iSERVO Challenges
• Make everything a Service
• Think about Data Curation
– Set up policies for observational data and criteria for inclusion
in iSERVO data repositories
• Think about Data Provenance
– Generate and maintain metadata describing ownership, origins
– Applies to both “experimental data” and results from
• Curation and Provenance change in research
methodologies and requires funding!
• Education and Emergency Response/Planning interesting
offshoots of iSERVO
QuakeSim Portal for SERVOGrid
The services need user interfaces
• WSDL descriptions are all you need to create
client stubs (if not client applications).
The QuakeSim portal effort aggregates
these service interfaces into a portal.
• Customizable displays, access controls to
QuakeSim is just one of many, many such
Challenge is to develop reusable portal
has its own
portlet for the
Use tabs or
2 Other Portlets
SERVOGrid Portal Screen
Computational Web Portal
Web service dream
is that core Aggregate Portals
Portlet User Interface
user interface Components
decoupled. Application Web Services
How do I manage
all those user
interfaces? Core Web Services
Portlet Class: Gateway Web/Grid Computing
WebForm (IU) service
Clients (Pure HTML, Java Applet ..)
Aggregation and Rendering
or Proxy Web/Grid
Portlet Class Portlets Data Stores
Portlet Class GridPort
Internal Local arrangement
Clients Portal Portlets Libraries Services Resources
Why Are Portlets a Good Idea?
You don’t have to reinvent
• Makes it easy (but not effortless) to
share portal components between
• So you can pull in portlets from all the
other earthquake grid projects.
You can easily combine a wide range
• Add document managers, collaboration
tools, RSS news lists, etc for your portal
Lessons Learned: Portals
Developing good user interfaces is a lot of work.
• Effort doesn’t scale: how do you simplify this for
computational scientists to do it themselves without
lots of background in XML, Java, portlets, etc?
Portal interfaces have advantages and disadvantages.
• Everyone has a browser.
• But it has a limited widget set, a limited event
model, limited interactivity.
• You can of course overcome a lot of this with
Following the service model, you can in principal use
any number of GUIs
• Browsers are not the only possible clients.
• Web service interoperability means that Java Swing
apps, Python, Perl GUIs are all possible, but this has
not been fully exploited.
Use OGCE Portal Architecture and portal services
Can expect GGF activities like OGSA to define/refine interfaces and
projects around the world to produce more powerful services
• Obsolescing of implementations is a consequence of interoperability
Use Grids of Grids of Services Architecture
• Interoperable Component Grids Built from interoperable services
• Collaboration, Compute, Database, GIS, Sensor, Visualization Grids
Build a GIS (Geographical Information Systems) Grid spanning
simulation/crisis management and different fields with openGIS
• openGIS has defined Web Service Interfaces
• Visualization should build on these
Geoscience Education Grid by transformations on research grid
Emergency Response and Planning Grids by adding real-time
control/collaboration and GIS tools
• These additions common to all crises
Collaboration between Beihang University and Indiana University to
produce Web Service based audio/video conferencing