A Cyberinfrastructure Framework for Discovery, Integration, and by alextt

VIEWS: 11 PAGES: 28

									A Cyberinfrastructure Framework for Discovery, Integration, and Analysis of Earth Science Data

A Prototype System
* * * * ** A. K. Sinha, Z. Malik, A. Rezgui, A. Dalton, K. Lin
* Virginia Tech ** San Diego Supercomputer Center

Hypothesis Evaluation: Are A-Type Rocks in
Virginia related to a Hot Spot Trace ?
Spatio-Temporal Distribution of Igneous Rocks Ho La tS u re po nti tT an ra c Cr e? us ta nd Lit ho sp he re

m Plu

ea eH

d
2

GEON’s DIA Engine


Evaluating a Hypothesis requires
 



Discovery - Access to Data Integration of Data – Provide data products Analysis of Data – Verify Hypothesis

3

Data Discovery


Registration of Data : Pre-requisite for Data Discovery
 


Level 1 Registration – Keywords Level 2 Registration – Ontologic Classes Level 3 Registration – Item Detail Level
4

Registration of Data:


Key to Discovery, Integration and Analysis
Level 1


Discovery of data resources (e.g., gravity, geologic maps, etc) requires registration through use of high level index terms. GEON has deployed extension of AGI Index terms -will be cross indexed to others such as GCMD, AGU
Discovering Item level databases requires registration at data level ontologies (e.g. bulk rock geochemistry, gravity database)



Level 2




Level 3


Item detail level registration (e.g., column in geochemical database that represents SiO2 measurement). This level of registration is a requirement for semantic integration
5

Level 1 Registration
GEON Index Ontology

AGI Index Terms

http://www.geoscienceworld.org/
6

Level 2: Registration at the Item Level

Level 2 Registration
Ontological Look at Virginia Tech Igneous Rock Database
Methods & References Structure

Structure

MapReference

References

AnalyticalMethods

Isotope
FeTreatmentMinerals BulkRockGeochemMethods

BodyShapes

Fractures

Fabric

Location
Isotope

Rb_Sr_Isotope_Whole_Rock

Sm_Nd_Isotope_Mineral

U_Th_Pb_Isotope_Mineral

Mineral
Rb_Sr_Isotope_Mineral Sm_Nd_Isotope_Whole_Rock U_Th_Pb_Isotope_Whole_Rock

Rock

Location Geologic Images

Rock Element

Mineral

RockGeoChemistry Images ModelComposition

GeologicLocation

MineralChemistry

7

Level 3 Registration

AnalyticalOxideConcentration
1

analyticalOxide: AnalyticalOxide concentration : ValueWithUnit 0..n errorOfConcentration : ValueWithUnit

GEON approach of registering data to concepts removes structural (format) and semantic heterogeneity

A Section from Planetary Material Ontology

8

DIA Engine (1)


How does GEON discover data
 

Keywords, Resource Type, Temporal, Spatial Invoke GEON protocol for discovering databases
Retrieve the discovered data from registered databases Emphasize Geospatial and Aspatial Discoveries (Not all things to be done through a Map-based browser)
9



Discovery, Integration and Analysis Engine




DIA Engine (2)

Geospatial Engine
Geoscience Templates
Geologic Map (USA) Geologic Map (States) Geologic Provinces Terrane Map Geophysical Map

Aspatial Engine

- Experimental Databases - Tools

10

High-Level View of the DIA Engine


User specifies class of data for analysis

Raw Data



The DIA Engine derives and retrieves the different data sets needed for the requested analysis
The DIA Engine applies processing and filtering techniques to generate the requested data product

Query Tool



Data Product



Data products and Query Steps can be saved

Modeling

Computation

11

Data products (1)


Data products can be in the form of Interactive Maps, Interactive Filtering Diagrams or Excel Data Files Examples:




A map showing the A-Type bodies in the Mid-Atlantic region



An Excel file giving the ages of those A-Type bodies
A gravity database table spatially related to A-Type bodies




Saved as a contoured gravity map

12

Data products (2)


Data products can be:


Pre-Packaged


Quickly queried but not flexible and provide little support for complex scientific discovery May require on-the-fly, extensive query processing but enables far richer possibilities for scientific discovery Requires Semantic Integration
13



Created Dynamically




Data Integration (1)


Semantic integration of data products requires:




Ontologies: a common language to interpret data from different sources Data sharing: requires data registration


Fine grain (i.e., item-level) registration is necessary to enable the automatic processing (by tools) of shared data.
14

Data Integration (2)
Ontologically Registered Data (Geo-physics) Ontologically Registered Data (Geo-chemistry)
Data Owner

Raw Data

Data Owner

Register Data Geo-physics Ontology Geo-chemistry Ontology Geo-chemistry Ontology Ontologically Registered Data 2

Raw Data

QT 1

Ontologically Registered Data 1

Integration Class
Location

DP 1 QT 2 Query Tool

DP 2

Data Product

Integration across ontological classes

Integration within an ontological class
15

Limitations of Current Data Sharing Approaches




Each research group adopts its own acronyms, notations, conventions, units, etc. Data sharing is of limited scope
 

Data discovery is ad-hoc Only a small community of scientists may be aware of and share a given data set Extensive conversion efforts may be needed



Integration is difficult






Absence of streamlined integration leads to poor ability to answer complex scientific questions Solution: Ontology-based Data Registration

16

Query Building


Menu-based (Used in the Demo)



The GUI lets the user select only specific items which in turn queries only a subset of the data Results are guaranteed as the query is definitely answered
A robust system informs the user of any incorrect input and guides in the right direction





Text-based
  

The entire database can be queried Result sets may be empty Only a small mistake in the query can return incorrect results, without the user being able to point out the fallacy
17

Menu-based Query Building


In a selected “region of interest” the user is provided with a number of options (the menu)



User clicks through the different menus to build an exact query


Click history is maintained to enable future referencing
Menu # 2 Menu # 3 Menu # 4

Menu # 1

Menu # 5
18

Query Tool Selection




Tools provided by GEON can be used to answer a query OR Other geologic tools can be incorporated (invocation interfaces need to be defined)


Example: GCD-Kit can be used for classification, geotectonic and normative calculations for Igneous Rocks

19

Analysis


Data Product(s) generated can be analyzed using various techniques
 

Modeling Computation

20

Java/VB Scriptenabled Web browser Q: A-Type polygons in a region R using discrimination diagram D ? User

Java/VB Script ASP.net VB.net Visual Basic GEON Server ESRI ArcGIS Virginia Server Tech

10000*G a/Al vs. Zr

Y vs. Nb

FeO*/ MgO vs Zr+Nb+ Ce+Y

Discrimination Functions

US National Gazeteer

Geo-Chemical Data Server 1 Virginia Tech (Mid-Atlantic)

Geo-Chemical Data Server 3 (Texas)

MS SQL Server
GeoChemical Data

MS SQL Server
GeoChemical Data Geo-Chemical Data Server 2 (Wyoming)

Geo-Spatial Data Server ESRI ArcSDE GeoSpatial Data Web Server SDSC Rock Classification Ontology

MS SQL Server

GeoChemical Data

Workflow Associated with the Demo

21

Used Technologies


User Interface:
  

Java / VB Script ASP.net VB.net ESRI ArcGIS Server 9.1 ESRI ArcSDE 9.1 (Spatial Database) Microsoft SQL Server (Geo-Chemical Database) Visual Basic (to code the discrimination filters)
22



Back-End:
  



Functionality Coding:


Demo Starts Here

23

Current Tool Sharing Approaches
 

 

Each research group develops its own tools Tools developed by a research group are rarely used by other groups Redundancy of development efforts Little interoperability amongst tools


Interaction amongst different tools is often not possible or requires extensive (re)coding



Solution: Wrap Tools as Web Services Accessible to the Scientific Community Worldwide
24

The Future: Integration through Ontologies and Web Services


Benefits of Web Services


Facilitate Integration




Tools developed independently may easily be integrated into new applications Example: Discrimination tools may be made as Web services



Provide High Reusability


More tools available to the research community



Reduce development time, effort, and cost
25

Web Services Explained (1)
User Application Provider 2 User Application Provider 1

WS Standards
WSDL: Web Services Description Language

2 Discover Web Service

3 Invoke Web Service

SOAP Messages

UDDI Registry

UDDI: Universal Description, Discovery, and Integration SOAP: Simple Object Access Protocol

Web

UDDI Registry

WSDL Service Descriptions

1 Publish Web Service

Function 2

Function 3

Function 1

Web Services Service Provider 1 Service Provider 2

Service Provider 3

26

Web Services Explained (2)


WSDL (Service provider describes service using WSDL)






An XML-based language to describe the capabilities of Web services The capabilities of a WS are described as a set of end points that can exchange messages WSDL is part of UDDI A Web-based directory where service providers may list their services and where service consumer may retrieve the services published by the providers (like yellow pages) An XML-based protocol used to encode the messages (requests and responses) exchanged between a Web service and its clients.
27



UDDI (Service provider publishes service using UDDI)




SOAP (Clients and services communicate using SOAP)


Discovery

Geospatial Query

Aspatial Query

Integration
Between Different Ontologic Classes

Within Same Ontologic Class

Geochemical A-Type Identification VA. Ontologically Registered Data WY. Ontologically Registered Data TX. Ontologically Registered Data Ontologically Registered Data Geochemical Geophysics Geologic Time

Data Product

Data Product

Analysis
Hypothesis Evaluation: Are A-Type Rocks in Virginia related to a Hot Spot Trace ?

28


								
To top