Luis Bermudez (PDF) by amb48952

VIEWS: 0 PAGES: 98

									     Using Semantics to Enable 

Environmental Information Integration

         for a Better World


   EPA's Environmental Information Symposium 

                May 12th, 2010


                Luis Bermudez 

              bermudez@sura.org

      Coastal Research Technical Manager

                    SURA
n
                   Agenda


•   SURA Coastal Program
•   Motivation
•   Controlled vocabularies: A quick tutorial

•   State of the art tools
•   Community examples
•   Summary and Conclusions
 Advancing Infrastructure for 

collaborative transformational 

            science

Thomas Jefferson Nuclear 

Accelerator Facility (JLab)

                  Information Technology





1985: six nodes
1987: SURAnet - First Internet Service Provider in the US
Today: Billions of nodes
Regional
cyberinfrastructure for
sharing computing
resources
Coastal Program

 United Nations estimate that by 2020 

 75% of the world’s population will be 

living within 60 km of the coastal zone

Complex Problems

                Complex Problems





Where is it going to go ?

                Complex Problems





Where is it going to go ?



                             Which routes are going to
                             be flooded and when?
  Need to improve our ability to 

understand and respond to natural 

            hazards

  Need to Improve Science





Get sensors that measure wave 

height to compare to my model

Need to Improve Response

Need to Improve Response

     Observing Systems Integration





2010, there will be 10,000 telemetric devices for

          every human on the planet 

                       Ernst &Young study

    Institute for the Future Report "Beyond the Internet"

Well .... let’s store 

 everything in 

one data base ?




                          15
Archives almost 99% of NOAA data

  225 gigabytes added each day

     320 million paper records 

Groundwater surveys at 

  USGS > 700,000


Real Data Surface Water 

   stations > 8,000



EPA STORET stations ~ 

      275,000

   NOAA / 

   NCDC

     EPA / 

    STORET



  EPA / WQX




USGS / NWIS
   NOAA / 

   NCDC

     EPA / 

    STORET



  EPA / WQX




USGS / NWIS
 We will end up with 

specialized systems:


   Quality Data

  Biological Data

  Physical Data

      Models

       etc..

Still think one system is 

 the correct answer ?

Internet is great !

Billions of connected nodes 

Billions of connected nodes

Some of these nodes talk to each other

SURA Coastal 

 Ocean and 

 Prediction 

  Program 

                           SURA Coastal 

                            Ocean and 

                            Prediction 

                             Program




 1. UF gets reports from
NHC and produces winds
                                          SURA Coastal 

                                           Ocean and 

                                           Prediction 

                                            Program




 1. UF gets reports from 2. RENCI, BIO,
NHC and produces winds	 VIMS, UF run
                         coastal models
                                                        SURA Coastal 

                                                         Ocean and 

                                                         Prediction 

                                                          Program




                                           3. Data is
 1. UF gets reports from 2. RENCI, BIO,
                                          catalogued
NHC and produces winds VIMS, UF run
                         coastal models     at UAH
                                                           SURA Coastal 

                                                            Ocean and 

                                                            Prediction 

                                                             Program




                                           3. Data is   4. LSU and
 1. UF gets reports from 2. RENCI, BIO,
                                          catalogued       TAMU
NHC and produces winds VIMS, UF run
                         coastal models     at UAH      archive the
                                                            data
                                                            SURA Coastal 

                                                             Ocean and 

                                                             Prediction 

                                                              Program




                                           3. Data is   4. LSU and 	 5. OpenIOOS
 1. UF gets reports from 2. RENCI, BIO,
                                          catalogued       TAMU       displays track,
NHC and produces winds VIMS, UF run
                         coastal models     at UAH      archive the 	 observations
                                                            data       and models
                                                           SURA Coastal 

                                                            Ocean and 

                                                            Prediction 

                                                             Program 





                                           3. Data is   4. LSU and     5. OpenIOOS
 1. UF gets reports from 2. RENCI, BIO,
                                          catalogued       TAMU       displays track,
NHC and produces winds VIMS, UF run
                         coastal models     at UAH      archive the    observations

                     Interoperability
                      data        and models
Machine to Machine communication 

         needs Metadata

Machine to Machine communication 

         needs Metadata

Machine to Machine communication 

         needs Metadata


Creator: USGS
Keyword: Gage Height
...
Machine to Machine communication 

         needs Metadata


Creator: USGS
Keyword: Gage Height
                       Describes a
...                    resource
Machine to Machine communication 

         needs Metadata


Creator: USGS
Keyword: Gage Height
                             Describes a
...                          resource

• Answers: what, when,
where, how, who and why of
the described data.
 Machine to Machine communication 

          needs Metadata


Creator: USGS
Keyword: Gage Height
                                Describes a
...                             resource

• Answers: what, when,
where, how, who and why of
the described data.
• Helps to: discover, access,
evaluate and use of data.
Communities Must Agree Upon

Communities Must Agree Upon

• What descriptors can be used ?
	   e.g. Keyword or topic
Communities Must Agree Upon

• What descriptors can be used ?
	    e.g. Keyword or topic
• Which possible values?
	    e.g. Gage height or water elevation
Communities Must Agree Upon

• What descriptors can be used ?
	     e.g. Keyword or topic
• Which possible values?
	     e.g. Gage height or water elevation
• Rules of usage:
Communities Must Agree Upon

• What descriptors can be used ?
	     e.g. Keyword or topic
• Which possible values?
	     e.g. Gage height or water elevation
• Rules of usage:
   Cardinality - E.g. How many authors?
   Obligation - E.g. Should topic always be
                annotated ?
Communities Must Agree Upon

• What descriptors can be used ?
	     e.g. Keyword or topic
• Which possible values?
	     e.g. Gage height or water elevation
• Rules of usage:
   Cardinality - E.g. How many authors?
   Obligation - E.g. Should topic always be
                annotated ?
• What syntax should be used ?

ASCII, Excel, XML …
Communities Must Agree Upon

• What descriptors can be used ?
	     e.g. Keyword or topic         Semantics
• Which possible values?
	     e.g. Gage height or water elevation
• Rules of usage:
   Cardinality - E.g. How many authors?
   Obligation - E.g. Should topic always be
                annotated ?
• What syntax should be used ?

ASCII, Excel, XML …
Semantic Problem


  Groundwater
  Spring
                    http://ga.water.usgs.gov/edu/pictures/wcgwstoragebeach.jpg




 Well

       Semantic Problem


           Groundwater
           Spring
> 10,000 searchable terms
      http://ga.water.usgs.gov/edu/pictures/wcgwstoragebeach.jpg




          Well

> 9,000 different parameters

Semantic Problem

• How do I know
  how others are
  naming
  groundwater site
  types ?

• How do I solve
  the inconsistency
  of terms (e.g.
  synonyms) ?         From: http://www.mattwardman.com/blog/wp-
                          content/uploads/q-man-thinking-7.gif
http://1.bp.blogspot.com/_2RmMLoFZiws/R6mm9dm84QI/AAAAAAAAATQ/G0q-5-rNdaU/s800/
                                language-translation.jpg
                   Solution


 1) We translate


       Spring
                   Well
         ..
                      ..


       USGS                      EPA
Controlled Vocabulary    Controlled Vocabulary

                   Solution


2) We translate ( mappings )


       Spring       Same As       Well
         ..                        ..


       USGS                       EPA

Controlled Vocabulary     Controlled Vocabulary

   Solution                 Central Controlled
                               Vocabulary
                  Groundwater
          narrower              narrower

       Spring                      Well
         ..                         ..


       USGS                        EPA

Controlled Vocabulary    Controlled Vocabulary

   3 Min Tutorial 

Controlled Vocabulary

 Based on Marine Metadata 

  Interoperability Tutorial 

            at 

AGU Ocean Sciences Meeting

           2008 

Controlled Vocabulary

• a set of restricted words
• agreed by a community
• used for:
  • describing resources
  • discovering resources
• avoids:
  • inconsistencies
  • misspellings
    Controlled Vocabulary


 Official
codes for
names of
countries
http://www.google.it

http://www.google.ch

http://www.google.ch

http://www.google.cn

http://www.google.cn

 Characteristics of 

Controlled Vocabulary

Published
                 Managed





British Oceanographic Data Centre (BODC)
Governed

Community Built

Friendly to computer programs

453 members and 55 groups
4,600 documents
10,00 pages viewed per month
Create ontologies from 

       scratch

Manages revisions

Browse

Map 

Projects

Hydroseek   faceted browser
Hydroseek          faceted browser


   Narrower Than
Hydroseek          faceted browser


   Narrower Than
   Narrower Than
           OGC Ocean Science

Interoperability Experiment and OOSTethys

                     OGC World initiative to
                     advance standards for
                     advancing interoperability of
                     ocean observing systems.
http://openioos.org
                      Data
                      Provider’s
                      Controlled
                      Vocabulary




http://openioos.org
                         Narrower
                           Than


                      Data
                      Provider’s
                      Controlled
                      Vocabulary




http://openioos.org
                          Controlled vocabularies





Controlled Vocabularies
But wait ... there is more than 

    taxonomies’ relations

     (narrower, broader)


world is much more complex

Is it raining in Philadelphia ?





    http://2.bp.blogspot.com/_DKLKIcLTrq4/SdD4SvEpdkI/AAAAAAAABrw/xUg0sJf-Kys/s400/rain.jpg
                       Is it raining in Philadelphia ?

                     radar
                                               measures
                                                                                     rain

  is a
                      measures                      Most recent average

                                  publishes

          operates       NOAA

    DIX                                                              Has units
                                                          Value is               Related noun

  covers                            0


                                              Has units                 dbZ
Philadelphia
                                 Value is

      is in
     Summary and Conclusion




Internet is Great !




                        Nodes can 

                       interoperate


Agreed Metadata

   Summary and Conclusion



                If we 

               resolve





 Nodes can                Semantic 

interoperate              problems

 Summary and Conclusion





            Can be 

           addressed 

              by

Semantic
                         Translators
problems
    Summary and Conclusion



              Use





                     Mappings
Translators
   Summary and Conclusion





           Created

Mappings     By

                      Tools
 Summary and Conclusion




Mappings





              Enable Environmental 

              Information Integration

 Web 

Services

       Summary and Conclusion




                           Faster
Enable Environmental                 Better World
                          Response
Information Integration


Thank you !

Luis Bermudez
bermudez@sura.org

								
To top