Docstoc

Semantic Web

Document Sample
Semantic Web Powered By Docstoc
					The Semantic Web


         Stefan Decker
 Information Sciences Institute
University of Southern California
                   Outline
• Semantic Web Overview
  – Vision, Challenges, Rationals
• Semantic Web in SCEC




                                    2
               Semantic Web
• coined by Tim Berners-Lee (1997)

"The Semantic Web is an extension of the current
  web in which information is given well-defined
  meaning, better enabling computers and people to
  work in cooperation.”


   – T. Berners-Lee, J. Hendler, O. Lassila,
     “The Semantic Web”, Scientific American, May 2001


                                                         3
              Doctor’s appointment
              “The Semantic Web”, Scientific American, May 2001


                                                   Insurance Co.

                                                         Rating
              Mom                                          Provider sites
Physician’s Agent

  required           in-plan?
  treatment          close-by?
                    Specialist?
                                           Schedule appointment


                       Driving schedule

  Lucy’s Agent                                    Pete’ Agent
                                                                            4
   Means to Achieve the Vision
• Explicit Ontologies
  – Needed to understand each others data
    (e.g., joint notion about what a schedule is)


• Web Services
  – Required to actively interconnect systems
    (automatically make an appointment)



                                                    5
         Technical challenges
• Interoperability
  – Inaccurate, incomplete, heterogeneous data
  – Unreliable, ill-defined, evolving services
• Natural language processing, data mining
  – make information explicit
• Human-computer interaction
  – querying interfaces, visualization
• Scalability
  – Subsecond performance

                                                 6
            Social challenges
• Standardization is hard
  – DublinCore
• Bogus or inaccurate metadata
  – Physician rating, profile
• Competition and commoditization
• Economical incentive
  – Chicken and egg
• Complexity: developers and users

                                     7
                 Jump Starters
• Machine Readable Data:
  –                 .org (human-edited
    directory)
  –                  .org (Music encyclopedia)
  – RSS (RDF Site Summary)
  –        (embedded metadata)
  – CC/PP (Composite Capability/Preference
    Profiles)
  – P3P (Platform for Privacy Preferences)

                                                 8
                Jump Starters
• B2B Vocabulary Projects
  – PapiNet.org: Vocabulary for Paper Industry
  – BPMI.org: Vocabulary for exchanging Business Process
    Models
  – XML-HR: Vocabularies for human resources (HR)
  – DMTF (Distributed Management Task Force)
    (Vocabularies for managing enterprises
  – …
• Research Vocabulary Projects
  –   Gen Ontology Working Group
  –   Earth Sciences
  –   MathNet
  –   …

                                                           9
      How do we get there?
  Research communities
     DL, AI, DB, …
                               Standards bodies
                                   W3C, OMG, …
 Non-profit
US, EC, Japan
                         Industry
                IBM, Nokia, HP, Microsoft(?),...
                     Business.semanticweb.org
                                                  10
                   Non-profit
• DARPA
  – “DARPA Agent Markup Language”
                                                  www.daml.org
  – since Aug 2000
• NSF
  – Co-sponsored events (e.g., SWWS)
  – Further support in the loop   www.semanticweb.org/SWWS

• European Council                            www.ontoweb.org
  – “Semantic Web Technologies”, FrameWork 6
• Japan                                www.net.intap.or.jp/INTAP/
  – Interoperability Technology Association for
    Information Processing, Japan (INTAP)

                                                                11
      AI: “Add logic to the Web”
• Assertions, rules
• Agents
• Interoperability
  –   First-order logics
  –   Ontologies, description logics
  –   Logic programming, datalog
  –   Problem-solving methods
  –   …

          Distributed knowledge base
                                       12
      DB: “Everything is syntax”
• Semistructured data
• Web services
• Interoperability
  –   Data integration
  –   Mediation, query rewriting
  –   Model management
  –   Conceptual modeling

 Conglomerate of distributed heterogeneous
       (semistructured) databases
                                             13
Many Previously Unknown
 Communication Partners




                          14
         Heterogenous Data
• To many data formats/languages




                                   15
                                                   1. Step
         • Define uniform, underlying syntax
                – Lowest common denominator: labeled graphs
                  (semi-structured Data) -> RDF
Relational Database                                     Structured Text (e.g., Vcard)
        Person

           ID           F-name    L-name                    begin:                  vcard
                                                            fn:                     Stefan
           1            Stefan    Decker
                                                            n:                      Decker;Stefan
           2            Birgit    Decker                    end:                    vcard

                        Person
                row                row                                           vcard1
                                                                         fn               n
                                              L-name
                      L-name
       ID F-name                 ID F-name                         Stefan                 Decker;Stefan
   1      Stefan      Decker 2       Birgit    Decker

                                                                                                          16
                     XML
•   Containment, hierarchy
•   Adjacency (A followed by B)
•   Attributes (atomic values)
•   Opaque reference (IDREF)

    Good for serialization, poor for modeling
    relational semantics


                                                17
      Encoding of Information
“The Creator of the Resource “http://www.w3.org/Home/Lassila” is Ora Lassila

                                    Creator                   Ora Lassila
   http://www.w3.org/Home/Lassila



             Endless encoding possibilities in XML:
               <Creator>
                  <uri>http://www.w3.org/Home/Lassila</uri>
                  <name>Ora Lassila</name>
               </Creator>

              <Document uri=“http://www.w3.org/Home/Lassila”
                 <Creator>Ora Lassila</Creator>
              </Document>


  <Document uri=“http://www.w3.org/Home/Lassila” Creator=“Ora Lassila”/>

                                                                               18
           Introduction to RDF
• RDF (Resource Description Framework)
   – Beyond Machine readable to Machine understandable
• RDF unites a wide variety of stakeholders:
   – Digital librarians, content-raters, privacy advocates,
     B2B industries, AI...
   – Significant (but less than XML) industrial momentum,
     lead by W3C
• RDF consists of two parts
   – RDF Model (a set of triples)
   – RDF Syntax (different XML serialization syntaxes)
• RDF Schema for definition of Vocabularies
  (simple Ontologies) for RDF (and in RDF)
                                                              19
                     A Simple Example
• Describing Resources
      –   URIs: global OIDs, literals
      –   Binary relationships between objects
      –   Arcs (relationships) are first-class objects
      –   Blank (anonymous) nodes
• “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila”
•    Structure
      – Resource        (subject)    http://www.w3.org/Home/Lassila
      – Property        (predicate) http://www.schema.org/#Creator
      – Value           (object)    "Ora Lassila”




                                               s:Creator         Ora Lassila
    http://www.w3.org/Home/Lassila




                                                                                20
                          RDF
• Graph-based universal syntax
               (Agent-) Applications


  RDF-Layer (Single dataformat, Query and
             storage System)


  Scheduling
                    Insurance Ratings      Calendar
   Service


Semantics in a global, open environment?
                                                      21
               Step2: Ontologies
• What is an Ontology?
  „An ontology is a specification of a conceptualization.“
                                          Tom Gruber, 1993

• Ontologies are social contracts
  – Agreed, explicit semantics
  – Understandable to outsiders
  – (Often) derived in a community process
• Ontologies require Knowledge
  Representation
  – Is_a hierarchy, part of, attributes, axioms

                                                             22
           RDF and Ontologies
   Idea: Define an Ontology Language by defining
    predefined nodes and arcs
   The Ontology Language itself is just an Ontology
   Ontologies are used to tag data from sources




                                                       23
         Step 2: Layers on Top of RDF

                                                         From an
                                                         Ontology

                                                          LivingThing

                                                                   subClassOf

                                                            Person
                                                    row                 row

                                                                                  L-name
Tim Berners-Lee:                                          L-name
                                             ID F-name               ID F-name
“Axioms, Architecture and Aspirations”
W3C all-working group plenary Meeting    1      Stefan    Decker 2       Birgit   Decker
28 February 2001


                                                                                     24
    W3C Semantic Web Activity

   Working Groups         Advanced development
                            •   Annotation (Annotea)
                            •   Access control
                            •   Calendaring
RDF Core   Web Ontology     •   Collaboration
                            •   Logic
                            •   Rules
                            •   Workflows




                                                       25
    RDF Core Working Group
• Resource Description Framework (RDF)
• Goals
  – Improve RDF abstract model and XML syntax
    according to implementors feedback
  – Define precise semantics for RDF and RDF
    Schema
  – Clarify ties with XML family



                                                26
  Web Ontology Working Group
• Standard definition language for ontologies
  (conceptual models)
• Derived from Description Logics
   – But partial mapping to Datbase and Datalog possible ->
     (see Horrocks, Volz, Decker, Grossof: WWW2003)
• Extension of RDF Schema and DAML+OIL
   –   Class Expressions (Intersection, Union, Complement)
   –   XML Schema Datatypes
   –   Enumerations
   –   Property Restrictions
        • Cardinality Constrains
        • Value Restrictions


                                                              27
                        The Layer Cake


   Research Phase

Standardization Phase


Recommendation Phase




 Tim Berners-Lee:
 “Axioms, Architecture and Aspirations”
 W3C all-working group plenary Meeting
 28 February 2001

                                          28
   SCEC/IT Architecture for a
Community Modeling Environment




                                 29
    Tasks within SCEC - CME
• Towards an Earth Sciences Ontology:
  – Cataloging and Unification of Existing
    Databases
     • E.g., Fissures and Fault Activity Database
     • Building a Mediation Environment
     • Organizing a Community Process
• Enriching of Web Services and Grid
  Infrastructure with Semantics
  – Service Discovery and Match Making

                                                    30
        Fault Activity Database
• Hand-Maintained within SCEC (Sue Perry)
• Re-engineering of the Database Schemata
   <rdfs:Class rdf:about="&FAD_v1;AVG_RECURRENCE_INTERVAL"
              rdfs:label="AVG_RECURRENCE_INTERVAL">
             <a:_slot_constraints
   rdf:resource="&FAD_v1;SCFADsep_02_00106"/>
             <rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
   </rdfs:Class>
   <rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT"
              rdfs:label="AVG_SLIP_PER_EVENT">
             <rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
   </rdfs:Class>
   <rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT_METHOD"
              rdfs:label="AVG_SLIP_PER_EVENT_METHOD">
             <rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
   </rdfs:Class>
   <rdf:Property rdf:about="&FAD_v1;CFM-A_coord_file_URL"
              a:maxCardinality="1"
              rdfs:label="CFM-A_coord_file_URL">
             <rdfs:domain rdf:resource="&FAD_v1;FAULT"/>
             <rdfs:range rdf:resource="&rdfs;Literal"/>
   </rdf:Property>

                                                                31
Planned: Mediation Environment with RDF-
          based Rule Language



                  Applications



 Mediation with RDF-based Rule Language


 Fault Activity
                     Fissures    Grid Services
  Database


                                                 32
             Motivation:
    Why Rule Languages for the Web

• Plethora of data available
   – Data needs to be adapted and combined
   – “Time to Market”: Faster to write rules than code
   – Data Transformation and Integration
• Logic specification, not programming
   – Tabled evaluation/bottom-up evaluation
   – Semi-structured data
   – Multiple semantics (Relational Data, UML, ER,
     TopicMaps, DAML+OIL, XML-Schema, special
     purpose data models)
   – Distributed, heterogeneous sources

                                                         33
      What’s Wrong With Existing
             Approaches?

• Built-in semantics (e.g. SiLRI, RQL, DQL)
  – but: many RDF-based languages with different
    semantics (DAML+OIL, RDF Schema,
    UML/RDF, TopicMaps/RDF, DMTF, …)
  – For each language a specialized query language
    ????




                                                     34
  TRIPLE:Language Overview

•Native support
   •for Resources & namespaces,
   •Abbreviations
   •Models (sets of RDF statements)
   •Reification
•Rules with expressive bodies (full FOL syntax)
•Inspired by F-Logic:
   •subject[predicateobject] (“molecule”)



                                                  35
        Language Description I
• Namespace and resource abbreviations:
   – rdf := “http://www.w3.org/1999/02/22-rdf-syntax-ns#”.
   – isa := rdf:subClassOf.
• Statements, triples, molecules:
   – subject[predicateobject]
   – subject[p1o1; p2 o2; ...]
   – s1[p1  s2[p2o] ]
• Models, model expressions, parameterized
  models:
   – s[po]@m            “triple <s,p,o> in model m”
   – s[po]@(m1  m2)    model intersection, union, diff.
   – s[po]@sf(m1, X, Y) Skolem function
                                                             36
           Language Description II
• Reification:
   – stefan[believes  <Ora[isAuthorOfhomepage]> ]
• Logical formulae:
   – usual logical connectives and quantifiers:      
     
   – all variables introduced via  (or )
• Clauses:
   – facts: s[p1o1; p2 o2; ...].
   – rules: X s1[p1X]  s2[p2X]  ... .
• Model blocks:
   – @model { clauses }
   – Mdl @model(Mdl) { clauses }


                                                              37
             Example: Dublin Core
dc := “http://purl.org/dc/elements/1.0/”.      namespace abbreviations

db := “http://www-db.stanford.edu/”.
                                                      TRIPLE             Stefan Decker
····                                  model block
@db:documents {                                         dc:title           dc:creator
   db:d_01_01 [                           fact
                                                                 db:d_01_01
     dc:title  TRIPLE;
     dc:creator  “Stefan Decker”;                 dc:subject                dc:subject

     dc:subject  RDF;                                 RDF           ...      triples
     dc:subject  triples; ... ].          rule
   N p(N)[ rdf:type  xyz:Person;                        Stefan Decker
                                                                                     Perso
            xyz:name  N ]                                                            n

      D D[dc:creator  N].                                        name           rdf:type
                                                     query:
                                                “find all names”

N
}  P P[rdf:type  xyz:Person;                                    N = “Stefan Decker”
     xyz:name  N]@db:documents.
                                                                                             38
        Example: Specification of RDF Schema
                     Semantics
                                                          namespace abbreviations
rdf := 'http://www.w3.org/...rdf-syntax-ns#'.
rdfs := 'http://www.w3.org/.../PR-rdf-schema-...#'.
                                                           resource abbreviations
type := rdf:type.
subPropertyOf := rdfs:subPropertyOf.
subClassOf := rdfs:subClassOf.                            model block

FORALL Mdl @rdfschema(Mdl) {
                                                      “copy” triples from Mdl
    FORALL O,P,V O[P->V] <-
      O[P->V]@Mdl.
                                                       Transitivity of subClassOf
    FORALL O,V O[subClassOf->V] <-
     EXISTS W (O[subClassOf->W]
               AND W[subClassOf->V]).

    …
}

                                                                                    39
                       Example:
       Cars Ontology with RDF Schema Semantics
@cars {                                                                   xyz:MotorVehicl
                                                                                e
  xyz:MotorVehicle[rdfs:subClassOf -> rdfs:Resource].
  xyz:PassengerVehicle[rdfs:subClassOf -> xyz:MotorVehicle].                                xyz:Truc
                                                               xyz:Van                         k
  xyz:Truck[rdfs:subClassOf -> xyz:MotorVehicle].
  xyz:Van[rdfs:subClassOf -> xyz:MotorVehicle].                           xyz:PassengerVehicl
                                                                                   e
  xyz:MiniVan[
   rdfs:subClassOf -> xyz:Van;
                                                                   xyz:MiniVan
   rdfs:subClassOf -> xyz:PassengerVehicle].
}


                                                                 X = xyz:Van
FORALL X <-                                                      X = xyz:Truck
   X[rdfs:subClassOf -> xyz:MotorVehicle]@cars.                  X = xyz:PassengerVehicle

                                                                 X = xyz:Van
FORALL X <-                                                      X = xyz:Truck
   X[rdfs:subClassOf -> xyz:MotorVehicle]@rdfschema(cars).       X = xyz:PassengerVehicle
                                                                 X = xyz:MiniVan

                                                                                                       40
Grid Computing and Web Services (ongoing)

  • Matchmaking between Jobs and Resources
  • Hard-Coded in Globus Toolkit
      – Reeingineering using a Ontology and Rule-
        based solution
      – RDF and DMTF Vocabulary (www.dmtf.org)
 <rdfs:Class rdf:ID="CIM_ComputerSystem">
   <rdfs:subClassOf rdf:resource="#CIM_System"/>
 <version><![CDATA["2.6.0"]]></version><rdfs:comment
 parseType="Literal"><![CDATA["A class derived from System that is a special
 collection of ManagedSystemElements. This collection provides compute
 capabilities and serves as aggregation point to associate one or more of the
 following elements: FileSystem, OperatingSystem, Processor and Memory
 (Volatile and/or NonVolatile Storage)."]]></rdfs:comment>
 <rdfs:subClassOf>
  <daml:Restriction>
   <daml:toClass rdf:resource="#string"/>
                                                                             41
 Semantic Web and Earth Sciences
• Semantic Web field provides technologies
  for explicity vocabulary and mediate data
• Standards-based, many resources available
  – Editors, Rule Engines, APIs
• Effort feeds back for other domain




                                              42

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:6
posted:9/8/2012
language:Unknown
pages:42