Docstoc

The Semantic Web

Document Sample
The Semantic Web Powered By Docstoc
					The Semantic Web

A progress report and some
       observations
      Pat Hayes, IHMC
 The vision of the Semantic Web
• The WWW is a planet-wide system linking computers
  which enables people to communicate, establish links
  and publish content to one another.
• The SW plans to use it to do this with machine-
  usable content, so that software can read it, draw
  conclusions from it and act on it.
 The vision of the Semantic Web
• The WWW is a planet-wide system linking computers
  which enables people to communicate, establish links
  and publish content to one another.
• The SW plans to use it to do this with machine-
  usable content, so that software can read it, draw
  conclusions from it and act on it.
• Possible applications include B2B, services,
  improved WWW access, integrated datahandling,
  Global Mind…
 The vision of the Semantic Web
• The WWW is:
• Moore’s law
 The vision of the Semantic Web
• The WWW is:
• Moore’s law + Optic fiber
 The vision of the Semantic Web
• The WWW is:
• Moore’s law + Optic fiber + HTTP
 The vision of the Semantic Web
• The WWW is:
• Moore’s law + Optic fiber + HTTP + HTML
 The vision of the Semantic Web
• The WWW is:
• Moore’s law + Optic fiber + HTTP + HTML (+ extra
  software goodies such as Javascript)
 The vision of the Semantic Web
• The WWW is:
• Moore’s law + Optic fiber + HTTP + HTML
• The SW will use the first two, and rely on the third for
  now.
 The vision of the Semantic Web
• The WWW is:
• Moore’s law + Optic fiber + HTTP + HTML
• The SW will use the first two, and rely on the third for
  now.
• But it needs a new ‘semantic HTML’, i.e. a standard
  reference language for expressing content. This is
  where most of the effort has gone so far.
    Semantic markup languages
• There are several candidate languages now being
  used or proposed:

            OIL
                      DAML+OI
                      L

                                          OWL
      RDF
              RDFS
    Semantic markup languages
• There are several candidate languages now being
  used or proposed:

            OIL
                      DAML+OI
                      L
                                        OWL-DL
      RDF                               OWL-
              RDFS                      RDF
               Semantic markup languages
      • There are several candidate languages now being
        used or proposed:

      Extensional,    OIL
      ‘layered’
                               DAML+OI
                               L
                                              OWL-DL
Intensional,    RDF                           OWL-Full
non-wf.
                        RDFS
      W3C semantic markup languages

      RDF                                                    OWL-Full
                   RDFS
Uniform and very simple syntactic model, processable by simple XML engines.
Intensional, non-well-founded semantics.
All RDF/RDFS/OWL assertions are encoded as sets of triples of form
aaa RRR bbb .
which means RRR(aaa, bbb); all variables are existential; all names are urirefs
  or literals.
The rest of the family consists of semantic extensions to this basic RDF model.
      W3C semantic markup languages

      RDF                                                    OWL-Full
                   RDFS
Uniform and very simple syntactic model, processable by simple XML engines.
Intensional, non-well-founded semantics.
All RDF/RDFS/OWL assertions are encoded as sets of triples of form
aaa RRR bbb .
which means RRR(aaa, bbb); all variables are existential; all names are urirefs
  or literals.
The rest of the family consists of semantic extensions to this basic RDF model.
(There is also a very ugly XML serial syntax.)
<ex:Mary> <ownershipOntologies:had> _:ll .
_:ll <rdf:type> <ex:Lamb> .
_:ll <dimensionOntologies:size> <ex:Little> .
      W3C semantic markup languages

      RDF                                                    OWL-Full
                    RDFS
Users are expected to define classes and use classes and properties defined by
   other users. The urirefs used as names constitute the ‘links’ between
   ontologies, eg
_:xx dc:title “My Diary” .
_:xx dc:author _:yy .
_:yy rdf:type biocat:HumanBeing .
_:yy w3:mailbox “phayes@ai.uwf.edu” .
_:yy usgov:ssNumber “567881962”^^xsd:string .
Many of these RDF ontologies already exist (c. 10|6 lines of RDF).
      Universal resource identifiers

• Links on the WWW are mostly URLs (global
  file address scheme), but also URNs and
  others.
• Key SW idea is that a URI locates the ‘owner’
  of any name, ie the authoritative source of
  information about the intended meaning.
• NB, the URI is usually not the intended
  denotation.
• The names are the links.
     W3C semantic markup languages

     RDF                                              OWL-Full
                 RDFS
RDFS has vocabulary for talking about properties (binary relations),
  membership in classes, subclass and subproperty relationships, eg
rdf:Property rdf:type rdf:Class .
rdf:Class rdf:type rdf:Class .
ph:FatherOf rdfs:subPropertyOf ph:ancestorOf .
Two different classes can have the same members…classes can contain
  themselves…
<ex:Mary> <prop:had> _:xxx .
_:xxx <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> _:ll .
_:ll <http://www.w3.org/2000/01/rdf-schema#subClassOf> <ex:Lamb> .
_:ll <http://www.w3.org/2000/01/rdf-schema#subClassOf> <ex:Little> .
<ex:Lamb> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://www.w3.org/2000/01/rdf-schema#Class>
<owl:Class rdf:about=“#OwnersOfOneLittleLamb”>
<owl:Restriction owl:cardinality=“1”^xsd:integer>
    <owl:onProperty rdf:resource=“prop:had” />
    <owl:someValueFrom rdf:resource=“#LittleLambs” />
 </owl:Restriction>
</owl:Class>
<Person rdf:ID=“ex:Mary”>
 <prop:had rdf:value=“MarysLamb” />
</Person>
< owl:IntersectionOf LittleLambs rdf:resource=
<rdf:List>
<owl:Restriction owl:onProperty ex:size >
  <owl:allValuesFrom <owl:Class owl:one of ex:Small /> />
</owl:Restriction>
<Ex:Lambs>
</rdf:List> />
     W3C semantic markup languages

      RDF                                               OWL-Full
                  RDFS
RDF: basic assertions (existential conjunctive binary positive logic);
  containers (bags, sequences, lists), XML literals, reification, …
RDFS: classes, subclass, subproperty; property ranges and domains;
  Literals corresponding to all XML Schema datatypes (strings,
  numbers, dates, etc…)
OWL: Notions of transitive, symmetric, functional properties; union,
  intersection and complement of classes; explicit class constructors;
  equality and inequality; classes defined by restrictions on properties.
     W3C semantic markup languages

     RDF                                           OWL-Full
                RDFS
Owl reasoning is much more complex than ‘bare’ RDF, yet OWL is all
  expressed as RDF triples. The extra complexity comes from extra
  OWL semantic conditions, mostly on the properties, eg.
ppp rdf:type owl:SymmetricProperty .
aaa ppp bbb .
      owl-entails
bbb ppp aaa .
     W3C semantic markup languages

     RDF                                           OWL-Full
                RDFS
Owl reasoning is much more complex than ‘bare’ RDF, yet OWL is all
  expressed as RDF triples.
The extra complexity comes from extra OWL semantic conditions, but
   can be all expressed by giving a translation from OWL/RDF into
   first-order logic.
      Lbase as a foundation formalism

                    Lbase
RDF triples         translation
written using       of triples
RDF/RDFS/OW
L vocabularies
                    RDF
                    axioms
                    RDFS
                    axioms
                    OWL-Full
                    axioms
 (A subset of CL adapted for SW use)

      Lbase as a foundation formalism

                           Lbase
RDF triples                translation
written using              of triples
RDF/RDFS/OW
L vocabularies
                           RDF
                           axioms
                           RDFS
                           axioms
                           OWL-Full
                           axioms
      Extensional,   OIL
      ‘layered’
                              DAML+OI
                              L
                                        OWL-DL

Intensional,   RDF                      OWL-Full
non-wf.
                       RDFS
   Same syntactic freedom as RDF


          OWL-Full
                                    OWL-Lite
                                               Restricted
            OWL-DL
                                               vocabulary
                                               Allows frame-like
                                               notation

Restricted syntactic constructions:
Individual/literal/class/property
vocabularies separated
No classes of classes, properties of
properties, etc.,
Extensional; need to distinguish OWL-DL
from RDFS categories.
State of play
Final RDF/RDFS specs now being produced
(published about now)
OWL being finalized now, published in next few
weeks.
See W3C website for details


DAML and OIL deployed, esp. by DARPA
intelligence community and DAML-S.
SCL initiative is a ‘fast-track’ effort to define a better
Lbase = subset of CL which is adapted to SW uses
and integrated with RDF/RDFS/OWL .
A small ad-hoc international working group has been
formed and we plan to have a draft standard proposal
written by July 2003.
SCL initiative is a ‘fast-track’ effort to define a better
Lbase = subset of CL which is adapted to SW uses
and integrated with RDF/RDFS/OWL .
A small ad-hoc international working group has been
formed and we plan to have a draft standard proposal
written by July 2003.



            Watch This Space…..
      How is the SW going to work?

OK, so you put some machine-readable stuff on your
website. Now what?
          How is it going to work?

OK, so you put some machine-readable stuff on your
website. Now what?
Hopefully, someone is going to do something useful with
it.
      How is the SW going to work?

OK, so you put some machine-readable stuff on your
website. Now what?
Hopefully, someone is going to do something useful with
it. Such as put you in touch with customers more
effectively, or find your website more efficiently, or draw
some useful conclusions.
      How is the SW going to work?

OK, so you put some machine-readable stuff on your
website. Now what?
Hopefully, someone is going to do something useful with
it. Such as put you in touch with customers more
effectively, or find your website more efficiently, or draw
some useful conclusions.
All of these assume some kind of collusion between the
publisher and the user, but they also assume a detachment
of purpose. In general, the writer of the content does not
know what the information is going to be used for.
               Transmitting content


The writer of the content does not know what the
information is going to be used for.
 What can the writer assume about the way the
information is used? No more than is in the spec, in
general. But logical semantics only supplies truth-
conditions; and those provide only a very minimal
constraint upon use, even with the strongest possible
assumptions.
                     Transmitting content
What content is in fact transmitted? Idea of “social
meaning” is central, but new for AI/KR
Eg.
A: gobshite rdf:type rdfs:Class
rdf:comment “A gobshite is a contemptible person who habitually tells lies.”

B:Irish   rdfs:subClassOf A#gobshite .

C:http://www.coginst.uwf.edu/~phayes     rdf:type B#Irish

rdfs-entails:
http://www.coginst.uwf.edu/~phayes rdf:type A#gobshite .
              Transmitting content


Logical semantics only supplies truth-conditions; and
those provide only a very minimal constraint upon use,
even with the strongest possible assumptions.
And we cannot even make the strongest assumptions,
since we cannot even assume a shared meaning when
software agents are involved, since they have access only
to the surface forms.
Is this the right thing to be working on?
   Is this the right thing to be working on?


•Moore’s law + Optic fiber + HTTP + HTML
    Is this the right thing to be working on?


•Moore’s law + Optic fiber + HTTP + HTML


So far we have been focusing on the ‘semantic HTML’
based on XML. But what we need also is a ‘semantic
HTTP’ to support negotiation of meaning and content.
     We cannot even assume a shared meaning

<ex:Mary> <prop:age> “10” .


What does this literal mean? Seems obvious….
     We cannot even assume a shared meaning

<ex:Mary> <prop:age> “10” .
a. It means the number ten.
b. It means the character string ‘10’.
c. It means both the number and the string.
d. It means either the number or the string.
e. It doesn’t mean anything unless associated with a
   datatype, and then what it means depends on the
   datatype.
   a. It means the number ten.
Then it would be impossible to represent property values which were strings or binary
numbers.
   b. It means the character string ‘10’.
   c. It means both the number and the string.
Then the range of the property would be a set of pairs, and there is no way to say that in
RDF
   d. It means either the number or the string.

 Then the range of the property wouldn’t be well-defined.

   e. It doesn’t mean anything unless associated with a
      datatype, and then what it means depends on the
      datatype.
 Then two identical literals might mean different things, so one could not identify them.
   a. It means the number ten.
Then it would be impossible to represent property values which were strings or binary
numbers.
   b. It means the character string ‘10’.
   c. It means both the number and the string.
Then the range of the property would be a set of pairs, and there is no way to say that in
RDF
   d. It means either the number or the string.

 Then the range of the property wouldn’t be well-defined.

   e. It doesn’t mean anything unless associated with a
      datatype, and then what it means depends on the
      datatype.
 Then two identical literals might mean different things, so one could not identify them.
b. It means the character string ‘10’.


<ex:Mary> <prop:age> “10” .
b. It means the character string ‘10’.


<ex:Mary> <prop:age> “10” .


<ex:Mary> <prop:age> “10”^^<xsd:number> .
The scary part of this story is that it took a group
of <10 reasonably intelligent, dedicated people
more than seven months intensive effort to get to
this point, and nobody is really happy with the
result.




(Is the chandelier in the room or part of the room?)
Tougher case: different universes of discourse.
What is the complement of a class? Eg what is in the class
of US non-citizens?
What is the range of a quantifier? When integrating
information from various sources, we have to assume that
the quantifiers range over (at least) the union of the
universes assumed by the different sources.
Many data archives and sources are built assuming a
restricted universe. We need universe-protection
mechanisms.
owl:Class vs. rdfs:Class in OWL-DL
A tougher case; time and change.
A tougher case; time and change.

   Different propositions are true at different times
   Do we associate times ….with assertions (tense)
    ….with relations (situation reasoning)
    ….with physical things (4-d spatiotemporal reasoning)
   ?
A tougher case; time and change.

   Different propositions are true at different times
   Do we associate times ….with assertions (tense)
    ….with relations (situation reasoning)
    ….with physical things (4-d spatiotemporal reasoning)?
   Ans: yes.
A tougher case; time and change.

   Different propositions are true at different times
   Do we associate times ….with assertions (tense)
    ….with relations (situation reasoning)
    ….with physical things (4-d spatiotemporal reasoning)?
   Ans: yes.
   Philosophical/ontological debates have been extremely
   heated, and the moral for the SW is that it is impossible
   to legislate a correct standard answer.
       The ‘standards’ do not agree
•   “Individual: unique existence with a particular space-time extension.” [ISO 15926-2]
    Individuals are 4-d and have locations and times; things and processes are classified
    under same common categorization. Standard in process industry ontologies, eg EPISTLE
    (http://www.epistle.ws/)


•   “Under the concept of Physical, we have the disjoint concepts of Object and
    Process. …. the SUMO assumes a so called 3D orientation, rather than a 4D
    orientation.” [Proposed IEEE Standard Upper Merged Ontology, 2001.
    (http://suo.ieee.org/) Arbitrary choice made by software engineers.

•   “Whereas 1stOrderEntities exist in time and space 2ndOrderEntities occur or take place,
    rather than exist.” [EuroWordNet (expertContrib/ewntop.zip)] Based on ‘endurantist’
    ideas derived originally from Aristotle. Common in linguistic analyses.
A tougher case; time and change.

   Different propositions are true at different times
   Do we associate times ….with assertions (tense)
   P(a, b) true at t
    ….with relations (situation reasoning)
   P(a, b, t)
    ….with physical things (4-d spatiotemporal reasoning)
   P( s(a,t), s(b,t) )
A tougher case; time and change.

   P(a, b) @ t
   P(a, b, t)
   P( s(a,t), s(b,t) )
A tougher case; time and change.

   P(a, b) @ t
   P(a, b, t)
   P( s(a,t), s(b,t) )


Even when translated into Lbase, these will not
interface easily.
A tougher case; time and change.

   P(a, b) @ t
   P(a, b, t)
   P( s(a,t), s(b,t) )
   These vary by how far down the logical syntax you place the time
   parameter.
   Moral: let it ‘float’ and allow the unification algorithm to match
   across levels.
   (Basic rule: a parameter cannot govern any expression containing
   it.)
   Same trick works for simple spatial reasoning, situational
   reasoning, etc.
A tougher case; time and change.

   P(a, b) @ t
   P(a, b, t)
   P( s(a,t), s(b,t) )
   Moral: ….. allow the unification algorithm to match
   across levels.
   This requires altering the logical machinery. This
   violates the academic work-boundary rules, so is very
   hard to achieve in a committee setting.
Maybe this does have something to do with language
after all….
We cannot legislate a WW ontology standard, we have to
allow different ways of representing the same content to
co-exist and communicate with each other.
When publishing content we cannot know exactly how
the reader will make use of it.
We have to expect to find misunderstandings and the
need to negotiate intended meanings.
We need to do for machine agents what nature did for
human agents.
Why are almost all XML-based languages unreadable?


     XML was designed as a TEXT MARKUP
     language.The tags describe the tagged text,
     providing ‘metadata’.
     However, it is now widely used as a structure
     description/specification language. In this use, the
     tags describe the same structure that is exemplified
     by the syntactic structure of the XML itself. It is
     being used to describe itself, in effect, like a dancer
     giving a running commentary on her own
     movements.
<sentence type="simpleActive">
<subject>
<nounPhrase type="definite">
<article>The</article>
<noun type="singular" class="animateEntity">cat</noun>
</nounPhrase></subject><verb type="active"
tense="simplePast">sat</verb>
<object><phrase type="locative">
<preposition>on</preposition>
<nounPhrase type="indefinite">
<article>a</article>
<noun type="singular" class="SurfaceObject">mat </noun>
</nounPhrase>
</phrase>
</object>
</sentence>      423 characters
<sentence type="simpleActive">
<subject>
<nounPhrase type="definite">
<article> The </article>
<noun type="singular" class="animateEntity"> cat </noun>
</nounPhrase></subject><verb type="active" tense="simplePast"> sat
</verb>
<object><phrase type="locative">
<preposition> on </preposition>
<nounPhrase type="indefinite">
<article> a </article>
<noun type="singular" class="SurfaceObject"> mat </noun>
</nounPhrase>
</phrase>
</object>
</sentence>       423 characters
The cat sat on a mat.   21 characters
<rdfs:Class rdf:ID="elephant">   <rdfs:subClassOf>
  <rdfs:subClassOf>                 <daml:Restriction>
     <rdfs:Class                      <daml:onProperty
rdf:about="#animal"/>            rdf:resource="#colour"/>
  </rdfs:subClassOf>                  <daml:hasValue>
  <rdfs:subClassOf>
    <daml:Restriction>           <daml:ConcreteTypeExpression>
      <daml:onProperty           EQUAL ``grey''
rdf:resource="#eats"/>           </daml:ConcreteTypeExpression>
      <daml:toClass>                   </daml:hasValue>
        <rdfs:Class                  </daml:Restriction>
rdf:about="#plant"/>               </rdfs:subClassOf>
      </daml:toClass>            </rdfs:Class>
    </daml:Restriction>
  </rdfs:subClassOf>
                                 591 characters
elephant_s are animal_s
[ which eats plant
which color is ‘grey’ ]


62 characters

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:8
posted:6/24/2012
language:
pages:67