Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Standards for the Representation of Knowledge on the Semantic by zlf11327

VIEWS: 0 PAGES: 71

									Standards for the
Representation of Knowledge
on the Semantic Web

Antoine ISAAC
STITCH Project

Offene Archivierbare Formate
Oct. 25th, 2007
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing metadata to the Semantic Web
   • SKOS
• Conclusion: SW and semantic alignment
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing metadata to the Semantic Web
   • SKOS
• Conclusion: SW and semantic alignment
        Standards for the Representation of Knowledge on the Semantic Web


  The Interoperability Problem in Cultural Heritage

• STITCH
   • SemanTic Interoperability To access Cultural Heritage
   • Here, CH at large (incl. Digital Libraries)


• Trend: simultaneous access to different collections
   • The European Library, Memory of the Netherlands


• Problem: how to access seamlessly different collections?


• Traditional solution: using object metadata
   • But…
     Standards for the Representation of Knowledge on the Semantic Web
KB Illustrated Manuscripts
 Standards for the Representation of Knowledge on the Semantic Web
KB Illustrated Manuscripts
    Standards for the Representation of Knowledge on the Semantic Web

Mandragore
    Standards for the Representation of Knowledge on the Semantic Web

Mandragore
     Standards for the Representation of Knowledge on the Semantic Web


The Interoperability Problems
From syntactic to semantic


• Different formats
   • “We have a solution”
   • XML as a standard for data exchange


• Different metadata schemes
   • “Something is coming”
   • Dublin Core for MD exchange
     Standards for the Representation of Knowledge on the Semantic Web


The Interoperability Problems
From syntactic to semantic (continued)


• Different conceptual vocabularies for description
   • “Do you really want to discuss about it now?”
   • No standard vocabulary
       • DDC, UDC, SWD, LCSH, AAT, Iconclass and myriads of others…
   • Not even a common model for these Knowledge Organization
     Schemes (KOSs)
       • thesauri, classification schemes, subject heading lists…
   • Even worse: there are reasons for this!
    Standards for the Representation of Knowledge on the Semantic Web

The result
               MDS 1
               - Field 1
                 - Field 1.1
               - Field 2
                 - Field 2.1
                 - Field 2.2
                 -…




               MDS 2
               - Field 1
                 - Field 1.1
                 - Field 1.2
                   - Field 1.2.1
                 - Field 1.3
               - Field 2
                 -…
    Standards for the Representation of Knowledge on the Semantic Web

An Ideal Situation

                   MDS 1
                   - Field 1
                     - Field 1.1
                   - Field 2
                     - Field 2.1
                     - Field 2.2
                     -…




                 MDS 2
                 - Field 1
                   - Field 1.1
                   - Field 1.2
                     - Field 1.2.1
                   - Field 1.3
                 - Field 2
                   -…
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing metadata to the Semantic Web
• Conclusion: SW and semantic alignment
     Standards for the Representation of Knowledge on the Semantic Web


Why thinking of the Semantic Web?
• Cf Semantic Web activity page at W3C
   • http://www.w3.org/2001/sw/


• “The Semantic Web provides a common framework
  that allows data to be shared and reused”
• “The Semantic Web is a web of data”
• “It is about common formats for integration and
  combination of data drawn from diverse sources”
    Standards for the Representation of Knowledge on the Semantic Web


SW Problem: The Web for Humans



                                                   • A city
                                                   • A flag
                                                   • The city’s
                                                     location
                                                   Meaning
    Standards for the Representation of Knowledge on the Semantic Web


SW Problem: The Web for Humans
    Standards for the Representation of Knowledge on the Semantic Web


SW Problem: The Web for Computers?




                                                   • Characters
                                                   • Images
                                                   Black boxes
                                                   • Markup
                                                   Layout/Display
    Standards for the Representation of Knowledge on the Semantic Web


SW Problem: The Web for Computers?
                    Standards for the Representation of Knowledge on the Semantic Web


                The Interoperability Problems in CH (reminder)
                                                             MDS 1
                                                             - Field 1
                                                               - Field 1.1
                                                             - Field 2
                                                               - Field 2.1
                                                               - Field 2.2
                                                               -…




                                       MDS 2
                                       - Field 1
                                         - Field 1.1                         MDS 2
                                         - Field 1.2                         - Field 1
                                           - Field 1.2.1                       - Field 1.1
                                         - Field 1.3                           - Field 1.2
                                       - Field 2                                 - Field 1.2.1
                                         -…                                    - Field 1.3
                                                                             - Field 2
                                                                               -…

MDS 1
- Field 1
  - Field 1.1
- Field 2
  - Field 2.1
  - Field 2.2
  -…
              Standards for the Representation of Knowledge on the Semantic Web


       The Semantic Web Approach: A Web of (Meta)data




                                    Article                   Document
 The_Netherlands                                 subClassOf
                             type

hasCapital
                               file1

                                          partOf
             Amsterdam

type
              subject                     par3

  City
     Standards for the Representation of Knowledge on the Semantic Web


A footnote
• Why “(meta)data”?
• Because what is metadata for certain applications can
  indeed be the data for the Semantic Web
• Boundary is blurred
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing metadata to the Semantic Web
• Conclusion: SW and semantic alignment
     Standards for the Representation of Knowledge on the Semantic Web


The Semantic Web (1/4)


• Pointing at resources
   • What? Knowledge objects, everything that we may want to
     refer to (including documents)
   • How? Uniform Resource Identifiers (incl. URLs)
       Standards for the Representation of Knowledge on the Semantic Web

  A Web of Resources



myVoc1:Article




                 http://ex.org/files/file1


myVoc2:Amsterdam



   http://ex.org/files/file1#par3
                                               http://www.ned.nl/rep321
        Standards for the Representation of Knowledge on the Semantic Web


 The Semantic Web (2/4)
 • Pointing at resources: URIs

 • Creating structured assertions involving resources
     • What? Structured assertions with typed links
     • How? RDF (Resource Description Framework)


     Factual knowledge encoded as “triples”
          subject – predicate (property) – object

http://ex.org/files/file1#p
           ar3
                                        myVoc1:subject

                                                    myVoc2:Amsterdam
         Standards for the Representation of Knowledge on the Semantic Web


   Data in an RDF “graph”


  myVoc1:Article

                     rdf:type


                   http://ex.org/files/file1

myVoc2:Amsterd
                                myVoc1:partOf
      am

    myVoc1:subject

         http://ex.org/files/file1#par3           http://www.ned.nl/rep321
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing metadata to the Semantic Web
• Conclusion: SW and semantic alignment
     Standards for the Representation of Knowledge on the Semantic Web


The Semantic Web (3/4)


• Pointing at resources: URIs
• Enabling structured assertions: RDF

• Giving machine-understandable semantics to “building
  blocks”
   • What? Ontologies
      • “Formal definitions of shared conceptual vocabularies”
      • Giving semantics for properties and classes
   • How? RDFS /OWL (Ontology Web Language)
     Standards for the Representation of Knowledge on the Semantic Web


RDF Schema (RDFS)
• Meta-language to create vocabularies
   • “Article” is an (RDFS) Class
       • Denotes a type, a collection of resources (individuals)
   • “subject” is an (RDFS) Property
• Giving semantics to vocabulary elements
   • My “Article” has the literal article as a label for display
       • myVoc1:Article rdfs:label “article”
   • “Article” is a subclass of the class “Document”
       • myVoc1:Article rdfs:subClassOf myVoc1:Document
   • “subject” is applied to resources of type “Document”
       • myVoc1:Article rdfs:domain myVoc1:Document
     Standards for the Representation of Knowledge on the Semantic Web


RDF Schema (RDFS)
• Different kind of constructs
   • Assigning domain and ranges of properties
   • Creating hierarchies of classes and properties
   • Labels and informal specifications


• (Some) Equipped with formal semantics
   • R rdf:type C1, C1 rdfs:subClass C2 -> X rdf:type C2
   • P rdfs:domain C, R1 P R2 -> R1 rdf:type C
     Standards for the Representation of Knowledge on the Semantic Web


Web Ontology Language (OWL)
• Same function as RDFS, but more possibilities, e.g.
   • Characteristics of properties
       • Inverse(hasAuthor, authorOf)
   • Restriction on property usage
       • SubClassOf(Books, restriction(hasISBN minCardinality(1)))
   • Combination and exclusion of classes and properties
       • DisjointClasses(Persons, Books)



• Inherits from AI research and Description Logics
• Comes in different levels of complexity:
   • Lite, DL, Full
    Standards for the Representation of Knowledge on the Semantic Web


Tools to build RDFS/OWL ontologies
         Standards for the Representation of Knowledge on the Semantic Web

   Ontological information
                       rdfs:subClassOf
myVoc1:Article                                 myVoc1:Document


    rdf:type


                 http://ex.org/files/file1

myVoc2:Amsterd
      am                      myVoc1:partOf

    myVoc1:subject
         http://ex.org/files/file1#par
                       3                          http://www.ned.nl/rep321
     Standards for the Representation of Knowledge on the Semantic Web


The Semantic Web (4/4)


• Pointing at resources: documents, knowledge objects
• Enabling structured assertions

• Using “building blocks” with precise semantics


• Controlling existing facts, inferring new ones
   Part of the tasks are delegated from the user to inference
     engines that use the formal semantics of ontologies
         Standards for the Representation of Knowledge on the Semantic Web

   Ontological information
                       rdfs:subClassOf
myVoc1:Article                                   myVoc1:Document

                 rdf:type        rdf:type


                 http://ex.org/files/file1

myVoc2:Amsterd
      am
                            myVoc1:partOf

  myVoc1:subject

       http://ex.org/files/file1#par3
                                                http://www.ned.nl/rep321
    Standards for the Representation of Knowledge on the Semantic Web


RDFS/OWL and Semantic Interoperability
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing metadata to the Semantic Web
• Conclusion: SW and semantic alignment
     Standards for the Representation of Knowledge on the Semantic Web


Why is it interesting?
• RDF model is simple
   • Just triples
• There is meaning exploitable by computers
• Resources are universal, hence shareable
   • One resource for one object, used in different places
• Vocabularies for (meta)data are made of resources
   • They can be re-used in different applications
       • RDF does not enforce the use of a specific ontology
   • Their meaning (incl. formal semantics) is shareable
           Standards for the Representation of Knowledge on the Semantic Web


    Building on top of the Web
 • Web-based resources allow
   distribution/sharing of
     • document
     • vocabulary                                 http://www.geo.org/voc/

     • (meta)data


                           (par3, subject, Amsterdam)


                           http://www.kb.nl/eDepot


http://www.ned.nl/rep321                                 different
                                                     owners & locations
     Standards for the Representation of Knowledge on the Semantic Web


Why is it interesting?
• Using open standards
   • W3C’s URI, XML, RDF, RDFS, OWL
           Standards for the Representation of Knowledge on the Semantic Web


     Footnote: Building on top of XML
     • RDF can be encoded as XML data
        • RDF/XML is the reference syntax, but others are possible


<rdf:Description rdf:about=”http://www.ned.nl/doc321”>
 <myVoc1:subject rdf:resource=” http://www.geo.org/Amsterdam”/>
</rdf:Description>
<rdf:Description rdf:about=”http://www.geo.org/The_Netherlands”>
 <myVoc2:hasCapital rdf:resource=”http://www.geo.org/Amsterdam”/>
</rdf:Description>
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing (meta)data to the Semantic Web
   • SKOS
• Conclusion: SW and semantic alignment
     Standards for the Representation of Knowledge on the Semantic Web


Problem: Data Population
• How will Semantic Web data will be created?
   • Creation of “born-semantic” data?
      • Automatic or manual (tagging)
   • Converting existing databases to SW format
      • Fits the vision of the SW as a place to exchange data



• In the CH situation: porting legacy metadata is
  fundamental
     Standards for the Representation of Knowledge on the Semantic Web


Porting CH Metadata to the Semantic Web
• Requirement: an ontology to create SW-enabled
  representations for metadata
   • “Ontologized” metadata schema


• A first candidate: Dublin Core for metadata schema
   • Well-established set of metadata elements
   • Already coming in RDFS!
     Standards for the Representation of Knowledge on the Semantic Web


Porting KOSs to the Semantic Web


• How about metadata values from Knowledge
  Organization Schemes?
   • E.g. dc:subject values (terms, keywords, classes…)
   • DC does not address the problem of KOS representation


• Why is it important?
   • Their heterogeneity is a primary source of interoperability
     problems
   • They are provided with (informal) semantics
       • Taxonomies, associative networks can be exploited in many
         applications
     Standards for the Representation of Knowledge on the Semantic Web


Porting KOSs to the Semantic Web
• A first solution: converting KOSs to formal ontologies
   • Ontologization of terms/concepts into classes
• Problem: KOSs are generally no full-fledged ontologies
   • Iconclass: “Group of Birds” rdfs:subClassOf “Birds”?
       • There is much work needed to have semantics fit!
   • The concept of a car (reference=a subject in a KOS)
       vs. the class of cars (reference=a set of objects in the world)
       • Things in ontologies and KOSs don’t have the same
         epistemological status


• We need a model for elements of the realm of subjects
     Standards for the Representation of Knowledge on the Semantic Web


Representing KOSs – Requirements

Many different models and formats to represent
 vocabularies
• Need for standard formats to develop standardized
  tools and methods
   • Semantic correspondences
   • Browsing/information retrieval tools using vocabularies
• Need to represent features commonly used by these
  tools
   • Especially lexical information and semantic links
     Standards for the Representation of Knowledge on the Semantic Web


SKOS (Simple Knowledge Organisation System)

• Model to represent KOSs (thesauri, classification
  schemes) on the Semantic Web in a simple way
   • Comparable to Dublin Core, for conceptual vocabularies


• SKOS offers building blocks to create XML/RDF data
   • Concepts and ConceptSchemes
   • Lexical properties (prefLabel, altLabel)
   • Semantic relations (broader, related)
   • Notes (scopeNote, definition)
    Standards for the Representation of Knowledge on the Semantic Web


SKOS: Iconclass Example
     Standards for the Representation of Knowledge on the Semantic Web


SKOS: Limitations
• SKOS is a standard
   • Simple
   • Meant for information exchange and re-use
• Not everything can be represented!
   E.g. for Iconclass, difficulty to represent all types of auxiliaries
   • Keys, structural digits…


• It is still work in progress
   • W3C Semantic Web Deployment Working Group
     Standards for the Representation of Knowledge on the Semantic Web


Agenda
• Interoperability problems in Cultural Heritage
• An introduction to the Semantic Web
   • The problem
   • RDF
   • RDFS/OWL
   • Why is it interesting?
• Porting existing metadata to the Semantic Web
   • SKOS
• Conclusion: SW and semantic alignment
    Standards for the Representation of Knowledge on the Semantic Web


What have we seen?
• TODO
     Standards for the Representation of Knowledge on the Semantic Web


Back to the Problem: Semantic Alignment
• Different ontologies/individuals should be aligned at
  the semantic level
   • Using the same resources to join SW graphs together
   • Using the same vocabularies and semantics


• But: difficulty to recognize equivalent resources at
  data creation time
   • There is (and will be) no such thing as a single one ontology!


• A posteriori semantic alignment is needed
     Standards for the Representation of Knowledge on the Semantic Web


Back to the Problem: semantic alignment
• Fortunately, SW languages give appropriate means
   • Equivalence/specialization links for properties and classes
       • myVoc:auteur rdfs:subPropertyOf dc:creator
       • myVoc:Article owl:equivalentClass yourVoc:Artikel
   • Identity link for individuals
       • vu:aisaac owl:sameAs kb:AntoineIsaac
   • (yet unstable) SKOS mapping links for subjects
       • iconclass:birds exactMatch swd:vogel



• But they don’t do the job for us!
   • The links have to be created somehow
   • This is another story…
    Standards for the Representation of Knowledge on the Semantic Web


Thank you!
     Standards for the Representation of Knowledge on the Semantic Web


Vocabulary alignment
• Find correspondences between vocabulary elements
   • “klassieke ruïnes” ≈ “landschap met ruïnes”
   • “maagd Maria”        = “Heilige Moeder”
• STITCH aim: doing it (semi-)automatically
   • Vocabularies are big
   • They evolve over time
• Using techniques from Semantic Web research domain
   • Problem comparable to ontology alignment
   • Techniques already investigated there
      • Linguistics, statistics
      Standards for the Representation of Knowledge on the Semantic Web


Automatic alignment techniques

•   Lexical
•   Structural
•   Statistical
•   Background knowledge
     Standards for the Representation of Knowledge on the Semantic Web


Lexical alignment


• Labels of entities, textual definitions




     Long brain tumor    More specific Long tumor
                            than
      Standards for the Representation of Knowledge on the Semantic Web


Automatic Alignment Techniques

•   Lexical
•   Structural
•   Statistical
•   Background knowledge
              Standards for the Representation of Knowledge on the Semantic Web


       Statistical alignment
       • Object information (e.g. book indexing)
Thesaurus 1                  “Dutch                             Thesaurus 2
                           Literature”




                                     “Dutch”




Collection
of books
      Standards for the Representation of Knowledge on the Semantic Web


Automatic Alignment Techniques

•   Lexical
•   Structural
•   Statistical
•   Background knowledge
          Standards for the Representation of Knowledge on the Semantic Web

     Alignment using shared background knowledge
     • Using a shared conceptual reference to find links

                                          Background
                                          knowledge




                   “Publication”




                             “Calendar”

Thesaurus 1                                            Thesaurus 2
     Standards for the Representation of Knowledge on the Semantic Web


Alignment: no universal solution
• No single technique gives an ideal solution
• Different techniques have to be selected/combined,
  depending on the application case
   • Poor vs. rich semantic structure
   • Extensive vs. limited lexical coverage
   • Existence of collections described by several vocabularies


• Alignment is a difficult research problem
     Standards for the Representation of Knowledge on the Semantic Web


Conclusions : Alignement
• Les techniques simples permettent d'obtenir des
  résultats rapides
   • 12300 concepts de Mandragore “accessibles” depuis Iconclass


• Leur fiabilité ne permet pas de les considérer comme
  sources uniques
   • Combinaison avec travail manuel (vérification, complétion)


• L’alignement sémantique est toujours un problème de
  recherche difficile
   • Aucune technique n’est parfaite
   • Il faut sélectionner/combiner, en fonction des cas applicatifs
    Standards for the Representation of Knowledge on the Semantic Web


Demo
• http://prauw.cs.vu.nl/rp33333/MANDRA-SV-ICE-
  mandraNewNONE , amphibiens


• Blé
     Standards for the Representation of Knowledge on the Semantic Web


Conclusions : Représentation
• Il est possible de produire des représentations WS
  standardisées (SKOS) des vocabulaires conceptuels
   • Et des méta-données qui les utilisent
   • Les techniques existantes pour accéder aux méta-données et
     vocabulaires (OAI-PMH, XML) facilitent le travail


• C’est utile
   • Réutilisation/interopérabilité des composants applicatifs
     utilisant les vocabulaires
   • Facilité de la représentation de liens avec des éléments
     extérieurs au vocabulaire représenté
      Standards for the Representation of Knowledge on the Semantic Web


Links
• STITCH                         http://stitch.cs.vu.nl
• Demo collections
    • BNF Mangragore             http://mandragore.bnf.fr
    • KB illuminated manuscripts http://www.kb.nl/manuscripts/
• Library-originated integration projects:
    • MSAC search interface      http://sigma.nkp.cz
    • MACS project               http://macs.cenl.org
• Semantic web links
    • Semantic Web at W3C        http://www.w3.org/2001/sw/
    • SKOS                       http://www.w3.org/2004/02/skos/
• Semantic Web projects dealing with Cultural Heritage
   • MuseumFinland            http://www.museosuomi.fi/
   • eCulture                 http://e-culture.multimedian.nl/
          Standards for the Representation of Knowledge on the Semantic Web
    Demo (1)




Subject vocabulary, collection 1


                             Subjects
          Standards for the Representation of Knowledge on the Semantic Web


   Demo (2)
                           Hierarchical path
                         from root to selected
                                subject




    Possible
specialization for
selected subject
     Standards for the Representation of Knowledge on the Semantic Web

Demo (3)
       Semantic alignment
       of subjects activated




Document from
 Collection 2
           Standards for the Representation of Knowledge on the Semantic Web


       Demo (4)




                                            Subject from voc2 aligned to
                                                 voc1:amphibians”

Back

								
To top