Docstoc

Dynamic and Distributed Scheduling in Communication Networks and

Document Sample
Dynamic and Distributed Scheduling in Communication Networks and Powered By Docstoc
					ICS-FORTH




 Describing Resources on the Web:
The Resource Description Framework
                Vassilis Christophides
                 Dimitris Plexousakis
       Computer Science Department, University of Crete
          Institute for Computer Science - FORTH
                        Heraklion, Crete
         http://www.ics.forth.gr/proj/isst/RDF
                                                          1
ICS-FORTH




            Introduction to Metadata



                         meta
                         data
                                       2
     ICS-FORTH


                         What is the Problem?


   3.6 million Web sites
   Five hundred million or more
    addressable pages on the Web
   High consumer expectations
    conflicting with primitive tools
    and mechanisms
   Uncertain quality, integrity, trust




                                                3
    ICS-FORTH


       The Information Landscape in the Web-era
 The Web changes relationships among
    authors

    publishers

    information intermediaries and distributors

    users

 Lower barriers to “publication”
    rapid dissemination of information and ideas

    less advantage to size or centralization

    greatly expanded access

 Manageability is reduced
    resource discovery is chaotic

    organization is haphazard

    preservation is almost non-existent
                                                    4
    ICS-FORTH


The Web Information System vs. Traditional Libraries

   Search systems are motivated by advertising
   Index coverage is unpredictable and limited (1/3)
   Too much recall, too little precision
   Index spam abound
   Resources (and their names) are volatile
   What about versions, editions, back issues?
   Archiving is presently unsolved
   Authority and quality of service are spotty
   Managing Access Rights is hard



                                                        5
    ICS-FORTH


Metadata: Higher Quality Web Information Services
 Traditionally:
    metadata has been understood as “Data about Data”
    help to impose order on chaos
 Example(s):
    a library catalogue contains information (metadata) about
     publications (data)
    a file system maintains permissions (metadata) about files (data)
 Metadata describes other data
    One application‟s metadata is another application‟s data
    Metadata can itself be described by metadata (but that doesn‟t
     make it meta-metadata)
 Example:
    Price lists (metadata) have expiration dates: metadata about
     metadata (It is still just metadata!!)
                                                                         6
ICS-FORTH


            Metadata takes Many Forms


     resource       document           rights
     discovery    administration    management

       content     security and        archival
        rating    authentication        status


  products and      database       process control
    services        schemas         or description


                                                     7
    ICS-FORTH


                Metadata exists for Almost Anything
   People

   Places

   Objects

   Concepts

   Documents

   Archives

   Databases                                         8
    ICS-FORTH


     Application: Item and Collection Cataloguing
 Describing individual resources
    documents, pages, images, audio files, etc.

 Describing the content of collections
    Web sites, databases, directories, etc.

 Relationships among Resources
    Tables of Content, chapters, images….

    Site Maps




                                                    9
    ICS-FORTH


                Application: Resource Discovery
 Search engines can better “understand”
  the contents of a particular page
     More accurate searches

 Additional information aids precision
     Makes it possible to automate
      searches because less manual
      “weeding” is needed to process the
      search results




                                                  10
    ICS-FORTH


                Application: Electronic Commerce
   Metadata can be used to encode                    Broker
    information needed in all stages of
    electronic commerce
       locating seller/buyer & product

          searching “yellow pages”
       agreeing on terms of sale

          prices, terms of payment,              Market place
           contractual information
       transactions

          delivery mechanisms, dates,
           terms                        Providers/Clients


                                                                 11
    ICS-FORTH


                Application: Intelligent Agents
 Representation and sharing of
  knowledge
    knowledge exchange

    modeling
                                                  place
 Communication
    user-to-agent, agent-to-agent,
     agent-to-service                             place

 Resource discovery
    gives web-roaming agents the ability
                                                   service
     to “understand” their environment



                                                  place
                                                             12
    ICS-FORTH


                Application: Content Rating
 Empowering users to select which
  kinds of web content they wish to see
 Child Protection
 W3C PICS (Platform for Internet
  Content Selection) working group
     US Communications Decency Act
      of 1996
     simple metadata architecture

     precursor to RDF




                                              13
    ICS-FORTH


                Application: Digital Signatures
 These are key to building the “Web of Trust”
 Required by
    agents

    electronic commerce

    collaboration

 RDF will become the preferred way to
  encode digital signatures on documents and
  on statements about documents




                                                  14
    ICS-FORTH


                       Other Applications
 Privacy Preferences and Policies
     describing a user‟s willingness/
      reluctance to disclose information
      about him/her-self
     describing a site administrator‟s
      desire to gather information about
      visiting users
 Intellectual Property Rights
     contractual terms related to usage
      and distribution rights to a document




                                              15
ICS-FORTH


            (Meta)Data Transmission Methods



                                   Trusted Third Party
                                   (explicit HTTP GET)

         Associated With
        (in HTTP header)




                  Embedded (eg META)
                                                         16
    ICS-FORTH


                    Metadata Assertions
 The Web is “machine-readable” but
  not “machine-understandable”
 Metadata is useful
    A lot could be gained from
     structured description of pages,
     servers, search services, and
     other resources
 Accommodate multiple varieties of
  metadata
    Metadata requirements will evolve




                                          17
    ICS-FORTH


                A Plethora of Metadata Standards
   Many metadata standards have evolved at different levels, and to
    meet different requirements...




                                                            MICI       18
    ICS-FORTH


                       Interoperability Issues


   Semantic      Standardisation of “Let‟s talk English”   “cat milk sat drank mat ”
Interoperability content

   Structural    Standardisation of “Here‟s how to         “Cat sat on mat. Drank
Interoperability form               make a sentence”       milk.”


   Syntactic     Standardisation of “These are the rules “The cat sat on the mat.
Interoperability expression         of grammar”          It drank some milk.”




                                                                                  19
    ICS-FORTH


                      Metadata Challenges
 Many flavours of metadata
     which one do I use?

 Managing change
     new varieties, and evolution
      of existing forms
 Tension between functionality
  and simplicity, extensibility and
  interoperability




    Functions, features, and cool stuff   Simplicity and interoperability
                                                                        20
     ICS-FORTH


          Towards Metadata for Community Webs

   Group of people sharing a domain of
    discourse and a set of resources (e.g.,
    data, documents, services) and having
    some common interests                                 Community
       Commerce, Education, Health                         Webs
                                              Workplace               Education

   Provide community-specific metadata
    functionality in order to create,
    administrate, and access resources
       common semantic, structural, and
        syntactic conventions for exchange    Commerce
        of resource description information                            Health



                                                                                21
    ICS-FORTH


    Metadata Interoperability in Community Webs
   Communities of expertise
    (not software vendors)
    are responsible for:     Home                                     Commerce
       Semantics            Pages          Library             Geo
       Registration

       Administration

       Access management
                                                 Community
       Authority of data

       Sharing and
                               Scientific          Webs
                                 Data
        Distribution                                  Museums
                                                                      Whatever...




                                                                              22
    ICS-FORTH


           Metadata Implementation Approaches
 Harvesting metadata into a repository (database)
 Distributed Database Search




                                                     23
    ICS-FORTH


 Harvesting Metadata into a Repository (database)



                                                    HTML


Query           Repository     Harvester            XML



                                                    Other types   Dynamic document
                                                                   creation from database




                             retrieve resource

                                                                                       24
 ICS-FORTH


             Distributed Database Search


                                       Z39.50 Server




Query         Z39.50 Gateway           Z39.50 Server




                                       Z39.50 Server




                   retrieve resource
                                                       25
ICS-FORTH




            Understanding RDF




                                26
ICS-FORTH


                            RDF origins



   W3C Metadata Activity 1997-2000
   PICS (Internet content selection)
   Warwick Framework / Dublin Core
   XML (XML Data, Channels etc)
   MCF (Apple, Netscape)
   URI specification for Web identifiers




                                            27
    ICS-FORTH


                         RDF Objectives

   Enables resource description communities to
    define their own semantics
      We can disagree about semantics, but
       share infrastructure (syntax, query,
       editors)
   Imposes structural constraints on the
    expression of various application metadata
      for consistent encoding, exchange and
       processing of metadata on the Web
   Metadata vocabularies can be developed
    without central coordination
      Fine-grained mixing of diverse metadata
   Signed RDF is the basis for trust
   XML used for „serialisation syntax‟
                                                  28
ICS-FORTH


Describing Community Resources using RDF

                          Advanced Knowledge Schemas
                              (ontologies, thesauri)




                                   Heterogeneous
                                resource descriptions



                                Complexity and diversity
                               of information resources
            <tag1>
              <tag2>
              <tag3>
            </tag1>
                                                        29
    ICS-FORTH


                 The Basic RDF Data Model

   RDF: Resource Descriptions
      Data Model: Directed Labeled
       Graphs
        Nodes: Resources (URIs) or
         Literals
        Edges: Properties – Attributes
         or Relationships
        Statement: assertion of the
         form resource, property, value
        Description: set of statements
         concerning a resource
      XML syntax


                                            30
ICS-FORTH


       The Basic RDF Data Model: Primitives




                       Property
            Resource                Value
                                   Resource



                       Statement



                                              31
ICS-FORTH


                       Simple Example




                           Author
            URI:Tutorial            “Vassilis”
                                    URI:Vassilis




                                                   32
  ICS-FORTH


                     The notion of Resource
A  resource is identified by a URI:
   [absoluteURI | relativeURI] [“#” fragment-id]
 The resource identified by a URI may be abstract
   i.e. not network retrievable
 Resource is distinct from entity resolved at any particular time
   http://www.ics.forth.gr/RDF/
 From RFC 2396:
   Resource A resource can be anything that has identity. Familiar examples include an
    electronic document, an image, a service (e.g., "today's weather report for Los
    Angeles"), and a collection of other resources. Not all resources are network
    "retrievable"; e.g., human beings, corporations, and bound books in a library can
    also be considered resources. The resource is the conceptual mapping to an entity
    or set of entities, not necessarily the entity which corresponds to that mapping at
    any particular instance in time. Thus, a resource can remain constant even when
    its content---the entities to which it currently corresponds---changes over time,
    provided that the conceptual mapping is not changed in the process.
                                                                                          33
    ICS-FORTH


                           RDF Syntax

 RDF Model defines a formal
  relationships among resources,
  properties and values
 Syntax is required to...              <tag1>
     Store instances of the model        <tag2>
      into files
                                          <tag3>
     Communicate files from one
      application to another            </tag1>
 W3C XML eXtensible Markup
  Language




                                                   34
ICS-FORTH


       RDF Model Example: Complex Values


                dc: Title
                             “RDF Presentation”
URI:Tutorial
               dc: Creator

                        “Vassilis Christophides”


<RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#”
             bib:Aff              bib:Email
           xmlns:dc =bib:Name
                         “http://purl.org/dc/elements/1.0/”>
   <Description about = “URI:Tutorial”>
                                     “christop@
     “ICS-FORTH”RDF Presentation </dc:Title>
     <dc:Title>      “`Vassilis
     <dc:Creator> Christophides” ics.forth.gr”
     URI:FORTH Vassilis Christophides </dc:Creator>
   </Description>
</RDF>
                                                               35
  ICS-FORTH


        RDF Syntax Example: Complex Values


<RDF xmlns = “http://www.w3.org/TR/WD-rdf-syntax#”
           xmlns:dc = “http://purl.org/dc/elements/1.0/”
           xmlns:bib = “http://www.bib.org/persons#”>
   <Description about = “URI:Tutorial”>
     <dc:Title> RDF Presentation </dc:Title>
     <dc:Creator>
       <Description
       <Description>
         <bib:Name> Vassilis Christophides”
         bib:Name = “Vassilis Christophides </bib:Name>
         bib:Email = “christop@ics.forth.gr” >
         <bib:Email> christop@ics.forth.gr </bib:Email>
         <bib:Aff resource = “http://www.ics.forth.gr” />
       </Description>
     </dc:Creator>
   </Description>
</RDF>

                                                            36
  ICS-FORTH


                    RDF Model Example


                dc: Title
                              “RDF Presentation”
URI:Tutorial
               dc: Creator                          admin:By
                                                          “STEP”
                                                    admin:On
                                                         “01-01-01”
               bib:Aff                 bib:Email    admin:For
                            bib:Name
                                                                “...”
      “ICS-FORTH”        “`Vassilis     “christop@
      URI:FORTH          Christophides” ics.forth.gr”


                                                                    37
    ICS-FORTH


                    Where do you stop?
 The Basic RDF model & syntax provides enabling technology
 Degree of metadata simplicity/complexity is a matter of:
    Resource description communities needs, best-practice and
     experience
    Organization/Institution‟s Policy

    Economics

    Goals and requirements of implementation




                                                                 38
ICS-FORTH


            The Basic RDF Data Model: In Brief

       P1                   Nodes are resources connected by
R1               R2                named properties


       P1
R1            “foo”             The degenerate case is an arc
                                 terminating in a fixed value

     P1          P2
R1          R2        R3
       P3        P4        P5          An RDF description consists
            R4        R5        R6        of a directed graph of
                           P6
                                R7        arbitrary complexity
                           P7
                                R8
                                                                     39
    ICS-FORTH


       One Additional Concept: Container Values
   Containers are collections
       they allow grouping of resources (or literal values)

   It is possible to make statements about the container (as a whole) or
    about its members individually
   Different types of containers exist
       Bags -- groups of things

       Sequences -- ordered group of things

       Alternates -- Alternate things/values

          First value is the default
          Must be at least one
   Duplicate values are permitted
       there is no mechanism to enforce unique value constraints

   Syntactic shorthand provided (much like HTML lists)
                                                                            40
ICS-FORTH


               Containers (continued)


                 URI:Tutorial

             dc:Creator
                            rdf:Type
                                          rdf:Seq

             rdf:_1             rdf:_2


            “Vassilis      “Dimitris
            Christophides” Plexousakis”


                                                    41
ICS-FORTH


               Containers (continued)


                 URI:Tutorial



            dc:Creator   dc:Creator




            “Vassilis        “Dimitris
            Christophides”   Plexousakis”


                                            42
     ICS-FORTH


       The Basic RDF Data Model: Formal Aspects
   Statement := (predicate,subject,object)
   Predicate is a resource
   Subject is a resource
   Object is either a resource or a literal
      Object = Predicate(Subject)

   A model is a set of statements
      Formal model based on triples (Universal relation)



 Example
{author, “http://www.ics.forth.gr/proj/isst/RDF”, node}
{name, node, “Vassilis Christophides” }
{email, node, “christop@ics.forth.gr” }
                                                            43
    ICS-FORTH


            Triples for Container Values: Example
 Triples from the first example:
   {“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator,x}
   {x, rdf:_1, “Vassilis Christophides” }
   {x, rdf:_2, “Dimitris Plexousakis” }
   {x, rdf:type, rdf:Seq }
 Triples from the second example:
   {“http://www.ics.forth.gr/proj/isst/RDF”,dc:Creator,
     “Vassilis Christophides”}
   {“http://www.w3.org/TR/REC-rdf-syntax”, dc:Creator,
     “Dimitris Plexousakis”}




                                                       44
 ICS-FORTH


         Edge Labeled Directed Graphs (RDF)

             creator   Vassilis
                                        affiliation


     RDFTutorial                    ICS-FORTH
                       activities               projects


                          ISL                         C-Web

(creator, RDFTutorial, Vassilis)
(affiliation, Vassilis, ICS-FORTH)
(activities, ICS-FORTH, ISL)
(projects, ICS-FORTH, C-Web)                                  45
  ICS-FORTH


              Node labeled Directed Graph (XML)

                               root
                                                   x
                      element element                   2
                                            attribute
                         foo          bar
                                        element
           attribute attribute attribute
                                            baz
              href         x         y
                                               attribute
                              1         3
<root>                                          z
     <foo href=“…” x=“1” />
                                                   aaa
     <bar x=“2” y=“3”>
          <baz z=“aaa”/>
     </bar>                                                 46
    ICS-FORTH


                What can we Express in RDF?
 RDF relies on a (edge labeled) directed
  graph model that can easily
     extended by just adding more edges

     combine multiple vocabularies,
      distinguished by their URIs
 RDF provides a standard syntax to
  represent these graphs in XML
     RDF Model can be thought of as a
      simplified XML Infoset
 But RDF goes beyond XML syntactic
  issues
     It allows to define semantic networks
      on the Web
                                              47
   ICS-FORTH


                            Semantic Networks
                                          name
                         Person                           String
                                          lives in
                      isa
                                             creates
                         Artist                                    Artifact
                isa                 isa                     isa               isa

               Painter            Sculptor     isa      Painting         Sculpture


                         paints
                                              sculpts

“a Person has a name and lives_in somewhere . Artists are persons, painters
and sculptors are artists. An artist creates artifacts, (paintings or sculptures)
a painter paints paintings and a sculptor sculpts sculptures”
                                                                                     48
    ICS-FORTH


                RDF Schema Definition: RDFS

 Declaration of label vocabularies for description graph nodes & edges
    Enables communities to share machine readable tokens and define
     human readable labels
 Node labels (types) are defined as classes
    Literal data types as defined by XML Schemas WG
    Resource may have a specific „type‟ property
 Edge labels (predicates) are defined as properties of these classes
    A resource of given type may have a given property (domain
     constraint)
    A resource of given type may be the value of a given predicate
     (range constraint)
 RDFS vocabularies expressible in the basic RDF model and syntax
    RDFS vocabularies are also Web resources (and have URIs) and
     therefore can be described using RDF
                                                                          49
     ICS-FORTH


            Constructing and Using RDF schemas

   RDFS Schema Vocabularies
    allows for
       Specialization of both classes
        & properties (simple &
        multiple)
       Multiple classification of
        resources under several
        classes
       Unordered, optional, and multi-
        valued properties
       Domain and range
        polymorphism of properties

                                                 50
    ICS-FORTH


A Cultural Community Resource Description Example

 String    fname                   creates                      exhibited
                        Artist                     Artifact                   Museum
 String    lname                                                                                                Date          String
                             sculpts
            Sculptor                                          Sculpture                              last_modified        title
                                   paints                        technique
Portal                 Painter                     Painting                     String                      ExtResource

Schema
                                 lname           creates
                   “Rodin”
                                         &r5                    &r1                          last_modified
                                                                          exhibited                             2000/06/09
                                                                                         &r4 title             “Reina Sofia
                   “Pablo”       fname            paints
                                                                &r2       technique      “oil on canvas”         Museum”

                “Picasso”        lname &r6        paints                                  2000/01/02
                                                                      last_modified
                                                                &r3
r2: museoreinasofia.mcu.es/                 r1:www.rodin.fr/          r3:www.artchive.com/            r4:museoreinasofia.mcu.es
guernica.jpg                                thinker.gif               woman.jpg


                                                                                                                                   51
ICS-FORTH


               RDF/XML Serialization: Data
  <rdf:RDF xml:lang="en"
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/TR/2000/PR-rdf-schema-20000327#"
      xmlns="">
  <Painter rdf:id=“picasso132">
     <fname>Pablo</fname>
     <lname>Picasso</lname>
     <paints>
       <Painting rdf:about="http://museoreinasofia.mcu.es/guernica.gif">
        <exhibited>
          <Museum rdf:about="http://museoreinasofia.mcu.es"/>
        </exhibited >
        <technique>oil on canvas</technique>
       </Painting>
     </paints>
     <paints>
       <Painting rdf:about="http://www.artchive.com/woman.jpg”/>
     </paints>
  </Painter>
  <ExtResource rdf:about="http://museoreinasofia.mcu.es">
     <title>Reina Sophia Museum</title >
     <lastmodified>2000/06/09</lastmodified>
  </ExtResource>
  <Sculptor rdf:id="rodin424" lname="Rodin“>
     <creates>
       <Sculpture rdf:about="http://www.rodin.fr/thinker.gif"/>
     </creates>
  </Sculptor>                                                              52
  </rdf:RDF>
     ICS-FORTH


                  RDF/XML Serialization: Schema
<rdf:RDF xml:lang="en"                          <rdf:Property rdf:ID="paints">
     xmlns:rdf="http://www.w3.org/1999/02/          <rdfs:domain rdf:resource="#Painter"/>
22-rdf-syntax-ns#"                                  <rdfs:range rdf:resource="#Painting"/>
     xmlns:rdfs="http://www.w3.org/TR/2000/         <rdfs:subPropertyOf rdf:resource="#creates"/>
PR-rdf-schema-20000327#">                       </rdf:Property>
<rdfs:Class rdf:ID="Artist"/>                   <rdf:Property rdf:ID="sculpts">
<rdfs:Class rdf:ID="Artifact"/>                     <rdfs:domain rdf:resource="#Sculptor"/>
<rdfs:Class rdf:ID="Style"/>                        <rdfs:range rdf:resource="#Sculpture"/>
<rdfs:Class rdf:ID=“Museum">                        <rdfs:subPropertyOf rdf:resource="#creates"/>
 <rdfs:Class rdf:ID="Sculptor">                 </rdf:Property>
   <rdfs:subClassOf rdf:resource="#Artist"/>    <rdf:Property rdf:ID=“exhibited">
</rdfs:Class>                                       <rdfs:domain rdf:resource="#Painting"/>
<rdfs:Class rdf:ID="Painter">                       <rdfs:range rdf:resource=“#Museum"/>
    <rdfs:subClassOf rdf:resource="#Artist"/> </rdf:Property>
</rdfs:Class>                                   <rdf:Property rdf:ID=" technique">
<rdfs:Class rdf:ID="Sculpture">                     <rdfs:domain rdf:resource="#Painting"/>
    <rdfs:subClassOf rdf:resource="#Artifact"/>     <rdfs:range rdf:resource="http://www.w3.org/
</rdfs:Class>                                   TR/1999/PR-rdf-schema-19990303#Literal"/>
<rdfs:Class rdf:ID="Painting">                  </rdf:Property>
    <rdfs:subClassOf rdf:resource="#Artifact"/> <rdf:Property rdf:ID="title">
</rdfs:Class>                                       <rdfs:domain rdf:resource="#ExtResource"/>
<rdf:Property rdf:ID="creates">                     <rdfs:range rdf:resource= "http://www.w3.org/
    <rdfs:domain rdf:resource="#Artist"/>       TR/1999/PR-rdf-schema-19990303#Literal"/>
    <rdfs:range rdf:resource="#Artifact"/>      </rdf:Property>
</rdf:Property>                                 ….                                              53
                                                </rdf:RDF>
     ICS-FORTH


                 RDF/S vs. Well-Known Formalisms
   Relational or Object Database Models (ODMG, SQL)
      Classes don‟t define table or object types

      Instances may have associated quite different properties

      Collections with heterogeneous members



   Semistructured or XML Data Models (OEM, UnQL, YAT, XML Schema)
      Schema labels on both nodes and edges

      Class and property subsumption is not captured

      Heterogeneous structures reminiscent to SGML exceptions



   Knowledge Representation Languages (Telos, DL, F-Logic)
      Absence of complex values and n-ary relationships (bags, sequences)

                                                                        54
     ICS-FORTH


                       Some RDF Applications
   Web Browsers:
      Netscape  6 from Netscape/AOL uses RDF to integrate various data-oriented
       applications such as bookmarks, mail/news, channels, etc. as well as for
       smart browsing and related links (RDF annotation services)
      Amaya Editor/Browser from W3C uses RDF to support user annotations on
       Web pages as metadata
   Brokers/Portals:
      RSS   (RDF Site Summary) XML/RDF Specification 1.0 2000
      Web Service Description Language (WSDL) XML/RDF Specification 2000
      PICS Rating Vocabularies in XML/RDF W3C NOTE 27 March 2000
      Platform for Privacy Preferences and RDF/RDF W3C Draft 10 May 2000

   Content Management:
      OCLC  Dublin Core Elements in RDF
      ICOM-CIDOC Conceptual Reference Model in RDF
      The Wordnet Lexical Ontology in RDF
      European Treasury Browser in RDF
                                                                               58
 ICS-FORTH


Example: Annotation & Recommendation Services




                                            59
    ICS-FORTH


                       Practical notes on RDF
   Authoring/Visualization
      by hand (experts only, perhaps copy & paste)
      support by other tools (editors like Stanford Protégé)
      conversion from existing data stores (using XSLT)
      visualize RDF graphs (using Rudolf RDFViz)
   Parsing/Validating
      ICS-FORTH Validating RDF Parser (VRP)
      Rapier RDF Parser
      W3C Simple RDF Parser & Compiler (SiRPAC)
   Storing/Querying
      ICS-FORTH RSSDB/RQL
      Aidministrator Sesame
      Redland Squish
      R.V.Guha RDFdb
   Harvesting/Crawling
      AIFB RDF Crawling                                        60
ICS-FORTH




            ICS-FORTH RDF R&D Activities
       (http://www.ics.forth.gr/proj/isst/RDF)




                                                 61
    ICS-FORTH


                The ICS-FORTH RDFSuite

 The Validating RDF Parser (VRP): Karsten Tolle Diploma Thesis
    The first RDF Parser supporting semantic validation of both
     resource descriptions and schemas
 The RDF Schema Specific DataBase (RSSDB): Sophia Alexaki
  M.Sc. Thesis
    The first RDF Store using schema knowledge to automatically
     generate an Object-Relational (SQL3) representation of RDF
     metadata and load resource descriptions
 The RDF Query Language (RQL): Greg Karvournarakis
  M.Sc. Thesis
    The first Declarative Language for uniformly querying RDF
     schemas and resource descriptions

                                                                   62
    ICS-FORTH


            The ICS-FORTH RDFSuite Architecture



VRP                 Loading RDF Java APIs                      ORDBMS                                                                          RQL Interpreter




                                                                                                                         DBMS RDF query APIs
 Parser                                             Class              Property




                                                                                             SQL3+ SPI functions
                                                                                                                                               Typing
                        RDF Loader




                                                   c_name       domain p_name range
                                                                           title                                   LIB
                                            JDBC
                                                    Hotel
                                                   Hotel Dir
                                                               Resource            Literal
                                                                                                                                                          Graph
      VRPInternal                                                                                                  C++
                                                                                                                                                        Constructor
      RDF Model                             SQL3    SubClass SubProperty
                                                     subcl     supcl    subpr suppr
                                                                                                            SQL3
                                                   Hotel Dir   Hotel
Validator                                                                                                                                      Evaluation Parser
                                                        Hotel              title
                                                          URI           source target
                                                        creates         paints creates




                                                                                                                                                                   63
      ICS-FORTH


                           The Validating RDF Parser (VRP)

                               Parser
    <df :RDFxmlns ="...#”
    r                :rdf
                    :rdfs
               xmlns ="...#"                               Namespace
                    =
              xmlns “ ">          Lexical       Syntax
      <tag1>                     Analyzer      Analyzer     Manager
                   t
        <tag2 ,,, </ ag2
             >            >
      </tag1>
    </ :RDF>
     rdf
                                                                         RDF graph model
      RDF/XML
     Descriptions                                          Validator
                                            VRP Internal               subject predicate object
                                            RDF Model

                                                                         RDF triple model



 The VRP parser checks only if an RDF file is well-formed according to
  the RDF M & S Spec
 The VRP validator checks if the model (i.e. triples) generated by the
  parser satisfies the constrains imposed by the RDF Schema Spec                                  64
     ICS-FORTH


                               The RDF to DBMS Loader
                                                 Resource
                                                 • URI


                                                                                                   Extended VRP Validator
RDF Model                                        RDF_Resource
                                                    •rdf:type
      P1                                           •………...                                           Persistent
C1          C2                                                                                                          Additional
                                                    store()                                          Namespace
                                                                                                                        Constraints
      P1                                                                                              (DBMS)
r1           r2
                                                                   RDF_Statement                            RDF Querying APIs
                                              RDF_Property          •rdf:predicate
                        RDF_Class          •rdfs:domain              •rdf:subject
                     •rdfs:subClassOf      •rdfs:range                •rdf:object
                        store()            •rdfs:subPropertyOf
                                           •link_list
                                               store()

                                           RDF Loading APIs
                                                                                        DBMS
            RDF_Resource@7844                                           Class                 Property
              URI       r1
                                          RDF_Property@5678             c_name       p_name    domain           range
            rdf:type  ns#C1
                                          URI        ns#P1              ns#C1        ns#P1         ns#C1        ns#C2
                                           rdf:type rdf#Property
           RDF_Class@2344
                                                                        ns#C1                 ns#P1
                   ns#C1                rdfs:range     ns#C2
           URI
                                                                          URI             source       target
                                        rdfs:domain     ns#C1
       rdf:type   rdfs#Class                                                                  r1           r2
                                         link_list    (r1,r2)             r1
                                                                                                                                      67
ICS-FORTH


    RSSDB Representation of RDF metadata

                    Namespace
        id                           uri                                           Type
         1    http://www.w3.org/2000/01/rdf-schema#                     id     nsid       lpart
         2    http://www.w3.org/1999/02/22-rdf-syntax-ns#               1       1     Literal
         3    http://www.odp.org/schema.rdf#                            2       1      Bag
         4    http://www.arts.org/schema.rdf#                           3       1      Seq
         5    http://www.dc.org/schema.rdf#
              Class                               Property
       id    nsid       lpart              id  nsid    lpart          domainid rangeid
       10                DataResource         14    4        title      10                1
       11      2         ExternalPage         15    2        title      11                1
       12      3              Arts                    SubClass
       13      3          Art_History               subid   superid            SubProperty
                   t10                                  11       10          subid    superid
                   URI                                  12       10           15              14
                                   t14                  13       12
        t11               t12   source    target
         URI             URI
                                    t15
                     t13         source    target            subtable
                      URI

                                                                                                   68
     ICS-FORTH


                 The RDF Query Language (RQL)
   Declarative query language for RDF description bases
      relies on a typed data model (literal & container types + union types)

      follows a functional approach (basic queries and filters)

      adapts the functionality of semistructured or XML query languages to
       RDF, but also:
         treats properties as self-existent individuals
         exploits taxonomies of node and edge labels
         allows querying of schemas as semistructured data

   Relational interpretation of schemas & resource descriptions
      Classes (unary relations)

      Properties (binary relations)

      Containers (n-ary relations)
                                                                                69
    ICS-FORTH


                Browsing Portal Catalogs with RQL
   Simple set queries on class and property extents:
      Find the resources in the extent of the property creates
                                                             Includes
                                    creates
                                                          paints & sculpts
         {{ [www.portal.gr/rodin424, www.rodin.fr/thinker.gif],
             [www.portal.gr/picasso132, museoreinasofia.mcu.es/guernica.gif],
             [www.portal.gr/picasso132, www.artchive.com/woman.jpg] }}

       Find  the resources of type painter and sculptor
                         ExtResource intersect Sculpture
         {{ www.rodin.fr/thinker.gif }}               Multiply classified
                                                          resources

   Schema constructs used as query terms & support for automatic
    query expansion (similar to thesauri-based IRS)
       Useful   to query resources with minimal schema knowledge               70
    ICS-FORTH


          Personalizing Portal Catalogs with RQL
   Navigational queries on semistructured resource descriptions
      Find the Museum resources that have been modified in year 2000.
      select x
      from Museum{x}.last_modified{y}
      where y >= 2000/01/01
                                                   Data paths not
      {{museoreinasofia.mcu.es}}              foreseen in the schema




Similar functionality to semistructured or XML query languages (Lorel,
 UnQL, XQL, XML-QL, XML-GL)
   Useful in the absence of schema information or when multiple
    schemas are used to describe resources
                                                                          71
    ICS-FORTH


    Querying Portal Catalogs with Large Schemas
   Filtering both resource descriptions and schemas
      Find the paintings having as technique “oil on canvas” that have
        been created by a neo-impressionist painter

                  Schema Filtering on               Data filtering with
                   Class hierarchies               schema information

       select y
       from {:$X}creates{y:Painting}.technique{z}
       where $X <= neo-impressionist and z = “oil on canvas”




                                                                          72
    ICS-FORTH


                Querying Portal Schemas with RQL
   Pure schema queries
      Find the properties which specialize the property creates and may
       have as domain the class Painter along with their corresponding
       range classes
      select @P, $Y                          All Properties defined or
      from {:Painter}@P{:$Y}                inherited in class Painter
      where @P <= creates

        {{ [creates, Artifact],      Schema filtering on
            [creates, Painting],      property hierarchies
            [creates, Sculpture],
            [paints, Painting] }}


                                                                           73
    ICS-FORTH


                                      RQL: Examples


       String       ns1#fname
                                          ns1#creates                       ns1#has_style
                          ns1#Artist ns1#creates ns1#createsns1#Artifact
                                                               ns1#Artifact               ns1#Style
       String   ns1#lname                                                     ns1#has_material
                                   ns1#creates                                              String
                 ns1#Painter ns1#Sculptor
                    ns1#Painter                    ns1#Painting ns1#Sculpture
                                                      ns1#Painting    ns1#Sculpture

                                                                                   dc#last_modified
                                ns1#paints
                                   ns1#paints          ns1#sculpts     odp#ExtPage            Date

ns1#Impressionist

                    ns1#PostImpressionist




   Similar functionality to DBMS schema QLs (SchemaSQL, XSQL)
      Useful for large schemas (integrating ontologies and thesauri)
                                                                                                      74
     ICS-FORTH


                        Putting it all Together
   Nested schema and data queries
      Find the resources modified after 2000/01/01 which can be reached
       by a property applied to the class Painting and its subclasses
      select R, y                                R ranges over the labels
      from (select @P                                of type property
               from {:$X}@P
               where $X <= Painting){R}.{y}last_modified{z}
      where z >= 2000/01/01

        {{ [exhibited, museoreinasofia.mcu.es] }}

   Subcommunities may use different schemas while sharing the same
    description base
                                                                            75
  ICS-FORTH


                                    RQL:Examples


                                      Museum
                    exhibited
                          technique
Portal        Painting                  String
Schema


                                 exhibited             last_modified
                                                                       2000/06/09
                                                 &r4
                          &r2     technique      “oil on canvas”

                                                  2000/01/02
                                last_modified
                          &r3




                                                                                    76
   ICS-FORTH


                           Putting it all Together
 Schema  and data queries
   Find all metadata about the resources of the site
       museoreinasofia.mcu.es
                                                       URLs‟ pattern matching
   select x,$$Y,$P,z,$$W
   from {x:$$Y}$P{z:$$W}
   where x like “*museoreinasofia.mcu.es*” or
          y like “*museoreinasofia.mcu.es*”
      {{[www.portal.gr/picasso132, Painter, paints, museoreinasofia.mcu.es/guernica.gif,
       Painting],
       [museoreinasofia.mcu.es/guernica.gif, Painting, exhibited, museoreinasofia.mcu.es,
       Museum],
       [museoreinasofia.mcu.es/guernica.gif, Painting, technique, “oil on canvas”, string],
       [museoreinasofia.mcu.es, ExtResource, title, “Reina Sophia Museum”, string],
       [museoreinasofia.mcu.es, ExtResource, last_modified, 2000/06/09, date], ….}}
 Subcommunities          may use both different schemas and description                    77
 bases
  ICS-FORTH


                  RQL Query Processing
select y
from {x}creates{y:Painting}.has_material{z}
where z = “oil on canvas”



select y
from creates A, has_material B, D $C
define x = A.source, y = A.target, w = B.source, z = B.target,
       R = range(creates), D = subclassOf(R), E = ^($C)
where z = “oil on canvas” and y = w and $C = Painting and y in E



                                                                   78
   ICS-FORTH


                    RQL Query Optimization
                                  Project
                                        y

                                  Select
                                        z = “oil on canvas”

                                   Join
                                   y=w
                 SemiJoin
                  SemiJoin
                   Select                    has_material[w,z]
                                             has_material[w,z]
                        y in
                 y in ^($C) ^Painting
                     y=p
 creates[x,y]                     Select
 creates[x,y]    creates[x,y] Painting[p]
                               $C = Painting

                               subclassOf
                          (range(creates))[$C]


select X.target
from creates* X, has_material* Y, Painting P
where X.target = Y.source and X.target = P.uri and Y.target = ‟oil on canvas‟
                                                                                79
  ICS-FORTH


                          The RQL Query Interpreter

                                          Syntax analysis
                  Query string
                                         •Syntactical analysis
          (1)                            (lex/yacc)
                                         •CNF transformation         Type inference
                       Syntax tree
                                                                    •Checks type
                       under CNF                    (2)             compatibility
Query
string                    (3)                                       •Sets appropriate
               Main                  Graph construction             evaluation
Query
result
                         (4)       •Evaluation of                   functions
                      Query        dependencies
                      graph        •Factorization                                         Typing
                                   functions
                                                                                        DBMS – RDF
                 (5)       Query graph                                                   Query APIs
                                                                 Evaluator
         (6)                                                                             Evaluation
                                                            •Defines
                                                            evaluation
                                Result                      functions
                                                            •Query
                                                            Processing                                DBMS
                                                                                                             80
    ICS-FORTH


                     RDFSuite Summary
 RDFSuite addresses the needs of effective RDF metadata
  management by providing tools for validation, storage and querying
    validation follows a formal data model and constraints enforcing
    consistency of RDF schemas
    incremental loading of voluminous description bases in a
    persistent store
    declarative query language for schema and data querying

 Ongoing efforts:
    RQL query optimization

    transactional aspects

    alternative encoding and representation schemes for access
    optimization

                                                                        81
    ICS-FORTH


                       Acknowledgements

   Funding was generously provided by the projects:
       C-WEB   (IST-1999-13479): “A Generic Platform Supporting
        Community Webs”
       MESMUSES     (IST-2000-26074): “Metaphor for Science Museums”




                                                                        82
ICS-FORTH




            83

				
DOCUMENT INFO