QBE-RDF-XML Graph Query by bns26590

VIEWS: 21 PAGES: 9

									 Edutella RDF Query Exchange Language (RDF-QEL)

                        Syntax and Feature Specification


Abstract

This document describes purpose and features of the Edutella Query Exchange Language
(RDF-QEL) and introduces a RDF/XML based query syntax. Each feature is described in
detail. Examples are given for clarification. In the appendix we provide the Edutella
RDF-QEL Schema definition as RDF graph and in RDF/XML.


Introduction
Motivation

The Edutella Query Exchange Language is intended to be a standardized query exchange
mechanism for RDF metadata stored in distributed RDF repositories. It will be
implemented by the Edutella Query Service – one of the core services for the Edutella
infrastructure and is meant to serve as both query interface for individual RDF
repositories located at single Edutella peers as well as query interface for distributed
queries spanning multiple RDF repositories. The main purpose is to abstract from various
possible RDF storage implementations (e.g. relational database, XML files etc.) and from
different user level query languages (e.g. RQL, TRIPLE): The Edutella Query Exchange
Language provides the syntax for an overall standard query interface across
heterogeneous peer repositories for any kind of RDF metadata.


Key Concepts

RDF-QEL uses graph patterns (or query patterns) to be matched against real RDF graphs.
In essence, a graph pattern is a regular RDF graph plus additional properties denoting
variables and constraints on these variables. Therefore there is no need to introduce
native language features because everything is expressible as RDF graph.

One of the key ideas ruling our query language design is to keep things as simple as
possible and reuse existing and widespread query concepts people are already familiar
with. Therefore RDF-QEL is inspired by QBE (Query By Example) for relational
databases and Datalog. As its syntax RDF-QEL uses standard RDF/XML. All Edutella
specific properties are defined in an Edutella RDF Schema (in the following referenced
as “edu”).
We assume a two-step parsing process with a standard RDF parser being blind for the
semantic information transported via properties from the Edutella namespace and a
second Edutella query parser evaluating this semantic information afterwards. Following
the QBE philosophy queries as well as query results look similar to the graph being
queried: After removing all statements using properties from the Edutella schema an
Edutella query looks just like the original RDF graph.



The Edutella Query Exchange Language in Detail
Queries, Query Results and Variables

Variables are used as placeholder for arbitrary subject or object resources as well as
object literals. Variables are not allowed for predicate resources.

Each Edutella Query needs to instantiate an instance of the “edu:Query” class. This
instance references all variables through the “edu:hasVariable” property. The referenced
resource itself declares to be a variable by a “rdf:type” property targeting the
“edu:Variable” class.

Example 1: Query with a single match

This query returns the title of the resource identified by
“http://www.xyz.com/book.html”.

<rdf:Description rdf:ID="AI_Title_Query_1">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Query"/>
  <edu:hasVariable rdf:resource="#X"/>
</rdf:Description

<rdf:Description rdf:ID="X">
  <rdf:type rdf:resource=”http://www.edutella.org/edutella#Variable”/>
</rdf:Description>

<rdf:Description rdf:about="http://www.xyz.com/book.html">
  <dc:title rdf:resource="#X"/>
</rdf:Description>

Results:

<rdf:Description rdf:about="http://www.xyz.com/book.html">
  <dc:title>Artificial Intelligence</dc:title>
</rdf:Description>


Note that the query as well as its result looks similar to the RDF graph we are running the
query against when removing all “edu”-statements. Also, note that an RDF parser will be
“blind” to the semantics of X being a variable: To an RDF parser X is a resource just like
any other resource.
In the above example the returned query result looks exactly like the query itself with all
Edutella specific statements removed and all variables instantiated. There is also an
alternative representation of results as a set of tuples of variables with their bindings:

<rdf:Description rdf:ID="AI_Title_Results_1">
  <rdf:type rdf:resource=”edu:ResultSet”/>
  <edu:hasResults rdf:parseType=”Resource”>
    <rdf:type rdf:resource=”rdf:Bag”/>
    <rdf:li rdf:parseType=”Resource”>
       <edu:bindsVariable rdf:resource=”#X”/>
       <rdf:value>Artifical Intelligence</rdf:value>
    </rdf:li>
  </edu:hasResults>
</rdf:Description>

Apparently this is a rather simple example with only one binding for a single variable.
With more variables in the query the result bag contains more members – one for each
variable. Also, if there is more than one binding for the variables the result set becomes a
two-dimensional structure with a bag of bags with each of the sub-bags describing one
binding.

Example 2: Query with multiple matches

This query returns all resources having a Dublin Core “title” statement along with their
titles.

<rdf:Description rdf:ID="AI_Title_Query_2">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Query"/>
  <edu:hasVariable rdf:resource="#X"/>
  <edu:hasVariable rdf:resource="#Y"/>
</rdf:Description>

<rdf:Description rdf:ID="X">
  <rdf:type rdf:resource=”http://www.edutella.org/edutella#Variable”/>
  <dc:title rdf:resource="#Y"/>
</rdf:Description>

<rdf:Description rdf:ID="Y">
  <rdf:type rdf:resource=”http://www.edutella.org/edutella#Variable”/>
</rdf:Description>


Results:

<rdf:Description rdf:about="http://www.xyz.com/book.html">
<dc:title>Artifical Intelligence</dc:title>
</rdf:Description>

<rdf:Description rdf:about="http://www.xyz.com/comment1.html">
<dc:title>Comment 1</dc:title>
</rdf:Description>

<rdf:Description rdf:about="http://www.xyz.com/comment2.html">
<dc:title>Comment 2</dc:title>
</rdf:Description>
Since there are multiple bindings for the variables we get a set of distinct RDF graphs as
result. In the alternative tuple representation we get a bag containing sub bags:

<rdf:Description rdf:ID="AI_Title_Results_2">
  <rdf:type rdf:resource=”edu:ResultSet”/>
  <edu:hasResults rdf:parseType=”Resource”>
    <rdf:type rdf:resource=”rdf:Bag”/>
    <rdf:li rdf:parseType=”Resource”>
      <rdf:type rdf:resource=”rdf:Bag”/>
      <rdf:li rdf:parseType=”Resource”>
         <edu:bindsVariable rdf:resource=”#X”/>
         <rdf:value rdf:resource=”http://www.xyz.com/book.html”/>
      </rdf:li>
      <rdf:li rdf:parseType=”Resource”>
         <edu:bindsVariable rdf:resource=”#Y”/>
         <rdf:value>Artifical Intelligence</rdf:value>
      </rdf:li>
    </rdf:li>
    <rdf:li rdf:parseType=”Resource”>
      <rdf:type rdf:resource=”rdf:Bag”/>
      <rdf:li rdf:parseType=”Resource”>
         <edu:bindsVariable rdf:resource=”#X”/>
         <rdf:value rdf:resource=”http://www.xyz.com/comment1.html”/>
      </rdf:li>
      <rdf:li rdf:parseType=”Resource”>
         <edu:bindsVariable rdf:resource=”#Y”/>
         <rdf:value>Comment 1</rdf:value>
      </rdf:li>
    </rdf:li>
    <rdf:li rdf:parseType=”Resource”>
      <rdf:type rdf:resource=”rdf:Bag”/>
      <rdf:li rdf:parseType=”Resource”>
         <edu:bindsVariable rdf:resource=”#X”/>
         <rdf:value rdf:resource=”http://www.xyz.com/comment2.html”/>
      </rdf:li>
      <rdf:li rdf:parseType=”Resource”>
         <edu:bindsVariable rdf:resource=”#Y”/>
         <rdf:value>Comment 2</rdf:value>
      </rdf:li>
    </rdf:li>
  </edu:hasResults>
</rdf:Description>



Variables and Joins

Variables are also used to perform join operations. Whenever the same variable occurs in
more than one statement a join is carried out between the involved statements.

Example 3: Variables and joins

This query returns all resources having the title “Artificial Intelligence” and being of type
“Book”.

<rdf:Description rdf:ID="AI_Query">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Query"/>
  <edu:hasVariable rdf:resource="#X"/>
</rdf:Description>
<rdf:Description rdf:ID="X">
  <rdf:type rdf:resource=”http://www.edutella.org/edutella#Variable”/>
  <dc:title>Artificial Intelligence</dc:title>
  <rdf:type rdf:resource="http://www.xyz.com/book.rdf#Book"/>
</rdf:Description>


This example shows how we are doing a join (on the predicates title and type) without
using two join variables in each predicate (one is sufficient, as the root of the query tree).
In case it is not immediately obvious we are using the variable X twice here take a look at
the expanded version of our query:

<rdf:Description rdf:ID="AI_Query">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Query"/>
  <edu:hasVariable rdf:resource="#X"/>
</rdf:Description>

<rdf:Description rdf:ID=”X”>
  <rdf:type rdf:resource=”http://www.edutella.org/edutella#Variable”/>
  <dc:title>Artificial Intelligence</dc:title>
</rdf:Description>

<rdf:Description rdf:ID=”X”>
  <rdf:type rdf:resource=”http://www.xyz.com/book.rdf#Book”/>
</rdf:Description>


Results:

<rdf:Description rdf:about=”http://www.xyz.com/book.html”>
<dc:title>Artificial Intelligence</dc:title>
<rdf:type rdf:resource=”http://www.xyz.com/book.rdf#Book”/>
</rdf:Description>


In this example the variable occurs as subject in both statements. In similar scenarios the
same variable might occur as subject and object of two different statements.


Conjunction

Each graph query pattern can be subdivided into its constituting statements with each
such statement adding some further refinement to the query. Therefore any graph query
pattern represents an implicit conjunction of all its statements. This is similar to having
all these statements connected explicitly by a Boolean AND operator.


Disjunction of Queries

To describe a disjunction of several queries the Edutella schema provides the
“edu:DisjunctiveQuery” class. An instance of this class describes a disjunction by
referencing different queries through the “edu:hasQuery” property. The individual (sub)
queries are not allowed to use the same variables.
Example 4: Disjunction

<rdf:Description rdf:ID="AI_Query">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#DisjunctiveQuery"/>
  <edu:hasQuery rdf:resource="#AI_Query_1"/>
  <edu:hasQuery rdf:resource="#AI_Query_2"/>
</rdf:Description>

<rdf:Description rdf:ID="AI_Query_1">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Query"/>
  <edu:hasProjectedVariable rdf:resource="#X"/>
</rdf:Description>

<rdf:Description rdf:ID="X">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Variable"/>
  <rdf:type rdf:resource="ai:AIBook"/>
</rdf:Description>

<rdf:Description rdf:ID="AI_Query_2">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Query"/>
  <edu:hasProjectedVariable rdf:resource="#Y"/>
</rdf:Description>

<rdf:Description rdf:ID="Y">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Variable"/>
  <rdf:type rdf:resource="http://www.xyz.com/schema#Book"/>
  <dc:title>Artifical Intelligence</dc:title>
</rdf:Description>


Results:

<rdf:Description rdf:about="http://www.xyz.com/book.html">
  <rdf:type rdf:resource="http://www.xyz.com/schema#Book"/>
  <dc:title>Artifical Intelligence</dc:title>
</rdf:Description>



Operators and Constraints

The Edutella Query Exchange Language supports a number of Boolean and comparison
operators to be applied to variables only. The available operators are

      Implicit alphanumeric CONTAINS
      Explicit alphanumeric / implicit numeric EQUAL_TO
      Alphanumeric / numeric GREATER_THAN and
       GREATER_THAN_OR_EQUAL
      Alphanumeric / numeric LESS_THAN and LESS_THAN_OR_EQUAL
      Boolean NOT
      Boolean AND
      Boolean OR

A condition involving one or more variables is expressed as one literal being referenced
by an instance of the “edu:Query” class through the “edu:hasConstraint” property. Within
the constraint string parenthesis may be used to construct more complex conditions.
Example 4: Boolean and numeric operators and nested conditions

This query returns all resources having a score less than three or greater than five.

<rdf:Description rdf:ID="Score_Query">
  <rdf:type rdf:resource="http://www.edutella.org/edutella#Query"/>
  <edu:hasVariable rdf:resource="#X"/>
  <edu:hasVariable rdf:resource="#Y"/>
  <edu:hasConstraint>(X GREATER_THAN 3) AND (X LESS_THAN 5)</edu:hasConstraint>
</rdf:Description>

<rdf:Description rdf:ID="X">
  <rdf:type rdf:resource=”http://www.edutella.org/edutella#Variable”/>
  <xyz:score rdf:resource="#Y"/>
</rdf:Description>

<rdf:Description rdf:ID="Y">
  <edu:hasDatatype rdf:resource="http://www.edutella.org/edutella#Number"/>
  <rdf:type rdf:resource=”http://www.edutella.org/edutella#Variable”/>
</rdf:Description>


If one prefers prefix over infix notation the constraint literal of this example could also be
written as

<edu:hasCondition>AND(GREATER_THAN(X,3),LESS_THAN(X,5))</edu:hasCondition>




APPENDIX: Edutella RDF-QEL Schema
Edutella RDF-QEL Schema in Graph Representation:

For a better understanding the RDF graph describing the Edutella RDF-QEL Schema is
broken up into separate sub graphs.
Complete XML Serialization of the Edutella RDF-QEL Schema:
<rdf:RDF xml:lang="en"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:edu="http://www.edutella.org/edutella#">

 <rdf:Description rdf:about='http://www.edutella.org/edutella#Variable'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Class'/>
 </rdf:Description>
 <rdf:Description rdf:about='http://www.edutella.org/edutella#Query'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Class'/>
 </rdf:Description>
 <rdf:Description rdf:about='http://www.edutella.org/edutella#DisjunctiveQuery'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Class'/>
 </rdf:Description>
 <rdf:Description rdf:about='http://www.edutella.org/edutella#ResultSet'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Class'/>
 </rdf:Description>

 <rdf:Description rdf:about='http://www.w3.org/2000/01/rdf-schema#hasVariable'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Property'/>
   <rdfs:domain rdf:resource='http://www.edutella.org/edutella#Query'/>
   <rdfs:range rdf:resource='http://www.edutella.org/edutella#Variable'/>
 </rdf:Description>
 <rdf:Description rdf:about='http://www.edutella.org/edutella#hasConstraint'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Property'/>
   <rdfs:domain rdf:resource='http://www.edutella.org/edutella#Query'/>
   <rdfs:range rdf:resource='http://www.w3.org/2000/01/rdf-schema#Literal'/>
 </rdf:Description>
 <rdf:Description rdf:about='http://www.edutella.org/edutella#hasQuery'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Property'/>
   <rdfs:range rdf:resource='http://www.edutella.org/edutella#Query'/>
   <rdfs:domain rdf:resource='http://www.edutella.org/edutella#DisjunctiveQuery'/>
 </rdf:Description>
 <rdf:Description rdf:about='http://www.edutella.org/edutella#bindsVariable'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Property'/>
   <rdfs:range rdf:resource='http://www.edutella.org/edutella#Variable'/>
 </rdf:Description>
 <rdf:Description rdf:about='http://www.edutella.org/edutella#hasResults'>
   <rdf:type rdf:resource='http://www.w3.org/2000/01/rdf-schema#Property'/>
   <rdfs:domain rdf:resource='http://www.edutella.org/edutella#ResultSet'/>
 </rdf:Description>

</rdf:RDF>

								
To top