Docstoc

Efficient Processing of Semantic Information on the Web - WeRC

Document Sample
Efficient Processing of Semantic Information on the Web - WeRC Powered By Docstoc
					Efficient Processing of Semantic
     Information on the Web

             Georg Lausen
          Technische Fakultät
          Universität Freiburg
    Processing of Semantic Information on the Web

•    The amount of available information on Web still is increasing rapidly.

• (Semi-)Automatic Data Extraction .

• Resource Description Framework (RDF) .

• SPARQL is the standard query language for RDF.


• Efficiency and Scalability of query processing.
    Efficiency and Scalability: A Variety of Approaches

•   Single machine RDF stores

•   Parallel Database Approach: Vertica and others


•   Approaches based on Hadoop (MapReduce Paradigm)
     –   Hadoop
     –   Hadoop++
     –   Integration of databases: HadoopDB
     –   Language translation
           • Mapping SPARQL to Hadoop/HBase directly
           • Mapping SPARQL to Pig Latin


•   Non Hadoop clusters
 Cluster-based Parallelism vs Parallel Database/Single
                 Machine RDF-Store

Each technology has its own advantages and problems.

Rough characterization:

                                  Querying               Loading
Parallel Database / Single
Machine RDF-Store                    +                      -
Cluster-based Parallelism             -                     +

Loading in the context of Web research: Extract Transform Load schema.

SPARQL provides a declarative way for specifying the transformation and querying.
         ETL and Querying in the context of Web research



                                           T

                              E                          L
        Web documents                Initial RDF graph             RDF store


        Efficient Loading
                                                                                     Efficient
                                                                                     querying


                                     SPARQL


PigSPARQL: Mapping SPARQL to PigLatin; to appear Semantic Web Information Management – SWIM 2011

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:4/7/2013
language:Unknown
pages:5