Transitioning Relational Databases to Ontologies Farid Cerbah

Reviews
Transitioning Relational Databases to Ontologies Farid Cerbah Dassault Aviation farid.cerbah@dassault-aviation.fr ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Outline  Problem statement  Previous work  The RDBToOnto tool and the RTAXON method  Improving the process through database optimisation  A case study in aircraft maintenance  Extending RDBToOnto  Conclusion ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 2 Problem statement  Relational databases are valuable heterogeneous sources for ontology learning  Better accuracy can be expected than from text corpora  Ontology learning from relational databases is not a new research issue  Limitations of existing support Problem often restricted to finding automated ways to import “tables” into ontologies Derivation of ontologies with flat structure that look like the source databases ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 3 Our contribution  RDBToOnto Platform A comprehensive software support to learn finetuned ontologies A framework that eases the development and the experimentation of transitioning methods  RTAXON Method To find out taxonomies hidden in the data ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 4 A motivating example Typical mappings covered by several methods Specific to RTAXON ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 5 Previous work (1)  RDB -> Ontology Transformation  Database Reverse Engineering Many transformation rules from this domain are reused for ontology learning  [Behm et al. 1997], [Ramanathan & Hodges 1997], … Approaches mostly based on an analysis of the RDB schema Data correlations are considered but with the restriction "Data ≡ Key Values"  Key inclusion may express inheritance Exploiting null values semantics [Lammari et al. 2007]  Partitioning of a table on the basis of null values may reveal concept hierarchies  Involves data from non-key attributes ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 6 Previous work (2)  Mapping languages and tools D2RQ  RDB to OWL/RDF mapping  Ontology-based access to relational databases  Rewriting SPARQL queries into SQL  Relational.OWL  A minimal ontology of „tables‟ and „column‟ and a processor to populate this ontology with data from relational databases  Can be used to exchange data between databases Triplify Plugin for web applications Converts the result of SQL queries into RDF KAON Reverse  Software support to interactively map an RDB schema to a predefined ontology DataMaster Protégé Plugin to import table data into ontologies ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 7 RDBToOnto  A user-oriented tool with a full-fledged user interface  Supports an extensive process from the access to the data to ontology generation  Includes the RTAXON converter  Though automated to a large extent, local constraints can be interactively included to progressively refine the ontologies  Types of local constraints Table and column exclusion Naming patterns for classes and instances Categorisation patterns ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 8 The RTAXON method Major improvement over existing methods Further refine the classes derived from the schema with subclasses found in the content of the relations Focus on reliable categorisation patterns Categorising attribute Type DOOR PANEL PANEL FLOOR FAIRG Floor Door Panel Fairing Access Zone Access Zones (X 516) A/C F7X F7X F7X F7X F7X Codes 2103 281FL 300ZZ 243DF 342EZ nose cone windshield retainers umbrella access panel No.1 servicing compartment floor No.1 rear under pylon fairing Description  Two sources involved in the identification of categ. attributes Attribute names Redundancy in attribute extensions  Revealed by lexical clues  Entropy-based approach to find good profiles Formal definition of RTAXON Demo 9 ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife Optimising the source databases  Another key improvement is the inclusion of a database optimisation step  Many input databases suffer from data duplication problems  Optimisation -> eliminate data duplication through the processing of inclusion dependencies WorkPackages (X 82) WP Number 33 34 34A 35 WP Title Hydraulic Power Landing Gears Company Code F0086 F564 Company Name Parker Messier-Dowty Dassault-Aviation ABS WP Number 33 34 34A 35 WorkPackages (X 82) WP Title Hydraulic Power Company Code F0086 F564 Landing Gears Landing Gear Emergency Control System F0214 eels, Brakes and Braking B453 Landing Gear Emergency Control System F0214 eels, Brakes and Braking B453 Data Duplication WorkPackag es[Company Code, Company Names]  Companies[ Cage Code, Name] WorkPackag es[Company Code]  Companies[ Name] Companies (X 106) Companies (X 105) Inclusion dependency Cage_Code (PKEY) F0086 F564 F0214 Parker Name Messier-Dowty Dassault-Aviation Foreign Key Relationship ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 10 Effect of inclusion dependency processing  Inclusion dependencies  more inter-class relations (i.e. object properties). Without ID identification With ID identification ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 11 Identification of inclusion dependencies  RDBToOnto includes an editor to interactively define inclusion dependencies  Automated identification of inclusion dependencies  A data mining approach Based on LATINO  See presentation in this tutorial on ontology learning by Miha Grčar (JSI)  Dependencies discovered by LATINO are exported in RDBToOnto and can be validated in the ID editor ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 12 Mining inclusion dependencies with LATINO ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 13 A case study in aircraft maintenance KCIT(GATE-based annotator) RDBToOnto + LATINO Radiant OWLIM ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 14 The ontology acquisition process  The legacy data  LSA database: an heterogeneous relational database that gathers all information related to maintenance activity Required logistic resources Aircraft parts (Product tree) Scheduling data  Standards: Documents including widely shared conceptual models  The ontology acquisition process  A multi-step transitioning process that favours modular design ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 15 Model Boostrapping + Ontology Normalisation MSG-3 SNS/ATA FOAF Reusable Ontologies  <>… <>… …. <> … imports ATA Model Bootstrapping Ontology Normalisation Legacy Data OWLIM/HKS Repository Ontology Learning Tools ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 16 The defined RDBToOnto conversion project 75 constraints Mostly naming patterns and inclusion dependencies Resulting ontology Ontology model 115 classes, 334 datatypes, 54 object properties Population 49617 class instances, 51449 object property instances No constraints for categorisation The ten discovered hierarchies by RTAXON are relevant Good behaviour when faced with categorisation conflicts ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 17 The generated class hierarchy ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 18 Identified object properties ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 19 RDBToOnto extension capabilities  RDBToOnto is a user-oriented tool but it is also a framework Written in Java OWL as target language (exploiting Jena 2.5 API)  Two types of components can be added  Database readers to cover more database formats  Converters to implement new learning methods  New converters can have their specific global options, local constraints and GUI ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 20 Structure of RDBToONTO Database DBReader Database getDatabase() Table ReadData(String name) … RDBToOntoConverter OntModel Convert(Database db) OntClass CreateClass(TableDef) … MSAccessReader DB2Reader RTAXON BasicConverter can be extended by the users ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 21 The neutral database model DBSchema Database * Table * Column TableDef * Attribute * friendlyNames Key * String Values * PrimaryKey ForeignKey * Input to any converter ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 22 Conclusion  We presented a significant support for transitioning relational databases to ontologies  RDBToOnto and RTAXON method have been evaluated on significant databases  RTAXON is just a first step as many extensions can be studied  Learning two-level hierarchies  Automatically generating local constraints (e.g. naming patterns)  More resources are available on TAO project web site, including  User Guide and demos  Development Guide  A fully implemented sample showing how to extend the tool ESWC'08 - Tutorial 3 - Transitioning Applications to Ontologies - June, 1st - Tenerife 23

Related docs
Other docs by One Seven