Microsoft PowerPoint - Data Translation

Document Sample
Microsoft PowerPoint - Data Translation Powered By Docstoc
					                 Data Translation                                            Data Heterogeneity
                         The Problem                                 Same element, different name
                                                                     Same element , different value, e.g.
7 Palliative Care Centre Datasets
                                                                     degree of temp
  Calgary Health Region Palliative Care Program                      Content differences
  Edmonton Regional Palliative Care Program
  Capital Health Palliative Care Program (Halifax)                   Semantic (same concept, diff meaning
  Queen’s Palliative Medicine Program (Kingston)
  Temmy Latner Centre for Hospice Palliative Care HPCNet Project
  Victoria Hospice Society
  Regional Health Authority Palliative Care Sub Program (Winnipeg)

     A Common Data Integration                                       Global schema     Mapping
           Architecture                                                      Local Schema
                                                                                              User Interface
                                                                                              Query          Integrated result without
                        Query         Result                                                                 inconsistency, etc.
                                                                                                 global unified
                           Mediator                                                            schema/ontology

                                                                                         Integration System

        Wrapper             Wrapper              Wrapper                         local

            data                                                                                                  …
                                data       …         data                       data               data                  data
          source 1            source 2             source n                   source 1           source 2              source n

             Query Reformulation                                                                 Views
                                                                     Global –As-View (GAV)
                                                                     Local-As-View (LAV)
                                                                     Combined View
                            Global-As-View                                                                                 Local-As-View (1)
    For each relation R in the Global Schema, we write a                                               LAV describes the local data source in the opposite
    query over the local data source relations                                                         direction
    Local data sources DB1, DB2 contain name, date of                                                  Contents of data source are described as a query over
    birth, diagnosis                                                                                   the global schema
     DB1(id,name,dob,diag) pcPatient(name,diag)                                                        2 sources V1 contains patient, doctor, diagnosis and
     DB2(id,name,dob,diag) pcPatient(name,diag)                                                        surgery and opDate and V2 contains surgery done in
    DB3 shares patient id with DB1 and provides treatment                                              2005
     DB1(id,name,dob,diag)^DB3(id,treat)                                                               S1:V1(name,doc,diag,surgery,opDate) pcPatient(name
          pcTreat(name,treat)                                                                          act
    q(name,treat) : pcPatient(name,’John Doe’),                                                        S2:V2(surgery,date) pcTreat(diag,surgery)^Operation(s
                     pcTreat(name,treat)                                                               urgery)^opDate=2005

                         Local-As-View (2)                                                                                   Combined View

    Query                                                                                          BYU-Global-Local-as-View (BGLaV)¹
                                                                                                     Each relation in target(global) schema is predifined and independent
   q(surgery,opDate):-                                                                               of source schema
    pcPatient(name,dob,doc,diag),year>2000, pcTreat(diag,                                            Sources are wrapped without reference to global schema
   surgery)                                                                                          Set of mapping elements are generated by schema matching
                                                                                                     techniques between the two schemas
    Reformulated Query                                                                               With new sources added, new source-to-target mapping must be
   q1(surgery,opDate):-                                                                              created
    V1(name,doc,diag,surg,opDate),V2(surg,date)                                                      User poses query with global schema, query reformulated by
                                                                                                     applying generated source-to-target mappings
    The reformulated query is not the same as the original
    query, it only returns operation done in 2005                                                  ¹ Combing the Best of Global-as-View and Local-as-View for Data Integration, Xu and Embley, Nov

 Peer-to-Peer Data Integration (1)                                                                  Peer-to-Peer Data Integration (2)
Both-as-View (BAV) Approach¹
  In BAV, schemas are mapped to each other using a
  sequence of bidirectional schema transformation called

¹ Defining Peer-to-Peer Data Integration using Both as View Rules, McBrien and Poulovassilis. 99

Peer-to-Peer Data Integration (3)                 Peer-to-Peer Data Integration (4)

          Need to Consider
Maximum or Minimum Dataset in the Global
Changes (e.g. Add, Delete) of Local Data
Direct or Automatic (semi-automatic) Schema
Addition of new Palliative Care Data Sources in
XML and Web Service