Storing XML Data in Relational Databases by vps11289


									Storing XML Data in Relational

Shaghayegh Sahebi
Nesa Asoudeh

  A data model for XML documents
  Different approaches to storing XML data in relational
  Query processing
  Conclusion and future work

• The volume of XML data exchange is explosively
• The need for efficient mechanisms of XML data
management is vital.
• One appropriate solution is storing XML data in
  relational dataBases
• Different models have been proposed.
A Data Model for XML Documents

 XPath data model:

    Root node

    Element nodes

    Attribute nodes

    Text nodes
A Data Model for XML Documents (cont.)
Different Approaches of Storing XML Data
in Relational Databases

 two approaches of designing database schemas for XML
    Structure-mapping approach:
    Database schemas represent the logical structure (or DTDs if they are
    available) of target XML documents.
    Model-mapping approach:
    Database schemas represent constructs of the XML document model.
   a fixed database schema is used to store the structure of all XML
Different Approaches of Storing XML
Data in Relational Databases(cont.)

 Classifying methods according to their dependency on
 document type definitions:
   DTD dependent approaches: Basic inlining ,Shared inlining,
   Hybrid inlining
   DTD independent approaches: Edge, Edge value, Xrel,
   Xparent, ORDpath, DLN,…
Edge and Edge value

• One of the first approaches
• Models XML document as an ordered and labeled
  directed graph
• Uses two tables to store XML documents:
• Edge:
   – Edge (Source, Target, Ordinal, Label, Flag, Value)
• Edge-Value:
   – Edge (Source, Target, Ordinal, Label, Flag)
   – Value (Node, Value)

• A variation of Edge approach.
• Partitions the schema based on path labels.
• N+1 tables for a path label l1.l2. ….ln.
   – N for l1 and l1.l2 and …
   – One for text or CData

• Large number of small tables.

• Stores the XML data graph in four tables:
  –   Path (PathId, PathExp)
  –   Element (DocId, PathId, Start, End, Ordinal)
  –   Text (DocId, pathId, Start, End, Value)
  –   Attribute (DocId, PathId, Start, End, Value)
• Uses “regions” to store the parent-child
  relationships. So doesn’t need node identifiers.

• Uses four tables to store XML document:
   –   LabelPath (Id, Len, Path)
   –   DataPath (Pid, Cid)
   –   Element (PathId, Did, Ordinal)
   –   Data (PathId, Did, Ordinal, Value)

• Each node has an identifier.
• Parent-Child relationships are in DataPath table.
• UTX: Uses Ancestor-Descendent relationships instead of
  Parent-Child ones.

• Hierarchical labeling scheme
• Determines ancestry relationship simply by comparison
  of labels
• only positive, odd integers are assigned during an initial
• even-numbered and negative integer component values
  are reserved for later insertions
ORD-Path (cont.)

• A new numbering scheme called DLN (Dynamic Level
• a fixed number of digits for level values (fixed length).
• application of sub values:
   – For instance to insert a node between nodes with ids 1.1/1 and
     1.1/2we can add a further sub value level and assign 1.1/1/1 to
     the new node.

• A variation of DLN.
• Number of insertions and deletions from an XML
  document is not restricted.
• Uses k-dimensional identifiers for each node: (id1, io1, id2,
  io2, …, idk, iok)
• Each dimension has an id (a number is calcuated like
  DLN) and an io(a number which shows the order of
• Adds to io by one when inserting a node with existing id.
• Adds a dimension when creating child for nodes with io >
Querying XML Documents

 Query Languages (Lorel, QML-QL, XML-GL,…)
   XPath( Boolean expressions and functions, traversing
   the tree to up, down left and right)

   XQuery( XPath + some other facilities, FLWR
   expressions, … )

• There are many different approaches for DTD-
  independent storing of XML data in relational
  databases and each of them result in a different
  schema in a DBMS.
• The approaches are not efficient enough. They
  should consider bulk loading, reconstruction of
  XML document, path traversal queries, parent-
  child relationships, ordered access, internal
  references, updating an XML document, …

   ‫"، )کارشناسي ارشد)، دانشگاه تهران، شهريىر‬XML ‫عمادي، مهدي، "بستري کارا براي مديريت داده­هاي‬ .1
2.    Florescu, D., Kossman, D., “Storing and Querying XML Data Using an
      RDBMS”. Bulletin of the IEEE Computer Society Technical Committee on Data
      Engineering, 1999, pp. 27-34.
3.    Schmidt, A., Kersten, M., “Bulkloading and Maintaining XML Documents”. ACM
      1-58113-445, 2/2/2003, pp. 407- 412.
4.    Yoshikawa, M., Amagasa, T., Shimura, T., Uemura, S., “XRel: A Path-Based
      Approach to Storage and Retrieval of XML Documents using Relational
      Databases”. Proceedings of the Tenth International Conference on Database
      and Expert Systems Applications, Aug. 30 - Sep. 3, 1999 )DEXA’99), pp. 206-
5.    Jiang, H., Lu, H., Wang, W., Yu, J.X., “Path Materialization Revisited: An
      Efficient Storage Model for XML Data”. 2nd Australian Institute of Computer
      Ethics Conference (AICE2000), Canberra, Conferences in Research and
      Practice in Information Technology, Vol. 1. J. Weckert, Ed, 2001.
6.    Jiang, H., Lu, H., Wang, W., Yu, J.X., “XParent: An Efficient RDBMS-Based
      XML Database System”. AOE 98/99.EG01 and HKUST6060/00E, 2001.
References( cont.)

7.    O’Neil, P., O’Neil, E., Pal, S., Cseri, I., Schaller, G., Westbury, N., “ORDPATHs:
      Insert-Friendly XML Node Labels”. ACM 1-58113-859, 8/4/2006.
8.    Böhme, T., Rahm, E., “Supporting Efficient Streaming and Insertion of XML
      Data in RDBMS”., 2001.
9.    Emadi, M., Rahgozar, M., Ardalan, A., Kazerani, A., Ariyan, M., “Approaches
      and Schemes for Storing DTD-Independent XML Data in Relational
      Databases”. Transactions on Enineering, Computing and Technology Volume
      13, May 2006, pp. 168-173.
10.    Kit, L.H., Ng, V., “Enumerating XML Data for Dynamic Updating”. 16th
      Australasian Database Conference, University of Newcastle, Newcastle,
      Australia. Conferences in Research and Practice in Information Technology,
      Vol. 39. H.E. Williams and G. Dobbie, Eds., 2005.
11.   D. Lee AND W. W. Chu, "Constraints-Preserving Transformation from XML
      Document Type Deffinition to Relational Schema," Proc. of 19th Intl. Conf. on
      Conceptual Modeling - ER (LNCS), Salt Lake City, Utah, USA, vol. 1920,
      October 2000‫ و‬pp. 323- 338.


To top