Database-Supported XML Processors by fdh56iuoui

VIEWS: 5 PAGES: 174

									Purely Relational XQuery
Database-Supported XML Processors




Torsten Grust
Technische Universität München

http://www-db.in.tum.de/~grust/   LUG Erding, Oct 2007
         A Database Guy’s View of
         XML, XPath, and XQuery




Prof. Dr. Torsten Grust   2   Technische Universität München
         A Database Guy’s View of
         XML, XPath, and XQuery
    •    Build XML processors that do not stumble once
         XML input volumes become sizable.




Prof. Dr. Torsten Grust          2           Technische Universität München
         A Database Guy’s View of
         XML, XPath, and XQuery
    •    Build XML processors that do not stumble once
         XML input volumes become sizable.
    •    Wait! Sizable data volume...? Use Relational Databases!




Prof. Dr. Torsten Grust            2            Technische Universität München
         A Database Guy’s View of
         XML, XPath, and XQuery
    •    Build XML processors that do not stumble once
         XML input volumes become sizable.
    •    Wait! Sizable data volume...? Use Relational Databases!


              <?xml version=”1.0”?>
              <root>                      let $doc := doc(“in.xml”)
                <elem att=”val”>          for $t in $doc//text()
                  <!-- comment -->        return
                  text                      count($t/../comment())
                </elem>
              </root>



Prof. Dr. Torsten Grust               2            Technische Universität München
         A Database Guy’s View of
         XML, XPath, and XQuery
    •    Build XML processors that do not stumble once
         XML input volumes become sizable.
    •    Wait! Sizable data volume...? Use Relational Databases!


              <?xml version=”1.0”?>
              <root>                      let $doc := doc(“in.xml”)
                <elem att=”val”>          for $t in $doc//text()
                  <!-- comment -->        return
                  text                      count($t/../comment())
                </elem>
              </root>



Prof. Dr. Torsten Grust               2            Technische Universität München
         A Database Guy’s View of
         XML, XPath, and XQuery
    •    Build XML processors that do not stumble once
         XML input volumes become sizable.
    •    Wait! Sizable data volume...? Use Relational Databases!


                     pre size level
                      1    1    10        let $doc := doc(“in.xml”)
                      2    1 20           for $t in $doc//text()
                                          return
                      3    1 30             count($t/../comment())
                      ⋮    ⋮    ⋮



Prof. Dr. Torsten Grust               2            Technische Universität München
         A Database Guy’s View of
         XML, XPath, and XQuery
    •    Build XML processors that do not stumble once
         XML input volumes become sizable.
    •    Wait! Sizable data volume...? Use Relational Databases!


                     pre size level
                                          SELECT   pre,size,
                      1    1    10                 ROW_NUMBER (…)
                      2    1 20           FROM     t000
                      3    1 30           WHERE    value
                      ⋮    ⋮    ⋮         GROUP BY iter,…




Prof. Dr. Torsten Grust               2            Technische Universität München
         A Database Guy’s View of
         XML, XPath, and XQuery
    •    Build XML processors that do not stumble once
         XML input volumes become sizable.
    •    Wait! Sizable data volume...? Use Relational Databases!
                                 OFF-TH   E-SHELF

                     pre size level
                                          SELECT   pre,size,
                      1    1    10                 ROW_NUMBER (…)
                      2    1 20           FROM     t000
                      3    1 30           WHERE    value
                      ⋮    ⋮    ⋮         GROUP BY iter,…




Prof. Dr. Torsten Grust               2            Technische Universität München
        RDBMSs as Processors for
        Non-Relational Languages
    •    Relational databases contain the best understood and
         most scalable query processors available today.
    •    Can we use relational query engines as-is to efficiently
         process non-relational languages?




Prof. Dr. Torsten Grust            3             Technische Universität München
        RDBMSs as Processors for
        Non-Relational Languages
    •    Relational databases contain the best understood and
         most scalable query processors available today.
    •    Can we use relational query engines as-is to efficiently
         process non-relational languages?
                                  L
                          X                      X




Prof. Dr. Torsten Grust            3             Technische Universität München
        RDBMSs as Processors for
        Non-Relational Languages
    •    Relational databases contain the best understood and
         most scalable query processors available today.
    •    Can we use relational query engines as-is to efficiently
         process non-relational languages?
                                             L
                              X                              X
                          R                                       R-1


                                  SQL / Relational Algebra
Prof. Dr. Torsten Grust                       3              Technische Universität München
        RDBMSs as Processors for
        Non-Relational Languages
    •    Relational databases contain the best understood and
         most scalable query processors available today.
    •    Can we use relational query engines as-is to efficiently
         process non-relational languages?
                                     XQuery

                          R                                   R-1


                              SQL / Relational Algebra
Prof. Dr. Torsten Grust                   3              Technische Universität München
          Purely Relational XQuery
    •    Find a relational representation of XQuery (XML, XPath)
         that leverages the table-based processing model:
        •    Exploit functionality already built into the database
             system (do not invade the database kernel):

             SQL idioms, OLAP extensions, partitioned B-Trees.
        •    Run on top off-the-shelf relational database systems:




Prof. Dr. Torsten Grust              4             Technische Universität München
          Purely Relational XQuery
    •    Find a relational representation of XQuery (XML, XPath)
         that leverages the table-based processing model:
        •    Exploit functionality already built into the database
             system (do not invade the database kernel):

             SQL idioms, OLAP extensions, partitioned B-Trees.
        •    Run on top off-the-shelf relational database systems:

     DB2      ®                                        (kxsystems)
IBM DB2 V9                SQL Server 2005 PostgreSQL     kdb+             MonetDB


   DB2 Version 9
Prof. Dr. Torsten Grust                       4              Technische Universität München
                           Pathfinder
                          pathfinder-xquery.org




                           Relational Database System
                                     (Kernel)




Prof. Dr. Torsten Grust                5                Technische Universität München
                           Pathfinder
                          pathfinder-xquery.org


                           Pathfinder XQuery Compiler

                            XPath/XQuery Semantics
                                 XML Encoding



                           Relational Database System
                                     (Kernel)




Prof. Dr. Torsten Grust                5                Technische Universität München
                           Pathfinder
                          pathfinder-xquery.org




                                      XQuery
                           Pathfinder XQuery Compiler

                            XPath/XQuery Semantics
                                 XML Encoding



                           Relational Database System
                                     (Kernel)




Prof. Dr. Torsten Grust                 5               Technische Universität München
                           Pathfinder
                          pathfinder-xquery.org




                                        XQuery
                           Pathfinder XQuery Compiler

                            XPath/XQuery Semantics
                                 XML Encoding
                                  SQL

                           Relational Database System
                                     (Kernel)




Prof. Dr. Torsten Grust                   5             Technische Universität München
                           Pathfinder
                          pathfinder-xquery.org




                                        XQuery
                           Pathfinder XQuery Compiler

                            XPath/XQuery Semantics
                                 XML Encoding
                                  SQL


                                                 RA
                           Relational Database System
                                     (Kernel)




Prof. Dr. Torsten Grust                   5             Technische Universität München
                            DB2
              Pathfinder & DB2            ®



            Against 110+ MB of XML

                              DB2 Version 9
                              for Linux, UNIX, an




Prof. Dr. Torsten Grust   6      Technische Universität München
                            DB2
              Pathfinder & DB2            ®



            Against 110+ MB of XML

                              DB2 Version 9
                              for Linux, UNIX, an




Prof. Dr. Torsten Grust   6      Technische Universität München
                    Slow Motion Replay

                             XQuery

                      Pathfinder
                             SQL




                           DB2        ®




                          DB2 Version 9
Prof. Dr. Torsten Grust                                  Technische Universität München
                          for Linux, UNIX, and Windows
                                             7
                    Slow Motion Replay
                                          XQuery (XMark Q8)


                             XQuery
                                          let $a := doc("XMark-110mb.xml")
                                          return
                                            <item person="{ $p/name/text() }">
                                             { count($ca) }
                                            </item>
                      Pathfinder
                             SQL




                           DB2        ®




                          DB2 Version 9
Prof. Dr. Torsten Grust                                         Technische Universität München
                          for Linux, UNIX, and Windows
                                             7
                    Slow Motion Replay

                             XQuery       SQL:1999 (no SQL/XML)
                                          WITH
                      Pathfinder            t0000 (iter,pre) AS
                                              SELECT … FROM … WHERE …
                                            t0001 (iter) AS
                             SQL



                                              SELECT … FROM … WHERE …
                                          SELECT     pre,size,kind,…
                                          FROM       …
                                          WHERE      …
                           DB2        ®
                                          ORDER BY   …




                          DB2 Version 9
Prof. Dr. Torsten Grust                                           Technische Universität München
                          for Linux, UNIX, and Windows
                                             7
                    Slow Motion Replay

                             XQuery

                      Pathfinder
                             SQL



                                          Relational Encoding of Result
                                               pre   size kind

                           DB2
                                             5072509 2      1
                                      ®      5072510 0      2
                                             5072511 0      3




                          DB2 Version 9
Prof. Dr. Torsten Grust                                            Technische Universität München
                          for Linux, UNIX, and Windows
                                             7
                    Slow Motion Replay

                             XQuery

                      Pathfinder
                             SQL




                                          Serialized XML Text
                           DB2        ®
                                          <?xml version="1.0" encoding="UTF-8"?>
                                          <XQuery>
                                            <item person=”Jixiang …”>0</item>
                                            <item person=”Harpreet …”>0</item>
                                          </XQuery>



                          DB2 Version 9
Prof. Dr. Torsten Grust                                         Technische Universität München
                          for Linux, UNIX, and Windows
                                             7
                    Table Access Modes




Prof. Dr. Torsten Grust     8     Technische Universität München
                    Table Access Modes
   •     Relational query engines derive much of their efficiency
         from the simplicity of the table of tuples model.
                                             A   B




Prof. Dr. Torsten Grust           8              Technische Universität München
                    Table Access Modes
   •     Relational query engines derive much of their efficiency
         from the simplicity of the table of tuples model.
        1. Sequential scans:                 A   B

           read-ahead on disks,
           prefetching CPU caches.




Prof. Dr. Torsten Grust           8              Technische Universität München
                    Table Access Modes
   •     Relational query engines derive much of their efficiency
         from the simplicity of the table of tuples model.
        1. Sequential scans:                 A   B

           read-ahead on disks,
           prefetching CPU caches.




Prof. Dr. Torsten Grust           8              Technische Universität München
                    Table Access Modes
   •     Relational query engines derive much of their efficiency
         from the simplicity of the table of tuples model.
        1. Sequential scans:                 A   B

           read-ahead on disks,
           prefetching CPU caches.

        2. Index-based access:
           B-Trees, range scans.




Prof. Dr. Torsten Grust            8             Technische Universität München
                    Table Access Modes
   •     Relational query engines derive much of their efficiency
         from the simplicity of the table of tuples model.
        1. Sequential scans:                 A   B

           read-ahead on disks,
           prefetching CPU caches.

        2. Index-based access:
           B-Trees, range scans.




Prof. Dr. Torsten Grust            8             Technische Universität München
                    Table Access Modes
   •     Relational query engines derive much of their efficiency
         from the simplicity of the table of tuples model.
        1. Sequential scans:                  A   B

           read-ahead on disks,
           prefetching CPU caches.

        2. Index-based access:
           B-Trees, range scans.



   •     But: We will benefit only if we actually operate the
         relational database in this bulk-oriented mode.
Prof. Dr. Torsten Grust            8              Technische Universität München
                     The Core of XQuery




Prof. Dr. Torsten Grust      9     Technische Universität München
                     The Core of XQuery
  (e1,..,en)                     ordered item sequences




Prof. Dr. Torsten Grust      9        Technische Universität München
                     The Core of XQuery
  (e1,..,en)                     ordered item sequences
  for $x in e1 return e2         iteration




Prof. Dr. Torsten Grust      9        Technische Universität München
                     The Core of XQuery
  (e1,..,en)                     ordered item sequences
  for $x in e1 return e2         iteration
  let $x := e1 return e2         variable binding




Prof. Dr. Torsten Grust      9        Technische Universität München
                     The Core of XQuery
  (e1,..,en)                     ordered item sequences
  for $x in e1 return e2         iteration
  let $x := e1 return e2         variable binding
  if (e1) then e2 else e3        conditionals




Prof. Dr. Torsten Grust      9        Technische Universität München
                     The Core of XQuery
  (e1,..,en)                     ordered item sequences
  for $x in e1 return e2         iteration
  let $x := e1 return e2         variable binding
  if (e1) then e2 else e3        conditionals
  <t> { e1 } </t>                XML node construction




Prof. Dr. Torsten Grust      9        Technische Universität München
                     The Core of XQuery
  (e1,..,en)                     ordered item sequences
  for $x in e1 return e2         iteration
  let $x := e1 return e2         variable binding
  if (e1) then e2 else e3        conditionals
  <t> { e1 } </t>                XML node construction
  e1/child::t, e1/following::t   XPath location steps




Prof. Dr. Torsten Grust      9        Technische Universität München
                     The Core of XQuery
  (e1,..,en)                     ordered item sequences
  for $x in e1 return e2         iteration
  let $x := e1 return e2         variable binding
  if (e1) then e2 else e3        conditionals
  <t> { e1 } </t>                XML node construction
  e1/child::t, e1/following::t   XPath location steps
  unordered { e1 }               local order indifference




Prof. Dr. Torsten Grust      9         Technische Universität München
                     The Core of XQuery
  (e1,..,en)                        ordered item sequences
  for $x in e1 return e2            iteration
  let $x := e1 return e2            variable binding
  if (e1) then e2 else e3           conditionals
  <t> { e1 } </t>                   XML node construction
  e1/child::t, e1/following::t      XPath location steps
  unordered { e1 }                  local order indifference
  fn:doc(e1), fn:data(e1)           built-in functions

   + Document order, node identity, dynamic typing, schema
     validation, user-defined functions, ...

Prof. Dr. Torsten Grust      9            Technische Universität München
                     The Core of XQuery
  (e1,..,en)                           ordered item sequences
  for $x in e1 return e2               iteration
  let $x := e1 return e2               variable binding
  if (e1) then e2 else e3              conditionals
  <t> { e1 } </t>                      XML node construction
  e1/child::t, e1/following::t         XPath location steps
  unordered { e1 }                     local order indifference
  fn:doc(e1), fn:data(e1)              built-in functions

   + Document order, node identity, dynamic typing, schema
     validation, user-defined functions, ...
   •     The XQuery–SQL gap appears substantial.
Prof. Dr. Torsten Grust          9           Technische Universität München
                          FP-Style Iteration




Prof. Dr. Torsten Grust           10     Technische Universität München
                          FP-Style Iteration
   •     The XQuery for..in..return construct performs
         side-effect free iteration:

                            for $x in (e1,..,en)
                            return b




Prof. Dr. Torsten Grust              10            Technische Universität München
                          FP-Style Iteration
   •     The XQuery for..in..return construct performs
         side-effect free iteration:

                            for $x in (e1,..,en)
                            return b


                            ( b[e1/$x],..,b[en/$x] )


   •     Individual iterations cannot interfere and may be
         evaluated in any order (or in parallel).
Prof. Dr. Torsten Grust                10              Technische Universität München
                          Loop Lifting
 •    “Relational loop unrolling”:
      For any XQuery subexpression, generate a relational
      plan that evaluates the expression in all iterations.




Prof. Dr. Torsten Grust          11            Technische Universität München
                          Loop Lifting
 •    “Relational loop unrolling”:
      For any XQuery subexpression, generate a relational
      plan that evaluates the expression in all iterations.

           for $x in (10,20,30)
           return $x + 42




Prof. Dr. Torsten Grust          11            Technische Universität München
                          Loop Lifting
 •    “Relational loop unrolling”:
      For any XQuery subexpression, generate a relational
      plan that evaluates the expression in all iterations.
                                                   iter pos item
           for $x in (10,20,30)                      1   1 42
           return $x + 42                            2
                                                     3
                                                         1 42
                                                         1 42




Prof. Dr. Torsten Grust          11            Technische Universität München
                          Loop Lifting
 •    “Relational loop unrolling”:
      For any XQuery subexpression, generate a relational
      plan that evaluates the expression in all iterations.
                                                   iter pos item
           for $x in (10,20,30)                      1   1 42
           return $x + 42                            2
                                                     3
                                                         1 42
                                                         1 42


           for $x in (10,20,30)
           return $x + 42




Prof. Dr. Torsten Grust          11            Technische Universität München
                          Loop Lifting
 •    “Relational loop unrolling”:
      For any XQuery subexpression, generate a relational
      plan that evaluates the expression in all iterations.
                                                   iter pos item
           for $x in (10,20,30)                      1   1 42
           return $x + 42                            2
                                                     3
                                                         1 42
                                                         1 42

                                                   iter pos item
           for $x in (10,20,30)                      1   1 10
           return $x + 42                            2
                                                     3
                                                         1 20
                                                         1 30




Prof. Dr. Torsten Grust          11            Technische Universität München
                          Loop Lifting
 •    “Relational loop unrolling”:
      For any XQuery subexpression, generate a relational
      plan that evaluates the expression in all iterations.
                                                   iter pos item
           for $x in (10,20,30)                      1   1 42
           return $x + 42                            2
                                                     3
                                                         1 42
                                                         1 42

                                                   iter pos item
           for $x in (10,20,30)                      1   1 10
           return $x + 42                            2
                                                     3
                                                         1 20
                                                         1 30


           let $x := (10,20,30)
           return $x
Prof. Dr. Torsten Grust          11            Technische Universität München
                          Loop Lifting
 •    “Relational loop unrolling”:
      For any XQuery subexpression, generate a relational
      plan that evaluates the expression in all iterations.
                                                   iter pos item
           for $x in (10,20,30)                      1   1 42
           return $x + 42                            2
                                                     3
                                                         1 42
                                                         1 42

                                                   iter pos item
           for $x in (10,20,30)                      1   1 10
           return $x + 42                            2
                                                     3
                                                         1 20
                                                         1 30

                                                   iter pos item
           let $x := (10,20,30)                      1   1 10
           return $x                                 1
                                                     1
                                                         2 20
                                                         3 30
Prof. Dr. Torsten Grust          11            Technische Universität München
                          Loop Lifting




Prof. Dr. Torsten Grust        12        Technische Universität München
                          Loop Lifting
                          for $x in (10,20,30)
                          return $x + 42




Prof. Dr. Torsten Grust            12        Technische Universität München
                          Loop Lifting
                          for $x in (10,20,30)
                          return $x + 42
  iter    pos item                      iter1 pos1 item1
    1      1    10                        1    1     42
    2      1    20                        2    1     42
    3      1    30                        3    1     42




Prof. Dr. Torsten Grust            12                 Technische Universität München
                             Loop Lifting
                             for $x in (10,20,30)
                             return $x + 42
  iter    pos item                                  iter1 pos1 item1
    1      1    10                                    1    1     42
    2      1    20                                    2    1     42
    3      1    30                                    3    1     42

                                 ⋈
                             iter = iter1

                                 +
                          item2:(item,item1)

                                 π
                            iter,pos,item2

Prof. Dr. Torsten Grust                        12                 Technische Universität München
                             Loop Lifting
                             for $x in (10,20,30)
                             return $x + 42
  iter    pos item                                  iter1 pos1 item1
    1      1    10                                    1    1     42
    2      1    20                                    2    1     42
    3      1    30                                    3    1     42

                                 ⋈
                             iter = iter1             iter   pos item iter1 pos1 item1
                                                        1     1    10   1    1     42
                                 +                      2     1    20   2    1     42
                          item2:(item,item1)            3     1    30   3    1     42


                                 π
                            iter,pos,item2

Prof. Dr. Torsten Grust                        12                 Technische Universität München
                             Loop Lifting
                             for $x in (10,20,30)
                             return $x + 42
  iter    pos item                                   iter1 pos1 item1
    1      1    10                                     1    1     42
    2      1    20                                     2    1     42
    3      1    30                                     3    1     42

                                 ⋈
                             iter = iter1
                                                    iter   pos item iter1 pos1 item1       item2
                                 +                    1     1    10   1    1     42          52
                          item2:(item,item1)          2     1    20   2    1     42          62
                                                      3     1    30   3    1     42          72

                                 π
                            iter,pos,item2

Prof. Dr. Torsten Grust                        12                   Technische Universität München
                             Loop Lifting
                             for $x in (10,20,30)
                             return $x + 42
  iter    pos item                                  iter1 pos1 item1
    1      1    10                                    1    1     42
    2      1    20                                    2    1     42
    3      1    30                                    3    1     42

                                 ⋈
                             iter = iter1

                                 +
                          item2:(item,item1)
                                                       iter   pos item2
                                 π                       1     1    52
                            iter,pos,item2               2     1    62
                                                         3     1    72
Prof. Dr. Torsten Grust                        12                  Technische Universität München
                          Loop Lifting




Prof. Dr. Torsten Grust        13        Technische Universität München
                          Loop Lifting
for $x in (1,2,3,4)
return if ($x mod 2 = 0) then “even” else “odd”




Prof. Dr. Torsten Grust        13        Technische Universität München
                          Loop Lifting
for $x in (1,2,3,4)
           $x         0
return if ($x mod 2 = 0) then “even” else “odd”




  iter    pos     item
    1      1      false
    2      1       true
    3      1      false
    4      1       true




Prof. Dr. Torsten Grust        13        Technische Universität München
                          Loop Lifting
for $x in (1,2,3,4)
           $x         0
return if ($x mod 2 = 0) then “even” else “odd”




                           σ
                          item
                                    π
                                   iter
                                               ×
  iter    pos     item        pos item
    1      1      false        1 “even”
    2
    3
           1
           1
                   true
                  false
                                                     ⋅
                                                     ⋃
                                 pos item
    4      1       true           1 “odd”


                            σ       π          ×
                          ¬ item   iter
Prof. Dr. Torsten Grust                   13       Technische Universität München
                              Loop Lifting
for $x in (1,2,3,4)
           $x         0
return if ($x mod 2 = 0) then “even” else “odd”
           iter    pos item
             2      1 true
             4      1 true

                               σ
                              item
                                        π
                                       iter
                                                   ×
  iter    pos     item            pos item
    1      1      false            1 “even”
    2
    3
           1
           1
                   true
                  false
                                                         ⋅
                                                         ⋃
                                     pos item
    4      1       true               1 “odd”


                                σ       π          ×
                              ¬ item   iter
Prof. Dr. Torsten Grust                       13       Technische Universität München
                          Loop Lifting
for $x in (1,2,3,4)
           $x         0
return if ($x mod 2 = 0) then “even” else “odd”
                                   iter
                                     2
                                     4

                           σ
                          item
                                           π
                                          iter
                                                      ×
  iter    pos     item        pos item
    1      1      false        1 “even”
    2
    3
           1
           1
                   true
                  false
                                                            ⋅
                                                            ⋃
                                 pos item
    4      1       true           1 “odd”


                            σ              π          ×
                          ¬ item          iter
Prof. Dr. Torsten Grust                          13       Technische Universität München
                          Loop Lifting
for $x in (1,2,3,4)
           $x         0
return if ($x mod 2 = 0) then “even” else “odd”
                                               iter   pos item
                                                 2     1 “even”
                                                 4     1 “even”

                           σ
                          item
                                    π
                                   iter
                                               ×
  iter    pos     item        pos item
    1      1      false        1 “even”
    2
    3
           1
           1
                   true
                  false
                                                            ⋅
                                                            ⋃
                                 pos item
    4      1       true           1 “odd”


                            σ       π          ×
                          ¬ item   iter
Prof. Dr. Torsten Grust                   13              Technische Universität München
                          Loop Lifting
for $x in (1,2,3,4)
           $x         0
return if ($x mod 2 = 0) then “even” else “odd”

                                                         iter   pos item
                                                           1     1 “odd”

                           σ        π          ×           2     1 “even”
                          item     iter                    3      1   “odd”
  iter    pos     item        pos item                     4      1   “even”
    1      1      false        1 “even”
    2
    3
           1
           1
                   true
                  false
                                                     ⋅
                                                     ⋃
                                 pos item
    4      1       true           1 “odd”


                            σ       π          ×
                          ¬ item   iter
Prof. Dr. Torsten Grust                   13       Technische Universität München
                     Intermediate Code:
                      Relational Algebra
                          XQuery
          Pathfinder XQuery Compiler
                 Relational Algebra
                             ⋅
                  σ π     ⋈  ⋃ ⋯




Prof. Dr. Torsten Grust                14   Technische Universität München
                     Intermediate Code:
                      Relational Algebra
                          XQuery
                                            •   Compiles into RA that
          Pathfinder XQuery Compiler            mimics capabilites of
                 Relational Algebra             SQL:1999 query engines.
                             ⋅
                  σ π     ⋈  ⋃ ⋯
                                            •   RA dialect is ...
                                                1. primitive
                                                2. explicit
                                                3. designed to be easily
                                                   analyzable

Prof. Dr. Torsten Grust                14               Technische Universität München
                     Intermediate Code:
                      Relational Algebra
                          XQuery
                                            •   Compiles into RA that
          Pathfinder XQuery Compiler            mimics capabilites of
                 Relational Algebra             SQL:1999 query engines.
                             ⋅
                  σ π     ⋈  ⋃ ⋯
                                            •   RA dialect is ...
       Code Gen Code Gen Code Gen
                                                1. primitive
                                                2. explicit
                                                3. designed to be easily
                                                   analyzable

Prof. Dr. Torsten Grust                14               Technische Universität München
                     Intermediate Code:
                      Relational Algebra
                              XQuery
                                                  •   Compiles into RA that
          Pathfinder XQuery Compiler                  mimics capabilites of
                 Relational Algebra                   SQL:1999 query engines.
                             ⋅
                  σ π         ⋈
                             ⋃ ⋯
                                                  •   RA dialect is ...
       Code Gen Code Gen Code Gen
                                                      1. primitive
         S




                                                      2. explicit
                                        MI
           QL



                              Q




                                        L




                                                      3. designed to be easily
                          (kxsystems)                    analyzable

Prof. Dr. Torsten Grust                      14               Technische Universität München
                          Typical Plans ...




Prof. Dr. Torsten Grust           15     Technische Universität München
                          Typical Plans ...




Prof. Dr. Torsten Grust           15     Technische Universität München
                       Relational
                   XQuery Optimization



                          let $d := fn:doc(⋯)
                          return for $a in $d//a
                                   for $b in $d//b
                                   where $b/@c = $a/@d
                                   return $b

Prof. Dr. Torsten Grust                16         Technische Universität München
                       Relational
                   XQuery Optimization
   •     We use concepts in the relational domain to analyze
         plans and to steer simplification and optimization.
   •     Detect join-like XQuery expressions. Let XQuery’s
         syntactic diversity not affect the detection:
                          let $d := fn:doc(⋯)
                          return for $a in $d//a
                                   for $b in $d//b
                                   where $b/@c = $a/@d
                                   return $b

Prof. Dr. Torsten Grust                16         Technische Universität München
                       Relational
                   XQuery Optimization
   •     We use concepts in the relational domain to analyze
         plans and to steer simplification and optimization.
   •     Detect join-like XQuery expressions. Let XQuery’s
         syntactic diversity not affect the detection:
                          let $d := fn:doc(⋯)
                          return $d//b[@c = $d//a/@c]




Prof. Dr. Torsten Grust                16         Technische Universität München
                       Relational
                   XQuery Optimization
   •     We use concepts in the relational domain to analyze
         plans and to steer simplification and optimization.
   •     Detect join-like XQuery expressions. Let XQuery’s
         syntactic diversity not affect the detection:
                          let $d := fn:doc(⋯)
                          return $d//b[@c = $d//a/@c]


   •     For both queries, the value-based XQuery join surfaces
         as a multi-valued dependency in the algebraic plans.
Prof. Dr. Torsten Grust                16         Technische Universität München
                                                                                                                                                                                                                                                                                              ¶ (iter, pos, item:res)




Rewriting XMark Q8
                                                                                                                                                                                                                                                                                               NOT (res:<item>)




                                                                                                                                                                                                                                                                                                 @ (pos), val: #1




                                                                                                                                                                                                                                                                                                          U




                                                                                                                                                                                                                                                              @ (item), val: false              @ (item), val: true




                                                                                                                                                                                                                                                        DIFF              DISTINCT




                                                                                                                                                                                                                                                                         ¶ (iter:outer)




                                                                                                                                                                                                                            ¶ (iter)                              ROW# (pos1:<sort, pos>/outer)




                                                                                                                                                                                                                                                                           |X| (iter = inner)




                                                                                                                                                                                                                                                                            @ (item), val: 1




                                                                                                                                                                                                                                                                           @ (pos), val: #1




                                                                                                                                                                                                                                                                                 ¶ (iter)




                                                                                                                                                                                                                                                                              SEL (item)




                                                                                                                                                                                                                                                                         ¶ (iter, pos, item:res)




                                                                                                                                                                                                                                                                        = (res:<item, item1>)




                                                                                                                                                                                                                                                                            |X| (iter = iter1)




                                                                                                                                                                                                                                       ¶ (iter, pos, item:cast)               ¶ (iter1:iter, item1:cast)




                                                                                                                                                                                                                            CAST (cast:<item>), type: str                   CAST (cast:<item>), type: str




                                                                                                                                                                                                                                ¶ (iter:inner, pos, item)                          @ (pos), val: #1




                                                                                                                                                                                                                                       |X| (iter = outer)                         ¶ (iter:inner, item)




                                                                                                                                                                                                                @ (pos), val: #1                 ¶ (outer:iter, sort:pos, inner)




                                                                                                                                                                                                                                                        NUMBER (inner)




                                                                                                                                                                                                               ¶ (iter:inner, item)               ¶ (iter:outer, pos:pos1, item)




                                                                                                                                                                                                               ROW# (pos1:<sort, pos>/outer)




                                                                                                                                                                             |X| (iter = inner)




                                                                                                                                        ¶ (iter, pos, item:cast)




                                                                                                                          CAST (cast:<item>), type: uA                 ¶ (outer:iter, sort:pos, inner)




                                                                                                                                ¶ (iter, pos, item:res)




                                                                                                                          access attribute value (res:<item>)




                                                                                                                                             @ (pos), val: #1




                                                                                                                                              ¶ (iter:inner, item)




                                                                                                                                                                NUMBER (inner)




                                                                                                                                                             ROW# (pos:<item>/iter)




                                                                                                                                                                   DISTINCT




                                                                                                                                                                   ¶ (iter, item)




                                                                                                                                                             ROW# (pos:<item>/iter)




                                                                                                                                               /| attribute::attribute id { atomic* } (iter, item)




                                                                                                                                                                          ¶ (iter:inner, item)




                                                                                                                                                                                                                                                                                  |X| (iter = outer)




                                                                                                                                                                                                                                                                              ¶ (iter:inner, pos, item)               ¶ (outer:iter, sort:pos, inner)




                                                                                                                                                                                                                                                                  NUMBER (inner)




                                                                                                                                                                                                                                                            ¶ (iter:outer, pos:pos1, item)




                                                                                                                                                                                                                                                        ROW# (pos1:<sort, pos>/outer)




                                                                                                                                                                                                                                                                   |X| (iter = inner)




                                                                                                                                                                                                                                             ¶ (iter, pos, item:cast)




                                                                                                                                                                                                                              CAST (cast:<item>), type: uA                    ¶ (outer:iter, sort:pos, inner)




                                                                                                                                                                                                                                        ¶ (iter, pos, item:res)




                                                                                                                                                                                                                              access attribute value (res:<item>)




                                                                                                                                                                                                                                              @ (pos), val: #1




                                                                                                                                                                                                                                                       ¶ (iter:inner, item)




                                                                                                                                                                                                                                                                         NUMBER (inner)




                                                                                                                                                                                                                                                                      ROW# (pos:<item>/iter)




                                                                                                                                                                                                                                                                              DISTINCT




                                                                                                                                                                                                                                                                            ¶ (iter, item)




                                                                                                                                                                                                                                                                      ROW# (pos:<item>/iter)




                                                                                                                                                                                                                                                     /| attribute::attribute person { atomic* } (iter, item)




                                                                                                                                                                                                                                                                                         ¶ (iter, item)




                                                                                                                                                                                                                                                                                          ROW# (pos:<item>/iter)




                                                                                                                                                                                                                                                                                                   DISTINCT




                                                                                                                                                                                                                                                                                                     ¶ (iter, item)




                                                                                                                                                                                                                                                                                              ROW# (pos:<item>/iter)




                                                                                                                                                                                                                                                                                        /| child::element buyer { item* } (iter, item)




                                                                                                                                                                                                                                                                                                                             ¶ (iter, item)




                                                                                                                                                                                                 @ (pos), val: #1                                                                            |X| (iter = outer)




                                                                                                                                                                                                      ¶ (iter:inner, item)                                                         ¶ (outer:iter, sort:pos, inner)




                                                          ROW# (pos1:<sort, pos>/outer)                                                                                                                  NUMBER (inner)




                                                                   |X| (iter = inner)                                                                                                               ROW# (pos:<item>/iter)




                                                ¶ (iter, pos, item:res)                                                                                                                                    DISTINCT




                                   access textnode content (res:<item>)             ¶ (outer:iter, sort:pos, inner)                                                                                        ¶ (iter, item)




                                                                 @ (pos), val: #1                                                                                                                   ROW# (pos:<item>/iter)




                                                                ¶ (iter:inner, item)                                                                                                /| child::element closed_auction { item* } (iter, item)




                                                                             NUMBER (inner)                                                                                                                ¶ (iter, item)




                                                                          ROW# (pos:<item>/iter)                                                                                                          ROW# (pos:<item>/iter)




                                                                                DISTINCT                                                                                                                          DISTINCT




                                                                                ¶ (iter, item)                                                                                                                   ¶ (iter, item)




                                                                          ROW# (pos:<item>/iter)                                                                                                          ROW# (pos:<item>/iter)




                                                                          /| child::text (iter, item)                                                                                     /| child::element closed_auctions { item* } (iter, item)




                                                                                  ¶ (iter, item)                                                                                                                       ¶ (iter, item)




                                                                            ROW# (pos:<item>/iter)                                                                                                                  ROW# (pos:<item>/iter)




                                                                                     DISTINCT                                                                                                                                   DISTINCT




                                                                                     ¶ (iter, item)                                                                                                                             ¶ (iter, item)




                                                                             ROW# (pos:<item>/iter)                                                                                                                                ROW# (pos:<item>/iter)




                                                                                        /| child::element name { item* } (iter, item)                                                                                         /| child::element site { item* } (iter, item)




                                                                                                                                            ¶ (iter, item)                                                                                                        ¶ (iter:inner, item)




                                                                                                                                                  @ (pos), val: #1                                                                                                 |X| (iter = outer)




                                                                                                                                                ¶ (iter:inner, item)                                                                                                               ¶ (outer:iter, sort:pos, inner)




                                                                                                                                                                                                    NUMBER (inner)




                                                                                                                                                                                                 ROW# (pos:<item>/iter)




                                                                                                                                                                                                       DISTINCT




                                                                                                                                                                                                      ¶ (iter, item)




                                                                                                                                                                                                 ROW# (pos:<item>/iter)




                                                                                                                                                                                    /| child::element person { item* } (iter, item)




                                                                                                                                                                                                      ¶ (iter, item)




                                                                                                                                                                                                 ROW# (pos:<item>/iter)




Prof. Dr. Torsten Grust   17   Technische Universität München
                                                                                                                                                                                                       DISTINCT




                                                                                                                                                                                                      ¶ (iter, item)




                                                                                                                                                                                                 ROW# (pos:<item>/iter)




                                                                                                                                                                                    /| child::element people { item* } (iter, item)




                                                                                                                                                                                                               ¶ (iter, item)
                                                                                                     ROW# (pos1:<sort>)




Rewriting XMark Q8
                                                                                                           |X| (iter = inner)




                                                                             FRAG_UNION                    @ (pos), val: #1




                                                                                    FRAGs                  ROOTS




                                                                         ELEM (iter, item:<iter, item><iter, pos, item>)




                                                                                                                         ELEM_TAG




                                                                                                                                                 ¶ (iter, pos:pos1, item)




                                                                                                                      @ (item), val: item            ROW# (pos1:<ord>/iter)




                                                                                                                                                                 U




                                                                                                                      @ (ord), val: #1                         @ (ord), val: #2




                                                                                   FRAG_UNION                  ¶ (iter, pos, item:res)                                @ (pos), val: #1




     •
                                                                               FRAG_UNION                               ROOTS                                               ¶ (iter, item:res)




                                                                                           FRAGs                                                             FRAGs                ROOTS




          Rewriting phases:
                                                                                          ATTR (res:<item, item1>)                                                      TEXT (res:<cast>)




                                                                                               |X| (iter = iter1)                                                 CAST (cast:<item>), type: str




                                                                         @ (item), val: person              ¶ (iter1:iter, item1:item)                                              U




                                                                                                                    fn:string_join                                                           @ (item), val: 0




                                                                                        @ (pos), val: #1              @ (item), val: " "                                                            DIFF




                                                                                                                                ¶ (iter:inner)                                                     ¶ (iter)




                                                             ¶ (iter:outer, pos:pos1, item)                                                                                              COUNT (item:/iter)




                                                                                                                                                                                             ¶ (iter:outer)




         1. Constant propagation and
                                                                                                                                                                                           |X| (iter = inner)




                                                                                                                                                                                                   ¶ (iter)




                                                                                                                                                                                            |X| (iter = iter1)




                                                                                                                                                                                             ¶ (iter1:iter)




                                                                                                                                                                                                 SEL (item)




            projection pushdown,
                                                                                                                                                                                           ¶ (iter, item:res)




                                                                                                                                                                                          NOT (res:<item>)




                                                                                                                                                                                                      U




                                                                                                                                                                                          @ (item), val: false              @ (item), val: true




                                                                                                                                                                                                                                  DIFF        DISTINCT




                                                                                                                                                                                                                                              ¶ (iter:outer)




                                                                                                                                                                                                 ¶ (iter:inner)                                                         |X| (iter = inner)




                                                                                                                                                                                                                                                                             ¶ (iter)




                                                                                                                                                                                                                                                                            SEL (item)




                                                                                                                                                                                                                                                                        ¶ (iter, item:res)




                                                                                                                                                                                                                                                                      NOT (res:<item>)




                                                                                                                                                                                                                                                                                U




                                                                                                                                                                                                                                                     @ (item), val: false               @ (item), val: true




                                                                                                                                                                                                                                                               DIFF            DISTINCT




                                                                                                                                                                                                                                                                              ¶ (iter:outer)




                                                                                                                                                                                                                                                ¶ (iter:inner)                  |X| (iter = inner)




                                                                                                                                                                                                                                                                                        ¶ (iter)




                                                                                                                                                                                                                                                                                    SEL (item)




                                                                                                                                                                                                                                                                                ¶ (iter, item:res)




                                                                                                                                                                                                                                                                             = (res:<item, item1>)




                                                                                                                                                                                                                                                                                |X| (iter = iter1)




                                                                                                                                                                                                                                                                      ¶ (iter, item:cast)




                                                                                                                                                                                                                                                        CAST (cast:<item>), type: str                    ¶ (iter1:iter, item1:cast)




                                                                                                                                                                                                                                                               ¶ (iter:inner, item)                CAST (cast:<item>), type: str




                                                                                                                                                                                                                                                                 |X| (iter = outer)                      ¶ (iter:inner, item)




                                                                                                                                                                                                                                                                                                                                ¶ (outer:iter, inner)




                                                                                                                                                                                                                                                           ¶ (iter:inner, item)                                NUMBER (inner)




                                                                                                                                                                                                                                                                                                               ¶ (iter:outer, item)




                                                                                                                                                                                                                                                                                                                |X| (iter = inner)




                                                                                                                                                                                                                                                                                             ¶ (iter, item:cast)




                                                                                                                                                                                                                                                                           CAST (cast:<item>), type: uA                    ¶ (outer:iter, inner)




                                                                                                                                                                                                                                                                                    ¶ (iter, item:res)




                                                                                                                                                                                                                                                                          access attribute value (res:<item>)




                                                                                                                                                                                                                                                                                                   ¶ (iter:inner, item)




                                                                                                                                                                                                                                                                                                                   NUMBER (inner)




                                                                                                                                                                                                                                                                                                   /| attribute::attribute id { atomic* } (iter, item)




                                                                                                                                                                                                                                                                                                                  ¶ (iter:inner, item)




                                                                                                                                                                                                                                                                                                                   |X| (iter = outer)




                                                                                                                                                                                                                                                                                                             ¶ (iter:inner, item)             ¶ (outer:iter, inner)




                                                                                                                                                                                                                                                    NUMBER (inner)




                                                                                                                                                                                                                                                    ¶ (iter:outer, item)




                                                                                                                                                                                                                                                     |X| (iter = inner)




                                                                                                                                                                                                                        ¶ (iter, item:cast)




                                                                                                                                                                                                       CAST (cast:<item>), type: uA                 ¶ (outer:iter, inner)




                                                                                                                                                                                                                    ¶ (iter, item:res)




                                                                                                                                                                                                              access attribute value (res:<item>)




                                                                                                                                                                                                                                         ¶ (iter:inner, item)




                                                              ROW# (pos1:<sort>/outer)                                                                                                                                                    NUMBER (inner)




                                                                   |X| (iter = inner)                                                                                               /| attribute::attribute person { atomic* } (iter, item)




                                                                ¶ (iter, pos, item:res)                                                                                                  /| child::element buyer { item* } (iter, item)                                                                        |X| (iter = outer)




                                                       access textnode content (res:<item>)                                                                                                                ¶ (iter:inner, item)                                                                                           ¶ (outer:iter, inner)




                                                                @ (pos), val: #1            ¶ (outer:iter, sort:pos, inner)                                                                                NUMBER (inner)




                                                                    ¶ (iter:inner, item)                                                                                            /| child::element closed_auction { item* } (iter, item)




                                                                              NUMBER (inner)                                                                                        /| child::element closed_auctions { item* } (iter, item)




                                                                          ROW# (pos:<item>/iter)                                                                                                          /| child::element site { item* } (iter, item)




                                                                           /| child::text (iter, item)                                                                                                                                                ¶ (iter:inner, item)




                                                                          /| child::element name { item* } (iter, item)




                                                                                                                                                                                                 ¶ (iter:inner, item)                                 ¶ (outer:iter, sort:pos, inner)




                                                                                                                                                                                                   NUMBER (inner)




                                                                                                                                                                                                  ROW# (pos:<item>)




                                                                                                                                                                                    /| child::element person { item* } (iter, item)




                                                                                                                                                                                                 /| child::element people { item* } (iter, item)




Prof. Dr. Torsten Grust         17     Technische Universität München
                                                                                                                                                                                                                                                  /| child::element site { item* } (iter, item)




                                                                                                                                                                     FRAG_UNION                                                                                       ROOTS




                                          EMPTY_FRAG                                                                                                                                         FRAGs




                                                                                                                                                                                                                           DOC




                                                                                                                                                                                                              @ (item), val: "auctionG.xml"
Rewriting XMark Q8
     •    Rewriting phases:
                                                                 SERIALIZE




         1. Constant propagation and
                                                                         ¶ (item, pos)




                                                                         ROW# (pos:<pos1>)




                                                                             ¶ (pos1, item)




                                                         FRAG_UNION          |X| (iter = iter1)




                                                             FRAGs                                             ROOTS




                                                                                                   ELEM (iter1, item:<iter1, item><iter1, pos, item>)




                                                                                                                                                        ELEM_TAG




            projection pushdown,
                                                                                                                                                                                       ¶ (iter1, item, pos)




                                                                                                                                                    @ (item), val: item                 ROW# (pos:<pos1>/iter1)




                                                                                                                                                                                                      U




                                                                                                                                                                                          @ (pos1), val: #1               @ (pos1), val: #2




                                                                                                                              FRAG_UNION                                                  ¶ (iter1, item)                     ¶ (iter1, item)




                                                                                                                              FRAG_UNION                                                ROOTS                 FRAGs                  ROOTS




                                                                                                                                 FRAGs                                                                             TEXT (item:<item2>)




                                                                                                                      ATTR (item:<item2, item1>)                                                              CAST (item2:<item1>), type: str




                                                                                                                          @ (item2), val: person                                                                             U




                                                                                                                               fn:string_join                                                                         @ (item1), val: 0




         2. functional dependency and
                                                                                                                                  @ (item1), val: " "                                                                     DIFF




                                                                                                                                                                      ¶ (iter1:iter)                                    ¶ (iter1)




                                                                                                      ¶ (iter1, item1, pos)                                                                                      COUNT (item1:/iter1)




                                                                                                  ROW# (pos:<pos1>/iter1)                                                                                                        ¶ (iter1)




                                                                                                    ¶ (iter1, pos1, item1)                                                                                                       |X| (iter = iter2)




                                                                                                                                                                                                                                    ¶ (iter:iter2)




            data flow analysis,
                                                                                                                                                                                                                                    DISTINCT                                                                                                                                  ¶ (item, item1, iter1, iter2:iter)




                                                                                                                                                                                                                                      ¶ (iter2)




                                                                                                                                                                                                                                    SEL (item)




                                                                                                                                                                                                                                 ¶ (iter2, item)




                                                                                                                                                                                                                           = (item:<item3, item4>)




                                                                                                                                                                                                                            ¶ (iter2, item3, item4)




                                                                                                                                                                                                                       CAST (item4:<item>), type: str




                                                                                                                                                                                                                      CAST (item3:<item2>), type: str




                                                                                                                                                                                                                             ¶ (iter2, item2, item)




                                                                                                                                                                                                                       CAST (item:<item4>), type: uA




                                                                                                                                                                                                                    access attribute value (item4:<item3>)




                                                                                                                                                                                                                              ¶ (item3, iter2, item2)




                                                                                                                                                                                                                                    |X| (iter1 = iter3)




                                                                                                                                                                                                                                 ¶ (iter1:iter3, item3)




                                                                                                                                                                                                                 /| attribute::attribute id { atomic* } (iter3, item3)              ¶ (iter2, item2:item1, iter3:iter1)




                                                                                                                                                                                                                                             ¶ (item3:item, iter3:iter1)




                                                                                                                                                                                                                                                                              NUMBER (iter1)




                                                                                                                                                                                                                                                                            ¶ (item, iter2, item1)




                                                                                                                                                                                                                                                                       CAST (item1:<item3>), type: uA




                                                                                                                                                                                                                                                                    access attribute value (item3:<item2>)




                                                                                                                                                                                                                                                                                                               ¶ (item2, item, iter2)




                                                                                                                                                                                                                                                                                                                           |X| (iter = iter3)




                                                                                                                                                                                                                                                                                                                          ¶ (iter:iter3, item2)




                                                                                                                                                                                                                                                                                                      /| attribute::attribute person { atomic* } (iter3, item2)                 ¶ (item:item1, iter2:iter, iter3:iter)




                                                                        access textnode content (item1:<item>)                                                                                                                                                                                                             /| child::element buyer { item* } (iter3, item2)




                                                                              ROW# (pos1:<item>/iter1)                                                                                                                                                                                                                                                            ¶ (item2:item, iter3:iter)




                                                                                /| child::text (iter1, item)                                                                                                                                                                                                                                                                                         NUMBER (iter)




                                                                                         /| child::element name { item* } (iter1, item)




                                                                                                                ¶ (item:item1, iter1:iter)                                                                                                                                                                                                                                                                ¶ (item1, iter1:iter)




                                                                                                                                                 NUMBER (iter)                                                                                                                                                                                                                         ¶ (item)




                                                                                                                                             ROW# (pos1:<item1>)




                                                                                                                                                   ¶ (item1)




                                                                                                                                /| child::element person { item* } (iter, item1)                                                                                                                                                                                  /| child::element closed_auction { item* } (iter, item)




                                                                                                                                                                                                                   /| child::element people { item* } (iter, item1)                                                                                                                   /| child::element closed_auctions { item* } (iter, item)




                                                                                                                                                                                                                                                                                 /| child::element site { item* } (iter, item1)                                                                            /| child::element site { item* } (iter, item)




                                                                                                                                                                                                                         FRAG_UNION                                                            @ (iter), val: #1




                                            EMPTY_FRAG                                                                                                                                                                                                                                          ¶ (item1:item)




                                                                                                                                                                                                                                                                   FRAGs                             ROOTS




                                                                                                                                                                                                                                                                                    DOC




                                                                                                                                                                                                                                                                              TBL: (iter | item)
                                                                                                                                                                                                                                                                            [#1,"auctionG.xml"]




Prof. Dr. Torsten Grust         17      Technische Universität München
Rewriting XMark Q8
     •    Rewriting phases:
         1. Constant propagation and
            projection pushdown,                                SERIALIZE




                                                                        ¶ (item, pos)




                                                                        ROW# (pos:<pos1>)




                                                                            ¶ (pos1, item)




                                                        FRAG_UNION          |X| (iter = iter1)




         2. functional dependency and
                                                            FRAGs                                             ROOTS




                                                                                                    ELEM (iter1, item:<iter1, item><iter1, pos, item>)




                                                                                                                                                          ELEM_TAG




                                                                                                                                                                                     ¶ (iter1, item, pos)




                                                                                                                                                      @ (item), val: item              ROW# (pos:<pos1>/iter1)




                                                                                                                                                                                                    U




                                                                                                                                                                                        @ (pos1), val: #1                   @ (pos1), val: #2




            data flow analysis,
                                                                                                                             FRAG_UNION                                                 ¶ (iter1, item)                           ¶ (iter1, item)




                                                                                                                             FRAG_UNION                                               ROOTS                    FRAGs                   ROOTS




                                                                                                                                  FRAGs                                                                          TEXT (item:<item2>)




                                                                                                                     ATTR (item:<item2, item1>)                                                             CAST (item2:<item1>), type: str




                                                                                                                        @ (item2), val: person                                                                             U




                                                                                                                               fn:string_join                                                                         @ (item1), val: 0




                                                                                                                                  @ (item1), val: " "                                                                     DIFF




                                                                                                                                                                    ¶ (iter1:iter)                                       ¶ (iter1)




                                                                                                     ¶ (iter1, item1, pos)                                                                                        COUNT (item1:/iter1)




                                                                                                 ROW# (pos:<pos1>/iter1)                                                                                                   ¶ (iter1)




         3. algebraic XQuery join detection,
                                                                                                   ¶ (iter1, pos1, item1)                                                                                                  DISTINCT




                                                                                                                                                                                                                           ¶ (iter, iter1)




                                                                                                                                                                                                                           |X| (item1 = item)




                                                                             access textnode content (item1:<item>)                                                                                                            ¶ (iter, item1)                                                                                         ¶ (iter1, item)




                                                                                    ROW# (pos1:<item>/iter1)                                                                                                     access attribute value (item1:<item>)                                                                                                                                         access attribute value (item:<item1>)




                                                                                      /| child::text (iter1, item)                                                                                                                           ¶ (item, iter:iter2)                                                                                                                                                                      ¶ (item1, iter1:iter2)




                                                                                                     /| child::element name { item* } (iter1, item)                                                                              /| attribute::attribute person { atomic* } (iter2, item)                                                                                                                                       /| attribute::attribute id { atomic* } (iter2, item1)




                                                                                                                             ¶ (item:item1, iter1:iter)                                                                                                                     /| child::element buyer { item* } (iter2, item)                                                                                                                         ¶ (item1, iter2:iter)




                                                                                                                                                              NUMBER (iter)                                                                                                                                           ¶ (item:item1, iter2:iter)




                                                                                                                                                          ROW# (pos1:<item1>)                                                                                                                                                  NUMBER (iter)




                                                                                                                                                                ¶ (item1)                                                                                                                                                           ¶ (item1)




                                                                                                                                             /| child::element person { item* } (iter, item1)                                                                                                                       /| child::element closed_auction { item* } (iter, item1)




                                                                                                                                                                                                                               /| child::element people { item* } (iter, item1)                                                        /| child::element closed_auctions { item* } (iter, item1)




                                                                                                                                                                                                                                                                                                                                                     /| child::element site { item* } (iter, item1)




                                                                                                                                                                                                                                                  FRAG_UNION                                                                  ROOTS




                                                   EMPTY_FRAG                                                                                                                                                                                                       FRAGs




                                                                                                                                                                                                                                                                                          DOC




                                                                                                                                                                                                                                                                                    TBL: (iter | item1)
                                                                                                                                                                                                                                                                                   [#1,"auctionG.xml"]




Prof. Dr. Torsten Grust           17           Technische Universität München
Rewriting XMark Q8
     •    Rewriting phases:
         1. Constant propagation and
            projection pushdown,
         2. functional dependency and
                                                                                     SERIALIZE




                                                                                        ROOTS




                                                                                    twig (iter, item)




                                                                                   ELEM (iter, item)




                                                                                                        fcns




                                                        Attach (item), val: item                                                fcns




                                                                                            ATTR (iter, item, item1)                     TEXT (iter, item)




            data flow analysis,
                                                                                           Attach (item), val: person                            Project (iter, item)




                                                                                                                                                             CAST (item:<item1>), type: str




                                                                                                                                                                                                             UNION




                                                                                                                                                                                                                            Attach (item1), val: 0




                                                                                                                                                                                                                                    DIFF




                                                                                                                                                                                                                                                Project (iter)




                                                                                                                                                                                                     COUNT (item1:/iter)




                                                                                                 fn:string_join                                                                                            Project (iter)




                                                               access textnode content (item1:<item>)             Attach (item1), val: " "                                                                  ThetaJoin
                                                                                                                                                                                                        (item eq item1)




                                                                          Project (iter, item)                                                                                 Project (item)         Project (iter, item1)




                                                                       /|+ child::text (item:item1)                               access attribute value (item1:<item>)             access attribute value (item:<item1>)
                                                                                  level=5




         3. algebraic XQuery join detection,
                                                              /|+ child::element name { item* } (item1:iter)                      /|+ attribute::attribute id { atomic* } (item:iter)
                                                                                                                                                                                                Project (item1)
                                                                                  level=4                                                               level=4




                                                                                                                        Project (iter)                                    /|+ attribute::attribute person { atomic* } (item1:item)
                                                                                                                                                                                                    level=5




                                                                                                          /|+ child::element person { item* } (iter:item)                                       Project (item)
                                                                                                                              level=3




                                                                                                                           Project (item)                                   /|+ child::element buyer { item* } (item:item1)
                                                                                                                                                                                                level=4




                                                                                                                                                                                            Project (item1)




                                                                                                          /|+ child::element people { item* } (item:item1)              /|+ child::element closed_auction { item* } (item1:item)
                                                                                                                               level=2                                                           level=3




                                                                                                                                                                                          Project (item)




                                                                                                                                                             /|+ child::element closed_auctions { item* } (item:item1)
                                                                                                                                                                                      level=2




                                                                                                                                                             Project (item1)




                                                                                                                                             /|+ child::element site { item* } (item1:item)




         4. XPath join graph isolation,
                                                                                                                                                                level=1




                                                                                                                                                                 ROOTS




                                                                                                                                                             DOC (iter, item)




                                                                                                                                                            TBL: (iter | item)
                                                                                                                                                          [#1,"auctionG.xml"]




Prof. Dr. Torsten Grust            17          Technische Universität München
Rewriting XMark Q8
     •    Rewriting phases:
         1. Constant propagation and
            projection pushdown,
         2. functional dependency and
                                                                                                       SERIALIZE




                                                                                                FRAG UNION




                                                                                                  FRAGs             ROOTS




                                                                                                              twig (iter, item)




                                                                                                                                                                            ELEM (iter, item)




                                                                                                                                                                                                            fcns




            data flow analysis,
                                                                                                                                                                                 @item:’item’                                   fcns




                                                                                                                                                                                                                         TEXT (iter, item)       nil




                                                                                                                                                                                                ATTR (iter, item, item1)           πiter ,item




                                                                                                                                                                                                @item:’person’        CAST (item:¡item1¿), type: str




                                                                                                                                         πiter ,item1                                                                          ∪




                                                                                                                                                                                                                            @item1 :0




                                                                                                                                                                                                                               \




                                                                                                                                                                                                                                   πiter




                                                                                                                                                                                                COUNTitem1 :()/iter




                                                                                                                                                                         πiter




         3. algebraic XQuery join detection,
                                                                                                                                                        item = item1




                                                                                                                                      πitem              πiter ,item1




                                                                                                                            item:item1




                                                                                                                            πitem1
                                                                                item1 :item                                                                item1 :item




                                                                           descendant::text()
                                                                    item:iter                                 item1 :itemattribute(person)                                       attribute(id)
                                                                                                                                                                         item:iter
                                                                       GPS = 802, level = 5                         GPS = 960, level = 5                                  GPS = 799, level = 4




                                                                                                                            πitem                                                                           πiter




                                                                                                       item:item1descendant::element(buyer)                                              item:iter   descendant::element(person)
                                                                                                                GPS = 959, level = 4                                                                 GPS = 798, level = 3




                                                                                                   ∪                   πitem1 :item                                         πitem




                                                                EMPTY FRAG                          FRAGs               ROOTS




         4. XPath join graph isolation,
                                                                                                         DOC (iter, item)




                                                                                                       TBL: (iter — item)




         5. data guide exploitation (“Node GPS”).

             Pathfinder & IBM DB2 V9, 115 MB XML input data
             < 1 sec
Prof. Dr. Torsten Grust                 17              Technische Universität München
Rewriting XMark Q8
                                                                                                                                πiter




     •    Rewriting phases:                                                                                   item=item1




         1. Constant propagation and                                                           πitem          πiter ,item1

            projection pushdown,
         2. functional dependency and                                                 item:item1




            data flow analysis,
                                                                                      πitem1
                                          item1 :item                                                           item1 :item


         3. algebraic XQuery join detection,
         4. XPath join graph isolation,
                                     descendant::text()
                              item:iter
                                 GPS = 802, level = 5
                                                                        item1 :itemattribute(person)
                                                                              GPS = 960, level = 5
                                                                                                                                    a
                                                                                                                             item:iter
                                                                                                                             GPS = 79



         5. data guide exploitation (“Node GPS”).                                     πitem




             Pathfinder & IBM DB2 V9, 115 MB XML input=data = 4
                                                    descendant::element(buyer)
                                                   GPS 959, level
                                                                 item:item1



             < 1 sec
Prof. Dr. Torsten Grust                                 17   ∪                        Technische
                                                                                 πitem1 :item          Universität München
                                                                                                                       πitem
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32




    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                   NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                              NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE           IXSCAN(51) 50.01       IXSCAN(49) 50.01               IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE    IDX_VALUE_KIND_PRE_SIZE     IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                  XMARK_1.DOC               XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                   Technische Universität München
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32




    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                   NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                              NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE           IXSCAN(51) 50.01       IXSCAN(49) 50.01               IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE    IDX_VALUE_KIND_PRE_SIZE     IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                  XMARK_1.DOC               XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                   Technische Universität München
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32




    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                   NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                              NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE           IXSCAN(51) 50.01       IXSCAN(49) 50.01               IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE    IDX_VALUE_KIND_PRE_SIZE     IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                  XMARK_1.DOC               XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                   Technische Universität München
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32




    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                   NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                              NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE           IXSCAN(51) 50.01       IXSCAN(49) 50.01               IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE    IDX_VALUE_KIND_PRE_SIZE     IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                  XMARK_1.DOC               XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                   Technische Universität München
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32




    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                   NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                              NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE           IXSCAN(51) 50.01       IXSCAN(49) 50.01               IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE    IDX_VALUE_KIND_PRE_SIZE     IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                  XMARK_1.DOC               XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                   Technische Universität München
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32




    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                   NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                              NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE           IXSCAN(51) 50.01       IXSCAN(49) 50.01               IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE    IDX_VALUE_KIND_PRE_SIZE     IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                  XMARK_1.DOC               XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                   Technische Universität München
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32




    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                   NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                              NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE           IXSCAN(51) 50.01       IXSCAN(49) 50.01               IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE    IDX_VALUE_KIND_PRE_SIZE     IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                  XMARK_1.DOC               XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                   Technische Universität München
                       upstream
                                  Can DB2 Cope?                            ®

                  SORT(33) 373.32




               HSJOIN(35) 373.32
                                                                     *) Indexes avised by DB2 itself

    NLJOIN(37) 75.02              HSJOIN(43) 298.3




    IXSCAN(41) 50.01          NLJOIN(45) 149.15      DB2 Version 9                                                           NLJOIN(




     IDX_GUIDE_PRE            IXSCAN(53) 50.01
                                                     for Linux, UNIX, and Windows
                                                        NLJOIN(47) 75.02                                                      NLJOIN(




    XMARK_1.DOC           IDX_GUIDE_PRE_VALUE
                                                *)      IXSCAN(51) 50.01            IXSCAN(49) 50.01                  IXSCAN(63) 50.0




                         XMARK_1.DOC            IDX_GUIDE_PRE_PRE_PLUS_SIZE
                                                                               *)   IDX_VALUE_KIND_PRE_SIZE
                                                                                                              *)   IDX_GUIDE_PRE_VAL




                                                       XMARK_1.DOC                       XMARK_1.DOC                  XMARK_1.DOC



Prof. Dr. Torsten Grust                                              18                        Technische Universität München
                          Order Indifference
     let $warning := <p>Do <em>not</em> press button,
                        computer will <em>explode!</em></p>
     return fn:distinct-doc-order( for $x in $warning//*
                                   return $x/text()     )




Prof. Dr. Torsten Grust           19        Technische Universität München
                          Order Indifference
    •    Order in XML and XQuery processing is essential:
     let $warning := <p>Do <em>not</em> press button,
                        computer will <em>explode!</em></p>
     return fn:distinct-doc-order( for $x in $warning//*
                                   return $x/text()     )




Prof. Dr. Torsten Grust           19           Technische Universität München
                          Order Indifference
    •    Order in XML and XQuery processing is essential:
     let $warning := <p>Do <em>not</em> press button,
                        computer will <em>explode!</em></p>
     return fn:distinct-doc-order( for $x in $warning//*
                                   return $x/text()     )

      Do not press button, computer will explode!




Prof. Dr. Torsten Grust           19           Technische Universität München
                          Order Indifference
    •    Order in XML and XQuery processing is essential:
     let $warning := <p>Do <em>not</em> press button,
                        computer will <em>explode!</em></p>
     return for $x in $warning//*
            return $x/text()

      Do press button, computer will not explode!




Prof. Dr. Torsten Grust           19           Technische Universität München
                          Order Indifference
    •    Order in XML and XQuery processing is essential:
     let $warning := <p>Do <em>not</em> press button,
                        computer will <em>explode!</em></p>
     return for $x in $warning//*
            return $x/text()

      Do press button, computer will not explode!


    •    Order-awareness thus is often
         deeply wired into XQuery processors.
    •    Supporting constructs like
         unordered { e1 } is challenging.

Prof. Dr. Torsten Grust           19            Technische Universität München
                          Order Indifference
    •    Order in XML and XQuery processing is essential:
     let $warning := <p>Do <em>not</em> press button,
                        computer will <em>explode!</em></p>
     return for $x in $warning//*
            return $x/text()

      Do press button, computer will not explode!


    •    Order-awareness thus is often
         deeply wired into XQuery processors.                      SERIALIZE




                                                                           ¶ (item, pos)




    •
                                                                           ROW# (pos:<pos1>)




         Supporting constructs like                        FRAG_UNION
                                                                               ¶ (pos1, item)




                                                                               |X| (iter = iter1)




         unordered { e1 } is challenging.
                                                               FRAGs                                         ROOTS




                                                                                                    ELEM (iter1, item:<iter1, item><iter1, pos, item>)




                                                                                                                                                     ELEM_TAG




                                                                                                                                                                        ¶ (iter1, item, pos)




                                                                                                                                                  @ (item), val: item    ROW# (pos:<pos1>/iter1)




                                                                                                                                                                                       U




                                                                                                                                                                           @ (pos1), val: #1             @ (pos1), val: #2




Prof. Dr. Torsten Grust           19            Technische Universität München
                                                                                                                           FRAG_UNION                                      ¶ (iter1, item)                  ¶ (iter1, item)




                                                                                                                           FRAG_UNION                                    ROOTS                 FRAGs            ROOTS




                                                                                                                               FRAGs                                                               TEXT (item:<item2>)
                          Order Indifference
    •    Order in XML and XQuery processing is essential:
     let $warning := <p>Do <em>not</em> press button,
                        computer will <em>explode!</em></p>
     return for $x in $warning//*
            return $x/text()

      Do press button, computer will not explode!

                                                       π iter,item
    •    Order-awareness thus is often
         deeply wired into XQuery processors.                      SERIALIZE




                                                                           ¶ (item, pos)




    •
                                                                           ROW# (pos:<pos1>)




         Supporting constructs like                        FRAG_UNION
                                                                               ¶ (pos1, item)




                                                                               |X| (iter = iter1)




         unordered { e1 } is challenging.
                                                               FRAGs                                         ROOTS




                                                                                                    ELEM (iter1, item:<iter1, item><iter1, pos, item>)




                                                                                                                                                     ELEM_TAG




                                                                                                                                                                        ¶ (iter1, item, pos)




                                                                                                                                                  @ (item), val: item    ROW# (pos:<pos1>/iter1)




                                                                                                                                                                                       U




                                                                                                                                                                           @ (pos1), val: #1             @ (pos1), val: #2




Prof. Dr. Torsten Grust           19            Technische Universität München
                                                                                                                           FRAG_UNION                                      ¶ (iter1, item)                  ¶ (iter1, item)




                                                                                                                           FRAG_UNION                                    ROOTS                 FRAGs            ROOTS




                                                                                                                               FRAGs                                                               TEXT (item:<item2>)
More Purely Relational XQuery




Prof. Dr. Torsten Grust   20   Technische Universität München
More Purely Relational XQuery
    •                                                  .
         Declarative XQuery debugging with “time travel”
        •    Users observe expressions
             and may traverse iterations
             forward/backward.




Prof. Dr. Torsten Grust             20        Technische Universität München
More Purely Relational XQuery
    •                                                  .
         Declarative XQuery debugging with “time travel”
        •    Users observe expressions     iter pos item
                                            1
             and may traverse iterations    2
                                            3
             forward/backward.              4




Prof. Dr. Torsten Grust             20          Technische Universität München
More Purely Relational XQuery
    •                                                  .
         Declarative XQuery debugging with “time travel”
        •    Users observe expressions        iter pos item
                                               1
             and may traverse iterations       2
                                               3
             forward/backward.                 4




    •    Dependable cardinality estimates at XQuery
         subexpression level (# items per iteration and
         overall contribution).
                      for $x in $doc//x
                      return if (e1) then e2 else e3
Prof. Dr. Torsten Grust              20            Technische Universität München
More Purely Relational XQuery
    •                                                  .
         Declarative XQuery debugging with “time travel”
        •    Users observe expressions          iter pos item
                                                 1
             and may traverse iterations         2
                                                 3
             forward/backward.                   4




    •    Dependable cardinality estimates at XQuery
         subexpression level (# items per iteration and
         overall contribution).
                      for $x in $doc//x            4.3
                      return if (e1) then e2 else e3
                                            8               2
Prof. Dr. Torsten Grust              20              Technische Universität München
                   Atomization Indexes
   <a>
    <b>
     <c>Same<d>shirt</d></c>
     different
    </b>
    day
   </a>




Prof. Dr. Torsten Grust        21   Technische Universität München
                   Atomization Indexes
   <a>
    <b>
     <c>Same<d>shirt</d></c>
     different
    </b>
    day
   </a>




Prof. Dr. Torsten Grust        21   Technische Universität München
                   Atomization Indexes
                                                          a
   <a>
    <b>                                     b
     <c>Same<d>shirt</d></c>                                      day
     different                         c
    </b>                                         different
    day
   </a>                                    d
                                    Same

                                           shirt




Prof. Dr. Torsten Grust        21               Technische Universität München
                   Atomization Indexes
                                                                     a
   <a>
    <b>                                                b
     <c>Same<d>shirt</d></c>                                                 day
     different                                    c
    </b>                                                    different
    day
   </a>                                               d
                                               Same

  node       atom1        atom2   atom3   atom4       shirt
   a
   b
   c

     d




Prof. Dr. Torsten Grust                   21               Technische Universität München
                   Atomization Indexes
                                                                     a
   <a>
    <b>                                                b
     <c>Same<d>shirt</d></c>                                                 day
     different                                    c
    </b>                                                    different
    day
   </a>                                               d
                                               Same

  node       atom1        atom2   atom3   atom4       shirt
   a
   b
   c

     d




Prof. Dr. Torsten Grust                   21               Technische Universität München
                   Atomization Indexes
                                                                     a
   <a>
    <b>                                                b
     <c>Same<d>shirt</d></c>                                                 day
     different                                    c
    </b>                                                    different
    day
   </a>                                               d
                                               Same

  node       atom1        atom2   atom3   atom4       shirt
   a
   b
   c
          Same
     d
          shirt
          different
          day
Prof. Dr. Torsten Grust                   21               Technische Universität München
                   Atomization Indexes
                                                                     a
   <a>
    <b>                                                b
     <c>Same<d>shirt</d></c>                                                 day
     different                                    c
    </b>                                                    different
    day
   </a>                                               d
                                               Same

  node       atom1        atom2   atom3   atom4       shirt
   a
   b
   c
          Same
     d    shirt
          shirt
          different
          day
Prof. Dr. Torsten Grust                   21               Technische Universität München
                   Atomization Indexes
                                                                 a
   <a>
    <b>                                            b
     <c>Same<d>shirt</d></c>                                             day
     different                                c
    </b>                                                different
    day
   </a>                                           d
                                          Same

  node atom1      atom2     atom3     atom4       shirt
   a Same       shirt     different day
   b Same       shirt     different
   c Same       shirt
      Same
   d shirt
      shirt
      different
      day
Prof. Dr. Torsten Grust              21                Technische Universität München
                   Atomization Indexes
                                                                 a
   <a>
    <b>                                            b
     <c>Same<d>shirt</d></c>                                             day
     different                                c
    </b>                                                different
    day
   </a>                                           d
                                          Same

  node atom1      atom2     atom3     atom4       shirt
   a Same       shirt     different day
   b Same       shirt     different                dict           atom
   c Same       shirt
      Same
   d shirt
      shirt
      different
      day
Prof. Dr. Torsten Grust              21                Technische Universität München
                   Atomization Indexes
                                                                 a
   <a>
    <b>                                            b
     <c>Same<d>shirt</d></c>                                             day
     different                                c
    </b>                                                different
    day
   </a>                                           d
                                          Same

  node atom1      atom2     atom3     atom4       shirt
   a Same 1
          δ     shirt     different day
   b Same 1
          δ     shirt     different                dict     atom
   c Same 1
          δ     shirt                               δ1 Same
          δ
      Same 1
   d shirt
      shirt
      different
      day
Prof. Dr. Torsten Grust              21                Technische Universität München
                   Atomization Indexes
                                                            a
   <a>
    <b>                                       b
     <c>Same<d>shirt</d></c>                                        day
     different                           c
    </b>                                           different
    day
   </a>                                      d
                                      Same

  node atom1 atom2 atom3 atom4               shirt
   a     δ1    δ2    δ3    δ4
   b     δ1    δ2    δ3                       dict           atom
   c     δ1    δ2                              δ1     Same
         δ1                                    δ2     shirt
   d     δ2                                    δ3     different
         δ2                                    δ4     day
         δ3
         δ4
Prof. Dr. Torsten Grust          21               Technische Universität München
                   Atomization Indexes
                                                            a
   <a>
    <b>                                       b
     <c>Same<d>shirt</d></c>                                        day
     different                           c
    </b>                                           different
    day
   </a>                                      d
                                      Same

  node atom1 atom2 atom3 atom4               shirt
   a     δ1
          5    δ2    δ3    δ4
   b     δ1
          5    δ2    δ3                       dict           atom
   c     δ1
          5    δ2                              δ1     Same
         δ1                                    δ2     shirt
   d     δ2                                    δ3     different
         δ2                                    δ4     day
         δ3                                    δ5     Same    shirt
         δ4
Prof. Dr. Torsten Grust          21               Technische Universität München
                   Atomization Indexes
                                                            a
   <a>
    <b>                                       b
     <c>Same<d>shirt</d></c>                                        day
     different                           c
    </b>                                           different
    day
   </a>                                      d
                                      Same

  node atom1 atom2 atom3 atom4               shirt
   a     δ1
          6
          5    δ2    δ3    δ4
   b     δ1
          6
          5    δ2    δ3                       dict           atom
   c     δ1
          5    δ2                              δ1     Same
         δ1                                    δ2     shirt
   d     δ2                                    δ3     different
         δ2                                    δ4     day
         δ3                                    δ5        δ1     δ2
         δ4                                    δ6        δ5     δ3
Prof. Dr. Torsten Grust          21               Technische Universität München
                   Atomization Indexes
                                                          a
   <a>
    <b>                                     b
     <c>Same<d>shirt</d></c>                                      day
     different                         c
    </b>                                         different
    day
   </a>                                    d
                                    Same

  node atom1 atom2                         shirt
   a     δ6    δ4
   b     δ6                                 dict           atom
   c     δ5                                  δ1     Same
         δ1                                  δ2     shirt
   d     δ2                                  δ3     different
         δ2                                  δ4     day
         δ3                                  δ5        δ1     δ2
         δ4                                  δ6        δ5     δ3
Prof. Dr. Torsten Grust        21               Technische Universität München
 Relational Encodings for XML
   •     XML documents represent ordered, unranked trees
         (nodes of 7 kinds).
   •     Relational XQuery processing calls for an adequate
         relational encoding of such trees:
        1. Schema-oblivious
           (XQuery constructs arbitrary XML fragments),
        2. accessible for the query processor and index support
           ( 107 nodes in typical GB-range XML instances).




Prof. Dr. Torsten Grust           22           Technische Universität München
 Relational Encodings for XML
   •     XML documents represent ordered, unranked trees
         (nodes of 7 kinds).
   •     Relational XQuery processing calls for an adequate
         relational encoding of such trees:
        1. Schema-oblivious
           (XQuery constructs arbitrary XML fragments),
        2. accessible for the query processor and index support
           ( 107 nodes in typical GB-range XML instances).




                                   ✘
                          doc                  X
                           1    “<a>b<c/>d<e/><f>g</f></a>”
                           2                   …
Prof. Dr. Torsten Grust                   22           Technische Universität München
Pre/Postorder Encoding
          <a>
            <b><c><d/>e</c></b>
            <f><!--g-->
               <h><i/><j/></h>
            </f>
          </a>




Prof. Dr. Torsten Grust           23   Technische Universität München
Pre/Postorder Encoding
                                                   a
          <a>
            <b><c><d/>e</c></b>            b                     f
            <f><!--g-->
               <h><i/><j/></h>             c                          h
            </f>                                       g
          </a>
                                       d       e             i            j




Prof. Dr. Torsten Grust           23               Technische Universität München
Pre/Postorder Encoding
                                                  a
                                                 0a9
          <a>
            <b><c><d/>e</c></b>           b
                                         1b3                     f
                                                                5f8
            <f><!--g-->
               <h><i/><j/></h>           c
                                        2c2                            h
                                                                      7h7
            </f>                                       g
                                                      6g4
          </a>
                                        d
                                       3d0      e
                                               4e1            i
                                                             8i5           j
                                                                          9j6




Prof. Dr. Torsten Grust           23                 Technische Universität München
Pre/Postorder Encoding
                                                     a
                                                    0a9
          <a>
            <b><c><d/>e</c></b>           b
                                         1b3                      f
                                                                 5f8
            <f><!--g-->
               <h><i/><j/></h>           c
                                        2c2                             h
                                                                       7h7
            </f>                                        g
                                                       6g4
          </a>
                                        d
                                       3d0       e
                                                4e1            i
                                                              8i5           j
                                                                           9j6

                                              pre post node
                                               0   9    a
                                               1   3    b
                                               2   2    c
                                               3   0    d
                                               4   1    e
                                               5   8    f
                                               6   4    g
                                               7   7    h
                                               8   5    i
                                               9   6    j
Prof. Dr. Torsten Grust           23                  Technische Universität München
Pre/Postorder Encoding
                                                                     a
                                                                    0a9
          <a>
            <b><c><d/>e</c></b>           b
                                         1b3                                      f
                                                                                 5f8
            <f><!--g-->
               <h><i/><j/></h>           c
                                        2c2                                             h
                                                                                       7h7
            </f>                                                        g
                                                                       6g4
          </a>
                                        d
                                       3d0                       e
                                                                4e1            i
                                                                              8i5           j
                                                                                           9j6

                                                              pre post node
                                                               0   9    a
                                                               1   3    b
                                                               2   2    c



                                             document order
                                                               3   0    d
                                                               4   1    e
                                                               5   8    f
                                                               6   4    g
                                                               7   7    h
                                                               8   5    i
                                                               9   6    j
Prof. Dr. Torsten Grust           23                                  Technische Universität München
Pre/Postorder Encoding
                                                                    a
                                                                   0a9
          <a>
            <b><c><d/>e</c></b>           b
                                         1b3                                      f
                                                                                 5f8
            <f><!--g-->
               <h><i/><j/></h>           c
                                        2c2                                             h
                                                                                       7h7
            </f>                                                        g
                                                                       6g4
          </a>
                                        d
                                       3d0                       e
                                                                4e1            i
                                                                              8i5           j
                                                                                           9j6

                                                              pre post node kind
                                                               0   9    a   elem
                                                               1   3    b   elem
                                                               2   2    c   elem



                                             document order
                                                               3   0    d   elem
                                                               4   1    e   text
                                                               5   8    f   elem
                                                               6   4    g comm
                                                               7   7    h   elem
                                                               8   5    i   elem
                                                               9   6    j   elem
Prof. Dr. Torsten Grust           23                                  Technische Universität München
Pre/Postorder Encoding
                                                                     a
                                                                    0a9
          <a>
            <b><c><d/>e</c></b>           b
                                         1b3                                       f
                                                                                  5f8
            <f><!--g-->
               <h><i/><j/></h>           c
                                        2c2                                              h
                                                                                        7h7
            </f>                                                         g
                                                                        6g4
          </a>
                                        d
                                       3d0                        e
                                                                 4e1            i
                                                                               8i5           j
                                                                                            9j6

                                                              pre post node kind size level
                                                               0   9    a   elem   9    0
                                                               1   3    b   elem   3    1
                                                               2   2    c   elem   2    2



                                             document order
                                                               3   0    d   elem   0    3
                                                               4   1    e   text   0    3
                                                               5   8    f   elem   4    1
                                                               6   4    g comm 0        2
                                                               7   7    h   elem   2    2
                                                               8   5    i   elem   0    3
                                                               9   6    j   elem   0    3
Prof. Dr. Torsten Grust           23                                   Technische Universität München
Pre/Postorder Encoding
                                                                       a
                                                                      0a9
          <a>
            <b><c><d/>e</c></b>           b
                                         1b3                                          f
                                                                                     5f8
            <f><!--g-->
               <h><i/><j/></h>           c
                                        2c2                                                 h
                                                                                           7h7
            </f>                                                            g
                                                                           6g4
          </a>
                                        d
                                       3d0                           e
                                                                    4e1            i
                                                                                  8i5           j
                                                                                               9j6

                                                              pre          node    kind size level
                                                               0            a      elem   9    0
                                                               1            b      elem   3    1
                                                                                   elem   2    2



                                             document order
                                                               2            c
                                                               3            d      elem   0    3
                                                               4            e      text   0    3
                                                               5            f      elem   4    1
                                                               6            g     comm 0       2
                                                               7            h      elem   2    2
                                                               8            i      elem   0    3
                                                               9            j      elem   0    3
Prof. Dr. Torsten Grust           23                                      Technische Universität München
Pre/Postorder Encoding
                                                                         a
                                                                        0a9
          <a>
            <b><c><d/>e</c></b>             b
                                           1b3                                          f
                                                                                       5f8
            <f><!--g-->
               <h><i/><j/></h>             c
                                          2c2                                                 h
                                                                                             7h7
            </f>                                                              g
                                                                             6g4
          </a>
                                          d
                                         3d0                           e
                                                                      4e1            i
                                                                                    8i5           j
                                                                                                 9j6
         post
                                                                pre          node    kind size level
                                                                 0            a      elem   9    0
                                                                 1            b      elem   3    1
                                                                                     elem   2    2



                                               document order
                                                                 2            c
                                                                 3            d      elem   0    3
          5                                                      4            e      text   0    3
                                                                 5            f      elem   4    1
                                                                 6            g     comm 0       2
                                                                 7            h      elem   2    2
                                                                 8            i      elem   0    3
                              pre
                          5                                      9            j      elem   0    3
Prof. Dr. Torsten Grust             23                                      Technische Universität München
Pre/Postorder Encoding
                                                                                         a
                                                                                        0a9
          <a>
            <b><c><d/>e</c></b>                             b
                                                           1b3                                          f
                                                                                                       5f8
            <f><!--g-->
               <h><i/><j/></h>                             c
                                                          2c2                                                 h
                                                                                                             7h7
            </f>                                                                              g
                                                                                             6g4
          </a>
                                                          d
                                                         3d0                           e
                                                                                      4e1            i
                                                                                                    8i5           j
                                                                                                                 9j6
         post
                                                                                pre          node    kind size level
                                                                                 0            a      elem   9    0
         a                f                                                      1            b      elem   3    1
                                  h                                                                  elem   2    2



                                                               document order
                                                                                 2            c
                                                                                 3            d      elem   0    3
                                          j                                      4            e      text   0    3
          5
                                      i                                          5            f      elem   4    1
                              g
              b c
                                                                                 6            g     comm 0       2
                                                                                 7            h      elem   2    2
                          e                                                      8            i      elem   0    3
                                              pre
                     d    5                                                      9            j      elem   0    3
Prof. Dr. Torsten Grust                             23                                      Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:




Prof. Dr. Torsten Grust       24          Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:


           a                  f
                                      h
                                              j
           5
                                          i
                          c       g
                 b
                              e
                          d   5
Prof. Dr. Torsten Grust                           24   Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:


           a                  f
                                      h
                                              j
           5
                                          i
                          c       g
                 b
                              e
                          d   5
Prof. Dr. Torsten Grust                           24   Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:


           a                  f
                                      h
                                              j
           5
                                          i
                          c       g
                 b
                              e
                          d   5
Prof. Dr. Torsten Grust                           24   Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:


           a                  f
                                      h
                                              j
           5
                                          i
                          c       g
                 b
                              e
                          d   5
Prof. Dr. Torsten Grust                           24   Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:


           a                  f
                                      h
                                              j
           5
                                          i
                          c       g
                 b
                              e
                          d   5
Prof. Dr. Torsten Grust                           24   Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:

                                                       B-Tree       pre post node
           a                  f
                                                                     0   9    a
                                                                     1   3    b
                                      h                              2   2    c
                                                                     3   0    d
                                              j                      4   1    e
           5
                                          i                          5   8    f

                          c       g                                  6   4    g
                 b                                                   7   7    h
                              e                                      8   5    i
                                                                     9   6    j
                          d   5
Prof. Dr. Torsten Grust                           24        Technische Universität München
                  B-Trees Accelerate
                 XPath Location Steps
    •    The XPath axes ancestor, descendant, preceding,
         following partition any XML document:

                                                       B-Tree       pre post node
           a                  f
                                                                     0   9    a
                                                                     1   3    b
                                      h                              2   2    c
                                                                     3   0    d
                                              j                      4   1    e
           5
                                          i                          5   8    f

                          c       g                                  6   4    g
                 b                                                   7   7    h
                              e                                      8   5    i
                                                                     9   6    j
                          d   5
Prof. Dr. Torsten Grust                           24        Technische Universität München
                      Partitioned B-Trees




Prof. Dr. Torsten Grust        25    Technische Universität München
                      Partitioned B-Trees
    •    Use low selectivity key prefixes to partition the
         XML nodes in a B-Tree according to various criteria:




Prof. Dr. Torsten Grust           25            Technische Universität München
                      Partitioned B-Trees
    •    Use low selectivity key prefixes to partition the
         XML nodes in a B-Tree according to various criteria:
        1. level (support XPath child axis)                  B-Tree




Prof. Dr. Torsten Grust           25            Technische Universität München
                      Partitioned B-Trees
    •    Use low selectivity key prefixes to partition the
         XML nodes in a B-Tree according to various criteria:
        1. level (support XPath child axis)                         B-Tree



                                              level = 1 level = 2   level = 3




Prof. Dr. Torsten Grust           25                   Technische Universität München
                      Partitioned B-Trees
    •    Use low selectivity key prefixes to partition the
         XML nodes in a B-Tree according to various criteria:
        1. level (support XPath child axis)                         B-Tree

        2. element tag name
        3. path to node (“Node GPS”)
                                              level = 1 level = 2   level = 3




    •    Given a Pathfinder-generated workload,
         the index advisor of IBM DB2 V9 proposes
         such partitionings automatically.
Prof. Dr. Torsten Grust           25                   Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.




Prof. Dr. Torsten Grust             26               Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.

                               c3
                                    c4

                 c1
                          c2




Prof. Dr. Torsten Grust                  26          Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.

                               c3
                                    c4

                 c1
                          c2




Prof. Dr. Torsten Grust                  26          Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.

                               c3
                                    c4

                 c1
                          c2




Prof. Dr. Torsten Grust                  26          Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.

                               c3
                                    c4

                 c1
                          c2




Prof. Dr. Torsten Grust                  26          Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.

                               c3
                                    c4

                 c1
                          c2




Prof. Dr. Torsten Grust                  26          Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.
                                              Pathfinder XQuery Compiler
                               c3
                                    c4
                                              XPath/XQuery Semantics
                 c1                                XML Encoding
                          c2




Prof. Dr. Torsten Grust                  26            Technische Universität München
Injecting More Tree Awareness
    •    In XQuery, an XPath location step originates in a
         sequence of context nodes.
             Duplicate nodes, out of order results
             (XPath semantics!), wasted work.
                                              Pathfinder XQuery Compiler
                               c3
                                    c4
                                              XPath/XQuery Semantics




                                                                      staircase join
                 c1                                XML Encoding
                          c2




Prof. Dr. Torsten Grust                  26            Technische Universität München
Staircase Join: Context Pruning
    •    XPath following axis:

                               c3
                                    c4


                   c1
                          c2




Prof. Dr. Torsten Grust                  27   Technische Universität München
Staircase Join: Context Pruning
    •    XPath following axis:




                   c1
                          c2




Prof. Dr. Torsten Grust          27   Technische Universität München
Staircase Join: Context Pruning
    •    XPath following axis:


                                      c1
                   c1
                                      c2
                          c2




Prof. Dr. Torsten Grust          27    Technische Universität München
Staircase Join: Context Pruning
    •    XPath following axis:


                                               c1

                                               c2
                          c2




    •    Index scan yields duplicate free result in document order.
Prof. Dr. Torsten Grust           27            Technische Universität München
             Staircase Join: Skipping
    •    XPath descendant axis:

                                c3
                                     c4




                          c1


                           c2




Prof. Dr. Torsten Grust                   28   Technische Universität München
             Staircase Join: Skipping
    •    XPath descendant axis:

                               c3




                          c1




Prof. Dr. Torsten Grust             28   Technische Universität München
             Staircase Join: Skipping
    •    XPath descendant axis:

                                      c3




                          c1




                               scan


Prof. Dr. Torsten Grust                    28   Technische Universität München
             Staircase Join: Skipping
    •    XPath descendant axis:

                                          c3
                                                    c1
                                                                  v
                                      v

                          c1




                               scan


Prof. Dr. Torsten Grust                        28    Technische Universität München
             Staircase Join: Skipping
    •    XPath descendant axis:

                                             c3
                                                              c1
                                                                            v
                                      v

                          c1




                               scan       skip    scan

    •    Index scan yields duplicate free result in document order.
Prof. Dr. Torsten Grust                                        Technische Universität München
                                                         28
                          MonetDB/XQuery




Prof. Dr. Torsten Grust         29   Technische Universität München
                          MonetDB/XQuery
•    MonetDB: Extensible relational database kernel,
     optimized for query processing close to the CPU.
    •     Full vertical fragmentation                 pre post node
                                                       0   9    a
          (column store).                              1   3    b
                                                       2   2    c
                                                       3   0    d
                                                       4   1    e
                                                       5   8    f
                                                       6   4    g
                                                       7   7    h
                                                       8   5    i
                                                       9   6    j




Prof. Dr. Torsten Grust            29        Technische Universität München
                          MonetDB/XQuery
•    MonetDB: Extensible relational database kernel,
     optimized for query processing close to the CPU.
    •     Full vertical fragmentation             pre post
                                                   0   9
                                                                pre node
                                                                 0   a
          (column store).                          1   3         1   b
                                                   2   2         2   c
                                                   3   0         3   d
                                                   4   1         4   e
                                                   5   8         5   f
                                                   6   4         6   g
                                                   7   7         7   h
                                                   8   5         8   i
                                                   9   6         9   j




Prof. Dr. Torsten Grust            29        Technische Universität München
                          MonetDB/XQuery
•    MonetDB: Extensible relational database kernel,
     optimized for query processing close to the CPU.
    •     Full vertical fragmentation             pre post
                                                   0   9
                                                                pre node
                                                                 0   a
          (column store).                          1   3         1   b
                                                   2   2         2   c
                                                   3   0         3   d
                                                   4   1         4   e
                                                   5   8         5   f
                                                   6   4         6   g
                                                   7   7         7   h
                                                   8   5         8   i
                                                   9   6         9   j




Prof. Dr. Torsten Grust            29        Technische Universität München
                          MonetDB/XQuery
•    MonetDB: Extensible relational database kernel,
     optimized for query processing close to the CPU.
    •     Full vertical fragmentation                   pre post
                                                         0   9
                                                                      pre node
                                                                       0   a
          (column store).                                1   3         1   b

•    Pathfinder + MonetDB = MonetDB/XQuery.
                                                         2
                                                         3
                                                         4
                                                             2
                                                             0
                                                             1
                                                                       2
                                                                       3
                                                                       4
                                                                           c
                                                                           d
                                                                           e
    •     Implementation of XQuery 1.0.                  5
                                                         6
                                                             8
                                                             4
                                                                       5
                                                                       6
                                                                           f
                                                                           g

    •     Evaluates queries against GB-range             7
                                                         8
                                                             7
                                                             5
                                                                       7
                                                                       8
                                                                           h
                                                                           i
          XML instances in interactive time.             9   6         9   j


    + XQuery Update, XQuery RPC ( execute at uri { e } ),
          support for multi-dimensional XML, ...
Prof. Dr. Torsten Grust            29              Technische Universität München
“DB2’s XML support
                                 .
 doesn’t beat its relational self”




Prof. Dr. Torsten Grust   30   Technische Universität München
“DB2’s XML support
                                 .
 doesn’t beat its relational self”
    •    Pathfinder builds on 30+ years of development of
         relational database technology.




Prof. Dr. Torsten Grust          30           Technische Universität München
“DB2’s XML support
                                 .
 doesn’t beat its relational self”
    •    Pathfinder builds on 30+ years of development of
         relational database technology.

        •    Outperforms the built-in DB2 V9 pureXML® engine,
             especially for large XML input instances.




Prof. Dr. Torsten Grust           30           Technische Universität München
“DB2’s XML support
                                 .
 doesn’t beat its relational self”
    •    Pathfinder builds on 30+ years of development of
         relational database technology.

        •    Outperforms the built-in DB2 V9 pureXML® engine,
             especially for large XML input instances.

    •    Covers other data-intensive languages:

        1. SQL/XML


Prof. Dr. Torsten Grust           30              Technische Universität München
“DB2’s XML support
                                 .
 doesn’t beat its relational self”
    •    Pathfinder builds on 30+ years of development of
         relational database technology.

        •    Outperforms the built-in DB2 V9 pureXML® engine,
             especially for large XML input instances.

    •    Covers other data-intensive languages:

        1. SQL/XML

        2. LINQ + other “Nested Loop” languages.
Prof. Dr. Torsten Grust           30              Technische Universität München
                                           grust@in.tum.de
                           http://www-db.in.tum.de/~grust/
Prof. Dr. Torsten Grust   31           Technische Universität München
Prof. Dr. Torsten Grust   32   Technische Universität München

								
To top