Unleash SQL Power to your XML Data by bns26590

VIEWS: 13 PAGES: 41

									                   Session: E14
                   Unleash SQL Power
                   to your XML Data
                    Matthias Nicola
                    IBM Silicon Valley Lab




                                                     16 October 2008 • 09:00 – 10:00am
                              Platform: DB2 for z/OS and DB2 for Linux, Unix, Windows



SQL is no longer the purely relational language that it used to be. The latest
SQL standard defines an XML data type and new functions for querying
hierarchical XML data. This is supported by DB2's pureXML functionality on
zOS and Linux, Unix, and Windows. Apart from efficient storage and
indexing, pureXML allows you to search and extract XML values through the
new SQL functions. This also enables you to join XML and relational data in
a single query, and to convert one to the other as needed. Additionally, you
can apply the full power of all the good old relational SQL features to your
XML data. This session introduces you to querying XML data in the DB2 9
family and presents tips & tricks for getting the best of both worlds when
querying XML and relational data.


Matthias Nicola is a Senior Software Engineer for DB2 pureXML at IBM's
Silicon Valley Lab. His work focuses on all aspects of XML in DB2, including
XQuery, SQL/XML, indexing and performance. Matthias works closely with
the DB2 development teams as well as with customers and business
partners who are using XML, assisting them in the design, implementation,
and optimization of XML solutions. Prior to joining IBM, Matthias worked on
data warehousing performance for Informix Software. He also worked in
research and industry projects on distributed and replicated databases. He
received his doctorate in computer science from the Technical University of
Aachen, Germany.
                  Key Points
                 • XPath, and how to embed XPath in SQL

                 • New SQL Functions: XMLQUERY, XMLTABLE,
                   XMLCAST, and XMLEXISTS

                 • Retrieve XML data in relational format, and vice versa

                 • Apply SQL functions and SQL aggregation to XML
                   Data

                 • Best Practices for writing hybrid XML/relational queries



                                                                                   2




Learn the basics of XPATH and how to use XPATH embedded in SQL to extract XML document
fragments or to express predicates on XML data.

Learn how to query XML data in DB2, using the new SQL functions XMLEXISTS, XMLQUERY,
and XMLTABLE.

Learn how to retrieve XML data in relational format, and vice versa.

Learn how to define relational views over XML to make your XML data available to your existing
SQL applications.

Learn fundamental guidelines for writing efficient queries, and for integrating XML and relational
data
Agenda

   Brief Recap: DB2 9 pureXML
   XPath Expressions
   Combining SQL and XPath
     SQL/XML functions XMLQUERY and XMLEXISTS
     From XML to SQL Types: XMLCAST
     Applying SQL functions to XML Data
     Returning XML in relational format: XMLTABLE
   Summary




                                                    3
Agenda

   Brief Recap: DB2 9 pureXML
   XPath Expressions
   Combining SQL and XPath
     SQL/XML functions XMLQUERY and XMLEXISTS
     From XML to SQL Types: XMLCAST
     Applying SQL functions to XML Data
     Returning XML in relational format: XMLTABLE
   Summary




                                                    4
        XML Storage: Old and New (DB2 V8 vs. DB2 9+)
          Unstructured XML                           Shredding:                          DB2 9 pureXML:
         storage: XML as text                      XML → Relational                      XML as XML

                                                                                                           XML
                                                                    XML                                    DOC
                                                                    DOC
              Extract             XML           Fixed
                                  DOC
             selected                          Mapping
          elements/attr.



                                  XML DOC

                                  XML DOC

                                  XML DOC


                                                                                        XML
                           CLOB Column
                           i.e. XML as text                                            Index

       Side Tables
       or Indexes
                                                     (regular relational tables)

                                              Mapping prevents XML schema
        Any sub-document level access         changes, and is often too complex.   Maximum flexibility   XML Column
        requires XML parsing – slow.          XML reconstruction is slow.          and performance


                                                                                                                      5




The XML Extender in DB2 V8 allows XML storage in CLOB (or Varchar) columns,
but performance is often not adequate due to XML parsing at query time. Another
option in DB2 V8 was to shred XML data to relational tables based on a fixed
mapping. This can work well if the XML structure is relatively simple and doesn’t
change over time. Hence, this is still possible in DB2 9 and beyond. However, in
many real-world scenarios the required mapping is very complex and it may take
dozens (sometimes hundreds) of tables to represent the XML data in relational
format. In such cases the shredding is very expensive, and re-constructing the XML
documents requires multi-way joins which often perform poorly. Also, if the XML
format changes, the mapping and the underlying relational need to be adjusted
accordingly, which is often a very complex and expensive task.
DB2 9 solves these problems. DB2 9 stores XML in a parsed format which avoids
XML parsing at query time. DB2 9 also does not require a fixed mapping to store
XML data. Documents of different form and shape can be stored in a single column
of type XML.
                The XML Data Type & pureXML Storage

                    Tables can contain relational and/or XML columns
                    Relational columns stored in tabular format
                    XML columns stored in a parsed hierarchical format
                    No XML parsing for query evaluation        High Performance

               create table dept (deptID char(8),...,deptdoc xml);
                                deptID        …     deptdoc
                                “PR27”        …     <dept> …              DB2 Storage:
                                                       <emp>…</emp>
                                                    </dept>
                                                                             XML
                                …             …     …


                                                                            Relational



                                                                                         6




The new data type “XML” can be used to define 1 or multiple XML columns in a DB2 table. The
table can also have traditional relational columns, but that’s optional. XML and relational data are
stored differently, but closely linked. Differently, because relational data is flat and best stored in
rows and columns, while XML is nested and best stored in a tree format.


For more details, see:


Matthias Nicola and Bert Van der Linden. ”Native XML Support in DB2 Universal Database,”
Proceedings of the 31st Annual VLDB, 2005.
(http://www.vldb2005.org/program/paper/thu/p1164-nicola.pdf)
               Efficient Document Tree Storage
       <dept>
         <employee id="901">                                                                  14
             <name>John Doe</name>
             <phone>408 555 1212</phone>
             <office>344</office>
                                                                         4                                            4
          </employee>
          <employee id="902">
             <name>Peter Pan</name>                     7=901      1           6       3           7=902       1           6         3
             <phone>408 555 9918</phone>
             <office>216</office>
                                                            John Doe    408-555-1212       344            Peter Pan   408-555-9918       216
          </employee>
       </dept>


                                                                                                    Tags encoded as
         XML text represented                               dept                                    Integers.
         as document tree                                                                               Reduces storage
                                    employee                                   employee
                                                                                                        Fast comparisons &
                                                                                                        navigation

                    id=901   name      phone       office   id=902     name        phone         office


                         John Doe   408-555-1212    344            Peter Pan   408-555-9918        216



                                                                                                                                           7




This shows a textual XML document (top left) and the hierarchical representation for
efficient storage. One of multiple optimizations is that DB2 encodes XML tag names
as unique integer values. This is invisible to the application but makes XML
processing inside DB2 much faster.


For more details, see: http://www.vldb2005.org/program/paper/thu/p1164-nicola.pdf
                Relevant XML Standards

                                                    SQL/XML
                                                  http://www.sqlx.org




                                                     XQUERY 1.0
                                           Expressions
                                       http://www.w3.org/TR/xquery           XPath 2.0
                                    Functions & Operators                   http://www.w3.org
                                   http://www.w3.org/TR/xquery-operators/     /TR/xpath20/


                                    XQuery 1.0 and XPath 2.0 Data Model
                                       http://www.w3.org/TR/query-datamodel/



                                                                                                8




The basic language to query XML data is XPath. It’s very powerful.


XQuery is an extension of XPath, and XPath is sub-set of XQuery. XQuery adds additional
expressions and functions to XPath to allow more complicated queries. Both XPath and XQuery
are based on the same data model which is called “XQuery 1.0 and XPath 2.0 Data Model”.
Basically, this data model defines how each XML document is actually a tree of element and
attribute nodes. Queries expressed in XQuery or XPath traverse these trees, evaluate
predicates, and retrieve XML values.


XQuery and XPath have been standardized by the W3C.


The SQL standard has been enhanced to allow embedding of XQuery or XPath in SQL
statements. You will see later in this presentation how that works.
         Options to query XML data in DB2 9



             SQL        Plain SQL, allows
                        full doc retrieval

           SQL/XML      XPath embedded
               XPath    in SQL

           SQL/XML      XQuery embedded
               XQuery   in SQL

            XQuery      XQuery as a stand-
                        alone language

            XQuery       SQL embedded
                SQL      in XQuery

                                                                        9




These are the 5 languages (or: combination of languages) that the DB2 Family
offers.
         Options to query XML data in DB2 9

                                             DB2 LUW     DB2/zOS
             SQL        Plain SQL, allows
                        full doc retrieval

           SQL/XML      XPath embedded
               XPath    in SQL

           SQL/XML      XQuery embedded
               XQuery   in SQL

           XQuery       XQuery as a stand-
                        alone language

           XQuery        SQL embedded
                SQL      in XQuery

                                                                      10




All of these options are supported in DB2 9 for Linux, Unix, Windows. DB2 9 for
zOS does not support XQuery. However, you will see that XPath embedded in SQL
is already a very powerful combination that‘s fully sufficient for many XML
applications.
         Options to query XML data in DB2 9

                                             DB2 LUW        DB2 zOS
             SQL        Plain SQL, allows
                        full doc retrieval

           SQL/XML      XPath embedded
                XPath   in SQL

           SQL/XML      XQuery embedded
               XQuery   in SQL

            XQuery      XQuery as a stand-
                        alone language

            XQuery       SQL embedded
                SQL      in XQuery

                                                                         11




In this presentation we‘ll focus on SQL/XML with XPath, which applies to both, DB2
9 for Linux, Unix, Windows and DB2 9 for zOS.
Agenda

   Brief Recap: DB2 9 pureXML
   XPath Expressions
   Combining SQL and XPath
     SQL/XML functions XMLQUERY and XMLEXISTS
     From XML to SQL Types: XMLCAST
     Applying SQL functions to XML Data
     Returning XML in relational format: XMLTABLE
   Summary




                                                    12
              XPath Concepts
                                                                              /
        <dept bldg=“101”>
                                                                              /dept
          <employee id=“901”>
                                                                              /dept/employee
              <name>John Doe</name>
                                                                              /dept/employee/@id
              <phone>408 555 1212</phone>               Each node
                                                                              /dept/employee/name
              <office>344</office>                      has a path            /dept/employee/phone
           </employee>
                                                                              /dept/employee/phone/text()
           <employee id=“902”>
                                                                              (...)
              <name>Peter Pan</name>
              <phone>408 555 9918</phone>
              <office>216</office>                                 dept
           </employee>
        </dept>

                              bldg=101   employee                                       employee



                     id=901      name       phone         office   id=902       name       phone       office



                              John Doe   408-555-1212      344              Peter Pan   408-555-9918    216


                                                                                                                13




XPath is based on the fact that every XML document is a tree of element and
attribute nodes. Each node has a path. A path is the sequence of nodes from the
root of the tree to one or multiple specific nodes. A useful analogy is to think of a
file system. A file system is a tree structure of nested directories. Each directory has
a specific absolute path through which it can be accessed.


Note: XML allows repeating elements, such as “employee” in this sample document.
Hence, a single path, such as /dept/employee/name, can find multiple matches in
the document.
                                                                     <dept bldg=“101”>
                                                                       <employee id=“901”>

         XPath: Path Expressions                                            <name>John Doe</name>
                                                                            <phone>408 555 1212</phone>
                                                                            <office>344</office>
          Use fully qualified paths to specify elements/attributes      </employee>
         “@” is used to specify an attribute                            <employee id=“902”>
          XPath is case-sensitive !                                         <name>Peter Pan</name>
                                                                            <phone>408 555 9918</phone>
         use “text()” to specify the text node under an element             <office>216</office>
                                                                        </employee>
          XPath                                      Result          </dept>


          /dept/@bldg                                101
          /dept/employee/@id                         901
                                                     902
          /Dept/Employee/@id
          /dept/employee/name                        <name>John Doe</name>
                                                     <name>Peter Pan</name>
          /dept/employee/name/text()                 John Doe
                                                     Peter Pan

                                                                                                     14




Path expressions point to specific nodes and return them as results.


Note: XML allows repeating elements, such as “employee” in this sample document.
Hence, a single path, such as /dept/employee/@id can find multiple matches.
Therefore the result of an XPath is in general a sequence of zero, one or multiple
items. An “item” is a node (such as an element or attribute) or just an atomic value.


How can XPath return a sequence of zero items, i.e. an empty sequence? For
example, the path /Dept/@bldg would return an empty result from our sample
document, because /dept and /Dept are different.
                                                                   <dept bldg=“101”>
                                                                     <employee id=“901”>
                                                                          <name>John Doe</name>

               XPath: Wildcards                                           <phone>408 555 1212</phone>
                                                                          <office>344</office>
                                                                      </employee>
                                                                      <employee id=“902”>
           * matches any tag name                                         <name>Peter Pan</name>
           // is the “descendent-or-self” wildcard                        <phone>408 555 9918</phone>
                                                                          <office>216</office>
                                                                      </employee>
                                                                   </dept>

             XPath                                           Result
                                                             John Doe
             /dept/employee/*/text()                         408 555 1212
                                                             344
                                                             Peter Pan
                                                             408 555 9918
                                                             216

             /dept/*/@id                                     901
                                                             902

             //name/text()                                   John Doe
                                                             Peter Pan

             /dept//phone                                    <phone>408 555 1212</phone>
                                                             <phone>408 555 9918</phone>

                                                                                                   15




The * is a wildcard that can match any tag name. The * in the first of the 4 examples
on this slide matches all the elements under “employee”, i.e. “name”, “phone” and
“office”.


While the * is a wildcard for elements, the // is a wildcard for paths. For example,
//name means: find “name” elements anywhere in the document no matter what the
path is that leads to these “name” elements. Likewise, a path such as /a/b//c/d means:
navigate from the root element “a” to the element “b”, and from there use any possible path to find “c”
elements and then get the “d” elements under those “c” elements.
                                                           <dept bldg=“101”>
                                                             <employee id=“901”>
          XPath: Predicates                                       <name>John Doe</name>
                                                                  <phone>408 555 1212</phone>
                                                                  <office>344</office>
                                                              </employee>
         Predicates are enclosed in square brackets […]       <employee id=“902”>
                                                                  <name>Peter Pan</name>
         Can have multiple predicates in one XPath                <phone>408 555 9918</phone>
                                                                  <office>216</office>
                                                              </employee>
                                                           </dept>




         XPath                                            Result
         /dept/employee[@id= 902]/name                    <name>Peter Pan</name>
         /dept[@bldg= 101]/employee[office > 300]/name    <name>John Doe</name>
         //employee[name=“John Doe” or office=216]/@id    901
                                                          902



                                                                                          16




Remember: a single path, such as /dept/employee/@id can find multiple matches in
the document. You can use value predicates to return only some of those matches.
Value predicates are also very useful when you run queries across many
documents (which is a very common scenario).
                                                               <dept bldg=“101”>

        XPath: The Parent Axis                                   <employee id=“901”>
                                                                      <name>John Doe</name>
                                                                      <phone>408 555 1212</phone>
                                                                      <office>344</office>
                                                                  </employee>
          Current context: “.”                                    <employee id=“902”>
                                                                      <name>Peter Pan</name>
          Parent context: “..”                                        <phone>408 555 9918</phone>
                                                                      <office>216</office>
                                                                  </employee>
                                                               </dept>

         XPath                                          Result
         /dept/employee[name=“John Doe”]/../@bldg       101
         /dept/employee/name[ . =“John Doe”]/../../@bldg 101
         /dept/employee/name[../@id=902]                <name>Peter Pan</name>
         /dept/employee/office[ . > 300]                <office>344</office>
         /dept/employee[office > 300]/office            <office>344</office>


                                                                                              17




You sure know the . and the .. in a file system. They have a very similar meaning in
XPath. The . refers to your current location in the tree, and the .. refers to the
parent. This allows us to also navigate XML documents in upwards direction.
               Use XPath to create XML Indexes in DB2

               create unique index idx1 on dept(deptdoc)
               generate key using
               xmlpattern '/department/@bldg'                 create table dept( deptdoc XML);
               as sql double;

               create index idx2 on dept(deptdoc)
               generate key using                          <department bldg="101">
               xmlpattern '/department/employee/name'        <employee id="901">
               as sql varchar(40);                             <name>John Doe</name>
                                                               <phone>408 555 1212</phone>
                                                               <office>344</office>
                                                             </employee>
                                                             <employee id="902">
               create index idx3 on dept(deptdoc)               <name>Peter Pan</name>
               generate key using                               <phone>408 555 9918</phone>
               xmlpattern '//@id'                               <office>216</office>
               as sql varchar(40);                           </employee>
                                                           </department>


                                                                                                 18




Here is how XPath expressions are used in DB2 to define XML indexes on specific XML
elements or attributes. The wildcards * and // are allowed in index definitions. However, parent
steps (“..”) and XPath predicates are not allowed in XML index definitions.
Agenda

   Brief Recap: DB2 9 pureXML
   XPath Expressions
   Combining SQL and XPath
     SQL/XML functions XMLQUERY and XMLEXISTS
     From XML to SQL Types: XMLCAST
     Applying SQL functions to XML Data
     Returning XML in relational format: XMLTABLE
   Summary




                                                    19
                SQL/XML defines:
                  XML Data Type (its semantics, not its storage)
                  XML publishing functions (relational data → XML)
                      XMLELEMENT, XMLATTRIBUTE, XMLAGG, …

                  Conversion: XML type ↔ char/varchar/clob
                      XMLSERIALIZE, XMLPARSE, XMLCAST

                  Integration of SQL and XQuery/XPath languages
                      XMLQUERY, XMLEXISTS, XMLTABLE

                  Other functions, for schema validation, etc.
                      XMLVALIDATE



                                                                                 20




Search for any of the capitalized function names in the DB2 online information center:
http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/index.jsp
Sample Scenario: XML and XPath inside DB2
   create table dept(deptID char(8), deptdoc xml)

   deptID char(8) deptdoc XML
   …              …
   PR37             <dept bldg=“101”>
                      <employee id=“901”>
                           <name>John Doe</name>
                               <phone>408 555 1212</phone>
                               <office>344</office>
                         </employee>
                         <employee id=“902”>
                               <name>Peter Pan</name>
                               <phone>408 555 9918</phone>
                               <office>216</office>
                         </employee>
                    </dept>

   …                …
                                                             21
         Querying XML using SQL

         create table dept(deptID char(8), deptdoc xml)




         select deptID, deptdoc
         from dept where deptID = “PR37”




            Full document retrieval based on relational predicates
                                                                           22




The following white paper contains more detailes on SQL/XML queries:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606nicola/
         Querying XML using SQL/XML

         create table dept(deptID char(8), deptdoc xml)



         select deptID, deptdoc
         from dept
         where xmlexists(‘$d/dept[@bldg = 101]’ passing deptdoc as “d“)


         select deptID, deptdoc
         from dept
         where xmlexists(‘$DEPTDOC/dept[@bldg = 101]’)


            Full document retrieval based on XML predicates
                                                                           23




The following white paper contains more detailes on SQL/XML queries:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606nicola/
         SQL/XML with Parameter Markers, HVs

         create table dept(deptID char(8), deptdoc xml)

         select deptID, deptdoc
         from dept
         where xmlexists(‘$d/dept[@bldg = $b]’
                      passing deptdoc as “d“, cast(? as char(5)) as “b”)

         select deptID, deptdoc
         from dept
         where xmlexists(‘$d/dept[@bldg = $b]’
                      passing deptdoc as “d“, cast(:hvar as char(5)) as “b”)


             Full document retrieval based on XML predicates
             with parameter markers or host variables
                                                                               24




The following white paper contains more detailes on SQL/XML queries:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606nicola/


The casting helps DB2 to know the type of comparison that should be performed
(textual vs. numeric).
         Querying XML using SQL/XML

         create table dept(deptID char(8), deptdoc xml)



          select xmlquery(‘$d/dept//name’ passing deptdoc as “d”)
          from dept
          where …. ;




          Extraction of XML elements, attribute values, or fragments.
                                                                           25




The following white paper contains more detailes on SQL/XML queries:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606nicola/
         Querying XML and Relational Data Combined

         create table dept(deptID char(8), deptdoc xml)




         select deptID,
                xmlquery(‘$d/dept//name’ passing deptdoc as “d”)
         from dept
         where deptID LIKE “PR%”
         and xmlexists(‘$d/dept[@bldg = 101]’ passing deptdoc as “d“)



                   Retrieving XML and relational data,
                   based on XML and relational predicates.
                                                                           26




The following white paper contains more detailes on SQL/XML queries:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0606nicola/
Agenda

   Brief Recap: DB2 9 pureXML
   XPath Expressions
   Combining SQL and XPath
     SQL/XML functions XMLQUERY and XMLEXISTS
     From XML to SQL Types: XMLCAST
     Applying SQL functions to XML Data
     Returning XML in relational format: XMLTABLE
   Summary




                                                    27
                Casting XML type to SQL types


                     XMLQUERY always returns type XML
                     To return other types: XMLCAST



                select xmlcast( xmlquery(‘$i/dept/@bldg’ passing deptdoc as “i”)
                       as integer )
                from dept
                where ….



                            XMLCAST fails if XMLQUERY returns more than 1 value

                                                                                   28




The XMLCAST function is used to convert XML type value to SQL types.


Note: based on our sample document with 2 employees, the following query would fail:


select xmlcast( xmlquery(‘$i/dept/employee/name’ passing deptdoc as “i”)
      as varchar(50) )
from dept;


Why? Because the path /dept/employee/name will produce a sequence of two values, since we
have two employee names in the sample document. Casting 2 XML values to a single SQL value
is not well defined. The SQL standard requires XMLCAST to fail when its argument produces
more than a single value. But, you see later that the XMLTABLE function provides a nice
solution for this.
               Casting XML type to SQL types…


                     …allows you to apply SQL functions
                     to values from the XML column !


               select max(c1) from
                ( select xmlcast( xmlquery(‘$i/dept/@bldg’ passing deptdoc as “i”)
                      as integer ) as c1
                from dept)




                                                                                29




Once you cast XML values to an SQL type, the result of now SQL values can be input to any
SQL query and any SQL function. Here the SQL function max is used as an example, but you
can imagine any other function.
         Casting XML type to SQL types…


               …allows you to sort rows based on XML
               element or attribute values !

           select deptID, deptdoc
           from dept
           where ….
           order by xmlcast(
               xmlquery('$i/dept/@bldg' passing deptdoc as "i")
                 as integer)



              XMLCAST required because SQL can not sort on XML types

                                                                       30




In the above query, deptid and deptdoc are returned sorted by the value of the
attribute bldg in the department documents in the XML column. Because SQL
cannot sort on XML values, the XML values must be cast to a SQL type that SQL
can order on, in this case, integer.
Agenda

   Brief Recap: DB2 9 pureXML
   XPath Expressions
   Combining SQL and XPath
     SQL/XML functions XMLQUERY and XMLEXISTS
     From XML to SQL Types: XMLCAST
     Applying SQL functions to XML Data
     Returning XML in relational format: XMLTABLE
   Summary




                                                    31
              XMLTABLE: Return XML in tabular format
                                                   SELECT X.* FROM dept,
                                                     XMLTABLE (‘$d/dept/employee’ passing deptdoc as “d”
                                                       COLUMNS
                                                       empID     INTEGER        PATH ‘@id’,
                                                       firstname VARCHAR(30) PATH ‘name/first’,
                                                       lastname VARCHAR(30) PATH ‘name/last’,
                                                       office    INTEGER        PATH ‘office’) AS “X”
               <dept bldg="101">
                 <employee id="901">
                      <name>
                            <first>John</first>
                            <last>Doe</last>
                      </name>
                      <office>344</office>
                   </employee>
                                                                  empID firstname lastname office
                   <employee id="902">
                      <name>                                      901      John        Doe        344
                            <first>Peter</first>
                            <last>Pan</last>                      902      Peter       Pan        216
                      </name>
                       <office>216</office>
                   </employee>
               </dept>

                                                                                                           32




The XMLTABLE function is used in the FROM clause of the SELECT statement together with
the table dept that it operates on. The XMLTABLE function is implicitly joined with the table dept
and applied to each of its rows.
The XMLTABLE function contains a row-generating XQuery expression and, in the COLUMNS
clause, one or multiple column-generating expressions. The row-generating expression is the
XPath $d/dept/employee. The passing clause defines that the variable $d refers to the XML
column deptdoc of the table dept.
The row-generating expression is applied to each XML document in the XML column and
produces one or multiple employee elements (sub-trees) per document. The output of the
XMLTABLE function contains one row for each employee element. Hence, the output produced
by the row-generating XQuery expression determines the cardinality of the result set of the
SELECT statement.
The COLUMNS clause is used to transform XML data into relational data. Each of the entries in
this clause defines a column with a column name and a SQL data type. Here, the returned rows
have 4 columns named empID, firstname and lastname and office of data types integer and
varchar(30). The values for each column are extracted from the employee elements and cast to
the SQL data types. For example, the path name/first is applied to each employee element to
obtain the value for the column firstname. The row-generating expression provides the context
for the column-generating expressions. In other words, you can typically append the column-
generating expressions to the row-generating expression to get an intuitive idea of what a given
XMLTABLE function returns in its columns.
        XMLTABLE: Single vs Multi Occurences
                                      SELECT X.* FROM dept,
                                        XMLTABLE (‘$d/dept/employee’ passing deptdoc as “d”
                                          COLUMNS
                                          bldg      INTEGER      PATH ‘../@bldg’,
                                          empID     INTEGER      PATH ‘@id’,
                                          firstname VARCHAR(30) PATH ‘name/first’,
                                          lastname VARCHAR(30) PATH ‘name/last’,
         <dept bldg="101">                office    INTEGER      PATH ‘office’) AS “X”
           <employee id="901">
                <name>
                      <first>John</first>
                      <last>Doe</last>
                </name>
                <office>344</office>
             </employee>                         bldg    empID firstname lastname office
             <employee id="902">
                <name>                           101     901      John         Doe          344
                      <first>Peter</first>
                      <last>Pan</last>           101     902      Peter        Pan          216
                </name>
                 <office>216</office>
             </employee>                           The "bldg" gets repeated for every employee.
         </dept>

                                                                                                  33




This example has one more column: bldg. Note that in our sample document the
bldg number is the same for both employees because there is only one bldg
attribute ! Hence, for each /dept/employee we need to obtain that bldg attribute by
going one level up (from employee to dept), and then from dept to @bldg. In XPath,
this is expressed as ../@bldg
        XMLTABLE: Missing elements
                                    SELECT X.* FROM dept,
                                      XMLTABLE (‘$d/dept/employee’ passing deptdoc as “d”
                                        COLUMNS
                                        bldg      INTEGER      PATH ‘../@bldg’,
                                        empID     INTEGER      PATH ‘@id’,
                                        firstname VARCHAR(30) PATH ‘name/first’,
                                        lastname VARCHAR(30) PATH ‘name/last’,
         <dept bldg="101">              office    INTEGER      PATH ‘office’) AS “X”
           <employee id="901">
               <name>
                   <first>John</first>
                   <last>Doe</last>
               </name>

             </employee>                       bldg    empID firstname lastname office
             <employee id="902">
                <name>                         101     901       John         Doe          NULL
                      <last>Pan</last>         101     902       NULL         Pan          216
                </name>
                 <office>216</office>
             </employee>                         Missing elements lead to NULLs in the result.
         </dept>

                                                                                                 34




The nice thing about XML is that missing values are simply omitted. Optional
elements are simply not there if they don‘t apply. This leads to NULL values in the
relational result set of the XMLTABLE function.
         XMLTABLE: Apply SQL aggregation and
                  GROUPing to XML data:
                                             SELECT X.office, COUNT(X.empID) as cnt
                                             FROM dept,
                                               XMLTABLE (‘$d/dept/employee’ passing deptdoc as “d”
                                                 COLUMNS
                                                 empID       INTEGER       PATH ‘@id’,
         dept
                                                 office      INTEGER       PATH ‘office’) AS “X”
         <dept bldg="101">
           <employee id="901">
                                             GROUP BY office;
                <name>
                      <first>John</first>
                      <last>Doe</last>
                </name>
                <office>344</office>
             </employee>                                               office      cnt
             <employee id="902">
                <name>                                                  344         1
                      <first>Peter</first>
                      <last>Pan</last>                                  216         1
                </name>
                 <office>216</office>
             </employee>
         </dept>


                                                                                                     35




And you can apply SQL aggregation functions and SQL group-ing to the rows
produced by XMLTABLE. It‘s obvious now: once XMLTABLE lifts some of the XML
element values from the XML level to the relational level, the full arsenal of the SQL
language is at your disposal to further manipulate that data.


Many more useful ways of using the XMLTABLE function are described in this 2-
part tutorial:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0708nicola/
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0709nicola/
        XMLTABLE: Simple Shredding !
                                             INSERT INTO dept2
                                                SELECT X.* FROM dept,
                                                XMLTABLE (‘$d/dept/employee’ passing deptdoc as “d”
                                                 COLUMNS
                                                 empID      INTEGER        PATH ‘@id’,
                                                 firstname VARCHAR(30) PATH ‘name/first’,
         dept                                    lastname VARCHAR(30) PATH ‘name/last’,
         <dept bldg="101">
                                                 office     INTEGER        PATH ‘office’) AS “X”
           <employee id="901">
                <name>
                      <first>John</first>
                      <last>Doe</last>
                </name>                               dept2
                <office>344</office>
             </employee>                              empID firstname          lastname         office
             <employee id="902">
                <name>                                901       John           Doe              344
                      <first>Peter</first>
                      <last>Pan</last>
                                                      902       Peter          Pan              216
                </name>
                 <office>216</office>                create table dept2 (empid int, firstname varchar(30),
             </employee>
         </dept>
                                                                         lastname varchar(30), office int);

                                                                                                          36




Yes, most certainly you can insert the relational result set of the query into a
relational table – if the columns and data types of the table match the columns &
types of the result set.
        XMLTABLE: Flexible Transformation
                                 SELECT X.* FROM dept,
                                   XMLTABLE (‘$d/dept/employee’ passing deptdoc as “d”
                                     COLUMNS
                                     empID     INTEGER        PATH ‘@id’,
                                     R344      VARCHAR(30) PATH ‘.[office=344]/name/first'’,
                                     R216      VARCHAR(30) PATH ‘.[office=216]/name/first'’,
                                               ) AS “X”
        <dept bldg="101">
          <employee id="901">
               <name>
                     <first>John</first>
                     <last>Doe</last>
               </name>
               <office>344</office>
            </employee>                               empID R344             R216
            <employee id="902">
               <name>                                 901      John           -
                     <first>Peter</first>
                     <last>Pan</last>
                                                      902       -            Peter
               </name>
                <office>216</office>
            </employee>
                                                     Transpose your data. Conditionally map
        </dept>                                      elements to one column or the other.

                                                                                              37




Note how we use XPath predicates (in square brackets) in the column-generating
expressions of the XMLTABLE functions.
Summary

1. Use XPath to “get to” values in the XML data
2. Use XMLCAST to convert single values from
   XML to SQL data types
3. Use XMLTABLE to convert multiple XML
   values to relational rows
4. Then use SQL scalar functions, aggregation
   functions, GROUP BY, ORDER BY, Olap
   functions, etc.



                                                  38
Session E14
Unleash SQL Power to your XML Data




                  Matthias Nicola
                    IBM Silicon Valley Lab
                     mnicola@us.ibm.com




                                             39
        XMLTABLE: Multiple values per cell?
                                      SELECT X.* FROM dept,
                                        XMLTABLE (‘$d/dept’ passing deptdoc as “d”
                                          COLUMNS
                                          bldg      INTEGER       PATH ‘@bldg’,
                                          lastname VARCHAR(30) PATH ‘employee/name/last’,
                                          office    INTEGER       PATH ‘employee/office’) AS “X”
         <dept bldg="101">
           <employee id="901">
                <name>
                      <first>John</first>
                      <last>Doe</last>
                </name>
                <office>344</office>                            bldg       lastname office
             </employee>
             <employee id="902">
                <name>
                      <first>Peter</first>   Error.
                      <last>Pan</last>
                </name>                      SQL16003N An expression of data type "( item(), item()+ )" cannot be used
                 <office>216</office>        when the data type "INTEGER" is expected in the context.
             </employee>
         </dept>


                                                                                                                     40




Here, the row-generating expression is $d/dept. That means we try to produce one row
for each department. But, in the COLUMNS clause we try to get all employee last
names, and all employee office numbers. Multiple names and office numbers cannot be
mapped to a single row!


Be aware that the path expressions in the COLUMNS clause must not return more than
one value per row. If a path expression returns a sequence of two or more items, the
XMLTABLE execution will fail, as it is not possible to convert a sequence of XML values
into a single SQL value.


What does the error msg SQL16003N mean by An expression of data type "( item(),
item()+ )" ? This message shows us a regular expression, (item, item+), which means
there is an item followed by one or more additional items. That makes at least two items
– but only 1 is allowed here.
         XMLTABLE: Multiple values per cell?
                                      SELECT X.* FROM dept,
                                        XMLTABLE (‘$d/dept’ passing deptdoc as “d”
                                          COLUMNS
                                          bldg      INTEGER       PATH ‘@bldg’,
                                          lastname XML            PATH ‘employee/name/last’,
                                          office    XML           PATH ‘employee/office’) AS “X”
         <dept bldg="101">
           <employee id="901">
                <name>
                      <first>John</first>
                      <last>Doe</last>
                </name>
                <office>344</office>
             </employee>
                                                 bldg   lastname              office
             <employee id="902">
                <name>
                                                 101    <last>Doe</last>      <office>344</offi
                      <first>Peter</first>              <last>Pan</last>      ce><office>216<
                      <last>Pan</last>
                </name>
                                                                              /office>
                 <office>216</office>
             </employee>                            XML type columns can naturally hold multiple
         </dept>                                    values per cell !
                                                                                                   41




A possible solution to the issue on the previous slide..

								
To top