XML Schema Overview by usr10478

VIEWS: 0 PAGES: 10

									Local Government e-Claim (LGeC)



    XML Schema Overview




          Stephen Champeau
            September 13, 2007
                Version 1.0

        State Controller’s Office
     John Chiang, State Controller
     Information Systems Division
                                                          Table of Contents

INTRODUCTION ........................................................................................................................................ 2
NAMESPACES ............................................................................................................................................ 2
XML SCHEMA ............................................................................................................................................ 4
DEFINING STRUCTURE .......................................................................................................................... 5
DEFINING ELEMENT CONSTRAINTS ................................................................................................. 7
COMBINING SCHEMAS........................................................................................................................... 8




                                                                    Page 1/10
Introduction
This document presents a high level overview of XML schema and how it is used in the LGeC XML
upload system. XML schema is a large topic, and this document should allow the reader to understand the
LGeC XML schemas without undertaking a lengthy, comprehensive study of XML schema.

In general, a XML schema defines the following constraints on a XML file to which the schema is applied:

    •   The structure of the document (what elements can appear as children of other elements; for a
        repeating or list element, how many of each child element can appear, etc.)
    •   The type of data that an individual element can contain
            o Data type (integer, boolean, string, etc.)
            o Minimum and maximum size
            o Number of decimal places



Namespaces
In the LGeC XML schemas, every claim has its own schema and associated namespace, and there is an
additional schema and namespace that defines how claims are combined into a single upload file. A basic
understanding of XML namespaces is therefore important in understanding the LGeC XML schemas.

Namespaces are a way to distinguish names used in XML documents, no matter where they come from.

For example, if one were to develop a customer XML document and a product XML document, it is
possible that both of these documents could define an element such as “name”, since both customers and
products have names.

        <customer>
          <name>
            <salutation>Field Marshall</salutation>
            <first-name>Steve</first-name>
            <last-name>Champeau</last-name>
            <suffix>Jr</suffix>
          </name>
        </customer>

        <product>
          <id>654763-4357438-34583</id>
          <name>Ultra-wide Back Scratcher</name>
        </product>

If these two XML documents were combined into an order XML document, in which one or more products
would be associated with a customer, validating this order document would be problematic because there
would be no way of differentiating the two uses of the element “name”:

        <order>
          <customer>
            <name>
              <salutation>Field Marshall</salutation>
              <first-name>Steve</first-name>
              <last-name>Champeau</last-name>
              <suffix>Jr</suffix>
            </name>
          </customer>
          <sales-person>64376</sales-person>
          <order-items>
            <order-item>



                                              Page 2/10
              <quantity>12</quantity>
              <product>
                <id>654763-4357438-34583</id>
                <name>Ultra-wide Back Scratcher</name>
              </product>
            </order-item>
          </order-items>
        </order>

By defining a namespace for order, customer and product, elements can be distinguished from each other
because the fully qualified names of the elements are now “customer:name” and “product:name” rather
than simply “name”:

        <order:order>
          <customer:customer>
            <customer:name>
              <customer:salutation>Field Marshall</customer:salutation>
              <customer:first-name>Steve</customer:first-name>
              <customer:last-name>Champeau</customer:last-name>
              <customer:suffix>Jr</customer:suffix>
            </customer:name>
          </customer:customer>
          <order:sales-person>64376</order:sales-person>
          <order:order-items>
            <order:order-item>
              <order:quantity>12</order:quantity>
              <product:product>
                <product:id>654763-4357438-34583</product:id>
                <product:name>Ultra-wide Back Scratcher</product:name>
              </product:product>
            </order:order-item>
          </order:order-items>
        </order:order>

Namespaces used in a XML document are specified as a URI (Uniform Resource Identifier) in the root
element of the XML document. The word between the “xmlns:” and the URI, such as “order” and
“customer”, is the prefix that identifies the namespace to which the element belongs.

        <order:order
           xmlns:order=”http://acme.com/order”
           xmlns:customer=”http://acme.com/customer”
           xmlns:order=”http://acme.com/product”>
        </order:order>




                                              Page 3/10
XML Schema
A XML schema document is itself a XML document that contains elements from the standard XML
schema namespace.

For example, the following is the root element of the schema for program 58.

         <xs:schema
            xmlns:tns=”http://sco.ca.gov/lgec/claim-58”
            elementFormDefault="qualified"
            targetNamespace=”http://sco.ca.gov/lgec/claim-58”
            xmlns:xs="http://www.w3.org/2001/XMLSchema">

The “targetNamespace” attribute specifies that this schema document defines the namespace
“http://sco.ca.gov/lgec/claim-58”, that is, all of the elements defined in this schema belong to that
namespace.

Like all XML schema documents, this document uses elements from the “XMLSchema” namespace, and
within the schema these elements are identified and differentiated from other elements by being prefixed
with identifier “xs”. As described above, this prefix is associated with the XMLSchema namespace by the
following attribute in the root element:

         xmlns:xs="http://www.w3.org/2001/XMLSchema"

Note that the root element of the document, “xs:schema”, is itself a member of the XMLSchema
namespace.

Within the XML schema document, elements from the XMLSchema namespace are used to define the
format of the XML documents that will conform to his schems. For example, the following node from the
program 58 schema document uses two entities from the XMLSchema namespace, an element called
“element” and an attribute value called “string”. Note that they are both prefixed with “xs:”.

         <xs:element name="department-name" type="xs:string" />

This element defines an element called “department-name” of data type string that belongs to the
namespace “http://sco.ca.gov/lgec/claim-58”, the namespace specified in the root element’s
“targetNamespace” attribute.

In a XML document that conforms to this schema, a valid implementation of this node is:

         <department-name>Accounting</department-name>


In the remaining sections of this document, the various ways in which XML schema are used to define the
structure of a XML document will be described.




                                                 Page 4/10
Defining Structure
XML documents can contain parent-child node relationships of arbitrary complexity. XML schema
includes elements that define such relationships.

The simplest parent-child is defined in the fragment below, showing that the “summary-reimbursement”
element contains three child elements:

         <xs:element name="summary-reimbursement">
            <xs:complexType>
               <xs:sequence>
                  <xs:element name="fiscal-year" type="xs:string">
                  <xs:element name="combined"    type="xs:boolean" />
                  <xs:element name="amended"     type="xs:boolean" />

All of the child elements are required and must appear in the order specified. Here is a XML fragment that
satisfies the above XML schema fragment:

         <summary-reimbursement>
            <fiscal-year>2004/2005</fiscal-year>
            <combined>true</combined>
            <amended>false</amended>


Each element can have the minimum and maximum number of times it can appear specified in the XML
schema by using the “minOccurs”and “maxOccurs” attributes. For example, many claim statistics are
optional, which is specified as follows:

         <xs:element minOccurs="0" maxOccurs="1" name="claim-statistic-2">

If these attributes are not included in an element definition, their value is assumed to be one. That is why
the child elements of the "summary-reimbursement" above are required – they have implicit
“minOccurs”and “maxOccurs” values of one. If an element can occur an unlimited number of times, the
value assigned to “maxOccurs” is the string “unbounded”.




                                                 Page 5/10
XML schema also allows for an element to have one or more different child elements using the schema
“choice” element.

For example, a claim can have a reimbursement and/or an estimated summary. An estimated summary can
be either within limit or beyond limit. The claim XML schemas use the “choice” element to define both of
these constraints, specifying a “minOccurs” of 1 and “maxOccurs” of 2 for the “one or both” relationship
within the “summary” element.

As with the “element” element, the “choice” element has an implicit “minOccurs” and “maxOccurs” of 1,
so specifying neither defines the “one or the other” relationship needed for the estimated summary.

        <xs:element name="summary">
          <xs:complexType>
            <xs:sequence>
              <xs:choice minOccurs="1" maxOccurs="2">
                <xs:element name="summary-reimbursement">
                  ...
                </xs:element>
                <xs:element name="summary-estimated">
                  <xs:complexType>
                    <xs:sequence>
                      <xs:choice>
                        <xs:element name="summary-estimated-within-limit">
                          ...
                        </xs:element>
                        <xs:element name="summary-estimated-beyond-limit">
                          ...
                        </xs:element>
                      </xs:choice>
                    </xs:sequence>
                  <xs:complexType>
                </xs:element>
              </xs:choice>
            </xs:sequence>
          </xs:complexType>
        </xs:element>




                                              Page 6/10
Defining Element Constraints
A number of XML elements shown in the examples above have their data types defined as “string”,
“boolean”, etc. XML schema allows much more detailed data definitions to be specified.

The following structure is used throughout the LGeC claim XML schema documents to specify minimum
and maximum values for currency elements:

         <xs:element name="late-penalty">
           <xs:simpleType>
             <xs:restriction base="xs:decimal">
               <xs:minInclusive value="0" />
               <xs:maxInclusive value="1000" />
               <xs:fractionDigits value="2" />
             </xs:restriction>
           </xs:simpleType>
         </xs:element>

The basic idea is that you start with a base type such as decimal, and place one or more restrictions on it,
such as minimum value, etc.

Another type restriction used in the LGeC claim XML schema documents is to specify a list of possible
values for a string element:

         <xs:element name="object-account">
           <xs:simpleType>
             <xs:restriction base="xs:string">
               <xs:enumeration value="object-account-option-1" />
               <xs:enumeration value="object-account-option-2" />
               <xs:enumeration value="object-account-option-3" />
             </xs:restriction>
           </xs:simpleType>
         </xs:element>

This restricts the value of the “object account” string element to one of the three enumeration values
specified. The “object-account” element can therefore appear in a XML document with one of the three
specified values:

         <object-account>object-account-option-1</object-account>

Or

         <object-account>object-account-option-2</object-account>

Or

         <object-account>object-account-option-3</object-account>




                                                 Page 7/10
Combining Schemas
As mentioned above, there is a XML schema document for each claim program which specifies the format
to which that particular program must adhere. Because a LGeC XML upload file can contain more than one
claim for more than one program, a XML schema document has been created that defines how multiple
claim documents can be combined into a single upload file.

The following is an abbreviated version of the file Lgec-Claims.xsd for three programs:

        <xs:schema
          xmlns:ns-claim-150=”http://sco.ca.gov/lgec/claim-150”
          xmlns:ns-claim-157=”http://sco.ca.gov/lgec/claim-157”
          xmlns:ns-claim-166=”http://sco.ca.gov/lgec/claim-166”

           elementFormDefault="qualified"
           targetNamespace=”http://sco.ca.gov/lgec/claims”
           xmlns:xs="http://www.w3.org/2001/XMLSchema">

           <xs:import schemaLocation="LGeC-Claim-150.xsd"
                      namespace="http://sco.ca.gov/lgec/claim-150" />
           <xs:import schemaLocation="LGeC-Claim-157.xsd"
                      namespace="http://sco.ca.gov/lgec/claim-157" />
           <xs:import schemaLocation="LGeC-Claim-166.xsd"
                      namespace="http://sco.ca.gov/lgec/claim-166" />

          <xs:element name="claims">
            <xs:complexType>
              <xs:sequence>
                <xs:element
                  minOccurs="0" maxOccurs="unbounded"
                  ref="ns-claim-150:claim" />
                <xs:element
                  minOccurs="0" maxOccurs="unbounded"
                  ref="ns-claim-157:claim" />
                <xs:element
                  minOccurs="0" maxOccurs="unbounded"
                  ref="ns-claim-166:claim" />
              </xs:sequence>
            </xs:complexType>
          </xs:element>
        </xs:schema>

Things to note:
    • Like the claim program XML schema files discussed above it has a reference to the XML schema
         namespace and uses elements from that namespace.
    • It has a “targetNamespace” called “http://sco.ca.gov/lgec/claims”, the containing namespace for
         all elements defined in this schema file.

The claim program XML schema files and their namespaces are imported into this schema using the
“import” XML schema element. Each of these imported namespaces is then associated with a prefix (such
as “ns-claim-150”) using the “xmlns” attribute in the root node. Finally, the root “claim” element of each
claim program XML schema file is specified as a possible child of the “claims” element using the claim-
specific prefix defined in the schema root element. The minimum number of each claim that may appear is
zero (minOccurs=”0”), so every claim is optional. The maximum number of each claim that may appear is
unlimited (maxOccurs=“unbounded”).

Because the claims appear in this schema in numeric order by program number, the claims in an upload file
must appear in numeric order by program number, and must be grouped together by program number.



                                               Page 8/10
Here is an example of an upload file:

        <claims xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”
          xmlns="http://sco.ca.gov/lgec/claims">
          <claim xmlns="http://sco.ca.gov/lgec/claim-150">
             <program-number>150</program-number>
             ...
          </claim>
          <claim xmlns="http://sco.ca.gov/lgec/claim-150">
             <program-number>150</program-number>
             ...
          </claim>
          <claim xmlns="http://sco.ca.gov/lgec/claim-157">
             <program-number>157</program-number>
             ...
          </claim>
          <claim xmlns="http://sco.ca.gov/lgec/claim-157">
             <program-number>157</program-number>
             ...
          </claim>
          <claim xmlns="http://sco.ca.gov/lgec/claim-157">
             <program-number>157</program-number>
             ...
          </claim>
          <claim xmlns="http://sco.ca.gov/lgec/claim-166">
             <program-number>166</program-number>
             ...
          </claim>
        </claims>

Note that the claims appear in numeric order by program number and the two claims for program 150 and
the three for program 157 are grouped together.




                                             Page 9/10

								
To top