PowerPoint Presentation - Centre for Advanced Computing and

Document Sample
PowerPoint Presentation - Centre for Advanced Computing and Powered By Docstoc
					          XML and Web Technologies

                    Lecture 3: XML Schema

                    Prof Mark Baker

                    ACET, University of Reading
                    Tel: +44 118 378 8615
                    E-mail: Mark.Baker@computer.org
                    Web: http://acet.rdg.ac.uk/~mab

      Spring 2010             mark.baker@computer.org
•    Problems with DTDs,
•    What is XML Schema,
•    XML Namespaces,
•    What Do Schemas Look Like?
•    episodes.xsd

    Spring 2010   mark.baker@computer.org
     Problems with DTDs
• Infuriating:
    – Cannot give precise datatype definitions,
    – ID/IDREF’s are almost useless for giving foreign key
• Annoying:
    – DTDs are not XML documents:
         • Must learn a new language,
         • Cannot use XML editor/validator,
    – DTDs are not “namespace-aware”.
• The solution: XML-schema W3C recommendation,
• See http://www.w3c.org/XML/Schema for a good
  web-based primer:
    – Also Roger Costello’s xml-schema tutorial (www.xfront.com).
    – Also details about XML Schema 1.1.

 Spring 2010            mark.baker@computer.org
                 What is XML Schema?
• XML Schema is a definition language that allows you
  to constrain conforming XML documents to a specific
  vocabulary and hierarchical structure.
• The things you want to define in your language are
  element types, attribute types, and the composition of
  both into composite types (called complex types).
• XML Schema is analogous to a database schema, which
  defines the column names and data types in database
• XML Schema became a W3C Recommendation
  (synonymous with standard) on May 5, 2001.
• XML Schema 1.1 introduces a new namespace, the
  version control namespace:
   – http://www.w3.org/2007/XMLSchema-versioning.
   Spring 2010        mark.baker@computer.org
                 What is XML Schema?
• We have two types of documents: a schema
  document (or definition document) and multiple
  instance documents that conform to the schema:
   – A good analogy to remember the difference between
     these two types of documents is that a schema
     definition is a blueprint (or template) of a type and each
     instance is an incarnation of that template.

• This also demonstrates the two
  roles that a schema can play:
   • Template for a form generator
      to generate instances of a
      document type,
   • Validator to ensure the
      accuracy of documents.

   Spring 2010          mark.baker@computer.org
               What Is XML Schema?
• Both the schema document and the instance document
  use XML syntax (tags, elements, and attributes).
• This was one of the primary motivating factors to
  replace DTDs, which did not use XML syntax.
• Having a single syntax for both definition and instance
  documents allows a single parser to be used for both.
• Referring back to our database analogy, the database
  schema defines the columns, and the table rows are
  instances of each definition.

 Spring 2010        mark.baker@computer.org
       Advantages of XML Schemas
• XML Schemas are more advanced than DTDs:
   – Enhanced datatypes,
   – Can roll your own datatypes,
   – Written in the same syntax as instance documents,
   – Object-oriented-ish,
   – Can express sets,
   – Can specify element content as being unique (keys
     on content) and uniqueness within a region,
   – Can define multiple elements with the same name
     but different content,
   – Can define elements with no content,
   – Can define substitutable elements.

 Spring 2010      mark.baker@computer.org
               What Is XML Schema?
• Each instance document must “declare” which
  definition document (or schema) it adheres to:
   – Done with a special attribute attached to the root element
     called xsi:noNamespaceSchemaLocation or
        • The attribute used depends on whether your vocabulary is
          defined in the context of a namespace – more later.
• XML Schema’s allow validation of instances to ensure
  the accuracy of field values and document structure
  at the time of creation.
• The accuracy of fields is checked against the type of
  the field;
   – e.g. a quantity typed as an integer or money typed as a
• The structure of a document is checked for things
  like legal elements and attribute names, correct
  number of children, and required attributes.
 Spring 2010            mark.baker@computer.org
                 XML Namespaces
• Namespaces are a simple mechanism for creating
  globally unique names for the elements and attributes
  of your markup language.
• This is important for two reasons:
   – Remove conflict between the meaning of identical names in
     different markup languages,
   – Allow different markup languages to be mixed together
     without ambiguity,
   – Unfortunately, namespaces were not fully compatible with
     DTDs, and therefore their adoption has been slow.
• XML Schema, fully support namespaces.

 Spring 2010         mark.baker@computer.org
                     XML Namespaces
• Namespaces are implemented by requiring every XML
  name to consist of two parts: a prefix and a local part.
• A fully qualified element name, e.g. <xsd:integer>.
• The local part is the identifier for the metadata
  (“integer”), and the prefix is an abbreviation for the
  actual namespace in the namespace declaration.
• The actual namespace is a unique URI, e.g.:
  <xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema”>
   – Declares a namespace for all the XML Schema elements to be
     used in a schema document:
        • It defines the prefix “xsd” to stand for the namespace
               – Note that the prefix is not the namespace.
        • The prefix can change from one instance document to another:
               – The prefix is merely an abbreviation (alias) for the namespace,
                 which is the URI.

 Spring 2010                mark.baker@computer.org
                XML Namespaces
• To specify the namespace of the new elements you
  want to you use the targetNamespace attribute:
  <xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema”
• There are two ways to apply a namespace to a
   – Attach the prefix to each element and attribute in the
   – Or declare a default namespace for the document.
• A default namespace is declared by eliminating the
  prefix from the declaration.

 Spring 2010         mark.baker@computer.org
                 XML Namespaces
   <html xmlns=”http://www.w3.org/1999/xhtml”>
    <head> <title> Default namespace Test </title> </head>
    <body> Go Semantic Web!! </body>
• Here is a textual representation of what the
  preceding document is internally translated to
  by a conforming XML processor:

   <{http://www.w3.org/1999/xhtml}title> Default namespace Test
   <{http://www.w3.org/1999/xhtml}body> Go Semantic Web!!
 Spring 2010          mark.baker@computer.org
               XML Namespaces
• Used to distinguish between duplicate element
  types and attribute names.
• An XML namespace is a collection of element
  type and attribute names:
   – The namespace is identified by a URI,
   – Two-part naming convention:
        • The local name,
        • The URI of the XML namespace.

         xmlns:foo = "http://www.foo.org/"
   – This two part naming convention is the only
     function of XML Namespaces

 Spring 2010         mark.baker@computer.org
                     Declaring a Namespace
  • XML namespaces are declared with an xmlns attribute
    - can associate a prefix with the namespace.
  • The declaration is in scope for the element containing
    the attribute and all its descendants.

<!-- Declares two XML namespaces. Their scope is the A and B elements. -->
<A xmlns:foo="http://www.foo.org/" xmlns="http://www.bar.org/">

       Spring 2010            mark.baker@computer.org
Referencing a schema in an XML instance
  1.     A default namespace declaration, tells the schema-validator that all of the
         elements used in this instance document come from the http://www.books.org
  2.     Tells the schema-validator that the schemaLocation attribute we are using is the
         one in the XMLSchema-instance namespace.
  3.     The schemaLocation tell the schema-validator that the http://www.books.org
         namespace is defined by BookStore.xsd (i.e., schemaLocation contains a pair of

       <?xml version="1.0"?>
       <BookStore xmlns ="http://www.books.org"
                    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      2
                    xsi:schemaLocation="http://www.books.org BookStore.xsd">   3
                 <Title>My Life and Times</Title>
                 <Author>Paul McCartney</Author>
                 <Date>July, 1998</Date>
                 <Publisher>McMillin Publishing</Publisher>

       Spring 2010                mark.baker@computer.org
          Namespace Considerations
• The root element of any schema is a schema element from the
  http://www.w3.org/2001/XMLSchema namespace.
• The targetNamespace attribute on this element specifies which
  namespace the elements declared here “go into”.
• The other namespace attributes are:
   – The xmlns:xsd attribute associates the prefix xsd with the XML
     Schema namespace.
   – The xmlns attribute makes the default namespace
     http://www.mydomain.org/ns/report1 for this document (see next
   – Often one uses xsd as the prefix for schema elements, and makes
     the target namespace the default namespace of the schema
     document, but neither is essential.

 Spring 2010          mark.baker@computer.org
        An XML Instance Document
<?xml version="1.0"?>
<report xmlns="http://www.grid2004.org/ns/report1"
   <paragraph>Recently uncovered documents prove...
   <paragraph>The author is grateful to W3C for making this
   research possible.

 Spring 2010        mark.baker@computer.org
           Namespace Considerations
• Assuming the document vocabulary belongs to a
  namespace, we must declare this namespace:
   – In this example http://www.mydomin.org/ns/report1 is
     declared as the default namespace.
• If the instance document is to be validated against a
  schema, we must normally define where the schema
  for the namespace is located.
• This is done here by putting an attribute
  schemaLocation on the root element of the document.
• This attribute is itself defined in a standard
  namespace, called
   – So we must introduce a prefix for this (xsi is traditional).

 Spring 2010          mark.baker@computer.org
• The value of the schemaLocation attribute should be a
  pair of URIs: a namespace name and the corresponding
  Schema URI:
   – If the document uses more than one namespace, the value can
     be several consecutive pairs
   – All tokens are separated by white space.
• In this example the schema should be in the file
  report1.xsd in the same directory as the instance

 Spring 2010        mark.baker@computer.org
       What Do Schemas Look Like?
• An XML Schema uses XML syntax to declare a set of
  simple and complex type declarations:
    – A type is a named template that can hold one or more values:
         • Simple types hold one value,
         • Complex types are composed of multiple simple types.
• So, a type has two key characteristics: a name and a
  legal set of values.
• A simple type is an element declaration that includes
  its name and value constraints.
• Here is an example of an element called author that
  can contain any number of text characters:
    – <xsd:element name=”author” type=”xsd:string” />
• The preceding element declaration enables an instance
  document to have an element like this:
    – <author> Fred Bloggs</author>

 Spring 2010             mark.baker@computer.org
      What Do Schemas Look Like?
• Notice that the type attributed in the element
  declaration declares the type to be xsd:string:
    – A string is a sequence of characters.
• There are many built-in data types defined in the
  XML Schema specification.
• If a built-in data type does not constrain the values
  the way the document designer wants, XML Schema
  allows the definition of custom data types.

Spring 2010          mark.baker@computer.org
      XML Schemas are Extensible
• XML Schemas are extensible, just like XML,
  because they are written in XML.
• With an extensible schema definition you can:
    – Reuse your schema in other schemas,
    – Create your own data types derived from standard
    – Reference multiple schemas from the same

Spring 2010        mark.baker@computer.org
    <?xml version="1.0" encoding="UTF-8"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
        <xs:element name="action" type="xs:string"/>
        <xs:element name="dialog">
                                                   <xs:element> used to declare elements
keywords         <xs:extension base="xs:string">
must use            <xs:attribute name="speaker" type="xs:string“ use="required"/>
Schema           </xs:extension>The type of an element
name         </xs:simpleContent>given by its content
space      </xs:complexType> or by the type attribute
                                                          Elements and Attributes with textual
        <xs:element name="episode">
                                                          data become type xs:string
                         <xs:choice maxOccurs="unbounded">
                                   <xs:element ref="dialog"/>
                                   <xs:element ref="action"/>

       Spring 2010                 mark.baker@computer.org
               Schema Reference
• The XML file references the Schema:
    – It must then adhere to those restrictions in order
      to be valid:

<?xml version="1.0" encoding="UTF-8"?>
<!-- BridgeOfDeathEpisode.xml -->

   <dialog speaker="Bridgekeeper">Stop! Who would cross the
   Bridge of Death must answer me these questions three, ere the
   other side he see.</dialog>

 Spring 2010         mark.baker@computer.org
         Predefined Simple Types
• Both Elements and Attributes have Types.
• You can specify a predefined simple type or make
  your own.
• XML Schema has a lot of built-in data types.
• Here are the most common types:
    –   xsd:string
    –   xsd:decimal
    –   xsd:integer
    –   xsd:boolean
    –   xsd:date
    –   xsd:time

 Spring 2010          mark.baker@computer.org
           Using Specific Simple Types

• Using type “integer” for the zip, ensures that no alpha-
  numeric characters are put in it (what about non-US
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
 <xsd:element name="Address">
     <xsd:element name="Street" type="xsd:string"/>
     <xsd:element name="Apartment" type="xsd:string"/>
     <xsd:element name="Zip" type="xsd:integer"/>

   Spring 2010         mark.baker@computer.org
               Default and Fixed Values
• Simple elements can have a default value OR a fixed
  value declared.
• A default value is automatically assigned to the
  element when no other value is specified.
• In the following example the default value is 42:
   <xs:element name=“LifeAndUniverse" type="xs:integer" default=“42"/>

• A fixed value is also automatically assigned to the
  element; however, the XML file cannot specify
  another value.
• In the following example the fixed value is 42:

   <xs:element name=“LifeAndUniverse" type="xs:integer" fixed=“42"/>

 Spring 2010          mark.baker@computer.org
               Simple Elements

• Simple elements cannot have attributes.
• If an element has attributes, it is considered to be a
  complex type.
• The attribute itself is always declared as a simple
• An element with attributes always has a complex type
• Simple elements cannot have other elements in their

 Spring 2010      mark.baker@computer.org
       Data type Restrictions
                                            A DTD can only say that zip
                                            can be any non-markup text.
  <!ELEMENT zip (#PCDATA)>

• In a schema this translates to:
  <xsd:element name="zip" type="xsd:string"/>

• But in an XML Schema you can do better:
  <xsd:element name="zip" type="xsd:decimal"/>

• Or even, make your own restrictions:
    <xsd:simpleType name="ZipPlus4">
      <xsd:restriction base="xsd:decimal">
         <xsd:length value="10"/>
         <xsd:pattern value="\d{5}-\d{4}"/>
    <xsd:element name="zip" type="ZipPlus4">

Spring 2010           mark.baker@computer.org
              Restriction Ranges
• The restrictions must be "derived" from a base type,
  so it is object based:
    <xs:element name="LifeUniverseAndEverything">
         <xs:restriction base="xs:integer">
          <xs:minInclusive value="42"/>
          <xs:maxInclusive value="42"/>
• Preceding "derived" from "integer“, 2 restrictions (called
    – Greater than 41 and less than 43.
• XML file is "42“:

Spring 2010                mark.baker@computer.org
                Restriction Facets
Facet             Description
enumeration       Defines a list of acceptable values
fractionDigits    The maximum number of decimal places allowed.       >=0
length            The exact number of characters or list items allowed.      >=0
maxExclusive      The upper bounds for numeric values (the value must be less
                  than the value specified)
maxInclusive      The upper bounds for numeric values (the value must be less
                  than or equal to the value specified)
maxLength         The maximum number of characters or list items allowed.       >=0
minExclusive      The lower bounds for numeric values (the value must be greater
                  than the value specified)
minInclusive      The lower bounds for numeric values (the value must be greater
                  than or equal to the value specified)
minLength         The minimum number of characters or list items allowed     >=0
pattern           The sequence of acceptable characters based on a regular
totalDigits       The exact number of digits allowed.    >0
whiteSpace        Specifies how white space (line feeds, tabs, spaces, and carriage
                  returns) is handled

  Spring 2010              mark.baker@computer.org
                Enumeration Facet

<xs:element name="FavouriteColour">
       <xs:restriction base="xs:string">
             <xs:enumeration value="red"/>
             <xs:enumeration value="no blue"/>
             <xs:enumeration value="aarrrrggghh!!"/>

  Spring 2010        mark.baker@computer.org
   Patterns (Regular Expressions)

• One interesting facet is the pattern, which allows
  restrictions based on a regular expression.
• This regular expression specifies a normal word of
  one or more characters:

  <xs:element name="Word">
      <xs:simpleType name="WordType">
             <xs:restriction base="xs:string">
                   <xs:pattern value="[a-zA-Z]+"/>

Spring 2010      mark.baker@computer.org
 Patterns (Regular Expressions)

• Individual characters may be repeated a specific
  number of times in the regular expression.
• The following regular expression restricts the string
  to exactly 8 alpha-numeric characters:

   <xs:element name="password">
                <xs:restriction base="xs:string">
                        <xs:pattern value="[a-zA-Z0-9]{8}"/>

 Spring 2010          mark.baker@computer.org
            White space facet
• The "white space" facet controls how white space in
  the element will be processed.
• There are three possible values to the white space
   – preserve causes the processor to keep all white space as-is.
   – replace causes the processor to replace all white space
     characters (tabs, carriage returns, line feeds, spaces) with
     space characters.
   – collapse causes the processor to replace all strings of white
     space characters (tabs, carriage returns, line feeds, spaces)
     with a single space character:

         <xs:restriction base="xs:string">
                <xs:whitespace value="replace"/>

  Spring 2010            mark.baker@computer.org
               Complex Elements

• A complex element is an XML element that contains
  other elements, attributes, or both (like C struct, or
  OO class).
• There are four kinds of complex elements:
    –   Empty,
    –   ones that contain only other elements,
    –   ones that contain only text,
    –   ones that contain both other elements and text.
• Each of the four kinds may contain attributes as well.

 Spring 2010           mark.baker@computer.org
     • Both elements and attributes have types, which are defined in
       the Schema, one can reuse types by giving them names.
         <xsd:element name="Address">
                       <xsd:element name="Street" type="xsd:string"/>
                       <xsd:element name="Apartment" type="xsd:string"/>
                       <xsd:element name="Zip" type="xsd:string"/>
         <xsd:complexType name="AddrType">
              <xsd:element name="Street" type="xsd:string"/>
              <xsd:element name="Apartment" type="xsd:string"/>
              <xsd:element name="Zip" type="xsd:string"/>
         <xsd:element name=“ShipAddress" type="AddrType"/>
         <xsd:element name=“BillAddress" type="AddrType"/>

     Spring 2010             mark.baker@computer.org
• The use in the XML file is identical:
<?xml version="1.0" encoding="UTF-8"?>

         <Street>1108 E. 58th St.</Street>
          <Apartment>Ryerson 155</Apartment>

         <Street>1108 E. 58th St.</Street>
         <Apartment>Ryerson 155</Apartment>

 Spring 2010          mark.baker@computer.org
              Type Extensions
• A third way of creating a complex type is to extend another
  complex type (like OO inheritance):

    <xs:element name="Employee" type="PersonInfoType"/>
    <xs:complexType name="PersonNameType">
         <xs:element name="FirstName" type="xs:string"/>
         <xs:element name="LastName" type="xs:string"/>
    <xs:complexType name="PersonInfoType">
         <xs:extension base="PersonNameType">
                            <xs:element name="Address" type="xs:string"/>
                            <xs:element name="City" type="xs:string"/>
                            <xs:element name="Country" type="xs:string"/>

Spring 2010              mark.baker@computer.org
        Type Extensions (use)
• To use a type that is an extension of another, it is as
  though it were all defined in a single type:

    <Employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-
       instance" xsi:noNamespaceSchemaLocation="TypeExtension.xsd">
         <Address>Round Table</Address>

Spring 2010           mark.baker@computer.org
  Simple Content in Complex Type
• If a type contains only simple content (text and
  attributes), a <simpleContent> element can be put
  inside the <complexType>.
• <simpleContent> must have either a <extension> or a

   <xs:element name="dialog">
       <xs:extension base="xs:string">
           <xs:attribute name="speaker“
                          type="xs:string" use="required"/>

  Spring 2010         mark.baker@computer.org
                    Complex Types
• A complex type is an element that either contains other
  elements or has attached attributes.
• Let us examine an element with attached attributes and
  then a more complex element that contains child elements.
• Here is a definition for a book element that has two
  attributes called “title” and “pages”:
   <xsd:element name=”book”>
   <xsd:attribute name=”title” type=”xsd:string” />
   <xsd:attribute name=”pages” type = “xsd:int” />
• An XML instance of the book element would look like this:
   <book title = “More Java Pitfalls” pages=”453” />

    Spring 2010          mark.baker@computer.org
                      Complex Types
 • Now let us look at how we define a “product” element with both
   attributes and child elements.
 • The product element will have three attributes: id, title, and
 • It will also have two child elements: description and categories.
 • The categories child element is mandatory and repeatable,
   while the description child element will be optional:

<xsd:element name=”product”>
  <xsd:element name=”description” type=”xsd:string” minOccurs=”0” maxOccurs = “1” />
  <xsd:element name=”category” type=”xsd:string” minOccurs = “1” maxOccurs = “unbounded” />
 <xsd:attribute name=”id” type=”xsd:ID” />
<xsd:attribute name=”title” type=”xsd:string” />
<xsd:attribute name=”price” type=”xsd:decimal” />

   Spring 2010                 mark.baker@computer.org
               Complex Types
• Here is an XML instance of the product element
  defined previously:
    <product id=”P01” title=”Wonder Teddy” price=”49.99”>
    <description> The best selling teddy bear of the year. </description>
    <category> toys </category>
    <category> stuffed animals </category>
• An alternate version of the product element could
  look like this:
    <product id=”P02” title=”RC Racer” price=”89.99”>
    <category> toys </category>
    <category> electronic </category>
    <category> radio-controlled </category>

Spring 2010            mark.baker@computer.org
                Model Groups

• Model Groups are used to define that an
  element that has:
  – “Mixed” content (elements and text mixed),
  – “Element” content.
• Model Groups can be:
  – All:
       • The elements specified must all be there, but in any order.
  – Choice:
       • Any of the elements specified may or may not be there.
  – Sequence:
       • All of the elements specified must appear in the specified order.

  Spring 2010            mark.baker@computer.org
              "All" Model Group
• The following schema specifies three elements and
  mixed content:
    <xs:element name="BookCover">
       <xs:complexType mixed="true">
         <xs:all minOccurs="0" maxOccurs="1">
                   <xs:element name="BookTitle" type="xs:string"/>
                   <xs:element name="Author" type="xs:string"/>
                   <xs:element name="Publisher" type="xs:string"/>

• The following XML file is valid in the above schema:
    <BookCover xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       <BookTitle>The Holy Grail</BookTitle>
       <Author>Monty Python</Author>
Spring 2010            mark.baker@computer.org
<xs:element name="dialog">
                                      The attribute declaration
 <xs:complexType>                     is part of the type of
                                      the element.
     <xs:extension base="xs:string">
      <xs:attribute name="speaker"
                   type="xs:string" use="required"/>

 Spring 2010         mark.baker@computer.org
<xsd:element name="cartoon">         Attribute list is part
 <xsd:complexType>                   of the type of the element.
    <xsd:element ref="character" minOccurs="0"                Default is given by
                                                              the “use”
   <xsd:attribute name="name" type="xsd:string" use="required"/>
   <xsd:attribute name="genre" type="xsd:string" use="required"/>
   <xsd:attribute name="syndicated" use="required">
      <xsd:restriction base="xsd:NMTOKEN">
       <xsd:enumeration value="yes"/>
       <xsd:enumeration value="no"/>
    </xsd:simpleType>                        If an attribute type is
  </xsd:attribute>                           more complicated than a
                                             basic type, then we spell
                                             out the type in a type
</xsd:element>                               declaration.

Spring 2010               mark.baker@computer.org
  Optional and Required Attributes

• All attributes are optional by default.
• To explicitly specify that the attribute is optional, use
  the "use" attribute:

  <xs:attribute name="speaker" type="xs:string"

• To make an attribute required:

  <xs:attribute name="speaker" type="xs:string"

   Spring 2010       mark.baker@computer.org
Common XML Schema Primitive Data Types

   Spring 2010   mark.baker@computer.org
Is Validation Worth the Trouble?
• Anyone who has worked with validation tools knows
  that developers are at the mercy of the maturity of
  the tools and specifications they implement.
• Validation, and the tool support is evolving.
• Until the schema languages are fully mature, validation
  may be a frustrating process that requires testing
  with multiple tools.
• You should not rely on the results of just one tool
  because it may not have implemented the specification
  correctly or could be buggy.
• Fortunately, the tool support for schema validation
  has been steadily improving and is now capable of
  validating even complex schemas.

 Spring 2010      mark.baker@computer.org
Is Validation Worth the Trouble?
• Even though it may involve significant testing and the
  use of multiple tools, validation is a critical component
  of your data management process.
• Validation is critical because XML, by its nature, is
  intended to be shared and processed by a large
  number and variety of applications.
• Second, a source document, if not used in its entirety,
  may be broken up into XML fragments and parts
• Therefore, the cost of errors in XML must be
  multiplied across all the programs and partners that
  rely on that data.
• As mining tools proliferate, the multiplication factor
  increases accordingly.

 Spring 2010       mark.baker@computer.org
  Is Validation Worth the Trouble?
• Chief difficulties with validation stems from the
  additional complexity of new features introduced with
  XML Schema:
   – Data types, namespace support, and type inheritance,
   – A robust data-typing facility, similar to that found in
     programming languages, is not part of XML syntax and is
     therefore layered on top of it,
   – Strong data typing is key to ensuring consistent
     interpretation of XML data values across diverse
     programming languages and hardware.
• Namespace support provides the ability to create
  XML instances that combine elements and attributes
  from different markup languages.
• This allows you to reuse elements from other markup
  languages instead of reinventing the wheel for
  identical concepts.
 Spring 2010         mark.baker@computer.org
Is Validation Worth the Trouble?
• Thus, namespace support eases software
  interoperability by reducing the number of unique
  vocabularies applications must be aware of.
• As stated previously, namespace support is a key
  benefit of XML Schema.

 Spring 2010      mark.baker@computer.org
                      Schema composition
        • Can apply different validation rules to different
          elements in the document.
        • A little complicated… here is an example.
Namespace - ord

Location of
schema -

References to
elements from
multiple schema

        Spring 2010       mark.baker@computer.org
                   Schema composition
Include ord2.xsd
and cus.xsd


  Root element


Spring 2010           mark.baker@computer.org
              Schema Composition


Spring 2010      mark.baker@computer.org
                Schema Composition


  Spring 2010      mark.baker@computer.org
              Schema Composition


Spring 2010      mark.baker@computer.org
               Validating a Schema
• By using Xeena or XMLspy or XML Notepad:
   – When publishing hand-written XML docs, this is the
     way to go.
• By using a Java program that performs
   – When validating on-the-fly, must do it this way
• By using Sun's Multi-Schema XML Validator:
   – Java source code,
   – Does validate correctly multiple schemas (as we
     have seen above):

 Spring 2010       mark.baker@computer.org
                What is a URI?
• A Uniform Resource Identifier (URI) is a standard
  syntax for strings that identify a resource:
   – Informally, a URI is a generic term for addresses and names
     of objects (or resources) on the Web,
   – A resource is any physical or abstract thing that has an
• There are two types of URIs: Uniform Resource
  Locators (URLs) and Uniform Resource Names
   – A URL identifies a resource by how it is accessed; e.g.,
     “http://www.example.com/stuff/index.html” identifies an
     HTML page on a server with a DNS name of
     www.example.com and accessed HTTP,
   – A URN creates a unique and persistent name for a resource
     either in the “urn” namespace or another registered
     namespace, which dictates the syntax for the URN
 Spring 2010         mark.baker@computer.org
              XML Schema Status
• Became a W3C recommendation Spring
     – World domination expected imminently.
     – Supported in Xalan.
     – Supported in XMLspy and other
• On the other hand:
     – More complex than DTDs.
     – Ultra verbose.

Spring 2010        mark.baker@computer.org

Shared By: