xml

Document Sample
xml Powered By Docstoc
					XML eXtensible Markup Language

        by Darrell Payne
Experience
   Logicon / Sterling Federal
       C, C++, JavaScript/Jscript, Shell Script, Perl
   XML Training
       XML Training Course
       2001 DevXCon Training Conference
       Currently developing XML course for Logicon
   Darrell.Payne@Sterling-fsg.com
XML eXtensible Markup Language
   Standard General Markup Language(SGML)
       Meta-tag language
            Used for creating other markup languages
       Standard adopted for SGML in 1986
   Hyper Text Markup Language(HTLM)
       Application of SGML
       Formatting Language
   eXtensible Markup Language(XML)
       Meta-tag language
       XML = DATA
       World Wide Web Consortium W3C
            ((!Standard) && (Specification)) // “c” code humor
            XML Version 1.0 February 1998
            XML is not designed to replace HTML
SGML – HTML – XML diagram


   SGML            XML




    HTML           WML
    application   application
             XML Family of Tools and Their
             Relationship

                               XLink
                                            XSLT
             Linking and                                      Style and
             Pointing        XPointer                         Transformation

                               XPath        XSL

Underlying         XML                                      XML
                                     Namespace
And              Info set                                 Schema
Object
Model             DOM           SAX
                                                        Complex
                                                        Data Modeling

                      Programmatic
                      Interface                    XML Developer's Guide - McGraw Hill - Page 12
HTML vs. XML
   Html
       Predefined tags
       Syntax is loose
       File extensions usually “.html” of “.htm”
       Not required to be Well – Formed
             Some closing tags optional
             Attribute value quotes may be omitted
   XML
       User defined tags
       Syntax is exact
       File extensions usually “.xml”
       Closing tags mandatory
       Required to be Well - Formed
Well - Formed
   All XML documents must be Syntactically
    correct!
       Single root element
       All element start tags have end tags
       XML is case sensitive
       Properly nested tags
            <first><second></first></second> //error
            <first><second></second></first> //correct
       Attributes values in quotes
            “value“ or „value„
Basic XML Parts
   Markup
       Tags
       Attributes, names and values
   Character Data
       Text
            PCDATA
            CDATA
       Binary
   XML document has two main sections
       Prolog
       Root
       Misc
            Optional and considered superfluous
Simple XML File

   <?xml version="1.0"?>
   <!-- My first XML file -->
   <document>
        <message>Hello World!</message >
   </document >
   <!--
     More Comments
   -->
Declaration
   If used:
   <?xml version="1.0"?> required
        Declaration optional
        Specifies version to which document conforms
        XML documents without XML declaration might be assumed to conform to
         the latest version
   Other declaration examples
   <?xml version="1.0" encoding="UTF-8"?> optional
        Default – Good for ASCII text – 8 bit characters
        “UTF-16” Good for foreign – 16 bit characters
             Used for Unicode characters
        To stay uniform use with 8 or 16
   <?xml version="1.0" standalone="yes"?> optional
        No external subset referenced – default
   <?xml version="1.0" encoding=“UTF-8” standalone="yes”?>
Comments

   <!-- My first XML file -->
   <!--
     More Comments
   -->
       XML uses same comment syntax as HTML
Root Element

   <document>


   </document>
       Lines preceding root element are contained in the Prolog
       All XML documents must contain only one root element
       All other elements are “child element”s
Child Element

   <document>
      <message>Hello World!</message >
   </document >
    Sibling Element

   <document>
      <message>Hello World!</message >
      <message>Goodbye World!</message >
      <message2>Nothing more to add!</message2 >
   </document >
Updating Microsoft‟s Internet
Explorer
   instmsia.exe
       Updates Microsoft‟s Installer


   msxml3sp1.exe
       Updates Microsoft‟s Internet Explorer


   IE now has built-in XML parser “msxml”
Create XML Document
   Include declaration
       <?xml version="1.0"?>
   Create root element <cis_class>
   Create child element <cis_345>
   Enter child element text “student name”
   Save file with “.xml” extension
   Open using Internet Explorer
   After success, add siblings elements and
    retest using Internet Explorer
Document viewed in
Microsoft's Internet Explorer
More about Elements
   Element types
       Container Element
            Contains other elements
       Data Element
            Contains DATA
       Mixed Content
            Contains other elements and DATA
       Empty Element
            Contains no elements or DATA
Container Element
   Contains other elements
   <outer_element>
       <inner_element>
            <yet_another_element>
                 <can_we_go_any_deeper>
                     Some text way down here in the center of it all

                 </can_we_go_any_deeper>
            </yet_another_element>
       </inner_element>
   </outer_element>
Data Element
   Contains DATA
       Parsable Character Data
            PCDATA
       Character Data
            CDATA
PCDATA
   Contains text
   Can be parsed by parser
   Can contain all text except
       <
       >
       “
       „
       &
Entity References
   XML provides built in entity references
       &lt;
       &gt;
       &quot;
       &apos;
       &amp;
CDATA
   Contains text
   Is a declaration
   Can contain reserved characters
       <, >, “, „, &
   Starts with / ends with
       <![CDATA[
            Data would be here
       ]]>
   CDATA can not contain
       ]]>
Declarations
   <!--          -->
   <!DOCTYPE     >
   <![CDATA[     ]]>
   <!ELEMENT     >
   <![IGNORE[    ]]>
   <![INCLUDE[   ]]>
   <!NOTATION    >
   <!ENTITY      >
   <!ATTLIST     >
Why CDATA section
   “C++” code example

   CDATA example
       If (this->getX() < 5 && array1[0] != 3)
           cerr << this->displayError();


   PCDATA example
       If (this-&gt;getX() &lt; 5 &amp;&amp; array1[0] != 3)
           cerr &lt;&lt; this-&gt;displayError();
Mixed Content
   Elements and PCDATA combined
   <outer_element>
       outer element stuff
            <inner_element>
                 inner element stuff
            </inner_element>
       more outer element stuff
   </outer_element>
Empty Element
   Contains no text or data
   May have an attribute
   <empty_element></empty_element>
   <empty_element/>
       Short cut notation for empty element
   Does this look unfamiliar
       HTML example of such a type of tag
       <img src = “image.gif”> //Non Well Formed
       <img src = “image.gif” /> //Well Formed
Elements

                element
    Start-tag

                content

                          End-tag
Create XML Document 2
   Include declaration
   Create root element <cis_class2>
   Create child element <cis_345>
       Enter child element text “student name”
   Create child element <cis_346>
       Child to root, sibling to <cis_345>
       Make this an empty element
   Create child element <cis_347>
       Child to root, sibling to <cis_345>
       Enter C++ code example in PCDATA section
   Create child element <cis_348>
       Enter same C++ code in a CDATA section
   Save file
   Open using Internet Explorer
XML Parser – DOM & SAX
   Required to process an XML document
   C, Java, Python, Perl
   Parsers are of type
       Document Object Model(DOM)
            Tree structure
            Like a drive directory structure
            Slower and requires large amounts of memory
       Simple API for XML(SAX)
            Events driven
            Events = tags, text, etc.
            Smaller, faster, but requires programmer to deal with data
   Validating and non-validating
XML Structure
   Logical structure
       Document divided into units
       Allows sub units
       XML is a logical tree structure document
   Physical structure
       Data stored inside document
       Data stored outside document
            Entities one example
Valid
   Conforms to some schema
       schema “s”
           Document Type Definition(DTD)
           Schema
   By definition, all valid XML documents
    are Well – Formed documents
DTD Document Type
Definition
   Document Type Declaration(DTD)
   File extension of “.dtd”
   DTD is not an XML document
   DTD is a schema “s”
   Introduced into an XML document via the Document Type
    Declaration
       <!DOCTYPE            >
   Three types of DOCTYPE declarations
       Internal Subset
            Contained in the Prolog
       External Subset
            Exist in different file
            Prolog contains reference to file containing DTD
            Referenced using key work
                  SYSTEM or PUBLIC
       Internal Subset and External Subset combination
Internal Subset
   <?xml version="1.0"?>
   <!-- My second XML file -->
   <!DOCTYPE document [
   <!ELEMENT document (message)>
   <!ELEMENT message (#PCDATA)>
   ]>
   <document>
        <message>Hello World!</message>
   </document>
   <!--
     More Comments
   -->
External Subset
   <?xml version="1.0"?>
   <!-- My second XML file -->
   <!DOCTYPE document SYSTEM "HelloWorld.dtd">
   <document>
        <message>Hello World!</message>
   </document>
   <!--
     More Comments
   -->
DTD for HelloWorld.xml
   <!ELEMENT document (message)>
   <!ELEMENT message (#PCDATA)>
Internal Subset and External
Subset combination I
   <?xml version="1.0"?>
   <!-- My second XML file -->
   <!DOCTYPE document SYSTEM "HelloWorld3.dtd"[
   <!ELEMENT document (message)>
   ]>
   <document>
         <message >
             <message2>
             </message2>
         </message>
   </document>
   <!--
     More Comments
   -->
Internal Subset and External
Subset combination II
   <!-- External declarations -->
   <!ELEMENT message (message2)>
   <!ELEMENT message2 (#PCDATA)>
    Putting it all together
   HelloWorld3.dtd
        <!-- External declarations -->
        <!ELEMENT message (message2)>
        <!ELEMENT message2 (#PCDATA)>
   HelloWorld3.xml
        <?xml version="1.0"?>
        <!-- My second XML file -->
        <!DOCTYPE document SYSTEM "HelloWorld3.dtd"[
        <!ELEMENT document (message)>
        ]>
        <document>
          <message >
              <message2>
              </message2>
          </message>
        </document>
        <!--
         More Comments
        -->
XML Validator
 Type in HelloWorld3.xml
Create XML Document 3
   Create root element <session1>
   Create child element of session1<session2>
        Enter child element text “xml class”
   Create child element of session2<session3>
        Enter child element text “class information”
   Create child element of session3<session4>
        Enter child element text “more”
   Create DTD for this file dtd_info.dtd
        Reference file in XML document
   Save files
   Open validate_vbs.html
   Enter .xml file name
   Validate
Schema
   Schemas are XML documents
       Schemas can be manipulated via a parser
   More complicated than DTDs
   Schemas have “ElementType”s
Schema vs. DTD
   <!-- External declarations -->
   <!ELEMENT document (message, message2,
    message3)>
   <!ELEMENT message (message4, message5 )>
   <!ELEMENT message2 (#PCDATA)>
   <!ELEMENT message3 (#PCDATA)>
   <!ELEMENT message4 (#PCDATA)>
   <!ELEMENT message5 (#PCDATA)>
Schema vs. DTD II
   <?xml version="1.0" encoding="UTF-8"?>
   <!--W3C Schema generated by XML Spy v3.5
    (http://www.xmlspy.com)-->
   <xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"
    elementFormDefault="qualified">
         <xsd:element name="document">
                  <xsd:complexType>
                          <xsd:sequence>
                                  <xsd:element ref="message"/>
                                  <xsd:element ref="message2"/>
                                  <xsd:element ref="message3"/>
                          </xsd:sequence>
                  </xsd:complexType>
         </xsd:element>
Schema vs. DTD III
        <xsd:element name="message">
                 <xsd:complexType>
                         <xsd:sequence>
                                 <xsd:element ref="message4"/>
                                 <xsd:element ref="message5"/>
                         </xsd:sequence>
                 </xsd:complexType>
        </xsd:element>
        <xsd:element name="message2" type="xsd:string"/>
        <xsd:element name="message3" type="xsd:string"/>
        <xsd:element name="message4" type="xsd:string"/>
        <xsd:element name="message5" type="xsd:string"/>
   </xsd:schema>
Topics not covered
   Namespace      Whitespace
   Xpath          Xpointer
   Xlink
   XSL            XSLT
   SOAP           DDI
   Web Services
   SMIL
   XHTML

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:18
posted:6/23/2011
language:English
pages:46