Docstoc

xml

Document Sample
xml Powered By Docstoc
					Extensible MarkUp Language
AGENDA

 OVERVIEW OF XML

 DATA TYPE DEFINITION LANGUAGE

 XML SCHEMA

 XML PARSERS
  1) DOM PARSER
  2) SAX PARSER
  3) JAXB PARSER

 EXTENSIBLE SYTLESHEET TRANSFORMATIONS
OVERVIEW OF XML

What is Markup language?

Markup languages are designed for the processing, definition and
presentation of text. The language specifies code for formatting, both
the layout and style, within a textfile.
The well known markup languages are HTML and XML.

XML is a

A framework for defining markup languages

 Each language is targeted at its own application domain with its markup
tags.

 There is a common set of generic tools for processing XML documents
How is XML different from HTML?
 Markup languages generally combine two distinct functions of representing text
(document) –the ‘look’ and the ‘structure’.

 HTML and XML have different sets of goals.

 While HTML was designed to display data and hence focused on the ‘look’ of the data,
XML was designed to describe and carry data and hence focuses on ‘what data is’.

 HTML is about displaying data and XML is about describing data.

 HTML and XML are complementary to each other.
XML FEATURES
 XML can be used to create new languages. Ex: WML, VRML

 XML uses the concept of DTD (Document Type Definition) to describe data

 XML with DTD is self descriptive

 XML separates data from display formats

 XML can be used as a format to exchange data

 Data can be stored in either files or databases

                              JAVA=Portable Programs

                              XML=Portable Data
XML Syntax
XML Syntax consists of

XML Declaration

 XML Elements

XML Attributes

XML Declaration
The first line of an XML document should always consist of an XML declaration defining the
version of XML

XML Element
XML is a markup language that is used to store data in a self-explanatory manner. Making
the data "self-explanatory" comes about by containing information in elements. If a piece of
text is a title then it will be contained within a "title" element.

XML Attributes
Attributes are used to specify additional information about the element. An attribute for an
element appears within the opening tag. The syntax for including an attribute in an element
is:
<element attributeName="value">
SAMPLE APPLICATION


<?xml version="1.0" encoding= "ISO-8859-1" ?>
<book>
<title> XML for dummies</title>
<chapter> introduction to xml
<para>Markup languages</para>
<para>Features of XML</para>
</chapter>
<chapter>XML syntax
<para>Elements must be enclosed in tags</para>
<para>Elements must be properly nested</para>
</chapter>
</book>
DOCUMENT TYPE DEFINITION LANGUAGE

A Document Type Definition (DTD) defines the legal building blocks of an XML
document. It defines the document structure with a list of legal elements and attributes.


A DTD is associated with an XML document via a Document Type Declaration ,
which is a tag that appears near the start of the XML document. The declaration
establishes that the document is an instance of the type defined by the referenced DTD.


The declarations in a DTD are divided into an internal subset and an external subset
Internal DTD Declaration
If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE
definition with the following syntax:

<!DOCTYPE root-element [element-declarations]>

Example XML document with an internal DTD:
<?xml version="1.0"?>
<!DOCTYPE book[
<!ELEMENT book (bookID,title)>
<!ELEMENT bookID (#PCDATA)>
<!ELEMENT title (#PCDATA)>
]>
 <book>
 <bookID>1243</bookID>
<title>john</title>
</book>
External DTD Declaration
If the DTD is declared in an external file, it should be wrapped in a DOCTYPE
definition with the following syntax:

<!DOCTYPE root-element SYSTEM "filename">

Example XML document with an external DTD:

<?xml version="1.0"?>
<!DOCTYPE note SYSTEM “book.dtd">
<book>
<bookID>Tove</bookID>
<title>Jani</title>
</book>

And the file "note.dtd" which contains the DTD:
<!ELEMENT book (bookID,title)>
<!ELEMENT bookID (#PCDATA)>
<!ELEMENT title (#PCDATA)>
XML SCHEMA

 An XML schema is a description of a type of XML document

An XML schema describes the structure of an XML document.

The XML Schema language is also referred to as XML Schema Definition (XSD).

An XML Schema:

 defines elements that can appear in a document

 defines attributes that can appear in a document

 defines the order of child elements

 defines the number of child elements

 defines data types for elements and attributes
SCHEMA LOCATION
! In an instance document, the attribute xsi:schemaLocation

<purchaseReport
xmlns="http://www.example.com/Report"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xsi:schemaLocation="http://www.example.com/Report
http://www.example.com/Report.xsd"
period="P3M" periodEnding="1999-12-31">
<!-- etc -->
</purchaseReport>
XML PARSERS

Parser is breaking (a sentence) down into its component parts with an explanation of the
form, function, and syntactical relationship of each part.

Compilers parse text to identify the program elements and check that it conforms to the
correct syntax.

An XML parser is the piece of software that reads XML files and makes the information
from those files available to applications and programming

XML parser is a Software that reads an XML document, identifies all the XML tags and
passes the data to the application

All modern browsers have a build-in XML parser that can be used to read and
manipulate XML.

The parser reads XML into memory and converts it into an XML DOM object that can
be accessed with JavaScript.
XML PARSERS

 DOM PARSERS
  1) DOM Characteristics
  2) DOM in Action
  3) DOM Tree and Nodes
  4) DOM Programming Procedures

 SAX PARSERS
  1) SAX Features
  2) SAX Operational Model
  3) SAX Programming Procedures
  4) Benefits Of SAX

 JAXB PARSERS
  1) JAXB Design Goals
  2) JAXB Binding Lifecycle
  3) JAXB Runtime Operations
  4) JAXB Programming
DOM PARSER
DOM is cross-platform and cross language

Uses OMG’s IDL to define interfaces
IDL to language binding

DOM CHARACTERISTICS

 Access XML document as a tree structure

Composed of mostly element nodes and text nodes

Can “walk” the tree back and forth

Larger memory requirements

 Fairly heavyweight to load and store

Use it when for walking and modifying the tree.
DOM IN ACTION
DOM TREE AND NODES

XML document is represented as a tree

A tree is made of nodes

There are 12 different node types

Node Types

Document node
 Document Fragment node
 Element node
 Attribute node
 Text node
 Comment node
 Processing instruction node
 Document type node
 Entity node
 Entity reference node
 CDATA section node
 Notation node
Example XML Document

<?xml version="1.0"?>
<people>
<name>
<first_name>Alan</first_name>
<last_name>Turing</last_name>
</name>
</people>

DOM Tree Example

XML Document node
element node “people”
element node “name”
element node “first_name”
text node “Alan”
element node “last_name”
text node “Turing”
Interfaces for DOM

 NodeList
 NamedNodeMap
 DOMImplementation

Node Interface

Primary data type in DOM
Represents a single node in a DOM tree
Every node is Node interface type

Methods in Node Interface
Useful Node interface methods
 public short getNodeType()
 public String getNodeName( )
 public String getNodeValue( )
 public NamedNodeMap getAttributes();
 public NodeList getChildNodes( )
NodeList Interface

 Represents a collection of nodes

 Return type of getChildNodes() method of Node interface

public interface NodeList {
public Node item(int index);
public int getLength();

NamedNodeMap Interface

 Represents a collection of nodes each of which can identified by name

 Return type of getAttributes() method of Node interface

Document Interface

 Contains factory methods for creating other nodes(elements, text nodes)

 Method to get root element node
DocumentType Interface

public interface DocumentType extends Node
{

public String getName();
public NamedNodeMap getEntities();
public NamedNodeMap getNotations();
public String getPublicId();
public String getSystemId();
public String getInternalSubset ();

}

Code Example

case Node.PROCESSING_INSTRUCTION_NODE:

System.out.println("<?" + node.getNodeName() +
" " + node.getNodeValue() +
DOM Programming Procedures

 Create a parser object

Set Features and Read Properties

Parse XML document and get

Document object

Perform operations

Traversing DOM

Manipulating DOM

Creating a new DOM

Writing out DOM
CREATING A DOM OBJECT
import org.w3c.dom.Document;
import org.xml.sax.SAXException ;
import java. io.IOException ;

String xmlFile = "file:///xerces-1_3_0/data/personal. xml";
DOMParser parser = new DOMParser();
try {
parser.parse(xmlFile);
} catch (SAXException se) {
se.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
}
Document document = parser. getDocument
Generating A New DOM
try
 {
// Generate a new DOM tree
Document doc= new DocumentImpl ();

Element root = doc.createElement("person"); // Create Root Element

Element item = doc.createElement("name"); // Create element

item. appendChild( doc.createTextNode("Jeff") );

root.appendChild( item ); // atach element to Root element

item = doc.createElement("height");
item. appendChild( doc.createTextNode("1.80" ) );
}
 catch ( Exception ex )
{
ex.printStackTrace();
}
SAX PARSER
 Simple API for XML
 Started as community-driven project


SAX Features
 Event-driven: You provide event handlers

Fast and lightweight: Document does not have to be entirely in memory

Sequential read access only

One-time access

Does not support modification of document
SAX Operational Model




                              Events

             Input                     PROVIDED
  XML                PARSER
DOCUMENT                               HANDLER
SAX Programming Procedures
SAX Event Handlers
SAX Parser Example

XMLReader parser = null;
--
try {

// Create XML (non-validating) parser
parser = XMLReaderFactory.createXMLReader();

// Create event handler
myContentHandler handler = new myContentHandler();
parser.setContentHandler(handler);

// Call parsing method
parser.parse(args[0]);
}
catch(SAXException ex){
System.err.println(ex.getMessage());
}
catch(Exception ex){
System.err.println(ex.getMessage());
}
SAX Event Handler
class myContentHandler implements ContentHandler
{
// ContentHandler methods
public void startDocument(){
System.out.println(“XML Document START”);
}
public void endDocument()
{
System.out.println(“XML Document END”);
}
public void startElement(String namespace, String name, String qName,
Attributes atts)
{
System.out.println(“<“ + qName + “>”);
}
public void endElement(String namespace, String name, String qName)
{
System.out.println(“</“ + qName + “>);
}
public void characters(char[] chars, int start, int length){
System.out.println(new String(chars, start, length);}
Benefits of SAX

 It is very simple

 It is very fast

 Useful when custom data structures are needed to model the XML document

 Can parse files of any size without impacting memory usage

Drawbacks of SAX

 SAX provides read-only access

 No random access to documents

 Searching of documents is not easy
JAXB PARSER

 Provides an efficient and standard way of mapping between XML and Java code

 Programmers don't have to create application Java objects anymore themselves

 Programmers do not have to deal with XML structure, instead deal with meaning
business data


JAXB Design Goals
  Easy to use : Don't have to deal with complexities of SAX and DOM

  Customizable : Allows keeping pace with schema evolution

  Portable: JAXB components can be replaced without having to make significant
 changes to the rest of the source code
How to Use JAXB

Develop or obtain XML schema

Generate the Java source files

Develop JAXB client application

Compile the Java source codes

With the classes and the binding framework and write Java applications that:
 1) Build object trees representing XML data
JAXB Binding Lifecycle
 JAXB Runtime Operations
Provide the following functionality for schema derived classes

Unmarshal

Process (access or modify)

Marshal

Validation

A factory generates Unmarshaller, Marshaller and Validator instances for JAXB
technology based applications

Pass content tree as parameter to Marshaller and

Validator instances
JAXB PROGRAMMING
EXTENSIBLE STYLESHEET TRANSFORMATION (XSLT)

Extensible Stylesheet Language (XSL)is a language for expressing stylesheets

XSL is made of two parts:

XSL Transformation (XSLT)

XSL Formatting Objects (XSL-FO)

Viewpoints of XML

Presentation Oriented Publishing (POP): Useful for Browsers and Editors

Message Oriented Middleware (MOM): Useful for Machine-to-Machine data
exchange. E.g.: Business-to-Business communication
XSLT is useful in:

Transforming data into a viewable format in a browser (POP)

Transforming business data between content models (MOM)
XSLT Stylesheet

 <?xml version="1.0"?>
 <xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template match=”people">
 Folks in Brandeis XML class
 </xsl:template>
 </xsl:stylesheet>


RESULT

<?xml version="1.0" encoding="UTF-8"?>

   Folks in Brandeis XML class
XSLT stylesheet language
 template

 value-of

 apply-templates

 for-each

 if

 when, choose, otherwise

 Sort

 filtering
THANK YOU

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:19
posted:3/8/2012
language:
pages:46