XHTML by wuzhengqin



Revised 3/20/2005
        Drawbacks of HTML
   Tags can be left out
    – If a tag is left out, the browser is left to figure out
      how the page should be rendered
    – The result is two pages with the same HTML may
      be rendered differently by different browsers
   Ambiguity
    – What should be done in certain situations is left
      for the browser to figure out
   Inflexibility
    – Any standard takes time to react to new needs
      and technologies.
       Drawbacks of HTML
   Does not support new devices
    – One standard which would work on emerging
      devices would simplify life
          Handheld computers
          Wireless phones
          Televisions
          Kiosks
    – But XHTML is flexible because it supports
          Allows XHTML to integrate other content, such as
           MathML, for browsers that support it
            What is XHTML?
   It is what HTML has morphed into
    – Last HTML specification was HTML 4.01 (1998)
    – XHTML 1.0 is syntactically identical (almost) to
      HTML 4.01 but has a means of being validated as
      an XML document
          “Well formed” XML documents follow the XML syntax
          “Valid” XML documents means they can be proved to be
           correct according to the specification
    – XHTML 1.1 is the current recommendation by the
      World-Wide Web Consortium
          That doesn’t mean browser manufacturers will adhere to
    – XHTML 2.0 is under development
          Notes on XHMTL
   IE 6.0+ and Mozilla Firefox claim to support
    XHTML 1.0
   However browsers will still to their best to
    translate “dirty” HTML where tags are missing
    or inconsistent to be backwards compatible
   So old HTML and invalid XHTML will still get
    – But if the XHTML is validated, there is no
      ambiguity so results should be consistent across
XML – A parent of XHTML
   HTML is the other parent
   XML = Extensible Markup Language
   XML is used to transfer information with its context
    over the internet or other electronic media
   Document Type Definitions (DTDs) describe the tags
    and attributes that constitute a valid means of
    describing some data
   XHTML 1.0 is essentially HTML 4.0 which has been
    minimally modified to follow an XML syntax
      Sample XML FileRequired declaration
                           Indicating XML Version
<?xml version="1.0"?>      syntax followed
<note>                              2 or more
  <to>Tove</to>                     parties agree
                                    on a set of
  <from>Jani</from>                 tags, attrib-
  <heading>Reminder</heading>       utes and
  <body>Don't forget me this        constraints
   weekend!</body>                  To be used

Why XML? (from the W3C)
   It allows structured information like an address book
    to be stored in a plain text file
   Files are easy to generate and read by computer and
   Unambiguous
    – Files follow an exact syntax
   Extensible
    – New tags can be added without breaking the old
   International
   Platform independent
   Open standard so it’s free!
                                                         Standardized Generalized Markup
                                                         Language – a set of rules used to
HTML specification               SGML                    Create other markup languages
did not require absence
or presence of tags and
did not specify order

            HTML 1.0                                 XML
Many generations                       XML Provides an          Extensible Markup Language,
                                       enforcement              a simplified subset of SGML
           HTML 4.01                   syntax                   which is useful but not too
                                                                complex and is ideal for
                                                                transmitting data with context
HTML provides existing
tags and attributes           XHTML 1.0
                                                         XHTML is a DTD of XML. HTML
                                                         can no longer be ambiguous
                                                         or poorly defined.
                              XHTML 1.1
   HTML-like tags and an enforcement mechanism
   in one standard
    Line 1: XML Declaration
   The first line of a valid XHTML document
    should say “This is an XML document!”
   Use this statement:

    <?xml version="1.0" encoding="UTF-8" standalone="no" ?>

    – version is the version of xml used. 1.0 is typical. 1.1 may
      also be used
    – encoding indicates how the file is encoded.
           UTF-8 is typical for Latin based languages like English. It is a
            superset of the ASCII character set
           UTF-16 allows for complex character sets like Chinese
    – standalone=“no” means the file cannot be used by itself. It
      must look for an external specification to validate correctly
Line 2: DOCTYPE Statement
   After the XML declaration preface your HTML
    with a DOCTYPE definition
    – The DOCTYPE points to a place on the web
      containing the legal tags and attributes for this
      XML document. In this case, XHTML is the
      DOCTYPE                        This will vary depending on
                                        the usage wanted
    "-//W3C//DTD XHTML 1.0 Transitional//EN"
    Understanding DOCTYPE
   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
    Transitional//EN "
    – !DOCTYPE is like a verb that means “Here’s some
      information about where to go if you want to validate this
      XML document”
    – html means this DOCTYPE is given the name “html”
    – PUBLIC means the specification can be found externally.
      (Otherwise the definition would be contained in the XML file
    – -//W3C//DTD XHTML 1.0 Transitional//EN is the ID given by
      the DTD creator to this DTD
    – http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
      is a URL to the specification
       XHTML Specifications
   You can use any of these XHTML specifications:
    – XHTML 1.0 Transitional – For use when old browsers must be
    – XHTML 1.0 Frameset – Frames are not supported by the XHTML
      1.0 Strict Specification, so use this if you have to support frames
    – XHTML 1.0 Strict – No frames, certain tags and attributes cannot
      be used
    – XHTML 1.1 – A more modularized version of XHTML. Current
      recommendation, but not fully supported by browsers
    – XHTML Basic – A “stripped” down version of XHTML 1.1 geared
      toward mobile devices and small screens
    – XHTML 2.0 – Under development
   Recommendation: Use XHTML 1.0 Strict unless you need to
    support frames or very old browsers like Netscape 4
Line 3: Improved <html> tag
    <html
      – xmlns is the XML Name Space. It prevents
        collisions should this XML document reference
        more than one XML specification.
            This is typically a URL but it doesn’t matter as long the
             name is unique in this XML file
Translating HTML to XHTML
    Documents must be well formed
     – All tags must be closed or
          <head> … </head>
          <br></br>
     – Empty tags can either use a pair or end in
          <br></br> is okay
          <br /> is equivalent
              Add a space or two before the / so it
              Will work with HTML compatible browsers
Translating HTML to XHTML
   Tag attributes that are off by default
    require an attribute value if “on”
    – Specify the tag name as the attribute value
    – Example:
          HTML 4.0: <input type=“checkbox” checked>
          XHTML 1.0: <input type=“checkbox”
Translating HTML to XHTML
   Tags and attributes must be in lower
    – You should be doing this already
    – HTML 4.0:
    – XHTML 1.0
          <body bgcolor=“WHITE”>
          Depending on the DTD it may be okay for
           attribute values to be in upper case
Translating HTML to XHTML
   Attribute values must be quoted with
    double quotes
    – You should be doing this already
    – HTML 4.0:
          <body bgcolor=white>
    – XHTML 1.0:
          <body bgcolor=“white”>
Translating HTML to XHTML
   Tags must be properly nested
    – Inner tags must be closed before outer
    – You should be doing this already
    – HTML 4.0:
         <b><i>Text</b></i>
    – XHTML 1.0:
         <b><i>Text</i></b>
Translating HTML to XHTML
   Tags must be nested under the proper
    – No tags in the specification may be placed
      before the <html> tag or after the
      </html> tag
    – <title> could not be nested inside the
      <body> tag because it belongs inside the
      <head> tag
    – <li> may not appear before an <ol> or
      <ul> tag
Translating HTML to XHTML
   Comments must begin with <-- and end
    with -->
    – Note: a space is required after <-- but not
      before -->
    – HTML 4.0:
         <! This is a comment>
    – XHTML 1.0:
         <!-- This is a comment -->
Translating HTML to XHTML
   Use the <img> name attribute as the id
    – HTML 4.0:
          <img src="picture.gif" name="picture1">
    – XHTML 1.0:
          <img src="picture.gif" id="picture1"
           name="picture1" />
          To accommodate very old browsers both the
           name and the ID attributes in and use the
           Transitional XHTML DTD
Translating HTML to XHTML
   <div> tag lang attribute must also
    specify the xml:lang attribute
    – HTML 4.0:
          <div lang="no">Heia Norge!</div>
    – XHTML 1.0:
          <div lang="no" xml:lang="no">Heia
Translating HTML to XHTML
   Mandatory Elements
    – Frameset DTD is slightly different
    <!DOCTYPE Doctype goes here>
          <title>Title goes here</title>
          Body text goes here
Translating HTML to XHTML
   <html> tag must include the appropriate
    namespace (xmlns) attribute
   Namespace: A URL that references a
    standards document that ensures the list of
    tags used by XHTML are identical to those in
    the associated DTD.
   Example:
<html xmlns=“http://www.w3.org/TR/xhtml11”>
Translating HTML to XHTML
   <script> and <style> tags must be marked
    as CDATA sections
   CDATA sections are used to ignore blocks of
    – This text would otherwise be considered XML and
      would cause the XML file to fail validation
   <script> tags used by themselves without
    information between the tags are exempt
Translating HTML to XHTML
                                       Start and end
   Example:                           Of CDATA section
    <script type=“text/javascript>”
    <![CDATA[ document.write("<b>Hello
     World!</b>"); ]]>
    </script>                   Delimits end of CDATA

                         Delimits start of CDATA
Translating HTML to XHTML
 Ampersands in attribute values must be
  coded using &amp;
 Example:
    <a href=“http://www.mysite.com/cgi-
     test2”>Link text</a>
Translating HTML to XHTML
   There may be other subtle things you
    may need to do to make your code
    XHTML 1.0 compliant.
    – Example, the alt attribute has to be
      specified for most tags that allow it
   The XHTML validator you use will
    indicate where the errors are
           Validating XHTML
   The World Wide Web Consortium provides
    tools to use:
    – http://validator.w3.org/
          Checks 1 page
          Must be on the web
    – http://validator.w3.org/file-upload.html
          Lets you upload a file and have it validated
    – HTML Tidy
          A tool you can download
          See http://www.w3.org/People/Raggett/tidy/
        Validating XHTML
 Web-based validators provide a web
  page of errors back
 Make the changes to the XHTML and try
 When you have no more errors, you
  XHTML is valid!
 Most often used is the W3C validator
    – http://validator.w3.org
Fixing the errors in the
 paragraph elements
Report showing a successful validation
    under XHTML 1.0 transitional
       XHTML References
   http://www.w3schools.com/xhtml
    – Good examples
    – Easy to read
   http://www.w3.org/MarkUp/
    – The definitive source published by the
      World Wide Web Consortium

To top