Document Sample
standards Powered By Docstoc
					    Standards In A Digital World:
          Z39.50, HTML, Java:
         Do They Really Work?
                Brian Kelly
              UK Web Focus
             University of Bath

    • Introduction
    • HTML
       • Initial Roadmap / The Diversion / Back on Course
    • W3C Standardisation Process
    • Rivals to HTML
       • PDF
       • Viewers
    • Scripting
       • Client-side Scripting Languages
       • Server side Scripting
    • Distributed Searching
       • Z39.50
       • Other Protocols
2   • Conclusions
    UK Web Focus
    UK Web Focus:
     • National web coordination post for UK HE community
     • Based at UKOLN, University of Bath
     • Responsibilities include:
        – Technology watch
        – Information dissemination in variety of ways:
            – Workshops (national, regional)
            – Presentations at conferences and seminars
            – Online
        – Coordination activities
        – Representing JISC on W3C
     • Brian Kelly appointed on 1st November 1996
        – Involved with web since January 1993
        – Previously worked at University of Newcastle, Leeds,
3         Liverpool, and Loughborough
        The Question
        Where do you stand?

                                 The success of the Web
    The success of the Web
                                  is based on building on
    is based on competition
                                   open, non-proprietary
       in the marketplace.
    Just look at the benefits
                                Use of proprietary systems
    provided by competition
                                   has increased costs for
     between Netscape and
                                 the user, and resulted in
                                      flawed systems.

    HTML Roadmap
    HTML 1.0       Gets things started
    HTML 2.0       CERN / NCSA partnership
                   introduces NCSA Mosaic with
                   support for forms and inline images
    HTML +         Proposal for enhancements
                   including improved layout control
                   (e.g. tables), maths, etc.
    Style Sheets   Mechanism for defining appearance
                   Structure separate from appearance
                   Various proposals (DSSSL, CSS,

    HTML History
    HTML 1.0   Unpublished specification. DTD developed
               by Tim Berners-Lee (CERN).
    HTML 2.0   Spec. based on innovations from NCSA
               (forms and inline images!)
    HTML 3.0   Proposed spec. (renamed from HTML+).
               Very comprehensive
               Failed to complete IETF standardisation
               Little implementation experience
    HTML 3.2   Spec. based on description of mainstream
               innovations in marketplace
    HTML 4.0   Current proposal.

    HTML Wars
    October 1994   Netscape released (Mosaic
                   Communication Corporation)
                   Quality browser, but supported
                   proprietary tags (<BLINK>, <FONT>,
    1995          New versions of Netscape released,
                   supporting additional proprietary tags
                   (<SPACER>, <LAYER>, etc.)
    1996          Microsoft respond to competition with
                   their own proprietary tags
                   (<MARQUEE>, etc)

    HTML Wars - The Problems
    Device Dependency
      • Resources are dependent on a particular browser
      • Platform dependency
      • Costs in supporting authoring tool
      • Potential costs in re-engineering
      • Proprietary innovations have been flawed:
            – Merging content and appearance
            – Maintenance of resources
      • Accessibility problems:
            – Poor support for access by disabled (e.g. speaking
              browsers for visually impaired)
    End of the Wars?
                                      Thursday, August 21 1996
    Microsoft Pledge on HTML Standards
    "HTML is the most basic and fundamental data format
    of the Web.
    Support for HTML standards ensures that content can
    be viewed by any browser as the creator intended.
    …. agreement on the most basic data format is critical
    to interoperability and the continued growth of the


     Microsoft Pledge (Cont.)
     "Previous proprietary HTML extensions from Microsoft and other
     vendors have confused the market, hampered interoperability and
     been ill-conceived with respect to [HTML] design principles ...
     Microsoft will agree to:
       Not ship extensions to HTML without first submitting them to
       Implement all W3C approved HTML standards.
       Clearly identify any not-yet-approved HTML tags we support as
       Publish a Document Type Definition (DTD) for its browser as
        mandated by SGML.
       Follow the architecture principles of HTML and its parent,
        SGML, when proposing new extensions.
     Microsoft agrees to hold itself to these standards. Will all the other
     Web browser vendors, including Netscape, also agree to this
10   conduct of behavior?"
        HTML 4.0 and CSS
        HTML 4.0 and CSS will provide an architecturally pure,
        yet functionally rich environment
         HTML 4.0                       CSS
          • Improved forms               • Support for all HTML
          • Hooks for stylesheets          formatting
          • Hooks for scripting          • Positioning of HTML
            languages                      elements
          • Table enhancements           • Support for multiple
          • Better printing                media

     Some problems with CSS are being experienced following:
       • Use of CSS features which changed during CSS
       • Browser supported features which changed
     W3C Process
      • A consortium of subscribing member organisations
      • Areas of work agreed by
        members                    User Interface:
                                    • HTML
      • Working group set up:       • Style Sheets
        – Charter                      •   Document Object Model
        – WG membership (restricted)   •   Maths
                                       •   Graphics
      • Initial recommendations        •   Fonts
        produced by WG
      • Recommendation made public
      • Feedback on open mailing lists and to editor
      • Recommendation updated
      • Members vote
     W3C Process
     Pros                      Cons
     • Work can be well-       • Discussions are closed
       focussed                • Process undemocratic
     • Avoids "flaming"        • Only rich companies
     • Battle can take place     can afford to take part
       in private              • Difficult for non-
     • Implementation and        members to contribute
       development of spec       their expertise
       closely linked          • Non-members may be
                                 developing systems in

     HTML - The Competition
     What are the alternatives to HTML ?
     HTML      An SGML DTD
               Describes document structure
               Used in conjunction with emerging style
               sheet proposal
               Agreements on standards emerging
     PDF       Adobe's Portable Document Format
               Provides control over appearance
     Native file format
               Store document in native format, and provide
               user with reader on client machine
     SGML / XML
14             Richer DTDs
     PDF Pros
       • Control over appearance not (yet) easily
         available in HTML
       • Functionality of PDF Reader can controlled (e.g.
         prevent copying, printing with watermarks)
     PDF Cons
       • Does not store document structure
       • Proprietary
          – How would we feel about it if it where owned
            by Microsoft?
          – Remember GIF patent problems!
       • Printing problems
     Use of Native File Format
     Files can be stored in their native file format (Word,
     Powerpoint, LaTeX, DVI, etc.)
     Files may then be viewed using the application or a
     viewer which understands the format
        • No conversion needed
        •   Viewing software needed
        •   Format version issues
        •   Indexing issues
        •   Viruses
        •   Proprietary
       • Extensible Markup Language
       • A lightweight SGML designed for network use
       • Arbitrary elements can be defined (<STUDENT-
         NUMBER>, <PART-NO>, etc)
       • Eliminates problems encountered in extending
          – Extension by fiat e.g. <FONT>
          – Public experiments e.g. the <BLINK> tag
          – The standards process e.g. Maths
       • Agreement achieved quickly
       • Support from industry (SGML vendors,
17       Microsoft, etc.)
     XML Support
     Microsoft have expressed support for XML:
       "Internet Explorer version 4.0 will support a few
       XML applications (such as CDF). Microsoft will
       be supporting XML in future versions of Internet
     Note how they will be supporting an ISO

     Metadata - the missing
     architectural component
     from the initial
     implementation             Addressing
     of the web                    URL

                           Transport Data format
                             HTTP      HTML

     Metadata Requirements
     Imagine a university prospectus on the web

     Requirement                 Protocol
     Available in Middle East,   PICS (rating system)
     where porn filters in use
     Resource discovery (find    DubIin Core
     “Bath prospectus”)
     Legally binding assertion   Digital Signature
     Delivered in appropriate    Transparent Content
     format (HTML, PDF)          Negotiation

     Metadata Standards
     PICS            Agreement within industry (US
                     Communications Decency Act
                     perceived as threat)
                     Format moving to XML in PICS/NG
     Dublin Core Pressure from library community
                     results in changes to HTML 4
                     Format likely to move to XML
     Digital Signatures
                     Based on PICS/NG

     W3C to set up a Metadata Coordination Group

     Other XML Developments
     XML seems to be gaining momentum:
     PICS      Moving from rating system to key part of
               metadata architecture
     CDF       Channel Definition Format
               Microsoft proposal for push technology
     OPS       Open Profiling Specification
               Microsoft proposal
     XML Web Collections
               Microsoft proposal for defining relationships
               between resource.
     MCF using XML
               Netscape proposal for describing metadata for
               collections of resources using XML
     CML       Chemical Markup Language
22   MML       Math Markup Language
       • Netscape's Javascript (renamed from
         Livescript) was first widely-deployed scripting
       • Problems with inter-working between different
       • Problems with inter-working across browsers
         (Microsoft and Jscript)
       • Problems with use of multiple scripting
         languages in a document

      • Javascript handed to standards body (ECMA)
      • W3C developing standards for integrating scripting
        languages with HTML
      • W3C working on Document Object Model (DOM) "
        .. a platform- and language-neutral interface that
        will allow programs and scripts to dynamically
        access and update the content, structure and style
        of documents."

       • Development began by Sun in early 1990s
         (known as Oak)
       • Moved to Web and released in 1995
       • Programming language and virtual machine
         environment (provides portability and
       • See

         Java Applications
     Java is gaining momentum:
      • Interactive applications
      • Enhanced user interfaces
      • Replacing conventional
        desktop applications
      • Extending browsers

     Java Standardisation
     Java developments:
       • Sun submitting Java to standards body
         (ISO/IEC JTC1)
       • Concerns over process ("Microsoft believes
         that .. that Sun wishes to retain full ownership
         and control over its Java specifications ..")
       • See

          Distributed Searching -
          The Problem

     End users face difficulties due to
     the wide variety of search
     interfaces available
     Possible Solutions
     Agree to use the same software
       • Unlikely to happen
       • Undesirable
     Agree to use implement similar interfaces
       • Probably not feasible
     Have a centralised database
       • Scaling problems
     Use software which implements protocol
     designed to provide common search
     interface across diverse services
       • e.g. Z39.50
     An Applications Solution
     Metacrawler can
     be used to search
     several large
     search engines.
      • Breaks if APIs
      • Centralised

     Z39.50 - What Is It?
       • A protocol which specifies data structures
         and interchange rules that allow a client
         machine to search databases on a server
         machine and retrieve records that are
         identified as a result of the search
       • Maintained by Library of Congress
       • Developed by ZIG
     Why is it important?
       • Powerful searching
       • Local, familiar interface
       • Retrieves structured data
     Z39.50 History
     Z39.50 (1988)
        • NISO work with roots in OSI work
        • "an unimplementable abomination which should never
          have been adopted"
        • "Inspired" WAIS (which was not interoperable)
     Z39.50 (1992)
        • Implementation experience
        • OSI now regarded as failure
     Z39.50 (version 3)
        • Accepted as ISO standard in 1996 ISO (23950)
        • Implemented using TCP/IP
        • Toolkits, profiles, etc now available
     Taken from Clifford Lynch's article at
     Z39.50 Pilot
     UKOLN is piloting
     Z39.50 across a
     number of services
     (UKOLN web site,
     BUBL, eLib project
     database, ...)
     Imagine searching
     across JISC services
     (and institutions):
      Find the chemical XML browser,
      and relevant reviews & papers.
      Search HENSA software archive,
      Mailbase lists, a Chemistry
      gateway and Imperial college
33    web site
     Related Protocols
     LDAP     Lightweight Directory Access Protocol
              Derived from X.500 directory service
              See "Lightweight Directory Access Protocol"
              See also
     whois++ Derived for whois protocol for finding
             people (IETF)
             See "Architecture of the Whois++ Index
             Service" at the URL
     What The Software Companies Say

     Netscape (see
         • [We will] aggressively support open standards wherever they
         • Work within the open standards process to innovate valuable
           new functionality in ways that promote openness and
         • All current Netscape products implement and support the
           existing open standards appropriate to their functionality.
     Microsoft (see
         • Microsoft is fully committed to the HTML standards articulated by
           the World Wide Web Consortium (W3C) and the international
           Internet community.

         Caveat Emptor!
         Beware of free software - it can be expensive!

 Remember Your Music                        Is The Same True Of Your
 Collection?                                Information Systems?
 7" single Your favourite single             Home-grown
 12" LP    The album containing the hit      Gopher     The hit of 1992
 12" LP    Greatest hits                     WWW        The HTML 2 version
                                             WWW (2)    Revamped, based on
 CD        When you bought your CD                      Netscapeisms
                                             WWW (3)    Revamped, based on
                                                        HTML 4 and CSS
 Record companies are happy to sell you      WWW (4)    ??
 the same information in several formats!
                                            Microsoft and Netscape will be happy
                                            to sell you tools to manipulate the
                                            same information!

     •   Without standards, costs are liable to escalate
     •   Software companies are happy to take our money
     •   OSI networking standard gave standardisation
         process a bad name
     •   Current IETF / W3C process of developing
         standards and gaining implementation experience
         is valuable
     •   Standards are not frozen
     •   The difficult choice may be "What standard?"

      Further Information
     List of Standards Bodies
     World Wide Web Consortium
     Microsoft and Standards
     Netscape and Standards
        On Julius Caesar, Queen Eanfleda,
        and the lessons from time past
         1 Dual standards rather than a single standard cause trouble.
         2 If you must have dual standards, specify mandatory conversions
           or interfaces between them.
         3 Never leave anything implementation-dependent
         4 If irregularities are unavoidable in a standard (e.g. because of
           external constraints), put them where they will do the least
         5 Never alter standards to please the rich and powerful, unless
           the changes can be justified on firm technical grounds.
         6 Even the most rich and powerful can be persuaded that they will
           benefit from changing from their local standard to a general one.
         7 The most effective standards are those you take so for granted
           you don't have to think about them.
         8 If provisions of standards are based on external assumptions or
           constraints unrelated to the purpose of the standard, they are
           likely to appear irrational.