Worshipping at the Shrine:
Myths and Legends from comp.text.xml
Kerry “the heretic” Raymond, CiTR
XML is the new religion
• Why is XML so popular?
• No new idea, but a simple SGML
• Right place at the right time
– HTML introduced mark-up to a wide audience
– W3C “brandname”
– “worse is better” or “simple better than perfect”
• Lots of XML-enabled products announced
So simple, anyone can do it!
“I am a 45yr old medical Doctor (General Practice,
England) who needs the challenge of a new career.
Electronics/communications/computing have been
a lifetime hobby interest.
I propose to learn XML etc. and combine this with
my people skills to work personally with
customers to provide them with information
management solutions, specialising in medical
So inadequate, please extend it?
• XML 1.0 not enough
– XML Namespaces
– XML Schema
– XML Data
– XML Link
– XML Pointer
– And so on …
What XML is?
• XML is a data format with a basic syntax
and generic well-formedness rules
• DTD and Schema provide model-specific
• Semantics of the model
– Out of scope!
• Often confused with related APIs and tools,
when comp.text.xml was new
• XML is compact
• XML is fast
• XML will replace word processors
• XML will replace relational databases
• XML will replace CORBA
• XML-enabled applications will interwork
Myth: XML is Compact
• XML is bigger than most proprietary format
(for the same information content)
– Content presented as text not binary
– Markup is verbose
• <longtagname> … </longtagname>
– Lots of nesting but little content
• Often more bytes in tags than in content
– But does it matter?
• Only if short of space (e.g. floppy disk)
Myth: XML is fast
• How do you measure the speed of a data format
• Being both generic and heavily text-based
(requiring lexical analysis and parsing),
specialised “binary” formats will be “faster”
• Development of an XML-enabled tool is faster
(XML parser is a generic COTS tool)
• Does it matter? Depends on your application.
Myth: XML will replace
• A word processor is a program, not a data format
• XML is about semantic markup, word processors
are mostly about presentation
• Word processors may use XML in addition to
– But interchange will require an agreed DTD/schema
– Interchange will address presentation not semantics
• Word processors may enable embedding of XML
tags and become XML editors
Myth: XML will replace
• While XML can express the content of a
(relational) database, it is not an efficient format
for either query or update
– Could build a database engine based on XML but
performance will be an issue
• Use of XML may replace the use of small static
• XML has a role for interchange
– But only if a DTD/schema is agreed
Myth: XML will replace CORBA
• XML could be used as the underlying message
format (invisible to CORBA users)
– But larger and slower than current format
• Could enable interaction with non-CORBA
– Provided they were CORBA-DTD/Schema compliant
and responded according to CORBA protocol
• So not CORBA but very CORBA-aware!
• XML can be carried as CORBA payload
– but so can Shakespearean sonnets
Myth: XML-enabled applications
• Yes, to the extent of parsing an XML file
• Yes, to doing generic actions on that file
– E.g. create a database with corresponding
• After that, you need semantic knowledge of
the DTD/Schema (if any)
– E.g. update an existing database
What XML isn’t!
• A program of any sort
• A communications protocol
• A solution to interoperability problems
– But it can help in lots of ways
New Myths about XML!
• We need to define specialised network
protocols for it
– What’s wrong with SMTP, FTP, and HTTP?
– XML is not small enough nor fast enough!
• Driven by desire to use XML in
applications in small mobile devices
– e.g. PDAs, mobile phones
– low bandwidth and limited computation
WBXML : XML as binary
“The binary format was designed to allow for
compact transmission with no loss of functionality
or semantic information …
allowing more effective use of XML data on
narrowband communication channels …
The binary format encodes the parsed physical form
of an XML document, i.e., the structure and
content of the document entities.”
Disillusionment sets in …
“Specifically I am complaining that W3C taking years and
years to release XML schema including simple and
obvious things like data types, which are desperately
needed by small business.
… instead of meeting human needs, W3C includes an endless
progression of enhancements that make XML too large and
complex for production developers to economically
… the reason … is corruption: too many individuals in the
process are motivated to keep making XML complex,
delaying competitors uptake, and intentionally preventing
the general population from using XML for data
interoperability in business.”
Conclusions on XML
Just don’t worship it!