XML / EDI
RECOMMENDATIONS FOR STANDARDIZATION IN THE
FIELD OF XML FOR ELECTRONIC DATA INTERCHANGE
Draft CWA XML/EDI 1 v0.3
Contents ......................................................................................................................................... 2
Foreword ........................................................................................................................................ 3
Introduction ..................................................................................................................................... 4
1. Scope .................................................................................................................................... 5
2 Normative References ........................................................................................................... 6
3 Abbreviations ......................................................................................................................... 7
4 Recommendations ................................................................................................................. 8
4.1 Recommendations on promoting Best Practices for XML/EDI Applications ....................................... 9
4.2 Recommendations on Technology Issues .......................................................................................... 9
4.2.1 General Issues ............................................................................................................................... 9
4.2.2 Specific Recommendations.......................................................................................................... 10
4.3 Recommendations on XML/EDI Integration Issues .......................................................................... 12
Annex A (Informative): .................................................................................................................. 15
A.1 What is XML? .................................................................................................................................... 15
A.2 What does an XML message look like? ............................................................................................ 16
A.3 What is EDI? ..................................................................................................................................... 17
A.4 What forms do EDI messages take? ................................................................................................ 18
Draft CWA XML/EDI 1 v0.3
This document is a CEN Workshop Agreement (CWA) in the XML/EDI domain. It consists of a synthesis of
the findings of two ISIS projects produced by a voluntary XML/EDI Workshop adhoc project team and
approved1 by the XML/EDI Workshop membership.
These findings have resulted in the production of two CWA's':
Recommendations and guidance on the use of XML for electronic data interchange
Recommendations for standardization in the field of XML for electronic data interchange (this document)
Broadly speaking, the objectives of these projects was to promote the application of XML/EDI for electronic
commerce in the business environment by:
Validating the use of W3C's XML specification for the electronic interchange of business data in the
transport and healthcare sectors
Demonstrating the applicability of the XML/EDI methodologies, tools and systems in user-driven pilot
trials in the selected industry and public administration sectors
Investigating the overall requirements for XML/EDI tools from European users of EDI
Recommending best practices for mapping existing EDI applications to XML which can be used by other
industrial sectors to facilitate the rapid deployment of XML/EDI.
Inform standardization bodies and the user community about the findings of the project
Contribute to standardization work and trade facilitation in the field of XML/EDI
The two projects were part funded by the European Commission and part funded / conducted by the following
XML/EDI Pilot Project (www.tieke.fi/isis-xmledi):
Project Coordinator Finnish Information Technology Development Centre (TIEKE)
Nederlandse Organisatie voor Toegepast atuurwetenschappelijk Onderzoek (TNO)
University Hospital of Giessen Institute of Medical Informatics
The Clinical Information Consultancy (CIC)
Communications Planning (CPL)
The SGML Centre
UK National Health Service
UK Royal College of General Practitioners
European Commission Enterprise Directorate-General (ISIS programme)
EXPERTS Project (www.ilc.at/experts.htm)
Project Coordinator Info Consult
Partners/Funders Interned Electronische Communicatie
University of Sunderland, United Kingdom
Electronic Commerce Platform Nederland (ECP.NL)
European Commission Enterprise Directorate-General (ISIS programme)
XML/EDI Workshop voluntary review team:
Stuart Campbell (CMASS)
Alain Dechamps (CEN/ISSS)
Martin Bryan (The SGML Centre - XML/EDI Pilot Project)
Gait Boxman (TIE - Experts project)
1 Assuming approval of this draft
Draft CWA XML/EDI 1 v0.3
This document provides recommendations and guidance for the use of XML for electronic data interchange,
in other words for XML/EDI.
XML/EDI is a combination of two technologies, XML (eXtensible Markup Language) and EDI (Electronic Data
Interchange), that promises to be the catalyst for Electronic commerce. XML is closely related to the web
language HTML; in fact it’s a subset of SGML, the mother of HTML. The main difference between XML and
HTML is that XML is extensible. It provides syntax for storing data. This data can then be represented in a
way that best suits the needs of the receiver. EDI is the inter-organisational, computer-to-computer exchange
of business documentation in a standard, machine-processable format (more detail in informative annexe 1).
Most EDI messages are based on either ANSI X12 (the US standard) or UN/EDIFACT (the global standard).
So much effort has gone into creating the UN/EDIFACT UNSMs (United Nations Standard Messages) that
they have become an invaluable data model of the real world, applicable in virtually every business
transaction, in every sector, in every country. Despite all this effort, the adoption of the standards hasn’t been
as great as one would expect. One of the reasons for this is the environment in which the messages are
XML/EDI’s aim is to use the know-how of business processes, captured in EDI messages, but tries to put it in
a Web environment, whereby the same file can be viewed by a user or can be processed by an application.
Rather then having HTML for a user and UN/EDIFACT for an application, with XML/EDI you can have an EDI
message that can be an UN/EDIFACT orders message written in an XML format; therefore it is presentable
to the application just like the EDI file. The same file uses the embedded templates and rules that explain
how it should be displayed to a user and can be viewed through a browser.
The idea behind the whole initiative is that in terms of a workflow application, one can send an EDI message
in an XML format to a supplier who can then pick it up and present it through a browser to a person who is in
charge of approving incoming orders. That person can add some data to the incoming order, which actually
signals his approval – this could for instance be a digital signature. The approved order can then be sent into
the supplier’s application as a plain EDI file. In this way, as a human or an application creates a message, it
can travel through an organisation, and can be sent to another organisation and switch between humans and
applications also. Every time the XML file will become larger, containing new updated information and as
such it is almost a foundation for a workflow application.
To really benefit from the advantages of XML/EDI, it should be placed in a framework of technologies. This
framework consists of XML, EDI, templates, a repository and agents. Another important issue is to define a
standard way of defining existing EDI messages in XML.
Draft CWA XML/EDI 1 v0.3
The present document identifies Recommendations for standardization in the field of XML for electronic data
The present document is particularly applicable to those parties to which the individual recommendations are
targeted at, or any other organisation which has influence / activity in these fields. This includes:
This document also provides an overview of the background to XML in an electronic data interchange
context. This document is not intended to describe the precise standards, their use, or implementation
information. However, further information is available via the reference documents, the project websites as
well as numerous other web resources.
Draft CWA XML/EDI 1 v0.3
2 Normative References
CWA XML/EDI 2: Recommendations and Guidance on the use of XML for electronic data interchange
XML/EDI Pilot Project:
Public deliverables of the ISIS XML/EDI pilot project and references therein:
D1: XML Document Type Definitions for selected messages
D2: Best Practices for Creating XML/EDI DTDs
D3: XML/EDI User Interface
D4: XML/EDI Datatype Validation Module
D5: Using XSL Style Sheets to display XML/EDI messages
D6: XML/EDI Action Control Module
D7: Web site for promoting XML/EDI within Europe
D8: Recommendations for the Standardization of XML/EDI
D9: Using XML for Electronic Data Interchange
D11: Dissemination event (organised by ISIS XML/EDI and EXPERTS projects)
D1: Final Report for Standardization and Trade Facilitation Bodies
Draft CWA XML/EDI 1 v0.3
CEN: European Committee for Standardization
CWA: CEN Workshop Agreement
EDI: Electronic Data Interchange
ISIS: Information Society Initiative in Standardization
ISSS: Information Society Standardization System
XML: eXtensible Markup Language
Draft CWA XML/EDI 1 v0.3
During 1999 significant progress was made in harmonising the work of the UN/EDIFACT and ANSI X12
communities with that being undertaken by the XML community in related areas. In addition to a greatly
increased awareness of the benefits of XML within the EDI community, there has been a greatly increased
awareness within the XML community of the benefits of creating or adopting a unified set of semantics for the
description of business processes. Whereas at the beginning of the year there were a number of initiatives
that seemed to be working in relative isolation, by the end of the year there was a noticeable increase in
recognition of the need for semantic harmonization to complement the syntactical harmonization introduced
by XML. Semantic harmonization is key to future development of XML/EDI. This will be speeded by the
agreement, during 1999, of the vendor-led Organization for the Advancement of Structured Information
Standards (OASIS) and UN/CEFACT, together with other related initiatives, to set up a joint initiative known
as ebXML. The first ebXML meeting was held in November 1999. This initiative has brought together all
significant developers of XML-based electronic commerce/business solutions with leading members of
traditional EDI communities such as ANSI X12 and UN/EDIFACT. It resulted in the setting up of seven
working groups that will, during 2000, explore how best to harmonise XML and EDI activities in the wider
context of e-business.
The following diagram illustrates, in a simplified form, the XML/EDI standardization environment as it exists at
the end of 1999:
The recommendations in this CWA have been divided into three sectors:
Recommendations on promoting Best Practices for XML/EDI Applications
Recommendations on Technology Issues
Recommendations on XML/EDI Integration Issues
These three sets of recommendations are targeted principally at the organizations identified in the above
illustration under the relevant headings, but they should not be taken as only being of relevance to these
bodies. Our recommendations are intended to improve the harmonization of these efforts, and should be
viewed as an integrated set of recommendations that all parties involved in the promotion and development
of XML/EDI should be addressing to the best of their abilities.
Our recommendations have a European slant to them. Whilst our findings will apply to the global multilingual
and multicultural business community in general, they are of specific interest to European institutions and
bodies such as the European Commission at a policy level and CEN/ISSS at a technical harmonization level.
Draft CWA XML/EDI 1 v0.3
4.1 Recommendations on promoting Best Practices for XML/EDI
The following recommendations on the further validation of the guidelines produced by the pilot projects are
1. To ensure that these guidelines have wide applicability they should be tested in different
sectors, by people other than the original developers of the guidelines.
The European Commission should consider co-funding projects for the development of XML/EDI
applications in new sectors based on the current guidelines so that they can be further refined within
the CEN/ISSS XML/EDI Workshop. Newly established projects in the XML/EDI area should be
encouraged to analyse the current guidelines as a starting point of the project work.
2. Where appropriate, national bodies responsible for the standardization of EDI applications
should encourage testing of the guidelines by those organizations interested in adopting
The national standards bodies have a vital role in facilitating the implementation of XML/EDI at a
“local” level and in co-ordinating feedback from local projects for the enhancement of the guidelines
and establishing a Europe-wide consensus.
The current guidelines are based solely on the use of XML Document Type Definitions (DTDs) as these are
the only form of applicable data management mechanism that was available to the project team. During the
course of the project the first working drafts of the XML Schema specification were published. It is clear that
XML Schemas offer functionality required by XML/EDI that is not available with standard DTDs. In particular
the role of "types" to define reusable "classes" of element types is expected to be key to the development of
future XML/EDI applications. The consortium therefore recommends that:
3. As new members of the XML family of standard reach a stable stage, their implications for
XML/EDI should be reviewed and further guidelines issued as to the role they should be
expected to play within business applications.
In particular, once the XML Schema specification reaches a stage where it is relatively stable, and
there are tools available to support the use of schemas, a project should be set up to evaluate the
limitations of schemas for managing XML/EDI data sets.
While the pilot projects were able to test and prove the applicability of XSLT and XML Paths within
XML/EDI applications, they were not able to evaluate many other emerging specifications that might
be of use to businesses at a later date. Further work is needed to evaluate the role that XML
Pointers, XML Links and XML Queries might have in XML/EDI applications, or the impact of the XML
Signatures proposal that was published during November 1999. In general, it is important that there
is a continuous and coordinated activity to identify further standards areas in which enhancements
are needed to meet the needs of European businesses.
4.2 Recommendations on Technology Issues
4.2.1 General Issues
Technology issues relating to XML/EDI fall in the purview of the World Wide Web Consortium's XML
Coordination Group. This committee has a specific brief to harmonize the efforts of the various working
groups on XML and related standards. As such it does an excellent job of resolving differences between the
Recommendations relating to missing functionality within the XML family of standards should be addressed to
the XML Coordination Group in the form of a statement of requirements. The Coordination Group ensures
that the requirements are directed at the relevant working groups.
To ensure harmonization of efforts, there should be a process whereby XML/EDI requirements identified as
being required by a particular application can be assessed for their applicability to other processes. At a
European level it would seem natural that the CEN/ISSS XML/EDI Workshop should be the place to discuss
requirements statements, and reach consensus, before they are submitted to the Coordination Group. This
Draft CWA XML/EDI 1 v0.3
would ensure an effective and optimal European input to international standardization. By the same token, it
is vital that the XML/EDI Workshop attracts the active contributions and support of all relevant constituencies
For some years now the European Commission has been actively supporting the development of European
centres of excellence within the W3C framework. This support should now be extended to XML/EDI, on a
case-by-case basis in support of projects which meet genuine industry requirements. In addition, the
European Commission should ensure that there is a supporting mechanism to facilitate active European
representation at international standardization meetings, including those organized by W3C.
4.2.2 Specific Recommendations
The ISIS European XML/EDI Pilot Project has identified the following areas where further work is needed on
the XML family of standards:
4. The Extensible Stylesheet Language (XSL) should contain functionality that can be used to
create an XML/EDI message.
There are two parts to the XSL specification:
A XSL Transformation Language (XSLT) that allows XML data streams to be combined and
converted into presentable or otherwise processable formats
A set of XSL Formatting Objects that can be used to describe the way in which data is to be
presented to users on different types of media (including paper and audio presentation).
Whilst XSLT is complete, work is still ongoing to define the XSL Formatting Objects in a way that can
be aligned with the next level of the W3C Cascading Style Sheet specification (CSS3).
At present the best that can be achieved in terms of using XML to capture data is through the
transformation of an XML document into an HMTL form using the completed XSLT specification.
HTML forms are limited in the functionality they provide. In particular they rely on submission of
captured data to a server in an unstructured manner, using sets of name/value pairs that cannot be
transferred to the server in a secure manner, using encryption, public keys, etc.
In September 1999 the W3C’s HTML Working Group issued a statement of requirements for
increased functionality within HTML forms in a document entitled XHTML Extended Forms
Requirements. (XHTML is an XML representation of HTML.) This specification includes the
"2.4.4 Preserving the current state of a form
There needs to be a generalised way of preserving the changes the user has made to a form.
This will make it possible for a user to save the form, and at a later time, to resume filling it out.
The ability to treat forms as persistent objects encapsulating state and behaviour is needed for
workflow applications where forms are passed from one user to another."
Note: This might lead to a new submit method "save", where the entire form together with the
modifications made by the user is submitted back to the server. Apart from other benefits, this
could be a simple mechanism of editing and updating XHTML documents over the web within
This requirements document, however, does not mention the need for the contents of the form to be
submittable as a structured XML message, or for a local copy of the submitted contents to be
recorded for audit and related purposes. These two aspects are vital to the adoption of XHTML forms
within XML/EDI environments.
Ideally XSL formatting objects should allow a user to create XML messages at the client side as well
as at the server side, rather than being based solely on XHTML forms.
An alternative would be to allow XML Path statements to be used to identify where the data entered
within a specific form should be located within a predefined message template. XSLT could then be
used to transform the current DOM into the required messaging format when the document was
cleared for submission.
Draft CWA XML/EDI 1 v0.3
5. It should be possible to pass parts of XML/EDI messages to different display windows.
It should be possible to manage the process of displaying each output in a separate window on the
screen. This functionality should include a declarative statement of window size and position
expressed in either absolute or relative terms, together with identification of which subtree of the
source data is to be displayed, and at which point in the window’s content it is to be inserted. Where
several windows are related to different parts of the same source tree, they should be directly linked
to a single source DOM, rather than to a separate DOM for each window, so that any changes made
to the source from user input to one window are reflected in all other windows addressing the same
There is a general requirement that each node in a displayed tree be traceable back to its position
within the source tree so that the context of separately processed subtrees can be properly
6. The XSL Transformation language should allow the creation and permanent storage of
multiple XML outputs from a single XML input.
Whilst XSLT allows multiple input files to be combined to create a single presentable object, it at
present only allows a single output object to be created. Various XML/EDI applications require that
parts or all of a received message be passed to different processes. At present this can only be
achieved using multiple transformations of the same data source. This is inefficient and should not
be necessary. It should be possible to create and store multiple result trees from a single
Whilst there are dangers in providing the ability to store information that has been subsetted from a
message, the ability to store the results of an XSLT transformation are likely to be important for both
auditing and data capture purposes. Business applications need to be able to record what has been
captured, and to be able to send copies to different sources, some of which are restricted in their
accessibility (for example, healthcare records need to have multiple, nested, levels of access
control). The ability to create specific subsets of a data capture event for sending to different
associated applications from a single XSLT specification would preclude the possibility of failure of a
process chain to create all the messages needed as part of a data capture event.
7. There should be some mechanism for specifying a sequence of actions that needs to be
performed for a particular application.
In many applications there is a need to perform a series of transformations during which the output of
one transformation may need to form part or all of the input of a subsequent process. Not all of these
processes will involve XSLT directly. For example, it may be necessary to make a call to a database
so that the result can be included in the resulting document.
Sometimes transformations may need to be "triggered" by specific events in the user interface, such
as the selection of a Submit button or the change of a particular field or button.
It should be possible to specify, using XML, the sequence in which a set of related actions should be
performed. The mechanism should allow actions to be nested so that the result of one action can
form a component of another action.
An example of the type of functionality that an action language could provide is given in the ISIS
XML/EDI pilot project document entitled XML/EDI Action Control Module (http://www.tieke.fi/isis-
xmledi/deliver/d6.doc). This action control is an application of XSLT.
Any generalized solution to action control should be based on the use of XSLT to integrate XML data
streams, one of which defines the sequence of actions required, together with some form of state
control message that can be referenced by events.
8. There should be some mechanism of packaging images and other forms of non-XML data
within an XML message so that a user does not need to rely on having Internet access to
obtain all images associated with a message.
Whilst non-XML data can be embedded within XML files using CDATA sections within elements
associated with a named notation, this is not efficient for handling images that need to be referenced
from more than one point in the data stream. Reusable images should be stored and referenced as
external unparsed entities. This feature is particularly important within the healthcare environment,
Draft CWA XML/EDI 1 v0.3
where medical and other images must accompany medical records, together with appropriate
attestation of both image and other associated record information. It must be possible to ensure that
the images associated with a particular message stream can be exchanged with the message
without the possibility of the two data sources becoming separated.
9. It should be possible to identify the scope within which an identifier is unique
Not all identifiers need to be unique within the scope of a document. The following additional levels of
uniqueness have been identified as being useful within XML/EDI applications:
Unique among siblings (all subelements of a given element)
Unique among like siblings (all siblings of the same element type)
Unique on the sending system
It should be possible to specify the scope of a particular identifier, ideally by use of an XML Path
statement that identifies the context within which the identifier is unique.
NOTE: The 17 December 1999 Working Draft of XML Schema does address the issue of uniqueness
constraints on identifiers in some depth, introducing the concepts of “unique, key and key reference
constraints”. Its proposals go beyond what is proposed here by allowing element content, and any
named attribute, or indeed a combination of multiple attributes or element contents, to serve as
unique identifiers. On the other hand, an “Issue” is raised in this Working Draft, to the effect that “The
XPaths in selector and field should be restricted to certain specified simple forms.” It is suggested
that “unique in the document”, “unique among siblings” and “unique among siblings of the same
element type” constraints would form a candidate minimal set of specified simple forms. However, it
would not be appropriate to see the use of XPaths to specify a range over which uniqueness
constraints apply restricted too much. Indeed it can be a requirement in EDI applications for allow
identifiers to be constrained to be unique over a defined subset of the elements in multiple
documents. This requirement goes beyond the even the capabilities of full XPath. One possible
solution would be to require all XML Schema processors to be able to validate data against a
specified “minimal” set of uniqueness constraints, but in addition to specify a mechanism whereby
applications that support full XPath, and perhaps also XLink and XSLT in addition to XML Schema,
can specify a constraint on the uniqueness of identifiers over the node sets identified by any XPath
statement or XLink structure, or generated through an XSLT transformation. It is recommended that
the XML Schema Working Group consider addressing this type of requirement in its next Working
During 2000 some revised XML-based specifications have been using XML namespace to qualify
attribute values. This technique may be relevant for the identification of objects with respect to a
clearly identified originating system. This mechanism might also be useful for uniqueness checking
across distributed systems.
10. It should be possible to sign parts of XML documents in a nested way such that access to the
inner layer is dependent on permission to use both outer and inner keys.
Portions of medical records, like many other types of commercial records, need to be approved by
specialists at specific times in the process, in a manner that allows users of the data to be assured
that the data has not changed since it was entered. At the same time parts of the records must only
be accessible on a "need-to-know" basis, where the relevant keys are only available to staff with a
relevant level of clearance. Typically such signatures need to encompass sets of signed records, but
not the complete record.
While it is anticipated that the recently published XML-Signatures Requirements document will lead
to a specification which will, on completion, provide much of the required functionality, tests will be
needed to ensure that adequate control mechanisms are in place to handle the complex
requirements of applications of the type encountered in healthcare applications.
4.3 Recommendations on XML/EDI Integration Issues
The OASIS plans to set up a neutral repository of XML DTDs and schemas will play a key part in the
integration of XML/EDI applications. The following recommendations are made to those working on the
Draft CWA XML/EDI 1 v0.3
harmonization of semantics for XML/EDI, especially those involved in the ebXML initiative and other bodies
involved in repositories for the storage of such semantics. For the detailed requirements/implementation
information connected with these recommendations please see CWA XML/EDI 2 - "Recommendations and
Guidance on the use of XML for electronic data interchange".
11. Efforts should be made to harmonise semantic sets, with the long-term aim of producing a
single set of semantics that can be applied across a wide range of applications. An XML DTD
to formally describe interchangeable semantic descriptions needs to be developed.
UN/EDIFACT directories record a wide range of data elements that have been used for commercial
transactions for a number of years. Over time, however, these data elements have come to include
multiple mechanisms for expressing the same semantic in different contexts. In XML/EDI
applications there should only be need for one semantic that can be used in many different contexts.
The ISO BSR has attempted to reduce the amount of overlap within the EDI semantic sets, and to
harmonize these semantics with those used in other communities such as the product data and
engineering communities. During this process it has identified a number of discrepancies in the way
that different communities identify different components, and has introduced the concept of aliasing
synonyms to the semantic repository. It should be an essential feature of XML/EDI systems that
these systems should be able to apply locally meaningful synonyms to agreed semantics. With the
availability of XSL Transformations, it is now possible to map automatically from local synonyms to
generic semantic descriptors.
A wide range of initiatives have developed their own sets of semantics for small subsets of business
applications over the last year. It is essential that these initiatives be encouraged to harmonize their
efforts with any general solution, if only by creating an XSL Transformation that will convert their local
message format to a standardized one.
One mechanism that could help harmonization of semantic sets would be to develop a standard
method for describing them. The ISO 11179 Data element specification standard, which is currently
undergoing revision and extension, provides a basis for such harmonization as many existing
repositories are based around this standard. Unfortunately different groups use different subsets of
the specification, and there is no standard mechanism for exchanging data between systems. The
development of an XML DTD to formally describe interchangeable semantic descriptions could be a
useful step in the harmonization process.
12. A unified method for querying different semantics repositories needs to be developed.
Whilst different authorities maintain their semantic repositories in different forms, it will be impossible
for users to identify the best DTD or schema for their purposes, or to develop new DTDs or schemas
based on components that have already been formally described.
The future XML Query specification could form the basis of querying semantic repositories, but at
present the functionality to be provided by this standard is unclear, as this W3C activity is currently
still at the requirements-definition stage.
If repositories made their semantics available in an agreed XML format, it would be possible to use
the XML Pointer Language (XPointer) to reference any semantic from within any XML document,
XSL stylesheet or XML schema.
Even if both of these approaches was adopted, there still needs to be an API that describes the
handshaking needed to determine which form queried data returned to the enquiring system should
13. It should be possible for developers of new applications to obtain information about related
sets of elements at a level below that of a complete message.
Present repositories are either designed to return information about all of the data elements that form
a message, or about only one component of the message. When developing a new form of
message, developers need to be able to identify all related components of a message. For example,
when defining an address developer need to identify all elements that can legitimately form part of an
address so that users can select which subset is needed for their application.
Draft CWA XML/EDI 1 v0.3
It should be possible to request information about an element and all of its permitted descendants in
a single request.
It is anticipated that this function will be particularly relevant once XML Schemas are adopted in place
of XML DTDs, and "types" are used to define classes of elements. In this case it will be important to
identify which element sets conform to which types.
14. There should be a simple mechanism through which developers of new business
applications can indicate that they have used an existing semantic in their applications that
can be seen by other users of the semantic.
Whilst a particular DTD can contain an XML Link/Pointer to the semantic that has been used within
an application, this pointer cannot be seen by existing or future users of the same semantic. Each
use of a semantic should be recorded within the relevant repository. XML extended links provide a
mechanism for doing this, but an API needs to be developed that will allow application developers to
inform semantic repositories of which pointers need to be added to which link sets within the
15. Users should be able to submit proposals for new entries to be stored in the repository; the
addition of these entries to the repository should be subject to technical evaluation and the
relevant registry maintenance rules.
To ensure repository quality, modification of the repository content should be subject to clearly
defined rules. Nevertheless, the repository content must be based upon clearly stated user
16. Users should be able to download portions of repositories for conversion and integration into
their own environment.
In a world with ample bandwidth there would be no need to create local copies of an on-line resource,
but at present there cannot be guaranteed accessibility to repositories at all times. It should,
therefore, be possible to download the repository into local databases using a relevant data formats.
Users should be able to select the repository columns needed and receive them through various
17. Users should be able to subscribe for update notification of (portions of) repositories.
For users that identify themselves a facility should be created in which they could receive automatic
updates or update notifications from the repository.
18. Names of semantic components should be context-independent and should be designed to
be concatenated with the names of parent elements to provide a logical context within a
Within XML applications references to a particular component of a message will be made using an
XML Paths. Such paths will also form the basis of the XML Query Language, which will provide a
standardized way of interrogating databases of XML-encoded data. In their most verbose format
such XML Paths can identify the complete parentage of a message component, e.g.
In describing tags (element or attributes) the simplest possible name will be the most reusable. Tags
that include in their name or definition information that is specific to the context in which they must be
used will be less reusable than more generalised tags.
Draft CWA XML/EDI 1 v0.3
Annex A (Informative):
This document aims to answer the following questions:
What is XML?
What does an XML message look like?
What is EDI?
What forms do EDI messages take?
A.1 What is XML?
The Extensible Markup Language (XML) is a World Wide Web Consortium (W3C) Recommendation for
marking up data in ways that reflect its meaning rather than its presentation. In this way, it differs from the
HyperText Markup Language (HTML), whose markup is specifically related to the presentation of information
in a browser. Whilst designed initially for the display of documentation distributed via the World Wide Web
(WWW), XML has since been widely adopted as a means of interchanging information between computer
XML is a simple dialect of the Standard Generalized Markup Language (SGML) defined in ISO Standard
8879. SGML was designed in the 1980s as a tool to enable technical documentation and other forms of
publishable data to be interchanged between authors, publishers and those responsible for the production of
printed copies of data sets. By providing a formal definition of the component parts of published information
sets, SGML made it possible to verify the correct transmission and receipt of interchanged data sets.
However, SGML only defines the syntax of the message. It does not define any semantics for the information
objects in the message, only the relationships between them.
An XML document instance must be created and stored as a set of properly nested data storage entities,
each of which is made up of a number of logical elements which contain data or define processes to be
performed. The outermost storage entity is referred to as the document entity: it contains both the start and
the end of the root or document element of the document instance. Elements can be nested to create
hierarchies (information trees). Elements can be assigned attributes (properties) which indicate how the
contents of the element should be interpreted.
Each XML element starts with a named start-tag and ends with an end-tag with a matching name. Outward
pointing angle brackets are used to delimit these markup tags (e.g. <title>). An end-tag is distinguished from
a start-tag by having a slash immediately preceding the name (e.g. </title>). Elements that have no contents
are distinguished by having a slash immediately after the name in the start-tag to indicate that the end-tag
has been omitted (e.g. <image/>). Because each element of an XML document has clearly marked limits, it is
easy to determine when its contents have been received over a network.
Attributes of XML elements are defined as part of its start-tag (e.g. <Order type="production">). Each XML
attribute must be fully defined, with the attribute name followed by a value indicator (=) and a quote delimited
string containing the attribute value. Attributes can be assigned a default value if an attribute list declaration is
associated with the formal declaration for the element in the document type declaration (see below).
Parts of an XML document instance can be stored in separate files that will be referenced as external text
entities. Alternatively internal text entities can be used to define the replacement text for an entity reference.
For example, addition of an entity declaration of the form <!ENTITY company "The Markup Centre"> to a
document type declaration will allow an entity reference of the form &company; to be entered in the
associated document instance. This reference will be replaced by the quoted replacement text defined in the
entity declaration when the file is processed (parsed).
XML is based on the ISO 10646 Universal Multi-Octet Coded Character Set (UCS), which includes the codes
that make up the Unicode character set, so that it can be used in all major trading nations. A special form of
entity reference, known as a character reference, can be used to identify special characters, including codes
that cannot be accessed from the local keyboard, that need to be added to the file either by reference to the
Draft CWA XML/EDI 1 v0.3
decimal number assigned to the character in ISO 10646 (Æ), or by reference to the equivalent
hexadecimal value (Æ)
The set of elements, attributes, etc, that can be used within an XML document instance can (optionally) be
formally defined in a document type definition (DTD) that is associated with the document instance through
the addition of a document type declaration that forms part of the prolog of a document instance. The
declarations that make up the document type definition can form part of a file referenced as the external
subset of the document type declaration, or can be embedded, or referenced, within the internal subset of the
declaration. Comment declarations can be used to record any explanatory material required as part of the
document type definition.
The W3C Document Object Model (DOM) can be used to identify the structure of an XML element tree.
Applications requiring a simpler application programming interface (API) can use the event-based Simple API
for XML (SAX) developed by the XML Developers Group. Both DOM and SAX have IDL definitions that allow
XML elements to be stored in CORBA compliant databases.
XML documents can be transformed into displayable formats such as HTML or PDF using another W3C
specification, the XSL Transformation (XSLT) Specification. This specification allows XML messages to be
subsetted, reordered and converted into alternative formats using a set of reusable templates that are
designed to process specific elements within the document tree, with any descendents that may be required.
A.2 What does an XML message look like?
A typical XML message can have the following form:
<?xml version ="1.0" ?>
<!DOCTYPE Order SYSTEM "C:\\xml\dtds\order.dtd">
<Name>The Village Store</Name>
<AddressLine>2 The Reddings</AddressLine>
<AddressLine>Glos. GL51 2UP</AddressLine>
<ItemDescription>Super Party Poppers &trademark;</ItemDescription>
The first line of the message indicates it is an XML message, and which version of XML it conforms to. The
second line of the message identifies the document type definition that has been used to validate the
structure of the message. The outward pointing angle brackets are the delimiters that separate XML markup
from contents. The first word within each set of angle brackets indicates the name of the XML element. The
word before each = sign represents an attribute name, and text between quotes following the = sign
represents the attribute value. Text not between angle brackets represents element content. The name
Draft CWA XML/EDI 1 v0.3
between & and; in the fifth from last line identifies a reference to an entity whose replacement text will be the
An XSLT transformation can be applied to this message to convert the XML into a format that can be
displayed on a WWW browser, which could take the following form:
In this example the bold text indicates data that was transmitted as part of the message, italic text indicates
data that has been generated as a result of local processing of data within the message, and the normal text
represents text that would be on a pre-printed form if the data was being printed out rather than displayed on
a resizable screen. As can be seen by comparing the two formats, there is a close connections between the
markup of the message and the generated text that is associated with the displayed message data.
A.3 What is EDI?
In this report the term EDI (Electronic Data Interchange) is used in a somewhat restricted manner to refer to
the exchange of commercial data between business partners. The sort of exchanges between consumers
and merchants that are commonly referred to as "electronic commerce", or the exchange of manufacturing
related information such as detailed drawings and database exchanges known generically as "product data
exchange" are not included. However, the general needs of businesses to exchange data for the purposes of
what is referred to as "electronic business transactions", within a broader framework than is understood by
bodies such as those involved with what is referred to as EDIFACT (EDI for Administration, Commerce and
Transport) are included.
EDI is traditionally concerned with computer-to-computer exchange of information, without human
intervention. The World Wide Web (WWW), on the other hand, is principally concerned with the exchange of
data between humans and computers. XML fits naturally into both of these worlds. The project team have
explored the relationships between these two worlds, to ascertain how XML, and related standard such as the
Draft CWA XML/EDI 1 v0.3
XML Stylesheet Language (XSL) and the Wireless Application Protocol (WAP), can help to capture, validate
and disseminate information that companies need to exchange to do business.
A broad view as to what constitutes a business has been taken. A significant proportion of our work has been
concerned with the interchange of healthcare records between the General Practitioners responsible for the
day-to-day health of patients and the hospitals that are required to provide specialist healthcare services.
While not traditionally thought of in business terms, the information exchange needs of the healthcare
industry are typical of many large businesses, and have the added incentive of a potentially life-threatening
criticality in the event of failure to exchange information in a timely manner.
Examination has also taken place of how existing users of EDI systems based on existing well-established
semantics, such as those agreed at the United Nations as part to the EDIFACT initiative, could benefit from
the use of XML. In particular the relationship between existing EDIFACT transport booking and management
messages has been reviewed as have the sorts of mobile computing applications that the Wireless
Application Protocol makes available to users of the new generation of Web-aware mobile phones and PDAs.
A.4 What forms do EDI messages take?
A typical EDIFACT message has the form:
NAD+CN+++THE VILLAGE STORE+2 THE REDDINGS:CHELTENHAM+GLOS++GL51 2UP'
LIN++1+37534656:EN'IMD+F+8+:::SUPER PARTY POPPERS'QTY+21:100'
Each segment of the message starts with a three-letter segment identifier and ends with a single quote.
Within a segment there are a number of composite message components, starting with a +. Within each
composite, multiple data elements are separated by colons. Empty data elements are indicated by
consecutive colons. (In other EDI messages data elements can occur as direct components of segments as
well as forming part of a composite.) Most data elements are transmitted in coded form. For example,
19991105 indicates a ISO 8601 date (5th November 1999). The preceding number, 137, indicates that this
date represents the date the message was created. The following number, 102, indicates that the date is
expressed as an ISO 8601 date. Without access to the code tables the interpretation of such messages is
Because EDI messages are typically fairly complicated in their structure it is usual for trading partners to
subset them for particular applications. The rules that apply to a particular subset are recorded in a Message
Implementation Guideline (MIG), which can include constraints on the use of fields over and above those
specified in the original message specification in the relevant UN EDIFACT directory.