XML Registries and Repositories: A Reality Check
Brand Niemann
“XML Web Services Solution Architect”
US EPA Office of Environmental Information
Kevin Williams
President & CEO
BlueOxides Technologies
August 14, 2002
1
Overview
• 1. XML.Gov
• 2. Booz Allen Hamilton Business Case
• 3. EPA-State Network
• 4. Mitre
• 5. Gartner
• 6. Web Services Initiative
• 7. Example Specification
• 8. Some Questions and Answers
2
1. XML.Gov
http://xml.gov/registries.htm
• The XML Working Group is considering whether to establish a
registry of "inherently governmental" data elements, DTDs, and
schemas. What do you think? Is such a registry needed?
• In the simplest sense, the benefits of XML will be achieved only if
organizations of a significant number are using the same XML
definitions. Therefore, these XML definitions must be available for
partners to discover and retrieve. A registry/repository is a
mechanism used to discover and retrieve documents, templates, and
software (i.e., objects and resources) over the Internet. A registry is
the mechanism used to discover the object. The registry provides
information about the object, including the location of the object. A
repository is where the object resides. A user retrieves an object
from a repository.
• Initial guidance on A Federal XML Registry (FXR) was provided in
the Draft Federal XML Developer’s Guide (April 2002, page 7-1).
3
1. XML.Gov
http://www.BizTalk.Org/Library/library.asp
• BizTalk.Org's Schema Library provides a card
catalog and librarian features that provide
BizTalk.Org members with the ability to locate
schemas that others have registered and cataloged.
Members are given the opportunity to register their
organizations and establish publishing rights. They
can also freely share their work and technical
information describing how their organization defines
their use of the XML standard. [Extracted from the
BizTalk.Org website]
4
1. XML.Gov
XML.ORG
• The XML.ORG Registry is a resource for accessing
[...] XML specifications, schemas and vocabularies
being developed for vertical industries and horizontal
applications. Operated by the Organization for the
Advancement of Structured Information Systems
(OASIS) -- the non-profit XML interoperability
consortium -- the XML.ORG Registry is a self-
supporting resource created by and for the
community at large. [Extracted from the XML.org
website]
5
1. XML.Gov
http://www.xml.org/xmlorg_registry/index.shtml
6
1. XML.Gov
Proof of Concept Pilot Registry
• Recently XML WG Co-Chair, Marion Royal, sent an email
message indicating that the XML WG was forming an
XML.GOV Registry/Repository Project Team. The purpose of
this team will be to develop initial
policies/procedures/metadata requirements for a registry that
will be accessible through XML.gov. Although we appreciate
the input from industry participants on the XML WG, the
Registry/Repository Project Team will consist of government
representatives and/or contractors in direct support of related
government initiatives. Any suggestions and/or proposals from
other organizations should be directed to the team leader with
courtesy copies to the XML WG Chairs.
• If you are a government representative, and/or a contractor in
direct support of related government initiatives and are willing
to participate on the XML.gov Registry/Repository Project
Team, please contact me the team leader
lisa.carnahan@nist.gov. 7
1. XML.Gov
http://xmlregistry.nist.gov/xml-gov
8
1. XML.Gov
http://xmlregistry.nist.gov/
9
XML.Gov
http://xml.gov/documents/completed/homelandsecurity/sld005.htm
• XML Registry and Repository Business Case:
– Captial Asset Plan & Budget Justification (OMB
Circular A-11, Exhibit 300):
• “Inherently Governmental Data Elements & Schemas.
• Standards-based, Distributed, Worldwide Network.
• Partners, e.g., OASIS & Global Justice Network.
• Federal IT Architecture – Data Reference Model.
• Foster Communities of Interest/Practice.
• Support Both Top-Down & Bottom Up Approach.
10
1. A Reality Check
• BizTalk.Org has gone away (fulfilled it’s original purpose).
• UDDI has gone to OASIS:
– UDDI.org Delivers Version 3 Specification.
– Santa Clara, Calif., July 30, 2002 - The Universal Description,
Discovery, and Integration (UDDI) project, whose specification
provides one of the building blocks for Web services applications and
services, and OASIS, an industry standards body, have announced that
OASIS will serve as the steward for the UDDI project and activities
and will continue development of the UDDI technical work.
– OASIS is also building a Standards Registry based on XML:
• http://www.oasis-open.org/stdsreg/
• The NIST Pilot Proof of Concept Registry has generated a
number of issues for discussion and lead to the BAH Business
Case Analysis.
11
1. A Reality Check
• XML Working Group: Web Services Initiative, Madhu
Siddalingaiah, June 19, 2002:
– Slide 10 on UDDI:
• A registry of WSDL documents
– Like an electronic Yellow Pages
• Web Service developer publishes WSDL to UDDI server
• Web Service clients can query the UDDI server for suitable service
definitions
– Accessible by humans and computers
• UDDI is not yet mature
– Standards and implementation is experimental
– How is trust established?
– Slide 18 on References:
• XMethods (http://www.xmethods.com) provides a great list of interesting
Web services and provides services that facilitate the development,
deployment, and usage of Web services and Web services networks.
12
2. Booz Allen Hamilton Business Case
http://xml.gov/documents/completed/bah/20020801statusreport.htm
• Alternative 1. Status Quo/Base Case: Undertaking no coordination
activities to standardize data and ensure the interoperability of all
government-sponsored registry/repositories. Allowing any and all agencies
to build, operate and maintain as many reg/reps with as many different
underlying technologies and specifications as they choose.
• Alternative 2. Single Unified Registry/Repository: Building a single federal
reg/rep from scratch that will require that every federal agency wishing to
publish schemas or artifacts go through/ provide submissions to the central
reg/rep for review and approval. This alternative requires the termination of
all current XML activities in agencies (EPA, DoD, etc) and would require
existing activities to be subsumed by the new single reg/rep.
• Alternative 3. Federated/Distributed Model: Each agency or entity may
stand up its own reg/rep. However, they must do so according to certain
specifications that ensure interoperability with the central government-wide
(XML.gov) portal/reg-rep. For those agencies electing not to build their
own reg/reps, they may publish information on the central reg/rep.
13
2. A Reality Check
• Alternative 1 has been going on because progress on
Alternative 2 has been so slow and because each
XML activity will favor use of its own reg/rep until it
discovers the need to integrate across its larger
enterprise and/or across the Federal enterprise (e.g. in
the 24 e-Gov Initiatives).
• Alternative 3 will need a Government-wide XML
Web Services Initiative to demonstrate and promote
the interoperability needed to be successful.
14
3. EPA-State Network
• LMI produced “Requirements for an XML Registry” in May
2001 that recommended it be based on the ebXML model (not
the OASIS model) and be coupled with a ISO 11179 Registry
for the XML tags.
• The EPA-State Network started a pilot XML Registry effort at
NIST in the Spring of 2001 and its status was addressed in my
previous presentation on July 17th
• The Network Steering Board (NSB) created a Technical
Resources Group (TRG) to:
– Data Exchange Templates (XML Schemas and sub-schemas).
– Core Reference Model (Manage development of DETs and Data
standards).
– Registry (Official Web site for the DETs, Trading Partner Agreements,
and guidance).
– Training and Orientation (Provide a common frame of reference).
15
3. EPA-State Network
16
3. EPA-State Network
• Registry Workgroup: A Proposed Work Plan, July
11, 2002 Draft:
– Objectives:
• Phase I.:
– Establish the scope (what would be registered, what functions, etc.)
– Updated information on applicable standards.
– Survey of available technology options including use of or linkage to
the EPA ISO 11179 EDR.
– Recommend building, buying, or making use of existing directories.
• Phase II:
– Provide the Registry for the Network.
– Schedule: 7 Deliverables by early December 2002.
• TRG Registry Work Group (July 22, 2002):
– Need two registries – the second being a UDDI for a
directory of Network Web Services!
17
3. EPA-State Network
• TRG Registry Work Group (August 5, 2002):
– Annotated Outline for the Software and Data Requirements
Document for the XML Registry for the EPA-State
National Environmental Information Exchange Network:
• Applicable Standards (ebXML/OASIS and ISO 11179).
• Software Requirements (6).
• Data Requirements (7).
• Interoperability Requirements (linkage to EDR and UDDI Web
Services Registry).
• Web Services Registry (not a complete requirements specification
for a Web Services Registry)
18
4. Mitre
http://pixit.mitre.org (not publically accessible)
• Some background:
– DoD has paid Mitre a considerable amount for its Registry
efforts, but it’s Registry still has no collaboration
mechanism.
– Mitre has held two Quarterly XML Web Services
Technical Exchange Meetings that included extensive
discussions of its DoD Registry work:
• April 9, 2002, Colorado Springs, CO
• July 16, 2002, Reston, VA
• October 15, 2002, Bedford, MA (next meeting)
– Mitre’s Project Showcase features the work of Mary
Pulvermacher, who lead a effort to improve Web-based
data exchange in space for the military:
• http://www.mitre.org/pubs/spotlight/2002/mary_pulverm/
– Terry Alford has spent the last year on Registry
Improvement Efforts and has lots to contribute. 19
4. Mitre
http://pixit.mitre.org (not publically accessible)
• Excerpts from official minutes of two meetings:
– DRIVE is the DISA Registry Initiative and uses XML Global software.
– You can put anything into the Registry – there are 90 DTDs/XML
Schemas and 14,832 XML elements (as of April 9, 2002).
– Both the government and commercial sector are doing a poor job with
processes like for producing consensus definitions.
– “It would be a big mistake to turn XML registration into data
standardization.”
– Agreement cross DoD is virtually impossible and prohibitive in cost.
– Make the Registry into a convenience for programs and a user-friendly
non-threatening experience.
– Collaboration and use of the Registry needs to be moved up before the
final design, development, and testing phase.
– XML Web Service for Registry – use XML tools to achieve greater
power.
20
4. Mitre
http://pixit.mitre.org (not publically accessible)
• Excerpts from official minutes of two meetings
(continued):
– People rarely register and reuse stuff; you have to go to the
users and beg them to register their stuff; you have to
register whatever people give you if it is well-formed and
valid; the MS Registry (BizTalk.Org) effort didn’t work so
they took it down.
– Not possible to move lock-step cross the agencies; must
start where you’re at and evolve; need coordination within
each agency first and five years later cross agencies.
– Experiments, initiatives, and working groups are very
valuable and in line with new DoD rapid development,
initiative, and fielding approaches.
21
5. Gartner: Their Reality Check
http://builder.cnet.com/webbuilding/0-3885-717-4721616.html?tag=st.bl.7267.edt.3885-717-4721616
• XML: 11 best practices Provided by Gartner:
– 7. Support public repositories: XML-defined vocabularies require resources to
create and manage them. Provide funding or skills to support them. Through
year-end 2002, the greatest growth in the development of XML-based
applications will occur in terms of new shared models needed for cross-
industry and information-chain integration, and the discovery of common
models and model-sharing techniques. By year-end 2002, industry-led groups
will develop standard procedures to define application-specific XML-defined
vocabularies and transaction and application schemas. There will be many
Web-based hosting sites, even within the same industries, for developing,
sharing, and reconciling XML-defined vocabularies and transaction and
application schemas.
– 8. Share vocabularies, not transactions: Transactions are easier to manage than
vocabularies, but they impose a rigidity on communications that limits their
usefulness. Most XML application standards have focused on creating entire
transaction definitions (purchase order and consumer bank transactions, for
example). A virtually infinite set of transactions will eventually use XML, but
those transactions will be most valuable if they use common vocabulary
definitions whenever possible (for words such as company, product, and
address, for example). Focus on defining the structure of components (for
example, a company consists of a name, an address, and a phone number), not
on the full set of transactions.
22
5. Gartner: Their Reality Check
http://builder.cnet.com/webbuilding/0-3885-717-4721616.html?tag=st.bl.7267.edt.3885-717-4721616
• XML: 11 best practices Provided by Gartner:
– 9. Don't argue about names: Computers don't care what things are
called. Most standards development efforts are impeded by discussions
about what things should be called. For computers to recognize two
strings as being the same, they must be identical or have a translation
that maps them to each other. Otherwise, applications will not
recognize them or process them as being the same. There are already 55
categories, with almost 200 standard proposals cataloged on
http://www.xml.org/--the site maintained by the Organization for the
Advancement of Structured Information Standards (OASIS). XML
sites such as http://www.biztalk.org/ and http://www.xml.com/ are
public sources of application-specific content models under
development. These sites are references to the industry-specific work
and show the variety of related models in various disciplines. There is
overlap among the many areas covered by the different standards.
When all the linguistic conflicts become obvious, the explosion of
XML-defined standards will slow as standardization activities
concentrate on reconciliation and reuse. As a result--and not because
XML standards become less important or less used--80 percent of
XML-based standards defined by year-end 2000 will be merged,
shelved, or discarded by the end of that year. 23
6. Web Services Initiative
• There was a proposal in the March meeting of the CIO Council’s
Architecture and Infrastructure (AIC) that the AIC should undertake
an initiative in Web Services and it was agreed that there was a need
for education of the AIC on the subject and to assess the level of
interest. There was high interest expressed regarding forming a
working group that focuses on federal opportunities associated with
Web Services and a brainstorming meeting was held on July 25th
with about 25 government and industry persons participating to
identify activities and draft a Charter.
• The GSA OIS recently distributed a Federal/State Issue Alert (May
2002) entitled Web Services: Using the Internet as a Shared Service
Platform as part of their series to provide short summaries on
emerging issues from quick reference by busy managers
(http://www.gsa.gov/attachments/GSA_PUBLICATIONS/extpub/W
eb%20Services_6.htm). This Alert explains how Web Services can
support the interoperability and integration objectives of e-
government with both legacy systems and new one-stop cross-
cutting portal systems.
24
6. Web Services Initiative
• The brainstorming session prioritized 18 suggestions for key
activities of the new working group and those that relate to an
XML Web Services Registry were as follows:
– Maintain registry of WS-related projects or efforts, to avoid duplication
and promote information sharing.
– Implement a registry of available Web Services (a “loose” registry of
human-researchable information at first, but later supporting automated
services location).
– Promote dissemination to Federal agencies of Web Services best
practices (from private sector or within Government).
– Develop an interoperability matrix for Web Services, helping agencies
spot interoperability issues between various W-S implementations.
– Develop on-line Web Services “want ads”, where businesses, agencies
or state and local governments could post requests for specific Web
Services.
– Provide on-line collaboration facility for exchange of sample business
cases, templates, and other info related to Web Services.
25
7. Example Specification
• An information design platform, not specifically an XML
Schema platform, so the users don’t need to be XML Schema
experts to make efficient use of the platform. The information
architectures created are not locked into the XML Schema
format so users can leverage emergent technologies over time.
• An outgrowth of a custom piece of software done by Kevin
Williams to support the MISMO effort
(http://www.mismo.org). Provided to Lisa Carnahan recently
for comments.
• The following are the specifications for a piece of software a
team has been working on for the past several months. It
includes a Web Service interface and enables servers to run in
a federated model, where servers can be linked together into a
searchable web of registries and repositories - and web
services turns out to be the best way to do that: (see next slide)
26
7. Example Specification
– Information design independent of serialization
– Robust import/export functionality
– Atomic versioning
– Version tagging
– Information hierarchy
– Support for namespaces
– Scratchpad support
– Threaded collaboration at every level
– Full-featured security model
– Fine control of data points
– Description fields available at every level
– Real-time collaboration
– Issue and resolution tracking
– No client install required
– Web Services API (planned)
– Information mapping (planned)
– Support for emergent technologies
27
7. Example Specification
• Contact information for more information and
suggestions:
– Kevin Williams
– President and CEO, Blue Oxide Technologies
– http://www.blueoxide.com
– RR3 Box 227 N
– Charles Town, WV 25414
– 304-724-6766
– kevin@blueoxide.com
28
8. Some Questions
• Sample question for each section:
– 1. Should we make more use of commercial expertise in developing a
network of Web Services?
– 2. Should the business case analysis include more than the location
options (many/diverse, single/unified, or federated/distributed) for
Registries like just XML Tags and Namespaces, XML Schemas and
TPAs only, or a broader-purpose, user-friendly collaboration platform?
– 3. What do you think about the evolution of an XML Registry to three
XML Registries at EPA?
– 4. What do think we can learn from Mitre’s experience in developing
the DoD XML Registry?
– 5. What is your response to the three Gartner “Best Practices” that
relate to Registries and Repositories?
– 6. If there is a Web Services initiative, how should their Registry
requirements and activities be coordinated and integrated with those of
XML.Gov?
– 7. Do you have any suggestions for Kevin Williams on the Example
Specification?
29
8. Some Answers
http://130.11.44.140
• Independent network XML nodes:
– Unit 22 – Close coupling of Oracle 9i R2 (native XML
database) with XML Spy 4.4.
• XML-based distributed collaboration platform:
– Unit 23 – EPA-State Content Network with NextPage’s
NXT 3 P2P Platform that uses XML-indexing (XIL) and
Web Services (SOAP, RDF, etc.).
• XML Community Vocabularies:
– Unit 28 – Bringing XML to EPA Data Standards. See next
two slides.
30
8. Some Answers
Unit 28 at http://130.11.44.140
• Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845:
– The Standards Stack (like a stack of pancakes):
• The higher in the stack one goes, the more technology and
specifications each layer is dependent on or references.
• Some aspects of XML specifications that exhibit layering behavior,
whereas others can be applied to multiple layers in the stack.
• The uses for XML fall into two different camps: message-oriented
protocols (right side-span all) and document-oriented specifications
(left side).
31
8. Some Answers
Unit 28 at http://130.11.44.140
The XML Standards Stack
Community Specifications
Business Process Layer
Presentation Aspect
Semantics Aspect
Security Aspect
Query Aspect Services Layer
Messaging Layer
Transport Layer
XML Base Architecture
32
8. Some Answers
Unit 28 at http://130.11.44.140
• The XML Standards Stack Layers:
– XML Base Architecture – all specifications use XML (e.g. XML
Schema).
– XML Transport Layer – Uses HTTP, SMTP, and FTP for transport
from place to place, but also BEEP (Blocks Extensible Exchange
Protocol), etc.
– XML Messaging Mayer – packaging XML documents for transmission
(analogy to a postal envelope) (SOAP-Simple Object Access Protocol
to become the W3C’s XML Protocol).
– Services Layer – functionalities that can be accessed by machines in a
distributed manner (WSDL-Web Services Description Language)
– Process Layer – turning functionality into coordinated action and
individual components into larger applications (various workflow
specifications that even allow human interaction to occur at various
points in the machine-to-machine dialogue).
33
8. Some Answers
Unit 28 at http://130.11.44.140
• The XML Standards Stack Aspects:
– Presentation Aspect – how XML should be presented or
modified in presentation for usability (XHTML, XForms,
and SVG-Scalable Vector Graphics).
– Security Aspect – provided a level of protection of XML
information (encryption, authentication, authorization and
permission, and privacy).
– Query Aspect – assist in locating XML resources (tagging
with metadata and retrieving).
– Semantic Aspect – help apply meaning and context to
XML documents (synchronizing XML vocabularies with
other incompatible representations).
34
8. Some Answers
Unit 28 at http://130.11.44.140
XML Standards Stack “Pyramid”
Document-Oriented Message-Oriented
Specifications Protocols
XML Base Architecture
35
8. Some Answers
Unit 28 at http://130.11.44.140
• XML Standards Stack “Pyramid”:
– Community Vocabularies Layer:
• All the industry specific implementations and problem-oriented
specifications (where the “rubber meets the road”).
• How a specific user community plans to make use of XML, the
specific of data exchange, and often some of the first specifications
to be developed.
• The number of community vocabularies is proliferating.
– Upside-down pyramid (relative numbers in each layer):
• From few (XML Base Architecture) to many (Community
Vocabularies) specifications.
36