Filemaker Pro Template by fdg78416

VIEWS: 267 PAGES: 42

More Info
									   Metadata Registries &
Repositories: Lessons Learned
               Informal Presentation to the
        SAIC/DHS Metadata Center of Excellence
                     Brand L. Niemann
   US EPA and Federal CIO Council’s Architecture and
      Infrastructure and Best Practices Committees
                   September 16, 2004



                                                       1
                 Overview
•   1. Service-Oriented Architecture
•   2. Pilots
•   3. Some Lessons Learned
•   4. A Possible DHS Strategy
•   5. Contact Information




                                       2
  1. Service-Oriented Architecture
• IBM has created a model to depict Web services
  interactions which is referred to as a “service-
  oriented architecture” comprising relationships
  among three entities (see next slide):
  – A Web service provider;
  – A Web service requestor; and a
  – A Web service broker.
• Note: IBM’s service-oriented architecture is a
  generic model describing service collaboration,
  not specific to Web services.
  – See http://www-106.ibm.com/developerworks/webservices/

                                                             3
 1. Service-Oriented Architecture

                               Service
                               provider

               Publish                          Bind




          Service                                      Service
          broker                Find                   requestor


Service-oriented architecture representation (Courtesy of IBM Corporation)



                                                                             4
  1. Service-Oriented Architecture
• A Service-Oriented Architecture (SOA) means
  that the architecture is described and organized
  to support Web Service’s dynamic, automated
  description, publication, discovery, and use.
  – The SOA organizes Web Services into three basic
    roles:
     • The service provider (publish)
     • The service requestor find)
     • The service registry (bind)
  – The SOA is also responsible for describing how Web
    Services can be combined into larger services.
                                                         5
  1. Service-Oriented Architecture
• The SOA has four key functional components:
   – Service Implementation:
      • Build from scratch, provide a wrapper, or create a new service
        interface for an existing Web Service.
   – Publication:
      • Author the WSDL document, publish the WSDL on a Web Server,
        and publish the existence of your WSDL in a Web Services registry
        using a standard specification (UDDI).
   – Discovery:
      • Search the registry, get the URL, and download the WSDL file.
   – Invocation:
      • Author a client (SOAP) using the WSDL and make the request
        (SOAP message) and get the response (SOAP message).



                                                                            6
1. Service-Oriented Architecture
                                 • 1. Client queries registry
                   WSDL            to locate service.
           2
 UDDI              Documen       • 2. Registry refers client to
Registry           t               WSDL document.
                                 • 3. Client accesses WSDL
 1
           3                       document.
               4                 • 4. WSDL provides data to
                                   interact with Web service.
               5
 Client                          • 5. Client sends SOAP-
                       Web
                                   message request.
               6
                       Service   • 6. Web service returns
                                   SOAP-message
                                   response.

                                                             7
  1. Service-Oriented Architecture
• Acronyms:             • Practical Examples:
  – UDDI                  –   Phone Book
  – WSDL                  –   Contract
  – SOAP                  –   Envelope
  – HTTP, SMTP, FTP       –   Mailperson
  – Programming (DOM,     –   Speech
    SAX)
  – Schema (DTD, XSD)     – Vocabulary
  – XML                   – Alphabet


                                                8
  1. Service-Oriented Architecture
• Stages of Web services Development and
  Deployment:
  – Creation – Design, development,
    documentation, testing, and distribution.
  – Publication – Web service hosting and
    maintenance.
  – Promotion – Directory services, value-added
    services and accreditation.


                                                  9
   1. Service-Oriented Architecture
                                                      Service requestors
Service providers




                            Web Services Network:
                                  Security
                                 Reliability
                                    QoS
                                   Billing




                    Web services networks act as intermediaries
                    in Web services interactions.
                                                                           10
                             2. Pilots
•   Ask to review EPA’s Environmental Data Registry (ISO 11179) and National
    Environmental Exchange Network and make recommendations for
    improvements and to provided XML Web Services training (2001-2002).
•   Received Special Award for Innovation with XML Web Services from the
    Federal Quad Council (Mark Forman, March 2002) and asked to lead the
    CIO Council’s XML Web Services Working Group and to do more pilots in
    support of E-Government (August 2002-September 2003) (see list on slide
    12).
•   Received Emerging Technology/Standards Leadership Award at the
    SecureE-Biz.Net Summit from Mark Forman and David McClure (April
    2003).
•   One CIO Council Pilot project becomes the First Annual Conference on
    Semantic Technology for E-Government at the White House Conference
    Center (September 8, 2003) which fosters the formation of the Semantic
    Interoperability Community of Practice (SICoP) under the CIO Council’s
    Best Practices Committee (March 2004) (Co-Chairs, Rick Morris and Brand
    Niemann) which in turn becomes a public-private partnership that produces
    the Second Annual Conference on Semantic Technology for E-Government
    (September 8-9, 2004).
•   The “Best Paper” Award at the Second Annual Conference on Semantic
    Technology for E-Government went to a four person team lead by the
    SAIC/ACS (see slides 13-14) in which the repository was a “semantic
    store”.                                                                 11
2. Pilots




            12
                        2. Pilots
• Operationalizing the Semantic Web: A
  Prototype Effort using XML and Semantic
  Web Technologies for Counter-Terrorism:
  – M. Personick*, B. Bebee*, B. Thompson
    SAIC/Advanced Systems & Concepts;
  – B. Parsia, The University of Maryland, College
    Park, Maryland Information and Network
    Dynamics Lab, Semantic Web Agents Project;
    and
  – C. Soechtig, Object Sciences Corporation.
   *Conference presenters
                                                13
                                2. Pilots
•   2.1 Repurposing EPA’s Environmental Data Registry (ISO 11179) (added structure
    and data element harmonization)
•   2.2 Distributed Content Network and Semantic Web Services (NextPage)
•   2.3 XML.Gov Working Group-NIST Pilot Registry (Yellow Dragon-Adobe)
•   2.4 Repurposing the DOD Registry with XML Collaborator (use in IC MWG?)
•   2.5 MetaMatrix-XML Collaborator (DHS integration scenario using MOF)
•   2.6 CollabNet (now used for CORE.Gov)
•   2.7 E-Forms for E-Gov (Census’s Registry and 12 or so vendors)
•   2.8 Integrated Web Services/ebXML (to OASIS TC)
•   2.9 State and Local Homeland Security Best Practices Pilot (FileMaker Pro)
•   2.10 Native XML Database ( Tamino with UDDI)
•   2.11 Data and Information Reference Model (DRM) (embedded semantic
    harmonization and real data tables)
•   2.12 Networked Communities of Practice (CoP) & Their Dynamic Knowledge
    Repositories (DKR) (ONTOLOG Forum & Collaborative Expedition Workshops with
    CIM3)
•   2.13 Semantic Information Management (Unicorn)
•   2.14 Federated Repository-Software Asset Reuse (LogicLibrary)
•   2.15 “Best Practices” for Networked Communities of Practice (CIM3 and NextPage)
•   2.16 Community of Practice Hosting Portals (Tomoye Simplify and Groove)
•   2.17 Ontology Production and Linking (several new open source and proprietary)

       Note: Some specifics for each to be provided in the presentation.              14
 2.1 Repurposing EPA’s Environmental Data Registry (ISO 11179)
        (added structure and data element harmonization)
(Note: NextPage’s LivePublish puts all data in the same format (XML) while its NXT4
              indexes many different formats in same format (XML))

                                                                 EPA’s EDR contains
                                                                 about 250
                                                                 standardized
                                                                 elements and about
                                                                 10,000 non-
                                                                 standardized
                                                                 data elements
                                                                 including
                                                                 many that are
                                                                 redundant.




                http://www.sdi.gov
                                                                                      15
                http://xml.gov/presentations/nist3/iso11179.htm
            2.2 Distributed Content Network and Semantic Web Services
                                     (NextPage)


                                              Enterprise Ontology and
                                              Web Services Registry

       Dynamic                                Semantic Web
                          Web Services
       Resources                              Services



       Static
       Resources            WWW               Semantic Web
Source: Derived in part
from two separate
presentations at the
Web Services One
Conference 2002 by
Dieter Fensel and         Interoperable       Interoperable
Dragan Sretenovic.
                          Syntax              Semantics
                                                                        16
     2.2 Distributed Content Network and Semantic Web Services
                              (NextPage)

Levels/              Digital               Communities XML Web
Mappings             Collections           of Practice Services

1                    Topics*               Domains               Networks


2                    Sub-Topics*           Conceptual            Nodes
                                           Areas
                                           (Topics)
3                    Table of              Knowledge             Services
                     Contents**            Objects
*”Content gives us the semantics (taxonomy/ontology) & the interoperability”,
Adam Pease, SICoP Meeting at MITRE, May 19, 2004.
**”Structure comes from the content itself”, The Large Document Problem, Lucian
Russell, Categorization of Government Information WG Meeting, 5/10/04.            17
  2.2 Distributed Content Network and Semantic Web Services
                           (NextPage)
(Note: Recently acquired by FAST, the Search Engine company used by FirstGov)




                           http://www.sdi.gov
                                                                                18
    2.3 XML.Gov Working Group-NIST Pilot Registry
               (Yellow Dragon-Adobe)
(BAH Business Case called for Federated, but only Centralized so far)




           http://xmlregistry.nist.gov:8080/index.jsp                   19
2.4 Repurposing the DOD Registry with XML Collaborator
     (use in IC MWG?-”DOD not a sterling registry”)
    (supported ISO 11179, ebXML, WSDL/UDDI, and now CAM)




http://www.blueoxide.com/Pages/xmlcollaborator.html
http://xml.gov/presentations/blueoxide2/collaborator.htm
http://xml.gov/presentations/fgm/dodregistry.htm
http://www.xml.saic.com/icml/ic_registry/introduction.asp   20
                    2.5 MetaMatrix-XML Collaborator
                  (DHS integration scenario using MOF)
                                   Emergency
Border & Transportation          Preparedness &   Science & Technology         Information Analysis
       Security                     Response




   Virtual Views


 Physical Sources




            INS                                                                      CIA

                       Customs                                           FBI
                                  Coast                    Secret
                                  Guard                    Service
            National                       TSA      FEMA
             Guard




      This approach is equally valid for intra-agency data integration                                21
                           2.5 MetaMatrix-XML Collaborator
                               (DHS integration scenario)

XML Schema
Mapping




 See Joint Government Data and Information Reference Model (IAC White Paper)
 which includes MetaMatrix-XML Collaborator Pilot Project (see pages 26-27) at:
 http://web-services.gov/030528_IAC_EA_SIG_Information_and_Data_Reference_Model_Body.pdf
                                                                                           22
                      2.6 CollabNet
                 (now used for CORE.Gov)
(more the “pull model’ than the “push model” used by LogicLibrary)




                   https://www.core.gov/                             23
                2.7 E-Forms for E-Gov
        (Census’s Registry and 12 or so vendors)




See http://www.fenestra.com/eforms/deliverables/final_report.htm
                                                                   24
                    2.8 Integrated Web Services/ebXML
                                (to OASIS TC)

One interface (HTTP, SwA,                                     Business Process and Information Models
  ebMS)                                                        (Compliant to the ebXML Meta Model)

   – Electronic Forms                                                        Model to XML Conversion

   – Web Services / WSRP
   – Collaboration                         Registration                            Registries
      Agreements                               Retrieval of Profiles &
                                               new/updated ebXML Models
                                                                                                                   Retrieval of Profiles &
                                                                                                              new/updated ebXML Models
                                                                                  Registry Service
   – Business Process                                        Register
                                                                                     Interface
                                                                                                                     Register
                                                           Collaboration                                           Collaboration
      Requirements, Objects,                              Protocol Profile
                                                               (CPP)
                                                                                                                  Protocol Profile
                                                                                                                       (CPP)


      Data                                                                                  Retrieval of ebXML
                                                                                            Models and Profiles
                                                                                                                                Business Service
   – Domain specific
                                     Business Service           Build                                                 Build
                                                                                  Implementers                                     Interface
                                        Interface
                                                                                                        CPP
      Semantics and                     Internal
                                        Business                                             De
                                                                                                riv
                                                                                                   es                                 Internal
                                                                                                                                      Business
                                                                                   Collaboration
      Relationships between            Application                                    Protocol
                                                                                  Agreement (CPA)
                                                                                                                                     Application


      Assets & Artifacts


                                                                                        Governs
                                                                                         CPA
   – SQL queries and APIs                                                            Payload



   See Carl Mattocks: http://xml.gov/presentations/oasis4/eGovRegistry.htm                                                                    25
2.9 State and Local Homeland Security Best Practices Pilot
                     (FileMaker Pro)
             (James Mackison, GSA, for DHS)




Standard metadata template in Excel imported to FileMaker Pro.
                                                                 26
                           2.10 Native XML Database
                               (Tamino with UDDI)

                                                             How UDDI Works
                                                                4) Marketplaces,
1) Software                                                     search engines, and
companies,                                                      business apps query the
standards bodies                                                registry to discover
and programmers                                                 services at other
populate the                                                    companies
registry with
descriptions of          UDDI Business Registry
different tModels

                               Business      Services Type
                             Registrations   Registrations


                          3) UDDI assigns a
2) Businesses             programmatically unique
populate                  identifier (UUID) to each
                                                            5) Businesses use this
the registry with         tModel and business
                                                            data to facilitate easier
descriptions of           registration and stores them
                                                            integration with each
the services they         in an Internet registry
                                                            other over the Web27
support       For update see http://xml.gov/presentations/systinet/uddi.htm
      2.11 Data and Information Reference Model (DRM)
    (embedded semantic harmonization and real data tables)




Harmonization/Standardization of Data Element and XML Tag Names
                                                                  28
and table structure preserved for use in spreadsheets, etc.
       2.12 Networked Communities of Practice (CoP) &
         Their Dynamic Knowledge Repositories (DKR)
(ONTOLOG Forum & Collaborative Expedition Workshops with CIM3)




                     http://ontolog.cim3.net/                29
                             2.13 Semantic Information Management
                                            (Unicorn)
            DESIGN                                   MANAGEMENT                                              RUN-TIME
•Manage repository                            •Discover data sources       •What does this      •Formulate ad hoc queries
•Manage ontology model collaboratively        •Impact analysis             mean? Where did it   •Run queries across sources
•Semantic mapping                             •Create schemas              come from?           •Analyze and visualize (using 3rd party tools)

                                                                                      The Public
       Information Resource Managers                   IT Professionals
                                                                                      Health & Safety professionals
                                                                                      Analysts in Federal & State Government


    Unicorn Workbench                                    Unicorn Web Interface                           Enterprise View Interface

                                                        Unicorn Server                                   Federated Query
     • Mapping                                                                                                   Semantic EII
     • Modeling                                • Information Management                            Powered by IBM Information Integrator
                                               • Information Integration        Information
     • Repository Management                                                   Management
                                               • Information Quality
                                                                                   Portal
                                                    Semantic Engine™



                                                    Unicorn Repository

                                   COMMON BUSINESS LANGUAGE:
                                         Ontology Model of environment/health


                                                     Semantic Mappings



                                                                                                                                                 30
               CATALOG: Metadata on estimated 300 sources of health and environmental data
                2.13 Semantic Information Management
                               (Unicorn)
  Semantic Mapping:
• Map each data asset once
  only as a spoke to Ontology
  Model hub
• Formal semantic mappings
  capture meaning of data in
  formal machine-readable form
• Flexibility of Mapping
   – Map all assets: relational, XML,
     legacy, etc. to same model
   – Productive mapping in two
     stages groups (e.g. tables) and
     fields (e.g. columns)
   – Attach conditions to mapping
                                                       31
         2.14 Federated Repository-Software Asset Reuse
                          (LogicLibrary)
         (more the “push model’ than the “pull model” used by Core.Gov)
See Federal Times article at http://federaltimes.com/index.php?S=290239




Enterprise Architecture: http://www.noblestar.com/we_do/arch/arch.jsp     32
            2.14 Federated Repository-Software Asset Reuse
                             (LogicLibrary)




Logidex Demonstration Site: http://www.logidexassetcenter.com/assetcenter.jsp
                                                                         33
2.15 “Best Practices” for Networked Communities of Practice
                   (CIM3 and NextPage)




                  http://web-services.gov
                                                              34
2.15 “Best Practices” for Networked Communities of Practice
                          (NextPage)
            (CIM3 Wiki shown previously in section 2.12)




                    http://web-services.gov
                                                              35
               2.16 Community of Practice Hosting Portals
                     (Tomoye Simplify and Groove)




http://12.158.152.7/ev_en.php.        Private Groove Workspace for
Contact Guy Rogers, Chief Editor,     Conference Planning, Paper
for password at grogers@triplei.com   Reviews, Etc.
                                                                     36
                         2.17 Ontology Production and Linking
                       (several new open source and proprietary)
                       OntoLink - Linking Ontologies & Services

• Mapping between
  OWL ontologies
  and XML Schemas
• Allow procedural
  transformations
• Generate XSLT
  transformations
• Create mapping
  services for
  ontologies

 Source: Sirin and Hendler, Semantic Web and Web Services, University of Maryland, Semantic Web and
 Agent Technology at the Maryland Information and Network Dynamics Laboratory, September 14, 2004.
                                                                                                  37
   3. Some Lessons Learned
• Metadata has evolved from:
  – Descriptions of databases that are not readily
    accessible (e.g., clearinghouse).
  – An interface to multiple databases that are
    accessible (e.g. warehouse).
  – A layer of information that makes multiple
    database integration possible (e.g., MOF).
  – A markup language for relating and linking
    multiple databases in a networked
    environment (e.g., RDF and OWL).
                                                 38
    3. Some Lessons Learned
• Registries - Repositories have evolved from:
  – Separate from the actual databases (e.g., ISO 11179)
    to integrated with DBMS (e.g., Tamino/UDDI).
  – Specific for certain artifacts and functions (e.g., XML
    Schemas) to comprehensive systems (e.g.,
    LogicLibrary).
  – Centralized (e.g., XML.Gov WG/NIST) to federated
    (e.g. Semantic Web Services).
  – Special category of software (e.g., data element
    management) to mainstream software (e.g.,
    document management systems/distributed content
    networks).
  – Simple tools (e.g., spreadsheets for data elements) to
    community of practice hosting portals (e.g., Tomoye
    Simplify, Groove, etc.).
                                                         39
    3. Some Lessons Learned
• Goals have evolved from:
  – Prevent (manage) data element naming conflicts
    (e.g., ISO 11179).
  – Support XML artifact development and versioning &
    Web Services (e.g., XML Collaborator).
  – Support semantic harmonization across different
    domains and Semantic Web Services (e.g., e.g.
    TopBraid, OntoLink, etc.).
  – Support community of practice (CoP) development
    and networking with other CoPs (e.g., CIM3 Wiki for
    Ontolog).

                                                          40
   4. A Possible DHS Strategy
• Multiple Levels of Metadata:
  – Level 1-Coarse:
     • Some basic descriptors for all 700 some databases.
        – E.g., Name, type, accessability, etc.
  – Level 2-Medium:
     • Some basic metadata for say the 100 “best” databases.
        – E.g., Data dictionary, XML Schema, etc.
  – Level 3-Fine-grained:
     • Detailed metadata and/or markup for say the 10 or so
       databases to be integrated.
        – E.g. MOF, RDF, etc.


                                                               41
          5. Contact Information
• U.S. Environmental Protection Agency, Office of Environmental
  Information (Office of the Chief Information Officer-CIO):
   – Enterprise Architecture Team.
   – Computer Scientist and Semantic XML Web Services Specialist.
       • 202-566-1657, niemann.brand@epa.gov.
• Interagency Working Group on Sustainable Development Indicators:
   – http://www.sdi.gov.
• CIO Council’s Architecture & Infrastructure Committee and
  Emerging Technology Subcommittee:
   – http://web-services.gov.
   – http://componenttechnology.org.
• CIO Council’s Best Practices Committee (Knowledge Management
  Working Group) and Semantic (Web Services) Interoperability
  Community of Practice:
   – http://km.gov and http://web-services.gov


                                                                    42

								
To top