Learning Center
Plans & pricing Sign in
Sign Out

OASIS Integrating Standards for Web Services_ Business Processes


									Data Services:
 Addressing the challenges of
 transformation to a knowledge-
 driven enterprise

  Sri Gopalan
  Booz Allen Hamilton

   Challenges of transitioning to a knowledge-
    driven enterprise
   Facets of an effective Data Services solution
   An approach to realizing Data Services
   The Way Ahead
   Questions and Comments
Challenges of transitioning to a
knowledge-driven enterprise
The current production rate of digital
information exceeds the ability to process it
    Technology research firm IDC
     determined that the world generated
     161 billion gigabytes of digital
     information last year

                                                    Volume of Data Creation
    Data is contained in a multitude of
     unstructured (images, video, free
     text) and structured ( RDBMS, XML,
     etc…) formats
    Greater policy requirements both
     from regulatory concerns (i.e.
     Sarbanes-Oxley, HIPAA, etc…) and
     enterprise interests (i.e. security
     constraints, etc…)
    Organizations are struggling to get a
     handle on what information they
     have, how to search for it, and how
     to protect it                           Time
Within many enterprises, there is no
consistent way to discover, access, or
share data
                                                                        Dept. B
                    Dept. A

       Portals     Web Application

                                                                     Dept D.

      HTTP               Dept. C
                                                         ERP             Email

                        Stand-alone                            XML

                           Apps     Proprietary

   Without a priori knowledge of where systems are, how to access them, and
    how to query them, users find it difficult to get all the information that they
Providing Business Context to Search

   The key element to search is to provide search results
    relevant to the given business context
   While a consumer might make a request in his/her
    business context, the data providers may interpret that
    request in their own divergent business context
                                              Org. A          “I have scuba tanks”
                                            Data       Apps

                                              Data Format

    “I need a tank…”                          Org. B
                                                              “I have gas tanks”

                                            Data       Apps

                                              Data Format
Facets of an effective Data
Services solution
“Web 2.0” technologies provide
enhanced collaboration and spark
community-building activities
          HousingMaps =                       JobMaps =
    Google Maps +     Google Maps + Indeed Job Search

   Mashups are a great example of re-purposing data, but
    they are still point-to-point and require a lot of redundant
    developer effort to create each one
Lessons Learned from Social Software
   Leverage industry strengths
       Use technologies and standards that are well supported by
        commercial and open-source tools in order to facilitate greater
   Greatest common factor approach
       Develop solutions that meets the requirements of the widest
        based of users, including those that may be technologically
        limited or resource constrained
   Evolve with the community
       Develop solutions that are flexible and adaptable enough to
        change over time and incorporate community feedback and
   Keep it Simple
       While Data Services solutions may perform very complicated
        process in the back end, try to keep the front-end interfaces to it
        as simple and easy to work with as possible
The importance of Metadata
                The main purpose of metadata, or data about data, is to
                 speed up and enrich searching for resources
                         “What data services have information on recent financial filings?”
                         “Which data services are associated with a HR data within an
                          enterprise taxonomy?”
                                                      Types of Metadata

                          Metadata Type                  Description                          Examples
                         Syntactic        Describes the physical, syntactic markup Datatype, Field Length,

                                          of individual data elements (formatting, Field Name, Tag Names,
                                          field markers)                           Flat File Makers
                         Structural       Describes the logical grouping of           Logical schema definitions
                                          individual of data elements (i.e. entity-   (PersonRecord:
                                          attribute groupings)                        PersonName, PersonSSN,
                         Semantic         Describes the codified meaning of data      Person was-born on
                                          elements, and their relationships,          PersonDOB, and was-born
                                          including any rules or constraints on       once and only once
                                          those relationships
The Need for Data Discovery
   Data Discovery provides service consumer agents with a common
    facility to distribute a search for relevant information across data
    assets within the enterprise including those that are known a priori
    and those that are unexpected
   Data Discovery exposes the essential metadata of a data resource
    (e.g. id, title, summary), not the data resource itself
   Potential usage scenarios:
        An consumer can “subscribe” to a Data Discovery service to
         automatically receive streams of information about topics he/she is
         interested in from a variety of data providers he/she may or may not
         know about
        Data providers, both small and large, can more directly advertise their
         information to interested service consumer agents that it may or may not
         know about
        An analyst may request more metadata about a data resource before
         accessing it
     Example Data Discovery Scenario

                                            Search                      Search
                                                                       Service #1    DB
                          1               Aggregator


                                                                       Service #2    XML

1.    Consumer makes discovery request                                 Service #3
2.    Search Aggregator queries Service Discovery for
     relevant Search Services

3.   Search Aggregator distributes request to relevant                  Search
                                                                       Service #4
     Search Services                                                                Images

4.   Search Aggregator aggregates search results

5.    Search Aggregator returns all search results
The need for Data Access and Delivery

   Once a data resource of interest has been identified via Data
    Discovery, a service consumer might want to “access” or “deliver”
    that data resource for further processing
   Data Access and Delivery capabilities provide service consumer
    agents with a common facility to synchronously fetch a data resource
    or asynchronously route it to a pre-determined endpoint
   Potential usage scenarios:
        An user at his/her workstation can directly “access” a data resource for
         detailed inspection
        An field technician on the job site can use his/her mobile device to
         “deliver” a data resource to his/her computer at work to analyze later
        Data providers can lower the cost of integration by supporting a common
         data retrieval interface that is well-understood throughout the local
         enterprise and industry
 Example Data Access & Delivery Scenario


                                                                      Service #1


                                                            3a        Callback
1.    Consumer makes data access request                              Interface

2a.   Retrieve Service returns requested information

2b.   Retrieve Service forwards requested information to
      the Messaging Infrastructure
3a.   Messaging Infrastructure routes requested
      information to service consumer
3b.   Messaging Infrastructure routes requested
      information to service consumer receiver agent
      implementing a Callback Interface
    Major issues facing distributed
    information sharing
   Must support for a number of interaction models
        Request-response, subscribe-push, probe and match,
         authenticated and/or single use of data, etc…
   Must support a variety of metadata and content formats
        Atom, Dublin Core, Images, Video, PDF, Open Document, etc…
   Different types of data lend themselves to be queried by
    different mechanisms
        XML can be natively searched XQuery
        Images cannot be natively searched with XQuery
   Must be designed for controlled evolution
        Do not want the addition of new features to alienate current users
         through constant upgrades or revisions
        Discourage specification “lip service” by avoiding unbounded
An approach to realizing Data
Data Service Objectives
   Address the need to enable enterprise-wide
    data discovery and aggregation across any
    number of service implementations while
    offering the end users with relevant
   Enable horizontal discovery, access, and
    consumption of data of relevance, regardless
    of physical location, data type, and/or technical
   Support a variety of messaging patterns,
    security and policy requirements, and data
Profile-Based Approach to
achieving Data Services
   Data Services specifications should focus on capturing
    the high-level process and use-cases requirements (i.e.
    the need to search against metadata and content), rather
    than the low-level realizations of those features (i.e.
    XQuery vs. Keyword search)
   Abstract Data Services interface focused on defining a
    high-level construct to capture intended behaviors that
    will be implemented by pluggable profiles
       Inspired by token profiles within WS-Security
       Loosely coupled specification that enables service providers to
        add new capabilities without having to change the WSDL
       Enables service providers to only implement those profiles that
        satisfy their specific requirements
What are the profiles we need to
   Context – What is the business context of the data
    service operation (search, retrieve)
       Ex. A set of taxonomy key-value pairs to search against a UDDI
   Metadata – What are the metadata formats that I would
    like to interact against?
       Ex. Dublin Core Metadata Element Set, Atom 1.0, RSS
   Content – What are the content types that I would like to
    interact with?
       Ex. PDF, Open Document, Open XML, JPEG, MPEG2
   Query – Given the type of metadata and/or content, how
    would I like to query for information?
       Ex. Keyword search, XQuery request, SPARQL requests
The combination of different “profiles”
can have measurable impact
    Data Services Request                 While “CriminalMetadata”,
  Metadata Profile: CriminalMetadata       “MugShotContent”,
                                           “CriminalQL” and “ImageMatch”
      Query Profile: CriminalQL
                                           do not exist today, if they are
     Find Where                            introduced in the future it should
       sex = “male” and
       race = “white” and
                                           not significantly alter the way we
       height >= “5-09” and                process requests for information
       height <= “5-10”

   Content Profile: MugShotContent

     Query Profile: ImageMatch
Encouraging collaboration with
REST and/or SOAP
   SOAP is a protocol specification that defines a uniform
    way of passing XML-encoded data that abstracts the
    physical transport layer.
   Representational State Transfer (REST) are a set of
    architectural principles that loosely describes any simple
    interface that uses the use XML over HTTP without an
    additional messaging layer such as SOAP
   SOAP and REST are two different approaches that serve
    different needs
       In many areas the provided functionality overlaps and causes a
        bit of contention
       The two approaches, if used properly, can be complementary and
        will help to meet the overall data services needs
RESTful feeds may be appropriate for
disparate content subscriptions

      Source: RSS--Promising Technology for Building Customer Relationships
SOAP-based messages are better suited
for complex requests and messaging

           Subscribe       Subscribe                       Retrieve
                                       Scheduled Pull     Service #1    DB

                                       Scheduled Pull
                                                          Service #2    XML


                                         Scheduled Pull    Retrieve
                                                          Service #3

               Callback                                    Retrieve
               Interface                                  Service #4
Supporting standards that may help to
advance Data Services initiatives
                  Need                               Standard(s)

    Service Registry             UDDI v3, ebXML Registry

    Security/Policy Concerns     WS-Security, SAML 2.0, XACML, WS-Policy

    Notifications and Eventing   WS-Notification, WS-Eventing, WS-EventNotification

    Asynchronous Behavior        WS-Addressing

    Reliable Messaging           WS-ReliableMessaging

    Query Languages              XQuery 1.0, XPath, SPARQL

    Metadata Formats             Dublin Core, Atom 1.0

    Search Functionality         Z39.50

   There is a no existing set of standards that fully supports
    the functionality of a complete Data Services solution
The Way Ahead
 OASIS Data Services Framework
 Technical Committee (OASIS DSF TC)
 Goals and objectives for the TC include:

     Collect, analyze and document the requirements for
      data management and sharing in a networked
      environment where data services lie under different
      domains of ownership and stewardship
     Aid architects in understanding the conceptual
      patterns of interaction pertaining to data oriented
     Create an abstract specification normatively
      describing a framework of operations to manage and
      retrieve data in a services environment, across
      ownership and stewardship boundaries.
     Describe service patterns and interactions between a
      provider, consumer, and other resources and entities
OASIS Data Services Framework
Technical Committee (OASIS DSF TC)
   Out of Scope Items:
       Define a mapping of the functions and elements
        described in the specifications to any programming
        language, to any particular messaging middleware, or
        to specific network transports.
       Define new key query algorithms, metadata
        specifications, or content specifications.
       Define concepts or renderings for functions that are of
        wider applicability including but not limited to:
            Addressing
            Query frameworks
            Routing
            Reliable message exchange

   The need for a distributed discovery, aggregation, and
    access mechanism becoming more an more important
   Any Data Services solution must account for a growing
    number of metadata specifications, content formats, and
    query mechanism
   WS-Security demonstrates that a a profile-based solution
    can meet the diverse needs of a community
   OASIS Data Service Framework TC will identify and fill
    the gaps to achieve a complete Data Services solution
Questions and Comments

To top