Federated Service Oriented Information Management by jau80560

VIEWS: 18 PAGES: 31

									Federated Service Oriented
 Information Management


      Ahmet Sayar
      asayar@cs.indiana.edu
    Introduction
   Aim: Develop a general Grid architecture based approach to
    distributed heterogeneous data, information and knowledge –
    which are provided by different repositories and producers- in
    an efficient and robust manner.
   Challenges in
        Representing,
        Transforming,
        Integrating and
        Displaying
    of
      Data
     Information/knowledge
    for decision makers in scientific application domains.
   Methodology:
        Create “Federated Service Oriented Information Management
         architecture” for the GIS domain based on OGC (Open Geospatial
         Consortium) specifications.
        Determine the requirements for the generalization of the architecture for
                                                                              2
         other domains (Chemistry).
    Motivation
   SOA based on Grid or Web Services
   We use DIKW to describe the hierarchy of Data-Information-
    Knowledge-Wisdom that we are attempting to support
   “Filter Services” are Information Sources:
        A service inputs DIKW from other Grids or Services and outputs DIKW
         – perhaps converting data to information etc.
        Web Services, easy to extend and federate.
        Easy to publish, located and bind.
        Predictable input/output interfaces defined by metadata
   A repository or sensor has or gets DIKW from "outside Grid";
    it outputs DIKW; they are “just” filters whose output is Grid
    compatible DIKW as messages or message streams
   Information management through ASIS (Application Specific
    Information System) framework in Science Domains.
   Data and metadata concepts and formats
                                                                       3
GIS – OGC (Motivation Domain) (1)
   Geographic Information System (GIS) is a
    system for creating and managing spatial data
    and associated attributes.
   OGC (Open Geospatial Consortium) The goal is
    to make geographic information and services
    neutral and available across any network,
    application, or platform.
   Challenges (valid for any science domains)
     Distributed nature of geospatial data.
     Proprietary data formats, and service methodologies.
     Lack of interoperable services.
     Assembling data from distributed sources
     Format conversions
     Amount of resources for geoprocessing
                                                             4
GIS – OGC (Motivation Domain) (2)

   GML : Geographic Markup language
   WFS: Web Feature Server
     Provides vector data such as rivers, state and city
      boundaries in GML.
   WCS : Web Coverage Server
     Provides   coverage (raster) data. Grided data, pixel info.
   WMS : Web Map Server
     Provides  data in the form of jpeg, svg, png etc. Defined
      in its capabilities file.
   WMS’ : Cascading Web Map Server
     Provides data in the form of layers in mages. It is
      cascading because it provides other WMS layers as if
      its own.                                            5
    Information Management Arch
    In GIS Domain (Sample Scenario)
   Query : No Standard – Filter specification –
    query on vector data by WFS using SQL
                                                                                                   Raster
   Data Encodings : GML, images                        Vector                                      data
                                                                                       Data:b
   Metadata : Structured Capability doc in              data
    XML.                                                                                 WCS    (Minnesota)
   No event notification – WS-Context for       Data:a
    asynchronous run.                                                               MD               Data:b
   Registry : WRS – we call it MD.               WFS                                           WMS  Data:c
                                                                                                  (Nasa)
                                 Capabilities                   (CGL)
                                  Meta-data
                                                                             WMS’      (CGL)
                                                                    Data:a
                                                                    Data:b
                                                                    Data:c
                                                  Discovering
        Service inte
         Data Handlin




                                    Filtering                            Interactive tools
                                    Module
                                 -Core Service-
                                                                         Decision support
                     rfaces
                        g




                                                                        Data:a                      Data
                                     Publishing
                                                                        Data:b
                                                                        Data:c                      capability
                              Filter Container                                                                6
                                                                Interactive Decision Support
    From Raw Data to Information / Knowledge
                                                                     Capabilities
   Raw Data  GML                                                    Meta-data
                                                                         Capabilities
    (WFS in Filter - ASFS)                                                Meta-data
                                                                            Domain
                                                                           Knowledge

   GML  Map image




                                                                                                          Discovering
                                   Service inte ce
                                                                             Domain




                                    Data Handlin
    (WMS in Filter - ASVS)                                    Any
                                                                            Knowledge
                                                                           ASFS             Structured




                                                                                                                Discovering
                                          Servi acinte
                                                              Data                             Data
                                                                            (Core)
   Each filter provides data in




                                            Data Handlin
                                                                           ASVS




                                               rf es rfaces
                                                              Any                            Structured

    a consistent format.                                      Data




                                                  g
                                                                                                Data
                                                                             (Core)
                                                                           Data Modeling


   Formats should be




                                                        g
    consistent with the systems                                             Data Modeling
                                                                            Publishing

    data model, GML
                                                                             Publishing
   Any Data  Common Data
    Model                                                             Raw Data
                                                                     Or Any Data
   Data Model is XML based                                          S
    hierarchical data                                                S
                                                                                           Data
       Portable across                                                                    base

            Languages
            Operating system                                                                                                 7
                                          Tools
Interactive Decision Support Williams et al.)
                http://virtualsky.org (R.
- Interactive query,
- Interactive display, movie and animation
- Integration to Application Science Simulations




                                                   8
Application Use Domains
   ServoGrid Projects (GIS)
     Patter
           Informatics (PI)
     GeoFest
       Virtual California (VC)
   Los Alamos National Labs (LANL)
       IEISS (The Interdependent Energy Infrastructure
        Simulation System )
            Models infrastructure networks (e.g. electric power systems
             and natural gas pipelines) and simulates their physical
             behavior, interdependencies between systems.

   Chemistry and Astronomy (Future)
       CML (Chemistry Markup Language) representation of
        molecules. VOTable (Virtual Observatory Table format)
                                                                           9
     Problem Recognition
                DB         Coverage
                                                          DB
 Vector                      data
  data                                   DB                       Raw Data
               Bitmap                            netCDF
                data
                                   Image
     Binary                         jpeg                          Data
      data        DB                             DB        HDF5
                        XML
DB                      data                                      Information
                                               Bar
                Plots
                                              graphs      DB
               images
          DB                                                      Knowledge
                            Statistics
                              data

                                                                  Wisdom
                        Interactive Tools                         Decisions


                                                                         10
Problem Recognition -cont
   Services like discovery and notification do not need to be made
    application specific.
   BUT If the domain changes then :
      choices,
        database requirements,
        data format,
        core service requirements,
        attributes, and
        metadata context
    CHANGES !
   What are the common concepts and characteristics for
        data,
        metadata,
        query language,
        services, and
        communication language,
    in order to drive information/knowledge from the heterogeneous
     data/information sources in any application domains ?      11
    Generalization of Service Oriented
    Information Management Architecture

   GIS has some specifications based on standards
    such as OGC ISO/TC210, But many others do not

   GIS          ASIS        (Science Domain)
   GML          ASL         (Representing)
   WFS          ASFS        (Storing-Resource)
   WMS          ASVS        (Displaying)
   Capa.xml     Metadata    (Integrating)
   SOAP over HTTP.      (Communication Protocol)
                                                 12
 Generalization - Overall Structure Solution
    ASL : Application Specific Language. XML based
     hierarchical data representation format.
        Cross language, platform and operating system
    ASVS : Application Specific Visualization System
        Last filter before the decision maker.
        Provides information/knowledge in human readable formats
    ASFS : Application Specific Feature Service.
        Stores and provides common data model (ASL)
    Treat binary and common data (in ASL) differently.
  ASFS

   AS
                  AS Tool       AS Service        AS Tool       ASVS
Repository                                                     Display
                 (generic)     (user defined)    (generic)

  AS
“Sensor”
                             Message Using ASL                      13
ASFS and ASVS in SOA
Interfaces, querying, metadata and data model
                    ASFS                                   ASVS
    Routines         Return types          Routines             Return types

    GetCapability    Capability file XML   GetCapability        Capability file XML
    DescribeData     XML-schema            GetVis               Images, svg, png..
    GetData          ASL                   GetDataInformation   HTML, Text, XML


     Each routine is published in the WSDL, invoked based on predefined
      request schema and put into SOAP body.

                                           <SOAP:Envelope>
                                           …<SOAP:Body>
    <request>                              ……<request>
    …..<GetCapability>                     ……..<GetCapability>
    </request>                             ……</request>
                                           ...<SOAP:Body>
                                                                                     14
                                           <SOAP:Envelope>
Sample Capabilities File (too simplified) – GIS Domain
   <?xml version='1.0' encoding="UTF-8" standalone="no" ?>
    <!DOCTYPE WMT_MS_Capabilities SYSTEM "http://toro.ucs.indiana.edu:8086/xml/capabilities.dtd">
    <Capabilities version="1.1.1" updateSequence="0">
       <Service>
          <Name>CGL_Mapping</Name>
          <Title>CGL_Mapping WMS</Title>
          <OnlineResource xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple“
                               xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" />
          <ContactInformation>
             …..
           </ContactInformation>
      </Service>
      <Capability>
          <Request>
            <GetCapabilities>
                 <Format>WMS_XML</Format>
                  <DCPType><HTTP><Get>
                    <OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“
                                xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" />
                  </Get></HTTP></DCPType>
             </GetCapabilities>
             <GetMap>
                 <Format>image/GIF</Format>
                 <Format>image/PNG</Format>
                  <DCPType><HTTP><Get>
                    <OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“
                                xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" />
                  </Get></HTTP></DCPType>
            </GetMap>
          </Request>
          <Layer>
             <Name>California:Faults</Name>
             <Title>California:Faults</Title>
             <SRS>EPSG:4326</SRS>
             <LatLonBoundingBox minx="-180" miny="-82" maxx="180" maxy="82" / >
          </Layer>
       </Capability>
    </Capabilities>                                                                                 15
Sample Scenario for ASIS
                                                                                                Capabilities
                                                                                                 Meta-data
                                                                                                                                                                                                                                                                 Capabilities
                                                                                                                                                                                                                                                                  Meta-data
                                                                                                      Domain
                                                                                                     Knowledge
                                                                                                                                                                                                                                                                     Domain
                                                                                                                                                                                                                                                                                                                                                                   Data




                                                                                                                                                 Discovering
                                                                  Service inte
                                                                                                                                                                                                                                                                    Knowledge




                                                                                         A,B,C                                                                                                                                                                                                                                                                      A




                                                                   Data Handlin




                                                                                                                                                                                                                                                                                                     Discovering
                                                                                                                                                                                                                     Service inte
                                                                                        Any          ASVS            Structured




                                                                                                                                                                                                                      Data Handlin
                                                                                                                                                                                                                                                A
                                                                                        Data                            Data




                                                                                                                                                                                                                                              A,B,C
                                                                                                     (Core)
                                                                                                                                                                                                                                           Any
                                                                                                                                                                                                                                           Data
                                                                                                                                                                                                                                                                    ASFS               Structured



                                                                                                                                                                                                                                                                                                                                    GetData(A)


                                                                               rfaces
                                                                                                                                                                                                                                                                                          Data
                                                                                                                                                                                                                                                                     (Core)




                                                                                  g




                                                                                                                                                                                                                                  rfaces
                                                                                                     Data Modeling




                                                                                                                                                                                                                                     g
                                                                                                                                                                                                                                                                    Data Modeling
     Interactive Tools


                                                                                                      Publishing
                                                                                                                                                                                                                                                                                                                                                              Capabilities
                                                                                                                                                                                                                                                                                                                                                               Meta-data

                         A,                                                                                                                                           Capabilities
                                                                                                                                                                                                                                                                     Publishing

                                                                                                                                                                                                                                                                                                                                                                  Domain
                                                                                                                                                                                                                                                                                                                                                                 Knowledge




                                                                                                                                                                                                                                                                                                                                                                                              Discovering
                                                                                                                                                                       Meta-data


                         B,




                                                                                                                                                                                                                                                                                                                                 Service inte
 A




                                                                                                                                                                                                                                                                                                                                  Data Handlin
                              GetVis(A,E)
                                                                                                                                                                          Domain
                                                                                                                                                                         Knowledge
                                                                                                                                                                                                                                                                                                                                                       Any

                                                                                                                                                                                                                                                                                                                                                          B,C
                                                                                                                                                                                                                                                                                                                                                          A,B,C
                                                                                                                                                                                                                                                                                                                                                       Data
                                                                                                                                                                                                                                                                                                                                                                 ASFS
                                                                                                                                                                                                                                                                                                                                                                  (Core)
                                                                                                                                                                                                                                                                                                                                                                                 Structured
                                                                                                                                                                                                                                                                                                                                                                                    Data




                                                                                                                                                                A,B,C,                                               GetData(A)




                                                                                                                                                                                                      Discovering




                                                                                                                                                                                                                                                                                                                                              rfaces
                                                                                                                           Service inte
                         C,
                                                                                                                            Data Handlin




                                                                                                                                                                                                                                                                                                                                                 g
                                                                                                                                                               Any


                                                                                                                                                                  D
                                                                                                                                                               Data
                                                                                                                                                                        ASVS
                                                                                                                                                                          (Core)
                                                                                                                                                                                         Structured
                                                                                                                                                                                            Data
                                                                                                                                                                                                                                                                                                                                                                 Data Modeling




 E
                                                                                                                                        rfaces
                         D,                                                                                                                                     D,E,F
                                                                                                                                           g
                                                                                                                                                                                                                                                                                                                                                                  Publishing
                                                                                                                                                                         Data Modeling




                                                                                                                                                                          Publishing
                                                                                                                                                                                                                    Data
                         E,                                 Capabilities
                                                             Meta-data                                                                                                                                               D                                                      Capabilities
                                                                                                                                                                                                                                                                                                                                                                                                            Data
                                                                Domain                                              GetVis(E)                                                                                                                                                Meta-data
                                                                                                                                                                                                                                                                                                                                                                                                            B,C
                         F                                     Knowledge
                                                                                                                                                                                                                                                                                     Domain
                                                                                                      Discovering
                               Service inte




                                                                                                                                                                                                                                                                                    Knowledge




                                                             E
                                Data Handlin




                                                            E,F




                                                                                                                                                                                                                                                                                                                   Discovering
                                                                                                                                                                                                                                           Service inte
                                                     Any      ASVS                      Structured




                                                                                                                                                                                                                                            Data Handlin
                                                                                                                                                                                                                                                                                    F
                                                     Data                                  Data
                                                                (Core)
                                                                                                                                                                                                                                                                   Any
                                                                                                                                                                                                                                                                   Data
                                                                                                                                                                                                                                                                                ASFS                Structured
                                            rfaces




                                                                                                                                                                                                                                                                                                       Data
                                                                                                                                                                                                                                                                                    (Core)
                                               g




                                                                                                                                                                                                                                                        rfaces
                                                               Data Modeling




                                                                                                                                                                                                                                                           g
                                                                Publishing
                                                                                                                                                                                                                                                                                Data Modeling

                                                                                                                                                                                                                                                                                                                                                               Data
                                                                                                                                                 Data                                                                                                                               Publishing
                                                                                                                                                                                                                                                                                                                                                                F
                                                                                                                                                  E




    Each linking to visualize Data Auser is at
     Static Filter publishes aredata through its the
     Successive requests from client tools not
     GetCapabilityof filters.Capability and E and
     Client needs requestits done, aggregation
     cycle through requests will be created based on
     startup.aLater “GetCapabilities” interfaces of
     involved. file.
     makes These                  to ASVS with specific
     capabilityGetVis request chains are created
     filters. onaggregated capabilitiespublished a
     attributes filters capabilities that
     based
     returned for querying. GetVis is defined in
                                                       16
     before
     schema file.
Overall Structure Solution -cont

   Common data (ASL) is kept in ASFS with query capability.
   In a given domain every filter speaks in ASL.
   Filters (ASVS, ASFS) keep their metadata locally.
   ASVS both visualize information and provide a way of navigating
    ASFS and their underlying DB.
   ASVS can itself be federated and present output interface.
   Dynamic metadata update via MD services or P2P metadata
    exchange.
   Utilizing data/information at the application level via filters
       ASFS provide ASL.
       ASVS provide human readable information such as text, graphs
        (scalable vector (svg) or portable (png)) and images.
       Filters have common ports and interfaces
            Enable chaining for more complex data and information creation.
       Filters are easily published, located and invoked over the internet.
                                                                               17
Applicability to
Different Science Domains
   How strongly our service definitions in proposed
    architecture matches to general science domains?

                                  Filters
                ASL       ASFS              ASVS     Metadata
GIS          GML        WFS          WMS           capability.xml
                                                   schema
Astronomy VOTable,      SkyNode      VOPlot        VOResource
          FITS                       TopCat
Chemistry    CML        NO           NO standard NO
                                     JChemPaint
                                                                18
Research Issues (1)
   Requirements for the domain metadata in
    capability
     What  does capabilities do and need to have to
      federate filters?
   Requirements for the ASL (such as CML, GML)
     What   does ASL need to have to federate the filters?
   Concept of data (such as feature, coverage)
     Common    representation? Possible? To what extend?
   A common information management framework
    which can be applied to any domain.
     some   instructions- any field, what needs to be done   19
Research Issues (2)
   Application level data/information federation.
   Integrating the system with application science
    simulations.
   Creating interactive decision support tools
    utilizing integrated filter services.
     Tools for map animation, map movies, images
     Interactive query support to get further information on
      the image and/or animation.
   Enabling binding of services into pipelines with
    or without human intervention through metadata.
   Caching and load balancing to handle large
    scientific data in an efficient and robust manner
    (application based).                              20
Related Work
SRB (Storage Resource Broker)

   SRB
     Uniform  access to distributed heterogeneous data
      resources by attributes.
     Catalog service is MCAT (Metadata Catalog Service).
     Resource and data location transparency.
     Remote authentication authorization – user groups.
     Not just for access, transferring and replicating.
     Sample projects using SRB: BIRN and IVOA.
   Summary
     Other important digital library projects and the NGAS
      (Next Generation Archive System) from ESO.
     We will research more these important activities, identify
      key architecture ideas and incorporate lessons.
     SRB can be leveraged in ASIS.                        21
Related Work -Cont
OGSA-DAI

   Ogsa-DAI
     Open  Grid Service Architecture–Data Access and
      Integration.
     Access to heterogeneous data via common interfaces
      on the grid.
     Catalog service is MCS (Metadata Catalog Service)
     OGSI-compliant Grid.
     Components are Grid services. Resources should be
      registered.
     Sample projects using Ogsa-DAI : LEAD, MyGrid.
   Summary
     OGSA-DAI    emphasizes database layer whereas we
      are tackling the application specific DIKW.
     OGSA-DAI can be leveraged in ASIS.                 22
Contributions

   Instructions how to build ASL and metadata in
    capability for the application sciences.
   Instructions how to build application specific
    information system (ASIS) federating multiple filters
    speaking ASL.
   Information grid (ASIS) formalization through
    capabilities metadata, defining all the
    data/information sources as interacting Web Service
    filters with standard metadata service ports.
   Optimize and enhance the distributed
    heterogeneous information management.


                                                       23
THANKS

asayar@cs.indiana.edu
Ahmet Sayar

                        24
APPENDIX




           25
Literature Survey

OGSA-DAI
SRB

                    26
Discussions on SRB & Ogsa-DAI

   SRB
       Monolithic – does too much
       MCAT dependent
       MCAT has limited support for application-level metadata
            Need diff metadata for diff domain, and extensions for applications
       Not standard based – Not open source
       Not handling data based on DIKW hierarchy
   Ogsa-DAI
       At the data and Database level
       MCS dependent
       MCS has limited support for application-level metadata
            Need diff metadata for diff domain, and extensions for applications
       For Grid applications - GGF standards
       Data only in relational and XML database or ordinary files
       Not handling data based on DIKW hierarchy
                                                                                   27
Our Work Compared to SRB & Ogsa-DAI (1)
   Each filter has its own metadata
       Distributed metadata handling
            Peer to peer
            Through MD services
   They provide heterogeneous data access and federation
    through central metadata services
       SRB MCAT and Ogsa-DAI MCS
   Main motivation is sharing, interpreting and knowledge
    extraction of the data and information.
   Their motivation is storing, accessing and updating of the
    heterogeneous data.
   We leverages their power and usability in our federated
    service oriented information management architecture.
   They are not competitors, instead completers.
                                                                 28
       Our Work Compared to SRB & Ogsa-DAI (2)
                                                                               Wisdom Decisions,
Wisdom decisions,                                                              ready to use information
knowledge and information                                                      and knowledge
extraction by the user
                                                     Interactive Tools
                                                                               -Reusable components
-Central data access                                         ASVS              Filter Services with
abstraction. Uniform                                                           specific ports and
access to                                                                      interfaces
heterogeneous data                      GDSReg           ASVS       ASVS
                     MCAT                                                      -Distributed DIKW
sources
                        MasterSRB                                              abstraction
-Metadata :             Ogsa-GDSF   SRB Agents
                                                   ASFS      ASFS     ASFS
                                    Ogsa/GDS                                   -Metadata in capability
SRB/MCAT, Ogsa-
DAI/MCS                                                                        document
-Both provides                                                                 -Metadata aggregators
extensible metadata         R       R       R        R          R          R   -New metadata for
arch for diff domains                                                          different domains
-SRB has “zone”                           Wisdom decisions                     -Smart data querying
concept addresses
                                          Information/knowledge
similar issues but in                                                          -Web Services based
different way                             Data access and query                SOA (advantages).
                                                                                              29
Why are we different ?
Federated Service Oriented Information Management
   SOA (Service Oriented Architecture)
       Easy to extend
       Reusable components
       Cross platform and language.
       XML based hierarchical data representation
            Easy data integration
            Easy querying
            Human readable information
   Easy to access data – no command line
       Interactive tools
       On the fly query creation.
   Not only accessing data but also transforming through its
    path to end users.
   Ports to integrate application simulations to application
    specific information system (ASIS)
       Integrating application simulation data/information with ASIS
        outputs                                                         30
    An Example of Other Domains:
    Astronomy Domain (IVOA Standards)
   FS-1 : VOPlot
      Integrating, Interacting
       visualization tools
                                                                       DB
   FS-2 : SkyNode                                                                 DB
        ADQL based SOAP interface                        DB
         returning VOTable based results                        FS-3
   FS-3 : SIA                                     FS-2                     FS-4
        2D sky projection, logically a grid of
         pixels encoded as a FITS image
                                                               FS-1
   FS-4 : SSA                                                                MD
        URL-based returning a dataset
         "document" (VOTable)
   Query : ADQL –extension of SQL
   Data Encoding: VOTable, FITS                          PORTAL
   Metadata : UCD, VOResource
   Event notification : VOEvent                                                   Data
   Registry : VORegistry
                                                                                   capability
   QueryableData in : SSAP and SIAP,                                                   31
    VOStore                                       Interactive Decision Support

								
To top