GridSurveyJune06.ppt - Community Grids Lab

Document Sample
GridSurveyJune06.ppt - Community Grids Lab Powered By Docstoc
					    Grid System Issues

            MSI-CI2 Meeting
            June 29 2006

             Geoffrey Fox

 Computer Science, Informatics, Physics
   Pervasive Technology Laboratories
Indiana University Bloomington IN 47401

                Topics Covered
   General Issues: Relation to P2P
   Types of Grids
   Why use Service Oriented Architectures
   Multi-core Chips
   All the world’s services
   Workflow
   Metadata and State
   Workflow
   Sensors and Filters
   SOAP MPI and Communication Performance
   Grids of Grids
   Community Tools

    Web services                                               Devices

   Web Services build
    distributed                                    Databases         Computational resources
    applications, (wrapping
    existing codes and

                                BPEL, Java, .NET

                                                                                                      service logic
    databases) based on the
    SOA (service oriented
    architecture) principles.
   Web Services interact

                                                                                                      message processing
    by exchanging messages      SOAP and WSDL                      <env:Envelope>
    in SOAP format                                                      ...
   The contracts for the                                            <env:Body>
    message exchanges that                                           </env:Body>
    implement those
    interactions are
    described via WSDL                                         SOAP messages

    interfaces.                                                                                   3
             A typical Web Service
   In principle, services can be in any language (Fortran .. Java ..
    Perl .. Python) and the interfaces can be method calls, Java RMI
    Messages, CGI Web invocations, totally compiled away (inlining)
   The simplest implementations involve XML messages (SOAP) and
    programs written in net friendly languages like Java and Python

       Web Services                                     Payment
                                                       Credit Card
                              WSDL interfaces

            Service        Security         Catalog

                          WSDL interfaces              Warehouse
        Web Services                                   Shipping
Philosophy of Web Service Grids
   Much of Distributed Computing was built by natural
    extensions of computing models developed for sequential
   This leads to the distributed object (DO) model represented
    by Java and CORBA
     • RPC (Remote Procedure Call) or RMI (Remote Method
       Invocation) for Java
    Key people think this is not a good idea as it scales badly
    and ties distributed entities together too tightly
     • Distributed Objects Replaced by Services
   Note CORBA was considered too complicated in both
    organization and proposed infrastructure
     • and Java was considered as “tightly coupled to Sun”
     • So there were other reasons to discard
   Thus replace distributed objects by services connected by
    “one-way” messages and not by request-response messages
        Some ideas to Remember
   Grids are managed Web Services exchanging Messages
   P2P Networks are differently managed and architected
    services exchanging messages
   Any computer operation involves messages; not all
    these messages can be isolated
    • With services all messages are explicit and can be examined
   Grid Services extend WS-* Web Service Specifications
   Web Service container replaces computer
   Service replaces process
   A stream is an ordered set of messages
   Service Internet replaces Internet: messages replace
   (Sub)Grids replace Libraries
    Internet Scale Distributed Services
   Grids use Internet technology and are distinguished by
    managing or organizing sets of network connected resources
     • Classic Web allows independent one-to-one access to
       individual resources
     • Grids integrate together and manage multiple Internet-
       connected resources: People, Sensors, computers, data
   Organization can be explicit as in
     • TeraGrid which federates many supercomputers;
     • Information Retrieval Grid which federates multiple data
     • CrisisGrid which federates first responders, commanders,
       sensors, GIS, (Tsunami) simulations, science/public data
   Organization can be implicit as in Internet resources such as
    curated databases and simulation resources that “harmonize a
    community”                                                      7
Typical Grid Architecture
  Each Blob is a
                              Portal        User
                             Services     Services

             System         Application          System
             Services         Service            Services


             System          System           System
             Services        Services         Services
                                     Raw (HPC)
                                     Resources       Database
Classic Grid Architecture                                           Resources

Database                                         Database

                                       Content Access
                     Composition                             Middle Tier
Netsolve                                                     Brokers
                                                 Security    Service Providers

Middle Tier becomes Web Services

           Clients                             Users and Devices           9
     Database                      Peers

                   Service Facing
                Web Service Interfaces

            Event/       Event/      Event/
                Peer to Peer Grid

                      User Facing
                 Web Service Interfaces

A democratic organization                  Peer to Peer Grid
       Different Visions of the Grid
   e-Science or Cyberinfrastructure are virtual organization Grids
    supporting global distributed engineering and science research
    (note sensors, instruments are people are all distributed)
   Utility Computing or X-on-demand (X=data, computer ..) is a
    major computer Industry interest in Grids and this is key part of
    enterprise or campus Grids
   Skype (Kazaa) VOIP system is a Peer-to-peer Grid (and
    VRVS/GlobalMMCS like Internet A/V conferencing are
    Collaboration Grids)
   DoD’s vision of Network Centric Computing can be considered a
    Grid (linking sensors, warfighters, commanders, backend
    resources) and they are building the GIG (Global Information
   Commercial 3G Cell-phones and DoD ad-hoc network initiative
    are forming mobile Grids
   Grids support universal Globalization in life, fun, research,
    e-moreorlessanything and the Grid
   e-Business captures an emerging view of corporations as
    dynamic virtual organizations linking employees, customers
    and stakeholders across the world.
     • The growing use of outsourcing is one example
   e-Science is the similar vision for scientific research with
    international participation in large accelerators, satellites or
    distributed gene analyses.
   The Grid integrates the best of the Web, traditional
    enterprise software, high performance computing and Peer-
    to-peer systems to provide the information technology e-
    infrastructure for e-moreorlessanything.
   A deluge of data of unprecedented and inevitable size must
    be managed and understood.
   People, computers, data and instruments must be linked.
   On demand assignment of experts, computers, networks and
    storage resources must be supported                           12
           e-Defense and e-Crisis
   Grids support Command and Control and provide
    Global Situational Awareness
    • Link commanders and frontline troops to themselves and to
      archival and real-time data; link to what-if simulations
    • Dynamic heterogeneous wired and wireless networks
    • Security and fault tolerance essential
   System of Systems; Grid of Grids
    • The command and information infrastructure of each ship is
      a Grid; each fleet is linked together by a Grid; the President
      is informed by and informs the national defense Grid
    • Grids must be heterogeneous and federated
   Crisis Management and Response enabled by a Grid
    linking sensors, disaster managers, and first responders
    with decision support
        1962 Licklider’s Vision
“Lick had this concept – all of the stuff
 linked together throughout the world, that
 you can use a remote computer, get data
 from a remote computer, or use lots of
 computers in your job.”

 Larry Roberts – Principal Architect of the ARPANET

            What is e-Science?
 „e-Science is about global collaboration in key
 areas of science, and the next generation of
 infrastructure that will enable it.‟
                  John Taylor
      Director General of Research Councils
        UK, Office of Science and Technology
 e-Science is about developing tools and
 technologies that allow scientists to do „faster,
 better or different‟ research                15
    Some Important Styles of Grids
   Computational Grids were origin of concepts and link
    computers across the globe – high latency stops this from being
    used as parallel machine
    • Typically Compute/File Grids where information (messages) exchanged
      by writing and reading files
   Knowledge and Information Grids link sensors and information
    repositories as in Virtual Observatories or BioInformatics
   Education Grids link teachers, learners, parents as a VO with
    learning tools, distant lectures etc.
    e-Science Grids link multidisciplinary researchers across
    laboratories and universities
   Community Grids focus on Grids involving large numbers of
    peers rather than focusing on linking major resources – links
    Grid and Peer-to-peer network concepts
   Semantic Grid links Grid, and AI community with Semantic
    web (ontology/meta-data enriched resources) and Agent
   Collaboration Grids support the linkage of multiple people and
    electronic resources (often peer-to-peer architecture)         16
           Types of Computing Grids
   Running “Pleasing Parallel Jobs” as in United Devices, Entropia
    (Desktop Grid) “cycle stealing systems”
   Can be managed (“inside” the enterprise as in Condor) or more
    informal (as in SETI@Home)
   Computing-on-demand in Industry where jobs spawned are
    perhaps very large (SAP, Oracle …)
   Support distributed file systems as in Legion (Avaki), Globus with
    (web-enhanced) UNIX programming paradigm
     • Particle Physics will run some 30,000 simultaneous jobs
   Distributed Simulation HLA style Grids (some work)
   Linking Supercomputers as in TeraGrid
   Pipelined applications linking data/instruments, compute,
   Seamless Access where Grid portals allow one to choose one of
    multiple resources with a common interfaces
   Parallel Computing typically NOT suited for a Grid (latency)
 Analysis and

VISUALIZATION                                       ,ANALYSIS
                                                                                       Large Disks

                      QuickTime™ and a

                are neede d to see this picture.

                                                   Old Style Metacomputing Grid
                                                    LARGE-SCALE DATABASES

                                                      Large Scale Parallel Computers

         Spread a single large Problem over multiple supercomputers                                  18
     Utility and Service Computing
   An important business application of Grids is believed to be
    utility computing
   Namely support a pool of computers to be assigned as needed to
    take-up extra demand
     • Pool shared between multiple applications
   Natural architecture is not a cluster of computers connected to
    each other but rather a “Farm of Grid Services” connected to
    Internet and supporting services such as
     • Web Servers
     • Financial Modeling
     • Run SAP
     • Data-mining
     • Simulation response to crisis like forest fire or earthquake
     • Media Servers for Video-over-IP
   Note classic Supercomputer use is to allow full access to do
    “anything” via ssh etc.
     • In service model, one pre-configures services for all programs
        and you access portal to run job with less security issues  19
UK National Grid Service
                         GOSC Timeline
                                                   NGS WS Service
   NGS Production          NGS Expansion
   Service                (Bristol, Cardiff…)
                                                                    NGS WS Service 2
                              OGSA-DAI                     NGS Expansion

                                 WS plan                     WS2 plan

           Q2       Q3   Q4      Q1        Q2   Q3    Q4      Q1    Q2     Q3
             2004                           2005                     2006

                          OMII release

                         gLite release 1

                EGEE gLite alpha
                release                               EGEE gLite release
                                                                                       Web Services
                                                      OMII Release

  Grid Operation Support Centre                                                        National Grid

Towards an International                               UK NGS
  Grid Infrastructure                                                 Leeds

                 Starlight (Chicago)                          Manchester
 US TeraGrid                            Netherlight

  SDSC                                                          RAL



                                                                Local laptops in
                          All sites connected by                Seattle and UK
                         production network (not
                                 all shown)

                   Computation            Steering clients
                   Network PoP            Service Registry
               Cyberinfrastructure At Home
• BOINC (Berkeley Open Infrastructure for                Arecibo telescope
   Network Computing)
    • study climate
    •   Einstein@home: search for gravitational
        signals emitted by pulsars
    •   LHC@home: improve the design of the
        CERN LHC particle accelerator
    •   Predictor@home: investigate protein-
        related diseases
    •   Rosetta@home: help researchers
        develop cures for human diseases
    •   SETI@home: Look for radio evidence of
        extraterrestrial live                          SETI@Home averages 138
    •   Etc.                                            TFLOPS on 100,000’s of
                                                     computers in 100’s of countries


                                                  UNIVERSITY OF CALIFORNIA, SAN DIEGO
                                  Fran Berman

Since September 2003:

95,000 registered participants in 150 countries
Donated 8,000 years of computer time
Completed 100,000 simulations of over 4M model years
     Information/Knowledge Grids
   Distributed (10’s to 1000’s) of data sources (instruments,
    file systems, curated databases …)
   Data Deluge: 1 (now) to 100’s petabytes/year (2012)
    • Moore’s law for Sensors
   Possible filters assigned dynamically (on-demand)
     • Run image processing algorithm on telescope image
     • Run Gene sequencing algorithm on compiled data
   Needs decision support front end with “what-if”
   Metadata (provenance)
    critical to annotate data
   Integrate across experiments
     as in multi-wavelength
Data Deluge comes from pixels/year available                24
             Data Deluged Science
   In the past, we worried about data in the form of parallel I/O or
    MPI-IO, but we didn’t consider it as an enabler of new
    algorithms and new ways of computing
   Data assimilation was not central to HPCC
   DoE ASCI set up because didn’t want test data!
   Now particle physics will get 100 petabytes from CERN
    • Nuclear physics (Jefferson Lab) in same situation
    • Use around 30,000 CPU’s simultaneously 24X7
   Weather, climate, solid earth (EarthScope)
   Bioinformatics curated databases (Biocomplexity only 1000’s of
    data points at present)
   Virtual Observatory and SkyServer in Astronomy
   Environmental Sensor nets
Data Deluged
Paradigm                                  Data




       Ideas                  Reasoning

      Repositories            Sensors    Streaming         Field Trip Data
   Federated Databases                      Data

  Database       Database
                                  Sensor Grid
      Database Grid

               Research        SERVOGrid                  Education

     Compute Grid

                            ?        GIS
                           Discovery Grid
                                                         to Education
  Services Research        Services     Analysis and                    Education
             Simulations                Visualization
Grid of Grids: Research Grid and Education Grid                         Farm 27
        SERVOGrid Requirements
   Seamless Access to Data repositories and large scale
   Integration of multiple data sources including sensors,
    databases, file systems with analysis system
    • Including filtered OGSA-DAI (Grid database access)
   Rich meta-data generation and access with
    SERVOGrid specific Schema extending openGIS
    (Geography as a Web service) standards and using
    Semantic Grid
   Portals with component model for user interfaces and
    web control of all capabilities
   Collaboration to support world-wide work
   Basic Grid tools: workflow and notification
   NOT metacomputing                                         28
                      Community Tools
   e-mail and list-serves are oldest and best used
   Kazaa, Instant Messengers, Skype, Napster, BitTorrent for P2P
    Collaboration – text, audio-video conferencing, files
, Connotea, Citeulike manage shared bookmarks
 or similar sites allow you to create community
    resources and share them
   Writely, Wikis and Blogs are powerful specialized shared
    document systems
   ConferenceXP and WebEx share general applications
   Google Scholar tells you who has cited your papers while
    publisher sites tell you about co-authors
   Note sharing resources creates (implicit) communities
    • Social network tools study graphs to both define communities and extract
      their properties
                     Why use SOA’s
   Globalization of applications: Life, Fun, Research, Business,
    Defense as an International collaborative activity
   Globalization of Software Production: Software components
    including open-source made everywhere
   Interoperability: in interfaces and protocol (messages) requires
    Web Services as only broadly supported SOA
   Anti-Performance: if Moore’s law gives you a factor X, then use
    √X for performance, √ X for improved lifecycle (re-use)
   Software Engineering: Software paradigms are ways of
    “packaging” modules/components/objects/methods/subroutines.
    Services have minimal coupling and best re-use (lowest
    performance). 1962 Fortran easier re-use than 2006 Java
   Multicore chips: requires pervasive concurrency without side
    effects. Even Microsoft must be able to use 32-128 way
    parallelism on a chip over next 5 years
 Intel Fall 2005 Multicore Roadmap

March 2006 Sun T1000 8 core
     Server at <$6,000
                               Peter Kogge 1997
Performance Per Transistor

                                                                         Normalized SPECFLTS
  Normalized SPECINTS


                              0.1                 1                 10
                                                                                                     0.1                1                  10
                                    Millions of Transistors (CPU)                                          Millions of Transistors (CPU)

                               Performance data from uP vendors
                               Transistor count excludes on-chip caches
                               Performance normalized by clock rate
                               Conclusion: Simplest is best! (250K Transistor
The Grid and Web Service Institutional Hierarchy

 4: Application or Community of Interest (CoI)     XTCE VOTABLE
 Specific Services such as “Map Services”, “Run    CML
          BLAST” or “Simulate a Missile”           CellML

   3: Generally Useful Services and Features       OGSA GS-*
                                                   and some WS-*
(OGSA and other GGF, W3C) Such as “Collaborate”,   GGF/W3C/….
       “Access a Database” or “Submit a Job”       XGSP (Collab)

      2: System Services and Features               WS-* from
       (WS-* from OASIS/W3C/Industry)               Industry
  Handlers like WS-RM, Security, UDDI Registry
    1: Container and Run Time (Hosting)            Apache Axis
    Environment (Apache Axis, .NET etc.)           .NET etc.

     Must set standards to get interoperability               33
      Sources of Grid Technology
   Grids support distributed collaboratories or virtual
    organizations integrating concepts from
   The Web
   Agents
   Distributed Objects (CORBA Java/Jini COM)
   Globus, Legion, Condor, NetSolve, Ninf and other High
    Performance Computing activities
   Peer-to-peer Networks
   With perhaps the Web and P2P networks being the most
    important for “Information Grids” and Globus for
    “Compute/File Grids”

The Essence of Grid Technology?
   We will start from the Web view and assert that basic
    paradigm is
   Meta-data rich Web Services communicating via
   These have some basic support from some runtime
    such as .NET, Jini (pure Java), Apache Tomcat+Axis
    (Web Service toolkit), Enterprise JavaBeans,
    WebSphere (IBM) or GT3/4 (Globus Toolkit 3/4)
    • These are the distributed equivalent of operating system
      functions as in UNIX Shell
    • Called Hosting Environment or platform
   W3C standard WSDL defines IDL (Interface
    standard) for Web Services                                   35
               What is Happening?
   Grid ideas are being developed in (at least) four communities
     • Web Service – W3C, OASIS, (DMTF)
     • Global Grid Forum (High Performance Computing, e-Science)
     • Enterprise Grid Alliance (Commercial “Grid Forum” with a
       near term focus) merged with GGF to make Open Grid Forum
   Service Standards are being debated
   Grid Operational Infrastructure is being deployed
   Grid Architecture and core software being developed
     • Apache has several important projects as do academia; large
       and small companies
   Particular System Services are being developed “centrally” –
    OGSA framework for this in GGF; WS-* for
   Lots of fields are setting domain specific standards and building
    domain specific services
   USA started but now Europe is probably in the lead and Asia will
    soon catch USA if momentum (roughly zero for USA) continues
             What do Grids Add?
   Grids use all of the Web Services
   They address management and deployment of
    large distributed systems of services
     • Internet Scale Distributed Services
     • I will use Grid more simply as a composable
       coordinated collection of services
   They address security and management issues of
    virtual organizations crossing multiple
    administrative domains
   GGF is developing specific services of relevance
    including job management, many aspects of data
    and scheduling
     • Not much on sensors, real-time, P2P
   GGF has a good process for developing new
    higher level specifications
          Technical Activities of Note
   Look at different styles of Grids such as Autonomic (Robust
    Reliable Resilient)
   New Grid architectures hard due to investment required
   Program the Grid – Workflow
   Access the Grid – Portals, Grid Computing Environments       Low
   Critical Services Such as                                    Level
     • Security – build message based not connection based       WS-*
     • Notification – event services
     • Metadata – Use Semantic Web, provenance
     • Fabric and Service Management
     • Databases and repositories – instruments, sensors
     • Computing – Submit job, scheduling, distributed file systems
     • Visualization, Computational Steering         High Level
     • Network performance                           e.g. OGSA      38
          What do Web Services Prescribe?
• The specify interfaces for system services (and generally useful
  services like database)
• They specify an interface language (WSDL) for all services
• They develop containers and frameworks to use to host services
• They specify a message format (SOAP) for ALL messages that
  defines both application and system actions precisely
• They imply a process be started to define domain specific
• There are multiple competing activities from Microsoft and IBM
  to Apache, and IU (for example) developing system and
  application services
• Unlike for RTI and CORBA, services from different vendors
  should interoperate
                              Container System Processing

H1   H2   H3   H4   Body         F1   F2    F3    F4        Service

                                 Container Handlers                   39
              Plethora of Standards
   Java is very powerful partly due to its many “frameworks” that
    generalize libraries e.g.
     • Java Media Framework
     • Java Database Connectivity JDBC
   Web Services have a correspondingly collections of specifications that
    represent critical features of the distributed operating systems for
    “Grids of Simple Services”
     • About 60 WS-* specifications introduced in last 2-3 years
     • These are low level with higher level standards such as access
       database (OGSA-DAI) or “Submit a job” built on top of these
   Many battles both between standard bodies and between companies as
    each tries to set standards they consider best; thus there are multiple
    standards for many of key Web Service functionalities
   Microsoft a key player and stands to benefit as Web Services open up
    enterprise software space to all participants
     • e.g. MQSeries (IBM) and Tibco have to change their messaging
       systems to support new open standards
The Ten areas covered by the 60 core WS-* Specifications
WS-* Specification Area           Examples
1: Core Service Model             XML, WSDL, SOAP
2: Service Internet               WS-Addressing, WS-MessageDelivery; Reliable
                                  Messaging WSRM; Efficient Messaging MOTM
3: Notification                   WS-Notification, WS-Eventing (Publish-Subscribe)
4: Workflow and Transactions      BPEL, WS-Choreography, WS-Coordination
5: Security                       WS-Security, WS-Trust, WS-Federation, SAML,
6: Service Discovery              UDDI, WS-Discovery
7: System Metadata and State      WSRF, WS-MetadataExchange, WS-Context
8: Management                     WSDM, WS-Management, WS-Transfer
9: Policy and Agreements          WS-Policy, WS-Agreement
10: Portals and User Interfaces   WSRP (Remote Portlets)

Activities in Global Grid Forum Working Groups
GGF Area                         GS-* and OGSA Standards Activities
1: Architecture     High Level Resource/Service Naming (level 2 of slide 6),
                    Integrated Grid Architecture
2: Applications     Software Interfaces to Grid, Grid Remote Procedure Call,
                    Checkpointing and Recovery, Interoperability to Job Submittal services,
                    Information Retrieval,
3: Compute          Job Submission, Basic Execution Services, Service Level Agreements
                    for Resource use and reservation, Distributed Scheduling

4: Data             Database and File Grid access, Grid FTP, Storage Management, Data
                    replication, Binary data specification    and interface, High-level
                    publish/subscribe, Transaction management
5: Infrastructure   Network measurements, Role of IPv6 and high performance
                    networking, Data transport
6: Management       Resource/Service configuration, deployment and lifetime, Usage
                    records and access, Grid economy model
7: Security         Authorization, P2P and Firewall Issues, Trusted Computing
         Net-Centric Core Enterprise Services
Core Enterprise Services     Service Functionality
NCES1: Enterprise Services   including life-cycle management
Management (ESM)
NCES2: Information           Supports confidentiality, integrity and availability.
Assurance (IA)/Security      Implies reliability and autonomic features
NCES3: Messaging             Synchronous or asynchronous cases
NCES4: Discovery             Searching data and services
NCES5: Mediation             Includes     translation,    aggregation,     integration,
                             correlation, fusion, brokering publication, and other
                             transformations for services and data. Possibly agents
NCES6: Collaboration         Provision and control of sharing with emphasis on
                             synchronous real-time services
NCES7: User Assistance       Includes automated and manual methods of optimizing
                             the user GiG experience (user agent)
NCES8: Storage               Retention, organization and disposition of all forms of
NCES9: Application           Provisioning,    operations    and     maintenance     of
                             applications.                                                43
          The Core Features/Service Areas I
Service or Feature       WS-*   GS-*   NCES     Comments
A: Broad Principles
FS1: Use SOA: Service    WS1                    Core Service Architecture, Build Grids on Web
   Oriented Arch.                                  Services. Industry best practice

FS2: Grid of Grids                              Distinctive Strategy for legacy subsystems and
                                                    modular architecture
B: Core Services
FS3: Service Internet,   WS2           NCES3 Streams/Sensors. Team
FS4: Notification        WS3           NCES3 JMS, MQSeries.
FS5 Workflow             WS4           NCES5 Grid Programming
FS6 : Security           WS5    GS7    NCES2 Grid-Shib, Permis Liberty Alliance ...
FS7: Discovery           WS6           NCES4 UDDI

FS8: System Metadata     WS7                    Globus MDS
   & State                                      Semantic Grid, WS-Context
FS9: Management          WS8    GS6    NCES1 CIM
FS10: Policy             WS9           ECS
          The Core Feature/Service Areas II
Service or Feature         WS-*   GS-*   NCES    Comments
B: Core Services (Continued)
FS11: Portals and User WS10              NCES7   Portlets JSR168, NCES Capability Interfaces
FS12: Computing                   GS3

FS13: Data and Storage            GS4    NCES8   NCOW Data Strategy
                                                 Federation at data/information layer major
                                                 research area; CGL leading role
FS14: Information                 GS4            JBI for DoD, WFS for OGC

FS15: Applications and User       GS2    NCES9   Standalone Services
Services                                         Proxies for jobs
FS16: Resources and               GS5            Ad-hoc networks
FS17: Collaboration and           GS7    NCES6   XGSP, Shared Web Service ports
Virtual Organizations
FS18: Scheduling and              GS3            Current work only addresses scheduling “batch
matching of Services and                         jobs”. Need networks and services
       A List of Web Services 1
• 1) Core Service Architecture
• XSD XML Schema (W3C Recommendation) V1.0
  February 1998, V1.1 February 2004
• WSDL 1.1 Web Services Description Language
  Version 1.1, (W3C note) March 2001
• WSDL 2.0 Web Services Description Language
  Version 2.0, (W3C under development) March
• SOAP 1.1 (W3C Note) V1.1 Note May 2000
• SOAP 1.2 (W3C Recommendation) June 24 2003
               A List of Web Services 2
• 2) Service Internet including messaging
• WS-Addressing Web Services Addressing (BEA, IBM, Microsoft, SAP, Sun)
  in W3C consideration August 2004
• WS-MessageDelivery Web Services Message Delivery (W3C Submission by
  Oracle, Sun ..) April 2004
• WS-Reliability Web Services Reliable Messaging (OASIS Web Services
  Reliable Messaging TC) March 2004
• WS-RM Web Services Reliable Messaging (BEA, IBM, Microsoft, Tibco)
  v0.992 February 2005 linked to WS-Reliability in OASIS as Web Services
  Reliable Exchange (WS-RX)
• WS-RM Policy Web Services Reliable Messaging Policy Assertion (BEA,
  IBM, Microsoft, Tibco) March 2006
• WS-RX Web Services Reliable Exchange (Many members) integrating
  previous reliability specifications
• SOAP MOTM SOAP Message Transmission Optimization Mechanism (W3C)
  June 2004
• SOAP-over-UDP Binding of SOAP to UDP (Microsoft, BEA …) September
• Many obsolete specifications like WS-Routing and Referral SOAP Routing
  Protocol (Microsoft) October 2001                                   47
      Application Specific Grids          Higher
 Generally Useful Services and Grids      Level
        Workflow WSFL/BPEL                Services
 Service Management (“Context etc.”)      Service
Service Discovery (UDDI) / Information    Context
Service Internet Transport  Protocol     Service
       Service Interfaces WSDL            Internet
      Base Hosting Environment
      Protocol HTTP FTP DNS …
         Presentation XDR …              Bit level
           Session SSH …                 Internet
        Transport TCP UDP …                (OSI
            Network IP …                  Stack)
         Data Link / Physical
Layered Architecture for Web Services and Grids
       WS-* implies the Service Internet
   We have the classic (CISCO, Juniper ….) Internet routing the
    flood of ordinary packets in OSI stack architecture
   Web Services build the “Service Internet” or IOI (Internet on
    Internet) with
     • Routing via WS-Addressing not IP header
     • Fault Tolerance (WS-RM not TCP)
     • Security (WS-Security/SecureConversation not IPSec/SSL)
     • Data Transmission by WS-Transfer not HTTP
     • Information Services (UDDI/WS-Context not
       DNS/Configuration files)
     • At message/web service level and not packet/IP address level
   Software-based Service Internet possible as computers “fast”
   Familiar from Peer-to-peer networks and built as a software
    overlay network defining Grid (analogy is VPN)
   SOAP Header contains all information needed for the “Service
    Internet” (Grid Operating System) with SOAP Body containing
    information for Grid application service
               A List of Web Services 3
• 3) Notification and high-level publish/subscribe information
• WS-Eventing Web Services Eventing (BEA, Microsoft, TIBCO)
  August 2004
• WS-EventNotification (HP, IBM, Intel, Microsoft) March 2006 uses
  resources to manage subscriptions
• WS-Notification Framework for Web Services Notification with WS-
  Topics, WS-BaseNotification, and WS-BrokeredNotification
  (OASIS) OASIS Web Services Notification TC Set up March 2004
• JMS Java Message Service V1.1 March 2002

• Different from using publish-subscribe to robustly support messaging
  between Web services
   – Bind SOAP to JMS or MQSeries                               50
            A List of Web Services 4
• 4) Coordination and Workflow, Transactions and
• BPEL Business Process Execution Language for Web Services
  (OASIS) V1.1 May 2003 (V1.1) with V2.0 under development
• WS-CDL Web Services Choreography Language (W3C) V1.0
  Working Draft 17 December 2004
• WSCI (W3C) Web Service Choreography Interface V1.0 (W3C
  Note from BEA, Intalio, SAP, Sun, Yahoo)
• WSCL Web Services Conversation Language (W3C Note) HP
  March 2002
• Workflow is general linkage between services; transactions are a
  critical special case
• Concept of workflow generalizes traditional workflow processes
  in business
• Many competing workflow implementations and standards;
  many implementations “reject” current standards
                  Role of Workflow
      Service-1                        Service-3


   Programming SOAP and Web Services (the Grid):
    Workflow describes linkage between services
   As distributed, linkage must be by messages
   Linkage is two-way and has both control and data
   Apply to multi-disciplinary, multi-scale linkage,
    multi-program linkage, link visualization to
    simulation, GIS to simulations and visualization
    filters to each other
   Microsoft-IBM specification BPEL is current
    preferred Web Service XML specification of     52
Example workflow
        Here a sensor feeds a data-
        mining application
        (We are extending data-
        mining in DoD
        applications with
        Grossman from UIC)
        The data-mining
        application drives a

  Example Flood Simulation workflow
  Data                            Data
Archives                        Archives

              GIS Grid Services
               Link Distributed        Runoff
     Model        Data and             Model
                SOAP Messages
     Flow         And Events           Flow
     Model                             Model
      SERVOGrid Codes, Relationships
                      Elastic Dislocation Inversion               Viscoelastic FEM

                                      Viscoelastic Layered BEM

Elastic Dislocation
                                                                 Pattern Recognizers

                           Fault Model BEM
    This linkage called Workflow in Grid/Web Service parlance
         Two-level Programming I
• The Web Service (Grid) paradigm implicitly assumes a
  two-level Programming Model
• We make a Service (same as a “distributed object” or
  “computer program” running on a remote computer) using
  conventional technologies
   – C++ Java or Fortran Monte Carlo module
   – Data streaming from a sensor or Satellite
   – Specialized (JDBC) database access
• Such services accept and produce data from users files and
• The Grid is built by coordinating such services assuming
  we have solved problem of programming the service 56
         Two-level Programming II
   The Grid is discussing the composition of distributed
    services with the runtime Service1                  Service2
    interfaces to Grid as
    opposed to UNIX
    pipes/data streams       Service3             Service4

   Familiar from use of UNIX Shell, PERL or Python
    scripts to produce real applications from core programs
   Such interpretative environments are the single
    processor analog of Grid Programming
   Some projects like GrADS from Rice University are
    looking at integration between service and composition
    levels but dominant effort looks at each level separately
        3 Layer Programming Model
Web Service 1      WS 2                        WS N-1   Web Service N
              Level 1 Programming inside services
     Application expressed in in Java Fortran C++ MPI etc.

                        WS-* Infrastructure

   Level 2 Programming choosing services by virtualization
  Application Semantics (Metadata, Ontology) Semantic Grid

     Level 3 Grid Programming composing multiple services
           Service Workflow, Transactions, Mediation

            Substantial work in UK e-Science program,
              international semantic web community
     A List of Web Services 4-Continued
• 4) Transactions, Business Processes and Contextualization
• WS-CAF Web Services Composite Application Framework including WS-
  CTX, WS-CF and WS-TXM below (OASIS Web Services Composite
  Application Framework TC)
• WS-CTX Web Services Context (OASIS Web Services Composite
  Application Framework TC) V0.9.2 July 2005
• WS-CF Web Services Coordination Framework (OASIS Web Services
  Composite Application Framework TC) V0.1 April 2005
• WS-TXM Web Services Transaction Management (OASIS Web Services
  Composite Application Framework TC) including WS-ACID (V0.1 May
  2005), WS-BP (Business Process V0.1 May 2005), WS-LRA (Long
  running action V0.1 May 2005)
• WS-Coordination Web Services Coordination (BEA, IBM, Microsoft)
  November 2004
• WS-AtomicTransaction Web Services Atomic Transaction (BEA, IBM,
  Microsoft) November 2004
• WS-BusinessActivity Web Services Business Activity Framework (BEA,
  IBM, Microsoft) November 2004
• BTP Business Transaction Protocol (OASIS) May 2002 with V1.1
  November 2004
• ebXML BPSS Business Process (OASIS) with V2.0.1 pre-Committee Draft
  review 17 July 2005
             A List of Web Services 5
• 5) Security Frameworks and Core Specifications
• WS-Security 2004 Web Services Security: SOAP Message Security (OASIS)
  Standard March 2004.
• WS-I Basic Security Profile V1.0 Web Services Interoperability Organization
  Working Group Draft May 15 2005
• WS-Security Username Token Profile Web Services Security Username Token
  Profile V1.0 OASIS Standard, March 2004
• WS-Security X.509 Certificate Token Profile Web Services Security X.509
  Certificate Token Profile OASIS Standard, March 2004
• WS-Security REL Profile Web Services Security Rights Expression Language
  (REL) Token Profile OASIS Standard: 19 December 2004
• WS-I REL Token Profile V1.0 Web Services Interoperability Organization
  Working Group Draft 13 May 2005
• WS-Security Kerberos Web Services Security Kerberos Binding (Microsoft)
  December 2003
• Web-SSO Web Single Sign-On Metadata Exchange Protocol (Microsoft, Sun)
  April 2005
• Web-SSO-Mex Web Single Sign-On Interoperability Profile (Microsoft, Sun)
  April 2005
• WS-SecurityPolicy Web Services Security Policy Language (IBM, Microsoft,
  RSA, Verisign) V1.1 July 2005                                         60
    A List of Web Services 5 - Contd
• 5) Security Capabilities
• WS-Trust Web Services Trust Language (BEA, IBM, Microsoft, RSA,
  Verisign …) February 2005
• WS-SecureConversation Web Services Secure Conversation Language
  (BEA, IBM, Microsoft, RSA, Verisign …) February 2005
• WS-Federation Web Services Federation Language (BEA, IBM,
  Microsoft, RSA, Verisign) July 2003
• WS-Federation Active Requestor Profile Web Services Federation
  Language Active Requestor Profile V 1.0 (BEA, IBM, Microsoft, RSA,
  Verisign) July 8, 2003
• WS-Federation Passive Requestor Profile Web Services Federation
  Language Passive Requestor Profile V 1.0 (BEA, IBM, Microsoft, RSA,
  Verisign) July 8, 2003
• WS-Authorization is being developed by IBM and Microsoft and will
  build on WS-Trust to describe how access to particular web services is
  specified and managed.
• WS-Privacy is being developed by IBM and Microsoft and will build on
  WS-Policy to describe the binding of privacy policies to Web services and
  their exchanged data.
   A List of Web Services 5 - Contd
• 5) Security Languages
• SAML Assertions and Protocols for the OASIS
  Security Assertion Markup Language (SAML) V2.0
  OASIS Standard, 15 March 2005
• WS-Security SAML Token Profile Web Services
  Security SAML Token Profile OASIS Standard, 1
  December 2004
• WS-I SAML Token Profile V1.0 Web Services
  Interoperability Organization Working Group Draft 13
  May 2005
• XACML eXtensible Access Control Markup
  Language (OASIS) V2.0 1 February 2005         62
        A List of Web Services 6
• 6) Service Discovery
• UDDI (Broadly Supported OASIS Standard) V3
  August 2003
• WS-Discovery Web services Dynamic Discovery
  (Microsoft, BEA, Intel …) February 2004
• WS-IL Web Services Inspection Language, (IBM,
  Microsoft) November 2001
• Note WS-Context as a metadata catalog and WS-
  Management Catalog are examples of related
• There are many UDDI extensions             63
          A List of Web Services 7
• 7) Metadata and State
• RDF Resource Description Framework (W3C) Set of
  recommendations expanded from original February 1999 standard
• DAML+OIL combining DAML (Darpa Agent Markup Language)
  and OIL (Ontology Inference Layer) (W3C) Note December 2001
• OWL Web Ontology Language (W3C) Recommendation February
• WS-MetadataExchange 1.1 Web Services Metadata Exchange
  (HP, IBM, Intel, Microsoft) March 2006
• ASAP Asynchronous Service Access Protocol (OASIS) with V1.0
  working draft 2B December 11 2004
• WS-GAF Web Service Grid Application Framework (Arjuna,
  Newcastle University) August 2003
• WBEM Web-Based Enterprise Management including CIM
  (Common Information Model) from DMTF (Distributed
  Management Task Force) 2004-2005
        A List of Web Services 7
• 7) Metadata and State: Resource Framework
• WS-RF Web Services Resource Framework (OASIS)
• WS-Resource Framework Web Services Resource 1.2
  (OASIS) Public Review Draft 01, 10 June 2005
• WS-ResourceProperties Web Services Resource
  Properties V1.2 Public Review Draft 01, 10 June 2005
• WS-ResourceLifetime Web Services Resource Lifetime
  V1.2 Public Review Draft 01, 13 June 2005
• WS-ServiceGroup Web Services Service Group V1.2
  Public Review Draft 01, 10 June 2005
• WS-BaseFaults Web Services Base Faults V1.2 Public
  Review Draft 01, June 13, 2005                    65
            Metadata and Service Context
   Consider a collection of services working together
     • Workflow tells you how to specify service interaction but more
        basically there is shared information or context
        specifying/controlling collection
   WS-RF and WS-GAF have different approaches to contextualization
    – supplying a common “context” which at its simplest is a token to
    represent state
   More generally core shared information includes dynamic service
    metadata and the equivalent of configuration information.
   One can supports such a common context either as pool of
    messages or as message-based access to a “database” (Context
   Two services linked by a stream are perhaps simplest example of a
    collection of services needing context
   Note that there is a tension between storing metadata in messages
    and services.
     • This is shared versus distributed memory debate in parallel
               Stateful Interactions
   There are (at least) four approaches to specifying state
     • OGSI use factories to generate separate services for
       each session in standard distributed object fashion
     • Globus GT-4 and WSRF use metadata of a resource
       to identify state associated with particular session
     • WS-GAF uses WS-Context to provide abstract
       context defining state. Has strength and weakness
       that reveals less about nature of session
     • WS-I+ “Pure Web Service” leaves state specification
       the application – e.g. put a context in the SOAP body
   I think we should smile and write a great metadata
    service hiding all these different models for state and
    metadata                                               67
       A List of Web Services 8
• 8) Management – original OASIS
• WS-DistributedManagement Web Services
  Distributed Management Framework with MUWS
  and MOWS below (OASIS)
• WSDM-MUWS Web Services Distributed
  Management: Management Using Web Services
  (OASIS) OASIS Standard March 9 2005
• WSDM-MOWS Web Services Distributed
  Management: Management of Web Services
  (OASIS) OASIS Standard March 9 2005
   A List of Web Services 8- Contd
• 8) Management: Microsoft Converged Stack
• WS-Management Web Services for Management
  (Microsoft, Intel, Sun …) August 2005
• WS-Management Catalog The WS-Management
  Catalog (Microsoft, Intel, Sun …) August 2005
• WS-ResourceTransfer Web Service Resource Transfer
  (HP, IBM, Intel, Microsoft) March 2006
• WS-Transfer Web Service Transfer (Microsoft, BEA,
  Sonic Software etc.) September 2004
• WS-TransferAddendum Extensions to Web Service
  Transfer (HP, IBM, Intel, Microsoft) March 2006
• WS-Enumeration Web Service Enumeration
  (Microsoft, BEA, Sonic Software etc.) September 2004
         A List of Web Services 9
• 9) General Service Characteristics
• WS-PolicyFramework Web Services Policy
  Framework (BEA, IBM, Microsoft, SAP …)
  September 2004
• WS-PolicyAttachment Web Services Policy
  Attachment (BEA, IBM, Microsoft, SAP …)
  September 2004
• WS-PolicyAssertions Web Services Policy Assertions
  Language (BEA, IBM, Microsoft, SAP) 18 December
  2002 (Superseded by WS-PolicyFramework)
• WS-Agreement Web Services Agreement
  Specification (GGF under development) 9 August 2004
       A List of Web Services 10
• 10) User Interfaces
• WSRP Web Services for Remote Portlets
  (OASIS) OASIS Standard August 2003
• JSR168: JSR-000168 Portlet Specification for
  Java binding (Java Community Process) October
• WSRP specifies the client-service protocol while
  JSR168 specifies how portlets are implemented
  for each supported service user-facing Web
  service ports inside aggregating portalslike
  JetSpeed, GridSphere or uPortal              71
              WS-I Interoperability
   Critical underpinning of Grids and Web Services is
    the gradually growing set of specifications in the
    Web Service Interoperability Profiles
   Web Services Interoperability (WS-I)
    Interoperability Profile 1.0a."
    gives us XSD, WSDL1.1, SOAP1.1, UDDI in basic
    profile and parts of WS-Security in their first
    security profile.
   We imagine the “60 Specifications” being checked
    out and evolved in the cauldron of the real world
    and occasionally best practice identifies a new
    specification to be added to WS-I which gradually
    increases in scope
     • Note only 4.5 out of 60 specifications have   72

       “made it” in this definition
   Raw Data                 Data             Information                      Knowledge                        Wisdom
                                                               Another                                             Decisions
Another             S            S              S                   S
                                 S              S
                                     FS                  FS

                         S                                                        MD
                    FS                        O                 O                          O
                             O                S                 S                          S
           SS                S                        FS                                           F
Another             FS                                                      FS                     S
Service    SS                                                                      MD
                             O                      MD
                                                                    O                                  MD
                             S                                      S                                              O
                    FS                   FS                                                FS                      S   Other
                                                                            F                                          Service
                                                                            S                               O
           SS                MD
                                         O                                            O                     S
                FS                       S            FS                              S            FS
                                                                 MD                                                 MD
           SS                FS
                                              Filter Service                                                O
Another         FS                       FS                      FS                   FS                    S       MetaData
 Grid                        MD
                S        S           S            S        S            S         S            S        S       Sensor Service
                S        S           S            S        S            S         S            S        S

                              Another                                                                                  73
Database                      Service
       Semantic Grid and Services
   Implications of SOA (Service Oriented Architectures) for SG
    (Semantic Grid)
     • Build services to implement SG
   Implications of SG for SOA
     • Build metadata rich systems of services using SG
   Services receive data in SOAP messages, manipulate it and
    produce transformed data as further messages
   Meta-data is carried in SOAP messages
   Meta-data controls processing and transport of SOAP Messages
   Knowledge is created from data by services
   The Grid enhances Web services with semantically rich system
    and application specific management
   One must exploit and work around the different approaches to
    meta-data and their manipulation in Web Services           74
      Structure of SOAP Messages
                                             Container Workflow

     H1   H2   H3   H4   Body          F1   F2    F3    F4    Service

                                       Container Handlers

   SOAP Messages have System information in the header
    including WS-Policy based meta-data defining processing
     • Processed by Handlers
   Application data and meta-data is the body (controversies here!)
     • Processed by the Service itself
   Some meta-data like WS-RF is logically “only in messages”
   Other like that in WS-Context or the SRB are stored in logical
    equivalent of XML databases
   We only need to preserve semantic structure (XML/SOAP
    Infoset) so transport in fast XML and store in efficient relational
             Support for Messages
    Optimize XML representation and transport protocol
    XML’’    Filter2

    StdXML    Filter1     XML’                  XML’   Filter1-1       StdXML

             Choose                Choose
                                                        Filter2-1       XML’’
             Invertible            Protocol

                                 (WS-Context)          Filters Preserve Infoset

              FI (Fast Infoset=Binary XML) v
                Traditional XML Messages
                                       Transfer Time Comparison



Time (ms)

                                                                  Transfer - FI
            300                                                   Transfer - XML

































                                  # Of Features Per Message
PDA to Web Service Optimized
                                     HHFR: 16 String Per Message
                                     SOAP: 16 String Per Message

Total Session Time (sec)






                                 0        5           10           15   20     25    30   35

                                                    Number Of Messages Per Session             78
    Requirements for MPI Messaging
                        tcalc       tcomm          tcalc

   MPI and SOAP Messaging both send data from a source to a
     • MPI supports multicast (broadcast) communication;
     • MPI specifies destination and a context (in comm parameter)
     • MPI specifies data to send
    • MPI has a tag to allow flexibility in processing in source processor
    • MPI has calls to understand context (number of processors etc.)
   MPI requires very low latency and high bandwidth so that
    tcomm/tcalc is at most 10
     • BlueGene/L has bandwidth between 0.25 and 3
        Gigabytes/sec/node and latency of about 5 microseconds
     • Latency determined so Message Size/Bandwidth > Latency                79
    Requirements for SOAP Messaging
   Web Services has much of the same requirements as MPI with
    two differences where MPI more stringent than SOAP
     • Latencies are inevitably 1 (local) to 100 milliseconds which is
       200 to 20,000 times that of BlueGene/L
           1) 0.000001 ms      – CPU does a calculation
           2) 0.001 to 0.01 ms – MPI latency
           3) 1 to 10 ms       – wake-up a thread or process
           4) 10 to 1000 ms – Internet delay
     • Bandwidths for many business applications are low as one
       just needs to send enough information for ATM and Bank to
       define transactions
   SOAP has MUCH greater flexibility in areas like security, fault-
    tolerance, “virtualizing addressing” because one can run a lot of
    software in 100 milliseconds
     • Typically takes 1-3 milliseconds to gobble up a modest
       message in Java and “add value”
             Structure of SOAP
   SOAP defines a very obvious message structure with a
    header and a body just like email
   The header contains information used by the “Internet
    operating system”
    • Destination, Source, Routing, Context, Sequence Number …
   The message body is partly further information used by
    the operating system and partly information for
    application when it is not looked at by “operating
    system” except to encrypt, compress it etc.
    • Note WS-Security supports separate encryption for different
      parts of a document
   Much discussion in field revolves around what is
    referenced in header
   This structure makes it possible to define VERY
    Sophisticated messaging                                         81
        MPI and SOAP Integration
   Note SOAP Specifies format and through WSDL
   MPI only specifies interface and so interoperability
    between different MPIs requires additional work
    • IMPI
   Pervasive networks can support high bandwidth
    (Terabits/sec soon) but latency issue is not resolvable in
    general way
   Can combine MPI interfaces with SOAP messaging but
    I don’t think this has been done
   Just as walking, cars, planes, phones coexist with
    different properties; so SOAP and MPI are both good
    and should be used where appropriate                    82
When is a High Performance Computer?
   We might wish to consider three classes of multi-node computers
   1) Classic MPP with microsecond latency and scalable internode
    bandwidth (tcomm/tcalc ~ 10 or so)
   2) Classic Cluster which can vary from configurations like 1) to 3)
    but typically have millisecond latency and modest bandwidth
   3) Classic Grid or distributed systems of computers around the
     • Latencies of inter-node communication – 100’s of milliseconds
       but can have good bandwidth
   All have same peak CPU performance but synchronization costs
    increase as one goes from 1) to 3)
   Cost of system (dollars per gigaflop) decreases by factors of 2 at
    each step from 1) to 2) to 3)
   One should NOT use classic MPP if class 2) or 3) suffices unless
    some security or data issues dominates over cost-performance
   One should not use a Grid as a true parallel computer – it can link
    parallel computers together for convenient access etc.
                        Linking Modules
    Closely coupled Java/Python …                Coarse Grain Service Model

        Module      Module                    Service                    Service
          B           A                         B                          A
           Method Calls                        0.1 to 1000 millisecond latency
        .001 to 1 millisecond

   From method based to RPC to message based to event-based
    publish-subscribe Message Oriented Middleware

      Subscribe                                                       Publisher
      to Events                                                       Post Events

       Service B                Message Queue in the Sky                Service A
           What is a Simple Service?
   Take any system – it has multiple functionalities
     • We can implement each functionality as an independent
       distributed service
     • Or we can bundle multiple functionalities in a single service
   Whether functionality is an independent service or one of many
    method calls into a “glob of software”, we can always make them
    as Web services by converting interface to WSDL
   Simple services are gotten by taking functionalities and making as
    small as possible subject to “rule of millisecond”
     • Distributed services incur messaging overhead of one (local) to
       100’s (far apart) of milliseconds to use message rather than
       method call
     • Use scripting or compiled integration of functionalities ONLY
       when require <1 millisecond interaction latency
   Apache web site has many (pre Web Service) projects that are
    multiple functionalities presented as (Java) globs and NOT (Java)
    Simple Services
     • Makes it hard to integrate sharing common security, user
       profile, file access .. services                              85
             Grids of Grids of Simple Services
•    Link via methods  messages  streams
•    Services and Grids are linked by messages
•    Internally to service, functionalities are linked by methods
•    A simple service is the smallest Grid
•    We are familiar with method-linked hierarchy
     Lines of Code  Methods  Objects  Programs  Packages

    Methods      Services      Component Grids

     CPUs       Clusters         Compute
                               Resource Grids                Overlay
                 MPPs                                     and Compose
                                                          Grids of Grids
                  Databases           Data
                                  Resource Grids                    86
    Sensor      Sensor Nets
               Component Grids?
   So we build collections of Web Services which we package as
    component Grids
     • Visualization Grid
     • Sensor Grid
     • Utility Computing Grid
     • Collaboration Grid
     • Earthquake Simulation Grid
     • Control Room Grid
     • Crisis Management Grid
     • Drug Discovery Grid
     • Bioinformatics Sequence Analysis Grid
     • Intelligence Data-mining Grid
   We build bigger Grids by composing component Grids using the
    Service Internet                                            87
Typical use of Grid Messaging in NASA
          Sensor Grid implementing using NB

          NB                       GIS Grid

     Datamining Grid
    Using the Grid of Grids and Core Services to build multiple
           application grids re-using common components.
 Chemical Informatics Grid                   BioInformatics Grid
           15: Application Services
           Screening Tools            … Domain Specific
                                                             …     15: Application Services
                                                                   Sequencing Tools
           Quantum                                                 Biocomplexity
            Calculations                                             Simulations

                 14: Information               11: Portals         Instrument/Sensor

             17: Collaboration                12: Computing       13: Data Access/Storage
             9: Management              18: Scheduling                   10: Policy

                                        4: Notification
             7: Discovery                                                 8:Metadata
                                 Core Low Level Grid Services
            6: Security        3: Messaging         5: Workflow         9: Management
                            Physical Network (monitored by FS16)
                          Electricity            Gas CIGrid
  Flood CIGrid
                     …    CIGrid        …
 Flood Services                                 Gas Services
   and Filters                                   and Filters

Collaboration Grid            Portals         Visualization Grid

 Sensor Grid              GIS Grid               Compute Grid

                     Data Access/Storage
   Registry                                       Metadata
                      Core Grid Services
Security       Notification        Workflow        Messaging
                        Physical Network

 Critical Infrastructure (CI) Grids built as Grids of Grids
Mediation and Transformation in a
Grid of Grids and Simple Services

                                                  Mediation and
    External facing                               Transformation
    Interfaces         Port              Port     Services

                         Subgrid or service


      Internal                                          Internal
Port            Port                            Port              Port
     Interfaces           Messaging                    Interfaces


                                                  Subgrid or service     91
  Subgrid or service
     Why can we build better software?
   In 1962 I was punching holes in cards and paper tape
    to persuade tiny slow computers to manipulate words
    in memory to string together instructions like a = b + c
   Now computers are much faster and languages are
    better but not a lot better
    • I suspect I would only be a factor of 2 or so faster
      programming the same program today
   However A B C can now be resources (Bank records,
    Drugs, Games, Supernova) and + can be a service
    • Objects were insufficient as they distributed ordinary
      programs; services express distributed independent entities
      (communication time very different inter and intra
    • Services are essential for reliable modular programming
      What’s wrong with old programs
   They were made of instructions, methods, subroutines
    and libraries thereof
   Languages (Java, C++) encouraged spaghetti
    programming that linked parts of programs together
    • This leads to efficient but unmaintainable software
   However now computers and networks are several
    orders of magnitude faster
    • Optimize for modularity and maintainability and rarely if
      ever optimize for performance
   Old programs have the wrong optimization and by
    construction are hard to maintain/change
        Old and New Software Regime
   Web Services, Grids and P2P systems are built with
    • The new software model: independent entities connected by
      explicit messages
        All computer entities are actually connected by some form
         of message (traveling on bus or from memory to register)
         but often implicit
    • And they support the distributed services and resources
      needed for global science, fun and business
    • Google, Amazon, Yahoo and perhaps Microsoft and
      Electronic Arts can exploit this model
   Old programs have the old architecture and cannot be
    • At best can wrap partial functionalities as services and use as
      a black box
    • IBM, Oracle and the old Enterprise software companies have
      this noose around their necks
           Delicious Applications
 purchased by Yahoo for ~$30M
 (Nature)
    • Associate metadata with Bookmarks specified by URL’s,
      DOI’s (Digital Object Identifiers)
    • Users add comments and keywords (called tags)
    • Users are linked together into groups (communities)
    • Information such as title and authors extracted automatically
      from some sites (PubMed, ACM, IEEE, Wiley etc.)
    • Bibtex like additional information
   This is de facto Semantic Web – remarkable for its

Connotea queried by SERVOGrid

    Provenance and Delicious ????
   ???? is any field such as chemistry
   All ???? Data should be associated with provenance that
    describes its lineage
     • How and when it was created
     • Compiler options used in simulation
     • ????XMLfrontendedDatabase query used on what
   Provenance produced by computer automatically and/or by user
   All ????Data can and should be labeled by a URI such as
   We can use style interface to annotate ????Data with
    missing provenance and user comments of any type (describing
    quality of data or a keyword relating different data etc.)
           Semantic Scholar Grid
   Citeseer and Google Scholar scour the Internet and
    analyze documents for incidental metadata
   Title, author and institution of documents
   Citations with their own metadata allowing one to
    match to other documents
   These capabilities are sure to become more powerful
    and to be extended
    • Give “Citation Index” in real time
    • Tell you all authors of all papers that cite a paper that cites
      you etc. (Note it’s a small world so don’t go too far in link
    • Tell you all citations of all papers in a workshop
   Such high value tools will appear on “publisher” sites
    of future (or else publishers will disappear)                       99
OSCAR2 Chemistry
Document analysis
   It detects “magic”
    chemical strings in text
    and then
    • Stores them as
      metadata associated
      with document
   Queries
    repositories to tell you
    lots of information
    about identified
   Tells you which other
    documents have this
         ???? Version of OSCAR
   Some of the ???? Nodes will store metadata associated
    with ????Data – including documents
    • Note documents could be anywhere on the Internet – the
      ????Node may choose to store (a copy of) document or just its
    • Note all ????Nodes are federated i.e. there is no “one central”
      store of any type of data
   Metadata will be user annotations including tags,
    Citeseer style citation information for all scientific
   Then each scientific field has its own version of OSCAR
    tuned to extract natural metadata for science – for
    Earthquake science this is GML and Chemistry is
    CML …
Document-enhanced Research Grid
                                              Windows Live    
  Export:              Traditional
                    Cyberinfrastructure      Academic Search
RSS, Bibtex                                                             CiteULike
Endnote etc.                                     Google Scholar
 Bibliographic      MyResearch                                         Bibsonomy
   Database          Database
                                                   PubChem              Biolicious
     Generic Document Tools
                                                    PubMed                CMT
         Community Tools                           Manuscript          Management

           Integration/                                         etc.
          User Interface                                   Existing
                                                         User Interface
     New Document-enhanced         Web service
                                                     Existing Document-based
         Research Tools             Wrappers
                                                          Research Tools
                 Native          Native           Native      Native
                  UI-1            UI-4             UI-3        UI-N

                Tool-1           Tool-2           Tool-3     Tool–N e.g.
           Connotea         CiteULike    CiteSeer

               Gateway           Gateway          Gateway     Gateway
                WS-1              WS-2             WS-3        WS-N

SSG Domain-1
 Web service

SSG Domain-N
 Web service                 User Interface UI

          Integration Framework of Tools

Shared By: