Web Services Globus OGSA - PowerPoint Presentation

Document Sample
Web Services Globus OGSA - PowerPoint Presentation Powered By Docstoc
					Grid Technology A
  Web Services
  Globus OGSA
     & Grid
   Architecture
Prosperity Games
   August 2003
      Geoffrey Fox
  Community Grids Lab
   Indiana University
    gcf@indiana.edu
                 With Thanks to
• Tony Hey my co-speaker at CERN and

•   I adapted presentations from
•   Marlon Pierce
•   Dennis Gannon
•   Globus
•   Malcolm Atkinson
•   David de Roure
What is High Performance Computer?
• We might wish to consider three classes of multi-node computers
• 1) The Grid or distributed systems of computers around the
  network
   – Latencies of inter-node communication – 100’s of milliseconds but can have
     good bandwidth
• 2) The classic cluster which can vary from configurations like 1) to
  3) but typically have millisecond latency and modest bandwidth
• 3) Classic MPP with microsecond latency and scalable internode
  bandwidth (tcomm/tcalc ~ 10 or so)
• All have same peak CPU performance but synchronization costs
  decrease as one goes from 1) to 3)
• Cost of system (dollars per gigaflop) increases by factors of 2 at
  each step from 1) to 2) to 3)
• One should NOT use classic MPP if class 1) or 2) suffices unless
  some security or data issues dominates over cost-performance
                  What is a Grid I?
• Collaborative Environment (Ch2.2,18)
• Combining powerful resources, federated computing and a security
  structure (Ch38.2)
• Coordinated resource sharing and problem solving in dynamic multi-
  institutional virtual organizations (Ch6)
• Data Grids as Managed Distributed Systems for Global Virtual
  Organizations (Ch39)
• Distributed Computing or distributed systems (Ch2.2,10)
• Enabling Scalable Virtual Organizations (Ch6)
• Enabling use of enterprise-wide systems, and someday nationwide
  systems, that consist of workstations, vector supercomputers, and
  parallel supercomputers connected by local and wide area networks.
  Users will be presented the illusion of a single, very powerful
  computer, rather than a collection of disparate machines. The system
  will schedule application components on processors, manage data
  transfer, and provide communication and synchronization in such a
  manner as to dramatically improve application performance. Further,
  boundaries between computers will be invisible, as will the location
  of data and the failure of processors. (Ch10)
                What is a Grid II?
• Supporting e-Science representing increasing global collaborations of
  people and of shared resources that will be needed to solve the new
  problems of Science and Engineering (Ch36)
• As infrastructure that will provide us with the ability to dynamically
  link together resources as an ensemble to support the execution of
  large-scale, resource-intensive, and distributed applications. (Ch1)
• Makes high-performance computers superfluous (Ch6)
• Metasystems or metacomputing systems (Ch10,37)
• Middleware as the services needed to support a common set of
  applications in a distributed network environment (Ch6)
• Next Generation Internet (Ch6)
• Peer-to-peer Network (Ch10, 18)
• Realizing thirty year dream of science fiction writers that have spun
  yarns featuring worldwide networks of interconnected computers that
  behave as a single entity. (Ch10)
           SERVOGrid Caricature
     Repositories                Sensor Nets
  Federated Databases                                   Streaming Data

Database          Database




                                                           Analysis and
Loosely Coupled                                            Visualization
     Filters            Closely Coupled Compute Nodes
       What is Grid Technology?
• Grids support distributed collaboratories or virtual
  organizations integrating concepts from
• The Web
• Distributed Objects (CORBA Java/Jini COM)
• Globus Legion Condor NetSolve Ninf and other High
  Performance Computing activities
• Peer-to-peer Networks
• With perhaps the Web being the most important for
  ―Information Grids‖ and Globus for ―Compute Grids‖
• Use Information Grids and not usual Data Grids as
  ―distributed file systems‖ (holding lots of data!) are
  handled in Compute Grids
PPPH: Paradigms Protocols Platforms and Hosting I
• We will start from the Web view and assert that
  basic paradigm is
• Meta-data rich Web Services communicating via
  messages
• These have some basic support from some runtime
  such as .NET, Jini (pure Java), Apache
  Tomcat+Axis (Web Service toolkit), Enterprise
  JavaBeans, WebSphere (IBM) or GT3 (Globus
  Toolkit 3)
  – These are the distributed equivalent of operating
    system functions as in UNIX Shell
• Called Hosting Environment or platform
       SERVOGrid Requirements
• Seamless Access to Data repositories and large scale
  computers
• Integration of multiple data sources including sensors,
  databases, file systems with analysis system
   – Including filtered OGSA-DAI
• Rich meta-data generation and access with SERVOGrid
  specific Schema extending openGIS standards and using
  Semantic Grid
• Portals with component model for user interfaces and web
  control of all capabilities
• Collaboration to support world-wide work
• Basic Grid tools: workflow and notification
                                                       Application WS
                         Approach
• Build on e-Science methodology and Grid
  technology
• Science applications with multi-scale
  models, scalable parallelism, data                       WS linking
                                                           to user and
  assimilation as key issues                                Other WS
  – Data-driven models for earthquakes, climate,         (data sources)

    environment …..
• Use existing code/database technology
  (SQL/Fortran/C++) linked to ―Application
  Web/OGSA services‖                                        Typical
                                                            codes
  – XML specification of models, computational
    steering, scale supported at ―Web Service‖ level
    as don’t need ―high performance‖ here
  – Allows use of Semantic Grid technology
   Integration of Data and Filters
• One has the OGSA-DAI Data repository interface
  combined with WSDL of the (Perl, Fortran, Python …)
  filter
• User only sees WSDL not data syntax
• Some non-trivial issues as to where the filtering compute
  power is
   – Microsoft says filter next to data

                 WSDL                     OGSA-DAI
                              Filter
                                                      DB
                 Of Filter                Interface
SERVOGrid Complexity Computing Environment

    Database




                                           Parallel
   Database           Compute                                  Sensor
                                         Simulation
    Service            Service            Service              Service

               Middle Tier    with XML           Interfaces
Application
 Service-1
                                                         XML Meta-data
Application                                                Service
                          CCE Control
 Service-2              Portal Aggregation                    Complexity
                                                              Simulation
Application                                                    Service
 Service-3
                                 Users                    Visualization
                                                             Service
         Portal such as “Jetspeed”
H                                                  H
o    AWS          AWS          AWS         AWS     o         Grid
s                                                  s     Computing or
t                                                  t
i   Application/User Framework supporting          i     Programming
n   development and deployment of OGSI compliant   n     Environments
g                                                  g
    AWS (Application Web Services)
E                                                  E
n          Generic Application Services            n                        Web
v                                                  v
i                                                  i                        Services
r            OGSA Interoperability Layer           r
o
n       “Sophisticated” System Services
                                                   o
                                                   n
                                                           “Core”
m                                                  m        Grid
e            OGSA Interoperability Layer           e
n                                                  n
t             Resource Grid Services               t   e.g. DAI compliant
                                                       database

                               Database                 Resources
Taxonomy of Grid Operational Style
Name of Grid Style          Description of Grid Operational or
                                     Architectural Style
Semantic Grid        Integration of Grid and Semantic Web meta-data
                     and ontology technologies
Peer-to-peer Grid    Grid built with peer-to-peer mechanisms

Lightweight Grid     Grid designed for rapid deployment and minimum
                     life-cycle support costs
Collaboration Grid   Grid supporting collaborative tools like the Access
                     Grid, whiteboard and shared applications.
R3 or Autonomic      Fault tolerant and self-healing Grid
Grid                 Robust Reliable Resilient R3
      Taxonomy of Grid Functionalities
Name of Grid Type               Description of Grid Functionality

Compute/File Grid      Run multiple jobs with distributed compute and data
                       resources (Global ―UNIX Shell‖)
or Data File Grid
Desktop Grid           ―Internet Computing‖ and ―Cycle Scavenging‖ with secure
                       sandbox on large numbers of untrusted computers
e.g. SETI@Home
Information Grid       Grid service access to distributed information, data and
or Data Service Grid   knowledge repositories

Complexity or          Hybrid combination of Information and Compute/File Grid
Hybrid Grid            emphasizing integration of experimental data, filters and
                       simulations: Data assimilation
Campus Grid            Grid supporting University community computing

Enterprise Grid        Grid supporting a company’s enterprise infrastructure
            What is a Web Service I
• A web service is a computer program running on either the local
  or remote machine with a set of well defined interfaces (ports)
  specified in XML (WSDL)
• In principle, computer program can be in any language (Fortran ..
  Java .. Perl .. Python) and the interfaces can be implemented in
  any way what so ever
   – Interfaces can be method calls, Java RMI Messages, CGI Web
      invocations, totally compiled away (inlining) but
• The simplest implementations involve XML messages (SOAP)
  and programs written in net friendly languages like Java and
  Python
• Web Services separate the meaning of a port (message) interface
  from its implementation
• Enhances/Enables Re-usable component model of ANY
  electronic resource
Raw               Raw Data                      Raw Data
Resources
                        (Virtual) XML Data Interface

                        Web Service (WS)


     WS                       WS                          WS

                      XML WS to WS Interfaces     etc.

WS           WS               WS                WS                WS



            (Virtual) XML Knowledge (User) Interface

             Render to XML Display Format
                                                         (Virtual) XML
     Clients                                           Rendering Interface
         What is a Web Service II
• Web Services have important implication that ALL
  interfaces are XML messages based. In contrast
• Most Windows programs have interfaces defined as
  interrupts due to user inputs
• Most software have interfaces defined as methods which
  might be implemented as a message but this is often NOT
  explicit
                                               Payment
                    WSDL interfaces           Credit Card


                 Security         Catalog

                                              Warehouse
                WSDL interfaces                shipping
          What is a Web Service III
• ―Everything electronic‖ is a resource
   – Computers; Programs; People
   – Data (from sensors to this presentation to email to databases)
• ―Everything electronic‖ is a distributed object
• All resources have interfaces which are defined in XML for both
  properties (data-structure) and methods (service, function,
  subroutine) (Resources are Services)
   – We can assume that a data-structure property has
     getproperty() and setproperty(value) methods to act as
     interface
• All resources are linked by messages with structure, which must
  be specifiable in XML
• All resources have a URI such as unique://a/b/c …….
              WSDL Abstractions
• WSDL abstracts a program as an entity that does
  something given one or more inputs with its results
  defined by streams on one or more outputs.
• Functions are defined by method name and parameters
  methodname(parm1,parm2, … parmN)
  – Where parameters are ―Input‖ ―Output‖ or both
• In WSDL, we will have a Web Service which like a
  (Java or CORBA Program) can be thought of as a
  (distributed) object with many methods
  – Instead of a function call, the ―calling routine‖ sends an XML
    message to the Web Service specifying methodname and
    values of the parameters
  – Note name of function is just another parameter
       Details of WSDL Protocol Stack
• UDDI finds where programs are
                                           UDDI or WSIL
   – remote( (distributed) programs are
     just Web Services                          WSFL
   – (not a great success)
• WSFL links programs together                  WSDL

  (under revision as BPEL4WS)
                                            SOAP or RMI
• WSDL defines interface (methods,
  parameters, data formats)                 HTTP or SMTP
• SOAP defines structure of message        or IIOP or RMTP
  including serialization of information
                                               TCP/IP
• HTTP is negotiation/transport protocol
• TCP/IP is layers 3-4 of OSI              Physical Network
• Physical Network is layer 1 of OSI
        Education as a Web Service
• Can link to Science as a Web Service and substitute educational
  modules
• ―Learning Object‖ XML standards already exist from IMS/ADL
  http://www.adlnet.org – need to update architecture
• Web Services for virtual university include:
• Registration
• Performance (grading)
• Authoring of Curriculum
• Online laboratories for real and virtual instruments
• Homework submission
• Quizzes of various types (multiple choice, random parameters)
• Assessment data access and analysis
• Synchronous Delivery of Curricula
• Scheduling of courses and mentoring sessions
• Asynchronous access, data-mining and knowledge discovery
• Learning Plan agents to guide students and teachers
 What are System and Application Services?
• There are generic Grid system services: security, collaboration,
  persistent storage, universal access
   – OGSA (Open Grid Service Architecture) is implementing these as
     extended Web Services
• An Application Web Service is a capability used either by another
  service or by a user
   – It has input and output ports – data is from sensors or other
     services
• Consider Satellite-based Sensor Operations as a Web Service
   – Satellite management (with a web front end)
   – Each tracking station is a service
   – Image Processing is a pipeline of filters – which can be grouped
     into different services
   – Data storage is an important system service
   – Big services built hierarchically from ―basic‖ services
• Portals are the user (web browser) interfaces to Web services
                Application Web Services
                                                             simulations and Prog2
• Filter1Service model integrates sensors, sensor analysis, Prog1
    Note          Filter2          Filter3                                    people
    WS               WS              WS                       WS               WS
• An Application Web Service is a capability used either by another service or by a
    user                                                       Build as multiple
 Build as multiple Filter Web Services
                                                               interdisciplinary
     – It has input and output ports – data is from users, sensors or other services
     – Big services built hierarchically from ―basic‖ servicesPrograms



                       Sensor Data                                 Simulation WS
                         as a Web
                       service (WS)

                                                 Data
                                              Analysis WS


                            Sensor                         Visualization WS
                          Management
                              WS
     The Application Service Model
• As bandwidth of communication (between) services increases
  one can support smaller services
• A service ―is a component‖ and is a replacement for a library in
  case where performance allows
• Services (components) are a sustainable model of software
  development – each service has documented capability with
  standards compliant interfaces
   – XML defines interfaces at several levels
   – WSDL at Service interface level and XSIL or equivalent for
     scientific data format
• A service can be written as Perl, Python, Java Servlet, Enterprise
  JavaBean, CORBA (C++ or Fortran) Object …
• Communication protocol can be RMI (Java), IIOP (CORBA) or
  SOAP (HTTP, XML) ……
             7 Primitives in WSDL
• types: which provides data type definitions used to describe the
  messages exchanged.
• message: which represents an abstract definition of the data
  being transmitted. A message consists of logical parts, each of
  which is associated with a definition within some type system.
• operation– an abstract description of an action supported by the
  service.
• portType: which is a set of abstract operations. Each operation
  refers to an input message and output messages.
• binding: which specifies concrete protocol and data format
  specifications for the operations and messages defined by a
  particular portType.
• port: which specifies an address for a binding, thus defining a
  single communication endpoint.
• service: which is used to aggregate a set of related ports
 OGSA OGSI & Hosting Environments
• Start with Web Services in a hosting environment
• Add OGSI to get a Grid service and a component model
• Add OGSA to get Interoperable Grid ―correcting‖ differences in base
  platform and adding key functionalities




                                                                           models
                                                                           Other
                Not OGSA                Domain -specific services


                                    More specialized services: data
             Possibly OGSA




                                                                           Models for resources
                                    replication, workflow, etc., etc.




                                                                            & other entities
                                  Broadly applicable services: registry,
                OGSA                 authorization, monitoring, data
                                           access, etc., etc.
              Environment
                                       OGSI on Web Services

                                     Hosting Environment for WS
     Given to us from on high
                                               Network
    OGSI Open Grid Service Interface
•   http://www.gridforum.org/ogsi-wg
•   It is a ―component model‖ for web services.
•   It defines a set of behavior patterns that each OGSI service must exhibit.
•   Every ―Grid Service‖ portType extends a common base type.
     – Defines an introspection model for the service
     – You can query it (in a standard way) to discover
           • What methods/messages a port understands
           • What other port types does the service provide?
           • If the service is ―stateful‖ what is the current state?
•   Factory Model
•   A set of standard portTypes for
     – Message subscription and notification
     – Service collections
•   Each service is identified by a URI called the ―Grid Service Handle‖
•   GSHs are bound dynamically to Grid Services References (typically wsdl
    docs)
     – A GSR may be transient. GSHs are fixed.
     – Handle map services translate GSHs into GSRs.
        Categories of Worldwide Grid Services
           to be exploited by SERVOGrid
•       1) Types of Grid                                 •   7) Information Grid Services
    –        R3                                               – OGSA-DAI/DAIT
    –        Lightweight                                      – Integration with compute resources
    –        P2P
                                                              – P2P and database models
    –        Federation and Interoperability
•       2) Core Infrastructure and Hosting Environment   •   8) Compute/File Grid Services
    –        Service Management                               – Job Submission
    –        Component Model                                  – Job Planning Scheduling Management
    –        Service wrapper/Invocation                       – Access to Remote Files, Storage and
    –        Messaging                                            Computers
•       3) Security Services                                  – Replica (cache) Management
    –        Certificate Authority                            – Virtual Data
    –        Authentication
                                                              – Parallel Computing
    –        Authorization
    –        Policy
                                                         •   9) Other services including
•       4) Workflow Services and Programming Model            – Grid Shell
    –        Enactment Engines (Runtime)                      – Accounting
    –        Languages and Programming                        – Fabric Management
    –        Compiler                                         – Visualization Data-mining and
    –        Composition/Development                              Computational Steering
•       5) Notification Services                              – Collaboration
•       6) Metadata and Information Services             •   10) Portals and Problem Solving Environments
    –        Basic including Registry
                                                         •   11) Network Services
    –        Semantically rich Services and meta-data
    –        Information Aggregation (events)                 – Performance
    –        Provenance                                       – Reservation
                                                              – Operations
    Functional Level above OGSA
•   Systems Management and Automation
•   Workload / Performance Management
•   Security
•   Availability / Service Management
•   Logical Resource Management
•   Clustering Services
•   Connectivity Management
•   Physical Resource Management
•   Perhaps Data Access belongs here
         Two-level Programming I
• The paradigm implicitly assumes a two-level Programming
  Model
• We make a Service (same as a ―distributed object‖ or
  ―computer program‖ running on a remote computer) using
  conventional technologies
   – C++ Java or Fortran Monte Carlo module
   – Data streaming from a sensor or Satellite
   – Specialized (JDBC) database access
• Such nuggets accept and produce data from users files and
  databases
                                   Nugget
                                                 Data
• The Grid is built by coordinating such nuggets assuming
  we have solved problem of programming the nugget
         Two-level Programming II
• The Grid is discussing the linkage and distribution of the
  nuggets with the only           Nugget1              Nugget2
  addition runtime interfaces
  to Grid as opposed to
  UNIX data streams          Nugget3              Nugget4

• Familiar from use of UNIX Shell, PERL or Python scripts
  to produce real applications from core programs
• Such interpretative environments are the single processor
  analog of Grid Programming
• Some projects like GrADS from Rice University are
  looking at integration between nugget levels but dominant
  effort looks at each level separately
Why we can dream of using HTTP and
          that slow stuff
•   We have at least three tiers in computing environment
•   Client (user portal discussed Thursday)
•   ―Middle Tier‖ (Web Servers/brokers)
•   Back end (databases, files, computers etc.)
•   In Grid programming, we use HTTP (and used to use
    CORBA and Java RMI) in middle tier ONLY to
    manipulate a proxy for real job
    – Proxy holds metadata
    – Control communication in middle tier only uses metadata
    – ―Real‖ (data transfer) high performance communication in
      back end
     User
    Services              Portal                  Grid
                         Services              Computing
                                              Environments

System              Application             System
Services          Application Metadata      Services
                        Service

                     Middleware

System                   System          System
                                                       “Core”
Services                 Services        Services       Grid


                                Raw (HPC)
    Actual Application
                                Resources     Database
PPPH: Paradigms Protocols Platforms and Hosting II
• Self-describing programs/interfaces are key to scaling
   – Minimize amount of work system has to do
   – Hide as much as possible in services and applications
• Protocols describe (in ―principle‖ at least) those rules that
  system obeys and uses to deliver information between
  services (processes)
• Interfaces tell the service what to do to interpret the results
  of communication
• HTTP is the dominant transport protocol of the Web
• HTML is the ―interface‖ telling browser how to render
• But you can extend interface to allow PDF, multimedia,
  PowerPoint using ―helper applications‖ which are (with
  more or less convenience) which are ―automatically‖
  downloaded if not already available
   – ―Mime types‖ essentially self-describe‖ each interface
                Analogy with Web II
• HTTP and HTML are the analogies on the client side
• A ―Web Service‖ generalizes a CGI Script on server side
   – CGI is essentially a Distributed Object technology
     allowing server to access an arbitrary program labeled by
     a URL plus an ugly syntax to specify name and
     parameters of program to run
• Roughly WSDL (Web Service Description Language) is a
  better to specify program name and its parameters
• Web uses other protocols – HTTPS for secure links and
  RTP etc. for multimedia (UDP) streams
   – These again are required to integrate system – codecs like
     MPEG are interfaces interpreted by client
   – There are further protocols like H323 and SIP which will
     be placed (IMHO) by HTTP plus RTP etc. We should
     minimize number of protocols to get maintainable
     systems
PPPH: Paradigms Protocols Platforms and Hosting III
• There are set of system capabilities which cannot be captured as
  standalone services and permeate Grid
• Meta-data rich Message-linked Web Services is permeating paradigm
• Component Model such as ―Enterprise JavaBean (EJB)‖ or OGSI
  describes the formal structure of services – EJB if used lives inside
  OGSI in our Grids
• Invocation Framework describes how you interact with system
• Security in fine grain fashion to provide selective authorization
  (Globus and EDG WP6)
• Policy context describes rules for this particular Grid
• Transport mechanisms abstract concepts like ports and Quality of
  Service
• Messaging abstracts destination and customization of content
• Network (monitoring, performance) EDG WP7
• Fabric (resources) EDG WP4
                 Virtualization
• The Grid could and sometimes does virtualize various
  concepts
• Location: URI (Universal Resource Identifier) virtualizes
  URL
• Replica management (caching) virtualizes file location
  generalized by GriPhyn virtual data concept
• Protocol: message transport and WSDL bindings
  virtualize transport protocol as a QoS request
• P2P or Publish-subscribe messaging virtualizes matching
  of source and destination services
• Semantic Grid virtualizes Knowledge as a meta-data
  query
• Brokering virtualizes resource allocation
• Virtualization implies references can be indirect
 IFS: Interfaces and Functionality and Semantics I
• The Grid platform tries to minimize detail in protocols and
  maximize detail in interfaces to enhance scaling
• However rich meta-data and semantics are critical for
  correct and interesting operation
   – Put as much semantic interpretation as you can into specific
     services
   – Lack of Semantic interoperation is in fact main weakness of
     today’s Grids and Web services
• Everything becomes a service (See example of education)
  whether system or application level
• There are some very important ―Global Services‖
   – Discovery (look up) and Registration of service metadata
   – Workflow
   – MetaSchedulers
IFS: Interfaces and Functionality and Semantics II
• There are many other generally important services
• OGSA-DAI The Database Service
• Portal Service linked to by WSRP (Web services
  for Remote Portals)
• Notification of events
• Job submission
• Provenance – interpret meta-data about history of
  data
• File Interfaces
• Sensor service – satellites …
• Visualization
• Basic brokering/scheduling
    Globus in a Nutshell from IPG
• GT2 (or Globus Toolkit 2) is original (non web
  service based) version which is basis of EDG
  (European Data Grid) work
• C programs and libraries
• See Chapter 5 of book with background in chapters
  2-4 and 37
• http://www.ipg.nasa.gov/ipgusers/globus/
• http://www.globusworld.org/globusworld_web/jw2
  _program_tut.htm
          Globus GT2 from IPG
• The goal of the Globus GT2 is to provide dependable,
  consistent, pervasive access to high-end resources.
   – This is original Grid ―start‖ general recently to virtual
     organizations and data grids
• The Globus Project offers the most widely used
  computing grid middleware. The Globus Project is a joint
  effort of Argonne National Laboratory, the Informational
  Sciences Institute of the University of Southern
  California, in collaboration with numerous other
  organizations including NCSA, NPACI, UCSD, and
  NASA. See http://www.globus.org/ for history, goals,
  release and usage notes, software distributions, and
  research papers.
                          Globus GT2 II
• Grid Fabric: Layer One
   The fabric of the Grid comprises the underlying systems, computers, operating
   systems, networks, storage systems, and routers—the building blocks.
• Grid Services: Layer Two
   Grid services integrate the components of the Grid fabric. Examples of the services
   that are provided by Globus Toolkit 2:
• GRAM
   The Globus Resource Allocation Manager, GRAM, is a basic library service that
   provides capabilities to do remote-submission job start up. GRAM unites Grid
   machines, providing a common user interface so that you can submit a job to
   multiple machines on the Grid fabric. GRAM is a general, ubiquitous service, with
   specific application toolkit commands built on top of it
• MDS
  The Monitoring and Discovery Service, also known as GIS, the Grid Information
  Service, provides information service. You query MDS to discover the properties of
  the machines, computers and networks that you want to use: how many processors
  are available at this moment? What bandwidth is provided? Is the storage on tape or
  disk? Is the visualization device an immersive desk or CAVE? Using an LDAP
  (Lightweight Directory Access Protocol) server, MDS provides middleware
  information in a common interface to put a unifying picture on top of disparate
  equipment.
• Contd …
                    Globus GT2 III
• GSI gss-api library for adding authentication to a program. GSI
  provides programs, such as grid-proxy-init, to facilitate login to a
  variety of sites, while each site has its own flavor of security
  measures. That is, on the fabric layer, the various machines you want
  to use might be governed by disparate security policies; GSI provides
  a means of simplifying multiple remote logins. The standard
  installation is based on a PKI security system; the Kerberos
  installation of Globus is less standard. (Some installations with DoE
  and DoD insist on Kerberos)
• GridFTP A new (in Globus 2.0) protocol for file transfer over a
  grid. This is a Global Grid Forum standard
• GASS Globus Access to Secondary Storage, provides command-line
  tools and C APIs for remotely accessing data. GASS integrates
  GridFTP, HTTP, and local file I/O to enable secure transfers using
  any combination of these protocols..
                      Globus GT2 IV
• Application Toolkits: Layer Three
    Application toolkits use Grid Services to provide higher-level
    capabilities, often targeted to specific classes of application.
•   For example, the Globus development team has created a set of Grid
    service tools and a toolkit of programs for running remotely
    distributed jobs. These include remote job submission commands (
    globusrun, globus-job-submit, globus-job-run), built on top of the
    GRAM service, and MPICH-G2, a Grid-enabled implementation of
    the Message Passing Interface (MPI).
•   A more modern interface is through CoG Kits (Commodity Grid) to
    different languages – Perl Python Java – see chapter 26 of Book
•   The Java CoG kit provides a natural way to link GT2 to a Web
    service framework
•   Globus Toolkit 3 (GT3) effectively integrated CoG Kit interface with
    core Globus by wrapping all Globus Services as Web services
            Job Submission in Globus
• Very similar to UNIX Shell – build Portal Web Interfaces to specific
  or general Shell commands. Some example commands
• globusrun Runs a single executable on a remote site with an RSL
  specification.
• globus-job-cancel Cancels a job previously started using globus-job-
  submit.
• globus-job-run Allows you to run a job at one or several remote
  resources. It translates the program arguments to an RSL request and
  uses globusrun to submit the job.
• globus-job-clean Kills the job if it is still running and cleans the
  information concerning the job.
• globus-job-status Display the status of the job. See also globus-get-
  output to check the standard output or standard error of your job.
• These are all controlled by metadata specified by the Globus
  Resource Specification Language (RSL) which provides a common
  language to describe jobs and the resources required to run them.
• http://www.globus.org/gram/gram_rsl_parameters.html
• The simplest RSL expression looks something like the following.
  (executable=/bin/ls)
Virtual Data Toolkit VDT from GriPhyn
• http://www.lsc-group.phys.uwm.edu/vdt/
• Trillium (PPDG from DoE GriPhyn and iVDgL from NSF) is
  major US effort building Grid application software with a
  strong particle physics emphasis
• VDT is their major software release and its heart is Condor
  and GT2.
   – There is some ―virtual data‖ software as well but not clear
     if this is of interest in production use (interesting research
     area)
• Condor (Chapter 11 of Book) is powerful job scheduler for
  clusters and ―cycle scavenging‖
   – It has a well developed interface (ClassAds) for defining
     requirements of jobs and matching to compute capabilities
     OGSA/OGSI Top Level View
Chapters 7 to 9 of Book
http://www.gridforum.org/Meetings/ggf7/docs/default.htm
http://www.globusworld.org/globusworld_web/jw2_program_tut.htm
• OGSA is the set of




                                                                       models
                                                                       Other
                                    Domain-specific services
  ―core‖ Grid services
                                More specialized services: data
   – Stuff you can’t live




                                                                       Models for resources
                                replication, workflow, etc., etc.




                                                                        & other entities
     without
                              Broadly applicable services: registry,
   – If you built a Grid         authorization, monitoring, data
                                       access, etc., etc.
     you would need to
     invent these things                     OGSI
                              Host. Env.   & Protocol Bindings
                                Hosting Environment
                               Hosting Environment        Transport
                                                          Protocol
    OGSI Open Grid Service Interface
• http://www.gridforum.org/ogsi-wg
• It is a ―component model‖ for web services.
• It defines a set of behavior patterns that each OGSI service must exhibit.
• Every ―Grid Service‖ portType extends a common base type.
   – Defines an introspection model for the service
   – You can query it (in a standard way) to discover
         • What methods/messages a port understands
         • What other port types does the service provide?
         • If the service is ―stateful‖ what is the current state?
• A set of standard portTypes for
   – Message subscription and notification
   – Service collections
• Each service is identified by a URI called the ―Grid Service Handle‖
• GSHs are bound dynamically to Grid Services References (typically wsdl
  docs)
   – A GSR may be transient. GSHs are fixed.
   – Handle map services translate GSHs into GSRs.
            OGSI and Stateful Services
• Sometimes you can send a message to a service, get a result and
  that’s the end
   – This is a statefree service
• However most non-trivial services need state to allow persistent
  asynchronous interactions
• OGSI is designed to support Stateful services through two
  mechanisms
   – Information Port: where you can query for SDE (Service
     Definition Elements)
   – ―Factories‖ that allow one to view a Service as a ―class‖ (in an
     object-oriented language sense) and create separate instances for
     each Service invocation
• There are several interesting issues here
   – Difference between Stateful interactions and Stateful services
   – System or Service managed instances
                             Factories and OGSI
• Stateful interactions are typified by amazon.com where messages carry correlation
  information allowing multiple messages to be linked together
   – Amazon preserves state in this fashion which is in fact preserved in its
      database permanently
• Stateful services have state that can be queried outside a particular interaction
• Also note difference between implicit and explicit factories
   – Some claim that implicit factories scale as each service manages its own
      instances and so do not need to worry about registering instances and lifetime
      management
• See WS-Addressing from largely IBM and Microsoft
   http://msdn.microsoft.com/webservices/default.aspx?pull=/library/en-us/dnglobspec/html/ws-addressing.asp


           Implicit Factory                                                         Explicit Factory
                                                                                                1
                     F
                                     1                                                                        F
                                                                                                              A
                     A                                                                          2             C
                     C               2                                                                        T
                     T                                                                                        O
                     O                                                                          3
                                     3                                                                        R
                     R                                                                                        Y
                     Y                                                                          4
                                     4
Open Grid Service Architecture

• OGSA-WG chaired by
  – Ian Foster, ANL and Univ. of Chicago
  – Jeff Nick, IBM
  – Dennis Gannon, IU
• Active Members from
  – IBM, Fujitsu, NEC, SUN, Hitachi, Avaki
  – Univ. of Mich, Chicago, Indiana (not much
    academic involvement)
        OGSA Core Services I

• Registries, and namespace bindings
   – Registry is a collection of services indexed by service
     metadata.
      • ―find me a service with property X.‖
   – Directory is a map from a namespace to GSHs.
   – A namespace is a human understandable version of a
     Grid Handle
• Queues
   – For building schedulers and resource brokers
   – Jobs and other requests are in queues
   – This is high-level messaging
                          Security
• Base this on Web Services Security
• Authentication
   – 2-way. Who are you and who am I?
• Authorization
   – What am I authorized to use/see/modify
• Accounting/Billing
   – (not really security – see monitoring)
• Privacy
• Group Access
   – Easily create a group to share access to a virtual Grid.
• Very complex issues related to services and message
  delivery.
    Common Resource Model

• Every resource on the grid that is
  manageable is represented by a service
  instance
  – CRM is the Schema hierarchy that defines each
    resource (with its meta-data)
  – Service for a resource presents its management
    interface to authorized parties.
                           Policy Management
• Policy management services
   – Mechanism to publish policy and the services it applies to.
   – Policy life-cycle mgmt.
• Policy languages exist for routing, security, resource use
         Producer of Policies
                                                                        Policy Service Core
                                        Canonical
            Admin GUI /                  Policies 1       Policy
                                    *                     Service                                     Policy
            Autonomic                                                   *                         Transformation
             Manager                                     Manager                                      Service

                                *                              1..n            1                     Policy
                                                                                     XML            Validation
                                                                                   Repository        Service
                                                                *              1
       Consumer of Policies     Canonical
                                                  *       Policy                                     Policy
                                 Policies                 Service                                   Resolution
                                                                                                     Service
             Policy
                                *               1
                                                           Agent         *
          Enforcement
             Point


                      Non-Canonical
                                                Policy Component Requirements:
                                                     A management control point for policy lifecycle (PSM)
      Common Resource Model                          A canonical way to express policies (AC 4-tuple)
          Device / Resource                          A distribution point for policy dissemination (PSA)
                                                     A way to express that a service is “policy aware” (PEP)
                                                     A way to effect change on a resource (CRM)
      Grid Service Orchestration
• Creating new services by composing other
  services
• Two types of Orchestration
  – Composition in space
     • One services is directly invoking another
  – Composition in time
     • Managing the workflow
        – First do this.
        – Then do this and that
        – When that is done do this
            » If something goes wrong do this
        – And so on…
              Data Services

•   Distributed Data Access
•   Data Caching
•   Data Replication Services
•   Metadata Catalog Services
•   Storage Services
Metering Resource Consumption
            • At what granularity do
              services report resource
              consumption?
            • How do they report it?
            • How are services metered?
              Transactions

• Two threads/workflows must synchronize
  and agree they have done so before moving
  on.
  – Usually involves modification to two or more
    persistent states
  – WS-transactions has been ―proposed‖.
     Messaging, Events, Logging
• Messaging
   – Delivery Model
   – Queuing and Pub/Sub message delivery (not clear to me why
     these are different as publish/subscribe implemented as topic
     labeled queues)
• Events
   – Time stamped messages
   – Standard XML schemas
• Standard Logging
• MQSeries (IBM), JMS (Java Message Service) and
  NaradaBrokering (Indiana) provide this but most
  naturally at level of ―platform/hosting environment‖
       Where should Messaging be?
• One can define messaging at the OGSA level ―above the
  hosting environment‖ but that makes it difficult to virtualize
  messaging and support network performance
   – Publish-subscribe or better queued messaging naturally
     supports optimized routing based on network
     performance
• One can naturally support collaborative Web services in
  same fashion in a way that it MUCH easier that
  GrooveNetworks and other collaborative environments
  (WebeX, Placeware(Microsoft)) do as long as every
  application is a Web service
• OGSA location of messages is fine for low volume logging
  or notification events
   – Not good for events on ―video‖ application where each frame is an
     update event
  From                                 Application as a Web service
  Collaboration
  As a WS                               Events         Rendering
                  From Master

                                            W3C DOM Events
     Participating Client
                                               User Interface




From                                 Application as a Web service
Collaboration     To Collaborative
As a WS                               Events         Rendering
                  Clients

                                          W3C DOM Events
       Master Client
                                            User Interface
        Collaboration: Shared Display
    Sharing can be done at any point on “object” or Web Service
     pipeline
                            Shared Web Service                 Shared
         Shared                    Shared Export               Display
          Event
                                                      Object   Object
    Object        Object’          Object’’
                                                      Viewer   Display
    Master

      Shared Display shares
      framebuffer with events                   Event
                                                                Object
      corresponding to changed                (Message)
                                                                Display
      pixels in master client.                 Service


    As long as pipeline uses messages, easy to
    make collaborative                                          Object
    Windows framebuffers and in fact most applications          Display
    do NOT expose a message based update interface
Shared Input Port (Replicated WS) Collaboration
                                    Collaboration as a WS
                                  Set up Session with XGSP

                     R     U
                 F    Web     F
                     Servic            WS            WS
                 I     e      I      Viewer        Display
                   O        O

                                              Master

                     R     U
                 F    Web     F
                     Servic           WS            WS
  Event          I     e      I      Viewer        Display
                   O        O
(Message)
 Service
                                           Other
                                        Participants
                     R     U
                 F    Web     F
                     Servic           WS             WS
                 I     e      I      Viewer        Display
                   O        O
Shared Output Port Collaboration
                                          Collaboration as a WS
  Web Service Message                   Set up Session with XGSP
  Interceptor

                                                            Master
              WSDL
      R                       U
  F                               F              WS           WS
           Application or
  I       Content source          I
                                               Viewer       Display
      O                       O
          Web Service


                                                WS           WS
Text Chat                                      Viewer       Display
Whiteboard
Multiple                      Event                         Other
masters                     (Message)                    Participants
                             Service
                                                WS            WS
                                               Viewer       Display
                NaradaBrokering
   Based on a network of cooperating broker nodes
    • Cluster based architecture allows system to scale to arbitrary
      size
   Originally designed to provide uniform software
    multicast to support real-time collaboration linked to
    publish-subscribe for asynchronous systems.
   Now has four major core functions
    • Message transport (based on performance measurement) in
      heterogeneous multi-link fashion
    • General publish-subscribe including JMS & JXTA and
      support for RTP-based audio/video conferencing
    • Filtering for heterogeneous clients
    • Federation of multiple instances of Grid services
    Role of Event/Message Brokers
   We will use events and messages interchangeably
     • An event is a time stamped message
   Our systems are built from clients, servers and “event brokers”
     • These are logical functions – a given computer can have one
       or more of these functions
     • In P2P networks, computers typically multifunction; in Grids
       one tends to have separate function computers
     • Event Brokers “just” provide message/event services; servers
       provide traditional distributed object services as Web services
   There are functionalities that only depend on event itself and
    perhaps the data format; they do not depend on details of
    application and can be shared among several applications
     • NaradaBrokering is designed to provide these functionalities
     • MPI provided such functionalities for all parallel computing
      Engineering Issues Addressed
      by Event / Messaging Service
   Application level Quality of Service
         – e.g. give audio highest priority
   Tunnel through firewalls & proxies
   Filter messages to slow (collaborative/real-time) clients
   Choose Hardware or Software multicast
   Scaling of software multicast
    • Efficient calculation of destinations and routes.
   Integrate synchronous and asynchronous collaboration with
    same messaging, control, archiving for all functions
   Transparently replace single server JMS systems with a
    distributed solution.
   Provides reliable inter-peer group messaging for JXTA
   Open Source (high quality) messaging
NaradaBrokering implements an Event Service
                          Destination
    Routing                                     Filter           workflow
                        Source Matching




         Web                        (Virtual)             Web
        Service 1                    Queue               Service 2

                WSDL              Broker           WSDL
                Ports                              Ports
   Filter is mapping to PDA or slow communication channel
    (universal access) – see our PDA adaptor
   Workflow implements message process
   Routing illustrated by JXTA and includes firewall
   Destination-Source matching illustrated by JMS using Publish-
    Subscribe mechanism
   These use Security model (being implemented) based on WS-Sec
     Narada Broker Network
                                                    (P2P) Community
                   For message/events service

                       Broker
                                           Broker
 (P2P) Community


  Resource
                Hypercube topology
                            Broker
                for
              Broker brokers?
                Tree for distance education
(P2P) Community
                              at root
                with teacherBroker                         Data
                                                           base


                           Software multicast


                                Broker
                                                (P2P) Community
           NaradaBrokering Communication
   Applications interface to NaradaBrokering through UserChannels
    which NB constructs as a set of links between NB Broker waystations
    which may need to be dynamically instantiated
   UserChannels have publish/subscribe semantics with XML topics
   Links implement a single conventional “data” protocol.
     • Interface to add new transport protocols within the Framework
     • Administrative channel negotiates the best available communication
       protocol for each link
   Different links can have different underlying transport implementations
     • Implementations in the current release include support for
       TCP,UDP, Multicast, SSL and RTP.
        HTTP, HTTPS support will be available in Feb 2003 release.
     • Supports communication through proxies such as iPlanet, Netscape
       and Apache.
     • Supports communication through firewalls such as Microsoft ISA,
       Checkpoint.
Performance/Routing in Message-based Architecture
                                                           Fast
                                                Firewall   Link        B1
                            Satellite           HTTP
     A
                              UDP
                                                              Hand-Held
                                                                                 B2
                       Software Multicast                       Protocol
                                                    Dial-up                 B3
                                                    Filter

   In traveling from cities A to B (say 3 separate passengers), one
    chooses between and changes transport mechanism at
    waystations to optimize cost, time, comfort, scenic beauty …
   Waystations are now NB brokers where one chooses transport
    protocol (individual or collective)
    • Able to choose between car, type of car, plane, train etc
    • Able to dynamically create waystations to cope with problems and acts as
      hubs for multicast messages
    • Knows about traffic jams and can assign the “HOV lane”
             Note on Optimization
   Note in parallel computing, couldn’t do much dynamic
    optimization as aiming at microsecond latency
    • Natural to use hardware routing
   In Grid, time scales are different
    • 100 millisecond quite normal network latency
    • 30 millisecond typical packet time sensitivity (this is one audio
      or video frame) but even here can buffer 10-100 frames on
      client (conferencing to streaming)
    • 1 millisecond is time for a Java server to “think”
   Jitter in latency (transit time) due to routing, processing
    (in NB) or packet loss recovery is important property
   Grid needs and can tolerate significant dynamic
    optimization
        Transit delay for message samples in NaradaBrokering
           Different communication hops - Internal Machines
9
                                                              hop-2
       Sender/receiver/broker - (Pentium-3, 1                 hop-3
8      GHz, 256 MB RAM). 100 Mbps LAN.                        hop-5
       JDK-1.3, Red Hat Linux 7.3                             hop-7
7

6

5

4

3

2

1
1000   1500      2000     2500      3000        3500   4000     4500   5000
                           Message Payload Size
                                 (Bytes)
        Standard Deviation for message samples in NaradaBrokering
             Different communication hops - Internal Machines
0.8
                                                         hop-2
0.7                                                      hop-3
                                                         hop-5
                                                         hop-7
0.6

0.5

0.4

0.3

0.2

0.1

 0
 1000     1500    2000    2500    3000    3500    4000     4500     5000
                          Message Payload Size
                                (Bytes)
 Average delays/packet for 12 (of the 400 total) video-clients.
    NaradaBrokering Avg=80.76 ms, JMF Avg=229.23 ms
450
                            NaradaBrokering-RTP
400                                      JMF-RTP
350
300
250
200
150
100
 50
  0
      0   200 400 600 800 100012001400160018002000
                      Packet Number
 Average jitter/packet for 12 (of the 400 total) video clients.
   NaradaBrokering Avg=13.38 ms, JMF Avg=15.55 ms
25
                              NaradaBrokering-RTP
                                           JMF-RTP
20


15


10


 5


 0
     0   200 400 600 800 1000 1200 1400 1600 1800 2000
                     Packet Number
Narada Performance Web Service
       Performance measurements are
        used by Links in
         • Reconfiguring Connectivity
             between nodes
         • Deciding underlying transport
             protocol
         • Determining possible filtering
       Each node determines performance           Probably should replace by a more
        of links of which it is endpoint           sophisticated measurement package
       Individual node web services are
        aggregated as another Web Service

        Factors measured include
        Transit delays, bandwidth, Jitter, Receiving rates.
        Performance measurements are
          • Spaced out at increasing intervals for healthy
             channels.
          • Factors selectively measured for unhealthy
             channels.
          • No repeated measurements of bandwidth for          Administrative Interface
             example.
          • Injected into Narada network as XML events
         The Overall Architecture
• The Grid is defined by a collection of distributed Services
   – For many users the primary interaction with the Grid will be
     through a portal

                                       Event and
                                        logging
                                       Services      Application
                                                       Factory
                                                       Services
                                                            Messaging
                     Portal Server                          and group
                                                           collaboration
                                                    Directory
                                                    & index
                          MyProxy                   Services
                                       Metadata
                           Server      Directory
                                       Service(s)
Application Portal in a Minute (box)
   Systems like Unicore, GPDK, Gridport (HotPage),
    Gateway, Legion provide “Grid or GCE Shell”
    interfaces to users (user portals)
    • Run a job; find its status; manipulate files
    • Basic UNIX Shell-like capabilities
   Application Portals (Problem Solving Environments)
    are often built on top of “Shell Portals” but this can be
    quite time confusing
    • Application Portal = Shell Portal Web Service + Application
      (factory) Web service
             Application Web service
   Application Web Service is ONLY metadata
     • Application is NOT touched
   Application Web service defined by two sets of schema:
     • First set defines the abstract state of the application
         What are my options for invoking myapp?

         Dub these to be “abstract descriptors”

     • Second set defines a specific instance of the application
         I want to use myapp with input1.dat on
          solar.uits.indiana.edu.
         Dub these to be “instance descriptors”.

   Each descriptor group consists of
     • Application descriptor schema
     • Host (resource) descriptor schema
     • Execution environment (queue or shell) descriptor schema
              Web Services as a Portlet
• Each Web Service naturally has a         Application as a WS
  user interface specified as ―just        General Application Ports
  another port‖                            Interface with other Web
                                           Services
   – Customizable for universal access
                                                     WSDL
• This gives each Web Service a                                    W
                                                  Application or       S
  Portlet view specified (in XML as              Content source        R
  always) by WSRP (Web services                                    P
                                                  Web Service
  for Remote Portals)
• So component model for resources                   User Face of
  ―automatically‖ gives a component                  Web Service
                                                     WSRP Ports define
  model for user interfaces                          WS as a Portlet
   – When you build your
     application, you define portlet     Web Services have other
     at same time                        ports (Grid Service) to be
                                         OGSI compliant
Online Knowledge Center built from Portlets
                                      A set of UI
                                      Components



• Web Services provide a component model
  for the middleware (see large ―common
  component architecture‖ effort in Dept. of
  Energy)
• Should match each WSDL component with
  a corresponding user interface component
• Thus one ―must use‖ a component model
  for the portal with again an XML
  specification (portalML) of portal
                                                          HTML
      Jetspeed
      Architecture                                   Turbine Servlet
                                                       JSP template
                                                              ECS Root to HTML

 PSML                                                Screen Manager
                                                                      ECS

                   PortletController                                        PortletController

                                 ECS                                           ECS         ECS

PortletControl




                           ECS                ECS
                                                                      ECS
                                                                                          ECS            ECS




 Portlets        Portlet                 Portlet                  Portlet             Portlet        Portlet


             XML                       HTML                     JSP or VM            WebPage       Portlets
 Data                                  Local files              Local templates      Remote HTML   User implemented
             RSS, OCS, or other
                                                                                                   using Portal API
             Local or remote
       Portlets and Portal Stacks
• User interfaces to Portal       Aggregation Portals




                                                        Message Security, Information Services
  services (Code                      (Jetspeed)
  Submission, Job
  Monitoring, File
                                   User facing Web
  Management for Host X)            Service Ports
  are all managed as
  portlets.
• Users, administrators can      Application Grid Web
  customize their portal               Services
  interfaces to just precisely
  the services they want.
                                  Core Grid Services
Jetspeed Computing Portal: Choose Portlets




                           4 available portlets
                           linking to Web Services
                           I choose two
Choose Portlet Layout



    Choose 1-column Layout




    Original 2-column Layout
File management

                  Tabs indicate available
                    portlet interfaces.




                     Lists user files on
                  selected host, noahsark.
                  File operations include
                    Upload, download,
                  Copy, rename, crossload
   Sample page with
    several portlets:
proxy credential manager,
 submission, monitoring
          Administer Grid Portal
                                   Provide information
                                    about application
                                           and
                                     host parameters




Select application
      to edit

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:102
posted:11/14/2010
language:Croatian
pages:93