Introduction to Grid Computing by zH37CPH8

VIEWS: 0 PAGES: 59

									Introduction to Grid Computing
         Introduction to Grid Computing

 The term Grid comes from an analogy to the
  Electric Grid.
  – Pervasive access to power.
  – Similarly, Grid will provide pervasive, consistent, and
    inexpensive access to advanced computational
    resources.
 Grid computing is all about achieving greater
  performance and throughput by pooling resources
  on a local, national, or international level.
                        Scalable Computing
P
E
R
F                                 2100   2100     2100    2100




O
R
M
A
N                                 2100   2100     2100    2100



C                         2100

E                                                                               Administrative Barriers
+                                                                                •Individual
Q                                                                                •Group
                                                                                 •Department
o                                                                                •Campus
S                                                                                •State
                                                                                 •National
                                                                                 •Globe




    Personal Device      SMPs or                 Local            Enterprise    Global     Inter Planet
                      SuperComputers            Cluster          Cluster/Grid    Grid          Grid
                       GRID Computing

 Grids are about large-scale resource sharing.
   – Spanning administrative boundaries.
        Central processors, storage, network bandwidth, databases,
         applications, sensors and so on
 Problem solving in dynamic, multi-institutional environment.
 Organizing geographically distributed computing resources
   – So that they can be flexibly and dynamically allocated and
     accessed

 Providing such capabilities, where Sharing is highly
  controlled, clear definitions of exactly what is shared, who
  is allowed to share, and the conditions under which sharing
  occurs.
 Elements of Grid Computing

 Resource sharing
   – Computers, data, storage, sensors, networks, …
   – Sharing always conditional: issues of trust, policy, negotiation,
     payment, …
 Coordinated problem solving
   – Beyond client-server: distributed data analysis, computation,
     collaboration, …
 Dynamic, multi-institutional virtual organizations
   – Community overlays on classic org structures
   – Large or small, static or dynamic
           Virtual Organizations
 A set of individuals and/or institutions defined by a set of
  sharing rules
 The sharing is highly controlled, with resource providers
  and consumers defining clearly and carefully just what is
  shared
An example: the set of application service providers, storage
  service providers, cycle providers and consultants engaged
  by a car manufacturer to plan for a new factory
Another example: industrial consortium building a new aircraft
          More Formal Definition of Grids

 A grid is a system that:
   – Coordinates resource sharing in a de-centralized manner (i.e.,
     different VOs).
   – Uses standard, open, general purpose protocols and interfaces.
   – Delivers non-trivial qualities of service.
       Guaranteed bandwidth for application.
       Guaranteed CPU cycles.
       Guaranteed latency.
Computational Grid Applications

   Biomedical research
   Industrial research
   Engineering research
   Studies in Physics and Chemistry
   Science Today is a Team Sport!!




I. Foster
                eScience
eScience [n]: Large-scale science carried out
  through distributed collaborations—often
  leveraging access to large-scale data &
  computing




                   I. Foster
TeraGrid is an Important Project developed by
   the National Science Foundation (NSF).




  Slide obtained from B. Wilkinson, http://sol.cs.wcu.edu/~abw/CS493F04/
                                                        TeraGrid




Slide obtained from B. Wilkinson, http://sol.cs.wcu.edu/~abw/CS493F04/
UK e-Science Grid




  Slide obtained from B. Wilkinson, http://sol.cs.wcu.edu/~abw/CS493F04/
                       Applications

 National Virtual Observatory
   – Astronomical surveys produce terabytes of data.
   – Data sets will cover sky in different wave bands (x-rays,
     optical, infrared, radio).
   – Challenge is to make this accessible to general
     research community.
       Heterogeneous data producers and consumers.
   – Resources in this Grid are data sets rather than
     compute engines.
                  High-Energy Physics

 Large-scale collaborations for CERN’s Large Hadron
  Collider.
 Involves 4000 physicists, 150 institutions, in more than 30
  countries.
 Data sets now at petabyte level. Predicted to generate data
  at the exabyte level in this decade.
 Challenges:
   – Providing rapid access to subsets of data.
   – Secure access to distributed computing and data handling
     resources.
 Essentially, provide a distributed collaborative
  infrastructure that will allow physicist from around the globe
  to effectively analyze results from their home institution.
               Online Access to
             Scientific Instruments
  Advanced Photon Source

                             wide-area
                           dissemination



real-time                         archival   desktop & VR clients
                                             with shared controls
collection                        storage




 tomographic reconstruction
       DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
NSF Network for Earthquake Engineering Simulation
                     (NEES)
  Transform our ability to carry out research vital to reducing
          vulnerability to catastrophic earthquakes




                                                         I. Foster
                        NEES

 network of 15 large-scale, experimental sites
 advanced tools such as shake tables, centrifuges
  that simulate earthquake effects, unique
  laboratories, a tsunami wave basin and field-
  testing equipment.
 linked to a centralized data pool and earthquake
  simulation software, bridged together by the high-
  speed Internet2.
 off-site researchers to interact in real time with any
  of the networked sites.
 Securely store, organize, and share data within a
  standardized framework in a central location.
 Remotely observe and participate in experiments
  through the use of synchronized real-time data
  and video.
 Collaborate with colleagues to facilitate the
  planning, performance, analysis, and publication
  of research experiments.
 Conduct hybrid simulations that combine the
  results of multiple distributed experiments and link
  physical experiments with computer simulations.
        DOE Earth System Grid
Goal: address
technical
obstacles to
the sharing &
analysis of
high-volume
data from
advanced
earth system
models


www.earthsystemgrid.org   I. Foster
Earth System Grid   I. Foster
 High-resolution, long-duration simulations performed with
  advanced DOE climate models produce tens of petabytes
  of output.
 This output made available to global change impacts
  researchers nationwide, both at national laboratories and
  at universities, other research laboratories, and other
  institutions.
 a virtual collaborative environment that links distributed
  centers, users, models, and data.
 provides scientists with virtual proximity to the distributed
  data and resources that they require to perform their
  research.
           Lets Play Virtual Organization!

 The members of this class represent a VO within the
  university.
 The resources of the VO include:
   – The laptops, workstations, and printers belonging to the individuals
     of the VO (that’s you guys1!).
   – Does this bring up any issues worth concerning yourself about?




                    1.   I do not join virtual organizations
 Want to tightly control who may use these resources and
  how they may be used. Thus need security.
 Security:
   – Want to tightly control who may use these resources and how they
     may be used.
 How about Larry and Ramm wanting to use your printer at
  the same time (which happens to be 3:30 AM). Is this a
  problem?
 Security:
   – Want to tightly control who may use these resources and how they
     may be used.
 How about Larry and Sarah wanting to use your printer at
  the same time (which happens to be 3:30 AM). Is this a
  problem?
   – Might want to have a scheduler, which in this case need not be
     more sophisticated than turning off the printer.
 What if David forgot Dan’s IP address and cannot gain
  access to his laptop? How could this be resolved
  (assuming you want it resolved)?
 What if David forgot Dan’s IP address and cannot gain
  access to his laptop? How could this be addressed
  (assuming you want it addressed)?
   – You could provide an information service that could tell David how
     to find the laptop.
 You would also have to deal with allocating multiple
  resources to a user, e.g., a laptop to write a paper and a
  printer to print it out. Thus need a resource manager.
 Also need a way to monitor your application executing in
  your VO Grid.
Grid Computing Software
      Infrastructure
          Open Grid Services Architecture

 Developed by the Global Grid Forum to define a
  common, standard, and open architectures for
  Grid-based applications.
   – Provides a standard approach to all services on the Grid.
       VO Management Service.
       Resource discovery and management service:
       Job management service.
       Security services.
       Data management services.
 Built on top of and extends the Web Services
  architecture, protocols, and interfaces.
A stateless Web Service invocation
Figure 1.11. A stateful Web Service invocation
 Relationship between OGSA, WSRF, and
  Web Services
                          WSRF

 Web Services Resource Framework
  –   a specification developed by OASIS.
  –   WSRF specifies how to make Web Services stateful.
  –   joint effort by the Grid and Web Services communities.
  –   WSRF provides the stateful services that OGSA needs.
  –   OGSA is the architecture, WSRF is the infrastructure on
      which that architecture is built on.
                           Standards Bodies
The primary standards-setting body is1:
 Global Grid Forum (GGF)
        – Started in 1998
        – Meets three times a year, GGF1, GGF2, GGF3 …
        – More than 40 organizations involved and growing …

Others:
 W3C consortium (Worlds Wide Web Consortium)
        – Working on standardization of web-related technologies such as XML
        – See http://www.w3.org
 OASIS (Organization for the Advancement of Structured
  Information Standards)
 IETF, DMTF

1   “The Grid Core Technologies” by M. Li and M. Baker, 2005, page 4.
     Standards in the Web Services
                 World
   XML introduced (ratified) in 1998
   SOAP ratified in 2000
   Web services developed
   Subsequently, standards have been are
    continuing to be developed:
    – WSDL
    – WS-* where * refers to names of one of many standards
Standards in the grid computing
             world

 Open Grid Services Architecture (OGSA)

 First announced at GGF4 in Feb 2002

 OGSA does not give details of
  implementation.
            Globus Project

 Open source software toolkit developed for grid
  computing.
 Roots in I-way experiment.
 Work started in 1996.
 Four versions developed to present time.
 Reference implementations of grid computing
  standards.
 Defacto standard for grid computing.
             Globus Version 4
 A “toolkit” of services and packages for creating
  the basic grid computing infrastructure
 Higher level tools added to this infrastructure
 Version 4 is web-services based
 Some non-web services code exists from earlier
  versions (legacy) or where not appropriate (for
  efficiency, etc.).
Layered diagram of OGSA, GT4, WSRF, and Web Services
 Each part comprises a set of web services
  and/or non-web service components.

 Some built upon earlier versions of Globus.
    Globus Open Source Grid Software
G                                       Community                     Python WS Core
     Delegation                          Scheduler                     [contribution]
T     Service                           Framework
4                                      [contribution]                   C WS Core

     Community
                       OGSA-DAI
    Authorization
                     [Tech Preview]
G      Service                                                                              Web
                                                                                          Services
T                                                                                       Components
                                            Grid        Monitoring
3        WS             Reliable
                                          Resource      & Discovery
    Authentication        File                                         Java WS Core
                                      Allocation Mgmt     System
    Authorization       Transfer
                                        (WS GRAM)         (MDS4)

G      Pre-WS
                                            Grid        Monitoring
                                          Resource      & Discovery     C Common
T   Authentication      GridFTP
                                      Allocation Mgmt     System         Libraries
    Authorization
2                                     (Pre-WS GRAM)       (MDS2)                         Non-WS

G                        Replica
                                                                                        Components
T                       Location                                           XIO
                        Service
3
G
     Credential
T   Management
4

                        Data           Execution        Information     Common
      Security
                     Management       Management          Services      Runtime
                                                                                           I Foster
         Another view of GT4 Components
                       Your      Your       Your                          Your                   Your
                                                                                                Your               Your
                                                                                                                  Your
CLIENT                Your      Your       Your                          Your
                       Java        C      Python                          Java                    CC             Python
                                                                                                                 Python
                      Java        C       Python                         Java
                       Client    Client    Client                         Client                 Client
                                                                                                Client            Client
                                                                                                                 Client
                      Client    Client    Client                         Client




               Interoperable
                                                        X.509 credentials =
             WS-I-compliant
                                                        common authentication
            SOAP messaging


                                                     Your       Your




                                                                                                                               Pre-WS MDS
                                                                                                                 Pre-WS GRAM
    Your
                                     OGSA-DAI
                      Delegation




   Your




                                                                                     SimpleCA
                       Archiver




                                                                                                 MyProxy
                                                                           GridFTP
                                                    Python       C
                       Trigger




    Java
               GRAM




                                      GTCP


   Java
                        Index



                                       CAS
                RFT




                                                                                                           RLS
  Service                                           Service    Service
  Service
                                                    pyGlobus   C WS
                                                    WS Core    Core


             Java Services in Apache Axis Python hosting,                   C Services using GT
SERVER
            Plus GT Libraries and Handlers GT Libraries                    Libraries and Handlers
                                                                                         I Foster
                  GT Core
 Provides the ability to create services
  running inside the GT 4 container.
                              Java WS Core
G                                       Community                     Python WS Core
     Delegation                          Scheduler                     [contribution]
T     Service                           Framework
4                                      [contribution]                   C WS Core

     Community
                       OGSA-DAI
    Authorization
                     [Tech Preview]
G      Service                                                                              Web
                                                                                          Services
T                                                                                       Components
                                            Grid        Monitoring
3        WS             Reliable
                                          Resource      & Discovery
    Authentication        File                                         Java WS Core
                                      Allocation Mgmt     System
    Authorization       Transfer
                                        (WS GRAM)         (MDS4)

G      Pre-WS
                                            Grid        Monitoring
                                          Resource      & Discovery     C Common
T   Authentication      GridFTP
                                      Allocation Mgmt     System         Libraries
    Authorization
2                                     (Pre-WS GRAM)       (MDS2)                         Non-WS

G                        Replica
                                                                                        Components
T                       Location                                           XIO
                        Service
3
G
     Credential
T   Management
4

                        Data           Execution        Information     Common
      Security
                     Management       Management          Services      Runtime
                GT4 Web Services Core
                            User Applications



                             Custom     GT4




                                                   Administration
                            WSRF Web WSRF Web




                                                     Registry
                 Custom
GT4 Container




                             Services Services
                  Web
                 Services
                            WS-Addressing, WSRF,
                               WS-Notification

                       WSDL, SOAP, WS-Security

                                                              I Foster
    Execution Management

             Key component

 GRAM (Grid Resource Allocation Manager)

 For submitting executable jobs
 May interface to a local job scheduler
     GRAM (Grid Resource Allocation Manager)
G                                       Community                     Python WS Core
     Delegation                          Scheduler                     [contribution]
T     Service                           Framework
4                                      [contribution]                   C WS Core

     Community
                       OGSA-DAI
    Authorization
                     [Tech Preview]
G      Service                                                                              Web
                                                                                          Services
T                                                                                       Components
                                            Grid        Monitoring
3        WS             Reliable
                                          Resource      & Discovery
    Authentication        File                                         Java WS Core
                                      Allocation Mgmt     System
    Authorization       Transfer
                                        (WS GRAM)         (MDS4)

G      Pre-WS
                                            Grid        Monitoring
                                          Resource      & Discovery     C Common
T   Authentication      GridFTP
                                      Allocation Mgmt     System         Libraries
    Authorization
2                                     (Pre-WS GRAM)       (MDS2)                         Non-WS

G                        Replica
                                                                                        Components
T                       Location                                           XIO
                        Service
3
G
     Credential
T   Management
4

                        Data           Execution        Information     Common
      Security
                     Management       Management          Services      Runtime
                           GT4 GRAM Structure:
                                                                         Sun Grid Engine
                                 Service host(s) and compute element(s)

                    GT4 Java Container                                    Compute element
                                            Local job
                            GRAM
                           GRAM              control
                           services                                             Local
                          services




                                                        sudo
                                                               GRAM           scheduler
                                                               adapter
Client




                                 Transfer
                    Delegation   request
         Delegate
                                                               GridFTP        User
                          RFT File
                                             FTP                               job
                          Transfer
                                             control
                                                                   FTP data
                                                                              Remote
                                                               GridFTP        storage
              Data management components                                      element(s)


                                                                                  I Foster
         Security Components
Addresses the security requirements of grid
computing. Three important factors are:

 Authorization
   – Process of deciding whether a particular identity can
     access a particular resource
 Authentication
   – Process of deciding whether a particular identity is
     who he says he is (applies to humans and systems)
 Delegation (somewhat specific to grid computing)
   – Process of giving authority to another identity
     (usually a computer/process) to act on your behalf.
          Security continued
 Security aspects complicated by the fact
  that virtual organization members and
  resources can be in different administrative
  domains.
                                      Security
G                                       Community                     Python WS Core
     Delegation                          Scheduler                     [contribution]
T     Service                           Framework
4                                      [contribution]                   C WS Core

     Community
                       OGSA-DAI
    Authorization
                     [Tech Preview]
G      Service                                                                              Web
                                                                                          Services
T                                                                                       Components
                                            Grid        Monitoring
3        WS             Reliable
                                          Resource      & Discovery
    Authentication        File                                         Java WS Core
                                      Allocation Mgmt     System
    Authorization       Transfer
                                        (WS GRAM)         (MDS4)

G      Pre-WS
                                            Grid        Monitoring
                                          Resource      & Discovery     C Common
T   Authentication      GridFTP
                                      Allocation Mgmt     System         Libraries
    Authorization
2                                     (Pre-WS GRAM)       (MDS2)                         Non-WS

G                        Replica
                                                                                        Components
T                       Location                                           XIO
                        Service
3
G
     Credential
T   Management
4

                        Data           Execution        Information     Common
      Security
                     Management       Management          Services      Runtime
         GT4 Data Management
   Move large data to/from nodes
   Replicate data for performance & reliability
   Locate data of interest
   Provide access to different data sources
    – File systems, parallel file systems, hierarchical
      storage (GridFTP)
    – Databases (OGSA DAI)
    GridFTP and Reliable File Transfer
G                                       Community                     Python WS Core
     Delegation                          Scheduler                     [contribution]
T     Service                           Framework
4                                      [contribution]                   C WS Core

     Community
                       OGSA-DAI
    Authorization
                     [Tech Preview]
G      Service                                                                              Web
                                                                                          Services
T                                                                                       Components
                                            Grid        Monitoring
3        WS             Reliable
                                          Resource      & Discovery
    Authentication        File                                         Java WS Core
                                      Allocation Mgmt     System
    Authorization       Transfer
                                        (WS GRAM)         (MDS4)

G      Pre-WS
                                            Grid        Monitoring
                                          Resource      & Discovery     C Common
T   Authentication      GridFTP
                                      Allocation Mgmt     System         Libraries
    Authorization
2                                     (Pre-WS GRAM)       (MDS2)                         Non-WS

G                        Replica
                                                                                        Components
T                       Location                                           XIO
                        Service
3
G
     Credential
T   Management
4

                        Data           Execution        Information     Common
      Security
                     Management       Management          Services      Runtime
                     GridFTP
 Built on FTP using separation of data and
  control channels
 Provides features for
  –   Large data transfers
  –   Secure transfers
  –   Fast transfers
  –   Reliable transfers
  –   Third party transfers
 Not a web service
  – RTF (Reliable File Transfer) service provided WS-
    level interface
   Parallel transfers and striping
 Using multiple (virtual) connections for transfer
   – Same external network
   – Speed improvement possible, but limited by network
     card
 Striping
   – a version of parallel transfers that can use separate
     hardware interfaces
   – Implemented in GT 4.
                  Monitoring and Discovery
G                                       Community                     Python WS Core
     Delegation                          Scheduler                     [contribution]
T     Service                           Framework
4                                      [contribution]                   C WS Core

     Community
                       OGSA-DAI
    Authorization
                     [Tech Preview]
G      Service                                                                              Web
                                                                                          Services
T                                                                                       Components
                                            Grid        Monitoring
3        WS             Reliable
                                          Resource      & Discovery
    Authentication        File                                         Java WS Core
                                      Allocation Mgmt     System
    Authorization       Transfer
                                        (WS GRAM)         (MDS4)

G      Pre-WS
                                            Grid        Monitoring
                                          Resource      & Discovery     C Common
T   Authentication      GridFTP
                                      Allocation Mgmt     System         Libraries
    Authorization
2                                     (Pre-WS GRAM)       (MDS2)                         Non-WS

G                        Replica
                                                                                        Components
T                       Location                                           XIO
                        Service
3
G
     Credential
T   Management
4

                        Data           Execution        Information     Common
      Security
                     Management       Management          Services      Runtime
   Monitoring and Discovery
 WSRF provides common mechanisms for
  monitoring and discovering a service:
 GT4 “aggregator” services within MDS:
  – MDS-Index: collects state information from
    registered resources and makes it available
    as XML document
  – MDS-Trigger: passes this information to an
    executable
  – MDS-Archive: archives state information
    (awaiting implementation)
 Every GT 4 is discoverable

								
To top