Overview of Grid Computing_1_ by liwenting

VIEWS: 20 PAGES: 188

									     Grid Computing:
Concepts, Appplications, and
       Technologies
                         Ian Foster
           Mathematics and Computer Science Division
                   Argonne National Laboratory
                                and
                Department of Computer Science
                     The University of Chicago


                 http://www.mcs.anl.gov/~foster

Grid Computing in Canada Workshop, University of Alberta, May 1, 2002
                                                           2


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                           3


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                   4
         Living in an Exponential World
            (1) Computing & Sensors
Moore’s Law: transistor count doubles each 18 months




    Magnetohydro-
       dynamics
    star formation


foster@mcs.anl.gov                  ARGONNE  CHICAGO
                                                        5
         Living in an Exponential World:
                    (2) Storage
        Storage density doubles every 12 months
        Dramatic growth in online data (1 petabyte
         = 1000 terabyte = 1,000,000 gigabyte)
         – 2000      ~0.5 petabyte
         – 2005      ~10 petabytes
         – 2010      ~100 petabytes
         – 2015      ~1000 petabytes?
        Transforming entire disciplines in physical
         and, increasingly, biological sciences;
         humanities next?
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                        6
         Data Intensive Physical Sciences

    High energy & nuclear physics
     – Including new experiments at CERN
    Gravity wave searches
     – LIGO, GEO, VIRGO
    Time-dependent 3-D systems (simulation, data)
     – Earth Observation, climate modeling
     – Geophysics, earthquake modeling
     – Fluids, aerodynamic design
     – Pollutant dispersal scenarios
    Astronomy: Digital sky surveys

foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                            7

     Ongoing Astronomical Mega-Surveys
    Large number of new surveys
     – Multi-TB in size, 100M objects or larger   MACHO
                                                  2MASS
      – In databases                              SDSS
      – Individual archives planned and under way DPOSS
                                                  GSC-II
    Multi-wavelength view of the sky             COBE
                                                  MAP
      – > 13 wavelength coverage within 5 years NVSS
    Impressive early discoveries                 FIRST
                                                  GALEX
      – Finding exotic objects by unusual colors  ROSAT
         > L,T dwarfs, high redshift quasars      OGLE
                                                  ...
     – Finding objects by time variability
        > Gravitational micro-lensing

foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                               8

      Crab Nebula in 4 Spectral Regions


     X-ray                           Optical




  Infrared                          Radio



foster@mcs.anl.gov            ARGONNE  CHICAGO
                                                         9



      Coming Floods of Astronomy Data
        The planned Large Synoptic Survey
         Telescope will produce over 10 petabytes
         per year by 2008!
         – All-sky survey every few days, so will have
           fine-grain time series for the first time




foster@mcs.anl.gov                        ARGONNE  CHICAGO
            Data Intensive Biology and                    10


                     Medicine
   Medical data
    – X-Ray, mammography data, etc. (many petabytes)
    – Digitizing patient records (ditto)
   X-ray crystallography
   Molecular genomics and related disciplines
    – Human Genome, other genome databases
    – Proteomics (protein structure, activities, …)
    – Protein interactions, drug delivery
   Virtual Population Laboratory (proposed)
    – Simulate likely spread of disease outbreaks
   Brain scans (3-D, time dependent)
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                                         11

                                                    A Brain
                                                    is a Lot
                                                    of Data!
                                              (Mark Ellisman, UCSD)




 And comparisons must be
    made among many

We need to get to one micron to know location of every cell. We’re just now
   starting to get to 10 microns – Grids will help get us there and further

foster@mcs.anl.gov                                   ARGONNE  CHICAGO
                                                                                               12
      An Exponential World: (3) Networks
          (Or, Coefficients Matter …)
          Network vs. computer performance
            – Computer speed doubles every 18 months
            – Network speed doubles every 9 months
            – Difference = order of magnitude per 5 years
          1986 to 2000
            – Computers: x 500
            – Networks: x 340,000
          2001 to 2010
            – Computers: x 60
            – Networks: x 4000
Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-
foster@mcs.anl.gov Vined Khoslan, Kleiner, Caufield and Perkins.
2001) by Cleo Vilett, source                                          ARGONNE  CHICAGO
                                                          13


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                          14



      Evolution of the Scientific Process
        Pre-electronic
         – Theorize &/or experiment, alone or in small
           teams; publish paper
        Post-electronic
         – Construct and mine very large databases of
           observational or simulation data
         – Develop computer simulations & analyses
         – Exchange information quasi-instantaneously
           within large, distributed, multidisciplinary
           teams

foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                         15



                 Evolution of Business
        Pre-Internet
         – Central corporate data processing facility
         – Business processes not compute-oriented
        Post-Internet
         – Enterprise computing is highly distributed,
           heterogeneous, inter-enterprise (B2B)
         – Outsourcing becomes feasible => service
           providers of various sorts
         – Business processes increasingly computing-
           and data-rich

foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                       16



                       The Grid
       ―Resource sharing & coordinated problem
        solving in dynamic, multi-institutional
        virtual organizations‖




foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                     17
      An Example Virtual Organization:
       CERN’s Large Hadron Collider
     1800 Physicists, 150 Institutes, 32 Countries




        100 PB of data by 2010; 50,000 CPUs?
foster@mcs.anl.gov                     ARGONNE  CHICAGO
     Grid Communities & Applications:                                                                                                                    18


     Data Grids for High Energy Physics
                                     ~PBytes/sec
                                                                                                                1 TIPS is approximately 25,000
                                                        Online System          ~100 MBytes/sec                  SpecInt95 equivalents

                                                                                    Offline Processor Farm
          There is a “bunch crossing” every 25 nsecs.
                                                                                           ~20 TIPS
          There are 100 “triggers” per second
                                                                                                         ~100 MBytes/sec
          Each triggered event is ~1 MByte in size

                                                       ~622 Mbits/sec
                                                                          Tier 0               CERN Computer Centre
                                        or Air Freight (deprecated)

 Tier 1
          France Regional                   Germany Regional                  Italy Regional                     FermiLab ~4 TIPS
              Centre                            Centre                           Centre
                                                                                                                               ~622 Mbits/sec


                                                            Tier 2            Caltech                  Tier2    Tier2 Centre
                                                                                               Tier2 Centre Centre        Tier2 Centre
                                                                              ~1 TIPS            ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS
                                             ~622 Mbits/sec


                                Institute
                                        Institute Institute       Institute
                               ~0.25TIPS                                                       Physicists work on analysis “channels”.
                                                                                               Each institute will have ~10 physicists working on one or more
      Physics data cache
                                                 ~1 MBytes/sec                                 channels; data for these channels should be cached by the
                                                                                               institute server
                                                                 Tier 4
                    Physicist workstations



 www.griphyn.org
foster@mcs.anl.gov                                      www.ppdg.net                                       www.eu-datagrid.org
                                                                                                            ARGONNE  CHICAGO
         Data Integration and Mining: (credit Sara Graves) 19
         From Global Information to Local Knowledge
                                Emergency
                                Response




                                Precision Agriculture




      Urban
      Environments
                                  Weather
                                  Prediction




foster@mcs.anl.gov                             ARGONNE  CHICAGO
           Intelligent Infrastructure:       20


       Distributed Servers and Services




foster@mcs.anl.gov             ARGONNE  CHICAGO
                                         22
                     Grid Computing




foster@mcs.anl.gov         ARGONNE  CHICAGO
                                                          23
                       The Grid:
                     A Brief History
       Early 90s
        – Gigabit testbeds, metacomputing
       Mid to late 90s
        – Early experiments (e.g., I-WAY), academic
          software projects (e.g., Globus, Legion),
          application experiments
       2002
        – Dozens of application communities & projects
        – Major infrastructure deployments
        – Significant technology base (esp. Globus ToolkitTM)
        – Growing industrial interest
        – Global Grid Forum: ~500 people, 20+ countries
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                         28


         The Grid World: Current Status
        Dozens of major Grid projects in scientific &
         technical computing/research & education
         – www.mcs.anl.gov/~foster/grid-projects
        Considerable consensus on key concepts
         and technologies
         – Open source Globus Toolkit™ a de facto
           standard for major protocols & services
        Industrial interest emerging rapidly
         – IBM, Platform, Microsoft, Sun, Compaq, …
        Opportunity: convergence of eScience and
         eBusiness requirements & technologies
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                          35


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                       36
           Grid Technologies:
  Resource Sharing Mechanisms That …

        Address security and policy concerns of
         resource owners and users
        Are flexible enough to deal with many
         resource types and sharing modalities
        Scale to large number of resources, many
         participants, many program components
        Operate efficiently when dealing with large
         amounts of data & computation



foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                             37


               Aspects of the Problem
     1) Need for interoperability when different
         groups want to share resources
         – Diverse components, policies, mechanisms
         – E.g., standard notions of identity, means of
           communication, resource descriptions
     2) Need for shared infrastructure services to
         avoid repeated development, installation
         – E.g., one port/service/protocol for remote
           access to computing, not one per tool/appln
         – E.g., Certificate Authorities: expensive to run
        A common need for protocols & services
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                                    39


                 The Hourglass Model

    Focus on architecture issues         Applications
     – Propose set of core services       Diverse global services
       as basic infrastructure
     – Use to construct high-level,
       domain-specific solutions
    Design principles                 Core
                                       services
     –   Keep participation cost low
     –   Enable local control
     –   Support for adaptation
     –   ―IP hourglass‖ model
                                                  Local OS

foster@mcs.anl.gov                                ARGONNE  CHICAGO
                                                                      40

        Layered Grid Architecture
   (By Analogy to Internet Architecture)

                                       Application




                                                                        Internet Protocol Architecture
―Coordinating multiple resources‖:
ubiquitous infrastructure services,        Collective
app-specific distributed services                         Application

―Sharing single resources‖:
negotiating access, controlling use       Resource

―Talking to things‖: communication
(Internet protocols) & security        Connectivity       Transport
                                                           Internet
―Controlling things locally‖: Access
to, & control of, resources               Fabric             Link


foster@mcs.anl.gov                                 ARGONNE  CHICAGO
                                                         41



                     Globus Toolkit™
        A software toolkit addressing key technical
         problems in the development of Grid-enabled
         tools, services, and applications
         – Offer a modular set of orthogonal services
         – Enable incremental development of grid-
           enabled tools and applications
         – Implement standard Grid protocols and APIs
         – Available under liberal open source license
         – Large community of developers & users
         – Commercial support

foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                           42



                     General Approach
        Define Grid protocols & APIs
         – Protocol-mediated access to remote resources
         – Integrate and extend existing standards
         – ―On the Grid‖ = speak ―Intergrid‖ protocols
        Develop a reference implementation
         – Open source Globus Toolkit
         – Client and server SDKs, services, tools, etc.
        Grid-enable wide variety of tools
         – Globus Toolkit, FTP, SSH, Condor, SRB, MPI, …
        Learn through deployment and applications
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                                      43



                        Key Protocols
        The Globus Toolkit™ centers around four
         key protocols
         – Connectivity layer:
            > Security: Grid Security Infrastructure (GSI)
         – Resource layer:
            > Resource Management: Grid Resource Allocation
              Management (GRAM)
            > Information Services: Grid Resource Information
              Protocol (GRIP) and Index Information Protocol (GIIP)
            > Data Transfer: Grid File Transfer Protocol (GridFTP)

        Also key collective layer protocols
         – Info Services, Replica Management, etc.
foster@mcs.anl.gov                                   ARGONNE  CHICAGO
                                                                        44



                 Globus Toolkit Structure
   Service naming
                                          Soft state
   Reliable invocation                   management

       GRAM              MDS   GridFTP          MDS        ???
  Notification
                 GSI                     GSI                     GSI
             Job
           manager
                    Job
                  manager
             Compute                   Data                Other Service
             Resource                Resource              or Application




foster@mcs.anl.gov                                     ARGONNE  CHICAGO
                                                              45
                      Connectivity Layer
                     Protocols & Services
        Communication
         – Internet protocols: IP, DNS, routing, etc.
        Security: Grid Security Infrastructure (GSI)
         – Uniform authentication, authorization, and
           message protection mechanisms in multi-
           institutional setting
         – Single sign-on, delegation, identity mapping
         – Public key technology, SSL, X.509, GSS-API
         – Supporting infrastructure: Certificate
           Authorities, certificate & key management, …
                             GSI: www.gridforum.org/security/gsi
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                          46

             Why Grid Security is Hard
    Resources being used may be extremely valuable
     & the problems being solved extremely sensitive
    Resources are often located in distinct
     administrative domains
     – Each resource may have own policies & procedures
    The set of resources used by a single computation
     may be large, dynamic, and/or unpredictable
     – Not just client/server
    It must be broadly available & applicable
     – Standard, well-tested, well-understood protocols
     – Integration with wide variety of tools
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                      47

           Grid Security Requirements
   User View                        Resource Owner View
   1) Easy to use                   1) Specify local access control
   2) Single sign-on                2) Auditing, accounting, etc.
   3) Run applications              3) Integration w/ local system
       ftp,ssh,MPI,Condor,Web,…         Kerberos, AFS, license mgr.
   4) User based trust model        4) Protection from compromised
   5) Proxies/agents (delegation)       resources
   Developer View
  API/SDK with authentication, flexible message protection,
  flexible communication, delegation, ...
      Direct calls to various security functions (e.g. GSS-API)
      Or security integrated into higher-level SDKs:
          E.g. GlobusIO, Condor-G, MPICH-G2, HDF5, etc.

foster@mcs.anl.gov                                ARGONNE  CHICAGO
                                                                     48


      Grid Security Infrastructure (GSI)
     Extensions to existing standard protocols & APIs
      – Standards: SSL/TLS, X.509 & CA, GSS-API
      – Extensions for single sign-on and delegation
     Globus Toolkit reference implementation of GSI
      – SSLeay/OpenSSL + GSS-API + delegation
      – Tools and services to interface to local security
         > Simple ACLs; SSLK5 & PKINIT for access to K5, AFS, etc.
      – Tools for credential management
         > Login, logout, etc.
         > Smartcards
         > MyProxy: Web portal login and delegation
         > K5cert: Automatic X.509 certificate creation
foster@mcs.anl.gov                                    ARGONNE  CHICAGO
         GSI in Action: ―Create Processes at A and B                                          49

           that Communicate & Access Files at C‖
              Single sign-on via ―grid-id‖
              & generation of proxy cred.       User Proxy
  User        Or: retrieval of proxy cred.
                                                    Proxy
                                                  credential
              from online repository
                                             Remote process
                                             creation requests*
            GSI-enabled Authorize                                 Ditto   GSI-enabled
 Site A     GRAM server Map to local id                                   GRAM server       Site B
 (Kerberos)             Create process                                                      (Unix)
  Computer              Generate credentials                                   Computer
  Process                                                                       Process
               Local id                      Communication*                      Local id
   Kerberos    Restricted      Remote file                                       Restricted
    ticket       proxy
                             access request*                                       proxy

                                                           GSI-enabled
                                             Site C         FTP server
                                             (Kerberos)
 * With mutual authentication                                       Authorize
                                             Storage                Map to local id
                                             system                 Access file


foster@mcs.anl.gov                                                  ARGONNE  CHICAGO
                                                           50



         GSI Working Group Documents
        Grid Security Infrastructure (GSI) Roadmap
         – Informational draft overview of working group
           activities and documents
        Grid Security Protocols & Syntax
         – X.509 Proxy Certificates
         – X.509 Proxy Delegation Protocol
         – The GSI GSS-API Mechanism
        Grid Security APIs
         – GSS-API Extensions for the Grid
         – GSI Shell API
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                        51



                      GSI Futures
        Scalability in numbers of users & resources
         – Credential management
         – Online credential repositories (―MyProxy‖)
         – Account management
        Authorization
         – Policy languages
         – Community authorization
        Protection against compromised resources
         – Restricted delegation, smartcards

foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                                52


            Community Authorization
            1. CAS request, with           CAS                user/group
               resource names                                 membership
               and operations             Does the
                                      collective policy
                                                          resource/collective
            2. CAS reply, with         authorize this
                                                             membership
                capability            request for this
               and resource CA info        user?
                                                            collective policy
                                                              information

     User                              Resource
            3. Resource request,
               authenticated with     Is this request
                capability            authorized by
                                            the
                                        capability?            local policy
                                                               information
            4. Resource reply
                                      Is this request
                                      authorized for
                                        the CAS?



 Laura Pearlman,
foster@mcs.anl.govSteve Tuecke, Von Welch, others
                                                ARGONNE  CHICAGO
                                                             53
                       Resource Layer
                     Protocols & Services
        Grid Resource Allocation Management (GRAM)
         – Remote allocation, reservation, monitoring,
           control of compute resources
        GridFTP protocol (FTP extensions)
         – High-performance data access & transport
        Grid Resource Information Service (GRIS)
         – Access to structure & state information
        Others emerging: Catalog access, code
         repository access, accounting, etc.
        All built on connectivity layer: GSI & IP
                           GRAM, GridFTP, GRIS: www.globus.org
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                       54



               Resource Management
        The Grid Resource Allocation Management
         (GRAM) protocol and client API allows
         programs to be started and managed on
         remote resources, despite local
         heterogeneity
        Resource Specification Language (RSL) is
         used to communicate requirements
        A layered architecture allows application-
         specific resource brokers and co-allocators
         to be defined in terms of GRAM services
         – Integrated with Condor, PBS, MPICH-G2, …
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                                                55
                   Resource
             Management Architecture

                                        Broker
                                                               RSL
                        RSL                                    specialization

                                                     Queries     Information
    Application
                                                     & Info        Service
                        Ground RSL

                                      Co-allocator

                                     Simple ground RSL
Local         GRAM                       GRAM                       GRAM
resource
managers          LSF                   Condor                       NQE


foster@mcs.anl.gov                                       ARGONNE  CHICAGO
                                                               56



               Data Access & Transfer
       GridFTP: extended version of popular FTP
        protocol for Grid data access and transfer
       Secure, efficient, reliable, flexible, extensible,
        parallel, concurrent, e.g.:
        – Third-party data transfers, partial file transfers
        – Parallelism, striping (e.g., on PVFS)
        – Reliable, recoverable data transfers
       Reference implementations
        – Existing clients and servers: wuftpd, ncftp
        – Flexible, extensible libraries in Globus Toolkit
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                       57
         The Grid Information Problem




    Large numbers of distributed ―sensors‖ with
     different properties
    Need for different ―views‖ of this information,
     depending on community membership, security
     constraints, intended purpose, sensor type
foster@mcs.anl.gov                     ARGONNE  CHICAGO
                                                        58
    The Globus Toolkit Solution: MDS-2




    Registration & enquiry protocols, information
    models, query languages
     – Provides standard interfaces to sensors
     – Supports different ―directory‖ structures
       supporting various discovery/access strategies
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                        59
              Globus Applications and
                   Deployments
        Application projects include
         – GriPhyN, PPDG, NEES, EU DataGrid, ESG,
           Fusion Collaboratory, etc., etc.
        Infrastructure deployments include
         – DISCOM, NASA IPG, NSF TeraGrid, DOE
           Science Grid, EU DataGrid, etc., etc.
         – UK Grid Center, U.S. GRIDS Center
        Technology projects include
         – Data Grids, Access Grid, Portals, CORBA,
           MPICH-G2, Condor-G, GrADS, etc., etc.

foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                          60



                     Globus Futures
        Numerous large projects are pushing hard
         on production deployment & application
         – Much will be learned in next 2 years!
        Active R&D program, focused for example on
         – Security & policy for resource sharing
         – Flexible, high-perf., scalable data sharing
         – Integration with Web Services etc.
         – Programming models and tools
        Community code development producing a
         true Open Grid Architecture
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                          61


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                      62



           Important Grid Applications
        Data-intensive
        Distributed computing (metacomputing)
        Collaborative
        Remote access to, and computer
         enhancement of, experimental facilities




foster@mcs.anl.gov                      ARGONNE  CHICAGO
                                                         63


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                                    64



    Data Intensive Science: 2000-2015
      Scientific discovery increasingly driven by IT
       – Computationally intensive analyses
       – Massive data collections
       – Data distributed across networks of varying capability
       – Geographically distributed collaboration
      Dominant factor: data growth (1 Petabyte = 1000 TB)
       – 2000        ~0.5 Petabyte
                                        How to collect, manage,
       – 2005        ~10 Petabytes
                                        access and interpret this
       – 2010        ~100 Petabytes     quantity of data?
       – 2015        ~1000 Petabytes?

   Drives demand for “Data Grids” to handle
   additional dimension of data access & movement
foster@mcs.anl.gov                                  ARGONNE  CHICAGO
                                                                           65
                         Data Grid Projects
   Particle Physics Data Grid (US, DOE)
     – Data Grid applications for HENP expts.
   GriPhyN (US, NSF)
     – Petascale Virtual-Data Grids
   iVDGL (US, NSF)                                Collaborations  of
     – Global Grid lab                              application scientists &
   TeraGrid (US, NSF)                              computer scientists
     – Dist. supercomp. resources (13 TFlops)      Infrastructure   devel. &
   European Data Grid (EU, EC)                     deployment
     – Data Grid technologies, EU deployment       Globus   based
   CrossGrid (EU, EC)
     – Data Grid technologies, EU emphasis
   DataTAG (EU, EC)
     – Transatlantic network, Grid applications
   Japanese Grid Projects (APGrid) (Japan)
    – Grid deployment throughout Japan
foster@mcs.anl.gov                                  ARGONNE  CHICAGO
     Grid Communities & Applications:                                                                                                                    66


     Data Grids for High Energy Physics
                                     ~PBytes/sec
                                                                                                                1 TIPS is approximately 25,000
                                                        Online System          ~100 MBytes/sec                  SpecInt95 equivalents

                                                                                    Offline Processor Farm
          There is a “bunch crossing” every 25 nsecs.
                                                                                           ~20 TIPS
          There are 100 “triggers” per second
                                                                                                         ~100 MBytes/sec
          Each triggered event is ~1 MByte in size

                                                       ~622 Mbits/sec
                                                                          Tier 0               CERN Computer Centre
                                        or Air Freight (deprecated)

 Tier 1
          France Regional                   Germany Regional                  Italy Regional                     FermiLab ~4 TIPS
              Centre                            Centre                           Centre
                                                                                                                               ~622 Mbits/sec


                                                            Tier 2            Caltech                  Tier2    Tier2 Centre
                                                                                               Tier2 Centre Centre        Tier2 Centre
                                                                              ~1 TIPS            ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS
                                             ~622 Mbits/sec


                                Institute
                                        Institute Institute       Institute
                               ~0.25TIPS                                                       Physicists work on analysis “channels”.
                                                                                               Each institute will have ~10 physicists working on one or more
      Physics data cache
                                                 ~1 MBytes/sec                                 channels; data for these channels should be cached by the
                                                                                               institute server
                                                                 Tier 4
                    Physicist workstations



 www.griphyn.org
foster@mcs.anl.gov                                      www.ppdg.net                                       www.eu-datagrid.org
                                                                                                            ARGONNE  CHICAGO
              Biomedical Informatics                   67

             Research Network (BIRN)
    Evolving reference set of
     brains provides essential
     data for developing
     therapies for neurological
     disorders (multiple sclerosis,
     Alzheimer’s, etc.).
    Today
     – One lab, small patient base
     – 4 TB collection
    Tomorrow
     – 10s of collaborating labs
     – Larger population sample
     – 400 TB data collection: more
       brains, higher resolution
     – Multiple scale data integration
       and analysis
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                mammograms  68
                     Digital Radiology          X-rays
                                                MRI
                (Hollebeek, U. Pennsylvania)    CAT scans
                                                endoscopies
     Hospital digital data                     ...
      – Very large data sources: great clinical value to
        digital storage and manipulation and significant
        cost savings
      – 7 Terabytes per hospital per year
      – Dominated by digital images
     Why mammography
      – Clinical need for film recall & computer analysis
      – Large volume ( 4,000 GB/year ) (57% of total)
      – Storage and records standards exist
      – Great clinical value
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                           69
                   Earth System Grid
        (ANL, LBNL, LLNL, NCAR, ISI, ORNL)

       Enable a distributed community [of
        thousands] to perform computationally
        intensive analyses on large climate datasets
       Via
        – Creation of Data Grid supporting secure, high-
          performance remote access
        – ―Smart data servers‖ supporting reduction and
          analyses
        – Integration with environmental data analysis
          systems, protocols, and thin clients

 www.earthsystemgrid.org (soon)
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                                                 70


         Earth System Grid Architecture
                      Attribute
      Metadata        Specification                         Replica
      Catalog                    Application                Catalog
                                                            Multiple Locations
        Logical Collection and
                                             Selected
        Logical File Name
                                             Replica    Replica                 MDS
                                                        Selection
                                        GridFTP commands      Performance
                                                              Information &
                                                              Predictions
                                                                                NWS


                                      Disk Cache

                                  Tape Library
        Disk Array                                            Disk Cache
     Replica Location 1          Replica Location 2        Replica Location 3

foster@mcs.anl.gov                                         ARGONNE  CHICAGO
                                                                                     71



           Data Grid Toolkit Architecture

        Collective Data Management Service
                              Collective File Movement
     (Collection mgmt., priority, fault recovery, replication, resource selection)



                      Data Transfer Service
                              End-to-End File Transfer
          (Link optimization, performance guarantees, admission control)


                    Data Movement Service
                          Optimized Endpoint Management
       (Bulk parallel transfer, rate-limited transfer, disk/network scheduling)




foster@mcs.anl.gov                                               ARGONNE  CHICAGO
                                                        72
                    A Universal
             Access/Transport Protocol
   Suite of communication libraries and related tools
    that support
    – GSI security            – Integrated instrumentation
    – Third-party transfers   – Parallel transfers
    – Parameter set/negotiate – Striping (cf DPSS)
    – Partial file access     – Policy-based access control
    – Reliability/restart     – Server-side computation
    – Logging/audit trail             [later]
   All based on a standard, widely deployed protocol

foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                          73
         And the Universal Protocol is …
                    GridFTP
        Why FTP?
         – Ubiquity enables interoperation with many
           commodity tools
         – Already supports many desired features,
           easily extended to support others
        We use the term GridFTP to refer to
         – Transfer protocol which meets requirements
         – Family of tools which implement the protocol
        Note GridFTP > FTP
        Note that despite name, GridFTP is not
         restricted to file transfer!
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                            74



              GridFTP: Basic Approach
        FTP is defined by several IETF RFCs
        Start with most commonly used subset
         – Standard FTP: get/put etc., 3rd-party transfer
        Implement RFCed but often unused features
         – GSS binding, extended directory listing,
           simple restart
        Extend in various ways, while preserving
         interoperability with existing servers
         – Stripe/parallel data channels, partial file,
           automatic & manual TCP buffer setting,
           progress and extended restart
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                          75



           The GridFTP Family of Tools
        Patches to existing FTP code
         – GSI-enabled versions of existing FTP client
           and server, for high-quality production code
        Custom-developed libraries
         – Implement full GridFTP protocol, targeting
           custom use, high-performance
        Custom-developed tools
         – E.g., high-performance striped FTP server



foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                                                     76



        High-Performance Data Transfer

       GRAM         GRIP      GridFTP           TCP, BTP…                    TCP, BTP…



                                Control
                                                 Data Channel                 Data Channel
     Resource Mgmt. Enquiry    Interface
       (disk, NIC)

                              Scheduling          Rate Limiting                Rate Limiting
                               Modules             Interface                    Interface



                                           Bulk Transfer TCP Transfer   Bulk Transfer TCP Transfer
                                             Protocol      Protocol       Protocol      Protocol




foster@mcs.anl.gov                                                             ARGONNE  CHICAGO
                                                                                                                                                                     77
                            GridFTP for Efficient WAN Transfer
                          Transfer Tb+ datasets
                           – Highly-secure authentication
                           – Parallel transfer for speed
                                                                                                                 Parallel Transfer
                          LLNL->Chicago transfer (slow                                                       Fully utilizes bandwidth of

                           site network interfaces):                                                      network interface on single nodes.


                                       GridFTP (globus-url-copy)

                  80




                                                                                    Parallel Filesystem




                                                                                                                                               Parallel Filesystem
                  70
Bandwidth (Mbs)




                  60
                  50
                  40
                  30
                  20
                  10

                  0
                       0      5   10           15            20      25   30   35

                                             # of Parallel Streams




                          FUTURE: Integrate striped
                           GridFTP with parallel storage                                                         Striped Transfer
                           systems, e.g., HPSS
                                                                                                            Fully utilizes bandwidth of
                                                                                                          Gb+ WAN using multiple nodes.



foster@mcs.anl.gov                                                                                                    ARGONNE  CHICAGO
                                                        78
               GridFTP for User-Friendly
                  Visualization Setup
   High-res visualization is too large
    for display on a single system
    – Needs to be tiled, 24bit->16bit
      depth
    – Needs to be staged to display
      units
   GridFTP/ActiveMural integration
    application performs tiling, data
    reduction, and staging in a single
    operation
    – PVFS/MPI-IO on server
    – MPI process group transforms
      data as needed before transfer
    – Performance is currently bounded
      by 100Mb/s NICs on display
      nodes
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                                                          79
  Distributed Computing+Visualization
      Remote Center                           WAN Transfer
      Generates Tb+ datasets from                                     Chiba City             Visualization
      simulation code                                                                        code constructs
                                           FLASH data transferred                            and stores
                                           to ANL for visualization                          high-resolution
                                                                                             visualization
                                             GridFTP parallelism                             frames for
                                           utilizes high bandwidth                           display on
                                             (Capable of utilizing                           many devices
                                               >Gb/s WAN links)




   Job Submission
Simulation code submitted to
 remote center for execution          ActiveMural Display               LAN/WAN Transfer
     on 1000s of nodes                Displays very high resolution
                                                                        User-friendly striped GridFTP
                                    large-screen dataset animations
                                                                       application tiles the frames and
                                                                       stages tiles onto display nodes


FUTURE (1-5 yrs)
• 10s Gb/s LANs, WANs
• End-to-end QoS
• Automated replica
  management
• Server-side data
  reduction & analysis
• Interactive portals
foster@mcs.anl.gov                                                    ARGONNE  CHICAGO
                                                          80
             SC’2001 Experiment:
          Simulation of HEP Tier 1 Site
   Tiered (Hierarchical) Site Structure
    – All data generated at lower tiers must be
      forwarded to the higher tiers
    – Tier 1 sites may have many sites transmitting to
      them simultaneously and will need to sink a
      substantial amount of bandwidth
    – We demonstrated the ability of GridFTP to support
      this at SC 2001 in the Bandwidth Challenge
    – 16 Sites, with 27 Hosts, pushed a peak of 2.8 Gbs
      to the showfloor in Denver with a sustained
      bandwidth of nearly 2 Gbs
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                             81
       Visualization of Network Traffic
       During the Bandwidth Challenge




foster@mcs.anl.gov             ARGONNE  CHICAGO
                                                         82
                    The Replica
                Management Problem
        Maintain a mapping between logical names
         for files and collections and one or more
         physical locations
        Important for many applications
        Example: CERN high-level trigger data
         – Multiple petabytes of data per year
         – Copy of everything at CERN (Tier 0)
         – Subsets at national centers (Tier 1)
         – Smaller regional centers (Tier 2)
         – Individual researchers will have copies
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                           83
              Our Approach to Replica
                   Management
        Identify replica cataloging and reliable
         replication as two fundamental services
         – Layer on other Grid services: GSI, transport,
           information service
         – Use as a building block for other tools
        Advantage
         – These services can be used in a wide variety
           of situations




foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                                                       84
                   Replica Catalog Structure:
                  A Climate Modeling Example
                                                     Replica Catalog


                             Logical Collection                       Logical Collection
                             C02 measurements 1998                    C02 measurements 1999

                             Filename: Jan 1998
                             Filename: Feb 1998
                             …


                                                                          Logical
    Location                     Location
   jupiter.isi.edu              sprite.llnl.gov                           File Parent
 Filename: Mar 1998            Filename: Jan 1998
 Filename: Jun 1998            …                           Logical File                 Logical File
 Filename: Oct 1998            Filename: Dec 1998          Jan 1998                     Feb 1998
 Protocol: gsiftp              Protocol: ftp
 UrlConstructor:               UrlConstructor:           Size: 1468762
 gsiftp://jupiter.isi.edu/     ftp://sprite.llnl.gov/
   nfs/v6/climate                 pub/pcmdi
foster@mcs.anl.gov                                                                     ARGONNE  CHICAGO
                                                                                                                                                  85
          Giggle: A Scalable Replication
                 Location Service
        Local replica catalogs maintain definitive
         information about replicas
        Publish (perhaps approximate) information
         using soft state techniques
        Variety of indexing strategies possible
                                                                                                           10000
                        Time (ms)




                                    10




                                                                                Time for soft-state update (sec)
                                     8                                                                             1000


                                     6
                                             Create (LRC)         Add (LRC)                                         100
                                             Delete (LRC)         Query (LRC)
                                     4       Query (RLI)                                                                                       1 LRC
                                                                                                                     10
                                     2                                                                                                         2 LRCs

                                     0                                                                                1
                                         1     10           100       1000                                                1   10     100        1000
                                                      Number of LFNs ('000)                                                        Number of LFNs ('000)




foster@mcs.anl.gov                                                                                                 ARGONNE  CHICAGO
                                                                    86
            GriPhyN = App. Science + CS + Grids

    GriPhyN = Grid Physics Network
     – US-CMS                 High Energy Physics
     – US-ATLAS               High Energy Physics
     – LIGO/LSC               Gravity wave research
     – SDSS                   Sloan Digital Sky Survey
     – Strong partnership with computer scientists
    Design and implement production-scale grids
     – Develop common infrastructure, tools and services
     – Integration into the 4 experiments
     – Application to other sciences via ―Virtual Data Toolkit‖
    Multi-year project
     – R&D for grid architecture (funded at $11.9M +$1.6M)
     – Integrate Grid infrastructure into experiments through VDT
foster@mcs.anl.gov                                ARGONNE  CHICAGO
                                                              87

                     GriPhyN Institutions
         – U Florida               – U Penn
         – U Chicago               – U Texas, Brownsville
         – Boston U                – U Wisconsin, Milwaukee
         – Caltech                 – UC Berkeley
         – U Wisconsin, Madison    – UC San Diego
         – USC/ISI                 – San Diego
         – Harvard                   Supercomputer Center
         – Indiana                 – Lawrence Berkeley Lab
         – Johns Hopkins           – Argonne
         – Northwestern            – Fermilab
         – Stanford                – Brookhaven
         – U Illinois at Chicago

foster@mcs.anl.gov                            ARGONNE  CHICAGO
                                                                             88

      GriPhyN: PetaScale Virtual Data Grids
                           Production Team
Individual Investigator                           Workgroups
                                                           ~1 Petaop/s
                                                           ~100 Petabytes
                     Interactive User Tools

      Virtual Data        Request Planning &      Request Execution &
          Tools            Scheduling Tools        Management Tools
            Resource
              Resource                Security and
                                     Security and               Other Grid
                                                               Other Grid
          Management
             Management                 Policy
                                       Policy                   Services
                                                                Services
            Services
              Services                 Services
                                      Services


                                                  Transforms

                                               Distributed resources
             Raw data                          (code, storage, CPUs,
               source                          networks)


foster@mcs.anl.gov                                       ARGONNE  CHICAGO
                                                                        89
                   GriPhyN/PPDG
                Data Grid Architecture
         Application                      = initial solution is operational


                DAG
                               Catalog Services            Monitoring
           Planner             MCAT; GriPhyN catalogs   MDS

               DAG                  Info Services
                                                          Repl. Mgmt.
                               MDS
                                                        GDMP
          Executor              Policy/Security
      DAGMAN, Kangaroo         GSI, CAS

                           Reliable Transfer
                                Service
                           Globus


      Compute Resource     Storage Resource
       GRAM                GridFTP; GRAM; SRM

 Ewa Deelman, Mike Wilde
foster@mcs.anl.gov                                  ARGONNE  CHICAGO
                                                          90
              GriPhyN Research Agenda
   Virtual Data technologies
    – Derived data, calculable via algorithm
    – Instantiated 0, 1, or many times (e.g., caches)
    – ―Fetch value‖ vs. ―execute algorithm‖
    – Potentially complex (versions, cost calculation, etc)
   E.g., LIGO: ―Get gravitational strain for 2 minutes
    around 200 gamma-ray bursts over last year‖
   For each requested data value, need to
    – Locate item materialization, location, and algorithm
    – Determine costs of fetching vs. calculating
    – Plan data movements, computations to obtain results
    – Execute the plan
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                                  91
             Virtual Data
              in Action
   Data request may
                                     Major facilities, archives
    – Compute locally
    – Compute remotely
    – Access local data
    – Access remote data
   Scheduling based on               Regional facilities, caches
    – Local policies
    – Global policies
    – Cost              Fetch item
                                      Local facilities, caches

foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                          92

      GriPhyN Research Agenda (cont.)
    Execution management
     – Co-allocation (CPU, storage, network transfers)
     – Fault tolerance, error reporting
     – Interaction, feedback to planning
    Performance analysis (with PPDG)
     – Instrumentation, measurement of all components
     – Understand and optimize grid performance
    Virtual Data Toolkit (VDT)
     – VDT = virtual data services + virtual data tools
     – One of the primary deliverables of R&D effort
     – Technology transfer to other scientific domains
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                          93
    Programs as Community Resources:
      Data Derivation and Provenance
        Most scientific data are not simple
         ―measurements‖; essentially all are:
         – Computationally corrected/reconstructed
         – And/or produced by numerical simulation
        And thus, as data and computers become
         ever larger and more expensive:
         – Programs are significant community resources
         – So are the executions of those programs



foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                                    94
―I’ve come across some
interesting data, but I need
to understand the nature of
the corrections applied                  ―I’ve detected a calibration
when it was constructed        Data      error in an instrument and
before I can trust it for my             want to know which derived
purposes.‖                                   data to recompute.‖

               created-by                consumed-by/
                                         generated-by



     Transformation       execution-of     Derivation
                                                 ―I want to apply an
―I want to search an astronomical
                                              astronomical analysis
database for galaxies with certain
characteristics. If a program that            program to millions of
performs this analysis exists, I               objects. If the results
won’t have to write one from                  already exist, I’ll save
scratch.‖
foster@mcs.anl.gov                           weeks of computation.‖
                                                  ARGONNE  CHICAGO
                                                                                95
      The Chimera Virtual Data System
             (GriPhyN Project)
   Virtual data catalog
    – Transformations,                            Virtual Data
                                                  Applications
      derivations, data                                              Task Graphs
                                                                  (compute and data
                            Chimera
                                                                 movement tasks, with
   Virtual data language         Virtual Data Language
                                  (definition and query)
                                                                    dependencies)

    – Data definition +           VDL Interpreter          Data Grid Resources
      query                    (manipulate derivations      (distributed execution
                                and transformations)       and data management)
   Applications include              SQL
    browsers and data
                                Virtual Data Catalog
    analysis applications       (implements Chimera
                                Virtual Data Schema)




foster@mcs.anl.gov                                     ARGONNE  CHICAGO
                                              96
             SDSS Galaxy Cluster Finding




foster@mcs.anl.gov              ARGONNE  CHICAGO
                                                             97



           Cluster-finding Data Pipeline
                             catalog



                              5


                             cluster



                              4


                     core              core



                     3                  3


          brg        brg               brg          brg



           2         2                 2            2



          field      field             field        field



           1         1                 1            1



                     tsObj             tsObj        tsObj
          tsObj




foster@mcs.anl.gov                             ARGONNE  CHICAGO
                                                      98



                     Virtual Data in CMS




            Virtual Data Long Term Vision of CMS:
            CMS Note 2001/047, GRIPHYN 2001-16
foster@mcs.anl.gov                      ARGONNE  CHICAGO
          Early GriPhyN Challenge Problem: 99
              CMS Data Reconstruction
                                                2) Launch secondary job on WI pool;
                             Master Condor      input files via Globus GASS
                             job running at                               Secondary
                                Caltech                                Condor job on WI
                                                5) Secondary                pool
                                                reports complete
      Caltech                                   to master
     workstation
          6) Master starts
          reconstruction jobs                       3) 100 Monte
          via Globus                                Carlo jobs on
          jobmanager on                             Wisconsin Condor
          cluster                                   pool
                                   9) Reconstruction
                                   job reports
                                   complete to master
                                                                         4) 100 data files
                                                                         transferred via
                                        7) GridFTP fetches               GridFTP, ~ 1 GB
                                        data from UniTree                each
                   NCSA Linux cluster
                                                                    NCSA UniTree
                                        8) Processed                - GridFTP-
                                        objectivity                 enabled FTP
                                        database stored             server
                                        to UniTree

 Scott Koranda, Miron Livny, others
foster@mcs.anl.gov                                                        ARGONNE  CHICAGO
                                                                                                              100



                  Trace of a Condor-G Physics Run
         120


         100
                  Pre / Simulation Jobs /                                          ooDigis at NCSA
                  Post (UW Condor)
             80

                                                                          ooHits at NCSA
             60


             40                                                         Delay due to
                                                                        script error

             20


             0
            19




                                                     19




                                                                                          19
                                  9




                                                                    9




                                                                                                          9
                                :1




                                                                  :1




                                                                                                        :1
         4:




                                                  4:




                                                                                       4:
                              16




                                                                16




                                                                                                      16
        01




                                                 01




                                                                                    01
                             01




                                                               01




                                                                                                     01
      3/




                                               4/




                                                                                  5/
                           3/




                                                             4/




                                                                                                   5/
   4/




                                            4/




                                                                               4/
                        4/




                                                          4/




                                                                                                4/
foster@mcs.anl.gov                                                                 ARGONNE  CHICAGO
                                                         101


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                         102



                Distributed Computing
        Aggregate computing resources & codes
         – Multidisciplinary simulation
         – Metacomputing/distributed simulation
         – High-throughput/parameter studies
        Challenges
         – Heterogeneous compute & network
           capabilities, latencies, dynamic behaviors
        Example tools
         – MPICH-G2: Grid-aware MPI
         – Condor-G, Nimrod-G: parameter studies
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                                                                                         103
                     Multidisciplinary Simulations:
                            Aviation Safety  Wing Models


                                                                 •Lift Capabilities
                                                                 •Drag Capabilities              Stabilizer Models
                                                                 •Responsiveness
                     Airframe Models




                                                                                                     •Deflection capabilities
                                                                                                     •Responsiveness
 Crew Capabilities
 - accuracy
 - perception
 - stamina
 - re-action times
 - SOPs
                                                                                       Engine Models




  Human Models                                      •Braking performance
                                                    •Steering capabilities
                                                                                         •Thrust performance
                                                    •Traction
                                                                                         •Reverse Thrust performance
                                                    •Dampening capabilities
                                                                                         •Responsiveness
                                                                                         •Fuel Consumption
                                  Landing Gear Models

                     Whole system simulations are produced by coupling all of the sub-system simulations

foster@mcs.anl.gov                                                                     ARGONNE  CHICAGO
                                                      104



        MPICH-G2: A Grid-Enabled MPI
       A complete implementation of the Message
        Passing Interface (MPI) for heterogeneous,
        wide area environments
        – Based on the Argonne MPICH implementation
          of MPI (Gropp and Lusk)
       Requires services for authentication, resource
        allocation, executable staging, output, etc.
       Programs run in wide area without change
        – Modulo accommodating heterogeneous
          communication performance
       See also: MetaMPI, PACX, STAMPI, MAGPIE
www.globus.org/mpi
foster@mcs.anl.gov                      ARGONNE  CHICAGO
                                                          105



   Grid-based Computation: Challenges
        Locate ―suitable‖ computers
        Authenticate with appropriate sites
        Allocate resources on those computers
        Initiate computation on those computers
        Configure those computations
        Select ―appropriate‖ communication methods
        Compute with ―suitable‖ algorithms
        Access data files, return output
        Respond ―appropriately‖ to resource changes
foster@mcs.anl.gov                          ARGONNE  CHICAGO
       MPICH-G2 Use of Grid Services                                 106

                        % grid-proxy-init
                        % mpirun -np 256 myprog
                     Locates                        Generates
           MDS                  mpirun         resource specification
                      hosts
                  Stages                       Submits multiple jobs
     GASS                      globusrun
                executables

                                DUROC          Coordinates startup
          Authenticates
             GRAM               GRAM                GRAM
           Initiates job                       Detects termination
                fork                LSF         LoadLeveler
        Monitors/controls

           P1          P2      P1         P2     P1      P2

         Communicates via vendor-MPI and TCP/IP (globus-io)
foster@mcs.anl.gov                               ARGONNE  CHICAGO
                                                             107
                        Cactus
        (Allen, Dramlitsch, Seidel, Shalf, Radke)
      Modular, portable framework for
       parallel, multidimensional simulations
      Construct codes by linking
                                                          Thorns
       – Small core (flesh): mgmt services
                                                Cactus
       – Selected modules (thorns): Numerical   ―flesh‖
         methods, grids & domain decomps,
         visualization and steering, etc.
      Custom linking/configuration tools
      Developed for astrophysics, but not
       astrophysics-specific
www.cactuscode.org
foster@mcs.anl.gov                      ARGONNE  CHICAGO
                                                           108
       Cactus: An Application
Framework for Dynamic Grid Computing
        Cactus thorns for active management of
         application behavior and resource use
        Heterogeneous resources, e.g.:
         – Irregular decompositions
         – Variable halo for managing message size
         – Msg compression (comp/comm tradeoff)
         – Comms scheduling for comp/comm overlap
        Dynamic resource behaviors/demands, e.g.:
         – Perf monitoring, contract violation detection
         – Dynamic resource discovery & migration
         – User notification and steering
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                 Cactus Example: Gig-E                          109
                                  100MB/sec
        17     Terascale Computing
                                           4        2      2
                 12 OC-12 line
                     But only 2.5MB/sec)                       12
                 5
         SDSC IBM SP                                       5
                                   NCSA Origin Array
         1024 procs                256+128+128
         5x12x17 =1020             5x12x(4+2+2) =480
    Solved EEs for gravitational waves (real code)
     – Tightly coupled, communications required through derivatives
     – Must communicate 30MB/step between machines
     – Time step take 1.6 sec
    Used 10 ghost zones along direction of machines:
     communicate every 10 steps
    Compression/decomp. on all data passed in this direction
    Achieved 70-80% scaling, ~200GF (only 14% scaling
     without tricks)
foster@mcs.anl.gov                             ARGONNE  CHICAGO
                                                                                                                110
                                      Cactus Example (2):
                                       Migration in Action
                        Running              3 successive Resource                         Running
                         At UC         Load    contract   discovery                        At UIUC
                                      applied violations & migration
                        1.4
                        1.2
    Iterations/Second




                         1
                        0.8
                        0.6
                        0.4                           (migration
                        0.2                        time not to scale)

                         0
                              1
                                  3
                                      5
                                          7
                                              9
                                                  11
                                                       13
                                                             15
                                                                  17
                                                                       19
                                                                            21
                                                                                 23
                                                                                      25
                                                                                            27
                                                                                                 29
                                                                                                      31
                                                                                                           33
                                                            Clock Time

foster@mcs.anl.gov                                                                    ARGONNE  CHICAGO
                                       IPG Milestone 3:                                  111


  high-lift subsonic                Large Computing Node
                                     Completed 12/2000
  wind tunnel model




                                                                  Glenn
    Ames
                                                              Cleveland, OH
    Moffett Field, CA                          Sharp                             Langley
                                                                               Hampton, VA
                                         OVERFLOW on IPG
                                          using Globus and
                                         MPICH-G2 for intra-
                                         problem, wide area
                                           communication
                                                                          Whitcomb
                        Lomax
              512 node SGI Origin 2000         Application POC: Mohammad J. Djomehri


 Slide courtesy Bill
foster@mcs.anl.gov Johnston, LBNL & NASA                        ARGONNE  CHICAGO
                                                         112
         High-Throughput Computing:
                   Condor
       High-throughput computing platform for
        mapping many tasks to idle computers
       Three major components
        – Scheduler manages pool(s) of [distributively
          owned or dedicated] computers
        – DAGman manages user task pools
        – Matchmaker schedules tasks to computers
       Parameter studies, data analysis
       Condor-G extensions support wide area
        execution in Grid environment
 www.cs.wisc.edu/condor
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                           113


                     Defining a DAG
        A DAG is defined by a .dag file, listing each
         of its nodes and their dependencies:
         # diamond.dag
         Job A a.sub
                                           Job A
         Job B b.sub
         Job C c.sub               Job B           Job C
         Job D d.sub
         Parent A Child B C                Job D

         Parent B C Child D
        Each node runs the Condor job specified by
         its accompanying Condor submit file
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                           114
          High-Throughput Computing:
          Mathematicians Solve NUG30
    Looking for the solution to the
     NUG30 quadratic assignment
     problem
    An informal collaboration of
     mathematicians and computer
     scientists
    Condor-G delivered 3.46E8
     CPU seconds in 7 days (peak
                                       14,5,28,24,1,3,16,15,
     1009 processors) in U.S. and
     Italy (8 sites)                   10,9,21,2,4,29,25,22,
                                       13,26,17,30,6,20,19,
                                       8,18,7,27,12,11,23
MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                          115
 Grid Application Development Software
             (GrADS) Project




 hipersoft.rice.edu/grads
foster@mcs.anl.gov          ARGONNE  CHICAGO
                                                        116


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                  117
                      Access Grid

    High-end group work and
     collaboration technology
    Grid services being used for
     discovery, configuration,
     authentication
    O(50) systems deployed
     worldwide                                              Presenter
                                                               mic
    Basis for SC’2001 SC Global
     event in November 2001 Presenter
                                  camera
     – www.scglobal.org
                                  Ambient mic
                                   (tabletop)

                                Audience camera

 www.accessgrid.org
foster@mcs.anl.gov                                ARGONNE  CHICAGO
                                                        118


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                           119
         Grid-Enabled Research Facilities:
              Leverage Investments
        Research instruments, satellites, particle
         accelerators, MRI machines, etc., cost a
         great deal
        Data from those devices can be accessed
         and analyzed by many more scientists
          – Not just the team that gathered the data
        More productive use of instruments
          – Calibration, data sampling during a run, via
            on-demand real-time processing


foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                            120

     Telemicroscopy & Grid-Based Computing
     DATA ACQUISITION
                        PROCESSING,     ADVANCED
                          ANALYSIS    VISUALIZATION




                                                NETWORK




  IMAGING                COMPUTATIONAL
INSTRUMENTS                RESOURCES
                                          LARGE-SCALE
foster@mcs.anl.gov                            ARGONNE  CHICAGO
                                           DATABASES
        APAN Trans-Pacific Telemicroscopy                                    121

        Collaboration, Osaka-U, UCSD, ISI
                     (slide courtesy Mark Ellisman@UCSD)




                                             1st
                           UHVEM                        NCMIR
                         (Osaka, Japan)               (San Diego)

                                          (Chicago)   (San Diego)
                            Tokyo XP        STAR         SDSC
                                             TAP
                                  TransPAC         vBNS
                                          Globus
                                CRL/MPT               UCSD
                                            2nd
                           UHVEM                         NCMIR
                         (Osaka, Japan)                (San Diego)
foster@mcs.anl.gov                                             ARGONNE  CHICAGO
                 Network for                        122


      Earthquake Engineering Simulation

     NEESgrid: US national
      infrastructure to couple
      earthquake engineers
      with experimental
      facilities, databases,
      computers, & each other
     On-demand access to
      experiments, data
      streams, computing,
      archives, collaboration

 Argonne, Michigan,
foster@mcs.anl.gov NCSA, UIUC, USC     www.neesgrid.org
                                     ARGONNE  CHICAGO
                                                        123
           ―Experimental Facilities‖ Can
                Include Field Sites
        Remotely controlled sensor grids for field
         studies, e.g., in seismology and biology
         – Wireless/satellite communications
         – Sensor net technology for low-cost
           communications




foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                          124


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                          125
               Nature and Role of Grid
                   Infrastructure
        Persistent Grid infrastructure is critical to
         the success of many eScience projects
         – High-speed networks, certainly
         – Remotely accessible compute & storage
         – Persistent, standard services: PKI,
           directories, reservation, …
         – Operational & support procedures
        Many projects creating such infrastructures
         – Production operation is the goal, but much
           to learn about how to create & operate

foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                                                                                     126



         A National Grid Infrastructure

                             A
              A
                                                                       A
                  REGIONAL                              A
                                                                                                             A
                                                                                              A
             A                   A                          REGIONAL
                                                                                                  REGIONAL
                   A                                                       A
                                                        A
                                                             A                                A                  A
                                                A                                                  A
                                 A
                                                                                          A
                                     REGIONAL                          A

                                                    A                          REGIONAL
                             A
                                      A                                                       A
                                                                       A
                                                                                A




foster@mcs.anl.gov                                                                            ARGONNE  CHICAGO
                                                            127
           Example Grid Infrastructure
                   Projects
        I-WAY (1995): 17 U.S. sites for one week
        GUSTO (1998): 80 sites worldwide, exp
        NASA Information Power Grid (since 1999)
         – Production Grid linking NASA laboratories
        INFN Grid, EU DataGrid, iVDGL, … (2001+)
         – Grids for data-intensive science
        TeraGrid, DOE Science Grid (2002+)
         – Production Grids link supercomputer centers
        U.S. GRIDS Center
         – Software packaging, deployment, support
foster@mcs.anl.gov                            ARGONNE  CHICAGO
                              The 13.6 TF TeraGrid:                                               128


                              Computing at 40 Gb/s
                                    Site Resources   Site Resources
                          26
                                4      HPSS             HPSS
                          24

                                     External              External
                          8          Networks              Networks
                                                                      5

                                    Caltech          Argonne

                                                                          External
               External
                                                                          Networks
               Networks
   Site Resources                   SDSC              NCSA/PACI                       Site Resources
                                    4.1 TF            8 TF
      HPSS                          225 TB            240 TB                         UniTree




NCSA, SDSC, Caltech, Argonne
foster@mcs.anl.gov                                                      www.teragrid.org
                                                                      ARGONNE  CHICAGO
                                                                                                                                                                      129
                                                      TeraGrid (Details)
                                                                                                                                                      574p IA-32
                                                                                                                                                      Chiba City
 256p HP                     32             32                                                                                32   32
  X-Class                                                Caltech                               Argonne                                                128p Origin
                             24                          32 Nodes                               64 Nodes                           32
 128p HP                                                                                                                      32                      HR Display &
  V2500
                                        24
                                                          0.5 TF                                  1 TF
                             8          8                                                                                              5
                                                                                                                                                      VR Facilities
 92p IA-32                                            0.4 TB Memory                         0.25 TB Memory                     5
                                                                                                                                                      HPSS
                                                        86 TB disk                             25 TB disk
  HPSS                                 24
                                                                                                                                           OC-12
                     4          Extreme                          Chicago & LA DTF Core Switch/Routers                                                ESnet
                             Black Diamond                                                                                                 OC-48
                                                             Cisco 65xx Catalyst Switch (256 Gb/s Crossbar)                                          HSCC
 Calren      OC-48                                                                                                                         OC-12
                                                                                                                                                     MREN/Abilene
 NTON                                                                                                                                      GbE
            OC-12 ATM    Juniper M160                                                                                                                Starlight
                     Juniper M40                             SDSC                               NCSA                                             Juniper M40
             OC-12
   vBNS                                                     256 Nodes                         500 Nodes                                                   OC-12
                                                                                                                                                                      vBNS
             OC-12
 Abilene                                                                                  8 TF, 4 TB Memory                                      2        OC-12
             OC-12
                                 2                    4.1 TF, 2 TB Memory                                                                                             Abilene
  Calren                                                                                                                                                   OC-3
                                                                                                                                                                      MREN
  ESnet
             OC-3                                          225 TB disk                       240 TB disk
                                                                                                                                                            8
                4

     HPSS                        8                                                                                                                                   UniTree
                                                                                                                                                               2
                          Sun                                              = 32x 1GbE
                4
                         Starcat
                                                                                                                                                                   1024p IA-32
 1176p IBM SP                                                                                                                                                       320p IA-64
 Blue Horizon                 16
                                                                          = 64x Myrinet                                                                     14
                     4                                                    = 32x Myrinet                       Myrinet Clos Spine
                                        Myrinet Clos Spine
                                                                                                                                                          1500p Origin
   Sun E10K
                                                                = 32x FibreChannel                                                               = 8x FibreChannel

                                                                                                                                                                      10 GbE

                                                 32 quad-processor McKinley Servers                32 quad-processor McKinley Servers       Fibre Channel Switch
                                                 (128p @ 4GF, 8GB memory/server)                   (128p @ 4GF, 12GB memory/server)
                                                 16 quad-processor McKinley Servers                Cisco 6509 Catalyst Switch/Router        IA-32 nodes
                                                  (64p @ 4GF, 8GB memory/server)

foster@mcs.anl.gov                                                                                                           ARGONNE  CHICAGO
                                                                                                  130
                       Targeted StarLight
                  Optical Network Connections
 Asia-                                                                                    CERN
Pacific                                        CA*net4                                    SURFnet
                  Vancouver

                 Seattle

           Portland          NTON
                                                U Wisconsin
    San Francisco
                                                      Chicago                   PSC        NYC
                           NTON                                   IU
                                                         NCSA
      Asia-                         DTF 40Gb
     Pacific
               Los Angeles                                             NW Univ (Chicago) StarLight Hub
                                                            Atlanta
                                                            UICAtlanta
                   San Diego                    I-WIRE                 Chicago Cross connect
                     (SDSC)                          ANL               Ill Inst of Tech
                                                                       Univ of Chicago
                                                 St Louis                        AMPATH
                                                                                      Indianapolis
                                                                                  AMPATH
                                                 GigaPoP                                   (Abilene NOC)
                                                                NCSA/UIUC



www.startap.net
foster@mcs.anl.gov                                                  ARGONNE  CHICAGO
                                                                                                                    131
                                CA*net 4 Architecture
                                                                                                       CANARIE
                                                                                                       GigaPOP
                                                                                                       ORAN DWDM
                                                                                                       Carrier DWDM
                        Edmonton

                                     Saskatoon


                        Calgary Regina                                                                 St. John’s
                                                 Winnipeg                      Quebec
                                                                                                Charlottetown
                                                            Thunder Bay
Victoria                                                                      Montreal
                                                                           Ottawa
                 Vancouver                                                               Fredericton
                                                                                                             Halifax

                                                                                                 Boston
           Seattle                         Chicago
                                                                                                   New York
                         CA*net 4 node)                                    Toronto
                     Possible future CA*net 4 node                        Windsor

foster@mcs.anl.gov                                                                   ARGONNE  CHICAGO
                                                                                                         132


                  APAN Network Topology                                                       2001.9.3




   Europe
                                                              
                                                                  Japan   622Mbps x 2
                                                      Korea                                  STAR TAP
                                                       
                                                                                             (USA)
                                          
                                   China

                                    Hong Kong
                                            


                                                  
                             Thailand        Vietnam          Philippines

                               Malaysia
                    Sri Lanka                
                                  Singapore           Indonesia
                                                  




            Current status
            2001(plan)
                                                                                Australia
foster@mcs.anl.gov                                                            ARGONNE  CHICAGO
                                                                    133


        iVDGL: A Global Grid Laboratory
       “We propose to create, operate and evaluate, over a
       sustained period of time, an international research
       laboratory for data-intensive science.”
                                   From NSF proposal, 2001

      International Virtual-Data Grid Laboratory
        – A global Grid laboratory (US, Europe, Asia, South
          America, …)
        – A place to conduct Data Grid tests ―at scale‖
        – A mechanism to create common Grid infrastructure
        – A laboratory for other disciplines to perform Data Grid
          tests
        – A focus of outreach efforts to small institutions
      U.S. part funded by NSF (2001-2006)
        – $13.7M (NSF) + $2M (matching)
foster@mcs.anl.gov                                ARGONNE  CHICAGO
                                                                          134

               Initial US-iVDGL Data Grid

                  SKC                                               BU
                                 Wisconsin
                                                          PSU
                                                                    BNL
                                 Fermilab
                                       Indiana      JHU         Hampton
     Caltech



        UCSD
                                                          Florida


                                       Tier1 (FNAL)
   Other sites to be    Brownsville    Proto-Tier2
    added in 2002                      Tier3 university

foster@mcs.anl.gov                               ARGONNE  CHICAGO
                  iVDGL:                135

International Virtual Data Grid Laboratory




                                                       Tier0/1 facility
                                                       Tier2 facility
                                                       Tier3 facility
                                                       10 Gbps link
                                                       2.5 Gbps link
                                                       622 Mbps link
                                                       Other link

U.S. PIs: Avery, Foster, Gardner, Newman, Szalay   www.ivdgl.org
foster@mcs.anl.gov                             ARGONNE  CHICAGO
                                                         136

                     iVDGL Architecture
                         (from proposal)




foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                                           137



             US iVDGL Interoperability
        US-iVDGL-1 Milestone (August 02)
               iGOC
                                     US-iVDGL-
                                         1
                                      Aug 2002
                   ATLAS
                                                            SDSS/NVO

                               CMS               LIGO

               1                                                       1
               2                                                       2
                                                        1
                           1
                                                        2
                           2




foster@mcs.anl.gov                                          ARGONNE  CHICAGO
                                                                                           138



               Transatlantic Interoperability
          iVDGL-2 Milestone (November 02)

      iGOC
                                                          Outreach
                           iVDGL-2                                           DataTAG
                           Nov 2002
           ATLAS
                                                      SDSS/NVO

                     CMS                 LIGO
     ANL
                           CS Research                           UC
     BNL
                                                             FNAL
     BU                                         CIT                                CERN
                   CIT                                        JHU
     HU                                         PSU                                INFN
                   UCSD
     IU                       ANL               UTB                              UK PPARC
                    UF
     LBL                       UC               UWM                               U of A
                   FNAL
     UM                       UCB
     OU                        IU
     UTA                       ISI
                              NU
                              UW




foster@mcs.anl.gov                                                    ARGONNE  CHICAGO
                                                         139
                     Another Example:
                     INFN Grid in Italy
        20 sites, ~200 persons, ~90 FTEs, ~20 IT
        Preliminary budget for 3 years: 9 M Euros
        Activities organized around
         – S/w development with Datagrid, DataTAG….
         – Testbeds (financed by INFN) for DataGrid,
           DataTAG, US-EU Intergrid
         – Experiments applications
         – Tier1..Tiern prototype infrastructure
        Large scale testbeds provided by LHC
         experiments, Virgo…..
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                        140
          U.S. GRIDS Center
 NSF Middleware Infrastructure Program

        GRIDS = Grid Research, Integration,
         Deployment, & Support
        NSF-funded center to provide
         – State-of-the-art middleware infrastructure
           to support national-scale collaborative
           science and engineering
         – Integration platform for experimental
           middleware technologies
        ISI, NCSA, SDSC, UC, UW
        NMI software release one: May 2002
 www.grids-center.org
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                          141


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                          142



         Globus Toolkit: Evaluation (+)
      Good technical solutions for key problems, e.g.
       – Authentication and authorization
       – Resource discovery and monitoring
       – Reliable remote service invocation
       – High-performance remote data access
      This & good engineering is enabling progress
       – Good quality reference implementation, multi-
         language support, interfaces to many systems,
         large user base, industrial support
       – Growing community code base built on tools
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                           143



          Globus Toolkit: Evaluation (-)
        Protocol deficiencies, e.g.
         – Heterogeneous basis: HTTP, LDAP, FTP
         – No standard means of invocation, notification,
           error propagation, authorization, termination, …
        Significant missing functionality, e.g.
         – Databases, sensors, instruments, workflow, …
         – Virtualization of end systems (hosting envs.)
        Little work on total system properties, e.g.
         – Dependability, end-to-end QoS, …
         – Reasoning about system properties
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                        144



                 Globus Toolkit Structure
   Service naming
                                          Soft state
   Reliable invocation                   management

       GRAM              MDS   GridFTP          MDS        ???
  Notification
                 GSI                     GSI                     GSI
             Job
           manager
                    Job
                  manager
             Compute                   Data                Other Service
             Resource                Resource              or Application



 Lots of good mechanisms, but (with the exception of GSI) not that easily
 incorporated into other systems


foster@mcs.anl.gov                                     ARGONNE  CHICAGO
                                                         145

         Open Grid Services Architecture
        Service orientation to virtualize resources
        Define fundamental Grid service behaviors
         – Core set required, others optional
          A unifying framework for interoperability &
          establishment of total system properties
        Integration with Web services and hosting
         environment technologies
          Leverage tremendous commercial base
          Standard IDL accelerates community code
        Delivery via open source Globus Toolkit 3.0
          Leverage GT experience, code, mindshare
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                         146

                     ―Web Services‖
     Increasingly popular standards-based
      framework for accessing network applications
      – W3C standardization; Microsoft, IBM, Sun, others
     WSDL: Web Services Description Language
      – Interface Definition Language for Web services
     SOAP: Simple Object Access Protocol
      – XML-based RPC protocol; common WSDL target
     WS-Inspection
      – Conventions for locating service descriptions
     UDDI: Universal Desc., Discovery, & Integration
      – Directory for Web services
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                       147
               Web Services Example:
                 Database Service
        WSDL definition for ―DBaccess‖ porttype
         defines operations and bindings, e.g.:
         – Query(QueryLanguage, Query, Result)
         – SOAP protocol
                           DBaccess




        Client C, Java, Python, etc., APIs can then
         be generated
foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                              148


           Transient Service Instances
      ―Web services‖ address discovery & invocation
       of persistent services
       – Interface to persistent state of entire enterprise
      In Grids, must also support transient service
       instances, created/destroyed dynamically
       – Interfaces to the states of distributed activities
       – E.g. workflow, video conf., dist. data analysis
      Significant implications for how services are
       managed, named, discovered, and used
       – In fact, much of our work is concerned with the
         management of service instances
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                                                149
                 The Grid Service =
              Interfaces + Service Data

                       Reliable invocation
                       Authentication

Service data access     GridService     … other interfaces …       Notification
Explicit destruction                                               Authorization
Soft-state lifetime                                                Service creation
                            Service   Service     Service
                                                                   Service registry
                              data
                            element
                                        data
                                      element
                                                    data
                                                  element
                                                                   Manageability
                                                                   Concurrency

                                Implementation


                        Hosting environment/runtime
                             (―C‖, J2EE, .NET, …)

foster@mcs.anl.gov                                             ARGONNE  CHICAGO
                                                       150
       Open Grid Services Architecture:
           Fundamental Structure
     1) WSDL conventions and extensions for
       describing and structuring services
         – Useful independent of ―Grid‖ computing
     2) Standard WSDL interfaces & behaviors for
       core service activities
         – portTypes and operations => protocols




foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                                          151


         WSDL Conventions & Extensions
        portType (standard WSDL)
         – Define an interface: a set of related operations
        serviceType (extensibility element)
         – List of port types: enables aggregation
        serviceImplementation (extensibility element)
         – Represents actual code
        service (standard WSDL)
         – instanceOf extension: map descr.->instance
        compatibilityAssertion (extensibility element)
         – portType, serviceType, serviceImplementation

foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                                           152



             Structure of a Grid Service

   Service
             service
                       …        service        service
                                                          …      service

Instantiation instanceOf    instanceOf         instanceOf    instanceOf

 Service
Description serviceImplementation         cA   serviceImplementation
                                                                           …
                                serviceType
                                                   c
                                                   A
                                                            serviceType    …
    =Standard WSDL
                           PortType
                                      …        PortType
                                                             c
                                                             A
                                                                    PortType


  cA = compatibilityAssertion
foster@mcs.anl.gov                                       ARGONNE  CHICAGO
         Standard Interfaces & Behaviors:                   153

            Four Interrelated Concepts
        Naming and bindings
          – Every service instance has a unique name,
            from which can discover supported bindings
        Information model
          – Service data associated with Grid service
            instances, operations for accessing this info
        Lifecycle
          – Service instances created by factories
          – Destroyed explicitly or via soft state
        Notification
          – Interfaces for registering interest and
            delivering notifications
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                                            154
        OGSA Interfaces and Operations
               Defined to Date
    GridService               Required           Factory
     – FindServiceData                                – CreateService
     – Destroy                                    PrimaryKey
     – SetTerminationTime                             – FindByPrimaryKey
                                                      – DestroyByPrimaryKey
    NotificationSource
     – SubscribeToNotificationTopic               Registry
     – UnsubscribeToNotificationTopic                 – RegisterService
    NotificationSink                                 – UnregisterService
     – DeliverNotification
                                                  HandleMap
                                                      – FindByHandle
 Authentication, reliability are binding properties
 Manageability, concurrency, etc., to be defined
foster@mcs.anl.gov                                         ARGONNE  CHICAGO
                                                           155

                      Service Data
        A Grid service instance maintains a set of
         service data elements
         – XML fragments encapsulated in standard
           <name, type, TTL-info> containers
         – Includes basic introspection information,
           interface-specific data, and application data
        FindServiceData operation (GridService
         interface) queries this information
         – Extensible query language support
        See also notification interfaces
         – Allows notification of service existence and
           changes in service data
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                                     156
                Grid Service Example:
                  Database Service
        A DBaccess Grid service will support at
         least two portTypes          Grid
                                        Service   DBaccess
         – GridService
         – DBaccess                           Name, lifetime, etc.

        Each has service data                DB info


         – GridService: basic introspection
           information, lifetime, …
         – DBaccess: database type, query languages
           supported, current load, …, …


foster@mcs.anl.gov                            ARGONNE  CHICAGO
                                                            159


                Lifetime Management
        GS instances created by factory or manually;
         destroyed explicitly or via soft state
         – Negotiation of initial lifetime with a factory
           (=service supporting Factory interface)
        GridService interface supports
         – Destroy operation for explicit destruction
         – SetTerminationTime operation for keepalive
        Soft state lifetime management avoids
         – Explicit client teardown of complex state
         – Resource ―leaks‖ in hosting environments

foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                          160



                          Factory
       Factory interface’s CreateService operation
        creates a new Grid service instance
        – Reliable creation (once-and-only-once)
       CreateService operation can be extended to
        accept service-specific creation parameters
       Returns a Grid Service Handle (GSH)
        – A globally unique URL
        – Uniquely identifies the instance for all time
        – Based on name of a home handleMap service


foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                                        161



               Transient Database Services
“What services         “Create a database
can you create?”       service”

                   Grid                     Grid
                                DBaccess
                   Service                  Service
                                Factory               DBaccess


                      Instance name, etc.        Name, lifetime, etc.

“What database         Factory info              DB info
services exist?”

                   Grid                     Grid
                   Service       Registry   Service   DBaccess


                      Instance name, etc.       Name, lifetime, etc.

                      Registry info             DB info



foster@mcs.anl.gov                               ARGONNE  CHICAGO
                                                                           162
                      Example:
            Data Mining for Bioinformatics

                      Community
                       Registry           Mining
                                          Factory               Database
                                                                 Service


                                                                BioDB 1

      User                        Compute Service Provider
                                                                   .
   Application                               .
                                                                   .
                                             .
                                                                   .
                                             .
  ―I want to create
                                                                Database
a personal database                      Database                Service
 containing data on                       Factory
 e.coli metabolism‖
                                                                BioDB n


                                  Storage Service Provider


foster@mcs.anl.gov                                       ARGONNE  CHICAGO
                                                                        163
                      Example:
            Data Mining for Bioinformatics

  ―Find me a data Community
mining service, and Registry           Mining
                                       Factory               Database
somewhere to store                                            Service
       data‖
                                                             BioDB 1

      User                     Compute Service Provider
                                                                .
   Application                            .
                                                                .
                                          .
                                                                .
                                          .
                                                             Database
                                      Database                Service
                                       Factory

                                                             BioDB n


                               Storage Service Provider


foster@mcs.anl.gov                                    ARGONNE  CHICAGO
                                                                          164
                      Example:
            Data Mining for Bioinformatics

                     Community
                      Registry           Mining
  GSHs for Mining                        Factory               Database
                                                                Service
  and Database
  factories
                                                               BioDB 1

     User                        Compute Service Provider
                                                                  .
  Application                               .
                                                                  .
                                            .
                                                                  .
                                            .
                                                               Database
                                        Database                Service
                                         Factory

                                                               BioDB n


                                 Storage Service Provider


foster@mcs.anl.gov                                      ARGONNE  CHICAGO
                                                                                 165
                      Example:
            Data Mining for Bioinformatics

                      Community
                       Registry                 Mining
                                                Factory               Database
        ―Create a data mining                                          Service
        service with initial
        lifetime 10‖                                                  BioDB 1

     User                               Compute Service Provider
                                                                         .
  Application                                      .
                                                                         .
                                                   .
                                                                         .
                                                   .
                ―Create a                                             Database
                database with initial          Database                Service
                lifetime 1000‖                  Factory

                                                                      BioDB n


                                        Storage Service Provider


foster@mcs.anl.gov                                             ARGONNE  CHICAGO
                                                                                 166
                      Example:
            Data Mining for Bioinformatics

                      Community
                       Registry                 Mining
                                                Factory               Database
        ―Create a data mining                                          Service
        service with initial
        lifetime 10‖                             Miner                BioDB 1

     User                               Compute Service Provider
                                                                         .
  Application                                      .
                                                                         .
                                                   .
                                                                         .
                                                   .
                ―Create a                                             Database
                database with initial          Database                Service
                lifetime 1000‖                  Factory

                                                                      BioDB n
                                               Database

                                        Storage Service Provider


foster@mcs.anl.gov                                             ARGONNE  CHICAGO
                                                                               167
                      Example:
            Data Mining for Bioinformatics

                     Community
                      Registry           Mining
                                         Factory                    Database
                                                            Query    Service


                                          Miner                     BioDB 1

     User                        Compute Service Provider
                                                                       .
  Application                               .
                                                                       .
                                            .
                                                       Query           .
                                            .
                                                                    Database
                                        Database                     Service
                                         Factory

                                                                    BioDB n
                                        Database

                                 Storage Service Provider


foster@mcs.anl.gov                                      ARGONNE  CHICAGO
                                                                                168
                      Example:
            Data Mining for Bioinformatics

                     Community
                      Registry            Mining
                                          Factory                    Database
                                                             Query    Service


                                           Miner                     BioDB 1
                      Keepalive
     User                         Compute Service Provider
                                                                        .
  Application                                .
                                                                        .
                                             .
                                                        Query           .
                                             .
                                                                     Database
                                         Database                     Service
                      Keepalive           Factory

                                                                     BioDB n
                                         Database

                                  Storage Service Provider


foster@mcs.anl.gov                                       ARGONNE  CHICAGO
                                                                                  169
                      Example:
            Data Mining for Bioinformatics

                     Community
                      Registry            Mining
                                          Factory                      Database
                                                                        Service


                                           Miner                       BioDB 1
                      Keepalive
     User                         Compute Service Provider
                                                                           .
  Application                                .                   Results   .
                                             .
                                                                           .
                                             .
                                                                       Database
                                         Database                       Service
                      Keepalive           Factory
                                                             Results
                                                                       BioDB n
                                         Database

                                  Storage Service Provider


foster@mcs.anl.gov                                       ARGONNE  CHICAGO
                                                                           170
                      Example:
            Data Mining for Bioinformatics

                     Community
                      Registry            Mining
                                          Factory               Database
                                                                 Service


                                           Miner                BioDB 1

     User                         Compute Service Provider
                                                                   .
  Application                                .
                                                                   .
                                             .
                                                                   .
                                             .
                                                                Database
                                         Database                Service
                      Keepalive           Factory

                                                                BioDB n
                                         Database

                                  Storage Service Provider


foster@mcs.anl.gov                                       ARGONNE  CHICAGO
                                                                           171
                      Example:
            Data Mining for Bioinformatics

                     Community
                      Registry            Mining
                                          Factory               Database
                                                                 Service


                                                                BioDB 1

     User                         Compute Service Provider
                                                                   .
  Application                                .
                                                                   .
                                             .
                                                                   .
                                             .
                                                                Database
                                         Database                Service
                      Keepalive           Factory

                                                                BioDB n
                                         Database

                                  Storage Service Provider


foster@mcs.anl.gov                                       ARGONNE  CHICAGO
                                                                          172



                Notification Interfaces
        NotificationSource for client subscription
         – One or more notification generators
            > Generates notification message of a specific type
            > Typed interest statements: E.g., Filters, topics, …
            > Supports messaging services, 3rd party filter services, …
         – Soft state subscription to a generator
        NotificationSink for asynchronous delivery
         of notification messages
        A wide variety of uses are possible
         – E.g. Dynamic discovery/registry services,
           monitoring, application error notification, …
foster@mcs.anl.gov                                    ARGONNE  CHICAGO
                                                                                     173



                       Notification Example
         Notifications can be associated with any
          (authorized) service data elements

   Grid                   Notification                  Grid
   Service                   Sink                       Service   DBaccess


       Name, lifetime, etc.                                 Name, lifetime, etc.

       DB info                           Notification       DB info    Subscribers
                                              Source




foster@mcs.anl.gov                                          ARGONNE  CHICAGO
                                                                                             174



                       Notification Example
         Notifications can be associated with any
          (authorized) service data elements

   Grid                   Notification                          Grid
   Service                   Sink        me of
                                       ―Notify                  Service   DBaccess
                                 new data about
       Name, lifetime, etc.
                                membrane proteins‖                  Name, lifetime, etc.

       DB info                                   Notification       DB info    Subscribers
                                                      Source




foster@mcs.anl.gov                                                  ARGONNE  CHICAGO
                                                                                           175



                       Notification Example
         Notifications can be associated with any
          (authorized) service data elements

   Grid                   Notification                        Grid
   Service                   Sink                             Service   DBaccess


       Name, lifetime, etc.
                                         Keepalive                Name, lifetime, etc.

       DB info                                 Notification       DB info    Subscribers
                                                    Source




foster@mcs.anl.gov                                                ARGONNE  CHICAGO
                                                                                         176



                       Notification Example
         Notifications can be associated with any
          (authorized) service data elements

   Grid                   Notification                      Grid
   Service                   Sink                           Service   DBaccess
                                         New data

       Name, lifetime, etc.                                     Name, lifetime, etc.

       DB info                               Notification       DB info    Subscribers
                                                  Source




foster@mcs.anl.gov                                              ARGONNE  CHICAGO
       Open Grid Services Architecture:                   177

                  Summary
     Service orientation to virtualize resources
      – Everything is a service
     From Web services
      – Standard interface definition mechanisms:
        multiple protocol bindings, local/remote
        transparency
     From Grids
      – Service semantics, reliability and security models
      – Lifecycle management, discovery, other services
     Multiple ―hosting environments‖
      – C, J2EE, .NET, …
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                                178



                Recap: The Grid Service

                       Reliable invocation
                       Authentication

Service data access     GridService     … other interfaces …       Notification
Explicit destruction                                               Authorization
Soft-state lifetime                                                Service creation
                            Service   Service     Service
                                                                   Service registry
                              data
                            element
                                        data
                                      element
                                                    data
                                                  element
                                                                   Manageability
                                                                   Concurrency

                                Implementation


                        Hosting environment/runtime
                             (―C‖, J2EE, .NET, …)

foster@mcs.anl.gov                                             ARGONNE  CHICAGO
                                                                     179


          OGSA and the Globus Toolkit
        Technically, OGSA enables
         – Refactoring of protocols (GRAM, MDS-2, etc.)—
           while preserving all GT concepts/features!
         – Integration with hosting environments:
           simplifying components, distribution, etc.
         – Greatly expanded standard service set
        Pragmatically, we are proceeding as follows
         – Develop open source OGSA implementation
            > Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs
         – Partnerships for service development
         – Also expect commercial value-adds
foster@mcs.anl.gov                                  ARGONNE  CHICAGO
                                                                180
          GT3: An Open Source OGSA-
           Compliant Globus Toolkit
       GT3 Core
        – Implements Grid service
          interfaces & behaviors
        – Reference impln of                       Other Grid
                                         GT3
          evolving standard                         Services
                                         Data
        – Java first, C soon, C#?       Services
       GT3 Base Services                GT3 Base Services
        – Evolution of current               GT3 Core
          Globus Toolkit capabilities
        – Backward compatible
       Many other Grid services
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                           181
          Hmm, Isn’t This Just Another
                Object Model?
        Well, yes, in a sense
         – Strong encapsulation
         – We (can) profit greatly from experiences of
           previous object-based systems
        But
         – Focus on encapsulation not inheritance
         – Does not require OO implementations
         – Value lies in specific behaviors: lifetime,
           notification, authorization, …, …
         – Document-centric not type-centric
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                          182
                  Grids and OGSA:
                 Research Challenges
        Grids pose profound problems, e.g.
         – Management of virtual organizations
         – Delivery of multiple qualities of service
         – Autonomic management of infrastructure
         – Software and system evolution
        OGSA provides foundation for tackling
         these problems in a rigorous fashion?
         – Structured establishment/maintenance of
           global properties
         – Reasoning about total system properties
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                           183



                        Summary
        OGSA represents refactoring of current
         Globus Toolkit protocols and integration
         with Web services technologies
        Several desirable features
         – Significant evolution of functionality
         – Uniform IDL facilitates code sharing
         – Allows for alignment of potentially divergent
           directions (e.g., info service, service
           registry, monitoring)


foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                           184



                        Evolution
        This is not happening all at once
         – We have an early prototype of Core (alpha
           release May?)
         – Next we will work on Base, others
         – Full release by end of 2002??
         – Establishing partnerships for other services
        Backward compatibility
         – API level seems straightforward
         – Protocol level: gateways?
         – We need input on best strategies
foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                           185



                 For More Information
        OGSA architecture and overview
         – ―The Physiology of the Grid: An Open Grid
           Services Architecture for Distributed Systems
           Integration‖, at www.globus.org/ogsa
        Grid service specification
         – At www.globus.org/ogsa
        Open Grid Services Infrastructure WG, GGF
         – www.gridforum.org/ogsi (?), soon
        Globus Toolkit OGSA prototype
         – www.globus.org/ogsa
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                          186


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                        187



                     GGF Objectives
       An open process for development of standards
        – Grid ―Recommendations‖ process modeled after
          Internet Standards Process (IETF)
       A forum for information exchange
        – Experiences, patterns, structures
       A regular gathering to encourage shared effort
        – In code development: libraries, tools…
        – Via resource sharing: shared Grids
        – In infrastructure: consensus standards


foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                                                 188



                                  GGF Groups
    Working Groups                                   Research Groups
      – Tightly focused on                             – More exploratory than
        development of a spec or                         Working Groups
        set of related specs                           – Focused on understanding
           > Protocol, API, etc.                         requirements,
      – Finite set of objectives and                     taxonomies, models,
        schedule of milestones                           methods for solving a
 Groups are approved and evaluated by a                  particular set of related
 GGF Steering Group (GFSG) based on                      problems
 written charters. Among the criteria for              – May be open-ended but
 group formation:
                                                         with a definite set of
 • Is this work better done (or already being
 done) elsewhere, e.g. IETF, W3C?                        objectives and milestones
 • Are the leaders involved and/or in touch with         to drive progress
 relevant efforts elsewhere?
foster@mcs.anl.gov                                              ARGONNE  CHICAGO
                     Current GGF Groups                                                    189

                   (Out-of-date List, Sorry…)
AREA                     Working Groups                    Research Groups

Grid Information         Grid Object Specification         Relational Database Information
Services                 Grid Notification Framework        Services
                         Metacomputing Directory
                          Services
Scheduling and           Advanced Reservation
Resource                 Scheduling Dictionary
Management               Scheduler Attributes
Security                 Grid Security Infrastructure
                         Grid Certificate Policy
Performance              Grid Performance Monitoring
                          Architecture
Architectures            JINI                              Grid Protocol Architecture
                         NPI Architecture                  Accounting Models
Data                     GridFTP                           Data Replication
Applications,                                               Applications
Programming Models,                                         Grid User Services
and User                                                    Grid Computing Env.
Environments
                                                            Adv Programming Models
                                                            Adv Collaboration Env
foster@mcs.anl.gov                                                    ARGONNE  CHICAGO
                      Proposed GGF Groups                                               190


                      (Again, Out of Date …)
      AREA                 Working Groups                 Research Groups

      Scheduling and       Scheduling Command Line        Scheduling Optimization
      Resource              API
      Management           Distributed Resource Mgmt
                            Applic API
                           Grid Resource Management
                            Protocol
      Performance          Network
                            Monitoring/Measurement
                           Sensor Management
                           Grid Event Service
      Architectures        Open Grid Services             Grid Economies
                            Architecture
      Data                 Archiving Command Line         DataGrid Schema
                            API                            Application Metadata
                           Persistent Archives            Network Storage
      Area TBD…         Open Source Software              High-Performance Networks
                        Licensing                           for Grids
                           Cluster Standardization


foster@mcs.anl.gov                                                   ARGONNE  CHICAGO
                                                             191



                     Getting Involved
        Participate in a GGF Meeting
         – 3x/year, last one had 500 people
         – July 21-24, 2002 in Edinburgh (with HPDC)
         – October 15-17, 2002 in Chicago
        Join a working group or research group
         – Electronic participation via mailing lists (see
           www.gridforum.org)




foster@mcs.anl.gov                           ARGONNE  CHICAGO
                                                        192



                      Grid Events
       Global Grid Forum: working meeting
        – Meets 3 times/year, alternates U.S.-Europe,
          with July meeting as major event
       HPDC: major academic conference
        – HPDC’11 in Scotland with GGF’5, July 2002
       Other meetings with Grid content include
        – SC’XY, CCGrid, Globus Retreat




 www.gridforum.org, www.hpdc.org
foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                          193


                          Outline
        The technology landscape
        Grid computing
        The Globus Toolkit
        Applications and technologies
         – Data-intensive; distributed computing;
           collaborative; remote access to facilities
        Grid infrastructure
        Open Grid Services Architecture
        Global Grid Forum
        Summary and conclusions
foster@mcs.anl.gov                          ARGONNE  CHICAGO
                                                          194



                        Summary
        The Grid problem: Resource sharing &
         coordinated problem solving in dynamic,
         multi-institutional virtual organizations
         – Real application communities emerging
         – Significant infrastructure deployments
         – Substantial open source/architecture
           technology base: Globus Toolkit
         – Pathway defined to industrial adoption, via
           open source + OGSA
         – Rich set of intellectual challenges

foster@mcs.anl.gov                          ARGONNE  CHICAGO
    Major Application                                    195

    Communities are
       Emerging
   Intellectual buy-in, commitment
    – Earthquake engineering: NEESgrid
    – Exp. physics, etc.: GriPhyN, PPDG,
      EU Data Grid
    – Simulation: Earth System Grid,
      Astrophysical Sim. Collaboratory
    – Collaboration: Access Grid
   Emerging, e.g.
    – Bioinformatics Grids
    – National Virtual Observatory
foster@mcs.anl.gov                         ARGONNE  CHICAGO
    Major Infrastructure                             196

     Deployments are
        Underway
   For example:
    – NSF ―National Technology Grid‖
    – NASA ―Information Power Grid‖
    – DOE ASCI DISCOM Grid
    – DOE Science Grid’
    – EU DataGrid
    – iVDGL
    – NSF Distributed Terascale
      Facility (―TeraGrid‖)
    – DOD MOD Grid
foster@mcs.anl.gov                     ARGONNE  CHICAGO
                                                         197
              A Rich Technology Base
               has been Constructed
        6+ years of R&D have produced a substantial
         code base based on open architecture
         principles: esp. the Globus Toolkit, including
         – Grid Security Infrastructure
         – Resource directory and discovery services
         – Secure remote resource access
         – Data Grid protocols, services, and tools
        Essentially all projects have adopted this as a
         common suite of protocols & services
        Enabling wide range of higher-level services
foster@mcs.anl.gov                         ARGONNE  CHICAGO
                                                         198
                     Pathway Defined to
                     Industrial Adoption
        Industry need
         – eScience applications, service provider models,
           need to integrate internal infrastructures,
           collaborative computing in general
        Technical capability
         – Maturing open source technology base
         – Open Grid Services Architecture enables
           integration with industry standards
        Result likely to be exponential industrial
         uptake

foster@mcs.anl.gov                        ARGONNE  CHICAGO
                                                       199



      Rich Set of Intellectual Challenges
        Transforming the Internet into a robust,
         usable computational platform
        Delivering (multi-dimensional) qualities of
         service within large systems
        Community dynamics and collaboration
         modalities
        Program development methodologies and
         tools for Internet-scale applications
        Etc., etc., etc.

foster@mcs.anl.gov                       ARGONNE  CHICAGO
                                         206



       New Programs
    U.K. eScience
     program
    EU 6th Framework
    U.S. Committee on
     Cyberinfrastructure
    Japanese Grid
     initiative




foster@mcs.anl.gov         ARGONNE  CHICAGO
                 U.S. Cyberinfrastructure:                                           207

                 Draft Recommendations
    New INITIATIVE to revolutionize science and engineering research at NSF
     and worldwide to capitalize on new computing and communications
     opportunities 21st Century Cyberinfrastructure includes supercomputing,
     but also massive storage, networking, software, collaboration, visualization,
     and human resources
      – Current centers (NCSA, SDSC, PSC) are a key resource for the INITIATIVE
      – Budget estimate: incremental $650 M/year (continuing)
    An INITIATIVE OFFICE with a highly placed, credible leader empowered to
      – Initiate competitive, discipline-driven path-breaking applications within NSF
        of cyberinfrastructure which contribute to the shared goals of the
        INITIATIVE
      – Coordinate policy and allocations across fields and projects. Participants
        across NSF directorates, Federal agencies, and international e-science
      – Develop high quality middleware and other software that is essential and
        special to scientific research
      – Manage individual computational, storage, and networking resources at least
        100x larger than individual projects or universities can provide.

foster@mcs.anl.gov                                            ARGONNE  CHICAGO
                                                 208

                 For More Information
   The Globus Project™
    – www.globus.org
   Grid concepts, projects
    – www.mcs.anl.gov/~foster
   Open Grid Services
    Architecture
    – www.globus.org/ogsa
   Global Grid Forum
    – www.gridforum.org
   GriPhyN project
    – www.griphyn.org
foster@mcs.anl.gov                 ARGONNE  CHICAGO
O

								
To top