Docstoc

Monitoring req for the grid LRZ Grid Portal

Document Sample
Monitoring req for the grid LRZ Grid Portal Powered By Docstoc
					Grid Computing and the
     Globus Toolkit

    Jennifer M. Schopf
     Argonne National Lab
    National eScience Centre
   A Bit of Background
    – Grid Architecture Overview
    – Working With Applications
    – Role of Globus
   Pieces of the Globus Toolkit
    – And an Example Application
   Globus Toolkit 4.0 and Futures
   Some Other Resources


                                     2
                  What is a Grid
   Resource sharing
    – Computers, storage, sensors, networks, …
    – Sharing always conditional: issues of trust, policy,
      negotiation, payment, …
   Coordinated problem solving
    – Beyond client-server: distributed data analysis,
      computation, collaboration, …
   Dynamic, multi-institutional virtual orgs
    – Community overlays on classic org structures
    – Large or small, static or dynamic



                                                             3
              Not A New Idea
   Late 70’s – Networked operating systems
   Late 80’s – Distributed operating system
   Early 90’s – Heterogeneous computing
   Mid 90’s - Metacomputing

   Then the “Grid” – Foster and Kesselman,
    1999

   Also called parallel distributed computing

                                                 4
       Why is this hard/different?
   Lack of central control
    – Where things run
    – When they run
   Shared resources
    – Contention, variability
   Communication
    – Different sites implies different sys admins,
      users, institutional goals, and often “strong
      personalities”


                                                      5
                So why do it?
   Computations that need to be done with a
    time limit
   Data that can’t fit on one site
   Data owned by multiple sites


   Applications that need to be run bigger,
    faster, more




                                               6
        What Kinds of Applications?
   Computation intensive
     – Interactive simulation (climate modeling)
     – Very large-scale simulation and analysis (galaxy formation, gravity waves,
       battlefield simulation)
     – Engineering (parameter studies, linked component models)
   Data intensive
     – Experimental data analysis (high-energy physics)
     – Image and sensor analysis (astronomy, climate study, ecology)
   Distributed collaboration
     – Online instrumentation (microscopes, x-ray devices, etc.)
     – Remote visualization (climate studies, biology)
     – Engineering (large-scale structural testing, chemical engineering)
   In all cases, the problems were big enough that they required people in
    several organization to collaborate and share computing resources, data,
    instruments.




                                                                                    9
         What Types of Problems?
   Too hard to keep track of authentication data
    (ID/password) across institutions
   Too hard to monitor system and application status
    across institutions
   Too many ways to submit jobs
   Too many ways to store & access files and data
   Too many ways to keep track of data
   Too easy to leave “dangling” resources lying
    around (robustness)




                                                        10
                                       Evolution of the Grid
Increased functionality,




                                                                            App-specific
                                                                             Services
    standardization




                                                             Open Grid
                                        Web services
                                                            Services Arch
                                                              GGF: OGSI, WSRF, …
                           X.509,                        (leveraging OASIS, W3C, IETF)
                           LDAP,        Globus Toolkit       Multiple implementations,
                           FTP, …                            including Globus Toolkit
                                         Defacto standards
                            Custom      GGF: GridFTP, GSI
                           solutions     (leveraging IETF)

                                                        Time
                                                                                           12
          With Grid Computing –
          Forget Homogeneity!
   Trying to force
    homogeneity on
    users is futile.
    Everyone has their
    own preferences,
    sometimes even
    dogma.
   The Internet
    provides the model…



                                  13
      Service-Oriented Architecture
   Idea is simple (and old)
    – Define remote activities in terms of interface and
      behavior, not implementation
   Devil is in the details
    – How to describe, discover, access, various type of
      service (semantically & practically)
   Latest instantiation = Web services
       Broad adoption, flexible XML-based model
       WSDL, SOAP, WS-Security
       Interfaces still being defined to date
       Performance challenges




                                                           14
      Open Grid Services Architecture

   Define a service-oriented architecture…
    – the key to effective virtualization
   …to address vital Grid requirements
    – AKA utility, on-demand, system
      management, collaborative computing, etc.
   …building on Web service standards.
    – extending those standards when needed




                                                  15
  Grid and Web Services Convergence




The definition of WSRF means that the Grid and
Web services communities can move forward on
a common base.
                                                 16
Theory -> Practice




                     19
   A Bit of Background
    – Grid Architecture Overview
    – Working With Applications
    – Role of Globus
   Pieces of the Globus Toolkit
    – And an Example Application
   Globus Toolkit 4.0 and Futures
   Some Other Resources


                                     20
                    Methodology
   Building a Grid system or application is currently
    an exercise in software integration.
    –   Define user requirements
    –   Derive system requirements or features
    –   Survey existing components
    –   Identify useful components
    –   Develop components to fit into the gaps
    –   Integrate the system
    –   Deploy and test the system
    –   Maintain the system during its operation
   This should be done iteratively, with many loops
    and eddys in the flow.

                                                         21
            Who Is the Grid For?
   Any Grid (distributed/collaborative) application or
    system will involve several “classes” of people.
    – “End users” (e.g., Scientists, Engineers, Customers)
    – Application/Product Developers
    – System Administrators
    – System Architects and Integrators
   Each user class has unique skills and unique
    requirements.
   The user class whose needs are met varies from
    tool to tool (even within the Globus Toolkit).



                                                             22
          How it Really Happens
   Implementations are provided by a mix of
    – Application-specific code
    – “Off the shelf” tools and services
    – Tools and services from the Globus Toolkit
    – Tools and services from the Grid community
      (compatible with GT)
   Glued together by…
    – Application development
    – System integration

                                                   23
                    How it Really Happens
                                                                         Compute
                                Simulation                                Server
                                   Tool                                  Compute
        Web
      Browser                                                             Server
                     Web                       Registration
                    Portal                       Service
                                                                         Camera

                                  Data         Telepresence
                                 Viewer           Monitor                Camera
                                  Tool
                                                                        Database
                             Chat
                                                                         service
                             Tool
                                                   Data
                                                                        Database
                      Credential                  Catalog
                                                                         service
                      Repository
                                                                        Database
                  Certificate                                            service
                  authority
Users work        Application services       Collective services    Resources implement
with client      organize VOs & enable         aggregate &/or       standard access &       24
applications    access to other services     virtualize resources   management interfaces
                           How it Really Happens
                            (without the Grid)
                                                                                Compute
                                                                           A
                                    Simulation                                   Server
                                       Tool                                     Compute
        Web                                                                B
      Browser                                                                    Server
                         Web                       Registration
                        Portal                       Service
                                                                               Camera

                                      Data         Telepresence
Application
Developer
               10
                                     Viewer           Monitor                  Camera
Off the                               Tool
               12
Shelf                                                                           Database
                                 Chat                                      C
Globus                                                                           service
Toolkit
               0                 Tool
                                                       Data
                                                                                Database
Grid                      Credential                  Catalog              D
Community
               0                                                                 service
                          Repository
                                                                                Database
                      Certificate                                          E
                                                                                 service
                      authority
Users work            Application services       Collective services    Resources implement
with client          organize VOs & enable         aggregate &/or       standard access &       25
applications        access to other services     virtualize resources   management interfaces
                          How it Really Happens
                             (with the Grid)
                                                                         Globus   Compute
                                                                         GRAM      Server
                                   Simulation
                                      Tool                                        Compute
        Web                                                              Globus
                                                                         GRAM      Server
      Browser
                                                  Globus Index
                       CHEF                          Service
                                                                             Camera

                                     Data         Telepresence
Application
Developer
               2
                                    Viewer           Monitor                 Camera
Off the                              Tool
               9
Shelf                                                                    Globus   Database
                           CHEF Chat                                      DAI
Globus                                                                             service
Toolkit
               4            Teamlet
                                                     Globus
                                                                         Globus   Database
Grid                                                MCS/RLS               DAI
Community
               4
                          MyProxy                                                  service
                                                                         Globus   Database
                     Certificate                                          DAI      service
                     Authority
Users work           Application services       Collective services    Resources implement
with client         organize VOs & enable         aggregate &/or       standard access &       26
applications       access to other services     virtualize resources   management interfaces
   A Bit of Background
    – Grid Architecture Overview
    – Working With Applications
    – Role of Globus
   Pieces of the Globus Toolkit
    – And an Example Application
   Globus Toolkit 4.0 and Futures
   Some Other Resources


                                     27
               Globus Is
    “Standard Plumbing” for the Grid
   Not turnkey solutions, but building blocks and tools
    for application developers and system integrators.
     – Some components (e.g., file transfer) go farther
       than others (e.g., remote job submission) toward
       end-user relevance.
   Since these solutions exist and others are already
    using them (and they’re free), it’s easier to reuse
    than to reinvent.
     – And compatibility with other Grid systems comes for
       free!




                                                             28
           Leveraging Existing
         and Proposed Standards
   SSL/TLS v1 (from OpenSSL) (IETF)
   LDAP v3 (from OpenLDAP) (IETF)
   X.509 Proxy Certificates (IETF)
   GridFTP v1.0 (GGF)
   OGSI v1.0 (GGF)
   And others on the road to standardization:
    WSRF (GGF, OASIS), DAI, WS-Agreement,
    WSDL 2.0, WSDM, SAML, XACML


                                                 29
       What Is the Globus Toolkit?
   The Globus Toolkit is a collection of solutions to
    problems that frequently come up when trying to
    build collaborative distributed applications.
   Heterogeneity
    – To date (v1.0 - v4.0), the Toolkit has focused on
      simplifying heterogenity for application developers.
    – We aspire to include more “vertical solutions” in
      future versions.
   Standards
    – Our goal has been to capitalize on and encourage
      use of existing standards (IETF, W3C, OASIS, GGF).
    – The Toolkit also includes reference implementations
      of new/proposed standards in these organizations.

                                                             30
            What Does the
         Globus Toolkit Cover?




        Goal
Today




                                 31
             Areas of Competence
   “Connectivity Layer” Solutions
    –   Service Management (WSRF)
    –   Monitoring/Discovery (WSRF and MDS)
    –   Security (GSI and WS-Security)
    –   Communication (XIO)
   “Resource Layer” Solutions
    – Computing / Processing Power (GRAM)
    – Data Access/Movement (GridFTP, OGSA-DAI)
   “Collective Layer” Solutions
    – Data Management (RLS, MCS, OGSA-DAI)
    – Monitoring/Discovery (MDS)
    – Security (CAS)

                                                 32
        What Is the Globus Toolkit?
   A Grid development environment
    – Develop new OGSA-compliant Web Services
    – Develop applications using Java or C/C++ Grid APIs
    – Secure applications using basic security mechanisms
   A set of basic Grid services
    –   Job submission/management
    –   File transfer (individual, queued)
    –   Database access
    –   Data management (replication, metadata)
    –   Monitoring/Indexing system information
   Tools and Examples
   The prerequisites for many Grid community tools
   Note: GT3 and GT4 releases include both WS and
    pre-WS components!
                                                            33
34
     How To Use the Globus Toolkit
   By itself, the Toolkit has surprisingly limited end user
    value.
    – There’s very little user interface material there.
    – You can’t just give it to end users (scientists, engineers,
      marketing specialists) and tell them to do something
      useful!
   The Globus Toolkit is useful to application developers
    and system integrators.
    – You’ll need   to have a specific application or system in
      mind.
    – You’ll need   to have the right expertise.
    – You’ll need   to set up prerequisite hardware/software.
    – You’ll need   to have a plan.



                                                                    35
               Easy to Use –
       But Few Applications are “Easy”

   The uses that the Toolkit has been aimed at are
    not easy challenges!
   The Globus Toolkit makes them easier.
    – Providing solutions to the most common problems
      and promoting standard solutions
    – A well-designed implementation that allows many
      things to be built on it (lots of happy developers!)
    – 6+ years of providing support to Grid builders
    – Ever-improving documentation, installation,
      configuration, training




                                                             36
  Global
Community
               “100,000 Computers”:
            A Healthy Computing Pyramid
  Today                         Tomorrow?
                   Supercomputers
                   USE SPARINGLY


                                            Specialized
                                            computers
Supercomputer                               2-3 SERVINGS

                   Clusters
                   100s of
                 SERVINGS

  Cluster
                                                  Desktop
                                                  100,000
                                                  SERVINGS
                                                           38
  Desktop
   A Bit of Background
    – Grid Architecture Overview
    – Working With Applications
    – Role of Globus
   Pieces of the Globus Toolkit
    – And an Example Application
   Globus Toolkit 4.0 and Futures
   Some Other Resources


                                     39
    Review: How it Really Happens
   Implementations are provided by a mix of
    – Application-specific code
    – “Off the shelf” tools and services
    – Tools and services from the Globus Toolkit
    – Tools and services from the Grid community
      (compatible with GT)
   Glued together by…
    – Application development
    – System integration

                                                   40
Iterative Design
           Ideal for cutting-
            edge activities where
            detailed needs and
            the “final goal”
            aren’t fully known
            ahead of time.
           Provides maximum
            adaptability, course
            correction.
           Produces useful
            results early.

                                    41
Grid2003: An Operational Grid
 28 sites (2100-2800 CPUs) & growing
 400-1300 concurrent jobs
 8 substantial applications + CS experiments
 Running since October 2003




          Korea

              http://www.ivdgl.org/grid2003     42
         Computation-Intensive
           Science: Grid2003
   GriPhyN - Grid Physics Network (NSF)
   iVDGL - International Virtual Data Grid
    Laboratory (NSF)
   LCG - LHC Computing Grid (EU)
   PPDG - Particle Physics Data Grid (DOE)




                                              43
            Grid2003 Project Goals
   Ramp up U.S. Grid capabilities in anticipation of
    LHC experiment needs in 2005.
    –   Build, deploy, and operate a working Grid.
    –   Include all U.S. LHC institutions.
    –   Run real scientific applications on the Grid.
    –   Provide state-of-the-art monitoring services.
    –   Cover non-technical issues (e.g., SLAs) as well as
        technical ones.
   Unite the U.S. CS and Physics projects that are
    aimed at support for LHC.
    – Common infrastructure
    – Joint (collaborative) work


                                                             44
          Grid2003 Requirements
   General Infrastructure
   Support Multiple Virtual Organizations
   Production Infrastructure
   Standard Grid Services
   Interoperability with European LHC Sites
   Easily Deployable
   Meaningful Performance Measurements




                                               45
             Grid2003 Applications
   6 VOs, 11 Apps
   CMS proton-proton collision
    simulation
   ATLAS proton-proton
    collision simulation
   LIGO gravitational wave
    search
   SDSS galaxy cluster
    detection
   ATLAS interactive analysis
   BTeV proton-antiproton
    collision simulation
   SnB biomolecular analysis
   GADU/Gnare genone
    analysis
   Various computer science
                                     46
    experiments
                Genome
               sequence
                analysis
    Example
   Grid2003
   Workflows




   Sloan        Physics
digital sky      data
  survey        analysis
                       47
          Grid2003 Components
   Security:
    – GT GSI, CAS, GSI-OpenSSH
   Monitoring
    – GT MDS, MonALISA, Ganglia
   Job Submission
    – GT GRAM, Condor-G, Chimera & Pegasus
   Data Tools
    – GT GridFTP, GT RLS, GT MCS



                                             48
               Grid2003 Components
   Computers & storage at 28 sites (to date)
    – 2800+ CPUs
   Uniform service environment at each site
    – Globus Toolkit provides basic authentication, execution
      management, data movement
    – Pacman installation system enables installation of numerous
      other VDT and application services
   Global & virtual organization services
    – Certification & registration authorities, VO membership
      services, monitoring services
   Client-side tools for data access & analysis
    – Virtual data, execution planning, DAG management, execution
      management, monitoring
   IGOC: iVDGL Grid Operations Center


                                                                    49
            Grid2003 Deployment




   Software installed at more than 25 U.S. LHC
    institutions, plus one Korean site.
   More than 2000 CPUs in total.
   More than 100 individuals authorized to use the Grid.
   Peak throughput of 500-900 jobs running concurrently,
    completion efficiency of 75%.
                                                            52
       Grid2003 Interesting Points
   Each virtual organization
    includes its own set of
    system resources
    (compute nodes,
    storage, etc.) and
    people. VO membership
    info is managed system-
    wide, but policies are
    enforced at each site.
   Throughput is a key
    metric for success, and
    monitoring tools are used
    to measure it and
    generate reports for each
    VO.
                                     53
                  Grid2003 Metrics

            Metric               Target     Achieved
                                             2762 (28
Number of CPUs                     400
                                              sites)
Number of users                   > 10       102 (16)

Number of applications             >4        10 (+CS)
Number of sites running
                                  > 10         17
concurrent apps
Peak number of concurrent jobs    1000        1100

Data transfer per day            > 2-3 TB   4.4 TB max
                                                         54
            Grid2003 Summary
   Working Grid for wide set of applications
   Joint effort between application scientists,
    computer scientists
   Globus software as a starting point,
    additions from other communities as
    needed




                                                   55
   A Bit of Background
    – Grid Architecture Overview
    – Working With Applications
    – Role of Globus
   Pieces of the Globus Toolkit
    – And an Example Application
   Globus Toolkit 4.0 and Futures
   Some Other Resources


                                     56
    The Globus Toolkit “Ecosystem”
   Pieces of the Grid world-
    – Globus Toolkit and associated software
   Security
   Monitoring
   Resource management
   Portals
   Packaging



                                               57
        Why Grid Security is Hard
   Resources being used may be valuable & the problems
    being solved sensitive
   Resources are often located in distinct administrative
    domains
    – Each resource has own policies & procedures
   Set of resources used by a single computation may be
    large, dynamic, and unpredictable
    – Not just client/server, requires delegation
   It must be broadly available & applicable
    – Standard, well-tested, well-understood protocols;
      integrated with wide variety of tools



                                                           58
                Security Tools
   Basic Grid Security Mechanisms
   Certificate Generation Tools
   Certificate Management Tools
    – Getting users “registered” to use a Grid
    – Getting Grid credentials to wherever they’re
      needed in the system
   Authorization/Access Control Tools
    – Storing and providing access to system-
      wide authorization information

                                                     59
    Basic Grid Security Mechanisms
   Basic Grid authentication and authorization
    mechanisms come in two flavors.
    – Pre-Web services
    – Web services
   Both are included in the Globus Toolkit, and both
    provide vital security features.
    –   Grid-wide identities implemented as PKI certificates
    –   Transport-level and message-level authentication
    –   Ability to delegate credentials to agents
    –   Ability to map between Grid & local identities
    –   Local security administration & enforcement
    –   Single sign-on support implemented as “proxies”
    –   A “plug in” framework for authorization decisions
                                                               60
    Basic Grid Security Mechanisms
   Basic security mechanisms are provided as
    libraries/classes and APIs.
    – Integrated with other GT tools and services
    – Integrated with many Grid community tools
      and services (and applications & systems)
   A few stand-alone tools are also included.




                                                    61
               A Cautionary Note
   Grid security mechanisms are tedious to set up.
    – If exposed to users, hand-holding is usually required.
    – These mechanisms can be hidden entirely from end
      users, but still used behind the scenes.
   These mechanisms exist for good reasons.
    – Many useful things can be done without Grid
      security.
    – It is unlikely that an ambitious project could go into
      production operation without security like this.
    – Most successful projects end up using Grid security,
      but using it in ways that end users don’t see much.


                                                               62
         Globus Certificate Service
   An online service that issues low-quality GSI
    certificates
    – Intended for people who want to experiment with
      Grid components that require certificates but do not
      have any other means of acquiring certificates.
    – These certificates are not to be used on production
      systems.
   Not a true Certificate Authority (CA)
    – No revoking or reissuing certificates
    – No verification of identities
    – The service itself is not especially secure.


                                                             63
                     Simple CA
   A convenient method of setting up a certificate
    authority (CA).
    – The Certificate Authority can then be used to issue
      certificates for users and services that work with GSI
      and WS-Security.
    – Simple CA is intended for operators of small Grid
      testing environments and users who are not part of a
      larger Grid.
   Most production Grids will not accept certificates
    that are not signed by a well-known CA, so the
    certificates generated by Simple CA will usually not
    be sufficient to gain access to production services.


                                                               64
                            MyProxy
   MyProxy is a remote service
    that stores user credentials.
     – Users can request proxies
       for local use on any system
       on the network.
     – Web Portals can request
       user proxies for use with
       back-end Grid services.
   Grid administrators can pre-
    load credentials in the
    server for users to retrieve
    when needed.
   Greatly simplifies certificate
    management!


                                      65
              CAS: Community
            Authorization Service
   CAS allows resource providers
    to specify course-grained access
    control policies in terms of
    communities as a whole.
   Fine-grained access control is
    delegated to the community.
   Resource providers maintain
    ultimate authority over their
    resources (including per-user
    control and auditing) but are
    spared most day-to-day policy
    administration tasks.


                                       66
        Monitoring and Discovery
               Challenges
   Grid Information Service
   Requirements and characteristics
    – Uniform, flexible access to information
    – Scalable, efficient access to dynamic data
    – Access to multiple information sources
    – Decentralized maintenance
    – Secure information provision




                                                   68
      Monitoring/Discovery Tools
   Basic WSRF Infrastructure Components
   Specialized Monitoring/Discovery
    Components
    – Specialized collection/monitoring agents
    – Viewing and display tools for showing
      system information for a variety of
      specialized purposes




                                                 69
     WSRF Infrastructure Elements
   WS Core Monitoring Features
    – Every service produces Resource Properties
      – so monitoring is baked right in to WSRF
   Non-WSRF services can also provide
    information from wrappers
   Index Service
    – Collection point for a set of data (registry)
    – Also has last value of data in cache
    – Indexes can be set up for a variety of uses,
      projects

                                                      70
        Monitoring and Discovery
         Service in GT4 (MDS4)
   WS-RF compatible
   Monitoring of basic service data
   Primary use case is discovery of services
   Starting to be used for up/down statistics




                                                 71
      MDS4 Information Providers
   Code that generates resource property information
    – Were called service data providers in GT3
   XML Based – not LDAP
   Basic cluster data
    – Interface to Ganglia
    – GLUE schema
   Some service data from GT4 services
    – Start, timeout, etc
   Soft-state registration
   Push and pull data models


                                                        72
           Ganglia Cluster Toolkit
   Ganglia is a toolkit for monitoring clusters and
    aggregations of clusters (hierarchically).
   Ganglia collects system status information and makes it
    available via a web interface.
   Ganglia status can be subscribed to and aggregated
    across multiple systems.
   Integrating Ganglia with MDS services results in status
    information provided in the proposed standard GLUE
    schema, popular in international Grid collaborations.




                                                              73
            MDS4 Index Service
   Index Service is both registry and cache
   Subscribes to information providers
    – Data, datatype, data provider information
   Caches last value of all data
   In memory default approach




                                                  74
            MDS4 Trigger Service
   Compound consumer-producer service
   Subscribe to a set of resource properties
   Set of tests on incoming data streams to evaluate
    trigger conditions
   When a condition matches, email is sent to pre-
    defined address


   GT3 tech-preview version in use by ESG
   GT4 version alpha is in GT4 alpha release currently
    available


                                                          75
          MDS4 Archive Service
   Compound consumer-producer service
   Subscribe to a set of resource properties
   Data put into database (Xindice)
   Other consumers can contact database
    archive interface


   Will be Tech Preview in GT4 Final release




                                                76
      Computing/Processing Tools
   Workflow Managers
    – Organize and coordinate task execution
      within a complicated application
    – Often coordinates data movement and task
      execution
   Metaschedulers
    – Optimize use of distributed compute pools
   Virtual Data Tools
    – Manage the trade-off between data storage
      and processing power

                                                  77
         The Resource Management
                 Challenge
   Enabling secure, controlled remote access to
    heterogeneous computational resources and
    management of remote computation
    –   Authentication and authorization
    –   Resource discovery & characterization
    –   Reservation and allocation
    –   Computation monitoring and control
   Addressed by a set of protocols & services
    – GRAM protocol as a basic building block
    – Resource brokering & co-allocation services
    – GSI for security, MDS for discovery



                                                    78
          GRAM - Basic Job
    Submission and Control Service
   A uniform service interface for
    remote job submission and
    control
    – Includes file staging and I/O
      management
    – Includes reliability features
    – Supports basic Grid security
      mechanisms
    – Available in Pre-WS and WS
   GRAM is not a scheduler.
    – No scheduling
    – No metascheduling/brokering
    – Often used as a front-end to
      schedulers, and often used to
      simplify metaschedulers/brokers
                                        79
                   CondorG
   The Condor project has produced a “helper
    front-end” to GRAM
    – Managing sets of subtasks
    – Reliable front-end to GRAM to manage
      computational resources
   Note: this is not Condor which promotes
    high-throughput computing, and use of
    idle resources



                                                80
              Chimera “Virtual Data”
   Captures both logical and
    physical steps in a data
    analysis process.
    – Transformations (logical)
    – Derivations (physical)                                        Sloan Survey Data
   Builds a catalog.
   Results can be used to
    “replay” analysis.
                                                                      Galaxy cluster
    – Generation of DAG (via                                         size distribution
      Pegasus)                                         100000




    – Execution on Grid                                10000




                                  Number of Clusters
    Catalog allows
                                                        1000


    introspection of analysis
                                                         100




    process.
                                                          10



                                                           1
                                                                1                10            100
                                                                          Number of Galaxies


                                                                                               81
  Pegasus Workflow Transformation
Converts Abstract Workflow
(AW) into Concrete Workflow                    Metadata
                                                                  Chimera
                                                                Virtual Data
(CW).                                           Catalog           Catalog

   – Uses Metadata to convert
     user request to logical data
     sources
   – Obtains AW from Chimera
                                                                                 t
   – Uses replication data to
     locate physical files                                           DAGman

   – Delivers CW to DAGman
   – Executes using Condor
                                     Replica               Condor
   – Publishes new replication      Location
                                    Service
     and derivation data in RLS                                        Compute
     and Chimera (optional)                    Storage
                                                Storage
                                                                        Server
                                                                      Compute
                                                                       Server
                                                                     Compute
                                                 Storage
                                               System
                                                System                Server
                                                                    Compute
                                                 System
                                                                     Server


                                                                                     82
                     Data Tools
   Virtual Data Tools
    – Manage the trade-off between data storage and
      processing power (already covered)
   Movement/Transfer Tools
    – Interfaces that meet specialized application or user
      needs
    – “Last mile” integration to specialized storage systems
   Optimization Tools
    – Help optimize the use of storage systems for
      specialized user communities



                                                               83
A Model Architecture for Data Grids
                 Attribute
 Metadata        Specification                          Replica
 Catalog                    Application                 Catalog
                                                        Multiple Locations
   Logical Collection and
                                        Selected
   Logical File Name
                                        Replica       Replica               MDS
                                                      Selection
                                                          Performance
         GridFTP Control Channel                          Information &
                                                          Predictions
                                                                            NWS


                     GridFTP        Disk Cache
                     Data
                     Channel     Tape Library
   Disk Array                                             Disk Cache
Replica Location 1               Replica Location 2    Replica Location 3

                                                                              84
                         GridFTP
   A high-performance, secure, reliable data transfer
    protocol optimized for high-bandwidth wide-area
    networks
    –   FTP with well-defined extensions
    –   Uses basic Grid security (control and data channels)
    –   Multiple data channels for parallel transfers
    –   Partial file transfers
    –   Third-party (direct server-to-server) transfers
    –   Reusable data channels
    –   Command pipelining
   GGF recommendation GFD.20


                                                               85
           Striped GridFTP Service
   A distributed GridFTP
    service that runs on a
    storage cluster                                            Parallel Transfer
                                                            Fully utilizes bandwidth of
    – Every node of the                                 network interface on single nodes.
      cluster is used to
      transfer data into/out of
      the cluster




                                                                                             Parallel Filesystem
                                  Parallel Filesystem
    – Head node coordinates
      transfers
   Multiple NICs/internal
    busses lead to very high
    performance
    – Maximizes use of Gbit+
      WANs                                                     Striped Transfer
                                                          Fully utilizes bandwidth of
                                                        Gb+ WAN using multiple nodes.

                                                                                                                   86
                    UberFTP
   UberFTP is an interactive (text prompt)
    client for GridFTP.
   Supports
    – Parallelism
    – Third-party transfer




                                              87
       RFT - File Transfer Queuing
   A WSRF service for queuing file transfer
    requests
    – Server-to-server transfers
    – Checkpointing for restarts
    – Database back-end for failovers
   Allows clients to requests transfers and
    then “disappear”
    – No need to manage the transfer
    – Status monitoring available if desired

                                               88
                    Example:
          Reliable File Transfer Service

                             Client          Client             Client


                       Request and manage file transfer operations
                              Grid     File   Notf’n Policy
 Fault                       Service Transfer Source
Monitor                        Pending
             Query &/or                            interfaces
              subscribe      Performance
                                             service
           to service data     Policy        data
                                                        Internal
 Perf.                                       elements    State
Monitor                        Faults


                                      Data transfer operations
                                                                         89
                           OGSA-DAI
   OGSA interface         1a. Request to Registry
                           for sources of data            DAI
    for accessing          about “x”                      Service
                                                          Group
                                                                                                SOAP/HTTP
                                                                                                service creation
    XML and                                1b. Registry
                                                          Registry
                                                                                                API interactions
                                        responds with
    relational data                    Factory handle
                                                          2a. Request to Factory for access

    stores                                                to database

                                                                               Grid Data
   Implements                                                                 Service
                                            2c. Factory returns                Factory
    the GGF DAIS      Client                handle of GDS to
                                            client                     2b. Factory creates
    WG standard                  3a. Client queries GDS with           GridDataService to manage
                                              XPath, SQL, etc          access
    (in progress)                                                                             XML /
                                                                                              Relationa
                                                                Grid Data                     l
                                                                Service                       database
                    3c. Results of query returned to
                    client (or to a 3rd party)
                                                                  3b. GDS interacts with database

                         Figure courtesy of Malcolm Atkinson and Rob Baxter, UK eScience Center
                                                                                                              90
    MCS - Metadata Catalog Service
   A stand-alone metadata catalog service
    – WSRF service interface
    – Stores system-defined and user-defined
      attributes for logical files/objects
    – Supports manipulation and query
   Integrated with OGSA-DAI
    – OGSA-DAI provides metadata storage
    – When run with OGSA-DAI, basic Grid
      authentication mechanisms are available


                                                91
     RLS - Replica Location Service
   A distributed system for tracking replicated data
    – Consistent local state maintained in Local Replica
      Catalogs (LRCs)
    – Collective state with relaxed consistency maintained
      in Replica Location Indices (RLIs)
   Performance features
    – Soft state maintenance of RLI state
    – Compression of state updates
    – Membership and partitioning information
      maintenance
Note:
    – RLS (developed by Globus Alliance and the DataGrid Project)
      replaces earlier components in the Globus Toolkit 2.x.


                                                                    92
                 Web Portals


   Tools for building web interfaces that
    provide access to system/application
    capabilities




                                             93
                               CHEF/Sakai
                                          The CompreHensive
                                           collaborativE Framework
                                           (CHEF) is a flexible
                                           environment for
                                           supporting distributed
                                           learning and
                                           collaborative work.
                                          CHEF is rapidly evolving
         QuickTime™ an d a
                                           into Sakai, with emphasis
                                           on JSR-168 and
TIFF (Uncompressed) decompressor
   are need ed to see this picture .



                                           localization.
                                          CHEF is highly extensible
                                           with support for
                                           JetSpeed, Velocity, and
                                           other portal interfaces.
                                                                       94
Open Grid Computing Environment
            (OGCE)
                   Extends CHEF/Sakai
                    to include support
                    for Grid services
                    – MyProxy
                    – GridPort
                    – GT services (GRAM,
                      GridFTP, MDS, etc.)
                    – Java CoG
                   Provides a “quick
                    start” for building
                    Grid-enabled portals.


                                            95
    System Packaging/Distribution
   Distribution and Packaging Tools
    – Getting software distributed and installed
      uniformly throughout a broad collaboration
    – Tools that help create integrated
      distributions that work on a wide variety of
      systems
   Integrated Distributions
    – Customized distributions of common Grid
      software


                                                     96
        Grid Packaging Tools (GPT)
   GPT is the packaging used for the Globus Toolkit,
    but it exists independently.
    – Adds metadata to tar.gz files, putting more
      “intelligence” into build/install/config
    – Tools for developers and users
   Focus is multiplatform, tricky builds
    –   Works on most Unix systems
    –   Source & Binary packages
    –   Dependency management
    –   Relocatable installations (multiple installs)
    –   Setup (config) awareness
    –   Bundles (aggregations of packages)

                                                        97
         Virtual Data Toolkit (VDT)
   VDT is a grid middleware distribution focused on the
    needs of the NSF-funded GriPhyN and iVDGL projects,
    both of which are focused on Physics and Astronomy
    applications.
    – Ease of use (and installation) is key.
   Contents
    – Globus Toolkit & Condor, Condor-G
    – Virtual Data Tools (Chimera, Pegasus, RLS)
    – Utilities (GSI-OpenSSH, UberFTP, MonaLisa, MyProxy,
      KX.509, etc.)
   Uses PACMAN for distribution, install, configuration.
   Deployed on Grid3 (28 major U.S. sites)

                                                            98
            GT2 Evolution To GT4
   ALL of GT2 functionality is in GT4
   What happened to the GT2 key protocols?
    – Security: Adapting X.509 proxy certs to integrate
      with emerging WS standards
    – GRIP/LDAP: Abstractions integrated into WSRF as
      resource properties
    – GRAM: ManagedJobFactory and related service
      definitions
    – GridFTP: Server updated, but not WSRF-compliant,
      RFT fills that role
   Also rendering collective services in terms of
    WSRF: RFT, RLS, CAS, etc.


                                                          99
   A Bit of Background
    – Grid Architecture Overview
    – Working With Applications
    – Role of Globus
   Pieces of the Globus Toolkit
    – And an Example Application
   Globus Toolkit 4.0 and Futures




                                     100
101
                            Apache Axis
                       Web Services Container
                  Good news for Java WS developers: GT4.0 works
                   with standard Axis* and Tomcat*
GT      App – GT provides Axis-loadable libraries, handlers
bits    bits – Includes useful behaviors such as inspection,
                     notification, lifetime mgmt (WSRF)
                   – Others implement GRAM, etc.
                  Major Globus contributions to Apache
   Security        –   ~50% of WS-Addressing code
  Addressing
                   –   ~15% of WS-Security code
   Axis            –   Many bug fixes
                   –   WSRF code a possible next contribution




                    * Modulo Axis and Tomcat release cycle issues
                                                               102
   WS Core Enables Frameworks:
    E.g., Resource Management
                 Applications of the framework
           (Compute, network, storage provisioning,
       job reservation & submission, data management,
                  application service QoS, …)


       WS-Agreement                WS Distributed Management
   (Agreement negotiation)          (Lifecycle, monitoring, …)


        WS-Resource Framework & WS-Notification (*)
    (Resource identity, lifetime, inspection, subscription, …)


                     Web services
    (WSDL, SOAP, WS-Security, WS-ReliableMessaging, …)


* An evolution of Open Grid Services Infrastructure (OGSI)       103
             WSRF & WS-Notification
   Naming and bindings (basis for virtualization)
     – Every resource can be uniquely referenced, and has one or more associated
       services for interacting with it
   Lifecycle (basis for fault resilient state management)
     – Resources created by services following factory pattern
     – Resources destroyed immediately or scheduled
   Information model (basis for monitoring & discovery)
     – Resource properties associated with resources
     – Operations for querying and setting this info
     – Asynchronous notification of changes to properties
   Service Groups (basis for registries & collective svcs)
     – Group membership rules & membership management
   Base Fault type




                                                                                   104
                       Globus 4.0 Structure
                       Your      Your       Your                          Your                   Your
                                                                                                Your               Your
                                                                                                                  Your
CLIENT                Your      Your       Your                          Your
                       Java        C      Python                          Java                    CC             Python
                                                                                                                 Python
                      Java        C       Python                         Java
                       Client    Client    Client                         Client                 Client
                                                                                                Client            Client
                                                                                                                 Client
                      Client    Client    Client                         Client




               Interoperable
                                                        X.509 credentials =
             WS-I-compliant
                                                        common authentication
            SOAP messaging


                                                     Your       Your




                                                                                                                               Pre-WS MDS
                                                                                                                 Pre-WS GRAM
    Your
                                     OGSA-DAI
                      Delegation




   Your




                                                                                     SimpleCA
                       Archiver




                                                                                                 MyProxy
                                                                           GridFTP
                                                    Python       C
                       Trigger




    Java
               GRAM




                                      GTCP


   Java
                        Index



                                       CAS
                RFT




                                                                                                           RLS
  Service                                           Service    Service
  Service
                                                    pyGlobus   C WS
                                                    WS Core    Core


             Java Services in Apache Axis Python hosting,                   C Services using GT
SERVER
            Plus GT Libraries and Handlers GT Libraries                    Libraries and Handlers
                                                                                                                               105
              What’s New in
        GT 4.0 (January 31, 2005)
   For all:
    – Additions: data, security, execution, XIO, …
    – Improved packaging, testing, performance, usability,
      doc, standards compliance (phew)
    – WS components ready for broader use
   For the end user:
    – More complementary tools & solutions
    – C, Java, Python APIs; command line tools
   For the developer:
    – Java (Axis/Tomcat) hosting greatly improved
    – Python (pyGlobus) hosting for the first time




                                                             107
           GT4.0 Release Schedule

 Date       Stability      Features   Public interfaces
             Level          added      changed after?
                            after?
Aug ‘04     Pre-Alpha        Yes              Yes

Dec ‘04   Full-featured      No         Yes, but only if
          development                 significant benefits
Spring    Beta-quality       No               No
 ‘05      development
Summer    Stable release     No               No
  ‘05        (FINAL)
                                                        108
       We’d Getting a Lot of Help,
      But Could do with A Lot More

   Testing and feedback
    – Users, developers, deployers: plan to use the
      software now & provide feedback
    – Tell us what is missing, what performance you need,
      what interfaces & platforms, …
    – Ideally, also offer to help meet needs (-:
   Related software, solutions, documentation
    – Adapt your tools to use GT4
    – Develop new GT4-based components
    – Develop GT4-based solutions
    – Develop documentation components


                                                            109
         Documentation Overview
   Current document drafts are publicly
    accessible
    – http://www-unix.globus.org/toolkit/docs/development/docmap.html

   We need reviewers!
    – Suggestions for ways we might improve our
      documentation are appreciated
   We need contributors!
    – We are happy to collaborate to write new
      documents


                                                                        110
                 Testing Overview
   Nightly builds and tests
   Calls for Community Testing; current calls include:
    – Delegation Service, CAS, RFT, GridFTP, RLS, WS GRAM, WS
      MDS, Java WS Core
   TestGrid at USC/ISI
    – Stand up services for several weeks
    – Perform stress tests
   TestGrid at LBNL
    – Focus on WS Core performance and interoperability tests
   Performance and reliability testing is a major focus
    – Bill Allcock (allcock@mcs.anl.gov) is coordinating this effort
   We welcome new testing collaborations!

                                                                       111
              How to Get Involved
Become a GT4 Friend!
   Open group of people from various organizations
    working with GT4 pre-release code and documents
    – Reporting problems in code and documents
    – Contributing ideas, tests, documentation
    – Building GT4-enabled applications
   Weekly telephone calls
   Discussion list
    – To subscribe to the GT4 friends list, send an email to
      majordomo@globus.org which contains the words
      “subscribe gt4-friends” in the message body




                                                               112
              What’s This About
             a Globus Company?
   Univa was announces yesterday (Dec 13, 2004)
    – http://biz.yahoo.com/prnews/041213/nym040_1.html
    – http://www.univa.com
   Steve Tuecke is CEO
   Both Carl Kesselman and Ian Foster are in
    advisory role
   Basic concept: “Redhat Linux for Globus”

   This will NOT affect the GT open source policy
   This WILL allow greater industrial involvement and
    investment in Grids


                                                         113
Conclusions




              120
        Overall, We are Doing Well
   Communities & individuals are, increasingly, using
    the Grid to advance their science
   Broad consensus on many key architecture concepts,
    if not always their implementation
   Significant base of open source software, widely
    used in applications & infrastructure
   Service-oriented arch facilitates cooperation on
    software development & code reuse
   Grid standards are making a difference on a daily
    basis: e.g., GSI, GridFTP



                                                         121
     Overall, We are Doing Well (2)
   A real understanding of how to operate Grid
    infrastructures is emerging
   Production infrastructures are appearing and are
    being relied upon for real science
   Productive international cooperation is occurring at
    many levels
   A vibrant community has formed and shows no
    signs of slowing down
   Real connections have been formed between
    computer science & applications



                                                           122
                Lessons Learned
   The Globus Toolkit consists of the basic building
    blocks needed
   But to meet application’s needs, more should be
    examined:
    – The Grid community (collectively) has many useful
      tools that can be reused!
    – System integration expertise is mandatory.
   OGSA, WSRF, and community standards (GGF,
    OASIS, W3C, IETF) are extremely important in
    getting all of this to work together.
   There’s much more to be done!



                                                          124
       We’d Getting a Lot of Help,
        But Could do with More

   Testing and feedback
    – Users, developers, deployers: plan to use the
      software now & provide feedback
    – Tell us what is missing, what performance you need,
      what interfaces & platforms, …
    – Ideally, also offer to help meet needs (-:
   Related software, solutions, documentation
    – Adapt your tools to use G4
    – Develop new G4-based components
    – Develop G4-based solutions
    – Develop documentation components


                                                            126
                    Summary
   Things that are working
    – Key standards are emerging
    – Open source infrastructure appearing
    – Success stories & experience gained
   Challenges that remain
    –   Complexity of some WS infrastructure
    –   Missing specifications
    –   Limited practical experience
    –   Progress being made on all fronts


                                               127
                      Thanks to:
   Ian Foster, Carl Kesselman and Steve Tuecke
   Bill Allcock, Kate Keahey, Lee Liming, Gregor von
    Laszewski, Mike Wilde @ Argonne
   Globus Alliance members at Argonne, U.Chicago,
    USC/ISI, Edinburgh, PDC, NCSA
   Other partners in Grid technology, application, &
    infrastructure projects
   And thanks to DOE, NSF (esp. NMI and TeraGrid
    programs), NASA, IBM, and the UK eScience Program
    for generous support



                                                        128
    General Globus Help and Support
   Globus-discuss list
    – discuss@globus.org
    – http://globus.org/about/contacts.html
   Bugzilla
    – Bugzilla.globus.org
   GT4 Information
    – gt4-friends@globus.org
    – Weekly telecons for early testers



                                              129
             For More Information
   Jennifer Schopf
    – jms@mcs.anl.gov
    – www.mcs.anl.gov/~jms
   Globus Alliance
    – www.globus.org
   Global Grid Forum
    – www.ggf.org
   GlobusWORLD 2005
                                2nd Edition
    – Feb 7-11, Boston       www.mkp.com/grid2


                                                 130

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:3/31/2013
language:Unknown
pages:113
dominic.cecilia dominic.cecilia http://
About