Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

Grid Computing Grid Computing Grid Computing DEFINITIONS Grid by pengxiang

VIEWS: 22 PAGES: 29

									Conceptos de Grid Computing
CACIC 2007 - Chaco
                                                                                       Conceptos de Grid Computing                             Mg. Javier Echaiz


                                                                                                              Grid Computing
                                                                                                                     Everywhere
                                                                                       Business: Sectors like                Humanitarian works
                                                                                       financial services, industrial
                                                                                       manufacturing, energy…




                                                                                       Research : Health,                      Government
                                                                                       Aerospace, Astronomy,
                                                                                       Finance…
                                        Mg. Javier Echaiz
                                         D.C.I.C. – U.N.S.
                                   http://cs.uns.edu.ar/~jechaiz
                                         je@cs.uns.edu.ar
                                                                                                                                                               2




         Conceptos de Grid Computing                               Mg. Javier Echaiz   Conceptos de Grid Computing                             Mg. Javier Echaiz




                                Grid Computing                                                                Grid Computing
          • The internet took 20 years to be taken
            seriously by business. By comparison                                        • "We really do believe that grid computing is
                                                                                          real," CEO of Hewlett-Packard Carly Fiorina
            the grid is happening far more rapidly.                                       said. "It is driving the R&D in our industry. For
            Tom Hawk, IBM.                                                                the first time our energy is focused on
          • Insight Research says the worldwide                                           something else than building a killer app or a
            market for grid technology and services                                       hot box. We are more focused on making
                                                                                          system that combines the best of IT and
            is doubling every year and will reach $5                                      business. Imagine what is possible."
            billion by 2008.                                                              (September 11, 2003)
          • Grid computing is just one of the                                           • "The Grid will be the major new direction for
            technologies the UK government says,                                          IT," said Geoff Brown, technical director for
            in one of its latest report, should                                           ATS Core Technologies at Oracle. (October 28,
            receive more support and funding.                                             2002)
              (December 17,2003)                                                   3                                                                           4




         Conceptos de Grid Computing                               Mg. Javier Echaiz   Conceptos de Grid Computing                             Mg. Javier Echaiz




                        DEFINITIONS: Grid?                                                            DEFINITIONS: Grid?
                                       ELECTRICITY GRID:                                GRID:
          •      A network of high-voltage transmission lines and
          connections that supply electricity from a number of
                                                                                             The Grid is envisaged to be ‘the
          generating stations to various distribution centres in a                      computing and data management
          country or a region, so that no consumer is dependent on
          a single station.
                                                                                        infrastructure that will provide the
                                                                                        electronic underpinning for a global
                                         UTILITY GRID:                                  society in business, government,
          •     (Term) used of any network that serves a similar
          purpose for other services.                                                   science and entertainment’

                                                                                                                             Berman, Fox and Hey (2003:9)
                                                            www.oed.com            5                                                                           6




Mg. Javier Echaiz                                                                                                                                                  1
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                                                         Mg. Javier Echaiz   Conceptos de Grid Computing                                                          Mg. Javier Echaiz




                          DEFINITIONS: Grid?                                                                                                Why should you care?

          GRID:                                                                                                                   • Ian Foster explains why we should care
                                                                                                                                    Grids in three points:
                A virtual information
          processing environment where the
          user has the ‘illusion’ of a                                                                                                                                      Future
          seamless single-source computing
                                                                                                                                                                            Reality
          power which is actually
          distributed.
                                                                                                                                                                             Vision
                                                                                                                             7                                                                                                        8




         Conceptos de Grid Computing                                                                         Mg. Javier Echaiz   Conceptos de Grid Computing                                                          Mg. Javier Echaiz




                      Why should you care?                                                                       Future
                                                                                                                                            Why should you care?                                                      Future


                                                                                                                 Reality                                            Virtualization                                    Reality


                                                                                                                                                                                                                      Vision
                                                                                                                 Vision

                                                                                                                                                                                                        • Automatically connect
          • Grid is a disruptive technology [Vision]                                                                                                                                                      applications to services
                                                                                                                                                                                                        • Dynamic & intelligent
                – It ushers in a virtualized, collaborative,                                                                                                                                              provisioning
                  distributed world.
                                                                                                                                                  Application Virtualization
               Two interrelated opportunities
                1) Enhance economy, flexibility, access by
                   virtualizing computing resources
                                                                                                                                                Infrastructure Virtualization
                2) Deliver entirely new capabilities by
                   integrating distributed resources                                                                                                                                                      • Dynamic & intelligent
                                                                                                                                                                                                            provisioning
                                                                                                                                                                                                          • Automatic failover
        Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003             9   Source: The Grid: Blueprint for a New Computing Infrastructure (2nd Edition), 2004                  10




         Conceptos de Grid Computing                                                                         Mg. Javier Echaiz   Conceptos de Grid Computing                                                          Mg. Javier Echaiz




                      Why should you care?                                                                                                                       The Grid
          The real and specific problem                                                                                           “Resource sharing & coordinated
          that underlies the Grid concept                                                              Future
                                                                                                                                   problem solving in dynamic …
          is coordinated resource sharing                                                              Reality
                                                                                                                                   virtual organizations”
          and problem solving in dynamic,                                                               Vision
          multi-institutional virtual
          organizations.




                                                                                                                                 1. Enable integration of distributed service & resources
                                                                                                                                 2. Using general-purpose protocols & infrastructure
                                                                                                                                 3. To achieve useful qualities of service
         Source: “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001                                                 11           “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001                                  12




Mg. Javier Echaiz                                                                                                                                                                                                                         2
Conceptos de Grid Computing
CACIC 2007 - Chaco
                  Conceptos de Grid Computing                                                                                            Mg. Javier Echaiz               Conceptos de Grid Computing                                                                                                     Mg. Javier Echaiz


                                                Why should you care?                                                                                                                    Why should you care?
                                                                          Terminology                                                                                                                                                                                                                    Future

                                                                                                                                    Future                                                                                                                                                               Reality

                                                                                                                                    Reality
                                                                                                                                                                                                                                                                                                         Vision

                                                                                                                                    Vision
                                                                                                                                                                         • Grid addresses pain points now
                                                                                                                                                                           [Reality]
                                   • Grid has strong links with “Utility                                                                                                        Grids are built not bought, but are delivering
                                     Computing”, “Autonomic                                                                                                                       real benefits in commercial settings
                                                                                                                                                                                – Low utilization of enterprise resources
                                     Computing” and “Service                                                                                                                    – High cost of provisioning for peak demand
                                     Oriented Architecture”.                                                                                                                    – Inadequate resources prevent use of
                                                                                                                                                                                  advanced applications
                                                                                                                                                                                – Lack of information integration
                                                                                                                                                           13            Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003                                           14




                  Conceptos de Grid Computing                                                                                            Mg. Javier Echaiz               Conceptos de Grid Computing                                                                                                     Mg. Javier Echaiz



                                                Why should you care?                                                                                                                    Why should you care?
                                                     Early Commercial Applications                                                                                                            Grid Deployment Strategies                                                                                 Future


                                                                                                                                                                                                                                                                                                         Reality
        Leading adopters (Oct 2003) *
        • Financial services: 31%                                                                                                             Future
                                                                                                                                                                                                                                                                                                         Vision
        • Life sciences: 26%
        • Manufacturing: 18%                                                                                                                  Reality                     • A range of excellent commercial & open
                                                                                                                                                                            source products for resource federation
        Grid Services Market Opportunity 2005




                                                                           Manufacturing                                                      Vision
                                                            Financial
                                                            Services        Mechanical/
                                                                             Electronic
                                                                                               LS /                                                                               – Federate enterprise computing resources
                                                                                          Bioinformatics                             Other
                                                                              Design
                                                Energy      Derivatives
                                                             Analysis         Process                        Entertainment
                                                                                                                                                                                  – Federate enterprise information resources
                                                                                               Cancer                                Web
                                                Seismic
                                                Analysis
                                                            Statistical
                                                            Analysis
                                                                             Simulation
                                                                               Finite
                                                                                              Research
                                                                                                                 Digital
                                                                                                                                  Applications
                                                                                                                                                                                  – Globus Toolkit®: inter-enterprise sharing
                                                                                               Drug                                 Weather
                                                                              Element        Discovery          Rendering
                                                Reservoir
                                                Analysis
                                                             Portfolio
                                                               Risk
                                                             Analysis
                                                                              Analysis
                                                                                               Protein           Massive
                                                                                                                                    Analysis
                                                                                                                                     Code
                                                                                                                                                                          • But, “Grids are built, not bought”
                                                                               Failure         Folding          Multi-Player
                                                                                                                                   Breaking/
                                                                              Analysis
                                                                                              Protein
                                                                                                                  Games
                                                                                                                                   Simulation                                     – Integration with other enterprise systems is
                                                                                                                Streaming
                                                                                            Sequencing
                                                                                                                  Media            Academic                                         needed to deliver complete solution
                                                                          “Gridified” Infrastructure
                                                                                                                                                                          • Start small & with well-defined ROI case
                                                                                                                                                                                  – Grow based on experience
                                                                                     Sources: IDC, 2000 and Bear Stearns- Internet 3.0 - 5/01 Analysis by SAI
           Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003                                        15            Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003                                           16




                  Conceptos de Grid Computing                                                                                            Mg. Javier Echaiz               Conceptos de Grid Computing                                                                                                     Mg. Javier Echaiz


                                                Data Grids for High Energy                                                                                                     Data Grids for High Energy Physics
                                                         Physics                                                                                                                                     ~PBytes/sec
                                                                                                                                                                                                                                                                                1 TIPS is approximately 25,000
                                                                                                                                                                                                                        Online System          ~100 MBytes/sec
                                                                                                                                                                                                                                                                                SpecInt95 equivalents

                                                                                                                                                                                                                                                    Offline Processor Farm
                                                                                            Fastest particle accelarator:                                                 There is a “bunch crossing” every 25 nsecs.
                                                                                                                                                                                                                                                           ~20 TIPS
                                                                                                                                                                          There are 100 “triggers” per second
                                                                                            Large Hadron Collider                                                         Each triggered event is ~1 MByte in size
                                                                                                                                                                                                                                                                         ~100 MBytes/sec

                                                                                                                                                                                                                                          Tier 0
                                                                                            When completed in                                                                                                          ~622 Mbits/sec
                                                                                                                                                                                                        or Air Freight (deprecated)
                                                                                                                                                                                                                                                               CERN Computer Centre


                                                                                            2005, CERN's Large                                                  Tier 1
                                                                                                                                                                           France Regional                  Germany Regional                  Italy Regional                     FermiLab ~4 TIPS
                                                                                            Hadron Collider will                                                               Centre                           Centre                           Centre
                                                                                                                                                                                                                                                                                               ~622 Mbits/sec
                                                                                            send protons and ions
                                                                                            from hydrogen nuclei                                                                                                             Tier 2           Caltech                   Tier2   Tier2 Centre
                                                                                                                                                                                                                                                               Tier2 Centre Centre        Tier2 Centre
                                                                                                                                                                                                                                              ~1 TIPS            ~1 TIPS ~1 TIPS ~1 TIPS ~1 TIPS
                                                                                                                                                                                                             ~622 Mbits/sec
                                                                                            rushing through a 17-
                                                                                            mile circular tunnel at                                                                             Institute
                                                                                                                                                                                               ~0.25TIPS
                                                                                                                                                                                                        Institute Institute       Institute
                                                                                                                                                                                                                                                               Physicists work on analysis “channels”.

                                                                                            speeds of up to                                                         Physics data cache
                                                                                                                                                                                                                 ~1 MBytes/sec
                                                                                                                                                                                                                                                               Each institute will have ~10 physicists working on one or more
                                                                                                                                                                                                                                                               channels; data for these channels should be cached by the
                                                                                                                                                                                                                                                               institute server
                                                                                            52,200,000 miles per                                                                                                                 Tier 4
                                                                                            hour.                                                                                   Physicist workstations

                                                                                                                                                                                                                        Image courtesy Harvey Newman, Caltech
           Image courtesy Christian Richters: Source:Wired News                                                                                            17                                                                                                                                                               18




Mg. Javier Echaiz                                                                                                                                                                                                                                                                                                                3
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                                      Mg. Javier Echaiz                   Conceptos de Grid Computing                                                               Mg. Javier Echaiz


                   Mathematicians Solve NUG30                                                                                           Mathematicians Solve NUG30
                           Quadratic Assignment Problem

                               Location 1                      The distances are:
                                                                  •d(1,2) = 22,                                           • Looking for the solution to
                                                                                                                            the NUG30 quadratic
                                                                  •d(1,3) = 53,                                             assignment problem
                                                       Location 2
                                                                  •d(2,3) = 40,
                                                                                                                          • An informal collaboration of
                                                                  •d(3,4) = 55.                                             mathematicians and
                                                                                                                            computer scientists
          Location 4                                     The required flows between                                       • Condor-G delivered 3.46E8
                                       Location 3        facilities are:                                                    CPU seconds in 7 days
                                                             •f(2,4) = 1,                                                   (peak 1009 processors) in
          The permutation p                                                                                                                                                              NUG30 Solution:
                                                             •f(1,4) = 2,                                                   U.S. and Italy (8 sites)                                     14,5,28,24,1,3,16,15,
          corresponding to this                              •f(1,2) = 3,                                                                                                                10,9,21,2,4,29,25,22,
          graphical solution is                              •f(3,4) = 4.                                                                                                                13,26,17,30,6,20,19,
          ( 2, 1, 4, 3 ).
                                                                                                                                                                                         8,18,7,27,12,11,23
        MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin                                                 19                   MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin                                                         20
        Source:Shawn McKee The Grid:The Future of High Energy Physics Computing? January 7,2002                               Source:Shawn McKee The Grid:The Future of High Energy Physics Computing? January 7,2002




         Conceptos de Grid Computing                                                      Mg. Javier Echaiz                   Conceptos de Grid Computing                                                               Mg. Javier Echaiz

                         Network for Earthquake                                                                                    NEES (Network for Earthquake
                          Engineering Simulation                                                                                Engineering Simulation) Collaboratory

        • NEESgrid: national                                                                                     Remote Users
                                                                                                                 (Faculty,                             Instrumented
                                                                                                                                                                                        Network for
          infrastructure to couple                                                                                Students,                              Structures                     Earthquake
                                                                                                                  Practitioners)                         and Sites
                                                                                                                                                                                        Engineering
          earthquake engineers with                                                                                                                                                     Simulation
          experimental facilities,
          databases, computers, &                                                                                                                                                                                 U.Nevada Reno
                                                                                                                                                                 High-
          each other                                                                                                Laboratory                                Performance                                       www.neesgrid.org
                                                                                                                    Equipment                                  Network(s)

        • On-demand access to                                                                                                                                                               Field Equipment


          experiments, data streams,
          computing, archives,                                                                                   Curated Data

          collaboration                                                                                           Repository
                                                                                                                                                                                                 Leading Edge
                                                                                                                                                                                                 Computation
                                                                                                                    Global
                                                                                                                  Connections
                                                                                                                 (fully developed                                            Remote Users:
                                                                                                                FY 2005 – FY 2014)                                           (K-12 Faculty and
                                                                                                                                              Laboratory Equipment           Students)                    2
          NEESgrid: Argonne, Michigan, NCSA, UIUC, USC                                                   21                                   (Faculty and Students)                                                                   22




         Conceptos de Grid Computing                                                      Mg. Javier Echaiz                   Conceptos de Grid Computing                                                               Mg. Javier Echaiz


          Building a NEES Collaboratory:                                                                                                 How it Really Happens
               What the User Wants                                                                                                                    (A Simplified View)
                                                                                                                                                                                                                  Compute
                                                                                                                                                            Simulation                                             Server
                                                                                                                                                               Tool                                               Compute
        Secure,                                                                                                           Web
                                                                                                                        Browser                                                                                    Server
        reliable, on-                                                                                                                          Web                             Registration
                                                                                                                                              Portal                             Service
        demand                                                                                                                                                                                                    Camera
        access to
                                                                                                                                                              Data            Telepresence
        data,                                                                                                                                                                    Monitor                          Camera
                                                                                                                                                             Viewer
        software,
                                                                                                                                                              Tool
        people, and
                                                                                                                                                                                                                  Database
        other                                                                                                                                          Chat
                                                                                                                                                                                                                   service
                                                                                                                                                       Tool
        resources                                                                                                                                                                  Data
                                                                                                                                                                                                                  Database
        (ideally all                                                                                                                            Credential                        Catalog
                                                                                                                                                                                                                   service
        via a Web                                                                                               Slide                           Repository
                                                                                                                courtesy of                                                                                       Database
        Browser)                                                                                                Ian Foster                  Certificate                                                            service
                                                                                                                                            authority
                                                                                                               Users work                  Application services             Collective services          Resources implement
                                                                                            Slide courtesy     with client                organize VOs & enable               aggregate &/or             standard access &
                                                                                            of Ian Foster 23                                                                                                                           24
                                                                                                               applications               access to other services          virtualize resources         management interfaces




Mg. Javier Echaiz                                                                                                                                                                                                                           4
Conceptos de Grid Computing
CACIC 2007 - Chaco
                  Conceptos de Grid Computing                                                                        Mg. Javier Echaiz                    Conceptos de Grid Computing                                                       Mg. Javier Echaiz


        Slide
        courtesy of
                                  How it Really Happens                                                                                        Slide
                                                                                                                                               courtesy
                                                                                                                                                                     How it Really Happens
        Ian Foster                                                                                                                             of Ian
                                             (without Grid Software)                                                                           Foster
                                                                                                                                                                      (with Grid Software)
                                                                                                                Compute                                                                                                        Globus   Compute
                                                                                                           A                                                                                                                   GRAM
                                                     Simulation                                                  Server                                                                 Simulation                                       Server
                                                        Tool                                                    Compute                                                                    Tool                                         Compute
                                                                                                                                                                                                                               Globus
               Web                                                                                         B                                          Web
                                                                                                                 Server                                                                                                        GRAM      Server
             Browser                     Web                                 Registration                                                           Browser
                                                                                                                                                                                                        Globus Index
                                                                                                                                                                           CHEF                            Service
                                        Portal                                 Service
                                                                                                               Camera                                                                                                                Camera

                                                       Data                  Telepresence                                                                                                 Data         Telepresence
     Application                                                                                                                             Application
                           10                         Viewer                    Monitor                        Camera                        Developer
                                                                                                                                                                 2                       Viewer           Monitor                    Camera
     Developer
                                                       Tool                                                                                                                               Tool
     Off the Shelf         12                                                                                                                Off the Shelf       9
                                                                                                                Database                                                                                                        OGSA    Database
     Globus                                        Chat                                                    C                                                                  CHEF Chat                                         DAI
                             0                                                                                   service                     Globus                                                                                      service
     Toolkit                                       Tool                                                                                                          5             Teamlet
                                                                                 Data                                                        Toolkit                                                       Globus
     Grid                                                                                                       Database                                                                                                        OGSA    Database
                             0               Credential                         Catalog                    D                                 Grid                                                         MCS/RLS               DAI
     Community                                                                                                   service                     Community
                                                                                                                                                                 3            MyProxy                                                    service
                                             Repository
                                                                                                                Database                                                                                                        OGSA    Database
                                       Certificate                                                         E                                                            Certificate                                             DAI
                                                                                                                 service                                                                                                                 service
                                       authority                                                                                                                        Authority
     Users work                     Application services                   Collective services        Resources implement                    Users work                Application services          Collective services    Resources implement
     with client                   organize VOs & enable                     aggregate &/or           standard access &               25     with client              organize VOs & enable            aggregate &/or       standard access &                  26
     applications                  access to other services                virtualize resources       management interfaces                  applications             access to other services       virtualize resources   management interfaces




                  Conceptos de Grid Computing                                                                        Mg. Javier Echaiz                    Conceptos de Grid Computing                                                       Mg. Javier Echaiz


                                             The 13.6 TF TeraGrid:
                                             Computing at 40 Gb/s
                                                          Site Resources           Site Resources
                                              26
                                                      4      HPSS                     HPSS
                                              24

                                                           External                      External
                                              8            Networks                      Networks
                                                                                                           5

                                                          Caltech                   Argonne

                                                                                                                 External
                                  External
                                                                                                                 Networks
                                  Networks
                      Site Resources                      SDSC                      NCSA/PACI                               Site Resources
                                                          4.1 TF                    8 TF
                         HPSS                             225 TB                    240 TB                                  UniTree




                 TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne                                         www.teragrid.org                  27                                                                                                                       28




                  Conceptos de Grid Computing                                                                        Mg. Javier Echaiz                    Conceptos de Grid Computing                                                       Mg. Javier Echaiz


                          IVDGL:International Virtual                                                                                                           IVDGL:International Virtual Data
                             Data Grid Laboratory                                                                                                                      Grid Laboratory
                                                                                                  Sloan Digital Sky Survey
                                                                                                is the most ambitious
                                                                                                astronomical survey
                                                                                                project ever undertaken.
                                                                                                  The survey will map in
                                                                                                detail one-quarter of the
                                                                                                entire sky, determining the
                                                                                                positions and absolute
                                                                                                brightnesses of more than
                                                                                                100 million celestial
                                                                                                objects.
                                                                                                                                                                                                                                            Tier0/1 facility
                                                                                                  It will also measure the                                                                                                                  Tier2 facility
                                                                                                distances to more than a                                                                                                                    Tier3 facility

                                                                                                million galaxies and                                                                                                                        10 Gbps link
                                                                                                                                                                                                                                            2.5 Gbps link
                                                                                                quasars                                                                                                                                     622 Mbps link
                 U.S. PIs: Avery, Foster, Gardner, Newman, Szalay   www.ivdgl.org                                                                                                                                                           Other link
                 Image courtesy of http://www.sdss.org/news/releases/20050111.yardstick.html                                          29                  U.S. PIs: Avery, Foster, Gardner, Newman, Szalay           www.ivdgl.org                             30




Mg. Javier Echaiz                                                                                                                                                                                                                                                   5
Conceptos de Grid Computing
CACIC 2007 - Chaco

     Grid3: An Operational Grid
         Conceptos de Grid Computing                                                                         Mg. Javier Echaiz       Conceptos de Grid Computing                                                                          Mg. Javier Echaiz


       28 sites (2100-2800 CPUs) & growing                                                                                                              Grid Physics Network
       400-1300 concurrent jobs
       8 substantial applications + CS experiments                                                                                                            (GriPhyN)
       Running since October 2003                                                                                                      Enabling R&D for advanced data grid systems, focusing in particular
                                                                                                                                       on Virtual Data concept.
                                                                                                                                                                                   Production Team
                                                                                                                                           Individual Investigator                                     Other Users

                                                                                                                                                                           Interactive User Tools



                                                                                                                                                                           Request Planning and             Request Execution
                                                                                                                                           Virtual Data Tools
                                                                                                                                                                             Scheduling Tools               Management Tools

                                                                                                                                                         Resource                           Security and                    Other Grid
                                                                                                                                                          Resource                           Security and                    Other Grid
                                                                                                                                                        Management                            Policy                         Services
                                                                                                                                                         Management                             Policy                        Services
                                                                                                                                                         Services                            Services
                                                                                                                                                          Services                            Services

                                                                                                                                        ATLAS
                                                                                                                                        CMS                                                             Transforms
                                                                                                                                        LIGO                         Raw data
                                                                                                                                                                     source
                                                                                                                                                                                                Distributed resources
                      Korea                                                                                                             SDSS                                                    (code, storage,
                                                                                                                                                                                                computers, and network)

                                 http://www.ivdgl.org/grid3                                          Slide by Ian Foster    31        www.griphyn.org; Slide from C. Kesselman/Cal(IT)2 presentation                                                     32




         Conceptos de Grid Computing                                                                         Mg. Javier Echaiz       Conceptos de Grid Computing                                                                          Mg. Javier Echaiz


                                                                                                                                         Grid Vision, Marketing, and
                      Why should you care?
                                                                                                             Future
                                                                                                                                                   Reality                                                                                Future


                                                                                                                                                                                                                                          Reality
                                                                                                             Reality

                                                                                                                                     • Vision                                                                                             Vision
                                                                                                             Vision
                                                                                                                                           – Computing & data resources can be shared like
                                                                                                                                             content on the Wb
          • An open Grid is to your                                                                                                  • Marketing
            advantage [Future]                                                                                                             – Have we got a [Data, compute, knowledge,
                                                                                                                                             information, desktop, PC, enterprise, cluster, …] Grid
                  –Standards are being defined now                                                                                           for you!
                   that will determine the future of                                                                                 • Reality
                                                                                                                                           – Commercial products mostly noninteroperable
                   this technology                                                                                                         – Open source tools offer de facto standards, but are
                                                                                                                                             also far from a complete solution

         Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003           33       Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003            34




         Conceptos de Grid Computing                                                                         Mg. Javier Echaiz       Conceptos de Grid Computing                                                                          Mg. Javier Echaiz


                                                                                                                                                        Open Grid Services
                               Standards Matter!                                                                                                                                                                                          Future

                                                                                                             Future                                        Architecture                                                                   Reality

                                                                                                             Reality
                                                                                                                                                                                                                                          Vision


         • Open, standard protocols                                                                          Vision                 • Define a service-oriented architecture …
              –   Enable interoperability                                                                                                 – the key to effective virtualization
              –   Avoid product/vendor lock-in
              –   Enable innovation/competition on end points
                                                                                                                                    • … that addresses vital “Grid” requirements
              –   Enable ubiquity                                                                                                         – AKA utility, on-demand, system management,
         • In Grid space, must address how we                                                                                               collaborative computing
              – Describe, discover, & access resources                                                                                    – in particular, distributed service management
              – Monitor, manage, & coordinate, resources
                                                                                                                                    • … building on Web services standards
              – Account & charge for resources
                       For many different types of resource                                                                               – extending those standards where needed
                                                                                                                                    “The Physiology of the Grid: An Open Grid Services Architecture for
         Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003           35   Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002 36




Mg. Javier Echaiz                                                                                                                                                                                                                                             6
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                                                        Mg. Javier Echaiz                 Conceptos de Grid Computing                                                                                                 Mg. Javier Echaiz


                Latest Step Forward:                                                                                                                        WS-Resource Framework
               WS-Resource Framework                                                                                                                     Completes Grid-WS Convergence
                                                                                                                    Future


                                                                                                                    Reality
                                                                                                                                                                      GT1                                                                                                Future
          • A family of six Web                                                                                                                     Grid                                 GT2                                                                            Reality
                                                                                                                    Vision
            services specifications                                                                                                                                                                                   OG S
                                                                                                                                                                                                                          I                                              Vision
                – A design pattern to                                                      Properties                              Started




                                                                                       n
                                                                                                                                   far apart                                Have been


                                                                                   ti o
                  specify how to use




                                                                                                            Lif
                                                                                                                                   in apps                                                                                                                              WSRF

                                                                                   ca
                                                                                                                                                                            converging




                                                                                                                e
                                                                                                              ti m
                                                                               tifi
                  Web services to                                            No
                                                                                                                                   & tech




                                                                                                                  e
                  access “stateful”                                                                                                                                                                                         L 2,
                                                                                                                                                                                                                        WSD
                  components                                                                                                                                                                        L,                   WSD
                                                                                                                                                                                                                             M
                                                                                     ps

                                                                                                                                                                                                WSD




                                                                                                               Fa
                                                                                   ou

                                                                                                                                            Web                                                  WS-*



                                                                                                             l ts u
                                                                                                                                                                     HTTP
                                                                                Gr


                – Message-based                                                            References
                  publish-subscribe to
                  Web services                                                                                                               The definition of WSRF means that Grid and Web
                                                                                                                                             communities can move forward on a common base
        Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003              37           Source: Ian Foster’ s presentation on “The Grid” , COMDEX 2003, Las Vegas, Nevada USA, November 18, 2003                                        38




         Conceptos de Grid Computing                                                                        Mg. Javier Echaiz                 Conceptos de Grid Computing                                                                                                 Mg. Javier Echaiz




                   The Evolution of the GRID                                                                                                                   The Evolution of the GRID
           1980’s                   Parallel computing clusters - improved                                                                                Currently there are (clusters) of
                                    performance from tightly coupled clusters
                                                                                                                                                        very powerful computing /
                                    and data sharing
                                                                                                                                                        communications systems
           1990’s                   Grid 1: Extend the advances in parallel
                                    computing to geographically distributed                                                                             (i) Systems for acquiring digital data and processing data
                                                                                                                                                        (Amazon.com or Oracle clusters)
                                    systems
                                                                                                                                                        (ii) Systems for analysing and visualising information
           2000                     Grid II: Grid is a platform for integrating                                                                         (CERN’s large hadron collider, Protein Synthesis systems)
                                    loosely coupled applications: some
                                    components running in parallel and some                                                                             (iii) Systems for imaging, analysis and visualisation for
                                                                                                                                                        distributed data (weather prediction, satellite based military
                                    for linking disparate resources largely                                                                             civilian systems)
                                    developed in the serial-von-Neumann
                                                                                                                                                        (iv) Systems that can link Sensors and predict on real-time
                                    paradigm - storage, visualisation, a-d/d-a
                                                                                                                                                        information (military systems, video surveillance)
                                    converters and sensors
                                                                                                                              39                                                                                                                                                           40




         Conceptos de Grid Computing                                                                        Mg. Javier Echaiz                 Conceptos de Grid Computing                                                                                                 Mg. Javier Echaiz




                   The Evolution of the GRID                                                                                                                   The Evolution of the GRID

           Developments in networking technologies, operating
           systems, clustered data bases, application services                                                                                                                                                                          * HTC
                                                                                                                                                                                                                                                                      * P2P

           and device technologies have enabled developers to                                                                                                                Minicomputers                                                                * PDAs
                                                                                                                                          COMPUTING




                                                                                                                                                         * Mainframes    *                                  * PCs            * Workstations
                                                                                                                                                                                                                                                                       * Grids
           build systems with literally distributed millions of                                                                                                                                                                                          * PC Clusters

           nodes for providing:                                                                                                                                      * XEROX PARC worm
                                                                                                                                                                                                              * Crays          * MPPs
                                                                                                                                                                                                                                              * WS Clusters




           •            Web-based services personal commercial transactions.
           •            Content delivery networks that can cache web-pages                                                                                                                                                    * IETF
                                                                                                                                                                                                                                                              * W3C
                                                                                                                                        Communication




                                                                                                                                                                                                          * TCP/IP
                        seamlessly.                                                                                                                                                          * Ethernet                                   * HTML * Mosaic                     * Web Services
                                                                                                                                                                            * Email
           •            Wireless networks have spawned ad-hoc distributed                                                                                * Sputnik
                                                                                                                                                                                                            * Internet Era                * WWW Era                      * XML
                        systems that when linked to wide-area networks lead to                                                                                        * ARPANET

                        a complex distributed system.
                                                                                                                                        1960                         1970             1975           1980               1985            1990            1995              2000
           Problems of efficiency, reliability, accessibility and security are
           not addressed in ‘global’ terms.                                                                                   41
                                                                                                                                                                                                                        Source: www.gridbus.org42




Mg. Javier Echaiz                                                                                                                                                                                                                                                                               7
Conceptos de Grid Computing
CACIC 2007 - Chaco
            Conceptos de Grid Computing                                                                      Mg. Javier Echaiz   Conceptos de Grid Computing                  Mg. Javier Echaiz




                 The Evolution of the GRID                                                                                            The Evolution of the GRID
      P    Source: www.gridbus.org
      E
      R
      F
      O
                                             2100   2100     2100   2100

                                                                                                                                   Grid is being developed
      R
      M
      A
                                                                                                                                   not only to make
      N
                                                                                                                                   distributed resources
                                             2100   2100     2100   2100




      C                            2100


      E                                                                                    Administrative Barriers
      +                                                                                     •Individual
      Q
                                                                                            •Group
                                                                                            •Department
                                                                                                                                   available to end-user not
      o                                                                                     •Campus
      S
                                                                                            •State
                                                                                            •National                              also to co-ordinate such
                                                                                            •Globe
                                                                                            •Inter Planet?
                                                                                            •Universe?                             usage     for sharing and
          Personal Device        SMPs or                    Local           Enterprise    Global
                                                                                                                                   aggregation of resources.
                              SuperComputers               Cluster         Cluster/Grid    Grid                             43                                                               44




            Conceptos de Grid Computing                                                                      Mg. Javier Echaiz   Conceptos de Grid Computing                  Mg. Javier Echaiz




                 The Evolution of the GRID                                                                                            The Evolution of the GRID
             • Moore’s law improvements in computing                                                                              • The first generation involved proprietary
               produce highly functional end-systems.                                                                               solutions for sharing high-performance
             • The internet and burgeoning wired and                                                                                computing resources.
               wireless provide wide-spread                                                                                       • The second generation introduced
               connectivity.                                                                                                        middleware to cope with scale and
                                                                                                                                    heterogeneity.
             • Changing modes of working and                                                                                      • The third generation introduced a
               problem solving emphasise teamwork,                                                                                  service-oriented approach leading to
               computation.                                                                                                         commercial projects in addition to the
             • Network growth produce dramatic                                                                                      scientific projects now collectively
               changes in topology and geography.                                                                           45
                                                                                                                                    known as e-Science.                                      46




            Conceptos de Grid Computing                                                                      Mg. Javier Echaiz   Conceptos de Grid Computing                  Mg. Javier Echaiz


                                                                                                                                               2nd      Generation example:
                     The Evolution of the GRID
                                                                                                                                                         Typical Portal
                                          • The first generation
                 – FAFNER, I-WAY
                                  • The second generation
                 – Technologies: Globus, Legion
                 – Distributed object systems (Jini and RMI, The common
                   component architecture form)
                 – Grid resource brokers and schedulers
                 – Grid portals                                                                                                     Applications
                 – Integrated systems
                 – Peer-to-Peer computing
                                          • The third generation
                 – Service-oriented architecture (web services, OGSA, Agents)
                 – Information aspects: relation with the World Wide Web                                                          Remote Services                         Hibernate
                 – Live information systems
                                                                                                                            47                                                               48




Mg. Javier Echaiz                                                                                                                                                                                 8
Conceptos de Grid Computing
CACIC 2007 - Chaco
               Conceptos de Grid Computing                                                                             Mg. Javier Echaiz                                         Conceptos de Grid Computing                                    Mg. Javier Echaiz


                The Evolution of the GRID
                Globus Toolkit® History                                                                                                                                               Building blocks of the Grid
                                                                                                                              30000
          Does not include downloads from:
          NMI, UK eScience, EU Datagrid,                                                                    GT 2.0
          IBM, Platform, etc.                                                                               Released
                                                                                        Physiology of the Grid




                                                                                                                                       Downloads per Month from ftp.globus.org
                                                                                                                              25000
                                                                                             Paper Released

                                                                                                                                                                                  •   Networks
                                                                                                                              20000
                                                                   Anatomy of the Grid
                                                                       Paper Released       Significant
                                                                                            Commercial
                                                                                                                                                                                  •   Computational ‘nodes’ on the Grid
                         The Grid: Blueprint for a                                          Interest in
                             New Computing
                         Infrastructure published
                                                       NSF & European Commission
                                                     Initiate Many New Grid Projects
                                                                                            Grids
                                                                                                                              15000
                                                                                                                                                                                  •   Pulling it all together
            DARPA, NSF,
            and DOE                              Early Application
                                                                                                                              10000                                               •   Common infrastructure: standards
            begin funding                        Successes Reported
            Grid work                GT 1.0.0
                                     Released                                                                                 5000
                  NASA begins
             funding Grid work,
             DOE adds support

                                                                                                                              0
      1997                1998               1999               2000                2001               2002
                                                                                                                                        49
      Source: Ian Foster’ s presentation on “The First 50 Years” , British Computer Society, Lovelace Medal Award Presentation, May, 2003                                                                                                                      50




               Conceptos de Grid Computing                                                                             Mg. Javier Echaiz                                         Conceptos de Grid Computing                                    Mg. Javier Echaiz




                                      GRID: Key Issues                                                                                                                                GRID: Key Issues                            Sharing
               Resources Discovery, Allocation,                                                                                                                                   •   A biochemist will be able to exploit 10,000
                                                                                                                                                                                      computers to screen 100,000 compounds in
                            Scheduling                                                                                                                                                an hour
               Availability Access, Security, Networks                                                                                                                            •   1,000 physicists worldwide will be able to pool resources for
                                                                                                                                                                                      petop analyses of petabytes of data

               Efficiency                            Economy,                                                                                                                     • A multidisciplinary analysis in aerospace
                                                                                                                                                                                    couples code and data in geographically
                                                     Management Administration.                                                                                                     distributed organisations may be possible
                                                                                                                                                                                  •   Civil engineers collaborate to design, execute, and analyse shake
               Hardware                              Computers, Services, Networks                                                                                                    table experiments
                                                                                                                                                                                  • Climate scientists will be able to visualise,
               Application Development, Testing                                                                                                                                     annotate, and analyse terabyte simulation
                                                                                                                                      51
                                                                                                                                                                                    datasets                                                                   52




               Conceptos de Grid Computing                                                                             Mg. Javier Echaiz                                         Conceptos de Grid Computing                                    Mg. Javier Echaiz


                            GRID: Key Issues                                            Sharing
                                  Online Access to Scientific Instruments                                                                                                             MORE DEFINTIONS
         Advanced Photon Source

                                                         wide-area
                                                       dissemination                                                                                                              •   Resource
                                                                                                                                                                                  •   Network protocol
      real-time                                                        archival                   desktop & VR clients                                                            •   Network enabled service
                                                                                                  with shared controls
      collection                                                       storage                                                                                                    •   Application Programming Interface(API)
                                                                                                                                                                                  •   Software Development Kit (SDK)
                                                                                                                                                                                  •   Syntax
       tomographic reconstruction
                   DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago53                                                                                                                                                                                  54




Mg. Javier Echaiz                                                                                                                                                                                                                                                   9
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                             Mg. Javier Echaiz   Conceptos de Grid Computing                          Mg. Javier Echaiz


                                                                                                       MORE DEFINTIONS:
          MORE DEFINTIONS: Resource
                                                                                                         Network protocol
          • An entity that is to be shared.                                           • A formal description of message formats and
                – E.g., computers, storage, data, software.
                                                                                        a set of rules for message exchange
                                                                                            – Rules may define sequence of message
          • Does not have to be physical entity                                               exchanges
                – E.g., Condor pool, distributed file system,…                              – Protocol may define state-change in endpoint, e.g.
          • Defined in terms of interfaces, not devices                                       file system state change
                – E.g. scheduler such as LSF and PBS define a                         • Good protocols designed to do one thing
                  compute resource.                                                         – Protocols can be layered
                – Open/close/read/write define access to a
                  distributed file system, e.g NFS, AFS, DFS.
                                                                                      • Examples of protocols
                                                                                            – IP, TCP, TLS (was SSL), HTTP, Kerberos
                                                                                55                                                                       56




         Conceptos de Grid Computing                             Mg. Javier Echaiz   Conceptos de Grid Computing                          Mg. Javier Echaiz


                           MORE DEFINTIONS:                                                            MORE DEFINTIONS:
              Network enabled services                                                     Application Programming Interface


          • Implementation of a protocol that                                         • A specification for a set of routines to facilitate
            defines a set of capabilities.                                              application development
                – Protocol defines interaction with service.                          • Spec often language specific (or IDL)
                                                                                            – Routine name, number, order and type of
                – All services require protocols.
                                                                                              arguments; mapping to language constructs
                – Not all protocols are used to provide                                     – Behaviour or function of routine
                  services (e.g. IP, TLS).
                                                                                      • Examples
          • Examples: FTP and Web servers.                                                  – GSS API (security), MPI (message passing)


                                                                                57                                                                       58




         Conceptos de Grid Computing                             Mg. Javier Echaiz   Conceptos de Grid Computing                          Mg. Javier Echaiz


                           MORE DEFINTIONS:                                                            MORE DEFINTIONS:
             Software Development Kit (SDK)                                                                        Syntax

          • A particular instantiation of API.                                        • Rules for encoding information, e.g.
          • SDK consists of libraries and tools.                                            – XML, Condor ClassAds, Globus RSL.
                – Provides implementation of API                                      • Distinct from protocols.
                  specification.                                                            – One syntax may be used by many
          • Can have multiple SDKs for an API.                                                protocols.

          • Examples of SDKs.                                                         • Syntaxes may be layered.
                – MPICH, Motif Widgets.                                                     – E.g., Condor ClassAds -> XML -> ASCII

                                                                                59                                                                       60




Mg. Javier Echaiz                                                                                                                                             10
Conceptos de Grid Computing
CACIC 2007 - Chaco
                                                                                     Conceptos de Grid Computing                           Mg. Javier Echaiz




                                                                                                                   Contents

                                                                                      • History and Evolution of Grid
                  Grid Architectures                                                  • Introduction to Grid Architecture
                                                                                      • Key Components - Resource
                  and Technologies                                                      infrastructure
                                                                                      • Services in the Web and the Grid
                                                                                      • Technologies: Globus, Condor


                                                                                                                                                          62




         Conceptos de Grid Computing                             Mg. Javier Echaiz   Conceptos de Grid Computing                           Mg. Javier Echaiz


                 History and Evolution of                                                         History and Evolution
                           Grid                                                                          of Grid
          The emergence of virtual organisations                                      The emergence of virtual organisations
                                                                                      • Sharing resources:
                                                                                            – The degree of service availability – which
                                                                                              resources will be shared.
                                                                                            – The authorization of the shared resource – who
                                                                                              will be permitted.
                                                                                            – The type of the relationship - Peer to peer.
                                                                                            – A mechanism to understand the nature of the
                                                                                              relationship.
                                                                                            – The possible ways the resource will be used
                                                                                              (memory, computing power, etc.).

                                       Picture from Foster I. et al (2003)      63                                                                        64




         Conceptos de Grid Computing                             Mg. Javier Echaiz   Conceptos de Grid Computing                           Mg. Javier Echaiz


                         Introduction to Grid                                                        Introduction to Grid
                             Architecture                                                                Architecture
          What is Architecture?                                                       Why discuss Architecture?

          • Design, the way                                                           • Descriptive
                                                                                            – Provide a common vocabulary for use when
            components fit together.                                                          describing Grid systems.
            The term is used                                                          • Guidance
                                                                                            – Identify key areas in which services are required.
            particularly of processors,                                               • Prescriptive
            both individual and in                                                          – Define standard protocols and APIs to facilitate
                                                                                              creation of interoperable Grid systems and
            general.                                                                          portable applications.
                                                                                65                                                                        66




Mg. Javier Echaiz                                                                                                                                              11
Conceptos de Grid Computing
CACIC 2007 - Chaco
                                          Conceptos de Grid Computing                                         Mg. Javier Echaiz   Conceptos de Grid Computing                              Mg. Javier Echaiz


                                                          Introduction to Grid                                                                    Introduction to Grid
                                                              Architecture                                                                            Architecture
                                            The nature of Grid Architecture?                                                       The nature of Grid Architecture?

                                           • A grid architecture identifies                                                        • Grid’s protocols provide VO users and
                                                                                                                                     resources to negotiate, establish,
                                             fundamental system                                                                      manage and exploit sharing
                                             components, specifies the                                                               relationships.
                                             purpose and function of these                                                               – Interoperability a fundamental concern
                                             components, and indicate how                                                                – The protocols are critical to interoperability
                                                                                                                                         – Services are important
                                             these components interact.
                                                                                                                                         – We need to consider APIs and SDKs
                                                                                                                             67                                                                           68




                                          Conceptos de Grid Computing                                         Mg. Javier Echaiz   Conceptos de Grid Computing                              Mg. Javier Echaiz


                                                          Introduction to Grid                                                                        Key Components
                                                              Architecture                                                                                The Hourglass Model
                                            Grid Architecture Requirements                                                                 Applications
                                           • The components are                                                                            Diverse global services
                                                 – numerous                                                                                                          User Applications
                                                 – owned and managed by different, potentially
                                                   mutually distrustful organisations and individuals                                                                Collective services
                                                 – may be potentially faulty                                                                                         Core
                                                 – have different security requirements and policies                                                                 Services and Abstractions
                                                                                                                                                                     (e.g. TCP, HTTP)
                                                 – heterogeneous
                                                                                                                                                                     Resource and
                                                 – connected by heterogeneous, multilevel networks                                                                   Connectivity protocol
                                                 – have different resource management policies
                                                                                                                                                                     Fabric
                                                 – are likely to be geographically separated
                                                                                                                                                     Local OS
                                                                                                                             69                                                                           70




                                          Conceptos de Grid Computing                                         Mg. Javier Echaiz   Conceptos de Grid Computing                              Mg. Javier Echaiz



                                                                  Key Components                                                                   Key Components
                                                                Layered Grid Architecture                                              Layered Grid Architecture: Fabric Layer
                                                           (By Analogy to Internet Architecture)

                                                               Application
     Internet Protocol Architecture




                                                                                                                                   • Just what you would expect: the diverse
                                                                                     “Coordinating multiple resources”:
                                                                                     ubiquitous infrastructure services,             mix of resources that may be shared
                                                                        Collective
                                      Application                                    app-specific distributed services                   – Individual computers, Condor pools, file
                                                                                     “Sharing single resources”:                           systems, archives, metadata catalogs,
                                                                    Resource         negotiating access, controlling use                   networks, sensors, etc., etc.
                                      Transport              Connectivity
                                                                                     “Talking to things”: communication
                                                                                     (Internet protocols) & security
                                                                                                                                   • Defined by interfaces not physical
                                       Internet                                                                                      characteristics
                                                                                     “Controlling things locally”: Access
                                         Link                       Fabric           to, & control of, resources

                                                                                                                             71                                                                           72




Mg. Javier Echaiz                                                                                                                                                                                              12
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                           Mg. Javier Echaiz   Conceptos de Grid Computing                  Mg. Javier Echaiz



                              Key Components                                                         Key Components
              Layered Grid Architecture: Connectivity Layer                           Layered Grid Architecture: Resource Layer

         • Communication
                                                                                    • The architecture is for the secure
               – Internet protocols: IP, DNS, routing, etc.
                                                                                      negotiation, initiation, monitoring,
         • Security: Grid Security Infrastructure (GSI)                               control, accounting, and payment of
               – Uniform authentication, authorization, and message                   sharing operations on individual
                 protection mechanisms in multi-institutional setting
                                                                                      resources.
               – Single sign-on, delegation, identity mapping
                                                                                          – Information Protocols (inform about the
               – Public key technology, SSL, X.509, GSS-API
                                                                                            structure and state of the resource)
               – Supporting infrastructure: Certificate Authorities,
                 certificate & key management, …
                                                                                          – Management Protocols (negotiate access
                                                                                            to a shared resource)
               – No password is ever exchanged
                                       GSI: www.gridforum.org/security        73                                                               74




         Conceptos de Grid Computing                           Mg. Javier Echaiz   Conceptos de Grid Computing                  Mg. Javier Echaiz


                             Key Components                                                            Key Components
           Layered Grid Architecture: Resource Layer                                  Layered Grid Architecture: Collective layer

          • Grid Resource Allocation Mgmt (GRAM)                                    • Coordinating multiple resources
                – Remote allocation, reservation, monitoring, control
                  of compute resources                                              • Contains protocols and services that capture
                                                                                      interactions among a collection of resources
          • GridFTP protocol (FTP extensions)
                – High-performance data access & transport
                                                                                    • It supports a variety of sharing behaviours
                                                                                      without placing new requirements on the
          • Grid Resource Information Service (GRIS)                                  resources being shared
                – Access to structure & state information (MDS)
                                                                                    • Sample services: directory services, co-
          • Network reservation, monitoring, control                                  allocation, brokering and scheduling services,
          • All built on connectivity layer: GSI & IP                                 data replication services, workload
                                          GridFTP: www.gridforum.org
                                                                                      management services, cooperative services
                                          GRAM, GRIS: www.globus.org
                                                                              75                                                               76




         Conceptos de Grid Computing                           Mg. Javier Echaiz   Conceptos de Grid Computing                  Mg. Javier Echaiz



                           Key Components                                                               Key Components
              Layered Grid Architecture: Collective Layer                              Layered Grid Architecture: Applications layer

          • Index servers aka metadirectory services
                – Custom views on dynamic resource collections                      • There are user applications that operate
                  assembled by a community                                            within the VO environment
          • Resource brokers (e.g., Condor Matchmaker)                              • Applications are constructed by calling upon
                – Resource discovery and allocation                                   services defined at any layer
          •    Replica catalogs
                                                                                    • Each of the layers are well defined using
          •    Replication services                                                   protocols, provide access to services
          •    Co-reservation and co-allocation services
                                                                                    • Well-defined APIs also exist to work with
          •    Workflow management services                                           these services
          •    Etc.
                                         Condor: www.cs.wisc.edu/condor
                                                                              77                                                               78




Mg. Javier Echaiz                                                                                                                                   13
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                          Mg. Javier Echaiz


                             Key Components                                                                        Key Components
                        Grid architecture in practice                                            Where Are We With Architecture?

                                                                                               • No “official” standards exist
                                                                                               • But:
                                                                                                     – Globus Toolkit™ has emerged as the de facto
                                                                                                       standard for several important Connectivity,
                                                                                                       Resource, and Collective protocols
                                                                                                     – Technical specifications are being developed for
                                                                                                       architecture elements: e.g., security, data, resource
                                                                                                       management, information
                                                                                                     – Internet drafts submitted in security area


                                                                                         79                                                                                       80




         Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                          Mg. Javier Echaiz


               Services in the Web and the                                                       Services in the Web and the Grid:
                   Grid: Web Services                                                               Web Services - Advantages

          • Define a technique for describing software
            components to be accessed, methods for                                             • Platform and language independent
            accessing these components, and discovery                                            since they use XML language.
            methods that enable the identification of
            relevant service providers.                                                        • Most use HTTP for transmitting
          • A distributed computing technology (like                                             messages (such as the service request
            CORBA, RMI…).                                                                        and response)
          • They allow us to create loosely coupled
            client/server applications.

                                                                                         81                                                                                       82




         Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                          Mg. Javier Echaiz


             Services in the Web and the                                                         Services in the Web and the Grid:
            Grid: Web Services - Disadvantages                                                      Web Services Architecture
                                                                                                                                 Find Web services which meet
          • Overhead: Transmitting data in XML is                                                                                certain requirements
                                                                                                                                 (Universal Description, Discovery and
            not as convenient as binary codes.                                                                                   Integration)
          • Lack of versatility: They allow very basic                                                                            Services describe their own
            forms of service invocation (Grid                                                                                     properties and methods
            services make up this versatility).                                                                                   (Web Services Description Language)

                – Stateless:                                                                                                      Format of requests(client) and
                                       They can’t remember what you have done                                                     responses (server)
                                       from one invocation to another                                                             (Simple Object Access Protocol)

                                                                                                                                  Message transfer protocol
                – Non-transient:
                                                                                                                                  (Hypertext Transfer Protocol)
                                            They outlive all their clients.                                                 Picture from Globus 3 Tutorial Notes www.globus.org
                                                                                         83                                                                                       84




Mg. Javier Echaiz                                                                                                                                                                      14
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                        Mg. Javier Echaiz   Conceptos de Grid Computing                                  Mg. Javier Echaiz


            Services in the Web and the Grid:                                                         Services in the Web and the
             Invoking A Typical Web Service                                                          Grid: Web Service Addressing

                                                                                                 • URI: Uniform Resource Identifiers
                                                                                                 • URI and URL are practically same thing.
                                                                                                       – Example:
                                                                                                          http://webservices.mysite.com/weather/ar/WeatherService


                                                                                                 • It can not be used with web browsers, they
                                                                                                   are for software.

                                                              Picture from
                                                              Globus 3 Tutorial Notes
                                                                                           85                                                                               86




         Conceptos de Grid Computing                                        Mg. Javier Echaiz   Conceptos de Grid Computing                                  Mg. Javier Echaiz


               Services in the Web and the                                                              Services in the Web and the
              Grid: Web Service Application                                                             Grid: What is Grid Service?
                                                                                                • It provides a set of well defined interfaces and
                                                                                                  that follows specific conventions.
                                                                                                • It is a web service with improved
                                                                                                  characteristics and services.
                                                                                                     – Improvement:
                                                                                                           •   Potentially Transient
                                                                                                           •   Stateful
                                                                                                           •   Delegation
                                                                                                           •   Lifecycle management
                                                                                                           •   Service Data
                                                                                                           •   Notifications
                                                                                                • Examples : computational resources, programs,
                                               Picture from Globus 3 Tutorial Notes               databases…
                                                                                           87                                                                               88




         Conceptos de Grid Computing                                        Mg. Javier Echaiz   Conceptos de Grid Computing                                  Mg. Javier Echaiz


               Services in the Web and the                                                            Services in the Web and the
                     Grid: Factories                                                                       Grid: GSH & GSR

                                                                                                 • GSH: Grid Service Handle (URI)
                                                                                                       – Unique
                                                                                                       – Shows the location of the service
                                                                                                 • GSR: Grid Service Reference
                                                                                                       – Describes how to communicate with the
                                                                                                         service
                                                                                                       – As we will use SOAP, our GSR will be a
                                                                                                         WSDL file.
        Picture from Globus 3 Tutorial Notes
                                                                                           89                                                                               90




Mg. Javier Echaiz                                                                                                                                                                15
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                           Mg. Javier Echaiz   Conceptos de Grid Computing                              Mg. Javier Echaiz


             Services in the Web and the Grid                                         Services in the Web and the Grid
             Open Grid Services Architecture                                          Open Grid Services Infrastructure
                                                                                    OGSI: Definition
          OGSA: Definition                                                          • It is a formal and technical specification of the
          • OGSA defines what Grid services are,                                      concepts described in OGSA.
            what they should be capable of, what                                    • The Globus Toolkit 3 is an implementation of
                                                                                      OGSI.
            type of technologies they should be
                                                                                    • Some other implementations are OGSI::Lite
            based on.                                                                 (Perl)1 and the UNICORE OGSA
          • OGSA does not give a technical and                                        demonstrator2 from the EU GRIP
            detailed specification. They use WSDL.                                    project.
                                                                                    • OGSI specification defines grid services and
                                                                                      builds upon web services.
                                                                              91                                                                           92




         Conceptos de Grid Computing                           Mg. Javier Echaiz   Conceptos de Grid Computing                              Mg. Javier Echaiz


               Services in the Web and the                                             Services in the Web and the Grid
                       Grid: OGSI                                                      Object Diagram of a Grid Service
          • OGSI creates an extension model for WSDL
            called GWSDL (Grid WSDL). The reason is:
                – Interface inheritance
                – Service Data (for expressing state information)
          • Components:
                –   Lifecycle
                –   State management
                –   Service Groups
                –   Factory
                –   Notification
                –   HandleMap
                                                                              93                                                                      94
                                                                                                                         Picture from IBM DeveloperWorks




         Conceptos de Grid Computing                           Mg. Javier Echaiz   Conceptos de Grid Computing                              Mg. Javier Echaiz


                 Services in the Web and the                                             Services in the Web and the
                 Grid: Service Data Structure                                              Grid: OGSA, OGSI, GT3
          <wsdl:definitions xmlns:tns="abc"
            targetNamespace="mynamespace">
            <gwsdl:portType
            name="AbstractSearchEngine">
              <wsdl:operation name="search" />
               --------------------
              <sd:serviceData name="cachedURL"
            type="tns: cachedURLType"
                   mutability="mutable"
            nilable="true", maxOccurs="1"
            minOccurs="0"
                   modifiable="true"/>
            </gwsdl:portType>
          </wsdl:definitions>                                                 95                                 Picture from Globus 3 Tutorial Notes      96




Mg. Javier Echaiz                                                                                                                                               16
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                      Mg. Javier Echaiz   Conceptos de Grid Computing                            Mg. Javier Echaiz


                                       Technologies:                                                        Technologies:
                                          Globus                                                            Globus Goals

                                                                               • Low-level toolkit providing basic mechanisms
          Globus:                                                                such as communication, authentication,
          •   Goals                                                              network information, and data access
                                                                               • Long term goal – build an Adaptive Wide
          •   Layered Architecture
                                                                                 Area Resource Environment (AWARE)
          •   Globus Services                                                  • Not intended for application use, instead used
          •   Limitations                                                        to construct higher-level components


                                                                         97                                                                         98




         Conceptos de Grid Computing                      Mg. Javier Echaiz   Conceptos de Grid Computing                            Mg. Javier Echaiz


                            Technologies                                                    Technologies
                         Core Globus Services                                           Communications (Nexus)

         • Communication Infrastructure (Nexus)                                • 5 basic abstractions
                                                                                     – Nodes
         • Information Services (MDS)
                                                                                     – Contexts (Address spaces)
         • Remote File and Executable
                                                                                     – Threads
           Management (GASS, RIO, and GEM)
                                                                                     – Communication links
         • Resource Management (GRAM)                                                – Remote service requests
         • Security (GSS)                                                      • Startpoints and Endpoints

                                                                         99                                                                       100




         Conceptos de Grid Computing                      Mg. Javier Echaiz   Conceptos de Grid Computing                            Mg. Javier Echaiz


                               Technologies                                                          Technologies
                           Information Services                                Remote file and executable management
          Metacomputing Directory Service - MDS
                                                                               • Global Access to Secondary Storage (GASS)
          • Required information
                                                                                     – Basic access to remote files, operations supported
                – Configuration details about resources                                include remote read, remote write and append
                      • Amount of memory                                       • Remote I/O (RIO)
                      • CPU speed                                                    – Implements a distributed implementation of the
                – Performance information                                              MPI-IO, parallel I/O API
                      • Network latency                                        • Globus Executable Management (GEM)
                      • CPU load                                                     – Enables loading and executing a remote file
                – Application specific information                                     through the GRAM resource manager
                      • Memory requirements
                                                                       101                                                                        102




Mg. Javier Echaiz                                                                                                                                        17
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                               Mg. Javier Echaiz   Conceptos de Grid Computing                            Mg. Javier Echaiz


                            Technologies                                                                             Technologies
                        Resource management                                                                    Authentication Model

        • Resource Specification Language (RSL)                                         • Authentication is done on a “user” basis
            – Provides a method for exchanging information about resource
              requirements between all of the components in the Globus
                                                                                              – Single authentication step allows access to all grid
              resource management architecture                                                  resources
        • Globus Resource Allocation Manager (GRAM)                                     • No communication of plaintext passwords
            – Provides a standardized interface to all of the various local             • Most sites will use conventional account
              resource management tools                                                   mechanisms
              that a site might
                                                    GRAM                                      – You must have an account on a resource to use
              have in place
                                             LSF     EASY-LL      NQE                           that resource
        • DUROC
            – Provides a co-allocation service                                          • Sites may use “generic” Grid accounts
            – It coordinates a single request that may span multiple                          – Not common, but Globus can deal with it
              GRAMs.
                                                                                103                                                                        104




         Conceptos de Grid Computing                               Mg. Javier Echaiz   Conceptos de Grid Computing                            Mg. Javier Echaiz


                                       Technologies                                                                  Technologies
                         Grid Security Infrastructure                                           Certificate Based Authentication


          • Each user has:                                                             • User has a certificate, signed by a trusted
                – a Grid user id (called a Subject Name)                                 “certificate authority” (CA)
                – a private key (like a password)                                           – Certificate contains user name and public key
                – a certificate signed by a Certificate                                     – Globus project operates a CA
                  Authority (CA)                                                       • A user proves own identity by encrypting a
          • A “gridmap” file at each site specifies                                      message; if the public key can decrypt,
            grid-id to local-id mapping                                                  the user is indeed holding the private key.
                                                                                       • No password is ever exchanged.
                                                                                105                                                                        106




         Conceptos de Grid Computing                               Mg. Javier Echaiz   Conceptos de Grid Computing                            Mg. Javier Echaiz


                                 Technologies                                                                        Technologies
                          “Logging” onto the Grid                                                            Simple job submission

         • To run programs, authenticate to Globus:
                                                                                       • globus-job-run provides a simple RSH
               % grid-proxy-init
                                                                                         compatible interface
               Enter PEM pass phrase: ******
                                                                                           % grid-proxy-init
         • Creates a temporary, short-lived credential                                       Enter PEM pass phrase: *****
           for use by our computations                                                     % globus-job-run host program [args]
               Private key is not exposed past grid-proxy-
                init


                                                                                107                                                                        108




Mg. Javier Echaiz                                                                                                                                                 18
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                              Mg. Javier Echaiz   Conceptos de Grid Computing                      Mg. Javier Echaiz


                                       Technologies                                                                                 Technologies
                                         Limitations                                                                                      Condor

                                                                                                       CONDOR:
          • Program needs to be compiled on
            remote machine                                                                             • It is a specialized job and resource
          • Gatekeepers usually runs as root                                                             management system. It provides:
                                                                                                             – Job management mechanism
          • Need to specify filenames as URLs
                                                                                                             – Scheduling
          • Need to specify machine names when                                                               – Priority scheme
            executing programs                                                                               – Resource monitoring
                                                                                                             – Resource management
                                                                                               109                                                                  110




         Conceptos de Grid Computing                                              Mg. Javier Echaiz   Conceptos de Grid Computing                      Mg. Javier Echaiz


                                       Technologies                                                                                 Technologies
                                 Condor Terminology                                                                                    Condor-G
          • The user submits a job to an agent.
                                                                                                       • Condor-G: computation management
          • The agent is responsible for remembering jobs in
            persistent storage while finding resources willing                                           agent for Grid Computing
            to run them.
                                                                                                       • Merging of Globus and Condor
          • Agents and resources advertise themselves to a
            matchmaker, which is responsible for                                                         technologies
            introducing potentially compatible agents and
            resources.                                                                                 • Globus
          • At the agent, a shadow is responsible for                                                        – Protocols for secure inter-domain
            providing all the details necessary to execute a                                                   communications
            job.
          • At the resource, a sandbox is responsible for                                                    – Standardized access to remote batch
            creating a safe execution environment for the job                                                  systems
            and protecting the resource from any mischief. 111
                                                                                                       • Condor                                                     112




         Conceptos de Grid Computing                                              Mg. Javier Echaiz   Conceptos de Grid Computing                      Mg. Javier Echaiz



                                       Technologies                                                                                 Technologies
                                        Condor Kernel                                                                               Gateway Flocking
                                                                     Matchmaker


                    Plan of                                        ClassAds
                                                  job
                     jobs
          User                   Problem Solver           Agent                   Resource
                                                                         claim                                                        G            G

                                                         Shadow                    Sandbox
                                                        Details of the     Environment
                                                             job
                                                                                                        Gateway pass information about participants
                                                                                      Job
                                                                                                        between pools, Ma sends request to Mb through
                                                                                               113
                                                                                                        gateways, Mb returns a match.                 114




Mg. Javier Echaiz                                                                                                                                                          19
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                         Mg. Javier Echaiz


                                       Technologies                                                                         Technologies
                                       Gateway Flocking                                                                      Direct Flocking


          • Structure of pools is preserved
          • Completely transparent- no modification to
            users
          • Sharing at organizational level
          • Technically complex- gateway participates in
            all interactions in the Condor kernel

          Solution: Direct Flocking                                                           A also advertises to Condor Pool B

                                                                                       115                                                                                     116




         Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                         Mg. Javier Echaiz


          Algunos Proyectos…
                                                                                                                             Resources
                                                                                              •   Foster I., Kesselman C., Tuecke S. (2003)The anatomy of the grid. In
                                                                                                  F.Berman, G.Fox, T.Hey (ed) Grid Computing: Making the Global
                                                                                                  Infrastructure a Reality, Chichester, John Willey & Sons Inc, pp. 171-
                                                                                                  199
                                                                                              •   Foster I., Kesselman C., Nick C.M., Tuecke S. (2003)The physiology of
                                                                                                  the Grid. In F.Berman, G.Fox, T.Hey (ed) Grid Computing: Making the
                                                                                                  Global Infrastructure a Reality, Chichester, John Willey & Sons Inc, pp.
                                                                                                  217-246
                                                                                              •   Thain D., Tannenbaum T., Livny M. (2003) Condor and the Grid. In
                                                                                                  Berman F., Fox G., Hey T., (ed) In F.Berman, G.Fox, T.Hey (ed) Grid
                                                                                                  Computing: Making the Global Infrastructure a Reality, Chichester,
                                                                                                  John Willey & Sons Inc, pp. 217-246
                                                                                              •   Joseph J. (2003) A developer’s overview of OGSI and OGSI-based
                                                                                                  Grid computing. IBM developerWorks [Online] Available at <http://www-
                                                                                                  106.ibm.com/developerworks/grid/library/gr-ogsi/>
                                                                                              •   The Globus Alliance [Online] Available at http://www.globus.org
                                                                                              •   Foldoc, What is architecture? [Online] Available at
                                                                                                  http://foldoc.doc.ic.ac.uk
                                                                                              •   Remember to read Ian Foster’s Paper “What is the Grid? A Three-point
                                                                                                  Checklist” http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf
                                                                                       117                                                                                     118
                                         Slide Courtesy of Ian Foster




         Conceptos de Grid Computing                                      Mg. Javier Echaiz




                                                                        Coming
                                                                         Next
                                               iones
                                                                                                             Backup Slides
                                        Aplicac


                                                                                                                   Security on Grids


                                                                                       119




Mg. Javier Echaiz                                                                                                                                                                     20
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                                    Mg. Javier Echaiz   Conceptos de Grid Computing                                                                   Mg. Javier Echaiz


                Globus Security: the Grid
                                                                                                                                   GSI: Some Terms
               Security Infrastructure (GSI)
                                                                                                             •   IETF: Internet Engineering Task Force
            • http://www.globus.org/security                                                                 •   PKI Public Key Infrastructure:
            • Public key encryption for single sign-on                                                            – IETF definition: "The set of hardware, software, people,
                                                                                                                    policies and procedures needed to create, manage, store,
                – PEM: Public Encryption Method                                                                     distribute, and revoke PKCs based on public-key
                                                                                                                    cryptography".
            • X.509 certificates for credentials                                                                  – OpenSSL provides an Open Source implementation. .
                                                                                                             •   OpenSSL Project:
            • Secure Sockets Layer (SLL) communication                                                            – Collaborative effort to develop a robust, commercial-grade,
              protocol for message security                                                                         full-featured, and Open Source toolkit implementing the
                                                                                                                    Secure Sockets Layer (SSL v2/v3) and Transport Layer
            • Standard: most sites conform to                                                                       Security (TLS v1) protocols as well as a full-strength general
                                                                                                                    purpose cryptography library.
                – Generic Security Service API (GSS-API) of the                                              •   PKC Public Key Certificate:
                  Internet Engineering Task Force (IETF)                                                          – IETF definition: "A data structure containing the public key of
                                                                                                                    an end-entity and some other information, which is digitally
            • Certificate Authority (CA) blesses process                                                            signed with the private key of the CA which issued it.“
                                                                                                     121                                                                                                               122




         Conceptos de Grid Computing                                                    Mg. Javier Echaiz   Conceptos de Grid Computing                                                                   Mg. Javier Echaiz




                             GSI: Some Terms                                                                                                Security: SSL
                                                                                                             •   SSL designed to secure data
                                                                                                                 exchanges between two
        •    CA: Certification Authority:                                                                        applications
              – An agency or organization that is able to publish and give out digital certificates                –   originally between Web server ->
              – IETF definition: "An authority trusted by one or more users to create and assign                       browser.
                public key certificates. Optionally the CA may create the user's keys. It is                       –   protocol is widely used and is
                important to note that the CA is responsible for the public key certificates during                    compatible with most Web browsers.
                their whole lifetime (what includes renewal, revocation, etc.), not just for issuing               –   At network level, SSL protocol is
                them.“                                                                                                 inserted between TCP/IP and HTTP
                                                                                                                       layers
        •    DN:Distinguished Name                                                                           •   SSL has been designed mainly to
              – X.509 certificates; always contained within the 'subject' field of a digital                     work with HTTP.
                certificate. The distinguished name should be unique within all certificates issued          •   SSL uses a public key encryption
                by a certification authority. A distinguished name might look like this: C=UK,                   method
                O=eScience, OU=Authority, CN=CA. The attributes (e.g. C (Country), O                                                                                   •   Public Key
                                                                                                                   –   technique based on a pair of                         –   used to decrypt the information and
                (Organisation), OU (Organisational Unit), CN (Common Name), etc.) all combine                          asymmetric keys for encryption and
                to form the DN                                                                                                                                                  is sent to the receivers (Web
                                                                                                                       decryption: a public key and a                           browsers) through a certificate.
        •    X.509:                                                                                                    private key.
                                                                                                                                                                            –   When using SSL with the Internet,
              – Recommendation X.509 specifies the authentication service for X.500                                –   The private key is used to encrypt                       certificate delivered through a
                                                                                                                       data.                                                    certification authority, such as
                directories, as well as the widely adopted X.509 certificate syntax. The initial
                version of X.509 was published in 1988, version 2 was published in 1993, and                 •   sender (e.g. web site) does not give                           Verisign
                version 3 was proposed in 1994 and considered for approval in 1995. Version 3                    it to anyone                                               –   Web site pays CA to deliver a
                                                                                                                                                                                certificate - guaranties the server
                addresses some of the security concerns and limited flexibility that were issues in                                                                             authentication and contains the
                versions 1 and 2.                                                                                                                                               public key allowing to exchange data
                                                                                                                                          (Image courtesy www.4d.fr)
                                                                                                                                                                                in a secured mode.
                                                                                                     123                                                                                                               124




         Conceptos de Grid Computing                                                    Mg. Javier Echaiz   Conceptos de Grid Computing                                                                   Mg. Javier Echaiz


                   Globus GSI: Public Key
                                                                                                                                  GSI: Certificates
                       Cryptography                                                                          •   Every user and service on the Grid is identified via a certificate,
            • PKI relies not on a single key (a password or a secret
                                                                                                                   – contains information vital to identifying and authenticating user or
              "code"), but on two keys.                                                                              service.
                – Asymmetric encryption                                                                            – Certificates are public
                – keys are numbers that are mathematically related in                                        •   GSI certificates are encoded in the X.509 certificate format and
                  such a way that if either key is used to encrypt a                                             includes four primary pieces of information:
                  message, the other key must be used to decrypt it.                                               – Subject name, identifies the person or object that the certificate
                                                                                                                     represents.
            • How it works:                                                                                        – Public Key belonging to the subject.
                – Entity (owner) has two keys -a public and a private key                                          – Identity of a Certificate Authority (CA) that has signed the
                – Data encrypted with one key can only be decrypted                                                  certificate to certify that the public key and the identity both belong
                  with other.                                                                                        to the subject.
                                                                                                                   – Digital Signature of the named CA.
                – The private key is known only to the entity                                                •   The CA used to certify link between public key and subject in
                – The public key is given to the world encapsulated in a                                         certificate
                  X.509 certificate                                                                                – This is done when you get your certificate
            • Important:                                                                                           – To trust the certificate and its contents, the CA's certificate must be
                                                                                                                     trusted.
                – It is critical that private keys be kept private! Anyone                                         – The link between the CA and its certificate must be established via
                  who knows the private key can easily impersonate the                                               some non-cryptographic means, or else the system is not
                  owner.                                                                             125                                                                                                               126
                                                                                                                     trustworthy.




Mg. Javier Echaiz                                                                                                                                                                                                             21
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                                                Mg. Javier Echaiz   Conceptos de Grid Computing                                   Mg. Javier Echaiz


              GSI: How Digital Signatures
                                                                                                                           GSI: Mutual Authentication
                        Work
          •   Using public key cryptography, it is possible to digitally "sign" a                                        • Grid services work based on mutual authentication
              piece of information.
                – Signing information essentially means assuring a recipient of the                                            – If two parties have certificates, and if both parties trust
                  information that the information hasn't been tampered with since it                                            the CAs that signed each other's certificates, then the
                  left your hands.                                                                                               two parties can prove to each other that they are who
          •   Entity A digitally signs a piece of information:                                                                   they say they are.
                –   Computes a mathematical Hash (H1) of the information
                –   Encrypt this has using Entity A’s private key (encr-H1)                                              • Before mutual authentication can occur,
                –   Attach hash encr-H1 to the original message                                                                – parties involved must first trust the CAs that signed
                –   Make sure that the Recipient has Entity A’s public key.                                                      each other's certificates.
                –   Make sure Recipient knows algorithm
                                                                                                                               – they must have copies of the CAs' certificates--which
          •   Recipient must verify that the signed message is authentic:
                – Compute new hash (H2) of original message using the same
                                                                                                                                 contain the CAs' public key
                  hashing algorithm used by Entity A                                                                           – they must trust that these certificates really belong to
                – Using the CA digital signature in entity A’s public certificate,                                               the CAs.
                  recipient decrypts hash encr-H1 that Entity A attached to the
                  message (hash H3)                                                                                      • This is often a very manual process and one of the
                – IF H3 == H2                                                                                              bottlenecks in setting up a grid:
                      • Proves that entity A signed the message and that the message has not
                        been changed since you signed it.                                                        127           – Site admins (system, security, project, etc)                      128




         Conceptos de Grid Computing                                                                Mg. Javier Echaiz   Conceptos de Grid Computing                                   Mg. Javier Echaiz




              GSI: Mutual Authentication                                                                                        GSI: Securing Private Keys
          •   A connects to B and A gives B its certificate (identity, public key,                                       • The core GSI software provided by the Globus Toolkit
              signing CA)                                                                                                  expects the user's private key to be stored in a file in
          •   B will first make sure that the certificate is valid
                – checking the CA's digital signature, check that the certificate hasn't been
                                                                                                                           the local computer's storage.
                  tampered with.                                                                                         • To prevent other users of the computer from stealing
          •   B must make sure that A really is the person identified in the certificate.
                –   B generates a random message and sends it to A, asking A to encrypt it.                                the private key, the file that contains the key is
                –   A encrypts the message using private key, and sends it back to B.                                      encrypted via a password (also known as a
                –   B decrypts the message using A's public key.                                                           passphrase).
                –   If this results in the original random message, B knows that A is who he
                    says he is.                                                                                          • To use the GSI, the user must enter the passphrase
          •   Now B trusts A's identity, the same operation must happen in reverse.                                        required to decrypt the file containing their private
                – B sends A a certificate
                – A validates the certificate and sends a challenge message to be encrypted.                               key.
                – B encrypts message, sends back to A, A decrypts it and compares it with                                • We have also prototyped the use of cryptographic
                  the original.
                – If it matches, then A knows that B is who she says she is.                                               smartcards in conjunction with the GSI.
          •   A and B have established a connection to each other and are certain                                        • This allows users to store their private key on a
              that they know each others' identities.
                                                                                                                           smartcard rather than in a filesystem, making it still
                                                                                                                 129       more difficult for others to gain access to the key.                    130




         Conceptos de Grid Computing                                                                Mg. Javier Echaiz   Conceptos de Grid Computing                                   Mg. Javier Echaiz


              GSI: Delegation and Single
                                                                                                                                                      Signature
                       Sign-On
          •   The GSI provides a delegation capability: an extension of the standard SSL protocol which
              reduces the number of times the user must enter his passphrase. If a Grid computation
              requires that several Grid resources be used (each requiring mutual authentication), or if
              there is a need to have agents (local or remote) requesting services on behalf of a user, the
              need to re-enter the user's passphrase can be avoided by creating a proxy.
          •   A proxy consists of a new certificate (with a new public key in it) and a new private key. The
              new certificate contains the owner's identity, modified slightly to indicate that it is a proxy.
              The new certificate is signed by the owner, rather than a CA. (See diagram below.) The
              certificate also includes a time notation after which the proxy should no longer be accepted
              by others. Proxies have limited lifetimes.

          •   The proxy's private key must be kept secure, but because the proxy isn't valid for very long,
              it doesn't have to kept quite as secure as the owner's private key. It is thus possible to store
              the proxy's private key in a local storage system without being encrypted, as long as the
              permissions on the file prevent anyone else from looking at them easily. Once a proxy is
              created and stored, the user can use the proxy certificate and private key for mutual
              authentication without entering a password.
          •   When proxies are used, the mutual authentication process differs slightly. The remote party
              receives not only the proxy's certificate (signed by the owner), but also the owner's
              certificate. During mutual authentication, the owner's public key (obtained from her
              certificate) is used to validate the signature on the proxy certificate. The CA's public key is
              then used to validate the signature on the owner's certificate. This establishes a chain of
              trust from the CA to the proxy through the owner.
          •   Note that the GSI and software based on it (notably the Globus Toolkit, GSI-SSH, and
              GridFTP) is currently the only software which supports the delegation extensions to TLS
              (a.k.a. SSL). The Globus Project is actively working with the Grid Forum and the IETF to
              establish proxies as a standard extension to TLS so that GSI proxies may be used with
              other TLS software.                                                                                131                                                                               132




Mg. Javier Echaiz                                                                                                                                                                                         22
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                                                                Mg. Javier Echaiz       Conceptos de Grid Computing                                                                 Mg. Javier Echaiz



                                    GSI: Examples                                                                                                   GSI: Securing Private Keys

            • X.509 Certificate:                                                                                                             • You get a cert by running grid-cert-
                   subject : C=UK,O=eScience,OU=Cardiff,L=WeSC,CN=liviu joita
                   issuer    : C=UK,O=eScience,OU=Authority,CN=CA,E=ca-operator@grid-
                      support.ac.uk
                                                                                                                                               request
                   start date : Tue Nov 12 15:33:51 GMT 2002
                   end date : Wed Nov 12 15:33:51 GMT 2003
                                                                                                                                             • GSI expects the user's private key to be
            • Distinguished Name:                                                                                                              stored in a file in the local user filespace
                   CN=liviu joita,L=WeSC,OU=Cardiff,O=eScience,C=UK                                                                                – Directory only readable by owner
            • Main advantages of using GSI:                                                                                                        – file that contains the key is encrypted via a
                       Single sign-on                                                                                                                password (also known as a passphrase).
                       Users do not have                                                                                                           – To use GSI, user must enter the
                       username/passwords, instead they have                                                                                         passphrase required to decrypt the file
                       public/private key pairs and identity
                                                                                                                                                     containing their private key.
                       certificates                                                                                                   133                                                                                                            134




         Conceptos de Grid Computing                                                                                Mg. Javier Echaiz       Conceptos de Grid Computing                                                                 Mg. Javier Echaiz


              Using GSI: grid-cert-request –h                                                                                                        grid-cert-request –ca
                        displays all options                                                                                                       (lists CA’s recognized by host)
           $ grid-cert-request [-help] [ options ...]                   -dir <dir_name> : Changes the directory the private
            Example Usage:                                                    key and certificate
                                                                                                                                             $ /usr/local/globus/globus-3.0.2/bin/grid-cert-request -verbose -ca nondefaultca=true
             Creating a user certifcate:                                                 request will be placed in. By default
               grid-cert-request                                              user
                                                                                         certificates are placed in                          The available CA configurations installed on this host are:
             Creating a host or gatekeeper certifcate:                        /home/mthomas/.globus, host
               grid-cert-request -host [my.host.fqdn]
                                                                                         certificates are placed in /etc/grid-               1) 1c3f2ca8 - /DC=org/DC=DOEGrids/OU=Certificate Authorities/CN=DOEGrids CA 1
             Creating a LDAP server certificate:                              security and
               grid-cert-request -service ldap -host                                                                                         2) 42864e48 - /C=US/O=Globus/CN=Globus Certification Authority
                                                                                         service certificates are place in
                 [my.host.fqdn]                                                          /etc/grid-security/<service>.
                                                                                                                                             3) 5fb2fc80 - /O=Louisiana State University/OU=CCT/OU=ca.cct.lsu.edu/CN=CCT CA
                                                                                                                                             4) 6349a761 - /O=DOE Science Grid/OU=Certificate Authorities/CN=Certificate Manager
            Options:                                                                                                                         5) 860e3429 - /C=US/ST=Virginia/L=Charlottesville/O=University of Virginia/Email=pkimaster@virginia
                                                                        -prefix <prefix> : Causes the generated files to be named
             -version        : Display version                                            <prefix>cert.pem, <prefix>key.pem and              .edu/CN=UVA Standard Assurance SKP 1
             -?, -h, -help, : Display usage                                               <prefix>cert_request.pem                           6) 9a1da9f9 - /C=US/O=UTAustin/OU=TACC/CN=TACC Certification
             -usage                                                        -nopw,             : Create certificate without a passwd              Authority/0.9.2342.19200300.100.1.1=ca man
             -cn <name>,        : Common name of the user                  -nodes,                                                           7) 9d8753eb - /DC=net/DC=es/OU=Certificate Authorities/OU=DOE Science Grid/CN=pki1
             -commonname <name>                                                                                                              8) ad478c3d - /C=US/ST=Virginia/L=Charlottesville/O=NMI Testbed Grid/OU=Bridge CA/CN=BridgeCA
                                                                           -nopassphrase,
             -service <service> : Create certificate for a service.
                Requires                                                   -verbose            : Don't clear the screen                      9) c4d34612 - /C=US/ST=Alabama/L=Birmingham/O=University of Alabama at
                                                                           -int[eractive] : Prompt user for each component of the                Birmingham/OU=UABGrid/CN=UAB GridCA
                           the -host option and implies that the               DN
                generated                                                                                                                    10) cd88b13f - /O=Grid/OU=Texas Tech HPCC/OU=onera/CN=Globus Simple CA
                           key will not be password protected (ie          -force           : Overwrites preexisting certifictes             11) d1b603c3 - /DC=net/DC=ES/O=ESnet/OU=Certificate Authorities/CN=ESnet Root CA 1
                implies -nopw).                                            -ca            : Will ask which CA is to be used
             -host <FQDN>         : Create certificate for a host              (interactive)
                                                                           -ca <hash>            : Will use the CA with hash value           Enter the index number of the CA you want to sign your cert request:
                named <FQDN>
                                                                               <hash>


                                                                                                                                      135                                                                                                            136




         Conceptos de Grid Computing                                                                                Mg. Javier Echaiz       Conceptos de Grid Computing                                                                 Mg. Javier Echaiz




                  Requesting a Certificate                                                                                                                   Certificate Issuance
          • To request a                                                                                                                     • The user then takes the
                                                                                                                                               certificate to the CA                                                                 Certificate
            certificate a user                                                                                                                                                                                                        Request
                                                                                                                                             • The CA usually includes
            starts by generating                                                                                                               a Registration Authority                                                              Public Key
            a key pair                                                                                                                         (RA) which verifies the
                                                                                                                                               request:
          • The private key is
                                                                                                                                                   – name unique with
            stored encrypted                                                                                                                         respect to the CA                                               Sign
            with a pass phrase                                                                                                                     – It is real name of the
            the user gives                                                                                                                           user (often by phone,
                                                                      Private Key                                                                    etc.)
          • The public key is put                                      Encrypted                       Certificate
                                                                                                                                             • The CA then signs the                                                Name
                                                                                                        Request                                                                                                     Issuer
            into a certificate                                          On local                                                               certificate request and
                                                                          disk                                                                                                                                      Public Key
            request                                                                                   Public Key                               issues a certificate for                                             Signature
                                                                                                                                      137      the user                                                                                              138




Mg. Javier Echaiz                                                                                                                                                                                                                                           23
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                                  Mg. Javier Echaiz   Conceptos de Grid Computing                                                                              Mg. Javier Echaiz


                                Creating a Cert:                                                                 Creating the Certificate:
                                “Simple” steps                                                               output from grid-cert-request
                                                                                                                                                                         Level 0 Organization [Grid]:Level 1 Organization
                                                                                                           $ /usr/local/globus/globus-3.0.2/bin/grid-cert-request             [Globus]:Level 0 Organizational Unit
                                                                                                           A certificate request and private key is being created.            [tacc.utexas.edu]:Name (e.g., John M. Smith) []:
                                                                                                                                                                         A private key and a certificate request has been
                                                                                                           You will be asked to enter a PEM pass phrase.                      generatedm with the subject:
          • Get account on host                                                                            This pass phrase is akin to your account password,
                                                                                                                 and is used to protect your key file.
                                                                                                                                                                         /O=Grid/O=Globus/OU=lisidi.cs.uns.edu.ar/CN=Javier
                                                                                                           If you forget your pass phrase, you will need to obtain           Echaiz
                                                                                                                 a new certificate.
          • Run grid-cert-request                                                                                                                                        If the CN=Mary Thomas is not appropriate, rerun this
                                                                                                           Using configuration from /etc/grid-security/globus-
                                                                                                                                                                         script with the -force -cn "Common Name" options.
                – ~/.globus directory created with PEM files                                                       user-ssl.conf
                                                                                                           Generating a 1024 bit RSA private key
                                                                                                           ........................++++++                                Your private key is stored in
          • Email contents from usercert_request.pem                                                       .......++++++
                                                                                                           writing new private key to
                                                                                                                                                                              /home/mthomas/.globus/userkey.pem
                                                                                                                                                                         Your request is stored in
                                                                                                                                                                              /home/mthomas/.globus/usercert_request.pem
            file to CA of choice                                                                                   '/home/mthomas/.globus/userkey.pem'
                                                                                                           Enter PEM pass phrase:
                                                                                                                                                                         Please e-mail the request to the Globus CA
                                                                                                           Verifying password - Enter PEM pass phrase:                        ca@globus.org
          • CA emails back to you your signed and                                                          -----
                                                                                                           You are about to be asked to enter information that
                                                                                                                                                                         You may use a command similar to the following:
                                                                                                                   will be incorporated into your certificate request.
            encrypted public key file (usercert.pem)                                                       What you are about to enter is what is called a
                                                                                                                   Distinguished Name or a DN.
                                                                                                                                                                          cat /home/mthomas/.globus/usercert_request.pem |
                                                                                                                                                                              mail ca@globus.org
                                                                                                           There are quite a few fields but you can leave some
          • Place in ~/.globus on machines you want to                                                             blank
                                                                                                           For some fields there will be a default value,
                                                                                                                                                                         Only use the above if this machine can send AND
                                                                                                                                                                              receive e-mail. if not, please mail using some
                                                                                                                                                                              other method.
            do grid jobs on                                                                                If you enter '.', the field will be left blank.
                                                                                                           -                                                             Your certificate will be mailed to you within two
                                                                                                                                                                               working days.
                                                                                                   139                                                                   If you receive no response, contact Globus CA at        140
                                                                                                                                                                               ca@globus.org




         Conceptos de Grid Computing                                                  Mg. Javier Echaiz   Conceptos de Grid Computing                                                                              Mg. Javier Echaiz




              Distinguished Names (DN)                                                                                                           Contents of
                                                                                                                             usercert_request.pem File
                                                                                                           This is a Certificate Request file:
                                                                                                                                                                         -----BEGIN CERTIFICATE REQUEST-----
                                                                                                           It should be mailed to ca@globus.org
                                                                                                                                                                         BgNVBAsTD3RhY2MudXRleGFzLmVkdTEU
                                                                                                                                                                              MBIGA1UEAxMLTWFyeSBUaG9tYXMw
                                                                                                                                                                              gZ8w
          • Globally unique identifier that                                                                ==========================================
                                                                                                                 ====                                                    DQYJKoZIhvcNAQEBBQADgY0AMIGJAoGB
                                                                                                                                                                              AJvOvU8PsbH4qIkaYaJkmGewcA/kC1B
                                                                                                           Certificate Subject:
            represents you individually                                                                      /O=Grid/O=Globus/OU=grid.cs.uns.edu.ar/CN=Javier
                                                                                                                                                                              x
                                                                                                                                                                         8yBahG8Uab1B5z2GR0xWIGv+IWoyp+04/+
                                                                                                               Echaiz                                                         X2071CLpebOX0A+/39foxzE+z7aXjI
                – /O=Grid/O=Globus/OU=lisidi.cs.uns.edu.ar/                                                The above string is known as your user certificate
                                                                                                                                                                         Fm14WL22Yn/K3uIGNSRwJoWOu+cENrNrrl
                                                                                                                                                                              2JfIQqEIiVG5dWyT+VMkX/wx9C69X3
                                                                                                                subject, and it
                  CN=Javier Echaiz                                                                         uniquely identifies this user.
                                                                                                                                                                         84iy0cq1So5VAgMBAAGgADANBgkqhkiG9w
                                                                                                                                                                              0BAQQFAAOBgQAE5qYwJVlFe2yQDgm
                                                                                                                                                                              u
                                                                                                           To install this user certificate, please save this e-mail
          • Mapfiles: map user login/account to                                                                  message
                                                                                                           into the following file.
                                                                                                                                                                         /b0ICwjxJ77kNiNZRvcIfo23N4eXMi0s3YWvW
                                                                                                                                                                              AI6/nd2cgsJyfDOErUhXRteLjFS
                                                                                                                                                                         pLSyr7njgfs2Jm26u4248P8LJbN6dAVN4JGF
            DN’s                                                                                           /home/je/.globus/usercert.pem
                                                                                                                                                                              doWAKPNYXbL7v30MQF8G93obvSB+
                                                                                                                                                                         r3JacgAqczOM1vQci3HBnStX3Q==
                                                                                                               You need not edit this message in any way. Simply         -----END CERTIFICATE REQUEST-----
                                                                                                               save this e-mail message to the file.                     $
                                                                                                           If you have any questions about the certificate contact
                                                                                                           the Globus CA at ca@globus.org

                                                                                                   141                                                                                                                           142




         Conceptos de Grid Computing                                                  Mg. Javier Echaiz




                     Certificate Operations
          •   Certificate creation
                – Grid-cert-request
                                                                                                                                  Some more
          •
                – grid-change-pass-phrase
              Certificate info & management
                –
                –
                    Grid-proxy-init
                    Grid-proxy-destroy
                                                                                                                                 details about
                –   Grid-proxy-info Grid-cert-info
                –   Grid-default-ca
          •   Mapfiles:
                – grid-mapfile-add-entry


          •
                – grid-mapfile-check-consistency
                – grid-mapfile-delete-entry
              For more details see:
                – http://www-
                  unix.globus.org/toolkit/docs/development/3.9.3/security/prewsaa/user/#com
                  mandline
                                                                                                                                 GLOBUS
                – Note: on the machine I was testing, not all commands were installed, not all
                  worked.

                                                                                                   143




Mg. Javier Echaiz                                                                                                                                                                                                                      24
Conceptos de Grid Computing
CACIC 2007 - Chaco
             Conceptos de Grid Computing                                                    Mg. Javier Echaiz   Conceptos de Grid Computing                Mg. Javier Echaiz




                          Globus Programming                                                                                  Globus Components
              • You can program to Globus using either
                Unix commands :
                    – e.g. grid-proxy-init, globus-job-submit,
                      globusrun, etc.
              • OR using a programming API.
                                                                                                                      GSI
                    – C, Java API’s exist
                                 int globus_gram_client_job_request (
                                       const char *         resource_manager_contact,
                                       const char *         description,
                                       int        job_state_mask,
                                       const char *         callback_contact,                                        GASS
                                      char **    job_contact)


                                                                                                         145                                                            146




             Conceptos de Grid Computing                                                    Mg. Javier Echaiz   Conceptos de Grid Computing                Mg. Javier Echaiz


               Globus Pre-WS Component
                                                                                                                                              GRAM
                  Interaction Diagram
                                                                                                                • Service that provides remote execution and
                                                                                                                  status management of the request
                                                         GSI
                                                                                      GSI
                                                                                               GSI
                                                                                                                • When a job is submitted by a client, the request
                                                                                                                  is sent to the remote host and handled by the
                                                                                                                  gatekeeper daemon located in the remote host.
                                                                    GSI
                                                                                                                • Then the gatekeeper creates a job manager to
     GRAM: Grid Resource Allocation Manager
     GASS: Global Access to Secondary Storage                                                                     start and monitor the job.
     MDS: Monitoring and Discovery Service
     GRIS: Grid Resource Information Service                                                                    • When the job is finished, the job manager sends
     GIIS: Grid Index Information Service
                                                                                                                  the status information back to the client and
                                                                                                                  terminates.
                               From IBM Redbook SG24-6895-012003: Intro to Grid Computing
                                                                                                         147                                                            148




             Conceptos de Grid Computing                                                    Mg. Javier Echaiz   Conceptos de Grid Computing                Mg. Javier Echaiz




                              GRAM Architecture                                                                                      GRAM Elements
                                                                                                                 • Clients
                                                                                                                 • Gatekeeper daemon
                                                                                                                 • Job Manager
                                                                                                                 • Global Access to Secondary Storage
                                                                                                                   (GASS)
                                                                                                                 • Dynamically-Updated Request Online
                                                                                                                   Coallocator (DUROC)
                                                                                                                 • User Resource Specification Language
                               From IBM Redbook SG24-6895-012003: Intro to Grid Computing
                                                                                                                   (RSL)
                                                                                                         149                                                            150




Mg. Javier Echaiz                                                                                                                                                              25
Conceptos de Grid Computing
CACIC 2007 - Chaco
         Conceptos de Grid Computing                                   Mg. Javier Echaiz   Conceptos de Grid Computing                                           Mg. Javier Echaiz




                                   GRAM Clients                                                             GRAM: globusrun

                                                                                               • Executable that uses Resource
          • Three clients:                                                                       Specification Language (see later in
                – globusrun
                                                                                                 lecture)
                – globus-job-run
                                                                                                  – RSL is powerful
                – globus-job-submit
                                                                                                  – GRAM knows what to do with RSL
                                                                                                    depending on context (which machine,
                                                                                                    what queues, is this job or file I/O, etc.)
                                                                                               • The other 2 clients are script wrappers
                                                                                                 around globusrun
                                                                                    151                                                                                       152




         Conceptos de Grid Computing                                   Mg. Javier Echaiz   Conceptos de Grid Computing                                           Mg. Javier Echaiz




               GRAM: globus-job-run                                                             globus-job-run:Running Jobs
        • Command line interface to job submission
        • Features staging of data and executables using                                   •     Step 1: “login” onto the grid.
                                                                                                 –     run globus-proxy-init command.
          Global Access to Secondary Storage (GASS)                                              –     Proxies lasts 12 hours by default.
        • The basic syntax is:                                                             •     The simplest way to run a program to the grid is the
                                                                                                 globus-job-run command.
              – % globus-job-run 'contact string' command                                  •     The syntax is:
                                                                                                 globus-job-run <hostname> <program> <arguments>
        • The contact string specifies a machine, port, and                                •     The program parameter must refer to the absolute path of the program. You
          service to send the request to.                                                        can avoid this using the “-s” option before the program name. With “-s”
                                                                                                 globus automatically transfers the program to the host that it will be
              – syntax of contact string is machine:port/jobmanager-                             executed.
                name.                                                                      •     You can start a multi-request using the “-:” delimiter. If you want all
                                                                                                 programs to have common parameters, use the
              – default port is 2119, the default name is "jobmanager"                           “-args” option:
                                                                                                 globus-job-run –args 1 1024 \
        • If you wanted to contact a jobmanager named                                            -: machine1 –s ./foo \
                                                                                                 -: machine2 –s ./bar
          "jobmanager-pbs" on port 2345, you would run
              – % globus-job-run localhost:2345/jobmanager-pbs /bin/date
                                                                       153                                                                                                    154




         Conceptos de Grid Computing                                   Mg. Javier Echaiz   Conceptos de Grid Computing                                           Mg. Javier Echaiz


                      GRAM: globus-job-run -
                                                                                                GRAM: globus-job-submit
                          staging files
        • globus-job-run -s flag
                                                                                           • basic syntax is:
           – Stages file I/O                                                                     % globus-job-submit 'contact string' command
        • For example, if you have a script called “hello.sh" which
          is the following:
                                                                                           • batch interface
                         #!/bin/sh                                                         • Similar to job-run, but does not support
                           /bin/date
                                                                                             automatic staging of files
        • Stage the script to be run:
                    % globus-job-run localhost -s myprog                                   • returns immediately with a contact string
              – causes GASS server to be started local machine.                                  – use to query the status of job.
              – when job reaches the job manager, it will contact your                           – https://pitcairn.mcs.anl.gov:4567/12345/7654321
                GASS server and read in the staged file, then submit the                         – Commands:
                job to the local scheduler.                                                           • globus-job-status
        • You also have the ability to stage in stdin, and stage out                                  • globus-job-get-output - once the job is done, collect
          stdout using the same mechanism:                                                              the output with this command
                 % globus-job-run localhost -stdin myin.txt -stdout myout.txt \                       • globus-job-clean
                                                                                    155                                                                                       156
                    -s myprog




Mg. Javier Echaiz                                                                                                                                                                    26
Conceptos de Grid Computing
CACIC 2007 - Chaco
            Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                               Mg. Javier Echaiz




                GRAM: globus-job-submit                                                                           GRAM: Gatekeeper
        •       Used to submit batch jobs use globus-job-submit.
        •       Syntax:                                                                           • gatekeeper daemon builds secure
                globus-job-submit <hostname>/jobmanager <program>
                <arguments>
                                                                                                    communication between clients and servers.
        •       Batch jobs are executed in the background, typically to queuing                   • gatekeeper daemon is similar to inetd
                systems.
                                                                                                    daemon in terms of functionality
        •       You can remotely submit a job, logout, come back later to collect
                results.                                                                          • provides a secure communication (GSI).
        •       The program must refer to an absolute path. You cannot use the “-s”
                option with globus-job-submit.                                                    • communicates with GRAM client
        •       After a successful submission a job handle is returned. Store this                  (globusrun) and authenticates the right to
                handle.
                                                                                                    submit jobs.
        •       Operations (note: handle is parameter):
                –     Check the status of a job: globus-job-status <handle>                       • after authentication, gatekeeper forks and
                –     Retrieve the output of a job: globus-job-get-output <handle>
                                                                                                    creates a job manager delegating the
                –     Cancel a job: globus-job-cancel <handle>
                –     Clear the files produced by a job: globus-job-clear <handle>        157       authority to communicate with clients.                                              158




            Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                               Mg. Javier Echaiz




                          GRAM: Job Manager                                                                    GRAM: Job Manager
                                                                                                  • The job manager functions are:
             • created by gatekeeper as part of the job
                                                                                                        – Parse the resource language (RSL) string, break
               requesting process.                                                                        down the scripts.
             • starts and monitors jobs on behalf of a GRAM                                             – Allocate job requests to the local resource
               client application.                                                                        managers
             • provides the interfaces that control the                                                       • The local resource manager is usually a job
                                                                                                                scheduler like PBS, LSF, or LoadLeveler.
               allocation of each local resource manager,                                                           – interface written in Perl, which allows you to create a new
               such as a job scheduler like PBS, LSF, or                                                              job manager to the local resource manager, if necessary.
               LoadLeveler                                                                              – Send callbacks to clients, if necessary
             • It interfaces with schedulers to start jobs                                              – Receive the status and cancel requests from
               based on a job request RSL string                                                          clients
                                                                                                        – Send output results to clients using GASS, if
                                                                                          159                                                                                           160
                                                                                                          requested




            Conceptos de Grid Computing                                      Mg. Javier Echaiz   Conceptos de Grid Computing                                               Mg. Javier Echaiz


                GRAM: Global Access to                                                             GRAM: Dynamically-Updated Request
               Secondary Storage (GASS)                                                               Online Coallocator (DUROC)

             • GRAM uses GASS for providing the
               mechanism to transfer output files from
               servers to clients.
             • Some APIs are provided under the GSI
                                                                                                  • DUROC allows users to
               protocol to furnish secure transfers.
                                                                                                        – submit jobs to different job managers at different
                    – E.g. GridFTP                                                                        hosts
             • used by the globusrun command, the                                                       – Submit jobs to different job managers at the
                                                                                                          same host
               gatekeeper, and the job manager.                                                   • DUROC RSL script parsed at GRAM client
                                                                                                    and allocated to different job managers.
                                                                                          161                                                                                           162




Mg. Javier Echaiz                                                                                                                                                                              27
Conceptos de Grid Computing
CACIC 2007 - Chaco
           Conceptos de Grid Computing                                               Mg. Javier Echaiz       Conceptos de Grid Computing                                             Mg. Javier Echaiz


                     Resource Specification
                                                                                                                                 Globus RSL scripts
                        Language (RSL)
            • RSL is the language used by the clients                                                         •     Designed to provide way to control the grid and its
                                                                                                                    resources to run your programs.
              to submit a job.                                                                                •     globus-job-run and globus-job-submit generate
            • All job submission requests are                                                                       and execute RSL scripts.
                                                                                                                    –     see generated script by using “-dumprsl” option just after
              described in RSL, including the                                                                             the command.
              executable file and condition on which it                                                       •     RSL scripts have the following syntax:
                                                                                                                    &(relation 1)(relation 2)…(relation n)
              must be executed.                                                                                     –     Each relation specifies a different detail of the job to be run.
            • You can specify the amount of memory                                                            •     Relations syntax based on tokens (or parameters):
                                                                                                                    (token = value)
              needed to execute a job in a remote                                                             •     A full list of the available relations :
              machine                                                                                               http://www.globus.org/gram/gram_rsl_parameters.html
                                                                                                  163                                                                                             164




           Conceptos de Grid Computing                                               Mg. Javier Echaiz       Conceptos de Grid Computing                                             Mg. Javier Echaiz


                                                                                                                               RSL: Resource co-
                           RSL has a Grammar
                                                                                                                                  allocation
            • Each RSL string consists of a sequence of
                                                                                                              • “&” indicates single resource request to Globus
              RSL tokens, whitespace, and comments.                                                             Resource Allocation Manager (GRAM), conjunction of
            • The RSL tokens are either special syntax or                                                       <attribute, value> pairs.
              regular unquoted literals                                                                       • “+” indicates request for multiple resources
                                                                                                                (coallocation): introduces new variable scope
            • Follows a modified BNF for tokenization rules                                                   • You can allocate multiple resources for your job, by
            • Has regular expressions                                                                           grouping RSL expressions as shown below.
                                                                                                                +(&(relation 1.1)(relation 1.2)…(relation 1.n))
            • Has large set of parameters which can be                                                          (&(relation 2.1)(relation 2.2)…(relation 2.n))
              extended                                                                                          …
                                                                                                                (&(relation m.1)(relation m.2)…(relation m.n))
                  – Note that they already include computational
                    requirements (max cpus, wallclock time, memory,                                           • you can run a job with many different executables,
                                                                                                                many different hosts, many different parameters etc.
                    queues, etc.)
                                                                                                  165                                                                                             166




           Conceptos de Grid Computing                                               Mg. Javier Echaiz       Conceptos de Grid Computing                                             Mg. Javier Echaiz


                            RSL: Some Common                                                             -                                 RSL (cont.)
                                Parameters
       •    (executable=string)                      •   (stdin=string)                                       • variables defined in one clause of a multi-request are
            Defines the program to be executed.          (stdout=string)                                        not visible to the other clauses
            The string is the absolute path to the       (stderr=string)
            program.                                     Provide redirection for the standard I/O
                                                                                                              • RSL tokens:
       •    (arguments=list)                             streams. The string can be a file or                       – Can’t be any of the following as part of an unquoted literal:
            Specifies the arguments that will be         URL. Uses GASS                                             `+' (plus), `&' (ampersand), `|' (pipe), `(' (left paren),
            supplied to the executable.              •   (count=integer)                                               `)' (right paren), `=' (equal), `<' (left angle), `>' (right angle),
       •    (environment=list)                           Specifies the number of processes to be                       `!' (exclamation), `"' (double quote), `'' (apostrophe),
            Environment variables needed for             run.
                                                                                                                       `^' (carat), `#' (pound), and `$' (dollar).
            the job. The list is consisted of space •    (resourceManagerName=strin
            separated pairs enclosed in                  g)                                                   • Common RSL tokens:
            parentheses. E.g. (CC “gcc”)                 Specifies the grid machine where the                           arguments, count, directory, executable, jobType,
       •    (directory=string)                           job will be submitted by default.                              environment, maxTime, maxWallTime, gramMyjob,
            Specifies the job’s active directory.    •   (maxCpuTime=integer)                                           maxCpuTime, stdin, stdout, stderr, queue, project, dryRun,
                                                         Specifies the maximum CPU run time in                          maxMemory, minMemory, hostCount
                                                         minutes.

                                                                                                  167                                                                                             168




Mg. Javier Echaiz                                                                                                                                                                                        28
Conceptos de Grid Computing
CACIC 2007 - Chaco
                                                                                                                Conceptos de Grid Computing                                                                     Mg. Javier Echaiz
      Programming with Globus API
       • Command line programs syntax: grid_* or globus_*                                                              MDS: Monitoring and Discovery Services
       • Function calls/APIs start with globus_*
       • Library binaries start with libglobus_*.a

       • Includes:
       #include <globus_common.h> //defines most common data structures
                                                                                                                                                                                                        Producers
       and others depending on which modules/functions are called in the program.
                                                                                                                                                                          Resources generate information (static or dynamic)
                                                                                                                                                                          about them, e.g. current CPU load, number of
       • Module Activation/Deactivation:                                                                                                                                  Reserved jobs etc. This information is collected by
       - Functions are arranged in several modules. The corresponding modules must be activated                                                                           Information Providers. (You can implement your own
                                                                                                                                                                          LDAP schema also at this level.)
       before calling a function:
                                                                                                                                                                          Information Providers pass these information to
       - globus_module_activate(MODULE_NAME)                                                                                                                              GRIS servers. GRIS servers collect such local data
       - globus_module_deactivate(MODULE_NAME)                                                                                                                            for many resources that are local.
       - globus_module_deactivate_all()                                                                                                                                   GRIS registers its local information with a GIIS server
                                                                                                                                                                          for a host or domain (e.g. Univ. of Michigan)
       GLOBUS_SUCCESS (0) is returned if successful.                                                                                                                      (GIIS servers are similar to DNS servers)
                                                                                                                                                                          GIIS from different hosts and domains register
       Example Module Names:                                                                                                      Copyright @ IBM ITSO/Redbook SG246895   themselves with others creating a hierarchy of
                GLOBUS_GRAM_CLIENT_MODULE                                                                                                                                 GIIS servers and information.
                                                                                                         An MDS client is the consumer of such information.
                GLOBUS_IO_MODULE                                                                         e.g. a Grid scheduler, a job-manager, a network
                GLOBUS_GASS_COPY_MODULE                                                                  status predictor (NWS) or an adaptive application.
        Dependencies among module activations exist. Read API documentation.                             It can query a GRIS for local resource information, or   Consumers
                                                                                                         a GIIS for Grid-wide information.
                                                                                                                                                                                                                               170




           Conceptos de Grid Computing                                               Mg. Javier Echaiz




                  Useful Links & Readings



            • http://www.globus.org/gram
            • IBM Redbook SG24-6895-012003: Intro
              to Grid Computing




                                                                                                  171




Mg. Javier Echaiz                                                                                                                                                                                                                    29

								
To top