Introduction to grids

Document Sample
scope of work template
							Introduction to grids

   Taavi Hupponen, CSC
Definition?

 There are as many definitions as there are grids…
 Power grid analogy really isn’t a very good one
 Grids aim to provide easy, efficient and secure access
  to distributed resources

 How to recognize a grid?
    •   Resource sharing (cpu, storage…)
    •   Spans over organization borders
    •   Security
    •   Based on open standards
Grid types

 Categorization of grids is more or less artificial, most grids fall into
  several categories

 Computational grids
    •   The traditional grid
    •   Connecting clusters, workstations and supercomputers
    •   Examples: EGEE, DEISA, SETI@home
 Data grids
    •   Easy, efficient and powerful access for data
    •   Uniform interface, distribution and replication of large data sets
    •   Examples: Bridges, BIRN, peer-to-peer file sharing networks like BitTorrent?
 Knowledge grids, services grids
Building blocks

 Most of the grids are built of same basic blocks,
  including
    •   Computing elements
    •   Storage elements
    •   User interface
    •   Job management
    •   User management
    •   Security
Middleware

 The building blocks are implemented by the middleware of the grid
 Middleware acts between an application and the operating systems
  of the grid nodes
 The term ’middleware’ is used quite loosely, it can mean almost
  anything
 Examples:
    •   LCG-2 and gLite (EGEE)
    •   Nordugrid ARC (SweGrid, M-grid)
    •   Unicore (DEISA)
    •   Globus Toolkit


 Unfortunately middlewares don’t work very well together, work is
  being done to improve grid interoperability
Common grid user interfaces

 Command-line interfaces
    •   Still the most common way of using grids
    •   Almost like using a batch job system in a local cluster:
          Write the job description
          Submit the job
          Poll for status
          Get the results
    •   In addition: certificate handling

 Graphical clients
    •   Often include workflow features

 Web portals
    •   Either hide or expose the grid middleware
    •   One portal for one or more grids (P-GRADE)
Security in grids

 With most grids, security has been considered from the
  beginning, unlike with for example World Wide Web

 Grid security:
    •   Is based on Public Key Infrastructure (PKI), which is a robust
        security mechanism used by for example ssh and ssl
    •   Usernames and passwords are replaced by certificates
    •   Certificates are provided by trusted entities called Certificate
        Authorities
    •   PKI provides authentication, integrity and confidentiality
Virtual organisations

 Access to grid resources is often controlled in Virtual
  Organisation level instead of individual users so

 VOs are based on collaboration, geographical location,
  scientific field

 Example: Biomed VO in EGEE

 In its simplest form: list of user identities, can also
  include the programs that are to be used
Putting programs into the grid

 Programs installed by grid admins
   •   Either at all or only at some nodes
   •   There usually is a a common set of programs that can be found on
       each node of a grid (basic utilities, compilers etc.)
   •   Nodes have mechanisms for advertising which programs are installed


 Programs installed by grid users
   •   Program is sent to the node with the job description and input data
   •   You need to consider hardware architecture, operating system and
       library issues
What kind of problems fit into grids?

 Non-parallel problems
   •   As if running on local workstation


 Embarassingly parallel problems
   •   The problem is easily split into smaller independent jobs that
       can be distributed inside a site or even among several sites
   •   Very well suited for grids


 Most problems are in-between and are best executed
  inside one site
Grid examples

 EGEE
   •   Grid of heterogenous clusters and workstations
   •   Over 30,000 cpu, 5 Petabytes of storage
   •   EGEE project ended in March 2006, EGEE II started in April 2006
   •   Funded by EU FP6
   •   http://www.eu-egee.org


 DEISA
   •   Grid of supercomputers (mostly IBM)
   •   For High Performance Computing applications
   •   Funded by EU FP6
   •   http://www.deisa.org
Challenges

 Constant development makes it challenging for users and
  admins to keep up

 Distribution adds overhead, decreases control and
  transparency

 Usability issues

 Grid interoperability issues
Benefits

 Grids don’t increase resources – they make usage of
  existing resources more efficient
    •   Load-balancing, idle resources to use


 Handling of large computations or data sets that aren’t
  possible within single site (CERN LHC)

 Increased collaboration
Grids are still developing, but already
offer good opportunities.

						
Shared by: gregoria
Related docs