Docstoc

Grid Tutorial = EGEEhu =.ppt

Document Sample
Grid Tutorial  = EGEEhu =.ppt Powered By Docstoc
					                               Grid Tutorial
                                         Norbert Podhorszki



                             Part I.
                       What are Grids and
                          e-Science?
EGEE is funded by the European Union under contract IST-2003-508833

                                                          EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 1
                 Acknowledgements
• This talk is based on a module of the tutorials delivered by
  the EGEE training team and slides from
   •   Andrew Grimshaw, University of Virginia
   •   Bob Jones, EGEE Technical Director
   •   Mark Parsons, EPCC
   •   the EDG training team
   •   Roberto Barbera, INFN
   •   Ian Foster, Argonne National Laboratories
   •   Jeffrey Grethe, SDSC
   •   The National e-Science Centre
   •   David Fergusson, ???
   •   Peter Kacsuk, MTA SZTAKI


                                      EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 2
          Goals of Part I
• Introduce grid concepts and definitions

• Why Grids?

• A brief outline of history leading to EGEE




                                EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 3
           Overview
• What is different about grids?

• Characteristics of a grid

• eScience

• Applications (what’s in it for the working scientist)

• European grids, and the world




                                   EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 4
What is different about grids?




               EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 5
             What is Grid Computing?
• A Virtual Organisation is:

    • People from different institutions working to solve a common goal
    • Sharing distributed processing and data resources


• Grid infrastructure enables virtual organisations

         “Grid computing is coordinated resource
         sharing and problem solving in dynamic,
         multi-institutional virtual organizations”
         (I.Foster)




                                          EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 6
Grids vs. Distributed Computing?




                EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 7
            A Real World Distributed Application


• SETI@home
  •   3.8M users in 226 countries
  •   1200 CPU years/day
  •   38 TF sustained (Japanese Earth
      Simulator is 40 TF peak)
  •   1.7 ZETAflop over last 3 years
      (10^21, beyond peta and exa …)
  •   Highly heterogeneous: >77
      different processor types




                                  Credit to Fran Berman
                                       EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 8
                 Grids vs. Distributed
                 Computing
• Distributed applications already exist, but they tend
  to be specialised systems intended for a single
  purpose or user group

• Grids go further and take into account:
  •   Different kinds of resources
       • Not always the same hardware, data and applications
  •   Different kinds of interactions
       • User groups or applications want to interact with Grids in different
         ways
  •   Dynamic nature
       • Resources and users added/removed/changed frequently


                                         EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 9
Grid vs. metacomputing




            EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 10
                  Motivations
    metacomputing                                    Grid
• Grand challenge problems     + To form a computational
  run weeks and months           grid similar to the
  even on supercomputers         information data access on
  and clusters                   the web.

                               • Any computers/devices
                                 must be connected by wide
• Various                        area networks in order to
  supercomputers/clusters        form a universal source of
  must be connected by wide      computing power.
  area networks in order to
  solve grand challenge
  problems in reasonable
                               • Grid = generalised
  time
                                 metacomputing
                              EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 11
  Original meaning of
  metacomputing


                     Super                Wide area
 Metacomputing   = computing            + network



Original goal of metacomputing:

• Distributed supercomputing to achieve
  higher performance than individual
  supercomputers/clusters can provide

                     EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 12
           Distributed
           Supercomputing



               Caltech
NCSA           Exemplar   • Issues:
Origin                       •   Resource discovery, scheduling
                             •   Configuration
           Maui
 Argonne   SP                •   Multiple communiation methods
 SP                          •   Message passing (MPI)
                             •   Scalability
                             •   Fault tolerance




SF-Express Distributed Interactive Simulation: Caltech, USC/ISI
                                 EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 13
  What is a Metacomputer?
• A metacomputer is a collection of
   •   computers
   •   that are heterogeneous in every aspects
   •   geographically distributed
   •   connected by a wide-area network
   •   form the image of a single computer
• Metacomputing means:
   •   network based
   •   distributed supercomputing




                          EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 14
  What is a Grid?
• A Grid is a collection of
   •   computers, storage and other devices
   •   that are heterogeneous in every aspects
   •   geographically distributed
   •   connected by a wide-area network
   •   form the image of a single computer
• Generalised metacomputing means:
   •   network based
   •   distributed computing




                          EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 15
           Distributed
           Supercomputing



               Caltech
NCSA           Exemplar
Origin                    • Issues:
                              •   Resource discovery, scheduling
            Maui
 Argonne    SP                •   Configuration
 SP                           •   Multiple comm methods
                              •   Message passing (MPI)
                              •   Scalability
                              •   Fault tolerance


SF-Express Distributed Interactive Simulation: Caltech, USC/ISI
                                  EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 16
High-Throughput Computing
                                 • Schedule many
                                   independent tasks
                                       • Parameter studies
 Deadline
            Cost                       • Data analysis
                                 • Issues:
                                       • Resource discovery
                                       • Data Access
                                       • Scheduling
                                       • Reservation
                                       • Security
                                       • Accounting

       Available                       • Code management

       Machines

    Nimrod-G: Monash University

                   EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 17
Characteristics of a grid




              EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 18
         What are the characteristics of a
         Grid system?


                             Numerous Resources
                                                           Connected by
 Ownership by Mutually                                     Heterogeneous,
Distrustful Organizations                                  Multi-Level Networks
             & Individuals


  Different Security                                        Different Resource
      Requirements                                          Management
& Policies Required                                         Policies


        Potentially Faulty                              Geographically
              Resources                                 Separated
                               Resources are
                               Heterogeneous



                                    EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 19
         What are the characteristics of a
         Grid system?


                             Numerous Resources
                                                           Connected by
 Ownership by Mutually                                     Heterogeneous,
Distrustful Organizations                                  Multi-Level Networks
             & Individuals


  Different Security                                        Different Resource
      Requirements                                          Management
& Policies Required                                         Policies


        Potentially Faulty                              Geographically
              Resources                                 Separated
                               Resources are
                               Heterogeneous



                                    EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 20
              How Different 2004 is from 1994
• Moore’s law everywhere
    •   Instruments, detectors, sensors, scanners, …
    •   Organising their effective use is the challenge
•   Enormous quantities of data: Petabytes
    •   For an increasing number of communities
    •   Gating step is not collection but analysis
• Huge quantities of computing: >100 Top/s
    •   Moore’s law gives us all supercomputers
    •   Organising their effective use is the challenge
• Ultra-high-speed networks: >10 Gb/s
    •   Global optical networks
    •   Bottlenecks: last kilometre & firewalls


                                         EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 21
                                           Exponential Growth

                                                                        Optical Fibre
     Performance per Dollar Spent


                                        Doubling Time
                                           (months)
                                                                        (bits per second)                          Gilder’s Law
                                                                                                                   (32X in 4 yrs)
                                           9 12   18
                                                                  Data Storage
                                                                  (bits per sq. inch)
                                                                                                                   Storage Law
                                                                                                                   (16X in 4yrs)

                                                  Chip capacity
                                                      (# transistors)
                                                                                                                   Moore’s Law
                                                                                                                   (5X in 4yrs)

                                    0         1            2             3              4            5
                                                        Number of Years

Triumph of Light – Scientific American. George Stix, January 2001
                                                                               EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 22
             The main drivers behind Grid
• The relentless increase in microprocessor performance
  •   you can buy multi-gigaflop systems for less than €800
• The availability of reliable high performance networking
  •   in Europe the GEANT network links 32 countries at speeds of up to
      10Gbps (and beyond)
  •   in the UK we have gone from 100Mbps -> 10Gbps academic
      backbone since 2000
  •   1Gbps is commonly available to the desktop
• The desire to push the boundaries of scientific discovery by
  computational analysis and simulation – e-Science




                                      EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 23
eScience




   EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 24
           The Emergence of e-Science
• Invention and exploitation of advanced computational
  methods

   •   To generate, curate and analyse research data
        • From experiments, observations and simulations
        • Quality management, preservation and reliable evidence
   •   To develop and explore models and simulations
        • Computation and data at extreme scales
        • Trustworthy, economic, timely and relevant results
   •   To enable dynamic distributed virtual organisations
        • Facilitating collaboration with information and resource
          sharing
        • Security, reliability, accountability, manageability and agility



                                      EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 25
             Why use Grids for Science?
• Scale of the problems
  •   Science increasingly done through distributed global collaborations
      enabled by the internet
• Grids provide access to:
  •   Very large data collections
  •   Terascale computing resources
  •   High performance visualisation
  •   Connected by high-bandwidth networks


• e-Science is more than Grid Technology
          It is what you do with it that counts



                                       EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 26
          Challenges
• Must share data between thousands of scientists with
  multiple interests
• Must ensure that all data is accessible anywhere, anytime
• Must be scalable and remain reliable for more than a
  decade
• Must cope with
  different access
  policies
• Must ensure data
  security



                               EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 27
                The Grid Vision
Researchers perform their
activities regardless                         The Grid: networked data
geographical location,                        processing centres and
interact with colleagues,                     ”middleware” software as the
share and access data                         “glue” of resources.




                                              Scientific instruments and
                                              experiments provide huge
                                              amount of data


                            EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 28
The Emergence of
Global Knowledge Communities




          Slide from Ian Foster’s ssdbm 03 keynote
           Applications

(What’s in it for working scientists)




                  EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 30
             Grid Applications
• Medical/Healthcare (imaging,
  diagnosis and treatment )

• Bioinformatics (study of the human
  genome and proteome to understand genetic
  diseases)

• Nanotechnology (design of new
  materials from the molecular scale)

• Engineering (design optimization,
  simulation, failure analysis and remote
  Instrument access and control)

• Natural Resources and the
  Environment (weather forecasting, earth
  observation, modeling and prediction of
  complex systems)

                                            EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 31
             CERN: Data intensive science in a
             large international facility

• The Large Hadron Collider (LHC)
   • The most powerful instrument ever built to
       investigate elementary particles physics

• Data Challenge:
   •   10 Petabytes/year of data !!!
   •   20 million CDs each year!
                                                                                     Mont Blanc
                                                                                      (4810 m)

• Simulation, reconstruction, analysis:
                                                                                 Downtown Geneva
   •   LHC data handling requires computing
       power equivalent to ~100,000 of today's
       fastest PC processors!

                                       EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 32
                      CrossGrid
• 1. Interactive biomedical simulation and
  visualization

• 2. Flooding crisis team support

• 3. HEP distributed data analysis

• 4. Weather forecasting and air pollution
  modelling




                                       EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 33
         Connecting People: Access Grid

Remote
 video




                                                        Visualisation




     Microphones                                   Cameras
                     EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 34
European grids

And the world




       EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 35
          Major EU GRID projects
European DataGrid (EDG)       www.edg.org

LHC Computing GRID (LCG)      cern.ch/lcg

CrossGRID                     www.crossgrid.org

DataTAG                       www.datatag.org

GridLab                       www.gridlab.org

EUROGRID                      www.eurogrid.org

European National Projects:
• INFNGRID,
• UK e-Science Programme,
• NorduGrid

                              EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 36
                                                           Application Testbed
                 EU DataGrid at a glance                   ~20 regular sites
People                                                     > 60,000 jobs
                                                           submitted (since 09/03,
500 registered users                                       release 2.0)
12 Virtual Organisations                                   Peak >1000 CPUs

21 Certificate Authorities                                 6 Mass Storage
                                                                  Systems
>600 people trained
456 person-years
of effort
170 years funded




Software
> 65 use cases
                                                              Scientific
7 major software                                         Applications
releases (> 60 in
total)                                                   5 Earth Obs institutes
                                                         10 bio-medical apps
> 1,000,000 lines of
                                                         6 HEP experiments
code

                                 EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 37
             Grid projects
Many Grid development efforts — all over the world
                                                            •UK – OGSA-DAI, RealityGrid, GeoDise,
 •NASA Information Power Grid                               Comb-e-Chem, DiscoveryNet, DAME,
 •DOE Science Grid                                          AstroGrid, GridPP, MyGrid, GOLD,
 •NSF National Virtual Observatory                          eDiamond, Integrative Biology, …
 •NSF GriPhyN                                               •Netherlands – VLAM, PolderGrid
 •DOE Particle Physics Data Grid                            •Germany – UNICORE, Grid proposal
 •NSF TeraGrid                                              •France – Grid funding approved
 •DOE ASCI Grid                                             •Italy – INFN Grid
 •DOE Earth Systems Grid                                    •Eire – Grid proposals
 •DARPA CoABS Grid            •DataGrid (CERN, ...)         •Switzerland - Network/Grid proposal
 •NEESGrid                    •EuroGrid (Unicore)           •Hungary – DemoGrid, Grid proposal
 •DOH BIRN                    •DataTag (CERN,…)             •Norway, Sweden - NorduGrid
 •NSF iVDGL                   •Astrophysical Virtual Observatory
                              •GRIP (Globus/Unicore)
                              •GRIA (Industrial applications)
                              •GridLab (Cactus Toolkit)
                              •CrossGrid (Infrastructure Components)
                              •EGSO (Solar Physics)




                                                   EGEE Tutorial at University of Szeged? – Dec 7th, 2004 - 38

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:7
posted:6/30/2010
language:English
pages:38
lily cole lily cole
About