A Grid Briefing by tac49996


									A Grid Briefing
Professor Carole Goble
University of Manchester

IT managers Day, Manchester Computing, 28-11-02
 What is Grid and eScience
 The UK eScience Programme
 The Global Programme
Take Home
 Grid is the latest attempt to do global
 distributed computing.
 Strong application pull.
 Strong industrial support.
 The technology is not ready (yet).
 It isn‘t going away.
 The UK is a leader & highly involved.
 The OST is committed to eScience
Why Grids?
  Large-scale science and engineering are done
  through the interaction of people,
  heterogeneous computing resources,
  information systems, and instruments, all of
  which are geographically and organizationally
  The overall motivation for ―Grids‖ is to
  facilitate the routine interactions of these
  resources in order to support large-scale
  science and engineering.
                             From Bill Johnston 27 July 01
CERN: Large Hadron Collider (LHC)
Raw Data: 1 Petabyte / sec
Filtered 100Mbyte / sec = 1 Petabyte / year = 1 Million CD ROMs

                                  CMS Detector
Why Grids?
  A biochemist exploits 10,000 computers to
  screen 100,000 compounds in an hour;
  A biologist combines a range of diverse and
  distributed resources (databases, tools,
  instruments) to answer complex questions;
  1,000 physicists worldwide pool resources for
  petaop analyses of petabytes of data
  Civil engineers collaborate to design, execute,
  & analyze shake table experiments

                               From Steve Tuecke 12 Oct. 01
Why Grids? (contd.)
  Climate scientists visualize, annotate, &
  analyze terabyte simulation datasets
  An emergency response team couples real
  time data, weather model, population data
  A multidisciplinary analysis in aerospace
  couples code and data in four companies
  A home user invokes architectural design
  functions at an application service provider

                                  From Steve Tuecke 12 Oct. 01
 What is the Grid?
― Grid computing [is] distinguished from
   conventional distributed computing by its
   focus on large-scale resource sharing,
   innovative applications, and, in some
   cases, high-performance orientation...we
   review the "Grid problem", which we
   define as flexible, secure, coordinated
   resource sharing among dynamic
   collections of individuals, institutions, and
   resources - what we refer to as virtual
 From "The Anatomy of the Grid: Enabling Scalable Virtual
         Organizations" by Foster, Kesselman and Tuecke
What is the Grid?
Resource sharing & coordinated problem solving
in dynamic, multi-institutional virtual
On-demand, ubiquitous access to computing,
data, and all kinds of services
New capabilities constructed dynamically and
transparently from distributed services
No central location, No central control, No
existing trust relationships, Little
Pooling Resources
e-Science and the Grid
‗e-Science is about global collaboration in
  key areas of science, and the next
  generation of infrastructure that will
  enable it.‘
‗e-Science will change the dynamic of the
  way science is undertaken.‘
John Taylor,
Director General of Research Councils,
Office of Science and Technology
Classical Grids
  Classical Grids
  emphasise sharing of
  physical resources.
  Existing Grid
  middleware (e.g.
  Globus, Condor,
  Unicore) allows
  resource discovery,
  resource allocation,
  data movement,
  certification …
                  Biogrid system
 Management Station
                                                            Management Station

                                                                                 Flat Neighborhood networks
Connected to
Grid system3
                        Grid system 1
                        Express5800/ISS for PC-Cluster
                        Xeon2.2G x 8 + Management node1


                                                             Grid system 2
                                                             NEC Blade Server
           1000Base-T x 12                                   78node(156CPU)

                                             Data Grid Disk
                                             Express5800/140Ra-4 x3
  Access portal for
  biomolecular modeling
  Interfaces to enable
  chemists and biologists
  to be able to submit
  work to HPC facilities
  Visualization of
  electrostatic field
  generated by
  a molecule.
dr Krzysztof Nowinski (ICM)
    Remote control of instruments
     Sharing of UHVEM(Ultra High Voltage Electron Microscopy) in Osaka
     University with NCMIR (National Center for Microscopy and Imaging
       3 Million electron volts

       the most powerful microscopy

 Osaka                          (Chicago)     (UC San Diego)
University         Tokyo XP     STAR TAP          SDSC

             JGN          TransPAC          vBNS

   UHVEM                                             NCMIR
(Osaka, Japan)                                     (San Diego)
    BioSim -- Molecular simulations as a tool for protein
                            structure analysis



                                          compute GRID
                                                                        MD database

                                                               novel biology…

   Overall vision – simulation as an integral component of structural genomics
   Needs both capacity (many systems) and capability (large systems - HPCx)
   Molecular Dynamics database (distributed)
 From Klaus Schulten, Center for Biomollecular Modeling and Bioinformatics, Urbana-Champaign
Information Weaving and
Question Answering
              Large amounts of different
              kinds of data & many
              Highly heterogeneous.
                 Different types, algorithms,
                  forms, implementations,
                  communities, service
              High autonomy.
              Highly complex and inter-
              related, & volatile.

  Personalised extensible
 environments for data-
 intensive in silico
 experiments in biology
  Straightforward discovery, interoperation, deployment &
sharing of services
     Service-oriented architecture For bioinformaticians
                                   who are building tools
  Integration and Information      and using or providing
    Workflow & Databases          services
      Provenance, propagating change, personalisation
BioGrid Projects
  Asia Pacific BioGRID
  North Carolina BioGrid
  Bioinformatics Research Network
  Osaka University BioGrid
  Indiana University BioArchive BioGrid
                                                  A Grid vs
                                                  The Grid
                                               A Grid of resources, not just
                                               compute resources but

                                               databases, digital libraries,
                                               instruments, workflows,
        Grid Middleware                        documents …

                                               These configurations are
      Gigabit IP Network

                                               Resources discovered,
                                               combined, used and
 Node                                          disbanded as and when
            Node             Node              needed or available.
Geographically     Node
(e.g. UKGrid)
Grid Evolution
1st Generation Grid
  Computationally intensive, file access/transfer
  Bag of various heterogeneous protocols & toolkits

  Recognises internet, Ignores Web

  Academic teams

2nd Generation Grid                       We are here!
   Data intensive -> knowledge intensive
   Services-based architecture
   Recognises Web and Web services
   Global Grid Forum
   Industry participation
US Grid Projects
 NASA Information Power Grid      DOE Earth Systems Grid
 DOE Science Grid                 DOE FusionGrid
 NSF National Virtual
 Observatory                      NEESGrid
 NSF GriPhyN                      NIH BIRN
 DOE Particle Physics Data Grid   NSF iVDGL
 NSF DTF TeraGrid
EU GridProjects
  DataGrid (CERN, ..)
  EuroGrid (Unicore)
  DataTag (TTT…)
  Astrophysical Virtual Observatory
  GRIP (Globus/Unicore)
  GRIA (Industrial applications)
  GridLab (Cactus Toolkit)
  CrossGrid (Infrastructure Components)
  EGSO (Solar Physics)
  COG (Semantic Grid)
National Grid Projects
  UK e-Science Grid
  Japan – Grid Data Farm, ITBL
  Netherlands – VLAM, DutchGrid
  Germany – UNICORE, Grid proposal
  France – Grid funding approved
  Italy – INFN Grid
  Eire – Grid-Ireland
  Poland – PIONIER Grid
  Switzerland - Grid proposal
  Hungary – DemoGrid, Grid proposal
  ApGrid – AsiaPacific Grid proposal
     UK e-Science Programme
                       DG Research Councils
                                                                 Grid TAG
                        Steering Committee
           Director’s                            Director’s
 Awareness and Co-ordination Role             Management Role

    Academic Application Support            Generic Challenges
             Programme                  EPSRC (£15m), DTI (£15m)
 Research Councils (£74m), DTI (£5m)
PPARC (£26m)
BBSRC (£8m)
MRC (£8m)
NERC (£7m)
ESRC (£3m)
                                  £80m Collaborative projects
EPSRC (£17m)
CLRC (£5m)

                               Industrial Collaboration (£40m)

                                                      From Tony Hey 27 July 01
UK e-Science Initiative
  £120M Programme over 3 years
  £75M is for Grid Applications in all areas
  of science and engineering
  £10M for Supercomputer upgrade
  £35M ‗Core Program‘ to encourage
  development of generic ‗industrial
  strength‘ Grid middleware
     Require £20M additional ‗matching‘
      funds from industry
Key Elements of
UK Grid Development Plan
  Development of Generic Grid Middleware
  Network of Grid Core Programme e-Science
     National Centre http://www.nesc.ac.uk/
     Regional Centres http://www.esnw.ac.uk/
  Grid IRC Grand Challenge Project
  Support for e-Science Pilots
  Short term funding for e-Science demonstrators
  Grid Network Team      * Grid Engineering Team
  Grid Support Centre * Task Forces
                             Adapted from Tony Hey 27 July 01
Key Elements of
UK Grid Development Plan (2)
  Grid Network Team
     CLRC-RAL
  Grid Support Centre
     CLRC-Daresbury, Manchester and Edinburgh
     http://www.grid-support.ac.uk
  Task Forces
     Database lead by Norman Paton
     Architecture lead by Malcolm Atkinson
  International Involvement via GGF
     http://www.gridforum.org/
e-Science Centres

                      Belfast          Manchester

                                 Oxford             Cambridge
                      Cardiff              London

 Centres donate equipment to make a Grid
Access Grid
A day in the life of NeSC
e-Science Demonstrators
 Dynamic Brain Atlas
 Chemical Structures
 Mouse Genes
 Robotic Astronomy
 Collaborative Visualisation
 Medical Imaging/VR
Grid Middleware R&D
  £16M funding available for industrial
  collaborative projects
  £11M allocated to Centres projects plus
  £5M for ‗Open Call‘ projects
  Set up Task Forces
     Database Task Force
     Architecture Task Force
     Security Task Force
Grid Network Team
  Expert group to identify end-to-end network
  bottlenecks and other network issues
    e.g. problems with multicast for Access Grid

  Identify e-Science project requirements
  Funding £0.5M traffic engineering/QoS project with
    investigating MPLS using SuperJANET network

  Funding DataGrid extension project investigating
  bandwidth scheduling with PPARC
  Proposal for ‗UKLight‘ lambda connection to Chicago
  and Amsterdam
UK e-Science Pilot Projects
 GRIDPP (PPARC)         Climateprediction.com
                        Oceanographic Grid (NERC)
 Comb-e-Chem (EPSRC)    Molecular Environmental
 DAME (EPSRC)           Grid (NERC)
 DiscoveryNet (EPSRC)   NERC DataGrid (NERC +
 GEODISE (EPSRC)        Biomolecular Grid (BBSRC)
 myGrid (EPSRC)         Proteome Annotation
 RealityGrid (EPSRC)    Pipeline (BBSRC)
                        High-Throughput Structural
                        Biology (BBSRC)
                        Global Biodiversity (BBSRC)
EPSRC e-Science Projects (2)
  myGrid: Personalised Extensible Environments for
  Data Intensive in silico Experiments in Biology
    Manchester, EBI, Southampton, Nottingham,

     Newcastle, Sheffield, GSK, Astra-Zeneca, IBM, Sun
  GEODISE: Grid Enabled Optimisation and Design
  Search for Engineering
    Southampton, Oxford, Manchester, BAE, Rolls

  Discovery Net: High Throughput Sensing Applications
    Imperial College, Infosense, …
Geodise Project                                               Engineer
   Ontology for                        database
   Engineering,                   Traceability
 Computation, &
 Optimisation and                                          OPTIMISATION
  Design Search

                                                   OPTIONS                                   Globus, Condor, SRB
                              SERVICE                                                  COMPUTATION
   Intelligent                                     and code                                                        Intelligent
   Application                                                                                                     Resource
                    CAD System              Analysis                                 Parallel machines
    Manager                                                                                                         Provider
                      CADDS                  CFD                                          Clusters
                      IDEAS                  FEM                                Internet Resource Providers
                       ProE                  CEM                                        Pay-per-use
                    CATIA, ICAD
                                                        Geodise will provide grid-based seamless access to an intelligent knowledge
                                                          repository, a state-of-the-art collection of optimisation and search tools,
                                             Design    industrial strength analysis codes, and distributed computing & data resources
EPSRC e-Science Projects (1)
  Comb-e-Chem:Structure-Property Mapping
    Southampton, Bristol, Roche, Pfizer, IBM

  DAME: Distributed Aircraft Maintenance Environment
    York, Oxford, Sheffield, Leeds, Rolls Royce

  Reality Grid: A Tool for Investigating Condensed
  Matter and Materials
    QMW, Manchester, Edinburgh, IC, Loughborough,

     Oxford, Schlumberger, …
DAME Project

                        In flight data

                                                      Global Network
                                                         eg: SITA
 Airline      Station

                                                      DS&S Engine Health Center
   Maintenance Centre       Internet, e-mail, pager

                                                        Data centre
                                                               Powering the Virtual
                                                     (Edinburgh, Belfast, Cambridge,
                                                     Leicester, London, Manchester,
Multi-wavelength showing the jet in M87: from top to RAL) Picture credits: “NASA / Chandra X-ray Observatory /
bottom – Chandra X-ray, HST optical, Gemini mid-IR,                     Herman Marshall (MIT)”, “NASA/HST/Eric Perlman
                                                                        (UMBC), “Gemini Observatory/OSCIR”, “VLA/NSF/Eric
VLA radio. AstroGrid will provide advanced, Grid based,                 Perlman (UMBC)/Fang Zhou, Biretta (STScI)/F Owen
federation and data mining tools to facilitate better and               (NRA)”

faster scientific output.
                                                                                                 p43      Printed: 22/01/2010
IRC Grand Challenge Project
  Equator: Technological innovation in
  physical and digital life
  AKT: Advanced Knowledge
  DIRC: Dependability of Computer-Based
  MIAS: From Medical Images and Signals
  to Clinical Information
e-Healthcare Grand Challenge
  Funding £0.5M projects to give Grid
  dimension to these IRCs
  Funding £2M Joint IRC projects with MIAS on
  e-Healthcare application
  Example: Breast cancer surgery
  – normalization of mammography and
  ultrasound scans
  FE modelling of breast tissue
  Deliver useful clinical information to surgeon
  ensuring privacy and security
e-Diamond                The challenge
                 Medical Image
               Analysis Technology

  Clinical                           Application
Open Grid Services
  Development of web services   OGSA-DAI
  from W3C                      Key middleware area for UK
  OGSA will provide             Program
    Naming /Authorization /
                                Develop high-quality data-
                                centric middleware capability
      Security / Privacy
                                    Total Budget $5M (CP $2M)
    Higher level services:
                                    Three Centres: Edinburgh,
      Workflow, Transactions,        Manchester and Newcastle
      Data Mining, Knowledge        Industrial partners: IBM US,
      Discovery,…                    IBM Hursley and Oracle UK.
   Exploit Synergy:
  Commercial Internet with
  Grid Services
Today‘s Grid
  A Single System Image    Security, certification,
  Transparent wide-area    single sign-on
                           authentication, AAA
  access to large data
                              Grid Security
  banks                        Infrastructure,
  Transparent wide-area    Data access,Transfer &
  access to applications   Replication
  on heterogeneous            GridFTP, Giggle
  platforms                Computational resource
  Transparent wide-area    discovery, allocation
  access to processing     and process creation
                              GRAAM, Unicore, Condor-
  resources                    G
Research Challenges
  Building a Future Infrastructure
     Developing a Semantic Grid
     Trusted Ubiquitous Systems
     Rapid Customized Assembly of Services
     Autonomic Computing
  Putting the Infrastructure to work
     Support for New Forms of Community
     Socio-Economic Impact
     Collaboratory IPR and legal issues
Reality Checks!!
The Technology is Ready
   Not true — its emerging
      Building middleware, Advancing Standards, Developing,
        Building demonstrators.
        The computational grid is in advance of the data intensive
        Integration and curation are probably the obstacles
        But!! It doesn‘t have to be all there to be useful.
We know how we will use grid services
   No — Disruptive technology
      Lower the barriers of entry.
Take Home
 Grid is the latest attempt to do global
 distributed computing.
 Strong application pull.
 Strong industrial support.
 The technology is not ready (yet).
 It isn‘t going away.
 The UK is a leader & highly involved.
 The OST is committed to eScience

To top