S10_Wolfgang_Gentch by youssefadham


More Info
									Sun Grid Computing Projects

Wolfgang Gentzsch
Director Grid Computing
Sun Microsystems Inc
   Sun's Vision in the 80's and 90's:
Integrated & “Integrate-able”         End-to-End, Javacard -> J2EE

     The Network

   is the Computer

          Open public API's                     Scale
          Sun's Vision in the 80's and 90's:
                        The Network Javacard -> J2EE
Integrated & “Integrate-able”
                     is the Computer
Innovation   Sun's Vision today:                Partners

Everybody and Everything

       to the Net
          Open public API's           Scale
     The 1 Step:
Integrated & “Integrate-able”      End-to-End, Javacard -> J2EE

               Grid                                Partners

             Computing                                   Security

  Choice                   Value
        Coordinated sharing of distributed resources
      including computers, data archives, visualization
             and multiple remote instruments Scale
     Open public API's
    The 2nd Step:
          Computing as a Utility
         Adding a business model to the Grid

                      What's a Utility?

●   On Demand: Get a service at your finger tip
●   From the Wall Socket: Don't care about the infrastructure
●   Metering & Billing: And pay as you go, for what you used

        Like electricity, water, gas, heat, telephony
       Computing as a Utility:
   The Grid, N1, and Web Services

       Developers & Users View: Sun ONE

Grid            Web & Grid Services
          Msg    App   Dir   DB    Meter Comp    Sysadmin
OGSI                                     uting
           Operating Systems & Grid

            Network               Storage
The 3 Step:

The Grid,
the Operating System of the Internet:
 High Capacity: rich in resources
 High Capability: rich in options
 Persistent: stable infrastructure & knowledgeable workforce
 Evolutionary: able to adapt to new technologies & users
 Usable: accessible, robust, and easy-to-use
 Scalable: growth must be part of the design
 Flexible: able to support new applications
 Fault-tolerant: resilient to changes and errors
The 3 Step:

The Grid,
the Operating System of the Internet:
 High Capacity: rich in resources
 High Capability: rich in options
 Persistent: stable infrastructure & knowledgeable workforce
 Evolutionary: able to adapt to new technologies & users
 Usable: accessible, robust, and easy-to-use
 Scalable: growth must be part of the design
 Flexible: able to support new applications
 Fault-tolerant: resilient to changes and errors

      For the Datacenter, this is N1 !
       Services sharing resources
Services       Services         Services
   1               2                   3

Virtual          Virtual          Virtual
 Store          Compute          Network
  N1: managing services, not servers
  Anyone, anywhere, anytime, any device, any data,
  connected to The Grid
Integration of new devices, data and information sources
Cell phones, PDAs, smart sensors, sensor arrays,
 health monitors
Devices embedded in cars, engines, roads, bridges, clothes,...
Huge amount of data for real-time analysis
Policies, grid economy, to maintain stability and efficiency
Organizational and societal structures, to bridge political
and social boundaries

of Increasing
Interest for Grids:
 Engineering, particle physics, astronomy, chemistry
 and materials, environmental science, bioscience and
genomics, education, digital libraries
 Societal challenges: network identity, safety, terrorism,
 global education, economic stability
 Healthcare: medical imaging, brain atlas, molecular
informatics, telemedicine
Grid Applications:

 On-demand & real-time: applications at your fingertip
 Adaptive & dynamic: run on best-suited system, adapting to
 dynamic changes of resources and performance variations
 Workflow: different components run on different resources
 Workbench: problem solving environments
 Collaborative Computing Frameworks: distributed people

 Far Future: “Throw any application at The Grid”
Grids Today              - Grids in 3 - 5 Years
Focus on Research          -   Focus on R&D and business
Compute-oriented           -   Petaflops linked w/ Petabytes
Proprietary interfaces     -   Standards: GGF, OGSA, DRMAA
“Mental Firewall”          -   Security, policies, identity
Difficult to build         -   Standards, services, solutions
Difficult to manage        -   Sun N1, IBM Autonomic, HP DC
Difficult to use           -   Grid Portals: transparent,
                               remote, secure
Many technologies          -   Globus Toolkit 3.x

                 Sun's Evolutionary Grid Strategy
     Expand existing technologies and integrate new
From Cluster Grids

                 To Enterprise Grids

                                             To Global Grids,

Cluster Grid                     Enterprise Grid
                                                                    To THE GRID
                                                                      Global Grid
Departmental Computing           Enterprise Computing
                                                                      Internet Computing
• Simplest Grid deployment       • Resources shared within the
• Maximum utilization of         enterprise                           • Resources shared over the Internet
departmental resources           • Policies ensure computing on       • Global view of distributed datasets
• Resources allocated based on   demand                               • Growth path for enterprise Grids
priorities                       • Gives multiple groups seamless
                                 access to enterprise resources
       Sun Grid Services Environment
                     Web Interface
             Computing Portal & Portal Server

 Admin Tools for    Development Tools          Global Grid Layer
Systems, Services
  & Application                                Globus/Avaki/S1GE
                    Sun ONE Developer Studio   GridXpert Synergy
      N1                    Sun HPC               Web Services
                          Cluster Tools
    Sun MC                                         SunONE

              Distributed Resource Management
                      Grid Engine Family
         Solaris/Linux/AIX/... Operating Environment
       Throughput and HPC Clusters, Enterprise Servers
                      Storage Systems
            Desktops and Information Appliances

       Sun Grid Computing Projects
- Sun Internal Projects
  - Develop Sun Grid Engine (SGE) and SGE Enterprise Edition
  - Develop Cluster and Enterprise Grid software stack
  - SGE/Jxta resource discovery, Grid Appliance
  - Testbed for 3rd party grid sw tools (GridXpert, TurboWorx,...)

- Sun Community Efforts
  - DRMAA, Distributed Resource Management Application API
  - Sun Grid Engine open source project

- Sun External Projects, with Partners
  - Sun Center of Excellence program, development from scratch
  - Integrating Sun Grid Engine with 3rd party software tools
  - Large funded Govt projecs with SGE as technology testbed
          Grid Computing in Sun
- SW, Jonathan Schwartz: SGE, Integrations, GE Portal,
    N1, SunCluster, SunMC, Control Station, Jxta, Sun ONE
- VSP, Neil Knox: Grid solutions, Grid Racks, Low-Cost,
    Grid Reference Architectures
- ESP, Clark Masters: HPTC, SuperCluster
- Storage, Mark Canepa: DSP/Pirus, DataGrid
- Processors, David Yen: PNP Enterprise Grid (Ranch) + DReAM
- CTO, Greg Papadopoulos: Advanced grid technology +
     customer projects
- GSO, Robert Youngjohns: EDU, EDGE Computing,
     Grid Sales Programs
- Services, Pat Sueltz: Sun Grid PS, Grid Solutions,
     Ref.Architectures, Utility Computing

     SMI wide Grid Projects
Sun Virtual Grid Organization: HPTC, EDU, VSP, Storage, SW, GSO, PS, . . .
●   SGE & Sun HPC ClusterTools
●   SGE & Sun Management Center
●   SGE & Sun Cluster 3.0
●   SGE & Sun Grid Engine Portal
●   SGE & Sun ONE Portal Server
●   SGE & Sun ONE Studio
●   SGE & Sun Control Station
●   SGE & Sun Jumpstart/Quark/JET
●   SGE & Transfer Queue/MultiCluster
●   SGE & JXTA, Discovery
●   SGE & Jini
●   SGE & Solaris Resource Manager
●   SGE & N1 Provisioning Server
●   SGE & Pirus
●   ...
Core Technology: Sun Grid Engine Family
         Distributed Resource Management in Cluster & Enterprise Grids

●   Multi-platform, open source, standards
    - Today 7,000+ departmental, enterprise, global grids
●   Sun Grid Engine, SGE, free Web downloads for Solaris & Linux
     –    Identifies best-suited, least loaded resource for your work
     –    Queuing, prioritizing, scheduling
●   Sun Grid Engine, Enterprise Edition, free and $$$
     –    Policy-based equitable, sharing between groups & projects
     –    Alignment of resources with business goals

           Sun Grid Products Roadmap
                                                                          Global Grid
                                                                  Grid standards (OGSA),
             Grid                                                     Globus GT3, research
                                    Cluster & Enterprise                collaborations
                                         ClusterGrid SW Stack:
                                             Grid appliance, preload,
                                               software product.
               Sun Grid Engine 6.0 Avaki/SGE/Portal
                  Scalability, analysis,
                  monitoring, accounting,
                  installation, administration,
                  scheduler, standards

                  FY03                                      FY04   FY05           FY06

Sun Proprietary – N1 Strategy Presentation v4.0 - 4th Feb 2003                               20
         SGE/EE 6.0 Scalability
●   Data base spooling of status data
     –   Status and accounting data (default/spool/...)
     –   Multiple databases supported
●   “Readonly daemon” aside qmaster for client read
    requests (e.g. Qstat, ...)
●   New communication system
     –   Threads replace commd
     –   Low level comm. based on “standards”
     –   Language independent (C/C++, Java, ...)
●   High-throughput scheduler
     –   Goal: Keep hosts busy (instead of balance load)
     –   Separate scheduling algorithm option
SGE/EE 6.0     Analysis / Monitoring / Accounting
  ●   Data base spooling of status data
  ●   Web-based tools for analysis,
      monitoring, accounting reports, etc.

SGE/EE 6.0    Ease of Installation & Administration
      Cluster Queues
          One queue config for arbitrary queue instances
          Flexible configuration (hostgroups)
      Automatic Install/Desinstall
          Hands-off install/deinstall procedures
          Integration with managment tools (N1, SCS, ...)

    SGE/EE 6.0   Scheduler Functionality
●   Advance reservation
     –   Reservation / Preemption / Backfilling
     –   Powerful algorithms; 6.x to include
●   Throughput Scheduler

    SGE/EE 6.0   Standards Compliance
     DRMAA 1.0 Implementation
          Submitted to GGF
          C-binding (Java-binding to follow)
          Exact scope depends on funding level
          An opportunity for partners to contribute!

       Cluster Grid SW Stack
●   Grid Engine
●   Control Station
●   HPC Cluster Tools
●   KickStart (Linux), JumpStart (Solaris)
●   WUI – Web User Interface
●   .....
  Remote Managed Grid
    Browser to Grid Appliance                  Grid Manager
(Remote Grid Setup & Configuration)

    I                                od
    n                    wn

    t                  Do                                        Linux Rack
                                                                Server Clients
    e                                          Server Clients

    n                 Workstation Clients
                       (Linux or Solaris)

          Grid Engine Open Source

- 500,000 lines of source code
- Binaries: Compaq, HP, IBM, SGI,
  Linux for free, 1000s of downloads
- Workshop Regensburg 2002:
  30+ development partners
- Next SGE Workshop Sept'03
- Contributions to scheduler,
  broker, parallel, clients,
  Globus, GridEngine Portal,...

        Sun Grid Partner Projects, Examples
-   ICENI, Imperial College e-Science Networked Infrastructure, London
-   GRIDS, Grid Computing & Distributed Systems Lab, Melbourne
-   EZ-Grid, Sun Center of Excellence for Grid Computing, Houston
-   White Rose Grid, Universities of Leads, Sheffield, York, UK
-   NCSV, Nanyang Center for Supercomp.& Visualization, Singapore
-   EPCC Edinburgh Sun Data & Compute Grid Project
-   HPCVL Canada, Secure innovative HPC/Grid environment
-   GridLab European Project for Grid Application Infrastructure
-   myGrid Infrastructure for an e-Biologist Workbench, Manchester
-   Sun Center of Excellence for BioInformatics, OSC Ohio
-   AIST Advanced Industrial Science & Technology Institute, Tokyo
-   ...

 UK e-Science Grid
$ 180 Mio in 3 years
for science and engineering
                                          DL      Newcastle
 Sun Grid Centers in UK:
 Edinburgh EPCC, Sun CoE HPC & Grid
 Cambridge, 2TeraFlops 10 SF15K               Manchester
 Oxford, Computational Finance
 London IC, Sun CoE e-Science                            Cambridge
 London UCL, Sun CoE Networks
 Manchester, MyGrid (BioGrid)             RAL              Hinxton
                             Cardiff               London
 Leads, Sheffield, York: White Roses Grid
 Durham: Cosmology Engine Grid               Southampton
                    White Rose Grid

 Maxima   Snowdon      Pascali        Titania   29
            White Rose Grid, UK
- Campus Grid, SGEEE for local users, Globus for global users
- Univ.Leads (Solaris/Linux), Sheffield (Solaris), York (Solaris)
- Web portal for access to local AND global resources
- SGEEE integrated with Sun HPC ClusterTools
- Grid portal development kit w/ Apache portal server
- Globus Toolkit with GRAM, MDS, GridFTP
- MyProxy for credential repository
- Globus/SGEEE integration:
   - GRAM: job submission and control through Globus
   - MDS: SGEEE infos (e.g. Queues) passed up to Globus
Peter Dew

             WRG Key Components
●   Globus Toolkit 2.0
Provides a secure means for inter-campus actions
    ●   Transferring jobs
    ●   Moving data
    ●   Gathering information about resources
●   Grid Engine Enterprise Edition
Manages the campus grid compute resources
    ●   Delivers a single interface for a heterogeneous grid
    ●   Guarantees a share of campus resource for grid and local
            WRG Key Components
●    Grid Portal Development Kit
Provides a portal interface into Globus Toolkit
     ●   Transferring jobs
     ●   Moving data
     ●   Gathering information about resources
●    MyProxy
    MyProxy provides a server with client-side utilities to store
    and retrieve delegated X.509 credentials via the Grid
    Security Infrastructure (GSI).
 WRG Architecture Overview


                                     White Rose Grid

 GT2.0    GT2.0             GT2.0           GT2.0

 GEEE     GEEE              GEEE            GEEE

Solaris   Linux            Solaris         Solaris

 Maxima   Snowdon             Pascali         Titania
Enterprise Edition Share Policies

    GT2.0                       GT2.0

     GEEE                        GEEE

  Solaris                     Solaris

    GT2.0                       GT2.0


            White Rose Grid
PROGRESS Architecture
   Grid Service Management System

                        + request description XRSL
           Portal              (XML/RSL)

               Resources &
Globus           Services

       Thank You !

To top