Grid Computing

Document Sample
Grid Computing Powered By Docstoc
					Grid Computing and
            Shawn Malhotra
  Monday, February 5th, 2007
   Background and definition
   Importance of middleware
   Globus Toolkit
   Sample Applications
What is Grid Computing?
   Computing model that leverages the power of many
    networked resources
   Not just CPUs
       Storage devices, special equipment (i.e. telescope)
   Share resources across administrative domains
       Requires security features
       Different than traditional cluster computing
   Programmer sees a single ‘virtual computer’
   Web ↔ Information as Grid ↔ Computing Power
Why is Grid Computing
   Helps solve computationally expensive problems
       Flexible enough to handle many small problems
   Share costly resources amongst institutions
       Federally funded research labs / academic institutions
   Make resources available to anybody
       Cost barrier is lowered
       ‘Pay as you go’ type service
       Increases overall bandwidth
Motivation for Middleware
   Need robust, efficient ways to pool resources
   Previous ‘ad-hoc’ methods not sufficient
   Need for standardization!
   Distributed Computing System (DCS)
       Developed at the University of California at Irvine
       Early 1970s
       Focus on CPU management
   Poor security solution
   Abandoned in the 1980s
Globus Toolkit
   Broader scope, more complete solution
       CPU Management
       Storage Management
       Monitoring Services
       More details to come …
   Most popular grid computing framework
   Implements several standards
Globus Toolkit - Overview
   Facilitates grid application development
       Open, extensible, flexible, high abstraction
Job Submission
   GRAM interface
       Grid Resource Allocation and Management
   Specify resource requirements and flow
   Uniform way to submit remote jobs
       Translate request for local resources
   Offers a variety of features
       Retrieve job status
       Send job signals (kill, start, restart)
   Uses Web services interface
Job Scheduling
   What happens after the job is submitted?
   Submitted to a scheduler
   Queues jobs decides where/when to run
       Requirement matching, priority systems, etc.
   Abstracts resources from user
       Pool heterogeneous resources together
   Can have multiple layers of scheduling
       Local schedulers vs. Metaschedulers
   Access to resources must be controlled
   Grid Security Infrastructure (GSI)
   Provides basic security constructs
       Certificate-based PKI system
       Supports single sign-on over the grid
       Supports delegation
   Access control left to individual services
       Infrastructure provides necessary info and control
   Uses Web services interface
Other Provided Modules
   Data management
       Facilitates file transfer, access to data stores
   Monitoring and discovery
       APIs to get status, subscribe to content
       Important since ‘grid’ is never down, only
   Collaboration tools
       Facilitates person-to-person collaboration
       Build web portals for chat, e-mail, etc.
Example Applications
   What can you build with such a toolkit?
   Applications range from the depths of the sea
    to the stars above!
       LOOKING    deep sea research
       Condor     batch computing infrastructure
       BIRN       medical resource pooling
       LEAD       meteorological data
       NVO        virtual observatory
   Workload management system
       Queuing, scheduling, prioritization, monitoring
   Pool desktops into batch system
       Use when idle, auto-detect when busy again
   ClasAd mechanism
       Novel way to match resources with requests
   Flocking
       Seamless combination of multiple networks

   Make tools / data related to oceanography
    available to all researchers
   ‘20,000 Terabits Beneath the Sea’
       Presented at iGrid2005
       Real-time high definition deep sea video
       Monitor active underwater volcanoes

   Resource pooling
       Tools for research and
   Collaboration
       Common user interface
   Better hypotheses
       Use a distributed patient

   Sharing meteorological resources
   Algorithm Development and Mining (ADaM)
       Works on observational data
       Provides analysis tools
   ARPS Data Assimilation System (ADAS)
       Provides visualization tools
   Earth Science Markup Language (ESML)
       Uniform way of expressing data
   Data Access Systems
       Allow uniform access to distributed data
   Expose the vast amount of astronomical data
    for all to use
       Telescopes will produce 7 petabytes per year by
   Standardized way of expressing data
       VOTable
   Creation of tools to produce required data
       ConeSearch
   Make accessing data like using real tools
The WISDOM Project
   Analyze potential anti-malaria drugs
   Focus lab tests on promising compounds
   Uses up to 5000 computers in 27 countries
   Simulate drug interaction with malaria protein
       Test 80,000 drugs per hour, 140 million in total
   Shows the power of collaboration
       Many computers borrowed from particle physics
        simulator in the UK – GridPP
       Shared spare capacity
Grid Computing – The Future
   Currently the domain of ‘Big Science’
       Make it more mainstream for ‘Little Science’
       Technology is not the barrier
   Evolution of the standards
       Continued enhancement of the toolkit
   Better front-end design
       Promote peer-to-peer collaboration
   Security is still a challenge
   Grid computing is a powerful collaborative
    computing model
   Grid computing requires efficient, fully
    featured middleware to thrive
   Grid computing enables research and
    development that is not possible in isolation
   Globus site
   Wikipedia
   Grid Café
The Need for Grid Solutions
   Grids are essential to sustain Moore’s Law as
    physical limitations will eventually limit what
    individual computing stations can achieve
   It will become less necessary as individual
    resources become more powerful since
    technology grows faster than the complexity
    of our research
The Corporate Barrier
   True grid computing will never be embraced
    by corporations due to security issues and
    sensitivity of data. This will limit the scope
    and power of the technology
   Much like Web 2.0 has caused a shift in
    corporate presence on the internet, a ‘Grid
    2.0’ will eventually force corporations to
    embrace this technology
Grid Middleware
   Middleware designed to manage a grid will
    eventually merge with software designed to
    handle multiple CPUs on one motherboard to
    form a common solution.
   Grid computing is far too different from multi-
    CPU processing to ever offer a common
Expanding User Base
   Development of a good middleware solution
    that abstracts most details of the grid will
    bring grid computing to ‘Little Science’ and
    eventually individual users.
   The complexity of grid computing and lack of
    demand will prevent grid computing from ever
    becoming part of the main stream.

Shared By: