Danny Powell

Document Sample
Danny Powell Powered By Docstoc
					                                               NCSA
              Data and Computing Infrastructure for
              Data and Computing Infrastructure for
             Scientific Inquiry and Decision Support
             Scientific Inquiry and Decision Support


                                             Danny Powell
                                             Executive Director
         National Center for Supercomputing Applications
           University of Illinois at Urbana-Champaign
                                             CHPC National Meeting
                                               December 7, 2010

University of Illinois at Urbana-Champaign                  National Center for Supercomputing Applications
                  Outline
                  Outline
•   Basic info about NCSA

•   Blue Waters – PetaScale Computer

•   User (application) support role – changes over
    time

•   Examples of current scientific support role



                               National Center for Supercomputing Applications
Basic Facts about NCSA
Basic Facts about NCSA
• Applied Research Unit of the University of Illinois
    – Established in 1986 with funding from NSF and State of Illinois
    – Mission
        • Provide high-end computing resources to scientists and engineers
        • Develop software tools and software systems needed to make full use of
          advanced computing and data systems (Mosaic, Apache Web Server, Telnet,
          D2K, MyProxy, numerous others…)
• Physical Facilities
    – NCSA Building: 140,000 sq.ft.
    – Advanced Computation Building: 14,000 sq.ft. (raised floor)
    – PetaScale Computing Building: 80,000 sq.ft. (20,000 sq.ft. raised floor)
• Computing Resources
    –   Mid-Range Supercomputing systems: 144 TF
    –   10+ Petaflop (1+ PF sustained) computer (IBM) – Fall 2011
    –   Archival storage system: 12 PB – growing to 500+ PB 2012
    –   Advanced visualization systems
• Staff (fte)
    – Technical/professional staff: 230+
    – Administrative/clerical: 25
    – Management: 8
                                                   National Center for Supercomputing Applications
Basic Facts about NCSA
Basic Facts about NCSA
•   Funding
     –   Federal Agencies, Industry, State of Illinois, Foundations, International sources
     –   Most projects are partnerships with others (88%)
           • Leveraging skills/resources of others
           • Goal to be viewed as the “Partner of Choice”
•   IACAT (Institute for Advanced Computing Applications and
    Technologies)
     –   Integrates applied research of NCSA with basic research teams
•   International Program
     –   25+ institutions from 14+ countries
     –   Faculty and student exchanges, joint projects, workshops, technology sharing
•   Industrial Program
     –   Nationally/internationally recognized for it’s level of functional interaction, technology
         transfer, student engagement
     –   17+ companies (Fortune 50/100/500, smaller technology companies)




                                                           National Center for Supercomputing Applications
 NCSA Bridges Basic Research and Commercialization with Application



  Phase 0          Phase 1              Phase 2                     Phase 3              Phase 4
  Concept/        Feasibility           Design/                    Prototyping          Production/
   Vision                             Development                                       Deployment

                                   Product Life Cycle


Theoretical          Applied                               Optimization           Commercialization
    &              Prototyping                                  &                         &
  Basic                 &                                 Robustification             Production
 Research         Development                                                       (.com or .org)
                                      NCSA
                                    Bridges the Gap
                                       BETWEEN
                           Basic Research & Commercialization




         Universities                Application                             Private
           & Labs                                                           Industry
                                            Economic Development
            Blue Waters
… enabling scientific discovery at the leading edge
… enabling scientific discovery at the leading edge




                              National Center for Supercomputing Applications
                                   Blue Waters Project
                                   Blue Waters Project
Science & Engineering Capability   NSF’s Strategy for Academic High-end Computing
                                   NSF’s Strategy for Academic High-end Computing

                                                                Track 1 System
                                                                                 UIUC/NCSA (≥1 PF sustained) – 10+ PF peak
                                                                                 UIUC/NCSA (≥1 PF sustained) – 10+ PF peak
                                   (logarithmic scale)




                                                                Track 2 Systems
                                                                                          Multiple Systems
                                                                                          Multiple Systems

                                                                                 UT/ORNL (~1PF)
                                                                                 UT/ORNL (~1PF)
                                                                    TACC (500+TF)
                                                                    TACC (500+TF)


                                                                Track 3 Systems
                                                                                                                     ters
                                                                                                                       rs
                                                                                                    s y      PC Cen te
                                                                                                      iitty H P
                                                                                                            H   C Cen
                                                                                             Uniiver s
                                                                                              n ver
                                                                                    Leadiing U
                                                                                    L ead ng




                                                         2006        2007           2008            2009                 2010                  2011

                                                                                                         National Center for Supercomputing Applications
    Blue Waters Project
    Blue Waters Project
    Input from Scientific Community
    Input from Scientific Community
•   D. Baker, University of Washington                               •     R. Luettich, University of North Carolina
    Protein structure refinement and determination                         Coastal circulation and storm surge modeling

•   M. Campanelli, RIT                                               •     W. K. Liu, Northwestern University
    Computational relativity and gravitation                               Multiscale materials simulations

•   D. Ceperley, UIUC                                                •     M. Maxey, Brown University
                                                                           Multiphase turbulent flow in channels
    Quantum Monte Carlo molecular dynamics
                                                                     •     S. McKee, University of Michigan
•   J. P. Draayer, LSU                                                     Analysis of ATLAS data
    Ab initio nuclear structure calculations
                                                                     •     M. L. Norman, UCSD
•   P. Fussell, Boeing                                                     Simulations in astrophysics and cosmology
    Aircraft design optimization
                                                                     •     J. P. Ostriker, Princeton University
•   C. C. Goodrich                                                         Virtual universe
    Space weather modeling
                                                                     •     J. P. Schaefer, LSST Corporation
•   M. Gordon, T. Windus, Iowa State University                            Analysis of LSST datasets
    Electronic structure of molecules
                                                                     •     P. Spentzouris, Fermilab
•   S. Gottlieb, Indiana University                                        Design of new accelerators
    Lattice quantum chromodynamics                                   •     W. M. Tang, Princeton University
•   V. Govindaraju                                                         Simulation of fine-scale plasma turbulence
    Image processing and feature extraction                          •     A. W. Thomas, D. Richards, Jefferson Lab
•   M. L. Klein, University of Pennsylvania                                Lattice QCD for hadronic and nuclear physics
    Biophysical and materials simulations                            •     J. Tromp, Caltech/Princeton
                                                                           Global and regional seismic wave propagation
•   J. B. Klemp et al., NCAR
    Weather forecasting/hurricane modeling                           •     P. R. Woodward, University of Minnesota
                                                                           Astrophysical fluid dynamics
                                                     National Center for Supercomputing Applications
    Blue Waters Project
    Blue Waters Project
    Attributes of Petascale Computing System
    Attributes of Petascale Computing System
•    Maximize Core Performance
     … minimize number of cores needed for a given level of performance as
       well as lessen the impact of sections of code with limited scalability
•   Maximize S&E Application Scalability
     … low latency, high-bandwidth communications fabric
•   Solve Memory-intensive Problems
     … large amount of memory
     … low latency, high-bandwidth memory subsystem
•   Solve Data-intensive Problems
     … high-bandwidth I/O subsystem
     … large quantity of on-line disk, massive quantity of archival storage
•   Provide Reliable Operation
     … maximize system integration
     … mainframe reliability, availability, serviceability (RAS) technologies
                                National Center for Supercomputing Applications
 Integrated/Scalable System
 Blue Waters will be one of the
 most powerful computers in the
 world for scientific research                                                                                     Blue Waters
 when it comes on line in 2011.                                                                                       ~10 PF Peak
                                                                                                                   ~1 PF sustained
                                                                                                                    >300,000 cores
                                                                                                               ~1.2 PB of memory
                                                                                                            >18 PB of disk storage
                                                                           Blue Waters Building Block    500 PB of archival storage
                                                                           32 IH server nodes              ≥100 Gbps connectivity
                                                                               256 TF (peak)
                                                                               32 TB memory
                                                                               128 TB/s memory bw
                                              IH Server Node               4 Storage systems (>500 TB)
                                              8 QCM’s (256 cores)          10 Tape drive connections
                                                8 TF (peak)
                                              1 TB memory
                                                4 TB/s memory bw
                                              8 Hub chips
                         Quad-chip Module     Power supplies
                         4 Power7 chips
                                              PCIe slots                     Blue Waters utilizes
                         128 GB memory        Fully water cooled
                         512 GB/s memory bw                               commercial components
Power7 Chip              1 TF (peak)
8 cores, 32 threads      Hub Chip                                that will power other systems —
L1, L2, L3 cache (32 MB) 1.128 TB/s bw
Up to 256 GF (peak)
128 Gb/s memory bw
                                                          from servers to beyond Blue Waters.
                                                                    Blue Waters
 NCSA Bridges Basic Research and Commercialization with Application



  Phase 0          Phase 1              Phase 2                     Phase 3              Phase 4
  Concept/        Feasibility           Design/                    Prototyping          Production/
   Vision                             Development                                       Deployment

                                   Product Life Cycle


Theoretical          Applied                               Optimization           Commercialization
    &              Prototyping                                  &                         &
  Basic                 &                                 Robustification             Production
 Research         Development                                                       (.com or .org)
                                      NCSA
                                    Bridges the Gap
                                       BETWEEN
                           Basic Research & Commercialization




         Universities                Application                             Private
           & Labs                                                           Industry
                                            Economic Development
Integrated Complex Information
Integrated Complex Information
Systems
Systems
•   Workflow
•   Data management
•   Compute resources
•   Software/Hardware optimization
•   Visualization tools and resources
•   Analytic tools
•   Collaborative environments
•   Resource sharing
•   Publishing support tools
•   …..
                              National Center for Supercomputing Applications
        Leading Edge Computational Science
        Leading Edge Computational Science
                  is Changing…
                   is Changing…
Initially - Single investigator
            code optimization
            porting to different architectures
            serial to parallel coding

Increasingly complex application solutions - more sophisticated users

Single investigator – to - multidisciplinary teams

Small investigative teams – to - communities of users

Local and national focus – to - global projects

New architectures – accelerators, cell architecture, heterogeneous
architectures, clouds, “ultra” scaling

Increasingly complex, integrated cyberinfrastructure

Computation-driven applications to data-driven applications
                                                 National Center for Supercomputing Applications
Advanced Information
     Systems for
 Data-Driven Science
… harnessing the power of the national/global
… harnessing the power of the national/global
            cyberinfrastructure
            cyberinfrastructure




                National Center for Supercomputing Applications
Examples: Community Information
Examples: Community Information
Management Infrastructure Projects
Management Infrastructure Projects
• Earthquake Engineering
     – Consequence based risk management for seismic events
•   Environmental Observatories
     – Ocean Observatories, Coupled Human/Natural Systems,
        BioDiversity
•   Atmospheric Modeling
     – Severe Weather Predictions, Regional Climate Modeling
•   Astronomy
     – Very large data transport, processing, and analysis pipelines
•   BioMedical Informatics
     – Multisource infectious disease surveillance and patient safety
•   Humanities/Social Science Research
     – Digital libraries, Text/Image analysis, social networks
•   Science Educational Support Systems
     – Teaching support and educational enhancement systems
                                                National Center for Supercomputing Applications
                      Process for Building
        Integrated Application/Decision Support Systems

Application Roadmaps                     User Representatives                             Technology Roadmaps
                                          Team Participation

                                                                         Cyberarchitecture Working
                                            Requirements                           Group
              Partners
                                       Analysis & Specification           Integrated Project Teams
                                                                                      Portals & GUIs
              TeraGrid                                                                Workflow Mgmt
                                           Development &                             S&E Applications
          Working Groups
                                          System Integration                       Data Mining & Analysis
        Advisory Committees
         Industrial Partners                                                            Visualization
            International                                                               Webservices
              Partners                  Prototype or Production                       Collaboratories
                                                                                        Middleware
                                          Cyberenvironments                               Security


                                                        Knowledge and
                             Situation Analysis
    National Center for
                                                        Decision Support
    Supercomputing Applications
                                                          National Center for Supercomputing Applications
    Astronomy: The new era of great
    survey telescopes
• Processing requirements:
– High sensitivity, wide-field
  imaging;
– Demanding time and frequency
  non-imaging analysis and transient
  detection

•   Data management and
    processing implications:             (LSST)
–   Large O(109) survey catalogs
–   High associated data rates (TBps)
–   Compute processing rates (PF)
–   Petabyte and Exabyte archives with
    sophisticated community access
    mechanisms

•   Data rates are exponential and
    require fundamentally new                     (SKA)
    approaches to data management
    and processing.
                Health Sciences
                Health Sciences
•   Infectious Disease Informatics
    – INDICATOR: outbreak detection, modeling, response
    – Endemic Disease project (data/model sharing, multiple
      countries – Africa, Central/South America, Asia)
•   Patient Safety
    – Algorithmic Medicine
    – Next Generation Medical Records
    – Personalized Medicine
•   Emergency Response
    – Multi-hospital resource coordination (Costa Rica)
Integrated Malaria Management
          Consortium




Using advanced information systems to help
              control malaria
                      Partnerships
                      Partnerships
•   National
    – Government support from 5 countries
    – Government level discussions with 6 countries
    – Sub-Government discussion with 14 countries
•   Disease
    –   Malaria
    –   Dengue Fever
    –   Chagas Disease
    –   HIV/AIDS
    –   Onchocerciasis (River Blindness)
              Weather Modeling
              Weather Modeling
          Regional Climate Modeling
          Regional Climate Modeling
•   Earlier detection of severe weather systems
    (LEAD Project)
    – Multiple sensor data feeds (field, radar, satellite)
    – Coupling compute resources with data models
    – Concentrating resources for real-time, fine grid analysis
      of important weather patterns
    – Micro-climate analysis/prediction
•   Regional Climate Modeling
    – Compute and data intensive
    – Coupling different models
      Environmental Studies/Decision
      Environmental Studies/Decision
      Support
      Support
• CLEANER/WATERS
  Collaborative Large-scale Engineering Analysis
  Network for Environmental Research


• Human-dominated, complex
  environmental systems, e.g.,
   – River basins
   – Coastal margins


• Ocean Observatories
  Infrastructure (OOI)
   – Influence on architectural design
   – System security, info visualization


• BioDiversity
   – Monitoring, identifying, cataloguing bird
     species in the field
NCSA Integrated Systems
NCSA Integrated Systems




                             0.3g
                           0.5g
                    0.6g




                                    Maeviz – [Memphis Test Bed]
                                    File   Inventory               Hazards       Vulnerability     Interventions     Decision support      Interdependencies         Help

                                           Consequence Table                                                                                                            ?




                                              Scheme Comparison                                                                                               ?


                                                                             Consequence Comparison
                                                                                                                                           Description
                                                             100                                                               Scheme #1
                                                              90                                                                           C2M           Rebuild
                                                              80
                                                                                                                                           C2L           Rebuild
                                                              70




                                                 Loss ($M)
                                                              60
                                                                                                                                           URML          Rebuild
                                                                                                               Life Loss
                                                              50
                                                              40
                                                                                                               Dollar Loss     Scheme #2
                                                              30
                                                                                                                                           C2M           Rehab LS
                                                              20                                                                           C2L           Rehab LS
                                                              10                                                                           URML          No Action
                                                               0
                                                                     No Action      Scheme #1      Scheme #2
                                                                                    Alternatives

                                              Prob. Distribution                           Preference Plot                   POS plot             Compare Schemes

                                                                       OK                                           Cancel
                                           Earthquake Level: 5% PE in 50 years
                                           Decision Option: Equivalent Cost Analysis


                                                                                                      OK                      Cancel
                          Geospatial
                          Geospatial
             A Leading Element of Data Deluge
             A Leading Element of Data Deluge

• About 80 percent of all data collected has a geographic component.
   That figure is steadily rising as the global knowledge economy
   continues to expand and location awareness technologies (Google
   Earth, GPS, etc.) grow more powerful and popular
    – http://www.boozallen.com/publications/leading-ideas/geospacerev-boozallen-
      pennstate/details/geospatial-technology-state-of-flux

    – http://www.apachecorp.com/explore/Browse_Archives/View_Article.aspx?Article.
      ItemID=982

    – http://fiscaloffice.summitoh.net/index.php/geographic-information

    – http://gis.oregon.gov/

    – http://www.esri.com/library/whitepapers/pdfs/sdwho.pdf



                                                  National Center for Supercomputing Applications
                                   Questions?


University of Illinois at Urbana-Champaign   National Center for Supercomputing Applications
National Petascale Computing Facility
National Petascale Computing Facility
$72.5M, 25MW, LEED Gold+, 88,000 ft2
$72.5M, 25MW, LEED Gold+, 88,000 ft2




                              Blue Waters
INDICator: An Infectious Disease Informatics CyberEnvironment
 INDICator: An Infectious Disease Informatics CyberEnvironment

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:5
posted:2/24/2011
language:English
pages:28