Document Sample
Indiana Powered By Docstoc
           Kurt A. Seiffert

               April 2008
            Outline Presentation
• What is the TeraGrid
• Indiana University’s data-centric computing
   – HPSS
   – Lustre
   – Data collections
• Science Gateways
• Bringing it all together
                   What is the TeraGrid?
•   An instrument (cyberinfrastructure) that delivers high-end IT resources -
    storage, computation, visualization, and data/service hosting - almost all of
    which are UNIX-based under the covers; some hidden by Web interfaces
     – A data storage and management facility: over 20 Petabytes of storage (disk
        and tape), over 100 scientific data collections
     – A computational facility - over 750 TFLOPS in parallel computing systems
        and growing
     – (Sometimes) an intuitive way to do very complex tasks, via Science
        Gateways, or get data via data services
•   A service: help desk and consulting, Advanced Support for TeraGrid
    Applications (ASTA), education and training events and resources
•   The largest individual cyberinfrastructure facility funded by the NSF, which
    supports the national science and engineering research community
•   Allocated via peer review (and without double jeopardy)
                                     February 24, 2012

TeraGrid: 11 Resource Partners, 1 Instrument
                       HPSS Configuration
 Bloomington                                                IUPUI                         Users
 Users                               Network

IUB                                                        IUPUI
Subsystem                                                  Subsystem

 HPSS Core
 Servers                                       TCP/IP
                                               Wide Area                           Research
                 Research                      Network                             Network

                 HPSS                                                               HPSS
                 Movers                                                             Movers

                 FC SAN                                                             FC SAN

   Disk Arrays        Tape Library
                                                                     Disk Arrays      Tape Library
    What’s A Data Capacitor Really?
• 12 pairs Dell PowerEdge 2950
   – 2 x 3.0 GHz Dual Core Xeon
   – Myrinet 10G Ethernet
   – Dual port Qlogic 2432 HBA (4 x FC)
   – 2.6 Kernel (RHEL 4)
• 6 DDN S2A9550 Controllers
   – Over 2.4 GB/sec measured throughput each
   – 535 Terabytes of spinning SATA disk
             Bandwidth Challenge
• Annual Event at SC Conference in November
   – This year’s venue - Reno, Nevada
• This Year’s Theme - “Serving as a Model”
   – Can others do what you’re doing?
• Criteria for Judging
   – Did you fill a single 10 Gigabit connection?
   – How are you supporting science?
   – Did you use your production network?
              The Challenge:
     Five Applications Simultaneously
• Acquisition and Visualization
  – Live Instrument Data
     • Chemistry
  – Rare Archival Material
     • Humanities
• Acquisition, Analysis, and Visualization
  – Trace Data
     • Computer Science
  – Simulation Data
     • Life Science
     • High Energy Physics
Bandwidth Challenge Configuration
                Digitization of
                    •   SarvamoolaGranthas – teachings of
                        ShriMadhvacharya (1238-1317) a great
                        Indian Philosopher, proponent of Dvaita
                    •   SarvamoolaGranthas is a collection of
                        works with commentaries on various
                        important scriptures such Vedas,
                        Upanishads, Itihasas, Puranas, Tantras
                        and Prakaranas
                    •   All of the original manuscripts of the
                        Sarvamoolagranthas were incised on
                        palm leaves
                    •   Mathas or Monasteries
Shri Madhvacharya       – Keepers of Palm Leaf Manuscripts
             Digitization of
         “Sarvamoola Granthas”
             Post processed images of the palm leaves

   Sample images of the palm leaf of Sarvamoola granthas illustrating the
 performance of the image processing algorithms. (a) Stitched 8 bit grayscale
image without normalization and contrast enhancement, (b) Final image after
                           contrast enhancement
MutDB (
               Science Gateways
• A Science Gateway is a domain-specific computing
  environment, typically accessed via the Web, that
  provides a scientific community with end-to-end support
  for a particular scientific workflow
• Science Gateways are distinguished from Web portals
  ( in that portals
  “present information from diverse sources in a unified
• Hides complexity (pay no attention to the grid behind the
        LEAD (

•   Simple enough an undergraduate can use it!
•   National Center for Supercomputing Applications (NCSA) and IU teamed up to
    support WxChallenge weather forecast competition. 64 teams, 1000 students,
    ~16,000 CPU hours on Big Red
Purdue’s NanoHUB (
But you don’t care - TeraGrid
                       RP 1            RP 2


                     TeraGrid Infrastructure
              (Accounting, Network,Network, Accounting, …

                        RP 3
                                     Compute     Viz      Data
                                     Service   Service   Service
    IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation under Grants No.
    ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075.
•   The IU Data Capacitor is supported in part by the National Science Foundation under Grant No. CNS-0521433.
•   The Grid Infrastructure Group management of the TeraGrid, and Dane Skow's leadership thereof, is funded by NSF grant
•   Purdue’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation under Grant
    No. OCI-050399.
•   This research was supported in part by the Pervasive Technology Labs and the Indiana METACyt Initiative. Both Indiana
    University initiatives are supported by the Lilly Endowment, Inc.
•   This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University.
•   The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and Dr. Beth Plale, and supported by
    NSF grant 331480. Marcus Christie and SurreshMarru of the Extreme! Computing Lab contributed the LEAD graphics
•   The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C. Fox and Dr. Marlon Pierce and
    funded via the Pervasive Technology Labs (supported by the Lilly Endowment, Inc.) and the National Institutes of Health grant
    P20 HG003894-01.
•   Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar’s award to Stewart, funded by the
    US Department of State and the TechnischeUniversitaet Dresden.
•   Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not
    necessarily reflect the views of the National Science Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment,
    Inc., or any other funding agency.
•   This work is made possible by the dedicated efforts of the expert staff of the Research Technologies Division of University
    Information Technology Services, the faculty and staff of the Pervasive Technology Labs, and the staff of UITS generally.
    Steve Simms, Erik Cornet, Mike Lowe, Scott Tiege, Michael Grobe, and Malinda Lingwall helped with this presentation.
•   Thanks to the faculty and staff with whom we collaborate locally at IU and globally (within the US via the TeraGrid, and
    internationally via collaboration with TechnischeUniversitaet Dresden)

Shared By: