End to End Computing at ORNL by qfa20129


									End to End Computing at ORNL

                                          Presented by
                                 Scott A. Klasky
      Computing and Computational Science Directorate
                    Center for Computational Sciences
    Petascale data workspace

2 Klasky_E2E_0611
    Impact areas of R&D
    Asynchronous I/O using data in transit (NXM) techniques
     (M. Parashar, Rutgers; M. Wolf, GT)
    Workflow automation
     (SDM Center; A. Shoshani, LBNL; N. Podhorszki, UC Davis)
    Dashboard front end
     (R. Barreto, ORNL; M. Vouk, NCSU)
    High-performance metadata-rich I/O
     (C. Jin, ORNL; M. Parashar, Rutgers)
    Logistical networking integrated into data analysis/visualization
     (M. Beck, UTK)
    CAFÉ common data model
     (L. Pouchard, ORNL; M. Vouk, NCSU)
    Real-time monitoring of simulations
     (S. Ethier, PPPL; J. Cummings, Cal Tech;
     Z. Lin, U.C. Irvine; S. Ku, NYU)
            Visualization in real-time monitor
            Real-time data analysis
3 Klasky_E2E_0611
   Hardware architecture
                Restart files
         Jaguar                             nodes
        compute                                                 Ewok
                             Simulation     40 G/s    Sockets
         nodes                 control                 later
                                                     Lustre2                  Nfs DB

                              Job control


4 Klasky_E2E_0611
    Asynchronous petascale I/O
    for data in transit
                                                                          High-performance I/O
                                                                            Asynchronous
                                                                            Managed buffers
                                                                            Respect firewall

                                                                          Enable dynamic
                                                                           control with flexible
                                                                           MxN operations
                                                                            Transform using
                                     User applications                       framework (Seine)
                    Seine coupling framework interface

                Shared space
                                        Load balancing   Other program
                Directory layer          Storage layer

             Communication layer (buffer management)

                                     Operating system

5 Klasky_E2E_0611
    Workflow automation

                     Automate the data processing
                      pipeline, including transfer of
                      simulation output to the e2e
                      system, execution of
                      conversion routines, image
                      creation, archival using the
                      Kepler workflow system
                    • Requirements for Petascale
                       – Easy to use
                       – Dashboard front-end
                       – Autonomic

6 Klasky_E2E_0611
    Dashboard front end
     Desktop Interface uses asynchronous
      Javascript and XML (AJAX); runs in
      web browser
     Ajax is a combination of technologies
      coming together in powerful ways
      XMLhttpRequest, and Javascript)
     The user’s interaction with the application
      happens asynchronously – independent of
      communication with the server
     Users can
      the page
      clearing it

7 Klasky_E2E_0611
    High-performance metadata-rich I/O
         Two-step process to produce files:
        Step 1:
                    Write out binary data + tags using parallel I/O on XT3.
                    (May or may not use files; could use asynchronous
                    I/O methods)
                     The tags contain the metadata information
                      that is placed inside the files
                     Workflow transfers this information to Ewok
                      (IB cluster with 160P)

        Step 2:
                    Service on Ewok decodes files into hdf5 files and places
                    metadata into XML file (one XML file for all of the data)

         Cuts I/O overhead in GTC from 25% to <3%
8 Klasky_E2E_0611
    Logistical networking:
    High-performance ubiquitous and
    transparent data access over the WAN


          Jaguar               Ewok
         Cray XT3             cluster


9 Klasky_E2E_0611
    CAFÉ common data model
    (Combustion S3D), astrophysics (Chimera),
    fusion (GTC/XGC Environment)
                            Stores and organizes
           Provenance        four types of information
           CAFÉ model
                             about a given run:
                                Provenance
                                Operational profile
                                Hardware mapping
                                Analysis metadata

  A scientist can
   seamlessly find input
   and output variables
   of a given run, unit,
   average, min and max
   values for a variable
10 Klasky_E2E_0611
    Real-time monitoring of simulations
     Scientists need tools to let them    Using end-to-end technology, we have
      see and manipulate their simulation   created a monitoring technique for
      data quickly                          fusion codes (XGC, GTC)
           Archiving data, staging it to             Kepler is used to automate the steps.
            secondary or tertiary computing            The data movement task will be using
            resources for annotation and               the Rutgers/GT data-in-transit
            visualization, staging it yet again to
            a web portal….                             routines
           These work, but we can accelerate  Metadata are placed in the
            the rate of insight for some        movement from the XT3 to the IB
            scientists by allowing them to      cluster. Data go from NXM processors
            observe data during a run           using the Seine framework
     The per-hour facilities cost of     Data are archived into the High-
      running a leadership-class machine   Performance Storage System,
      is staggering                        metadata are placed in DB.
     Computational scientists should      Visualization and analysis services
      have tools to allow them to           produce data/information that are
      constantly observe and adjust their   placed on the AJAX front-end
      runs during their scheduled time
      slice, as astronomers at an          Data are replicated from ewok to other
      observatory or physicists at          sites using logisitcal networking
      a beam source can                        Users access the files from our
                                                          metadata server at ORNL
11 Klasky_E2E_0611
    Real-time monitoring
    Typical monitoring                    More advanced monitoring
       • Look at volume-averaged            • 5 seconds move 300 MB,
         quantities                           and process the data
       • At four key times, this            • Need to use FFT for 3-D data, and
         quantity looks good                  then process data + particles
                                                – 50 seconds (10 time steps) move
       • Code had one error that didn’t           and process data
         appear in the typical ASCII            – 8 GB for 1/100 of the
         output to generate this graph            30 billion particles
       • Typically, users run                • Demand low overhead <3%!
         gnuplot/grace to monitor

12 Klasky_E2E_0611

  Scott A. Klasky
  Lead, End-to-End Solutions
  Center for Computational Sciences
  (865) 241-9980

13 Klasky_E2E_0611

To top