The Grid and Meteorology by zoi14224


									                      The Grid
                   and Meteorology
   Ian Foster
 Argonne National Lab
 University of Chicago
    Globus Project
                                        Image Credit: Electronic Visualization Lab, UIC

          Meteorology and HPN Workshop, APAN 2003, Busan, August 26, 2003
        The Grid: why and what
         – Global knowledge communities
         – Resource sharing technologies
         – Open standards and software
        The Grid and meteorology
         – Opportunities
         – Espresso interface
         – Earth System Grid project              2          ARGONNE  CHICAGO
                It’s Easy to Forget
         How Different 2003 is From 1993
        Enormous quantities of data: Petabytes
          – For an increasing number of communities,
            gating step is not collection but analysis
        Ubiquitous Internet: 100+ million hosts
          – Collaboration & resource sharing the norm
        Ultra-high-speed networks: 10+ Gb/s
          – Global optical networks
        Huge quantities of computing: 100+ Top/s
          – Moore’s law gives us all supercomputers            3            ARGONNE  CHICAGO
         Consequence: The Emergence of
         Global Knowledge Communities
        Teams organized around common goals
         – Communities: “Virtual organizations”
        With diverse membership & capabilities
         – Heterogeneity is a strength not a weakness
        And geographic and political distribution
         – No location/organization possesses all
           required skills and resources
        Must adapt as a function of the situation
         – Adjust membership, reallocate
           responsibilities, renegotiate resources            4            ARGONNE  CHICAGO
                        For Example:
                     High Energy Physics            5       ARGONNE  CHICAGO
             Grid Technologies
         Address Key Requirements
        Infrastructure (“middleware”) for
         establishing, managing, and evolving
         multi-organizational federations
         – Dynamic, autonomous, domain independent
         – On-demand, ubiquitous access to
           computing, data, and services
        Mechanisms for creating and managing
         workflow within such federations
         – New capabilities constructed dynamically
           and transparently from distributed services
         – Service-oriented, virtualization            6               ARGONNE  CHICAGO
        The Grid World: Current Status
      Substantial number of Grid success stories
       – Major projects in science
       – Emerging infrastructure deployments
       – Growing number of commercial deployments
      Open source Globus Toolkit® a de facto
       standard for major protocols & services
       – Simple protocols & APIs for authentication,
         discovery, access, etc.: infrastructure
       – Large user and developer base
       – Multiple commercial support providers
      Global Grid Forum: community & standards
      Emerging Open Grid Services Architecture           7            ARGONNE  CHICAGO
               What We Can Do Today
      A core set of Grid capabilities are available and
       distributed in good quality form, e.g.
       – Globus Toolkit: security, discovery, access, data
         movement, etc.
       – Condor: scheduling, workflow management
       – Virtual Data Toolkit, NMI, EDG, etc.
      Deployed at moderate scales
       – WorldGrid, TeraGrid, NEESgrid, DOE SG, EDG, …
      Usable with some hand holding, e.g.
       – US-CMS event prod.: O(6) sites, 2 months
       – NEESgrid: earthquake engineering experiment           8            ARGONNE  CHICAGO   9   ARGONNE  CHICAGO
      NEESgrid Earthquake Engineering

                                U.Nevada Reno

                       10      ARGONNE  CHICAGO
         CMS Event Simulation Production
        Production Run on the Integration Testbed
          – Simulate 1.5 million full CMS events for physics
            studies: ~500 sec per event on 850 MHz processor
          – 2 months continuous running across 5 testbed sites
          – Managed by a single person at the US-CMS Tier 1               11              ARGONNE  CHICAGO
                Key Areas of Concern
        Integration with site operational procedures
         – Many challenging issues
        Scalability in multiple dimensions
         – Number of sites, resources, users, tasks
        Higher-level services in multiple areas
         – Virtual data, policy, collaboration
        Integration with end-user science tools
         – Science desktops
        Coordination of international contributions
        Integration with commercial technologies            12            ARGONNE  CHICAGO
        The Grid: why and what
         – Global knowledge communities
         – Resource sharing technologies
         – Open standards and software
        The Grid and meteorology
         – Opportunities
         – Espresso interface
         – Earth System Grid project          13             ARGONNE  CHICAGO
            The Grid and Meteorology:
        Inter-personal collaboration
         – E.g., Access Grid, CHEF
        On-demand access to simulation models
         – E.g., Espresso
        Access to, and integration of, data sources
         – E.g., Earth System Grid
        Dynamic, virtual computing resources
         – “Metacomputing”
        Integration of all of the above
         – Collaborative, computationally intensive
           analysis of large quantities of online data            14            ARGONNE  CHICAGO
           Expresso Modeling Interface
          (Michael Dvorak, John Taylor)
        “Meteorology on demand”        15       ARGONNE  CHICAGO
             Earth System Grid (ESG)

Goal: address
obstacles to
the sharing &
analysis of
data from
earth system
models     16       ARGONNE  CHICAGO   17   ARGONNE  CHICAGO
                     ESG: Strategies
        Move data a minimal amount, keep it close to
         point of origin when possible
         – Data access protocols, distributed analysis
        When we must move data, do it fast and with
         minimum human intervention
         – Storage Resource Management, fast networks
        Keep track of what we have, particularly
         what’s on deep storage
         – Metadata and Replica Catalogs
        Harness a federation of sites, web portals
         – GT -> Earth System Grid -> UltraDataGrid           18            ARGONNE  CHICAGO
           Distributed Data Access                             -Transparency
                   Protocols                                   -Security
  Typical Application    Distributed Application               -(Processing)
      Application              Application               Application

      netCDF lib           OPeNDAP Client                ESG client

                        OPeNDAP                    OPeNDAP               ESG
             data         Via                        Via                  +
                          http                       Grid               DODS
                           OpenDAP Server                ESG Server

                                 Data                        Big Data
                               (remote)                      (remote)                19                  ARGONNE  CHICAGO
                ESG: Metadata Services

                               HIGH LEVEL METADATA SERVICES
          METADATA        METADATA      METADATA & DATA     METADATA             METADATA

          METADATA          METADATA                               METADATA      METADATA
         AGGREGATION        VALIDATION                              DISPLAY      DISCOVERY

                                       CORE METADATA SERVICES
                       METADATA ACCESS                 SERVICE TRANSLATION
                   (update, insert, delete, query)           LIBRARY

                                           METADATA HOLDINGS

                    Data &                        mirror
                                 Dublin Core    Dublin Core    COARDS      COMMENTS
                   Metadata       Database                     Database     XML Files
                   Catalog                       XML Files                             20                         ARGONNE  CHICAGO
               ESG: NcML Core Schema
    XML encoding of metadata (and data) of any generic netCDF file
    Objects: netCDF, dimension, variable, attribute
    Beta version reference implementation as Java Library


                       nc:dimension         nc:VariableType

 netCDF                                                  nc:attribute

                       nc: attribute                     21                 ARGONNE  CHICAGO   22   ARGONNE  CHICAGO
          Collaborations & Relationships
        CCSM Data Management Group
        OPeNDAP/DODS (multi-agency)
        NSF National Science Digital Libraries
         Program (UCAR & Unidata THREDDS
        U.K. e-Science and British Atmospheric
         Data Center
        NOAA NOMADS and CEOS-grid
        Earth Science Portal group (multi-agency,
         international)           23          ARGONNE  CHICAGO
                 For More Information
    The Globus Project®
    Earth System Grid
    Global Grid Forum
    Background information

    GlobusWORLD 2004
                                      2nd Edition: November 2003
     – Jan 20–23, San Francisco               24          ARGONNE  CHICAGO

To top