Proposal Template

Reviews
Shared by: Juan Agui
Stats
views:
468
rating:
not rated
reviews:
0
posted:
4/28/2009
language:
English
pages:
0
A Proposal to NPOESS IGS BAA To Develop and Adapt HDF5 Technologies And Provide HDF5 Services For NPOESS Data Production and Exploitation Mike Folk The HDF Group 1901 South First Street, Suite 2C Champaign, Illinois 61820 Email: contact@hdfgroup.org Phone: (217) 265-7850 July 2006 TABLE OF CONTENTS 1. SUMMARY 2. INTRODUCTION 2.1 THE NPOESS CHALLENGE AND HDF5 2.2 HDF, EOSDIS AND NPOESS 3. NEEDS ASSESSMENT AND PROPOSAL OVERVIEW 3.1 ASSESSING NPP AND NPOESS DATA TECHNOLOGY NEEDS 3.2 PRIMARY FOCUS AREAS 3.3 PROJECT DURATION AND AREAS OF EMPHASIS 4. TASK-BY-TASK DESCRIPTION OF WORK 4.1 ENHANCING THE HDF5 LIBRARY TO ACHIEVE NPOESS GOALS 4.2 DEVELOPING NEW TOOLS AND TECHNOLOGIES 4.3 HIGH PRIORITY SUPPORT FOR NPOESS USERS 2 2 2 2 3 3 3 4 4 4 5 7 The HDF Group HDF Technology Development and Services for NPOESS 1. SUMMARY NPOESS faces formidable data management challenges. Choosing HDF5 as its distribution format helps address those challenges, for HDF5 is a proven technology for managing large, complex scientific data. However, the capabilities of HDF5 can only be fully exploited for NPOESS by applying the best expertise in their use, and that expertise is available in The HDF Group (THG). THG experience in working with NPOESS staff, supplemented by a survey of key NPP and NPOESS participant groups, has identified three key areas in which The HDF Group can address important NPP and NPOESS requirements:  Enhancing the HDF5 library to handle large data volume flow, at the same time to serving very diverse communities. This includes modifying and optimizing the library to achieve high I/O performance with NPOESS data structures, adding new features to meet the needs of the broad NPOESS user communities, and supporting HDF on current and future NPOESS user platforms. Developing new tools and technologies, and adapting existing tools, to enable NPOESS data producers and users to view data content, to translate data into other formats, and to manage metadata. Providing high priority support for NPOESS users by creating a support structure that gives priority to NPOESS HDF users at all levels.   This work addresses two of the areas of need identified in the NPOESS IGS Program: 1. Data exploitation and new product development 2. Development of innovative technologies that may help to enable the NPOESS mission 2. INTRODUCTION 2.1 The NPOESS Challenge and HDF5 NPOESS has taken on a formidable, multifaceted data management challenge. It promises ultimately to deliver 8 terabytes of highly complex, scientifically sound data per day. NPOESS must deliver processed data to its central users within 30 minutes of observation, and to the world in 24 hours. NPOESS data must stand up to the rigors of scientific and military standards of quality, must be processed at enormously high speeds, and it must be easy to access, understand and use by a vast and varied user population. These daunting and complex challenges can only be met by applying the very best technologies, and engaging the very best expertise in using those technologies. In choosing HDF5 as its data distribution format, NPOESS has selected a technology with a proven record in meeting many of the challenges that NPOESS faces. HDF5 has demonstrated flexibility and scalability in applications ranging from physics to film-making, has shown a capacity for fast data accretion and access in uses from flight testing to weather radar, and has improved accessibility and usability through the broad array of visualization, analysis, and other tools. As with any complex powerful technology, the capabilities of HDF5 can only be fully exploited by applying the very best expertise to their use. This proposal seeks to add this second ingredient to NPOESS. It seeks support for The HDF Group to partner with diverse facets of the NPOESS project to apply its unique expertise in making the very best use of the HDF5 software, in developing applications that use HDF5, in transferring its expertise to NPOESS staff and users, and in supporting HDF technologies at high levels of readiness and usability. Not only are The HDF Group (THG) the principle architects of the HDF5 format and software, they are also the best source of knowledge about HDF5 – knowledge about how to create and access HDF5 files in ways that most effectively meet specific performance requirements, how to use the powerful HDF5 data model to represent complex data most meaningfully and efficiently, and how to build applications and tools that provide users with the easiest, most natural access to HDF5-stored data. 2.2 HDF, EOSDIS and NPOESS Delivering 3 terabytes of high quality earth science data per day, with an estimated 1.6 million users, NASA relies on The HDF Group to provide a range of services in support of EOSDIS. The HDF formats Page 2 The HDF Group HDF Technology Development and Services for NPOESS and software must remain robust, must run on nearly every computing system, and must evolve with changing technologies. HDF software must include the features, tools and documentation needed by every segment of society, from scientists to policy makers to school children. It must support rapid, efficient query and access to data, as well as its long-term preservation. HDF Group activities in support of EOSDIS provide a model for how The HDF Group can enhance the quality, availability, and usefulness of NPOESS data. NPOESS introduces a new generation of remote sensing and satellite technology. NPOESS data volumes will be greater than those of EOS, and data will be delivered much more quickly, arriving within 30 minutes of observations. Efficient and effective data handling and distribution will be critical. The HDF Group can ensure that these needs are met. The HDF Group can also work with the NPOESS Data Exploitation (NDE) team in the development of applications to enhance exploitation of existing data products and the development of new data products. New tools and technologies can be developed to ensure that the NPOESS data products will be used optimally for civil, military, and scientific purposes. Although it has had no formal relationship with NPOESS or NPP, The HDF Group has been engaged with NPP, NPOESS and other staff in a number of ways. NPOESS support staff use the free HDF helpdesk for technical advice and troubleshooting assistance related to NPP or NPOESS activities, and HDF Group members have participated in technical meetings and interacted with NPOESS and NPP staff at annual HDF workshops and other Earth Sciences conferences. 3. NEEDS ASSESSMENT AND PROPOSAL OVERVIEW 3.1 Assessing NPP and NPOESS data technology needs Recently, The HDF Group canvassed many groups responsible for the success of NPP/NPOESS data generation and exploitation to identify those technologies and activities that The HDF Group should focus on to best address NPP/NPOESS needs as described in the IGS Program Broad Area Announcement. Meetings included NPOESS IPO staff and management, the NPOESS Field Terminal Manager, NPOESS NPP staff, the NPOESS Data Exploitation team, the NPOESS Navy Central, NOAA’s NPOESS/EOS CLASS management, National Climatic Data Center staff with NPOESS involvement, and members of the IDPS development team at Raytheon. From these activities, a clear view has emerged of NPP and NPOESS needs and priorities that The HDF Group can best help to address. The highest priority tasks are library enhancement, tool development, data usability, and user support. The NPOESS IGS program identifies seven areas of need for new technical contributions to NPOESS. This proposal addresses two areas in particular: 1. Data exploitation and new product development 2. Development of innovative technologies that may help to enable the NPOESS mission Data exploitation can only be effectively achieved when the tools for data access and management are robust and of high quality, appropriate to the applications and technological environments in which they are used, easy to use, and well-supported. In some cases, new data technologies are needed, reflecting both unique aspects of NPOESS data and its use, and a need to stay abreast of evolving complementary technologies. In other cases, existing technologies must be adapted to new requirements. 3.2 Primary focus areas The HDF Group is uniquely able to develop technologies and provide services that will help the NPOESS program efficiently manage and distribute its data product, and that will help users access and use NPOESS data effectively. The proposed activities are in three primary areas.  Enhancing the HDF5 library – To satisfy NPOESS’ need to handle large data volume flow, and at the same time to serve very diverse communities, it is crucial that the HDF5 library be enhanced to optimally address NPOESS access and storage options. Library enhancements will include modifying and optimizing the HDF5 library to achieve high I/O performance with NPOESS data structures, adding new features to meet the needs of the broad NPOESS user communities, and supporting HDF on current and future NPOESS user platforms. Page 3 The HDF Group HDF Technology Development and Services for NPOESS  Developing new tools and technologies – Tools to enable NPOESS data producers and users to view data content, to translate data into other formats, and to manage metadata are very important for the immediate use of the NPOESS data products in science and elsewhere. The HDF Group will develop tools and utilities addressing needs of the four weather centrals, as well as those of NPOESS end-users. Providing high priority support for NPOESS users – As the NPOESS system replaces the existing weather system, more and more users will depend on HDF support on a regular basis. This need goes beyond the informal support that The HDF Group provides to the general user population. The HDF Group will create a support structure that gives priority to NPOESS HDF users at all levels.  3.3 Project duration and areas of emphasis The proposed work covers three years. In the first year, the focus is on the most pressing needs associated with getting the project underway, including tasks that will support efficient data production and distribution, development of tools and other software that will be the basis for evaluating and understanding data products, and developing an understanding of the data flow and data access challenges in both the data production and data exploitation components of the project. The optional second and third years will build on the experience of the first year to address library maintenance and performance needs, but gradually shift in focus to data exploitation needs. They include development of translation tools, tools for viewing, analyzing and performing simple manipulations of NPOESS data, and tools for browsing NPOESS data and metadata. In all three years, the proposed work will offer priority support to NPOESS in the areas of platform maintenance, bug fixing for the HDF5 library and tools, and helpdesk response. 4. TASK-BY-TASK DESCRIPTION OF WORK This section provides a detailed description of the proposed activities. 4.1 Enhancing the HDF5 library to achieve NPOESS goals HDF5 library enhancements will emphasize (1) efficient access to NPOESS data products, (2) adding NPOESS-specific features and APIs and (3) porting and maintaining HDF5 on current and future NPOESS user platforms. 4.1.1. Achieving efficient access to NPOESS data products Given the sizes of data volumes, it is important to tune and modify the library to obtain the maximum I/O performance in accessing NPOESS data structures. THG expertise can be very valuable in choosing how to use existing library capabilities. Optimizing cache management. HDF5 uses a sophisticated cache to avoid unnecessary accesses when performing I/O operations. Optimal choices of cache sizes and replacement strategies depend on applications, data structure and data operations, particularly for high throughput read and write operations such as those performed in the IDPS, as well as in data exploitation. The HDF Group has extensive experience helping applications to optimize their use of the HDF5 cache. Task 1: THG will test and study cache size and replacement strategies and provide guidance for NPOESS data products Effective use of data compression. NPOESS uses data compression technology to handle the huge volume of data. Besides compression technology itself, three important factors need to be considered: 1. Real-time data delivery to the four centrals requires the data compression technology to provide fast decoding and encoding times as well as good compression ratios. 2. Use of data compression inside HDF5 requires chunked storage, which must be carefully tuned to achieve good performance. 3. Different system settings may also affect the optimal use of compression technologies. Page 4 The HDF Group HDF Technology Development and Services for NPOESS THG has unique expertise to achieve optimal use of HDF5 features to address these needs. THG has implemented several internal filters to improve compression performance and has done a number of performance studies with EOS and other data, leading to significant improvements in compression usage. Likewise, THG has experience tuning the use of chunks to achieve fast data access and improved storage. Finally, THG has a extensive experience understanding and solving HDF5 performance issues related to different systems such as hardware architectures, OS, compilers and compiler flag settings. Task 2: THG will test and identify effective chunking and compression settings for NPOESS data on computing systems requested by weather centrals and NDE staff. THG will provide guidelines on how to achieve good performance. Tuning data structures for NPOESS products. The choice of HDF5 data structures can have a major influence on performance and storage requirements. For example, a key HDF5 structure in NPOESS data products is the “dataset region reference.” Region references in NPOESS data products have a simple rectangular shape. The HDF5 library is currently not optimized for I/O access to such shapes, but such optimizations could be done and would likely improved I/O access times for this important structure. Task 3: THG will develop a new region reference I/O method within the library to ensure optimal performance when accessing regions. 4.1.2. Adding NPOESS-specific features and APIs NPOESS data products will be used by many communities and users. It is important to stay informed about user requirements, and to respond by adding features to the library to meet users’ needs. Standard APIs can be especially helpful. The availability of standard APIs for common tasks can benefit applications at all levels by reducing coding time, code complexity and vulnerability to error. It also promotes standard ways to create and access NPOESS products, promoting interoperability and software re-use. Examples of APIs that have been suggested by the NPOESS community include: A new API to access data pointed to by region references. Retrieving data pointed to by region references requires a complex series of function calls. The new API could encapsulate into a few routines the large number of function calls that occur whenever regions are accessed. Task 4: THG will develop an easy-to-use API for efficient and error-free data access for region references. A new API to move and update datasets of region references. When a dataset with region references is moved from one file to another, or the file is rewritten, the references of the datasets become invalid in the new file. The process of updating the values of references is complex and error prone. Standard functions can be implemented to address both of these concerns. Task 5: THG will develop a set of simple routines for rewriting region references that preserves their validity. Based on previous experience, it is very likely that other APIs will be identified to help reduce errors and increase programming speed for both data producers and users. The proposed support activities (Section 4.3) describe a mechanism for identifying such APIs. 4.1.3. Porting and maintaining HDF on current and future NPOESS user platforms The diversity of NPOESS users dictates that HDF5 will need to be supported in many configurations. The challenge of making complex software work well on varying computing architectures, operating systems, compilers and other middleware is often underappreciated. Some platforms requested by NPOESS users (such as AIX 5.3 and some 64-bit platforms) are not currently fully supported by the current library, and new platforms will certainly be added in accordance with the needs of its NPOESS users. Task 6: THG will maintain full support of HDF5 on platforms used by the NPOESS program. 4.2 Developing new tools and technologies Surveys of the NPOESS community identified the following needs for tools and utilities to assist NPOESS end users: to extract and display XML data/granules, to translate data into other formats, to view data content, to display statistics, and to manipulate image data. Page 5 The HDF Group HDF Technology Development and Services for NPOESS 4.2.1. Tools to extract and display data using XML profiles Users often need to extract and display NPOESS granule data in human readable forms. The NPOESS product design facilitates this by the use of XML “profiles,” based on the NPOESS schema for organizing HDF5 files, which describe the structures of NPOESS products. Because HDF5 is not a simple sequential format, it is not a trivial matter to develop tools to use these profiles for extracting parts of HDF5 files. Currently, applications are hard coded for each type of data product to parse the XML profiles. This can be inefficient and error-prone. Task 7: THG will develop a tool to parse XML profiles based on the NPOESS profile schema. Providing an efficient tool to parse NPOESS XML profiles could make it easier for applications to extract and display NPOESS granule data in usable forms. In addition, a GUI or command-line tool could be developed, using this parser, to retrieve and display datasets or subsets of data based on the profiles. Task 8: THG will develop a GUI or command-line tool to retrieve and display datasets or subsets of datasets using the XML profile. 4.2.2. Tools to translate HDF5 to other standard formats File formats such as BUFR and GRIB2 are widely used in the meteorological community. Being able to convert HDF5 to BUFR and GRIB2 will greatly facilitate the use of NPOESS data products for those users. The HDF team has successfully developed and maintained tools to convert HDF5 to many other data formats such as DAP, HDF4, and GIF. Furthermore, since new features will be added to HDF5 as the NPOESS mission continues, THG is in a unique position to maintain the conversion tools to benefit the long term use of NPOESS data. Task 9: THG will a) Work with the NPOESS community to specify user requirements for converting HDF5 NPOESS data to GRIB2; b) Create a mapping document that aims to set a standard mapping of NPOESS-HDF5 data to the GRIB2 format; and c) Develop a conversion tool to efficiently convert NPOESS-HDF5 data to GRIB2. 4.2.3. Adding new Java Graphical User Interface (GUI) in HDFView An HDF5 file with NPOESS data products contains an XML user’s block, groups, data arrays, data products (aggregation and granules) and attributes (file attributes, data product attributes and granule attributes). The diagram shows the structure of a basic NPOESS file. There are two challenges to presenting such data in a meaningful way. First, current tools do not know the NPOESS Handbook, which is the key to showing the relationships of data objects. Second, there is no GUI tool to show NPOESS data products (references pointing to arrays or regions of arrays) in a user-friendly way. HDFView, a visual browser and editor for HDF files, has a plug-in capability that allows users to add Graphical User Interface (GUI) components for showing file contents in ways that correspond to a particular application. For example, the HDF-EOS profile used by the EOSDIS project has three types of data objects: swath, grid and point. Since HDFView is a general tool, it shows HDF-EOS objects just as normal HDF5 datasets and groups. The HDF-EOS development team implemented an HDF-EOS plug-in for HDFView that displays data in a more effective and user-friendly way for EOS users. Page 6 The HDF Group HDF Technology Development and Services for NPOESS Because of the uniqueness of the NPOESS data/metadata structure, NPOESS users have the same need for user-friendly GUI components. The proposed work will add the following GUI components to show NPOESS data content in a more meaningful and user-friendly way. Task 10: THG will add a tree view showing the correct relationship of data products and more details of the NPOESS file structure, such as adding dashed connectors between data products (object/region references) and data arrays. Task 11: THG will implement a metadata view presenting NPOESS metadata in a meaningful way according to the types of metadata (XML user block, file and data product attributes, and granule attributes). For example, showing XML user block structure instead of plain text. Task 12: THG will provide the ability to specify a region reference and then display the corresponding region in a spreadsheet format. 4.2.4. Other tools As NPOESS approaches delivery, new utilities will be needed for both data production and data exploitation. THG has significant experience enhancing the usability of HDF5 data by extending existing HDF5 utilities and HDFView, as well as developing new HDF5 utilities. THG’s experience is particularly valuable in cases where efficient access is critical, as is the case with many NPOESS data products. Two tools that have already been requested are:   A utility to show statistics about a dataset, such as averages, min/max, and standard deviation, in particular one that can deal with datasets too large to be loaded into memory. A utility to enable simple arithmetic and Boolean operations to be performed on two datasets, such as showing dataset differences, overlaying datasets (e.g. images). that is too large to load into memory. Task 13: THG will develop a tool to display common statistics about a dataset, including a dataset Task 14: THG will develop a utility to perform arithmetic and Boolean operations on two datasets. 4.3 High priority support for NPOESS users The HDF Group, in cooperation with the NASA ESDIS Project and the NPOESS IPO, already engages the community through its annual HDF workshop, which includes HDF training, presentations on commercial and non-commercial technologies, research updates, and breakout meetings on specific NPOESS issues. Also, HDF mailing lists keep interested NPOESS users abreast of HDF5 releases and other activities. However, these activities address only some of NPOESS’ HDF5 support needs. As the NPOESS system replaces existing on-board and ground weather systems, more and more users will depend on HDF support on a regular basis. The HDF Group is available to provide some support, but with millions of HDF users and limited resources, there is always a long queue of feature requests and bug reports, and the ability to provide rapid, in-depth helpdesk support is limited. We propose to overcome these barriers by creating a support structure that gives priority to NPOESS HDF users at all levels. This mechanism is similar to that employed with the EOS project and has the following components. Task 15: THG will assign high priority to helpdesk requests from the NPOESS user community. Task 16: THG will assign high priority to requests from the NPOESS user community for features and resolution of software anomalies (bug tracking, analysis, and fixes). Task 17: THG will hold meetings and technical discussions, which include: a) annual meetings with the IPO, Centrals and other designated participants, such as the NDE and appropriate vendors, in which The HDF Group gives a briefing on HDF activities of interest to NPOESS, and participants provide information about their needs and concerns; b) participation in the NDE regular telecons and other activities; and c) direct consulting with NPOESS users on technique issues such as application design and implementation. Page 7

Related docs
proposal template
Views: 4000  |  Downloads: 397
policy proposal template
Views: 27  |  Downloads: 4
Proposal Template
Views: 393  |  Downloads: 18
Grant Proposal Template
Views: 309  |  Downloads: 19
PROPOSAL TEMPLATE
Views: 573  |  Downloads: 82
Free Business Proposal Template
Views: 9419  |  Downloads: 269
Proposal Template
Views: 999  |  Downloads: 110
PROPOSAL (TEMPLATE)
Views: 1542  |  Downloads: 64
Sales Proposal Template
Views: 975  |  Downloads: 4
Proposal Template
Views: 919  |  Downloads: 90
PROPOSAL TEMPLATE
Views: 1095  |  Downloads: 67
Proposal Template
Views: 371  |  Downloads: 16
Proposal Template
Views: 1147  |  Downloads: 160
premium docs
Other docs by Juan Agui
Major in Economics
Views: 509  |  Downloads: 15
Expense schedule
Views: 476  |  Downloads: 4
Exercise Chart
Views: 1319  |  Downloads: 25
Connecticut v Doehr
Views: 1009  |  Downloads: 34
Provisions in deed made pursuant to receiver
Views: 246  |  Downloads: 2
Howard v Kunto
Views: 1997  |  Downloads: 33
In The Secret
Views: 328  |  Downloads: 10
Angels We Have Heard on High
Views: 237  |  Downloads: 0
English-Russian Legal Glossary
Views: 1061  |  Downloads: 54
Deck the Halls
Views: 129  |  Downloads: 1
Victory in Jesus
Views: 285  |  Downloads: 0
O Come All Ye Faithful
Views: 214  |  Downloads: 4
Oh Lord You_re Beautiful
Views: 223  |  Downloads: 1
Property Outline -- Acquisition by Capture
Views: 414  |  Downloads: 13
Hill Anderson Summers Hall Sindell
Views: 283  |  Downloads: 1