Hartnett netCDF by y9ih8A0

VIEWS: 4 PAGES: 52

									     NetCDF

   Ed Hartnett
  Unidata/UCAR
ed@unidata.ucar.edu
                Unidata
• Unidata - helps universities acquire,
  display, and analyze Earth-system data.
• UCAR – University Corporation for
  Atmospheric Research - a nonprofit
  consortium of 66 universities.
    SDSC Presentation, July 2005
•    Intro to NetCDF Classic
•    Intro to NetCDF-4
          What is NetCDF?
• A conceptual data model for scientific
  data.
• A set of APIs in C, F77, F90, Java, etc. to
  create and manipulate data files.
• Some portable binary formats.
• Useful for storing arrays of data and
  accompanying metadata.
              History of NetCDF


       netCDF developed       netCDF 3.0                 netCDF 4.0
           at Unidata          released                 beta released

1988   1991         1996              2004   2005
                 netCDF 2.0
                  released                          netCDF 3.6.0
                                                      released
             Getting netCDF
• Download latest release from the netCDF
  web page:
 http://www.unidata.ucar.edu/content/software/netcdf
• Builds and installs on most platforms with
  no configuration necessary.
• For a list platforms netCDF versions have
  built on, and the output of building and
  testing netCDF, see the web site.
         NetCDF Portability
• NetCDF is tested on a wide variety of
  platforms, including Linux, AIX, SunOS,
  MacOS, IRIX, OSF1, Cygwin, and
  Windows.
• We test with native compilers when we
  can get them.
• 64-bit builds are supported with some
  configuration effort.
   What Comes with NetCDF
• NetCDF comes with 4 language APIs: C,
  C++, Fortran 77, and Fortran 90.
• Tools ncgen and ncdump.
• Tests.
• Documentation.
          NetCDF Java API
• The netCDF Java API is entirely separate
  from the C API.
• You don’t need to install the C API for the
  Java API to work.
• Java API contains many exciting features,
  such as remote access and more
  advanced coordinate systems.
 Tools to work with NetCDF Data
• The netCDF core library provides basic data
  access.
• ncgen and ncdump provide some helpful
  command line functionality.
• Many additional tools are available, see:
 http://www.unidata.ucar.edu/packages/netcdf/software.html
CDL – Common Data Language
• Grammar defined for displaying
  information about netCDF files.
• Can be used to create files without
  programming.
• Can be used to create reading program in
  Fortran or C.
• Used by ncgen/ncdump utilities.
                     Example of CDL
netcdf foo { // example netCDF specification in CDL
dimensions:
lat = 10, lon = 5, time = unlimited;

variables:
int lat(lat), lon(lon), time(time);
float z(time,lat,lon), t(time,lat,lon);
double p(time,lat,lon); int rh(time,lat,lon);

lat:units = "degrees_north";
lon:units = "degrees_east";

data:
lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
lon = -140, -118, -96, -84, -52;
}
 Software Architecture of NetCDF-3
 V2 C tests                                   F77 tests   F90 API

 V2 C API V3 C tests ncgen   ncdump C++ API          F77 API

                               V3 C API



• Fortran, C++ and V2 APIs are all built on the
  C API.
• Other language APIs (perl, python, MatLab,
  etc.) use the C API.
      NetCDF Documentation
• Unidata distributes a NetCDF Users Guide
  which describes the data model in detail.
• A language-specific guide is provided for
  C, C++, Fortran 77, and Fortran 90 users.
• All documentation can be found at:
 http://my.unidata.ucar.edu/content/software/netcdf/docs
            NetCDF Jargon
• “Variable” – a multi-dimensional array of
  data, of any of 6 types (char, byte, short,
  int, float, or double).
• “Dimension” – information about an axis:
  it’s name and length.
• “Attribute” – a 1D array of metadata.
       More NetCDF Jargon
• “Coordinate Variable” – a 1D variable with
  the same name as a dimension, which
  stores values for each dimension value.
• “Unlimited Dimension” – a dimension
  which has no maximum size. Data can
  always be extended along the unlimited
  dimension.
 The NetCDF Classic Data Model
• The netCDF Classic Data Model contains
  dimensions, variables, and attributes.
• At most one dimension may be unlimited.
• The Classic Data Model is embodied by
  netCDF versions 1 through 3.6.0
• NetCDF is moving towards a new, richer
  data model: the Common Data Model.
          NetCDF Example
• Suppose a user wants to store
  temperature and pressure values on a 2D
  latitude/longitude grid.
• In addition to the data, the user wants to
  store information about the lat/lon grid.
• The user may have additional data to
  store, for example the units of the data
  values.
    NetCDF Model Example
Dimensions        Variables         Attributes



 latitude         temperature
                                     Units: C



 longitude         pressure
                                    Units: mb

             Coordinate Variables

                   latitude

                   longitude
  Important NetCDF Functions
• nc_create and nc_open to create and open files.
• nc_enddef, nc_close.
• nc_def_dim, nc_def_var, nc_put_att_*, to define
  dimensions, variables, and attributes.
• nc_inq, nc_inq_var, nc_inq_dim, nc_get_att_* to
  learn about dims, vars, and atts.
• nc_put_vara_*, nc_get_vara_* to write and read
  data.
  C Functions to Define Metadata
/* Create the file. */
 if ((retval = nc_create(FILE_NAME, NC_CLOBBER, &ncid)))
     return retval;

 /* Define the dimensions. */
 if ((retval = nc_def_dim(ncid, LAT_NAME, LAT_LEN, &lat_dimid)))
    return retval;
 if ((retval = nc_def_dim(ncid, LON_NAME, LON_LEN, &lon_dimid)))
    return retval;

 /* Define the variables. */
 dimids[0] = lat_dimid;
 dimids[1] = lon_dimid;
 if ((retval = nc_def_var(ncid, PRES_NAME, NC_FLOAT, NDIMS, dimids, &pres_varid)))
    return retval
 if ((retval = nc_def_var(ncid, TEMP_NAME, NC_FLOAT, NDIMS, dimids, &temp_varid)))
    return retval;

 /* End define mode. */
 if ((retval = nc_enddef(ncid)))
    return retval;
       C Functions to Write Data
/* Write the data. */
if ((retval = nc_put_var_float(ncid, pres_varid, pres_out)))
   return retval;
if ((retval = nc_put_var_float(ncid, temp_varid, temp_out)))
   return retval;

/* Close the file. */
if ((retval = nc_close(ncid)))
   return retval;
       C Example – Getting Data
•   /* Open the file. */
•     if ((retval = nc_open(FILE_NAME, 0, &ncid)))
•        return retval;

•    /* Read the data. */
•    if ((retval = nc_get_var_float(ncid, 0, pres_in)))
•       return retval;
•    if ((retval = nc_get_var_float(ncid, 1, temp_in)))
•       return retval;

•    /* Do something useful with the data… */
•
•    /* Close the file. */
•    if ((retval = nc_close(ncid)))
•       return retval;
       Data Reading and Writing
              Functions
• There are 5 ways to read/write data of each
  type.
• var1 – reads/writes a single value.
• var – reads/writes entire variable at once.
• vara – reads/writes an array subset.
• vars – reads/writes an array by slices.
• varm – reads/writes a mapped array.
• Ex.: nc_put_vars_short writes shorts by slices.
                Attributes
• Attributes are 1-D arrays of any of the 6
  netCDF types.
• Read/write them with functions like:
  nc_get_att_float and nc_put_att_int.
• Attributes may be attached to a variable,
  or may be global to the file.
        NetCDF File Formats
• Starting with 3.6.0, netCDF supports two binary
  data formats.
• NetCDF Classic Format is the format that has
  been in use for netCDF files from the beginning.
• NetCDF 64-bit Offset Format was introduced in
  3.6.0 and allows much larger files.
• Use classic format unless you need the large
  files.
        NetCDF-3 Summary
• NetCDF is a software library and some
  binary data formats, useful for scientific
  data, developed at Unidata.
• NetCDF organizes data into variables, with
  dimensions and attributes.
• NetCDF has proven to be reliable, simple
  to use, and very popular.
      Why Add to NetCDF-3?
• Increasingly complex data sets call for
  greater organization.
• Size limits, unthinkably huge in 1988, are
  routinely reached in 2005.
• Parallel I/O is required for advanced Earth
  science applications.
• Interoperability with HDF5.
              NetCDF-4
• NetCDF-4 aims to provide the netCDF API
  as a front end for HDF5.
• Funded by NASA, executed at Unidata
  and NCSA.
• Includes reliable netCDF-3 code, and is
  fully backward compatible.
    NetCDF-4 Organizations
• Unidata/UCAR
• NCSA – The National Center for
  Supercomputing Applications
  University of Illinois at Urbana-Champaign
• NASA – NetCDF-4 was funded by NASA
  award number AIST-02-0071.
   New Features of NetCDF-4
• Multiple unlimited dimensions.
• Groups to organize data.
• New types, including compound types and
  variable length arrays.
• Parallel I/O.
    The Common Data Model
• NetCDF-4, scheduled for beta-release this
  Summer, will conform to the Common
  Data Model.
• Developed by John Caron at Unidata, with
  the cooperation of HDF, OpenDAP,
  netCDF, and other software teams, CDM
  unites different models into a common
  framework.
• CDM is a superset of the NetCDF Classic
  Data Model
    The NetCDF-4 Data Model
• NetCDF-4 implements the Common Data Model.
• Adds groups, each group can contain variables,
  attributes and dimensions, and groups.
• Dimensions are scoped so that variables in
  different groups can share dimensions.
• Compound types allow users to define new
  types, comprised of other atomic or user-defined
  types.
• New integer and string types.
Software Architecture of NetCDF-4



V2 C tests                                   F77 tests   F90 API
V2 C API V3 C tests ncgen   ncdump C++ API          F77 API

                              V4 C API


             V3 C API                           HDF5
    NetCDF-4 Release Status
• Latest alpha release includes all netCDF-4
  features – depends on latest HDF5
  development snapshot.
• Beta release – due out in August, replaces
  artificial netCDF-4 constructs, and
  depends on a yet-to-be-released version
  of HDF5.
• Promotion from beta to full release will
  happen sometime in 2006.
          Building NetCDF-4
• NetCDF-4 requires that HDF5 version
  1.8.3 be installed. This is not released yet.
• The latest HDF5 development release
  works with the latest netCDF alpha
  release.
• To build netCDF-4, specify –enable-
  netcdf-4 at configure.
When to Use NetCDF-4 Format
• The new netCDF-4 features (groups, new
  types, parallel I/O) are only available for
  netCDF-4 format files.
• When you need HDF5 files.
• When portability is less important, until
  netCDF-4 becomes widespread.
          Versions and Formats

       netCDF developed
        by Glenn Davis        netCDF 3.0                   netCDF 4.0
                               released                   beta released

1988   1991        1996                2004   2005
                 netCDF 2.0
                  released                           netCDF 3.6.0
                                                       released

                                                     NetCDF-4 Format

                                               64-Bit Offset Format

                      Classic Format
      NetCDF-4 Feature Review
•   Multiple unlimited dimensions.
•   How to use groups.
•   Using compound types.
•   Other new types.
•   Variable length arrays.
•   Parallel I/O.
•   HDF5 Interoperability.
 Multiple Unlimited Dimensions
• Unlimited dimensions are automatically
  expanded as new data are written.
• NetCDF-4 allows multiple unlimited
  dimensions.
         Working with Groups
• Define a group, then use it as a container for
  the classic data model.
• Groups can be used to organize sets of data.
       An Example of Groups
      Model_Run_1                                      Model_Run_2
                                  history
lat      rh          units                                                     history
                                                lat              rh    units
lon     temp         units
                                                lon             temp   units




                    Model_Run_1a
                                                      history
              lat            rh             units

              lon       temp                units
 New Functions to Use Groups
• Open/create returns ncid of root group.
• Create a new group with nc_def_grp.
nc_def_grp(int parent_ncid, char *name, int *new_ncid);
• Learn about groups with nc_inq_grps.
 nc_inq_grps(int ncid, int *numgrps, int *ncids);
     C Example Using Groups
if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR;
    if (nc_def_grp(ncid, DYNASTY, &tudor_id)) ERR;
    if (nc_def_dim(tudor_id, DIM1_NAME,
   NC_UNLIMITED, &dimid)) ERR;
    if (nc_def_grp(tudor_id, HENRY_VII, &henry_vii_id))
   ERR;
    if (nc_def_var(henry_vii_id, VAR1_NAME, NC_INT, 1,
   &dimid, &varid)) ERR;
    if (nc_put_vara_int(henry_vii_id, varid, start, count,
   data_out)) ERR;
    if (nc_close(ncid)) ERR;
      Create Complex Types
• Like C structs, compound types can be
  assembled into a user defined type.
• Compound types can be nested – that is,
  they can contain other compound types.
• New functions are needed to create new
  types.
• V2 API functions are used to read/write
  complex types.
C Example of Compound Types
/* Create a file with a compound type. Write a little data. */
  if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR;
  if (nc_def_compound(ncid, sizeof(struct s1), SVC_REC, &typeid)) ERR;
  if (nc_insert_compound(ncid, typeid, BATTLES_WITH_KLINGONS,
                     HOFFSET(struct s1, i1), NC_INT)) ERR;
  if (nc_insert_compound(ncid, typeid, DATES_WITH_ALIENS,
                     HOFFSET(struct s1, i2), NC_INT)) ERR;
  if (nc_def_dim(ncid, STARDATE, DIM_LEN, &dimid)) ERR;
  if (nc_def_var(ncid, SERVICE_RECORD, typeid, 1, dimids, &varid)) ERR;
  if (nc_put_var(ncid, varid, data)) ERR;
  if (nc_close(ncid)) ERR;
New Ints, Opaque, String Types
• Opaque types are bit-blobs of fixed size.
• String types allow multi-dimensional arrays
  of strings.
• New integer types: UBYTE, USHORT,
  UINT, UINT64, INT64.
      Variable Length Arrays
• Variable length arrays allow the efficient
  storage of arrays of variable size.
• For example: an array of soundings of
  different number of elements.
    Parallel I/O with NetCDF-4
• Must use configure option –enable-parallel when
  building netCDF.
• Depends on HDF5 parallel features, which
  require MPI.
• Must create or open file with nc_create_par or
  nc_open_par.
• All metadata operations are collective.
• Adding a new record is collective.
• Variable reads/writes are independent by
  default, but can be changed to do collective
  operations.
        HDF5 Interoperability
• NetCDF-4 can interoperate with HDF5 with a
  SUBSET of HDF5 features.
• Will not work with HDF5 files that have looping
  groups, references, and types not found in
  netCDF-4.
• HDF5 file must use new dimension scale API to
  store shared dimension info.
• If a HDF5 follows the Common Data Model,
  NetCDF-4 can interoperate on the same files.
      Future Plans for NetCDF
• NetCDF 4.0 release in 2006.
• Beta for next major version of netCDF in
  Summer, 2006.
• Full compatibility with Common Data Model.
• Remote access, including remote subsetting of
  data.
• XML-based representation of netCDF metadata.
• Full Fortran 90 support, but limited F77 support.
     For Further Information
• netCDF mailing list:
  netcdfgroup@unidata.ucar.edu
• email Ed: ed@unidata.ucar.edu
• netCDF web site: www.unidata.ucar.edu

								
To top