caron

Document Sample
caron Powered By Docstoc
					    Recent Work in Progress
               John Caron, June 3, 2003



• THREDDS development
  – Dynamic Catalogs: DQC, Resolvers
  – IDD Data Server
  – ADDE Cataloger

• NetCDF development
  – NetCDF Markup Language (NcML)
  – More efficient Java I/O (NIO)
  – NetCDF/DODS/HDF5 Data Models
                THREDDS Catalogs
                             HTTP Server
                                                        CatalogRef.xml
 Catalog                         Catalog.xml
 Generator                                              CatalogRef.xml




Data Server
DODS, ADDE, FTP, HTTP

                                          Client
                                          Application
    Datasets

                  hostname.edu
   Dynamic Catalogs = Services
  HTTP Tomcat Server

             Catalog.xml
                                  Catalog Service
Catalog
Generator              CatalogRef.xml



                  Query Resolver Service

                        DQC.xml

             Resolver Service               URI      Client
                                               URL
                                                     Application

                     Data Server
                     DODS, ADDE, FTP, HTTP

  Datasets
                           hostname.edu
 Dataset Query Capability (DQC)
• XML document.
• Describes what the user can ask for as a set of
orthogonal “selections”.
• On the client, a “query URL” is formed based on
the user’s choices, and sent to the server.
• The “query resolver” server finds which datasets
satisfy the query and returns a list of real dataset
URLs.
•The DQC describes the queries that the server is
capable of responding to.
          Resolver Services
• Logical Dataset, eg “latest ETA model run”
• Dataset with Service type “Resolver”
• On the client, the URI of the logical
  dataset is sent to the server
• The server finds what is available and
  returns a list of real dataset URLs.
                 ADDE Cataloger
  HTTP Tomcat Server

               Catalog.xml

                                  Catalog Service
ADDE
                          CatalogRef.xml
Cataloger
                  Query Resolver Service

                        DQC.xml

                                                    Client
                                                    Application
  Xxxxx
  Xxxxx
          ADDE Data
  Xxxx
          Server
                             hostname.edu


Datasets                                   IDD
     Summary IDD Data Server
Get as much of the IDD Data feeds available
 via THREDDS as possible.
  – NCEP model data (catgen) (DODS)
  – Level 3 NEXRAD (custom server/DQC) (ADDE)
  – SSEC/Unidata Satellite data (ADDE
    Cataloger) (ADDE)
  – Text Data: Metars, Surface Obs, etc
    (DQC/custom server), returns text or XML.
  – Profiler Data (custom server/DQC) (ADDE)
                      NetCDF 3

                  NetCDF                         OpenDAP
                    File                          Dataset


                           Local file
                                                        OpenDAP
                           HTTP protocol
   NcML                                                 protocol
Dataset XML
                                NetCDF-3 library
Virtual dataset
                                          API
                                        Client
                                   Application
    NetCDF Markup Language
XML representation of netCDF metadata, uses
 XML Schema

• Core: existing netCDF data model
• Coordinate System: general and
  georeferencing coordinate system
• Dataset: redefine, aggregate, subset

• Luca Cinquini (NCAR/SCD/ESG), John Caron, Ethan
  Davis, Bob Drach (LLNL), Stefano Nativi (Florence),
  Russ Rew
 NcML Coordinate Systems
Convention Parser
   •ATDRadar
                           NetCDF File
   •AWIPS
                         OpenDAP Dataset
   •COARDS
   •CF
   •CSM
   •GDV
   •NUWG
   •WRF             Netcdf Dataset
   •Zebra




                    NcML Dataset
                       XML
GeoGrids, GeoTiffs, Geowhiz!
                                                         NetCDF File
                                                       OpenDAP Dataset
       Convention Parser

                                      Netcdf Dataset


                                                         VisAD / IDV
GeoGrid factory               GeoGrid
                              Dataset
                                                       WCS Server
GeoTiff Writer
                              Strange land
                                 of GIS
                                             OpenGIS
                   GeoTiff File               WCS
NcML Dataset : “virtual view”

                           NcML
                        Dataset XML
  NetCDF File
OpenDAP Dataset




                     Dataset XML Parser
                     Java-netCDF 2.1

                   Client Application
  NetCDF Dataset
              NcML Dataset
• Use NcML like CDL, to declare the contents of a
  netCDF file.
• Add, delete or rename Variables, Attributes, and
  Dimensions
• Subset Variables
• Reorder a Variable’s dimensions
• Aggregate multiple netCDF files, a la DODS
  Aggregation Server
• NcML Dataset is a “virtual view” or can make
  copy to a local netCDF file.
        2: NcML Datasets on a Server
                          Catalog.xml


DODS Agg/Netcdf
Server
DODS, ADDE, FTP, HTTP

   Dataset XML Parser                   Client
                                        Application

                  NcML
               Dataset XML




    Datasets
                    hostname.edu
 3: NcML Datasets via Catalogs
                                    NcML
                  Catalog.xml
                                 Dataset XML




                                    Catalog/Dataset XML Parser
  NetCDF File
                                    Java-netCDF v 2.1.1
OpenDAP Dataset
                                Client
                                Application
                   NIO
• Rewrite ucar.nc2 I/O layer using java.nio
  package (currently using ucar.netcdf)
• Uses memory mapping, bulk I/O transfer
• Prototype has 7x speedup on large files.
• Requires JDK 1.4+
• HTTP access must be rewritten
            NIO vs current Java
                                       NIO Current old/new
First access
 small (3.9 Mb)                         281 671        2.4
 large (240 Mb)                        3334 28221      8.5

Average next 5 accesses
 small                                   54  290       5.4
 large                                 2239 16367      7.3

• Time in millisecs to sequentially read entire file
• Wintel 2GHz, 1 GB main memory
• Java 1.4.2 -client
           NIO vs optimized C
                                      NIO      C   C/NIO
First access
 small (3.9 Mb)                        281 370       1.3
 large (240 Mb)                       3334 19348     5.8

Average next 5 accesses
 small                                  54    24     .44
 large                                2239   953     .43

• Java 1.4.2 –client vs. VC 6.0 /O2
     NetCDF Data Model
NetcdfFile

  Dimension   Variable      Attribute

                Attribute

               DataType
               •byte
               •char
               •short
               •int
               •float
               •double
       OpenDAP Data Model
               BaseType             array
Dataset
               •primitive (8)        Dimension
   BaseType    •string
                                     BaseType
   Attribute   •array
               •grid

Attribute      •structure          structure /
                                   sequence
Attribute      •sequence
                                      BaseType
                       Attribute
               HDF5 Data Model
                       Datatype           Dataset
Groups
                       •Fixed point         Datatype
File directory
structure inside HDF   •floating point
file.                  •date/time           DataSpace
                       •string
                                            Attribute
                       •bit field
                       •Opaque            Data storage
                       •Compound          •Compact
                       •Reference         •External
                       •Enumeration
                                          •Layout
                       •Variable length
                                          •Indexed
                       •Array
                                          •Striped
   Possible Extensions to netCDF
             data model
• Add new data types:
   – Strings: variable length arrays of bytes, plus an encoding
     attribute.
   – Structures: collections of any other element types, allow nested
     structures.
   – Vector: a variable length 1D array of any type.
• Allow reusable structure definition = user defined data
  type.
• Allow unnamed, undeclared dimensions = anonymous
  dimensions.
• Allow multiple unlimited dimensions (outer dimension
  only)
• Compression. Push scale/offset into library, allow
  variable bit sizes.
• Explicit support for coordinate variables/axes.
 New NetCDF Data Model
NetcdfFile
              Variable       Structure
  Dimension     DataType
                •byte        DataType
                •short       DataType
                •int          DataType
                •long
                •float
                •double
                •String      Vector
                •Structure   •Length
                •Vector
  Attribute                  DataType
                Attribute
                                NetCDF 4
                    NetCDF                                   OpenDAP
                   V.1 and 2                   HDF5           Dataset
                      File                      File

                                                                   OpenDAP
                               Local file or                       4.0
   NcML                        HTTP protocol                       protocol
Dataset XML
                                          NetCDF 4 library
 Virtual dataset
                                                   API
                                                 Client
                                               Application

				
DOCUMENT INFO