NARA-status-703 by keralaguest

VIEWS: 4 PAGES: 16

									                                   SDSC Technical Report 2003-04




 NARA Persistent Archives
NPACI Collaboration Project

           Status Report

              June 2003

              Reagan W. Moore
              Richard Marciano
               Charlie Cowart
              Sheau-Yen Chen
               Arcot Rajasekar
                Michael Wan
       San Diego Supercomputer Center
                  Joseph JaJa
           University of Maryland
                Paul Berkman
          George James Morgan III
              EvREsearch LTD
                 June 6, 2003



                Sponsored by
 National Archives and Records Administration
The views and conclusions contained in this document are those of the authors and should not be interpreted as
representing the official policies, either expressed or implied, of the National Archives and Records
Administration or the U.S. Government.




                                                       ii
                                                                   Table of Contents


1. Introduction ......................................................................................................................................................... 1
2. Hardware Systems Status .................................................................................................................................... 1
3. Software System Status....................................................................................................................................... 2
   3.1 SRB Development ........................................................................................................................................ 2
   3.2 Multivalent Browser Development ............................................................................................................... 4
4. Collection Status ................................................................................................................................................. 5
   4.1 Clinton Government Web Snapshot – NCES ............................................................................................... 6
   4.2 Image EAD Collection .................................................................................................................................. 6
   4.3 US Code ........................................................................................................................................................ 6
   4.4 Performance Testing ..................................................................................................................................... 6
5. Budget ................................................................................................................................................................ 7
6. Outreach .............................................................................................................................................................. 8
7. Papers .................................................................................................................................................................. 8
8. Presentations, Workshops, and Meetings ........................................................................................................... 8
Appendix A. NARA WEB SNAPSHOTS (NCES Website) ................................................................................. 9
Appendix B. Statistics for the accessioned holding collection, provide on 3480 tape ........................................ 11




                                                                                     iii
iv
1. Introduction

The National Partnership for Advanced Computational Infrastructure, an NSF sponsored program, is
collaborating with the National Archives and Records Administration on a demonstration of prototype
persistent archives. The NPACI partners formally involved in the collaboration include the research staff of the
Data and Knowledge Systems group at the San Diego Supercomputer Center, Robert Wilensky at the University
of California at Berkeley, and Joseph JaJa at the University of Maryland. The EVResearch Corporation is
applying concept–based organization of material. Additional NPACI partners are informally involved,
including Stanford, which is collaborating on the integration of the WebBase web crawling technology with the
SDSC Storage Resource Broker data grid. This technology is used to crawl presidential web sites.

The goal of the development effort is to demonstrate support for automation of archival processes through use
of data grid technology. A prototype persistent archive is being implemented that includes systems at SDSC,
the University of Maryland, and NARA. The systems at SDSC, the University of Maryland, and NARA are in
place, and connected via wide area networks.

This status report describes progress in accessioning NARA collections, registration of the collections onto the
Storage Resource Broker data grid, the development effort that is being pursued to support federation of
multiple independent metadata catalogs, and standardization of persistent archive technology.

2. Hardware Systems Status

The standard Grid Brick hardware configuration remains the same.

                             Intel Celeron 1.7 GHz CPU
                             SuperMicro P4SGA PCI Local bus ATX mainboard
                             1 GB memory (266 MHz DDR DRAM)
                             3Ware Escalade 7500-12 port PCI bus IDE RAID
                             10 Western Digital Caviar 200-GB IDE disk drives
                             3Com Etherlink 3C996B-T PCI bus 1000Base-T
                             Redstone RMC-4F2-7 4U ten bay ATX chassis
                                        Table 1. Grid Brick configuration

SDSC has ordered a database grid brick for supporting the persistent archive metadata catalog. The
configuration is much more robust, to ensure high availability. The database grid brick configuration is shown
in Table 2. The system will be used to run an MCAT metadata catalog for testing the ability to federate
catalogs.

                             Two 2.4GHz processors
                             2 CPU mother board (high-end board)
                             4GB memory total (interleaved DDR)
                             128-bit fast I/O bus
                             3Ware Escalade 7500-12 port PCI bus IDE RAID
                             16+2 SCSI 15K RPM 36 GB with RAID 1 or (1 +0)
                             Hot-swappable disk
                             2 Gigabit Ethernet connections on mother board
                             3 unit boxes with power supplies
                                           Table 2. Database Grid Brick
                                                        1
3. Software System Status

The Storage Resource Broker software has been installed at NARA, University of Maryland, and SDSC. A
separate MCAT metadata catalog is installed at each site. The systems function as three separate and
independent data management systems or zones. The Grid Bricks at each site have been cross-registered onto
each zone. Thus, by writing to the appropriate physical resource name, it is possible for the NARA zone to
store data onto the SDSC grid brick. Since eight grid bricks have been installed at SDSC, a logical resource
name, “nara-brick-sdsc” has been implemented to control the SDSC storage repositories. Writing to “nara-
brick-sdsc” results in the storage of the digital entity onto the next physical resource in the list, effectively
implementing the ability to load level storage across the eight grid bricks. Table 3 gives the configuration that
is currently being used to name the storage resources. The same port number, 6618, is used at all three sites for
accessing the metadata catalogs.

                      Zone                     NARA Zone      U Md Zone       SDSC Zone
                      Logical resource name                                   nara-brick-sdsc
                      Physical resource name   erasrb-linux   bodleian-prod   nara-brick1-sdsc
                                                                              nara-brick2-sdsc
                                                                              nara-brick3-sdsc
                                                                              nara-brick4-sdsc
                                                                              nara-brick4-sdsc
                                                                              nara-brick6-sdsc
                                                                              nara-brick7-sdsc
                                                                              nara-brick8-sdsc

                               Table 3. Prototype Persistent Archive Configuration

3.1 SRB Development

The SDSC Storage Resource Broker data grid is being extended to add features that are required by the
persistent archive prototype. Version 2.1 of the SRB was released in June 2003. This version included both
bug fixes for version 2.0 and new features:

   1. Better support for GSI (Grid Security Infrastructure). It is now easier to build SRB with support for
      GSI, via some new configure options. The separate AID library is no longer needed as the SRB release
      contains the needed source code that interfaces to the GSI libraries. The system supports GSI 2.2 (NMI
      2).

   2. Optional data encryption and compression. We had planned to implement GSI data encryption on
      network traffic, but found we could make it more secure, higher-performance, and also include data
      compression via a different approach. This system is accessed via scripts that operate on top of SRB
      Scommands and OpenSSL commands. Some of the SRB commands have been enhanced to encrypt key
      fields (file encryption keys) as they are transferred to/from the MCAT-enabled SRB server. This SRB
      field encryption is performed via OpenSSL libraries, which are the also the foundation of GSI. For
      more information, see README.dataEncryptionCompression and
      http://www.npaci.edu/dice/srb/SecureAndOrCompressedData.html.

   3. SDSC Matrix, a service (SOAP/WSDL) oriented architecture for SRB. SDSC Matrix is a W3C
      standards based approach to provide a (web) service oriented architecture for data grids, digital libraries
                                                          2
   and persistent archives. The services include support for data movement, replication, access control,
   data set ingestion, retrieval, and container support. SDSC Matrix is a protocol layer on top of the SRB
   protocol stack providing interoperability, asynchronous messages and statefull services. Matrix
   internally converts the SOAP protocol based messages to native SRB protocol messages. This is the
   first of a series of releases for SDSC Matrix, as it is under active development. See
   http://www.npaci.edu/DICE/SRB/matrix for more information.

4. JARGON, a pure Java API. JARGON is a pure java API for developing programs with a grid interface.
   The API currently handles file I/O for local and SRB file systems and is easily extensible to other file
   systems. File handling with JARGON closely matches file handling in Sun's java.io API, a familiar API
   to most java programmers. The previous SRB Java client library (used by SrbBrowser and the admin
   tool) was implemented via JNI, which connects to the C client code. See
   http://www.npaci.edu/DICE/SRB/jargon for more information.

5. Bulk load without requiring the use of a container. The Sbload utility has been modified to allow the
   bulk load of the content of a local directory to SRB without requiring the use of containers. If the [-c
   container] option is not specified in the command line, the bulk load is assumed to be carried out
   without the use of a container. In addition, a new API - srbBulkLoad() has been created for such
   operations. The performance improvement without the use of a container is not as great as a bulk load
   using a container since the container design fits the bulk load operation perfectly. Nevertheless,
   improvement up to a factor of 5 can be achieved. The Sbunload of files that are not in containers will be
   next.

6. Listing of host-specific resources. A [-H hostname] option has been added to the SgetR utility to list all
   resources on the specified host.

7. Parallel transfer configuration. Two configurable parameters, MAX_THREAD and
   SIZE_PER_THREAD, were added to determine the number of threads for parallel transfer. These can
   be adjusted in the runsrb script, before starting a server.

8. SRB Python binding. The new SRB Python binding exposes some SRB client APIs which allow any
   Python program to access SRB systems via POSIX style file I/O calls, to query SRB system metadata
   for files, and to manipulate SRB systems such as creating and deleting a collection. A file,
   README.python, included in SRB source tree, provides an overview of SRB Python binding and an
   example of using SRB Python binding APIs.

9. Sput/Sget checksum, status, and retry features added. Sput -M will store the checksum of the original
   file into the MCAT as it is being stored. Sget -M will verify this checksum on the data as the file is
   retrieved. Both of these operate within the Sput/Sget programs, without additional disk I/O.
         Sget or Sput -V displays a progress bar.
         Sget or Sput -v shows a final message.
         Sget or Sput -V forces -v.
         Sput or Sget -R number, will cause retries up to 'number' times if an error occurs, with an
            exponential backoff sleep in between.

10. SRB_Install_Notes.doc - This is an excellent SRB installation note written by Dr. Michael Doherty of
    the e-Science Group at the Rutherford Appleton Laboratory of UK. It describes from start to finish the
    installation of MCAT, MES (MCAT enable server), other SRB resource servers and client software.

                                                    3
       The install note is based on a Linux system with Oracle 9i as the database for MCAT. However, most of
       the note is applicable to other OS and DMBS.

A critical software component of the Prototype Persistent Archive will be the ability to federate the independent
MCAT metadata catalogs at NARA, University of Maryland, and SDSC. Federation requirements are being
developed in collaboration with the UK data grid, the Particle Physics Data Grid, the Biomedical Informatics
Research Network, and the National Partnership for Advanced Computational Infrastructure. The
implementation is proceeding in stages:

   1. MCAT request forwarding. This is the ability for an MCAT metadata catalog (or zone) to forward a
      service request to a second MCAT catalog (zone).
   2. Support for data and metadata copying from one zone to a second zone.
   3. Hierarchical organization of zones. This is the ability to register all metadata from a set of MCAT
      catalogs into a single MCAT catalog.
   4. Replication of metadata between zones. This requires consistency constraints for access control and
      metadata update.
   5. Simultaneous registration of metadata into multiple zones.

A first release is planned in September 2003, which is intended to support the first three requirements.

3.2 Multivalent Browser Development

The University of California, Berkeley, continues development of the multivalent document presentation
technology. A comparison of the data types found in the Presidential Web sites with the data types supported
by the MultiValent Browser is given in Table 3. The major challenge is support for the Microsoft Office
proprietary formats. With the release of Office 11, an option is expected that will support migration to an XML
encoding. Currently, XML support is partially available in the Multivalent Browser and will need to be
extended to manage Office 11 XML products.

It is worth noting that the top eight formats in terms of prevalence within Presidential Web Sites are all
supported.

                           Presidential Web site data types       Multivalent Browser data types
                                         html                                  html
                                          gif                                   gif
                                          jpg                                   jpg
                                         htm                                   htm
                                         xml                                  (xml)
                                          txt                                   txt
                                          pdf                                   pdf
                                          css                               html+css
                                         doc
                                          asp
                                          ppt
                                          xls

       Table 4. Comparison of data format types and display interfaces provided by the Multivalent Browser




                                                              4
4. Collection Status

The schedule for accessioning selected NARA digital holdings is on track. Additional information about
particular collection properties is presented in Appendices A and B. The size column in Table 5 lists the current
amount of data that has been registered into the prototype persistent archive.

                                           Size
               Site        Collection      (GB)    # Files      Media        Activity           Contact      Due date

             SDSC Accessioned                                                                               Accession
             NARA Holdings                 36       540      3480 tape   Documentation    Fynette Eaton     by May

                  Presidential                                           Time Snapshot    Sam McClure,      Accession
             SDSC Websites                 11      15,000    Web         comparison       Fletcher Burton   by May
                                                                         Anomaly
                       Code of Fed.                                      detection,                         Arrange by
             EVR       Regulations          1        54      Document    Knowledge        Fynette Eaton     May
                                                                                                            Accession
             UMD       Image EAP           1300   600,000    WORM        Media Migration Steve Puglia       by June

                  Clinton Gov. Web                                       Time Change                        Accession
             SDSC Snapshot                 54      176,000   CD-ROM      tracking         Fynette Eaton     by June


                      Table 5. Selected collections for testing on the persistent archive prototype

Additional collections will be accessioned in collaboration with NARA, as listed in Table 6. For each of these
collections, coordination is done through NARA to gain access. The USPTO claims collection is already at
SDSC.

                  AAD (accessioned                                                                          Accession
             SDSC holdings)                 1                Web         Documentation    Fynette Eaton     by July

                                                                         Organization,                      Arrange by
             EVR       USPTO Claims        150    2,000,000 Document     Knowledge        Reagan Moore      July

                                                                         Constraints on                     Accession
             U Md      White House E-Gov   10                CD-ROM      Access           Fynette Eaton     by Aug.

                                                                         Organization,                      Accession
             EVR       US Code-Syracuse     1                Document    Knowledge        TBD               by Sept.

             SDSC DOE Engineering                                        CadCAM                          Accession
             UMD Data                      500               Network     drawings         John Zimmerman by Oct.

             UMD                                                         Dynamic                            Accession
             SDSC EDGAR SEC Filings        10                Web         Document         TBD               by Oct.


                           Table 6. Additional collections to register into the SRB data grid




                                                                   5
4.1 Clinton Government Web Snapshot – NCES
The NCES Web Snapshots (provided on10 CDs) included the data organization listed below. Despite the
presence of 83 different file types, 97% of the files used one of the eight file types as listed in Tablel 8.

                                                                          File Type        Percent of files
               NCES Statistics                                               html               39%
                                                                              csv               26%
                  83 file types
                                                                              jpg               13%
                 53.5 GB total                                                gif               12%
                 2,767 folders                                                pdf                3%
                 175,968 files                                                asp                2%
   Table 7. NCES statistics on data organization                              xls                1%
                                                                            shtml                1%
                                                                Table 8. Types of files in NCES Web Snapshot


4.2 Image EAD Collection

The University of Maryland has completed the accession of the EAD image collection. In addition to ingesting
the NARA image collection, U Md also loaded the EAP tracking MS access databases into the user-defined
values of the MCAT. This enables simple searches and the ability to more easily navigate the collection. To
search the 8 million pieces of metadata, U Md used the Excalibur Text Datablade from IBM to efficiently index
the data. Arcot Rajasekar (SDSC) supplied U Md with a patch to the MCAT to allow U Md to call native
database functions.

The perl module demonstrated during the March meeting has been cleaned up and released to SDSC to be
incorporated into the SRB. The released version only supports file I/O functions now, but more intuitive
metadata access is under development and should be released soon. An internal prototype supporting metadata
has been used for several months. The perl module is being used to present several views of the EAP collection
on the Web. A simple search against a user-defined metadata was written. More advanced navigation of the
collection is being developed, which will allow mined relationships (NAIL number hierarchy) and user-defined
relationships to be browsed. To do advanced browsing, U Md is extracting the EAP metadata into separate
tables in order to more efficiently search and index the data.


4.3 US Code

EvREsearch is continuing to apply its 'digital integration system' to the United States Code for the purpose
of quantifiying mark-up-language and metadata applications in relation to storage requirements, granularity,
searchability, digital ontologies and knowledge-discovery displays.


4.4 Performance Testing

A first demonstration of the utility of data grid technology was demonstrated by the replication of collections
between the three sites comprising the prototype persistent archive. The 36-GB Accessioned Holdings that
were read from 3480-tape at SDSC were replicated from the SDSC zone onto the NARA zone. The transfers
                                                         6
were done at a sustained rate of 1.37 MB/sec. We will repeat the tests using parallel I/O streams to achieve
higher data transfer rates. The statistics for the replication of the SDSC 300 tape collection onto the NARA
brick are:
        count = 585 files
        size = 35.9 GB

The collection is organized on the SDSC brick as 3 subcollections (1 per NARA shipment: November,
December, and March).

The transfer to NARA consisted of a staging step followed by a data transfer:
       SDSC SRB brick -----> SDSC disk -----> NARA SRB brick
This procedure is a simple way to replicate data between two independent data collections.

       STAGING STATS:
               50 min 34 sec --> for NARA Shipment 1
               51 min 06 sec --> for NARA Shipment 2
               51 min 06 sec --> for NARA Shipment 3
       ----------------------
         2 hours 32 min 46 sec (total)

       TRANSFER STATS:
         2 hours 34 min 18 sec --> for NARA Shipment 1
         2 hours 46 min 59 sec --> for NARA Shipment 2
         1 hour 53 min 25 sec --> for NARA Shipment 3
       ----------------------
         7 hours 14 min 42 sec (total)

5. Budget

The funding support for the prototype persistent archive is being used to support project tasks through August
2003. The spending to date that has been billed is listed in Table 9 for the periods October 2002 through
December 2002, January 2003 through 31 March 2003, and April 2003 through June 2003.

                Expenses          Period – Oct-Dec 2002       Period - Jan-Mar 2003   Period April-June 2003
         Salaries                        $18,201                     $65,374                 $27,111
         Benefits                         $8,179                     $22,953                 $12,437
         Equipment                        $4,004                     $16,935                  $8,206
         Travel                              0                        $1,877                  $1,903
         Supplies                          $279                        $925                    $540
         Subcontracts
         General Atomics                  $1,820                      $7,133                    $0
         Univ. Maryland                  $58,740                     $24,325                $134,546
         UC Berkeley                     $11,601                        0                     $7,491
         EvREsearch                      $10,780                        0                    $13,798
         Indirect costs                  $11,925                     $43,512                 $18,851
         Quarterly Total                $125,531                    $183,036                $224,883

               Table 9. Expenditures Billed on the NARA Prototype Persistent Archive Project

The remaining funds on the project are $66,550, to cover labor costs for July and August, 2003.
                                                          7
6. Outreach

At the Global Grid Forum meeting, a final draft of an information document on “Persistent Archive Concepts”
was presented. The paper has been submitted to the Global Grid Forum for approval. The final draft was
emended to address comments from the preservation community, including comments provided by Margaret
Hedstrom, William Underwood, and Mark Conrad.

7. Papers

1. R. Moore, A. Merzky, “Persistent Archive Concepts”, Global Grid Forum Persistent Archive Research
   Group, version four of the draft on Persistent Archive Information document, presented at Global Grid
   Forum 8 in Seattle, June 25, 2003.

8. Presentations, Workshops, and Meetings

To promote the development of common systems between the digital library community, the data grid
community, and the persistent archive community, presentations have been given at workshops, conferences,
project coordination meetings, federal review meetings, and at standards meetings. In particular, persistent
archives are being developed for the NSF National Science Digital Library (NSDL) and the California Digital
Library (CDL). Federation of data resources through data grids is being developed for the NSF National Virtual
Observatory (NVO), the NSF Grid Physics Network (GriPhyN), the NSF Southern California Earthquake
Center (SCEC), the NASA Information Power Grid (IPG), the DOE Particle Physics Data Grid (PPDG) and the
NLM Digital Embryo project. The goal is to promote common software solutions across all of these application
areas that will result in a standard implementation that can be used by other communities. The following
presentations were given by Reagan Moore.

1.   June 30, 2003    NSF           Data Grids for the El Tigre Texas grid, University of Texas, Austin
2.   June 25, 2003    GGF           Persistent Archive Concepts – GGF Informational Document
3.   June 25, 2003    GGF           Recommendation for Standard Operations at Remote Sites
4.   June 22, 2003    InterPARES    “e-archiving for posterity”, Leuven, Belgium, presentation by William
                                    Underwood
5.   June 19, 2003    DOE           High Performance Scientific Computing Workshop
6.   June 17, 2003    NSF           NSF Post-DL Futures Workshop
7.   June 13, 2003    AFIP          Embryology, Imaging and Education Conference
8.   June 9, 2003     DOE           Particle Physics Data Grid




                                                      8
Appendix A. NARA WEB SNAPSHOTS (NCES Website)

For each CD-ROM, the number of files, amount of data, and types of files are enumerated. The notable statistic
is that 97% of the files belonged to one of eight major file types. There were a total of 25 file types for which
only a single file was stored on the CD-ROM.


                 CD1      CD2     CD3     CD4     CD5     CD6        CD7     CD8     CD9     CD10   SUMs     %
    folders        257     432     268     240     260     29         700     538      40      3     2767
    files         3733    14055   12922   13370   14116   3764       64860   31237   17899    12    175968
    size          650M    643M    648M    624M    618M    547M       591M    617M    624M    140K   53.5GB

    ~gi                               1                                                                 1
    ldb               9                                                                                 9
    asp             185    1258     368     200     708          2                     104      4    2829    2%
    pdf             283     324     503    1010     604       1383     539    1174     145           5965    3%
    ai                                        1                                                         1
    agy               2                                                                                 2
    exe              64       4       9      34       1                                  1            113
    dll                      11                                                                        11
    asc                               1                                                                 1
    bak                       2       6               9         1               73                     91
    bmp                       1                                                                         1
    cache            71      11      73      48                                                       203
    cache+           69      11      76      48                                                       204
    cap                               1                                                                 1
    css                       9       3       2       3         1                        2             20
    chp                               1                                                                 1
    cif                               1                                                                 1
    class                     1               2                                          8             11
    dat                                                                                  1              1
    dd                7                                                                                 7
    dos               1       2       2                                                                 5
    dwt                                                                          1                      1
    eds                       5                                                                         5
    ext                       1                                                                         1
    (none)          142      21     162     111       3                          2       1            442
    gif             222    3673    2125    5812    5484        143      98    2692     603          20852    12%
    gio                                       3                                                         3
    html            119    4783    4673    3657    6655       1085   32426   13903     423          67724    39%
    idq                      12      10       1                                                        23
    inc              11     185      27       3      27                                 38      8     299
    htx               2      12       9       1                                                        24
    java                                      1                                                         1
    jbf                                       3                                                         3
    jpg             736    1478    2907      79     337                         22   16531          22090    13%
    jpeg                                              1                                                 1
    js                        2                       6         2        1                             11
    link              2               7       3                                                        12
    link-bbs          2               8       3                                                        13
    lst               6       6                                                                        12
    png                       3       2               4                         10                     19
    map                               1               7                                                 8
    me1               2               2                                                                 4
    me2               2               2                                                                 4
                                                          9
eps                                                                  1             1
mdb             1      6                                                           7
csv                    4                            995   31794   12390        45183   26%
udl                          1                                                     1
dsn                                       2                                        2
xls             3      4     27   1093                             469    32    1628   1%
dbf                           4                                                    4
ppt             16     32                  1                                      49
scc                    19                 21                                      40
sch              2                                                                 2
doc             17     56    4      2     2                          54          135
mov                                       2                                        2
bat                     1                                                          1
old             2       1    1                                                     4
org                     1                                                          1
psd                    30    2       5                                    2       39
prn                                133                                           133
ps                     23                                                         23
psp                                 1                                              1
rtf                    27                                                         27
sas                          3                                                     3
sf                      8                                                          8
shtm                   13     9                                                   22
shtml           35   1588    15          200        152      2                  1992   1%
sql                     4                  2                              1        7
sd2             1                                                                  1
sps                          3                                                     3
sty                          1                                                     1
sys              1                                                                 1
txt            295     82   529     81    14                         3          1004
tmp                     3            2    23                         1    1       30
wav                                                                       1        1
asx             2      2                                                  3        7
frm                           1                                                    1
wid                           1                                                    1
zip           1426    320   391    497                               3    2     2639   2%
wk1                         950    534                             439          1923   1%
wp              1      16     2                                                   19
wp5             1                                                                  1
youth_indic                  1                                                     1




                                               10
Appendix B. Statistics for the accessioned holding collection, provide on 3480 tape

count =       585 files, 298 tapes, sent to SDSC in 3 separate shipments
size =         36 GB

1. Decennial Census of Population & Housing,
(219 tapes, 312 files)

       Bureau of the Census records, Record Group 29.

2. ESAA (Emergency School Aid Act)
(1 tape, 78 files)
      Records of the Office of Education (Record Group 12) 1870-1983
       12.2.9 Records of the Office of Program Planning and Evaluation

3. Survey of Household Food Consumption
      Household Survey
             Basic Data
             Elderly Data w/weights, Winter 1978
      Individual Survey
             Basic Data, Summer 1977, Fall 1977, Winter 1978, Summ
             Low Income I Data, 1977-78
   Survey of the Characteristics of Households Receiving Food Stamps, Raw Data,
      February 1978
(30 tapes, 44 files)

       Records of the Food and Consumer Services

           462.2.1 Records of the Food Stamp Division
           462.2.2 Records of the Consumer and Food Economics Division
4. PUTNAM 1970 ALABAMA
(11 tapes, 39 files)
      Putnam study (1970 census) 381.3.4
       381.3.4 Records of the Office of Operations

5. Expenditure and Employment Data for the Criminal Justice System, Longitudinal Files,
      Data Dictionary 1971-1979
(1 tape, 27 files)

6. 26N-81W, Everglades Area, Florida, HYDRO 26W-81W
(1 tape, 22 files)

7. Council for Financial Aid to Education, Voluntary Support Survey (VOLSUP72)
      1971-1972 NCFPE
(1 tape, 16 files)
      RECORDS OF TEMPORARY COMMITTEES, COMMISSIONS, AND BOARDS, R.G. 220

8. Annual Import Data Bank (IA245), 1971, 1980, 1981, 1990, 1991
      Annual Export Data Bank (EA645), 1991
(10 tapes, 14 files)
       Record Group 29: Records of the Bureau of the Census


                                                     11
9. Records and Information Management System (RIMS), 1986, 1987, 1988, 1989
(12 tapes, 12 files)

10. COMPARE2 1977
(1 tape, 6 files)
11. Military Prime Contact File (MPCF), July 1966-June 1967
(2 tapes, 6 files)
      Record Group 330: Records of the Office of the Secretary of Defense

12. Central Personnel Data File (CPDF) Current Status Master File, Scrambled Version 1975
      Central Personnel Data File (CPDF) / Current Status File, Scrambled Version, 1976
(4 tapes, 4 files)
        RECORDS OF THE U. S. CIVIL SERVICE COMMISSION, R.G. 146

13. T00650 001 1 330-1976-039A O KHMER_7001_7404_NIPS N01773 18370N
      D 01000 01004 DUPLI.
(1 tape, 1 file)
       IN THE BOX BUT NOT READ YET – LABEL SAYS:
            To 0650 KHMER NIPS
            Unlabeled 9200 blocks
            Copy for AAD/Muller
      CAMBODIAN INCIDENTS FILE (KHMER), 1970-1974
14. CRIS Y89 Oct 1, 1988 – Sept 30, 1989
(1 tape, 1 file)
      Records of the Forest Service (Record Group 95) 1870-1989 ??

15. Elementary and Secondary School Civil Rights Survey, 1972
(1 tape, 1 file)
      441.4 RECORDS OF THE OFFICE OF THE ASSISTANT SECRETARY FOR CIVIL RIGHTS 1968-
      76

16. Southeast Asia Data Base (SEADAB), March – April 1972
(1 tape, 1 file)
      http://www.archives.gov/research_room/center_for_electronic_records/defense_departm
      ent.html
      Records of the U.S. Joint Chiefs of Staff [JCS]
      (Record Group 218) 1941-78




                                                        12

								
To top