(Current Status and Planning Document)
San Francisco Estuary Institute
SFEI is actively participating in the State Water Board’s need to have current and future
statewide ambient monitoring data available in a standardized, widely available data repository for use
in evaluating to what extent the SWAMP assessment questions are being answered by becoming one of
In preparation for becoming a node, SFEI has been working for several years with the SWAMP
Data Management Team (SWAMP DMT) at Moss Landing Marine Laboratories (MLML) and the
Department of Water Resources (DWR) to format SFEI's ambient monitoring data into SWAMP-
comparable formats, and to develop a compatible enterprise data system at SFEI with replication to
DWR’s BDAT/CEDEN statewide system.
STATUS OF SFEI’S SWAMP-NODE DEVELOPMENT
(Enterprise Data System/BDAT Replicate/SWAMP-Node development)
SFEI staff met with Karl Jacobs at DWR on January 18th, 2006 to begin work to create SFEI's
BDAT replicate (node). At this meeting the group concluded that the customary prerequisite for BDAT
node creation, creation of an enterprise data management system, would be carried out with assistance
from DWR staff. This document outlines our envisioned process to implement the SWAMP-Node at
SFEI, the status of SFEI’s progress, and current agreements of collaboration.
Part I. Creation of an enterprise data system for SFEI (see Figure 1)
With the assistance of Karl Jacob’s group at DWR, SFEI will create an enterprise data system at
SFEI, based on MS SQL Server. In particular, Liz Cook will be making frequent visits to SFEI, and a
regular place to work has been provided. The enterprise system will be a database with a generalized
table structure designed to hold a number of different types of environmental data sets and be the single
location for most of SFEI’s environmental data.
The enterprise data system will house data that is SWAMP-comparable and other data types. The
system will emulate the SWAMP v2.5 database design and have the capability to house (and manage)
all of SFEI’s contaminant monitoring data and some other types of information, such as those related to
habitat condition, watershed processes and functions, benthos, photos, and information related to
presenting results graphically using GIS.
Initially, this system will be loaded with as much of SFEI’s contaminant monitoring data as is
practical, given time and cost constraints, but at a minimum all RMP Status and Trends Program data
will be included (water, sediment, bivalve and fish tissue contaminant concentrations, toxicity, and
associated measures). Table 1 contains a list of several datasets that SFEI could load to the enterprise
The enterprise data system will provide data to, but be independent of, the BDAT replicate
also being installed at SFEI (see Part II, below). Enterprise data system benefits:
Simplification of data management; simplification of data application development (e.g.,
Web Query Tool), due to standardized data structures; opportunity to improve SFEI data
system with expert outside assistance
Disadvantages: Initially more time consuming and costly to implement
Enterprise Data System/BDAT Replicate/SWAMP-Node - Page 1
The group discussed SFEI’s role in data management aspects of the Surface Water Ambient
Monitoring Program (SWAMP; referred to as ―client data‖ in this document) and the State’s need for
data management of SWAMP-comparable data from other state programs and grants (referred to as
―non-client data‖ in this document, and further explained in Part III, below).
SFEI confirmed that:
- All SWAMP client data (e.g., data from the Regional Boards) would be uploaded to
- (Pending additional funding) SFEI would participate as a regional ―SWAMP-Node‖ to
receive, validate, and verify SWAMP-comparable (and appropriately formatted) non-
client data from other projects based in Region 2 and possibly other statewide projects
(e.g. State grants, Ag-Waiver) Next step is to establish SFEI’s enterprise system
1. Data discovery: To assist in the scoping of the enterprise data system, SFEI has inventoried
SFEI datasets and prioritized them for inclusion into the system (Table 1).
2. Database schema: With assistance from SFEI staff, Liz Cook will customize the generic
enterprise database schema to SFEI’s informational needs (she will make the relational database
tables and relationships compatible with SFEI’s current datasets and nomenclature and/or adjust
the datasets to an overall standard of the new system). Liz will also assist with initial loading of
data to the database, and will define the mechanism/s for uploading future data.
Pending decision: The mechanism for uploading data to the enterprise system. Options
1. copying Excel spreadsheets into an Access database and upload to SQL Server once
data have been approved, or
2. using uploading applications to load Excel spreadsheets to the SQL Server.
3. Data loading: Liz and SFEI staff will upload datasets into new enterprise data system
Pending decision: The method for checking data prior to being uploaded to the enterprise
data system. Options include:
1. SWAMP Client approach contains query and QA tools for checking data. Data pass
through this system and are checked according to inflexible SWAMP business rules.
2. A SQL Server staging area can be used to run loading applications to check the data
before uploading the data to the enterprise data system. This approach allows data that
are SWAMP comparable, but do not follow the exact SWAMP business rules.
Part II. Install a BDAT replicate at SFEI
The group is moving forward with establishing a BDAT replicate at SFEI. This replicate will be
based on SFEI’s existing installation of SQL Server. The intention is to hold the entire contents of
BDAT, with the initial exception of time series data1. SFEI’s RMP Status and Trends data will be
included in BDAT/CEDEN by being loaded into this local replicate. As a regional SWAMP-Node, non-
client SWAMP-comparable datasets, mandated by the State to be loaded into a centralized database, will
be uploaded to BDAT/CEDEN by this same process after final verification/validation.
Due the high volume of existing time series data and the high flow rate of new data, the replication of time series data will
be initially skipped. After successful deployment of the rest of the replicate, the incorporation of the time series can be
Enterprise Data System/BDAT Replicate/SWAMP-Node - Page 2
The main reason for installing a BDAT replicate at SFEI is to support the creation of data retrieval,
analysis, and display tools tailored to local needs. These tools can be shared among CEDEN participants
to advance data management and use throughout the state.
Next steps to establishing BDAT replicate and developing upon it
1. Install replicate: Karl and his group, with assistance from Todd and Cristina, will create the
BDAT schema in SQL Server and perform an initial data load from BDAT system. SFEI will
ensure proper storage is prepared (roughly 200 GB—confirm with Liz). Karl’s group and Todd
will write applications for exchanging real-time changes or deltas between replicates.
2. Tool development: SFEI staff will update existing data access tools, such as the RMP’s Web
Query, to use the new enterprise data system and/or BDAT replicate, and develop new
functionality to retrieve, analyze, and display monitoring results.
The group acknowledged that engaging and innovative data displays are a key way to demonstrate
the value of, and garner continued support for, statewide data management initiatives.
Part III. Engage in data management of the States ambient monitoring data processing using the
SWAMP-Node at SFEI [currently no funding is available for this task]
Four SWAMP-Nodes are planned statewide to manage SWAMP and the State’s Ambient
Monitoring data that are required to be submitted into a centralized database in SWAMP-comparable
formats (non-client data). These non-client projects currently include about 880 datasets (NPDES data
State Grants - that generated ambient data (e.g. PRISM)
Ag Waiver data (Region 3)
Irrigated Lands Program (Region 5)
NPDES receiving water monitoring data
SWAMP-Nodes will be located at Moss Landing Marine Laboratories (MLML), Southern California
Coastal Water Resources Program (SCCWRP), Region Water Quality Control Board (Region 5), and
the San Francisco Estuary Institute (SFEI).
- MLML – will serve all SWAMP-client data (statewide) and provide database
development and management for the SWAMP data management system. To ensure
ongoing comparability of the data and data review processes performed statewide,
MLML will organize user group meetings and provide technical and management
support to the other Nodes.
- SCCWRP – will manage non-client data for Regions 4,7,8,9
- REGION-5 – will manage non-client data for Region 5 (mostly the Irrigated lands
- SFEI – will manage non-client data for Regions 1 and 2. [We need to discuss if this is
the appropriate way of divvying up the workload. Might it not also serve to have certain
nodes specialize in certain types of data? For example, SFEI has been instrumental in
getting the bioaccumulation component of SWAMP 2.5 into shape. SCCWRP
apparently already has the CRAM data fields and business rules ready, and Region 5
Enterprise Data System/BDAT Replicate/SWAMP-Node - Page 3
may want to focus on the bioassessment data. Also, CCAMP has done a lot already and
needs to be involved]
Ambient monitoring data mandated by the State to be in SWAMP-comparable formats and loaded to
the centralized data management system may be sent to these Regional Nodes. Each Node will be
responsible to validate, verify and upload data to the BDAT/CEDEN system. This responsibility will be
limited to reviewing the already formatted, SWAMP-comparable, electronic data deliverable (EDD) and
providing an overall data quality assessment prior to uploading the data to the centralized data
It is expected that specialized training will be provided to all grantees and projects, required to
comply with SWAMP SOPs, in how to format project data into the required SWAMP-comparable EDD
formats. The regional Nodes may provide some of these support services.
Next steps to establishing SFEI as a SWAMP-Node
1. Establish the enterprise system and BDAT replication system at SFEI: outlined above.
2. Solicit funding to establish resources at SFEI to support management of non-client datasets
and client training. [Need to develop a detailed work plan with hours and estimated costs after
additional discussion of the scope of this project with SWAMP managers.]
3. Develop web interface and data uploading tools at SFEI: collaborate with other SWAMP
participants to adapt existing tools for compatibility with SFEI’s enterprise system.
SFEI’S ROLE IN ONGOING SWAMP DATABASE DEVELOPMENT
AND DEVELOPING WEB-BASED DATA ACCESS AND ASSESSMENT TOOLS
SFEI will continue to have a role in SWAMPs database development through participation in the
SWAMP DMT monthly meeting to develop, manage, and incorporate a wide variety of ambient
monitoring data types collected in California. Current efforts are underway to incorporate benthos and
physical habitat data from freshwater, estuarine, wetland, and coastal monitoring projects.
List some of the current projects that need to be incorporated into the database for which SFEI is
participating in database development with the SWAMP DMT and DWR.
California Rapid Assessment Method (CRAM) – physical habitat information (descriptive
data with categorical assessments)
RMP-Benthic Pilot Study – estuarine benthic taxonomy (species identification, abundance,
Spatial data from GIS (projects??), others?? – physical characteristics of monitoring
locations (GIS layers, photos, descriptive information, contacts, etc… ??)
As mentioned in Part I, SFEI will have a BDAT replicate to support the creation of data retrieval,
analysis, and display tools tailored to local needs. An effort will be made to collaborate with
environmental managers and database developers of existing assessment tools at the State, local, and
private industry level.
RMP’s web based data retrieval system – Data Query Tool - http://www.sfei.org/RMP/report
Geospatial Waterbody System (Geo WBS) - http://www.ice.ucdavis.edu/geowbs/
Enterprise Data System/BDAT Replicate/SWAMP-Node - Page 4
Enterprise Data System/BDAT Replicate/SWAMP-Node - Page 5
Enterprise Data System/BDAT Replicate/SWAMP-Node - Page 6
Draft 27Apr2006 sl
List of RMP and other data sets to be considered for inclusion in the Status and Trends SWAMP
Database & Enterprise system.
* indicates ready for uploading to the enterprise system (pending agreement by SFEI)
Ambient Monitoring Data form SF Estuary
RMP Status and Trends Water, Sediment & Bivalve contaminant monitoring (1993-2005) high * in SWAMP comparable format
RMP Status and Trends Fish Contamination Pilot Study (2003) high in SWAMP Bioaccumulation data
RMP-PS Fish Contamination Pilot Study (1994, 1997, 2000) high in SWAMP Bioaccumulation data
RMP-PS Benthic Infauna Pilot Study (1994 - 2000) high in a relational database format
RMP-SS California Toxics Rule (CTR) Priority Pollutants Ambient Monitoring high * in SWAMP comparable format
RMP-SS Estuary Interface Pilot Study (1996-1999) high * in SWAMP comparable format
RMP-SS Guadalupe River Study high in RMP format
RMP-PS Mallard Island Study high in RMP format
RMP Status and Trends Aquatic Episodic Toxicity Monitoring Program (1996-2003) medium in a relational database
RMP-SS River Study(1993-94) medium * in SWAMP comparable format
RMP-SS Wetlands (1995-96) Pilot Study medium * in SWAMP comparable format
RMP-SS San Francisco Bay Atmospheric Deposition Pilot Study medium data not managed by SFEI
RMP Status and Trends - Conductivity, Temperature and Depth (CTD) Profiles med-low in Excel files and relational datab
RMP-PS Biological Exposure and Effects Pilot Study (EEPS) unknown various data types and data forma
CMR Sediment Contamination in San Leandro Bay medium in a relational database
CMR Delta Splittail Study (2001-2002) high-medium
CMR Coastal Intensive Sites Network (CISNET) San Pablo Bay Study (1999-2001) medium
State Mussel Watch Data (1988-?) high in SWAMP Bioaccumulation data
CMR NOAA/EMAP San Francisco Bay Sediment Data (2000-2001, about 200 sites) no EPA to manage data
Environmental Working Group (EWG) Fish Study (2003) no in a relational database
IEP net Delta Outflow Data (1984-2001, Chipps Island to San Francisco Bay) no-drop already on web through other age
CMR South Bay Trace Organic Contaminants in Effluent (TOES) Study unknown in RMP format
CMR Statewide Sediment Quality Guidelines Development Database (~1990-2002) unknown
Bay Protection and Toxic Cleanup Program Sediment data (1992-1997) unknown * in SWAMP comparable format
RMP Eh Sediment mini-study (2003,2004) unknown in RMP format
Enterprise Data System/BDAT Replicate/SWAMP-Node - Page 7