Building a comprehensive environmental geodatabase, the challenges
and the solutions
Ahmed Wagih Abdel-Latif, Ph.D.1
Namir F. Najjar, Ph.D.2
Mostafa AbouGhanem, M.Sc.3
The challenge of designing and implementing a geographic database for a single aspect of
environmental protection is a huge undertaking, however, this task is dwarfed by the task of designing a
multi-aspect environmental geographic database. In this paper, the authors are reviewing the challenges
posed by the task of designing and implementing a comprehensive geographic database for
environmental protection covering Air, Water, Groundwater, Waste, Marine, Radiation and other
environmental data. The authors also review how they managed to overcome these challenges and
reach a design that satisfied the needs of Saudi Aramco’s Environmental Protection groups. The authors
provide a roadmap for designing similar systems with multi-objectives, where objectives might intersect
or otherwise diverge.
Keywords: Geodatabase, GIS System Design, CASE Tools
The protection of the environment has become a major component in all human undertakings since it
became clear that any human activity may have serious adverse impacts on life on the planet. In its
quest to satisfy the Kingdom’s obligations as well as its own, Saudi Aramco (SA) has placed great
importance on the protection of the environment of its areas of operations as well as the entire
kingdom of Saudi Arabia.
The task of monitoring all aspects of the environment for SA areas of operations, which amounts to a
significant percentage of the entire country’s area, requires dealing with huge amounts of data. The vast
majorities of these data are either spatial data -maps- or have a strong spatial aspect in them. The best
possible scenario is to create an enterprise Geographic Information System (GIS) for SA’s Environmental
Protection that would house all available data to engineers and scientists.
Dar Al-Handasah Consultants.
Environmental Protection Department, Saudi Aramco.
eMap Division, Saudi Aramco.
There are many challenges to make all these data available through such a system. The first challenge is
about the volume of these data - including historic data. Huge-sized reports will have to be entered into
the system, readings from real-time sensors, and field data to name a few.
In addition to the volume challenge, there is also the issue of sources; these data come from different
sources; internal and external. These challenges pose compatibility issues because of the various
formats, representations, extents, coordinate system, scale, accuracy, precision and other data
parameters. The task is even more challenging by the fact that constructing a comprehensive
environmental GIS Geodatabase has the inherent problems caused by the very nature of environmental
data which cover aspects that not necessarily coherent ranging from marine, radiation, air quality, to
routine inspection sheets filled by site scientists.
Having said that, the prospect of creating one single stop shop where all environmental data could be
available to query, analyze, map, and report, is a huge undertaking that needs to be leveraged by the
users of the newly designed system.
In the remainder of this paper, the authors discuss how the challenges mentioned above were
overcome to produce a harmonious environmental GIS database that fulfilled the duty of supporting the
mission statement of environmental protection as mandated by Saudi Aramco.
Saudi Aramco intends to build a comprehensive environmental GIS to help manage its mandate of
monitoring the environmental aspects of its operations. In order to do that, a comprehensive
methodology for system analysis, design, implementation, and support was devised. The resulting
system became part of the GIS portal of the company. In order to achieve that, the following steps were
- Conduct a comprehensive assessment of the existing business processes and
essential information flow in and among different internal and external stakeholders;
- Complete inventory and assessment of the current technical and human
infrastructure at the stakeholders premises;
- Design and build a comprehensive environmental data model based on best
international and local practices and standards;
- Set the data and metadata standards, procedures, rules, sharing, accessibility, etc.
based on requirements and according to best practices;
- Design and develop a web-based GIS application that would allow for data browsing,
viewing, querying, editing, etc.
- Implement a detailed staff development, capacity building, and knowledge-transfer
plan to selected personnel who will be tasked to maintain the system in the future;
- Recommend the best alternatives for system configuration and setup in terms of
software, hardware, networking, etc. that meet the current and future requirements,
which will be based on the available and potential server capacity hosted at SA’s IT.
The design and implementation of an Enterprise GIS requires the development of five aspects: Data,
Software, Hardware, People, and Methods (ESRI, 1999). The development of each of these components
is done by one of the tasks shown in Figure 1. This paper only discusses the development of the
geographic database which is covered by the System Design and Data Conversion tasks.
Figure 1: Components of GIS Implementation plan
The Design of the Geodatabase went through the following six steps:
User needs assessment
System Analysis and Conceptual Design
Logical System Design
Physical System Design
Data Conversion, QA/QC, and Loading
Metadata and Procedure Documentation
In order to paint a complete picture of current situation, the design team conducted a thorough user
needs assessment study, which included conducting an awareness seminar, the design and distribution
of questionnaire forms, and repeated face-to-face interviews with the stakeholders.
User Needs Assessment - Data Requirements
While this was not the only type of data to be collected during this phase, it certainly was an important
component to build the complete picture of the essential parts of the database to be created. The
analysis used –in addition to the questionnaire survey- specific data inventory forms, and follow-up
interviews to make sure that all aspects of data used were captured including types, sources, content,
currency, modality, frequency, custodianship, ownership and format.
Other aspects of the system were also collected during the user needs assessment including Application
Requirements, Training & Support Requirements, and Software & Hardware Requirements. Aramco IT
team was also involved in this process to provide information on standards and guidelines utilized at
Saudi Aramco and to determine the data naming conventions, coordinate system and projection,
coordinate domains and finally data security requirements.
System Analysis & Conceptual Design
The process of system analysis started with the categorization of the results of the questionnaire survey,
and interviews. These were then compared to industry standard data models extracted from ESRI’s Data
Model repository (ESRI, 1999) many standard data models were researched. These were the Marine,
Groundwater, Basemap, Atmospheric, Hearth, Hydro, and Environmental Regulated Facilities.
At the end of the analysis, the design team divided the GIS data which constitute the Environmental
Database to be composed to three main groups: Basemap Data, Framework Data, and Group Specific
data as shown in Figure 2. These components will be discussed at length in the next section.
Figure 2: Structure of the Environmental Geodatabase
Logical & Physical System Design
During the System Analysis phase, the design team produced the early conceptual data model which
contained three components; Basemap data entities, Framework data entities, and Environmental
Group-specific data entities. The first task in designing the database was to examine the industry
standard data models that have been prepared for different aspects of the data model. These models
have supposedly gone through extensive review efforts by the industry, and are available for free
download on the company’s web site. Figure 3 illustrates the Arc Hydro Groundwater data model used
partly as a model to produce and enhance Saudi Aramco’s Environmental Geodatabase Data Model. The
initial database was placed on the ArcSDE/Oracle development servers to link it to the application to
support the different functionalities by group for user testing. This step is crucial part as it helps
determine if the data model does support the business functions of the user and whether it requires
further tuning. Following the approval of the development phase the data model will be deployed into
QA/QC environment and then production of Aramco IT.
Figure 3: Arc Hydro Groundwater Data Model – as an example of Industry Standard data models examined
The industry standard data models were subjected to gap analysis in order to adopt them or use them
partially in the creation of SA Environmental Data Model. To further the design objectives, the project
design team used CASE tools (MS Visio) to model and represent the geodatabase design. The produced
model in Unified Modeling Language (UML) during the logical design phase represented the entities,
attributes, relationships, and behavior of the different elements of the geodatabase. This was converted
into physical tables in the enterprise geodatabase on the test server, and then on the production server
using Oracle 11g.
The geodatabase model is composed of three main entities as follows:
These are not actually environmental data, but rather entities that would provide context to the
database such as the roads network, airports, built-up areas, places, etc. This dataset is read directly
from the company’s main digital basemap database, and task of maintaining the data is not linked to the
maintenance of the main environmental geodatabase, which makes the task of maintaining the
environmental data easier.
2. Framework Data
These are data that are used by more than one environmental group. They include three entities as
shown in a simplistic representation showing geometric entities in Figure 4; these are the Facility
dataset, which holds the location and attributes of Saudi Aramco Facilities, the Weather Station entity
which holds the geometry and attributes of weather stations throughout the kingdom, and the last
entity is called the Environmental Impact Assessment entity, which is an entity that covers the location
and area coverage of EIA reports and data produced for different projects.
Figure 4: Components of the framework data
It has to be said that the actual model for the framework dataset contains about 20 entities (including
tabular data) in addition to 15 relationships to meet the demands collected in the user requirements
analysis. This is shown in the logical design represented in Figure 5.
Figure 5: Logical design of the Framework dataset
3. Group Specific data
The third dataset is the group specific dataset which contains different entities relating to the different
needs for each group; the Air quality professionals will be mostly interested in managing air related
datasets, while the marine group will be interested mostly in marine datasets, and so on. This division
prompted the design team to create different views for common entities in the same way that was done
with framework data, but this time the separation is less obvious. Figure 6 illustrates the conceptual
geodatabase design for the environmental protection group-specific data within the overall model.
Figure 6: Some of the Environmental Protection groups-specific data entities
A. Air Quality dataset
This dataset includes mainly emission sources and the tables relating to these emission sources. The
data model contains eight entities, and seven relationships. Figure 7 illustrates the logical design of the
air quality data.
Figure 7: Air Quality dataset entities and relationships
B. Environmental Health Dataset
The next dataset in the database is the environmental health dataset which basically document the
records of different installations and their inspection results for compliance with health regulations. The
dataset includes one geometrical entity, and four tabular entities, in addition, the dataset includes four
relationship entities as well. Figure 8 illustrates the Environmental Health entities within that dataset.
Figure 8: Environmental Health Dataset
C. Groundwater Dataset
The third dataset represents the groundwater entities. The main entity in this dataset is the Well entity.
It has six tabular entities in addition to six relationship entities. Figure 9 illustrates the entities making
the Groundwater dataset.
Figure 9: Groundwater Data Entities
D. Marine Dataset
This dataset has its main entity as the Habitat entity; it has four entities describing different aspects of
the marine environment. Figure 10 illustrates the marine entities dataset.
Figure 10: Marine Dataset
D. Radiation Dataset
This dataset has eighteen entities, including seven geometrical entities, eleven tabular entities, and
eleven relationship entities. The geometrical entities included representation of the different types of
radiation sources, license areas, incident locations etc.., while the tabular part included the detailed
information about these geometrical entities. Figure 11 illustrates the radiation entities dataset.
Figure 11: Radiation Dataset
E. Waste Management Dataset
This dataset has six geometrical entities including the spatial representation of landfills, containment
sites, landframs, hazardous contentment sites, and one tabular entity, and one relationship entity.
Figure 12 illustrates the waste management dataset entities.
Figure 12: Waste Management Dataset
E. Wastewater Dataset
Wastewater dataset is composed of four entities including two geometric, one tabular, and one
relationship. Most of the data in this area are related to either sampling points or treatment plants.
Figure 13 illustrates the entities of the wastewater dataset.
Figure 13: Wastewater dataset entities
Using ArcCatalog, the project design team converted the logical system design into physical
representation in the enterprise geodatabase. Figure 14 illustrates the environmental geodatabase in
ArcCatalog. Making separate views for different groups lumping their respective data sets was achieved
in two different ways; the first is through the use of different views to the same dataset, while the
second way was through the application interface which limited access to users based on their interest.
Figure 14: Environmental Protection Geodatabase physical design
Data Conversion, QA/QC, and Loading
Data was converted from different forms, including coordinate lists, reports, GIS files, spreadsheets, as
well as specialized data such as real-time systems –although these are read through intermediate
software, and fed into the company’s enterprise database before being read into the environmental
geodatabase. The basemap data were read and integrated into the final geodatabase, and into the
The resulting data were then tested using QA/QC procedures, and uploaded into the final geodatabase.
Metadata and Procedure Documentation
The project design team used the ISO metadata stylesheet to represent the geodatabase metadata.
These represented descriptive data about the database elements. Some of the data are collected
automatically from the system, such as the spatial characteristics of the dataset, and others were
entered by the team including source, reliability, scale, currency, frequency of update, operator, terms
of use, ownership, custodianship, etc.
Once the geodatabase was designed, implemented, populated, and tested, the system was ready to
start showing results using the web-based mapping application. Figure 15 shows the screen displaying
the spatial query widget of the environmental which enables end users to formulate complex queries to
support their day-to-day reporting on top of the map of the site.
Figure 15: Querying different aspects of the environmental geodatabase
This paper presented the methodology followed by the project design team to design and implement an
environmental geodatabase covering different aspects of the environment inside and outside Saudi
Aramco areas of operations, to support the company in its environmental protection and prevention
efforts. The design and implementation followed a sixstep process, that culminated in the uploading,
checking, and testing the geodatabase, and making sure that it can be served through a web-based
mapping application designed for that purpose. CASE Tools were instrumental in modeling and
designing this fairly sophisticated, new geodatabse.
“Arc Hydro Groundwater Data Model”, from
http://www.archydrogw.com/ahgw/Arc_Hydro_Groundwater_Data_Model, as consulted on January 10,
ESRI, 1999, “ESRI Data Models Repository”, from http://resources.arcgis.com/content/data-models as
consulted on December 21, 2010.
Tomlinson, Roger, 2007, “Thinking about GIS: geographic information system planning for managers”,
Environmental Systems Research Institute, Inc., Redlands, CA: ESRI Press
Wright, D.J., Blongewicz, M.J., Halpin, P.N. and Breman, J., 2007, “ Arc Marine:
GIS for a Blue Planet”, Redlands, CA: ESRI Press
Zeiler, Michael, 1999, “Modeling Our World: The ESRI Guide to Geodatabase Design", Environmental
Systems Research Institute, Inc., Redlands, CA: ESRI Press