Authoritative Systems Database System Software Design Document

Document Sample
scope of work template
							System and Software Design Document
                 for the

  Authoritative Systems Database (ASDB)

              VERSION 4.0

              August 2004

              Prepared for the
    Joint Strike Fighter Program Office




               Prepared by




                PTI
       Paradigm Technologies, Inc.
       1215 S. Clark St, Suite 1104
          Arlington, VA 22202
         http://www.pti-usa.com/

       For comments and concerns:
         ASDB Project Manager
             Steven R. Hix
            hix@pti-usa.com
                               The ASDB System and Software Design Document



Table of Contents

1    Purpose........................................................................................................................ 4
2    Scope........................................................................................................................... 4
3    References................................................................................................................... 5
4    MISAT Members ........................................................................................................ 6
5    Software Development................................................................................................ 7
6    Interoperability and Reuse .......................................................................................... 8
7    Native Library........................................................................................................... 10
8    Instance Metadata Repository................................................................................... 11
9    Metadata-Based Meta-Models .................................................................................. 13
10      Metadata Mapping Model..................................................................................... 13
11      Extraction, Transform & Load Process for Metadata Mapping Model ................ 16
   11.1 Creating a CSV file from an ERwin ER diagram ............................................. 16
   11.2 Order for Tables to be loaded ........................................................................... 16
   11.3 Perl Scripts to run.............................................................................................. 17
   11.4 How to Run SQL*Loader ................................................................................. 17
   11.5 SQL*Loader Control Files to run ..................................................................... 18
12      Data Call ............................................................................................................... 19
13      Production Request ............................................................................................... 20
14      Points of Contact................................................................................................... 21
15      Starting the Reverse Engineering Process ............................................................ 25
16      Creation of the DBMS Model............................................................................... 25
   16.1 Identify Segments ............................................................................................. 25
   16.2 Record Physical Characteristics........................................................................ 26
   16.3 Add Meaningful Model Names ........................................................................ 27
   16.4 Identify Primary Keys....................................................................................... 28
   16.5 Select the Primary Key ..................................................................................... 28
   16.6 Identify Foreign Keys & Relationships ............................................................ 29
   16.7 Key Migration................................................................................................... 30
   16.8 How to Read a Relationship ............................................................................. 30
   16.9 How to Identify Foreign Keys and Relationships............................................. 31
   16.10     Provide Entity and Attribute Definitions ...................................................... 31
   16.11     Save Your DBMS Model.............................................................................. 32
17      Creation of the Project Information Model........................................................... 32
   17.1 Normalization ................................................................................................... 33
   17.2 Remove Violations of 1NF, 2NF, and 3NF ...................................................... 33
   17.3 Resolve Multiple Attribute Occurrences .......................................................... 33
   17.4 Identify Remaining Foreign Keys and Relationships ....................................... 34
   17.5 Identify Category Relationships ....................................................................... 34
   17.6 Create Associative Entities ............................................................................... 35
   17.7 Eliminate Implementation Characteristics........................................................ 35
   17.8 Eliminate Non-unique Entity and Attribute Names.......................................... 35
   17.9 Refine Business Rules....................................................................................... 35
18      A Complete Project Information Model ............................................................... 36


                                                                  2
                              The ASDB System and Software Design Document


19     Using ERwin......................................................................................................... 37
20     Data Modeling Using ERwin................................................................................ 38
21     Project Information Model Assessment................................................................ 41
22     Database Generation from ERwin ........................................................................ 41
23     Data Marts and Operational Data Stores .............................................................. 42
24     Population Test ..................................................................................................... 44
25     Converting the database into XML with Java....................................................... 47
26     Applying a schema to an XML document ............................................................ 48
27     Load Testing ......................................................................................................... 49
28     Domain Controller ................................................................................................ 53
29     Developmental System Hardware......................................................................... 53
30     Client Tier Components........................................................................................ 54
   30.1 Web Browser: Netscape Navigator or Microsoft Internet Explorer ................. 54
   30.2 Java Runtime Environment............................................................................... 54
   30.3 Java Certificate.................................................................................................. 54
31     DbVisualizer ......................................................................................................... 54
32     CS-RCS................................................................................................................. 55
33     ExamDiff............................................................................................................... 56
34     ERwin ................................................................................................................... 57
35     NASIC Software ................................................................................................... 57




                                                                3
                     The ASDB System and Software Design Document



Introduction

Paradigm Technologies Inc. (PTI) Data Management Division has applied its Secure
Process for Information Integration through Data Reverse Engineering (SPIIDRE)
solution to the creation of the Authoritative Systems Database (ASDB). The ASDB was
conceived as part of the Department of Defense pioneering effort to apply a Simulation
Based Acquisition (SBA) strategy and approach to the development of the Joint Strike
Fighter (JSF). This document, the System Software Design Document (SSDD), provides
a high-level description of the application of PTI’s SPIIDRE solution to the development
ASDB, to include detailing hardware and software specifications applicable to the design,
development and deployment process.

1 Purpose
The SSDD was intended to provide those persons involved in ASDB-related
development, maintenance, support, operations and training activities with sufficient,
detailed information concerning the setup, design, development and deployment
concepts, plans and actions to accomplish their respective tasks without reliance on the
authors.
Of note, the sections covering setup, specifically the internal classified and unclassified
networks, design software, database management system, etc. will be included at the end
of the document. PTI developed the ASDB using two local area networks (LAN), one
classified and one unclassified. The ASDB data models were developed using Computer
Associates AllFusion ERwin Modeling Suite, version 4.1. The ASDB database was
instantiated using Oracle 8i. More details about the hardware and software requirements
are covered later.

2 Scope
The SSDD was written to comply with direction provided in the JSF Modeling and
Simulation Support Plan (MSSP). It covers the following key areas:

•   Metadata
•   Gathering Data
•   Data Modeling
•   Datamart and Operational Datastore Development
•   Data Extraction, Transformation and Load
•   Digital System Model Development
•   Status
•   Access
•   Hardware
•   Software


                                            4
                    The ASDB System and Software Design Document


3 References
In composing this SSDD, the following documents were referenced:
Lockheed Martin. RAS Implementation Using Cocoon 2.x. Fort Worth, TX: May 2002.
Paradigm Technologies, Inc. ASDB System Design Plan. Arlington, VA: 2002.
Paradigm Technologies, Inc. Authoritative Systems Database (ASDB) Statement of
Work, Version 4.0. Arlington, VA: 2001.
Paradigm Technologies, Inc. Secure Process for Information Integration through Data
Reverse Engineering White Paper Arlington, VA: 2004.
United States Department of Defense. DOD Manual 5000.59: Reverse Engineering for
Data Integration and Sharing. Washington D.C.: Department of Defense, 1996
United States Department of Defense. DOD Manual 8320.1-M-1: Department of
Defense Data Integration and Standardization Procedures. Washington D.C.: Department
of Defense, 1998.
United States Department of Defense. Department of Defense Net-Centric Data Strategy.
Washington D.C.: Department of Defense Chief Information Officer, 2003.
Joint Strike Fighter Program Office Modeling & Simulation Support Plan, V5.3.1
Arlington VA. Feb 2001




                                         5
                           The ASDB System and Software Design Document




ASDBOverview

As part of an overall investment in improving data quality and consistency among the
models and simulations used to develop the JSF, the Joint Strike Fighter Program Office
(JSFPO) through their MSSP envisioned an integrated data repository containing
detailed, authoritative characteristic and performance information for RED, BLUE and
GREY weapons, system-of-systems, and sensors. This common repository, the
Authoritative Systems Database, would feed simulations used in both government and
industry1.    Those simulations were
categorized and included in two
collaborative environments, the Strike
Warfare Collaborative Environment
(SWCE)       and      Engineering      and
Manufacturing               Collaborative
Environment (EMCE). In addition to
the ASDB, the JSFPO planned for
several other information repositories,
all covered under the umbrella
organization of the Joint Authoritative
Modeling and Information System
(JAMIS). If completed, the JAMIS
would have contained an Operational                Figure 1: JSF Collaborative Environments
Context Database, a Multi-Spectral
Environment Database, the ASDB and
the Distributed Product Description (DPD). The Operational Context Database would
have housed the data used to describe scenarios, rules of engagement, tactics, doctrine
and order of battle. The Multi-Spectral Environment Database would include
information about the environment, terrain, and civil infrastructure. The DPD, developed
by the weapon system contractor, Lockheed Martin (LM), contains data about the JSF. A
JSFPO-led governing body, the Modeling Information Sources Action Team (MISAT),
of which PTI is a charter member, administers the JAMIS.

4 MISAT Members
1. Action Team Leader [Maj Joel Schubbe, (Joel.Schubbe@jsf.mil) JSFPO, USAF]

2. JSFPO Technical Lead [Jim Hollenbach, (JimH@simstrat.com) Simulation
   Strategies Inc]

3. Lockheed Martin Technical Lead [Dr. Henson Graves, (Henson.Graves@lmco.com)
   LM DPD Architect]


1Department of Defense Joint Strike Fighter Program Office. Modeling and Simulation Support Plan, version 5.2.
Washington, DC: 2000.



                                                          6
                     The ASDB System and Software Design Document


4. LM DPD/RAS Project Manager [Russ Campbell, (Russell.C.Campbell@lmco.com)
   LM]

5. ASDB Project Manager [Steve Hix, (Hix@pti-usa.com) Paradigm Technologies Inc]

6. JSF weapons representative [Curtis Erickson, (Curtis.Erickson@eglin.af.mil)
   AAC/ENMI, Orion Technologies Inc]

7. Natural/civil environment representative [Frank Zawada,
   (Frank.Zawada@jhuapl.edu) JHU APL]

8. Natural/civil environment representative [Floyd Adagio,
   (FAdagio@orionnetwork.net) Orion Networking]

9. JSF Threat Working Group representative [Dean Hinson, (Dean.Hinson@jsf.mil)
   JSF PO]

10. Operational context representative [TBD]

Other representatives as approved by the SBA Implementation Council.

5 Software Development
In the development of the ASDB, PTI adhered to
industry standards for sound software engineering
and database development.           PTI satisfied
requirements for level 2 of the Capability Maturity
Model Integrated (CMMI) and some requirements for
level 3.
The steps outlined in the following chapters have
been encapsulated into a small checklist. This list
amounts to a summary of the ASDB software spiral
                                                                 Figure 2: Standard Software
development process. Through this process, PTI is
                                                                        Development
able to rapidly begin development of prototypes and
refine their functionality and expand their capabilities
through a robust test and evaluation process.


                    1    Physical Data Model Creation
                    2    Project Information Model Creation
                    3    Data Mart/Operational Data Store Creation
                    4    Data Mart/Operational Data Store Population
                    5    Data Check
                    6    Project Information Model Rework
                    7    Data Mart/Operational Data Store Repopulation
                    8    XML Development
                    9    Data Set Visual Comparison Tests using ExamDiff
                    10   Data Set Load Test



                                              7
                     The ASDB System and Software Design Document


                    11   Project Information Model Rework
                    12   Data Mart/Operational Data Store Repopulation
                    13   XML Rework
                    14   Data Set Visual Comparison Tests using ExamDiff
                    15   Data Set Load Test
                              Figure 2: ASDB Development Process

6 Interoperability and Reuse
In an effort to foster interoperability and reuse of the ASDB within the DoD modeling
and simulation community, PTI has forged relationships with the Defense Modeling and
Simulation Office (DMSO) and the Defense Technical Information Center (DTIC).
Additionally, PTI has worked on the ASDB in cooperation with Johns Hopkins
University Applied Physics Laboratory (JHU/APL), the Applied Research Laboratories
of the University of Texas at Austin (ARL:UT), Air Force Studies and Analysis Agency
(AFSAA), Air Force Aeronautical System Center (ASC), Lockheed Martin Aeronautics
Co and a number of other contributors.
                                              PTI’s association with the DMSO began
                                              early in the ASDB development process
                                              and continues today. One of the aspects
                                              to this relationship has been PTI’s ability
                                              to adapt a key resource conceived within
                                              the Knowledge Integration Branch at
                                              DMSO for use in data integration efforts
                                              in the modeling and simulation
                                              community, Reverse Engineering for
                                              Data Integration and Sharing (REDIS).
                                              PTI has built on the model-driven
                                              approach promulgated by REDIS and its
         Figure 3: Stovepipes for Information developers, ARL:UT, and integrated this
                                              metadata discovery methodology in the
                                              SPIIDRE solution. This model-driven
approach is consistent with current US Government and more specifically, DoD data
standardization policy and directives.
Using REDIS as a foundation,
SPIIDRE’s           information
integration       infrastructure
allows us to fully leverage
connections             between
information from different
applications and databases.
Through this infrastructure, we
can aggregate data and
coordinate transactions across
back-end data sources. To the
user, all of this back-end
information appears as if it
                                                  Figure 4: The ASDB as the Data Provider

                                              8
                     The ASDB System and Software Design Document


came from one comprehensive, global database. Simulation users can come to the ASDB
and gather simulation-specific information, attracting analysts looking for a one-stop
shop for information. The information can be presented in simulation-specific format, or
alternatively, in its native format.
Prior to the ASDB, each analyst supporting one of the tools in the SWCE or EMCE had
to rely on data provided with the simulation, without a clear pedigree of metadata
indicating from where that data was derived and when. Those analysts were forced to
deal with a crisscross of various back-end sources.
With the ASDB concept, all of those analysts would have the opportunity to come to one
spot to gather their data. The connections to individual back-end sources can be
maintained in a more automated way and users would have been provided with detailed
metadata about the information used for their studies. With the ASDB in place, all of the
data needed to support JSF development-related analysis could have been contained in a
common location, easily maintained and configuration managed, with one central data
administration function. As an additional benefit to having all of this high-interest
information available via a common interface, users outside of the JSF enterprise could
gain access, if approved, to this information without having to go through the
intermediaries that typically stand between the individual analyst and the information he
or she needs. The benefit of this technology increases exponentially with successive
integration of applications and data sources.
In early 2004, a new DoD data management strategy is emerging. The DoD Chief
Information Officer has outlined a network centric approach to information
interoperability and data management. SPIIDRE is already there. In applying SPIIDRE
to the ASDB, PTI and the JSFPO are already using the DoD information integration and
data management approach.




                                           9
                     The ASDB System and Software Design Document




Metadata

Metadata means “data about data.” What kind of information is in the system? How is it
named? How is it formatted? How is it indexed? How is it organized? Where did it
come from? In what context is the use of this data appropriate? When an enterprise
establishes this information context, its users reap many benefits:
•   Information needed to understand, access, and use an organization’s information
    resources.
•   Enterprise-wide standardization and maintenance of the organization’s business
    definitions and business rules for data.
•   Dynamic information dictionary, a common definition of data structures and the data
    content represented by component data elements.
•   Enterprise wide decision support. Metadata provides the necessary data integration
    for effective cross-business unit analyses.
•   Data lifecycle management. Metadata captures the history of data over time.
•   Information exploration and extrapolation.
•   Information exchange among applications, both vertically and horizontally, within an
    organization’s information architecture.
•   An information baseline to help guide the data extract/transform/load (ETL)
    processes.
•   Allows the reusability of data without the risk of reinterpretation.

At the core of the SPIIDRE solution is metadata management. There are two essential
metadata aspects to consider: metadata management about the data and the development
of metadata models for structure integration. PTI’s SPIIDRE, as applied to the ASDB,
provides a solution to both problems. Metadata management involves storage and access
to metadata information in a metadata repository as well as a means to define and
manipulate metadata models through a graphical interface.
Think of metadata as a directory – a database about the database. This repository
contains information about the data stored within the databases that compose the entire
enterprise infrastructure. This level of metadata contains the same information that is
stored in data dictionaries, as well as information that is shared across databases. PTI’s
metadata management is composed of four pieces: a native library, an instance-value
level metadata repository, metadata meta-models, and a mapping metadata repository.

7 Native Library
The Native Library is just what the name suggests. This repository contains exact
replicas of the source information in its native format. In this regard, PTI’s application of
SPIIDRE for the JSFPO differs from the typical deployment in the private sector. In the
business world, the SPIIDRE solution would leave the data sources as is. The solution
provider would have the luxury of free access to those pieces of source information and
would not have to rely on the intelligence community or system program offices to
supply information to support their data consumers. For the ASDB, PTI gathers data


                                             10
                     The ASDB System and Software Design Document


from a variety of disparate organizations and replicate that information in a series of data
marts. The information contained in those data marts come in a variety of formats
ranging from on-line (SIPRNet) databases, simulation-specific data files, intelligence
production center reports, and ASCII text files. In each case, as this information arrives
at the PTI Data Management Division, it is stored in its native format within the Native
Library
The Native Library was established on the classified LAN (\\Asdbdb\ASDB\Library\
Native Library). It is a directory tree structure with individual branches to represent data
systems, either sources or simulations. Within each individual branch, versions of the
specific data system are stored separately – even when the vast majority of the
information could overlap. At the next level, individual folders are used to hold
information on Native Data, Source Code, Documentation, Models, and Tools.
Once this original information is cataloged in the Native Library, analysts make copies
for work on their individual tasks. At no time is the original altered in any way. As the
data is later used to populate databases and logged in the instance metadata repository, its
location in the Native Library is used as a reference.
The data structure of the Native Library is intuitive. Information is catalogued based on
simulation or data source. It is further separated by versions.

8 Instance Metadata Repository
As information arrives at PTI, it is copied from its native format and processing begins.
The first step in that processing is the harvesting of metadata about the source and data
provided. This is the first interface between incoming information and the metadata
repository, but not the last. During subsequent stages of processing, the data about the
data is entered in the metadata repository for analysis purposes. The fields populated in
the metadata repository were developed in cooperation with the MISAT. The group
developed a metadata model – an entity relationship diagram describing the entities and
attributes required to best represent the data and how they relate with other entities and
attributes (more on entity relationship diagrams in Chapter 3). After the incoming source
information is cataloged in the metadata repository, it can be linked, per instance value,
to the various data marts. Analysts can query the metadata fields to find information best
suited for their application and users outside of the JSF enterprise can search for data of
interest. Capturing metadata in this manner and linking it with individual instance values
in the data marts allows for rapid updates of information without reloading the entire data
mart. Additionally, this methodology provides flexibility in the ability to create a
composite system within the operational data stores from a variety of backend data
sources. For example: an analyst wants an SU-35 FLANKER representation for
BRAWLER. Most of the information contained in that dataset could be provided by
NASIC. Let’s say that this analyst has alternative information from a collateral United
Kingdom report. The analyst could provide that source information for inclusion in the
ASDB. The metadata would be captured and linked to the instance values in the Brawler
Operational Data Store and subsequently represented in its own text file when the new
DSM is presented. Now, when the new DSM is generated, metadata about all of the data
is available, including the UK-specific instance values. Additionally, other Brawler users
in the community would be provided notification about and access to this same source


                                            11
                     The ASDB System and Software Design Document


information for potential use in their studies. In the case of competing information
sources, where two or more data source are available to meet a single or multiple needs,
this metadata strategy allows individual data values to be tagged for use in comparison
studies.
The ASDB’s instance value metadata database is based on the JAMIS-approved Joint
Strike Fighter Metamodel – 12 January 04. The development of this model was an
evolutionary process spanning several years. The metadata model is stored on the
unclassified CS-RCS in the Metadata Project as “JAMIS metadata.” This model was
instantiated on the classified network and can be accessed using DbVisualizer.
Instance value metadata is entered manually using DbVisualizer, a tool that is covered in
more detail in Chapter 11. PTI analysts capture data about the information resource,
point of contact and process. Using the inline editor option, data can be entered into the
individual cells that need to be modified. Any errors are reported in the error log and
may require the involvement of the database administrator to resolve.




                               Figure 5: DbVisualizer Inline Editor

Because the metadata information is being filled directly into the database, it’s important
that the data be entered in the parent entities before the child entities or associative
entities. This is not an intuitive endeavor. An understanding of the underlying data
model is necessary to properly complete this task. The following is one way to fill the
metadata tables without violating this referential integrity:
INFO_RESOURCE
INFO_RESOURCE_CLASS
POC
PERSON
ORGANIZATION
POSITION
PERSON_ASSOC
PROCESS
PROCESS_MACHINE
TOOL
INFO_RESOURCE_POC_ROLE



                                               12
                      The ASDB System and Software Design Document


INFO_RESOURCE_CLASS_POC_ROLE
PROCESS_EXEC
PROCESS_EXEC_POC
PROCESS_EXEC_TOOL
INFO_RESOURCE_SOURCE
PROCESS_EXEC_INPUT
PROCESS_INPUT
INFO_RESOURCE_CLASS_SOURCE

9 Metadata-Based Meta-Models
Once the incoming information has been cataloged, the analysts at PTI dissect the data
format to construct a series of metadata-based entity relationship diagrams. These
models are developed to precisely represent each physical source of information and are
the centerpiece for future integration efforts. This topic will be discussed in detail in
Chapter 3, but suffice it to say this is an absolutely necessary, but labor-intensive task.
Once complete, we have an accurate graphical representation of the data fields for both
the incoming source information and the data requirements necessary to meet the
consumer’s needs. These individual data models are instantiated as databases and
populated with legacy data to test the integrity of the structure. After testing, the data can
be replaced, gradually or in bulk, with more current, authoritative information as it
becomes available. These new databases fall into two categories. The models and
subsequent databases generated to store the back-end source data are data marts. These
data repositories are logical representations of the original data format and they use
replication or extract, transform and load (ETL) tools for updates and queries. The
models and databases developed to store simulation-specific information are operational
data stores. There are differences between the two, which will be discussed in more
detail later. The next step is the combination of consumer and source data to detect
relationships.

10 Metadata Mapping Model
The Metadata Mapping Model and information repository provides the ability to connect
information within a data source or application with information in other data sources and
applications. This allows for cross-data-source development of application-specific data
queries for use in generating new DSMs. PTI Data Division analysts have worked with
simulation developers, intelligence producers, and model management to identify key
relationships between source information and simulation-specific data requirements.
This information is documented and stored in the classified CS-RCS repository
(SAI/Design/ASDB V2.0 Draft Report.xls).
In order to facilitate this conversion process, PTI has developed, inherited and modified a
number of software pieces from a variety of data suppliers. This conversion software
allows PTI to convert incoming data from non-relational to relational format and from
relational back to the target native format.



                                             13
                    The ASDB System and Software Design Document


The National Air and Space Intelligence Center provided five conversion programs for
use in developing Brawler datasets. These programs are classified and are stored on the
SGI in the following directory:
brawler_stuff/JSF.PARADIGM.CONVERTERS.22May02.
Information about their use and context is documented in the classified library:
\\Asdbdb\ASDB\Library\Native Library\Brawler Native\Brawler 6.4 Native\Brawler 6.4
Native Documentation.




                                          14
                                                                                  The ASDB System and Software Design Document




                                                           ASDB Metadata Mapping
                                                                                                                            LEGACY MODEL                               LEGACY SUBJECT AREA
   ASDB SUBJECT AREA
                                                                                            EXCLUSION CODE                   Legacy Model                               Legacy Subject Area
    Asdb Subject Area                                                                                                        Legacy Model Version                       Legacy Subject Area Version
    Asdb Subject Area Version                                                                Exclusion Code
                                                                                                                                                                        Legacy Model (FK)
                                                                                             Exclusion Description          Legacy Model Description                    Legacy Model Version (FK)
    Asdb Subject Area Description                                                                                           Legacy Model Notes
                                                                                                                            Legacy Model Dbms                           Legacy Subject Area Description
                                              ASDB SUBJECT AREA ENTITY MAP
                                               Asdb Entity Name (FK)                        STATUS CODE
                                               Asdb Subject Area (FK)                        Status Code
                                               Asdb Subject Area Version (FK)
                                                                                             Source Code Description                                                   LEGACY SUBJECT AREA ENTITY MAP
                                                                                                                                  LEGACY ENTITY                         Legacy Subject Area (FK)
                                                                                                                                   Legacy Entity                        Legacy Subject Area Version (FK)
ASDB ETL PROCESS SOURCE MAP                                                                                                                                             Legacy Model (FK)
 Asdb Etl Process (FK)                                                                                                             Legacy Entity Definition             Legacy Model Version (FK)
 Asdb Attribute (FK)                                                                                                               Legacy Entity Notes                  Legacy Entity (FK)
 Asdb Entity Name (FK)
                                                         ASDB ENTITY
                                                          Asdb Entity Name
                                                         Asdb Entity Definition
                                                         Asdb Entity Notes


                                                                                                                                                                                             LEGACY ETL PROCESS SOURCE MAP
                                                                                                                                                                                              Asdb Etl Process (FK)
                                ASDB ATTRIBUTE
                                                                                                                                                                                              Legacy Attribute (FK)
                                 Asdb Attribute                                        LEGACY TO ASDB ATTRIBUTE MAP                         LEGACY ATTRIBUTE                                  Legacy Entity (FK)
                                 Asdb Entity Name (FK)                                  Legacy Attribute (FK)                                Legacy Attribute
                                Asdb Attribute Base Unit                                Asdb Attribute (FK)                                  Legacy Entity (FK)
                                Asdb Attribute Definition                               Asdb Entity Name (FK)
                                                                                        Legacy Entity (FK)                                   Legacy Attribute Native Name
                                Asdb Attribute Notes                                                                                         Legacy Attribute Base Unit
                                Asdb Attribute Required Flag                           Status Code (FK)                                      Legacy Attribute Definition
                                Asdb Attribute Datatype Description                    Transform Type                                        Legacy Attribute Notes
                                Exclusion Code (FK)                                    Transform Name                                        Legacy Attribute Required Flag
                                Asdb Domain (FK)                                                                                             Legacy Attribute Datatype Description
                                                                                                                                             Exclusion Code (FK)
                                                                                                                                             Legacy Domain (FK)




                                    ASDB DOMAIN
                                     Asdb Domain                                       LEGACY TO ASDB DOMAIN MAP                                 LEGACY DOMAIN
                                     Asdb Domain Description                            Asdb Domain (FK)                                          Legacy Domain
                                     Exclusion Code (FK)                                Legacy Domain (FK)
                                                                                                                                                    Legacy Attribute Domain Description
                                                                                        Status Code (FK)                                            Exclusion Code (FK)
                                                                                        Domain Transform Type
                                                                                        Domain Transform Name
                                     ASDB DOMAIN VALUE
                                      Asdb Domain (FK)                                                                                                 LEGACY DOMAIN VALUE
                                                                                                                                                        Legacy Domain (FK)




                                                                                        ETL PROCESS
                                                                                         Asdb Etl Process
                                                                                        Asdb Etl Process Description
                                                                                        Asdb Etl Process Owner




    Target                                                                                                              Legacy
                                                                                          Figure 6: ASDB Metadata Mapping Model




                                                                                                                       15
                     The ASDB System and Software Design Document




11 Extraction, Transform & Load Process for Metadata Mapping Model
The ETL process involves taking the appropriate variables from the ER diagram of data
mart (table names, field names, data types, subject areas, etc.), and placing them in a
comma delimited (CSV) file. The data from the CSV files is transformed via a Perl script
into the proper format for loading into the database. The output is another CSV file that
is ready to be used with a SQL*Loader control file, which will load the information into
the appropriate table of the database. Note that due to table constraints, it is necessary to
load these tables in a particular order (detailed below).
The data for the correlation tables (EXCLUSION_CODE, STATUS_CODE,
LEGACY_TO_ASDB_ATT_MAP, LEGACY_TO_ASDB_DOMAIN_MAP, and ETL_
PROCESS) is retrieved from the Metadata Mapping Excel File, and placed into a CSV
file as well.

11.1 Creating a CSV file from an ERwin ER diagram
1) On the Taskbar, go to Tools -> Report Builder.
2) Click New
3) Select a section of either Entity (under Logical) or Table (under Physical)
4) Select the variables that you wish to display. Logical variables will give you logical
    ERwin definitions and values. Physical variables will give you the actual database
    definitions and values.
5) Change Export As to Text
6) On the Toolbar, Click Run
7) The result will be displayed in Microsoft Excel in a CSV file.
8) Remove the header row.
9) Name and save the file.


11.2 Order for Tables to be loaded
The following list is the order in which the Metadata Mapping Model must be populated.
The tables marked with an asterisk are optional and may not be used. If they are used,
however, they must be populated in this order.

ASDB_2_0_META
ASDB_LEGACY_LOGICAL
ASDB_TO_LEGACY
EXCLUSION_CODE
* ETL_PROCESS
LEGACY_MDEL
LEGACY_SBJECT_AREA
LEGACY_ENT
LEGACY_SBJECT_AREA_ENT_MAP
* LEGACY_DOMAIN
* LEGACY_DOMAIN_VAL
LEGACY_ATT


                                             16
                     The ASDB System and Software Design Document


* LEGACY_ETL_PROCESS_SRC_MAP
ADB_SBJECT_AREA
ADB_ENT
ADB_SBJECT_AREA_ENT_MAP
* ADB_DOMAIN
* ADB_DOMAIN_VAL
ADB_ATT
* ADB_ETL_PROCESS_SRC_MAP
STATUS_CODE
LEGACY_TOO_ADB_ATT_MAP
LEGACY_TOO_ADB_DOMAIN_MAP


11.3 Perl Scripts to run
These scripts are stored on the Classified RCS network under /Metadata
Controls/Conversion

•   parse_asdb_temp.pl – Transforms exclusion code, status code, and transformation
    code into a temporary table to be used when correlation of ASDB and Legacy is
    performed.
•   parse_asdb_to_legacy.pl – Transforms basic Entity and Attribute data for each legacy
    datamart. Needed to facilitate the correlation of ASDB to Legacy.
•   parse_exclusion_code.pl – Transforms and Loads the Exclusion Code data
•   parse_legacy_mdel.pl – Transforms and Loads the Legacy Model data.
•   parse_legacy_sbject_area.pl – Transforms the Legacy subject area data.
•   parse_legacy_ent.pl – Transforms the Legacy entity data.
•   parse_legacy_sbject_area_map.pl – Transforms the Legacy subject area entity map
    data.
•   parse_legacy_domain.pl – Transforms the Legacy domain data.
•   parse_legacy_att.pl – Transforms the Legacy attribute data.
•   parse_sbject_area.pl – Transforms the ASDB subject area data.
•   parse_asdb_ent.pl – Transforms the ASDB Entity data.
•   parse_sbject_ent_map.pl – Transforms the ASDB subject area entity map data.
•   parse_asdb_domain.pl – Transforms the ASDB domain data.
•   parse_asdb_att.pl – Transforms the ASDB attribute data.
•   parse_status_code.pl – Transforms and loads the Status Code data.
•   parse_legacy_too_adb_att_map.pl - Transforms the
    Legacy_to_ASDB_Attribute_Map data.


11.4 How to Run SQL*Loader
1) Go to Start -> Run, and type “cmd”.
2) Change directory to the /Metadata Controls/load
3) Usage is as follows:




                                           17
                     The ASDB System and Software Design Document


        Sqlldr USERID=username/password@ASDB CONTROL=ctl_filename
        LOG=path/log_filename BAD=path/bad_filename DATA=path/data_filename
4) The log file will show you any error messages that occur during the load. The bad
   file will show you the actual rows that were rejected by SQL*Loader. If you wish to
   repeat the load, you may correct these rows and load the BAD file as your new
   DATA file.


11.5 SQL*Loader Control Files to run
These control files are stored on the Classified RCS network under /Metadata Controls/
load.

•   asdb_2_0_meta.ctl – Loads the logical and physical representation of the data into
    one table for linking purposes
•   asdb_legacy_logical.ctl – Loads the logical representation of the data
•   asdb_to_legacy.ctl – Loads information transformed from parse_asdb_to_legacy.pl
    into database
•   asdb_temp.ctl – Loads information transformed from parse_asdb_temp.pl into
    database
•   etl_process.ctl – Loads information into the ETL_PROCESS table
•   legacy_sbject_area.ctl – Loads information into the LEGACY_SBJECT_AREA table
•   legacy_ent.ctl – Loads information into the LEGACY_ENT table
•   legacy_sbject_area_ent_map.ctl – Loads information into the
    LEGACY_SBJECT_AREA_ENT_MAP table
•   legacy_domain.ctl – Loads information into the LEGACY_DOMAIN table.
•   legacy_att.ctl – Loads information into the LEGACY_ATT table
•   legacy_etl_process_src_map.ctl – Loads information into the
    LEGACY_ETL_PROCESS_SRC_MAP table
•   asdb_sbject_area.ctl – Loads information into the ASDB_SBJECT_AREA table
•   asdb_ent.ctl – Loads information into the ASDB_ENT table
•   asdb_sbject_area_ent_map.ctl – Loads information into the
    ASDB_SBJECT_AREA_ENT_MAP table
•   asdb_domain.ctl – Loads information into the ASDB_DOMAIN table
•   asdb_att.ctl – Loads information in the ASDB_ATT table
•   legacy_too_adb_att_map.ctl - Loads information into the
    LEGACY_TOO_ADB_ATT_MAP table




                                           18
                    The ASDB System and Software Design Document




Gathering Data

To assist all elements of the JAMIS, the MISAT developed a Data Call mechanism
though which the JSFPO communicated with the M&S user community. The Data Call
was used to gather information about the simulations, their data structures, and some
metadata about sources and personnel. It was designed to benchmark the existing data
being used and to start a conversation with the appropriate Subject Matter Experts.

12 Data Call

   1. Name and version of SWCE tool:

   2. Model Manager Info

          Name:
          Position/Title, Organization:
          Telephone:
          Email Address:
          Website:
          Postal Address:

   3. What database management system (DBMS) does this tool use?

   4. Who will serve as the Model manager’s subject matter expert (SME) for
      data/information requirement matters pertaining to this tool? This person will
      serve as a member of the Modeling Information Sources Integrated Process Team
      (MISIPT).

          Name:
          Position/Title, Organization:
          Telephone:
          Email Address:
          Postal Address:

   5. What types of information about the Joint Strike Fighter (e.g., characteristics,
      attributes) and its weapons are required to support JSF representation in this tool?
      This question is asked to scope DPD and JWepsDB information requirements.
      Please include in this category all compound information that describes the JSF’s
      relationship/performance relative to other man-made systems or the natural
      environment (e.g., JSF Pk/Ps/exchange ratios versus threats; the impact of rain on
      JSF sensor performance).

          Information about the JSF and its weapons contained in input data sets for
          initialization/configuration of the tool.


                                           19
                    The ASDB System and Software Design Document



   6. What sources do you now use to obtain the above information?

   7. What information about other man-made systems or cultural features (e.g.,
      friendly weapons systems, threats, transportation networks, and structures) is
      required to support their representation in this tool?

          Information contained in input data sets for initialization/configuration of the
          tool.

          Information reflected in the software code for this tool (i.e., the hard-coded
          characteristics of other man-made systems or cultural features)

   8. What sources do you now use to obtain the above information?

   9. What information about the natural environment (i.e., terrain, oceans, atmosphere,
      and space) is required to support their representation in this tool?

          Natural environment information contained in input data sets for
          initialization/configuration of the tool.

          Natural environment information reflected in the software code for this tool
          (i.e., the hard-coded characteristics of the natural environment)?

   10. What sources do you now use to obtain the above information?

   11. What changes to the above responses, if any do you anticipate as a result of
       forthcoming modifications/improvements to this tool?

   12. Is there any other information or remarks you would like to provide to, or ask
       from, the DPD, JWepsDB, ASDB and/or MSEDB development teams?

In addition to gathering feedback via the Data Call, PTI contacted the Intelligence
Production Centers, Model Managers and simulation distributors (the Survivability/
Vulnerability Information Analysis Center (SURVIAC) for example) to obtain the
simulations, documentation and classified datasets.

Depending on the level of cooperation, it has taken anywhere from days to many months
to obtain materials from various sources and in some cases all requested and required
information never appeared at all.

13 Production Request
Data for Brawler and TRAP have required Production Requests (PR) through the JSFPO
while data for other tools has been gathered in cooperation the respective authoritative
source. For Brawler data, the National Air and Space Intelligence Center (NASIC)
produced a series of PR templates. Those templates are classified and stored on PTI’s


                                           20
                     The ASDB System and Software Design Document


classified LAN at \\Asdbdb\ASDB\Library\Native Library\Brawler Native\Brawler 6.5
Native\Brawler 6.5 Native Documentation\Brawler PR 10-2003. The PR contains all of
the information necessary to submit the PR. The fields are intuitive and can be updated
to reflect the most current requirements.

As a new release date nears, PTI analysts update and submit the PR to the JSFPO
intelligence community liaison, currently Dean Hinson. Mr. Hinson provides the
appropriate intelligence production center with the PR and tracks the results. The
intelligence production center may provide the data to the JSFPO directly or to PTI via
the SIPRNet.

14 Points of Contact
This is a listing of the points of contact for each of the simulations and data sources as of
August 2004. This list has been and will likely continue to be very dynamic.




                                             21
                                     The ASDB System and Software Design Document




Simulation   Name              Organization     E-mail                              Role            Telephone
Bluemax      Stephen Ames      ASC              Stephen.Ames@wpafb.af.mil           Model Manager   937-255-1276
Bluemax      Michael           BAH              Bennett_Michael@bah.com             Developer       937-255-4840
             Bennett
Brawler      Maj James         AFSAA            James.Mundy@pentagon.af.mil         Model Manager   703-588-8699
             Mundy
Brawler      Larry Taranto     ASC              Larry.Taranto@wpafb.af.mil          JSFPO Model     937-904-7064
                                                                                    Liaison
Brawler      Jose Pena         NASIC            Jose.Ayala-Pena@wpafb.af.mil        Source          937-257-2404
Brawler      Scott             NASIC            SJF44@naic.wpafb.af.mil             Source          937-257-2404
             Fullenkamp
Brawler      Matthew           NASIC            Matthew.Veness@wpafb.af.mil         Source          937-257-2404
             Veness
Brawler      Paul Freisthler   NASIC            Paul.Freisthler@wpafb.af.mil        Source          937-257-2404
Brawler      Al Gordon         SRS              Al.Gordon@fcva.asi.srs.com          Developer       703-358-4743
Brawler      Dan Crogan        SRS              Dan.Crogan@fcva.asi.srs.com         Developer       703-358-4743
ESAMS        James             ASC              James.Begovich@wpafb.af.mil         Model Manager   937-255-4174
             Begovich
ESAMS        Mike Miles        BAH              Miles_Mike@bah.com                  Developer       937-255-4358
ESAMS        Greg Born         SURVICE          GregB@survice.com                   Developer       760-446-2497
JIMM         1 Lt G.           ESC              Gregory.Tyler@Hanscom.AF.mil        Program         781-377-6052
             Marshall Tyler                                                         Manager
JIMM/GCS     Vern Panini       NAVAIR           vernon.panei@navy.mil               Database        301-342-6150
                                                                                    Manager
MIL-         Randy Levine      ASC              Randyll.Levine@wpafb.af.mil         Model Manager   937-255-0672
AASPEM II
MOSAIC       James             ASC              James.Begovich@wpafb.af.mil         Model Liaison   937-255-4174
             Begovich
MOSAIC       Jerry Dague       SAIC                                                 Developer       937-258-8215
Radguns      Ken Holet         NGIC             Kenneth.M.Holet@us.army.mil         Model Manager   434-980-7559
                                                                                    – Source
Shazam       Capt Stephen                                                           Model Manager
             Rosencrantz       ASC              Stephen.Rosencrantz@wpafb.af.mil    – Source        937-255-4358
SIMS         Sandra Brown      96CG/SCTOS       Sandra.Brown@eglin.af.mil           Source          850-882-2693
                               (FMIC)
Thunder
TRAP         Joe Herrmann      NASIC            JWH57@naic.wpafb.af.mil             Developer –     937-257-2653
                                                                                    Source
                                                 Figure 7 Points of Contact



                                                            22
                     The ASDB System and Software Design Document




Data Modeling

In any data integration solution, the goal is to be able to combine back-end data sources,
which may include a hodgepodge of legacy systems, multiple databases from multiple
suppliers, data housed on disparate operating platform environments, heterogeneous and
geographically dispersed database management and files systems with a customer’s
requirements. PTI’s Data Management Division recognizes that this is no small task and
often times, the initial scope of a project, such as the ASDB, expands to a degree larger
than anticipated. In order to complete a project of the ASDB’s magnitude, PTI had to
employ a solution flexible enough to handle these data connectivity problems, rapidly
changing technology, and transactions that needed to occur on a per-application, per
database basis. The SPIIDRE solution does all of these things and more. One of the keys
to its success is the model-driven approach to the construction of graphical diagrams of
the data systems for each physical source and consuming application to determine
structure and format metadata and the further integration of those models into meta-
models using metadata mapping techniques.
This solution:
•   Supports transactions across databases and consuming applications, making
    incorporation of data or business rules much easier (Example: New source data
    arrives from NASIC that supports Brawler, TRAP, and MOSAIC. Using the PTI
    solution, that new information can be easily fed to the consuming simulations in
    seconds rather than weeks.)
•   Makes reuse of integration techniques across heterogeneous environments much
    easier (Example: Data provided by the intelligence community arrives in the form of
    an HTML file. That data is linked via metadata to a field in an operational data store,
    converted using an ETL tool, and then feeds a simulation requiring ASCII text file
    inputs.)
•   Gives users the ability to implement integration solutions without needing detailed
    knowledge of the application infrastructure’s specific underlying technologies.
    (Example: An ESAMS user is able to download updated information about a
    Surface-to-Air threat, move that dataset to the data directory in ESAMS, compile and
    run a study. All of this done without having to understand the format of the missile
    files or how they interface with the source code.)
The starting point for data integration is the reverse engineering of the physical data
sources and consuming formats. As mentioned earlier, PTI has incorporated a DMSO-
sponsored methodology in the SPIIDRE solution called REDIS. REDIS stands for
Reverse Engineering for Data Integration and Sharing. This approach promulgates a
model-driven process to discover format and structure metadata about each of the data
source providers and information consumers. Using that metadata, the sources and
consumers are integrated in a series of Subject Area Integration (SAI) models. These
SAI models can then be populated. REDIS is similar in many ways to data warehousing,
in that it provides for the development of a common data repository where data from


                                            23
                     The ASDB System and Software Design Document


sources is copied to a target repository. Any subsequent changes to the data, source
format or consuming format must also be maintained. The size of this repository could
be huge and the effort spent to keep the data sources and copies in synchronization would
be considerable.
PTI has followed the REDIS example in its data modeling techniques, but has deviated in
that instead of developing SAI models, data integration is accomplished through the use
of metadata and triggers and stored procedures. These are a few of the comparisons
between using the “central database” approach and using the PTI SPIIDRE solution:


    Single Central Database                     SPIIDRE
    Cannot keep up with upgrades to other Can keep up with upgrades common
    databases                             to all databases or individual
                                          databases, allows programming to
                                          convert new data types to a standard
                                          format, and allows users to access
                                          information specific to individual
                                          source databases
    No programming interface for cross- High-level        components     and
    database-supplier transactions      programming interface (appearing like
                                        a single database to external
                                        developers/users) for cross-database-
                                        supplier transactions
    No provisions for handling similar Metadata directory that “tags” data
    data defined in multiple databases or that exists in multiple databases so
    at different levels of classification analysts have the option to use
                                          different values for the same
                                          parameter
                         Figure 8: Central Database SPIIDRE Comparison

PTI uses the REDIS process to uncover and document the metadata concerning format
and structure. The objective of REDIS' is to develop two levels of graphical data models,
a DBMS model followed by a Project Information Model (PIM). This section
concentrates on the presentation of the REDIS process, which centers on the application
of the IDEF1X data modeling methodology. As the reverse-engineering process is laid
out, brief explanations of IDEF1X symbols and concepts are introduced as they are
encountered. This section does not cover all aspects of IDEF1X. Only those basic
symbols/concepts, which provide an understanding of the reverse-engineering process,
will be presented and graphically demonstrated. However, the information that follows
will provide pertinent details to accomplish the objective.
Reverse engineering presents certain limitations, which are noted for reader awareness.
The process is unavoidably tedious and requires a considerable amount of time. It is an
iterative effort that needs dedicated participation of people in several key roles for



                                              24
                     The ASDB System and Software Design Document


success. Gathering and refining information forms the basis of the effort. Access to all
of the needed information is the major principal stumbling block in certain situations.

15 Starting the Reverse Engineering Process
The first step in the reverse-engineering process is the identification of the data used by a
system. For the purposes of this document, we call the data “portion” of a system being
reverse-engineered, the “data system”. The data system may take many physical forms.
For example, it may be a relational database, a set of flat-files, or even data embedded
within the source code of a program. The reverse engineering process is surprisingly
similar for most legacy data systems; no matter what their form or their origin (legacy
database or M&S data). It always begins at the physical level. The physical data system
is the foundation of the reverse-engineering task whether your data are managed by a
hierarchical, network, or relational database management system or managed in flat files.
Some parts of the reverse-engineering process may be more complex than others,
depending on your physical layout.            IDEF1X is partial to relational DBMS
implementations, but that does not preclude using it to document data managed by
hierarchical databases, network databases, or even flat files.
Several information sources should be sought. Any software or documentation connected
with the data system is a valuable source of information in developing the information
models. The degree of documentation detail may be insufficient, but it provides a
starting point. Technical reference material on the subject area may be needed to clear up
some modeling questions. Personnel resources are vital! Individuals who have created,
maintained, or used the data system can provide input to your modeling sessions.
Technical experts can verify modeling assumptions and serve as a sounding board when
business rules are finalized. Although a basic IDEF1X language description is presented
in the manual, individuals involved in the reverse-engineering process may consider
additional IDEF1X training.
A data-modeling tool that implements the IDEF1X methodology is beneficial; it can save
time in the long run, even though there may be a learning curve associated with its use.
Management of a project information model representing a large relational database with
over 100 tables and easily five times as many fields is a difficult task. PTI has used
Computer Associates ERwin Data Modeling tool in the development of the ASDB.

16 Creation of the DBMS Model
The DBMS model is an IDEF1X representation of the physical structure of the existing
data system. It is considered a graphical form of documentation of the actual physical
structure of the data system. The material to follow presents the tasks necessary to
complete a DBMS model. The topics are presented in an order that is suggested for
developing the DBMS model, but note that there may be some deviation in the order of
accomplishing these tasks.

16.1 Identify Segments
The identification of segments in your data system provides the foundation of your
DBMS model. Before we discuss how to identify segments, we will introduce two of the
building blocks of IDEF1X, the entity and the attribute.


                                             25
                     The ASDB System and Software Design Document


An entity is an object that may be real or abstract. A real
object is a person, place, or thing. An abstract object is an
idea or some identifiable activity. An attribute is a
characteristic of an entity. It provides a piece of
information about an entity.
The diagram syntax for an entity and its attributes is a
square-cornered box with the entity name above the left          Figure 10: Entity and Attributes
upper corner of the box and all attribute names placed in
the box, one line for each attribute.
A segment is a physically grouped set of information about a real or abstract object in a
data system. A segment in IDEF1X terms is an entity. The information consists of the
entity attributes. The physical layout of data to be modeled determines what we identify
as segments. If your data system is under a relational DBMS, a segment would be a table.
                                                    For a network or hierarchical database,
                                                    a segment would be a node. A flat file
                                                    is assembled under one or more record
                                                    types, and one record type is a
                                                    segment. Data segments embedded in
                                                    a model or simulation may take the
                                                    form of data structures or data tables.
                                                    Attributes in a DBMS are the field or
                                                    column names. For a flat file or
                                                    embedded data, attributes may be
                                                    uncovered by inspecting the data or
              Figure 11: Identifying Segments       examining application code for
                                                    variable names.

16.2 Record Physical Characteristics
After segments are identified, we record our findings either manually or electronically by
creating an entity box (square-cornered) for each identified segment. Label the entity
with the exact name of the segment. Within the entity box, include attribute names that
correspond to the information contained in the segment. If you have flat files, use the
record type as the entity name and the names that you use for manipulating your
individual pieces of flat file
information as the attribute names.
For DBMS files, the names of the
data tables or data structures can be
used as entity names. The following
diagram in is an example of the
transition of information from a flat
file with record types to entities and
attributes. After segments of the
physical layout are identified, each
attribute is labeled with its
corresponding physical data type.                Figure 12 Record Physical Characteristics
Supplying this information enhances


                                            26
                       The ASDB System and Software Design Document


the model for completeness and thoroughness. For purposes of mapping data from one
data system to another, it is vital to record the physical storage method for all attributes.
One data system may store a serial number as a character string, and another may store
the same serial number as a numeric. To build a bridge between the two, we must be
aware of the physical storage. Any DBMS has a particular set of data types. One type is
used for each attribute. A non-DBMS flat file or embedded data application may require
that you examine the data and the application code to determine the data type.
In addition to the data type, it is beneficial to specify whether or not the attribute value
can be null. Discovering this information now will facilitate modeling when relationships
are identified. Some attributes are allowed to take on a null value. On the other hand,
some are always required to have a value. Whether or not nulls are allowed can be
uncovered by examining your database or M&S application. Some DBMSs allow
developers to specify this characteristic. Other implementations are complex and use
application code and a special flag to represent nulls.
Physical characteristics are recorded in the DBMS model in a manner dependent upon the
manual method or automated tool you are using. The model is not changed by adding
physical characteristics, but is enhanced by displaying more detail. The next diagram
shows an example with physical characteristics included in the model.

16.3 Add Meaningful Model Names
Entities and Attributes are now given meaningful names to enhance readability of the
DBMS model. The reason for this addition is that meaningful names are not always used
in the physical implementation. In addition, participants to the modeling effort include
individuals who may not be familiar with the physical names. For example, some
database implementations assign physical names according to a naming scheme using
acronyms. Also, the framework in which the system is implemented (DBMS,
FORTRAN, flat files, etc.) may limit the space available for meaningful names. Usually,
only the database developers, currently working with the physical implementation, will
have a full and quick understanding of the physical data names. Thus, businesslike names
are required for best comprehension by modeling participants. Although businesslike
names are used through most of the reverse-engineering effort, the physical names must
be retained for purposes of mapping back to the original data system.
                                                       When businesslike names are added to
                                                       the DBMS model, a few points should
                                                       be kept in mind. For entities, select a
                                                       name that represents the collection of
                                                       attributes in the entity. For attributes,
                                                       select a name that is meaningful for the
                                                       occurrences of that attribute. In a
                                                       grammatical sense, the entity or
                                                       attribute name selected should be a
                                                       noun or noun phrase. Including a
                                                       qualifier in the entity or attribute name
                                                       enhances its significance. The name
                                                       given to either an entity or an attribute
       Figure 13: Add Business Names to Improve
                    Comprehension

                                                  27
                     The ASDB System and Software Design Document


is singular. For example, each occurrence of a customer in the entity CUSTOMER
represents one and only one customer and the occurrence of a Customer-Name for one
occurrence of a customer represents one and only one customer name. Be consistent
when naming entities and attributes. For example, attributes that contain number in their
names should consistently be named "Number", not sometimes "Number" and other
times "No." This diagram shows an example of a model with businesslike model names.
Also, be consistent in the capitalization and use of separators (spaces, dashes, and
underscores) in the entity and attribute names.

16.4 Identify Primary Keys
Another level of detail in the DBMS model is the presentation of the primary key in each
entity. It provides a stepping-stone to the identification of foreign keys and relationships.
Before we discuss the identification of primary keys, we will define IDEF1X candidate
and primary keys.
A candidate key is an attribute, or group of attributes, that uniquely identifies entity
instances. An instance is a single occurrence of an entity type in a data system. A primary
key is the preferred candidate key for uniquely identifying entity instances. Key selection
includes a couple of considerations besides unique identification of entity instances. The
attribute or group of attributes selected as a candidate key usually has a value that does
not change over time. Also, an attribute that can be null is disqualified from being a
candidate key. If an attribute can be null, there is a possibility that unique identity of
instances no longer holds.
The diagram syntax for a primary key involves the separation of the attributes into two
sets within an entity. One set of attributes consists of all those that make up the primary
key. The other set consists of those attributes that do not belong to the primary key.
Within an entity box, a horizontal line separates the two sets of, with one attribute per
line. The primary key attributes are positioned above the horizontal line, and all other
attributes are positioned below the line. This is a diagram of a model with primary key
notation
In a reverse-engineering effort, we have several sources to lead us to the set of candidate
keys of an entity. If the data system is managed by a DBMS, we can examine any
established indexing for an entity. An index is a good indicator of a candidate key.
Application code that manipulates the data system may provide clues to identifying
unique entity instances. For flat files or embedded data, the application code may order
the data. Simply examining the data for unique field values may uncover the identity of
candidate keys. For M&S data, the identifiers used for the object being modeled are
usually unique keys. The name of an attribute may also give a clue that an attribute is a
candidate key. Obvious names containing "ID" or "key" are a good indicator of a
candidate key. Database developers who are familiar with the data may provide the
candidate key information with minimal effort, since the database is their arena.

16.5 Select the Primary Key
A given data system may have several valid keys of which we must select one. Two
points to consider are choosing a key made up of as few attributes as possible (ideally,



                                             28
                      The ASDB System and Software Design Document


only one) and selecting the candidate key least likely to change over time. By changing,
we mean the changing of instance data.

16.6 Identify Foreign Keys & Relationships
More descriptive detail is revealed when the DBMS
model is furnished with foreign keys and
relationships. Before we discuss the identification
of foreign keys and relationships in the reverse-
engineering effort, we will introduce background
IDEF1X information for relationships and include
any related diagram syntax. The next few
paragraphs describe how IDEF1X can document
the relationships.
An IDEF1X relationship is a connection between
two entities where one entity, the parent, passes its
primary key to another entity, the child.
Additionally, the primary key inherited by a child         Figure 14: Establish Relationships
entity, through a relationship, is termed a foreign                 between Entities
key. For example, a CUSTOMER (the parent) places zero, one, or many ORDERs (the
child). When a CUSTOMER places an ORDER, the ORDER captures the primary key of
the CUSTOMER to maintain instance identity and maintain the association between the
two entities. The CUSTOMER passes its primary key, Customer-Number, to ORDER
where it is noted as a foreign key. The notation for an attribute of a child entity that is a
foreign key is (FK), after the attribute name.
The diagram syntax for a relationship is either a solid or a broken line connecting the
parent entity to the child entity with a solid dot on the child entity end of the line. Along
with a line connection, a relationship requires a verb phrase to annotate the relationship
from the parent entity to the child entity. The verb phrase should be constructed so that a
meaningful sentence can be stated about the relationship between the two entities. The
verb should not be over-simplified.
In other words, avoid using single
verbs like "has" or "is." See the
next diagram for another example
of a relationship (a FACILITY is
the location for zero, one, or many
DEPARTMENTs). (Note that the
DEPARTMENT entity is a round-
cornered box.)
We can examine a relationship a
little more closely by looking at its
cardinality.     A     relationship's
cardinality is the number of
expected child instances to parent
instances in a relationship. There
are     13     possible     IDEF1X
                                                     Figure 15: Establish Cardinality


                                             29
                        The ASDB System and Software Design Document


cardinalities. We will examine four of these, with their syntax.
•   One parent instance may have zero, one, or more related child instances. The dot on
    the child entity has no distinctive label.
•   One parent instance may have one or more related child instances. The dot on the
    child entity is distinguished with "P" to its right.
•   One parent instance may have zero or one related child instances. The dot on the child
    entity is distinguished with "Z" to its right.
•   One parent instance is related to exactly n child instances. The dot on the child entity
    is distinguished with some integer "n" to its right. The following diagram shows the
    described cardinalities (note that these are identifying relationships). Further
    discussion of relationship types is provided in the following paragraph.

16.7 Key Migration
Key migration is the movement of the primary key
of the parent entity to the child entity through a
relationship. Keys migrate through two types of
relationships, either an identifying relationship or a
non-identifying relationship. An identifying
relationship is a relationship where the primary key
attributes of the parent entity become part of the
primary key attributes of the child entity (the
primary key of the parent migrates above the
horizontal line in the child entity box). A non-
identifying relationship, on the other hand, is a
relationship where the primary key attributes of the
parent usually become part of the non-primary key
attributes of the child entity (the primary key of the                   Figure 16: Key Migration
parent migrates below the horizontal line in the
child entity box).

16.8 How to Read a Relationship
A relationship implies a business rule that can be stated in a logical sentence. The pieces
needed to form the sentence are the parent entity name, the child entity name, the
relationship verb phrase, and the cardinality at each end of the relationship. Typically,
business rules read from parent to child but can be read from child to parent by revising
the relationship verb phrase. A general form of the business rule is:
                                                             Zero, one or many
 One
 Zero or one <parent entity name> <relationship verb phrase> One or more       <child entity name>.
                                                             Zero or one
                                                             Exactly “n”

                                   Figure 17: Reading a Business Rule




                                                  30
                      The ASDB System and Software Design Document


16.9 How to Identify Foreign Keys and Relationships
The key to identifying foreign keys
and relationships is to carefully
examine attributes between entities in
conjunction with existing data. If the
entire primary key of one entity
appears as all or part of the primary
key of another entity, we probably
have an identifying relationship. If the
entire primary key of one entity
appears in the non-primary key
attributes of another entity, we
probably have a non-identifying
relationship. Also look for possible
role names when examining entity
attributes.                                        Figure 18: A Business Rule in IDEF1X
Verify the relationship and determine
the cardinality by querying your data or examining the referential integrity constraints
imposed by your data system; all child instances should have an associated parent
instance (unless the relationship is optional non-identifying). For embedded data or flat
files, the referential integrity may be maintained by the application. Patterns in the
existing data may also verify relationships or uncover overlooked relationships.
A key point to remember is that the DBMS model should only contain relationships that
are substantiated by the structure of your data. For example, a network or hierarchical
database will probably consist of independent entities because connections between
relationships are strictly through physical pointers; we would have to introduce new
attributes to present relationships. Relationship information will be addressed further in
the project information model. We will add further relationships there. So, don't be
concerned if some entities are not connected by relationships at this point.

16.10 Provide Entity and Attribute Definitions
One of the most important aspects of any information model is the documentation of
definitions. All entities and attributes must be fully described, through a definition, to
provide needed information as the reverse-engineering process progresses. For the
DBMS model, definitions are useful in identifying migrating keys and relationships.
Thorough and complete definitions are used to avoid any uncertainties. Also, definitions
help to resolve unanswered questions when the project information model is refined.
Definitions help to determine mappings from one project information model to another.
The information pool for definitions includes documentation, technical material and
subject matter experts. The difficulty of gathering definitions from these sources depends
on the degree of availability. The data analyst responsible for recording definitions begins
with any available data system documentation. The availability of complete, up-to-date
documents eliminates research time and does not tie up additional personnel resources. If
documentation is unavailable or insufficient, the data analyst provides some of the
obvious or intuitive definitions. If there are still missing definitions, the next option is to


                                              31
                     The ASDB System and Software Design Document


research technical material or to approach subject matter experts. Research is time
consuming and yields incomplete or assumed definitions, especially if the data analyst is
new to the information area. On the other hand, a subject matter expert saves the data
analyst time and a lot of guess work. The subject matter expert may find it beneficial to
examine existing data to isolate definitions.
The data analyst must guide the depth and content of the definitions. A definition is
sufficient if your audience for the subject area knows and understands the definition
without question. Problems encountered while gathering definitions may vary.
Documentation of the data system may contain ambiguous and incomplete definitions.
When interacting with subject matter experts, the data analyst may encounter responses
like "self explanatory" or "too technical to explain simply and briefly," or the expert may
be vague and reuse the word being defined. The key is to arrive at a definition that is
unambiguous to someone familiar with the subject area of interest. Definitions do not
have to be geared towards the lay person. The last resort to resolve inadequate definitions
is research time, preferably on the part of the subject matter expert. Finally note that you
should include any concerns and special information about the entity or attribute being
defined.
After an initial pass to document definitions, the subject matter expert reviews the
definitions. If there is more than one expert, there must be agreement with the definitions.
Definitions are added throughout the entire reverse-engineering process because some
will not be available when definition gathering is initiated for the DBMS model.

16.11 Save Your DBMS Model
A completed DBMS model is saved to preserve a reference to the existing physical data
system. This preservation provides valuable documentation from a graphical standpoint,
and provides easy access to vital information that is organized in central locations.
Hence, individuals within your business realm have an easily accessible, centralized
information source. Similarly, those outside of your business realm have available an
easy to assimilate, centralized information source about your existing data system. The
DBMS model is the foundation of the project information model that is covered in the
following paragraph.

17 Creation of the Project Information Model
The project information model (PIM) is an IDEF1X model, which moves away from a
physical representation into a logical description of the requirements of the legacy data
system. The DBMS model neither added nor deleted any entities or attributes. There
remained a one-to-one correspondence between actual data system segments and fields
and the resulting entities and attributes. In contrast, the PIM structure will be a
modification. New entities and attributes may be added, attributes may be moved from
one entity to another, and attributes may be deleted from the model. So, to preserve the
physical information, save the DBMS model. The goal of creating a PIM is to develop a
model that is free of inaccurate, inconsistent, and redundant information (termed a
normalized model) and to refine all business rules. The foundation for the PIM is the
completed physical DBMS information model that is taken to a logical level of detail.




                                            32
                      The ASDB System and Software Design Document


The material to follow presents a set of tasks to achieve a PIM. There is no set order
associated with the accomplishment of tasks.

17.1 Normalization
Normalization is the process of removing inaccurate, inconsistent, and redundant data
and modifying the structures that support them. For reasons of performance, a data
system may intentionally incorporate a non-normalized data structure by breaking
normalization rules. When normalization rules are broken, data integrity and reliability
are lowered and the burden of enforcing these rules is placed upon the application
software. The goal of normalization is to guarantee that there is one and only one way to
know a fact. "One fact in one place" is the phrase to remember when we think about
normalization. The discussion to follow introduces normalization specifics and basic
model checks to detect non-normalized data structures.

17.2 Remove Violations of 1NF, 2NF,
       and 3NF
Normalization rules incorporate five
levels: 1NF (first normal form), 2NF
(second normal form), 3NF (third normal
form), 4NF (fourth normal form), and 5NF
(fifth normal form). To achieve a sound
data structure, we need to achieve at least
a 3NF model; a model normalized to 3NF
is also in 1NF and 2NF. To attain 3NF
implies that entity instances depend on
"the key, the whole key and nothing but
the key." We will define normalization
levels through 3NF and provide a means                   Figure 8: Normalization
to resolve normalization violations.

17.3 Resolve Multiple Attribute Occurrences
Part of achieving a PIM is to free the model of
multiple attribute occurrences that are specified in
the DBMS model. A multiple attribute occurrence
is the use of two or more attributes in one entity
to represent the same information but with
different values in each attribute. Also, the
definition for each of these attributes is identical.
The reason for resolving these situations is to
clarify the model and provide a more direct means
of representing the business rule without
repeatedly using an attribute name, or using
attribute names that are qualified with an integer
number.                                                 Figure 20: Resolve Multiple Attribute
When reverse engineering, multiple attribute                        Occurrences
occurrences are easily detected by examining


                                              33
                     The ASDB System and Software Design Document


attribute names and definitions within an entity. An entity with repeated attribute names
and attribute names that are discriminated by a consecutive number are the best indicators
of multiply occurring attributes. You can verify your findings by observation of like
definitions for those attributes. As explained earlier, we resolve multiple attribute
occurrences by placing the multiple attributes into one attribute in a new dependent child
entity.

17.4 Identify Remaining Foreign Keys and Relationships
Through focus sessions with the modeling team, some hanging entities may be placed
into logical relationships by adding attributes to represent foreign keys. A hanging entity
is an independent entity that is not connected to any other entity in the model.
The modeling team determines through discussions if and where a hanging entity fits into
the model. For each logical relationship exposed, the information modeler adds missing
foreign keys, includes the appropriate relationship connection (i.e., identifying or non-
identifying relationship), provides a relationship label, and supplies a definition if
required for each added foreign key attribute.
Some entities should be dropped from the PIM. Entities that do not contribute to the
overall business but are used for implementation purposes should be discarded; for
example, an entity of authorized users is implementation dependent. No one outside your
business wants to know who entered a particular record in your data system and on what
date.

17.5 Identify Category Relationships
Categories of information are
represented in PIMs through
special relationships called
category         relationships.
Categories are a means of
graphically       representing
business rules. They specify
information        from       a
generalization standpoint at
one end of the relationship,
and corresponding to a
categorization standpoint on
the other end of the                             Figure 21: Category Relationships
relationship. The business
rules associated with category relationships are hidden in DBMS models. The physical
representation of a DBMS or flat file implementation does not directly accommodate
category relationships. We rely on the logical PIM to represent categories. The discussion
on how to identify category relationships during reverse engineering will demonstrate
how category relationships are hidden in the DBMS model. First, we will define IDEF1X
category relationships and explain the related diagram syntax.




                                            34
                         The ASDB System and Software Design Document


17.6 Create Associative Entities
The PIM should provide what can logically exist for the business. Implementation
constraints add to the complexity of achieving a PIM that purely distills the logical
requirements. One special case of an implementation constraint involving multiple
                                          relationships between the same pair of
                                          entities limits the flexibility of representing
                                          relationships. These special cases may be
                                          identified and resolved by modeling
                                          participants through detailed discussions
                                          that keep in mind the information to follow.
                                          The resolution to eliminating this implemen-
                                          tation constraint is to apply the concept of
                                          associative entities. We present the
                                          guidelines for uncovering and resolving this
                                          implementation constraint following an
                                          introduction to IDEF1X associative entities.
          Figure 22: Associative Entities




17.7 Eliminate Implementation Characteristics
Implementation characteristics in a model belong to the DBMS model only. In a PIM, we
want to represent what logically exists in the business, not what is physically limited by
the implementation or what is physically characteristic of the implementation. Physical
implementation characteristics revolve around design decisions that are made as a result
of specific requirements for the organization. The requirements range from accountability
of record entry and modification to ease of data entry. The modeling team carefully
reviews the current model to uncover implementation characteristics.
To determine whether an attribute, or an entity for that matter, supports an
implementation characteristic, the team must decide whether attribute or entity instance
data have meaning to anyone outside the organization. If the attribute or entity in
question has meaning only within the organization, we can drop that attribute or entity
which captures the implementation characteristic and revise any definitions that might
reference the dropped information.

17.8 Eliminate Non-unique Entity and Attribute Names
Entity and attribute names should be unique within the PIM. The DBMS model is
allowed non-unique attribute names. It is a choice initiated by the data system developers.
The PIM on the other hand is enhanced for communication purposes by eliminating non-
unique entity and attribute names. Communication of information represented by a PIM
may be misinterpreted without this enhancement. Using explicit names for entities and
attributes promotes clear discussions of model information.

17.9 Refine Business Rules
Business rules are formed throughout the reverse-engineering process. The refinement of
business rules in the PIM entails a review of the model in two different areas. One area of


                                             35
                                                                                                  The ASDB System and Software Design Document


review is the cardinality specified in each relationship. Another area is to check the
dependency of each entity as a result of established relationships. The two areas
combined determine the business rule. The goal of this exercise is to take another
perspective of the model and modify it to represent what is logically required and not
what was chosen as an implementation constraint. A discussion of each refinement area
follows, including IDEF1X explanations as needed.

18 A Complete Project Information Model
The PIM is complete when we can no longer incorporate new information in the model.
At this time, the PIM is ready for integration through the metadata mapping process.

                                                                                                                               Z




                                                                              P




                                                                                                                                                                                                                                                                                                                                         P



                                                                                  Z                                                                                                                                                                                                                                                                                              P
                                                                                                                                       Z




                                  P




                                                                                                                                                                                                                                                                                                                                             Z

                                                                                                                                                                                                                                                                                                                                     P




                                                                                                                                                                   Z
                                                                      Z




          P




                                      Z                                                                                                                                                                                                                                                                                                                          Z
                                                                                                                                                                       Z




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     P




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Z               Z




      P


                                                                                                                                                                                                                                                                                                                                                                                                         Z                               P

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         P                       P




                                                                                                                                                                                       P

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 P

                                                                                                                                                                       Z
                                                                                                                                                                                                                                                                                                                                                                                                     Z                                                                                                                                                       P




                                          Z



                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         P
                                                                                                                                                                                                                                                                                                                                                                                                             Z


                                                                                                                                                                                                                                                                                                                                                                                                                         P




                                                                                                                                                                                                                                                                                                                                                                                                                                                     P
                                                                                                                                                                                                                                                                                                                                                                                                                 Z




                                                                                                                                                                                                                                                                                                                                                                                                                     P
                                                                                                  Brawler F ile Name                                                               Z




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     P
                                                                                                                                                                                                                                                                                                                                                                                                                 P



                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         P

                                                                                                                                                                                                                                                                                                                                                                                     Z


                                                                  Z


                                                                                                                                                                                                                                                                                                                                                                                                                 P


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     P


                                                          P                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Z




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         P
                                                                  P                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          P

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 P

                                                                                                                                                                                                               Z
                                                      P




                                                                                                                                                                                                                                                                                                                                                                                                                                                         2




                                                                                                                                                                                                                   Z
                                                                                                                                                                                                                                                                                                                                                                                                                                                                         P




                                                                                                                                                                                                                                                                                                                                                 P



                                                                                                                                                                                                                                                                                                                                                                     P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 P

                                                                                                                                                                                                                       Z
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Z
                                                                                                                                                                                                                               Z




                                                                                                                                                                                                                                                                                                                                                                                                                                             P   P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             P
                                                                                                                                                                                                                                                                                                                                                                     P

                                                                                                                                                                                                                                                           P                                                                                                                                                                                                                         P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             P
                                                                                                                                                                                                                           Z


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 Z

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Z

                                                                                                                                                                                                                                                                                                                                                                                                                                                             P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Z
                                                                                                                                                                                                                                                                                                                                                                                                                                                             P                           P
                                                                                                                                                                                                   P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             P


                                              P                                                                                                                                                                                                                                                                                                                                                                                                      P   P
                                                                                                                                                       Z
                                                                          Z                                                                                                                                                                                                                                                                                              P                                                                       P

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         P                                               P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 P



                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     P
                                                                                                                                                                                               P
                                                                                                                                                                                                                                                                                                                                                                                                                                                                     P

                                                                                                                                                                                                                                                                                                                                                     P                                                                                                               P




                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 P




                                          P



                                                                                                                                                                                                                                                                                                                                                                                                                                             P           P

              P




                                                                                                                                                                                           P




                                                                                                                                                           Z




                  P
                                                  P




                                                                                                                                                                                                                                                                               Z




                                                                                                                                                                                                                                                                                                                                                                                         P




                                                                                                                                                               Z

                                                                                                                                                                                                                                                                                                                                                                                                         Z




                                                                                                                                                                               P                                                                                                                                                                         P




                                                                                                                                                                                                           P                                                                                                                                                                                         Z




                                                                                                                                                                                                                                                                                                                                                                                             Z

                                      Z                                                                                                                                                                                                                                                                                                                                                                                          Z
                                                                                                                                                                                                                                                                                                                                                                 Z




                              P




                                                                                                                                                                                                                                                                                                                                                             Z
                                                                                                                                                                                                                                                                                                                                                                                                                             Z



                                                                                                                                                                                                                                                                                                                                                                                                 Z
                                                                                                                                                                                                                                                                               P



                                                                                                                                                                                                       P




                                                                                                                                           Z




                                                                                                                                                                                                                                                                                                                                                                             Z

                                                                                                                                                                                                                                                                                                                                                                                                 P

                                                                                                                                                                                                                                                                                                                                                                                                                                     Z




                                                                                                                                                                                                                                                                                                                                                                                                                                                                 P
                                                                                                                                                                               P
                                                              Z




                                                                                                                                                                                   P




                                                                                                                                                                           P                       P




                                                                                                                                                                                                   P




                                                                                                                                                                                                                                           Z


                      P




                                                                                                                                                                                                                                                               P
                                                                                                                                                                                                                                               P




                          P                                                                                                                                                                                                                                        P




                                                                                                                                                   P


                                                                                                                                                                                                                                                   P
                                                                                                                                                                                                                                                                   P                   Attribute Value
                                                                                                                                   Z




                                                                                                                                                                                                                                                       P
                                                                                                                                                                                                                                       P
                                                                                                                                               P




                                                                      Z
                                                                                      Z                                Z

                                                                                                                                                                                                                                                                                   P
                                                                                                                                                                                                                                   P
                                                                                                                                                                                                                                                                           P




                                                                                              P
                                                                                                                                                                                                                                                                                                         Brawler System Name (F K)


                                                                                                                                                                                                                                                                       P



                                                                                                                                                                                                                                                                               P




                                                                                          P




                                                                                                                           P




                                                                                                                           Figure 23: Complete Project Information Model




                                                                                                                                                                                                                                                                   36
                     The ASDB System and Software Design Document




Using ERwin

CA’s AllFusion ERwin Data Modeler is the industry’s leading data modeling solution. It
can greatly enhance the design, generation and maintenance of high-quality, high-
performance databases, data warehouses and enterprise data resource models.
From a logical model of your information requirements and business rules that define
your database to a physical model optimized for the specific characteristics of your target
database, AllFusion ERwin Data Modeler helps you visually determine the proper
structure, key elements and optimal design for your database.
More than just a design tool, AllFusion ERwin Data Modeler is a powerful database
development tool, automatically generating tables and thousands of lines of stored
procedure and trigger code for leading databases.

19 Using ERwin
Once ERwin has been installed, following the directions included in Chapter 11, the
analyst can begin a new project.
Open ERwin. Use the file menu to select a new project. You’ll be asked if you want the
model to be physical, logical or both. Select both.
At this point, ERwin is setup and ready for modeling. If the analyst is not familiar with
ERwin, he can walk through the tutorial offered via the Help drop down menu.




                                            37
                      The ASDB System and Software Design Document




                                 Figure 24: ERwin Logical/Physical

20 Data Modeling Using ERwin
Once the analyst has completed the tutorial, he should have a general understanding of
how ERwin works. There are some ASDB-specific issues that should be addressed.
During the development of the ASDB, analysts have done their utmost to capture as
much information as possible about the entities and attributes in their data models. In
order to make the job of the analyst a little easier, it’s important to note some peculiarities
inherent with ERwin. ERwin was built primarily as a forward engineering database
development tool. Since the majority of the modeling work done using the SPIIDRE
solution requires reverse engineering, some special steps need to be followed.
During the DBMS development phase of the model development, the definitions for the
entities and attributes should be captured on the logical side. They will migrate to the
physical.
When applying the meaningful names to the physical tables and columns, switch over to
the logical side of the model. Use the user defined properties tab to define a native name
field. Copy the native name you’ve used to define the physical column to this value.
Capturing that information will help in mapping the attribute back to its original data.




                                                38
                     The ASDB System and Software Design Document




                                     Figure 25: ERwin UDP

Now, using CS-RCS, which is covered in Chapter 11, check out the ASDB naming
standards file. The check out process will generate a directory on your C drive, source
code. Once you’ve completed this step, go back to ERwin to continue with the setup
process. Under the Tools drop down menu, select Names and then Model Naming
Options. Under the General Tab, using the browse option, find the ASDB naming
standards file at C:\source-code\Naming Standards File\Design\ASDB.NSM. Under the
Logical Tab, select capitalization for Entities to upper and Attributes to initial. Under the
Physical Tab, select Tables to upper and Columns to upper. Under the Name Mapping
Tab, select Use Glossary for Entity to Table and Attribute to Column. Finally, under the
Duplicate Names Tab, select ask.
You can begin renaming your attributes. Do the renaming on the logical side. Once
you’ve completed an entity, switch over to the physical side of the model. You’ll see that
the naming standards have been applied and that the full, meaningful name you just
created has been abbreviated to a representative column heading.




                                             39
The ASDB System and Software Design Document




       Figure 26: ERwin Model Naming Options




                        40
                     The ASDB System and Software Design Document




Model Testing & Generation of Data Marts & Operational Data Stores

Given that the conceptual data models lay the foundation of all later design work, their
quality has a significant impact on the quality of the subsequent databases that are
ultimately implemented. There are few quantitative and objective measurement
instruments available to assess structural complexity and internal quality of conceptual
data models, with the idea that those attributes are related to the external quality of such
models. PTI’s SPIIDRE solution allows for the instantiation of the conceptual data
models as physical databases and then analysts conduct controlled experiments with
existing legacy data to ensure design quality. Once an assessment has been made on the
conceptual data models and subsequent implementations, the newly generated databases
are used to fill key roles in the information integration process. PTI uses data marts as a
means to store the replicated back-end source data and operational data stores, as a target
for the conversion software linking those sources with application-specific customer
needs.

21 Project Information Model Assessment
Assessing a PIM is difficult given the scarcity of quantitative and objective measurement
tools. Recognizing that the data models form the foundation for all future integration
efforts and therefore a validation of the integrity of the conceptual model is imperative,
PTI has developed a process to test each model using controlled experiments during
which each model is instantiated and load tested.
Once the data model is complete, a panel of peers reviews it. It is assessed for
understandability, compliance with standards and extensibility. After any resulting
modifications are made, the conceptual model is used to generate an Oracle database.
This is an automated process using the E/R modeling tool.

22 Database Generation from ERwin
Before beginning the generation process, the model has to be on the classified LAN and
table space has to have been created in the Oracle database. Once those steps have been
completed, use the following checklist to generate the database:
 1.    Open the E/R diagram using ERwin
 2.   Save the model to a new name which includes the word “Oracle,” such as Brawler
   Oracle PIM
 3.    Toggle to the physical side of the model
 4.    Click “Tools” on the main menu
 5.    Choose “Forward Engineering Schema”
 6.    Under the “Option” button, select the following:




                                            41
                      The ASDB System and Software Design Document


       a. Schema     uncheck
       b. View       uncheck
       c. Table     check “Create
          Table” and “Table Check”
       d. Column     “All”
       e. Index      uncheck
       f. Ref        keep the default,
          “PK-Alter” and “FK-On
          Delete-Alter”
       g. Trig       uncheck
       h. Options check
          “Comments”
 7.     Click “Generate”
 8.   Enter the user name,
   password and host string
 9.    Check through any errors
   that were captured during the              Figure 27: ERwin Automatically Generates the Database
   generation process and resolve
   those by making adjustments to
   the E/R diagram
 10. After any and all corrections are made to the E/R diagram, repeat steps 3-8 using
  the following options in the generation phase:
       a. Schema     uncheck
       b. View       uncheck
       c. Table      check “DROP TABLE”
       d. Column     uncheck
       e. Index      uncheck
       f. Ref        uncheck
       g. Trig       uncheck
       h. Options    uncheck
 11.    Restart the generation process until you have no errors

23 Data Marts and Operational Data Stores
The data marts are relatively straightforward structures. They are the physical
implementation of the completed PIM for the source data. Once instantiated as a
database, ETL tools are used to convert the source information from its native format into
the relational database. This allows the databases to be easily updated as new,
authoritative information is provided.


                                            42
                     The ASDB System and Software Design Document


For the ASDB, the data has been provided in a number of ways, each of which allows
some reuse of the ETL tools. In most cases, the authoritative data has been provided as
updates to an existing consuming data structure. For example, when we request an
update for TRAP, we received an updated TRAP missile file. The ETL tools used to load
that data into the database have been developed.
In other cases, the data will arrive in a source format. For example, NAIC provides data
in one of several source formats. Those have been modeled as individual back-end data
marts and can be loaded using the existing ETL tools. In most cases, the conversion of
these source files into the model-ready datasets is done outside of the database. When
that is the case, the data can be reloaded into the operational data store database using
existing ETL tools. For example, NASIC provides an aircraft export file. It’s entered
into the database using the ETL tools. The file is then processed and converted to
generate a portion of the Brawler aircraft dataset. Once that new dataset is complete, it’s
loaded into the database using the Brawler ETL tools.
While possible, data aggregation does not take place at this level. Cross-database data
mining can be accomplished once the data marts are linked via the SPIIDRE metadata.
At this stage, to an end-user, the ASDB interface appears to interact with one, global
database rather than a number of linked data marts.
The operational data stores are the physical implementation of the application specific
data requirements as modeled using SPIIDRE. The operational data stores are used to
cache data from multiple data sources into a relational database. Unlike data
warehousing, the size of the data store is relatively small, making replication efforts
quicker and easier. From a strategic information management perspective, the use of data
stores has distinct merits. It is a high-performance database that serves as a persistent,
long-term cache for data from one or multiple sources. Using SPIIDRE, these data stores
can be combined quickly and efficiently. Data extracted from the operational data stores
is transformed to generate DSMs.


Back-End Data Marts                                Operational Data Stores
TRAP                                               Brawler
NASIC Source Data                                  Thunder
SHAZAM                                             JIMM
BLUEMAX                                            ESAMS
SIMS                                               RADGUNS
                                                   MOSAIC
                                                   MIL-AASPEM
                          Figure 28 Data Marts and Operational Data Stores




                                                43
                     The ASDB System and Software Design Document




Using the SPIIDRE infrastructure as a focal point for gathering an unprecedented amount
of metadata, PTI can administer as wide a swath of source information as is desired by
the customer. SPIIDRE allows leveraging data relationships across data sources and
consuming applications without the effort of moving or copying the data into a central
repository.




                         Figure 29: Data Marts and Operational Data Stores




24 Population Test
After checking for any errors resulting from the implementation of the model, PTI conducts
a population test on the new database. PTI has developed a library of ETL to load legacy
data into the relational databases.
These ETL tools are for the most part Perl scripts that have been written for each data
mart. They grab the data from the source and convert it into a CSV format that can then
be loaded into each Oracle data mart using SQL Loader. There may be one Perl script for
one data mart or several Perl scripts. All of these files have been checked into PTI’s
Software Version Control System CS-RCS and can be found under each CSCI (Brawler,
Thunder, JIMM, etc) in the Version control library.




                                                44
                     The ASDB System and Software Design Document


In demonstrating that the data currently used in the legacy tool can be extracted from its
original format and used to populate the newly generated database, PTI analysts can
identify issues with structure, data type, size, domain, constraints, and a number of other
potential problems. When issues are discovered in the architecture of a given datamart
the changes are incorporated into the original Erwin and a new database is generated.
Then if any alterations must occur to the Perl scripts because of a change in the database
design that is accomplished then the data is reloaded into the datamart.
This step is accomplished for both the data sources and the consumers. For the
consuming models, the creation of the physical database and loading a sampling of the
existing legacy data provides a test of the data structures to ensure the proper physical
properties were captured.
In addition to providing a test of
the conceptual data model,
loading the source models
provides a linking between the
existing data sources and a more
manageable data management
system. This linking between
the native source and the new
data management structure can
be set up for automated updates
as new information arrives, thus
ensuring real time access to the
most current information.


                                                   Figure 30: Database Population




Application-Specific Transformations or Digital System Model
Development

From the data point of view, XML is simply a standardized approach to storing text-
based data in a hierarchical manner and to defining metadata about said data. XML is a
descendant of SGML (Standard Generalized Markup Language) and a sibling of HTML
(Hyper Text Markup Language), both ISO (International Standards Organization) and
Internet standards. XML was developed to add data management features similar to
those of SGML, but went beyond HTML by keeping the data and its structure together in
the same “message.” XML does not include the presentation of that data in an
application-specific format. This has to be created separately, as discussed in out section
on XSLTs. By providing the structure and data together, XML is an ideal medium for
migrating and sharing data. Users can build reusable code to process data from multiple
back-end sources for use in operational data stores. Conversely, data can be entered in
application-specific format in order to update multiple back-end sources.


                                            45
                     The ASDB System and Software Design Document


Standard        Description
Extensible      XSL enables you to present data in a paginated format. XSL supports
Stylesheet      the ability to apply formatting rules to elements, to apply formatting
Language        rules to pages to add things like headers and footers, and to render
                XML documents on various display technologies.
Extensible      XSLT enables you to transform data from one format to another.
Stylesheet      XSLT is often used to rearrange the order of the content within an
Language        XML document so that it makes the most sense for display. XSLT is
Transformations effectively used to transform data documents into presentation
                documents.
XML Schema      XML Schema enables you to define the structure and definition rules
                of an XML document. DTDs can only be used to define the structure.
                XML Schema provides the ability to specify data types to the level of
                precision that you see in programming languages and simple data
                modeling CASE tools. You can specify simple types such as strings or
                create your own “complex types” (data structures). You can also
                specify the cardinality and optionality. Simple validation rules can be
                defined as well. The greatest drawback of XML Schema is its
                complexity because it has a large feature set.
                                    Figure 31: XML Primer

SPIIDRE utilizes XML for data migration, transformation and storage. For data
migration purposes, XML’s portability is very useful. The XML document is platform
and application independent. In this neutral state, it is much easier to perform formatting
functions necessary to convert relational data into a non-relational format, such as
generating the ASCII text files needed to load data into a legacy simulation. Once the
data has been formatted, analysts and users can examine the input data sets prior to
simulation execution.
The business rules captured during the data modeling process define this structure. These
“structure rules” are collected and captured in a document linked to the XML file. This
schema is represented in a Document Type Definition (DTD), an XML Schema
Document (XSD) or in a variety of other formats.
These schema documents accomplish similar tasks, but in slightly different ways. When
linked to the XML document, they ensure the XML file follows the predefined structure.
It is this ability to provide consistency between XML files that allows for the
development of reusable code for processing data.
These reusable components allow users to view an XML file in a way they can
understand. To illustrate, let’s consider a set of data about an automobile tune up. The
data about the tune up consists of information about inspections of various components of
the car and replacement of faulty or worn parts. The data about the parts and specifics
about wear are all contained within the same repository, but depending on the user’s
specific needs, it may need to be delivered in differing formats. Consider the mechanic is
concerned with data about the parts that failed the inspection process, which parts must
be replaced, and the technical details of the replacement parts. Suppliers are concerned
with providing a suitable part for that particular car model. Customers want verification


                                             46
                     The ASDB System and Software Design Document


that the parts needed to be replaced, that the new parts are appropriate for their car, and
the cost of the replacement components. In applying XML to the ASDB, we have found
many similar examples.
Let’s consider the fact that many simulations need information about an F-15C. Each of
these applications requires similar information, but formatted in different ways. The
users know what they need and want to be provided information in a format they
recognize and understand. The most efficient means of delivering the data, therefore, is
in the format easiest for the users. This can be accomplished by applying application-
specific transformations for each simulation. The rules for accomplishing these
transformations are stored in an Extensible Stylesheet Language (XSLT) file.
All of PTI’s XSLTs are stored in their Version Controlled Library under the specific
CSCI (Brawler, TRAP, etc). The folder where these can be found in each CSCI is
“load”.

25 Converting the database into XML with Java
In order to accomplish this conversion, you will need the Java 1.3 Software Development
Kit (SDK) or higher installed on your computer. The Java application can be created in
any text editor program such as Textpad or other integrated development environments
such as one that PTI uses, Eclipse.
The process to create an XML document starts by using the org.apache.xerces.dom.
DocumentImpl class, specifying the XML filename, root, elements, and attributes to be
created.
For example:
File XMLFile = new File("xml", "organization.xml"));
Document doc = new DocumentImpl();
Attr xmlns = doc.createAttribute("xmlns:xsi");
Element root = doc.createElement("organization");
Element orgName = doc.createElement(“organizationName”);
Element orgAddress = doc.createElement(“organizationAddress”);

Using the Java.sql package, write an SQL query and retrieve the result into a ResultSet
object. The queries PTI used for the ASDB can be found in the load folder of each CSCI.
For example:
Statement stmt = conn.createStatement();
String query = "SELECT * FROM ORGANIZATION";
ResultSet queryResult = stmt.executeQuery(query);


Then you will loop through the ResultSet, assigning the results of the query to the
appropriate element in the XML document that you are creating.

Example:
while (queryResult.next()) {


                                            47
                     The ASDB System and Software Design Document


     orgName.appendChild(
     doc.createTextNode(queryResult.getString("ORG_NAME")));

      orgAddress.appendChild(
    doc.createTextNode(queryResult.getString("ORG_ADDR")));
}
Append the elements to the root and the document.
Example:
root.appendChild(orgName);
root.appendChild(orgAddress);
doc.appendChild(root);


Call the serialize () function and pass in the name of the XML file that you are creating.
Compile and run the Java application to generate the file.


26 Applying a schema to an XML document


To apply a schema to an XML document, a reference to the path and schema file name
must be made in the xsi:noNamespaceSchemaLocation attribute of the root element of
the XML document.


Example:
<infoResource
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:/C:/Eclipse/workspace/JAMIS/schema/dataItem
Class.xsd">
  <infoResourceID>MC-19</informationResourceID>
  <infoResourceName>AA-11 TRAP 4.1</informationResourceName>
…
</infoResource>


This reference now associates the XSD schema file with the XML document. To validate
the XML document against this schema, you may use the Validate Document command
under the XML command on the menubar in Eclipse. An alternate method of validating
the document is to use the Xalan Validate utility. To use Xalan Validate, open a
command window (Start->Run->type ‘cmd’), and switch to the directory of your XML
document.


To execute the Xalan Validate utility type


                                             48
                     The ASDB System and Software Design Document


java Validate file-or-dir-name [logfile]
giving the filename and logfile, if needed.

27 Load Testing
Once the DSM has been developed, where possible they are load testing using the
simulation. PTI has installed Brawler at its Crystal City location and has successfully
load tested PTI-produced Brawler datasets. The checklist for performing this load test is
available on PTI’s classified LAN at \\Asdbdb\ASDB\Library\Native Library\Brawler
Native\Brawler 6.4 Native\Brawler 6.4 Native Documentation.




                                              49
                      The ASDB System and Software Design Document




 System Hardware

 The classified LAN is a client server network built upon the Windows 2000 operating
 system and contains the equipment listed in the table below. The classified network is
 linked using cat 5 cable connected to each client computer and provides 100 mhz of
 bandwidth. The unclassified computer is also linked using cat 5 cable, however the 100
 mhz of bandwidth is then split in some rooms using a 10 mhz splitter, thus delivering 10
 mhz of bandwidth to each computer. The unclassified system is a peer-to-peer network
 using a variety of MS Windows operating systems and Server 2000. The unclassified
 equipment listing is described in the next table.


         Make/Model             Processor     Hard Drive   Storage     RAM        Other
ASDB     Dell Poweredge         Pentium 3     HD=2X18      IDE CD-     SDRA       Fast
Server   2450                   833 Mhz       G SCSI       ROM         M          Ethernet
                                                                       256MB      NIC
LAN      Dell Poweredge         Pentium 3     HD=4X36      IDE CD-     SDRA       Fast
Server   2450                   833 Mhz       G            ROM         M          Ethernet
                                                                       512MB      NIC
                                              SCSI
Client   Dell Optiplex GX       Pentium 4     HD=80G       IDE CD-     SDRA       Fast
         260                    2.53 Ghz                   ROM         M1G        Ethernet
                                                                                  NIC
Client   Dell Optiplex GX       Pentium 4     HD=80G       IDE CD-     SDRA       Fast
         260                    2.53 Ghz                   ROM         M1G        Ethernet
                                                                                  NIC
Client   Dell GX110             Pentium 3     HD=40G       IDE CD-     SDRA       Fast
                                766 Mhz                    ROM         M          Ethernet
                                                                       256M       NIC
Client   Dell GX110             Pentium 3     HD=40G       IDE CD-     SDRA       Fast
                                766 Mhz                    ROM         M          Ethernet
                                                                       256M       NIC
Client   Dell GX110             Pentium 3     HD=40G       IDE CD-     SDRA       Fast
                                766 Mhz                    ROM         M          Ethernet
                                                                       256M       NIC
Client   Dell GX110             Pentium 3     HD=40G       IDE CD-     SDRA       Fast
                                766 Mhz                    ROM         M          Ethernet
                                                                       256M       NIC
Client   Dell GX110             Pentium 3     HD=40G       IDE CD-     SDRA       Fast
                                766 Mhz                    ROM         M          Ethernet


                                            50
                                                            The ASDB System and Software Design Document


                                                                                                                                                                                                                                                                                                                                                                  256M                    NIC
Client                         Lexmark Optra                                                                                                                                                                                                                                                                                                                      SDRA                    Fast
                               K1220                                                                                                                                                                                                                                                                                                                              M 1M                    Ethernet
                                                                                                                                                                                                                                                                                                                                                                                          NIC
Client                         Hewlett Packard                                                                                                                                                                                                                                                                                                                    SDRA                    Fast
                               DLT1e                                                                                                                                                                                                                                                                                                                              M 2M                    Ethernet
                                                                                                                                                                                                                                                                                                                                                                                          NIC
Client                         E-Machine E-Tower                                                                                                                                                                                                        Pentium                          HD=4G                                             IDE CD-                SDRA                    Fast
                               433                                                                                                                                                                                                                      Celeron                                                                            ROM                    M 32M                   Ethernet
                                                                                                                                                                                                                                                        433 Mhz                                                                                                                           NIC
Client                         Dell GX150                                                                                                                                                                                                               Pentium 3                        HD=40G                                            IDE CD-                SDRA                    Fast
                                                                                                                                                                                                                                                        833 Mhz                                                                            ROM                    M                       Ethernet
                                                                                                                                                                                                                                                                                                                                                                  256M                    NIC
Client                         SGI 02-R10K                                                                                                                                                                                                              R10,000                          HD=18G                                            4 MM                   SDRA                    Fast
                                                                                                                                                                                                                                                                                                                                           Tape                   M                       Ethernet
                                                                                                                                                                                                                                                        195Mhz
                                                                                                                                                                                                                                                                                                                                           Drive                  256M                    NIC
                                                                                                                                                                                         Figure 32: Unclassified LAN Equipment




                                                                                                                                                                                                                                                                Windows Workstation                             Windows Workstation        Windows Workstaton     Windows Workstation       Windows Workstation


                                                                                                                                                                                                                                                         SD


                                                                       Pr o        itch
                                                                           Curve Sw 242 4M
                                                                       HP J 4 122B                                                                                                            10/100Base- T Ports

                                                                                 Module S tatus           1     2     3 4   5   6   1 14 15 1 17 18
                                                                                                                                     3       6        Link   1X   2X   3X   4X    5X    6X                          13X   14X   15X   16X   17X   18X
                                                                                    Self Test                                                         Mode




          Ethernet Switch 5-
                                                                                                          7     8     9 10 11 12     9       2
                                                                                                                                    1 20 21 2 23 24   Link
                                                                               s
                                                                            Con ole                                                                   Mode
                                                               Power
                                                                                                                                      Act Fdx 100
                                                                                                  Reset       Clear
                                                               Fault                                                                Mode S elect             7X   8X   9X   10X   11X   12X                         19X   20X   21X   22X   23X   24X




                port                                                                                                                                                                                                                                                                                            Fast Ethernet 10/100
                                                             Fast Ethernet Switch 16-port


                                            Tape Backup
                                                  SD
           PROLI ANT
                                 1850R




                         LAN Server                                                                                                                                                                                                                                                                                                                                                              Network Printer
                                                                                                                                                                                                                                                                Windows Workstation                                 Windows Workstation     Windows Workstation     Windows Workstation
                                                  SD
           PROLI ANT
                                 1850R




     ASDB Development Server



                                                       SD
                 PROLI ANT
                                    1850R




                                                                                                                                                                                                               SGI O2 Workstation

         ASDB Deployment Server

                                                                                                                                                                                                                                                                                          T AL K / DATA
                                                                                                                                                                                                                                                                                                    TAL K   RS CS TR RD TD CD




                                                                                                                                                                                                                                                                                      Fiber Converter
                             RAS Server


                              Class 5 Container

                                                                                                                                                                                                                                                                 3Com




                                                                                                                                                                                                                                                                                                                                Internet

                                                                                                                                                                                                                                                        ISP Router/Firewall/NAT




                                                                                                                                                                             Figure 33: Classified Network Architecture




                                                                                                                                                                                                                                                                                  51
                    The ASDB System and Software Design Document




         Make/Model                Processor        Hard              Storage   RAM     Other
                                                    Drive
Client   Hewlett Packard 6830      Pentium          HD=40G            IDE CD-   SDRAM   Fast
                                   Celeron                            ROM       128M    Ethernet
                                   633 Mhz                                              NIC
Client   Hewlett Packard 6830      Pentium          HD=40G            IDE CD-   SDRAM   Fast
                                   Celeron                            ROM       128M    Ethernet
                                   633 Mhz                                              NIC
Client   Hewlett Packard 6830      Pentium          HD=40G            IDE CD-   SDRAM   Fast
                                   Celeron                            ROM       128M    Ethernet
                                   633 Mhz                                              NIC
Client   Dell Dimension 4300       Pentium 4        HD=40G            IDE CD-   SDRAM   Fast
                                   1.4 Ghz                            ROM       256M    Ethernet
                                                                                        NIC
Client   Dell Dimension 4400       Pentium 4        HD=40G            IDE CD-   SDRAM   Fast
                                   1.6 Ghz                            ROM       256M    Ethernet
                                                                                        NIC
Client   Dell Dimension 4300       Pentium 4        HD=40G            IDE CD-   SDRAM   Fast
                                   1.4 Ghz                            ROM       256M    Ethernet
                                                                                        NIC
Client   Dell Dimension 4300       Pentium 4        HD=40G            IDE CD-   SDRAM   Fast
                                   1.4 Ghz                            ROM       256M    Ethernet
                                                                                        NIC
Client   Lexmark Optra K1220                                          IDE CD-   SDRAM   Fast
                                                                      ROM       256M    Ethernet
                                                                                        NIC
Client   Hewlett Packard                                              IDE CD-   SDRAM   Fast
         DLT1e                                                        ROM       256M    Ethernet
                                                                                        NIC
Client   Dell Inspiron 8100        Pentium 4        HD=20G            IDE CD-   SDRAM   Fast
                                   1.0 Ghz                            ROM       256M    Ethernet
                                                                                        NIC
Client   Compaq Presario           AMD              HD=25G            IDE CD-   SDRAM   Fast
                                   Duron                              ROM       512M    Ethernet
                                                                                        NIC

                              Figure 34: Unclassified LAN Equipment




                                               52
                             The ASDB System and Software Design Document


                                                       PTI ASDB Unclassified Network




Windows Workstation   Windows Workstation                  Windows Workstaton                     Windows Workstation                 Windows Workstation




                                                                                                   Fast Ethernet 10/100




Windows Workstation   Windows Workstation                  Windows Workstation                    Windows Workstation
                                                                                                                                                       Network Printer




                                                                                                                                                                                                                           7x   8x   9x       10x   11 x   12 x   7x   8x   9x       10x   11 x   1 2x




                                                                                                                                                                                          Ethern et
                                                                                                                                                                                                       C
                                                                                                                                                                                                           7 8 9 10111 2

                                                                                                                                                                                                       A   1 2 34 5 6      1x   2x   3x       4x    5x     6x     1x   2x   3x       4x    5x     6x
                                                                                                                                                                                                                                          A                                      B




                                                                                                                                                                                                      Fast Ethernet 16-port hub




                                                                                                                                         T AL K / DA TA
                                                                                                                                                   TAL K   RS CS TR RD TD CD




                                                                                 Fast Ethernet 10/100
                                                                                                                                     Fiber Converter
                                            Linux Workstation/
                                            Tomcat Server


                                                                                                                     3Com




                                                                                                                                                                               Internet

                                                                                                           ISP Router/Firewall/NAT




                                            Figure 35: Unclassified Network Architecture

28 Domain Controller
A Dell PowerEdge computer acts as the server for the classified network and the hard
drive has a RAID 1 configuration. Windows 2000 is the operating system with service
pack 2. Component System Revision Control software is used to accomplish version
control of all entered software programs. Active Perl is installed to provide data parsing
capability, Inoculate IT for Windows NT is installed to scan for viruses, MS Office 2000
Professional, Text Pad 4 to enhance parsing capabilities, Power Archiver to compress and
decompress files, ArcServe 2000 Advanced Edition to provide data back up and
recovery, and Hewlett-Packard Library and Tape Tools to assist in data backup. This
system also serves as the intranet information server with Apache Tomcat 4.0, Java 2
SDK Standard Edition and Cocoon installed.

29 Developmental System Hardware
Connected to the classified LAN this Dell PowerEdge Computer with a RAID 5 array
established on four hard drives and Windows 2000 Server Operating System with service
pack 2 operates as the ASDB developmental server. Loaded with Oracle 8i it houses the
developmental version of the ASDB and is subject to daily modifications. Access to the
database is done through a command line prompt or through DB Visualizer on the server


                                                                                 53
                     The ASDB System and Software Design Document


keyboard or on a client machine. Adobe Acrobat 5.0 and CS Revision Control System
are also loaded.


SystemSoftware

30 Client Tier Components

30.1 Web Browser: Netscape Navigator or Microsoft Internet Explorer
A program that uses the hypertext transfer protocol (HTTP) and secure HTTP (HTTPS)
to make requests of web servers throughout the Internet on behalf of the user. The web
browser transmits requests from the client to the server; translates hypertext markup
language (HTML) from the server into readable pages; and receives and executes Java
applets (Java applications that execute within a web browser).
Constraints: The web browser must be capable of displaying HTML 3.2 or higher and
running a Java Virtual Machine (JVM) compliant with Sun’s JDK 1.3.0.C or higher. The
connection that the web browser uses to access ASDB must not block TCP/IP port 80.
This component is external to ASDB, it is assumed that the user has an acceptable
browser installed and correctly configured.

30.2 Java Runtime Environment
Java Plug-in software enables enterprise customers to direct applets or JavaBeansTM on
their intranet web pages to run using Sun's Java 2 Runtime Environment, Standard
Edition (JRE) instead of the web browser's default virtual machine.
Constraints: Support for Microsoft Windows and Sun Solaris-based browsers.

30.3 Java Certificate
A signed applet file is a mechanism for ensuring an applet can be trusted on the client.
For a signed applet to be trusted, the client must trust the key that was used to sign the
applet file on the server providing the applet. The Java ™ Certificate provides the
mechanism for the client to authenticate the source of the applet.
Constraints: Support for Microsoft Windows and Sun Solaris-based browsers.

31 DbVisualizer
DbVisualizer is a cross-platform database tool for all major relational databases
developed by Minq Software (www.dbvis.com). DbVisualizer enables simultaneous
connections to many different databases through JDBC drivers. Just point and click to
browse the database structure, view detailed characteristics of database objects, edit table
data graphically, execute arbitrary SQL statements or SQL scripts, reverse engineer
primary/foreign key mappings graphically or why not let DbVisualizer chart your
database with its advanced charting options.
For purposes of developing and populating the ASDB, DbVisualizer has been used to
interface with the databases on several levels. While the majority of the initial and
follow-on population efforts were done through mass updates using Perl scripts, for


                                            54
                     The ASDB System and Software Design Document


purposes of maintaining the database, DbVisualizer has been used to do small-scale
individual updates through its grid-level input capabilities. This capability has been
explained in the metadata chapter.

32 CS-RCS
Component Software RCS (Revision Control System) monitors changes made in files
that are accessed by stand-alone or networked workstations. Based on the widely used
GNU RCS, Component Software RCS is fully integrated with Windows. RCS helps you
manage multiple file revisions by keeping a complete history of changes made to files.
You can see how and when a file was changed, and quickly return to a previous revision
of a file. This capability is crucial for files that are edited frequently, such as program
source files, word-processor documents and HTML documents.
For purposes of ASDB development, Component Software RCS has been used to manage
PTI-produced documents and programs on both the unclassified and classified networks.
The repositories are organized according to the CSCIs that were developed from the
Statement of Work. Within each CSCI additional folders have been created that
represent the software components or SCs. This helps to further organize the information
within the RCS. The PTI staff developed all of the documents contained within the
ComponentSoftware RCS repository. Materials provided from sources outside of the PTI
Data Management Division are stored in the Native Library – as described in the
metadata chapter.




                          Figure 36: Component Software RCS Repository



                                              55
                      The ASDB System and Software Design Document


Once all of the files have been checked in (see CS-RCS tutorial for specifics on check-in
and check-out procedures), any member of the team can accomplish revisions to those
files. Analysts are able to keep track of work others are doing and coordinate their efforts
with their team members. As a version of the ASDB is complete and distributed, PTI
using the tool’s mile stoning capabilities to specify which files and which versions of
those files were used for a specific release.

33 ExamDiff
ExamDiff is a freeware Windows tool for visual file comparison. Here is what an
ExamDiff      screen    after    comparing       files    may     look     like:




                           Figure 37: Examdiff Used For Visual Comparisons

Examdiff can be downloaded from http://www.prestosoft.com/. Once downloaded the
self extracting installation file is intuitive to execute. There is an excellent on-line tutorial
and helpfile for novice users.
PTI analysts have used ExamDiff as a visual comparison tool for purposes of checking
through the original datasets and the ASDB-generated DSMs. This is what we use to
accomplish our difference checks.


                                                 56
                     The ASDB System and Software Design Document


34 ERwin
CA’s AllFusion ERwin Data Modeler is the industry’s leading data modeling solution. It can
greatly enhance the design, generation and maintenance of high-quality, high-performance
databases, data warehouses and enterprise data resource models.
From a logical model of your information requirements and business rules that define your
database to a physical model optimized for the specific characteristics of your target
database, AllFusion ERwin Data Modeler helps you visually determine the proper structure,
key elements and optimal design for your database.

35 NASIC Software
The National Air and Space Intelligence Center provided five conversion programs for use
in developing Brawler datasets. These programs are classified and are stored on the SGI in
the following directory:
brawler_stuff/JSF.PARADIGM.CONVERTERS.22May02.
Information about their use and context is documented in the classified library:
\\Asdbdb\ASDB\Library\Native Library\Brawler Native\Brawler 6.4 Native\Brawler 6.4
Native Documentation.




                                           57

						
Related docs