GEOG Introduction to GIS

Document Sample
GEOG Introduction to GIS Powered By Docstoc
					GIS data model and data collection
     Learning objective

   Define what a geographic data models are and discuss their
    importance in GIS
   Outline the main geographic models used in GIS and their
    strengths and weaknesses
   Understand key topological concepts and why topology is
   Describe data collection work flows
   Understand the primary data capture techniques in remote
    sensing and surveying
   Familiarize with the secondary data capture techniques and
    understand the principles of data transfer and geographic data
     GIS data modeling

   Data Modeling is a process of abstraction from the real world
    for the purpose of representation in a GIS (or other information
       How we organize data for use in a GIS
       The structure of the digital representation
       Converting the Earth into numbers that the computer understands
   Maps and GIS are models of reality (geospatial phenomena).
   GIS emphasize some aspects of reality in a database
       GIS Data Modeling

  Human oriented


Computer oriented
    Real world to conceptual model

   Perception is subjective
   Conceptual model: a formalized model to represent the real
    world with some simplification
     GIS Conceptual Data Models

   Object-view vs. Field-view of reality:
       Object-view: collection of discrete objects with spatial reference
        (discrete entities) e.g., buildings, trees, rivers, roads, bridges, towers…
            The object-view uses points, lines, or polygons to represent objects
             with discrete boundaries.

       Field-view: geographic phenomena that vary continuously throughout
        space e.g., elevation, temperature, air pressure, gravity, soil
            The field-view uses grid (rows and columns) to represent objects
      GIS data models

   Vector data model
    Discrete features, such as customer locations and data
    summarized by area, are usually represented using the vector
   Raster data model
    Continuous numeric values, such as elevation, and continuous
    categories, such as vegetation types, are represented using the
    raster model.
      Vector Data Model

   A coordinate-based data structure commonly used to represent
    linear geographic features.
   Vector data represents each feature as a row in a table, and
    feature shapes are defined by x,y locations in space (the GIS
    connects the dots to draw lines and outlines.)
   Features can be discrete locations or events, lines, or areas.
   Locations, such as the address of a customer, or the spot a crime
    was committed, are represented as points having a pair of
    geographic coordinates.
   Lines, such as streams or roads, are represented as a series of
    coordinate pairs.
      Vector Data Model

   Represents features on the earth's surface as points, lines or
   Point: nodes or vertices
   Line: arc or chain
   Polygon: area feature.
   Vector data is usually contained within the Layer database
    structure where points, lines and polygons are stored on separate
   The question is how to build intelligence into a computer so that
    it can know not only the individual locations of specific spatial
    entities, but where things are in relation to other things. This is
    known as defining topology for a spatial database
      Vector Database Structure
   In the vector data model, xy locations are used to specify point locations.
    These point locations can either represent point features (eg a building), or
    they can specify the start and end point of line segments. (eg. Point Attribute

   Arcs (Lines) are composed of one or more straight line segments which join
    point vertices. Lines are described by the start node which is the point of
    the first segment, and an end node which is the end point of the last
         Since all area in our vector database belongs to something (Universe polygon),
         then if we travel along the arc in the direction it is digitized, we can specify a
         left polygon and right polygon. Arcs can describe line features or form the
         boundary of polygons.

   Polygons are described by a unique identifier, and a list of the arcs that make
    up the polygon boundary. One arc can form the boundary between two
   Topology is the science and mathematics of relationships used to validate the
    geometry of vector entities, and for operations such as network tracing and
    tests of polygon adjacency

   The actual process of defining topology is where enough additional
    information is added to a dataset so that the computer can calculate spatial
    analysis functions such as determining if two features are adjacent.

   Topology explicitly defines spatial relationships. The principle in practice is
    quite simple; spatial relationships are expressed as lists. For example, a
    polygon is defined by the list of arcs comprising its border.
    Advantages and Disadvantages of Vector data model

   Advantages
       High Precision in locating where objects are
       Less Disk space
       Maps are more aesthetically pleasing

   Disadvantages
       Expensive to gather and clean data (digitizing)
       Expensive and time consuming to explicitly define topology
       Complex operations may be computationally time consuming
Application of Vector GIS

   Mostly used in utilities applications,

   Government applications, such as voting districts

   Planning applications, such as parcel mapping
    Raster Data Model

   Raster models are developed by building up a grid of cells, or pixels:
   Divides space into a matrix of cells (pixels) called a raster.
   Each cell has a single value attached to it. If more than one value occurs at
    a location, must decide what value to store at that location
   In raster data, topology is explicitly defined by the matrix
     Raster Input Methods / Sources:

   Rasterization of vector data / conversion of other data formats
   Scanning
   Remote Sensing Imagery
   DEM's (digital elevation model)

Raster Data
                     Thematic
                     Continuous
     Advantages and Disadvantages of Raster data model

   Advantages
       Layer overlays are fast and simple
       topology is explicitly defined
       gathering raster data from scanning or remotely sensed images is cheap
        and fast

   Disadvantages
       Resolution of dataset will determine what features are represented by the
        coarse raster cells
       If fine resolution, will take up large amounts of disk space
      Applications of Raster

   Most Raster Applications are environmental

   Natural resource management

   Deforestation
     Object data model

   An object is a self-contained package of information describing
    the characteristics and capabilities of an entity under study
   A collection of objects of the same type is called class
   Example of objects are oil wells, soil Association, stream
    catchments and aircraft flight paths
   Three key facets of object data models
      Encapsulation

      Inheritance

      Polymorphism
         Object data model

   Encapsulation-describes the fact that each object packages together a
    description of its state and behavior
      Example forest object- dominant tree type, average tree age etc

   Inheritance- is the ability to reuse some or all of the characteristics of one
    object in another object
        Example a gas facility system- a new type of gas valve could be created by
         overwriting or adding properties to a similar existing valve
   Polymorphism- describes the process whereby each object has its own
    specific implementation for operations like draw, create and delete
        GIS data collection

 GIS can contain a wide variety of geographic data types originating from many
    diverse sources.

 From the perspective of creating geographic databases, it is convenient to
    classify raster and vector geographic data as primary and secondary.

   Primary geographic data sources are captured specifically for use in GIS by
    direct measurement.
       GIS data collection

   Typical examples of primary GIS sources include digital remote-sensing
    images such as SPOT and IKONOS Earth Satellite images and digital aerial
    photographs, and vector building survey measurements captured using a total
    survey station.
   Secondary sources are those that were originally captured for another purpose
    and need to be converted into a form suitable for use in a GIS project. In other
    words, secondary uses are those reused from earlier studies.

 Typical secondary sources include raster scanned color aerial photographs of
  urban areas, and USGS or paper maps that can be scanned and vectorized.
Primary data capture
Secondary data capture
     GIS data collection

   Geographic data may be obtained in either digital or analog

   Analog data must always be digitized before being added to a
    geographic database.

   Digital data, depending on the format and characteristics of
    the digital data, considerable reformatting and restructuring
    may be required prior to import.
        Data collection work flow
   Planning: Establishing user requirements, garnering resources (staff, hardware,
    and software) and developing a project plan.

   Preparation: Obtaining data, redrafting poor-quality map sources, editing
    scanned map images, and removing noise

   Digitizing/transfer

   Editing/improvement: Techniques to validate the data, as well as correcting
    errors and improving quality

   Evaluation: The process of identifying project successes and failures, Data
    quality assessment and quality control (QA/QC)
    Primary Geographic Data Capture- Raster data
   Remote sensing is the science of acquiring information about the Earth's
    surface without actually being in contact with it. This is done by sensing
    and recording reflected or emitted energy and processing, analyzing, and
    applying that information.
   In much of remote sensing, the process involves an interaction between
    incident radiation and the targets of interest.
        Primary Geographic Data Capture- Vector data

   Ground Surveying is based on the principle that the 3D location
    of any point can be determined by measuring angles and
    distances from other known points.
   Traditionally, surveyors used equipment like theodolites to
    measure angles, and tapes and chains to measure distances.
   Today these have been replaced by
       Electro-optical devices called total stations that can measure both angles
        and distances to an accuracy of 1 mm.
   Surveying is typically used for capturing buildings, land
    boundaries, and other objects that need to be located accurately.
Primary Geographic Data Capture- Vector data

   GPS(Global Positioning System)
      GPS is a collection of 27 NAVSTAR satellites orbiting the earth at

       a height of 12, 500 miles, five monitoring stations, and individual
      GPS was originally funded by U.S. department of Defense, and for

       many years military users had access to only the most accurately
      Fortunately, this selective availability was removed in May 2000

       so that now civilian and military users can fix x, y, z locations of
       objects relatively easily to an accuracy of better than 10 m with
       standard equipment.
     Secondary Geographic Data Capture

   Geographic Data Capture from secondary sources is the process of creating raster
    and vector files and databases from maps and other hard copy documents.
   Scanning is used to capture raster data.
   Table digitizing, heads-up digitizing, stereophotogrametry and COGO data entry are
    used for the vector data.
   Scanned maps and documents are used extensively in GIS as background
    maps and data stores.
   Three reason to scan hardcopy media for use in GIS:
        1. Documents, such as building plans, CAD drawings, are scanned to improve
         access, provide integrated database storage, and index them geographically.
        2. Film, and paper maps, aerial photographs, and images are scanned and
         georeferenced so that they provide geographic context for other data (typically
         vector layers)
        3. Maps, aerial photos, and images are also scanned prior to vectorization.
      Vector Data Capture

   Manual digitizing: is still the easiest, and cheapest method of capturing
    vector data from existing maps.

   Heads-up digitizing and vectorization:

   The simplest way to create vectors from raster layers is to digitize vector
    objects manually straight off a computer screen using a mouse or digitizing
    cursor.The reverse is called rasterization.
        Vector Data Capture

   Photogrametry
       The science and technology of making measurements from pictures, aerial
        photographs, and images.
       Measurements are captured from overlapping pairs of photographs using
        stereo plotters.These build a model and allow 3D measurements to be
        captured, edited, stored, and plotted.
       Photogrametry techniques are particularly suitable for highly accurate
        capture of contours, digital elevation models, and almost any type of object
        that can be identified on an aerial photograph or image.
       Orthophotographs result from using a DEM to correct distortions in an
        aerial photograph derived from varying land elevation. The have become
        popular because of their relatively low cost creation and ease of
        interpretation as base maps.
        Vector Data Capture

   COGO
       Coordinate geometry.
       It is a vector data structure and method of data entry.
       The COGO system is widely used in North America to represent land
        records and property parcels
       COGO uses survey style bearings and distances to define each part of an
    Obtaining data from external sources
    (Data Transfer)

 Whether to build or buy a database?
 The best way to find geographic data is to search the Internet using one of
  the specialist geographic search engines such as the US NSDI
  Clearinghouse or the Geography network.
 Problem: Data encoded in different formats.
 Solution:
     Direct translation (for small systems that involve the translation of a
      small number of formats).
     Neutral intermediate format (efficient for large systems).

Shared By: