Principles of Geographical Information Systems - PowerPoint

Document Sample
Principles of Geographical Information Systems - PowerPoint Powered By Docstoc
					               Data Input and GIS

                   refer to Chapter 4
      Data input, verification, storage, and output

                       Text Book
Burrough, P. A. and R. A. McDonnell, 1998. Principles of
 Geographical Information Systems. Oxford University
                    Press, London.
•   Input of spatial data
•   Modes of data input
•   Rasterization and vectorization
•   Map preparation and the digitizing
•   Remote Sensing: Special Raster Data Input
•   Integrating different data sources
•   External Databases
•   Exercise
            Input of spatial data
• Need to have tools to transform spatial data of
  various types into digital format
• Data input is a major bottleneck in application
  of GIS technology. Costs of input often
  consume 80% or more of project costs
• Many commercial GIS operations generate
  most of their revenue through data input
• Data input is labor intensive, tedious, and
• There is a danger that construction of the
  database may become an end in itself and the
  project may not move on to analysis of the
  data collected
       Input of spatial data-continue
• Essential to find ways to reduce costs and
  maximize accuracy
• Need to automate the input process as much as
  possible, but: automated input cab create
  bigger editing problems later
• Source documents (maps) may often have to be
  redrafted to meet rigid quality requirements of
  automated input
• Sharing of digital data is one way around
  the input bottleneck. More and more spatial
  data is becoming available in digital form
       Input of spatial data-continue
• Data input to a GIS involves encoding both the
  locational and attribute data
• The locational data is encoded as coordinates on
  a particular cartesian coordinate system
• Source maps may have different projections and
• Several stages of data transformation may be
  needed to bring all data to a common coordinate
• Attribute data is often obtained and stored in
  tables (Database Management System)
       Input of spatial data-continue

• There are two methods for spatial data
• Primary methods
  Surveying, Photogrammetry, GPS, and Remote
• Secondary methods
  Digitization, Automatic line following,
  and scanning
             The input subsystem

• Designed to transfer data into the GIS from
  external sources (attribute and map data)
• Must allow for encoding in either raster or
• Must provide a means for spatial referencing
  (projections, cartesian coordinate systems, etc)
• Must provide link between storage and editing
  subsystems (ensure input can be saved and any
  errors corrected)
       Modes of data input: Input Devices

•   Grid overlay
•   keyboard
•   Digitizer
•   Scanner
•   Data in digital format (Total station, digital
    photogrammetry, remote sensing, GPS)
                  Grid overlay

• Grid on clear material is overlaid on map
• Identity of each cell in the grid is determined
  by what map features are in a particular cell
• Number or code is assigned to each class of
  map features, and used to label cells in grid
• After filling in the grid, numbers or codes are
  typed into the computer to produce a raster
• Pretty antiquated method, seldom used

• Keyboard entry (X,Y,Z), (Ø, , h), or angle
  and distance
• Input through keyboard is time consuming,
  but it is more accurate
• It is suitable for small areas i.e. when the
  number of points/lines/areas are limited
• Because of its high accuracy, sometimes it
  is used in applications that need high quality
  e.g. cadastral mapping
         Digitizing: Digitizing Tablet
• Tablet composed of a flat surface, in which are
  embedded a grid of electronically active wires
  and mouse-like device (puck or stylus) usually
  with cross hairs. When puck is moved over the
  tablet, its location is known because the grid of
  wires senses it location. Puck also has buttons
  which allow communication with the computer
• Grid acts like a cartersian (X,Y) coordinate
  system. To input data, map is taped on
  digitizing tablet. Puck is placed over the feature
  of interest, and message is sent to compute
  through buttons on puck e.g., node is used to
  mark beginning and end of line feature, or
  point where polygon closes on itself
• Digitization is a process of converting
  existing maps to digital form (vector
• A digitizer is connected to a computer and
  map features are followed manually
• Digitizers are available at different sizes
  (A4, A3, A2, A0) and different accuracy
  (0.05 mm)
• Example of digitizers are CalComp 9500
  and Summagraphic
• Digitizing the map contents can be done in two
  different modes: point or stream
• Point mode: the operator identifies the points
  to be captured explicitly by pressing a button
• Stream mode: points are captured at set time
  intervals (typically 10 per second) or on
  movement of the cursor by a fixed amount
• In point mode the operator selects points
  subjectively, two point mode operators will
  not code a line in the same way
• Stream mode generates large numbers of
  points, many of which may be redundant
             Digitizing- Problems
• Paper maps are unstable: each time the map is
  removed from the digitizing table, the reference
  points must be re-entered when the map is
  affixed to the table again
• If the map has stretched or shrunk in the
  interim, the newly digitized points will be
  slightly off in their location when compared to
  previously digitized points
• Errors occur on these maps, and these errors
  are entered into the GIS database as well
  the level of error in the GIS database is directly
  related to the error level of the source maps
      Digitizing- Problems-continue
• Maps are meant to display information, and do
  not always accurately record locational
  information, for example, when a railroad,
  stream and road all go through a narrow
  mountain pass, the pass may actually be
  depicted wider than its actual size to allow for
  the three symbols to be drafted in the pass
• Discrepancies across map sheet boundaries can
  cause discrepancies in the total GIS database
  e.g. roads or streams that do not meet exactly
  when two map sheets are placed next to each
• User error causes overshoots, undershoots
• Types of scanners: Line following and drum
• Line following placed on a line and follow line
  using a guiding device such as a laser
• Two short comings:
   – 1. sample lines at regular time or distance intervals
     (more complex parts of the line should have more
     samples, less complex need less samples)
   – 2. lines that converge then diverge (e.g., contours
     along a cliff, road intersections, etc), device
     doesn’t know which line is which also broken lines
     (dashes, interrupted by label etc.)
• Line following technology can be reproduced in
  a software environment (line tracing software)
                    Drum scanners
• Drum scanners (Fig 5.2, p. 129) as the drum rotates
  about its axis, a scanner head containing a light source
  and photo-detector reads the reflectivity of the target
  graphic, and digitizing this signal, creates a single row
  of pixels from the graphic. The scanner head moves
  along the axis of the drum to create the next column of
  pixels, and so on through the entire scan
• Systems may have a scan spot size of as little as 25
  micrometers, and be able to scan graphics of the order
  of 1 meter on a side an alternative mechanism involves
  an array of photo-detectors which extract data from
  several rows of the raster simultaneously. The detector
  moves across the document in a swath when all the
  columns have been scanned, the detector moves to a
  new swath of rows initially, scanning produces a raster
  image, which can be converted to vector using on
  screen digitizing or automated line tracing software

• Scanning is a process of converting existing
  maps to digital form (raster format)
• A scanner is connected to a computer and
  map features are scanned automatically
• Scanners are available at different sizes (A4,
  A3, A2, A0) and different accuracy (300 dpi,
  600dpi, 1000 dpi)
• Example of Scanners are UMAX-S12, HP

• Scanners are generally very expensive
• Editing can take nearly as long as manual
  digitizing would have taken
• Scanners should be thought of as time-
  saving devices only when maps are clear,
  show good contrast, and contain a relatively
  simple amount of content
         Rasterization and vectorization
• Regardless of input device, it is necessary to determine
  if the final product will be raster or vector
• Most GIS programs allow conversion between the two,
  but problems are involved
• If vector to raster, cell size is also important, but the
  results are satisfactory
• If raster to vector, lines become blocky and step-like,
  can’t reverse the procedure to produce original content
  of vector line also, resolution (or cell size) has a direct
  effect on the spatial integrity of the object
• Spline algorithms - apply a smoothing function to
  vector lines
• Examples of software convert from raster to vector or
  vice versa are R2V and ArcScan under Arc/Info
     Map Preparation and the Digitizing
• Identification of features to be digitized. Sometimes
  marked directly on the map or on clear overlay.
  Sometimes, identification of nodes vs. vertices
• The digitizing process usually starts with telling the
  computer about the coordinate system that the map is
  in. Digitizer operates in its own cartesian coordinate
  system, need to establish relationship between digitizer
  coordinates and map coordinates (Transformation)
• Registration points or tick marks identified. Allows you
  to remove map from tablet to allow others to access it,
  then put it back on and register the input system
  using tic marks. It is essential to locate these precisely
  because they provide the reference for all other spatial
  data entered
Reference Frameworks and Transformations

• Digitizer coordinate must be transformed to
  map coordinates using a minimum number
  of four registration marks to cater for
  Translation: Object movement
  Rotation: Reorient the object
  Scale change: adjust the object size
      (Figure 5.4, p. 133, DeMers)
Setting up digitizing environment to handle
• Fuzzy tolerance - attempts to account for errors caused
  by the "shaky hand”. Based on the idea that you will
  not be able to place the cursor exactly the same location
  twice. Essentially defines a distance for maximum
  separation . If two nodes are within the limits of fuzzy
  tolerance, the are snapped together. Same idea for line
  features. Can be done before digitizing starts or can be
  implemented in post-digitizing editing process
• Other variables: Material of map shrink/swell with
  changes in humidity and temperature and stable
  medium such as plastic (Mylar) is preferred
                   What to Input
• Define your purpose before hand and make sure the
  data you are using are suitable for the goals of your
  project and pre-plan carefully
• Use the most accurate data, but not data that is too
  accurate for your purpose
• Check to see if data are already available
• Keep coverages simple and use the same map to extract
  different coverages when possible
• How Much to Input:
   – Scale dependent
   – General rule - more complex features at larger
     scales require more detail (more vertices, smaller
     cell size)
• Sample more for more information
          Methods of Vector Input

• Manual digitizing, Registration marks
• Location of nodes, lines not become nodes
  and nodes don’t become just points
• Building of topology
• Correcting of digitizing errors
• Transformation and projection
• Adding attribute data
• Checking the accuracy of attribute data
             Methods of Raster Input
• Presence/absence method: If object occurs in a cell
  (anywhere) it is recorded as present ( simplicity )
  best method for coding points and lines (Fig. 5.6 p. 143)
• Centroid of cell method : Presence only recorded if
  object is at the center of the cell . Disadv. - less simple,
  requires calculation of centroid, location of object
  relative to centroid. Generally restricted to raster
  encoding of polygons
• Dominant type method : Commonly used for encoding
  polygons into raster format . Identified as present if it
  occupies more than 50% of the cell
• Percent occurrence: Not only encodes
  presence/absence, but % occurrence (Urban/Rural)
• Generally, each attribute is recorded as a separate
  coverage e.g., one grid of percent urban, one of percent
  rural, percent water, percent forest, etc.
Remote Sensing: Special Raster Data Input
• Remote Sensing data is considered as special
  raster data (in digital form). Image processing
  software can be used to extract/classify remote
  sensing imagery (cover later in the semester)
• Attention should be paid to geometric and
  radiometric corrections and method of
  classification (supervised/unsupervised),
  different radiometric, geometric, and temporal
• Institutional problems related to remote
  sensing data include availability of data
  (limited coverage, cloud cover), cost, education
  and training, and organizational infrastructure
            External Databases
• An efficient method of building a GIS
  database is to limit the amount of time and
  cost necessary to develop database
• A plenty of data already available in
  different digital format an in different media
  9-inch tape, 8 mm tape, CD-ROM, etc.
• Need to evaluate data for its utility/quality
  for projects and ability to import
• Meta-data or data dictionary should be
  prepared for the GIS database (information
  about the content)
                 Exercise 2: Digitization
• 1. Load any image of the campus into ArcView.
• 2. Create three new themes (point, line, polygon) (hint: View/new
    – a. Create a point theme to show your classroom buildings, where you
      park, and where you have lunch.
    – b. Create a line theme of the paths that you walk between those points
    – c.Create a polygon theme of each building that has a point.
• 3. Give a unique id number to each point, line and polygon in each
  of the new themes tables. (hint: Table/start editing)
• 4. Change the legend of each theme to show the different id
• 5. Give the points a new symbol that represent what is happening
• 6. Give the lines arrows to show what direction you are walking.
• 7. Create a layout that has a title, north arrow, your name, date,
  and a custom legend.
• To create a custom legend
    – a. First load the legend into a layout.
    – b. Second select the legend and right click simplify. (This will
      separate the legend)
    – c. Now edit the legend to give it a unique look and take out the
      polygon theme part of the legend.
    – d. Once you are finish with the legend select everything in the legend.
      When everything is selected go to graphics/group. This will group the
      graphic back together.
• 9. In the layout, select File and Export the layout to JPEG format,
  but before that make sure the JPEG (JFIF) image support
  Extension is loaded
• What to hand in.
    – 1. Jpeg layout
    – 2. Layout contains
             – a. Campus image with three themes
             – b. Custom legend
             – c. Title, scale, north arrow and your name

Description: Principles of Geographical Information Systems document sample