04b
Document Sample


Data Conversion
&
Integration
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Data Conversion/Integration Process
• Data Inventory
• Existing hard-copy maps / digital data
• Data Collection (additional )
• Satellite Imagery, Aerial Photo, etc.
• Field Collection (hand-held devices-GPS,
etc.)
• Data Input/Conversion
• Keyboard entry of coordinates
• Digitizing/Scanning/Raster-to-Vector
• Editing/Building Topology
• Data Integration
• Georeferencing/Geocoding
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
About Geographic Data
• Conversion of hardcopy to digital maps is the most time-
consuming task in GIS
• Up to 80% of project costs
• Example: estimated to be a US $10 billion annual market
• Labor intensive, tedious and error-prone
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Data Inventory
• National overview maps
• 1:250,000 and 1:5,000,000 (small scale)
• show major civil divisions, urban areas, physical
features such as roads, rivers, lakes, elevation, etc.
• used for planning purposes
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Data Inventory (cont.)
• Topographic maps- scales range from 1:25,000 to
250,000 (mid-scale)
• Town and city maps at large cartographic scales, showing
roads, city blocks, parks, etc. (1:1,000 to 1:5,000)
• Maps of administrative units at all levels of civil division
• Thematic maps showing population distribution for
previous census dates, or any features that may be useful
for census mapping
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Existing Digital Data
• Digital maps
• Satellite imagery
• GPS coordinates
• Etc.
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Data Collection
Capture
Aerial Photography
Remote Sensing Surveying.
GPS Maps
GDB
Census & Surveys
GIS
Management
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Aerial photography
• Aerial photography is obtained using specialized cameras
on-board low-flying planes. The camera captures the image
digitally or on photographic film.
• Aerial photography is the method of choice for mapping
applications that require high accuracy and a fast completion
of the tasks.
• Photogrammetry—the science of obtaining measurements
from photographic images.
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Aerial photography (cont.)
• Traditional end product: printed photos
• Today: digital image (scanned from photo) in standard graphics
format (TIFF, JPEG) that can be integrated in a GIS or desktop
mapping package
• Trend: fully digital process
• digital orthophotos
• corrected for camera angle, atmospheric distortions and
terrain elevation
• georeferenced in a standard projection (e.g. UTM)
• geometric accuracy of a map
• large detail of a photograph
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Remote sensing process
Sources Sensing
of Energy System
Receiving
station
Earth Surface
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
GPS
• Collection of point data
• Stored as “waypoints”
• Accuracy dependent on device and environmental variables
Surveying
• Paper Based
• Manual recording of information
• Electronic Based
• Handheld device
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Geographic data input/conversion
• Keyboard entry of coordinates
• Digitizing
• Scanning and raster to vector conversion
• Field work data collection using
• Global positioning systems
• Air photos and remote sensing
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Keyboard entry
• keyboard entry of coordinate data
• e.g., point lat/long coordinates
• from a gazetteer (a listing of
place names and their
coordinates)
• from locations recorded on a
map
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Latitude/Longitude coordinate conversion
• Latitude is y-coo, Longitude is x-coo
• Common format is
degrees, minutes, seconds
113º 15’ 23” W 21º 56’ 07” N
• To represent lat/long in a GIS, we need to convert to
decimal degrees
-113.25639 21.93528
• DD = D + (M + S / 60) / 60
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Data Conversion
• Conversion is often the easiest form to import digital
spatial data into a GIS
• Data transfer often rely on the exchange of data in
mostly proprietary file formats using the import/export
functions of commercial GIS packages
• Open source data Conversion software becoming widely
available
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Conversion of hardcopy maps to digital data
• Turning features that are visible on a hardcopy map into
digital point, line, polygon, and attribute information
• In many GIS projects this is the step that requires by far
the largest time and resources
• Newer methods are arising to minimize this arduous step
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Conversion of hardcopy maps to digital data (cont.)
• Digitizing
• Manual digitizing
• Heads-up digitizing
• Scanning
• Raster-to-Vector
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Manual Digitizing
Most common form of coordinate
data input
• Requires a digitizing table
• Ranging in size (25x25 cm to
150x200cm)
• Ideally the map should be flat
and not torn or folded
• Cost: hundreds (300) to
thousands (5000)
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Digitizing steps (how points are recorded)
• trace features to be digitized with pointing device (cursor)
• point mode: click at positions where direction changes
• stream mode: digitizer automatically records position at
regular intervals or when cursor moved a fixed distance
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Control Points
• If a large map is digitized in several stages and the map
has to be removed from the digitizing table occasionally, the
control points allow the exact re-registration of the map on
the digitizing board.
• Control points are chosen for which the real-world
coordinates in the base map’s projection system are known.
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Digitizing table
• Grid of wires in the table creates a magnetic field which is
detected by the cursor
• X/Y coordinates in digitizing units are
• fed directly into GIS
y
• High precision in coordinate recording
x
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Heads-Up Digitizing I
• Features are traced from a map drawn on a transparent
sheet attached to the screen
• Option, if no digitizer is available; but: accuracy very low
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Heads-Up Digitizing II
• Common today is heads-up digitizing, where the operator
uses a scanned map, air photo or satellite image as a
backdrop and traces features with a mouse
• This method yields more accurate results
• Quicker and easier to retrace and save steps
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Heads-Up Digitizing II
• Raster-scanned image on the computer screen
• Operator follows lines on-screen in vector mode
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Digitizing Errors
• Undershoots
• Dangles
• Spurious Polygons
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Digitizing errors
• Any digitized map requires considerable post-processing
• Check for missing features
• Connect lines
• Remove spurious polygons
• Some of these steps can be automated
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Fixing Errors
• Some of the common digitizing errors shown in the figure
can be avoided by using the digitizing software’s snap
tolerances that are defined by the user
• For example, the user might specify that all endpoints of a
line that are closer than 1 mm from another line will
automatically be connected (snapped) to that line
• Small sliver polygons that are created when a line is
digitized twice can also be automatically removed
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Advantages and Disadvantages of Digitizing
Advantages
• It is easy to learn and thus does not require expensive
skilled labor
• Attribute information can be added during digitizing
process
• High accuracy can be achieved through manual
digitizing; i.e., there is usually minimal loss of accuracy
compared to the source map
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Advantages and Disadvantages of Digitizing
Disadvantages
• It is a tedious activity, possibly leading to operator
fatigue and resulting quality problems which may require
considerable post-processing
• It is slow. Large-scale data conversion projects may thus
require a large number of operators and digitizing tables
• The accuracy of digitized maps is limited by the quality
of the source material
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Scanning
A viable alternative to digitizing
• The map is placed onto the scanning surface where light
is directed at the map at an angle
• A photosensitive device records the intensity of light
reflected for each cell or pixel in a very fine raster grid
• In gray scale mode, the light intensity is converted
directly into a numeric value, for example into a number
between 0 (black) and 255 (white)
• In binary mode, the light intensity is converted into
white or black (0/1) cell values according to a threshold
light intensity
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Scanning
• Electronic detector moves across map and records light
intensity for regularly shaped pixels
• Flat-bed scanner
• Drum-scanner (pictured)
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Scanning (cont.) R G B
computer
color
splicing
Types of scanners optical
sensor
• Flat
• small format, low cost, good for
small tasks pixel
width
• Drum
• high precision but expensive and
slow
• Feed
• fast, good precision, lower cost than
drum
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Scanning (cont.)
• direct use of scanned images
• e.g., scanned air-photos
• digital topographic maps in raster format
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Scanning (cont.)
• Scanner output is a raster data set usually needs to be
converted into a
• Vector representation
- manually (on-screen digitizing)
- automated (raster-vector conversion)
line-tracing - e.g., MapScan
• Often requires considerable editing
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Advantages and Disadvantages of Scanning
Advantages
• Scanned maps can be used as image backdrops for
vector information
• Scanned topographic maps can be used in combination
with digitized EA boundaries for the production of
enumerator maps
• Clear base maps or original color separations can be
vectorized relatively easily using raster-to-vector
conversion software
• Small-format scanners are relatively inexpensive and
provide quick data capture
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Advantages and Disadvantages of Scanning
Disadvantages
• Converting large maps with a small format scanners
requires tedious re-assembly of the individual parts
• Large format, high-throughput scanners are expensive
• Despite recent advances in vectorization software
associated with scanning, considerable manual editing
and attribute labeling may still be required
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Raster to Vector Conversion
Gets scanned/image data into vector format
• Automatic mode: the system converts all lines on the
raster image into sequences of coordinates automatically.
automated raster to vector process starts with a line
thinning algorithm
• Semi-automatic mode, the operator clicks on each line
that needs to be converted; system then traces that line to
the nearest intersections and converts it into a vector
representation
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
OBIA Raster to Vector Conversion
• Object-Based Image Analysis (OBIA) is a tentative name
for a sub-discipline of GIScience devoted to partitioning
remote sensing (RS) imagery into meaningful image-objects,
and assessing their characteristics through spatial, spectral
and temporal scale. At its most fundamental level, OBIA
requires image segmentation,
• attribution, classification and the ability to query and link
individual objects (a.k.a. segments) in space and time. In
order to achieve this, OBIA incorporates knowledge from a
vast array of disciplines involved in the generation and use of
geographic information (GI).
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Object-Based Image Analysis
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
OBIA Dwelling Identification
• Segmentation
based
• Pixel based
• Automated
Digitizing
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Object-Based Image Analysis
• Increasing demand for updated geo-spatial information,
rapid information extraction
• Complex image content of VHSR data needs to be
structured and understood
• Huge amount of data can only be utilized by automated
analysis and interpretation
• New target classes and high variety of instances
• Monitoring systems and update cycles
• Transferability, objectivity, transparency, flexibility
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Editing
• Manual digitizing is error prone
• Objective is to produce an accurate representation of the
original map data
• This means that all lines that connect on the map must
also connect in the digital database
• There should be no missing features and no duplicate
lines
• The most common types of errors
• Reconnect disconnected line segments, etc
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Some common digitizing errors
spike
undershoot
missing
line
overshoot
line digitized
twice
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Building Topology
• GIS determines relationships between features in the database
• System will determine intersections between two or more roads
and will create nodes
• For polygon data, the system will determine which lines define
the border of each polygon
• After the completed digital database has been verified to be
error-free
• The final step is adding additional attributes
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Building Topology
• The building of relationships between objects
• Feature topology describes the spatial relationships between
connecting or adjacent geographic features such as roads
connecting at intersections
• The user typically does not have to worry about how the GIS
stores topological information
• Feature topology describes the spatial relationships between
connecting or adjacent geographic features such as roads
connecting at intersections
• The user typically does not have to worry about how the GIS
stores topological information
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Converting Between Different Digital Formats
• All software systems provide links to other formats
• But the number and functionality of import routines varies
between packages
• Problems often occur because software developers are
reluctant to publish the exact file formats that their
systems use -> instability of information (ex. file-
geodatabase [.gdb])
• Option of using a third data format
• Example: Autocad’s DXF format
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Georeferencing/Geocoding
• Georeferencing
• Converting map coordinates to the real world
coordinates corresponding to the source map’s
cartographic projection.
• Attaching codes to the digitized features (geocoded
feature)
• each line representing a road would obtain a
code that refers to the road status (dirt road,
one lane road, two lane highway, etc.)
• Or a unique code that can be linked to a list of
street names.
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
For attribute data:
• spreadsheets
• links to external database
• management systems (DBMS)
• tabulation programs (IMPS, Redatam)
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Sample components of a digital EA map
Street Network Buildings
35 31 45
27 61 22
57 65 62 19
43
40
350 32 44 43
42 41
349 58 64
63 21
20
34 33
28 41 60 59
42
Eckert Drive 6
Clark
61
60 1
Ortelius
55
e Stre
6
Ptolemy
4 5 7
1
et
62 11
Street
5 2
Street
31 56 66
59
10 9 8
11 10
65 64 63 3
58 4
57
Tobler Street
Gall Street
Snyder Stre
37
12 13 19 12
42 74 75 67 13
358 2
18 73
Krassowsk
72
362
Lambert Avenue
14
et
38 20
23 68
71
Boundaries Annotation and symbols 18
ij Stre
69
Mercat
43 3 17 15 22 21
51 16 70
et
ve
350 Cassini Dri 21
44 20
or
349 45 19
Cassini Drive
Avenu
81 82 76
80 29
35 36
21
e
24 26
19 25
358
362 23
20
22 361 37
79 78 77 28 27
Mollweide Street
Street 27 43
30 29 28
29 1 2
38 88 83
28 31 87
eet
361 32 Bonne Str 42
ive
33
Goode
Dr
Street 32 39 14
Robinson 31
33 84
85
41 41 86
30 40 13 7
374 34
Miller
378
374 Imhof Drive 24
12
377 15
42
Street
Grinten Street
51 52 23 11 10
43 50
48 49 9 8
44
et
Tissot Stre 53
54
Building numbers Neatlines and legend 58
59 50 47 46 45
Bessel Street
25
378 26 27
51 22 16
27 61 22 35 31
57 65 45
40
62 19
42 41
3 4
2
377
32 44 43
43
64
63 21
20
34 33
1 32
60 58
28
42
41 59
6
54 52 34 33
55 60
61
1 9 10 21
6
4 5 7
1 62 11
5 2
31 56 66
59
11
10 9 8
58 57
65 64 63
4 3
10
Enumeration Area Map Symbols
37
12 13 19 67 12 13
42 74 75
18 73
2
Province: Cartania 14
72
14
38 20
23 68
71 18
358
69
51
45 44
43
3 17
16
15 22
21
70
19
20
21
District: Chartes 032 District EA-Code
21 24 26
35
36
80
81 82
76
29 Locality: Maptown 0221
Locality
19 25
23
20
22
30 28 27 43
37
79 78 77
28 27
EA-Code: 00361
Hospital
29
29 1 2
38 88 83
28 31 87
EA
32 42
33
N
32 39 14
31
33 85 84
41 86
30 41 40 13
7
34
Church
12
43
42
49
50
51 52 23
24
15
11 10
9
17
Building
number
48 8
44
54 53
59 50 47 46 45
School
25 26 27
58
22
51 16
3 4
1 2 32
52 34
Approximate scale
54 33
9 10 21
Enumeration Area Map Symbols
Province:
District:
Cartania
Chartes
14
032 District 358 EA-Code
0 50 100 200m
Locality:
EA-Code:
Maptown 0221
00361 Locality
Hospital
Census 2000 National Statistical Office - July 1998
N EA
17
Building Church
number
School
Approximate scale
Census National Statistical Office -
July 1998
2000
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
A Simpler Alternative
• In many countries, EA map design may be simpler than in
this example
• Instead of a fully integrated digital base map in vector
format, rasterized images of topographic maps may be
used as a backdrop for EA boundaries
• What is available already!
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
A Simpler Alternative
• In some instances, map features may be more
generalized, for instance by using only the centerlines for
the streets and polygons for entire city blocks rather
than for individual houses
• This can include the use of free data as a baseline or
starting point in the creation or updating of census
related maps
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Agencies to contact
• National geographic institute / mapping agency
• Military mapping services
• Province, district and municipal governments
• Various government or private organizations dealing with
spatial data
• Geological or hydrological survey
• Environmental protection authority
• Transport authority
• Utility and communication sector companies
• Land titling & surveying agencies
• Academic institutions
• Donor activities
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Sources of
geographic Additional
Identify existing
information geographic data
data sources
collection
Paper maps, Field mapping
Digital air photos
existing printed products GPS coordinate Existing digital
and satellite
air photos and such as collection maps
images
satellite images sketch maps
Data
conversion
Generate lines
Digitizing Scanning
and polygons
Raster to vector
conversion
(automated or
semi-automated)
Editing
geographic
features
Construct
topology for
geographic
features
Digital map
Georeferencing
data integration (coordinate trans-
formation and
projection change)
Coding (labelling)
of digital
Parallel activity
geographic
features
Develop
geographic
Combine and Additional attributes
integrate digital delineation of database
map sheets EA boundaries
Workshop on International Standards, Contemporary Technologies and Regional Cooperation,
Noumea, New Caledonia, 04–08 February 2008
Get documents about "