; GEOMETRIC CORRECTION
Documents
User Generated
Resources
Learning Center
Your Federal Quarterly Tax Payments are due April 15th

# GEOMETRIC CORRECTION

VIEWS: 4 PAGES: 29

• pg 1
```									Thematic Information Extraction:
Pattern Recognition

Chapter 9
Classification

Multispectral classification may be performed using a variety of methods,
including:

• algorithms based on parametric and nonparametric statistics that use
ratio- and interval-scaled data and nonmetric methods that can also
incorporate nominal scale data;
• the use of supervised or unsupervised classification logic;,
• the use of hard or soft (fuzzy) set classification logic to create hard or
fuzzy thematic output products;
• the use of per-pixel or object-oriented classification logic, and
hybrid approaches.
Classification
• Parametric methods such as maximum likelihood classification and
unsupervised clustering assume normally distributed remote sensor data and
knowledge about the forms of the underlying class density functions.

• Nonparametric methods such as nearest-neighbor classifiers, fuzzy classifiers,
and neural networks may be applied to remote sensor data that are not normally
distributed and without the assumption that the forms of the underlying
densities are known.

• Nonmetric methods such as rule-based decision tree classifiers can operate on
both real-valued data (e.g., reflectance values from 0 to 100%) and nominal
scaled data (e.g., class 1 = forest; class 2 = agriculture).
Supervised Classification

   In a supervised classification, the identity and location of some of the land-
cover types (e.g., urban, agriculture, or wetland) are known a priori through a
combination of fieldwork, interpretation of aerial photography, map analysis,
and personal experience.
   The analyst attempts to locate specific sites in the remotely sensed data that
represent homogeneous examples of these known land-cover types.
   These areas are commonly referred to as training sites because the spectral
characteristics of these known areas are used to train the classification
algorithm for eventual land-cover mapping of the remainder of the image.
   Multivariate statistical parameters (means, standard deviations, covariance
matrices, correlation matrices, etc.) are calculated for each training site. Every
pixel both within and outside the training sites is then evaluated and assigned to
the class of which it has the highest likelihood of being a member
Unsupervised Classification

   In an unsupervised classification, the identities of land-cover
types to be specified as classes within a scene are not generally
known a priori because ground reference information is lacking
or surface features within the scene are not well defined.

   The computer is required to group pixels with similar spectral
characteristics into unique clusters according to some statistically
determined criteria.

   The analyst then re-labels and combines the spectral clusters into
information classes.
Hard vs. Fuzzy Classification

• Supervised and unsupervised classification algorithms typically use hard
classification logic to produce a classification map that consists of hard,
discrete categories (e.g., forest, agriculture).

• Conversely, it is also possible to use fuzzy set classification logic, which takes
into account the heterogeneous and imprecise nature of the real world.
Per-pixel vs. Object-oriented Classification

• In the past, most digital image classification was based on processing the
entire scene pixel by pixel. This is commonly referred to as per-pixel
classification.

• Object-oriented classification techniques allow the analyst to decompose the
scene into many relatively homogenous image objects (referred to as patches
or segments) using a multi-resolution image segmentation process. The
various statistical characteristics of these homogeneous image objects in the
scene are then subjected to traditional statistical or fuzzy logic classification.

• Object-oriented classification based on image segmentation is often used for
the analysis of high-spatial-resolution imagery (e.g., 1  1 m Space Imaging
IKONOS and 0.61  0.61 m Digital Globe QuickBird).
Land-use and Land-cover Classification Schemes

• Land cover refers to the type of material present on the
landscape (e.g., water, sand, crops, forest, wetland, human-made
materials such as asphalt).

•    Land use refers to what people do on the land surface (e.g.,
agriculture, commerce, settlement).
Land-use and Land-cover Classification Schemes

• Mutually exclusive means that there is no taxonomic overlap (or fuzziness) of
any classes (i.e., deciduous forest and evergreen forest are distinct classes).

• Exhaustive means that all land-cover classes present in the landscape are
accounted for and none have been omitted.

* Hierarchical means that sublevel classes (e.g., single-family residential,
multiple-family residential) may be hierarchically combined into a higher-
level category (e.g., residential) that makes sense. This allows simplified
thematic maps to be produced when required.
Land-use and Land-cover Classification Schemes

It is also important for the analyst to realize that there is a
fundamental difference between information classes and spectral
classes.

* Information classes are those that human beings define.

* Spectral classes are those that are inherent in the remote sensor
data and must be identified and then labeled by the analyst.
Land-use and Land-cover Classification Schemes

Certain hard classification schemes can readily incorporate land-use and/or land-cover data
obtained by interpreting remotely sensed data, including the:

•    American Planning Association Land-Based Classification System which is oriented
toward detailed land-use classification;

•   United States Geological Survey Land-Use/Land-Cover Classification System for Use
with Remote Sensor Data and its adaptation for the U.S. National Land Cover
Dataset and the NOAA Coastal Change Analysis Program (C-CAP);

•   U.S. Department of the Interior Fish & Wildlife Service Classification of Wetlands and
Deepwater Habitats of the United States;

•   U.S. National Vegetation and Classification System;

•   International Geosphere-Biosphere Program IGBP Land Cover Classification System
modified for the creation of MODIS land-cover products
U.S. Geological Survey Land-
Use/Land-Cover Classification
System
Four Levels of the U.S.
Geological Survey Land-
Use/Land-Cover
Classification System for
Use with Remote Sensor
Data and the type of
remotely sensed data
typically used to provide
the information.
Selecting the Optimum Bands for Image Classification:
Feature Selection

• Once the training statistics have been systematically collected from each band
for each class of interest, a judgment must be made to determine the bands
(channels) that are most effective in discriminating each class from all others.
• This process is commonly called feature selection. The goal is to delete from
the analysis the bands that provide redundant spectral information. In this way
the dimensionality (i.e., the number of bands to be processed) in the dataset
may be reduced.
• This minimizes the cost of the digital image classification process (but should
not affect the accuracy). Feature selection may involve both statistical and
graphical analysis to determine the degree of between-class separability in the
remote sensor training data.
• Using statistical methods, combinations of bands are normally ranked
according to their potential ability to discriminate each class from all others
using n bands at a time.
Select the Appropriate Classification Algorithm

•   Various supervised classification algorithms may be used to assign an unknown pixel to
one of m possible classes. The choice of a particular classifier or decision rule depends
on the nature of the input data and the desired output. Parametric classification
algorithms assumes that the observed measurement vectors Xc obtained for each class in
each spectral band during the training phase of the supervised classification are
Gaussian; that is, they are normally distributed. Nonparametric classification algorithms
make no such assumption.

•   Several widely adopted nonparametric classification algorithms include:
• one-dimensional density slicing
• parallepiped,
• minimum distance,
• nearest-neighbor, and
• neural network and expert system analysis.

•   The most widely adopted parametric classification algorithms is the:
• maximum likelihood.
Unsupervised Classification

Unsupervised classification (commonly referred to as clustering) is an effective
method of partitioning remote sensor image data in multispectral feature
space and extracting land-cover information.

Compared to supervised classification, unsupervised classification normally
requires only a minimal amount of initial input from the analyst. This is
because clustering does not normally require training data.
Unsupervised Classification

• Unsupervised classification is the process where numerical operations are
performed that search for natural groupings of the spectral properties of pixels,
as examined in multispectral feature space.

• The clustering process results in a classification map consisting of m spectral
classes. The analyst then attempts a posteriori (after the fact) to assign or
transform the spectral classes into thematic information classes of interest (e.g.,
forest, agriculture).

• This may be difficult. Some spectral clusters may be meaningless because they
represent mixed classes of Earth surface materials. The analyst must understand
the spectral characteristics of the terrain well enough to be able to label certain
clusters as specific information classes.
Unsupervised Classification

Hundreds of clustering algorithms have been developed. Two
examples of conceptually simple but not necessarily efficient
clustering algorithms will be used to demonstrate the fundamental
logic of unsupervised classification of remote sensor data:

•    clustering using the Chain Method
•    clustering using the Iterative Self-Organizing Data Analysis
Technique (ISODATA).
Clustering Using the Chain Method

The Chain Method clustering algorithm operates in a two-pass mode (i.e., it passes
through the multispectral dataset two times).

Pass #1: The program reads through the dataset and sequentially builds clusters
(groups of points in spectral space). A mean vector is then associated with
each cluster.

Pass #2: A minimum distance to means classification algorithm is applied to the
whole dataset on a pixel-by-pixel basis whereby each pixel is assigned to one
of the mean vectors created in pass 1. The first pass, therefore, automatically
creates the cluster signatures (class mean vectors) to be used by the minimum
distance to means classifier.
ISODATA Clustering

• The Iterative Self-Organizing Data Analysis Technique (ISODATA) represents
a comprehensive set of heuristic (rule of thumb) procedures that have been
incorporated into an iterative classification algorithm. Many of the steps
incorporated into the algorithm are a result of experience gained through
experimentation.

• The ISODATA algorithm is a modification of the k-means clustering algorithm,
which includes a) merging clusters if their separation distance in multispectral
feature space is below a user-specified threshold and b) rules for splitting a
single cluster into two clusters.
ISODATA Clustering

• ISODATA is iterative because it makes a large number of passes through the
remote sensing dataset until specified results are obtained, instead of just two
passes.

•    ISODATA does not allocate its initial mean vectors based on the analysis of
pixels in the first line of data the way the two-pass algorithm does. Rather, an
initial arbitrary assignment of all Cmax clusters takes place along an n-
dimensional vector that runs between very specific points in feature space. The
region in feature space is defined using the mean, µk, and standard deviation,
sk, of each band in the analysis. This method of automatically seeding the
original Cmax vectors makes sure that the first few lines of data do not bias the
creation of clusters.
ISODATA Clustering

ISODATA is self-organizing because it requires relatively little human input. A
sophisticated ISODATA algorithm normally requires the analyst to specify the
following criteria:

•    Cmax: the maximum number of clusters to be identified by the algorithm (e.g.,
20 clusters). However, it is not uncommon for fewer to be found in the final
classification map after splitting and merging take place.

•    T: the maximum percentage of pixels whose class values are allowed to be
unchanged between iterations. When this number is reached, the ISODATA
algorithm terminates. Some datasets may never reach the desired percentage
unchanged. If this happens, it is necessary to interrupt processing and edit the
parameter.
ISODATA Clustering

• M: the maximum number of times ISODATA is to classify pixels and
recalculate cluster mean vectors. The ISODATA algorithm terminates when
this number is reached.

• Minimum members in a cluster (%): If a cluster contains less than the minimum
percentage of members, it is deleted and the members are assigned to an
alternative cluster. This also affects whether a class is going to be split (see
maximum standard deviation). The default minimum percentage of members is
often set to 0.01.
Analog and
Digital
Image
Analysis
Object-oriented Image Segmentation

• This need has given rise to the creation of image classification algorithms based
on object-oriented image segmentation. The algorithms incorporate both
spectral and spatial information in the image segmentation phase.
• The result is the creation of image objects defined as individual areas with
shape and spectral homogeneity which one may recognize as segments or
patches in the landscape. In many instances, carefully extracted image objects
can provide a greater number of meaningful features for image classification.
• In addition, objects don’t have to be derived from just image data but can also
be developed from any spatially distributed variable (e.g., elevation, slope,
aspect, population density).
• Homogeneous image objects are then analyzed using traditional classification
algorithms (e.g., nearest-neighbor, minimum distance, maximum likelihood) or
knowledge-based approaches and fuzzy classification logic.
Object-oriented Image Segmentation

• There are many algorithms that can be used to segment an image into relatively
homogeneous image objects. Most can be grouped into two classes:
• edge-based algorithms, and
• area-based algorithms.

• Unfortunately, the majority do not incorporate both spectral and spatial
information, and very few have been used for remote sensing digital image
classification.
Object-oriented Image Segmentation

One of the most promising approaches to remote sensing image segmentation was
developed by Baatz and Schape (2000). The image segmentation involves
looking at individual pixel values and their neighbors to compute a (Baatz et.
al., 2001):

•   color criterion (hcolor), and

•   a shape or spatial criterion (hshape).
Object-oriented Image Segmentation

• The object-oriented classification of a segmented image is substantially
different from performing a per-pixel classification.
• First, the analyst is not constrained to using just spectral information. He or
she may choose to use
a) the mean spectral information in conjunction with
b) various shape measures associated with each image object
(polygon) in the dataset.
• This introduces flexibility and robustness. Once selected, the spectral and
spatial attributes of each polygon can be input to a variety of classification
algorithms for analysis (e.g., nearest-neighbor, minimum distance,
maximum likelihood).

```
To top