HSI Course by S7VPvwHp

VIEWS: 16 PAGES: 79

									Introduction to Hyperspectral Imaging
      HSI Feature Extraction Methods




        Dr. Richard B. Gomez, Instructor
                             Outline

•   What is Hyperspectral Image Data?
•   Interpretation of Digital Image Data
•   Pixel Classification
•   HSI Data Processing Techniques
    – Methods and Algorithms (Continued)
        •   Principal Component Analysis
        •   Unmixing Pixel Problem
        •   Spectral Mixing Analysis
        •   Other
• Feature Extraction Techniques
    – N-dimensional Exploitation
    – Cluster Analysis
    What is Hyperspectral Image Data?
Hyperspectral image data is image data that is:

• In digital form, i.e., a picture that a computer can read,
manipulate, store, and display

• Spatially quantized into picture elements (pixels)

• Radiometrically quantized into discrete brightness
levels

• It can be in the form of Radiance, Apparent Reflectance,
True Reflectance, or Digital Number
Difference Between Radiance and Reflectance

• Radiance is the variable directly measured by remote sensing
instruments
• Radiance has units of watt/steradian/square meter
• Reflectance is the ratio of the amount of light leaving a target to the
amount of light striking the target
• Reflectance has no units
• Reflectance is a property of the material being observed
• Radiance depends on the illumination (both its intensity and
direction), the orientation and position of the target, and the path of the
light through the atmosphere
• Atmospheric effects and the solar illumination can be compensated
for in digital remote sensing data. This yields something, which is
called "apparent reflectance," and it differs from true reflectance in that
shadows and directional effects on reflectance have not been dealt with
   Interpretation of Digital Image Data
• Qualitative Approach: Photointerpretation by a
human analyst/interpreter
   • On a scale large relative to pixel size
   • Limited multispectral analysis
   • Inaccurate area estimates
   • Limited use of brightness levels
• Quantitative Approach: Analysis by computer
   • At individual pixel level
   • Accurate area estimates possible
   • Exploits all brightness levels
   • Can perform true multidimensional analysis
     Data Space Representations




• Image Space - Geographic Orientation
• Spectral Signatures - Physical Basis for Response

• N-Dimensional Space - For Use in Pattern Analysis
Hyperspectral Imaging Barriers

            Ephemeris,
          Calibration, etc.       • Scene - The most complex
                                  and dynamic part


 Sensor      On-Board             • Sensor - Also not under
             Processing           analyst’s control

                                  • Processing System - Analyst’s
                                  choices



                                      Data            Information
                  Preprocessing
                                     Analysis          Utilization


                              Human Participation
                               with Ancillary Data
                 HSI Data Analysis Scheme*
           Finding Optimal Feature Subspaces

             • Feature    Selection (FS)
             • Discriminant Analysis Feature Extraction (DAFE)

             • Decision     Boundary Feature Extraction (DBFE)

             • Projection Pursuit (PP)


Available in MultiSpec via WWW at: http://dynamo.ecn.purdue.edu/~biehl/MultiSpec/

Additional documentation via WWW at:
http://dynamo.ecn.purdue.edu/~landgreb/publications.html

  *After David Landgrebe, Purdue University
Dimension Space Reduction
                Pixel Classification

• Labeling the pixels as belonging to particular spectral
classes using the spectral data available

• The terms classification, allocation, categorization, and
labeling are generally used synonymously

• The two broad classes of classification procedure are:
supervised classification and unsupervised classification

• Hybrid Supervised/Unsupervised Methods are available
Pixel Classification
Pixel Classification
 Classification Techniques



 Unsupervised

 Supervised

 Hybrid
Classification
Classification (Cont)
Classification (Cont)
Classification (Cont)
Classification (Cont)
Classification (Cont)
Classification (Cont)
Classification (Cont)
             Classifier Options

• Correlation Classifier       • Spectral Angle Mapper

            XT  i                            XT  i 
  gi (X)   T                 gi (X)  cos1  T           
            X X  i  i                       X X  i  i 
                     T                                    T


• Matched Filter - Constrained Energy Minimization
                          X T C1i
                  gi (X)  T 1b

                           i Cb  i
• Other types - “Nonparametric”
     Parzen Window Estimators
     Fuzzy Set - based
     Neural Network implementations
     K Nearest Neighbor - K-NN
     etc.
              Classification Algorithms
• Linear Spectral Unmixing (LSU)
     • Generates maps of the fraction of each endmember in a pixel
• Orthogonal Subspace Projection (OSP)
     • Suppresses background signatures and generates fraction maps
       like the LSU algorithm
• Spectral Angle Mapper (SAM)
     • Treats a spectrum like a vector; Finds angle between spectra
• Minimum Distance (MD)
     • A simple Gaussian Maximum Likelihood algorithm that does
       not use class probabilities
• Binary Encoding (BE) and Spectral Signature
  Matching (SSM)
     • Bit compare simple binary codes calculated from spectra
                    Unsupervised Classification
K-MEANS
• Use of statistical techniques to group n-dimensional data into their natural spectral
classes
• The K-Means unsupervised classifier uses a cluster analysis approach that requires
the analyst to select the number of clusters to be located in the data, arbitrarily
locates this number of cluster centers, then iteratively repositions them until optimal
spectral separability is achieved


ISODATA (Iterative Self-Organizing Data Analysis Technique)
• IsoData unsupervised classification calculates class means evenly distributed in the data
space and then iteratively clusters the remaining pixels using minimum distance techniques
• Each iteration recalculates means and reclassifies pixels with respect to the new means
• This process continues until the number of pixels in each class changes by less than the
selected pixel change threshold or the maximum number of iterations is reached
Iterative Self-Organizing Data Analysis Technique
                      (ISODATA)

• IsoData, unsupervised classification, calculates class
means evenly distributed in the data space and then
iteratively clusters the remaining pixels using minimum
distance techniques

• Each iteration recalculates means and reclassifies pixels
with respect to the new means

• This process continues until the number of pixels in each
class changes by less than the selected pixel change
threshold or the maximum number of iterations is reached
           Supervised Classification
• Supervised classification requires that the user
select training areas for use as the basis for
classification
• Various comparison methods are then used to
determine if a specific pixel qualifies as a class
member
• A broad range of different classification methods,
such as Parallelepiped, Maximum Likelihood,
Minimum Distance, Mahalanobis Distance, Binary
Encoding, and Spectral Angle Mapper can be used
                  Parallelepiped

• Parallelepiped classification uses a simple
decision rule to classify multidimensional spectral
data
• The decision boundaries form an n-dimensional
parallelepiped in the image data space
• The dimensions of the parallelepiped are defined
based upon a standard deviation threshold from the
mean of each selected class
              Maximum Likelihood

• Maximum likelihood classification assumes that
the statistics for each class in each band are
normally distributed
• The probability that a given pixel belongs to a
specific class is then calculated
• Unless a probability threshold is selected, all
pixels are classified
• Each pixel is assigned to the class that has the
highest probability (i.e., the "maximum likelihood")
              Minimum Distance
• The minimum distance classification uses the
mean vectors of each region of interest (ROI)

• It calculates the Euclidean distance from each
unknown pixel to the mean vector for each class

• All pixels are classified to the closest ROI class
unless the user specifies standard deviation or
distance thresholds, in which case some pixels
may be unclassified if they do not meet the
selected criteria
Euclidean Distance
            Mahalanobis Distance
• The Mahalanobis Distance classification is a
direction sensitive distance classifier that uses
statistics for each class

• It is similar to the Maximum Likelihood
classification, but assumes all class covariances
are equal and, therefore, is a faster method

• All pixels are classified to the closest ROI
class unless the user specifies a distance
threshold, in which case some pixels may be
unclassified if they do not meet the threshold
      Bhattacharyya Distance



                                                 1
   1                          1
                T 1   2               1 2    1   2 
B   1   2                  1  2   Ln
   8               2                     2      1 2
      Mean Difference Term                 Covariance Term
           Binary Encoding Classification

• The binary encoding classification technique
encodes the data and endmember spectra into 0s and
1s based on whether a band falls below or above the
spectrum mean
• An exclusive OR function is used to compare each
encoded reference spectrum with the encoded data
spectra and a classification image produced
• All pixels are classified to the endmember with the
greatest number of bands that match unless the user
specifies a minimum match threshold, in which case
some pixels may be unclassified if they do not meet
the criteria
     Spectral Angle Mapper Classification


• The Spectral Angle Mapper (SAM) is a
physically-based spectral classification that uses
the n-dimensional angle to match pixels to
reference spectra

• The SAM algorithm determines the spectral
similarity between two spectra by calculating the
angle between the spectra, treating them as
vectors in a space with dimensionality equal to
the number of bands
Spectral Angle Mapper (SAM) Classification

  • The Spectral Angle Mapper (SAM) is a physically
  based spectral classification that uses the n-dimensional
  angle to match pixels to reference spectra

  • The algorithm determines the spectral similarity
  between two spectra by calculating the angle between the
  spectra, treating them as vectors in a space with
  dimensionality equal to the number of bands

  •The SAM algorithm assumes that hyperspectral image
  data have been reduced to "apparent reflectance", with all
  dark current and path radiance biases removed
 Spectral Angle Mapper (SAM) Algorithm
The SAM algorithm uses a reference
spectra, r, and the spectra found at each
pixel, t. The basic comparison algorithm to
find the angle  is: (where nb = number of
bands in the image)




                 OR
 Minimum Noise Fraction (MNF) Transformation
• The minimum noise fraction (MNF) transformation is used to determine the
inherent dimensionality of image data, to segregate noise in the data, and to reduce
the computational requirements for subsequent processing
• The MNF transformation consists essentially of two-cascaded Principal
Components transformations
    • The first transformation, based on an estimated noise covariance matrix,
    decorrelates and rescales the noise in the data. This first step results in
    transformed data in which the noise has unit variance and no band-to-band
    correlations
    • The second step is a standard Principal Components transformation of the
    noise-whitened data.
• For further spectral processing, the inherent dimensionality of the data is
determined by examination of the final eigenvalues and the associated images
• The data space can be divided into two parts: one part associated with large
eigenvalues and coherent eigenimages, and a complementary part with near-unity
eigenvalues and noise-dominated images. By using only the coherent portions, the
noise is separated from the data, thus improving spectral processing results.
           N - Dimensional Visualization

• Spectra can be thought of as points in an n -
dimensional scatterplot, where n is the number of
bands

• The coordinates of the points in n -space consist of
"n" values that are simply the spectral radiance or
reflectance values in each band for a given pixel

• The distribution of these points in n - space can be
used to estimate the number of spectral endmembers
and their pure spectral signatures
            Pixel Purity Index (PPI)
• The "Pixel-Purity-Index" (PPI) is a means of
finding the most "spectrally pure," or extreme
pixels in multispectral and hyperspectral images
• PPI is computed by repeatedly projecting n-
dimensional scatterplots onto a random unit
vector
• The extreme pixels in each projection are
recorded and the total number of times each pixel
is marked as extreme is noted
• A PPI image is created in which the DN of each
pixel corresponds to the number of times that
pixel was recorded as extreme
             Matched Filter Technique

Matched filtering maximizes the response of a
known endmember and suppresses the response of
the composite unknown background, thus
"matching" the known signature
  • Provides a rapid means of detecting specific minerals
  based on matches to specific library or image endmember
  spectra
  • Produces images similar to the unmixing technique, but
  with significantly less computation
  • Results (values from 0 to 1), provide a means of
  estimating relative degree of match to the reference
  spectrum where “1” is a perfect match
                 Spectral Mixing

• Natural surfaces are rarely composed of a single
uniform material
• Spectral mixing occurs when materials with
different spectral properties are represented by a
single image pixel
• Researchers who have investigated mixing scales
and linearity have found that, if the scale of the
mixing is large (macroscopic), mixing occurs in a
linear fashion
• For microscopic or intimate mixtures, the mixing
is generally nonlinear
       Mixed Spectra Models


Mixed spectra effects can be formalized
in three ways:
  • A physical model
  • A mathematical model
  • A geometric model
Mixed Spectra Physical Model
Mixed Spectra Mathematical Model
Mixed Spectra Geometric Model
Mixture Tuned Matched Filtering (MTMF)

 • MTMF constrains the Matched Filtering as
 mixtures of the composite unknown background
 and the known target

 • MTMF produces the standard Matched Filter
 score images plus an additional set of images for
 each endmember “infeasibility images”

 • The best match to a target is obtained when the
 Matched Filter score is high (near 1) and the
 “infeasibility” score is low (near 0)
   Principal Component Analysis (PCA)

• Calculation of new transformed variables
(components) by a coordinate rotation
• Components are uncorrelated and ordered by
decreasing variance
• First component axis aligned in the direction of
the highest percentage of the total variance in the
data
• Component axes are mutually orthogonal
• Maximum SNR and largest percentage of total
variance in the first component
Principal Component Analysis (PCA)
Principal Component Analysis (PCA) (Cont)
  • The mean of the original data is the origin of the
    transformed system with the transformed axes of
    each component mutually orthogonal

  • To begin the transformation, the covariance
    matrix, C, is found. Using the covariance matrix,
    the eigenvalues, i, are obtained from

                   |C – iI| = 0

   where i = 1,2,...,n (n is the total number of original
   images and I is an identity matrix)
Principal Component Analysis (PCA) (Cont)
• The eigenvalues, i,, are equal to the variance of each
  corresponding component image

• The eigenvectors, ei , define the axes of the
  components and are obtained from

                (C – iI) ei = 0

• The principal components are then given as
                PC = T• DN
  where DN is the digital number matrix of the original
  data and T is the (n x n) transformation matrix with
  matrix elements given by eij , i, j = 1,2,3,...n
                A Matrix Equation

Problem: Find the value of vector x from
measurement of a different vector y, where they are
related by the matrix equation given by:
                       y = Ax
or
                       yi = aijxj sum over j
Note1: If both A and x are known, it is trivial to find y
Note2: In our problem, y is the measurement, and A is
determined from the physics of the problem, and we
want to retrieve the value of x from y
            Mean and Variance
   Mean:

            x = (1/N) xk

 Variance:

      var(x) = (1/N) (xk - x)2 = x2


where k = 1,2,…,N
                Covariance

cov(x,y) = (1/N) (xk  x)(yk  y)

         = (1/N)  xk yk  x y
Note1: cov(x,x) = var(x)
Note2: If the mean values of x and y are
zero, then
    cov(x,y) = (1/N)  xk yk
Note3: Sums are over k = 1,2,…., N
            Covariance Matrix
• Let x = (x1, x2, …,xn) be a random vector with
n components


• The covariance matrix of x is defined to be:

      C = (x  )(x  )T

      where         = (1, 2, … k)T

      and           k = (1/N)xmk
        Summation is over m = 1,2,…, N
       Gaussian Probability Distributions

• Many physical processes are well represented with
Guassian distributions given by:


      P(x) = (1/2x){e(x<x>)2 /2 x 2 }


• Given the mean and variance of a Guassian random
variable,it is possible to evaluate all of the higher moments

• The form of the Gaussian is analytically simple
Normal (Gaussian) Distribution
Scatterplots
         Spectral Signatures
Laboratory Data: Two classes of vegetation
                                     Discrete (Feature) Space
                                       Samples from T wo Classes

                           23



                           22


                           21
% Reflectance at 0.69 µm




                           20



                           19



                           18

                                                                         Clas s 1
                           17                                            Clas s 2



                           16
                                10      11           12            13    14         15


                                             % Re flectance at 0.67 µm
                                            Hughes Effect




G.F. Hughes, "On the mean accuracy of statistical pattern recognizers," IEEE Trans. Inform. Theory., Vol IT-14, pp. 55-63, 1968.
Higher Dimensional Space Implications

High dimensional space is mostly empty.
 Data in high dimensional space is mostly in a
 lower dimensional structure.



Normally distributed data will have a tendency
 to concentrate in the tails; Uniformly
 distributed data will concentrate in the
 corners.
Higher Dimensional Space Geometry
• The number of labeled samples needed for
  supervised classification increases rapidly
  with dimensionality

In a specific instance, it has been shown that the samples
required for a linear classifier increases linearly, as the square
for a quadratic classifier. It has been estimated that the number
increases exponentially for a non-parametric classifier.



 • For most high dimensional data sets, lower
   dimensional linear projections tend to be
   normal or a combination of normals.
       HSI Data Analysis Scheme*
                 *After David Landgrebe, Purdue University


200 Dimensional Data


 Class Conditional        Feature
                                         Classifier/Analyzer
 Feature Extraction      Selection



                        Class-Specific
                         Information
     HSI Image of Washington DC Mall*

         Define Desired Classes




Training areas designated by polygons outlined in white

        *After David Landgrebe, Purdue University
  Thematic Map of Washington DC Mall*




         Legend        Operation               CPU Time (sec.) Analyst Time
                       Display Image                  18
           Roofs       Define Classes                            < 20 min.
           Streets     Feature Extraction             12
           Grass       Reformat                       67
           Trees       Initial Classification         34
                       Inspect and Mod. Training                  ≈ 5 min.
           Paths
                       Final Classification           33
           Water
                                Total         164 sec = 2.7 min. ≈ 25 min.
           Shadows
                                                      (No preprocessing involved)
*After David Landgrebe, Purdue University
     Hyperspectral Imaging Barriers

Scene - Varies from hour to hour and sq. km to sq. km

Sensor - Spatial Resolution, Spectral bands, S/N

Processing System -
   • Classes to be labeled

   • Number of samples to define the classes

   • Features to be used

   • Complexity of the Classifier
            Operating Scenario
• Remote sensing by airborne or spaceborne
  hyperspectral sensors

• Finite flux reaching sensor causes spatial-
  spectral resolution trade-off

• Hyperspectral data has hundreds of bands of
  spectral information

• Spectrum characterization allows subpixel
  analysis and material identification
                               Spectral Mixture Analysis
Assumes reflectance from each pixel is caused by
a linear mixture of subpixel materials
                                        Mixed Spectra Example

                 14000

                 12000

                 10000
 Digital Count




                 8000                                                     Parking Lot
                                                                          Vegetation
                 6000                                                     1:1 Mixture

                 4000

                 2000

                    0
                         0.4     0.9          1.4             1.9   2.4
                                       Wavelength (microns)
  Mixed Pixels and Material Maps



                    1.0   0.0        Red
Input Image
                                   Fraction
                                     Map
PURE PURE           1.0   0.5

PURE MIXED          0.0   1.0       Green
                                   Fraction
                    0.0   0.5        Map
      Traditional Linear Unmixing
            N
     Ri   Ri,e f e   i                 i=1…k
           e 1

Constraint Conditions
   • Unconstrained:              fe  

   • Partially Constrained:      f    e
                              endmembers
                                            10
                                              .

   • Fully Constrained:       00  f e  10
                               .          .
  Hierarchical Linear Unmixing Method
 • Unmixes broad material classes first
 • Proceeds to a group’s constituents only if the unmixed
   fraction is greater than a given threshold

                Example Materials Hierarchy


                Mixed Pixel                          Full Library

                                                •   Concrete
     Man-Made       Water       Vegetation      •   Metal
                                                •   Water
                                                •   Vegetation
Concrete    Metal           Trees       Grass   •   Trees
                                                •   Deciduous Trees
                                                •   Coniferous Trees
                Deciduous       Coniferous      •   Grass
         Stepwise Unmixing Method
• Employs linear unmixing to find fractions

• Uses iterative regressions to accept only the
  endmembers that improve a statistics-based
  model

• Shown to be superior to classic linear method
  – Has better accuracy
  – Can handle more endmembers

• Quantitatively tested only on synthetic data
         Performance Evaluation
Error Metric:

      1
  SE =   ( f truth  f test ) 2

      N pixels materials
  • Compare squared error from traditional,
    stepwise and hierarchical methods

  • Visually assess fraction maps for accuracy
           Endmember Selection

• Endmembers are simply material types
  – Broad classification: road, grass, trees…
  – Fine classification: dry soil, moist soil...


• Use image-derived endmembers to produce
  spectral library
  – Average reference spectra from “pure” sample
    pixels
  – Chose specific number of distinct endmembers
             Materials Hierarchy
• Grouped similar materials into 3-level hierarchy



– Level 1

– Level 2


– Level 3
Squared Error Results
   Stepwise Unmixing Comparisons

• Linear unmixing does poorly, forcing
  fractions for all materials

• Hierarchical approach performs better but
  requires extensive user involvement

• Stepwise routine succeeds using adaptive
  endmember selection without extra
  preparation
HSI Image of Washington DC Mall




  HYDICE Airborne System
  1208 Scan Lines, 307 Pixels/Scan Line
  210 Spectral Bands in 0.4-2.4 µm Region
  155 Megabytes of Data
  (Not yet Geometrically Corrected)
       Hyperspectral Imaging Potential

• Assume 10 bit data in a 100 dimensional space


• That is (1024)100 ≈ 10300 discrete locations


• Even for a data set of 106 pixels, the probability
of any two pixels lying in the same discrete location
is extremely small

								
To top