LAB 4 UNSUPERVISED CLASSIFICATION

Document Sample
LAB 4 UNSUPERVISED CLASSIFICATION Powered By Docstoc
					Geography 309                                        Lab 4                                              Page 1



                          LAB 4: UNSUPERVISED CLASSIFICATION
                                              Question Sheets

                                       Due Date: March 12

Objectives
   to study some of the mechanics of unsupervised classification

Preparation
   Re-read the sections on Defining the Classes (pg. 307), Land-use versus land-cover mapping
    (pp. 307-309), and Unsupervised classification (pp. 311-315) in your text.

Notes
1. Image classification is a complex task and it takes considerable effort to become comfortable
   with classification concepts. As in the previous Lab, you are asked to do some basic image
   manipulations “by hand” using a pencil and paper before attempting a similar operation on
   the computer. In Part B you will apply the concepts you have learned in Part A to classify a
   digital image on the computer. The manual techniques described here are very similar to the
   tasks performed by computer, however, the computer's great speed permits it to handle much
   larger images with more channels and greater radiometric range.

2. All Figures and Tables referenced in Part A are included on the Answer Sheets attached to the
   end of this Lab. Please use these Answer Sheets to submit your answers to Part A.

3. Detailed instructions for image classification using Geomatica can be found on-line by
   following the Geomatica Visual Guide link from the bottom of the course homepage.

4. At any time when you are working with the Geomatica software, you can save a "snapshot"
   of your work in progress by Saving it in a Project (look in the File menu). Be sure to save
   your projects in your personal directory space.


A. Manual Image Classification1
It is often useful to delineate spectrally distinct areas of an image, even when nothing is known
about the environmental character of the resulting subdivisions or classes. These classes are then
mapped and the map taken into the field to identify the classes. This procedure is known as
unsupervised classification since no training sites are involved. The main advantage of using an
unsupervised classification technique is that the classes are subdivided based on their statistical
characteristics usually covering large geographical areas, rather than depending on a training
sample which may be quite unrepresentative of the class variability over the whole scene to be
mapped.

1
  This material is derived from material presented in the publication Introduction to Digital Images and Digital
Image Analysis Techniques by Tom Alföldi


J.M. Piwowar                                                                                        2010.02.24
Geography 309                                  Lab 4                                          Page 2




There are a large number of mathematical algorithms which use various schemes to locate and
separate the statistically "cohesive" clusters in feature space, which are likely to have
environmental significance. Most of these algorithms rely on finding areas of high pixel density
(in feature space) which are separated by regions of low density. The following task serves to
illustrate this by a less sophisticated algorithm than is actually used in practice.

Question 1:                                                                              (1 mark)
The feature space representation of the image in Figure 1 is shown in Figure 2. We can
artificially define the high density clusters by eliminating (from view) all the low density cells.
Into Figure 3, copy those cells from Figure 2 which have a count of (density of) 3 or more.

You should now see three clusters of high density cells in Figure 3. These groupings of high
density cells should each only be considered the nucleus of a cluster. The next step is to define
the boundaries of each whole cluster by spreading out from the nucleus.

Question 2:                                                                                 (1 mark)
Lets call the cluster with a nucleus of only two cells Cluster ‘A’ or Class 'A'. Identify each cell
that touches the nucleus cells of cluster ‘A’, by marking such cells with the letter ‘A’ in Figure 3.
There should be l0 such cells marked, counting even those cells that touch with a corner only.

Repeat the process for cluster ‘B’ (with three nucleus cells) using the letter ‘B’ for the
neighbouring cells, and also for cluster ‘C’ (with one nucleus cell) and using the letter ‘C’.

There will be a point of ambiguity where two clusters overlap and a cell is identified as
belonging to the neighbourhood of two clusters. A decision must be forced, so identify this
conflict cell as belonging to the cluster with the larger nucleus.

Draw the boundary for each cluster enclosing its complete neighbourhood in Figure 3. There
should be 11 cells inside the boundary for cluster ‘A’, 15 cells in cluster ‘B’, and six cells in
cluster ‘C’.

The clusters just created were defined as a uniform perimeter around each nucleus, without
regard for the presence or absence of actual image data. A more accurate cluster representation
can be obtained by combining the cluster definitions from Figure 3 with the pixel counts from
Figure 2.

Question 3:                                                                               (1 mark)
Transfer the cluster boundaries from Figure 3 back to the original feature space in Figure 2.
Now copy those cells in Figure 2 which fall inside the boundary for cluster ‘A’ and which have a
pixel density of 1 or greater into the corresponding cells in Figure 4. Identify those cells by the
letter ‘A’. Repeat the procedure for clusters ‘B’ and ‘C’.

Now that feature space has been (pseudo-) statistically subdivided into cohesive clusters, it
remains to map these clusters into their geographical locations.




J.M. Piwowar                                                                              2010.02.24
Geography 309                                  Lab 4                                         Page 3


Question 4:                                                                             (1 mark)
For each pixel in the image, retrieve the band 'A' and band 'B' spectral coordinates from the
digital maps of Figure 1. Determine which class to assign these coordinates to by looking them
up in Figure 4. Record the class by its representative symbol (A, B, C, or leave blank for
undefined) in Figure 5. Only the last three lines of pixels need to be considered, since the first
four lines have been mapped for you.

The unsupervised classification in Figure 5 shows the spatial distribution of 3 classes, „A‟, „B‟,
and „C‟, but no attempt has been made to give each class a real label. Class identification is the
next step of the unsupervised classification process. This may be done by a variety of techniques,
notably airphoto interpretation, or actually visiting the site, if practical. It is not necessary to
completely cover the scene in question, however. Ground verification, or ground truthing, can
be directed to convenient, small, and representative locations in the image where the
environmental meaning of a variety of classes may be determined. For instance, the location
marked by a star in Figure 5 would be a suitable location to identify classes „A‟, „B‟, and „C‟
because of their proximity to each other. By investigating the class definitions in a few such
locations, the class labels can be extrapolated to the larger scene with confidence.


Question 5:                                                                       (2 marks)
Assume that in the feature space of the image (Figure 4), band 'A' is representative of the
visible spectrum and band 'B' is a near-IR band. Suggest what the 3 classes (clusters)
represent?

Question 6:                                                                                (1 mark)
Why was one pixel not classified in the unsupervised classification of Figure 5?


B. Digital Image Classification
Now that you have seen how an unsupervised classification works with test data, you are ready to
try one using Geomatica. In an unsupervised classification, the computer examines your image
data and attempts to find clusters of pixel values naturally occurring among the different spectral
bands. These natural clusters typically represent different land covers. The analyst's job is to
attach meaningful labels to the spectral classes produced by an unsupervised classification based
on ancillary data gathered from field observations, aerial photographs, maps, and other sources.
In this lab, you are not expected to use any of these ancillary data sources, rather you are to base
your class labels on a visual interpretation of the imagery.

Before you begin …
    During the process of classification, you will need to add a new layer to the image data
       file you are working with. This means that you will need to work with a copy of the
       image that you have saved on your USB drive. Use the geometrically corrected image




J.M. Piwowar                                                                             2010.02.24
Geography 309                                          Lab 4                                                 Page 4


         you created in the last lab.2 If your image has black triangles on its edges, make a subset
         of only the image data portion before you begin.3

Assignment

1. Following the procedure as outlined in the Geomatica Visual Guide, set up your image for an
   Unsupervised Classification. Use the Session Configuration exactly as it is shown on the
   Unsupervised Classification web page.
2. Classify the image into 16 classes (Max. Class) using the K-Means algorithm.
         a. Assign meaningful Colours, and Names to each of the classes created. Recall that
            unsupervised classes are based on spectral clusters in the data and may, or may not be
            interpretable in real-world terms.
                  i. You may find the USGS class labels listed at
                     http://landcover.usgs.gov/classes.php or in Table 11.1 of your text (use the
                     Level II classes) as a useful guide.
                 ii. Since you probably don't know which farmer was growing what types of crops
                     on their fields in 2000 (when the image was acquired), you won't be able to
                     accurately label the crop classes – try to make an educated guess based on
                     their colours.
                iii. If there are two or more classes which appear to represent the same feature on
                     the ground, give them the same labels and colours.4
         b. Using the Classification Report, prepare a summary table to show the spatial extents
            of your classes across your image. Use the following headings in your table:


         Class #            Class Name                  # pixels                   Image Coverage

                                                                             % of image                  km2

Question 7:                                                                                               (1 mark)
Submit a copy of your classification summary table.
         c. Prepare a Map Composition to show your classified image. Follow the instructions
            for Simple Mapping in the Geomatica Visual Guide.
                  i. Include a Neatline, Border, Legend, and Title on your composition.
                 ii. Change the main title to something more meaningful than the default. Use
                     your name as the Sub-title.




2
  If you were not able to produce satisfactory results on Lab 3, you can copy the georeferenced image, UTM.pix,
from the course disk to your personal folder.
3
  To subset an image use the Clipping/Subsetting… function of the Tools menu.
4
  In practice, there are tools that you can use to merge several classes like this, but I would like you to keep them
separate for now.


J.M. Piwowar                                                                                            2010.02.24
Geography 309                                          Lab 4                                                  Page 5


Question 8:                                                                                               (3 marks)
Submit your map composition of your k-means classified image.5,6

Question 9:                                                                           (2 marks)
Using your textbook, or other reference(s), briefly describe how the K-means algorithm works.
3. Classify the image into 16 classes (Desired Clusters) using the IsoData algorithm.
         a. You will need to add another new channel to the image file to save this second
            classification.
         b. Assign meaningful Names, Colours, and Descriptions to each of the classes created.
            Assign similar names and colours as you used for the K-Means classification.
         c. Using the Classification Report, prepare a summary table to show the spatial extents
            of your classes across your image. Use the following headings in your table:


         Class #             Class Name                  # pixels                   Image Coverage

                                                                              % of image                 km2

Question 10:                                                                                               (1 mark)
Submit a copy of your classification summary table.
         d. Prepare a Map Composition to show your classified image.

Question 11:                                                                                              (3 marks)
Submit your map composition of your IsoData classified image.3,4

Question 12:                                                                            (2 marks)
Using the course text, or other reference(s), briefly describe how the IsoData algorithm works.

4. Compare your K-Means and Isodata classifications. How do the total areas for each class
   compare between the two images? Do you perceive one as more representative of reality?

Question 13:                                                                                              (2 marks)
Prepare a paragraph summarizing your classification comparison.

Bonus Question
Question 14:                                                                  (1 bonus mark)
Repeat either one of your classifications but only select ETM bands 3, 4 and 7 as the Input
Channels in the Session Configuration. Compare your results to your first classification where
you used ETM bands 1, 2, 3, 4, 5, and 7. Would you say your results are very similar, similar,
different, or very different to those of your first classification? Why?


5
  I require colour prints of all your classified images. If you do not have access to a colour printer, you may submit
your map file on a CD, e-mail it to me, or I can copy it onto my USB flash drive during the lab.
6
  If you are submitting your image as a file, send in the .jpg file; do not submit your .prj file.


J.M. Piwowar                                                                                             2010.02.24
Geography 309                                                      Lab 4                                           Page 1


NAME:                                                                                        MARK:

                         LAB 4: UNSUPERVISED CLASSIFICATION
                                                              Answer Sheets

                                                     Due Date: March 12

                                               PIXELS                                        PIXELS
                    1              2           3 4 5      6    7                     1   2   3 4 5      6   7
           L    1   3              4           1 1 2      2    2           L   1     7   7   0 0 0      3   3
           I    2   4              4           4 2 1      2    2           I   2     7   7   6 0 0      3   4
           N    3   3              3           5 2 2      2    2           N   3     4   4   7 1 1      3   4
           E    4   5              5           2 2 2      2    2           E   4     7   7   4 4 3      4   4
           S    5   4              5           5 2 2      2    2           S   5     6   7   7 4 5      5   5
                6   4              3           3 4 4      5    5               6     7   5   5 5 7      7   6
                7   5              5           3 4 4      5    4               7     7   6   6 7 7      7   7
                                               BAND ‘A’                                      BAND ‘B’
Figure 1: A two-band Image.
Question 1:                                                                              (1 mark)
The feature space representation of the image in Figure 1 is shown in Figure 2. We can
artificially define the high density clusters by eliminating (from view) all the low density cells.
Into Figure 3, copy those cells from Figure 2 which have a count of (density of) 3 or more.



                                               9
                                               8
                                               7               1     8     8
                        Band 'B' Intensities




                                               6               1     2     2
                                               5          3    2     1
                                               4          7    2
                                               3          5
                                               2
                                               1          2
                                               0    3     2
                                                   0 1 2 3 4 5 6 7 8 9
                                                              Band ‘A’ Intensities
Figure 2: Feature space representation.




J.M. Piwowar                                                                                                    2010.02.24
Geography 309                                            Lab 4                                Page 2



                                             9
                                             8
                                             7


                      Band 'B' Intensities
                                             6
                                             5
                                             4
                                             3
                                             2
                                             1
                                             0
                                                 0 1 2 3 4 5 6 7 8 9
                                                      Band ‘A’ Intensities
Figure 3: High density clusters.

Question 2:                                                                                 (1 mark)
Lets call the cluster with a nucleus of only two cells Cluster ‘A’ or Class 'A'. Identify each cell
that touches the nucleus cells of cluster ‘A’, by marking such cells with the letter ‘A’ in Figure 3.
There should be l0 such cells marked, counting even those cells that touch with a corner only.

Repeat the process for cluster ‘B’ (with three nucleus cells) using the letter ‘B’ for the
neighbouring cells, and also for cluster ‘C’ (with one nucleus cell) and using the letter ‘C’.

There will be a point of ambiguity where two clusters overlap and a cell is identified as
belonging to the neighbourhood of two clusters. A decision must be forced, so identify this
conflict cell as belonging to the cluster with the larger nucleus.

Draw the boundary for each cluster enclosing its complete neighbourhood in Figure 3. There
should be 11 cells inside the boundary for cluster ‘A’, 15 cells in cluster ‘B’, and six cells in
cluster ‘C’.




J.M. Piwowar                                                                              2010.02.24
Geography 309                                            Lab 4                              Page 3




Question 3:                                                                               (1 mark)
Transfer the cluster boundaries from Figure 3 back to the original feature space in Figure 2.
Now copy those cells in Figure 2 which fall inside the boundary for cluster ‘A’ and which have a
pixel density of 1 or greater into the corresponding cells in Figure 4. Identify those cells by the
letter ‘A’. Repeat the procedure for clusters ‘B’ and ‘C’.

                                             9
                                             8
                                             7
                      Band 'B' Intensities



                                             6
                                             5
                                             4
                                             3
                                             2
                                             1
                                             0
                                                 0 1 2 3 4 5 6 7 8 9
                                                      Band ‘A’ Intensities

Figure 4: Feature space.




J.M. Piwowar                                                                            2010.02.24
Geography 309                                   Lab 4                                      Page 4




Question 4:                                                                             (1 mark)
For each pixel in the image, retrieve the band 'A' and band 'B' spectral coordinates from the
digital maps of Figure 1. Determine which class to assign these coordinates to by looking them
up in Figure 4. Record the class by its representative symbol (A, B, C, or leave blank for
undefined) in Figure 5. Only the last three lines of pixels need to be considered, since the first
four lines have been mapped for you.

                                                  PIXELS

                                        1   2    3      4   5   6   7
                                    1   A   A    C      C   C   B   B
                                    2   A   A    A      C   C   B   B
                                    3   B   B    A      C   C   B   B
                            LINES




                                    4   A   A    B      B   B   B   B
                                    5
                                    6
                                    7
Figure 5: Unsupervised classification.



Question 5:                                                                       (2 marks)
Assume that in the feature space of the image (Figure 4), band 'A' is representative of the
visible spectrum and band 'B' is a near-IR band. Suggest what the 3 classes (clusters)
represent?




Question 6:                                                                              (1 mark)
Why was one pixel not classified in the unsupervised classification of Figure 5?




J.M. Piwowar                                                                           2010.02.24