VIEWS: 9 PAGES: 5 POSTED ON: 9/13/2011
Computer Processing of Remotely-Sensed Images An Introduction Second Edition Paul M. Mather School of Geography, The University of Nottingham, U K ailand, rested 2 lower Non-parametric feature selection methods d o not operator as belonging to class i that have been correctly A rely on assumptionsconcerning the frequency distribu- labelled by the classifier. The other elements of row i I .( tion of the features. O n e such method, which has not give the number and distribution of pixels that have P" been widely used, is proposed by Lee and Landgrebe been incorrectly labelled. T h e classification accuracy for (I! (1993). Benediktsson and Sveinsson (1997) demonstrate class i is therefore the number of pixels in cell i divided sh its application. by the total number of pixels identified by the operator wl from ground data as being class i pixels. The overall gu classification accuracy is the average of the individual da 8.10 CLASSIFICATION ACCURACY class accuracies, which are usually expressed in percen- tri tage terms. The methods discussed in section 8.9 have as their aim Some analystsuse a statistical measure. the kappa the establishment of the degree of separability of the k the provided to which the image pixels are be by thecontingency matrix(Bishop er (I,,. 1975). Kappais allocated (though the Bhattacharyya distance is more computed from: like a measure of the probability of misclassification). Once a classification exercise has been carried out there r r is a need to determine the degree of error in the end- product. These errors could be thought of as being due N x i= l .xii - x i= l .xi+.x+; = to incorrect labelling of the pixels. Conversely, the de- gree of accuracy could be sought. First of all, if a method allowing a 'reject' class has been used then the number N' - xr i= I of pixels assigned to this class (which is conventionally labelled 601) be an indication of the overallrepresen- will The xi; are the diagonal entries of the confusion matrix. tativeness of the training classes. If large numbers of T h e notation x i + and . x + ~indicates, respectively, the pixels are labelled aO' then the representativeness of sum of row i and the sum of column i of the confusion t3e training data sets is called into question - d o they matrix. N is the number of elements in the confusion adequately sample the feature space? ~h~ most matrix. Row totals ( x i + for the confusion matrix shown ) manly used method of representing the degree of accu- in Table 8.4 are listed in the column headed (1) and racy of a classification is to build a confusio,l column totals are given in the last row. The sum of the matrix (or error matrix), ~h~ elements of the rows of diagonal elements bii) 350 (Ej=I .xi; for I. = 6). and the is this matrix give the number of pixels that the operator the products the and cOiumn marginal has identified as beiflg members ofclass i that have been ~ is totals(Ej=, ~ ~ + . x + 28)820. Thus the value ofkappais: allocated to classes I to k by the classification procedure (see Table 8.4). Element i of row i (the ith diagonal 410x350-28820 114680 = - - 0.82 element) contains the number of pixels identified by the 168 100 - 28 820 139 280 Table 8.4 Conlusion or error matrix lor six classcs. The row labels (Rel.)are those given by an operalor usins pound reference data. The column labels (Class.)are those generated by the classification procedure. See text for explanation. The lour r~ght-hand columns are as lollows: (i) number of pixels in class from ground reference data; (ii) estimated classificarion accuracy (per cent);(iii) class i pixels in relerence data but not given label by classifier; and (iv) pixels given label i by classifier but not class i in rererence data. The sum of the diagonal elements of the confusion matrix is 350, and the overall accuracy is therefore (350/410) x 100 = 85.4%. Class Ref. il Col. sums 71 72 76 67 81 43 410 60 8.10 CIassiJicationaccuracy 207 ,en correctly A value of zero indicates no agreement, while a value of A A A A F A A nts of row i 1.0 shows perfect agreement between the classifier out- Is that have put and the reference data. Montserud and Leamans A A A F F U U ;iccuracy for (1992) suggest that a value of kappa of 0.75 or greater :ell i divided shows a 'very good to excellent' classifier performance, A A A F A A A ' ;he operator while a value of less than 0.4 is 'poor'. However, these A A A A A A A The overall guidelines are only valid when the assumptio'n that the F A U U A e individual I data are randomly sampled from a multinomial dis- :d in percen- i tribution, with a large sample size, is met. , A A U Values of kappa are often cited when classifications .. the kappa : are compared. If these classifications refer to different on provided ; procedures (such as maximum l~kelihood and artificial : 5 ) . Kappa is j neural networks) applied to the same data set, then comparisons or kappa values are acceptable, though the : percentage accuracy (overall and for each class) pro- ! vides as much, if not more, information. If the two j classifications have different numbers of categories then A A A A A A A 4 it is not clear whether a straightforward logical com- A A A A A A A 4 parison is valid. It is hard to see what additional infor- mation is provided by kappa over and above that given A A A A A A A by a straightrorward calculation of percentage accu- racy. See Congalton (1991). Kalkhan e t a / . (1997). Steh- A A A A A A A ~sion matrix. ectively. the man (1997) and Zhuang er a/. (1995). A A A A A A A l e confusion The confusion matrix procedure stands or falls by the availability of a test sample of pixels for each of the k A A A A A A A ie confusion latrix shown classes. The use of training-class pixels for this purpose ;ided (i) and is dubious and is not recommended - one cannot logi- e sum of the cally train and test a procedure using the same data set. Figure 8.20 Cover type categories derived from (a) ground = 6). and the A separate set of test pixels should be used for the reference data and (b) auromatic image classifier.The choice of nn marginal calculation of classification accuracy. Users of the sample locations (solid or dashed lines in (a))will influence the 2 or kappa is: method should be cautious in interpreting the results if outcome of accuracy assessment measures. the ground data from which the test pixels were identifi- ed were not collected on the same date as the remotely- sensed image, for crops can be harvested or forests values could be summarised by a conventional prob- cleared. Other problems may arise as a result of differen- ability distribution, for example the hypergeometric dis- >undreference ces in scale between test and training data and the image tribution, which describes a situation in which there are , u r right-hand pixels being classified. So far as is possible, the test pixel two outcomes to an experiment, labelled P (successful) (per cent);(iii) labels should adequately represent reality. and Q (failure), and where samples are drawn from a s i in reference The literal interpretation of accuracy measures de- population of finite size. If the population being sam- is therefore rived from a confusion matrix can lead to error. Would pled is large the binomial distribution (which is easier to the same level of accuracy have been achieved if a calculate) can be used in place of the hypergeometric different test sample of pixels had been used? Figure 8.20 distribution. These statistical distributions allow the shows an extract from a hypothetical classified image evaluation of confidence limits, which can be inter- and the corresponding ground reference data. If the preted as follows: If a very large number of samples of section outlined in the solid line in Figure 8.20(a) had size N are taken and if the true proportion P of success- been selected as test data the user would infer that the ful outcomes is P, then 95% of all the sample values will classification accuracy was loo%, whereas if the area lie between P, and P, (the lower and upper 95% confi- outlined by the dashed line had been selected then the dence limits around P,). The values of the upper and accuracy would appear to be 75%. For a given spectral lower confidence limits depend on (i) the level of prob- class there are a very large number of possible configur- ability employed and (ii) the sample size N. The confi- ations of test data and each might give a different accu- dence limits get wider as the probability level increases racy statistic. It is likely that the distribution of accuracy towards 100% so that we can always say that the 100% 208 Classification confidence limits range from minus infinity to plus infin- of erroneous labels besides allowing the calculation o f of cc ity. Confidence limits also get wider as the sample size N classification accuracy. Errors of omission are commit- inter becomes smaller, which is self-evident. ted when patterns that are really class i become labelled pixel Jensen (1986, p. 228) provides a formula for the calcu- as members of some other class, whereas errors of com- espel lation of the lower confidence limit associated with a mission occur when pixels that are really members o f map classification accuracy value obtained from a training some other class become labelled as members of class i. bet^ sample of N pixels. The formula used to determine the Table 8.4 shows how these error rates are calculated. grad required r% lower confidence limits given the values of From these error rates the user may be able to identify data P, Q and N is: the main sources of classification accuracy and alter his nece or her strategy appropriately. Congalton et al. (1983), scrir Congalton (1991) and Story and Congalton (1986) give clas5 more advanced reviews of this topic. . How to calculate the accuracy oTa fuzzy classification might appear to be a difficult topic; refer to Gopal and where z is the (100 - r)/100th point of the standard Woodcock (1994) and Foody and Arora (1996). Bur- Con normal distribution. Thus, if r equals 95% then the rough and Frank (1996) ~ o n s i d e rthe more general shot z value required will be that having a probability of problem of fuzzy geographical boundaries. The Clues- edit1 (100 - 95)/100 or 0.05 under the standard normal tion of estimating area from classified remotely-sensed autl curve. The tabled z value for this point is z = 1.645. I f r images is discussed by Canters (1997) with reference to clas were 99% then z would be 2.05. T o illustrate the pro- fuzzy methods. Dymond (1992) provides a formula to bee1 cedure assume that, of480 test pixels, 381 werecorrectly calculate the root-mean-square error of this area om1 classified, giving an apparent classification accuracy (P) mate for 'hard' classifications (see also Lawrence and fuz/ of 79.375%. Q is therefore (100 - 79.375) = 20.625%. Ripple, 1996). Cza~lewski (1992) discusses the effect of feat If the lower 95% confidence limit was required then z misclassification on areal estimates derived from re- OCCl would equal 1.645 and motely-sensed data, and Fitzgerald and Lees (1994) ly sc z] examine classification accuracy of multisource remote follc sensing data. hav 79.375 x 20.625 The use of single summary statistics to describe the of c s = 79.375 - [1.645/- + ord degree of association between the spatial distribution o f class labels generated by a classification algorithm and In t = 79.375 - C1.645 x 1.847 + 0.1041 the corresponding distribution of the true (but un- be = 76.223% known) ground cover types is rather simplistic. First, tail these statistics tell us nothing about the spatial pattern Pro This result indicates that, in the long run, 95% of train- of agreement or disagreement. An accuracy level of , eas ing samples with observed accuracies of 79.375% will 50% for a particular class would be achieved if all the . twa have true accuracies of 76.223% or greater. As men- test pixels in the upper half of the image were correctly net tioned earlier, the size of the training sample influences classified and all those in the lower half of the image Pro the confidence level. If the training sample in the above were incorrectly classified, assuming an equal number acq example had been composed of 80 rather than 480 of test pixels in both halves of the image. The same agc pixels then the lower 95% confidence level would be degree of accuracy would be computed if the pixels in me s = 79.375 - [I645 -/ 79.375 x 20.625 + g] . agreement (and disagreement) were randomly distrib- uted over the image area. Secondly, statements of 'over- -all accuracy' levels can hide a multitude of sins. For j 1 roo yea I example, a small number of generalised classes will ) an! usually be identified more accurately than would a lar- ! Ea = 71.308% ger number of more specific classes, especially i f one o f the general classes is 'water'. Thirdly, a number of re- This procedure can also be applied to individual classes searchers appear to use the same pixels to train and to in the same way as described above with the exception test a supervised classification. This practice is illogical I agl that P is the number of pixels correctly assigned to class and cannot provide much information other than a 1 im; j from a test sample of N j pixels. measure of the 'purity' of the training classes. More F len The confusion matrix can be used to assess the nature thought should perhaps be given to the use of measures 8.12 Questions 209 lation of of confidence in pixel labelling. It is more useful and becoming available. The early years of the new millen- commit- interesting to state that the analyst assigns label x to a nium will see a very considerable increase in the volumes : labelled pixel, with the probability of correct labelling being y, of Earth observation data being collected from space s of com- especially if this information can be presented in quasi- platrorms,and much greatercomputerpower(with intel- rnbers of map form. A possible measure might be the relationship ligent software) will be needed if the maximum value is to of class i. between the first and second highest membership be obtained from these data. An integrated approach to ~lculated. grades output by a fuzzy classifier. The use of ground geographical data analysis is now being adopted, and . identify ) data to test the output from a classifier is, of course, this is having a significant effect on the way image I alter his necessary. It is not always sufficient, however, as a de- classification is performed. The use of non-remotely- 11. ( 19831, scription or summary of the value or validity of the sensed data in the image classification process is provid- 986) give classification output. ing the possibility of greater accuracy, while - in turn - the greater reliability ofimage-based products is improv- .sification 1 8.11 SUMMARY ing the capabilities of environmental GIS, particularly ;opal and 196). Bur- s general 1 Compared to other chapters of this book, this chapter shows the greatest increase in size relative to the 1987 with respect to studies of temporal change. All or these factors will present challenges to the remote sensing and GIS communities, and the focus of The ques- edition. T o some extent this is a reflection of the research will move away from specialised algorithm :I?-sensed author's own interests. However. the developments in development to the search for methods that satisfy user I'erence to classification methodology over the past 10 years have needs and are broader in scope than the statistically )rmula to been considerable, and the problem has been what to based methods of the 1980s, which are still widely used area estl- omit. The introduction of artificial neural net classifiers, in commercial GIS and image processing packages. If rence and fuzzy methods. new techniques for computing texture progress is to be made then high-quality interdisciplin- e effect of features, and new models of spatial context have all ary work is needed, involving mathematicians, statisti- from re- occurred during the past decade. This chapter has hard- cians, computer scientists and engineers as well as Earth :es (1994) ly scratched the surface, and readers are encouraged to scientists and geographers. The future has never looked ce remote 4 follow up the references provided at various points. I brighter Tor researchers in this fascinating and challeng- have deliberately avoided providing potted summaries ing area. .scribe the f o each paper or book to which reference is made in ibution of order to encourage readers to spend some of their time 8.12 QUESTIONS rithm and in the library. However, 'learning by doing' is always to (but un- be encouraged. The C D supplied with this book con- 1. Explain the following terms: labelling, classification, stic. First, tains some programs for image classification. These clustering, unsupervised, supervised, pattern, feature, ial pattern programs are intended to provide the reader with an pattern recognition, Euclidean space, per-pixel, per- .y level of easy way into image classification. More elaborate sof- field, texture, context, divergence, decision rule, d i f all the tware is required if methods such as artificial neural spatial autocorrelation, prior probability, neuron, c correctly networks. evidential reasoning and fuzzy classification reed-forward, multi-layer perceptron, steepest de- the image procedures are to be used. It is important, however, to scent, geostatistics, variogram, image segmentation, al number acquire familiarity with the established methods of im- GLCM, fractal dimension, kappa. The same ageclassification before becoming involved in advanced 2. What is meant by the term 'feature space'? How can ie pixels in methods and applications. you measure similarities between points (represen- ~ly distrib- Despite the efforts of geographers following in the ting objects to be classified) in a feature space of n ~ t of 'over- s footsteps of Alexander von Humboldt over the past 150 dimensions, where n > 3? r sins. For years, we are still a long way from being able to state with 3. Compare the operation of the k-means and also :lasses will any acceptable degree of accuracy the proportion of the ISODATA unsupervised classifiers. Use the pro- ould a lar- Earth's land surface that is occupied by different cover grams k-means and isodata (described in Appen- ~ l y one of if types. At a regional scale, there is a continuing need to dix B) to carry out two unsupervised classifications nber of re- observe deforestation and other types of land cover of one of the test images on the CD. Summarise your rain and to change, and to monitor the extent and productivity of experiences in note form. : is illogical agricultural crops. More reliable, automatic, methods of 4. The parallelepiped, supervised k-means and maxi- her than a image classification are needed if answers to these prob- mum likelihood classifiers are described as paramet- sses. More lems are to be provided in an efficient manner. New ric. Explain. These three classifiers use, respectively, )f measures the extreme pixel values in each band, the mean pixel