Grand Challenges in Remote Sensing

Document Sample
Grand Challenges in Remote Sensing Powered By Docstoc
					Grand Challenges in Global
     Remote Sensing
      John Townshend
    The stimulus from Paul Mather
• A man called Hilbert wrote a seminal paper in 1900
  that contained a list of problems that had to be
  overcome if maths was to develop.
• This provided a research focus for mathematicians
  around the world.
• Given the range of uses of RS data and the
  inadequacies of many of the techniques used to
  extract information from that data I suggest that the
  RS community …needs a remote sensing Hilbert to
  write a paper that focuses on land cover extraction
  (from a range of data of different scales and
  coverages, and the use to which this remotely-sensed
  information is put).
• To put it bluntly, would you be willing to write such a
  paper for PIPG (of which I'm an editor)?
Examples of David Hilbert’s 23
          • The continuum hypothesis (that is,
            there is no set whose size is
            strictly between that of the integers
            and that of the real numbers)
          • The Riemann hypothesis (the real
            part of any non-trivial zero of the
            Riemann zeta function is ½) and
            Goldbach's conjecture (every even
            number greater than 2 can be
            written as the sum of two prime
          • Solve all 7-th degree equations
            using functions of two parameters.
Some hard nuts to crack

•   Making progress, but:
•   Validation is grossly unsatisfactory.
•   Classification issues.
•   Separating emissivity and temperature.
•   Over-fitting
•   Failure to repudiate nonsense
•   Formats
•   Data Policy
•   Research to operations
Making progress
GlobCover (300m product)
    Landsat-5: Atmospheric Correction (Masek et al)

                                  1990’s Landsat-5 mosaic
                                        TOA reflectance

                                   Surface reflectance

                         100 km

BOREAS Study Region

                                                          100 km
TM Mosaic (current) band 321 (0-1200)->(0,512)

                                                 MODIS SR

                                                   R2, RMSD

                                                            500m or 1km

                                                             Landsat SR



                                                             R2, RMSD

                                                       Landsat forest%

                                                   Masek et al
                  (A)                                       (B)
               LAI distribution around August 12, 2000: MODIS
                    product (A) and processed product (B)

Fang, H., S., Liang, J. Townshend, R. Dickinson, (2008), Spatially and temporally
continuous LAI data sets based on an new filtering method: Examples from North
America, Remote Sensing of Environment, 112:75-93
                      Monitoring Vegetation Fires in Amazonia
                                  Schroeder et al
Optimizing the combined use of MODIS
    and GOES fire detection data for

1.   Schroeder, W., Prins, E., Giglio, L., Csiszar, I., Schmidt,
     C., Morisette, J., and D. Morton (2008). Validation of
     GOES and MODIS active fire detection products using
     ASTER and ETM+ data. Remote Sensing of Environment,
     112 (5), 2711-2726, doi:10.1016/j.rse.2008.01.005.

2.   Schroeder, W., Csiszar, I., and Morisette, J. (2008).
     Quantifying the impact of cloud obscuration on remote
     sensing of active fires in the Brazilian Amazon. Remote
     Sensing of Environment, 112, 456-470,

3.   Schroeder, W., Morisette, J. T., Csiszar, I., Giglio, L.,
     Morton, D., and Justice, C. (2005). Characterizing
     vegetation fire dynamics in Brazil through multisatellite
     data: Common trends and practical issues. Earth               Integrated fire product for Brazilian Amazonia using 2005
     Interactions, 9, Paper 13.                                        MODIS and GOES data showing average number of
                                                                                                       detection days per year.
4.   Morisette, J.T., Giglio, L., Csiszar, I., Setzer, A.,
     Schroeder, W., Morton, D., and Justice, C. (2005),
     Validation of MODIS active fire detection products derived
     from two algorithms. Earth Interactions, 9, Paper 9.
  Making data available through the web in standard formats
   makes an enormous difference
  MODIS Global
   Fire Maps
From: Chris M Mayfield,
  Wildfires in California
   MODIS active fire
“Long time NASA MODIS users,
we detections superimposed
    were unaware of the FIRMS
   with USFS park
resource until that mid morning,
    now, I can hydrology,
butboundaries,assure you that
   roads. very can query for
FIRMS is User much a part of the
NORTHCOM Team in protecting
   fire detection attribute
the homeland.
Again, our many thanks and a very
big BRAVO ZULU to all of you on
the FIRMS Team.”
   Davies et al. UMd

財団法人 リモート・センシング技術センター      SRTM DEM
                        ALOS PRISM DSM

    What is truth?
           Operational lc validation framework
                Existing global
                   Primary                      Comparative             Updated       Validation of
                 LC products
                  validation                     validation           valid./change   new products



                                                                                                                 Degree of usability and flexibility
                     Legend translations

                                           Link to
                                                                                           Data reprocessing
                                           regional datasets
                                                                             Updated interpretations


                                                     Interpretation       Reference database:
                                                     (Regional            statistically robust, consistent,
                     Design based                    Networks)            harmonized, updated, and accessible
          sample of reference sites

   International consensus on technical issues

 “Best Practices

Strahler et al., 2006
        Validation is really hard.

• Scale matters a lot
• Making ground measurements and relating them to
  even 30m or 250m pixels is hard work and
• With too much inherent spatial variability relative to
  pixel size and locational rms errors you never know
  where your ground observations are in relation to the
    – Some areas can not be validated
•   Not to mention MTF/PSF.
•   Timing (or lack of it) is usually also an issue.
•   In rugged terrain we are usually screwed.
•   Validation of change detection is really, really hard.
• We have failed to make the case for Validation so that enugh
  funds are available!
• Few funds means that validation of all products is inadequate.
      • Stage 1 Validation – Product accuracy has been estimated using a small
        number of independent measurements from selected locations and time
      • Stage 2 Validation – Product accuracy has been assessed by a number of
        independent measurements, at a number of locations or times
        representative of the range of conditions portrayed by the product e.g. EOS
        Land Validation Core Sites, Fluxnet sites, Aeronet sites.
      • Stage 3 Validation - Product accuracy has been assessed by independent
        measurements in a systematic and statistically robust way representing
        global conditions e.g. IGBP DISCover Project – suggest that this be
• For any product can we truthfully give the errors in space and
  time to our own satisfaction?
• Sometimes there are no funds and no validation.
    Does validation allow us to assess value?
“The widely used leaf area products derived from satellite-observed surface
   reflectances contain substantial erratic fluctuations in time due to inadequate
   atmospheric corrections and observational and retrieval uncertainties.
These fluctuations are inconsistent with the seasonal dynamics of leaf area,
   known to be gradual.
Use in process-based terrestrial carbon models corrupts model behavior,
   making diagnosis of model performance difficult.
We propose a data assimilation approach
    Combines the satellite observations of Moderate Resolution Imaging
        Spectroradiometer (MODIS) albedo with a dynamical leaf model.
    Its novelty is that the seasonal cycle of the directly retrieved leaf areas is smooth
        and consistent with both observations and current understandings of processes
        controlling leaf area dynamics.”
Liu et al 2008

The point is that any sort of generic validation might not identify this problem.
We should assess value not in the abstract but in terms of usefulness.

• Classification often does not work well.
  – Many reasons.
  – Some arise because we still don’t know
    how to classify
• Robustness to error in training data.
• Class proportions
      Dealing with training site errors

• Training sets always contain errors
• Can we overcome this problem in classification?
  – Test the classifiers with varying amounts of errors
    introduced into the training set
  – Support Vector Machine (SVM) and Kernel
    Perceptron (KP) outperforms Maximum Likelihood,
    Decision Tree, and ARTMAP Neural Network
  – Errors as much as 30% in SVM can be tolerated
• The soft-boundary design of modern SVM
  allows a proportion of errors to exist in the
  training set
              SVM Robust against subjective errors

                 A. Overall of and SVM using a a 20% corrupted training data
B D. Change Detection Resultcondition of the Experiment Site
  Change Detection Result of DTDT and SVM using10% corrupted training data
                             Error Resistance of Major Machine Learning Algorithms



Overall Accuracy




                                 MLC total Accuracy
                                 ARTMAP NN Total Accuracy
                                 DT Total Accuracy
                                 SVM Total Accuracy
                                 KP Total Accuracy

                         0      5      10      15     20     25     30     35      40   45   50
                                            Percentage of Error in Training Data
           Early Work on Training Design

• Class proportions impact on a priori probabilities
   –   Identified by Strahler in 1980
   –   Part of the Maximum Likelihood Classifier (MLC) framework
   –   Usage: to multiply with the probability of each pixel
   –   Contribution: Introduced the concept of “Class Prior”
   –   Issue: The concept was not used in training design

• Class proportions in the Population
   –   Identified by Hagner in 2001 and 2005
   –   Estimated using MLC
   –   Usage: to adjust the proportions in the training set for iterative MLC
   –   Contribution: Adaptive training design using “Class Prior”
   –   Issue: It is not MLC that needs training set design. MLC actually is
       largely invariant to training sets of different proportions, as is shown
       in Hagner’s own results.
The Over/Under-Estimation Problem (Song et al)
         The Optimal Configuration of Training data for SVM-based Forest Change Detections


           Accuracy (%)



                                         User Accuracy of Forest Change
                                         Producer Accuracy of Forest Change
                                         Total Accuracy
                                0   10        20       30       40       50   60      70      80       90   100
                                          Percentage of forest change pixels in the training data(%)
Modern Algorithms such as SVM are very susceptible to this problem.
But MLC is largely unaffected
  The Over/Under-Estimation Problem
         The Optimal Configuration of Training data for MLC-based Forest Change Detections



          Accuracy (%)





                                         User Accuracy of Forest Change
                                         Producer Accuracy of Forest Change
                                         Total Accuracy
                                0   10        20       30       40       50    60      70      80       90   100
                                           Percentage of forest change pixels in the training data(%)

• Many methods need the class prior of the population
  to resample the training dataset
• The class prior of the population might be estimated
  through MLC.
Almost impossible to separate surface emissivity
      and temperature accurately (Liang)

Surface leaving radiance is the sum of the surface emitted radiance and
reflected downward atmospheric radiation

          L(  )  B(T )  (1   ) Fd / 
Where  is surface emissivity, B () is the Planck function, and Fd is the downward flux.
For most surfaces, since emissivity is close to 1 the reflected radiance is quite small. Thus

         L( )  B(T )

It is almost impossible to separate two multiplied components, so we cannot determine
 emissivity and temperature T accurately.

The alternative solution is to estimate upwelling radiation from thermal IR observations
for initialization/calibration/validation of land surface models.
            Some other issues

• The history of remote sensing information extraction
  is largely the history of over-fitting.
   – Those working on identification of spam have a one-shot
     externally organized test.
• Hyper-spectral RS.
   – Something is almost bound to be related to something.
   – How do we begin to move towards standard products?
   – Where is the underlying theory to determine them?
• Disparities in resolution of reanalysis products and
  typical land cover variability.
• Difficulty of getting global biomass at time and space
  resolutions appropriate for REDD and conservation.
  Standing up for what we believe in.

• 159 scientific papers have been found to base their
  conclusions heavily on FRA statistics (Grainger, 2008)
• We know FRA is garbage for land cover change so
  why don’t we say so? This should not be a challenge.
   Land cover and land use change.

• FRA Problems are twofold
• Having to deal with individual countries
• Confusion between land cover and land use
  – “Where part of a forest is cut down but replanted
    (reforestation), or where the forest grows back
    on its own within a relatively short period (natural
    regeneration), there is no change in forest area.”
  – But for those concerned with land cover these
    differences are real
The curious case of Canada in FRA 2005

  • Forest Area 1990          310,134,000 ha.*
  • Forest Area 2000          310,134,000 ha.*
  • Forest Area 2005          310,134,000 ha.

  “Canada reports only productive forest land;
    unproductive forests are classified as “other wooded
    land” even though many of them meet the FAO
    definition of forest land. This results in underreporting
    of more than 170 million hectares, or 40 percent of
    Canadian forest land.” (Matthews 2000).
  Note in FRA 2000 Canada reported only 244,571,000
    hectares for both 1990 and 2000!
            Issues with FRA

• Assuming we are interested in land cover
  and not land use
  – Global rates are wrong (much too low)
  – Changes in rates (by decade and half-decade)
    are wrong (Tropical deforestation rates from 80s
    to 90s supposedly declining when increasing).
  – Inter-continental variations are seriously
    mistaken (South America vs Africa)
  – Considerable inconsistencies between countries.
The importance of formats and
         data policy
How to ensure data are used
   On December 8, 2008, the USGS made the
    entire 36-year long Landsat archive available to
    anyone via the Internet at no cost.
     GeoTIFF format
     Orthorectified “GIS-ready”
     Calibrated across missions and instruments
    Questions for space agencies
• Why don’t you always provide the following:
  – User friendly formats allowing immediate ingestion
    into GIS’s.
  – Standardized meta-data.
  – Rapid response systems.
  – Ortho-rectified data for all resolutions 500m and
  – Atmospherically corrected data
  – Up to date Calibration data
  – Validation data for all products
          Six Problems with RS data policies
1. If people want to use remotely
      sensed data then they should             4. Restrictive Data Policy is OK
      pay                                          because remote sensing data is
                                                   made available free to scientists.
    –   They already have as citizens.
        Plus the driving force for most            –   Why should scientists have
        environmental remote sensing                   preferential access compared with
        data is scientific or policy driven.           those in developing countries
                                                       alleviating poverty?
2. Making data available has an
    incremental cost.                          5. Principal Investigators need an
                                                   extended period of exclusive use
    –   Resources raised are a tiny
        fraction of the total cost of the          –   Only to make sure the products
        system.                                        are characterized so that “health
                                                       warnings” can be attached.
3. There is a commercial future for all
    environmental remote sensing        6.         Tell us why you want to use the
    data.                                          data before we will let you have it
                                                   – Otherwise known as the
    –   No evidence for mid and coarser
        resolution data.                           ”Papa ESA knows best policy”
            GEO Halls of Fame and
             Shame for Agencies

                                          HALL OF SHAME

• Free and open data       • Restrictive data policy
  policy                     with charging.
• Data easily accessible   • Not on-line: difficult to
  on line.                   order.
• Community specified      • Non-standard agency
  formats                    specified formats
• Orthorectified           • Not orthorectified
• Validated data sets      • Unvalidated data sets
           “Valley of death”.
                FROM RESEARCH TO
                SATELLITES AND
                NUMERICAL WEATHER
                CROSSING THE VALLEY OF
                Board on Atmospheric
                Sciences and Climate

The term “Crossing the Valley of Death” is sometimes
used in industry to describe a fundamental challenge
for research and development (R&D) programs. For
technology investments, the transitions from
development to implementation are frequently difficult,
and, if done improperly, these transitions often result
in “skeletons in Death Valley.”
     Successful transitions from R&D
      to operational implementation
• Understanding of the importance (and risks) of the transition,
• Development and maintenance of appropriate transition
• Adequate resource provision,
• Continuous feedback (in both directions) between the R&D
  and operational activities.

“In the case of the atmospheric and climate sciences,
   inadequacies in transition planning and resource commitment
   can seriously inhibit the implementation of good research
   leading to useful societal benefits.” NRC.
Landsat>LDCM and MODIS>VIIRS clearly demonstrate
the enormous difficulties that can occur.
                   Fire (Justice)

• A near-term major challenge for the international
  community will be to develop the best available -
  validated Fire Disturbance ECVs.
• The Grand Challenge will be to secure the satellite
  fire observing system that is needed consisting of
   – 1) operational polar orbiters with appropriate saturation for
     fire characterization,
   – 2) operational global geostationary network with 500m
     resolution 30 minute repeat,
   – 3) operational global Landsat class observations with 3-5
     day repeat
         Who has the responsibility for
          doing things operationally?
• Broad consensus on methods to achieve operational
   – But we must adapt to rapidly changing technologies and data
     availability (Google and radar)
• Need to ensure commitment to:
   –   Supply of remote sensing data
   –   Generation of terrestrial products
   –   Operational validation process
   –   More broadly who will commit to generation of operational products
       such as ECVs?
• Which international body will oversee the work?
   – Who has both the formal responsibility and scientific and technical
   – Can not simply be left to agencies. Agencies are starting to lay claim to
     certain ECVs but with little oversight.
• Urgent need to establish roles and responsibilities.
            GEO and CEOS

• Internationally highly dependent on them.
• But both “best efforts” organizations.
• Much talk about cooperation but concepts
  such as virtual constellations will be very
  difficult without
  – Agreements on data policy
  – Agreements on formats and pre-processing
  – Common portals that work.
• Perhaps the greatest challenges is to get
  these organizations acting in an integrated
  coordinated fashion responding to user
Thank you
            Time Series for Amazon Forests                                    Solar
 5.0                                                                          900

 4.0                                                             (mm /mo) 750

         2000         2001         2002        2003   2004   2005    2006
       *Dry seasons are in grey shaded bars.
       The phase-shift between LAI and solar radiation suggests
       rainforests’ adaptation to anticipating more sunlight.

                                                                     Part Four
      Transitioning to operational
•   Get the data policy right
•   Standardization of formats
•   Orthorectification
•   Atmospheric correction
•   Use of improved algorithms

      Training                      Algorithm

• Performance of remotely sensing studies in the
  real world largely relies on two factors:
  – 1. How well can algorithms handle unknown errors
  – 2. How to adaptively design the training set so that
    we can balance the overestimation/underestimation
                                                      Global Agricultural Fires




Fire Counts

              25000                                                       2002
              20000                                                       2003



                      Jan Feb Mar Apr May Jun   Jul Aug Sep Oct Nov Dec

                                                                                 Korontzi et al. 2007

Shared By: