Learning Center
Plans & pricing Sign in
Sign Out

Image processing Hierarchy and sequence


									Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                                -4-

1. Fundamentals of image processing
(Reading: Castleman, 1996, ch. 1; Gonzalez and Wintz, 1987, ch. 1, ch. 2 - Sections 2.1 to 2.3)

- basic aims of digital image processing and analysis are to (1) reduce the amount of surplus or non-
information, (2) derive an image model (e.g., through classification), (3) parameterize an image (e.g.,
through statistics of size or shape quantifiers for a given object class) for a given scene such as a set of
objects revealed in a sample cross-section (thin section) or a rendition of the earth's surface obtained from a
remote-sensing satellite

- images as generic entities; differences between, e.g., microstructural and remote-sensing applications are
mostly reduced to the specifics of geometric transformations, details of multispectral analysis, etc.

Image processing :                                            Object
Hierarchy and sequence                                        Optical system
                                                              (microscope, photolens etc.)
  Image acquisition
                                                              Optical image

     3-D scene                                                Film, scanner, CCD etc.

     Optical image formation

     2 D image f(x,y)                                         Storage device


    Grey value / colour
    digital image f[x,y, c(1»n)]
                                                Image processing/enhancement

    Geometric transformation             operations
    (distortion correction etc.)
                                                           Filtering (Spatial
    Decomposition, multi-scale &                           transforms)
    multi-grid data structure

            Pre-processed image                            Enhanced/restored
            (features)                                     image

   Segmentation             Classification             Transformation
                            (objects)                  & encoding                              Fig. 1.1: Schematic of
   Segmented                Classified                                                         hierarchies and sequences of
   image                    image                      Encoded image                           image-processing tasks (based
                           Size & shape                                                        in part on Jähne, 1995).
                           analysis (objects)
                                                     Image encoding
                         Image analysis

1.1. Images

1.1.1. Greyshade image representation

- a definition of an image: 2-dimensional light-intensity function f(x,y), with the value f at point with
coordinates x and y indicating the brightness; for greyshade images f(x,y) is referred to as greylevel or
greyvalue (for colour images see below); for most image-processing applications, f(x,y) can be expressed
as the product f(x,y) = i(x,y) • r(x,y) of an illumination component i and a reflectance component r; in
transmitted light microscopy, r corresponds to an extinction component that depends on absorption and
scattering in the sample

- a greyshade image of G grey levels, consisting of N x N picture elements f(i,j) is then given by
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                      -5-

                       f (0, 0)          f (0,1) ...      f (0, N − 1) 
                       f (1, 0)          f (1,1) ...       f (1, N − 1) 
          f ( x, y) =                                                   
                            ...             ...  ...             ...    
                                                                        
                       f ( N − 1, 0)        ...  ...   f ( N − 1, N − 1)

- for digital images both N and G are mostly considered as powers of 2:

          N = 2n
         G = 2m
such that the total number of bits b required to represent an image is given by
         b= N×N×m

- the size in Bytes (8 bit each) of images of varying size and greyshade resolution is indicated in the table
(from Gonzalez & Wintz, 1987)

1.1.2. Colour image representation

- a multispectral image pixel constituted by n spectral bands can be represented by a vector DNp in colour
space as shown in the Figure below (Schowengerdt, 1997, p. 117); for dedicated multispectral applications
such as in remote sensing, the number and width of spectral bands is dictated by the application and the
sensor employed (discussed in more detail in Section 7 - Multispectral Analysis), whereas for
photographic/camera imaging, models are mostly based on classical colour theory

- classical three-colour theory (originating back to studies in the early 19th century by Young) describes the
perception αi of a colour i by the integral over the product of the colour sensitivity Si(λ) and the spectral
composition of the light C(λ)
                      λ max
         α i (C ) =    ∫ Si (λ )C(λ )dλ
                      λ min

and it furthermore states that any given perceived colour can be represented by an appropriate mixture of
three primary colours (red, green, blue); as has been shown by Maxwell, Ostwald and others in the latter
half of the 19th century, the human eye distinguishes between red-yellow, green and blue light; for three
linearly independent primary light sources Pk(λ) that are contributing a fraction βk each to the total
luminance, the colour perception is then given by the colour mixing equation
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                -6-

                        3             
           α i (C ) = ∫  ∑ β k Pk (λ ) Si (λ )dλ
                         k =1         

- based on colour theory, any given colour can be represented by a mixture of three primary components at
different saturation levels β (you can personally try this at Brian Lesser's page on the web at with a Java-based colour matching program); in
principle, the same colour perception can be induced by different proportions of different spectral

Figure from Jain, 1989

- a number of different colour model coordinate systems are in general use (see Table below); while the
Saturation-Hue-Brightness scheme comes closest to the human perception of colour, Red-Green-Blue
systems are conform with video norms (corresponding to the spectral characteristics of phosphorescence

- technically (e.g., in colour image processing applications such as the NIH Image clone used in this
course), colour images are commonly represented by three images for the R, G and B signal; as will be
shown later, image transforms can then be performed on the component images which are recombined
based on a colour model (see Table below)
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                       -7-

                                                                             Jain, 1989, p. 67
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                      -8-

1.1.3. Image file formats

- digital image files are typically stored as binary files, with each pixel represented by an 8-bit or 16-bit
(short or long) integer/character that corresponds to the numeric value of the pixel grey value; in
uncompressed, raw image files these are generally written away as a continuous chain of characters

- standardized image formats that are in wider use in the realm of image processing and transfer include the
following (for a more detailed overview of graphics file formats and related information, see:; also
presents examples as well as advantages and drawbacks of different image formats):
          TIFF (Tagged Image File Format): lossless, compression limited to LZW
          JPEG/JFIF (Joint Photographics Experts Group): lossy, high compression factors
          GIF (Graphics Interface Format): lossless, limited colour depth (indexed LUTs with 256 colours),
          relies on LZW compression technique that is licensed through Unisys (alternative: PNG)
          BMP (Windows Bitmap): lossless, Windows-specific
- for remote-sensing applications, a variant of the TIFF format, GeoTIFF, which allows for the inclusion
geolocation data in the file header, is of interest; furthermore, the Hierarchical Data Format (HDF) that has
been adopted by NASA for EOS data sets is increasingly used to represent remote-sensing data sets
(detailed information can be found at the NCSA web site:

- colour or multispectral images are typically represented in one of three fashions: band-interleaved by
sample (BIS), band-interleaved-by-line (BIL) or band-sequential (BSQ)
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                                                 -9-

1.2. Image acquisition and digitization

1.2.1 The human eye
- the human eye as a biological image acquisition system; animal and human eyes generally optimized for
immediate comparison of features or patterns, motion detection, high resolution; to ensure efficiency under
wide range of conditions, sophisticated adaptation mechanisms; visual acuity/resolution in the human eye
are highest in the fovea region, where the cones are concentrated (effective image resolution comparable to
high-performance CCD chips); human and animal eyes have evolutively adapted to the requirements of the
"real world", the resulting performance in typical components of image-processing or -analysis tasks is
indicated in the Table below

                                                                             Cones: total of                           Fig. 1.3: Cross-
                                                  Highest resolution
                                                                             6 to 7 x 106, one                         section of human eye,
                                                  (most cones located        per nerve ending,
                                                  in fovea)                  colour & photopic                         spectral sensitivity
                                                                             (bright-light) vision                     and adaption to
                                                                                                                       variable illumination

                                                     Blind spot              Rods: total of 75 to
                                                                             150 x 106, several
                                                                             per nerve ending,
                                                                             greyshade & scotopic
   Typical image projection sizes           Rods distributed                 (dim-light) vision
   on retina on the order of a millimeter   throughout retina
   (@ 1 MCone resolution comparable                                                    Gonzalez & Wintz, 1987, p. 17
   to high-performance CCDs)

                                                    Russ, 1998, p. 13

 incl. Fig. p. 13 from Russ
 plus possibly Fig. 2.3 from Gonzalez

Table 1.3. Image acquisition and processing: Human vs. machine vision systems (based on Gerhardt,
Rensselaer Polytechnic Institute)

Human eye                                                               Machine vision
• Parallel acquisition/processing                                       • Sequential processing
• Colour processing                                                     • Composite grey scale processing
• Non-uniform distribution of sensor elements                           • Uniform distribution of sensor elements
• Gestalt processing                                                    • Inductive/deductive processing
• Intrinsically 3-D processing                                          • Intrinsically 2-D processing

1.2.2. Camera systems

- except for specialized applications, electronic imaging tubes, which used to dominate industrial and
scientific imaging applications, are progressively being substituted by camera systems with solid-state
imaging devices; analogous to cathode-ray tube (CRT) monitors, electronic imaging tubes construct an
image based on an electron beam sweeping across a target that accumulates charges as a function of the
incoming photoelectrons; the scan geometry and timing are underlying current video norms (such as the
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                  - 10 -

National Television Standards Committee or NTSC specifications) and are hence of importance in
applications that include a video signal transfer somewhere along the line of processing
- as shown in Figure 1.4, a video image of 525 lines is a composite of two interlaced half-images that are
each refreshed at a frequency of 30 Hz (the 60 Hz synchronization for each consecutive half image is
provided by the AC power supply); European Systems (such as CCIR) are based on a 50 Hz sync and
provide up to 625 lines at 768 pixels each

                                                          Fig. 1.4: Scan lines in NTSC video signal
                                                          (Castleman, 1996).

- solid-state cameras, in particular based on charge-coupled device (CCDs) arrays are dominating today's
commercial and scientific market; they are robust & distortion-free, exhibit near-linear sensitivity, allow for
photon integration over longer time periods
- the CCD consists of a silicone wafer with a metal surface that is positively charged to act as a potential
well for photoelectrons liberated by impinging photons; typically, individual pixels hold on the order of 105
to 106 photoelectrons; depending on the method of charge transfer three main CCD types are recognized
(Fig. 1.5); thermal electrons released as function of temperature result in dark current and determine noise

Fig. 1.5: CCD types distinguished by pixel data transfer mechanism (Castleman, 1996).

- colour CCD cameras can be equipped with a single CCD that is covered by a composite array of three-
colour filters (one per pixel; hence, the effective resolution and coverage may differ for different colours -
this is usually what is employed in low- to medium-grade digital cameras) or they contain three separate
(monochrome) CCDs that are illuminated by a wavelength-discriminating beamsplitter (usually employed
in high-grade cameras for scientific applications; a further type of camera employs filter sets or a single
liquid-crystal colour filter which results in high-quality colour images for scientific applications but
precludes high sampling rates)
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                    - 11 -

1.2.3. Digitization

- the digitization process of, e.g., a video image signal, results in a quantization of both the spatial signal
(through convolution of the image scene and pixelation, which will be discussed in more detail at a later
stage) as well as the greyshade signal according to

          DN p = int[a s( x, y) + b]
where x, y are the spatial coordinates of pixel p for a raw signal s, a and b correspond to the gain and the
offset (contrast and offset) of the signal and DN is the resulting digital number

- analog/digital (A/D) converters (often referred to as frame grabbers) employed to digitize, e.g., a video
signal are one of the bottlenecks of image processing and may introduce a significant error into the image
data, in particular in the case of low-grade digitizers (such as employed in cheaper PC audio-video cards)

1.2.4. Remote-sensing scanning systems
Reading: Schowengerdt, 1997, ch. 1.4)

- image-acquisition for satellite remote-sensing systems varies from other applications (such as video or
digital photography) generally in so far as that an image or scene is acquired by sweeping a linear array of
sensors (or a single sensor) across the sampled surface; depending on the orientation and motion of the
scanner array, different types of such systems are recognized (Fig. 1.6)

                                                         Fig. 1.6: Different type of scanning methods in remote
                                                         sensing systems (Schowengerdt, 1997, p. 19).

- the geometry of the surface sampled by a scanner system is characterized by the Ground Sample Interval
(GSI), Instantaneous Field of View (IFOV) and the Ground-Projected IFOV (GIFOV) that can be described
Eicken, GEOS 692 - Geoscience Image Processing Applications, Lecture Notes                                     - 12 -

in terms of the altitude H of the sensor system, its focal length f and detector element width w and the
spatial sampling rate n/i (with n the number of pixels per inter-detector spacing i):
                  i H
         GSI =
                  n f
                         w w
         IFOV = 2 arctan  ≅
                        2f  f

         GIFOV = 2 H tan
                                 IFOV     H
                                  2      f

                                                               Fig. 1.7: Geometry and optics of single sensor
                                                               element in scanning remote sensing instrument
                                                               (Schowengerdt, 1997, p. 20).

Characteristics of different VIS Multispectral Scanning Systems

System           GSI, m       Swath, km      Radiom. resol.,      No. channels      Data on web
                 (nadir)                     bits/channel         (VIS)
AVHRR              1100           2700              10                  1 
MODIS             500/1000        2300              12                  10
Landsat TM           30           185                8                   3   
Landsat MSS          80           185                6                   2  
SPOT XS              20            60                8                   2

To top