Docstoc

Introduction to Computer Vision and image processing

Document Sample
Introduction to Computer Vision and image processing Powered By Docstoc
					  Introduction to Computer Vision and image
                   processing
1.1 Overview: Computer Imaging
1.2 Computer Vision
1.3 Image Processing
1.4 Computer Imaging System
1.6 Human Visual Perception
1.7 Image Representation
1.8 Digital Image File Format




                                              1.1
         Overview: computer imaging
Computer imaging – the acquisition and processing of visual
information by computers.
The computer representation of an image requires the
equivalent of many thousand of words of data.
One picture is worth a thousand words.
The “receiver” of the visual information: the human visual
system and the computer
 • Computer vision application – the processed (output) images
   are for use by a computer
 • Image processing applications – the output images are for
   human consumption.


                                                            1.2
Overview: computer imaging (cont.)


         Computer Imaging




    Computer           Image
     Vision          Processing




                                     1.3
                Computer vision
In computer imaging, the images are examined and acted
upon by a computer.
Image analysis – the examination of the image data to
facilitate solving a vision problem, involving
 • Feature extraction – the process of acquiring higher-level
    image information, such as shape or color information
 • Pattern classification – the act of taking this higher-level
    information and identifying objects within the image




                                                              1.4
            Computer vision (cont.)
Applications of computer vision:
 • Recognition – face, fingerprint, optical character
 • Trademark retrieval, image retrieval
 • Medical imaging – diagnose skin tumors, aid neurosurgeons
 • Law enforcement and security – identification of fingerprint,
   DNA analysis, retinal scans, facial scans, veins in the hand
 • Autonomous vehicles, target tracking and identification
 • Weather prediction, planet changing
 • …


                                                             1.5
                Image processing
In image processing, the images are examined and acted upon
by people, including
 • Image restoration – the process of taking an image with
    some known, or estimated, degradation, and restoring it to
    its original appearance. (see Fig. 1.3-1)
 • Image enhancement – taking an image and improving it
    visually, typically taking advantage of the human visual
    system’s responses. (see Fig. 1.3-2)
 • Image compression – reducing the massive amount of data
    needed to represent an image by taking advantage of the
    redundancy inherent in most images.

                                                                 1.6
           Image processing (cont.)
Applications of image processing:
 • Medical imaging – medical image diagnostic, such as PET
   (Positron Emission Tomography), CT (Computerized
   Tomography), MRI (Magnetic Resonance Imaging)
 • Entertainment – special effects, image editing, creating
   artificial scenes and beings (computer animation)
 • Virtual reality
 • Multimedia communications




                                                              1.7
           Computer imaging systems
The computer imaging system consists of:
 • Hardware –
    » Image acquisition subsystem – camera, scanner, video player
    » Computer system –
        o Frame grabber – hardware that translate standard analog video
          signal into digital images, this process is called digitization (see
          Fig. 1.4-2)
            – Sampling – translating continuous signal into digital form at
               a fixed rate
            – Quantization – translating the voltage of the signal into the
               brightness of the image (see Fig. 1.4-3)
    » Image display subsystem – monitor, printer, film, video recorder
• Image processing software (see Fig. 1.4-4)
                                                                            1.8
      Computer imaging systems (cont.)
The image brightness at a point depends on both the intrinsic
properties of the object and the lighting conditions in the scene.
An image can be regarded as a two-dimensional array of data
with each data point referred to as a pixel (picture element)
The digital image can be represented as
    I(r, c) = the brightness of the image at the point (r, c)
where r = row and c = column




                                                                 1.9
           Human visual perception
Why study visual perception?
• Compression – compact the data as much as possible but
  still retain all the necessary visual information
• Enhancement – improving the image visually achieved by
  understanding how the visual information is perceived




                                                           1.10
             Human visual system
The human visual system has two primary components – the
eye and the brain, which are connected by the optic nerve (see
Fig. 1.6-1)
 • Eye – the image receiving sensor
 • Brain – the information processing unit
The visible light energy corresponds to an electromagnetic
wave that falls into the wavelength of about 380 to 825
nanometers (see Fig. 1.6-2 for the electromagnetic spectrum).
In imaging system, the spectrum is often divided into various
spectral bands, with each band defined by a range on the
wavelengths (or frequency). For example,
blue (400 ~ 500 nm), green (500 ~ 600 nm), red (600 ~ 700 nm)

                                                           1.11
            Human visual system (cont.)
The eye has two primary type of sensors – rods and cones,
which are distributed across the retina (see Fig. 1.6-3)
 • Cones –
    »   primary used for daylight vision
    »   sensitive to color
    »   concentrated in the central region of the eye
    »   have a high resolution capability
• Rods –
    »   primary used in night vision
    »   see only brightness (not color)
    »   distributed across the retina
    »   have medium to low level resolution
• Blind spot – a place on the retina with no sensors exist
                                                             1.12
         Human visual system (cont.)
The rods see only a spectral band (brightness), and cannot
distinguish color (see Fig. 1.6-4 (a) for the response of rods)
There are three types of cones, each responding to different
wavelengths of light energy called the tristimulus curves
because all the colors that we perceive are the combined result
of the response to these three sensors (see Fig. 1.6-4 (b)).
About 65% of the cones responds most to wavelengths
corresponding to red and yellow, 33% to green, and 2% to blue.




                                                             1.13
          Spatial frequency resolution
Resolution – the ability to separate two adjacent pixels; if we
can see two adjacent pixels as being separate, then we say that
we can resolve the two.
Spatial frequency – how rapidly the signal is changing in space,
and the signal has two values for brightness – 0 and Maximum
(see Fig. 1.6-5).
The spatial frequency concept must includes distance from the
viewer to the object as part of the definition.
The spatial frequency is defined in terms of cycles per degree
which provides us a relative measure to eliminate the necessity
to include distance (see Fig. 1.6-6).

                                                             1.14
     Spatial frequency resolution (cont.)
The physical limitations that affect the spatial frequency
response of the visual system are both optical and neural:
 • The spatial resolution are are limited by the size of the
   sensors (rods and cones), we cannot resolve things smaller
   than the individual sensors.
 • The primary optical limitation are caused by the lens which
   limit the amount of light and typically contains
   imperfections that cause distortion in our visual perception.
The spatial resolution is affected by the average (background)
brightness of the display (see Fig. 1.6-7).


                                                               1.15
              Brightness adaptation
The response of the human visual system actually varies based
on the average brightness observed and is limited by the dark
threshold and the glare limit.
Subjective brightness is a logarithmic function of the light
intensity incident on the eye (see Fig. 1.6-8).
Experimentally, we can detect only about 20 changes in
brightness in a small area within a complex image.
For an entire image, due to the brightness adaptation that our
vision system exhibits, about 100 different gray levels are
necessary to create a realistic image.


                                                             1.16
         Brightness adaptation (cont.)
False contouring (bogus lines) – if fewer gray levels are used,
the gradually changing light intensity will not being accurately
represented, as seen in Fig. 1.6-9.
Mach band effect – when there is a sudden change in intensity,
our vision system responds overshoots the edge, thus creating a
scalloped effect.
accentuates the edges and helps us to distinguish, and separate
objects within an image (see Fig. 1.6-10).




                                                              1.17
               Temporal resolution
The temporal resolution deals with how we respond to visual
information as a function of time.
See Fig. 1.6-11 the temporal contrast sensitivity to the
frequency.
Flicker sensitivity – our ability to observe a flicker(閃爍) in a
video signal displayed on a monitor.
The human visual system has a temporal cutoff frequency of
about 50 hertz (cycles per seconds)




                                                                  1.18
              Image representation
The digital image I(r, c) is represented as a two-dimensional
array of data, where each pixel value corresponds to the
brightness of the image at the point (r, c).
In linear algebra form, the two-dimensional array like I(r, c) is
referred to as a matrix, and one row (or column) is called a
vector.
There are multiband images (color, multispectral), and they can
be modeled by a different I(r, c) function corresponding to each
separate band of brightness information.




                                                               1.19
         Binary image representation
A binary image takes on two values, typically black and white,
or ‘0’ and ‘1’.
A binary image is referred to as 1 bit/pixel image because it
takes only one binary digit to represent each pixel.
Applications fields of binary images (see Fig. 1.7-1):
 • Optical character recognition (OCR)
 • Robot
 • Facsimile (FAX) image
Binary images are often created from gray-scale image via a
threshold operation – every pixel above the threshold value is
turned white (‘1’) and those below it are turned black (‘0’)
                                                            1.20
      Gray-scale image representation
Gray-scale images are referred to as monochrome, or one-color
images – they contain only brightness, no color, information.
A typical image contains 8 bit/pixel data, which can represent
256 (0-255) different brightness (gray) levels –
 • Provides a “noise margin” by allowing for approximately
    twice an many gray levels the human visual system required
 • 8-bit represent a byte, which is the standard small storage
    unit of digital computers
In certain applications, such as medical imaging or astronomy,
12 or 16 bit/pixel representation are used.


                                                           1.21
          Color-image representation
Color images can be modeled as three-band monochrome image
data, where each band of data corresponds to a different color.
Typically, color images are represented as red, green, and blue,
or RGB images
24 bit/pixel, 8 bit for each of the three color band
A single pixel’s red, green, and blue values can be referred to as
a color pixel vector – (R, G, B), see Fig. 1.7-2.
In many applications, RGB color information is transformed
into a mathematical space that decouples the brightness
information from the color information
Create a more people-oriented way of describing the colors
                                                               1.22
    Color-image representation (cont.)
 Additive color system (color of light)
• Primary colors- red (R), green (G), blue (B)
• Secondary colors- magenta (R+B) 紫紅, cyan (G+B) 青藍,
                      yellow (R+G)黃色




                                                  1.23
     Color-image representation (cont.)
RGB color space:           G    Green             Yellow


             Cyan
                                        White


                               Black              Red   R



                    Blue                Magenta
         B
                                                            1.24
     Color-image representation (cont.)
Subtractive color system (color of pigments, colorants)
 • Primary colors- magenta, cyan, yellow
 • Secondary colors- red, green, blue




                                                          1.25
      Color-image representation (cont.)
CMY color space:                 M    Magenta           Blue


                       Red                      Black
⎡ C ⎤ ⎡1 ⎤ ⎡ R ⎤
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ M ⎥ = ⎢1 ⎥ − ⎢ G ⎥                                    Cyan   C
                                      White
⎢ Y ⎥ ⎢1 ⎥ ⎢ B ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦

                             Yellow           Green
               Y
                                                                   1.26
     Color-image representation (cont.)
Characteristics of Colors –
 • brightness - intensity
 • chromaticity - hue and saturation
    • hue - dominant wavelength (color) perceived by human eyes
    • saturation - relative purity or the amount of white light mixed with a
      hue.
• tristimulus values - the amounts of red, green, blue (denoted
  as X, Y, and Z) needed to form any particular color.
• trichromatic coefficients - x, y, z with
  x = X /(X + Y + Z), y = Y/(X + Y + Z), z = Z/(X + Y + Z).


                                                                         1.27
      Color-image representation (cont.)
HSL (hue/saturation/lightness) color space –
 • Lightness – the brightness of the color
 • Hue – the “color” (e.g., green, blue, orange)
 • Saturation – how much white is in the color (e.g., pink, pure
   red)
See Fig. 1.7-3 for HSL color space.




                                                              1.28
     Color-image representation (cont.)
The spherical coordinate transform (SCT)
 • decouples the brightness information from the color
   information
 • The transform equations are as follows:
      L = R 2 + G 2 + B 2 : the length of the RGB vector
                 ⎡B⎤
     ∠A = cos −1 ⎢ ⎥ : the angle form the B - axis to the RG plane
                 ⎣L⎦
                 ⎡      R    ⎤
     ∠B = cos −1 ⎢           ⎥ : the angle between the R - and G - axis
                 ⎣ L sin(∠A) ⎦
• See Fig. 2.4-5 for SCT transform.

                                                                          1.29
      Color-image representation (cont.)
One problem with the color spaces previously described is that
they are not perceptually uniform – two different colors in one
part of the color space will not exhibit the same degree of
perceptually difference as two colors in another part of the color
space, even though they have the same “distance” apart (see Fig.
1.7-5)
We cannot define a metric to tell us how close, or far apart, two
colors are in terms of human perception.




                                                               1.30
     Color-image representation (cont.)
The CIE (Commission Internationale de I’Eclairage) has
defined internationally recognized standards.
For the RGB space, chromaticity coordinates are defined as:
          R
      r=
        R+G+ B
          G
     g=
        R+G+ B
          B
     b=
        R+G+ B
The CIE has defined the standard CIE XYZ color space and the
perceptually uniform L*u*v*, L*a*b* color spaces.

                                                              1.31
       Color-image representation (cont.)
CIE chromaticity diagram:




                                            1.32
      Color-image representation (cont.)
Color spaces:
 • RGB model - for hardware (e.g., monitors, video cameras)
 • CMY model- for color printers
 • YUV (luminance, hue, saturation) – British PAL television
   standard
 • YIQ (luminance, inphase, quadrature) – NTSC TV
   broadcasting
 • YDrDb – French SECAM broadcast system
 • YCrCb – CCIR-601 video standard
 • HSI (hue, saturation, intensity) - color image manipulation
 • HSV (hue, saturation, value) - color image manipulation
                                                             1.33
                       YUV Model
Y – luminance or total illumination
U, V – color difference values
Y = 0.299R + 0.587G + 0.114B
U = 0.493(B – Y)
V = 0.877(R – Y)
RGB → YUV                          YUV → RGB

⎡Y⎤ ⎡ 0.299 0.587 0.114 ⎡R⎤ ⎤       ⎡R⎤ ⎡1 0.0         1.140⎤   ⎡Y⎤
⎢U⎥ = ⎢- 0.147 - 0.289 0.436 ⎢G⎥
                            ⎥       ⎢G⎥ = ⎢1 - 0.395 - 0.581⎥   ⎢U⎥
⎢ ⎥ ⎢                       ⎥ ⎢ ⎥   ⎢ ⎥ ⎢                   ⎥   ⎢ ⎥
⎢V⎥ ⎢ 0.615 − 0.515 - 0.100 ⎢B⎥
⎣ ⎦ ⎣                       ⎥ ⎣ ⎦
                            ⎦       ⎢B⎥ ⎢1 2.032 0.0 ⎥
                                    ⎣ ⎦ ⎣                   ⎦   ⎢V⎥
                                                                ⎣ ⎦

                                                                      1.34
                    YIQ Model
The luminance component (Y) and color information (I, Q) are
decoupled.
Useful in color TV broadcasting - for transmission efficiency
and maintaining compatibility with monochrome TV.
Human visual system is more sensitive to changes in luminance
than to changes in hue or saturation.




                                                           1.35
                 YIQ Model (cont.)
Y – luminance       Y = 0.299R + 0.587G + 0.114B
I – in-phase        I = Vcos330 – Usin330
Q – quadrature      Q = Vsin330 – Ucos330
RGB → YIQ
     ⎡ Y ⎤ ⎡0.299     0.587   0.114 ⎤   ⎡R ⎤
     ⎢ I ⎥ = ⎢0.596 - 0.274 − 0.322 ⎥   ⎢G ⎥
     ⎢ ⎥ ⎢                          ⎥   ⎢ ⎥
     ⎢Q ⎥ ⎢ 0.211 − 0.522
     ⎣ ⎦ ⎣                    0.311⎥⎦   ⎢B ⎥
                                        ⎣ ⎦
YIQ → RGB
     ⎡ R ⎤ ⎡1.0    0.956   0.621⎤    ⎡Y ⎤
     ⎢G ⎥ = ⎢1.0 - 0.272 − 0.649 ⎥   ⎢I⎥
     ⎢ ⎥ ⎢                       ⎥   ⎢ ⎥
     ⎢ B ⎥ ⎢1.0 − 1.106
     ⎣ ⎦ ⎣                 1.703 ⎥
                                 ⎦   ⎢Q ⎥
                                     ⎣ ⎦
                                                   1.36
                    YDrDb Model
Y – luminance
Dr – difference in red               Dr = -1.902(R – Y)
Db – difference in blue              Db = 1.505(B – Y)
RGB → YDrDb
    ⎡ Y ⎤ ⎡ 0.299      0.587   0.114 ⎤   ⎡R ⎤
    ⎢ Dr ⎥ = ⎢ - 1.333 1.116 − 0.217 ⎥   ⎢G ⎥
    ⎢ ⎥ ⎢                            ⎥   ⎢ ⎥
    ⎢ Db ⎥ ⎢- 0.450 − 0.883
    ⎣ ⎦ ⎣                     1.333 ⎥⎦   ⎢B ⎥
                                         ⎣ ⎦




                                                          1.37
                       YCrCb Model
Same as YUV without the coefficient for U and V plus a minor
modification to allow integer math.
Y = 0.299R + 0.587G + 0.114B
Cr = 0.713(R – Y)
Cb = 0.564(B – Y)
RGB → YCrCb                       YCrCb → RGB

⎡ Y ⎤ ⎡ 0.299 0.587 0.114⎤ ⎡R⎤        ⎡R⎤ ⎡1 0.0         1.402⎤   ⎡Y⎤
⎢Cr⎥ = ⎢ 0.500 - 0.419 - 0.081 ⎢G⎥
                              ⎥       ⎢G⎥ = ⎢1 1.772 0.0 ⎥        ⎢Cr⎥
⎢ ⎥ ⎢                         ⎥ ⎢ ⎥   ⎢ ⎥ ⎢                   ⎥   ⎢ ⎥
⎢Cb⎥ ⎢- 0.169 - 0.331 0.500⎥ ⎢ B ⎥
⎣ ⎦ ⎣                         ⎦ ⎣ ⎦   ⎢ B ⎥ ⎢1 - 0.344 - 0.714⎥
                                      ⎣ ⎦ ⎣                   ⎦   ⎢Cb⎥
                                                                  ⎣ ⎦


                                                                         1.38
                    HSV Model
HSV - hue, saturation, and value




                          H : specific angles around the vertical axis
                          S : the radial distance from the vertical axis
                          V : along the central axis




                                                                   1.39
                HSV Model (cont.)
V = Y/256
S = (Cr − 128) 2 + (Cb − 128) 2 / 128
H = sin-1((Cr-128)/(128*s))




                                        1.40
                      HSI Model




H : the angle of the vector w.r.t. the red axis
S : the distance from p to the center of the triangle
I : measured w.r.t. a line perpendicular to the triangle and
passing through its center
                                                               1.41
                 HSI Model (cont.)
RGB → HSI
      R+G+ B
  I=
          3
         3 × [min( R, G, B )]
  S = 1−
            ( R + G + B)
            ⎧     1                             ⎫
            ⎪        [( R − G ) + ( R − B )]    ⎪
         −1 ⎪      2                            ⎪
  H = cos ⎨                                   1 ⎬
            ⎪[( R − G ) 2 + ( R − B)(G − B )] 2 ⎪
            ⎪
            ⎩                                   ⎪
                                                ⎭
  Note: 1. if B > G then H = 360°-H, and let H = H/ 360°
        2. If S = 0, H undefined
        3. S is undefined if I = 0
                                                           1.42
                   HSI Model (cont.)
HSI → RGB
(1)RG sector (0° < H ≦ 120°)
        1
     b = (1 − S )
        3
        1⎡        S cos H ⎤
     r = ⎢1 +
        3 ⎣ cos(60° − H ) ⎥
                          ⎦
     g = 1 − ( r + b)
    R = 3Ir, G = 3Ig, B = 3Ib




                                       1.43
                        HSI Model (cont.)
(2)GB sector (120°<H≦240°)           (3)GB sector (240°<H≦360°)
     H = H − 120°                         H = H − 240°
        1                                    1
     r = (1 − S )                         g = (1 − S )
        3                                    3
        1⎡        S cos H ⎤                  1⎡        S cos H ⎤
     g = ⎢1 +                             b = ⎢1 +
        3 ⎣ cos(60° − H ) ⎥
                          ⎦                  3 ⎣ cos(60° − H ) ⎥
                                                               ⎦
     b = 1 − (r + g )                     r = 1 − ( g + b)
  R = 3Ir, G = 3Ig, B = 3Ib          R = 3Ir, G = 3Ig, B = 3Ib




                                                                  1.44
      Multispectral image representation
Multispectral images contain information outside the normal
human perceptual range, including infrared, ultraviolet, X-ray,
acoustic, or radar data.
Multispectral image systems include satellite system,
underwater sonar systems, airborne radar, infrared imaging
systems, medical diagnostic imaging systems.




                                                              1.45
              Digital image file format
Image data is divided into two primary categories:
 • Bitmap images (raster images) –
    » can be represented by the image model I(r, c)
    » the brightness value of each pixel data is stored in some file format
    » e.g., BIN, PPM, TIFF, GIF (see Fig. 1.8-1)
• Vector images –
    » representing lines, curves, and shapes by storing only the key points
    » the key points are sufficient to define the shapes
    » rendering – the process of turning the key points into an image




                                                                          1.46
        Digital image file format (cont.)
Data visualization – the process of representing data as an
image
Remapping – the process of taking the original data and
defining an equation to translate the original data to the output
data range typically 0 to 255 for 8-bit display (see Fig. 1.8-2)
 • Linear mapping
 • Logarithmic mapping




                                                                1.47

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:5/1/2013
language:Unknown
pages:47
wang nianwu wang nianwu http://
About wangnianwu