Computer Representation of
Images
A picture function f(x,y) is a real-valued function
of two variables, having values that are
nonnegative and bounded,
0 ≤ f(x,y) ≤ L-1 for all (x,y)
When a picture is digitized, a sampling process is
used to extract from the picture a discrete set of
samples, and then a quantization process is
applied to these samples
Sampling
f
t
A sampled function
Quantization
f
3
2
1
0
t
Quantization
Computer Representation of
Images
The resultant digital picture function f, or digital picture,
can be represented by a two-dimensional array of picture
elements, or pixels
The digital picture function f can be regarded as a
mapping from {0, ... , M-1}X{0, ... , N-1} to {0, ... , L-1}
The set {0, ... , L-1} is called the gray level set, and L is
the number of distinct gray levels
For example, if we use 3 bits to represent each pixel, then L is 8
If 8 bits are used, L is 256.
Computer Representation of
Images
Picture Elements: Pixel
Color,
gray-value images and
binary images (e.g., values 0 for black, 1 for white)
Example
gray-value images contain different number of
brightness levels:
Computer Representation of
Images
M and N represent the size of the digital picture and are
determined by the coarseness of the sampling
If the sampling interval is too large, the process by which
a digital image was produced may be apparent to human
viewers
This is the problem of undersampling
A digital image may be generated by a scanner, digital
camera, frame grabber, etc.
Computer Representation of
Images
Even if two images have the same number of pixels, the
quality of the images may differ in quality due to
differences in how the images are captured
More expensive digital cameras will have larger digital
sensors than less expensive ones (larger sensors cost
more)
So if the two cameras produce images with the same number of pixels, the pixels in the
larger array will represent a larger area – so more information is packed into each pixel
Sensor Arrays of Differing Size
Image Resolution and Image
Size
Sometimes we use the term “image resolution” to
refer to the size of the image in pixels – this is
imprecise (at best)
The size of the image (and indirectly the image resolution) depends on
the number of pixels per inch (along with size in pixels) associated with
an image (scanner resolutions are typically quoted in ppi – i.e. samples
per inch)
In a printer, where a number of dots may be needed to represent a pixel,
we may use the term dots per inch
To be still more precise, the image resolution (e.g. how well we can
resolve separate lines in an image) depends on the device used to form
the image
Image Resolution Test Pattern
1-bit Images
Each pixel is stored as a single bit (0 or 1), so also
referred to as a binary (or bilevel) image.
Such an image is also called a 1-bit monochrome image
since it contains no color.
Fig. 3.1 shows a 1-bit monochrome image (called “Lena”
by multimedia scientists - this is a standard image used to
illustrate many algorithms).
1-bit image
Grey-scale image
Raster Display of Digital Images
The special-purpose high-speed memory for
storing the image frames is called the frame buffer
The frame buffer is considered a component of
the graphics card which are used to drive bit-
mapped displays
Bit-mapped Displays
Bit-mapped displays require a considerable amount of
video RAM. Some common sizes are 640x480 (VGA),
800x600 (SVGA), 1024x768 (XVGA), and 1280x960.
Each of these has an aspect ratio of 4:3.
To get true color, 8 bits are needed for each of the three
primary colors, or 3 bytes/pixel. Thus, 1024x768 requires
2.3 MB of video RAM.
An alternative to truecolor is hicolor which uses 16 bits
per pixel (5 R, 6 G, 5 B or 5 for each with 1 bit unused)
Bit-mapped Displays
To lessen this requirement, some computers have used 8-
bits to indicate the desired color. This number is then used
as an index into a hardware table called the color palette
that contains 256 entries, each holding a 24-bit RGB value.
This is called indexed color. It reduces the required RAM
by 2/3, but allows only 256 colors.
This technique is also called pseudocolor
Sometimes each window on the screen has its own
mapping. The palette is changed when a new window gains
focus.
Bit-mapped Displays
To display full-screen full-color multimedia on a
1024x768 display requires copying 2.3 MB of data
to the video RAM for every frame. For full-motion
video, 25 frame/sec is needed for a total data rate
of 57.6 MB/sec.
In liquid crystal displays (LCDs) as well as plasma
display panels (PDPs), and digital micromirror
displays (DMDs as in projectors) discrete pixels
are constructed on the display device
CRTs don’t have this characteristic
Bit-mapped Displays
Such a display can be driven digitally at the native
pixel count (one to one correspondence between
framebuffer pixels and display pixels)
When there is a mismatch between framebuffer
size and display size, the graphics system may
resample by primitive means
Drop or replicate pixels
Image quality suffers in this case
Dithering
When an image is printed, the basic strategy of dithering
is used, which trades intensity resolution for spatial
resolution to provide ability to print multi-level images on
2-level (1-bit) printers.
Dithering is used to calculate patterns of dots such that
values from 0 to 255 correspond to patterns that are more
and more filled at darker pixel values, for printing on a 1-
bit printer.
Dithering
The main strategy is to replace a pixel value by a larger
pattern, say 22 or 44, such that the number of printed
dots approximates the varying-sized disks of ink used in
analog, in halftone printing (e.g., for newspaper photos).
1. Half-tone printing is an analog process that uses
smaller or larger filled circles of black ink to represent
shading, for newspaper printing.
2. For example, if we use a 22 dither matrix
Dithering
We can first re-map image values in 0..255 into the new
range 0..4 by (integer) dividing by 256/5. Then, e.g., if
the pixel value is 0 we print nothing, in a 22 area of
printer output. But if the pixel value is 4 we print all four
dots.
The rule is:
If the intensity is > the dither matrix entry then print an
on dot at that entry location: replace each pixel by an n
n matrix of dots.
Note that the image size may be much larger, for a
dithered image, since replacing each pixel by a 44 array
of dots, makes an image 16 times as large.
Dithering
A clever trick can get around this problem.
Suppose we wish to use a larger, 44 dither
matrix, such as
Dithering
An ordered dither consists of turning on the printer
output bit for a pixel if the intensity level is greater than
the particular matrix element just at that pixel position.
Fig. 3.4 (a) shows a grayscale image of “Lena”. The
ordered-dither version is shown as Fig. 3.4 (b), with a
detail of Lena's right eye in Fig. 3.4 (c).
Dithering
Picture Operations
Digital images are transformed by means of one or more
picture operations
An operation transforms an image I into I’
An operation can be classified as a local operation if
I’(x,y) depends only on the values of some neighborhood
of the pixel (x,y) in I
If I’(x,y) depends only on the value I(x,y), then the
operation is called a point operation
Operations for which the value of I’(x,y) depends on all
of the pixels of I are called global operations
Picture Operations
An example of a local operation is an averaging
operator which has the effect of blurring an image
The new image I’ is found by replacing each pixel
(x,y) in I by the average of (x,y) and (for
example) the 8 neighbors of (x,y)
Pixels on the edge of the image require special
consideration
Picture Operations
The gradient operation has the opposite effect - it
sharpens a picture, emphasizing edges
The gradient operator (and other local operators)
can be illustrated by a template
Imagine placing the center of the template on a given pixel (x,y) in I
To compute (x,y) in I’, multiply each neighbor by the corresponding
value in the template and divide by the number of pixels in the template
Picture Operations
Prewitt Operator:
-1 0 1 1 1 1
-1 0 1 or 0 0 0
-1 0 1 -1 -1 -1
Sobel Operator:
-1 0 1 1 2 1
-2 0 2 or 0 0 0
-1 0 1 -1 -2 -1
Sobel Operator
A Greyscale Image
255 235 180 14
245 220 140 10
24 35 25 14
10 8 8 6
Thresholding
Thresholding is an example of a point operation
Given a pixel value, we set the value of (x,y) in I’
to L-1 if (x,y) in I is greater than or equal to the
threshold and to 0 if (x,y) is less than the
threshold
The resulting image has only two values and is
thus a binary image
For example, given a threshold value of 200 the image of the previous
figure becomes the following
Binary Image
255 255 0 0
255 255 0 0
0 0 0 0
0 0 0 0
Global Operations
Nonlocal operations are called global operations
Examples include barrel distortion correction,
perspective distortion correction and rotation
Camera optics may cause barrel distortion. We
can correct the distortion if we know a few
control points
Barrel Distortion
f g
distortion T
(x,y) (x',y')
-1
correction T
matching
control points
Barrel Distortion
The mappings for barrel distortion are:
x’ = a1x + b1y + c1xy + d1
y’ = a2x + b2y + c2xy + d2
If we know four control points, we can solve for
a1, b1 c1, d1, and a2, b2, c2, d2
Contrast Enhancement
Contrast generally refers to a difference in grayscale
values in some particular region of an image function
Enhancing the contrast may increase the utility of the image.
Suppose we have a digital image for which the contrast
values do not fill the available contrast range. Suppose
our data cover a range (m,M), but that the available range
is (n,N). Then the following linear transformation
expands the values over the available range
g(x,y) = {[f(x,y)-m]/ (M-m)}[N-n]+n
This transformation may be necessary when an image has
been scanned, since the image scanner may not have been
adjusted to use its full dynamic range
Contrast Enhancement
For many classes of images, the “ideal”
distribution of gray levels is a uniform
distribution
In general, a uniform distribution of gray levels
makes equal use of each quantization level and
tends to enhance low-contrast information
We can enhance the contrast of an image by
performing histogram equalization
Noise Removal
Noise smoothing for “snow” removal in TV
images was one of the first applications of digital
image processing
Certain types of “noise” are characteristic for
pictures
Noise arising from an electronic sensor generally appears as random,
additive errors or “snow”
In other situations, structured noise rather than random noise is persent
in an image. Consider for example the scan line pattern of TV images
which may be apparent to viewers
Image Data Types
The most common data types for graphics and image file
formats - 24-bit color and 8-bit color.
Most image formats incorporate some variation of a
compression technique due to the large storage size of
image files. Compression techniques can be classified
into either lossless or lossy.
In a color 24-bit image, each pixel is represented by three
bytes, usually representing RGB.
Many 24-bit color images are actually stored as 32-bit
images, with the extra byte of data for each pixel used to
store an alpha value representing special effect
information (e.g., transparency).
8-Bit Color Images -
Pseudocolor
As stated before, some systems only support 8
bits of color information in producing a screen
image
The idea used in 8-bit color images is to store
only the index, or code value, for each pixel.
Then, e.g., if a pixel stores the value 25, the
meaning is to go to row 25 in a color look-up
table (LUT).
8-Bit Color Images
How to Devise a Color Lookup
Table
The most straightforward way to make 8-bit look-
up color out of 24-bit color would be to divide the
RGB cube into equal slices in each dimension.
The centers of each of the resulting cubes would
serve as the entries in the color LUT, while simply
scaling the RGB ranges 0..255 into the appropriate
ranges would generate the 8-bit codes.
Since humans are more sensitive to R and G than to
B, we could use 3 bits for R and G, and 2 bits for B
Can lead to edge artifacts though
How to Devise a CLUT
Median-cut algorithm: A simple alternate
solution that does a better job for this color
reduction problem.
(a) The idea is to sort the R byte values and find
their median; then values smaller than the median
are labelled with a “0” bit and values larger than the
median are labelled with a “1” bit.
(b) This type of scheme will indeed concentrate bits
where they most need to differentiate between high
populations of close colors.
(c) One can most easily visualize finding the
median by using a histogram showing counts at
position 0..255.
(d) Fig. 3.11 shows a histogram of the R byte values
for the forestfire.bmp image along with the median
of these values, shown as a vertical line.
Median-Cut Algorithm
Popular Image File Formats
We will look at
GIF
JPEG
TIFF
EXIF
PS
PDF
GIF
GIF standard: (We examine GIF standard because it is
so simple yet contains many common elements.)
Limited to 8-bit (256) color images only, which, while
producing acceptable color images, is best suited for
images with few distinctive colors (e.g., graphics or
drawing).
GIF standard supports interlacing - successive display of
pixels in widely-spaced rows by a 4-pass display process.
GIF actually comes in two flavors:
1. GIF87a: The original specification.
2. GIF89a: The later version. Supports simple animation
via a Graphics Control Extension block in the data,
provides simple control over delay time, a transparency
index, etc.
GIF
Originally developed by UNISYS corporation and
Compuserve for platform-independent image exchange
via modem
Compression using the Lempel-Ziv-Welch algorithm
(LZW) slightly modified
Localizes bit patterns which occur repeatedly
Variable bit length coding for repeated bit patterns
Well suited for image sequences (can have multiple
images in a file)
GIF87
File format overview:
GIF87
Screen
Descriptor
comprises a set
of attributes
that belong to
every image in
the file.
According to
the GIF87
standard, it is
defined as in
Fig. 3.13.
GIF87
Color Map is
set up in a very
simple fashion
as in Fig. 3.14.
However, the
actual length of
the table equals
2(pixel+1) as given
in the Screen
Descriptor.
GIF87
Each image in the file has its own image descriptor
GIF 87 Interlaced Display Mode
JPEG
JPEG: The most important current standard for
image compression.
The human vision system has some specific
limitations and JPEG takes advantage of these to
achieve high rates of compression.
JPEG allows the user to set a desired level of
quality, or compression ratio (input divided by
output).
PNG
PNG format: standing for Portable Network
Graphics: meant to supersede the GIF standard,
and extends it in important ways.
Special features of PNG files include:
1. Support for up to 48 bits of color information - a
large increase.
2. Files may contain gamma-correction information for
correct display of color images, as well as alpha-
channel information for such uses as control of
transparency.
3. The display progressively displays pixels in a 2-
dimensional fashion by showing a few pixels at a
time over seven passes through each 88 block of
an image.
TIFF
TIFF: stands for Tagged Image File Format.
The support for attachment of additional information
(referred to as “tags”) provides a great deal of flexibility.
1. The most important tag is a format signifier: what type
of compression etc. is in use in the stored image.
2. TIFF can store many different types of image: 1-bit,
grayscale, 8-bit color, 24-bit RGB, etc.
3. TIFF was originally a lossless format but now a new
JPEG tag allows one to opt for JPEG compression.
4. The TIFF format was developed by the Aldus
Corporation in the 1980's and was later supported by
Microsoft.
EXIF
EXIF (Exchange Image File) is an image format
for digital cameras:
1. Compressed EXIF files use the baseline JPEG
format.
2. A variety of tags (many more than in TIFF) are
available to facilitate higher quality printing, since
information about the camera and picture-taking
conditions (flash, exposure, light source, white
balance, type of scene, etc.) can be stored and used
by printers for possible color correction algorithms.
3. The EXIF standard also includes specification of
file format for audio that accompanies digital
images. As well, it also supports tags for
information needed for conversion to FlashPix
(initially developed by Kodak).
Postscript
History:
Developed 1984 by Adobe
First time fonts became important to the general public
Functionality:
Integration of high-quality text, graphics and images
programming language
full-fledged
with variables, control structures and files
Vector based, can include bit-mapped graphics
Encapsulated PS for inclusion in other files
Postscript
Postscript Level-1:
Earliest version developed in 1980s
Scalable font concept (in contrast to fixed-size fonts available
until then)
Problem: no patterns available to fill edges of letters resulting in
medium quality
Postscript Level-2:
High-quality pattern filling
Greater number of graphics primitives
Color concept both device-dependent or device-independent
ASCII files
Follow-up: Adobe’s Portable Document Format (PDF)
LZW compression
Windows Metafile
Microsoft Windows: WMF: the native vector
file format for the Microsoft Windows operating
environment:
1. Consist of a collection of GDI (Graphics Device
Interface) function calls, also native to the Windows
environment.
2. When a WMF file is “played” (typically using the
Windows PlayMetaFile() function) the described
graphics is rendered.
3. WMF files are ostensibly device-independent and
are unlimited in size.