AN ELECTRONIC EYE FOR THE VISUALLY IMPAIRED
(NANOTECHNOLOGY)
ABSTRACT time display. The calculated time is then
compared with the average time needed
This paper proposes a method for
for a blind person to cross. The
detecting frontal pedestrian crossings from
observations are relayed through voice
image data obtained with a single camera
signals using the voice vision technology.
as a travel aid for the visually challenged.
Thus, this effective technology aids
This would be mounted on a pair of
mobility for the visually impaired
glasses, will be capable of detecting the
throughout the globe.
existence and location of a pedestrian
crossing, to measure the width of the road,
INDEX
and to detect the color of the traffic lights.
The process of detecting a crossing is a Introduction
An overview of our Electronic
pre-process followed by the process for
Eye
detecting the state of the traffic lights. It is
Functioning of the system
important for the visually challenged to Image Analyzer
Calculation of width of road and
know whether or not a frontal area is a
time required to cross it
crossing. The existence of a crossing is
Traffic light detector
detected in two steps. In the first step, edge Timing unit
Voice speech system
detection and pattern detection are
Functional block
employed to identify the crossing. In the
Conclusion
second step, the existence of a crossing is
INTRODUCTION
detected by checking the periodicity of
white lines on the road using projective Blindness is the most feared
of all human ailments. CROSSING busy
invariants. Then the traffic light detector is
roads can be a challenge for people with
used to check the pedestrian light and the good vision. For blind people, it is a
perilous activity. Our electronic eye aims
at helping millions of blind and visually buildings and high traffic. None of these
impaired people lead more independent devices are able to specifically identify a
lives. crosswalk, nor do they have the potential
for figuring out the state of the traffic
The electronic eye can be
signals.
adapted to help the blind or visually
impaired get around without a walking An effective
stick or seeing-eye dog. Canes and other navigation system would improve the
travel aids with sonar or lasers can alert mobility of millions of blind people all
the user to approaching objects. Global over the world. Our new “eye” will allow
Positioning Systems can tell what streets, blind people to cross busy roads in total
restaurants, parks and other landmarks the safety for the first time. Our “electronic
user is passing. Devices like these are very eye”, which would be mounted on a pair of
good at giving locations and directions. glasses, will be capable of detecting the
But the limitations of G.P.S. technology existence and location of a pedestrian
mean that they cannot pin down the crossing, and at the same time measure the
location of a curb or crosswalk and width of the road to the nearest step and
frequently fail in areas that have many tall detect the color of the traffic lights.
AN OVERVIEW OF OUR ELECTRONIC EYE
A camera to
capture the image
of crossroads and
traffic signals
A voice speech
generator - used to
instruct the user
We have developed a system that is able to single camera. By measuring the width of
detect the existence of a pedestrian the road and the color of traffic lights, this
crossing in front of a blind person using a single camera can now give the blind all
the information they need to cross a road voice speech system and give vocal
in safety. The camera would be mounted at commands and information through a
eye level, and be connected to a tiny small speaker placed near the ear
computer. It will relay information using a
.
FUNCTIONING OF THE SYSTEM
CROSS
CAMERA IMAGE
ROAD
ANALYSER
DETECTOR
TRAFFIC VOICE 1
LIGHT SPEECH TIMING
2GENERATOR
DETECTOR UNIT3
1 2 3
TO USERS
EAR
1 – Tells the user whether any cross road is present
2 - Tells the user whether the traffic signal is favorable or not
3 – Tells the user the time taken to cross the road.
The style of crosswalks commonly used in points on the edges of the white lines. This
India are known as zebra crossings and gives an accurate way of detecting whether
they feature a series of thick white bands crossing is present in a given image or not.
that run in the same direction as the The length of a pedestrian crossing
vehicle traffic. is measured by projective geometry. The
To detect the presence of camera makes an image of the white lines
a zebra crossing we use the “projective painted on the road, and then the actual
invariant” which takes the distance distances are determined using the
between the white lines and a set of linear
properties of geometric shapes as seen in One way to detect edges or
the image. variations within a region of an image is
by using the gradient operator. There are
The traffic light detector checks
several well-known gradient filters. In this
images for symmetrical shapes and
experiment we use the Sobel gradients,
compares them to a list of road signs. If
which are obtained by convolving the
the pedestrian light is ON, the voice
image with two kernels, one for each
speech system instructs the user to cross
direction.
the road.
The timer unit calculates the CROSSROAD PATTERN
average time required by the visually DETECTION
challenged person to cross the road and The zebra crossing has alternate
‘tells’ it to the user via the voice speech white bands running across the width of
system. the road. This pattern has to be recognized
to confirm the presence of a crossing. To
High-level scene interpretation
detect basic shapes within the image, we
applied to the processed images will
make use of the Hough transform. At its
produce a symbolic description of the
simplest the Hough transform can be used
scene. The symbolic description is then
to detect straight lines from edges detected
converted into verbal instructions
in an earlier processing step.
appropriate to the needs of the user by
using voice speech software.
If the pixels detected fall on a straight line
IMAGE ANALYSER
then they can be expressed by the
The image analyzer contains the equation:
bitmap image, which has to be processed Y=mx+c
to detect the presence of a zebra crossing. The basis of the Hough transform is to
Given an X-bit per pixel image, slicing the translate the points in (x, y) space into
image at different planes (bit-planes) plays (m,c) space using the equation:
an important role in image processing. c= (-x) m+y
Thus each point in (x,y) space (i.e. the
image) represents a line in (m,c) space.
EDGE DETECTION
Where three or more of these lines
intersect a value can be found for the that connects the (x,y) space points.
gradient (m) and intercept (c) of the line
CALCULATION OF THE Calculation of the width of the road
WIDTH OF THE ROAD AND is based on the concept of projection
invariants. This requires us to define the
TIME REQUIRED TO CROSS
term Cross Ratio.
IT
L1
P4 L2
P3
P2 L3
P1 L4
The cross ratio can defined for the four collinear points as,
(P1, P2, P3, P4) = (P1P3/P2P3)/(P1P4/P2P4)
Where P1P2 is the distance P1 to P2. The cross ratio of the four lines is given by,
(L1, L2, L3, L4) = (sin13/sin23)/ (sin14/sin24)
Where ij is the angle between Li and Lj. L1
L2
P1
L3
P2
P3 L4
P4
joining the points in the lines L1, L2, L3,
In the above figure, lines are L4.
constructed from the collinear points and A useful fact is that the cross-ratio of the
in the adjacent figure a line is formed by original four points is equal to the cross-
ratio of the constructed lines.
To detect the presence of a zebra coordinate vectors M and N. N is
crossing we use the “projective invariant” represented by (0, 1) and serves as the
which takes the distance between the white ``origin'' and M is represented by (1, 0)
lines and a set of linear points on the edges and serves as the ``point at infinity''.
of the white lines. The system effectively For an arbitrary point (,) (1,0) we can
draws a virtual line out into the road. If a rescale (,) to =1 and represent A by its
crosswalk is present, the edges of the ``affine coordinates'', (, 1) or just for
painted white lines will form a predictable short. Since we have mapped M to infinity,
series of points along the virtual line. this is just linear distance along the line
Let M and N be two distinct points from N.
of the projective space. Here we take The time required to cross the road is
points M and N as the points on the edges calculated based on an assumption that the
of the line formed on the image. The user covers a distance of one foot in a
projective line between M and N consists minute on an average. So, the time
of all points A of the form required to cover the calculated distance is
calculated based on a simple logic.
Here (,) are the coordinates of A in the Generally, the time taken, T, to cross the
road can be found out by
2D linear subspace spanned by the
T= Calculated width of the road
Distance covered by the user in one second
TRAFFIC LIGHT DETECTOR to cross the road. This process can be
The function of the traffic light effectively done by having an image
detector is to recognize if the pedestrian database in the system and comparing the
light is on for the user to cross the road. If obtained image of camera to detect if
the user happens to reach the road when pedestrian light is on and to detect the time
the pedestrian light is already on, the time left to cross the road.
indicated by the timer display in the traffic
light must be detected and compared with We have a large number of images and
the time required by the user to cross the wish to select some of them, which are
road. If the user can cross the road safely, similar to a certain image (for example, the
the voice speech system will instruct him image of the pedestrian light). So we need
a content-based image database system, or standard deviation. Next step is to
which accepts an image as its input and compute curvature on each smoothed
retrieves all images like that by using some contour.
image properties such as color, texture, As a result, curvature zero-crossing points
shape and keywords. can be recovered and mapped to the CSS
Every image is processed to recover the image in which the horizontal axis
boundary contour, which is then represents the arc length parameter on the
represented by three global shape original contour, and the vertical axis
parameters and the maxima of the represents the standard deviation of the
curvature zero-crossing contours in its Gaussian filter.
Curvature Scale Space image. The features recovered from a CSS image
for matching are the maxima of its zero-
CURVATURE SCALE SPACE crossing contours. The matching of two
COMPUTATION AND
CSS images consists of finding the optimal
MATCHING
horizontal shift of the maxima in one of
The CSS image is a multi-scale
the CSS images that would yield the best
organization of the inflection points (or
possible overlap with the maxima of the
curvature zero-crossing points) of the
other CSS image. The matching cost is
contour as it evolves. Intuitively, curvature
then defined as the sum of pair wise
is a local measure of how fast a planar
distances (in CSS) between corresponding
contour is turning. Contour evolution is
pairs of maxima.
achieved by first parametrizing using arc
So, if an image of a pedestrian light in the
length. This involves sampling the contour
image database finds a match with an
at equal intervals and recording the 2-D
image in the camera, the pedestrian can
coordinates of each sampled point. The
cross the road. The time in seconds
result is a set of 2 coordinate functions (of
required to cross the road is also detected
arc length), which are then convolved,
based on the image of numbers in the
with a Gaussian filter of increasing width
database.
OBTAINED IMAGE IMAGE DATABASE
PEDESTRIAN LIGHT DETECTION
TIMER UNIT
TIMING UNIT system instructs the user to cross the road.
The timing unit compares the Else it asks to wait till it is safe to cross the
calculated value T, the time required by road.
the user to cross the road with the time left
to cross the road T1, as identified from the
image (traffic signal time). If T < T1, the
PEDESTRIAN LIGHT
DETECTION
TIMER UNIT
VOICE SPEECH SYSTEM
converted into verbal instructions
AUDITORY IMAGE
REPRESENTATION appropriate to the needs of the user.
The images captured by the camera
VOICE VISION
are swept from left to right at little less
The VOICE VISION technology for the
than one image per second. The pixels in
totally blind offers the experience of live
each column generate a particular sound
camera views through sophisticated
pattern, consisting of a combination of
image-to-sound renderings. If we have a
frequencies based on that specific set of
64 * 64, 16 gray tone image, the 64-
Pixels. The result is an auditory
channel sound synthesis maps the image
signature effectively an inverse
into an exponentially distributed frequency
spectrogram that characterizes the
interval for a one second visual sound. The
particular image.
VOICE mapping: vertical positions of
High-level scene interpretation
points in a visual sound are represented by
applied to the processed images will
pitch, while horizontal positions are
produce a symbolic description of the
represented by time-after-click. Brightness
scene. The symbolic description is then
is represented by loudness. In this manner,
pixels become... voices!
FUNCTIONAL BLOCK START
CAPTURE
THE IMAGE
IMAGE
ANALYSER
ANALYSE
THE
IMAGE
IS
YES THERE ANY NO
CROSS ROADS
INFORM THE
USER THAT A CROSS
CROSSROAD IS ROAD
DETECTED DETECTOR
CALCULATE
THE WIDTH
OF THE ROAD
DETECT THE TRAFFIC
LIGHT
IS THE INFORM THE
YES LIGHT NO USER TO WAIT
FAVOURABLE? TRAFFIC
LIGHT
CALCULATE
THE TIME DETECTOR
TAKEN TO
CROSS THE
ROAD (A)
IDENTIFY THE TIME
LEFT IN THE
TRAFFIC SIGNAL (B)
YES IF NO
A
TIMER
INFORM THE INFORM THE UNIT
USER TO CROSS USER TO
THE ROAD WAIT
STOP
CONCLUSION
The development of mobility aids for the visually impaired is a challenging task that has
many potential solutions. A sophisticated mechanism designed to enhance the mobility of the
blind is intended to help people who cannot recover their eyesight by normal medical
procedures. Blind pedestrians in the greatest danger are those who must cross wide, busy
roads. This system along with the available low technology aids can relieve the visually
challenged of being dependent on others and lead normal lives. This effective navigation
system would improve the mobility of millions of blind people all over the world.