1.Introduction to Artificial Intelligence:
Some people suggest that the computer advances of the last forty years ago
the beginning-that even more dramatic breakthroughs in electronics and
computers are looming the horizon. About 100 years ago, the United States had
progressed from an agricultural society and economy to an industrial one. The
Industrial Revolution centered on the steam engine and similar technologies and
resulted in a dramatic change in methods of factory production that led to an
urbanization of population patterns and major changes in American lifestyles.
Now, according to those experts who study trends, we are seeing an
Information Revolution, one that affects economic and societal patterns via the
transfer of information. Starting in about 1947 with the ENIAC (one of the first
general-purpose, electronic digital computers), the U.S. began to experience a
tremendous increase in its ability to collect and store information; and Americans
began to develop new technologies to manipulate and communicate that
Some people suggest that the “revolution” is actually over. They say that
the most important discoveries have been made (computers and communication
technology) and that we’re now in a stage of evolving or refining those
technologies and adjusting to the changes occurring in our personal lives, our
economy and our work. Whatever the stage, we are clearly in the midst of the
information Age, living in an information-centered economy with all its rewards
Although some changes have occurred relatively fast, progress ha been very slow
in giving computers “intelligence” and no major breakthrough or “intelligence
revolution” has occurred yet.
The discovery and harnessing of electricity coupled with the development of
vacuum tubes brought us to the first generation of computers. Then
semiconductors (transistors) brought in the second generation. Integrated circuits,
containing thousands of switches on a chip, were key to the third. Finally,
microscopic-sized circuits on a chip (LSI and VLSI) were the elements of the
fourth generation, now scientists are looking for that technological breakthrough
to propel us into the fifth generation of computers. Many people argue that the
development of artificial intelligence will mark the beginning of a fifth generation
Meanwhile, scientific research is being done on several fronts, all with the
purpose of: (a) increasing speed, memory and power of computers; (b) teaching
them to think like humans; (c) making them easier to use; (d) finding practical
applications for the existing technology. It may seem that there is no place to go.
After all, computers are now inexpensive and readily available to almost anyone
in the U.S. who wants to use one. Computers can calculate many times faster than
humans; computerized robots can work longer hours and data banks can
remember much more data than the human mind.
One of major difficulties encountered earlier in AI was the definition and a clear
understanding of intelligence. The three definition of intelligence given are:
1. Intelligence is a state grasping the truth, involving reason, concerned with
action about what is good or bad for human being…
2. The test of first-rate intelligence is the ability to hold two opposite ideas in the
mind at the same time and still retain the ability to function.
3. The ability to learn or understand from experience, the ability to acquire and
retain knowledge and the ability to respond quickly and successfully to a new
situation; use of the faculty of reason in solving problems; directing the
Clearly, there is no consensus on clear definition of the intelligence thought all
definition seem to agree that intelligence behavior would include the following
Coping with complexity
A working definition of AI then would be to program computer to carry out tasks
that would require intelligence if carried out by human beings.
This gives us four possible goals to pursue in artificial intelligence:
Systems that think like humans.
Systems that act like humans.
Systems that think rationally.
Systems that act rationally.
TYPES OF ARTIFICIAL INTELLIGENCE
One common definition states that artificial intelligence (AI) is the area of
computer science that deals with the ways in which computers can be made to
perform cognitive functions ascribed to humans. It does not say what functions
are performed, to what degree they are performed or how these functions are
carried out. In fact, there are at least three different views on what the term
artificial intelligence means.
1. AI is the embodiment of human intellectual capabilities within a computer.
This view is called Strong AI.
2. AI is a set of computer programs that produce output that would be
considered to reflect intelligence if it were generated by humans.
3. AI is the study of mental faculties through the use of mental models
implemented on a computer. This view is called Weak AI.
2.COMPONENT OF AI.
Broadly speaking AI consist of the following four component:
1) Natural Language Processing (NLP). This involves the study, understanding
and processing of the natural languages so as to provide natural language
interface to information system, machine translation, etc.
2) Computer Vision. This entails ability to recognize shapes, features etc,
automatically and in turn automate movement through robots.
3) Heuristic Problem Solving. This consist of solving problems that are probably
hard by smart methods, that quickly evaluate a small number of candidate
solution to get to the optimal or near optimal solution.
4) Expert System. This comprises computer program that are able to exhibit
expert like performance in a specific narrow application domain, which the
main focus of this text.
3.Regular Programming Versus AI Programming:
Let’s compare regular programming and AI programming in term of these
INPUT: A sequence of alphanumeric symbols that is presented and
Stored according to a given set of previously stipulated Rules and that utilizes a
limited set of communication Media, such as a keyboard, magnetic disk or
PROCESSING: Manipulation of the stored symbols by a set of previously
Defined algorithms. (An algorithm is a set of step-by step Instructions that
completely and unambiguously specify How to solve the problem in a finite
length of time.)
OUTPUT: A sequence of alphanumeric or graphic symbols, possibly in a given
set of colors that represents the result of the Processing and that is placed on such
a medium as a CRT screen, paper or magnetic disk or tape.
Regular programming tends to be relatively inflexible in terms of the type and order of
input and output. Both numeric processing and character processing are done on an
item- by- item bases. If special data structures are needed, they must normally be
specified during development of the computer program. The easiest programs to write
are those that involve well-defined processes with very little variation. AI
programming, on the other hand, often mimics the more creative and less well-defined
functions that people perform. These include functions that are related to the five
Input Sight: One-dimensional linear symbols such as typed text, two-
dimensional object such as planer maps, three dimensional scenes such as image
Sound: spoken language, music ,noises made by object.
Touch: temperature, smoothness, resistance to pressure.
Smell: orders emanating from animate and inanimate object.
Taste: sweet, sour, salty and bitter foodstuff and chemical.
Processing knowledge representation and pattern matching: the way
concepts about the world are represented organized, stored and compared.
Search: the way the representation of concept are found and related to one
Logic: the way deduction are made and inference drawn.
Problem solving: the way an overall approach is planned organized, and
Learning: the way new concept are automatically added and previous
Output Printed language and synthesized speech.
Manipulation of physical object (via rotation and translation).
Locomotion (one-, two-, and three-dimensional movement in space). Here
the word “space” may refer to any location: in the atmosphere, in the vacuum, under
ground, under water or in a hazardous environment.
4.Major areas of AI
Although research in artificial intelligence started in the mid 1950s,it is still in its
early stages of development. Progress is low. Specifically, the areas of artificial
intelligence that are being pursued are:
(a) Expert system
(b) Advanced robotics
(c) Natural-language processing
(d) Voice synthesis
(e) Voice recognition
(f) Computer vision
(g) Symbolic and numeric processing
(h) Knowledge representation
(i) Patter watching
(k) Logic inference
(l) Problem solving
(n) Social and ethical lessuses
Thinking humanly: The cognitive modeling approach
If we are going to say that a given program thinks like a human, we must have some way
of determining how humans think. We need to get inside the actual workings of human
minds. There are two ways to do this: through introspection--trying to catch our own
thoughts as they go by--or through psychological experiments. Once we have a
sufficiently precise theory of the mind, it becomes possible to express the theory as a
computer program. If the program's input/output and timing behavior matches human
behavior, that is evidence that some of the program's mechanisms may also be operating
in humans. For example, Newell and Simon, who developed GPS, the ``General Problem
Solver'' (Newell and Simon, 1961), were not content to have their program correctly
solve problems. They were more concerned with comparing the trace of its reasoning
steps to traces of human subjects solving the same problems. This is in contrast to other
researchers of the same time (such as Wang (1960)), who were concerned with getting
the right answers regardless of how human’s mind do it. We will simply note that AI and
cognitive science continue to fertilize each other, especially in the areas of vision, natural
language, and learning.
Thinking rationally: The laws of thought approach
The Greek philosopher Aristotle was one of the first to attempt to codify ``right thinking,''
that is, irrefutable reasoning processes. His famous syllogisms provided patterns for
argument structures that always gave correct conclusions given correct premises. For
example, ``Socrates is a man; all men are mortal; therefore Socrates is mortal.'' These
laws of thought were supposed to govern the operation of the mind, and initiated the field
Acting rationally: The rational agent approach
Acting rationally means acting so as to achieve one's goals, given one's beliefs. An agent
is just something that perceives and acts. (This may be an unusual use of the word, but
you will get used to it.) In this approach, AI is viewed as the study and construction of
In the ``laws of thought'' approach to AI, the whole emphasis was on correct inferences.
Making correct inferences is sometimes part of being a rational agent, because one way
to act rationally is to reason logically to the conclusion that a given action will achieve
one's goals, and then to act on that conclusion. On the other hand, correct inference is not
all of rationality, because there are often situations where there is no provably correct
thing to do, yet something must still be done. There are also ways of acting rationally that
cannot be reasonably said to involve inference. For example, pulling one's hand off of a
hot stove is a reflex action that is more successful than a slower action taken after careful
The State of the Art
International grandmaster Arnold Denker studies the pieces on the board in front
of him. He realizes there is no hope; he must resign the game. His opponent, Hitech,
becomes the first computer program to defeat a grandmaster in a game of chess.
``I want to go from Boston to San Francisco,'' the traveller says into the microphone.
``What date will you be travelling on?'' is the reply. The traveler explains she wants to go
October 20th, nonstop, on the cheapest available fare, returning on Sunday. A speech
understanding program named Pegasus handles the whole transaction, which results in a
confirmed reservation that saves the traveler $894 over the regular coach fare. Even
though the speech recognizer gets one out of ten words wrong, it is able to recover from
these errors because of its understanding of how dialogs are put together.
An analyst in the Mission Operations room of the Jet Propulsion Laboratory suddenly
starts paying attention. A red message has flashed onto the screen indicating an
``anomaly'' with the Voyager spacecraft, which is somewhere in the vicinity of Neptune.
Fortunately, the analyst is able to correct the problem from the ground. Operations
personnel believe the problem might have been overlooked had it not been for Marvel, a
real-time expert system that monitors the massive stream of data transmitted by the
spacecraft, handling routine tasks and alerting the analysts to more serious problems.
Cruising the highway outside of Pittsburgh at a comfortable 55 mph, the man in the
driver's seat seems relaxed. He should be--for the past 90 miles; he has not had to touch
the steering wheel. The real driver is a robotic system that gathers input from video
cameras, sonar, and laser range finders attached to the van. It combines these inputs with
experience learned from training runs and successfully computes how to steer the vehicle.
A leading expert on lymph-node pathology describes a fiendishly difficult case to the
expert system, and examines the system's diagnosis. He scoffs at the system's response.
Only slightly worried, the creators of the system suggest he ask the computer for an
explanation of the diagnosis. The machine points out the major factors influencing its
decision, and explains the subtle interaction of several of the symptoms in this case. The
expert admits his error, eventually.
From a camera perched on a street light above the crossroads, the traffic monitor watches
the scene. If any humans were awake to read the main screen, they would see ``Citroen
2CV turning from Place de la Concorde into Champs Ellipses,'' ``Large truck of unknown
make stopped on Place de la Concorde,'' and so on into the night. And occasionally,
``Major incident on Place de la Concorde, speeding van collided with motorcyclist,'' and
an automatic call to the emergency services.
These are just a few examples of artificial intelligence systems that exist today. Not
magic or science fiction--but rather science, engineering, and mathematics, to which this
book provides an introduction.
The Transition from Lab to Life
The impact of the computer technology, AI included was felt. No longer was the
computer technology just part of a select few researchers in laboratories. The personal
computer made its debut along with many technological magazines. Such foundations as
the American Association for Artificial Intelligence also started. There was also, with the
demand for AI development, a push for researchers to join private companies. 150
companies such as DEC, which employed its AI research group of 700 personnel, spend
$1 billion on internal AI groups.
Other fields of AI also made there way into the marketplace during the 1980's. One in
particular was the machine vision field. The work by Minsky and Marr were now the
foundation for the cameras and computers on assembly lines, performing quality control.
Although crude, these systems could distinguish differences shapes in objects using black
and white differences. By 1985 over a hundred companies offered machine vision
systems in the US, and sales totaled $80 million.
The 1980's were not totally good for the AI industry. In 1986-87 the demand in AI
systems decreased, and the industry lost almost a half of a billion dollars. Companies
such as Acknowledge and Intellicorp together lost more than $6 million, about a third of
their total earnings. The large losses convinced many research leaders to cut back
funding. Another disappointment was the so-called "smart truck" financed by the Defense
Advanced Research Projects Agency. The projects goal was to develop a robot that could
perform many battlefield tasks. In 1989, due to project setbacks and unlikely success, the
Pentagon cut funding for the project.
Despite these discouraging events, AI slowly recovered. New technology in Japan was
being developed. Fuzzy logic, first pioneered in the US has the unique ability to make
decisions under uncertain conditions. Also neural networks were being reconsidered as
possible ways of achieving Artificial Intelligence. The 1980's introduced to its place in
the corporate marketplace, and showed the technology had real life uses, ensuring it
would be a key in the 21st century.
AI put to the Test
The military put AI based hardware to the test of war during Desert Storm. AI-
based technologies were used in missile systems, heads-up-displays, and other
advancements. AI has also made the transition to the home. With the popularity of the AI
computer growing, the interest of the public has also grown. Applications for the Apple
Macintosh and IBM compatible computer, such as voice and character recognition have
become available. Also AI technology has made steadying camcorders simple using
fuzzy logic. With a greater demand for AI-related technology, new advancements are
becoming available. Inevitably Artificial Intelligence has, and will continue to affect our
5. BRANCHES OF AI
Branches of AI
What a program knows about the world in general the facts of the specific
situation in which it must act, and its goals are all represented by sentences of
some mathematical logical language. The program decides what to do by
inferring that certain actions are appropriate for achieving its goals. The first
article proposing this was[Mcc59] [McC89] is a more recent
summary.[McC96].lists some of the concepts involved in logical aI.[Sha97] is
an important text.
AI programs often examine large numbers of possibilities, e.g. moves in a chess
game or inferences by a theorem proving program. Discoveries are continually
made about how to do this more efficiently in various domains.
When a program makes observations of some kind, it is often programmed to
compare what it sees with a pattern. For example, a vision program may try to
match a pattern of eyes and a nose in a scene in order to find a face. More
complex patterns, e.g. in a natural language text, in a chess position, or in the
history of some event are also studied. These more complex patterns require quite
different methods than do the simple patterns that have been studied the most.
Facts about the world have to be represented in some way. Usually languages of
mathematical logic are used.
From some facts, others can be inferred. Mathematical logical deduction is
adequate for some purposes, but new methods of non-monotonic inference have
been added to logic since the 1970s. The simplest kind of non-monotonic
reasoning is default reasoning in which a conclusion is to be inferred by default,
but the conclusion can be withdrawn if there is evidence to the contrary. For
example, when we hear of a bird, we man infer that it can fly, but this conclusion
can be reversed when we hear that it is a penguin. It is the possibility that a
conclusion may have to be withdrawn that constitutes the non-monotonic
character of the reasoning. Ordinary logical reasoning is monotonic in that the set
of conclusions that can the drawn from a set of premises is a monotonic
increasing function of the premises
common sense knowledge and reasoning
This is the area in which AI is farthest from human-level, in spite of the fact that it
has been an active research area since the 1950s. While there has been
considerable progress, e.g. in developing systems of non-monotonic reasoning
and theories of action, yet more new ideas are needed. The Cyc system contains a
large but spotty collection of common sense facts.
learning from experience
Programs do that. The approaches to AI based on connectionism and neural nets
specialize in that. There is also learning of laws expressed in logic.[Mit97] is a
comprehensive undergraduate text on machine learning. Programs can only learn
what facts or behaviors their formalisms can represent, and unfortunately learning
systems are almost all based on very limited abilities to represent information.
Planning programs start with general facts about the world (especially facts about
the effects of actions), facts about the particular situation and a statement of a
goal. From these, they generate a strategy for achieving the goal. In the most
common cases, the strategy is just a sequence of actions.
This is a study of the kinds of knowledge that are required for solving problems in
Ontology is the study of the kinds of things that exist. In AI, the programs and
sentences deal with various kinds of objects, and we study what these kinds are
and what their basic properties are. Emphasis on ontology begins in the 1990s.
A heuristic is a way of trying to discover something or an idea imbedded in a
program. The term is used variously in AI. Heuristic functions are used in some
approaches to search to measure how far a node in a search tree seems to be from
a goal. Heuristic predicates that compare two nodes in a search tree to see if one
is better than the other, i.e. constitutes an advance toward the goal, may be more
useful. [My opinion].
Genetic programming is a technique for getting programs to solve a task by
mating random Lisp programs and selecting fittest in millions of generations. It
is being developed by John Koza's group and here's a tutorial.
5.APPLICATION OF AI:
You can buy machines that can play master level chess for a few hundred dollars.
There is some AI in them, but they play well against people mainly through brute
force computation--looking at hundreds of thousands of positions. To beat a world
champion by brute force and known reliable heuristics requires being able to look
at 200 million positions per second.
In the 1990s, computer speech recognition reached a practical level for limited
purposes. Thus United Airlines has replaced its keyboard tree for flight
information by a system using speech recognition of flight numbers and city
names. It is quite convenient. On the the other hand, while it is possible to instruct
some computers using speech, most users have gone back to the keyboard and the
mouse as still more convenient.
understanding natural language
Just getting a sequence of words into a computer is not enough. Parsing sentences
is not enough either. The computer has to be provided with an understanding of
the domain the text is about, and this is presently possible only for very limited
The world is composed of three-dimensional objects, but the inputs to the human
eye and computers' TV cameras are two dimensional. Some useful programs can
work solely in two dimensions, but full computer vision requires partial three-
dimensional information that is not just a set of two-dimensional views. At
present there are only limited ways of representing three-dimensional information
directly, and they are not as good as what humans evidently use.
A ``knowledge engineer'' interviews experts in a certain domain and tries to
embody their knowledge in a computer program for carrying out some task. How
well this works depends on whether the intellectual mechanisms required for the
task are within the present state of AI. When this turned out not to be so, there
were many disappointing results. One of the first expert systems was MYCIN in
1974, which diagnosed bacterial infections of the blood and suggested treatments.
It did better than medical students or practicing doctors, provided its limitations
were observed. Namely, its ontology included bacteria, symptoms, and treatments
and did not include patients, doctors, hospitals, death, recovery, and events
occurring in time. Its interactions depended on a single patient being considered.
Since the experts consulted by the knowledge engineers knew about patients,
doctors, death, recovery, etc., it is clear that the knowledge engineers forced what
the experts told them into a predetermined framework. In the present state of AI,
this has to be true. The usefulness of current expert systems depends on their
users having common sense.
One of the most feasible kinds of expert system given the present knowledge of
AI is to put some information in one of a fixed set of categories using several
sources of information. An example is advising whether to accept a proposed
credit card purchase. Information is available about the owner of the credit card,
his record of payment and also about the item he is buying and about the
establishment from which he is buying it (e.g., about whether there have been
previous credit card frauds at this establishment).
5.What is a Neural Network?
A neural network is a software (or hardware) simulation of a biological brain
(sometimes called Artificial Neural Network or "ANN"). The purpose of a neural
network is to learn to recognize patterns in your data. Once the neural network has been
trained on samples of your data, it can make predictions by detecting similar patterns in
future data. Software that learns is truly "Artificial Intelligence".
Neural networks are a branch of the field known as "Artificial Intelligence". Other
branches include Case Based Reasoning, Expert Systems, and Genetic Algorithms.
Related fields include Classical Statistics, Fuzzy Logic and Chaos Theory. A Neural
network can be considered as a black box that is able to predict an output pattern when it
recognizes a given input pattern. The neural network must first be "trained" by having it
process a large number of input patterns and showing it what output resulted from each
input pattern. Once trained, the neural network is able to recognize similarities when
presented with a new input pattern, resulting in a predicted output pattern.
Neural networks are able to detect similarities in inputs, even though a particular input
may never have been seen previously. This property allows for excellent interpolation
capabilities, especially when the input data is noisy (not exact). Neural networks may be
used as a direct substitute for autocorrelation, multivariable regression, linear regression,
trigonometric and other regression techniques.
When a data stream is analyzed using a neural network, it is possible to detect important
predictive patterns that were not previously apparent to a non-expert. Thus the neural
network can act as an expert.
Artificial Neural Networks (ANN) are a relatively new approach to computing that
involves using an interconnected assembly of simple processing elements loosely based
on the animal neuron, a specialized biological cell, found only in the animal brain. A
generally accepted basic definition of an ANN is a network of many simple processors.
These simple processing elements are referred to as units, nodes, or neurons. These units
are connected by communication channels referred to as "connections" which carry
numeric data between nodes (see Figure 3). Each unit operates only on its local data and
on the inputs they receive via the connections. The processing ability of the network as a
whole is stored in the inter-unit connection strengths, or weights. These weights are
obtained by a process of adaptation to a set of training patterns; similar to the way neural
connections in the human brain are strengthened or weakened by some stimulus. Another
name for this model is connectionist architecture. This approach differs greatly from the
more traditional symbolic or expert system approach to artificial intelligence. Neural nets
have the ability to learn and derive meaning from a complex, imprecise, or noisy data,
thus extracting patterns that would otherwise be imperceptible by other means. A trained
neural network can be thought of as an "expert" in it the category of information it has
been given to analyze and it can then be given "what if" questions to answer on that
information. The greatest power of a neural network comes from its ability to generalize
from information it has seen to similar patterns of information that it has not seen.
How Neural Networks and Expert Systems Differ?
Neural networks differ from both the expert system approach to artificial intelligence
and the traditional algorithmic approach to computing. Expert systems use rules and facts
to offer solutions to complex problems that would normally require a human expert.
These types of rule-based and symbolic solutions have a common thread in that they all
address relatively well defined problems solvable by some procedural method. In other
words, rule-based systems perform high level reasoning tasks. An example of such a
system is MYCIN, an Expert System for diagnosing and recommending treatment of
bacterial infections of the blood, developed by Shortliffe and associates at Stanford
University. To create such a system, hundreds or thousands of facts are entered into the
expert system. In addition to theses facts, hundreds or thousands of rules that operate on
those facts are also entered. The facts and rules that operate on the facts are essentially
kept separate and any fact can affect any rule and vice versa. The system operates by
taking in facts representing a current problem, applying applicable rules to those facts,
generating new facts to which further rules are applied, and eventually producing a
conclusion to the initial set of facts. Expert systems are very powerful tools in that any
number of facts and rules can be entered to the system in any order. Conflicting facts and
rules may also be entered assuming an appropriate conflict resolution scheme exists
within the expert system. Theoretically an expert system can solve any high level
reasoning task, provided the rules and facts of the problem have been entered. The
drawback to such a system is that the rules and facts must be known ahead of time and
they must be specified to the system. For some problems this is impossible. For example,
the most extreme high level artificial intelligence problem is common sense reasoning.
For a rule-based system to perform common sense reasoning, every fact and rule even
remotely connected to a common sense problem would have to be entered to the system.
The possible number of rules, facts, and potential conflicts make this impossible with
current programming tools. In fact, no artificial intelligence technique has been able solve
this problem although recent research has pointed to a partial solution using hybrid
models, a combination of neural nets and expert systems.
Neural networks take an entirely different approach to artificial intelligence. Neural
networks seek to model (on a very rudimentary level) the biological action of the animal
brain. Neural networks operate on the idea that the conceptual (high level) representation
of information is not important. A neural network seeks to represent data in a distributed
fashion across many simple processing elements so that no single piece of that network
contains any meaningful information, only the network as a whole has any ability to
process, store, and produce information and make decisions. Because of this, very little
information can be gained by observing the network itself, only the actions of the
network are meaningful.
Each technique has it’s strengths and weaknesses and although they are often thought of
as competitors this is not true. Each is well suited to a type of artificial intelligence
problem. Expert systems are very well suited to well defined problems with facts and
rules. Where these rule-based techniques fall short is on low level perceptual tasks such
as vision, speech recognition, complex pattern matching, and signal processing. Rule-
based techniques also have difficulty dealing with fuzzy, imprecise, or incomplete data.
Data in a rule-based system must be in a precise format. Noisy or incomplete data may
confuse an expert system unless specific steps are taken to account for such variability.
This is where neural networks can do what expert systems can not. Neural networks
distribute the representation of data across the whole network of neurons so no one part
of the network is responsible for any one fact or rule. This enables the network to deal
with errors in data and allows it to learn complex patterns that no human expert could
perceive and quantify in simple rule/fact form.
What are Neurons?
The power and flexibility of a neural network follows directly from the connectionist
architecture. This architecture begins with the simple neuron-like processing elements. A
real neuron is a specialized biological cell, found only in the animal brain, that processes
information and presumably stores data. As shown in Figure 1, a neuron is composed of a
cell body and two types of outreaching tree-like branches, the axon and the dendrites. A
neuron receives information from other neurons through its dendrites and transmits
information through the axon, which eventually branches into strands and substrands. At
the end of these sub strands is the synapse, which is the functional unit between two
neurons. When an impulse (information) reaches a synapse, chemicals are released to
enhance or inhibit the receiver’s tendency to emit electrical impulses.
Figure 1: Biological Neuron
The synapse’s effectiveness can be adjusted by the impulses passing through it so that
they can learn from the activities in which they participate. This dependence on a specific
sequence of impulses acts as a memory, possibly accounting for human memory, and
forms the basis for artificial neural network technology. Dendrites and axons form the
inputs and outputs of the neuron respectively. A neuron does nothing unless the collective
influence of all its inputs reaches some threshold level. When the threshold is reached,
the neuron produces a pulse that proceeds from the body to the axon branches.
Stimulation at some synapses encourages neurons to fire, while at others firing is
An artificial neuron, as conceptually shown in Figure 2, is structured to simulate a real
neuron with inputs (x1, x2,...xn) entering the unit and then multiplied by corresponding
weights (w1, w2,...wn) to indicate the strength of the "synapse." The weighted signals are
summed to produce an overall unit activation value. This activation value is compared to
a threshold level. If the activation level exceeds the threshold value, the Neuron passes on
its data. This is the simplest form of the artificial neuron and is known as a perceptron.
Figure 2: Artificial Neuron (perceptron)
Another interesting property of biological neurons is the way they also encode
information in terms of frequency. Real neurons not only pass on information in simple
electrical pulses, but it is the rate at which the pulses are emitted that encodes
information. This presents a major difference between simple perceptrons (artificial
neurons) and real neurons. This difference can be partially overcome by allowing the
artificial neuron to pass on a partial pulse based on a mathematical function known as an
activation function. This activation function is some type of mathematical function,
which allows the artificial neuron to simulate the frequency characteristics of the real
neuron’s electrical signals.
How is a Neural Network Built, What Do They Look Like?
The single neuron described earlier can be structured to solve very simple problems
however it will not suffice for any complex problems. The solution to complex problems
involves the use of multiple neurons working together, this is known as a neural network.
The artificial neuron is a simple element that can be made a part of a large collection of
neurons in which each neuron’s output is the input to the next neuron in line. These
collections of neurons usually form layers as shown in Figure 3. Although this multi-
layer structure can take on virtually any shape, the most common structure is called a
feedforward network and is pictured in Figure 3. The term feedforward comes from the
pattern of information
Figure 3: Example Multi-layer Perceptron
flow through the network. Data is transferred to the bottom layer, called the input layer,
where it distributed forward to the next layer. This second layer, called a hidden layer,
collects the information from the input layer, transforms the data according to some
activation function, and passes the data forward to the next layer. The third layer, called
the output layer, collects the information from the hidden layer, transforms the data a
final time and then outputs the results.
The 3-layer structure shown in Figure 3 is a standard feedforward network although
many variations of this network exist. For example, feedforward networks may have 2 or
more hidden layers, although the basic idea of any feedforward network is that
information passes from bottom to top only Feedforward networks may have any number
of neurons per layer although it is very common for networks to have a pyramid shape in
that the input layer is generally larger than the hidden layer which is larger than the
How Does a Neuron Work?
Artificial neural networks are built up from the simple idea of the perceptron or
artificial neuron. To understand the network it is necessary to understand the neuron. One
neuron is able to solve very simple problems, for example a simple logic problem known
as the logical AND. The logical AND problem assumes two premises and it says that
something is true if and only if both of the premises in the problem are true. For example,
if it is raining AND I go outside, then I will get wet. The two premises are 1) it is raining,
2) I go outside. For I get wet to be true both of the premises must be true first. If either
one is true but other is not, or if they are both false, I will not get wet. This type of
problem can be directly applied to a single neuron and a single neuron can classify all the
possible cases in the problem. Table 1 shows all the possible cases in this problem.
Notice there is only one possible way for the conclusion "I get wet" to be true which is
listed first in the table.
Premise 1 Premise 2 Conclusion
It Is Raining I Go Outside I Get Wet
It Is Not Raining I Go Outside I Do Not Get Wet
It Is Raining I Do Not Go Outside I Do Not Get Wet
It Is Not Raining I Do Not Go Outside I Do Not Get Wet
Table 1: Logical AND Problem (1)
Now consider a single neuron structured as shown in Figure 4. Assume that if a premise
is true it is equal to the number 1, if it is false it is equal to 0. Also assume that the neuron
sends out a "signal" of 1 if the answer is "get wet" or it sends out a 0 if the answer is "do
not get wet." We can set an simple arbitrary activation function that says if the neuron
receives a combined signal of higher than 1.5 it will send out a 1 (get wet), otherwise it
will send out a 0 (do not get wet). This is all that is needed to solve this logic problem.
Figure 4: Simple Neuron Problem
If "it is raining" and "I go outside" are both true the neuron in Figure 4 will receive 1 + 1
= 2 which is greater than 1.5 and it will send out a signal of 1 which means "I get wet."
For all other cases the neuron will receive a total of only 1 or 0 and it will send out a 0
and "I will not get wet" will be the conclusion. In this way a conceptual problem such as
"will I get wet?" has been transformed to a mathematical problem. This type of
conversion from a conceptual problem that humans understand to a numerical problem
computers understand is termed encoding. Much of artificial intelligence is concerned
with encoding and data representation. Table 1 can now be shown numerically as Table
Premise 1 Premise 2 Conclusion
1 1 1
0 1 0
1 0 0
0 0 0
Table 2: Logical AND Problem (2)
By adjusting the strength of the inputs (weighting on the connections) and the way the
collective influence of the inputs is used (activation function) any simple problem such as
has been described can be encoded and solved. For a more complex problem a single
neuron will not suffice. More complex problems require several neurons working
together as neural network. Neural networks operate similarly to the single neuron except
they combine their outputs to handle complex problems.
How Do Neural Networks Learn?
Neural networks are known for their ability to learn a task when given the proper
stimulus. Usually neural networks learn through a process called Supervised Learning.
This learning requires sets of data where a set of inputs and outputs are known ahead of
time. For example, if a neural network were to be taught to recognize hand written
characters. Several examples of each letter, written by different people could be given to
the network. As the teacher, the neural programmer will have several examples of all
characters in the alphabet (inputs) and the programmer knows to which category (‘A’,
‘B’, ‘C’, etc.) each character belongs (output). Inputs (characters in this case) are then
given to the network. The network will produce some kind of output (probably wrong,
e.g. it will say an ‘A’ is a ‘O’). Initially, the network’s responses are totally random and
most likely incorrect. When the neural network produces an incorrect decision the
connections in the network are weakened so it will not produce that answer again.
Similarly, when the network produces a correct decision the connections in the network
are strengthened so it will become more likely to produce that answer again. Through
many iterations of this process, giving the network hundreds or thousands of examples,
the network will eventually learn to classify all characters it has seen. This process is
called supervised learning since the programmer guides the network’s learning through
the type and quality of data given to the network. For this reason neural networks are said
to be data driven and it is critical that the data given to the network is very carefully
selected to represent the information the network is to learn.
The real power in a neural network is not what they can learn but rather what they can do
with that information. A trained neural network will not only be able to identify and
classify data it has seen but it will generalize to similar data it has not seen. In the
handwriting example, if the network is given examples of characters from 20 different
people it will then be able to correctly identify characters written by almost any person,
whether it has seen those particular instances of characters or not. This represents a major
difference between artificial intelligence programming and conventional programming.
In conventional programming each and every character and every variation on every
character would have to be programmed into a computer before that computer could
identify all characters written by all people. Artificial intelligence techniques such as
neural networks work by generalizing from specific patterns to general patterns. This is
similar to human problem solving in that we very often reason from the specific to the
general (inductive reasoning). In this way neural networks can learn to classify groups of
data, match patterns in data, and approximate any mathematical function.
What Can Neural Networks Learn?
Theoretically neural networks can learn any computable function, whether or not that
function can be identified by the programmer or a mathematician. Neural networks are
especially useful for classification and function approximation problems which are
tolerant of some imprecision, which have lots of data available, but to which hard and
fast rules cannot easily be applied. Neural networks work on finding a best match to
inputs (premises) and outputs (conclusions) based on what they have seen in the past. In
this way, neural networks do not give a perfect solution; they give a "best" solution given
the information at hand. Neural networks, like all artificial intelligence techniques, are
based on the assumption that a slightly less than perfect solution that is acceptable is
better than a perfect solution which may be practically impossible to find and implement.
For example, in the handwriting example given earlier the neural network will learn to
identify most characters written by most people (greater than 99% accuracy) but it will
fail a small percentage of the time. This inaccuracy is considered acceptable because the
alternative is to use conventional programming and to create a database of every possible
character ever written and that ever will be written by every human. Just creating, let
alone using such a data base is practically impossible. In giving up a small measure of
accuracy an artificial intelligence technique such as a neural network can be implemented
in a matter of hours by a single programmer using a small fraction of the total
information to be learned. Artificial intelligence is inspired by biological intelligence in
that it is considered more important to have a fast, general, and very robust solution than
it is to have a perfect but time consuming solution.
Neural networks are basically function approximators and pattern matchers and in
general all neural networks perform only these functions. Neural networks may however
be applied to a variety of problems that can make use of their pattern matching and
approximation ability. Neural networks are not only used to classify and match data
directly but also for vision and speech recognition, prediction and forecasting, data
mining and extraction, and process control and optimization. Each of these tasks is
accomplished through the creative use of the neural network’s pattern matching ability.
Once trained, a neural network can be inserted as the heart of any decision making
system where the patterns of inputs and outputs are fed to an problem specific application
that can interpret and process that data. For example, a neural network on its own has no
process control logic or process control ability, however a neural network can be given
process control data for a particular system and it can learn the workings of that control
system. Once that system has been learned the neural network can be inserted as the
decision making part of the control system. The neural network will not only replicate the
control system rules it has seen but it will also be able to generalize to unknown
conditions. This means that when the system as a whole is presented with new and
unseen situations the neural network will extrapolate from known conditions to unknown
conditions and provide a "best match" decision based on this new information.
7.Neural Networks and Machine Learning
Types of Neural Networks
The type and variety of artificial neural networks is virtually limitless although neural
networks are classified according to two factors, the topology (shape) of the network and
the learning method used top train the network. For example, the most widely used
topology is the feedforward network and the most common learning method is the
backpropagation of errors. Backpropagation is a form of supervised learning in which a
network is given input and then the network’s actual output is compared to the correct
output. The network’s connections are adjusted to minimize the error between the actual
and the correct output. Feedforward networks that use backpropagation learning are so
common that these networks are commonly referred to as "backpropagation networks"
although this terminology is not correct. "Multi-layer feedforward" refers to the topology
and pattern of information flow in the network. "Backpropagation" refers to a specific
type of learning algorithm in which errors in the output layer are fed back through the
network. It is possible to use a feedforward architecture without backpropagation, or to
use backpropagation with another type of architecture. In any case, it has become
commonly accepted to call this combination of topology and learning method simply a
Another common network structure is the recurrent or feedback network. Recurrent
networks are similar (usually) in shape to the feedforward network although data may
pass backwards through the net or between nodes in the same layer (see Figure 5).
Networks of this type operate by allowing neighboring neurons to adjust other nearby
neurons either in a positive or negative direction.
Figure 5: Recurrent/Feedback Network
This allows the network to reorganize the strength of its connections by not only
comparing actual output against correct output but also by the interaction of neighboring
neurons. Recurrent networks are generally slower to train and to implement than
feedforward networks although they present several interesting possibilities including the
idea of unsupervised learning. In unsupervised learning the network is only given input
with no output and neurons are allowed to compete or cooperate to extract meaningful
information from the data. This is especially useful when trying to analyze data searching
for some pattern but no specific pattern is known to exist ahead of time.
A third network structure, also based on the feedforward architecture, is the functional
link network. This type of network, as shown in Figure 6, duplicates the input signal with
some type of transformation on the input. For example consider a network that is
designed to input a series of past stock prices and the output is a predicted future stock
price. This network may have as input four past stock prices (last month, last week, and
the past two days). The output may be a single value such as tomorrow’s stock price. In a
functional link network additional inputs will also be given to the network, which are
some form of the original inputs. These additional inputs may be various products of the
original inputs, or they may be high and low values from the whole input set, or they may
be virtually any combination of mathematical functions that are deemed to contain value
for this set of input. In Figure 6 a functional link network is shown with four actual inputs
and two additional functional links, which in this example are products of the first two
and second two inputs. In this network the functional link is directly connected to the
output layer although the functional link may be directed toward the hidden layer. The
idea behind this type of network is to give the network as much information as possible
about the original input set by also giving it variations of the input set.
figure 6: Functional Link Network
In discussions of neural networks and artificial intelligence in general the topic of
learning is a central theme. True human like learning is beyond all artificial intelligence
techniques although some learning techniques have been developed which allow
machines to mimic human intelligence. These techniques that allow computers to acquire
information with some degree of autonomy are collectively known as machine learning.
Machine learning is the artificial intelligence field of study that attempts to mimic or
actually duplicate human learning. There are many artificial intelligence techniques that
do not employ any type of learning, such as search and planning strategies. These type of
artificial intelligence methods rely on sophisticated search methods that can examine
massive amounts of data and very quickly pick out important information without
searching the entire set of data. These strategies do not learn but they do mimic a
human’s ability to quickly investigate different paths and select the one that seems most
productive. These kind of techniques are fairly static in that as long as the information
they are given does not change they will always behave exactly the same.
Theoretically neural networks fall into the category of machine learning. Neural networks
are specifically designed to program themselves based on information they are given. A
neural programmer’s job is to set up the structure and learning ability of the network and
then provide the network with good information. If the network is designed correctly and
the information input to the network is of acceptable quantity and quality the network
will adapt to understand that information. In a sense neural networks exhibit the ability to
learn in a similar fashion to animal learning, they have a given structure (topology and
learning method), they are presented with stimulus (inputs), and they adapt to that
Most practicality implemented neural networks do not continue learning once they have
been trained and placed in service. Neural networks are usually designed to be taught
once and then the network is put to use. While in service they remain fixed and do not
adapt to changing conditions. So in this sense they are not truly "intelligent." There are
some examples of on-line adaptive networks that "learn as they go" and continually adapt
to changing conditions. Most on-line learning neural networks are experimental although
a few practical networks have been constructed and practically implemented. These
networks continually retrain in small increments to adapt to changing conditions. This
presents an exciting area in neural network technology since a network that can reliably
learn in an on-line fashion can be put into service for a virtually indefinite period of time
and it will continue to acquire information and adapt to it’s environment. The most
sophisticated of these on-line networks can also adjust the number of nodes in their
hidden layer(s) although at the moment this is still largely experimental.
In practicality, expert systems are not generally considered in the category of machine
learning since they are built with a certain set of facts and rules and then put into service
and they generally do not adapt on their own. By this definition however most neural
networks must also be excluded from machine learning since neural network training can
be considered analogous to entering rules and facts to an expert system and then both
systems are simple put into service where they usually remain static. Expert systems can
however be updated at any time by entering new facts and rules which again is analogous
to a neural network that observes new conditions and is allowed to update itself to those
conditions while in service. Since both require a human to carefully prepare new
information and either enter that information or explicitly allow the system to acquire that
information they may both be considered in the category of machine learning. This is
especially true for the few theoretical and experimental expert systems that have the
ability to create new facts and rules autonomously by combining the already given facts
and rules. Some experimental and "toy" expert systems have been designed with the
ability to enter information to themselves from what they observe during operation and
from the interaction of the current set of rules and facts. These systems are not considered
completely practical at this time but there is not reason to believe they will not eventually
be brought into practical service.
Unfortunately neural networks and expert systems are like all artificial intelligence
techniques in that they can only solve problems for which they were designed and they
have no ability to change problem domains, cross reference learning, or restructure
themselves to a new problem. For example, a neural network that has been designed and
trained to drive a car (there are several examples of this) can not learn to do character
recognition. If it is restructured to another task, it will not be able to perform the original
task. In addition, a neural network that learns one task such as driving a car will have no
ability to drive a motorcycle, a similar but different task. Neural networks like all current
artificial intelligence techniques, are highly task specific (narrow domain). The ability to
combine learning from different domains and acquire truly new information from that
combination is beyond all machine learning techniques. In narrow domains with
relatively stable conditions, there are many neural network and machine learning
solutions that perform extremely well and can learn.
8.Types of Neural Network Learning
There are generally three different ways to approach neural network learning, supervised
learning, unsupervised learning, and reinforcement learning. Supervised learning requires
the programmer to give the network examples of inputs and correct output for each given
input. In this way the network can compare what it has output against what it should
output and it can correct itself. Figure 7 shows the backpropagation method.
Backpropagation is the most widely used method for neural network training because it is
the easiest to implement and to understand and it works reasonably well for most
Unsupervised learning provides input but no correct output. A network using this type of
learning is only given inputs and the network must organize its connections and outputs
without direct feedback. There are several ways in which this type of learning is
accomplished, one is Hebbian learning and another is competitive learning. Hebbian
learning states that if neurons on both sides of a synapse are selectively and repeatedly
stimulated the strength of the synapse is increased. This type of learning is well suited to
data extraction and analysis in which a pattern is known to exist in some data but the type
and location of the pattern is unknown. Competitive learning uses a "winner take all"
strategy in which output neurons compete to decide which is the stronger and should
remain active while all others must remain passive for a given input. This type of learning
is used most often for categorization where categories of data are thought to exist within
a set of data but the exact nature of the categories is unknown. Unsupervised learning is
still not completely understood or as practically implemented as supervised learning but
the possibilities of unsupervised learning are very promising.
Reinforcement learning is a method half way between supervised and unsupervised
learning but it is usually considered a subtype of supervised learning. In reinforcement
learning a network is given input and although no specific target output is provided (as in
supervised learning) the network is "punished" when it does poorly and "rewarded" when
it does well. Punished and rewarded in this sense takes the form of the weakening or
strengthening the connections between neurons. This means during the learning phase of
a network’s life there are three possibilities for the adaptation of the neurons in the
network. Connections may be selectively strengthened, selectively weakened, or they
may be left unchanged depending on how the network performs. In this type of learning
the network is given input and output is observed Then output neurons are categorized as
being either right, wrong, or neutral. Output neurons that are judged incorrect, and all
neurons that provided input to that neuron, have their connections weakened. Similarly,
output neurons that are judged correct, and all neurons that provided input to that neuron,
have their connections strengthened. Output neurons that are neither right nor wrong are
left unchanged. With this learning method no specific output is targeted. The network
does not know what it should do, only that when it does something it is either right,
wrong, or neutral. This way the network is allowed to find information in data without
being told what the information is but at the same time it is guided to a solution. This
type of learning has been successfully applied to search problems in which a path to some
goal must be identified but the exact path is not know ahead of time. Theoretically these
types of networks are good candidates for on-line learning in an variable environment.
Networks employing reinforcement learning can be placed into environments where
decisions must be made and the outcome of those decisions is known ahead of time but
the exact decisions that need to be made is unknown.
Figure 7: Backpropagation Learning
9.Neural Network Uses and Applications
Neural networks essentially are function approximators, pattern matchers, and
categorizers. They do very little outside of these basic functions although these task can
be employed in a wide variety of powerful and complex applications. The following
represents some common and practically implemented solutions using neural networks.
Neural networks have been used very successfully in speech recognition tasks. Verbal
speech is encoded mathematically and input to the network and the network responds
with an action. Using a neural network for this purpose allows a person or multiple
people to speak with different tones and voices but the verbal command is still
understood by the network despite variations in tone, pitch, quality, etc.
Character recognition is accomplished by presenting the network with many
examples of handwritten characters and allowing the network to learn those
characters. Once trained networks used for this task are remarkably accurate across
not only the characters they have seen but also with characters they have never seen
Neural networks have constructed that process image data such as a photograph or x-
ray image. In the case of photographs neural networks have been trained to pick out
details in the photograph and identify portions of the image as being specific objects. In
x-ray images neural networks have been used to construct composite and 3-D images
from several flat x-ray images taken from different angles on the same bone structure.
Pattern Recognition & Categorization
Obviously pattern recognition and categorization are the most straightforward use for
a neural network. Neural networks can take virtually any set of data that contains one or
more patterns or categories and then extract those patterns. This is extremely useful for
any application that must sort data by category or make decisions based on some pattern
Signal processing is closely related to pattern recognition and neural networks have
been used very successfully to reduce noise in corrupt electrical signals and to separate
various signals from transmissions, which contain multiple signals. Signal processing
neural networks have been used in wide variety of problems. Two examples of this use
include noise reduction in phone lines and detecting engine misfires in engines that can
run as high as 10,000 RPMs.
One of the newest and most important neural net uses is in process control and
optimization. Neural networks have been trained by allowing them to observe some
system, such as piece of machinery, and then it can take over control of that system. Not
only will the neural net control the system in normal operation but it will control that
system during unforeseen occurrences. Neural networks have been put to this use in tests
at NASA's Dryden Flight Research Center in Edwards, California using a modified F-15
aircraft. In this application a neural network was allowed to study normal flight
operations. The neural network learned how a correctly flying aircraft should behave.
Then if the aircraft suffered some type of damage the flight control system enables the
neural net and allows the network to correct mismatches between data on the plane's
airspeed, bearing, and the forces on its body versus what the network thinks the data
should be if the plane were flying normally. In this way the pilot can continue to fly a
damaged aircraft by controlling the plane as if it were undamaged. The neural network
does the job of transforming the pilot’s actions from normal operation to the necessary
operations given the plane is damaged in some way. The network was tested in high
performance maneuvers, such as tracking a target or performing a 360 degree roll. The
neural net managed to keep disabled planes under control even at supersonic speeds.
Process optimization is similar to process control in that a neural net is trained by
allowing it to observe some type of system in operation. In the case of process control the
inputs are the system state and the outputs are the control positions that affect the system.
In process optimization the inputs and outputs are similar but additional inputs and/or
outputs are also specified to represent some target state for the system. For example,
consider a vehicle that takes in fuel and air and produces some speed. In operating this
vehicle there are several factors which may be important at any given moment, like
speed, fuel consumption, wear on the vehicle, safety, etc. Targeting one or more of these
factors as most important requires a careful balance of fuel, air intake, air mechanical
settings, etc (e.g. if it is decided the vehicle must run at minimal fuel usage it probably
can not operate at maximum speed). Neural networks are used to balance system setting
so that one or more system factors can be maximized, minimized, or stabilized. In the
vehicle example a neural net could be set up to minimize fuel consumption by carefully
adjusting air intake, speed, and other mechanical settings that affect fuel consumption.
Process optimization represents one of the most challenging neural network application
areas. Expert systems have also been successfully applied to both process control and
optimization and they are the older and more traditional way of applying an artificial
intelligence solution in this domain. Expert systems however have a few drawbacks, they
still require a human expert to input to the system, they must be tailor made for each
system, and they do not deal well with unseen or imprecise data. Neural networks have
the advantage in that they program themselves, provided of course they are given the
proper input. Neural networks also have advantage of being very robust and dealing well
with unseen data. An expert system faced with unknown or corrupt facts will not do
anything where as a neural network faced with unseen or corrupt data will respond with a
"best guess" answer. Provided the new data does not stray too far from the original
conditions shown to the network, neural networks perform very well and can extrapolate
from the new information to a reasonable solution.
10.How to Determine If an Application is a Candidate
for a Neural Network:
There are several requirements and conditions a problem must meet if it is to be an
acceptable candidate for a neural network solution. First and foremost the problem must
be tolerant of some level of imprecision. All artificial intelligence techniques sacrifice
some small measure of precision in favor of speed an tractability. This imprecision may
be very small, much less than one percent or it may be relatively large such as ten
percent. Neural network error rates tend to be below two percent however for certain
applications error rates can go as low as a very small fraction of one percent. Any
application that has zero tolerance for imprecision can not be solved with any artificial
intelligence technique including neural networks. For example digital data transmission
algorithms must be perfectly precise. If even the tiniest portion of a digital data
transmission (e.g. sending a file over a network from computer to computer) is corrupted
the entire transmission may be ruined. Conversely, something like an analog voice or
video transmission is very tolerant of error. If a fraction of a second of a video or audio
transmission is lost or damaged it may never be noticed by the observer. There are many
such examples of processes that can tolerate some small measure of error where there is
no appreciable impact on the problem.
Another requirement for a neural network solution is that abundant high quality data
exists for both training and testing purposes. A neural network must be able to observe
the problem at hand and it must be tested on that problem once it is trained but before it is
put into service. This may require massive amounts of training and test data depending on
the complexity of the problem.
Related to the error tolerance requirement, neural networks (like all artificial intelligence
methods) work best when there exists one or more acceptable solutions to a problem that
are not necessarily the best solution. There exist many problems for which finding an
acceptable solution is easy but finding the perfect solution requires a practically
impossible amount of resources. For example there may be a time dependant problem in
which "fast enough" is just as good as "fastest."
At the heart of any problem for which a neural network is the solution there must be a
pattern matching or categorization problem. This is not a difficult requirement to meet
since pattern matching and categorization are inherent to a wide variety of problems.
Most of what humans do that is considered "intelligent" is really the ability to quickly
categorize and match what we see versus what we know and make a decision based on
If a neural network is used in a process optimization or control application, economics
plays an important part in the neural network’s usage. Neural networks used in this area
tend to be of marginal benefit, in other words they provide benefits at the edges of
existing performance. In any system where small increases in performance and efficiency
translate to large changes in economic gain a neural network will prove very useful. This
is especially true in systems where the small gain in performance is very difficult to
achieve but when it is achieved it provides large benefits.
An Example Of Neural Network
Imagine a highly experienced bank manager who must decide which customers will
qualify for a loan. His decision is based on a completed application form that contains ten
questions. Each question is answered by a number from 1 to 5 (some responses may be
subjective in nature).
Early attempts at "Artificial Intelligence" took a simplistic view of this problem. The
Knowledge Engineer would interview the bank manager(s) and decide that question one
is worth 30 points, question two is worth 10 points, question three is worth 15
points,...etc. Simple arithmetic was used to determine the applicant's total rating. A
hurdle value was set for successful applicants. This approach helped to give artificial
intelligence a bad name.
The problem is that most real-life problems are non-linear in nature. Response #2 may be
meaningless if both response #8 and #9 are high. Response #5 should be the sole criterion
if both #7 and #8 are low.
Our ten question application has almost 10 million possible responses. The bank
manager's brain contains a Neural Network that allows him to use "Intuition". Intuition
will allow the bank manager to recognize certain similarities and patterns that his brain
has become attuned to. He may never have seen this exact pattern before, but his intuition
can detect similarities, as well as dealing with the non-linearity’s. He is probably unable
(and unwilling) to explain the very complex process of how his intuition works. A
complicated list of rules (called "Expert System") could be drawn up but these rules
may give only a rough approximation of his intuition.
If we had a large number of loan applications as input, along with the manager's decision
as output, a neural network could be "trained" on these patterns. The inner workings of
the neural network have enough mathematical sophistication to reasonably simulate the
Another Example: A Real Estate Appraiser
Consider a real estate appraiser whose job is to predict the sale price of residential
houses. As with the Bank Loans example, the input pattern consists of a group of
numbers. (For example: number of bedrooms, number of stories, floor area, age of
construction, neighborhood prices, size of lot, distance to schools, etc.). This problem is
similar to the Bank Loans example, because it has many non-linearties, and is subject to
millions of possible inputs patterns. The difference here is that the output prediction will
consist of a calculated value the selling price of the house.
It is possible to train the neural network to simulate the opinion of an expert appraiser, or
to predict the actual selling price.
The above examples use a hypothetical bank manager and real-estate appraiser. Similar
examples could use a doctor, judge, scientist, detective, IRS agent, social worker,
machine operator or other expert. Even the behavior of some non-human physical process
could be modeled. NeuNet Pro includes several sample projects.
11.Introduction to Expert system:
Expert systems are one of the first and the most practical applications derived
from the research on artificial intelligence. Artificial intelligence (AI) is that area
of computer science in which scientists are striving to build machines to “think”
and “reason” in a fashion similar to humans. An expert system is software, based
on certain concept of AI that acts as a consultant or an expert, in a specific field or
discipline to help solve a problem or help make a decision. Expert systems are
also referred to as knowledge-based systems.
Expert systems attempt to supply both the knowledge and reasoning of human
beings. They are “expert” in only one field, topic or discipline; they can help
solve only a narrowly defined problem. The user provides data about a problem
through a keyboard and the computer responds with an answer and explanation
based on facts and rules that have earlier been extracted from human experts and
stored in the computer.
An expert system cannot entirely duplicate a human expert’s judgment or
make the final decision; but it can offer opinions, suggest possible diagnoses and
suggest various solutions to a problem. These programs are usually used as a
supplemental source of advice.
Because of their usefulness, expert systems are one of the first results of
AI research to become a viable commercial product. Oil companies use expert
systems to analyze geological data, while physicians use them to help diagnose
and treat illness. People in other types of diagnostic fields, professional assistance
and emergency management also take advantage of expert systems.
Until recently, most expert systems were designed for use only a large
computers because the programming demanded so much power and memory.
Now, many expert systems can be used with microcomputers: however, these
programs are still very expensive. Early work was done using PROLOG
(Programming in Logic), LISP (List processing), and other specialized
programming languages. Now the trend is towards designing these systems for
microcomputers and with popular programming languages such as FORTRAN
Conventional Programs versus Expert Systems:
An expert system differs from a traditional program that is used to solve a
problem (application software). In traditional software, there is the program and
data that the program is given to work on. In expert systems, however, the
program is called the inference engine and the data- base has been replaced with a
Traditional computer programs are composed of a detailed set of sequentially
organized instructions to be followed that comprise an algorithm for the
processing steps. The computer can do nothing else but strictly follow the
sequence of the instructions. Using heuristic programming, an expert system can
group instructions in any order and allows different reactions to each situation it
encounters. The exact processing activities are determined by the data that are
entered during a “consultation,” not by the sequence of processing statements.
Heuristic programming is a key feature of expert systems. The main difference
between expert systems and traditional programs is the inclusion of heuristics, the
rules of thumb about the problem. Heuristic programming is an attempt to
emulate human intuition, judgment and common sense. This type of program
allows the computer to recall earlier and include them in its programming. That
newly gained knowledge is then added to its knowledge base and becomes a basis
for the next problem and its solution. Here, the computer has learned from its own
experience and mistakes, so when it encounters a new problem, it will recall
earlier results and consider them.
Difference between Expert System and Conventional Programming
Use if inference engine rather than program
Use of knowledge base rather that data base
Processing determined by data entered rather than by the sequence of
An expert system is a knowledge-intensive program that solves a problem by
capturing the expertise of a human in limited domains of knowledge and
experience. An expert system can assist decision making by asking relevant
questions and explaining the reasons for adopting certain actions. Some of the
common characteristics of expert systems are the following:
They perform some of the problem-solving work of humans.
They represent knowledge in forms such as rules or frames.
They interact with humans.
They can consider multiple hypotheses simultaneously.
Today, expert systems are quite narrow, shallow, and brittle. They lack the
breadth of knowledge and the understanding of fundamental principles of a
human expert. Expert systems today do not “think” as a human being does. A
human being perceives significance, works with abstract models of causality, and
can jump to conclusions. Expert systems do not resort to reasoning from first
principles, do nor draw analogies, and lack common sense.
Above all, expert systems are not a generalized expert or problem solver. They
typically perform very limited tasks that can be performed by professionals in a
few minutes or hours. Problems that cannot be solved by human expert in the
same short period of time are far too difficult for an expert system. But by
capturing human expertise in limited areas, expert systems can provide
12.HOW EXPERT SYSTEMS WORK:
Four major elements compose an expert system: the knowledge domain or base,
the development team, the AI shell, and the user (see Figure 17.4). Subsequently,
we will describe each of these parts.
THE KNOWLEDGE BASE:
What is human knowledge? AI developers sidestep this thorny issue by asking a
slightly different question: How can human knowledge be modeled or represented in
a way that a computer can deal with it? This model of human knowledge used by
expert systems id called the knowledge base. There ways have been devised to
represent human knowledge and expertise; rules, semantic nets, and frames.
A standard structured programming construct is the IF-THEN construct, in which
a condition is evaluated. If the condition is true, an action is taken. For instance:
PRINT NAME AND ADDRESS (action)
A series of these rules can be a knowledge base. Any reader who has written
computer programs knows that virtually all-traditional computer programs contain
IF-THEN statements. The difference between a traditional program and a rule-
based expert system program is primarily one of degree and magnitude. AI
programs can easily have 200 to 10,000 rules, far more than traditional programs,
which may have 50 to 100 IF-THEN statements. Moreover, in an AI program the
rules tend to be interconnected and nested to a far larger degree than in traditional
programs. The order in which the rules are searched depends in part on what
information the system is given. Multiple paths lead to the same result, and the
rules themselves can be interconnected. Hence the complexity of the rules in a
rule-based expert system is considerable.
Could you represent the knowledge in the Encyclopedia Britannica this way?
Probably not, because the rule base would be too large, and not all the knowledge
in the encyclopedia can be represented in the form of IF-THEN rules. In general,
expert systems can efficiently used only in those situations where the domain of
knowledge is highly restricted (such as in granting credit) and involves no more
than a few thousand rules.
It can be used to represent knowledge when the knowledge base is composed of
easily identified chunks or objects of interrelated characteristics. Semantic nets
can be much more efficient than rules. They use the property of inheritance to
organize and classify objects. A condition like “Is-A” ties objects together-Is-A”
is a pointer to all objects of a specific class. For instance, Figure shows semantic
net that is used to classify kinds of automobiles. All specific automobiles in the
lower part of the diagram inherit characteristics of the general categories of the
automobiles above them. Insurance companies can use such a semantic net to
classify cars into rating classes.
It also organizes knowledge into chunks, but the relationships are based on shared
characteristics rather than a hierarchy. This approach is grounded in the belief that
humans use “frames” or concepts to make rapid sense out of perceptions. For
instance, when a person is told to “look for a tank and shoot when you see one,”
experts believe humans invoke a concept or frame of what a tank should look like.
Anything that does not fit this concept of a tank is ignored. In a similar fashion,
AI researchers can organize a vast array of information into frames. The computer
is then instructed to search the database of frames and list connections to other
frames of interest. The user can then follow the various pathways pointed to by
An AI development team is composed of one or several “experts” who have a
thorough command over the knowledge engineers who can translate the
knowledge into a set of rules, frames, or semantic nets. A knowledge engineer is
similar to a traditional systems analyst but has special expertise in eliciting
information and expertise from other professionals. The knowledge engineer
interviews the expert or experts and specifies the decision rules and knowledge
that must be captured by the system.
The AI shell is the programming environment of an expert system. AI systems
can be developed in just about any programming language, such as BASIC or
Pascal. In the early years of expert systems, computer scientists used specialized
programming languages such as LISP or Prolog that could process lists of rules
efficiently. Today a growing number of expert systems use either the C language
or, more commonly, AI shells that are user-friendly development environment. AI
shells can quickly generate user interface screens, capture the knowledge base,
and manage the strategies for searching the rule base. The best of these AI shells
generate C code, which can then be integrated into existing programs or tied into
exiting data streams and databases.
Inference engines in expert systems. An inference engine works by searching
through the rules and “firing” those rules that are triggered by facts gathered and
entered by the user. Basically, a collection of rules is similar to a series of nested
“IF” statements in a traditional software program; however, the magnitude of the
statements and degree of nesting are much greater in an expert system.]
One of the most interesting parts of expert systems is the inference engine.
The inference engine is simply the strategy used: forward chaining and backward
In forward chaining, the inference engine begins with the information
entered by the user and searches the rule base to arrive at a conclusion. The
strategies are to “fire,” or carry out, the action of the rule when a condition is true.
In Figure, beginning on the left, if user enters a client with income greater than $
100,000, the engine will on fire all rules in sequence from left to right. If the user
then information indicating that the same client owns real estate, another pass of
the rule base will occur and more rules will fire. The rule base can be searched
each time the user enters new information. Processing until no more rules can be
In backward chaining, an expert system acts more like a problems solver who
begins with a question and seeks out more information to evaluate the question.
The strategy for searching the rule base starts with a hypotheses and proceeds by
asking the user questions about selected facts until the hypotheses is either
confirmed or disproved. In figure, ask the question, “Should we add this person to
the prospect database?” Begin on the right of the diagram and work toward the
left. You can see that the person should be added to the database if a sales rep is
sent, term insurance is granted, or a financial adviser will be sent to visit the
The role of the users is both pose questions of the systems and to enter relevant
data guide the systems along. The user may employ the expert systems, as a
source of advice or to perform tedious and routine analysis tasks.
CREATING AN EXPERT SYSTEM:
The creation of an expert system is an involved process that requires
careful planning. The trend among those who need an ES is to acquire an
expert systems tool (shell) instead of writing the inference engine and
other code “from scratch”. The following are the major steps involved in
the creation of a knowledge-based system when an ES shell is used.
1. Select a domain and a particular task.
Choose a task that someone (an “expert”) can do well.
The performance of the task should be related to both breadth and
depth of knowledge.
The facts and rules should be stable.
The recommendations should be well defined.
2. Select the ES shell for implementation.
Decide what type of inference control is needed.
Decide what type of pattern-matching capability is needed.
Decide whether certainly factors are necessary.
Begin constructing a prototype system.
3. Acquire initial knowledge about the domain and task.
Identify the knowledge expert(s).
Select particular problems associated with each task.
Obtain record and crosscheck factual knowledge from both reference
material and experts.
Obtain record and task-related rules from the experts and confirm them
to the degree possible.
Prepare a set of test cases.
4. Encode the knowledge, using the appropriate representation.
5. Execute and test the knowledge.
Evaluate the test cases.
Be alert for problems with consistency and completeness.
6. Refine the current knowledge and acquire additional domain knowledge.
Revise the rules as necessary.
Modify any facts that need revision.
Augment the system with information on additional domain tasks and
Repeat as often as necessary.
7. Complete any necessary interface code.
Demonstrate the system.
Make the system user-friendly.
8. Document the system.
Provide on-line and hard-copy documentation as necessary.
Document the consultation portion especially well.
Document the knowledge portion to the degree necessary.
If the expert system is to be coded from scratch, then many more concerns
must be addressed. They are related primarily to the design and coding of the
inference engine and the explanation subsystem. Coding from scratch can be a
13. CAPABILITIES OF EXPERT SYSTEMS:
An expert system also utilizes other capabilities. For example,
1. It is often necessary to be able to easily to remove (retract given facts or
even remove (excise) given rules from consideration during a consultation.
Most expert systems provide this capability.
2. It is useful to be able to assign priorities to the firing of rules. This can be
accomplished by giving each rule a priority number that the inference
3. Some expert systems permit combined forward and backward reasoning
(an example is opportunistic reasoning).
4. Certain systems permit the rules to invoke subprograms written in another
language (such as LISP, C or FORTRAN) to perform complex operations.
One example of this is knowledge-based simulation where in knowledge-
based system invokes a discrete of continuous simulation system.
5. Some advanced expert systems combine facts with framed as well as rules
and thereby incorporate the capability of inheritance, wherein there is a
network of data structures and the “offspring” automatically inherit
properties of the “parents”.
6. As we have seen, one weakness of expert systems is their lack of so called
deep knowledge. Without the type to knowledge, the expert system cannot
respond to a question or statement when there is no matching fact or rule.
Efforts are under way to build fundamental models to help solve this
7. Learning subsystems are currently being investigated as a way to simplify
the task of the knowledge base builder and the knowledge engineer so that
the knowledge base can be dynamically updated with reliable information.
A typical expert system, shell or tool has many capabilities. The number and
quality of the features also affect the cost of the system, so before selecting a
system, the user should have a good idea what capabilities she or he needs.
14.APPLICATIONS AND EXAMPLES OF
After DENDRAL, CADUCEUS and MYCIN proved their usefulness, other
systems started to appear in variety of fields. Today, expert systems are used in
the areas of: Economy, Industry, Medical Education and Date Processing and
THE RCS (River Conservation Status) System:
South Africa’s river systems vary form practically pristine natural systems to
heavily exploited and degraded drainage ditches. The Olifants river system of the
Southwestern Cape flows through a mountainous catchments of unique vegetation
and includes in its fauna eight fish species endemic to the system. In contrast, 150
km further south, the Black River flows in a concrete canal through Cape Town,
with effluent from the cities main sewage works contributing up to 90% of the
flow. Other aspects of conservation concern are embodied in the Olifants River of
the Eastern Transvaal, one of the main drainage systems for the highly populated,
industrialized and mining rich Witwatersant area, which then flows through the
Kruger National Park.
Conservation in South Africa has been dominated by the economic and
popular appeal of the large animal populations in protected areas, so that until
recently, consideration of aquatic conservation has been confined to hippos,
crocodiles and a few fish species. Nevertheless, streams in undeveloped
catchments reflect unaltered natural conditions and could be preserved in this
state. Perhaps the most important issue is that South Africa is an arid country
where water is often the limiting resource for future development. Natural
freshwater lakes are unknown and groundwater reservoirs meager. Rivers in
South Africa are therefore under intense development pressure and the case for
conservation must be very strong to be given priority. A major problem has been
that no coherent river conservation policy has been developed and consequently it
has been extremely difficult for government agencies, planners and engineers to
understand and consider conservation priorities.
A need was therefore identified for a means of assessing the major
conservation attributes of rivers, for communicating these in a conceptually
simple manner to people who are not ecologists and for investigating the likely
consequences of proposed river development schemes on the conservation of
Some aspects of conservation are quantifiable, but others involve subjective
value judgements. Expert systems are well suited for modeling and decision
making with conservation problems.
The aims of the RCS project were to identify attributes of rivers which are
important for their conservation, to establish their relative nature and scale of this
important and then to quantify the conservation status of any particular river of
section of river. ‘Conservation status’ is defined as a measure of the relative
importance of the river for conservation and the extent to which it has been
distributed from its natural state.
Given the required information about a river, the system must be able to
A relative value of the conservation status of the river.
Relative values for different components of the river.
‘Confidence limits’ indicating how precisely the conservation status
can be measured and indicating where more accurate information is
A listing of the relative importance of each attribute in determining the
status of the river.
Opportunities for the user to manipulate the program to examine its
assumptions and change parameters.
The system was originally designed as a communication tool to describe
conservation priorities to managers, developers and planners in a consistent way,
so that ecological factors can be taken into account in plans for river exploitation
and development. The results are presented in a conceptually simple fashion, thus
allowing the non- specialist to appreciate the relative conservation status of
For the ecologist/conservationist, the primary function of the system is the
classification and mapping of rivers over an area. Within this function the system
can act as a conservation agency, classifying rivers of different conservation
status and also clearly identifying areas where more information is needed.
A second important function of the system is its use as a model to evaluate the
effects of planned changes. In this case the system uses information about the
river in its present state and then compares this with runs, using data or
predictions on river conditions following the planned changes. This should
provide a powerful tool for environmental impact assessment.
The construction, evaluation and testing of the system provided an extremely
valuable function in itself. This process forced the contributors to examine their
own assessment methods carefully. Many conservationists, for instance, make
‘intuitive’ judgments about the importance of particular sites. In fact, this
‘intuition’ comprises a complex net of interacting variables which are evaluated in
terms of an individual’s experience and the available information about the site.
To be forced to analyze these variables and their interrelationships often led to
considerable insight and also often helped to pinpoint areas of disagreement, so
that, while arguments were not eliminated, they were at least channeled into
specific resolvable problems.
In South Africa, the development of RCS system has led to the identification of
those aspects, which are generally felt to the most important in conserving rivers
and has provided a fair consensus as to the relative importance of different
attributes of rivers. The ability to present this consensus view to ecological
laymen charged with management responsibilities in rivers gives them a more
realistic opportunity to take conservation priorities into account at the planning
stage. Conservationists too often see managers, planners and developers as being
insensitive to environmental issues, when in fact a major part of the problem is
the inability of the conversionists to present a concise summary of their complex
points of view. It is unreasonable to expect laymen to unravel the multivariate
probabilities and diffuse intuitions of the conservation ethic. The RCS
encapsulates the more important components of river conservation status.
As with the other ecological systems discussed here, this system has acted as a
focus for a number of interested experts. It served to identify points of conflict
between experts and thus to identify areas where further research is required.
The lack of aquaculture expertise has been defined as one of the major
constraints on the development of aquaculture in southern Africa. It was decided
to develop and expert system to assist potential aquaculturists in assissing the
aquaculture potential of various fish species in relation to particular sites and
Computers have played an increasingly important role in aquaculture in recent
years, but their use has been limited to data storage (to aid in pond management)
and the monitoring of oxygen levels and water flow. As far as known,
FISHFARMER represents the first use of expert system techniques as a means of
assessing aquaculture potential. Great emphasis has been placed on the ‘user
friendliness’ of the system.
Fish culture has been practiced for over 2000 years, but it is only recently that
intensive aquaculture has developed into a commercial activity, which is
dependent upon advanced technology and scientific research. World output from
aquaculture has increased by 40% between 19975 and 1984.
Wellesley and Burton propose a number of steps that should be taken if the
industry is to develop. These include the establishment of a lead agency to initiate
marketing, coordinate research and to promote the transfer of technology.
As a means to this end, it was decided to develop an expert system to evaluate
the aquaculture potential of a given organism, site culture method. Such an expert
system would provide the ability to assess the aquaculture potential of a particular
site independently of human aquaculture experts. In addition, it was hoped that
the development of the system would have two side effects. In the first instance, it
would bring together information from a wide number of different sources and
secondly it would clearly identify ‘holes’ in the available knowledge.
The requirements of the system were the following:
1. Given available data on the biological, physical, financial and infrastructural
parameters pertaining to a potential site, the system should be able to:
Evaluate the site with regard to its suitability for fish culture.
If the site is suitable for fish culture, evaluate the species included in
the knowledge base in relation of the environmental characteristics of
Provide a confidence value with each recommendation.
2. The system should be ‘user friendly’ in that it must:
Explain (on demand) why a particular question is being asked.
Explain (on demand) rationale of reasoning at any given point in the
Explain (on demand) any technical terms and procedures associated
with any of the questions.
Allow manipulation of the system by the user to examine its
assumptions and change parameters.
Require a minimum of computer expertise for its use and
It is envisaged that Nature Conservation officers, fish farming consultants,
researches will use the system and of course, anyone interested in assessing the
aquaculture potential of a site.
FISHFARMER attempts to determine the optimum match between a particular
site, water source, market, species, culture method and financial resources. The
system used information provided by the user in conjunction with that from its
own knowledge base in order to make the correct match. The system queries the
user in detail about the proposed site in order to determine whether the site,
culture and species complement each other.
Good use has been made of the relative importance measure. Thus water
temperature, being a particularly crucial environmental parameter, influences the
system’s decision far more than would the state of the site’s access road for
example and so is given a higher rating. In addition to answering questions the
user may ask for further background information about the questions being asked.
The development of the FISHFARMER expert system does not lend itself to a
single researcher working in isolation; the design and contents of the knowledge
base constantly need to be challenged and discussed. This project has benefits
form the inputs of wide variety of individuals: aquaculture researchers, computer
scientists and commercial fish farmers.
The FISHFARMER project has proved to be a most useful exercise from a
number of different points of view. It has consolidated information and
knowledge from a wide range of sources and people. Many areas, which require
further in depth research, have been identified and it has provided the user with a
useful tool to assist in solving problems and developing sites.
EMEX: An Expert System for Market Analysis and Forecasting
The aim of the EMXE system is to guide the user through the stages
of the model-building task. Its role is that of assistance. It is important that the
user, as the expert in the market being modeled, exercises judgment over the
results and suggestions that the system makes. The combination of the system’s
model building expertise and user’s market expertise together can provide a very
powerful insight into the operation of the market.
Ease of use is an important consideration for such a package. The
user and the system interact via a series of forms and menus. There are facilities
to allow the user to browse through the details of the current consultation, to
change previously entered information and at all times a help system can offer
advice relevant to the current context.
EMEX is now in regular use and continue to undergo refinement. To date
its performance has been judge by comparing the models it generates with the
models produced by experts and on several occasions it has actually improved
originally built by the experts. Moreover, a model, which might take an expert
half a day to build, is more likely to take around thirty minutes for EMEX; so
great timesaving can also be made.
APES: An Expert System for Nutrition Education
Interaction with a knowledge base can be good for your health! This was the
starting point for an attempt at constructing an environment for exploration by
pupils, students, teachers and experts.
Part of this environment is an expert system which is developed to provide
advice on the nutritional properties of food and the needs of a healthy diet. This
requires an underlying knowledge base which itself includes a very large bank of
data. A powerful learning environment or micro world is then capable of
interrogation, amendment and of supporting computer aided learning (CAL)
applicarepresent a cooling process or composition of a diet for analysis.
A key feature of the project is the construction of a database, database
management system (DBMS) and a set of rules for presentation to and use by the
user. To keep the initial development simple a non probabilistic, backward
chaining production rule system was chosen for implementations: Micro-Prolog
Professional and APES (Augmented Prolog for Expert System) produced by
Logic Programming Associates. Additional reasons for choosing this system were
that it was available for microcomputers already in use in schools and that
powerful learning packages had already been successfully built, particularly by
the Exeter Project under Jon Nichol at Exeter University. Application packages
based on Micro-Prolog have been well received by History teachers.
The involvement of experts in knowledge engineering is a two-way process. As
they constrict the knowledge base, so too they learn how to structure their
knowledge and make explicit the underlying rules of behavior. Furthermore, in
the field of education this process can be harnessed to improve and extend the
learning abilities of students.
A wide range of expert system shells and AI languages is becoming available;
even in education, specialist shells are being developed. How does one choose
which system adopt? For this application we needed a shell, which could work
with our own database module, a language, which supports complex list handling
and external files as well as possessing a syntax that is reasonably accessible to
nutritionists, teachers and students. APES suited these purposes: it is flexible,
extensible and available. The main drawback to this relatively open environment,
however, is that the interface is not as user friendly as some shells.
The database itself can be hidden from the ordinary user in two major ways.
For large applications the best strategy is to externalize very large files that cannot
be held in memory and use the indexed method of record retrieval available with
Micro-Prolog for reasonably fast access. The rules for extracting and
manipulating the data can be held in a closed APES module loaded in with APES
itself and an integral part of it. For prototyping, however, it should be sufficient to
code part of the file as Prolog clauses and make the low level DBMS rules non-
interactive, i.e. hidden from the user. APES allows for tailoring the environment
to suit the application.
Proto-tying the knowledge base was relatively straightforward. Both bottom up
and top down methods were needed. Once the basic data needs were identified,
efficient rules for presenting meaningful information were written. At the same
time, teachers and experts sketched the applications, which would use this
information, out. The skill lay in matching the needs of the application with the
potential of the database. Differences arose over what programmers considered to
be the simple choice of clause names: to nutritionists and health experts, words
such as high in and good-for as in ‘foods high in fiber are good for constipation’
cause all sorts of problems which were not immediately apparent to us! A lesson
learnt here was that the experts need to be more familiar with what the system can
do in order to define the language to be presented to the user.
APES is not ideally suited to education. The interface is too complex, some of
its language too advanced and, in any case, specialized applications do require
specialized front-ends. Expert system shells, aimed at the needs of education have
been developed with Micro-Prolog as the host language, at Kingston College.
There is still a problem, though, that simplified shells for education restrict the
complexity of the micro-world available to pupils. Other avenues being explored
for greater ease of use includes object oriented languages and environments but it
is still early days yet.
Another difficulty encountered was lack of teacher familiarity with declarative
programming and the lack of suitable software systems in schools. There will
need to be an investment in training in fifth generation computing before we even
begin to discover the potential for interaction with knowledge bases and writing
small “expert system” programs. Ten years of similar investment in Logo is only
just beginning to show research results, though not all positive. Nonetheless, the
small amount of work we have done with teachers, pupils, students and health
educationalists convinces us that there is potential in educational applications of
THE DIABETICS EXPERT SYSTEM:
The expert system DIABETES is used for tutoring and teaching medical
students, general practitioners and medical related staff on diabetes diagnoses and
management. Major complications of diabetes are dealt with as well as treatment
using insulin administration. The bases idea of the system is to present the user
with a general tool for experimenting with a large number of patient cases by the
choice of symptom and history from a menu driven interface. It is a system based
on production rules and can be an effective decision making tool in such a
complex urea of medicine as diabetes.
15.EXAMPLES OF SUCCESSFUL EXPERT
There are many successful expert systems. However there is no accepted
definition of successful. What is successful to an academic (“It works!”) may not
be successful to a corporation (“It cost a million dollars!”). While some of the
better-known expert systems are quite large and cost millions of dollars, others
are less expensive and tackle interesting but smaller problems. Some of the most
celebrated systems are not used to facilitate routine decision-making. Finding out
which successful systems are used on a daily bases is difficult because
corporations regard this information as proprietary. Nevertheless, we can briefly
describe some of the better-known commercial success stories.
1. Whirlpool uses the Consumer Appliance Diagnostic System (CADS) to help
its customer service representatives handle its 3 million annual telephone
inquiries. The system expedites customer service by directing customers to a
single source of help without delay. Previously, customers who had a problem or
question about Whirlpool products might have to be put on hold or directed to two
or three different representatives before their questions could be answered.
Whirlpool developed CADS using. Anion’s Development System for OS/2 as its
expert system shell. Two knowledge engineers worked with one programmer and
three of the company’s customer service experts to capture 1000 rules for 12
product lines. BY 1999, Whirlpool expects to use CADS to respond to 9 million
2. The National Aeronautic and Space Administration (NASA) developed
MARVAL, its Multimission Automation for Real-Time Verification of Spacecraft
Engineering. Link to monitor its Voyager missions without burning out analyst
after analyst. Spacecraft flights on long missions generate voluminous and critical
information that must be carefully analyzed. MARVEL monitors NASA’s
computer-command subsystem, which receives and executes commands from the
ground and also analyzes power, propulsion flight-data subsystems, and
telecommunications functions. NASA developed MARVEL with the assistance of
the equivalent of 1.5 full-time computer scientists and two mission experts.
MARVEL is based on Software Architecture and Engineering’s Knowledge
Engineering System expert system shell and runs on Sun workstations.
3. Countrywide Funding Corp. in Pasadena, California, loan underwriters with
about 400 underwriters in 150 offices around the country developed a
microcomputer-based expert system in 1992 to make preliminary creditworthiness
decisions on loan requests. The company had experienced rapid, continuing
growth and used the system to help ensure consistent and high-quality loan
4. CLUES (Countryside’s Loan Underwriting Expert System) have about 400
rules. Countrywide tested the system by having every loan application handled by
a human underwriter fed to CLUES. The system was refined until it agreed with
the underwriter in 95 percent of the cases. However, Countrywide will not rely on
CLUES to reject loans because the expert system cannot be programmed to
handle exceptional situations such as those involving a self-employed person or
complex financial schemes. An underwriter will review all rejected loans and will
make the final decision. CLUES have other benefits. Traditionally an underwriter
can evaluate at least sixteen per day (Nash, 1993).
5. The Digital Equipment Corporation (DEC) and Carnegie-Mellon University
developed XCON in the late 1970s to configure VAX computers on a daily basis.
The system configures customer orders and guides the assembly of those orders at
the customer site. XCON has been used for major functions such as sales and
marketing, manufacturing and production, and field service, and played a strategic
role at DES (Sviokla, June 1990; Barker and O’Connor, 1989). It is estimated that
XCON and related systems saved DEC approximately $40 million per year.
6. Table describes other well-known expert systems in terms of their size and
programming languages. As can be seen, these systems generally have a
minimum pf several hundred rules. Note Digital Equipment Corporation’s
XCON, which started out with 250 rules but expanded to about 10,000.
7. These examples show that expert systems can provide organizations with an
array of benefits, including reduced errors, reduced cost, reduced training time,
improved decisions, and improved quality and service. The Window on
Organization shows how expert systems can be applied to solve some problems in
medicine and health care.
16. PROBLEMS WITH EXPERT SYSTEMS:
A thorough understanding of expert systems also requires awareness of their
current limitations and problem.
1. Expert Systems Are Limited to Certain Problems
In answer to the question, “Why do some expert systems work?” critics point
out that virtually all successful expert systems deal with problems of classification
in which there are relatively few alternative outcomes and in which these possible
outcomes are all known in advance. Contrary to early promises, expert systems do
best in automation lower-level clerical functions. Even in these comparatively
simple situations, however, expert systems require large, lengthy, and expensive
development efforts. For these kinds of problems, hiring of training more experts
may be less expensive than building an expert system.
2. Important Theoretical Problems Exist
There are significant theoretical problems in knowledge representation. IF-
THEN knowledge exists primarily in textbooks. There are no adequate
representations for deep causal models or temporal trends. No expert system, for
instance, can write a textbook on information systems or engage in other creative
activities not explicitly foreseen by system designers. Many expert systems
cannot yet replicate knowledge that is intuitive, based on analogy and on a “sense
3. Expert Systems Are Not Applicable to Complex Managerial Problems
The applicability of expert systems to complex managerial problems is
currently highly limited. Many managerial problems generally involve drawing
facts and interpretations from divergent sources, evaluation the facts, and
comparing one interpretation of the facts with another, and do not involve
analysis or simple classification. Expert systems cannot address complex
problems requiring intuition to solve. Expert systems based on the prior
knowledge of a few known alternatives are unsuitable for the problems mangers
face on a daily bases.
4. Expert Systems Are Expensive to Maintain
The knowledge base of expert systems is fragile and brittle; they cannot learn
or change over time. In fast-moving fields like medicine or the computer sciences,
keeping the knowledge base up to date is a critical problem. For applications of
even modest complexity, expert system code is generally hard to understand,
debug, and maintain. Adding new rules to a large rule-based program nearly
always requires revision of the control variables and conditions of earlier rules.
Which of these entries to change to make the next rule work is often far from
5. A More Limited Role For Expert Systems
Although expert systems lack the robust and general intelligence of human
beings, they can provide benefits to organizations if their limitations are well
understood. Expert systems have proved especially useful for certain types of
diagnostic problems. They can provide electronic checklists for lower-level
employees in service bureaucracies like banking, insurance, sales, and welfare
agencies. Elements of expert system technology have been incorporated into a
wide variety of products and services (Hayes-Roth and Jacob stein, 1994). (An
example would be expert tax advice provided in popular financial planning and
tax calculation software packages.) In limited areas, expert systems can help
organizations make higher-quality decisions using fewer people.