VIEWS: 11 PAGES: 65 CATEGORY: Emerging Technologies POSTED ON: 10/14/2012
Some people suggest that the computer advances of the last forty years ago the beginning-that even more dramatic breakthroughs in electronics and computers are looming the horizon. About 100 years ago, the United States had progressed from an agricultural society and economy to an industrial one. The Industrial Revolution centered on the steam engine and similar technologies and resulted in a dramatic change in methods of factory production that led to an urbanization of population patterns and major changes in American lifestyles.
1.Introduction to Artificial Intelligence: Some people suggest that the computer advances of the last forty years ago the beginning-that even more dramatic breakthroughs in electronics and computers are looming the horizon. About 100 years ago, the United States had progressed from an agricultural society and economy to an industrial one. The Industrial Revolution centered on the steam engine and similar technologies and resulted in a dramatic change in methods of factory production that led to an urbanization of population patterns and major changes in American lifestyles. Now, according to those experts who study trends, we are seeing an Information Revolution, one that affects economic and societal patterns via the transfer of information. Starting in about 1947 with the ENIAC (one of the first general-purpose, electronic digital computers), the U.S. began to experience a tremendous increase in its ability to collect and store information; and Americans began to develop new technologies to manipulate and communicate that information. Some people suggest that the “revolution” is actually over. They say that the most important discoveries have been made (computers and communication technology) and that we’re now in a stage of evolving or refining those technologies and adjusting to the changes occurring in our personal lives, our economy and our work. Whatever the stage, we are clearly in the midst of the information Age, living in an information-centered economy with all its rewards and problems. Although some changes have occurred relatively fast, progress ha been very slow in giving computers “intelligence” and no major breakthrough or “intelligence revolution” has occurred yet. The discovery and harnessing of electricity coupled with the development of vacuum tubes brought us to the first generation of computers. Then semiconductors (transistors) brought in the second generation. Integrated circuits, containing thousands of switches on a chip, were key to the third. Finally, microscopic-sized circuits on a chip (LSI and VLSI) were the elements of the fourth generation, now scientists are looking for that technological breakthrough to propel us into the fifth generation of computers. Many people argue that the development of artificial intelligence will mark the beginning of a fifth generation of computers. Meanwhile, scientific research is being done on several fronts, all with the purpose of: (a) increasing speed, memory and power of computers; (b) teaching them to think like humans; (c) making them easier to use; (d) finding practical applications for the existing technology. It may seem that there is no place to go. After all, computers are now inexpensive and readily available to almost anyone in the U.S. who wants to use one. Computers can calculate many times faster than humans; computerized robots can work longer hours and data banks can remember much more data than the human mind. AI: DEFINITION One of major difficulties encountered earlier in AI was the definition and a clear understanding of intelligence. The three definition of intelligence given are: 1. Intelligence is a state grasping the truth, involving reason, concerned with action about what is good or bad for human being… 2. The test of first-rate intelligence is the ability to hold two opposite ideas in the mind at the same time and still retain the ability to function. 3. The ability to learn or understand from experience, the ability to acquire and retain knowledge and the ability to respond quickly and successfully to a new situation; use of the faculty of reason in solving problems; directing the conduct effectively. Clearly, there is no consensus on clear definition of the intelligence thought all definition seem to agree that intelligence behavior would include the following traits: Learning Managing ambiguity Coping with complexity Responding quickly Reasoning Inferencing Prioritizing A working definition of AI then would be to program computer to carry out tasks that would require intelligence if carried out by human beings. This gives us four possible goals to pursue in artificial intelligence: Systems that think like humans. Systems that act like humans. Systems that think rationally. Systems that act rationally. TYPES OF ARTIFICIAL INTELLIGENCE One common definition states that artificial intelligence (AI) is the area of computer science that deals with the ways in which computers can be made to perform cognitive functions ascribed to humans. It does not say what functions are performed, to what degree they are performed or how these functions are carried out. In fact, there are at least three different views on what the term artificial intelligence means. 1. AI is the embodiment of human intellectual capabilities within a computer. This view is called Strong AI. 2. AI is a set of computer programs that produce output that would be considered to reflect intelligence if it were generated by humans. 3. AI is the study of mental faculties through the use of mental models implemented on a computer. This view is called Weak AI. 2.COMPONENT OF AI. Broadly speaking AI consist of the following four component: 1) Natural Language Processing (NLP). This involves the study, understanding and processing of the natural languages so as to provide natural language interface to information system, machine translation, etc. 2) Computer Vision. This entails ability to recognize shapes, features etc, automatically and in turn automate movement through robots. 3) Heuristic Problem Solving. This consist of solving problems that are probably hard by smart methods, that quickly evaluate a small number of candidate solution to get to the optimal or near optimal solution. 4) Expert System. This comprises computer program that are able to exhibit expert like performance in a specific narrow application domain, which the main focus of this text. 3.Regular Programming Versus AI Programming: Let’s compare regular programming and AI programming in term of these segments. REGULARPROGRAMMING: INPUT: A sequence of alphanumeric symbols that is presented and Stored according to a given set of previously stipulated Rules and that utilizes a limited set of communication Media, such as a keyboard, magnetic disk or magnetic tape. PROCESSING: Manipulation of the stored symbols by a set of previously Defined algorithms. (An algorithm is a set of step-by step Instructions that completely and unambiguously specify How to solve the problem in a finite length of time.) OUTPUT: A sequence of alphanumeric or graphic symbols, possibly in a given set of colors that represents the result of the Processing and that is placed on such a medium as a CRT screen, paper or magnetic disk or tape. Regular programming tends to be relatively inflexible in terms of the type and order of input and output. Both numeric processing and character processing are done on an item- by- item bases. If special data structures are needed, they must normally be specified during development of the computer program. The easiest programs to write are those that involve well-defined processes with very little variation. AI programming, on the other hand, often mimics the more creative and less well-defined functions that people perform. These include functions that are related to the five senses. AI Programming Input Sight: One-dimensional linear symbols such as typed text, two- dimensional object such as planer maps, three dimensional scenes such as image of object. Sound: spoken language, music ,noises made by object. Touch: temperature, smoothness, resistance to pressure. Smell: orders emanating from animate and inanimate object. Taste: sweet, sour, salty and bitter foodstuff and chemical. Processing knowledge representation and pattern matching: the way concepts about the world are represented organized, stored and compared. Search: the way the representation of concept are found and related to one another. Logic: the way deduction are made and inference drawn. Problem solving: the way an overall approach is planned organized, and executed. Learning: the way new concept are automatically added and previous concepts revised. Output Printed language and synthesized speech. Manipulation of physical object (via rotation and translation). Locomotion (one-, two-, and three-dimensional movement in space). Here the word “space” may refer to any location: in the atmosphere, in the vacuum, under ground, under water or in a hazardous environment. 4.Major areas of AI Although research in artificial intelligence started in the mid 1950s,it is still in its early stages of development. Progress is low. Specifically, the areas of artificial intelligence that are being pursued are: (a) Expert system (b) Advanced robotics (c) Natural-language processing (d) Voice synthesis (e) Voice recognition (f) Computer vision (g) Symbolic and numeric processing (h) Knowledge representation (i) Patter watching (j) Search (k) Logic inference (l) Problem solving (m) Learning (n) Social and ethical lessuses Thinking humanly: The cognitive modeling approach If we are going to say that a given program thinks like a human, we must have some way of determining how humans think. We need to get inside the actual workings of human minds. There are two ways to do this: through introspection--trying to catch our own thoughts as they go by--or through psychological experiments. Once we have a sufficiently precise theory of the mind, it becomes possible to express the theory as a computer program. If the program's input/output and timing behavior matches human behavior, that is evidence that some of the program's mechanisms may also be operating in humans. For example, Newell and Simon, who developed GPS, the ``General Problem Solver'' (Newell and Simon, 1961), were not content to have their program correctly solve problems. They were more concerned with comparing the trace of its reasoning steps to traces of human subjects solving the same problems. This is in contrast to other researchers of the same time (such as Wang (1960)), who were concerned with getting the right answers regardless of how human’s mind do it. We will simply note that AI and cognitive science continue to fertilize each other, especially in the areas of vision, natural language, and learning. Thinking rationally: The laws of thought approach The Greek philosopher Aristotle was one of the first to attempt to codify ``right thinking,'' that is, irrefutable reasoning processes. His famous syllogisms provided patterns for argument structures that always gave correct conclusions given correct premises. For example, ``Socrates is a man; all men are mortal; therefore Socrates is mortal.'' These laws of thought were supposed to govern the operation of the mind, and initiated the field of logic. Acting rationally: The rational agent approach Acting rationally means acting so as to achieve one's goals, given one's beliefs. An agent is just something that perceives and acts. (This may be an unusual use of the word, but you will get used to it.) In this approach, AI is viewed as the study and construction of rational agents. In the ``laws of thought'' approach to AI, the whole emphasis was on correct inferences. Making correct inferences is sometimes part of being a rational agent, because one way to act rationally is to reason logically to the conclusion that a given action will achieve one's goals, and then to act on that conclusion. On the other hand, correct inference is not all of rationality, because there are often situations where there is no provably correct thing to do, yet something must still be done. There are also ways of acting rationally that cannot be reasonably said to involve inference. For example, pulling one's hand off of a hot stove is a reflex action that is more successful than a slower action taken after careful deliberation. The State of the Art International grandmaster Arnold Denker studies the pieces on the board in front of him. He realizes there is no hope; he must resign the game. His opponent, Hitech, becomes the first computer program to defeat a grandmaster in a game of chess. ``I want to go from Boston to San Francisco,'' the traveller says into the microphone. ``What date will you be travelling on?'' is the reply. The traveler explains she wants to go October 20th, nonstop, on the cheapest available fare, returning on Sunday. A speech understanding program named Pegasus handles the whole transaction, which results in a confirmed reservation that saves the traveler $894 over the regular coach fare. Even though the speech recognizer gets one out of ten words wrong, it is able to recover from these errors because of its understanding of how dialogs are put together. An analyst in the Mission Operations room of the Jet Propulsion Laboratory suddenly starts paying attention. A red message has flashed onto the screen indicating an ``anomaly'' with the Voyager spacecraft, which is somewhere in the vicinity of Neptune. Fortunately, the analyst is able to correct the problem from the ground. Operations personnel believe the problem might have been overlooked had it not been for Marvel, a real-time expert system that monitors the massive stream of data transmitted by the spacecraft, handling routine tasks and alerting the analysts to more serious problems. Cruising the highway outside of Pittsburgh at a comfortable 55 mph, the man in the driver's seat seems relaxed. He should be--for the past 90 miles; he has not had to touch the steering wheel. The real driver is a robotic system that gathers input from video cameras, sonar, and laser range finders attached to the van. It combines these inputs with experience learned from training runs and successfully computes how to steer the vehicle. A leading expert on lymph-node pathology describes a fiendishly difficult case to the expert system, and examines the system's diagnosis. He scoffs at the system's response. Only slightly worried, the creators of the system suggest he ask the computer for an explanation of the diagnosis. The machine points out the major factors influencing its decision, and explains the subtle interaction of several of the symptoms in this case. The expert admits his error, eventually. From a camera perched on a street light above the crossroads, the traffic monitor watches the scene. If any humans were awake to read the main screen, they would see ``Citroen 2CV turning from Place de la Concorde into Champs Ellipses,'' ``Large truck of unknown make stopped on Place de la Concorde,'' and so on into the night. And occasionally, ``Major incident on Place de la Concorde, speeding van collided with motorcyclist,'' and an automatic call to the emergency services. These are just a few examples of artificial intelligence systems that exist today. Not magic or science fiction--but rather science, engineering, and mathematics, to which this book provides an introduction. The Transition from Lab to Life The impact of the computer technology, AI included was felt. No longer was the computer technology just part of a select few researchers in laboratories. The personal computer made its debut along with many technological magazines. Such foundations as the American Association for Artificial Intelligence also started. There was also, with the demand for AI development, a push for researchers to join private companies. 150 companies such as DEC, which employed its AI research group of 700 personnel, spend $1 billion on internal AI groups. Other fields of AI also made there way into the marketplace during the 1980's. One in particular was the machine vision field. The work by Minsky and Marr were now the foundation for the cameras and computers on assembly lines, performing quality control. Although crude, these systems could distinguish differences shapes in objects using black and white differences. By 1985 over a hundred companies offered machine vision systems in the US, and sales totaled $80 million. The 1980's were not totally good for the AI industry. In 1986-87 the demand in AI systems decreased, and the industry lost almost a half of a billion dollars. Companies such as Acknowledge and Intellicorp together lost more than $6 million, about a third of their total earnings. The large losses convinced many research leaders to cut back funding. Another disappointment was the so-called "smart truck" financed by the Defense Advanced Research Projects Agency. The projects goal was to develop a robot that could perform many battlefield tasks. In 1989, due to project setbacks and unlikely success, the Pentagon cut funding for the project. Despite these discouraging events, AI slowly recovered. New technology in Japan was being developed. Fuzzy logic, first pioneered in the US has the unique ability to make decisions under uncertain conditions. Also neural networks were being reconsidered as possible ways of achieving Artificial Intelligence. The 1980's introduced to its place in the corporate marketplace, and showed the technology had real life uses, ensuring it would be a key in the 21st century. AI put to the Test The military put AI based hardware to the test of war during Desert Storm. AI- based technologies were used in missile systems, heads-up-displays, and other advancements. AI has also made the transition to the home. With the popularity of the AI computer growing, the interest of the public has also grown. Applications for the Apple Macintosh and IBM compatible computer, such as voice and character recognition have become available. Also AI technology has made steadying camcorders simple using fuzzy logic. With a greater demand for AI-related technology, new advancements are becoming available. Inevitably Artificial Intelligence has, and will continue to affect our lives. 5. BRANCHES OF AI Branches of AI logical AI What a program knows about the world in general the facts of the specific situation in which it must act, and its goals are all represented by sentences of some mathematical logical language. The program decides what to do by inferring that certain actions are appropriate for achieving its goals. The first article proposing this was[Mcc59] [McC89] is a more recent summary.[McC96].lists some of the concepts involved in logical aI.[Sha97] is an important text. search AI programs often examine large numbers of possibilities, e.g. moves in a chess game or inferences by a theorem proving program. Discoveries are continually made about how to do this more efficiently in various domains. pattern recognition When a program makes observations of some kind, it is often programmed to compare what it sees with a pattern. For example, a vision program may try to match a pattern of eyes and a nose in a scene in order to find a face. More complex patterns, e.g. in a natural language text, in a chess position, or in the history of some event are also studied. These more complex patterns require quite different methods than do the simple patterns that have been studied the most. representation Facts about the world have to be represented in some way. Usually languages of mathematical logic are used. inference From some facts, others can be inferred. Mathematical logical deduction is adequate for some purposes, but new methods of non-monotonic inference have been added to logic since the 1970s. The simplest kind of non-monotonic reasoning is default reasoning in which a conclusion is to be inferred by default, but the conclusion can be withdrawn if there is evidence to the contrary. For example, when we hear of a bird, we man infer that it can fly, but this conclusion can be reversed when we hear that it is a penguin. It is the possibility that a conclusion may have to be withdrawn that constitutes the non-monotonic character of the reasoning. Ordinary logical reasoning is monotonic in that the set of conclusions that can the drawn from a set of premises is a monotonic increasing function of the premises common sense knowledge and reasoning This is the area in which AI is farthest from human-level, in spite of the fact that it has been an active research area since the 1950s. While there has been considerable progress, e.g. in developing systems of non-monotonic reasoning and theories of action, yet more new ideas are needed. The Cyc system contains a large but spotty collection of common sense facts. learning from experience Programs do that. The approaches to AI based on connectionism and neural nets specialize in that. There is also learning of laws expressed in logic.[Mit97] is a comprehensive undergraduate text on machine learning. Programs can only learn what facts or behaviors their formalisms can represent, and unfortunately learning systems are almost all based on very limited abilities to represent information. planning Planning programs start with general facts about the world (especially facts about the effects of actions), facts about the particular situation and a statement of a goal. From these, they generate a strategy for achieving the goal. In the most common cases, the strategy is just a sequence of actions. epistemology This is a study of the kinds of knowledge that are required for solving problems in the world. ontology Ontology is the study of the kinds of things that exist. In AI, the programs and sentences deal with various kinds of objects, and we study what these kinds are and what their basic properties are. Emphasis on ontology begins in the 1990s. heuristics A heuristic is a way of trying to discover something or an idea imbedded in a program. The term is used variously in AI. Heuristic functions are used in some approaches to search to measure how far a node in a search tree seems to be from a goal. Heuristic predicates that compare two nodes in a search tree to see if one is better than the other, i.e. constitutes an advance toward the goal, may be more useful. [My opinion]. genetic programming Genetic programming is a technique for getting programs to solve a task by mating random Lisp programs and selecting fittest in millions of generations. It is being developed by John Koza's group and here's a tutorial. 5.APPLICATION OF AI: game playing You can buy machines that can play master level chess for a few hundred dollars. There is some AI in them, but they play well against people mainly through brute force computation--looking at hundreds of thousands of positions. To beat a world champion by brute force and known reliable heuristics requires being able to look at 200 million positions per second. speech recognition In the 1990s, computer speech recognition reached a practical level for limited purposes. Thus United Airlines has replaced its keyboard tree for flight information by a system using speech recognition of flight numbers and city names. It is quite convenient. On the the other hand, while it is possible to instruct some computers using speech, most users have gone back to the keyboard and the mouse as still more convenient. understanding natural language Just getting a sequence of words into a computer is not enough. Parsing sentences is not enough either. The computer has to be provided with an understanding of the domain the text is about, and this is presently possible only for very limited domains. computer vision The world is composed of three-dimensional objects, but the inputs to the human eye and computers' TV cameras are two dimensional. Some useful programs can work solely in two dimensions, but full computer vision requires partial three- dimensional information that is not just a set of two-dimensional views. At present there are only limited ways of representing three-dimensional information directly, and they are not as good as what humans evidently use. expert systems A ``knowledge engineer'' interviews experts in a certain domain and tries to embody their knowledge in a computer program for carrying out some task. How well this works depends on whether the intellectual mechanisms required for the task are within the present state of AI. When this turned out not to be so, there were many disappointing results. One of the first expert systems was MYCIN in 1974, which diagnosed bacterial infections of the blood and suggested treatments. It did better than medical students or practicing doctors, provided its limitations were observed. Namely, its ontology included bacteria, symptoms, and treatments and did not include patients, doctors, hospitals, death, recovery, and events occurring in time. Its interactions depended on a single patient being considered. Since the experts consulted by the knowledge engineers knew about patients, doctors, death, recovery, etc., it is clear that the knowledge engineers forced what the experts told them into a predetermined framework. In the present state of AI, this has to be true. The usefulness of current expert systems depends on their users having common sense. heuristic classification One of the most feasible kinds of expert system given the present knowledge of AI is to put some information in one of a fixed set of categories using several sources of information. An example is advising whether to accept a proposed credit card purchase. Information is available about the owner of the credit card, his record of payment and also about the item he is buying and about the establishment from which he is buying it (e.g., about whether there have been previous credit card frauds at this establishment). 5.What is a Neural Network? A neural network is a software (or hardware) simulation of a biological brain (sometimes called Artificial Neural Network or "ANN"). The purpose of a neural network is to learn to recognize patterns in your data. Once the neural network has been trained on samples of your data, it can make predictions by detecting similar patterns in future data. Software that learns is truly "Artificial Intelligence". Neural networks are a branch of the field known as "Artificial Intelligence". Other branches include Case Based Reasoning, Expert Systems, and Genetic Algorithms. Related fields include Classical Statistics, Fuzzy Logic and Chaos Theory. A Neural network can be considered as a black box that is able to predict an output pattern when it recognizes a given input pattern. The neural network must first be "trained" by having it process a large number of input patterns and showing it what output resulted from each input pattern. Once trained, the neural network is able to recognize similarities when presented with a new input pattern, resulting in a predicted output pattern. Neural networks are able to detect similarities in inputs, even though a particular input may never have been seen previously. This property allows for excellent interpolation capabilities, especially when the input data is noisy (not exact). Neural networks may be used as a direct substitute for autocorrelation, multivariable regression, linear regression, trigonometric and other regression techniques. When a data stream is analyzed using a neural network, it is possible to detect important predictive patterns that were not previously apparent to a non-expert. Thus the neural network can act as an expert. Artificial Neural Networks (ANN) are a relatively new approach to computing that involves using an interconnected assembly of simple processing elements loosely based on the animal neuron, a specialized biological cell, found only in the animal brain. A generally accepted basic definition of an ANN is a network of many simple processors. These simple processing elements are referred to as units, nodes, or neurons. These units are connected by communication channels referred to as "connections" which carry numeric data between nodes (see Figure 3). Each unit operates only on its local data and on the inputs they receive via the connections. The processing ability of the network as a whole is stored in the inter-unit connection strengths, or weights. These weights are obtained by a process of adaptation to a set of training patterns; similar to the way neural connections in the human brain are strengthened or weakened by some stimulus. Another name for this model is connectionist architecture. This approach differs greatly from the more traditional symbolic or expert system approach to artificial intelligence. Neural nets have the ability to learn and derive meaning from a complex, imprecise, or noisy data, thus extracting patterns that would otherwise be imperceptible by other means. A trained neural network can be thought of as an "expert" in it the category of information it has been given to analyze and it can then be given "what if" questions to answer on that information. The greatest power of a neural network comes from its ability to generalize from information it has seen to similar patterns of information that it has not seen. How Neural Networks and Expert Systems Differ? Neural networks differ from both the expert system approach to artificial intelligence and the traditional algorithmic approach to computing. Expert systems use rules and facts to offer solutions to complex problems that would normally require a human expert. These types of rule-based and symbolic solutions have a common thread in that they all address relatively well defined problems solvable by some procedural method. In other words, rule-based systems perform high level reasoning tasks. An example of such a system is MYCIN, an Expert System for diagnosing and recommending treatment of bacterial infections of the blood, developed by Shortliffe and associates at Stanford University. To create such a system, hundreds or thousands of facts are entered into the expert system. In addition to theses facts, hundreds or thousands of rules that operate on those facts are also entered. The facts and rules that operate on the facts are essentially kept separate and any fact can affect any rule and vice versa. The system operates by taking in facts representing a current problem, applying applicable rules to those facts, generating new facts to which further rules are applied, and eventually producing a conclusion to the initial set of facts. Expert systems are very powerful tools in that any number of facts and rules can be entered to the system in any order. Conflicting facts and rules may also be entered assuming an appropriate conflict resolution scheme exists within the expert system. Theoretically an expert system can solve any high level reasoning task, provided the rules and facts of the problem have been entered. The drawback to such a system is that the rules and facts must be known ahead of time and they must be specified to the system. For some problems this is impossible. For example, the most extreme high level artificial intelligence problem is common sense reasoning. For a rule-based system to perform common sense reasoning, every fact and rule even remotely connected to a common sense problem would have to be entered to the system. The possible number of rules, facts, and potential conflicts make this impossible with current programming tools. In fact, no artificial intelligence technique has been able solve this problem although recent research has pointed to a partial solution using hybrid models, a combination of neural nets and expert systems. Neural networks take an entirely different approach to artificial intelligence. Neural networks seek to model (on a very rudimentary level) the biological action of the animal brain. Neural networks operate on the idea that the conceptual (high level) representation of information is not important. A neural network seeks to represent data in a distributed fashion across many simple processing elements so that no single piece of that network contains any meaningful information, only the network as a whole has any ability to process, store, and produce information and make decisions. Because of this, very little information can be gained by observing the network itself, only the actions of the network are meaningful. Each technique has it’s strengths and weaknesses and although they are often thought of as competitors this is not true. Each is well suited to a type of artificial intelligence problem. Expert systems are very well suited to well defined problems with facts and rules. Where these rule-based techniques fall short is on low level perceptual tasks such as vision, speech recognition, complex pattern matching, and signal processing. Rule- based techniques also have difficulty dealing with fuzzy, imprecise, or incomplete data. Data in a rule-based system must be in a precise format. Noisy or incomplete data may confuse an expert system unless specific steps are taken to account for such variability. This is where neural networks can do what expert systems can not. Neural networks distribute the representation of data across the whole network of neurons so no one part of the network is responsible for any one fact or rule. This enables the network to deal with errors in data and allows it to learn complex patterns that no human expert could perceive and quantify in simple rule/fact form. What are Neurons? The power and flexibility of a neural network follows directly from the connectionist architecture. This architecture begins with the simple neuron-like processing elements. A real neuron is a specialized biological cell, found only in the animal brain, that processes information and presumably stores data. As shown in Figure 1, a neuron is composed of a cell body and two types of outreaching tree-like branches, the axon and the dendrites. A neuron receives information from other neurons through its dendrites and transmits information through the axon, which eventually branches into strands and substrands. At the end of these sub strands is the synapse, which is the functional unit between two neurons. When an impulse (information) reaches a synapse, chemicals are released to enhance or inhibit the receiver’s tendency to emit electrical impulses. Figure 1: Biological Neuron The synapse’s effectiveness can be adjusted by the impulses passing through it so that they can learn from the activities in which they participate. This dependence on a specific sequence of impulses acts as a memory, possibly accounting for human memory, and forms the basis for artificial neural network technology. Dendrites and axons form the inputs and outputs of the neuron respectively. A neuron does nothing unless the collective influence of all its inputs reaches some threshold level. When the threshold is reached, the neuron produces a pulse that proceeds from the body to the axon branches. Stimulation at some synapses encourages neurons to fire, while at others firing is discouraged. An artificial neuron, as conceptually shown in Figure 2, is structured to simulate a real neuron with inputs (x1, x2,...xn) entering the unit and then multiplied by corresponding weights (w1, w2,...wn) to indicate the strength of the "synapse." The weighted signals are summed to produce an overall unit activation value. This activation value is compared to a threshold level. If the activation level exceeds the threshold value, the Neuron passes on its data. This is the simplest form of the artificial neuron and is known as a perceptron. Figure 2: Artificial Neuron (perceptron) Another interesting property of biological neurons is the way they also encode information in terms of frequency. Real neurons not only pass on information in simple electrical pulses, but it is the rate at which the pulses are emitted that encodes information. This presents a major difference between simple perceptrons (artificial neurons) and real neurons. This difference can be partially overcome by allowing the artificial neuron to pass on a partial pulse based on a mathematical function known as an activation function. This activation function is some type of mathematical function, which allows the artificial neuron to simulate the frequency characteristics of the real neuron’s electrical signals. How is a Neural Network Built, What Do They Look Like? The single neuron described earlier can be structured to solve very simple problems however it will not suffice for any complex problems. The solution to complex problems involves the use of multiple neurons working together, this is known as a neural network. The artificial neuron is a simple element that can be made a part of a large collection of neurons in which each neuron’s output is the input to the next neuron in line. These collections of neurons usually form layers as shown in Figure 3. Although this multi- layer structure can take on virtually any shape, the most common structure is called a feedforward network and is pictured in Figure 3. The term feedforward comes from the pattern of information Figure 3: Example Multi-layer Perceptron flow through the network. Data is transferred to the bottom layer, called the input layer, where it distributed forward to the next layer. This second layer, called a hidden layer, collects the information from the input layer, transforms the data according to some activation function, and passes the data forward to the next layer. The third layer, called the output layer, collects the information from the hidden layer, transforms the data a final time and then outputs the results. The 3-layer structure shown in Figure 3 is a standard feedforward network although many variations of this network exist. For example, feedforward networks may have 2 or more hidden layers, although the basic idea of any feedforward network is that information passes from bottom to top only Feedforward networks may have any number of neurons per layer although it is very common for networks to have a pyramid shape in that the input layer is generally larger than the hidden layer which is larger than the output layer. How Does a Neuron Work? Artificial neural networks are built up from the simple idea of the perceptron or artificial neuron. To understand the network it is necessary to understand the neuron. One neuron is able to solve very simple problems, for example a simple logic problem known as the logical AND. The logical AND problem assumes two premises and it says that something is true if and only if both of the premises in the problem are true. For example, if it is raining AND I go outside, then I will get wet. The two premises are 1) it is raining, 2) I go outside. For I get wet to be true both of the premises must be true first. If either one is true but other is not, or if they are both false, I will not get wet. This type of problem can be directly applied to a single neuron and a single neuron can classify all the possible cases in the problem. Table 1 shows all the possible cases in this problem. Notice there is only one possible way for the conclusion "I get wet" to be true which is listed first in the table. Premise 1 Premise 2 Conclusion It Is Raining I Go Outside I Get Wet It Is Not Raining I Go Outside I Do Not Get Wet It Is Raining I Do Not Go Outside I Do Not Get Wet It Is Not Raining I Do Not Go Outside I Do Not Get Wet Table 1: Logical AND Problem (1) Now consider a single neuron structured as shown in Figure 4. Assume that if a premise is true it is equal to the number 1, if it is false it is equal to 0. Also assume that the neuron sends out a "signal" of 1 if the answer is "get wet" or it sends out a 0 if the answer is "do not get wet." We can set an simple arbitrary activation function that says if the neuron receives a combined signal of higher than 1.5 it will send out a 1 (get wet), otherwise it will send out a 0 (do not get wet). This is all that is needed to solve this logic problem. Figure 4: Simple Neuron Problem If "it is raining" and "I go outside" are both true the neuron in Figure 4 will receive 1 + 1 = 2 which is greater than 1.5 and it will send out a signal of 1 which means "I get wet." For all other cases the neuron will receive a total of only 1 or 0 and it will send out a 0 and "I will not get wet" will be the conclusion. In this way a conceptual problem such as "will I get wet?" has been transformed to a mathematical problem. This type of conversion from a conceptual problem that humans understand to a numerical problem computers understand is termed encoding. Much of artificial intelligence is concerned with encoding and data representation. Table 1 can now be shown numerically as Table 2. Premise 1 Premise 2 Conclusion 1 1 1 0 1 0 1 0 0 0 0 0 Table 2: Logical AND Problem (2) By adjusting the strength of the inputs (weighting on the connections) and the way the collective influence of the inputs is used (activation function) any simple problem such as has been described can be encoded and solved. For a more complex problem a single neuron will not suffice. More complex problems require several neurons working together as neural network. Neural networks operate similarly to the single neuron except they combine their outputs to handle complex problems. How Do Neural Networks Learn? Neural networks are known for their ability to learn a task when given the proper stimulus. Usually neural networks learn through a process called Supervised Learning. This learning requires sets of data where a set of inputs and outputs are known ahead of time. For example, if a neural network were to be taught to recognize hand written characters. Several examples of each letter, written by different people could be given to the network. As the teacher, the neural programmer will have several examples of all characters in the alphabet (inputs) and the programmer knows to which category (‘A’, ‘B’, ‘C’, etc.) each character belongs (output). Inputs (characters in this case) are then given to the network. The network will produce some kind of output (probably wrong, e.g. it will say an ‘A’ is a ‘O’). Initially, the network’s responses are totally random and most likely incorrect. When the neural network produces an incorrect decision the connections in the network are weakened so it will not produce that answer again. Similarly, when the network produces a correct decision the connections in the network are strengthened so it will become more likely to produce that answer again. Through many iterations of this process, giving the network hundreds or thousands of examples, the network will eventually learn to classify all characters it has seen. This process is called supervised learning since the programmer guides the network’s learning through the type and quality of data given to the network. For this reason neural networks are said to be data driven and it is critical that the data given to the network is very carefully selected to represent the information the network is to learn. The real power in a neural network is not what they can learn but rather what they can do with that information. A trained neural network will not only be able to identify and classify data it has seen but it will generalize to similar data it has not seen. In the handwriting example, if the network is given examples of characters from 20 different people it will then be able to correctly identify characters written by almost any person, whether it has seen those particular instances of characters or not. This represents a major difference between artificial intelligence programming and conventional programming. In conventional programming each and every character and every variation on every character would have to be programmed into a computer before that computer could identify all characters written by all people. Artificial intelligence techniques such as neural networks work by generalizing from specific patterns to general patterns. This is similar to human problem solving in that we very often reason from the specific to the general (inductive reasoning). In this way neural networks can learn to classify groups of data, match patterns in data, and approximate any mathematical function. What Can Neural Networks Learn? Theoretically neural networks can learn any computable function, whether or not that function can be identified by the programmer or a mathematician. Neural networks are especially useful for classification and function approximation problems which are tolerant of some imprecision, which have lots of data available, but to which hard and fast rules cannot easily be applied. Neural networks work on finding a best match to inputs (premises) and outputs (conclusions) based on what they have seen in the past. In this way, neural networks do not give a perfect solution; they give a "best" solution given the information at hand. Neural networks, like all artificial intelligence techniques, are based on the assumption that a slightly less than perfect solution that is acceptable is better than a perfect solution which may be practically impossible to find and implement. For example, in the handwriting example given earlier the neural network will learn to identify most characters written by most people (greater than 99% accuracy) but it will fail a small percentage of the time. This inaccuracy is considered acceptable because the alternative is to use conventional programming and to create a database of every possible character ever written and that ever will be written by every human. Just creating, let alone using such a data base is practically impossible. In giving up a small measure of accuracy an artificial intelligence technique such as a neural network can be implemented in a matter of hours by a single programmer using a small fraction of the total information to be learned. Artificial intelligence is inspired by biological intelligence in that it is considered more important to have a fast, general, and very robust solution than it is to have a perfect but time consuming solution. Neural networks are basically function approximators and pattern matchers and in general all neural networks perform only these functions. Neural networks may however be applied to a variety of problems that can make use of their pattern matching and approximation ability. Neural networks are not only used to classify and match data directly but also for vision and speech recognition, prediction and forecasting, data mining and extraction, and process control and optimization. Each of these tasks is accomplished through the creative use of the neural network’s pattern matching ability. Once trained, a neural network can be inserted as the heart of any decision making system where the patterns of inputs and outputs are fed to an problem specific application that can interpret and process that data. For example, a neural network on its own has no process control logic or process control ability, however a neural network can be given process control data for a particular system and it can learn the workings of that control system. Once that system has been learned the neural network can be inserted as the decision making part of the control system. The neural network will not only replicate the control system rules it has seen but it will also be able to generalize to unknown conditions. This means that when the system as a whole is presented with new and unseen situations the neural network will extrapolate from known conditions to unknown conditions and provide a "best match" decision based on this new information. 7.Neural Networks and Machine Learning Types of Neural Networks The type and variety of artificial neural networks is virtually limitless although neural networks are classified according to two factors, the topology (shape) of the network and the learning method used top train the network. For example, the most widely used topology is the feedforward network and the most common learning method is the backpropagation of errors. Backpropagation is a form of supervised learning in which a network is given input and then the network’s actual output is compared to the correct output. The network’s connections are adjusted to minimize the error between the actual and the correct output. Feedforward networks that use backpropagation learning are so common that these networks are commonly referred to as "backpropagation networks" although this terminology is not correct. "Multi-layer feedforward" refers to the topology and pattern of information flow in the network. "Backpropagation" refers to a specific type of learning algorithm in which errors in the output layer are fed back through the network. It is possible to use a feedforward architecture without backpropagation, or to use backpropagation with another type of architecture. In any case, it has become commonly accepted to call this combination of topology and learning method simply a backpropagation network. Another common network structure is the recurrent or feedback network. Recurrent networks are similar (usually) in shape to the feedforward network although data may pass backwards through the net or between nodes in the same layer (see Figure 5). Networks of this type operate by allowing neighboring neurons to adjust other nearby neurons either in a positive or negative direction. Figure 5: Recurrent/Feedback Network This allows the network to reorganize the strength of its connections by not only comparing actual output against correct output but also by the interaction of neighboring neurons. Recurrent networks are generally slower to train and to implement than feedforward networks although they present several interesting possibilities including the idea of unsupervised learning. In unsupervised learning the network is only given input with no output and neurons are allowed to compete or cooperate to extract meaningful information from the data. This is especially useful when trying to analyze data searching for some pattern but no specific pattern is known to exist ahead of time. A third network structure, also based on the feedforward architecture, is the functional link network. This type of network, as shown in Figure 6, duplicates the input signal with some type of transformation on the input. For example consider a network that is designed to input a series of past stock prices and the output is a predicted future stock price. This network may have as input four past stock prices (last month, last week, and the past two days). The output may be a single value such as tomorrow’s stock price. In a functional link network additional inputs will also be given to the network, which are some form of the original inputs. These additional inputs may be various products of the original inputs, or they may be high and low values from the whole input set, or they may be virtually any combination of mathematical functions that are deemed to contain value for this set of input. In Figure 6 a functional link network is shown with four actual inputs and two additional functional links, which in this example are products of the first two and second two inputs. In this network the functional link is directly connected to the output layer although the functional link may be directed toward the hidden layer. The idea behind this type of network is to give the network as much information as possible about the original input set by also giving it variations of the input set. figure 6: Functional Link Network Machine Learning In discussions of neural networks and artificial intelligence in general the topic of learning is a central theme. True human like learning is beyond all artificial intelligence techniques although some learning techniques have been developed which allow machines to mimic human intelligence. These techniques that allow computers to acquire information with some degree of autonomy are collectively known as machine learning. Machine learning is the artificial intelligence field of study that attempts to mimic or actually duplicate human learning. There are many artificial intelligence techniques that do not employ any type of learning, such as search and planning strategies. These type of artificial intelligence methods rely on sophisticated search methods that can examine massive amounts of data and very quickly pick out important information without searching the entire set of data. These strategies do not learn but they do mimic a human’s ability to quickly investigate different paths and select the one that seems most productive. These kind of techniques are fairly static in that as long as the information they are given does not change they will always behave exactly the same. Theoretically neural networks fall into the category of machine learning. Neural networks are specifically designed to program themselves based on information they are given. A neural programmer’s job is to set up the structure and learning ability of the network and then provide the network with good information. If the network is designed correctly and the information input to the network is of acceptable quantity and quality the network will adapt to understand that information. In a sense neural networks exhibit the ability to learn in a similar fashion to animal learning, they have a given structure (topology and learning method), they are presented with stimulus (inputs), and they adapt to that stimulus. Most practicality implemented neural networks do not continue learning once they have been trained and placed in service. Neural networks are usually designed to be taught once and then the network is put to use. While in service they remain fixed and do not adapt to changing conditions. So in this sense they are not truly "intelligent." There are some examples of on-line adaptive networks that "learn as they go" and continually adapt to changing conditions. Most on-line learning neural networks are experimental although a few practical networks have been constructed and practically implemented. These networks continually retrain in small increments to adapt to changing conditions. This presents an exciting area in neural network technology since a network that can reliably learn in an on-line fashion can be put into service for a virtually indefinite period of time and it will continue to acquire information and adapt to it’s environment. The most sophisticated of these on-line networks can also adjust the number of nodes in their hidden layer(s) although at the moment this is still largely experimental. In practicality, expert systems are not generally considered in the category of machine learning since they are built with a certain set of facts and rules and then put into service and they generally do not adapt on their own. By this definition however most neural networks must also be excluded from machine learning since neural network training can be considered analogous to entering rules and facts to an expert system and then both systems are simple put into service where they usually remain static. Expert systems can however be updated at any time by entering new facts and rules which again is analogous to a neural network that observes new conditions and is allowed to update itself to those conditions while in service. Since both require a human to carefully prepare new information and either enter that information or explicitly allow the system to acquire that information they may both be considered in the category of machine learning. This is especially true for the few theoretical and experimental expert systems that have the ability to create new facts and rules autonomously by combining the already given facts and rules. Some experimental and "toy" expert systems have been designed with the ability to enter information to themselves from what they observe during operation and from the interaction of the current set of rules and facts. These systems are not considered completely practical at this time but there is not reason to believe they will not eventually be brought into practical service. Unfortunately neural networks and expert systems are like all artificial intelligence techniques in that they can only solve problems for which they were designed and they have no ability to change problem domains, cross reference learning, or restructure themselves to a new problem. For example, a neural network that has been designed and trained to drive a car (there are several examples of this) can not learn to do character recognition. If it is restructured to another task, it will not be able to perform the original task. In addition, a neural network that learns one task such as driving a car will have no ability to drive a motorcycle, a similar but different task. Neural networks like all current artificial intelligence techniques, are highly task specific (narrow domain). The ability to combine learning from different domains and acquire truly new information from that combination is beyond all machine learning techniques. In narrow domains with relatively stable conditions, there are many neural network and machine learning solutions that perform extremely well and can learn. 8.Types of Neural Network Learning There are generally three different ways to approach neural network learning, supervised learning, unsupervised learning, and reinforcement learning. Supervised learning requires the programmer to give the network examples of inputs and correct output for each given input. In this way the network can compare what it has output against what it should output and it can correct itself. Figure 7 shows the backpropagation method. Backpropagation is the most widely used method for neural network training because it is the easiest to implement and to understand and it works reasonably well for most problems. Unsupervised learning provides input but no correct output. A network using this type of learning is only given inputs and the network must organize its connections and outputs without direct feedback. There are several ways in which this type of learning is accomplished, one is Hebbian learning and another is competitive learning. Hebbian learning states that if neurons on both sides of a synapse are selectively and repeatedly stimulated the strength of the synapse is increased. This type of learning is well suited to data extraction and analysis in which a pattern is known to exist in some data but the type and location of the pattern is unknown. Competitive learning uses a "winner take all" strategy in which output neurons compete to decide which is the stronger and should remain active while all others must remain passive for a given input. This type of learning is used most often for categorization where categories of data are thought to exist within a set of data but the exact nature of the categories is unknown. Unsupervised learning is still not completely understood or as practically implemented as supervised learning but the possibilities of unsupervised learning are very promising. Reinforcement learning is a method half way between supervised and unsupervised learning but it is usually considered a subtype of supervised learning. In reinforcement learning a network is given input and although no specific target output is provided (as in supervised learning) the network is "punished" when it does poorly and "rewarded" when it does well. Punished and rewarded in this sense takes the form of the weakening or strengthening the connections between neurons. This means during the learning phase of a network’s life there are three possibilities for the adaptation of the neurons in the network. Connections may be selectively strengthened, selectively weakened, or they may be left unchanged depending on how the network performs. In this type of learning the network is given input and output is observed Then output neurons are categorized as being either right, wrong, or neutral. Output neurons that are judged incorrect, and all neurons that provided input to that neuron, have their connections weakened. Similarly, output neurons that are judged correct, and all neurons that provided input to that neuron, have their connections strengthened. Output neurons that are neither right nor wrong are left unchanged. With this learning method no specific output is targeted. The network does not know what it should do, only that when it does something it is either right, wrong, or neutral. This way the network is allowed to find information in data without being told what the information is but at the same time it is guided to a solution. This type of learning has been successfully applied to search problems in which a path to some goal must be identified but the exact path is not know ahead of time. Theoretically these types of networks are good candidates for on-line learning in an variable environment. Networks employing reinforcement learning can be placed into environments where decisions must be made and the outcome of those decisions is known ahead of time but the exact decisions that need to be made is unknown. Figure 7: Backpropagation Learning 9.Neural Network Uses and Applications Neural networks essentially are function approximators, pattern matchers, and categorizers. They do very little outside of these basic functions although these task can be employed in a wide variety of powerful and complex applications. The following represents some common and practically implemented solutions using neural networks. Speech Recognition Neural networks have been used very successfully in speech recognition tasks. Verbal speech is encoded mathematically and input to the network and the network responds with an action. Using a neural network for this purpose allows a person or multiple people to speak with different tones and voices but the verbal command is still understood by the network despite variations in tone, pitch, quality, etc. Character Recognition Character recognition is accomplished by presenting the network with many examples of handwritten characters and allowing the network to learn those characters. Once trained networks used for this task are remarkably accurate across not only the characters they have seen but also with characters they have never seen before. Image Processing Neural networks have constructed that process image data such as a photograph or x- ray image. In the case of photographs neural networks have been trained to pick out details in the photograph and identify portions of the image as being specific objects. In x-ray images neural networks have been used to construct composite and 3-D images from several flat x-ray images taken from different angles on the same bone structure. Pattern Recognition & Categorization Obviously pattern recognition and categorization are the most straightforward use for a neural network. Neural networks can take virtually any set of data that contains one or more patterns or categories and then extract those patterns. This is extremely useful for any application that must sort data by category or make decisions based on some pattern of information. Signal Processing Signal processing is closely related to pattern recognition and neural networks have been used very successfully to reduce noise in corrupt electrical signals and to separate various signals from transmissions, which contain multiple signals. Signal processing neural networks have been used in wide variety of problems. Two examples of this use include noise reduction in phone lines and detecting engine misfires in engines that can run as high as 10,000 RPMs. Process Control One of the newest and most important neural net uses is in process control and optimization. Neural networks have been trained by allowing them to observe some system, such as piece of machinery, and then it can take over control of that system. Not only will the neural net control the system in normal operation but it will control that system during unforeseen occurrences. Neural networks have been put to this use in tests at NASA's Dryden Flight Research Center in Edwards, California using a modified F-15 aircraft. In this application a neural network was allowed to study normal flight operations. The neural network learned how a correctly flying aircraft should behave. Then if the aircraft suffered some type of damage the flight control system enables the neural net and allows the network to correct mismatches between data on the plane's airspeed, bearing, and the forces on its body versus what the network thinks the data should be if the plane were flying normally. In this way the pilot can continue to fly a damaged aircraft by controlling the plane as if it were undamaged. The neural network does the job of transforming the pilot’s actions from normal operation to the necessary operations given the plane is damaged in some way. The network was tested in high performance maneuvers, such as tracking a target or performing a 360 degree roll. The neural net managed to keep disabled planes under control even at supersonic speeds. Process Optimization Process optimization is similar to process control in that a neural net is trained by allowing it to observe some type of system in operation. In the case of process control the inputs are the system state and the outputs are the control positions that affect the system. In process optimization the inputs and outputs are similar but additional inputs and/or outputs are also specified to represent some target state for the system. For example, consider a vehicle that takes in fuel and air and produces some speed. In operating this vehicle there are several factors which may be important at any given moment, like speed, fuel consumption, wear on the vehicle, safety, etc. Targeting one or more of these factors as most important requires a careful balance of fuel, air intake, air mechanical settings, etc (e.g. if it is decided the vehicle must run at minimal fuel usage it probably can not operate at maximum speed). Neural networks are used to balance system setting so that one or more system factors can be maximized, minimized, or stabilized. In the vehicle example a neural net could be set up to minimize fuel consumption by carefully adjusting air intake, speed, and other mechanical settings that affect fuel consumption. Process optimization represents one of the most challenging neural network application areas. Expert systems have also been successfully applied to both process control and optimization and they are the older and more traditional way of applying an artificial intelligence solution in this domain. Expert systems however have a few drawbacks, they still require a human expert to input to the system, they must be tailor made for each system, and they do not deal well with unseen or imprecise data. Neural networks have the advantage in that they program themselves, provided of course they are given the proper input. Neural networks also have advantage of being very robust and dealing well with unseen data. An expert system faced with unknown or corrupt facts will not do anything where as a neural network faced with unseen or corrupt data will respond with a "best guess" answer. Provided the new data does not stray too far from the original conditions shown to the network, neural networks perform very well and can extrapolate from the new information to a reasonable solution. 10.How to Determine If an Application is a Candidate for a Neural Network: There are several requirements and conditions a problem must meet if it is to be an acceptable candidate for a neural network solution. First and foremost the problem must be tolerant of some level of imprecision. All artificial intelligence techniques sacrifice some small measure of precision in favor of speed an tractability. This imprecision may be very small, much less than one percent or it may be relatively large such as ten percent. Neural network error rates tend to be below two percent however for certain applications error rates can go as low as a very small fraction of one percent. Any application that has zero tolerance for imprecision can not be solved with any artificial intelligence technique including neural networks. For example digital data transmission algorithms must be perfectly precise. If even the tiniest portion of a digital data transmission (e.g. sending a file over a network from computer to computer) is corrupted the entire transmission may be ruined. Conversely, something like an analog voice or video transmission is very tolerant of error. If a fraction of a second of a video or audio transmission is lost or damaged it may never be noticed by the observer. There are many such examples of processes that can tolerate some small measure of error where there is no appreciable impact on the problem. Another requirement for a neural network solution is that abundant high quality data exists for both training and testing purposes. A neural network must be able to observe the problem at hand and it must be tested on that problem once it is trained but before it is put into service. This may require massive amounts of training and test data depending on the complexity of the problem. Related to the error tolerance requirement, neural networks (like all artificial intelligence methods) work best when there exists one or more acceptable solutions to a problem that are not necessarily the best solution. There exist many problems for which finding an acceptable solution is easy but finding the perfect solution requires a practically impossible amount of resources. For example there may be a time dependant problem in which "fast enough" is just as good as "fastest." At the heart of any problem for which a neural network is the solution there must be a pattern matching or categorization problem. This is not a difficult requirement to meet since pattern matching and categorization are inherent to a wide variety of problems. Most of what humans do that is considered "intelligent" is really the ability to quickly categorize and match what we see versus what we know and make a decision based on that match. If a neural network is used in a process optimization or control application, economics plays an important part in the neural network’s usage. Neural networks used in this area tend to be of marginal benefit, in other words they provide benefits at the edges of existing performance. In any system where small increases in performance and efficiency translate to large changes in economic gain a neural network will prove very useful. This is especially true in systems where the small gain in performance is very difficult to achieve but when it is achieved it provides large benefits. An Example Of Neural Network Imagine a highly experienced bank manager who must decide which customers will qualify for a loan. His decision is based on a completed application form that contains ten questions. Each question is answered by a number from 1 to 5 (some responses may be subjective in nature). Early attempts at "Artificial Intelligence" took a simplistic view of this problem. The Knowledge Engineer would interview the bank manager(s) and decide that question one is worth 30 points, question two is worth 10 points, question three is worth 15 points,...etc. Simple arithmetic was used to determine the applicant's total rating. A hurdle value was set for successful applicants. This approach helped to give artificial intelligence a bad name. The problem is that most real-life problems are non-linear in nature. Response #2 may be meaningless if both response #8 and #9 are high. Response #5 should be the sole criterion if both #7 and #8 are low. Our ten question application has almost 10 million possible responses. The bank manager's brain contains a Neural Network that allows him to use "Intuition". Intuition will allow the bank manager to recognize certain similarities and patterns that his brain has become attuned to. He may never have seen this exact pattern before, but his intuition can detect similarities, as well as dealing with the non-linearity’s. He is probably unable (and unwilling) to explain the very complex process of how his intuition works. A complicated list of rules (called "Expert System") could be drawn up but these rules may give only a rough approximation of his intuition. If we had a large number of loan applications as input, along with the manager's decision as output, a neural network could be "trained" on these patterns. The inner workings of the neural network have enough mathematical sophistication to reasonably simulate the expert's intuition. Another Example: A Real Estate Appraiser Consider a real estate appraiser whose job is to predict the sale price of residential houses. As with the Bank Loans example, the input pattern consists of a group of numbers. (For example: number of bedrooms, number of stories, floor area, age of construction, neighborhood prices, size of lot, distance to schools, etc.). This problem is similar to the Bank Loans example, because it has many non-linearties, and is subject to millions of possible inputs patterns. The difference here is that the output prediction will consist of a calculated value the selling price of the house. It is possible to train the neural network to simulate the opinion of an expert appraiser, or to predict the actual selling price. Note: The above examples use a hypothetical bank manager and real-estate appraiser. Similar examples could use a doctor, judge, scientist, detective, IRS agent, social worker, machine operator or other expert. Even the behavior of some non-human physical process could be modeled. NeuNet Pro includes several sample projects. 11.Introduction to Expert system: Expert systems are one of the first and the most practical applications derived from the research on artificial intelligence. Artificial intelligence (AI) is that area of computer science in which scientists are striving to build machines to “think” and “reason” in a fashion similar to humans. An expert system is software, based on certain concept of AI that acts as a consultant or an expert, in a specific field or discipline to help solve a problem or help make a decision. Expert systems are also referred to as knowledge-based systems. Expert systems attempt to supply both the knowledge and reasoning of human beings. They are “expert” in only one field, topic or discipline; they can help solve only a narrowly defined problem. The user provides data about a problem through a keyboard and the computer responds with an answer and explanation based on facts and rules that have earlier been extracted from human experts and stored in the computer. An expert system cannot entirely duplicate a human expert’s judgment or make the final decision; but it can offer opinions, suggest possible diagnoses and suggest various solutions to a problem. These programs are usually used as a supplemental source of advice. Because of their usefulness, expert systems are one of the first results of AI research to become a viable commercial product. Oil companies use expert systems to analyze geological data, while physicians use them to help diagnose and treat illness. People in other types of diagnostic fields, professional assistance and emergency management also take advantage of expert systems. Until recently, most expert systems were designed for use only a large computers because the programming demanded so much power and memory. Now, many expert systems can be used with microcomputers: however, these programs are still very expensive. Early work was done using PROLOG (Programming in Logic), LISP (List processing), and other specialized programming languages. Now the trend is towards designing these systems for microcomputers and with popular programming languages such as FORTRAN and C. Conventional Programs versus Expert Systems: An expert system differs from a traditional program that is used to solve a problem (application software). In traditional software, there is the program and data that the program is given to work on. In expert systems, however, the program is called the inference engine and the data- base has been replaced with a knowledge base. Traditional computer programs are composed of a detailed set of sequentially organized instructions to be followed that comprise an algorithm for the processing steps. The computer can do nothing else but strictly follow the sequence of the instructions. Using heuristic programming, an expert system can group instructions in any order and allows different reactions to each situation it encounters. The exact processing activities are determined by the data that are entered during a “consultation,” not by the sequence of processing statements. Heuristic programming is a key feature of expert systems. The main difference between expert systems and traditional programs is the inclusion of heuristics, the rules of thumb about the problem. Heuristic programming is an attempt to emulate human intuition, judgment and common sense. This type of program allows the computer to recall earlier and include them in its programming. That newly gained knowledge is then added to its knowledge base and becomes a basis for the next problem and its solution. Here, the computer has learned from its own experience and mistakes, so when it encounters a new problem, it will recall earlier results and consider them. Difference between Expert System and Conventional Programming Use if inference engine rather than program Use of knowledge base rather that data base Processing determined by data entered rather than by the sequence of program instructions. An expert system is a knowledge-intensive program that solves a problem by capturing the expertise of a human in limited domains of knowledge and experience. An expert system can assist decision making by asking relevant questions and explaining the reasons for adopting certain actions. Some of the common characteristics of expert systems are the following: They perform some of the problem-solving work of humans. They represent knowledge in forms such as rules or frames. They interact with humans. They can consider multiple hypotheses simultaneously. Today, expert systems are quite narrow, shallow, and brittle. They lack the breadth of knowledge and the understanding of fundamental principles of a human expert. Expert systems today do not “think” as a human being does. A human being perceives significance, works with abstract models of causality, and can jump to conclusions. Expert systems do not resort to reasoning from first principles, do nor draw analogies, and lack common sense. Above all, expert systems are not a generalized expert or problem solver. They typically perform very limited tasks that can be performed by professionals in a few minutes or hours. Problems that cannot be solved by human expert in the same short period of time are far too difficult for an expert system. But by capturing human expertise in limited areas, expert systems can provide organizational benefits. 12.HOW EXPERT SYSTEMS WORK: Four major elements compose an expert system: the knowledge domain or base, the development team, the AI shell, and the user (see Figure 17.4). Subsequently, we will describe each of these parts. THE KNOWLEDGE BASE: What is human knowledge? AI developers sidestep this thorny issue by asking a slightly different question: How can human knowledge be modeled or represented in a way that a computer can deal with it? This model of human knowledge used by expert systems id called the knowledge base. There ways have been devised to represent human knowledge and expertise; rules, semantic nets, and frames. A standard structured programming construct is the IF-THEN construct, in which a condition is evaluated. If the condition is true, an action is taken. For instance: IF INCOME>$45,000(condition) PRINT NAME AND ADDRESS (action) A series of these rules can be a knowledge base. Any reader who has written computer programs knows that virtually all-traditional computer programs contain IF-THEN statements. The difference between a traditional program and a rule- based expert system program is primarily one of degree and magnitude. AI programs can easily have 200 to 10,000 rules, far more than traditional programs, which may have 50 to 100 IF-THEN statements. Moreover, in an AI program the rules tend to be interconnected and nested to a far larger degree than in traditional programs. The order in which the rules are searched depends in part on what information the system is given. Multiple paths lead to the same result, and the rules themselves can be interconnected. Hence the complexity of the rules in a rule-based expert system is considerable. Could you represent the knowledge in the Encyclopedia Britannica this way? Probably not, because the rule base would be too large, and not all the knowledge in the encyclopedia can be represented in the form of IF-THEN rules. In general, expert systems can efficiently used only in those situations where the domain of knowledge is highly restricted (such as in granting credit) and involves no more than a few thousand rules. SEMANTIC NETS. It can be used to represent knowledge when the knowledge base is composed of easily identified chunks or objects of interrelated characteristics. Semantic nets can be much more efficient than rules. They use the property of inheritance to organize and classify objects. A condition like “Is-A” ties objects together-Is-A” is a pointer to all objects of a specific class. For instance, Figure shows semantic net that is used to classify kinds of automobiles. All specific automobiles in the lower part of the diagram inherit characteristics of the general categories of the automobiles above them. Insurance companies can use such a semantic net to classify cars into rating classes. FRAMES. It also organizes knowledge into chunks, but the relationships are based on shared characteristics rather than a hierarchy. This approach is grounded in the belief that humans use “frames” or concepts to make rapid sense out of perceptions. For instance, when a person is told to “look for a tank and shoot when you see one,” experts believe humans invoke a concept or frame of what a tank should look like. Anything that does not fit this concept of a tank is ignored. In a similar fashion, AI researchers can organize a vast array of information into frames. The computer is then instructed to search the database of frames and list connections to other frames of interest. The user can then follow the various pathways pointed to by the system. THE DEVELOPMENT An AI development team is composed of one or several “experts” who have a thorough command over the knowledge engineers who can translate the knowledge into a set of rules, frames, or semantic nets. A knowledge engineer is similar to a traditional systems analyst but has special expertise in eliciting information and expertise from other professionals. The knowledge engineer interviews the expert or experts and specifies the decision rules and knowledge that must be captured by the system. THE SHELL The AI shell is the programming environment of an expert system. AI systems can be developed in just about any programming language, such as BASIC or Pascal. In the early years of expert systems, computer scientists used specialized programming languages such as LISP or Prolog that could process lists of rules efficiently. Today a growing number of expert systems use either the C language or, more commonly, AI shells that are user-friendly development environment. AI shells can quickly generate user interface screens, capture the knowledge base, and manage the strategies for searching the rule base. The best of these AI shells generate C code, which can then be integrated into existing programs or tied into exiting data streams and databases. Inference engines in expert systems. An inference engine works by searching through the rules and “firing” those rules that are triggered by facts gathered and entered by the user. Basically, a collection of rules is similar to a series of nested “IF” statements in a traditional software program; however, the magnitude of the statements and degree of nesting are much greater in an expert system.] One of the most interesting parts of expert systems is the inference engine. The inference engine is simply the strategy used: forward chaining and backward chaining. In forward chaining, the inference engine begins with the information entered by the user and searches the rule base to arrive at a conclusion. The strategies are to “fire,” or carry out, the action of the rule when a condition is true. In Figure, beginning on the left, if user enters a client with income greater than $ 100,000, the engine will on fire all rules in sequence from left to right. If the user then information indicating that the same client owns real estate, another pass of the rule base will occur and more rules will fire. The rule base can be searched each time the user enters new information. Processing until no more rules can be fired. In backward chaining, an expert system acts more like a problems solver who begins with a question and seeks out more information to evaluate the question. The strategy for searching the rule base starts with a hypotheses and proceeds by asking the user questions about selected facts until the hypotheses is either confirmed or disproved. In figure, ask the question, “Should we add this person to the prospect database?” Begin on the right of the diagram and work toward the left. You can see that the person should be added to the database if a sales rep is sent, term insurance is granted, or a financial adviser will be sent to visit the client. THE USER: The role of the users is both pose questions of the systems and to enter relevant data guide the systems along. The user may employ the expert systems, as a source of advice or to perform tedious and routine analysis tasks. CREATING AN EXPERT SYSTEM: The creation of an expert system is an involved process that requires careful planning. The trend among those who need an ES is to acquire an expert systems tool (shell) instead of writing the inference engine and other code “from scratch”. The following are the major steps involved in the creation of a knowledge-based system when an ES shell is used. 1. Select a domain and a particular task. Choose a task that someone (an “expert”) can do well. The performance of the task should be related to both breadth and depth of knowledge. The facts and rules should be stable. The recommendations should be well defined. 2. Select the ES shell for implementation. Decide what type of inference control is needed. Decide what type of pattern-matching capability is needed. Decide whether certainly factors are necessary. Begin constructing a prototype system. 3. Acquire initial knowledge about the domain and task. Identify the knowledge expert(s). Select particular problems associated with each task. Obtain record and crosscheck factual knowledge from both reference material and experts. Obtain record and task-related rules from the experts and confirm them to the degree possible. Prepare a set of test cases. 4. Encode the knowledge, using the appropriate representation. Factual knowledge. Inference knowledge. Control knowledge. 5. Execute and test the knowledge. Evaluate the test cases. Be alert for problems with consistency and completeness. 6. Refine the current knowledge and acquire additional domain knowledge. Revise the rules as necessary. Modify any facts that need revision. Augment the system with information on additional domain tasks and test again. Repeat as often as necessary. 7. Complete any necessary interface code. Demonstrate the system. Make the system user-friendly. 8. Document the system. Provide on-line and hard-copy documentation as necessary. Document the consultation portion especially well. Document the knowledge portion to the degree necessary. If the expert system is to be coded from scratch, then many more concerns must be addressed. They are related primarily to the design and coding of the inference engine and the explanation subsystem. Coding from scratch can be a substantial undertaking. 13. CAPABILITIES OF EXPERT SYSTEMS: An expert system also utilizes other capabilities. For example, 1. It is often necessary to be able to easily to remove (retract given facts or even remove (excise) given rules from consideration during a consultation. Most expert systems provide this capability. 2. It is useful to be able to assign priorities to the firing of rules. This can be accomplished by giving each rule a priority number that the inference engine checks. 3. Some expert systems permit combined forward and backward reasoning (an example is opportunistic reasoning). 4. Certain systems permit the rules to invoke subprograms written in another language (such as LISP, C or FORTRAN) to perform complex operations. One example of this is knowledge-based simulation where in knowledge- based system invokes a discrete of continuous simulation system. 5. Some advanced expert systems combine facts with framed as well as rules and thereby incorporate the capability of inheritance, wherein there is a network of data structures and the “offspring” automatically inherit properties of the “parents”. 6. As we have seen, one weakness of expert systems is their lack of so called deep knowledge. Without the type to knowledge, the expert system cannot respond to a question or statement when there is no matching fact or rule. Efforts are under way to build fundamental models to help solve this problem. 7. Learning subsystems are currently being investigated as a way to simplify the task of the knowledge base builder and the knowledge engineer so that the knowledge base can be dynamically updated with reliable information. A typical expert system, shell or tool has many capabilities. The number and quality of the features also affect the cost of the system, so before selecting a system, the user should have a good idea what capabilities she or he needs. 14.APPLICATIONS AND EXAMPLES OF EXPERT SYSTEMS: After DENDRAL, CADUCEUS and MYCIN proved their usefulness, other systems started to appear in variety of fields. Today, expert systems are used in the areas of: Economy, Industry, Medical Education and Date Processing and others. THE RCS (River Conservation Status) System: South Africa’s river systems vary form practically pristine natural systems to heavily exploited and degraded drainage ditches. The Olifants river system of the Southwestern Cape flows through a mountainous catchments of unique vegetation and includes in its fauna eight fish species endemic to the system. In contrast, 150 km further south, the Black River flows in a concrete canal through Cape Town, with effluent from the cities main sewage works contributing up to 90% of the flow. Other aspects of conservation concern are embodied in the Olifants River of the Eastern Transvaal, one of the main drainage systems for the highly populated, industrialized and mining rich Witwatersant area, which then flows through the Kruger National Park. Conservation in South Africa has been dominated by the economic and popular appeal of the large animal populations in protected areas, so that until recently, consideration of aquatic conservation has been confined to hippos, crocodiles and a few fish species. Nevertheless, streams in undeveloped catchments reflect unaltered natural conditions and could be preserved in this state. Perhaps the most important issue is that South Africa is an arid country where water is often the limiting resource for future development. Natural freshwater lakes are unknown and groundwater reservoirs meager. Rivers in South Africa are therefore under intense development pressure and the case for conservation must be very strong to be given priority. A major problem has been that no coherent river conservation policy has been developed and consequently it has been extremely difficult for government agencies, planners and engineers to understand and consider conservation priorities. A need was therefore identified for a means of assessing the major conservation attributes of rivers, for communicating these in a conceptually simple manner to people who are not ecologists and for investigating the likely consequences of proposed river development schemes on the conservation of rivers. Some aspects of conservation are quantifiable, but others involve subjective value judgements. Expert systems are well suited for modeling and decision making with conservation problems. The aims of the RCS project were to identify attributes of rivers which are important for their conservation, to establish their relative nature and scale of this important and then to quantify the conservation status of any particular river of section of river. ‘Conservation status’ is defined as a measure of the relative importance of the river for conservation and the extent to which it has been distributed from its natural state. Given the required information about a river, the system must be able to provide: A relative value of the conservation status of the river. Relative values for different components of the river. ‘Confidence limits’ indicating how precisely the conservation status can be measured and indicating where more accurate information is required. A listing of the relative importance of each attribute in determining the status of the river. Opportunities for the user to manipulate the program to examine its assumptions and change parameters. The system was originally designed as a communication tool to describe conservation priorities to managers, developers and planners in a consistent way, so that ecological factors can be taken into account in plans for river exploitation and development. The results are presented in a conceptually simple fashion, thus allowing the non- specialist to appreciate the relative conservation status of different rivers. For the ecologist/conservationist, the primary function of the system is the classification and mapping of rivers over an area. Within this function the system can act as a conservation agency, classifying rivers of different conservation status and also clearly identifying areas where more information is needed. A second important function of the system is its use as a model to evaluate the effects of planned changes. In this case the system uses information about the river in its present state and then compares this with runs, using data or predictions on river conditions following the planned changes. This should provide a powerful tool for environmental impact assessment. The construction, evaluation and testing of the system provided an extremely valuable function in itself. This process forced the contributors to examine their own assessment methods carefully. Many conservationists, for instance, make ‘intuitive’ judgments about the importance of particular sites. In fact, this ‘intuition’ comprises a complex net of interacting variables which are evaluated in terms of an individual’s experience and the available information about the site. To be forced to analyze these variables and their interrelationships often led to considerable insight and also often helped to pinpoint areas of disagreement, so that, while arguments were not eliminated, they were at least channeled into specific resolvable problems. In South Africa, the development of RCS system has led to the identification of those aspects, which are generally felt to the most important in conserving rivers and has provided a fair consensus as to the relative importance of different attributes of rivers. The ability to present this consensus view to ecological laymen charged with management responsibilities in rivers gives them a more realistic opportunity to take conservation priorities into account at the planning stage. Conservationists too often see managers, planners and developers as being insensitive to environmental issues, when in fact a major part of the problem is the inability of the conversionists to present a concise summary of their complex points of view. It is unreasonable to expect laymen to unravel the multivariate probabilities and diffuse intuitions of the conservation ethic. The RCS encapsulates the more important components of river conservation status. As with the other ecological systems discussed here, this system has acted as a focus for a number of interested experts. It served to identify points of conflict between experts and thus to identify areas where further research is required. FISHFARMER SYSTEM: The lack of aquaculture expertise has been defined as one of the major constraints on the development of aquaculture in southern Africa. It was decided to develop and expert system to assist potential aquaculturists in assissing the aquaculture potential of various fish species in relation to particular sites and culture methods. Computers have played an increasingly important role in aquaculture in recent years, but their use has been limited to data storage (to aid in pond management) and the monitoring of oxygen levels and water flow. As far as known, FISHFARMER represents the first use of expert system techniques as a means of assessing aquaculture potential. Great emphasis has been placed on the ‘user friendliness’ of the system. Fish culture has been practiced for over 2000 years, but it is only recently that intensive aquaculture has developed into a commercial activity, which is dependent upon advanced technology and scientific research. World output from aquaculture has increased by 40% between 19975 and 1984. Wellesley and Burton propose a number of steps that should be taken if the industry is to develop. These include the establishment of a lead agency to initiate marketing, coordinate research and to promote the transfer of technology. As a means to this end, it was decided to develop an expert system to evaluate the aquaculture potential of a given organism, site culture method. Such an expert system would provide the ability to assess the aquaculture potential of a particular site independently of human aquaculture experts. In addition, it was hoped that the development of the system would have two side effects. In the first instance, it would bring together information from a wide number of different sources and secondly it would clearly identify ‘holes’ in the available knowledge. The requirements of the system were the following: 1. Given available data on the biological, physical, financial and infrastructural parameters pertaining to a potential site, the system should be able to: Evaluate the site with regard to its suitability for fish culture. If the site is suitable for fish culture, evaluate the species included in the knowledge base in relation of the environmental characteristics of the site. Provide a confidence value with each recommendation. 2. The system should be ‘user friendly’ in that it must: Explain (on demand) why a particular question is being asked. Explain (on demand) rationale of reasoning at any given point in the evaluation. Explain (on demand) any technical terms and procedures associated with any of the questions. Allow manipulation of the system by the user to examine its assumptions and change parameters. Require a minimum of computer expertise for its use and comprehension. It is envisaged that Nature Conservation officers, fish farming consultants, researches will use the system and of course, anyone interested in assessing the aquaculture potential of a site. FISHFARMER attempts to determine the optimum match between a particular site, water source, market, species, culture method and financial resources. The system used information provided by the user in conjunction with that from its own knowledge base in order to make the correct match. The system queries the user in detail about the proposed site in order to determine whether the site, culture and species complement each other. Good use has been made of the relative importance measure. Thus water temperature, being a particularly crucial environmental parameter, influences the system’s decision far more than would the state of the site’s access road for example and so is given a higher rating. In addition to answering questions the user may ask for further background information about the questions being asked. The development of the FISHFARMER expert system does not lend itself to a single researcher working in isolation; the design and contents of the knowledge base constantly need to be challenged and discussed. This project has benefits form the inputs of wide variety of individuals: aquaculture researchers, computer scientists and commercial fish farmers. The FISHFARMER project has proved to be a most useful exercise from a number of different points of view. It has consolidated information and knowledge from a wide range of sources and people. Many areas, which require further in depth research, have been identified and it has provided the user with a useful tool to assist in solving problems and developing sites. EMEX: An Expert System for Market Analysis and Forecasting The aim of the EMXE system is to guide the user through the stages of the model-building task. Its role is that of assistance. It is important that the user, as the expert in the market being modeled, exercises judgment over the results and suggestions that the system makes. The combination of the system’s model building expertise and user’s market expertise together can provide a very powerful insight into the operation of the market. Ease of use is an important consideration for such a package. The user and the system interact via a series of forms and menus. There are facilities to allow the user to browse through the details of the current consultation, to change previously entered information and at all times a help system can offer advice relevant to the current context. EMEX is now in regular use and continue to undergo refinement. To date its performance has been judge by comparing the models it generates with the models produced by experts and on several occasions it has actually improved originally built by the experts. Moreover, a model, which might take an expert half a day to build, is more likely to take around thirty minutes for EMEX; so great timesaving can also be made. APES: An Expert System for Nutrition Education Interaction with a knowledge base can be good for your health! This was the starting point for an attempt at constructing an environment for exploration by pupils, students, teachers and experts. Part of this environment is an expert system which is developed to provide advice on the nutritional properties of food and the needs of a healthy diet. This requires an underlying knowledge base which itself includes a very large bank of data. A powerful learning environment or micro world is then capable of interrogation, amendment and of supporting computer aided learning (CAL) applicarepresent a cooling process or composition of a diet for analysis. A key feature of the project is the construction of a database, database management system (DBMS) and a set of rules for presentation to and use by the user. To keep the initial development simple a non probabilistic, backward chaining production rule system was chosen for implementations: Micro-Prolog Professional and APES (Augmented Prolog for Expert System) produced by Logic Programming Associates. Additional reasons for choosing this system were that it was available for microcomputers already in use in schools and that powerful learning packages had already been successfully built, particularly by the Exeter Project under Jon Nichol at Exeter University. Application packages based on Micro-Prolog have been well received by History teachers. The involvement of experts in knowledge engineering is a two-way process. As they constrict the knowledge base, so too they learn how to structure their knowledge and make explicit the underlying rules of behavior. Furthermore, in the field of education this process can be harnessed to improve and extend the learning abilities of students. A wide range of expert system shells and AI languages is becoming available; even in education, specialist shells are being developed. How does one choose which system adopt? For this application we needed a shell, which could work with our own database module, a language, which supports complex list handling and external files as well as possessing a syntax that is reasonably accessible to nutritionists, teachers and students. APES suited these purposes: it is flexible, extensible and available. The main drawback to this relatively open environment, however, is that the interface is not as user friendly as some shells. The database itself can be hidden from the ordinary user in two major ways. For large applications the best strategy is to externalize very large files that cannot be held in memory and use the indexed method of record retrieval available with Micro-Prolog for reasonably fast access. The rules for extracting and manipulating the data can be held in a closed APES module loaded in with APES itself and an integral part of it. For prototyping, however, it should be sufficient to code part of the file as Prolog clauses and make the low level DBMS rules non- interactive, i.e. hidden from the user. APES allows for tailoring the environment to suit the application. Proto-tying the knowledge base was relatively straightforward. Both bottom up and top down methods were needed. Once the basic data needs were identified, efficient rules for presenting meaningful information were written. At the same time, teachers and experts sketched the applications, which would use this information, out. The skill lay in matching the needs of the application with the potential of the database. Differences arose over what programmers considered to be the simple choice of clause names: to nutritionists and health experts, words such as high in and good-for as in ‘foods high in fiber are good for constipation’ cause all sorts of problems which were not immediately apparent to us! A lesson learnt here was that the experts need to be more familiar with what the system can do in order to define the language to be presented to the user. APES is not ideally suited to education. The interface is too complex, some of its language too advanced and, in any case, specialized applications do require specialized front-ends. Expert system shells, aimed at the needs of education have been developed with Micro-Prolog as the host language, at Kingston College. There is still a problem, though, that simplified shells for education restrict the complexity of the micro-world available to pupils. Other avenues being explored for greater ease of use includes object oriented languages and environments but it is still early days yet. Another difficulty encountered was lack of teacher familiarity with declarative programming and the lack of suitable software systems in schools. There will need to be an investment in training in fifth generation computing before we even begin to discover the potential for interaction with knowledge bases and writing small “expert system” programs. Ten years of similar investment in Logo is only just beginning to show research results, though not all positive. Nonetheless, the small amount of work we have done with teachers, pupils, students and health educationalists convinces us that there is potential in educational applications of expert systems. THE DIABETICS EXPERT SYSTEM: The expert system DIABETES is used for tutoring and teaching medical students, general practitioners and medical related staff on diabetes diagnoses and management. Major complications of diabetes are dealt with as well as treatment using insulin administration. The bases idea of the system is to present the user with a general tool for experimenting with a large number of patient cases by the choice of symptom and history from a menu driven interface. It is a system based on production rules and can be an effective decision making tool in such a complex urea of medicine as diabetes. 15.EXAMPLES OF SUCCESSFUL EXPERT SYSTEMS: There are many successful expert systems. However there is no accepted definition of successful. What is successful to an academic (“It works!”) may not be successful to a corporation (“It cost a million dollars!”). While some of the better-known expert systems are quite large and cost millions of dollars, others are less expensive and tackle interesting but smaller problems. Some of the most celebrated systems are not used to facilitate routine decision-making. Finding out which successful systems are used on a daily bases is difficult because corporations regard this information as proprietary. Nevertheless, we can briefly describe some of the better-known commercial success stories. 1. Whirlpool uses the Consumer Appliance Diagnostic System (CADS) to help its customer service representatives handle its 3 million annual telephone inquiries. The system expedites customer service by directing customers to a single source of help without delay. Previously, customers who had a problem or question about Whirlpool products might have to be put on hold or directed to two or three different representatives before their questions could be answered. Whirlpool developed CADS using. Anion’s Development System for OS/2 as its expert system shell. Two knowledge engineers worked with one programmer and three of the company’s customer service experts to capture 1000 rules for 12 product lines. BY 1999, Whirlpool expects to use CADS to respond to 9 million calls annually. 2. The National Aeronautic and Space Administration (NASA) developed MARVAL, its Multimission Automation for Real-Time Verification of Spacecraft Engineering. Link to monitor its Voyager missions without burning out analyst after analyst. Spacecraft flights on long missions generate voluminous and critical information that must be carefully analyzed. MARVEL monitors NASA’s computer-command subsystem, which receives and executes commands from the ground and also analyzes power, propulsion flight-data subsystems, and telecommunications functions. NASA developed MARVEL with the assistance of the equivalent of 1.5 full-time computer scientists and two mission experts. MARVEL is based on Software Architecture and Engineering’s Knowledge Engineering System expert system shell and runs on Sun workstations. 3. Countrywide Funding Corp. in Pasadena, California, loan underwriters with about 400 underwriters in 150 offices around the country developed a microcomputer-based expert system in 1992 to make preliminary creditworthiness decisions on loan requests. The company had experienced rapid, continuing growth and used the system to help ensure consistent and high-quality loan decisions. 4. CLUES (Countryside’s Loan Underwriting Expert System) have about 400 rules. Countrywide tested the system by having every loan application handled by a human underwriter fed to CLUES. The system was refined until it agreed with the underwriter in 95 percent of the cases. However, Countrywide will not rely on CLUES to reject loans because the expert system cannot be programmed to handle exceptional situations such as those involving a self-employed person or complex financial schemes. An underwriter will review all rejected loans and will make the final decision. CLUES have other benefits. Traditionally an underwriter can evaluate at least sixteen per day (Nash, 1993). 5. The Digital Equipment Corporation (DEC) and Carnegie-Mellon University developed XCON in the late 1970s to configure VAX computers on a daily basis. The system configures customer orders and guides the assembly of those orders at the customer site. XCON has been used for major functions such as sales and marketing, manufacturing and production, and field service, and played a strategic role at DES (Sviokla, June 1990; Barker and O’Connor, 1989). It is estimated that XCON and related systems saved DEC approximately $40 million per year. 6. Table describes other well-known expert systems in terms of their size and programming languages. As can be seen, these systems generally have a minimum pf several hundred rules. Note Digital Equipment Corporation’s XCON, which started out with 250 rules but expanded to about 10,000. 7. These examples show that expert systems can provide organizations with an array of benefits, including reduced errors, reduced cost, reduced training time, improved decisions, and improved quality and service. The Window on Organization shows how expert systems can be applied to solve some problems in medicine and health care. 16. PROBLEMS WITH EXPERT SYSTEMS: A thorough understanding of expert systems also requires awareness of their current limitations and problem. 1. Expert Systems Are Limited to Certain Problems In answer to the question, “Why do some expert systems work?” critics point out that virtually all successful expert systems deal with problems of classification in which there are relatively few alternative outcomes and in which these possible outcomes are all known in advance. Contrary to early promises, expert systems do best in automation lower-level clerical functions. Even in these comparatively simple situations, however, expert systems require large, lengthy, and expensive development efforts. For these kinds of problems, hiring of training more experts may be less expensive than building an expert system. 2. Important Theoretical Problems Exist There are significant theoretical problems in knowledge representation. IF- THEN knowledge exists primarily in textbooks. There are no adequate representations for deep causal models or temporal trends. No expert system, for instance, can write a textbook on information systems or engage in other creative activities not explicitly foreseen by system designers. Many expert systems cannot yet replicate knowledge that is intuitive, based on analogy and on a “sense of things.” 3. Expert Systems Are Not Applicable to Complex Managerial Problems The applicability of expert systems to complex managerial problems is currently highly limited. Many managerial problems generally involve drawing facts and interpretations from divergent sources, evaluation the facts, and comparing one interpretation of the facts with another, and do not involve analysis or simple classification. Expert systems cannot address complex problems requiring intuition to solve. Expert systems based on the prior knowledge of a few known alternatives are unsuitable for the problems mangers face on a daily bases. 4. Expert Systems Are Expensive to Maintain The knowledge base of expert systems is fragile and brittle; they cannot learn or change over time. In fast-moving fields like medicine or the computer sciences, keeping the knowledge base up to date is a critical problem. For applications of even modest complexity, expert system code is generally hard to understand, debug, and maintain. Adding new rules to a large rule-based program nearly always requires revision of the control variables and conditions of earlier rules. Which of these entries to change to make the next rule work is often far from obvious. 5. A More Limited Role For Expert Systems Although expert systems lack the robust and general intelligence of human beings, they can provide benefits to organizations if their limitations are well understood. Expert systems have proved especially useful for certain types of diagnostic problems. They can provide electronic checklists for lower-level employees in service bureaucracies like banking, insurance, sales, and welfare agencies. Elements of expert system technology have been incorporated into a wide variety of products and services (Hayes-Roth and Jacob stein, 1994). (An example would be expert tax advice provided in popular financial planning and tax calculation software packages.) In limited areas, expert systems can help organizations make higher-quality decisions using fewer people.
Pages to are hidden for
"Artificiall Intelligent"Please download to view full document