Neural networks in business

Document Sample
Neural networks in business Powered By Docstoc
					                             Computers & Operations Research 27 (2000) 1023}1044

 Neural networks in business: techniques and applications for
                  the operations researcher
                                 Kate A. Smith *, Jatinder N.D. Gupta
                        School of Business Systems, Monash University, Clayton, VIC 3168, Australia
                         Department of Management, Ball State University, Muncie, IN 47306, USA


   This paper presents an overview of the di!erent types of neural network models which are applicable when
solving business problems. The history of neural networks in business is outlined, leading to a discussion of
the current applications in business including data mining, as well as the current research directions. The role
of neural networks as a modern operations research tool is discussed.

Scope and purpose

   Neural networks are becoming increasingly popular in business. Many organisations are investing in
neural network and data mining solutions to problems which have traditionally fallen under the responsibil-
ity of operations research. This article provides an overview for the operations research reader of the basic
neural network techniques, as well as their historical and current use in business. The paper is intended as an
introductory article for the remainder of this special issue on neural networks in business.      2000 Elsevier
Science Ltd. All rights reserved.

Keywords: Neural networks; Operations research; Business; Data mining

1. Introduction

  Over the last decade, we have seen a rapid acceptance of new technologies like neural networks
and data mining methodologies for solving a wide range of business problems. Many of these
problems involve tasks that have typically been the domain of the operations researcher, like

  * Corresponding author. Tel.: #61-3-9905-5800; fax: #61-3-9905-5159.
  E-mail address: (K.A. Smith)

0305-0548/00/$ - see front matter             2000 Elsevier Science Ltd. All rights reserved.
PII: S 0 3 0 5 - 0 5 4 8 ( 9 9 ) 0 0 1 4 1 - 0
1024           K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

forecasting, modelling, clustering, and classi"cation. As the business world becomes more excited
about neural networks and data mining, however, it is important for the operations researcher to
realise that these technologies are really their own.
   While neural networks have developed from the "eld of arti"cial intelligence and brain model-
ling, the operations research reader will recognise them for what they really are. Neural networks
are nothing more than function approximation tools which learn the relationship between indepen-
dent variables and dependent variables, much like regression or other more traditional approaches.
The principal di!erence between neural networks and statistical approaches is that neural net-
works make no assumptions about the statistical distribution or properties of the data, and
therefore tend to be more useful in practical situations. Neural networks are also an inherently
nonlinear approach giving them much accuracy when modelling complex data patterns. There are
several types of neural networks, each with a di!erent purpose, architecture and learning algo-
rithm, and these will be outlined in Section 2.
   In Section 3, we brie#y review the history of neural networks from the perspective of business
applications. Five stages of neural network development are identi"ed, together with the impact
each stage had on the business community. This leads into a discussion in Section 4 of the current
business application areas where neural networks are "nding relevance. One of the main areas
where neural networks are proving to be useful is data mining. Data mining is becoming extremely
popular in the business world, as a solution methodology to a wide variety of problems where the
solution is believed to be hidden in the data warehouse. Neural networks form the backbone of
most of the data mining products available, and are an integral part of the knowledge discovery
process which is central to the methodology. This data mining methodology, as well as some of the
other knowledge discovery techniques, will be discussed in Section 5.
   Certainly, this is not the "rst paper to review neural networks. The developments in the "eld of
neural networks have been reviewed by several authors from various points of view. Wong et al.
[1}3] categorise the available literature using the number of publications in each area to identify
previous research and application trends, and identify future directions. Sharda [4] and Ignizio and
Burke [5] review the applications of neural networks in the forecasting, prediction and operations
research "elds. Smith [6] surveys the application of neural networks to problems of combinatorial
optimization. Zhang and Huang [7] review the applications of neural networks in the area of
manufacturing. A previous special issue of Computers and Operations Research by Ignizio and
Burke [5] also presented some interesting developments in the use of arti"cial intelligence and
evolutionary programming for solving operations research problems. This paper (and special issue)
has a di!erent focus however. Our review emphasizes the historical progressions in the "eld of
neural networks and discusses the impact these had on the business community. The role of the
operations researcher in this current environment is then identi"ed by reviewing neural network
developments in a series of application areas.
   This paper thus aims to introduce the operations research reader to neural techniques which
appear to have been received rather sceptically to date. Neural networks and data mining are not
magic solutions to problems, despite the message purported by vendors of software products.
Operations researchers are likely to "nd success when using these techniques however because they
will understand the process and are likely to adhere to the methodology. Due to the strong demand
from business and industry, these approaches will become a valuable and highly marketable tool
for operations research groups in the near future.
               K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044   1025

                      Fig. 1. Architecture of MFNN. (note: not all weights are shown)

2. Neural network models

  In this section we provide details of three of the better known neural network models. Each
model is presented in terms of its purpose, architecture, and algorithm. Each of these models has
some similarity to more traditional statistical and operations research techniques, and the relation-
ships to the analogous traditional techniques are discussed.

2.1. Multilayered feedforward neural networks

   According to a recent study [2], approximately 95% of reported neural network business
application studies utilise multilayered feedforward neural networks (MFNNs) with the back-
propagation learning rule. This type of neural network is popular because of its broad applicability
to many problem domains of relevance to business: principally prediction, classi"cation, and
modelling. MFNNs are appropriate for solving problems that involve learning the relationships
between a set of inputs and known outputs. They are a supervised learning technique in the sense
that they require a set of training data in order to learn the relationships.
   The MFNN architecture is shown in Fig. 1 and consists of two or more layers of neurons
connected by weights. The #ow of information is from left to right, with inputs x being passed
through the network via the hidden layer of neurons to the output layer. The weights connecting
input element i to hidden neuron j are denoted by = , while the weights connecting hidden neuron
j to output neuron k are denoted by < .
   Each neuron calculates its output based on the amount of stimulation it receives from the given
input vector x. More speci"cally, a neuron's net input is calculated as the weighted sum of its
inputs, and the output of the neuron is based on a sigmoidal function indicating the magnitude of
this net input. That is, for the jth hidden neuron

      netF"    = x        and y "f (netF),                                                        (1)
         H      HG G           H       H
1026           K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

while for the kth output neuron
     netM "       < y and o "f (netM ).                                                            (2)
         I         IH H         I          I
Typically, the sigmoidal function f (net) is the well-known logistic function
       f (net)"         ,                                                                          (3)
where is a parameter used to control the gradient of the function, although the only requirement
is that it be bounded between 0 and 1, monotonically increasing, and di!erentiable.
   For a given input pattern, the network produces an output (or set of outputs) o , and this
response is compared to the known desired response of each neuron d . The weights of the network
are then modi"ed to correct or reduce the error, and the next pattern is presented. The weights are
continually modi"ed in this manner until the total error across all training patterns is reduced
below some pre-de"ned tolerance level (or the network has started to `overtraina as measured by
deteriorating performance on the test set [8]).
   The weight update rule for the output layer weights V is given by
      < (t#1)"v (t)#c (d !o )o (1!o )y (t)                                                         (4)
       IH          IH        I    I I I H
and for the hidden layer weights W by
       = (t#1)"w (t)#c y (1!y )x (t)            (d !o )o (1!o )v .                                (5)
        HG      HG       H    H G                  I  I I        I IH
Proof that the e!ect of these weight updates minimizes the total average-squared error
            1 . )
      E"               (d !o ),                                                                   (6)
           2P            NI    NI
              N I
where d is the desired output of neuron k for input pattern p, and o is the actual network output
        NI                                                            NI
of neuron k for input pattern p), relies on the fact that the algorithm (known as the backpropaga-
tion learning algorithm) performs steepest descent on this error function [8].
   There are many training issues involved in applying MFNNs successfully, including ensuring
that the learnt relationships generalise well to new data. To ensure this, data are typically divided
into a training and a test set, where the performance on the test set is used to indicate the
generalisation of the neural network results. Other issues involve optimal selection of the many
training parameters including the number of hidden neurons, the learning rate c, the initial weights,
and the slope of the sigmoidal function . Convergence to local minima of the error function (6) is
also a concern, since this means that the "nal combination of weights will always produce an error.
Researchers have recently started using heuristics approaches like genetic algorithms instead of the
backpropagation learning rule to determine the optimal weights for the MFNN to minimise the
total average-squared error [9}11].
   The MFNN, with an algorithm for determining the optimal weights for a given training set of
data (backpropagation or heuristic algorithm), can be seen as similar to any function approxima-
tion technique like regression, where the weights are analogous to regression coe$cients estimated
               K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044   1027

                              Fig. 2. Architecture of Hop"eld neural network.

by least squares. The di!erence of course is the improved potential of the function approximation
when learning highly complex and nonlinear data due to the increased number of free parameters.

2.2. Hopxeld neural networks

   While MFNNs learn the relationships between inputs and outputs in a supervised manner,
Hop"eld neural networks are completely di!erent, in function, architecture and approach. With
MFNNs, the neurons are connected in layers, and the weights are modi"ed throughout the
algorithm to re#ect the learning process. With Hop"eld networks however, there is no layer
structure to the architecture, and the weights do not change. Hop"eld networks [12] are a fully
interconnected system of N neurons as shown in Fig. 2 for N"4. The weights of the network
= are "xed and symmetric (= "= ), and store information about the memories or stable
   GH                             GH     HG
states of the network. Each neuron has a state x which is bounded between 0 and 1. Neurons are
updated according to a di!erential equation, and over time an energy function is minimised. The
local minima of this energy function correspond to the stable states of the network.
   Hop"eld networks are principally used to solve optimisation problems of the kind familiar to the
operations researcher. Hop"eld and Tank [13] showed that the weights of a Hop"eld network can
be chosen so that the process of neurons updating simultaneously minimises the Hop"eld energy
function and the optimisation problem.
   Each neuron i updates itself according to the di!erential equation
      dnet          net    ,
            G "! G # = x #I ,                                                                 (7)
        dt                     GH H    G
      x "f (net ),
        G         G
where f (.) is a sigmoidal output function bounded by 0 and 1 like (3), and is a constant. These
equations are similar to the calculation of a neuron output in the MFNN except that a constant
term I has been added to the net input of each neuron, and the time dynamics are now continuous
(although the process is usually simulated with a discrete Euler approximation). Each time
1028           K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

a neuron is updated in this manner, the energy function

               1 , ,                 ,
      E"!                = xx! Ix                                                                  (8)
               2            GH G H       G G
                 G H            G
is reduced. In fact, this energy function is a Liapunov function for the system and is guaranteed not
to increase [12]. This proof relies on the fact that the neuron update rules (7) result in steepest
descent of the energy function (8), just like the weight update rules (4) and (5) of the MFNN with
backpropagation result in steepest descent of the error function (6).
   The approach to solving optimisation problems using Hop"eld networks is to choose the
weights = and constant terms I to force the energy function and the optimisation objective
            GH                        G
function to be equivalent. The optimisation problem is expressed as a single function to be
minimised, which incorporates all costs and constraints of the problem using a penalty function
approach. Notice that the weights = are simply the coe$cients of the quadratic terms x x in the
                                        GH                                                  G H
energy function, while the constant terms I are the coe$cients of the linear terms x . Once the
                                                G                                         G
network weights and constants have been chosen, the neuron states x are randomly initialised, and
the neurons begin updating in a random sequence according to di!erential Eq. (7). Over time, the
energy function minimises until the neuron states have stabilised, and the "nal neuron states
correspond to a local minimum solution of the optimisation problem. This solution may not
necessarily be a feasible one or a good one since the penalty function treatment of the cost and
constraints means that a balance needs to be found between which components of the energy
function are minimised. Penalty function parameters need to be selected to re#ect the relative
degree of di$culty in minimising each component of the energy function. Numerous researchers
have tried to alleviate this problem by modifying the energy function form [14], or by analytically
choosing values for the penalty parameters [15,16].
   Clearly, Hop"eld networks are a steepest descent technique for solving an optimisation problem
using a penalty function approach. The performance of Hop"eld networks has been improved by
incorporating hill-climbing strategies into the neuron update equations (7), like simulated anneal-
ing [17]. Variations of the Hop"eld network include Boltzmann machines [18] and mean-"eld
annealing [19]. Enhancements to these approaches such as neuron normalisation [19] have
enabled certain hard constraints to be enforced by the neuron updating, rather than relying on
a penalty function approach. We refer the interested reader to Smith et al. [6,20] for a comprehens-
ive discussion of the issue involved with using Hop"eld neural networks and their variations for
solving optimisation problems.

2.3. Self-organising neural networks

   For many decades, statisticians have used discriminant analysis and regression to model the
patterns within data when there are labelled training data (with inputs and known outputs)
available, and clustering techniques when no such data are available. These techniques "nd
analogies in neural networks, where MFNNs are used with backpropagation when training data
are available, and self-organising neural networks are used as a clustering technique when no
training data are available. Clustering has always been used to group the data based upon the
natural structure of the data. The objective of an appropriate clustering algorithm is that the degree
               K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044    1029

                             Fig. 3. Architecture of a SOFM with nine neurons.

of similarity of patterns within a cluster is maximised, while the similarity these patterns have with
patterns belonging to di!erent clusters is minimised.
   Often patterns in a high-dimensional input space have a very complicated structure, but this
structure is made more transparent and simple when they are clustered in a one, two or three
dimensional feature space. Kohonen [21,22] developed self-organising feature maps (SOFMs) as
a way of automatically detecting strong features in large data sets. SOFMs "nd a mapping from the
high-dimensional input space to low-dimensional feature space, so the clusters that form become
visible in this reduced dimensionality.
   In comparison with the two previous neural network models discussed, the SOFM involves
adapting the weights to re#ect learning (like the MFNN with backpropagation) but the learning is
unsupervised since the desired network outputs are unknown. Another signi"cant di!erence
between the SOFM and the previous models is the architecture and the role of neuron locations in
the learning process. In the SOFM, input vectors are connected to an array of neurons, usually
one-dimensional (a row) or two-dimensional (a lattice). Fig. 3 shows this architecture for n inputs
and a square array of nine neurons.
   When an input pattern is presented to the SOFM, certain regions of the array will become active,
and the weights connecting the inputs to those regions will be strengthened. Once learning is
complete, similar inputs will result in the same region of the array becoming active or `"ringa.
Central to this idea is the notion of the ordering and physical arrangement of the neurons. With
SOFMs the ordering of the neurons is important since we are refering to regions of neurons "ring.
If a neuron "res, it is likely that its neighbours will also "re, and thus for the "rst time we are
concerned with the physical location of the neurons. This idea has more biological justi"cation
than the other neural models, since the human brain involves large regions of neurons operating in
a centralised and localised manner to achieve tasks. In the human brain, as in the SOFM, there is
usually a clear `winning neurona which "res the most upon receiving an input signal, but the
surrounding neurons also get a!ected by this, "ring a little, and the entire region becomes active.
   In order to replicate the response of the human brain in the SOFM, the learning process is
modi"ed so that the winning neuron (de"ned as the neuron whose weights are most similar to the
input pattern) receives the most learning, but the weights of neurons in the neighbourhood of the
1030           K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

                  Fig. 4. Concept of neighbourhood size for a rectangular array of neurons.

winning neuron are also strengthened, although not as much. It is appropriate at this point to
de"ne the concept of a neighbourhood in relation to the architecture of the SOFM. For a linear
array of neurons, the neighbours are simply the neurons to the left and right of the winner. This is
called a neighbourhood size of one. To achieve the e!ect of an active region of neurons, we need to
consider larger neighbourhood sizes, as shown in Fig. 4 for rectangular array of neurons, with
a hexagonal neighbourhood structure.
   Initially the neighbourhood size around a winning neuron is allowed to be quite large to
encourage the regional response to inputs, but as the learning proceeds, the neighbourhood size is
slowly decreased so that the response of the network becomes more localised. The localised
response, which is needed to help clearly di!erentiate distinct input patterns, is also encouraged by
varying the amount of learning received by each neuron within the winning neighbourhood. The
winning neuron receives the most learning at any stage, with neighbours receiving less the further
away they are from the winning neuron.
   Let us denote the size of the neighbourhood around winning neuron m at time t by Nm(t). The
amount of learning that every neuron i within the neighbourhood of m receives is determined by
      c" (t) exp(!""r !r ""/ (t)),                                                             (9)
                         G   K
where r !r is the physical distance (number of neurons) between neuron i and the winning
         G    K
neuron m. The two functions (t) and (t) are used to control the amount of learning each neuron
receives in relation to the winning neuron. These functions can be slowly decreased over time. The
amount of learning is greatest at the winning neuron (where i"m and r "r ) and decreases the
                                                                          G   K
further away a neuron is from the winning neuron, as a result of the exponential function. Neurons
outside the neighbourhood of the winning neuron receive no learning.
   Like the other neural network models considered thus far, the learning algorithm for the SOFM
follows the basic steps of presenting input patterns, calculating neuron outputs, and updating
weights. The di!erences lie in the method used to calculate the neuron output (this time based on
the similarity between the weights and the input), and the concept of a neighbourhood of weight
updates. The steps of the algorithm are as follows:
               K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044   1031

Step 1: Initialise
        } weights to small random values
        } neighbourhood size N (0) to be large (but less than the number of neurons in one
           dimension of the array)
        } parameter functions (t) and (t) to be between 0 and 1
Step 2: Present an input pattern x through the input layer and calculate the closeness (distance) of
        this input to the weights of each neuron j:

             d """x!w """                (x !w ).
              H      H                     G  GH
Step 3: Select the neuron with minimum distance as the winner m
Step 4: Update the weights connecting the input layer to the winning neuron and its neighbouring
        neurons according to the learning rule
             w (t#1)"w (t)#c[x !w (t)],
               HG          HG         G    HG
        where c" (t) exp(!""r !r ""/ (t)) for all neurons j in N (t)
                               G    K                             K
Step 5: Continue from STEP 2 for        epochs; then decrease neighbourhood size, (t) and (t):
        Repeat until weights have stabilised.

SOFMs have been predominantly used for clustering and feature extraction, "nding application as
a data mining technique. As such, they are comparable to traditional clustering techniques like the
k-means algorithm [23]. There has also been quite a signi"cant amount of research undertaken in
using SOFMs for solving optimisation problems as an alternative to the Hop"eld neural networks
discussed in the previous section. This involves combining the ideas of the SOFM with the elastic
net algorithm [24] to solve Euclidean problems like the travelling salesman problem [25,26]. In
recent work, a modi"ed SOFM has been used to solve broad classes of optimisation problems by
freeing the technique from the Euclidean plane. We refer the reader to Smith et al. [6,20] for more
details of this and other self-organising approaches to optimisation.

2.4. Other neural network models

  There are many other di!erent types of neural network models, each with their own purpose and
application areas. Most of these are extensions of the three main models we have discussed here.
Their potential application to problems of concern to the business world and the operations
researcher is unclear, but they are referenced here for completeness. These other neural network
models include adaptive resonance networks [27], radial basis networks [28], modular networks
[29], neocognitron [30], brain-state-in-a-box [31], to name just a few.

3. History of neural networks in business

  The history of neural network development can be divided into "ve main stages, spanning over
150 years. These stages are shown in Fig. 5, where key research developments in computing and
1032      K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

       Fig. 5. The "ve stages of neural network research development, and its business impact.
               K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044    1033

neural networks are listed along with evidence of the impact these developments had on the
business community. The subdivision of this history into "ve stages is not the only viewpoint, and
many other excellent reviews of historical developments have been written [8,32,33]. The "ve
stages proposed here, however, each re#ect a change in the research environment and the
resourcing and interests of business.
   Much of the preliminary research and development was achieved during Stage 1 which here is
considered to be pre-World War II (i.e. prior to 1945). During this time most of the foundations for
future neural network research had been formed. The basic design principles of analytic engines
had been invented by Charles Babbage in 1834, which became the forerunner to the modern
electronic computer. The ability of these analytic engines and adding machines to automate tedious
calculations led to their widespread use by 1900 (the US government used such machines for the
1890 national census), and International Business Machines (IBM) was founded in 1914 to capture
this market. Meanwhile researchers in psychology had been exploring the human brain and
learning. William James' 1890 book Psychology (see James [34]) discussed some of the early
insights researchers had into the nature of brain activity. In 1904, Ivan Pavlov received a Nobel
Prize for his work on conditional learning (see Schultz [35]), which became extremely important
for subsequent researchers in neural networks. Between the two World Wars, Alan Turing
investigated computing devices which used the human brain as a paradigm, and the "eld of
artixcial intelligence was born. This "rst stage of preliminary research concludes with the "rst basic
attempts to mathematically describe the workings of the human brain. McCulloch and Pitts' (1943)
paper entitled `A logical calculus of the ideas immanent in nervous activitya proposed a simple
neuron structure with weighted inputs and neurons which are either `ona or `o!a [36]. At this
stage, however, these neural networks could not learn, and the lack of suitable computing resources
sti#ed experimentation.
   Stage 2 is characterised by the age of computer simulation. In 1946, Wilkes designed the "rst
operational stored-program computer. Over the ensuing years, the development of electronic
computers progressed rapidly, and in 1954 General Electric Company became the "rst corporation
to use a computer when they installed a UNIVAC I to automate the payroll system (see Turban
et al. [37]). The advances in computing enabled neural network researchers to experiment with
their ideas, and in 1949 Donald Hebb wrote The Organization of Behaviour, where he proposed
a rule to allow neural network weights to be adapted to re#ect the learning process explored by
Pavlov [38]. In 1954, Marvin Minsky built the "rst NeuroComputer based on these principles. In
the summer of 1956, the Dartmouth Summer Research Project was held and attracted the leading
researchers at the time. The "eld of neural networks was o$cially launched at this meeting.
Rosenblatt's Perceptron model soon followed in 1957, and many simple examples were used to
show the learning ability of neural networks. By this stage the "elds of arti"cial intelligence and
neural networks were causing much excitement amongst researchers, and the general public was
soon to become captivated by the idea of `thinking machinesa. In 1962, Bernard Widrow appeared
on the US documentary program Science in Action and showed how his neural network could learn
to predict the weather, blackjack, and the stock market. For the remainder of the 1960s this
excitement continued to grow.
   Then in 1969 a book was published which severely dampened this enthusiasm. The book
was Minsky and Papert's Perceptrons (1969), that proved mathematically that Perceptrons are
incapable of learning any problem containing data that are linearly inseparable [39]. The
1034           K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

consequence of their book was that much neural network research ceased. This is the third stage,
commonly called the `quiet yearsa from 1969 until 1982. During this time, however, there were
signi"cant developments in the computer industry. In 1971 the "rst microprocessor was developed
by the Intel Corporation. Computers were starting to become more common in businesses
worldwide, and several computer companies and software companies were formed during the
mid-seventies. SPSS Inc. and Nestor Inc. in 1975 and Apple Computer Corporation in 1977 are
a few examples of companies which formed then, and later became heavily involved in neural
networks. In 1981, IBM introduced the IBM PC which brought computing power to businesses
and households across the world. While these rapid developments in the computing industry were
occurring, some researchers started looking at alternative neural network models which might
overcome the limitations observed by Minsky and Papert. The concept of self-organisation in the
human brain and neural network models was explored by Willshaw and von der Malsburg [40],
and consolidated by Kohonen in 1982 [21]. This work helped to revive interest in neural networks,
as did the e!orts of Hop"eld [12] who was looking at the concepts of storing and retrieving
memories. Thus, by the end of this third stage, research into neural networks had diversi"ed, and
was starting to look promising again.
   From 1983 until 1990 marks the 4th Stage where neural network research blossomed. In 1983
the US government funded neural network research for the "rst time through the Defence
Advanced Research Projects Agency (DARPA), providing testament to the growing feeling of
optimism surrounding the "eld. An important breakthrough was then made in 1985 which
impacted on the future of neural networks considerably. Backpropagation was discovered indepen-
dently by two researchers [41,42] which provided a learning rule for neural networks which
overcame the limitations described by Minsky and Papert. In actual fact, backpropagation had
been proposed by Werbos [43] while he was a graduate student some 10 years earlier, but
remained undiscovered until after LeCun and Parker had published their work. The backpropaga-
tion algorithm enabled any complex problem to be learnt without the limitations of Perceptrons.
Within years of its discovery the neural network "eld grew dramatically in size and momentum.
Rumelhart and McClelland's (1986) book [44], Parallel Distributed Processing, became the neural
network `biblea. In 1987, the Institute of Electrical and Electronic Engineers (IEEE) held the 1st
International Conference on Neural Networks, and these conferences have been held annually ever
since. Many neural network journals emerged over the next few years, with notable ones being
Neural Networks in 1988, Neural Computation in 1989, and IEEE Transactions on Neural Networks in
1990. During this stage of rapid growth, the business world remained fairly untouched by neural
networks. A few companies specialising in neural networks formed such as NeuralWare Inc. in
1987, and the reputation of neural networks in the business community was beginning to grow, but
it was not until the next stage that neural network made their real and lasting impact in business.
   In 1991, the banks started to use neural networks to make decisions about loan applicants and
speculate about "nancial prediction (see Ref. [45]). This marks the start of the 5th Stage. Within
a couple of years many neural network companies had been formed including Neuraltech Inc. in
1993 and Trajecta Inc. in 1995. Many of these companies produced easy-to-use neural network
software containing a variety of architectures and learning rules. A survey of neural network
software products available in 1993 listed over 50 products, the majority of which were designed to
be run on a PC under Microsoft Windows (see DTI [46]). The impact on business was almost
instantaneous. By 1996, 95% of the top 100 banks in the US were utilising intelligent techniques
                 K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044   1035

including neural networks [47]. Within competitive industries like banking, "nance, retail, and
marketing, companies realised that they could use these techniques to help give them a `competi-
tive edgea. In 1998, IBM announced a company-wide initiative for the estimated $70 billion
business intelligence market. Research during this 5th stage still continues, but it is now more
industry driven. Now that the business world is becoming increasingly dependent upon intelligent
techniques like neural networks to solve a variety of problems, new research problems are
emerging. Researchers are now devising techniques for extracting rules from neural networks, and
combining neural networks with other intelligent techniques like genetic algorithms, fuzzy logic
and expert systems. As more complex business problems are tackled, more research challenges are

4. Overview of business applications

   Over the last decade, neural networks have found application across a wide range of areas from
business, commerce and industry. In this section, an overview is provided of the kinds of business
problems to which neural networks are suited, with a brief discussion of some of the reported
studies relevant to each area. This overview is based upon some excellent review articles [3,48,49],
as well as many published studies.

4.1. Marketing

   The goal of modern marketing exercises is to identify customers who are likely to respond
positively to a product, and to target any advertising or solicitation towards these customers.
Target marketing involves market segmentation, whereby the market is divided into distinct groups
of customers with very di!erent consumer behaviour. Market segmentation can be achieved using
neural networks by segmenting customers according to basic characteristics including demog-
raphics, socio-economic status, geographic location, purchase patterns, and attitude towards
a product [50]. Unsupervised neural networks can be used as a clustering technique to automati-
cally group the customers into segments based on the similarity of their characteristics [51].
Alternatively, supervised neural networks can be trained to learn the boundaries between customer
segments based on a group of customers with known segment labels, i.e. frequent buyer, occasional
buyer, rare buyer [52].
   Once market segmentation has been performed, direct marketing can be used to sell a product to
customers without the need for intermediate action such as advertising or sales promotion.
Customers who are contacted are already likely to respond to the product since they exhibit similar
consumer behaviour as others who have responded in the past. In this way, marketers can save
both time and money by avoiding contacting customers who are unlikely to respond. Bounds and
Ross [53] showed that neural networks can be used to improve response rates from the typical one
to two percent, up to 95%, simply by choosing which customers to send direct marketing mail
advertisements to.
   Neural networks can also be used to monitor customer behaviour patterns over time, and to
learn to detect when a customer is about to switch to a competitor. The electronic storage of daily
transaction details enables us to anticipate consumer behaviours based upon learnt models, and
1036           K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

strategies can be devised for retaining customers who are identi"ed as likely to switch to
a competitor (also known as `churna). Analysis of market research is also an area where
neural networks can be of bene"t. Moutinho et al. [54] applied neural networks to analyse
responses to advertising and to determine the factors in#uencing the usage of ATMs by a bank's

4.2. Retail

   Businesses often need to forecast sales to make decisions about inventory, sta$ng levels, and
pricing. Neural networks have had great success at sales forecasting, due to their ability to
simultaneously consider multiple variables such as market demand for a product, consumers'
disposable income, the size of the population, the price of the product, and the price of com-
plementary products [52]. Forecasting of sales in supermarkets and wholesale suppliers has been
studied [55,56] and the results have been shown to perform well when compared to traditional
statistical techniques like regression, and human experts.
   The second major area where retail businesses can bene"t from neural networks is in the area of
market basket analysis (see Bigus [57]). Hidden amongst the daily transaction details of customers
is information relating to which products are often purchased together, or the expected time delay
between sales of two products. Retailers can use this information to make decisions, for example,
about the layout of the store: if market basket analysis reveals a strong association between
products A and B then they can entice consumers to buy product B by placing it near product A on
the shelves. If there is a relationship between two products over time, say within 6 months of buying
a printer the customer returns to buy a new cartridge, then retailers can use this information to
contact the customer, decreasing the chance that the customer will purchase the product from
a competitor. Understanding competitive market structures between di!erent brands has also been
attempted with neural network techniques [51].

4.3. Banking and xnance

   One of the main areas of banking and "nance that has been a!ected by neural networks is trading
and xnancial forecasting. Neural networks have been applied successfully to problems like deriva-
tive securities pricing and hedging [58], futures price forecasting [59], exchange rate forecasting
[60] and stock performance and selection prediction [61}64]. The success stories are numerous
and have received much attention.
   There are many other areas of banking and "nance that have been improved through the use of
neural networks though. For many years, banks have used credit scoring techniques to determine
which loan applicants they should lend money to. Traditionally, statistical techniques have driven
the software. These days, however, neural networks are the underlying technique driving the
decision making [65,66]. Hecht-Nielson Co. have developed a credit scoring systems which
increased pro"tability by 27% by learning to correctly identify good credit risks and poor credit
risks [48]. Neural networks have also been successful in learning to predict corporate bankruptcy
   A recent addition to the literature on neural networks in "nance is the topic of wealth creation.
Neural networks have been used to model the relationships between corporate strategy, short-run
                 K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044   1037

"nancial health, and the performance of a company [70]. This appears to be a promising new area
of application.
   Financial fraud detection is another important area of neural networks in business. Visa
International have an operational fraud detection systems which is based upon a neural network,
and operates in "ve Canadian and 10 US banks [71]. The neural network has been trained to
detect fraudulent activity by comparing legitimate card use with known cases of fraud. The system
saved Visa International an estimated US$40 million within its "rst six months of operation alone
[72]. Neural networks have also been used in the validation of bank signatures [73], identifying
forgeries signi"cantly better than human experts.

4.4. Insurance

   There are many areas of the insurance industry which can bene"t from neural networks. Policy
holders can be segmented into groups based upon their behaviours, which can help to determine
e!ective premium pricing. Prediction of claim frequency and claim cost can also help to set
premiums, as well as "nd an acceptable mix or portfolio of policy holders characteristics [74].
The insurance industry, like the banking and "nance sectors, is constantly aware of the need to
detect fraud, and neural networks can be trained to learn to detect fraudulent claims or unusual
circumstances. The "nal area where neural networks can be of bene"t is in customer retention [74].
Insurance is a competitive industry, and when a policy holder leaves, useful information can be
determined from their history which might indicate why they have left. O!ering certain
customers incentives to stay, like reducing their premiums, or providing no-claims bonuses,
can help to retain good customers. Unfortunately, the competitive nature of the insurance
industry means that few details of successful applications of neural networks have been published.
The data mining company Trajecta ( advertises success within the
insurance industry, as does Risk Data Corporation (a subsidiary of Hecht}Nielson Company).
Risk Data Corporation used neural networks to detect fraudulent insurance claims for the
Workers' Compensation Fund of Utah, as well as estimating the "nancial impact of predicted
claims [75].

4.5. Telecommunications

   Like other competitive retail industries, the telecommunications industry is concerned with the
concepts of churn (when a customer joins a competitor) and winback (when an ex-customer returns).
Neural Technologies Inc., is a UK-based company which has marketed a product called DA Churn
Manager. Speci"cally tailored to the telecommunications industry, this product uses a series of
neural networks to: analyse customer and call data; predict if, when and why a customer is likely to
churn; predict the e!ects of forthcoming promotional strategies; and interrogate the data to "nd the
most pro"table customers. Telecommunications companies are also concerned with product sales,
since the more reliant a customer becomes on certain products, the less likely they are to churn.
Market basket analysis is signi"cant here, since if a customer has bought one product from
a common market basket (such as call waiting), then enticement to purchase the others (such as
caller identi"cation) can help to reduce the likelihood that they will churn, and increases pro"tabil-
ity through sales.
1038           K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

  There are also many other applications of neural networks in the telecommunications industry,
and while these are more engineering applications than business applications, they are of interest to
the operations researcher because they involve optimisation. These include the use of neural
networks to assign channels to telephone calls [76], for optimal network design [77] and for the
e$cient routing and control of tra$c [78].

4.6. Operations management

   There are many areas of operations management, particularly scheduling and planning, where
neural networks have been used successfully. The scheduling of machinery [79], assembly lines
[17], and cellular manufacturing [80] using neural networks have been popular research topics
over the last decade. Other scheduling problems like timetabling [81], project scheduling [82] and
multiprocessor task scheduling [83] have also been successfully attempted. All of these approaches
are based upon the Hop"eld neural network [12] and the realisation of Hop"eld and Tank [13]
that these networks could solve complex optimisation problems. Recently, alternative neural
network approaches like neuro-dynamic programming [84] have also been used to solve related
   The use of neural networks in various operations planning and control activities are reviewed by
Garetti and Taisch [85] and cover a broad spectrum of application from demand forecasting to
shop #oor scheduling and control. Balakrishnan et al. [86] use neural networks to integrate
marketing and manufacturing functions in an organization. A unique feature of this paper is the use
of both supervised and unsupervised learning modes in the neural network design. In addition,
using scheduling of jobs as an example, Gupta et al. [87,88] describe the use of neural networks for
selecting the most appropriate heuristic algorithm to use to solve a practical problem in operations
management. Neural networks have also been used in conjunction with simulation modeling to
learn better manufacturing system design [89].
   The other area of operations management which bene"ts from neural networks is quality control.
Neural networks can be integrated with traditional statistical control techniques to enhance their
performance. Examples of their success include a neural network used to monitor soda bottles to
make sure each bottle is "lled and capped properly [90]. Neural networks can also be used as
a diagnostic tool, and have been used to detect faults in electrical equipment [91] and satellite
communication networks [92].
   Project management tasks have also been tackled using neural networks. Lind and Sulek [93]
report the use of MFNNs to forecast project completion times for knowledge work projects, while
Smith et al. [94] use neural networks for estimating several software metrics in software develop-
ment projects.

4.7. Other industries

   In this section we have examined some applications of neural networks to various sectors of
business: marketing, retail, banking and "nance, insurance, telecommunications, and operations
management. There are of course many other industries which have bene"tted from neural
networks over the last decade. Many commercially available products incorporate neural network
technology. IBM's computer virus recognition software IBM AntiVirus uses a neural network to
               K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044    1039

detect boot sector viruses. In addition to the viruses it was trained to detect, the software has also
caught approximately 75% of new boot viruses since the product was released. Sensory Inc. have
used neural networks to create a speech recognition chip, which is currently being used in
Fisher}Price electronic learning aids, and car security systems. Companies like Siemens use neural
networks to provide automation for manufacturing processes, saving operating costs and improv-
ing productivity. Handwritten character recognition software like that used in Apple Computer's
Newton MessagePad uses neural network technology as well. Details about many of these applica-
tions can be found in Knoblock [49].
   What emerges from this discussion is the complete diversity of the application areas which are
reaping the advantages and bene"ts of neural networks. The important point about these applica-
tions is that they have e!ectively driven research over the last decade. Banks cannot reject a loan
applicant because their neural network advised them that the applicant would be a bad risk. They
must provide reasons why the application was not successful, and give suggestions as to how the
applicant could improve their chances next time. Because of these legal requirements, researchers
are now working on extracting rules from neural networks [95,96]. High demands for speech and
character recognition software means that researchers are constantly striving for faster and more
e$cient algorithms to achieve the task. These demands from business and industry will continue to
drive research well into the next century.

5. Data mining

   And what is the role of data mining in this discussion? Data mining is an area which is captivating
the business world at the moment, and the operations researcher can "nd many opportunities for
engaging in consulting work or collaborative research with companies interested in data mining.
Data mining has emerged over recent years as an extremely popular approach to extracting
meaningful information from large databases and data warehouses [97]. The increased computerisa-
tion of business transactions, improvements in storage and processing capacities of computers, as
well as signi"cant advances in knowledge discovery algorithms have all contributed to the evolution
of the "eld [57]. Neural networks (MFNNs and SOFMs) form the core of most commercial data
mining packages such as the SAS Enterprise Miner and the IBM Intelligent Miner. Other tools like
regression, classi"cation (decision) trees, and advanced statistics modules are also often included.
   To the operations researcher, data mining is an opportunity to use traditional techniques, neural
networks, and other `intelligent techniquesa to help an organisation achieve their potential. While
data mining may therefore appear to be about using old techniques under a new name, it is the
methodology of data mining and the new range of applications that are generating the excitement.
There have been many studies published recently that demonstrate the bene"ts that can be brought
to an organisation through data mining [74,98}101].
   Data mining has not been without criticism, however, and it appears that some data mining
projects have been unsuccessful for a variety of reasons [102]. Perhaps the most perceptive quote
on this topic comes from Small [102], who observes:

  The new technology cycle typically goes like this: Enthusiasm for an innovation leads to
  spectacular assertions. Ignorant of the technology's true capabilities, users jump in without
1040            K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

  adequate preparation and training. Then, sobering reality sets in. Finally, frustrated and
  unhappy users complain about the new technology and urge a return to `business as usuala.
   Certainly an understanding of the individual techniques that fall under the umbrella of data
mining, as well as adherence to a methodology, can prevent this scenario from occuring. It is for
this reason that the operations researcher is likely to "nd success when applying data mining: the
approach is a natural extension of an existing problem solving methodology. We refer the
interested reader to Berry and Lino! [103] for an excellent introduction to data mining method-
ologies and techniques.

6. Conclusion

   This paper has reviewed neural network techniques in business from the perspective of the
operations researcher. The three main neural network approaches to solving business problems
have been introduced: multilayered feedforward neural networks, Hop"eld neural networks, and
self-organising neural networks. Each of these techniques "nds natural analogy with more tradi-
tional statistical and operations research techniques, and these analogies have been discussed.
   There has been a certain amount of hype associated with neural and `intelligenta techniques
which appears to have made the academic community sceptical about their merits. This is partly
due to the turbulent history of neural network development, which has been discussed in Section 3.
This paper has aimed to clarify the potential of these techniques in comparison with more
traditional approaches. The operations research reader will recognise neural network approaches
to solving business problems as very similar to statistical methods, with some relaxation of
assumptions and more #exibility.
   We have also provided an overview of some of the many business applications that have been
successfully tackled using neural networks. Data mining is one of the booming application areas at
the moment, and is an area where the operations researcher can "nd projects with industry. Neural
network research is now being driven by industry, as more business problems are attempted and
new research challenges emerge. Given the need for any successful research area to be responsive to
the interests of industry, the role of emerging technologies like neural networks and data mining in
operations research is clear.


  The authors express deep appreciation to the "ve reviewers for their constructive comments and
suggestions that improved the presentation of this paper.


 [1] Wong BK, Bodnovich TA, Selvi Y. A bibliography of neural network business application research: 1988}Septem-
     ber 1994. Expert Systems 1995;12(3):253}61.
 [2] Wong BK, Bodnovich TA, Selvi Y. Neural network applications in business: a review and analysis of the literature
     (1988}1995). Decision Support Systems 1997;19:301}20.
                 K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044                  1041

 [3] Wong BK, Lai VS, Lam J. A bibliography of neural network business application research: 1994}1998. Computers
     and Operations Research 2000;27(11}12):1045}76.
 [4] Sharda R. Neural networks for the MS/OR analyst: an application bibliography. Interfaces 1994;24:116}30.
 [5] Ignizio JP, Burke LI. Special Issue on: arti"cial intelligence, evolutionary programming and operations research.
     Computers and Operations Research 1996;23(6).
 [6] Smith KA. Neural networks for combinatorial optimization: a review of more than a decade of research.
     INFORMS Journal of Computing 1999;11(1):15}34.
 [7] Zhang HC, Huang SH. Applications of neural networks in manufacturing: a state-of-the-art survey. International
     Journal of Production Research 1995;33(3):705}28.
 [8] Zurada JM. An Introduction to arti"cial Neural systems. St. Paul: West Publishing, 1992.
 [9] Montana DJ. Neural network weight selection using genetic algorithms. In: Goonatilake S, Khebbal S. editors.
     Intelligent hybrid systems. Chichester: Wiley, 1995. p. 85}104.
[10] Sexton RS, Gupta JND, Smith BN, Montagno RV. Neural network training via genetic algorithm and back-
     propagation: an empirical comparison. Working paper, Dept. Management, Ball State University, Muncie
     Indiana, 1998.
[11] Gupta JND, Sexton RS. Comparing backpropagation with a genetic algorithm for neural network training.
     Omega 1999;27:679}84.
[12] Hop"eld JJ. Neural networks and physical systems with emergent collective computational abilities. Proceedings
     of the National Academy of Sciences of the USA 1982;79:2554}8.
[13] Hop"eld JJ, Tank DW. Neural computation of decisions in optimization problems. Biological Cybernetics
[14] Brandt RD, Wang Y, Laub AJ, Mitra SK. Alternative networks for solving the travelling salesman problem
     and the list-matching problem. Proceedings International Conference on Neural Networks, Vol. 2, 1988. p.
[15] Hegde S, Sweet J, Levy W. Determination of parameters in a Hop"eld/Tank computational network. Proceedings
     IEEE International Conference on Neural Networks, Vol. 2, 1988. p. 291}98.
[16] Lai WK, Coghill GG. Genetic breeding of control parameters for the Hop"eld/tank neural net. Proceedings
     International Joint Conference on Neural Networks, Vol. 4, 1992. p. 618}23.
[17] Smith KA, Palaniswami M, Krishnamoorthy M. Traditional heuristic versus Hop"eld neural network ap-
     proaches to a car sequencing problem. European Journal of Operational Research 1996;93:300}16.
[18] Ackley DH, Hinton GE, Sejnowski TJ. A learning algorithm for Boltzmann machines. Cognitive Science
[19] Van Den Bout DE, Miller III TK. Improving the performance of the Hop"eld-tank neural network through
     normalization and annealing. Biological Cybernetics 1989;62:129}39.
[20] Smith KA, Palaniswami M, Krishnamoorthy M. Neural techniques for combinatorial optimisation with applica-
     tions. IEEE Transactions on Neural Networks 1998;9:1301}18.
[21] Kohonen T. Self-organized formation of topologically correct feature maps. Biological Cybernetics 1982;43:59}69.
[22] Kohonen T. Self-organisation and associative memory. New York: Springer, 1988.
[23] Hartigan JA. Clustering algorithms. New York: Wiley, 1975.
[24] Durbin R, Willshaw D. An analogue approach to the travelling salesman problem using an elastic net method.
     Nature 1987;326:689}91.
[25] Favata F, Walker R. A study of the application of Kohonen-type neural networks to the travelling salesman
     problem. Biological Cybernetics 1991;64:463}8.
[26] Goldstein M. Self-organizing feature maps for the multiple traveling salesman problem (MTSP). Proceedings
     IEEE International Conference on Neural Networks, 1990. p. 258}61.
[27] Carpenter GA, Grossberg S. The ART of adaptive pattern recognition by a self-organizing neural network. IEEE
     Computer 1988;21:77}88.
[28] Broomhead DS, Lowe D. Multivariable function interpolation and adaptive networks. Complex Systems
[29] Jacobs RA, Jordon MI. A competitive modular connectionist architecture In: Lippman RP et al, editor. in neural
     information processing systems 3. San Mateo, CA: Morgan Kaufmann, 1991. 733}67.
1042            K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

[30] Fukushima K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition
     une!ected by shift in position. Biological Cybernetics 1980;36:193}202.
[31] Anderson JA, Silverstein JW, Ritz SA, Jones RS. Distinctive features, categorical perception, and probability
     learning: some applications of a neural model. Psychological Review 1977;84:413}51.
[32] Haykin S. Neural networks: a comprehensive foundation. Englewood Cli!s, NJ: McMillan, 1994.
[33] McCord Nelson M, Illingworth WT. A practical guide to neural nets. Reading, MA: Addison-Wesley, 1991.
[34] James W. Psychology: a briefer course. New York: Holt, 1890.
[35] Schultz DP, Schultz SE. A History of modern psychology. New York: Harcourt Brace, 1992.
[36] McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical
     Biophysics 1943;5:115}33.
[37] Turban E, McLean E, Wetherbe J. Information technology for management. New York: Wiley, 1996.
[38] Hebb DO. The organization of behavior: a neuropsychological theory. New York: Wiley, 1949.
[39] Minsky ML, Papert SA. Perceptrons. Cambridge, MA: MIT Press, 1969.
[40] Willshaw DJ, von der Malsburg C. How patterned neural connections can be set up by self-organization.
     Proceedings of the Royal Society of London, Series B 1976;194:431}45.
[41] LeCun Y. Une procedure d'apprentissage pour reseau a seuil assymetrique. Cognitiva 1985;85:599}604.
[42] Parker DB. Learning logic: Casting the cortex of the human brain in silicon. Technical Report TR-47. Center for
     Computational Research in Economics and Management, MIT, Cambridge, MA, 1985.
[43] Werbos PJ. Beyond regression: new tools for prediction and analysis in the behavioral sciences. Ph.D. Disserta-
     tion, Harvard University, Cambridge, MA, 1974.
[44] Rumelhart DE, McClelland JL. Parallel distributed processing: explorations in the microstructure of cognition..
     Cambridge, MA: MIT Press, 1986.
[45] PC/AI Magazine, May}June, 1991.
[46] DTI, Neural computing learning solutions. Directory of Neural Computing Suppliers, UK Department of Trade
     and Industry, 1993.
[47] Ernst and Young. American Bankers Association Special Report on Technology in Banking, 1996.
[48] Harston CT. Business with neural networks.. In: Maren A, Harston C, Pap R, editors. Handbook of neural
     computing applications.. CA: Academic Press, 1990.
[49] Knoblock C. Neural networks in real-world applications. IEEE Expert, August 4}12, 1996.
[50] Dibb S, Simkin L. Targeting segments and positioning. International Journal of Retail and Distribution
     Management 1991;19:4}10.
[51] Reutterer T, Natter M. Segmentation based competitive analysis with MULTICLUS and topology representing
     networks. Computers and Operations Research 2000;27(11}12):1227}47.
[52] Venugopal V, Baets W. Neural networks and their applications in marketing management. Journal of Systems
     Management, September 16}21, 1994.
[53] Bounds D, Ross D. Forecasting customer response with neural networks. Handbook of Neural Computation
[54] Moutinho L, Curry B, Davies F, Rita P. Neural networks in marketing. In: Computer modelling and expert
     systems in marketing. New York: Routledge, 1994.
[55] Kong JHL, Martin GM. A backpropagation neural network for sales forecasting. Proceedings IEEE Interna-
     tional Conference on Neural Networks, Vol. 2, 1995. p. 1007}11.
[56] Thiesing FM, Middleberg U, Vornberger O. Short term prediction of sales in supermarkets. Proceedings IEEE
     International Conference on neural networks, Vol. 2. 1995. p. 1028}31.
[57] Bigus J. Data Mining with neural networks.. New York: McGraw-Hill, 1996.
[58] Hutchinson JM, Lo AW, Poggio T. A non-parametric approach to pricing and hedging derivative securities via
     learning networks. The Journal of Finance 1994;XLIX:851}89.
[59] Grudnitski G, Osburn L. Forecasting S&P and gold futures prices: an application of neural networks. The Journal
     of Futures Markets 1993;13:631}43.
[60] Leung MT, Chen AS, Daouk H. Forecasting exchange rates using general regression neural networks. Computers
     and Operations Research 2000;27(11}12):1093}1110.
[61] Barr DS, Mani G. Using neural nets to manage investments. AI Expert 1994;9:16}21.
                 K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044                    1043

[62] Kryzanowski L, Galler M, Wright DW. Using arti"cial neural networks to pick stocks. Financial Analysts
     Journal 1993;49:21}7.
[63] Motiwalla M, Wahab M. Predictable variation and pro"table trading of U.S. equities: a trading simulation using
     neural networks. Computers and Operations Research 2000;27(11}12):1111}29.
[64] Swales GS, Yoon Y. Applying arti"cial neural networks to investment analysis. Financial Analysts Journal
[65] Jensen HL. Using neural networks for credit scoring. Managerial Finance 1992;18:15}26.
[66] West D. Neural network credit scoring models. Computers and Operations Research 2000;27(11}12):1131}52.
[67] Fletcher D, Goss E. Forecasting with neural networks: an application using bankruptcy data. Information
     & Management 1993;24:159}67.
[68] Udo G. Neural network performance on the bankruptcy classi"cation problem. Computers & Industrial
     Engineering 1993;25:377}80.
[69] Wilson R, Sharda R. Business failure prediction using neural networks. Encyclopedia of Computer Science and
     Technology. New York: Marcel Dekker, 1997. Vol. 37(22), p. 193}204.
[70] St. John CH, Balakrishnan N, Fiet JO. Modeling the relationship between corporate strategy and wealth creation
     using neural networks. Computers and Operations Research 2000;27(11}12):1077}92.
[71] Goonatilake S, Treleaven P. Intelligent systems for "nance and business.. Chichester: Wiley, 1995.
[72] Holder V. War on suspicious payments. Financial Times, 7th February, 1995.
[73] Francett B. Neural nets arrive. Computer Decisions, 1989; 58}62.
[74] Smith KA, Willis RJ, Brooks M. An analysis of customer retention and insurance claim patterns using data
     mining: a case study. Journal of the Operational Research Society 2000, to appear.
[75] Hancock MF. Estimating dollar value outcomes of workers' compensation claims using radial basis function
     networks. In: Keller P et al, editors. Application of neural networks in environment, energy and health. Singapore:
     World Scienti"c, 1996. 199}208.
[76] Smith KA, Palaniswami M. Static and dynamic channel assignment using neural networks. IEEE Journal on
     Selected Areas in Communications 1997;15:238}49.
[77] Patterson R, Pirkul H. Heuristic procedure neural networks for the CMST problem. Computers and Operations
     Research 2000;27(11}12):1171}1200.
[78] Yuhas B, Ansari N. Neural networks in telecommunications. MA: Kluwer Academic Publishers, 1994.
[79] Foo YPS, Takefuji Y. Stochastic neural networks for job-shop scheduling. Proceedings of the IEEE International
     Conference on Neural Networks, Vol. 2, 1988. p. 275}290.
[80] Guerrero F, Lozano S, Canca D, Smith KA. Machine grouping in cellular manufacturing: a self-organising neural
     network. In: Bulsari AB et al., editors. Engineering bene"ts from neural networks. Turku, Finland: Systems
     Engineering Association, 1998. p. 374}77.
[81] Gislen L, Peterson C, Soderberg B. Teachers and classes with neural networks. International Journal of Neural
     Systems 1989;1:167}76.
[82] Padman R. Choosing solvers in decision support systems. A neural network application in resource-constrained
     project scheduling. In: Recent developments in decision support systems. Berlin: Springer, 1993.
[83] Ansari N, Zhang ZZ, Hou ESH. Scheduling computation tasks onto a multiprocessor system using mean "eld
     annealing of a Hop"eld neural network.. In: Wang J, Takefuji Y, editors. Neural networks in design and
     manufacturing.. New Jersey: World Scienti"c, 1993.
[84] Secomandi N. Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochas-
     tic demands. Computers and Operations Research 2000;27(11}12):1201}25.
[85] Garetti M, Taisch M. Neural networks in production planning and control. Production Planning and Control
[86] Balakrishnan N, Chakravarty AK, Ghose S. Role of design-philosophies in interfacing manufacturing with
     marketing. European Journal of Operational Research 1997;103:453}69.
[87] Gupta JND, Tunc EA. Neural network approach to select scheduling heuristics for a two-stage hybrid #owshop.
     International Journal of Management and Systems 1997;13:283}98.
[88] Gupta JND, Sexton RS, Tunc EA. Selecting a scheduling heuristic through neural networks, INFORMS Journal
     of Computing 1999; in press.
1044              K.A. Smith, J.N.D. Gupta / Computers & Operations Research 27 (2000) 1023}1044

 [89] Mollaghasemi M, LeCroy K, Georgiopoulos M. Application of neural networks and simulation modeling in
      manufacturing systems design. Interfaces 1998;28:100}14.
 [90] Glover DE. Neural nets in automated inspection. The Digest of Neural Computing 1988;2:1}17.
 [91] Jacubowicz O, Ramanujam S. A neural network model for fault diagnosis of digital circuits. Proceedings of the 1st
      IEEE International Conference on Neural Networks. Vol. 2, 1990. p. 611}14.
 [92] Casselman F, Acres JD. DASA/LARS, a large diagnostic system using neural networks. International Joint
      Conference on Neural Networks, Vol. 1, 1990. p. 565}72.
 [93] Lind MR, Sulek JM. A methodology for forecasting knowledge work projects. Computers and Operations
      Research 2000;27(11}12):1153}69.
 [94] Smith KA, Siew E, Milne B, Luxford K. Neural networks for software metrics estimation. In: Dagli C. et al.
      editors. Intelligent engineering systems through arti"cial neural networks. New York: ASME Press, 1999, Vol. 9,
      pp. 1073}8.
 [95] Andrews R, Deiderich J, Tickle AB. Survey and critique of techniques for extracting rules from trained arti"cial
      neural networks. Knowledge-Based Systems 1995;8:373}83.
 [96] Lubinsky B, Kothari R. A function decomposition approach to rule formation and rule extraction. In: Dagli C. et
      al. editors. Intelligent engineering systems through arti"cial neural networks. Vol. 7. New York: ASME Press,
      1997. p. 99}104.
 [97] French M. Mining for dollars: A $6.5 billion market by 2000. America's Network 1998;102:24.
 [98] Chan PK, Stolfo SJ. Toward scalable learning with non-uniform class and cost distributions: a case study in credit
      card fraud detection. Proceedings Fourth International Conference on Knowledge Discovery and Data Mining.
      Menlo Park, CA, AAAI Press, 1998. p. 164}8.
 [99] Filippidou D, Keane JA, Svinterikou S, Murray J. Data mining for business process improvement: applying the
      HyperBank approach. PADD98. Proceedings of the Second International Conference on the Practical Applica-
      tion of Knowledge Discovery and Data Mining, 1998. p. 155}66.
[100] Rauch J, Berka P. Knowledge discovery in "nancial data } a case study. Neural Network World
[101] Feelders AJ, le Loux AJF, van&t Zand JW. Data mining for loan evaluation at ABN AMRO: a case study. KDD-95
      Proceedings First International Conference on Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI
      Press, 1995. p. 106}11.
[102] Small RD. Debunking data mining myths. Information Week 1997;20:55}60.
[103] Berry M, Lino! G. Data mining techniques. New York: Wiley, 1997.

   Kate A. Smith is a Senior Lecturer in the School of Business Systems at Monash University, Australia. She holds
a B.Sc(Hons) in Mathematics and a Ph.D. in Electrical Engineering, both from the University of Melbourne, Australia.
She is Director of the Data Mining Research Group in the Faculty of Information Technology at Monash University. Dr.
Smith has published a book on neural networks in business, and over 40 journal and international conference papers in
the areas of neural networks, combinatorial optimization, and data mining. Journals she has published in include
Computers and Operations Research, European Journal of Operational Research, IEEE Transactions on Neural Networks,
INFORMS Journal of Computing, Location Science, Journal of the Operational Research Society, IEEE Journal on Selected
Areas in Communications, etc. Dr. Smith serves as a referee for many journals in the "eld, and is a member of the
organizing committee for several international data mining and neural network conferences.
   Jatinder N.D. Gupta is Professor of Management, Information and Communication Sciences, and Industry and
Technology at the Ball State University, Muncie, Indiana, USA. He holds a Ph.D. in Industrial Engineering (with
specialization in Production Management and Information Systems) from Texas Tech University. Coauthor of a text-
book in Operations Research, Dr. Gupta serves on the editorial boards of several national and international journals.
Recipient of an Outstanding researcher award from Ball State University, he has published numerous research and
technical papers in such journals as Computers and Operations Research, International Journal of Information Management,
Journal of Management Information Systems, Operations Research, IIE Transactions, Naval Research Logistics, European
Journal of Operational Research, etc. His current research interests include information technology, scheduling, planning
and control, organizational learning and e!ectiveness, systems education, and knowledge management.

Shared By: