valve lifter by 9i1Y7n4

VIEWS: 7 PAGES: 26

									                                                                                              1



                           Introduction: The mind as computer

How is it possible to have a science of the mind? To have a science of the mind, you have
to be able to observe the mind, or to infer its workings from something that is observable.
Let's take a look at these options in turn.

Observing the mind.

If you are a dualist, believing the mind to be immaterial, observation of the mind is
problematic. Observation seems to require some sort of physical contact with the thing
observed. It has to reflect light, distort a magnetic field, and collide with something. If the
mental is immaterial, it seems it must be undetectable. Of course, most dualists hold that
there is mind-body interaction: that pains cause screams, that intentions to move one's
arm cause one's arm to move, that bodily damage causes pain, and that light causes
visual experiences. This leads to the possibility of inferring the mental from its
environmental or bodily causes, and from its bodily effects. But there is a principled
difficulty with this idea. We may, indeed, infer a mental effect from a bodily cause, or a
mental cause from a bodily effect, but we will have no idea what these mental causes and
effects are like. We will, in short, have no idea what sort of mechanisms the mind harbors,
and hence no idea how it works to produce the effects it does, or to be affected as it is.
To get a feel for this, consider the following analogy. How might an early chemist have
explained chemical bonds? Two ideas dating from ancient times are (1) that atoms are
like burrs, and stick together, or (2) that they are like pieces of a three dimensional jigsaw
puzzle, or, more simply, fitted with hooks and eyes. The law of constant proportions—the
fact that elements regularly bond together in fixed ratios by weight--favors (2) over one.
You get H2O because each hydrogen atom has one free hook, and each oxygen atom
has two free eyes. Burrs just clump willy-nilly. However, while (2) allows for H2, it ought,
contrary to fact, be less stable than H20, since it ought to be easier for two hooks to come
apart than for a hook to come out of an eye. And there shouldn't be any O 2 at all.
You can do all this reasoning about (1) and (2) because you already understand, or can
readily investigate, the properties of burrs and of hooks and eyes. (1) and (2) simply
project those properties into the realm of the imperceptibly small. But suppose a defender
of (2) were to respond to the problem about O2 by suggesting that very, very small eyes,
unlike the medium sized ones with which we are all familiar, can hook together. Although
intended to save the theory from an objection, this reply would rather undermine it
altogether, for it would leave us clueless about what to expect of micro hooks and eyes,
thereby destroying the theory's explanatory and predictive power.
The moral of this story is not far to seek. Inferred mechanisms, if they are to have any
explanatory or predictive value, must be, to some extent anyway, understood
independently of the effects they are rung in to explain. And this is where the dualist is in
trouble, because mental mechanisms cannot, for the dualist, just be ordinary mechanisms
that happen to be hidden away in the mind. They only occur in the mind. They occur
nowhere else, and they do not operate on physical principles. So, although dualists
believe in mind-body interaction, the idea that the mental can be inferred from its bodily
causes and effects founders on the lack of any mental mechanisms to mediate bodily
                                                                                               2


causes and effects.
The answer to this difficulty seems obvious: introspection. The mind, as we all know, can
observe itself.
So the mind, according to the dualist, can be observed after all, and directly observed at
that. It is just the minds of others that cannot be directly observed. We will return shortly to
this idea. But first, let's see how the would-be psychologist fares if she is not a dualist, but
a materialist instead.
The materialist, surprisingly, has the same problem as the dualist because of something
we call Leibniz' Gap. Here is Leibniz' formulation of the Gap.
         “ It must be confessed, moreover, that perception, and that which depends on it,
         are inexplicable by mechanical causes, that is, by figures and motions, And,
         supposing that there were a mechanism so constructed as to think, feel and have
         perception, we might enter it as into a mill. And this granted, we should only find on
         visiting it, pieces which push one against another, but never anything by which to
         explain a perception. This must be sought, therefore, in the simple substance, and
         not in the composite or in the machine. (Leibniz, Monodology, sec.17)
There is, as Liebniz points out in this famous passage, a gap between the concepts we
use to describe the mind, and those we use to describe the brain. So even if we are
convinced that the mind is the brain, or a process going on in the brain, physical
observation of the brain seems to give us data in the wrong vocabulary: synapses rather
than thoughts. When we look at a brain, even a living brain, we don't see thoughts. Or,
not to beg the question, we don't see anything we readily recognize as thoughts. If you
could put on Newton glasses and look at a billiard game in progress, you would see
vectors with centers of gravity at their tails. If you could put on Psychology glasses and
look at a living brain, you would, according to the materialist, see thoughts, and probably a
good deal more. But to build Psychology glasses, you would need to somehow bridge
Liebniz' Gap by correlating observed brain properties, events and processes with
psychological properties, events and processes. It seems that the only way you could do
this would be to rely on introspection to identify the psychological correlates of what you
could observe in the brain.
But this puts the materialist in the same boat as the dualist: relying on introspection to
generate an observational base in order to get a scientific psychology off the ground.
Liebniz' Gap may only be a conceptual gap, but it seems it is no easier to see across it
than it is to see across the metaphysical gap that separates the mind and body for the
dualist. And so it seemed to dualist and materialist alike that psychology must be founded
on introspection.

Structuralism

Unquestionably the most significant introspectionist program in the United States was the
"structuralism" of E. B. Titchener. Titchener was concerned to establish the claim that
the "new psychology" imported from Germany had made psychology a rigorous empirical
science. Lacking a nontrivial account of science, Titchener supported his claim by
emphasizing the analogies between psychology as he saw it and an established
experimental science--namely, physical chemistry. To understand Titchener's vision of
psychology, therefore, we do well to examine his model.
                                                                                              3


The core of physical chemistry in Titchener's time was the periodic table. The periodic
table allowed one to explain analytically an enormous number of the chemical properties
of compounds. It provided a list of chemical elements--i.e., components whose further
analysis is not theoretically significant for the explanation of properties in the intended
domain--together with a specification of the chemically important properties of those
elements. With these resources, it was possible to derive laws of composition--which
compounds are possible--and laws of instantiation--which properties a compound will
have given its constituents and structure--for a large number of the empirically established
chemical properties of substances. Titchener's idea was to provide for psychology what
the periodic table provided for chemistry, thereby making it possible to explain the
properties of mental events and processes by analyzing them into psychological
"elements": mental events that could not be further analyzed.
Since Titchener's elements are not things or substances but events, his program requires
some account of the origin of elements. He needed a general recipe for answering this
question: Why do we have just these elements present (in consciousness) at this time
rather than some others? There appear to be just three possible sources of elemental
mental events: either a current element is the effect of one or more previous elements, or
it is the effect of extramental stimuli, or both. Titchener allowed all three possibilities, but
he concentrated mainly on the second, probably because (i) extra mental events are more
open to experimental manipulation, and (ii) under the influence of empiricist philosophy,
Titchener believed that perception is the most significant source of events in the mind.
The object of psychological theory, then, is to explain the origin and properties of the
contents of consciousness--e.g., feeling anger, the visual image of a pouncing cat, or the
experience of voluntary action. Suppose the feeling of anger has properties Q and R (as
revealed by introspection). To explain why anger has these properties, we are to
proceed by analyzing this feeling into its elements--call them x, y, and z. Then, appealing
to the properties of these elements and the laws of composition, we endeavor to show
why anger must have the properties Q and R. To explain why S was angry at some
particular time, we explain the occurrence of x, y, and z (tokens of the mental
element-types that make up anger) as effects of previous mental events and/or current
extramental stimuli. (Perhaps we shall also need to explain why conditions were propitious
for the combination of x, y, and z into anger. Compare: many chemical combinations
require a fair amount of heat, or a catalyst, to take place.)
The project, then, was to discover the fundamental and introspectively unanalyzable
elements of consciousness and to formulate the principles of combination whereby these
elements are synthesized into the complex and familiar experiences of ordinary life.
Every compound mental state or process was to be explained compositionally, the
characteristics of the whole derived from the characteristics of the parts and mode of
combination. Not surprisingly, introspectionists spoke of mental valences, of mental
equilibrium, and of mental elements neutralizing each other.
The fundamental issues of mental analysis and synthesis never made significant
progress, however, and the reason is fairly clear. There were simply no technologies or
experimental procedures for analysing a mental event or process: nothing like qualitative
analysis in chemistry. This was a fatal defect. You get a large explanatory payoff from
the strategy of explaining the observed properties of things in terms of the properties of
their elemental constituents and their mode of combination only if the properties of the
                                                                                              4


compound differ significantly from those of the constituents. Salt is nothing like chlorine or
sodium. But this means that simply observing salt will tell you nothing about its
constituents. You need to be able to analyze it--break it up into its components and isolate
them for study. Lacking anything comparable to the laboratory tools of analytical
chemistry, the student of introspective psychology was, in the end, simply left with passive
introspective observation, and this mean that no significant analysis could be forthcoming.
Everything, in effect, was an unanalyzable element.
Introspection as a method thus sorts ill with the explanatory strategy of the theory. This
strategy was to explain analytically the properties of the complex contents of
consciousness, and perhaps the capacities of the mind required for it to have such
contents. Since introspection is, at best, a form of observation, it can hope to yield
data--the properties of conscious contents--but it cannot hope to yield analyses. The
elements and their properties will not be "visible" when compounded unless we assume
that there is no serious composition at all. If we assume a bushel basket theory of
consciousness, a step even Hume did not take, then the properties of anger, say, will
simply be the union of the properties of the elements in consciousness when one is angry.
So the analysis of consciousness bogged down for lack of analytical tools. But the
correlative project to explain the elements of consciousness as responses to perceptual
stimulation did not, for here introspectionists had an experimental paradigm in the
Weber-Fechner experiments. Here is Titchener's textbook description of a typical
variation.
        Method-to find the numerical expression of Weber's law for noise-An ivory ball is let
        fall from two different heights, upon a hard-wood plate. The difference of intensity
        between the two sounds(i.e., the difference between the two heights of fall) must
        be slight. The two sounds are given in irregular order in different experiments (to
        avoid the influence of expectation), and the subject is required to say, in each case,
        whether the second is louder than the first. In 100 experiments, he will give a
        certain number of right answers, and a certain number of wrong.
        The method assumes that if the two sounds are just noticeably different in intensity,
        the subject will give about 80% right and 20% wrong answers. This proportion is
        calculated by what mathematicians call the 'law of probability.' Now suppose that a
        certain difference gave 70 right and 30 wrong answers in 100 experiments. We
        could calculate, by aid of the integral calculus, how much larger the difference must
        have been to give 80 right and 20 wrong-i.e., to be just noticeable. The calculated
        difference (difference of height of fall) is the numerator, and the original intensity
        (height of fall) of the weaker sound, the denominator, of the fraction which
        expresses Weber's law. [Titchener, 1897,pp. 81-82]

Here is a different and more fundamental description:

       Suppose that we are investigating the intensity of noise. We shall begin with a
       stimulus of moderate intensity: say, the noise made by the fall of an ivory ball upon
       a wood plate from a height of 90 cm. We will call the intensity of this sensation 1.
       If we gradually increase the height of fall, we shall reach a point at which the noise
       of the fall is just noticeably greater than the original noise. We may call the
       intensity of this second sensation 2. If we further increase the height of fall, we shall
                                                                                              5


      presently get a noise, 3, which is just noticeably louder than 2; and so on. Now
      what are the different heights of fall-i.e., intensities of stimulus-necessary to arouse
      sensations of the intensities 2, 3, 4, etc.? .. . An addition of 30 cm. suffices to raise
      the intensity of sensation from I to 2; but if we are affected by the stronger stimulus
      120 cm., we must add more than 30 to it to change intensity 2 to intensity 3. In
      other words: change in the intensity of sensations does not keep even pace with
      change in the intensity of the stimuli which occasion them.
              Experiment enables us to replace this general statement of the relation of
      sensation intensity to stimulus intensity by a definite scientific law. If sensations
      are to increase in intensity by equal amounts, their stimuli must increase by
      relatively equal amounts.[Titchener, 1897, pp. 79-80]

This sort of experiment, begun by Weber (1834) and refined and expanded by Fechner
(1860), led to Fechner's law (mistakenly called Weber's law by Titchener and
Fechner-compare Gregory, 1981, pp. 500-505). Here was a law and a procedure for
testing and refining it. Does the law hold for all sensation? What are the values of the
constants of proportionality for each case? Work on these matters proceeded apace.
But this apparent bright spot in the program proved to be an Achilles' heel. Although the
analysis of consciousness languished, the fact was that there were no generally accepted
or clearly articulated canons for the evaluation of structural-analytic explanations
envisioned above for complex mental processes like anger, especially as applied to
consciousness. Hence there was no way to determine whether the trouble was merely
practical or deeply conceptual. On the other hand, there did exist well-articulated and
generally accepted canons for the evaluation of the sort of explanations Fechner's law
was used to construct, since the idea is to explain an event in consciousness--e.g., a
discernible difference in loudness--as an effect of an external cause--a change in stimulus
strength. Critics knew how to hunt down and articulate problems with this sort of
paradigm, and they did so.
We can bring out the problem in a few lines. The canon requiring independent access to
causes and effects applies to the Weber-Fechner experiments as follows. Suppose we
find a subject whose responses don't fit the law? Is the subject (A) misdescribing his/her
sensations, or (B)psychologically idiosyncratic? For that matter, how do we know that a
subject responding normally is not in fact (C) psychologically idiosyncratic but
systematically misdescribing experience, rather than (D)psychologically normal? We
cannot compare a subject's descriptions with what is supposed to be described. Our only
access to the sensations of the subject are (i) inference from established connections with
responses, including verbal reports, and (ii) inference from established connections with
stimuli. Obviously we cannot establish connections of the sort required unless we can, at
least sometimes, distinguish(A) from (B) and(C) from (D). But we cannot make these
distinctions unless we can establish the connections. Since the accuracy of introspective
observation cannot be checked, it seems it cannot play the role required of scientific
observation. With introspection ruled out of court, the only way to measure sensation
intensity is to measure stimulus intensity and then calculate sensation intensity using
Fechner's law, but if we do this, we are using Fechner's law to define sensation intensity,
and we cannot then turn around and pretend to explain the intensity of a sensation by
appeal to stimulus intensity and Fechner's law. Once introspection is disqualified, we
                                                                                                6


have no access to sensation intensity other than the very law that is supposed to explain
it.

Behaviorism.

Thus it was that introspectionists became vulnerable to a powerful methodological attack.
The methodology of explanation by causal subsumption had been well entrenched by
Bacon, Berkeley, Hume, Mach, and Mill. The empiricist doctrine was (and is) that causal
laws have no explanatory power unless the causes and effects they subsume are
knowable independently of each other. It is ironic that this doctrine was so fatal to
introspectionism, for introspection was held to be the only avenue to noninferential
knowledge by the very empiricist philosophers who developed the line of argument that
killed introspectionist psychology. Locke's inverted spectrum problem returned to haunt
those who attempted to pick up where Book Two of the Essay Concerning Human
Understanding left off. The inverted spectrum problem turned on the conceptualist
assumption that linguistic training would inevitably disguise sufficiently systematic
"psychological" differences. This is just the possibility raised by (A) and (C).
Whatever we may think of this critique, the mere certainty that it could be formulated was
eventually enough to kill any psychology based on introspection. And just as Berkeley's
critique of representational realism seemed to point inevitably to a single alternative (if we
know about tables and can know only about ideas, then tables are ideas), so this critique
of introspectionism seemed to point inevitably to a single alternative. Introspection isn't
genuine observation, so the Weber-Fechner law cannot be about consciousness. It is
obviously about something, though, for experimenters certainly observe and record
something in the sort of experiment described by Titchener in the passages above.
What? Since experimenters record what their subjects say, the Weber-Fechner law must
correlate stimuli with "introspective behavior." This is verbal behavior, per accidens, in the
usual experimental set-up, but button pressings would do just as well.
This bit of diagnostics would probably have sufficed to produce behaviorism eventually,
but a number of other factors conspired to guarantee a quick takeoff. Two, I think, are
especially worthy of note. First, pragmatism was in vogue in United States philosophical
circles, and pragmatists emphasized the importance of understanding connections with
action to the understanding of traditional philosophical problems involving mental states
and processes. According to Dewey, for instance, the central mistake of empiricists and
rationalists alike is the supposition that knowledge and belief can be understood
independently of action, and treated as antecedent conditions to be investigated in their
own right. Refuting this supposition was always a central theme in Dewey's writings. It
isn't a long step from this to the doctrine that talk of mental states is just shorthand for talk
of identifiable behavioral patterns. John Watson, the founder of behaviorism. was a
student at Chicago at a time when the influence of Dewey's pragmatism was very strong
there.
The other significant factor was Pavlov's discovery of stimulus substitution. This was an
important discovery in its own right, but in the intellectual climate we have been
describing, it had a special significance, for it seems to account for the sort of
phenomenon generally attributed to the association of ideas, without recourse to ideas.
Someone taken with the empiricist critique of introspection, and the pragmatist treatment
                                                                                             7


of doxastic states, could hardly have failed to conclude that Pavlov had shown that it was
stimuli and responses, not ideas, that were associated. This had to seem a major
breakthrough, for association was the only principle of learning on hand.
Analysis in the Behaviorism of Watson. By itself, stimulus substitution has no chance of
explaining complex behavior, or the introduction of new responses. Pavlovian
conditioning can simply link new stimuli to responses already in the organism's repertoire,
and experimentation quickly revealed that the stimuli and responses involved had to be
rather short and simple. The new Pavlovian principle of association, though
experimentally demonstrable, seemed to be no more explanatory than the old version.
Watson brightened this dim scene with a simple strategy: analyze an extended behavioral
pattern into a sequence of responses to stimuli produced by execution of the previous
response. Consider playing a tune from memory on the piano. Initially we have a set of
connections between perceiving a written note (stimulus) and striking the appropriate key.
Now striking a particular key produces a corresponding stimulus-visual and
kinesthetic-that always immediately precedes striking the next key specified by the score.
Thus repetitious playing from the score should produce stimulus substitution-perception of
a previous response substituting for perception of the next note in the score. When
substitution is complete, the score will be unnecessary.
This analysis fails for a number of rather boring reasons-e.g., it runs afoul of the fact that
people can play more than one tune from memory without difficulty even though the
different tunes share notes. But the strategy was promising and exciting because the
problem of explaining acquisition and exercise of a complex behavioral capacity is
reduced to the problem of analyzing the capacity into simple antecedently explained
capacities. A compellingly general picture of psychological change emerges. An
organism begins with a genetically determined endowment of S-R connections, some of
which, perhaps, emerge only at certain maturation stages. This basic endowment is
expanded via stimulus substitution. The resulting connections are then combined in
more or less complex sequences to yield an organism with a respectably complex
behavioral repertoire.
This picture is still the operative picture underlying behaviorist psychology. The shifts
have been shifts in detail. First, contemporary behaviorism is far more liberal about
genetic endowment than was Watson--the tabula isn't nearly so rasa. Second, and more
important, classical Pavlovian conditioning is supplemented by operant conditioning.
Since operant conditioning builds on emitted behavior rather than on preexisting S-R
connections, this change affects the assumptions about genetic endowment. Also, since
operant conditioning produces "shaping," contemporary behaviorism has a source of
novel, unanalyzable, behaviors available as building blocks. But the basic Watsonian
picture remains: significant psychological change is the result of composition of
antecedently explained (via genetics or shaping) behaviors. Hence, psychological
explanation must proceed by analyzing observed behaviors into more tractable
components. Thus it is analysis, not subsumption under causal law, that is the central
explanatory strategy of behaviorism.
Watson conceived of an organism as a transducer the input-output characteristics of
which could be altered over time. To characterize the organism psychologically at a
moment in time is to specify the current input-output properties-a bundle of S-R
connections. The goal of psychological theory is to specify transition laws that subsume
                                                                                              8


successive changes in momentary input-output properties-e.g., the law of stimulus
substitution. It is therefore ironic that Watson never introduced a single principle of this
type. Instead, his major achievement was the introduction of the analytical strategy into
behaviorism in the guise of the response chain. Watson was fond of saying that the point
of scientific psychology is the prediction and control of behavior. But Watson's analysis
of habit, even had it been sound, would not have increased the power of psychology to
predict or control responses at all, though it would have greatly increased its explanatory
power. For example, the problem about playing a tune from memory was not that it was
unpredictable: whether or not a subject could do this was predictable from the amount of
practice. Watson's analysis did not alter this situation at all, for Watson did not isolate a
stimulus, or stimulus history, that has as response playing a tune from memory. What
Watson did was explain the capacity to play a tune from memory by analyzing it into
antecedently understood (or anyway antecedently present) capacities of the organism.
This in turn allowed Watson to describe (inaccurately as things turned out) the conditions
under which this capacity would be acquired, but these were already known, and, in any
case, this is not predicting a response. '
Watson's presentation of his analysis of habit formation in Behaviorism (1924) is
introduced as a response to his discussion of the sort of learning Thorndike studied; what
we now think of as operant conditioning.

       ... Let us put in front of the three-year-old child, whose habits of manipulation are
       well established, a problem box-a box that can be opened only after a certain thing
       has been done.... Before we hand it to him, we show him the open box containing
       several small pieces of candy and then we close it and tell him that if he opens it he
       may have a piece of candy. ... Let us suppose that he has 5 0 learned and
       unlearned separate responses at his command. At one time or another during his
       first attempt to open the box, let us assume that he displays, as he will, nearly all of
       them before he pushes the button hard enough to release the catch. The time the
       whole process takes, we will say, is about twenty minutes. When he opens it, we
       give him his bit of candy, close the box and hand it to him again. The next time he
       makes fewer movements; the third time fewer still. In 10 trials or less he can open
       the box without making a useless movement and he can open it in two seconds.
                Why is the time cut down, and why do movements not necessary to the
       solution gradually drop out of the series? This has been a hard problem to solve
       because no one has ever simplified the problem enough really to bring
       experimental technique to bear upon it. [p. 204]

This is not even prima facie a problem of prediction and control; it is a problem of
explanation. It isn't at all clear how Watson's analysis is supposed to help with this
particular problem, but it is quite clear that we have a capacity that wants explaining, not a
response that wants predicting, and that the explanatory strategy employed is analysis,
not subsumption. The learning curves obtained by Thorndike and others specify a
capacity of organisms. It was this capacity that Watson sought to explain by analyzing it
into the capacity for stimulus substitution and the antecedently available capacities
characterized by the organism's pretrial S-R connections.
Behaviorism eventually came to grief because it had no resources to explain the
                                                                                             9


acquisition of novel responses. Pavlovian conditioning attaches new stimuli to old
responses, but introduces no new responses. Operant conditioning alters the probability
that a given behavior in the repertoire will be emitted, but doesn't add to the repertoire. It
is one of the great ironies of the history of science that behaviorism, which identified
psychology with learning theory, was ultimately unable to accommodate the principled
acquisition of novel behavior--learning, in short.
There is a far more fundamental problem with behaviorism, however. Even if the
behaviorist program had succeeded, it would have, at best, specified our psychological
capacities; it wouldn't have explained them. To see this, one has only to note that
behaviorism has, in principle, no resources to explain why an organism can be
conditioned, or why some schedules of reinforcement work better than others. One can
say which laws of conditioning characterize which types of organism, but one cannot say
why.
This is no accident, of course. Behaviorism seeks to avoid the problem about observing
the mind by eliminating the mind from psychology. But it is breathtakingly obvious that
we are conditionable, that we can learn language, that we can recognize objects, and so
on and so on, because of the way our minds work. Introspection does tell us that much.
It just doesn't tell us how our minds work. So, we are back where we started: you cannot
have a science of the mind unless you can observe the mental, and the only way to do
that appears to be introspection. But introspection cannot be calibrated, and is maddingly
passive and hence unsuitable as an analytical tool. Behaviorism got around this by
banishing the mind from psychology, but most of the interesting questions were banished
along with it.
Inference to unobservables.          Suppose we admit that most of the mind is
unobservable: why is that a problem? After all, science is full of unobservables. Why
cannot the machinery of the mind be inferred from its observable manifestations in the
way that genes were inferred from inheritance patterns?
Here, it might seem, the materialist has a real advantage, for, if minds are physical
systems, then mental mechanisms are physical mechanisms. Our lately imagined
chemist was able to reason cogently about the pros and cons of the burr versus
hook-and-eye accounts of chemical bonding because burrs and hooks-and-eyes were
antecedently well-understood physical mechanisms. But where is the immaterialist going
to find a stock of well-understood mental mechanisms which might be postulated to
explain observable psychological phenomena?
But, once again, the apparent advantage enjoyed by the materialist intent on a science of
the mind is undermined by Leibniz' Gap. For, while there are lots of well-understood
physical mechanisms for sticking things together, there are none for producing thoughts,
feelings or perceptions or sticking them together. Or rather, there are only two, namely
inference and association.
Philosophers and scientists have always followed common sense in explaining the
acquisition of certain beliefs and desires by appeal to inference. People are planners, and
planning requires inferring what the world will be like when one comes to the time and
place of action, inferring what changes one's contemplated actions will make in the world,
and inferring what sub-goals one must achieve in order to achieve the goal of the plan.
The inferentially mediated interplay of belief, desire and intention that is familiar to
common sense seems capable of explaining a vast amount of the kind of behavior that is
                                                                                         10


characteristic of thinking beings. Leibniz, Helmholtz and Freud extended this idea to other
phenomena (notably perception and affect) by allowing for unconscious inference as well
a conscious inference. But it remained a conceptual mystery how a physical device could
be an inference engine, and attributing inferential powers to an immaterial mind simply
made the mystery metaphysical. Thus inference, though a powerful explanatory process,
was itself left unexplained.
Association fared no better: though a variety of phenomena--especially memory
phenomena--could be explained by appeal to principles of association, there was no
prospect of explaining association itself.

Computationalism.

Computationalism provides a way around this impasse by proposing that mental
processes are computational processes, i.e., by proposing that the mind is a computer,
and that inference and association are just two, albeit important, computational processes
endogenous to the mind.
Functionalism. Computationalism is a species of functionalism. Functionalism is best seen
as a proposed solution the problem posed by Leibniz' Gap. The central idea is that mental
concepts specify their instances in terms of what they do--in terms of their
functions--rather than in terms of their intrinsic structures. Doorstop, valve-lifter, mouse
trap, can opener, pump, and calculator, are all functional concepts. A great variety of
different physical structures can be pumps: hearts, propeller and case, vibrator and
one-way valve, centrifuges, piston-and-sleve arrangements. What they have in common
is a function: to pump. Since they do not have a physical composition in common, you
cannot reduce being a pump to having a certain physical structure. Yet certain physical
structures are sufficient for pumping; nothing non-physical is required. Functionalism in
the philosophy of mind is the proposal that the problem imagined by Leibniz arises
because one cannot, in general, read off function from form. Wandering through the
mill-sized mind is not enough, as Leibniz pointed out. But, according to functionalism,
what is missing is not an immaterial soul but a functional analysis of the mill and its
component structures and processes. Wandering through an expanded engine, you would
not, simply by looking, realize that the cam-shaft is a valve lifter, or that the things it
moves are valves. Of course, most of us know enough simple push and pull mechanics
that we could make some shrewd guesses. But a comparable experiment with the
micro-chip at the heart of a calculator or computer would leave most of us on the wrong
side of a Leibnizian Gap.
Computationalism--the idea that the mind is what Haugeland calls an automatic formal
system--is the functionalist proposal that mental capacities can be analyzed and explained
as complex computational processes. This idea was given a huge impetus by the dual
discovery that inference could be treated as a computational process, and that that
process could be instantiated in a machine. Psychology's oldest and most powerful
explanatory primitive was finally given a materialistic explanation, though not yet a
biological one. Association proved relatively easy to implement computationally and found
its most important home in semantic networks.
Computationalism requires some fundamental enabling assumptions to turn it into a
serious research program. The first of these is that the mind is fundamentally an engine of
                                                                                           11


thought. Descartes held the essence of mind is thought, and Locke that the essence of
mind is the capacity for thought. We think of this as the Mr. Spock assumption: a thinking
engine with no other mental characteristics would still count as a mind, but a system with
emotions, sensations and other non-cognitive mental processes that did not think
(assuming this is even possible) would not count as a mind. Computationalism proposes
to follow Descartes in the assumption that the place to start in understanding the mind is
thought or cognition. Other aspects of mentality can be added on later, much as terms for
friction and air resistance can be added to the basic pendulum equation once it is
articulated and understood.
The second fundamental assumption grounding Computationalism is that thought does
not require consciousness. This assumption, along with the first one, allows
computationalists to put off the explanation of consciousness until some future time.
Given these assumptions, computationalism looked like an attractive research program for
psychology. The idea that the mind is essentially a functionally specified computational
process running on the brain provides a bridge over Leibniz' Gap (functionalism), a supply
of mental mechanisms with precisely specified properties (anything you can program),
and medium independence: the possibility that thought can exist in a non-biological
computer, and hence can be investigated in the computer lab as well as in the
psychological lab. It was a powerful vision. And though it shows signs of fading today, it
was, and in some respects continues to be, a hugely prolific vision, fueling the initial birth
and development of what came to be called Cognitive Science.
                                                                                          12


Introduction: The mind as neural network

Top Down vs. Bottom Up.

Top Down. computationalism is what is called a "top down" strategy. In the hands of the
computationalists, that strategy, classically characterized by Marr in the last section,
begins by identifying a task or capacity to be explained--i.e., with the explanandum (the
thing to be explained): the capacity to learn a language, or converse, or solve a problem,
etc. It then attempts to specify that capacity as a function or relation: What inputs produce
what outputs under what circumstances? Finally, that characteristic function or relation is
analyzed into components that have known computational realizations. (In practice, this
means analysis into components that can be programmed in LISP or some other standard
programming language.)
This strategy involves three assumptions and a precondition that are worth noting.
1. One underlying assumption of this approach is that cognitive functions are computable.
This is actually a rather strong and daring assumption. Most dynamical systems found in
nature cannot be characterized by equations that specify a computable function. Even
three bodies moving in Newtonian space do not satisfy this assumption. It is very much an
open question whether the processes in the brain that subserve cognition can be
characterized as the computation of a computable function.
2. Another underlying assumption of top-down computationalism as it is usually
characterized (and as we have just characterized it) is that cognitive capacities can be
specified independently of their realizations. But this is pretty patently false: There is no
input-output function the computation of which would constitute playing intelligent chess.
Or rather, there are a great many. Think of a chess system as a move generator, i.e., as a
function from board positions (current) to board positions (the move). In a given situation,
intelligent chess players might make any number of different moves. Indeed, the same
one might make different moves on different occasions. In practice, then, the only way to
specify a chess function is to actually write an algorithm for computing it. We cannot, in
general, expect to specify a cognitive function before we analyze and implement it.
3. A third underlying assumption of the top-down strategy, closely related to the second
assumption, is that we will be able to recognize and characterize the relevant inputs and
behaviors antecedently to serious attempts to explain how the later are computed from the
former. Here the difficulty is that pre-analytic conceptions of behavior and its causes may
seriously misrepresent or distort what is actually going on. Connectionists often complain
that there is not reason to think that cognition in the brain is the manipulation of
representations that correspond to our ordinary concepts. Top-down strategists therefore
run the risk of characterizing the explananda in terms that are cross-cut or distort the
categories that are actually causally relevant. This is a common fallacy in biology where
there is an almost irresistible temptation to believe that the morphological traits of
importance and interest to us must correspond to our genes in some neat way.
Computationalists are wont to reply that what Daniel Dennett calls the intentional
strategy--explaining behavior in terms of beliefs, desires and intentions--is enormously
successful, and hence that it cannot be fundamentally wrong to characterize cognition in
something like these commonsense terms.
So much for the assumptions. Now for the precondition: A successful application of the
                                                                                          13


top-down strategy is that the target explanandum can be analyzed. Everyone who has
ever tried their hand at programming is familiar with this constraint. You cannot write a
program that computes bids in bridge, or computes square roots, if you do not know how
to compute bids in bridge or compute square roots. But many psychological capacities are
interesting explananda precisely because we have no idea how the task is done. This is
why artificial intelligence plays such a central role in computationalism. It requires very
considerable ingenuity to discover a way--any way--to construct 3Dspecifications of visual
space from retinal images, or to make it happen that two short sessions on many
problems are more effective than one long one.
 But even with success, there is a problem: having figured out a way to compute a
cognitive function, what reason is there to think that that is how our brains do the job? We
do not mean to suggest that there is no way of addressing this problem, only that it is a
problem that is bound to arise in a top-down framework. Computationalists are thus
inevitably left with a narrowed but still substantial Leibnizian Gap: the gap between a
computational description of psychological processes and a bio-neural description of the
processes in the brain.
Before we leave the topic of underlying assumptions and enabling conditions, it is worth
pausing to note that some of the central enabling assumptions of computationalism are
shared by connectionism. Both assume the possibility of Spock--i.e., that the mind is
basically a cognitive engine and only secondarily a seat of emotion, feeling and sensation.
Both assume that consciousness is inessential to the understanding of cognition. And
both assume that cognition doesn't require a biological brain, let alone an immaterial soul.
Both are thoroughly functionalist and materialist. And both are representationalist in that
both assume that cognition is to be understood as disciplined transformation over states
whose primary function is the representation of information relevant to the cognitive
capacity being exercised. The differences that divide computationalist and connectionist
are practically invisible against the scale that measure the distance between both and
behaviorism or structuralism.
Bottom Up. The top down strategy is explanandum driven: you begin with a capacity to
explain, and try to find a computational architecture that will have it. The bottom up
strategy is explanans (the explainer) driven: you start with a specification of the
architecture, and try to find a way to make it do the task.1 What connectionists have in
common is the assumption that cognitive capacities are built out of a stock of primitive
process designed explicitly to be rather brain-like. They begin with the building blocks of a
simplified and idealized brain, and attempt to create systems that will behave in a



fs241In practice, most computationalist are actually bottom-uppers to some
extent. This is because, as a graduate student, you apprentice in a research
group that is more or less committed to a given architecture, and you job is to
extend this approach to some new capacity. It is just as well: pure
top-downism, as described by Marr, is probably impossible. Computationalist
architectures, however, are not well-grounded in the brain, so the problem
just rehearsed remains.
                                                                                               14


recognizably cognitive way. The connectionist thus seeks to narrow the Leibnizian Gap
even further to that between a genuine bio-neural description of the brain, and the
simplified and idealized "neural networks" that are their stock in trade.
But a much narrowed Gap is not the only payoff. As it happens, it is possible to program
connectionist networks to do tasks that the programmer does not know how to do. All that
is required is a sufficiently representative "training set": a set of inputs paired with their
correct responses. Thus the Precondition of top-down computationalism, discussed
above, can be avoided. You can program a network to do a task you haven't the faintest
idea how to do. There is a downside to this, however: once you have trained a network,
you may still have little if any idea how it does the task. Studying an artificial network is, of
course, much easier than studying a living brain, so you are still substantially ahead. But
you are not home free.
Moreover, it is seldom noticed that one of the lately discussed assumptions required by
the top down approach are also required by bottom-uppers. Training sets must be
specified somehow, and the problem of how to conceptualize inputs and behaviors is no
easier for connectionists than it is for top-down computationalists. While connectionists
need not assume that networks operate on internal representations that correspond to
ordinary common-sense concepts, they are no better off than top-down computationalists
when it comes to conceptualizing the target explananda.

The Architecture.

A connectionist architecture is a network of simple units (see figure1). Each unit has an
arbitrary number of inputs and outputs. Inputs and outputs are not messages, but simply
quantities of activation. Inputs maybe positive (activation) or negative (inhibition), but are
not distinguished in any way from each other by the unit that receives them, but the
activations from each input are simply summed up to yield a net input. Thus, there is no
difference between three inputs of one unit of activation, and one input of three units of
activation; these have precisely the same effect. Each output at a given time is the same
as every other from that unit, and is a function of the total input. That function, called the
activation function, is characteristic of the unit and is assumed to be fixed.2
Units are connected by weighted connections (see figure 2). The output from the source
unit is multiplied by the weight of the connection to yield the input to the receiving unit.
Weights may change as a function of activation spreading from unit to unit, or may be
altered externally by the programmer or some automatic process.
Inputs to a network are given by setting activations on some pool of units. The input pool
may be any designated set of units including all of them. Activation is then spread through
the network, each unit computing its output from its input, each output being modified by
the relevant connection weight. Outputs are read as the pattern of activation on some set

2
A typical activation function is a logistic function:
                     output activation =              1
                                                      1+ e- (total input)
This makes output a near linear function of input, keeps total output bounded without
introducing discontinuities.
                                                                                             15


of units designated as the output pool. Output might be the pattern of activation on the
output pool after a given time, or when the network reaches equilibrium, or when some
other criterion is met.
Networks are of two basic types (see figure 3). Feed-forward networks are distinguished
into layers. Each unit in a given layer may connect to other units in that layer, (including
itself) or to units in the next layer "downstream", but may not connect to any unit that is
"upstream". In recurrent networks, any unit may connect to any other. Various patterns
of connectivity can produce hybrids of feed-forward and recurrence in the obvious ways.
The performance of a network is altered by modifying the connection weights. This is
typically called learning, but the most important techniques for modifying weights are best
thought of as programming techniques, since they are the result of some agent external to
the network modifying the weights in response to some feature of performance. These
techniques are generally called "supervised learning" techniques to distinguish them from
weight change that results automatically as a byproduct of activation flow and hence can
be conceptualized as the result of processes intrinsic to the connections themselves. The
operation of these processes are typically referred to "unsupervised learning".
Backpropagation. By far the most important form of "unsupervised learning" is back
propagation, a technique that applies straight-forwardly only to feed-forward networks.
Backpropagation operates as follows. First, an error for each output unit is computed by
taking the difference between the actual activation of that unit on a given input, and the
activation it should have (read from the training set). This error is then used to compute a
modification of the weights on the unit's input connections. If the activation was too high,
each weight is slightly decreased; if it was too low, each weight is slightly increased. This
is done for each output unit.
To compute weight changes for connections leading to units in intermediate layers, the
correct activation values for those units must be estimated, since these are not given by
the training set. The details of the estimation process need not concern us here, but once
it is carried out, changes to the weights are computed as before.3 In practice, back
propagation is virtually guaranteed to eventually set a network's weights correctly if there
is any set of weights that will work at all.

Simplification and idealization.

    3
8
 The error for each unit in the output layer is given by:
        error= (target value - actual value)f'(total input),
where f' is the first derivative of the activation function with respect to change in input. For
units in the intermediate layers, the error is given by:
                                  3 error k w ki
        error=f'(total input) k
where the summation term sums the products of the error at the kth unit in the layer
downstream and the weight between that unit and the one whose error is being computed.
Weight change is then simply:
         ij   error i )(activationi )
          w       (
i.e., the change in weight between unit i and unit j is a constant(the "learning rate") times
the error at i, times the activation at i.
                                                                                           16



Connectionist networks evidently don't begin to incorporate everything that is known about
neurons and their synaptic connections. Why not? Two reasons might be given. The first
is mathematical tractability. Connectionist networks are mathematically straightforward,
whereas actual neural networks are not. The second and more important reason for the
relative simplicity of connectionist networks is methodological. It is, perhaps, best to begin
with a simple and well-understood model, adding complications only when forced to do so
to accommodate otherwise recalcitrant phenomena. In that way, we are sure to have
some idea of what the extra complexity is actually for--what it makes possible. The
underlying hope is that connectionist networks will turn out to be idealized neural
networks.

Representation.

Information is stored in two ways in a connectionist network: fleetingly, in the activation
patterns that appear and disappear in various pools of units, and, more enduringly, in the
pattern of weights. In both of these cases, we find something very different from the
quasi-linguistic representations that make up the data structures of computationalist
models. It is precisely because connectionist representations do not seem to correspond
in any neat way to the linguistic expressions we use to articulate the contents of beliefs,
desires and intentions that controversy has arisen as to whether connectionist models can
explain cognitive capacities as they are ordinarily specified.
A little noticed aspect of this issue has to do with the fact that the typical connectionist
network actually computes a function which is a huge superset of the function it is
designed to compute. Thus NetTalk is said to compute a Wickel-feature representation of
the phonetic value of the middle letter in a seven letter string. And so it does. But consider
how inputs are represented (see figure 4). The assumption is that every unit is either on or
off, and that only one unit is on for each position. But the network will happily compute
outputs on input configurations that violate either of both of these constraints, i.e., on
inputs that have no canonical interpretation in the intended task domain at all. Since the
network is designed so that every possible output represents some phoneme or other,
each of these maverick inputs will be mapped onto a phoneme. How should we react to
this? Should we say that NetTalk doesn't really do the job it was intended to do? Or shall
we make a virtue of necessity and hypothesize that something comparable would happen
in the brain if maverick stimuli were introduced, perhaps by electrode? In computationalist
systems, maverick inputs generally generate error messages or crashes. Is this a virtue or
a vice?

Graceful Degradation.

Closely related to this issue is a much touted property of connectionist networks.
"Graceful degradation" refers to the fact that degraded input, lesioned connections, or
damaged units, typically do not bankrupt a network, but lead to impaired but interpretable
performance, with the degree of impairment depending on the degree of damage. As any
programmer knows, damage to a computational system is generally catastrophic, yielding
an error, a crash, or nonsense rather than impaired but interpretable performance. The
                                                                                             17


fact that stroke and other kinds of damage to the brain often (but not always!) leads
similarly to degraded performance rather total dysfunction appears to be a factor strongly
in favor of the connectionist approach. If the brain implements an orthodox computational
system of the sort envisaged in the previous section, brain damage ought to uniformly
produce crashes, not degraded performance.

Real Time and the Hundred Step Rule.

Feldman and Ballard (19~~) point out that many basic cognitive functions are performed
in 500 milliseconds or less. Since it takes, on average, about 5 milliseconds for one
neuron to communicate with another in the brain, it follows that the brain must accomplish
these tasks in about100 sequential computational steps. Typical orthodox
computationalist programs take thousands. The obvious inference is that these programs
err in one or more of the following ways:
        • They are too sequential; parallel computation requires fewer sequential steps.

       • They utilize the wrong primitives: more sophisticated primitive operations allows
       for more speed, but less flexibility.

        • They are digital rather than analogue: a basic physical component whose physical
        state changes are given by a function analogous to (isomorphic to) a psychological
        function computes that function in a single step in the time it takes for a physical
        state change. The primitive computations in a digital computationalist model are
        never computations of a psychological function.
It is no use to reply to these points by pointing out that thousands of steps reduce to
hundreds or less if we count each psychologically significant sub-routine, rather then the
computational steps in the routine, as single steps. For this would require the assumption
that each such sub-routine could be accomplished in a single(perhaps massively parallel)
neural step. Computationalist models do not even address the question of how this might
be possible; connectionist models address the question directly.

Spontaneous Generalization.

Another often touted property of connectionist networks is spontaneous generalization. In
the typical case, if a network is trained to respond to two activation vectors I 1 and I2 with
output activation vectors O1 and O2 respectively, the network will automatically respond
to an input vector that is (geometrically) between I1 and I2 with an output vector that is
between O1 and O2.This sort of interpretation is one of many kinds of generalization that
appear in networks automatically and unbidden. (Whether it is wanted or not!) This is an
example of a psychologically significant process that has to be specifically programmed in
computationalist models, but appears in networks as a kind of side-effect, with no
additional computational resources required. It seems perverse to assume that a network
of neurons implements in some highly indirect way a computationalist algorithm for
generalization. A phenomenon that occurs spontaneously in the neurons is obliterated in
the implementation and artificially re-introduced as a complex computation!
                                                                                         18


Concluding Remarks.

The connectionist revolution has forever altered the once uniform landscape of Cognitive
Science as it existed under the sway of the computationalist paradigm. It has not utterly
supplanted computationalism by any means. And there are growing numbers who think
that, ultimately, only actual neuroscience will give us any real insight into the mind. We
have tried here to briefly indicate here what connectionist networks are and why they are
attractive. It has not been our intention here to adjudicate this dispute, but simply to
introduce one of the players.
                                                                                          19


Introduction: The Mind as Brain

Everyone who is not a dualist believes that mental processes are process that go on in
the brain. If one's goal is a science of the mind, however, observation of the brain seems
to yield results on the wrong side of Leibniz' Gap. The Computationalist response to this
problem is to try to understand cognitive processes in abstraction from the brain or any
other "hardware" in which they might occur. The Computationalist strategy is to first
articulate a computational theory of cognition, and then to inquire into how the implicated
computational processes might be carried out in the brain. This strategy has some evident
merits. Since no one doubts that computational processes can be physically realized,
Computationalism is free from any dualist taint. Yet the problem of bridging Leibniz' Gap is
conveniently put off until some future date when we will surely know more about both
cognitive and neural processes. An evident drawback, however, is that there is no
guarantee that cognitive processes are computational processes at all, let alone that
cognition in biological brains will turnout to be the kind of processes we are led to
investigate by following a strictly top-down approach. Although that approach has had
some notable successes, it has also had some notable failures. It would not be
unreasonable to conclude that the difficulties faced by Computationalism might be due to
insufficient attention being paid to the only processes we know for sure are sufficient to
subserve mentality in general, and cognition in particular, namely brain processes.
Perhaps we should simply accept the fact that, as things currently stand, studying the
brain puts us on the wrong side of Liebniz' Gap, but hope that, as our knowledge
increases, the outlines of a bridge over the Gap will eventually appear.
Connectionists attempt to take a middle ground here, starting in the middle of the Gap, as
it were, and trying simultaneously to bridge to either side. Most neuroscientists, it seems,
are at least tolerant of the Connectionist strategy. But they are inclined to argue that
connectionist models are such vastly oversimplified models of the brain as to be very
likely misleading at best. If we are going to bridge Liebniz' Gap, we are going to have to
know a great deal more about the brain than we do now. This much is agreed on all
hands. So why not get on with it? And, since the brain is the only known organ of
mentality, whether natural or artificial, it seems only sensible to begin by trying to
understand how it works. Any other strategy arguably runs the risk of being a wild goose
chase, an attempt to make mentality out of stuff that just isn't up to the job.
This line of argumentation has been around at least since the seventeenth century, but it
had little practical consequence until relatively recently simply because there was no very
good way to study the brain. Except for some nice details, simple dissection is enough to
give a fairly complete structural picture of the heart. X-ray of the living heart pumping
radioactively tagged blood, together with open chest and open heart surgery, are enough
to give an equally complete picture of the functioning of the living heart and how that
functioning supervenes on the heart's anatomical structure. But nothing comparable will
do for the brain. It is way to complicated. Its functional parts, the neurons, are incredibly
small and numerous. The processes are subtle electro-chemical processes distributed
over millions of cells. The scale is forbidding, both the vastness of the number of parts and
processes, and the smallness of their physical size and duration.
Still, a variety of techniques have been available for some time. Dissection is by no
means useless, and autopsy of those with damaged or diseased brains with known
                                                                                             20


mental dysfunction has always been suggestive. The Gross anatomy of the brain has
been known since at least 200BC. The anatomy of the neuron, along with some of its
functional properties began to emerge in the 19C, aided by staining techniques and better
microscopes, and the discovery of a neuron in the squid that is conveniently large.
Synaptic junctions became "visible" in the 1950s with the use of the electron microscope.
The ensuing decades ushered in increasingly ingenious uses of lesion studies, in which
functional impairments are correlated with damage to specific brain areas.
Commissurotomy (lesioning of the commissures connecting the right and left
hemispheres) as a treatment for certain forms of severe epilepsy led to a mass of data
and speculation concerning hemispherical specialization, as did the introduction of the
Wada test, in which sodium amytal is injected into the left or right carotid artery resulting in
"paralysis" of the corresponding hemisphere. Electrode stimulation and recording studies
in both human and animals became highly refined.
Even more recently, a variety of non-invasive techniques have finally allowed to begin to
enter into the brain as into a mill, as Leibniz imagined. EEG (electroencephalograph),
ERP (evoked reaction potential, an averaging of EEGs over many trials), CAT
(computerized axial tomography),PET (positron emission tomography), and, most
recently, MRI (magnetic resonance imaging) and FMRI (functional MRI), have made it
possible to observe the brain and its processes at a scale in both space and time that are
beginning to make Leibniz' thought experiment a reality.
All of this technology makes it the bottom-up strategy favored by many neuroscientists a
serious scientific possibility rather than a mere philosophical platform. But as that platform
becomes a practical reality, the vision that made Cognitive Science a reality for thirty
years has, unquestionably, begun to dim. Bottom-uppers have no particular reason to
focus on cognition--to assume, with Descartes, that discursive thought is the essence of
the mental. Nor, consequently, have they any particular reason to think that cognitive
phenomena form an autonomous domain whose principles might be articulated
independently of both non-cognitive mental phenomena and of their particular physical
realizations. That there can be scientific study of cognition, no one in the scientific
community seriously doubts. But it is increasingly unclear whether there can be Cognitive
Science as this was conceived by most computationalists and many connectionists.
Computationalism as a research strategy requires, as we saw, enabling assumptions that
make the possibility of a special science of cognition seem extremely plausible. Bottom-up
neuroscience has no such implications.
                                                                                          21


Introduction: Special Topics


Our final section is a short collection seminal papers that introduced four issues that have
deeply affected the development of cognitive science. While many issues in cognitive
science are parochial to one or another specialized sub-discipline, the ones to be
discussed here--innateness, modularity, eliminativism, and (more recently) evolutionary
psychology, are overarching. No one actually involved in the cognitive science can be
indifferent to these issues because they all have serious implications for each of the
approaches to the mind that have underwritten scientific research on cognition.

Innateness

Perhaps more than any other, the issue of innateness divides rationalist and empiricist
approaches to the science of cognition. Rationalist and empiricist alike believe that there
are innate cognitive capacities. Everyone believes we are born with the capacity to learn.
What divides rationalist from empiricist is the idea that there is innate knowledge; that
substantive and contingent information about the environment is part of the genetic
endowment of every biological cognizer. The central argument for this claim is very
simple and goes back to Plato: Certain things that are known could not be learned and
hence must be innate.
There are variations on this theme, having to do with why the alleged innate knowledge
cannot be learned. There is not enough time; the knowledge is acquired before
development of the (typically perceptual) capacities required to extract the information
from the environment; the knowledge in question is required for any other learning to take
place; the environment simply doesn't provide enough information. Looking over these
variations, it is easy to see that they depend on fairly detailed views about how learning
works, and about the developmental sequence. It is, or course, tempting to declare innate
whatever cannot be accommodated by one's own stories about learning mechanisms and
development. On the other hand, opponents of any particular nativist claim can be
required to put up or shut up: if you think something is learned, it is encumbent on you to
say how and when. Since it is enormously difficult to explain how anything is learned, this
has proved a very effective strategy.
Computationalists are more likely to be nativists than connectionists simply because it is
relatively easy to build a particular bit of innate knowledge into a computationalist model,
and relatively hard to do anything comparable with a connectionist model. (Strict
bottom-upper scan afford to be agnostic on this topic since, as things now stand, they do
not have specific enough architectural models to have implications for how innate
knowledge might be accommodated.) It is worth noting, however, that the issue of
innateness has tended to be formulated by cognitive scientists in terms that have a
comfortable home only in the computationalist framework. For only in that framework is it
natural to think of long term knowledge as something that comes in the form of the sort of
discrete propositions typically expressed by a sentence in a natural or artificial language.
In the 17th C. it was much debated whether our knowledge of God's nature and existence
is innate or acquired. That debate is instructive for at least two reasons. First, it is
instructive because it presupposed some knowledge that many today would claim we do
                                                                                             22


not have at all. But it is also instructive because it takes the issue to be which
linguistically expressible propositions are innate.
The issue of innateness as it was understood in the 17th C. loses focus if we ask whether
there is innate knowledge of how to do something, for this blurs the distinction between
innate knowledge, which only rationalists accept, and innate capacities, which both
rationalists and empiricists accept. The rationalist reply is typically that knowledge how
depends on knowledge that. We know how to drive a car because we know that turning
the wheel counter-clockwise moves the car to the left, and so on. But this reply is
evidently self-defeating in its full generality. Rationalist and empiricist alike concede that
there must be innate capacities to exploit the knowledge we have in learning, action and
perception. An encyclopedia cannot do anything. But there is a subtler issue here as well.
Computationalists quickly became aware that a great deal of information could be implicit
in the logical structure of a program. Production systems self-consciously blur the
distinction between information and the processes that operate on it almost completely,
with only working memory remaining as a repository of information untainted by process.
Whatever long-term knowledge is represented in the weights of a connectionist system is
not to be distinguished from the processes that operate on the short-term information
represented in the activation vectors. Both parties, it seems, will have trouble with the
distinction between knowledge and capacities that initially gave the issue its bite.
INSERT HERE CANALIZATION PARAGRAPH FROM DENISE?
Even if there is little left of the once important philosophical issue concerning innate ideas,
there is still a great deal left to discover about what sort of information is implicit in our
genetic neuro-architectural endowment, and how just how it is "in there." And here the old
arguments still have some force: is there enough time? Is there enough information in the
environment? What has to be presupposed in a system capable of extracting and
exploiting that information?

Modularity

From time immemorial, people have divided the mind into "parts." Plato distinguished
reason from appetite and will (##), and commonsense distinguishes perception from
thought, and both from the emotions. Frequently, memory is distinguished from all of
these as well. These distinctions are evidently functional. Though commonsense is
unclear what the emotions are for, this scheme of things has its rationale in speculation
about what it takes to get the job done. Perception takes in information, memory stores it,
reason processes it, storing some and using some to direct action. The emotions provide
desires or goals. These are the basic components of GPS, which was designed to solve
problems. Today, we would recognize these components as the basic components of a
planner, a system designed to formulate a plan (a series of actions) to achieve a goal in
the light of whatever information is already available or can b gleaned from the
environment either accidentally or as part of the plan.
6480 This simple but elegant, compelling and remarkably effective structure involves a
division of labor. It assumes special faculties for information acquisition(perception),
storage (memory), inference (reason) and goal generation(emotion/desire). Contemporary
talk of modules, however, involves a different kind of division of labor, a division
corresponding to different cognitive domains rather than to different requirements for
                                                                                           23


accomplishing a given task. For some time, cognitive scientists have taken seriously the
idea that the mind is a committee of specialized minds, each relatively complete in itself
and designed especially for a given kind of cognitive problem. Modules have been
proposed for language production and language understanding, with sub-modules for st
phonology, syntax and semantics, at a minimum. The same goes for vision, memory,
social reasoning, physical reasoning, and many others. An extreme form of the resulting
vision is admirably captured in the following quotation from Tooby and Cosmides(1995):
        [O]ur cognitive architecture resembles a confederation of hundreds or thousands of
        functionally dedicated computers (often called modules) designed to solve adaptive
        problems endemic to our hunter-gatherer ancestors. Each of these devices has its
        own agenda and imposes its own exotic organization on different fragments of the
        world. There are specialized systems for grammar induction, face recognition, for
        dead reckoning, for construing objects and for recognizing emotions from the face.
        There are mechanisms to detect animacy, eye direction, and cheating. There is a
        "theory of mind" module..., a variety of social inference modules...and a multitude
        of other elegant machines.

       This "Swiss Army Knife" model of the mind has many attractions. It allows the
cognitive scientist to attack cognition piecemeal. Building a parser is a lot easier than
building a whole mind. It makes the evolution of mind easier to understand, since it allows
various modules to evolve in relative independence of others in response to specific
selection pressures. It explains in a natural and compelling way how we can do so many
things without having a clue how we do them. It accommodates such puzzling facts as
that language competence is largely independent of IQ, and that social and physical
reasoning seem largely independent of each other. And finally, it helps to explain the
strange dissociations that are the stock in trade of the cognitive neuroscientist: people
who can write from dictation but cannot read, people who are blind but who think they can
see, people who are unaware of one half of their body, and many more.
       But there are problems as well. The brain exhibits remarkable plasticity as well as
specialization. While modularity doesn't depend on strict neural localization--after all, you
can program a computer with highly modular sub-routines, and the resulting modularity of
capacity will not be reflected in the hardware--localization of functions corresponding to
hypothesized modules would certainly make life easier. And without some constraints
from neuroscience, it seems that the modularization of mind could be imagined in many
ways consistent with the behavioral evidence.
       Once modules are admitted, each with its own implicit information, it is inevitable to
ask which, if any, are innate? Thus, modularity, as an hypothesis about the structure of
the mind, cohabits comfortably with nativist theories. In general, the degree of nativism in
a theory correlates positively with the degree of modularity it assumes. You can have
modularity without nativism, but it is hard to defend nativism without modularity. An innate
module comes equipped with its own proprietary information, whether implicit in the
processes, or explicit in the memory. Defenders of innate modules can defend nativism
without facing the tricky distinction between knowledge and capacities that was
presupposed by the original dispute over innate ideas.
                                                                                           24


Eliminativism

Eliminativism in the philosophy of mind is the doctrine that commonsense mental
concepts, especially those of belief, desire and intention, are seriously flawed, and should
ultimately be eliminated from a serious theory of cognition.
       Commonsense has it that it is beliefs, desires and intentions that make the mind go
'round. You want a sandwich. You form an intention to acquire one. You believe that
there is bread, peanutbutter and jelly in the kitchen. You have beliefs about where the
kitchen is relative to where you are know, and other beliefs about to get to the kitchen.
You form an intention to go to the kitchen. You believe that, once there, you will be able to
construct a sandwich. Etc. That we all have beliefs, desires and intentions, and that they
interact to generate intelligent, goal directed behavior in this way, seems beyond question.
       But it isn't beyond question, for it has been questioned, and quite seriously. And the
implications are staggering: All of traditional epistemology, as well as moral and legal
reasoning, have been couched in the terms of commonsense psychology. If these prove
to be seriously flawed, it is not just our cognitive science that will have to change.

Evolutionary Psychology

Recently, evolutionary psychologists of a cognitive bent have argued that the study of
cognition needs to be informed and constrained by a consideration of what our various
cognitive capacities are for--what they evolved to accomplish. Cognitive capacities, the
argument goes, are adaptations, and hence the mind should be seen as a biological
system structured by natural selection. The argument can be simply summarized as
follows:

If you are a materialist, then you are committed (at least implicitly) to the view that
       The mind is what the brain does.
That is, our cognitive processes are instantiated as neurological processes. And unless
you are a creationist, you are also committed to the view that
       The brain was shaped by natural selection.
If you accept these two premises, you are also committed to accepting their logical
conclusion, namely, that
       The mind was shaped by natural selection.

      The strongest proponents of this argument take a very literal interpretation of this
conclusion. The mind is seen as a collection of independent cognitive functions or
modules each of which constitutes an adaptation to different environmental pressures. We
have already quoted Tooby and Cosmides to this effect. Stephen Pinker puts it even more
succinctly:

      "[T]he human mind..[is]..not a general-purpose computer but a collection of
      instincts adapted for solving evolutionarily significant problems -- the mind as a
      Swiss Army knife."
      (Pinker, 1994)
                                                                                                25


This view is certainly consistent with a neurological view of the adult brain, which is, to a
first approximation anyway, a collection of discrete specialized structures and substrates.
Specific neural circuits subserve specific cognitive functions, and damage to those circuits
produce selective impairments in cognition, not across-the-board reduction in intellectual
function. This massive modularity view, however, does not sort well with what we
understand about neural plasticity during development. As one leading textbook on
cognitive neuroscience puts it

       "The environment has profound effects on the <developing>brain. Such effects are
       clearly seen during sensitive periods, which are time periods during development
       when an organism is particularly sensitive to certain external stimuli." (Banich,
       1997)

        At first blush, the plasticity of the developing brain does not seem to sort well with
the view that natural selection shaped innate modules with specific functions. Instead, it
seems more consistent with a strictly empiricist view of intellectual function, one in which
the nature of one's reasoning strategies simply reflects environmental contingencies.
        But an evolutionary approach to cognition need not presuppose massive
modularity. An alternative conception, consistent with the idea that natural selection has
designed our cognitive architecture, is that complex social animals do not inherit
modules fully formed but that we have a biological preparedness to develop them very
quickly for classes of problems that are critical to survival (and hence reproductive
success).Further, these predispositions can differ in their degree of canalization, that is, in
the degree to which the environment plays a role in their expression.
        As examples, consider the neurological changes that subserve the development of
binocular vision and language. Cortical binocular columns(used in depth perception) are
not present at birth, but appear in the visual cortex during a critical period after the infant
has received visual input. Other visual cortical cells show diffuse line orientation
"preferences" at birth, firing maximally to lines of a particular orientation (e.g., vertical), but
responding to lines of other orientations as well, albeit to a lesser degree. After receiving
visual input, however, these cell preferences are sharpened so that they respond
maximally only to lines of a particular orientation. Further, if visual input is restricted to
only a single orientation (e.g., the animal is exposed only to lines of vertical orientation),
the majority of cells will shift their preferences to match their visual experiences,
responding maximally to lines of vertical orientation even if their initial preferences were
for lines of other orientations.
        Like vision, language development also shows a complex pattern of interplay
between innate biases and environmental input. Deaf babies will begin to babble vocally
just as hearing babies do, but their babbling declines and eventually ceases, presumably
because they don't receive the auditory feedback hearing babies do. Infants are also born
with the capacity to hear all phonetic contrasts that occur in human communicative
systems, yet lose the capacity to distinguish among phonemes that are not marked in their
language community within the first year of life.
        As these examples show, biological preparedness comes in degrees, and is
probably best explained in terms of canalization (Ariew, 199~).A trait is said to be more or
less canalized as its expression is more or less independent on environmental influence.
                                                                                          26


A combination of genetic and environmental factors cause development to follow a
particular pathway, and once begun, development is bound to achieve a particular
end-state. Limb development is highly canalized in humans (humans everywhere grow
limbs in the same way) but not perfectly so, as the example of Thalidomide shows. As we
saw, language is highly canalized, though not so highly as limb development.
        The environment can influence trait development in many different ways. The most
interesting of these to the psychologist is learning. It is not often fully appreciated that
learning can affect the development of even highly canalized traits. Thus language,
though highly canalized, is still learned. Biology puts strong constraints on what properties
a language must have to be learnable (as a first language), and it virtually guarantees that
language will be learned in a huge variety of environments. This is what is meant by the
claim that there is a specific biological preparedness for language acquisition.
        As we noted earlier, when it comes to nativist theses about cognition, there is a
temptation to ask which information (rules, theories, etc.) is innate, and which is learned.
Couching the issue in terms of canalization or biological preparedness, however, allows
us to see things quite differently. Consider a jointly authored paper. We might ask who
authored which sections or paragraphs or even sentences. This is how people tend to
think of nature vs. nurture in the cognitive realm. But it could also happen that both
authors are responsible for every sentence, with the degree of responsibility varying from
sentence to sentence, or section to section. The suggestion is that we should think of our
cognitive abilities as all thoroughly co-authored. From this perspective the question is not
what ideas are contributed by the genes, and which by learning, but rather how canalized
the development of a given idea (or concept or ability) is: how much variability in the
learning environment will lead to the same developmental end-state. An advantage of this
way of thinking is that we see at once that nothing, not even limb-development, is
inevitable. And when we investigate things in this light, we are led to ask which variations
in the learning environment will divert the stream into a different and perhaps preferable
canal.
        However we conceive of the influence of natural selection on the mind, it seems
clear that natural selection has shaped the mind, and will continue to do so. This rather
uncontroversial idea, however, opens up anew source of evidence for cognitive scientists,
and generates a new set of constraints on cognitive theory. Can the strategies employed
so successfully by evolutionary biology in other domains yield substantial progress in the
study of the mind? We won't know unless we try.

								
To top