1 Introduction: The mind as computer How is it possible to have a science of the mind? To have a science of the mind, you have to be able to observe the mind, or to infer its workings from something that is observable. Let's take a look at these options in turn. Observing the mind. If you are a dualist, believing the mind to be immaterial, observation of the mind is problematic. Observation seems to require some sort of physical contact with the thing observed. It has to reflect light, distort a magnetic field, and collide with something. If the mental is immaterial, it seems it must be undetectable. Of course, most dualists hold that there is mind-body interaction: that pains cause screams, that intentions to move one's arm cause one's arm to move, that bodily damage causes pain, and that light causes visual experiences. This leads to the possibility of inferring the mental from its environmental or bodily causes, and from its bodily effects. But there is a principled difficulty with this idea. We may, indeed, infer a mental effect from a bodily cause, or a mental cause from a bodily effect, but we will have no idea what these mental causes and effects are like. We will, in short, have no idea what sort of mechanisms the mind harbors, and hence no idea how it works to produce the effects it does, or to be affected as it is. To get a feel for this, consider the following analogy. How might an early chemist have explained chemical bonds? Two ideas dating from ancient times are (1) that atoms are like burrs, and stick together, or (2) that they are like pieces of a three dimensional jigsaw puzzle, or, more simply, fitted with hooks and eyes. The law of constant proportions—the fact that elements regularly bond together in fixed ratios by weight--favors (2) over one. You get H2O because each hydrogen atom has one free hook, and each oxygen atom has two free eyes. Burrs just clump willy-nilly. However, while (2) allows for H2, it ought, contrary to fact, be less stable than H20, since it ought to be easier for two hooks to come apart than for a hook to come out of an eye. And there shouldn't be any O 2 at all. You can do all this reasoning about (1) and (2) because you already understand, or can readily investigate, the properties of burrs and of hooks and eyes. (1) and (2) simply project those properties into the realm of the imperceptibly small. But suppose a defender of (2) were to respond to the problem about O2 by suggesting that very, very small eyes, unlike the medium sized ones with which we are all familiar, can hook together. Although intended to save the theory from an objection, this reply would rather undermine it altogether, for it would leave us clueless about what to expect of micro hooks and eyes, thereby destroying the theory's explanatory and predictive power. The moral of this story is not far to seek. Inferred mechanisms, if they are to have any explanatory or predictive value, must be, to some extent anyway, understood independently of the effects they are rung in to explain. And this is where the dualist is in trouble, because mental mechanisms cannot, for the dualist, just be ordinary mechanisms that happen to be hidden away in the mind. They only occur in the mind. They occur nowhere else, and they do not operate on physical principles. So, although dualists believe in mind-body interaction, the idea that the mental can be inferred from its bodily causes and effects founders on the lack of any mental mechanisms to mediate bodily 2 causes and effects. The answer to this difficulty seems obvious: introspection. The mind, as we all know, can observe itself. So the mind, according to the dualist, can be observed after all, and directly observed at that. It is just the minds of others that cannot be directly observed. We will return shortly to this idea. But first, let's see how the would-be psychologist fares if she is not a dualist, but a materialist instead. The materialist, surprisingly, has the same problem as the dualist because of something we call Leibniz' Gap. Here is Leibniz' formulation of the Gap. “ It must be confessed, moreover, that perception, and that which depends on it, are inexplicable by mechanical causes, that is, by figures and motions, And, supposing that there were a mechanism so constructed as to think, feel and have perception, we might enter it as into a mill. And this granted, we should only find on visiting it, pieces which push one against another, but never anything by which to explain a perception. This must be sought, therefore, in the simple substance, and not in the composite or in the machine. (Leibniz, Monodology, sec.17) There is, as Liebniz points out in this famous passage, a gap between the concepts we use to describe the mind, and those we use to describe the brain. So even if we are convinced that the mind is the brain, or a process going on in the brain, physical observation of the brain seems to give us data in the wrong vocabulary: synapses rather than thoughts. When we look at a brain, even a living brain, we don't see thoughts. Or, not to beg the question, we don't see anything we readily recognize as thoughts. If you could put on Newton glasses and look at a billiard game in progress, you would see vectors with centers of gravity at their tails. If you could put on Psychology glasses and look at a living brain, you would, according to the materialist, see thoughts, and probably a good deal more. But to build Psychology glasses, you would need to somehow bridge Liebniz' Gap by correlating observed brain properties, events and processes with psychological properties, events and processes. It seems that the only way you could do this would be to rely on introspection to identify the psychological correlates of what you could observe in the brain. But this puts the materialist in the same boat as the dualist: relying on introspection to generate an observational base in order to get a scientific psychology off the ground. Liebniz' Gap may only be a conceptual gap, but it seems it is no easier to see across it than it is to see across the metaphysical gap that separates the mind and body for the dualist. And so it seemed to dualist and materialist alike that psychology must be founded on introspection. Structuralism Unquestionably the most significant introspectionist program in the United States was the "structuralism" of E. B. Titchener. Titchener was concerned to establish the claim that the "new psychology" imported from Germany had made psychology a rigorous empirical science. Lacking a nontrivial account of science, Titchener supported his claim by emphasizing the analogies between psychology as he saw it and an established experimental science--namely, physical chemistry. To understand Titchener's vision of psychology, therefore, we do well to examine his model. 3 The core of physical chemistry in Titchener's time was the periodic table. The periodic table allowed one to explain analytically an enormous number of the chemical properties of compounds. It provided a list of chemical elements--i.e., components whose further analysis is not theoretically significant for the explanation of properties in the intended domain--together with a specification of the chemically important properties of those elements. With these resources, it was possible to derive laws of composition--which compounds are possible--and laws of instantiation--which properties a compound will have given its constituents and structure--for a large number of the empirically established chemical properties of substances. Titchener's idea was to provide for psychology what the periodic table provided for chemistry, thereby making it possible to explain the properties of mental events and processes by analyzing them into psychological "elements": mental events that could not be further analyzed. Since Titchener's elements are not things or substances but events, his program requires some account of the origin of elements. He needed a general recipe for answering this question: Why do we have just these elements present (in consciousness) at this time rather than some others? There appear to be just three possible sources of elemental mental events: either a current element is the effect of one or more previous elements, or it is the effect of extramental stimuli, or both. Titchener allowed all three possibilities, but he concentrated mainly on the second, probably because (i) extra mental events are more open to experimental manipulation, and (ii) under the influence of empiricist philosophy, Titchener believed that perception is the most significant source of events in the mind. The object of psychological theory, then, is to explain the origin and properties of the contents of consciousness--e.g., feeling anger, the visual image of a pouncing cat, or the experience of voluntary action. Suppose the feeling of anger has properties Q and R (as revealed by introspection). To explain why anger has these properties, we are to proceed by analyzing this feeling into its elements--call them x, y, and z. Then, appealing to the properties of these elements and the laws of composition, we endeavor to show why anger must have the properties Q and R. To explain why S was angry at some particular time, we explain the occurrence of x, y, and z (tokens of the mental element-types that make up anger) as effects of previous mental events and/or current extramental stimuli. (Perhaps we shall also need to explain why conditions were propitious for the combination of x, y, and z into anger. Compare: many chemical combinations require a fair amount of heat, or a catalyst, to take place.) The project, then, was to discover the fundamental and introspectively unanalyzable elements of consciousness and to formulate the principles of combination whereby these elements are synthesized into the complex and familiar experiences of ordinary life. Every compound mental state or process was to be explained compositionally, the characteristics of the whole derived from the characteristics of the parts and mode of combination. Not surprisingly, introspectionists spoke of mental valences, of mental equilibrium, and of mental elements neutralizing each other. The fundamental issues of mental analysis and synthesis never made significant progress, however, and the reason is fairly clear. There were simply no technologies or experimental procedures for analysing a mental event or process: nothing like qualitative analysis in chemistry. This was a fatal defect. You get a large explanatory payoff from the strategy of explaining the observed properties of things in terms of the properties of their elemental constituents and their mode of combination only if the properties of the 4 compound differ significantly from those of the constituents. Salt is nothing like chlorine or sodium. But this means that simply observing salt will tell you nothing about its constituents. You need to be able to analyze it--break it up into its components and isolate them for study. Lacking anything comparable to the laboratory tools of analytical chemistry, the student of introspective psychology was, in the end, simply left with passive introspective observation, and this mean that no significant analysis could be forthcoming. Everything, in effect, was an unanalyzable element. Introspection as a method thus sorts ill with the explanatory strategy of the theory. This strategy was to explain analytically the properties of the complex contents of consciousness, and perhaps the capacities of the mind required for it to have such contents. Since introspection is, at best, a form of observation, it can hope to yield data--the properties of conscious contents--but it cannot hope to yield analyses. The elements and their properties will not be "visible" when compounded unless we assume that there is no serious composition at all. If we assume a bushel basket theory of consciousness, a step even Hume did not take, then the properties of anger, say, will simply be the union of the properties of the elements in consciousness when one is angry. So the analysis of consciousness bogged down for lack of analytical tools. But the correlative project to explain the elements of consciousness as responses to perceptual stimulation did not, for here introspectionists had an experimental paradigm in the Weber-Fechner experiments. Here is Titchener's textbook description of a typical variation. Method-to find the numerical expression of Weber's law for noise-An ivory ball is let fall from two different heights, upon a hard-wood plate. The difference of intensity between the two sounds(i.e., the difference between the two heights of fall) must be slight. The two sounds are given in irregular order in different experiments (to avoid the influence of expectation), and the subject is required to say, in each case, whether the second is louder than the first. In 100 experiments, he will give a certain number of right answers, and a certain number of wrong. The method assumes that if the two sounds are just noticeably different in intensity, the subject will give about 80% right and 20% wrong answers. This proportion is calculated by what mathematicians call the 'law of probability.' Now suppose that a certain difference gave 70 right and 30 wrong answers in 100 experiments. We could calculate, by aid of the integral calculus, how much larger the difference must have been to give 80 right and 20 wrong-i.e., to be just noticeable. The calculated difference (difference of height of fall) is the numerator, and the original intensity (height of fall) of the weaker sound, the denominator, of the fraction which expresses Weber's law. [Titchener, 1897,pp. 81-82] Here is a different and more fundamental description: Suppose that we are investigating the intensity of noise. We shall begin with a stimulus of moderate intensity: say, the noise made by the fall of an ivory ball upon a wood plate from a height of 90 cm. We will call the intensity of this sensation 1. If we gradually increase the height of fall, we shall reach a point at which the noise of the fall is just noticeably greater than the original noise. We may call the intensity of this second sensation 2. If we further increase the height of fall, we shall 5 presently get a noise, 3, which is just noticeably louder than 2; and so on. Now what are the different heights of fall-i.e., intensities of stimulus-necessary to arouse sensations of the intensities 2, 3, 4, etc.? .. . An addition of 30 cm. suffices to raise the intensity of sensation from I to 2; but if we are affected by the stronger stimulus 120 cm., we must add more than 30 to it to change intensity 2 to intensity 3. In other words: change in the intensity of sensations does not keep even pace with change in the intensity of the stimuli which occasion them. Experiment enables us to replace this general statement of the relation of sensation intensity to stimulus intensity by a definite scientific law. If sensations are to increase in intensity by equal amounts, their stimuli must increase by relatively equal amounts.[Titchener, 1897, pp. 79-80] This sort of experiment, begun by Weber (1834) and refined and expanded by Fechner (1860), led to Fechner's law (mistakenly called Weber's law by Titchener and Fechner-compare Gregory, 1981, pp. 500-505). Here was a law and a procedure for testing and refining it. Does the law hold for all sensation? What are the values of the constants of proportionality for each case? Work on these matters proceeded apace. But this apparent bright spot in the program proved to be an Achilles' heel. Although the analysis of consciousness languished, the fact was that there were no generally accepted or clearly articulated canons for the evaluation of structural-analytic explanations envisioned above for complex mental processes like anger, especially as applied to consciousness. Hence there was no way to determine whether the trouble was merely practical or deeply conceptual. On the other hand, there did exist well-articulated and generally accepted canons for the evaluation of the sort of explanations Fechner's law was used to construct, since the idea is to explain an event in consciousness--e.g., a discernible difference in loudness--as an effect of an external cause--a change in stimulus strength. Critics knew how to hunt down and articulate problems with this sort of paradigm, and they did so. We can bring out the problem in a few lines. The canon requiring independent access to causes and effects applies to the Weber-Fechner experiments as follows. Suppose we find a subject whose responses don't fit the law? Is the subject (A) misdescribing his/her sensations, or (B)psychologically idiosyncratic? For that matter, how do we know that a subject responding normally is not in fact (C) psychologically idiosyncratic but systematically misdescribing experience, rather than (D)psychologically normal? We cannot compare a subject's descriptions with what is supposed to be described. Our only access to the sensations of the subject are (i) inference from established connections with responses, including verbal reports, and (ii) inference from established connections with stimuli. Obviously we cannot establish connections of the sort required unless we can, at least sometimes, distinguish(A) from (B) and(C) from (D). But we cannot make these distinctions unless we can establish the connections. Since the accuracy of introspective observation cannot be checked, it seems it cannot play the role required of scientific observation. With introspection ruled out of court, the only way to measure sensation intensity is to measure stimulus intensity and then calculate sensation intensity using Fechner's law, but if we do this, we are using Fechner's law to define sensation intensity, and we cannot then turn around and pretend to explain the intensity of a sensation by appeal to stimulus intensity and Fechner's law. Once introspection is disqualified, we 6 have no access to sensation intensity other than the very law that is supposed to explain it. Behaviorism. Thus it was that introspectionists became vulnerable to a powerful methodological attack. The methodology of explanation by causal subsumption had been well entrenched by Bacon, Berkeley, Hume, Mach, and Mill. The empiricist doctrine was (and is) that causal laws have no explanatory power unless the causes and effects they subsume are knowable independently of each other. It is ironic that this doctrine was so fatal to introspectionism, for introspection was held to be the only avenue to noninferential knowledge by the very empiricist philosophers who developed the line of argument that killed introspectionist psychology. Locke's inverted spectrum problem returned to haunt those who attempted to pick up where Book Two of the Essay Concerning Human Understanding left off. The inverted spectrum problem turned on the conceptualist assumption that linguistic training would inevitably disguise sufficiently systematic "psychological" differences. This is just the possibility raised by (A) and (C). Whatever we may think of this critique, the mere certainty that it could be formulated was eventually enough to kill any psychology based on introspection. And just as Berkeley's critique of representational realism seemed to point inevitably to a single alternative (if we know about tables and can know only about ideas, then tables are ideas), so this critique of introspectionism seemed to point inevitably to a single alternative. Introspection isn't genuine observation, so the Weber-Fechner law cannot be about consciousness. It is obviously about something, though, for experimenters certainly observe and record something in the sort of experiment described by Titchener in the passages above. What? Since experimenters record what their subjects say, the Weber-Fechner law must correlate stimuli with "introspective behavior." This is verbal behavior, per accidens, in the usual experimental set-up, but button pressings would do just as well. This bit of diagnostics would probably have sufficed to produce behaviorism eventually, but a number of other factors conspired to guarantee a quick takeoff. Two, I think, are especially worthy of note. First, pragmatism was in vogue in United States philosophical circles, and pragmatists emphasized the importance of understanding connections with action to the understanding of traditional philosophical problems involving mental states and processes. According to Dewey, for instance, the central mistake of empiricists and rationalists alike is the supposition that knowledge and belief can be understood independently of action, and treated as antecedent conditions to be investigated in their own right. Refuting this supposition was always a central theme in Dewey's writings. It isn't a long step from this to the doctrine that talk of mental states is just shorthand for talk of identifiable behavioral patterns. John Watson, the founder of behaviorism. was a student at Chicago at a time when the influence of Dewey's pragmatism was very strong there. The other significant factor was Pavlov's discovery of stimulus substitution. This was an important discovery in its own right, but in the intellectual climate we have been describing, it had a special significance, for it seems to account for the sort of phenomenon generally attributed to the association of ideas, without recourse to ideas. Someone taken with the empiricist critique of introspection, and the pragmatist treatment 7 of doxastic states, could hardly have failed to conclude that Pavlov had shown that it was stimuli and responses, not ideas, that were associated. This had to seem a major breakthrough, for association was the only principle of learning on hand. Analysis in the Behaviorism of Watson. By itself, stimulus substitution has no chance of explaining complex behavior, or the introduction of new responses. Pavlovian conditioning can simply link new stimuli to responses already in the organism's repertoire, and experimentation quickly revealed that the stimuli and responses involved had to be rather short and simple. The new Pavlovian principle of association, though experimentally demonstrable, seemed to be no more explanatory than the old version. Watson brightened this dim scene with a simple strategy: analyze an extended behavioral pattern into a sequence of responses to stimuli produced by execution of the previous response. Consider playing a tune from memory on the piano. Initially we have a set of connections between perceiving a written note (stimulus) and striking the appropriate key. Now striking a particular key produces a corresponding stimulus-visual and kinesthetic-that always immediately precedes striking the next key specified by the score. Thus repetitious playing from the score should produce stimulus substitution-perception of a previous response substituting for perception of the next note in the score. When substitution is complete, the score will be unnecessary. This analysis fails for a number of rather boring reasons-e.g., it runs afoul of the fact that people can play more than one tune from memory without difficulty even though the different tunes share notes. But the strategy was promising and exciting because the problem of explaining acquisition and exercise of a complex behavioral capacity is reduced to the problem of analyzing the capacity into simple antecedently explained capacities. A compellingly general picture of psychological change emerges. An organism begins with a genetically determined endowment of S-R connections, some of which, perhaps, emerge only at certain maturation stages. This basic endowment is expanded via stimulus substitution. The resulting connections are then combined in more or less complex sequences to yield an organism with a respectably complex behavioral repertoire. This picture is still the operative picture underlying behaviorist psychology. The shifts have been shifts in detail. First, contemporary behaviorism is far more liberal about genetic endowment than was Watson--the tabula isn't nearly so rasa. Second, and more important, classical Pavlovian conditioning is supplemented by operant conditioning. Since operant conditioning builds on emitted behavior rather than on preexisting S-R connections, this change affects the assumptions about genetic endowment. Also, since operant conditioning produces "shaping," contemporary behaviorism has a source of novel, unanalyzable, behaviors available as building blocks. But the basic Watsonian picture remains: significant psychological change is the result of composition of antecedently explained (via genetics or shaping) behaviors. Hence, psychological explanation must proceed by analyzing observed behaviors into more tractable components. Thus it is analysis, not subsumption under causal law, that is the central explanatory strategy of behaviorism. Watson conceived of an organism as a transducer the input-output characteristics of which could be altered over time. To characterize the organism psychologically at a moment in time is to specify the current input-output properties-a bundle of S-R connections. The goal of psychological theory is to specify transition laws that subsume 8 successive changes in momentary input-output properties-e.g., the law of stimulus substitution. It is therefore ironic that Watson never introduced a single principle of this type. Instead, his major achievement was the introduction of the analytical strategy into behaviorism in the guise of the response chain. Watson was fond of saying that the point of scientific psychology is the prediction and control of behavior. But Watson's analysis of habit, even had it been sound, would not have increased the power of psychology to predict or control responses at all, though it would have greatly increased its explanatory power. For example, the problem about playing a tune from memory was not that it was unpredictable: whether or not a subject could do this was predictable from the amount of practice. Watson's analysis did not alter this situation at all, for Watson did not isolate a stimulus, or stimulus history, that has as response playing a tune from memory. What Watson did was explain the capacity to play a tune from memory by analyzing it into antecedently understood (or anyway antecedently present) capacities of the organism. This in turn allowed Watson to describe (inaccurately as things turned out) the conditions under which this capacity would be acquired, but these were already known, and, in any case, this is not predicting a response. ' Watson's presentation of his analysis of habit formation in Behaviorism (1924) is introduced as a response to his discussion of the sort of learning Thorndike studied; what we now think of as operant conditioning. ... Let us put in front of the three-year-old child, whose habits of manipulation are well established, a problem box-a box that can be opened only after a certain thing has been done.... Before we hand it to him, we show him the open box containing several small pieces of candy and then we close it and tell him that if he opens it he may have a piece of candy. ... Let us suppose that he has 5 0 learned and unlearned separate responses at his command. At one time or another during his first attempt to open the box, let us assume that he displays, as he will, nearly all of them before he pushes the button hard enough to release the catch. The time the whole process takes, we will say, is about twenty minutes. When he opens it, we give him his bit of candy, close the box and hand it to him again. The next time he makes fewer movements; the third time fewer still. In 10 trials or less he can open the box without making a useless movement and he can open it in two seconds. Why is the time cut down, and why do movements not necessary to the solution gradually drop out of the series? This has been a hard problem to solve because no one has ever simplified the problem enough really to bring experimental technique to bear upon it. [p. 204] This is not even prima facie a problem of prediction and control; it is a problem of explanation. It isn't at all clear how Watson's analysis is supposed to help with this particular problem, but it is quite clear that we have a capacity that wants explaining, not a response that wants predicting, and that the explanatory strategy employed is analysis, not subsumption. The learning curves obtained by Thorndike and others specify a capacity of organisms. It was this capacity that Watson sought to explain by analyzing it into the capacity for stimulus substitution and the antecedently available capacities characterized by the organism's pretrial S-R connections. Behaviorism eventually came to grief because it had no resources to explain the 9 acquisition of novel responses. Pavlovian conditioning attaches new stimuli to old responses, but introduces no new responses. Operant conditioning alters the probability that a given behavior in the repertoire will be emitted, but doesn't add to the repertoire. It is one of the great ironies of the history of science that behaviorism, which identified psychology with learning theory, was ultimately unable to accommodate the principled acquisition of novel behavior--learning, in short. There is a far more fundamental problem with behaviorism, however. Even if the behaviorist program had succeeded, it would have, at best, specified our psychological capacities; it wouldn't have explained them. To see this, one has only to note that behaviorism has, in principle, no resources to explain why an organism can be conditioned, or why some schedules of reinforcement work better than others. One can say which laws of conditioning characterize which types of organism, but one cannot say why. This is no accident, of course. Behaviorism seeks to avoid the problem about observing the mind by eliminating the mind from psychology. But it is breathtakingly obvious that we are conditionable, that we can learn language, that we can recognize objects, and so on and so on, because of the way our minds work. Introspection does tell us that much. It just doesn't tell us how our minds work. So, we are back where we started: you cannot have a science of the mind unless you can observe the mental, and the only way to do that appears to be introspection. But introspection cannot be calibrated, and is maddingly passive and hence unsuitable as an analytical tool. Behaviorism got around this by banishing the mind from psychology, but most of the interesting questions were banished along with it. Inference to unobservables. Suppose we admit that most of the mind is unobservable: why is that a problem? After all, science is full of unobservables. Why cannot the machinery of the mind be inferred from its observable manifestations in the way that genes were inferred from inheritance patterns? Here, it might seem, the materialist has a real advantage, for, if minds are physical systems, then mental mechanisms are physical mechanisms. Our lately imagined chemist was able to reason cogently about the pros and cons of the burr versus hook-and-eye accounts of chemical bonding because burrs and hooks-and-eyes were antecedently well-understood physical mechanisms. But where is the immaterialist going to find a stock of well-understood mental mechanisms which might be postulated to explain observable psychological phenomena? But, once again, the apparent advantage enjoyed by the materialist intent on a science of the mind is undermined by Leibniz' Gap. For, while there are lots of well-understood physical mechanisms for sticking things together, there are none for producing thoughts, feelings or perceptions or sticking them together. Or rather, there are only two, namely inference and association. Philosophers and scientists have always followed common sense in explaining the acquisition of certain beliefs and desires by appeal to inference. People are planners, and planning requires inferring what the world will be like when one comes to the time and place of action, inferring what changes one's contemplated actions will make in the world, and inferring what sub-goals one must achieve in order to achieve the goal of the plan. The inferentially mediated interplay of belief, desire and intention that is familiar to common sense seems capable of explaining a vast amount of the kind of behavior that is 10 characteristic of thinking beings. Leibniz, Helmholtz and Freud extended this idea to other phenomena (notably perception and affect) by allowing for unconscious inference as well a conscious inference. But it remained a conceptual mystery how a physical device could be an inference engine, and attributing inferential powers to an immaterial mind simply made the mystery metaphysical. Thus inference, though a powerful explanatory process, was itself left unexplained. Association fared no better: though a variety of phenomena--especially memory phenomena--could be explained by appeal to principles of association, there was no prospect of explaining association itself. Computationalism. Computationalism provides a way around this impasse by proposing that mental processes are computational processes, i.e., by proposing that the mind is a computer, and that inference and association are just two, albeit important, computational processes endogenous to the mind. Functionalism. Computationalism is a species of functionalism. Functionalism is best seen as a proposed solution the problem posed by Leibniz' Gap. The central idea is that mental concepts specify their instances in terms of what they do--in terms of their functions--rather than in terms of their intrinsic structures. Doorstop, valve-lifter, mouse trap, can opener, pump, and calculator, are all functional concepts. A great variety of different physical structures can be pumps: hearts, propeller and case, vibrator and one-way valve, centrifuges, piston-and-sleve arrangements. What they have in common is a function: to pump. Since they do not have a physical composition in common, you cannot reduce being a pump to having a certain physical structure. Yet certain physical structures are sufficient for pumping; nothing non-physical is required. Functionalism in the philosophy of mind is the proposal that the problem imagined by Leibniz arises because one cannot, in general, read off function from form. Wandering through the mill-sized mind is not enough, as Leibniz pointed out. But, according to functionalism, what is missing is not an immaterial soul but a functional analysis of the mill and its component structures and processes. Wandering through an expanded engine, you would not, simply by looking, realize that the cam-shaft is a valve lifter, or that the things it moves are valves. Of course, most of us know enough simple push and pull mechanics that we could make some shrewd guesses. But a comparable experiment with the micro-chip at the heart of a calculator or computer would leave most of us on the wrong side of a Leibnizian Gap. Computationalism--the idea that the mind is what Haugeland calls an automatic formal system--is the functionalist proposal that mental capacities can be analyzed and explained as complex computational processes. This idea was given a huge impetus by the dual discovery that inference could be treated as a computational process, and that that process could be instantiated in a machine. Psychology's oldest and most powerful explanatory primitive was finally given a materialistic explanation, though not yet a biological one. Association proved relatively easy to implement computationally and found its most important home in semantic networks. Computationalism requires some fundamental enabling assumptions to turn it into a serious research program. The first of these is that the mind is fundamentally an engine of 11 thought. Descartes held the essence of mind is thought, and Locke that the essence of mind is the capacity for thought. We think of this as the Mr. Spock assumption: a thinking engine with no other mental characteristics would still count as a mind, but a system with emotions, sensations and other non-cognitive mental processes that did not think (assuming this is even possible) would not count as a mind. Computationalism proposes to follow Descartes in the assumption that the place to start in understanding the mind is thought or cognition. Other aspects of mentality can be added on later, much as terms for friction and air resistance can be added to the basic pendulum equation once it is articulated and understood. The second fundamental assumption grounding Computationalism is that thought does not require consciousness. This assumption, along with the first one, allows computationalists to put off the explanation of consciousness until some future time. Given these assumptions, computationalism looked like an attractive research program for psychology. The idea that the mind is essentially a functionally specified computational process running on the brain provides a bridge over Leibniz' Gap (functionalism), a supply of mental mechanisms with precisely specified properties (anything you can program), and medium independence: the possibility that thought can exist in a non-biological computer, and hence can be investigated in the computer lab as well as in the psychological lab. It was a powerful vision. And though it shows signs of fading today, it was, and in some respects continues to be, a hugely prolific vision, fueling the initial birth and development of what came to be called Cognitive Science. 12 Introduction: The mind as neural network Top Down vs. Bottom Up. Top Down. computationalism is what is called a "top down" strategy. In the hands of the computationalists, that strategy, classically characterized by Marr in the last section, begins by identifying a task or capacity to be explained--i.e., with the explanandum (the thing to be explained): the capacity to learn a language, or converse, or solve a problem, etc. It then attempts to specify that capacity as a function or relation: What inputs produce what outputs under what circumstances? Finally, that characteristic function or relation is analyzed into components that have known computational realizations. (In practice, this means analysis into components that can be programmed in LISP or some other standard programming language.) This strategy involves three assumptions and a precondition that are worth noting. 1. One underlying assumption of this approach is that cognitive functions are computable. This is actually a rather strong and daring assumption. Most dynamical systems found in nature cannot be characterized by equations that specify a computable function. Even three bodies moving in Newtonian space do not satisfy this assumption. It is very much an open question whether the processes in the brain that subserve cognition can be characterized as the computation of a computable function. 2. Another underlying assumption of top-down computationalism as it is usually characterized (and as we have just characterized it) is that cognitive capacities can be specified independently of their realizations. But this is pretty patently false: There is no input-output function the computation of which would constitute playing intelligent chess. Or rather, there are a great many. Think of a chess system as a move generator, i.e., as a function from board positions (current) to board positions (the move). In a given situation, intelligent chess players might make any number of different moves. Indeed, the same one might make different moves on different occasions. In practice, then, the only way to specify a chess function is to actually write an algorithm for computing it. We cannot, in general, expect to specify a cognitive function before we analyze and implement it. 3. A third underlying assumption of the top-down strategy, closely related to the second assumption, is that we will be able to recognize and characterize the relevant inputs and behaviors antecedently to serious attempts to explain how the later are computed from the former. Here the difficulty is that pre-analytic conceptions of behavior and its causes may seriously misrepresent or distort what is actually going on. Connectionists often complain that there is not reason to think that cognition in the brain is the manipulation of representations that correspond to our ordinary concepts. Top-down strategists therefore run the risk of characterizing the explananda in terms that are cross-cut or distort the categories that are actually causally relevant. This is a common fallacy in biology where there is an almost irresistible temptation to believe that the morphological traits of importance and interest to us must correspond to our genes in some neat way. Computationalists are wont to reply that what Daniel Dennett calls the intentional strategy--explaining behavior in terms of beliefs, desires and intentions--is enormously successful, and hence that it cannot be fundamentally wrong to characterize cognition in something like these commonsense terms. So much for the assumptions. Now for the precondition: A successful application of the 13 top-down strategy is that the target explanandum can be analyzed. Everyone who has ever tried their hand at programming is familiar with this constraint. You cannot write a program that computes bids in bridge, or computes square roots, if you do not know how to compute bids in bridge or compute square roots. But many psychological capacities are interesting explananda precisely because we have no idea how the task is done. This is why artificial intelligence plays such a central role in computationalism. It requires very considerable ingenuity to discover a way--any way--to construct 3Dspecifications of visual space from retinal images, or to make it happen that two short sessions on many problems are more effective than one long one. But even with success, there is a problem: having figured out a way to compute a cognitive function, what reason is there to think that that is how our brains do the job? We do not mean to suggest that there is no way of addressing this problem, only that it is a problem that is bound to arise in a top-down framework. Computationalists are thus inevitably left with a narrowed but still substantial Leibnizian Gap: the gap between a computational description of psychological processes and a bio-neural description of the processes in the brain. Before we leave the topic of underlying assumptions and enabling conditions, it is worth pausing to note that some of the central enabling assumptions of computationalism are shared by connectionism. Both assume the possibility of Spock--i.e., that the mind is basically a cognitive engine and only secondarily a seat of emotion, feeling and sensation. Both assume that consciousness is inessential to the understanding of cognition. And both assume that cognition doesn't require a biological brain, let alone an immaterial soul. Both are thoroughly functionalist and materialist. And both are representationalist in that both assume that cognition is to be understood as disciplined transformation over states whose primary function is the representation of information relevant to the cognitive capacity being exercised. The differences that divide computationalist and connectionist are practically invisible against the scale that measure the distance between both and behaviorism or structuralism. Bottom Up. The top down strategy is explanandum driven: you begin with a capacity to explain, and try to find a computational architecture that will have it. The bottom up strategy is explanans (the explainer) driven: you start with a specification of the architecture, and try to find a way to make it do the task.1 What connectionists have in common is the assumption that cognitive capacities are built out of a stock of primitive process designed explicitly to be rather brain-like. They begin with the building blocks of a simplified and idealized brain, and attempt to create systems that will behave in a fs241In practice, most computationalist are actually bottom-uppers to some extent. This is because, as a graduate student, you apprentice in a research group that is more or less committed to a given architecture, and you job is to extend this approach to some new capacity. It is just as well: pure top-downism, as described by Marr, is probably impossible. Computationalist architectures, however, are not well-grounded in the brain, so the problem just rehearsed remains. 14 recognizably cognitive way. The connectionist thus seeks to narrow the Leibnizian Gap even further to that between a genuine bio-neural description of the brain, and the simplified and idealized "neural networks" that are their stock in trade. But a much narrowed Gap is not the only payoff. As it happens, it is possible to program connectionist networks to do tasks that the programmer does not know how to do. All that is required is a sufficiently representative "training set": a set of inputs paired with their correct responses. Thus the Precondition of top-down computationalism, discussed above, can be avoided. You can program a network to do a task you haven't the faintest idea how to do. There is a downside to this, however: once you have trained a network, you may still have little if any idea how it does the task. Studying an artificial network is, of course, much easier than studying a living brain, so you are still substantially ahead. But you are not home free. Moreover, it is seldom noticed that one of the lately discussed assumptions required by the top down approach are also required by bottom-uppers. Training sets must be specified somehow, and the problem of how to conceptualize inputs and behaviors is no easier for connectionists than it is for top-down computationalists. While connectionists need not assume that networks operate on internal representations that correspond to ordinary common-sense concepts, they are no better off than top-down computationalists when it comes to conceptualizing the target explananda. The Architecture. A connectionist architecture is a network of simple units (see figure1). Each unit has an arbitrary number of inputs and outputs. Inputs and outputs are not messages, but simply quantities of activation. Inputs maybe positive (activation) or negative (inhibition), but are not distinguished in any way from each other by the unit that receives them, but the activations from each input are simply summed up to yield a net input. Thus, there is no difference between three inputs of one unit of activation, and one input of three units of activation; these have precisely the same effect. Each output at a given time is the same as every other from that unit, and is a function of the total input. That function, called the activation function, is characteristic of the unit and is assumed to be fixed.2 Units are connected by weighted connections (see figure 2). The output from the source unit is multiplied by the weight of the connection to yield the input to the receiving unit. Weights may change as a function of activation spreading from unit to unit, or may be altered externally by the programmer or some automatic process. Inputs to a network are given by setting activations on some pool of units. The input pool may be any designated set of units including all of them. Activation is then spread through the network, each unit computing its output from its input, each output being modified by the relevant connection weight. Outputs are read as the pattern of activation on some set 2 A typical activation function is a logistic function: output activation = 1 1+ e- (total input) This makes output a near linear function of input, keeps total output bounded without introducing discontinuities. 15 of units designated as the output pool. Output might be the pattern of activation on the output pool after a given time, or when the network reaches equilibrium, or when some other criterion is met. Networks are of two basic types (see figure 3). Feed-forward networks are distinguished into layers. Each unit in a given layer may connect to other units in that layer, (including itself) or to units in the next layer "downstream", but may not connect to any unit that is "upstream". In recurrent networks, any unit may connect to any other. Various patterns of connectivity can produce hybrids of feed-forward and recurrence in the obvious ways. The performance of a network is altered by modifying the connection weights. This is typically called learning, but the most important techniques for modifying weights are best thought of as programming techniques, since they are the result of some agent external to the network modifying the weights in response to some feature of performance. These techniques are generally called "supervised learning" techniques to distinguish them from weight change that results automatically as a byproduct of activation flow and hence can be conceptualized as the result of processes intrinsic to the connections themselves. The operation of these processes are typically referred to "unsupervised learning". Backpropagation. By far the most important form of "unsupervised learning" is back propagation, a technique that applies straight-forwardly only to feed-forward networks. Backpropagation operates as follows. First, an error for each output unit is computed by taking the difference between the actual activation of that unit on a given input, and the activation it should have (read from the training set). This error is then used to compute a modification of the weights on the unit's input connections. If the activation was too high, each weight is slightly decreased; if it was too low, each weight is slightly increased. This is done for each output unit. To compute weight changes for connections leading to units in intermediate layers, the correct activation values for those units must be estimated, since these are not given by the training set. The details of the estimation process need not concern us here, but once it is carried out, changes to the weights are computed as before.3 In practice, back propagation is virtually guaranteed to eventually set a network's weights correctly if there is any set of weights that will work at all. Simplification and idealization. 3 8 The error for each unit in the output layer is given by: error= (target value - actual value)f'(total input), where f' is the first derivative of the activation function with respect to change in input. For units in the intermediate layers, the error is given by: 3 error k w ki error=f'(total input) k where the summation term sums the products of the error at the kth unit in the layer downstream and the weight between that unit and the one whose error is being computed. Weight change is then simply: ij error i )(activationi ) w ( i.e., the change in weight between unit i and unit j is a constant(the "learning rate") times the error at i, times the activation at i. 16 Connectionist networks evidently don't begin to incorporate everything that is known about neurons and their synaptic connections. Why not? Two reasons might be given. The first is mathematical tractability. Connectionist networks are mathematically straightforward, whereas actual neural networks are not. The second and more important reason for the relative simplicity of connectionist networks is methodological. It is, perhaps, best to begin with a simple and well-understood model, adding complications only when forced to do so to accommodate otherwise recalcitrant phenomena. In that way, we are sure to have some idea of what the extra complexity is actually for--what it makes possible. The underlying hope is that connectionist networks will turn out to be idealized neural networks. Representation. Information is stored in two ways in a connectionist network: fleetingly, in the activation patterns that appear and disappear in various pools of units, and, more enduringly, in the pattern of weights. In both of these cases, we find something very different from the quasi-linguistic representations that make up the data structures of computationalist models. It is precisely because connectionist representations do not seem to correspond in any neat way to the linguistic expressions we use to articulate the contents of beliefs, desires and intentions that controversy has arisen as to whether connectionist models can explain cognitive capacities as they are ordinarily specified. A little noticed aspect of this issue has to do with the fact that the typical connectionist network actually computes a function which is a huge superset of the function it is designed to compute. Thus NetTalk is said to compute a Wickel-feature representation of the phonetic value of the middle letter in a seven letter string. And so it does. But consider how inputs are represented (see figure 4). The assumption is that every unit is either on or off, and that only one unit is on for each position. But the network will happily compute outputs on input configurations that violate either of both of these constraints, i.e., on inputs that have no canonical interpretation in the intended task domain at all. Since the network is designed so that every possible output represents some phoneme or other, each of these maverick inputs will be mapped onto a phoneme. How should we react to this? Should we say that NetTalk doesn't really do the job it was intended to do? Or shall we make a virtue of necessity and hypothesize that something comparable would happen in the brain if maverick stimuli were introduced, perhaps by electrode? In computationalist systems, maverick inputs generally generate error messages or crashes. Is this a virtue or a vice? Graceful Degradation. Closely related to this issue is a much touted property of connectionist networks. "Graceful degradation" refers to the fact that degraded input, lesioned connections, or damaged units, typically do not bankrupt a network, but lead to impaired but interpretable performance, with the degree of impairment depending on the degree of damage. As any programmer knows, damage to a computational system is generally catastrophic, yielding an error, a crash, or nonsense rather than impaired but interpretable performance. The 17 fact that stroke and other kinds of damage to the brain often (but not always!) leads similarly to degraded performance rather total dysfunction appears to be a factor strongly in favor of the connectionist approach. If the brain implements an orthodox computational system of the sort envisaged in the previous section, brain damage ought to uniformly produce crashes, not degraded performance. Real Time and the Hundred Step Rule. Feldman and Ballard (19~~) point out that many basic cognitive functions are performed in 500 milliseconds or less. Since it takes, on average, about 5 milliseconds for one neuron to communicate with another in the brain, it follows that the brain must accomplish these tasks in about100 sequential computational steps. Typical orthodox computationalist programs take thousands. The obvious inference is that these programs err in one or more of the following ways: • They are too sequential; parallel computation requires fewer sequential steps. • They utilize the wrong primitives: more sophisticated primitive operations allows for more speed, but less flexibility. • They are digital rather than analogue: a basic physical component whose physical state changes are given by a function analogous to (isomorphic to) a psychological function computes that function in a single step in the time it takes for a physical state change. The primitive computations in a digital computationalist model are never computations of a psychological function. It is no use to reply to these points by pointing out that thousands of steps reduce to hundreds or less if we count each psychologically significant sub-routine, rather then the computational steps in the routine, as single steps. For this would require the assumption that each such sub-routine could be accomplished in a single(perhaps massively parallel) neural step. Computationalist models do not even address the question of how this might be possible; connectionist models address the question directly. Spontaneous Generalization. Another often touted property of connectionist networks is spontaneous generalization. In the typical case, if a network is trained to respond to two activation vectors I 1 and I2 with output activation vectors O1 and O2 respectively, the network will automatically respond to an input vector that is (geometrically) between I1 and I2 with an output vector that is between O1 and O2.This sort of interpretation is one of many kinds of generalization that appear in networks automatically and unbidden. (Whether it is wanted or not!) This is an example of a psychologically significant process that has to be specifically programmed in computationalist models, but appears in networks as a kind of side-effect, with no additional computational resources required. It seems perverse to assume that a network of neurons implements in some highly indirect way a computationalist algorithm for generalization. A phenomenon that occurs spontaneously in the neurons is obliterated in the implementation and artificially re-introduced as a complex computation! 18 Concluding Remarks. The connectionist revolution has forever altered the once uniform landscape of Cognitive Science as it existed under the sway of the computationalist paradigm. It has not utterly supplanted computationalism by any means. And there are growing numbers who think that, ultimately, only actual neuroscience will give us any real insight into the mind. We have tried here to briefly indicate here what connectionist networks are and why they are attractive. It has not been our intention here to adjudicate this dispute, but simply to introduce one of the players. 19 Introduction: The Mind as Brain Everyone who is not a dualist believes that mental processes are process that go on in the brain. If one's goal is a science of the mind, however, observation of the brain seems to yield results on the wrong side of Leibniz' Gap. The Computationalist response to this problem is to try to understand cognitive processes in abstraction from the brain or any other "hardware" in which they might occur. The Computationalist strategy is to first articulate a computational theory of cognition, and then to inquire into how the implicated computational processes might be carried out in the brain. This strategy has some evident merits. Since no one doubts that computational processes can be physically realized, Computationalism is free from any dualist taint. Yet the problem of bridging Leibniz' Gap is conveniently put off until some future date when we will surely know more about both cognitive and neural processes. An evident drawback, however, is that there is no guarantee that cognitive processes are computational processes at all, let alone that cognition in biological brains will turnout to be the kind of processes we are led to investigate by following a strictly top-down approach. Although that approach has had some notable successes, it has also had some notable failures. It would not be unreasonable to conclude that the difficulties faced by Computationalism might be due to insufficient attention being paid to the only processes we know for sure are sufficient to subserve mentality in general, and cognition in particular, namely brain processes. Perhaps we should simply accept the fact that, as things currently stand, studying the brain puts us on the wrong side of Liebniz' Gap, but hope that, as our knowledge increases, the outlines of a bridge over the Gap will eventually appear. Connectionists attempt to take a middle ground here, starting in the middle of the Gap, as it were, and trying simultaneously to bridge to either side. Most neuroscientists, it seems, are at least tolerant of the Connectionist strategy. But they are inclined to argue that connectionist models are such vastly oversimplified models of the brain as to be very likely misleading at best. If we are going to bridge Liebniz' Gap, we are going to have to know a great deal more about the brain than we do now. This much is agreed on all hands. So why not get on with it? And, since the brain is the only known organ of mentality, whether natural or artificial, it seems only sensible to begin by trying to understand how it works. Any other strategy arguably runs the risk of being a wild goose chase, an attempt to make mentality out of stuff that just isn't up to the job. This line of argumentation has been around at least since the seventeenth century, but it had little practical consequence until relatively recently simply because there was no very good way to study the brain. Except for some nice details, simple dissection is enough to give a fairly complete structural picture of the heart. X-ray of the living heart pumping radioactively tagged blood, together with open chest and open heart surgery, are enough to give an equally complete picture of the functioning of the living heart and how that functioning supervenes on the heart's anatomical structure. But nothing comparable will do for the brain. It is way to complicated. Its functional parts, the neurons, are incredibly small and numerous. The processes are subtle electro-chemical processes distributed over millions of cells. The scale is forbidding, both the vastness of the number of parts and processes, and the smallness of their physical size and duration. Still, a variety of techniques have been available for some time. Dissection is by no means useless, and autopsy of those with damaged or diseased brains with known 20 mental dysfunction has always been suggestive. The Gross anatomy of the brain has been known since at least 200BC. The anatomy of the neuron, along with some of its functional properties began to emerge in the 19C, aided by staining techniques and better microscopes, and the discovery of a neuron in the squid that is conveniently large. Synaptic junctions became "visible" in the 1950s with the use of the electron microscope. The ensuing decades ushered in increasingly ingenious uses of lesion studies, in which functional impairments are correlated with damage to specific brain areas. Commissurotomy (lesioning of the commissures connecting the right and left hemispheres) as a treatment for certain forms of severe epilepsy led to a mass of data and speculation concerning hemispherical specialization, as did the introduction of the Wada test, in which sodium amytal is injected into the left or right carotid artery resulting in "paralysis" of the corresponding hemisphere. Electrode stimulation and recording studies in both human and animals became highly refined. Even more recently, a variety of non-invasive techniques have finally allowed to begin to enter into the brain as into a mill, as Leibniz imagined. EEG (electroencephalograph), ERP (evoked reaction potential, an averaging of EEGs over many trials), CAT (computerized axial tomography),PET (positron emission tomography), and, most recently, MRI (magnetic resonance imaging) and FMRI (functional MRI), have made it possible to observe the brain and its processes at a scale in both space and time that are beginning to make Leibniz' thought experiment a reality. All of this technology makes it the bottom-up strategy favored by many neuroscientists a serious scientific possibility rather than a mere philosophical platform. But as that platform becomes a practical reality, the vision that made Cognitive Science a reality for thirty years has, unquestionably, begun to dim. Bottom-uppers have no particular reason to focus on cognition--to assume, with Descartes, that discursive thought is the essence of the mental. Nor, consequently, have they any particular reason to think that cognitive phenomena form an autonomous domain whose principles might be articulated independently of both non-cognitive mental phenomena and of their particular physical realizations. That there can be scientific study of cognition, no one in the scientific community seriously doubts. But it is increasingly unclear whether there can be Cognitive Science as this was conceived by most computationalists and many connectionists. Computationalism as a research strategy requires, as we saw, enabling assumptions that make the possibility of a special science of cognition seem extremely plausible. Bottom-up neuroscience has no such implications. 21 Introduction: Special Topics Our final section is a short collection seminal papers that introduced four issues that have deeply affected the development of cognitive science. While many issues in cognitive science are parochial to one or another specialized sub-discipline, the ones to be discussed here--innateness, modularity, eliminativism, and (more recently) evolutionary psychology, are overarching. No one actually involved in the cognitive science can be indifferent to these issues because they all have serious implications for each of the approaches to the mind that have underwritten scientific research on cognition. Innateness Perhaps more than any other, the issue of innateness divides rationalist and empiricist approaches to the science of cognition. Rationalist and empiricist alike believe that there are innate cognitive capacities. Everyone believes we are born with the capacity to learn. What divides rationalist from empiricist is the idea that there is innate knowledge; that substantive and contingent information about the environment is part of the genetic endowment of every biological cognizer. The central argument for this claim is very simple and goes back to Plato: Certain things that are known could not be learned and hence must be innate. There are variations on this theme, having to do with why the alleged innate knowledge cannot be learned. There is not enough time; the knowledge is acquired before development of the (typically perceptual) capacities required to extract the information from the environment; the knowledge in question is required for any other learning to take place; the environment simply doesn't provide enough information. Looking over these variations, it is easy to see that they depend on fairly detailed views about how learning works, and about the developmental sequence. It is, or course, tempting to declare innate whatever cannot be accommodated by one's own stories about learning mechanisms and development. On the other hand, opponents of any particular nativist claim can be required to put up or shut up: if you think something is learned, it is encumbent on you to say how and when. Since it is enormously difficult to explain how anything is learned, this has proved a very effective strategy. Computationalists are more likely to be nativists than connectionists simply because it is relatively easy to build a particular bit of innate knowledge into a computationalist model, and relatively hard to do anything comparable with a connectionist model. (Strict bottom-upper scan afford to be agnostic on this topic since, as things now stand, they do not have specific enough architectural models to have implications for how innate knowledge might be accommodated.) It is worth noting, however, that the issue of innateness has tended to be formulated by cognitive scientists in terms that have a comfortable home only in the computationalist framework. For only in that framework is it natural to think of long term knowledge as something that comes in the form of the sort of discrete propositions typically expressed by a sentence in a natural or artificial language. In the 17th C. it was much debated whether our knowledge of God's nature and existence is innate or acquired. That debate is instructive for at least two reasons. First, it is instructive because it presupposed some knowledge that many today would claim we do 22 not have at all. But it is also instructive because it takes the issue to be which linguistically expressible propositions are innate. The issue of innateness as it was understood in the 17th C. loses focus if we ask whether there is innate knowledge of how to do something, for this blurs the distinction between innate knowledge, which only rationalists accept, and innate capacities, which both rationalists and empiricists accept. The rationalist reply is typically that knowledge how depends on knowledge that. We know how to drive a car because we know that turning the wheel counter-clockwise moves the car to the left, and so on. But this reply is evidently self-defeating in its full generality. Rationalist and empiricist alike concede that there must be innate capacities to exploit the knowledge we have in learning, action and perception. An encyclopedia cannot do anything. But there is a subtler issue here as well. Computationalists quickly became aware that a great deal of information could be implicit in the logical structure of a program. Production systems self-consciously blur the distinction between information and the processes that operate on it almost completely, with only working memory remaining as a repository of information untainted by process. Whatever long-term knowledge is represented in the weights of a connectionist system is not to be distinguished from the processes that operate on the short-term information represented in the activation vectors. Both parties, it seems, will have trouble with the distinction between knowledge and capacities that initially gave the issue its bite. INSERT HERE CANALIZATION PARAGRAPH FROM DENISE? Even if there is little left of the once important philosophical issue concerning innate ideas, there is still a great deal left to discover about what sort of information is implicit in our genetic neuro-architectural endowment, and how just how it is "in there." And here the old arguments still have some force: is there enough time? Is there enough information in the environment? What has to be presupposed in a system capable of extracting and exploiting that information? Modularity From time immemorial, people have divided the mind into "parts." Plato distinguished reason from appetite and will (##), and commonsense distinguishes perception from thought, and both from the emotions. Frequently, memory is distinguished from all of these as well. These distinctions are evidently functional. Though commonsense is unclear what the emotions are for, this scheme of things has its rationale in speculation about what it takes to get the job done. Perception takes in information, memory stores it, reason processes it, storing some and using some to direct action. The emotions provide desires or goals. These are the basic components of GPS, which was designed to solve problems. Today, we would recognize these components as the basic components of a planner, a system designed to formulate a plan (a series of actions) to achieve a goal in the light of whatever information is already available or can b gleaned from the environment either accidentally or as part of the plan. 6480 This simple but elegant, compelling and remarkably effective structure involves a division of labor. It assumes special faculties for information acquisition(perception), storage (memory), inference (reason) and goal generation(emotion/desire). Contemporary talk of modules, however, involves a different kind of division of labor, a division corresponding to different cognitive domains rather than to different requirements for 23 accomplishing a given task. For some time, cognitive scientists have taken seriously the idea that the mind is a committee of specialized minds, each relatively complete in itself and designed especially for a given kind of cognitive problem. Modules have been proposed for language production and language understanding, with sub-modules for st phonology, syntax and semantics, at a minimum. The same goes for vision, memory, social reasoning, physical reasoning, and many others. An extreme form of the resulting vision is admirably captured in the following quotation from Tooby and Cosmides(1995): [O]ur cognitive architecture resembles a confederation of hundreds or thousands of functionally dedicated computers (often called modules) designed to solve adaptive problems endemic to our hunter-gatherer ancestors. Each of these devices has its own agenda and imposes its own exotic organization on different fragments of the world. There are specialized systems for grammar induction, face recognition, for dead reckoning, for construing objects and for recognizing emotions from the face. There are mechanisms to detect animacy, eye direction, and cheating. There is a "theory of mind" module..., a variety of social inference modules...and a multitude of other elegant machines. This "Swiss Army Knife" model of the mind has many attractions. It allows the cognitive scientist to attack cognition piecemeal. Building a parser is a lot easier than building a whole mind. It makes the evolution of mind easier to understand, since it allows various modules to evolve in relative independence of others in response to specific selection pressures. It explains in a natural and compelling way how we can do so many things without having a clue how we do them. It accommodates such puzzling facts as that language competence is largely independent of IQ, and that social and physical reasoning seem largely independent of each other. And finally, it helps to explain the strange dissociations that are the stock in trade of the cognitive neuroscientist: people who can write from dictation but cannot read, people who are blind but who think they can see, people who are unaware of one half of their body, and many more. But there are problems as well. The brain exhibits remarkable plasticity as well as specialization. While modularity doesn't depend on strict neural localization--after all, you can program a computer with highly modular sub-routines, and the resulting modularity of capacity will not be reflected in the hardware--localization of functions corresponding to hypothesized modules would certainly make life easier. And without some constraints from neuroscience, it seems that the modularization of mind could be imagined in many ways consistent with the behavioral evidence. Once modules are admitted, each with its own implicit information, it is inevitable to ask which, if any, are innate? Thus, modularity, as an hypothesis about the structure of the mind, cohabits comfortably with nativist theories. In general, the degree of nativism in a theory correlates positively with the degree of modularity it assumes. You can have modularity without nativism, but it is hard to defend nativism without modularity. An innate module comes equipped with its own proprietary information, whether implicit in the processes, or explicit in the memory. Defenders of innate modules can defend nativism without facing the tricky distinction between knowledge and capacities that was presupposed by the original dispute over innate ideas. 24 Eliminativism Eliminativism in the philosophy of mind is the doctrine that commonsense mental concepts, especially those of belief, desire and intention, are seriously flawed, and should ultimately be eliminated from a serious theory of cognition. Commonsense has it that it is beliefs, desires and intentions that make the mind go 'round. You want a sandwich. You form an intention to acquire one. You believe that there is bread, peanutbutter and jelly in the kitchen. You have beliefs about where the kitchen is relative to where you are know, and other beliefs about to get to the kitchen. You form an intention to go to the kitchen. You believe that, once there, you will be able to construct a sandwich. Etc. That we all have beliefs, desires and intentions, and that they interact to generate intelligent, goal directed behavior in this way, seems beyond question. But it isn't beyond question, for it has been questioned, and quite seriously. And the implications are staggering: All of traditional epistemology, as well as moral and legal reasoning, have been couched in the terms of commonsense psychology. If these prove to be seriously flawed, it is not just our cognitive science that will have to change. Evolutionary Psychology Recently, evolutionary psychologists of a cognitive bent have argued that the study of cognition needs to be informed and constrained by a consideration of what our various cognitive capacities are for--what they evolved to accomplish. Cognitive capacities, the argument goes, are adaptations, and hence the mind should be seen as a biological system structured by natural selection. The argument can be simply summarized as follows: If you are a materialist, then you are committed (at least implicitly) to the view that The mind is what the brain does. That is, our cognitive processes are instantiated as neurological processes. And unless you are a creationist, you are also committed to the view that The brain was shaped by natural selection. If you accept these two premises, you are also committed to accepting their logical conclusion, namely, that The mind was shaped by natural selection. The strongest proponents of this argument take a very literal interpretation of this conclusion. The mind is seen as a collection of independent cognitive functions or modules each of which constitutes an adaptation to different environmental pressures. We have already quoted Tooby and Cosmides to this effect. Stephen Pinker puts it even more succinctly: "[T]he human mind..[is]..not a general-purpose computer but a collection of instincts adapted for solving evolutionarily significant problems -- the mind as a Swiss Army knife." (Pinker, 1994) 25 This view is certainly consistent with a neurological view of the adult brain, which is, to a first approximation anyway, a collection of discrete specialized structures and substrates. Specific neural circuits subserve specific cognitive functions, and damage to those circuits produce selective impairments in cognition, not across-the-board reduction in intellectual function. This massive modularity view, however, does not sort well with what we understand about neural plasticity during development. As one leading textbook on cognitive neuroscience puts it "The environment has profound effects on the <developing>brain. Such effects are clearly seen during sensitive periods, which are time periods during development when an organism is particularly sensitive to certain external stimuli." (Banich, 1997) At first blush, the plasticity of the developing brain does not seem to sort well with the view that natural selection shaped innate modules with specific functions. Instead, it seems more consistent with a strictly empiricist view of intellectual function, one in which the nature of one's reasoning strategies simply reflects environmental contingencies. But an evolutionary approach to cognition need not presuppose massive modularity. An alternative conception, consistent with the idea that natural selection has designed our cognitive architecture, is that complex social animals do not inherit modules fully formed but that we have a biological preparedness to develop them very quickly for classes of problems that are critical to survival (and hence reproductive success).Further, these predispositions can differ in their degree of canalization, that is, in the degree to which the environment plays a role in their expression. As examples, consider the neurological changes that subserve the development of binocular vision and language. Cortical binocular columns(used in depth perception) are not present at birth, but appear in the visual cortex during a critical period after the infant has received visual input. Other visual cortical cells show diffuse line orientation "preferences" at birth, firing maximally to lines of a particular orientation (e.g., vertical), but responding to lines of other orientations as well, albeit to a lesser degree. After receiving visual input, however, these cell preferences are sharpened so that they respond maximally only to lines of a particular orientation. Further, if visual input is restricted to only a single orientation (e.g., the animal is exposed only to lines of vertical orientation), the majority of cells will shift their preferences to match their visual experiences, responding maximally to lines of vertical orientation even if their initial preferences were for lines of other orientations. Like vision, language development also shows a complex pattern of interplay between innate biases and environmental input. Deaf babies will begin to babble vocally just as hearing babies do, but their babbling declines and eventually ceases, presumably because they don't receive the auditory feedback hearing babies do. Infants are also born with the capacity to hear all phonetic contrasts that occur in human communicative systems, yet lose the capacity to distinguish among phonemes that are not marked in their language community within the first year of life. As these examples show, biological preparedness comes in degrees, and is probably best explained in terms of canalization (Ariew, 199~).A trait is said to be more or less canalized as its expression is more or less independent on environmental influence. 26 A combination of genetic and environmental factors cause development to follow a particular pathway, and once begun, development is bound to achieve a particular end-state. Limb development is highly canalized in humans (humans everywhere grow limbs in the same way) but not perfectly so, as the example of Thalidomide shows. As we saw, language is highly canalized, though not so highly as limb development. The environment can influence trait development in many different ways. The most interesting of these to the psychologist is learning. It is not often fully appreciated that learning can affect the development of even highly canalized traits. Thus language, though highly canalized, is still learned. Biology puts strong constraints on what properties a language must have to be learnable (as a first language), and it virtually guarantees that language will be learned in a huge variety of environments. This is what is meant by the claim that there is a specific biological preparedness for language acquisition. As we noted earlier, when it comes to nativist theses about cognition, there is a temptation to ask which information (rules, theories, etc.) is innate, and which is learned. Couching the issue in terms of canalization or biological preparedness, however, allows us to see things quite differently. Consider a jointly authored paper. We might ask who authored which sections or paragraphs or even sentences. This is how people tend to think of nature vs. nurture in the cognitive realm. But it could also happen that both authors are responsible for every sentence, with the degree of responsibility varying from sentence to sentence, or section to section. The suggestion is that we should think of our cognitive abilities as all thoroughly co-authored. From this perspective the question is not what ideas are contributed by the genes, and which by learning, but rather how canalized the development of a given idea (or concept or ability) is: how much variability in the learning environment will lead to the same developmental end-state. An advantage of this way of thinking is that we see at once that nothing, not even limb-development, is inevitable. And when we investigate things in this light, we are led to ask which variations in the learning environment will divert the stream into a different and perhaps preferable canal. However we conceive of the influence of natural selection on the mind, it seems clear that natural selection has shaped the mind, and will continue to do so. This rather uncontroversial idea, however, opens up anew source of evidence for cognitive scientists, and generates a new set of constraints on cognitive theory. Can the strategies employed so successfully by evolutionary biology in other domains yield substantial progress in the study of the mind? We won't know unless we try.
Pages to are hidden for
"valve lifter"Please download to view full document