Noam Chomsky on Syntax - Syntactic Structures _1957__ Aspects Of

					Noam Chomsky on Syntax - Syntactic Structures (1957), Aspects Of The Theory Of Syntax (1965) “Colorless green ideas sleep furiously” has no meaning, yet we consider it a sentence; while the same meaningless collection of words with the order reversed, “Furiously sleep ideas green colorless”, is not a legal sentence. We accept some word orderings as grammatical but not others. It does not have to do with semantics (meaning) -- only with structure: the abstract form or "shape" of the sentence. It certainly does not depend on our experience with those words, since we are extremely unlikely to have encountered them together in the past. The system of relations among words in a sentence that determines which word orders are legal is called syntax. In each of the following sentences there are two different semantic interpretations possible: “Visiting relatives can be a nuisance. Flying planes can be dangerous. The shooting of the hunters disturbed me.” The abstract syntactic grouping of the words is what distinguishes the different interpretations. In the first sentence, for example, the syntactic form is different when “visiting” is the subject than when “relatives” is the subject. The different syntactic structures correspond to the different meanings, though the sequence of words is the same. What we know when we know a language is a system of rules -- for instance, rules for the syntax of a wellformed sentence. These syntactic rules determine the grammaticality of sentences: if the sentence follows the rules it‟s okay, and if not, it‟s ungrammatical. “Colorless green ideas sleep furiously” is meaningless, yet it follows the rules; putting the words in reverse order does not. Rules are required because language is generative (i.e., productive, or creative): an infinite number and variety of grammatically correct sentences can be produced from a finite number of words. Completely novel sentences are routinely composed by every speaker, sentences that may never have been produced before in all of history. If I‟m walking along and a hole opens up in the ground and I fall into a subterranean world where dinosaurs co-exist with primitive humans, I can easily talk about it when I get back. Without rules, maybe we'd have to have a repertoire of sentences that we've heard before and have practiced on occasion, which we might draw upon for appropriate responses in certain situations; but such a "catalog" of practiced sentences could hardly be expected to suffice for all the expressive needs that might possibly arise for a speaker (even if he lived his whole life without confronting a live dinosaur). And aside from the variety of expressive needs, the number of sentences to be learned would be virtually infinite -- far too many to even count, never mind memorize. One can make a new sentence just by using simple embedded phrases: “I know that you know that I know... (for as long as you like!)” Rules provide a way of generating new instances in some principled way, even though those specific new instances have never been experienced before. In the case of multiplication, we know “2 x 2 = 4” from previous encounters by now; not so for “345828.343 x 8979432.67868” Yet given the simple rules we learn in elementary school anyone can come up with that product in a matter of minutes, though it's probably safe to say that in the whole history of mathematics no one has ever before needed to multiply exactly those two randomly chosen quantities. The point is, rules can be used to generate novel sentences as easily as novel multiplication products. We use rules, but that doesn‟t imply that we must know what they are. We're allowed to change “I'm going to work" to "I'm gonna work", but we never change "I'm going to school" to "I'm gonna school"; the contraction rule depends on the sentence's structure, but we follow the rule perfectly without realizing that. Knowing how to change "going to" to "gonna" doesn't require knowing that you can only contract the infinitive (which is why the first example only works when "work" is a verb!). A calculator is a good analogy -- it can add numbers according to rules built into its circuitry, but it can‟t tell you what the rules are that it uses when it does so. The distinction between linguistic competence and performance is important. Competence refers to our implicit underlying knowledge of linguistic rules: we can tell legal from illegal sentences instantly, and the rules by which we do so are known implicitly by every language speaker. Performance is what happens in an actual speaking context: we make errors like using incomplete sentences when our attention is diverted, or failing to make the subject and verb agree, or using an incorrect tense. These errors may be due to factors such as attention or memory constraints (for instance, if we‟re trying to produce a very long sentence and we lose track of its beginning by the time we near the end, or if there is a very complex embedding of phrases) or even

physiological constraints (we could run out of breath in trying to get a sentence out). But no one would suggest that syntactic errors in performance, in actual speaking, indicate some kind of deficit in basic syntactic competence. It's the abstract basis of linguistic performance -- competence -- that is really of interest. The set of rules we want to identify should generate all the possible legal sentences in a language, and none of the illegal ones. The first part simply means that the rules must account for our great productive capacity to make any number of sentences to meet any occasion. The second part means that the rules we use to make our legal sentences must not be able to produce illegal sentences as well. In the case of “Colorless green ideas sleep furiously”, one rule we could try (to give a ridiculously stupid example) would be: a sentence just has to have a noun and a verb, and adjectives and adverbs are also allowed. Immediately we see that this rule is far too simple, because by ignoring word order it also allows “Furiously sleep ideas green colorless,” an obviously illegal sentence. We want to arrive at a rule system that will produce all and only the legal sentences.

We refer to a system of syntactic rules as a "grammar", and Chomsky talks about three types of possible grammars that might be responsible for our syntactic knowledge. The three types are finite state grammars, phrase structure grammars, and transformational grammars; he shows the first two to be inadequate and settles on a transformational generative grammar as the basis of syntax in human language. A finite state grammar consists of rules that specify a particular sequence of word classes. The rule

would produce “Colorless green ideas sleep furiously,” as well as “Hungry little rats run quickly.” Other sentences require rules for other sequences of word classes: for “The tall boy is in the garden” we have the rule

But for “The tall boy who has two heads is in the garden” we need

which is a completely new rule even though the last two sentences are obviously related! Note another important shortcoming: the dependency between “boy” and “is” reaches across an embedded clause and therefore cannot be explained by this grammar -- that is, why should "is" agree with "boy" instead of with "heads"? Still another shortcoming is that although it would be possible to account for every possible word ordering using such rules, the system of rules would be so large and inefficient that it would also end up producing many more illegal sentences. We get the all, but not the only. A phrase structure grammar builds sentences not out of isolated word classes strung together, but instead out of groupings of word classes into phrases, such as NOUN PHRASE (NP), VERB PHRASE (VP), PREPOSITIONAL PHRASE (PP). Phrase structure rules specify how to “rewrite” the sentence. So as an example, at the highest level, rewrite SENTENCE as NP VP (roughly, "a 'sentence' has a 'subject' and a 'predicate'"); then rewrite NP as DETERMINER ADJECTIVE NOUN (“The tall boy --”); then rewrite VP as VERB PP ; and rewrite PP as PREPOSITION DETERMINER NOUN (“-- is -- in the garden”). This grammar has the advantage of explaining connections across embedded phrases -- like “the,” above -- since its structure is a hierarchy of phrases rather than a simple linear chain of elements: “boy” and “is” are actually linked at a higher level in the structure. It also allows embedding of phrases within themselves to make longer and more complex structures, a property called “recursiveness”. A simple illustration is provided by the two rules (1) rewrite PP as PREPOSITION NP, and (2) rewrite NP as DETERMINER NOUN PP. Any time we have a NOUN PHRASE that is the object of a preposition in rule (1), we can include in that NP a new PREPOSITIONAL PHRASE as rule (2) allows -- and then go back to (1) and include another NP, and then another PP, and so on. Together these allow us to start with something like "John is in the garden" and then expand "garden" into "garden near the tree"; and then to say "tree in the yard... yard behind the building... building next to the statue,” and so forth: "John is in the garden near the tree in the yard behind the building next to the statue". Still, there are cases in which even phrase structure rules do not fully capture relations between sentence structures. For example, when we change from the active voice to the passive voice, the two word orders are conveying the same basic proposition. “The cat chased the mouse” and “The mouse was chased by the cat” are, in some sense, the same sentence. How should we explain this difference in word order? A transformational grammar is Chomsky‟s solution to this problem. Such a grammar starts with phrase structure rules, and adds another component: transformational rules. The phrase structure (PS) rules generate a basic sentence form, and then the many transformational (T) rules provide optional ways of rearranging the phrase constituents (i.e., the word order). For example, there is a transformational rule that changes statements into questions; another, illustrated here, changes a sentence from the active to the passive voice: given the form NP1 VERB NP2, The cat chased the mouse. ("deep structure") 1) move NP2 to the beginning, The mouse 2) insert a form of the verb “TO BE”, was 3) followed by the VERB, chased ("surface structure")

4) and then the word “BY”, by 5) and finally NP1. the cat. The form of the sentence produced by the PS rules is called the deep structure; the form after any optional T rules have been applied is called the surface structure. The active and passive forms of a sentence differ in their surface structures but share the same deep structure; conversely, two interpretations of an ambiguous sentence (“Visiting relatives can be a nuisance”) may have the same surface structure, but two different deep structures. Transformational rules together with phrase structure rules are, according to Chomsky, sufficient for generating all and only the legal sentences of a language, and also provide the best account of structural differences and similarities in those sentences.

Chomsky’s Critique of Behaviorism Chomsky‟s (1959) review of Skinner‟s (1957) Verbal Behavior argued that 1) terms like “stimulus”, “response”, and “reinforcement” have a specific operational definition in the laboratory; but given a literal application to the case of language, they become implausible, and given a metaphorical application, the technical vocabulary of behaviorism is not as useful or precise as traditional, informal language. Some examples: stimulus control of operant responses - if a painting is a discriminative stimulus SD, the response R might be any of the following: “Dutch”; “clashes with the wallpaper”; “I thought you liked abstract work”; “never saw it before”; “tilted”; “hanging too low”; “beautiful”; “hideous”; “remember our camping trip last summer?”, etc.: it's completely unpredictable (unlike, say, pressing a bar when a light turns on in a Skinner box!) - what stimulus controls the response of saying a person‟s name? Not the person herself: having that stimulus -- the person -- actually present seems to decrease the probability of a speaker saying her name! (Usually a proper name is used to refer to someone who's not present.) - the “stimulus” is no longer an objective notion but is just projected backward from the response: saying “red” when viewing a red chair means the stimulus was its color; saying “chair” means the stimulus was its “chairness”. And then “response” cannot be an objective term either: there‟s no way of independently specifying what stimuli control it -- it‟s just anything people say! reinforcement - in the case of avoidance behavior (controlled by both punishment and negative reinforcement): responding to “your money or your life” would seem (on a literal reading) to require a past history of going unharmed on occasions when you turned over your money, and being killed on occasions when you didn't, so that the habit of giving up your money would be learned. - “The phrase „X is reinforced by Y (stimulus, state of affairs, event, etc.)‟ is being used as a cover term for „X wants Y‟, „X likes Y‟, „X wishes that Y were the case‟, etc....The only effect is to obscure the important differences among the notions being paraphrased.” For example, saying "Eric is reinforced by M&M's" glosses over distinctions concerning whether he wants them, or he likes them, or he wishes there were some in the cookies -- all of which are legitimately different things that might be intended. - Chomsky says that according to Skinner, “a person can be reinforced though he emits no response at all, and ... the reinforcing „stimulus‟ need not impinge on the „reinforced person‟ or need not even exist (it is sufficient that it be imagined or hoped for).” Skinner also suggests some rather implausible extensions of the reinforcement concept to verbal behavior, including "future reinforcement" (e.g., of a writer‟s writing behavior), and "automatic self-reinforcement" (of talking to yourself) -- these are obviously more metaphorical than objective. 2) simply stringing words together into a chain couldn‟t produce the complexities of syntax learned by every (normal) child; instead, there must be some generalized pattern to word orderings (here he cites Lashley‟s paper on serial order in behavior) and probably some innate biological mechanism for learning those patterns! - “the syntactic organization of an utterance is not something directly represented in any simple way in the physical structure of the utterance itself.” - “the fact that all normal children acquire essentially comparable grammars of great complexity with remarkable rapidity suggests that human beings are somehow specially designed to do this.” 3) in spite of behaviorism's deliberate neglect of all processes internal to the organism, there is no real justification for ignoring the contribution of the organism‟s nervous system to the process of learning. - “prediction of the behavior of a complex organism (or machine) would require, in addition to information about external stimulation, knowledge of the internal structure of the organism, the ways in which it processes input information and organizes its own behavior.”

4) the only apparent reason for adhering to a behaviorist interpretation of language structure is blind dogmatic acceptance of the behaviorist metatheory; in a preface added to a reprint of this review eight years later, Chomsky commented that “the general [behaviorist] point of view is largely mythology, and ... its widespread acceptance is not the result of empirical support, persuasive reasoning, or the absence of a plausible alternative.”

Chomsky on Language Acquisition Chomsky argued that at least part of language‟s rule structure must be given innately (reminiscent of Kant!), based on observations such as the following: 1) the success of language learners: all neurologically normal children learn very rapidly, with apparent ease, and with no instruction, a system of syntactic rules too complex for even some very sophisticated computers to master. (By contrast, learning high school algebra -- a set of rules simple enough to fit in a pocket calculator -- requires intensive instruction and still defeats many students.) Furthermore, regardless of individual differences in experience, all kids converge on exactly the same set of rules for their language. 2) the evidence they hear ("poverty of the stimulus"): the data available to children from their parents and other speakers contains no negative evidence (i.e., of what is incorrect). Since children never even hear their parents produce illegal sentences, how do they learn to discriminate between legal and illegal sentences? They must be constrained a priori in the ways they will interpret the syntactic examples they are exposed to. - syntactic errors tend not to get corrected by parents, who focus instead on the truth value of a child‟s utterances. That is, “the doggie is green” -- grammatical -- would probably get a “no it‟s not” response, but something like “doggie big” -- non-grammatical -- would probably get a “yes it is” response. But if the child says something incorrectly and is “reinforced” for it anyway (that is, the parent responds to the child regardless of the syntactic form of the child‟s utterance), why does the child get better at using correct syntactic forms in her utterances? Again, there must be prior constraints on the course of language learning. 3) the utterances they produce: some possible errors are never made in the course of learning syntax. - many plausible hypotheses about syntactic rules are never even entertained by the child in his search for the correct rules. Question formation is an example: - in forming a question from the sentence “The man is tall”, the child learns to move the “is” to the front of the sentence: “Is the man _ tall?” (where the blank represents the place where is was). But... - for “The man who is in the garden is tall”, children never ask “Is the man who _ in the garden is tall?”; they‟ll always say (correctly) “Is the man who is in the garden _ tall?” - the rule says „move the is from the verb phrase to the front of the sentence‟, referring to the structure of the sentence. Presumably the rule „move the first is to the front of the sentence‟ (referring to position in a string of words, which happens to be the is in the relative clause) is not tried out because children are innately constrained to only hypothesize structure-dependent rules. Chomsky's view of the language system LANGUAGE ACQUISITION UG Universal Grammar - the constraints on the possible kinds of rules for human languages, which are given innately in the "wiring" of every neurologically normal human brain  experience  LAD Language Acquisition Device, proposed by Chomsky as an "organ" of the brain which uses the minimal information available in the child's language experience to "flip the switches" (that is, choose from the possible settings) of the Universal Grammar, finally arriving at the particular phrase structure and transformational rules of that child's language exposure to one's native language (from parents, siblings, etc.); such linguistic data is impoverished, exhibiting insufficient regularity to support acquisition of syntactic rules on its own


1) Phrase Structure Rules ... 2) apply optional Transformational Rules 

--> -->

Deep Structure - basic form of sentence is generated, then Surface Structure - final form of sentence

LINGUISITIC PERFORMANCE spoken sentence: surface structure as affected by memory limitations, cognitive demands, rate of speech, physiological constraints, speech environment, conversation conventions, etc.

Shared By: