Document Sample

Why NL is not CF As proven in Stuart Shieber‟s 1985 paper, Evidence Against Context Freeness, and explained informally for people whose eyes glaze over formal proofs 1 Motivation • Important to theoretical linguists and philosophers of language. Central to Chomsky‟s innateness hypothesis as well as to critics of transformational grammars and their derivatives • Should be of at least theoretical interest to Computational Linguists because computational processing difficulty of languages is directly linked to their formal complexity • Personal motivation: If I label myself a Linguist I should have more than just a vague idea of what this question is about 2 Tiny subset of the question • A. Why are the Swiss German constructions a proof that at least some NLs are not CFLs? • B. How are the Swiss German examples crucially different from similar Dutch examples that are not a proof of the non-CFL-ness of NLs? • C. Are Polish (and other free word order languages) that contain sentences essentially identical to the Swiss German ones additional examples non-CFLs? 3 The Chomsky Hierarchy adapted from www.wikipedia.com α, β, γ – strings of terminals and nonterminals, A,B – nonterminals, x-string of nonterminals Grammar Languages Automaton Production rules Type-0 Recursively Turing Machine α->β (no Enumerable restrictions) Type-1 Context Sensitive Linear-Bounded Non- αAβ -> αγβ, where Deterministic Turing γ is not empty Machine Type-2 Context Free Non-deterministic A->γ Pushdown Automaton Type-3 Regular Finite State A->x Automaton A->xB 4 Preliminaries – Regular Languages • Productions (A->xB, A->x, where A and B are nonterminals and x is any string in the language) • Here is an example of a regular grammar: • Vocabulary of terminals={a, cat, dog, mouse, chased, scared, squeaked} • Vocabulary of non-terminals={S,VP,NP} • S is the only initial symbol. • S->a mouse VP • VP->squeaked • VP->chased NP • VP->scared NP • NP->a N • N->cat • N->dog • N->mouse 5 Preliminaries – Regular Languages • S->a mouse VP • VP->squeaked • VP->chased NP • VP->scared NP • NP->a N • N->cat • N->dog • N->mouse • What sentences can this grammar recognize/generate? • A mouse squeaked. • A mouse chased a cat. • A mouse chased a dog. • A mouse chased a mouse. • A mouse scared a cat. • A mouse scared a dog. • A mouse scared a mouse. 6 Preliminaries – Regular Languages • If we want this grammar to generate sentences with a subject other than “a mouse” we have to add the following productions: • S->a cat VP • S->a dog VP • This is inefficient • What‟s worse, it doesn‟t capture the NP generalization (i.e. doesn‟t give us the “right” structure) • We need S->NP VP but this is not a legitimate production 7 Preliminaries – Regular Languages • Consider a CFG that accepts the same sentences (slightly different nonterminal vocabulary): • S->NP VP • NP->DT N • DT->a • N->cat • N->dog • N->mouse • VP->VI • VP->VT NP • VI->squeaked • VT->chased • VT->scared 8 Preliminaries – Regular Languages Crucial point: the regular grammar shown here recognizes the strings we want it to recognize but it doesn‟t assign to them the structure we want. That is, this grammar weakly generates the language in question but doesn‟t strongly generate it. Why is this important? 9 Preliminaries – Regular Languages • An example of why it is important: • I saw the man with a telescope. • Recognizing the string is not sufficient for expressing its syntactic ambiguity • We need some way of expressing the ambiguity to get at the two meanings. • The reason I am drawing attention to the weak/strong distinction is that it seems that there is some conceptual confusion. When people say “language x is CF” that don‟t always say whether they mean that it is strongly CF or just weakly CF. In the context of formal languages and automata, people talk about weak generative capacity, but to us linguists strong generative capacity is of greater interest. 10 Preliminaries – Regular Languages • In any case, for some sentences of English, one cannot write a regular grammar at all. That is, some sentences cannot be even recognized by a regular grammar, let alone assigned the correct structure. That is, they are not even weakly regular. • A mouse a cat chased squeaked. • A mouse a cat a dog scared chased squeaked. • Example of Center Embedding: • NP1 NP2 NP3 … V3 V2 V1 • There is no way to write a regular grammar for an arbitrary number of such embeddings. • How do we know this for sure? 11 Preliminaries – Regular Languages • Pumping Theorem for finite state languages • If a language is an infinite set over some alphabet E, then there are strings x,y,z made out of the characters of E, such that y is not the empty string, and xynz is in the language for all n>=0. • What does this mean? 12 Preliminaries – Regular Languages • Example: L={abn|n>=0} • Some strings in this language are: • a • ab • abb • abbb • There are strings xynz such that y is not empty and xynz is in the language for all n>=0. For example, the string abb is such a string: x=a, y=b, and z=b. The following are all in the language: • n=0, x=a, y=b0, z=b, ab • n=1, x=a, y=b1, z=b, abb • n=2, x=a, y=b2, z=b, abbb • Why does this have to be true? 13 Preliminaries – Regular Languages • If a language is regular then by definition there is some regular grammar that accepts it. • By definition a grammar has a finite set of productions. In our example, one grammar for this language could be: S->aB, B->bB, B->Ø • But if the language is to consist of an infinite number of strings then there are strings in this language that have more symbols in them than there are productions, so some production must be applied more than once to generate the string. 14 Preliminaries – Regular Languages • Let‟s call the substring read by the grammar up to the point in which the production which ends up being used more than once is used for the first time, x (so in our example, let‟s say that we have a production such as S->aB; so x=a) . • Now let‟s call the substring that is read when the production eventually used more than once is used for the first time, y (so in our example, let‟s say that we have a production such as B->bB; so y=b). • Let‟s call the substring that is read from the point where we used that B->bB production for the first time to the end of the string, z (in this case z can result from B->b or even be empty and correspond to no production). • But since the middle substring, y, is the result of applying a recursive production (in this example, B->bB), we know that we can apply this production arbitrarily many times and thus make n in yn arbitrarily large. 15 Preliminaries – Regular Languages • So how would we use the Pumping theorem to prove that the language that accepts the sentences exhibiting center embedding that are shown above cannot be regular. • Similar language: L={anbn|n>=0} • ab • aabb • aaabbb • aaaabbbb • …. • This language does not contain any strings in which the number of b‟s does not equal the number of preceding a‟s or which includes any a‟s after b‟s: • *a • *aab • *abb • *abab 16 Preliminaries – Regular Languages • Imagine that L={anbn|n>=0} were a regular language. • Then there would be some string xyz, such that y is not the empty string, and xynz is in the language for all n>=0. • The substring y is the substring created by the recursive rule and it therefore cannot contain both a‟s and b‟s because we‟d end up with b‟s following a‟s when we pump the string • So the substring y must consist entirely of a‟s or entirely of b‟s. 17 Preliminaries – Regular Languages • If y consists entirely of a‟s then z consists entirely of b‟s. • But every time we apply the recursive rule that created y we get one more a and since z is fixed we cannot increase it by the same number of b‟s, so it will always be possible to get a greater number of a‟s than b‟s. • If y consists entirely of b‟s then x consists entirely of a‟s. But every time we apply the recursive rule that created y we get one more b and since x is fixed we cannot increase it by the same number of a‟s, so it will always be possible to get a greater number of b‟s than a‟s. • So there is no string xynz that satisfies the conditions for a regular language. So L={anbn|n>=0} is not a regular language. 18 Preliminaries – Regular Languages • Now how does this relate to the center embedding examples? • The set of sentences that exhibit center embedding as described above may be viewed as a special case of the L={anbn|n>=0} language, with nouns being a‟s and verbs being: b‟s, or L={(cat|dog|mouse)n(chased|scared|squeaked)n| n>=0}. • There is no way of writing a regular grammar that accepts strings with an arbitrarily long number of nouns followed by the same exact number of verbs. 19 Preliminaries – Regular Languages • So we have shown that the subset of English which consists of these types of sentences is not regular. • But this doesn‟t in itself prove that English itself is not regular. • In order to show that English is not regular we need to use a few more steps in our proof. • Regular languages are closed under intersection. This means that intersecting a regular language with a regular language produces a regular language. 20 Preliminaries – Regular Languages • If English were a regular language than intersecting it with some other regular language would result in a regular language. We will try to find some regular language and show that intersecting it with English results in L={(cat|dog|mouse)n(chased|scared|squeaked)n|n>=0} which we have already shown is not a regular language. • What language when intersected with English would produce L={(cat|dog|mouse)n(chased|scared|squeaked)n|n>=0}? • L={(cat|dog|mouse)*(chased|scared|squeaked)*} is clearly regular and intersecting it with English results in L={(cat|dog|mouse)n(chased|scared|squeaked)n|n>=0. • So English is not regular. 21 Brief History • Chomsky (1963) – NLs are not regular or CF. Proposed the transformational grammar model as an alternative • The notion that all NL phenomena are regular was put to rest. But the inadequacy of context free grammars for handling NL proved more controversial. Chomsky‟s proofs for the non-CFG-ness of NL are not accepted. • Peters & Ritchie (1973) showed that Chomsky‟s transformational grammar framework was powerful enough to describe any recursively enumerable set - perhaps too powerful. • Until 1985, all the arguments for the claim that NLs are not CF were shown to be flawed (see alleged counterexamples debunked in Gazdar & Pullum (1982)) • Shieber (1985) provides the first syntactic counterexample to the claim that CFGs are powerful enough to generate NL. Shieber‟s argument survived until today. 22 Dutch • Dutch has been initially introduced as a counterexample but later dismissed. The example and the explanation of why it is not a counterexample is presented in Bresnan, Kaplan, Peters & Zaenen (1982). • Dutch has the following structures: • …dat Jan Marie Piet de kinderen zag helpen laten zwemmen • that Jan Marie Piet the children see-past help-inf make-inf swim-inf • „..that Jan saw Marie help Piet make the children swim‟ • The structure is: • …that NP1 NP2 NP3 NP4 V1 V2 V3 V4 23 Dutch • Arbitrarily many of these NP V pairs may be inserted to form longer sentences. • The number of verbs and NPs must be the same. • The first verb has to be tensed and it must agree with the first NP. • All the other verbs have to be infinitives. • The subcategorization constraints between the final NP and final verb must be satisfied. • This is an example of cross-serial dependencies. • A language that has strings with arbitrarily long cross- serial dependencies is not a CFL. 24 Center Embedding [ [ [ …] ] ] 1 2 3 3 2 1 Cross Serial Dependencies [ [ [ …] ] ] 1 2 3 1 2 3 25 Context Free Languages • L={ambncmdn|m,n>=0} • We can use the pumping theorem for CFLs to show this. • If L is an infinite CFL, then there is some constant K such that any string w in L longer than K can be factored into substrings w=uvxyz such that v and y are not both empty and uvnxynz is in L for all n>=0. • What does this mean? 26 Context Free Languages • Let‟s look first at L={anbn|n>=1} • One CFG for this L would be: • S->aSb • S->ab • n=0, w=empty, v=a0, x=ab, y=b0, z=empty, ab • n=1, w=empty, v=a1, x=ab, y=b1, z=empty, aabb • n=2, w=empty, v=a2, x=ab, y=b2, z=empty, aaabbb 27 Context Free Languages • We can show that ambncmdn is not CF by using the pumping theorem. • If ambncmdn were a CFL then there would be some constant K such that any string in L longer than K, say ak bk ck dk, for example, could be written as w=uvxyz such that v and y are not both empty and v and y are pumpable. 28 Context Free Languages • v can‟t consist of both a‟s and b‟s because when pumped it would produce strings with a‟s after b‟s. Similarly, it cannot consist of both b‟s and c‟s or both c‟s and d‟s. The same goes for the other pumpable term, y. • So v must consist entirely of a‟s or entirely of b‟s or entirely of c‟s or entirely of d‟s. Then no matter what y we choose, any pumping of v and y simultaneously will result in strings not in L because we can pump only 2 symbols at a time but not 4. 29 Dutch • The Dutch example seems to exhibit the same cross serial dependencies we just showed could not be handled by a CFG. • However, it is possible to write a CFG that would accept these Dutch strings. 30 Dutch • We can divide the verbs as follows: • 1. V-index: Form infinitive, Subcats for a subject (swim) • 2. V-tensed: Form tensed, Subcats for a subject it agrees with and an S‟ or S” complement without complementizer (saw) • 3. V-infinitive: Form infinitive, Subcats for a subject and an S‟ or S‟‟ complement without complementizer (help, make) • 1. S->NP-agr S‟-agr-index V-index • 2. S‟-agr-index->NP S‟-agr-index V-infinitive • 3. S‟-agr-index->NP S‟‟-agr-index V-infinitive • 4. S‟‟-agr-index->NP-index V-tensed • 5. NP-index -> Jan|Piet|Marie|the children • 6. V-tensed->saw • 7. V-index->swim • 8. V-infinitive-> help|make • These productions would accept the example sentence as well as the following sentences: • …that Jan Marie Piet the children see-past make-inf help-inf swim-inf • …that Jan Marie the children Piet see-past help-inf make-inf swim-inf • …that Jan Marie the children Piet see-past make-inf help-inf swim-inf • These are perfectly grammatical. 31 Dutch • How come this works? Note that the only items we care about are the ones in bold. • …that Jan Marie Piet the children see-past help-inf make-inf swim-inf • „…thar Jan saw Marie help Piet make the children swim‟ • …that Jan Marie Piet the children see-past make-inf help-inf swim-inf • „…thar Jan saw Marie make Piet help the children swim‟ • …that Jan Marie the children Piet see-past help-inf make-inf swim-inf • „…thar Jan saw Marie help the children make Piet swim‟ • …that Jan Marie the children Piet see-past make-inf help-inf swim-inf • „…thar Jan saw Marie make the children help Piet swim‟ 32 Dutch • We can recognize and generate all the grammatical strings with this grammar because the number of items we need to cross-reference is finite and all the other items are interchangeable syntactically. • Note that the sentences above are all grammatical but each has a different interpretation in Dutch. • The final tree structure of each sentence will only reflect the order of the words in the sentence and not all the cross-serial dependencies. • So we can write a grammar to recognize and generate all these strings but not to assign a structure to them that will preserve the cross-serial dependencies. • So the grammar above weakly generates the cross-serial examples but doesn‟t strongly generate them. • This is sufficient if we are interested in classifying sentences as grammatical or ungrammatical but is it sufficient for semantic interpretation? Probably not. 33 Swiss German • How is the Swiss German example set presented by Shieber (1985) crucially different from the Dutch example set? • …mer em Hans es haus halfed aastriiche • …we Hans-DAT the house-ACC helped paint • „…we helped Hans paint the house.‟ • …mer d‟chind em Hans es haus lond halfe aastriiche • …we the-children-ACC Hans-DAT the house-ACC let help paint • „…we let the children help Hans paint the house.‟ 34 Swiss German • …we the-children-ACC Hans-DAT the house-ACC let help paint • In Swiss German verbs subcategorize for NPs with specific cases. • Some verbs subcategorize for accusative NPs and some verbs subcategorize for dative NPs • The number of verbs subcategorizing for accusative case NPs must be the same as the number of accusative NPs in the sentence and the number of verbs subcategorizing for dative case NPs must be the same as the number of dative NPs in the sentence. 35 Swiss German • Why can‟t we produce arbitrarily long sentences of this type with a CFG? • Simplest case: all the accusatives precede all the datives: • NPam NPdn Vam Vdn • This is the same as ambncmdn which is non-CF, as shown earlier. 36 Swiss German • The Swiss German strings with all the accusatives preceding all the datives may be presented as • …NP_ACCmNP_DATn…V_ACCmV_DATn… • which is the same as: • ambncmdn which has been shown not to be CF. • CFLs are closed under intersection with regular languages. • If Swiss German were CF then intersecting it with the regular language a*b*c*d* would yield a CF. • But the intersection, wambnxcmdny, is not CF, so Swiss German cannot be CF. 37 Swiss German • Shieber‟s argument rests on the impossibility of writing a CFG that would insure that the total number of accusatives and datives matched and not on the order they appear in. • This argument is SUFFICIENT for Swiss German‟s status as a non-CFL. • The argument does NOT DEPEND on the ORDER of the inner NPs and inner Verbs. • Surprising. • Orders other than NP1 NP2 NP3..NPn V1 V2 V3 ..Vn are acceptable. • So, Polish and other similar examples provide the same exact counterexamples that Swiss German does. 38 Conclusions • If I am right then one important point to take away from this: the often cited Swiss German example is by no means unique. It just happens to be the first published counterexample of NL syntax not being CF. Once the inadequacy of CFGs was shown for one language, there is no need to show it for other languages. 39 Conclusions • Another important point is that the proof shows that Swiss German, and thus NL, is not even weakly CF. • Since in CL we are not interested in merely recognizing strings, weak generative capacity without strong generative capacity is of limited use to us and plenty of examples have been presented to support the claim that NLs are not strongly CF before Shieber‟s paper, so its importance is perhaps more theoretical than practical from a CL point of view. 40 References • Bresnan, Kaplan, Peters & Zaenen (1982) - Joan Bresnan, Ron Kaplan, Stanley Peters, and Annie Zaenen. 1982. Cross-serial dependencies in Dutch. Linguistic Inquiry, 13(4):613--35. • Chomsky (1963) - Chomsky, Noam. 1963. Formal properties of grammar. In Luce, R.D., R.R. Bush and E. Galanter (eds), Handbook of Mathematical Psychology, vol.II. New York: Wiley, pp. 323-418. • Partee (1993) - Mathematical Methods in Linguistics, Corrected second printing of the first edition (Studies in Linguistics and Philosophy) by Barbara H. Partee, Alice Ter Muelen and Robert Wall • Peters & Ritchie (1973) - Peters, Stanley; R. Ritchie (1973). "On the generative power of transformational grammars". Information Sciences 6: 49-83. • Pullum & Gazdar (1982) - Pullum, Geoffrey K., and Gerald Gazdar (1982) "Natural languages and context-free languages," <u>Linguistics and Philosophy</u> 4, 471-- 504. • Savitch (1987) - The formal complexity of natural language" by Walter J. Savitch, Emmon Bach, William Marsh, and Gila Safran-Naveh. D. Reidel 1987 • Shieber (1985) - Stuart M. Shieber. Evidence against the context-freeness of natural language. Linguistics and Philosophy, 8:333-343, 1985. http://www.eecs.harvard.edu/~shieber/Biblio/Papers/shieber85.pdf • 41

DOCUMENT INFO

Shared By:

Categories:

Stats:

views: | 13 |

posted: | 12/1/2009 |

language: | English |

pages: | 41 |

Description:
Why-NL-is-not-CF

OTHER DOCS BY akgame

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.