4. Representations by twe16545

VIEWS: 34 PAGES: 40

									 Lexicase ref erence manual, f ile 729ccb10-5af 4-4e5b-8069- 162af da36b2f .rtf           1
 Revised 08.09.2010, 13:48




                       Lexicase: an on-line reference manual [to hold.doc]
 4. Representations                             103 [reps]
      4.1. Dependency, stemmas and non-concatenative representations
 In this section I will present and illustrate the basic notational conventions of lexicase
 dependency grammar. The system is basically simple: a sentence or phrase is a network
 of words connected by binary unidirectional lexically licensed links. This information
 may be optionally represented as a dependency stemma, but this is only an expository
 convenience; all the information on dependency and linear precedence is included as
 atomic and valency (contextual) features in the lexical matrices themselves.
 Representing some of this information redundantly as edges in a stemma does not alter
 anything.

     In a phrase or sentence, each word is given an index based on its linear position, and
 each word depends on at most one regent word and/or governs one or more dependent
 words. The binary link between pairs of words is formalized by copying the linear index
 of the dependent into a valency feature of the regent, as illustrated in 1).

1) Dependency links

 wordform                                                              send
 nndex                wordform                                         1ndex      money
 m[Fi]       mndex                                                    2[N]     2ndex
              Fi                                                               N
 Fi = any atomic feature(s)                                           'Send money!'
      4.2. Minimal representations                          110
 The Chomskyan 'minimalist program' advocates 'bare phrase structure', constituent
 structure trees without node labels &&. Like other basic components of X-bar theory,
 this idea seems to have been taken without attribution from dependency grammar. Since
 1979/1988 && (Starosta 1979 EoPS &&, 1988:9,24), lexicase dependency trees have
 been 'bare' because given the constraints applied to lexicase representations, node labels
 are predictable from lexical information. This is far from saying that the two systems are
 now identical, however. Minimalism would have to shed its empty and 'light' categories
 and binary structure (and of course all movement rules) before it began to approximate
 the level of constraints and empirical content characteristic of lexicase.

     Node labels are redundant in lexicase dependency representations because of the
 locality constraint: all dependency links are strictly local. Stated in stemma terms, there
 can be no intermediate nodes such as N, N', or N" on the path from a word to its regent or
 dependent. Given locality, N, N', and N" must be non-distinct, and an N node is
 redundant because the same information is given by the word class feature [N]. For
 convenience of reference, a 'phrase' can be defined as a word and all the words that
 depend on it directly or indirectly. If the head word is an Adj, then the phrase headed by
 Lexicase ref erence manual, f ile 729ccb10-5af 4-4e5b-8069- 162af da36b2f .rtf                              2
 Revised 08.09.2010, 13:48



 the Adj is an AdjP. However, phrases as such do not play a role in the statement of
 generalizations in a pure dependency grammar. 1

 It is in fact technically possible to streamline such representations even further, In 1),
 dependency stemmas are shown with a vertical mast above e ach word, and dependency
 link edges are tied to these masts. An even simpler approach would be to eliminate these
 masts and link words by edges directly, as shown in 2). (I have kept the lexicase
 convention of using horizontal edges to represent exocentric constructions.)

2) Minimal stemmas

      a)                                 took

                       boy                                  girl                        to

     the                                  the                                                   movies

                                                                                       the

      b)                                                                      were

                                          and                                        expected

                    women                                  men                                    to

     the                old                                                                              attend

     This notation has several advantages over the mast notation: 1) it is closer to
 Tesnière's original version, and 2) it emphasizes the fact that dependency representations
 are composed of direct word-to-word links. In spite of that, I have kept the mast
 notation, which is adapted from John Anderson's (Anderson 1971) and David Hays' &&
 and Jane Robinson's && dependency stemma representations. The mast notation is
 perhaps a bit easier to read when feature matrices are added to the lexical entries, and I
 think easier to draw precisely. No question of theory is involved, since the two notations
 give exactly the same information, and since all stemmas are redundant because the same
 information is included in the lexical entries.




 1
   Not all dependency grammars are pure. Some sadder but wiser ones use phrase
 structure machinery to account for coordination. This has for example been advocated by
 Dick Hudson &&, Petr Sgall &&, …. In a lexicase grammar, coordination is treated
 dependentially, though additional mechanisms are needed to account for missing
 elements.
 Lexicase ref erence manual, f ile 729ccb10-5af 4-4e5b-8069- 162af da36b2f .rtf              3
 Revised 08.09.2010, 13:48



      4.3. Command                  108
 The notion of command in a dependency grammar is much more straightforward than it
 is in Chomskyan constituent structure grammar because dependency grammars don't
 have a VP to get in the way of characterizing grammatical relatedness. In a (lexicase)
 stemma, two words linked by a single line (straight or bent) are grammatically related,
 and two words not connected by a single line are not grammatically related. Thus only
 grammatically related words can enter into agreement and government relationships.

     However, some notion of command going beyond the idea of immediate dependency
 is necessary or useful in a full grammar.

      1) A given word may carry information applying to its dependents and its
      dependents' dependents. As an example, the tense feature marked on a root
      predicate word has the word and its direct and indirect dependents---the 'clause'---
      in its semantic scope.

      2) As in Chomskyan grammar, and going back to Langacker 1969, lexicase refers
      to command in placing restrictions on possible coreference relationships between
      words in a clause.

      3) In accounting for information structure, the SPOT( LIGHT) carries the new
      information in the clause. The spot is a single word plus the words it commands,
      its direct and indirect dependents (cf. Starosta 19&&:&&&). Syntactic
      constructions such as cleft and pseudocleft sentences make it possible for a single
      word to command the appropriate new information, and contrastive intonation
      may be used when this is not syntactically possible.

      4) People trained in the constituency tradition are accustomed to thinking of
      grammatical relationships as applying to phrases rather than individual words.
      Thus in

3)

 The beagle in the drainpipe bayed mournfully.

      the person in the linguistic street has been taught to think of the whole phrase the
      beagle in the drainpipe as being the 'subject' of bayed or of bayed mournfully,
      though the lexicase counterpart of the relation 'grammatical subject', the
      Nominative dependent of a finite predicate, applies between two words.

     The Chomskyan anaphora literature is littered with conceptions of 'command' (c-
 command. m-command, &&) that have tried and discarded in a vain attempt to account
 for possible coreference relationships from within a binary clause structure. To refer to a
 dependentially defined version of command, I have coined the term CAP- COMMAND (from
 Latin caput, capitis 'head'): a word cap-commands its immediate dependents. In 4.,
 Lexicase ref erence manual, f ile 729ccb10-5af 4-4e5b-8069- 162af da36b2f .rtf          4
 Revised 08.09.2010, 13:48



4) Cap-command relationships

            boy
 that                   on
                                            bus
                                 the                      there

        (i) bus cap-commands the and there; equivalently, bus is the regent of the and
        there;

        (ii) on cap-commands bus (on is the regent of bus), since on has a single
        dependent sister bus;

        (iii) boy cap-commands that and on.

 The lexicase notion of COMMAND is defined in terms of cap-command: a word X
 commands a word Y if either (a) X cap-commands Y, or (b) X cap-commands Z and Z
 commands Y. Thus in 4., boy commands there because boy cap-commands on, on cap-
 commands bus, and bus cap-commands there. Working from the bottom up, bus cap-
 commands there so bus commands there. On cap-commands bus and bus commands
 there, so on commands there. Boy cap-commands on and on commands there, so boy
 commands there. However that does not command there because that has no dependent
 sisters at all, and so does not cap-command anything. - - - - -
        4.4. Exo- and endocentricity                        106
 The most basic kind of valency feature is one marked on a whole syntactic class
 specifying which other class or classes are allowed or required as dependents. Such
 features may be optional or obligatory. For example, verbs as a class allow but don't
 require nouns, adverbs, prepositions, verbs and/or sentence particles as dependents,
 depending on the language. This kind of dependency is stated in terms of optional
 categorial valency features, and will be referred to as an endocentric dependency.

(1)
 V
 ?([N])
 ?([Adv])
 ?([P])
 ?([V])
 ?([Sprt])


 The '?' in the valence feature is a variable which may be replaced by the linear index of
 another word in the same syntactic structure.
 Lexicase ref erence manual, f ile 729ccb10-5af 4-4e5b-8069- 162af da36b2f .rtf                             5
 Revised 08.09.2010, 13:48



      Two word classes, P (prepositions and postpositions) and Cnjc (conjunctions) take
 required dependents as part of their definitions. Thus a preposition must have a
 preceding dependent to qualify as a P, and a conjunction must occur between two
 dependents of the same type to qualify as a conjunction. Note that by this definition a
 'stranded' preposition is not a preposition but an adverb, and a 'subordinating conjunction'
 is not a conjunction but a preposition. The relation between a P or a Cnjc and its
 dependent(s) is referred to as an exocentric dependency, and is indicated in the lexical
 entry by an obligatory categorial valency feature (cf. (2) and (3)).

(2)
 P                           P                            P
 ?[N]                        ?[V]                         ?[P]


(3)
 Cnjc                        Cnjc
 ?[N]                        ?[V]
 ?[N]                        ?[V]


    This endocentric-exocentric distinction is represented in a stemma by drawing
 endocentric dependents in the usual way under slanting lines, but hanging exocentric
 dependents under longer arcs under a horizontal bar (cf. 41.centricity).

41.1: Centricity in stemmas

      Endocentric                               Exocentric
                                                  PP                              Coordination


          [2([W])]                                   P                                           Cnjc
                                                     2[W]                                        1[W]
                                  2ndex                                  2nde      1ndex         3[W]   3ndex
                                                                         x
                                  W                                      W         W                    W

     Exocentric constructions have special grammatical properties that motivate their
 special treatment in this notational system. In particular, they share the property of being
 transparent to some valency requirements imposed by their regents. Among the P -
 dependent combinations, the P - V combinations might be noted. They are the
 'subordinating conjunctions' of traditional grammar and some of the 'complementizers' of
 Chomskyan grammar, but are grouped together with traditional N-governing prepositions
 here because of their shared dependency property (contracting an exocentric dependency
 with a single dependent) and often their shared semantic specifications as well.
 Lexicase ref erence manual, f ile 729ccb10-5af 4-4e5b-8069- 162af da36b2f .rtf        6
 Revised 08.09.2010, 13:48



     Each word class may be further specified for the subclasses of dependents it allows or
 requires. Many of these subtypes are widespread or universal. Some of the typical
 configurations for verbs, nouns, prepositions and conjunctions are illustrated in 41.2.

41.2 Dependents of verbs


 V                                       V
 2([N])           2ndex                  2([V])             2ndex
                  N                                         V
                  Nom                                    fint 
                                                              
                  Acc                                     -fint              
                                                                         
                lctn                                                       
                                                                         
                assn                                                     
                                                                         
                trmn


      In a dependency grammar, the verb is the control center for the whole clause. In
 previous lexicase descriptive studies on a number of languages, five word classes have
 been found to depend on Vs: N, V, Adv, P, and Sprt. (Cnjc occurs too, but is
 grammatically transparent and functions as its immediate dependents.) V - N links are
 the backbone of a sentence. Every noun has a case form, lexically marked on the noun
 itself and/or imposed by its regent, typically a verb. The most common ones are Nom
 ('grammatical subject'), lctn ('locative'; universally present) and Acc (present in every
 accusative language). Verbs may also take other verbs as dependents, and are
 subcategorized into [±fact] subsets depending on whether they require the dependent
 verbs to be finite (allowing their own Nom dependent) or non-finite (excluding their own
 dependent Nom noun). Dependents of nouns are also numerous; so far Det, Adj, N, V,
 and P have been found in lexicase studies.

     V dependents of nouns are relative clauses or noun complements, and again are
 distinguished into finite and non- finite subtypes. There is a bit more cross- linguistic
 variation in the set of N dependents. Thus not all languages have Det or Adj categories,
 and in English, only gerund nouns allow N dependents. (The Chomskyan analysis of -'s
 and of NP modifiers as 'Genitive' NPs is not linguistically supportable.)
 Lexicase ref erence manual                                                               7
 Revised 08.09.2010, 13:48



41.3: Dependents of nouns



 N                                N                            N
 2([V])          2ndex            2([Det])       2ndex         2([Adj])        2ndex
                 V                               Det                           Adj
                 fint 
                       
                 -fint                                       


 N                                N
 2([N])          2ndex            2([P])         2ndex
                 N                               P
41.4: Dependents of Ps and Cnjcs


 P                            P                                P
 2([N])          2ndex        2([V])         2ndex             2([P])      2ndex
                 N                           V                             P
                                           fint                     
                                                                      
                                            -fint                                  


                Cnjc                                               Cnjc
 1ndex          1[N]          3ndex                    1ndex       1[V]          3ndex
 N                       N                       V           3[V]         V
 As shown in this figure, we may usefully distinguish those prepositions that
 govern finite verbs (cf. English that, Thai wâa) and those that govern non- finite
 verbs (English to, Thai hây). A language may also distinguish between
 conjunctions that conjoin nouns (e.g. Mandarin Chinese        'and') and
 those that conjoin verbs (e.g. Mandarin         'or').
            4.4.1. Centricity
 Starting from the formal definition of head and attribute introduced above, it is
 possible to characterize the notions of endocentricity and exocentricity
 construction in a way which is somewhat more workable and theoretically
 productive than the traditional structuralist characterization in terms of
 substitutability
Lexicase ref erence manual                                                          8
Revised 08.09.2010, 13:48



(Bloomfield 1933:194). Thus if a construction has only one head, that is, one
obligatory constituent, it is defined as endocentric, and the head must be a lexical
item rather than a construction by the one-bar constraint. The items in
parentheses on the right side of the arrow in a conventional phrase structure rule
are the attributes of the head, and must by the one-bar constraint also be
constructions rather than individual lexical items, since every word must have its
own one-bar projection (Pullum's 'Maximality' condition). 2 The lexical head of an
endocentric construction is written on a vertical line, directly under the label for
the construction as a whole if this redundant label is retained to facilitate
readability, and attribute nodes are written one step below the level of the head
word on lines slanted to the right or the left. Noun phrases and (verbal) clauses
for example are endocentric constructions, headed by nouns and verbs
respectively.

     An exocentric construction is any construction with more than one obligatory
member (Bloomfield 1933:194), and by the one-bar constraint, at least one of its
heads must be a lexical item rather than a construction. The other co-heads may
be individual lexical items (from the same syntactic class) or phrases. Following
a convention proposed by Cathy Hine, the dominating node for the construction is
stretched into a horizontal line, and the two or more co-heads of an exocentric
construction are written on vertical lines descending from the horizontal line
representing the node dominating the exocentric construction as a whole. For
example, a PP is an exocentric construction 3 which has two co-heads, a
preposition and an NP, PP, or S, and a coordinate construction NP such as my
Dad and Rufus in (12) has three or more co-heads: one (or more) coordinating
conjunctions (the lexical head of the construction) and two or more coordinated
NPs (the phrasal co- heads). The lexical head of a exocentric construction, such as
and in (12), is dominated by a vertical line of unit length, and the lexical heads of
the phrasal co-heads (the 'secondary heads' of the exocentric construction), in this
case the Ns Dad and Rufus, are dominated by vertical lines of two-unit length.
The use of horizontal bars to characterize exocentric constructions makes it
possible to reconcile (i) the requirement that heads be represented by vertical lines
with (ii) the fact that exocentric constructions have more than one head and (iii)
the eternal verity that parallel lines never meet (except at infinity). A side benefit
is that exocentric constructions are easier to spot in a tree.

2
  If they were single lexical items, they would by definition be co-lexical heads of
the construction, for example conjunctions in a phrase like Bob and Carol and
Ted and Alice, and the construction would not be endocentric.
3
  An alternative analysis due to Emonds (Emonds 1972:547-55) would treat some
prepositions as 'intransitive', thereby making PP an endocentric construction. In
fact, this is the usual dependency grammar approach as well (Anderson 1971),
and it remains a possibility within the lexicase framework. Facts about English
and Dutch preposition stranding tend to support it, but facts about German case
government tend to favor the exocentric solution.
Lexicase ref erence manual                                                          9
Revised 08.09.2010, 13:48



    Note that the constituent my Dad and Rufus in Figure 3.26 is formally a cnjc'
('conjunction-bar'), since a phrase label is a one-bar projection of its lexical head,
and the lexical head of a coordinate construction is a coordinating conjunction.
However, my Dad and Rufus is also an NP in function, because a coordinate
construction is exocentric, and so the virtual matrix associated with my Dad and
Rufus contains the feature [+N] as well as [+cnjc], making the construction for
external subcategorizing purposes simultaneously a cnjc' and an N', that is, an NP.
Thus lexicase solves the old problem of how to put coordinate constructions in the
same category as their phrasal heads (Pullum 1985:24-6) without giving up the
succession principle as GPSG does (op.cit., p. 25). See section 6.4 for a more
detailed discussion.
     4.5. Intensional semantic representation               110
From the dependency point of view, a syntactic representation is a string of words
related to one another in terms of pairwise dependency relations. Every word but
one is dependent on some other word in the sentence, and the meaning of each
dependent serves to delimit the possible range of reference of its regent. In
addition, relational features such as case relations add additional information
about the relation obtaining between regents and their nominal dependents. Thus
semantic information is readily extracted from the syntactic representation,
because the representation links together those words which are semantically as
well as syntactically related. Such a dependency tree then constitutes not only a
syntactic representation of a sentence but at the same time an intensional semantic
one: the meaning of the sentence is any conceivable set of circumstances which is
not ruled out by the meanings of the individual words and the attributional and
case relationships obtaining among them. Any imaginable situation which is not
incompatible with this intensional representation is one of the readings of the
sentence. Thus the (incomplete) structure in Figure 3.32 is grammatical though
the situation it refers to is bizarre. The syntactic representation is simultaneously
a semantic representation stating that the rock is interpreted as an animate entity
which sees the rhubarb.

(15)


                                            saw

                             rock       +V                         rhubarb
                                                       
                       +N              [+AGT]                  +N        
       the
                                                          the
                                                                              
                       +cncr           [+PAT]                  +cncr     
                                                                          
                       -anmt           +AGT                  -anmt     
                                                                        
                       +AGT            +anmt                  +PAT      
                                                       
                                         +PAT       
Lexicase ref erence manual                                                        10
Revised 08.09.2010, 13:48


                                              
                                      +cncr   

                                Figure 3.32
     4.6. Comparison with other types of representation
           4.6.1. Levels, structures, and functions   103
Grammatical frameworks differ in the number of distinct levels of representation
which they allow for a single sentence. Generalized Phrase Structure Grammar
for example has only one (Gazdar et al 1985:10), Generative Semantics has two
(semantic representation and surface structure), lexical- functional grammar has
three (c-structure, f-structure, and logical representation), the Standard Theory has
three (deep, surface, and semantic), Government and Binding Theory four or
maybe more (D-structure, S-structure, Logical Form, and possibly others; van
Riemsdijk and Williams 1986:173; Radford 1981:390), and relational grammar
has an unlimited number of distinct strata, though they are written in a single
composite graph.

Lexicase, like Generalized Phrase Structure Grammar, has only one stratum, or
two, depending on how you count: a single dependency representation, the
SIMPLE S YNTACTIC REPRESENTATION. This structure plus supplementary
information regarding grammatically determined coreference relationships
constitute the AUGMENTED S YNTACTIC REPRESENTATION). I would contend that
the existence of these two kinds of representation does not make the framework
bistratal. The augmented syntactic representation is identical to the simple
syntactic representation except that it has added information. No word order has
changed, no nodes or words or morphemes have been added or subtracted, and
most importantly, no grammatical relations have been changed. This system is
quite different in power from a framework which allows a single surface
constituent to bear two distinct and mutually exclusive GRs in a single derivation,
as is done in relational grammar and classical and contemporary transformational
grammar. Government and Binding theory for example allows a single NP to be
an object at one level and a subject later (passive), or a lower subject at one level
and a higher subject at another (subject to subject raising), and relational grammar
allows even more radical correspondences, such as 'ascensions' in which a
possessor functions as an attribute of an N in one stratum and a clause- level
constituent in a later one. This is not allowed in a lexicase representation.

    According to the Frosch Air Mattress Principle (Helmut Frosch, personal
communication), when a grammatical framework is pushed down in one place, it
normally bulges up somewhere else. Consequently, we might expect that lexicase
grammatical representations and lexical rules will be far more complex than, say,
their Government and Binding Theory counterparts. In the case of lexical rules,
this expectation is to some extent fulfilled, since lexicase currently posits five
types of lexical rules to generate its monostratal grammatical representations
(Redundancy Rules, Subcategorization Rules, Inflectional Redundancy Rules,
Morphological rules, and Derivation Rules), while the Standard Theory needs
 Lexicase ref erence manual                                                        11
 Revised 08.09.2010, 13:48



 only three (Phrase Structure Rules for deep structures and transformations,
 spelling rules, and maybe filters for surface structures). Whether this constitutes
 an overall complication with respect to GB and RB cannot however be
 determined until the latter two theories have reached comparable levels of
 formalization and coverage.
            4.6.2. Dependency and constituency         104
 A well- formed syntactic structure within the lexicase framework must satisfy all
 the requirements on possible trees (for example that the structure is connected,
 has one 'root' node, no crossing branches, and has a full word at the end of each
 branch). As an example, see Figure 3.26. Every word in a sentence is the head of
 its own construction, and every lexical item in a sentence but one, the main verb
 (or non-verbal predicator), is dependent on one and only one other lexical item, its
           4
 REGENT.       The head of a construction carries semantic and grammatical
 information for the construction as a whole, and attributes modify (constrain the
 potential reference of) the heads of their constructions.

     A lexicase representation can be viewed as a network of dependencies
 obtaining between (actual or virtual) pairs of lexical items in a sentence. Each
 word is specified for the kinds of dependents it is allowed to take, including in the
 limiting case (for example Determiners) none at all. A word decides which

(12)

                                        saw

                                       [+V]

                               bear                                  and

                              [+AGT]                               [+cnjc]

        the                                          Dad                           Rufus

       [+Det]                                     +PAT                         +PAT 
                                                                                    
                                                  +N                           +N   
                                        my

                                       [+Det]

                                                                 +cnjc      
                                                                            
                                                                 +N         
                                                                            

 4
   This point differentiates it from Hudson's version of dependency grammar,
 which is much more powerful in allowing a word to have any number of 'heads'
 [i.e. regents] (.r.Hudson 1984:;85).
Lexicase ref erence manual                                                        12
Revised 08.09.2010, 13:48


                                                                +PAT     


                              Figure 3.26

classes or subclasses of words may, may not, or must occur as dependents,
whether the dependents appear to the right or left of it, how they are ordered
among themselves, whether they must appear in a particular inflectional category
(government) or in the same inflectional form as the head (agreement), and how
they are to interpreted semantically (case relations and selectional interpretation),
and the positive and negative contextual features in each lexical matrix must be
met by the lexical items which depend on it in that tree.

    Lexicase grammatical representations neutralize the distinction between
phrase structure grammar and dependency grammar, and a lexicase
representations incorporate the information carried by the three different kinds of
tree structure contrasted by Winograd (Winograd 1983 :80), dependency (head and
modifier), role structure (slot and filler), and phrase structure (immediate
constituents): a head is any item attached under a vertical line, and an attribute is
any item attached under a slanting line (dependency); the case role of a
constituent such as the bear in Figure 3.26 is the case role of its lexical head, in
this instance [+AGT] 'Agent' (role structure); and a constituent is any word plus
all its direct or indirect dependents (phrase structure; cf. Hudson 1984 :92). A
construction can be defined as a set of functionally equivalent constellations of
words all of which have a head word drawn from the same syntactic class.
Conversely, the head of a construction can be defined as the indispensable
representative of that construction (Wilson 1966).

    Obviously, then, the concepts 'head', 'attribute', and 'construction' are very
important in such a grammar, and syntactic representations must minimally be
able to distinguish among these three notions (Hudson 1979c :4). However,
conventional Phrase Structure representations cannot. For example, consider the
PSRs in Figure 3.27. From these rules themselves, we can tell that PP is
exocentricity and NP is endocentricity, since NP has only one obligatory
immediate constituent and PP has more than one (Wilson 1966).
Lexicase ref erence manual                                                          13
Revised 08.09.2010, 13:48



(a) PSR-2.            NP              (Det) (Adj) N

(b) PSR-5.            PP              P     NP 
                                            <PP>
                                            S 

                                       Figure 3.27

In PP, the first constituent is the lexical head of the construction, and in NP, it is
the last. These distinctions are lost, however, when the usual tree representations
are drawn (see Figure 3.28).




(a)                                                    (b)

                             NP                                         PP

      Det                    Adj           N                 P                 NP

      that               crazy             dog               to        Det               N

                                                                        the          hills


                                       Figure 3.28

    The lexicase conception of dependency theory is a rigorous and restricted
version of Chomsky's X-bar analysis (Chomsky 1970; cf. Pullum 1985) and it
employs X-bar terminology. The ways in which lexicase representations differ
most strikingly from typical X-bar tree notation are (i) the omissibility of node
labels, which given lexicase constraints are redundant, and (ii) the 'flatness' of the
trees, and especially absence of a distinction between VP and S, since given
lexicase constraints, such a distinction is impossible. It is in fact the reverence for
the binary NP-VP division at the sentence level in frameworks within the
Chomskyan tradition which necessitates making a distinction between
dependency and constituent structure in the first place. A subject is clearly
dependent on the verb, as shown by subcategorization, selection, theta-role
assignment, and/or subject agreement, yet in a Chomskyan NP-VP constituent
structure, the subject is not in the same constituent as the verb. Therefore,
dependency must be different from constituency. However, the lexicase one-bar
constraint on syntactic representations eliminates the distinction between VP and
Lexicase ref erence manual                                                      14
Revised 08.09.2010, 13:48



S, and when the VP node is eliminated, the subject is within the syntactic domain
of the verb, so there is no need to separate constituent structure and dependency
structure.
                4.6.2.1.X-bar theory
weak version of DG
                4.6.2.2.Bare phrase structure
can't be done because of binarity

lexical items don't have the kind of information they need to project initial
complex binary structures



5. R E S T
     5.1. Constraints on dependency representations
                5.1.1.1.Constraints on dependency representations
    The formal notation for lexicase dependency representations is quite similar to
John Anderson’s notation (cf. Anderson 1971): there is a vertical ‘mast’ above
each word, to which dependency links to its regent and dependents are attached,
and ‘surface’ linear order is preserved. However, unlike Anderson’s notation,
lexicase word class and other features are marked under the words they apply to,
rather than at the tops of these masts, and no zero categories are allowed.
     5.2. Lexicalist, Dependency
     5.3. Binarity
     5.4. Aristotle, Hulda Hannah-Winkle r (Aristotle's vicar on earth)
     5.5. Constrained dependency grammar ve rsus other frame works
           5.5.1. Stemmas
In classical Phrase Structure grammars, the entities introduced by a PS rule could
be almost anything: morphemes, words, word classes, phrases, dummy symbols,
or even bare matrices. In more recent GB analyses, PS representations still allow
dummies ('empty categories' such as t and PRO) and sublexical morphemes such
as Tense and AGR. The lexicon in a lexicase grammar on the other hand consists
not of morphemes (Halle 1973) but of actually occurring words (cf. Aronoff
197646, Hudson 1979c19) and possibly also of stems, i.e. words minus
inflectional affixes (but cf. Ladefoged 1972 for some preliminary neurolinguistic
evidence suggesting that this may not be the case), and only full actually
occurring words can be 'inserted' in trees.
Lexicase ref erence manual                                                      15
Revised 08.09.2010, 13:48



The formal notation for lexicase dependency representations is quite similar to
John Anderson's notation (cf. Anderson 1971): there is a vertical `mast' above
each word, to which dependency links to its regent and dependents are attached,
and `surface' linear order is preserved&&. However, unlike Anderson's notation,
lexicase word class and other features are marked under the words they apply to,
rather than at the tops of these masts, and no zero categories are allowed&&.]

Cf. Tesnière's stemmas and constraints; multiple words attached to one node; verb
and aux as one unit even when they are separated in surface structure
           5.5.2. Deep structures and strata   9
\qp `Dependency grammar is not a uniform concepts. There are as many different
dependency grammars as there are different linguistic theories. Different
dependency grammars call for different constraints in the spirit of a particular
linguistic theory to which a particular dependency grammar is subordinated. For
example, dependency grammar in Melchuk's linguistic theory is a multistratal
system while dependency grammar in Lexicase Grammatical Theory and Word
Grammar are monostratal. Dependency grammar in Word Grammar is a word-
based system while dependency grammar in Tesnière's Grammar is a function-
based system. Therefore a reasonable discussion of constraints on dependency
grammar calls for the understanding of the goals of particular linguistic theories.
Not every constraint is useful for every type of dependency grammar. It does not
make sense to impose the monostratal constraint on Melchuk's dependency
grammar or remove this constraint from the Lexicase dependency grammar. The
discussion of constraints on dependency grammars in the context of the goals of
particular linguistic is important in a methodological respect. Linguistic
constraints are an essential methodological device which makes a linguistic
theory an efficient tool for the discovery of laws of language. Research in
dependency grammars will be cross- fertilized by exchange of ideas.' \cl email

\sr Sebastian Shaumyan

\da 17/Aug/1996

\dt 18/Apr/2000

Other constraints are particular to lexicase as compared to some related modern
dependency approaches. Most noteworthy among these are first the requirement
that syntactic representations be monostratal and concrete, disallowing any kind
of distinct Chomsky-style D-structure, or `tectogrammatical level' (TL; cf.
Hajic ová and Sgall 1987:446-448), `deep word order' or underlying representation
(UR; cf. Sgall 1987:819), S-structure, LF, or `empty categories'. Thus all
grammatical representations are `base generated'; there are no rules to order or
reorder the constituents, change the dependency relationships, or distort the
original dependency stemmae in other ways.
Lexicase ref erence manual                                                         16
Revised 08.09.2010, 13:48



                 5.5.2.1.multiple analyses:
                      5.5.2.1.1. (1) Distinct underlying representations and
                               transformations are disallowed.

MONOSTRATALITY: A lexicase grammar has only one level of representation.
There are no operations that change one syntactic representation into another
form. Thus there are no movement rules, no 'adjunctions', no deletions, no
insertions, and no 'fusions' (Halliday 1985:72). All structures are 'base- generated'.

      In a lexicase grammar, there is no distinction between deep and surface
      structures, levels, strata, or other simultaneous levels of syntactic
      representation.

This constraint is the most radical of the nine discussed in this paper, and is
sufficient to rule out almost all of the 'generative' grammars written between 1957
and 1975. As an example, if multistratal representations are a llowed in a
grammar, any of the following seven underlying structures could be assigned to
the sentence 'John loves Mary' without violating any general metatheoretical
principles (see Figure 1.1). 5

(3)     Verb- initial                         Verb- final

           (a)                S                  (b)          S

           V                 NP        NP        NP          NP           V

         loves                N        N          N           N         loves

                             John    Mary       John        Mary




5
 . These examples are constructed assuming a grammatical framework which
expresses strata of analysis as phrase structure trees, but corresponding examples
could be given in other frameworks, for example Relational Grammar.
Lexicase ref erence manual                                                                    17
Revised 08.09.2010, 13:48




Abstract verb,                                     Abstract verb,

S as NP                                            S as NP

(c)                S                               (d)                            S

      V                      NP                          V          NP                 NP

 PRES                         S                      DO              N                  S

                   V         NP        NP                           John          V    NP          NP

                 love         N         N                                       love    N          N

                             John      Mary                                            John    Mary



Performative analysi s


(e)                                     S


      V                NP         NP                         NP


DECLARE                N          N                           S


                                              V              NP            NP


                                            love              N             N


                                                             John          Mary
Lexicase ref erence manual                                                               18
Revised 08.09.2010, 13:48




VP analysi s                                             VP and INFL analysi s


(f)                    S                                  (g)           S


      NP                       VP                               NP     INFL       VP


      John             V             NP                     John       TNS        V           NP


                    loves           Mary                               Pres      loves        Mary

                                          Figure1.1

                                          (h) Di scontinuous VP

    Since the beginning of transformational grammar there have been some
minimal constraints on possible phrase structure trees, such as the requirement
that lines cannot cross, but recently even this limitation has been weakened to
allow discontinuous constituents (Aoun 1979, cited in Chomsky 1981:128; cf.
O'Grady 1987), so that even trees such as the following and its notational
equivalents would be allowed.

                           S

             NP                 VP

      V                                   NP

loves        John                     Mary




Figure 1.2

What this means is that the metatheory in effect claims that such a simple N - V -
N sentence conceivably could be ambiguous in eight ways in some language.
Actually of course it could be ambiguous in more than eight ways, since (a) - (h)
do not begin to exhaust the possible underlying representations which could be
assigned to this sentence by a theory which allows a deep-surface distinction and
relatively unconstrained phrase structure representations and transformations.

    There has always been at least an unspoken assumption that such vacuous
ambiguity is not acceptable, and that a grammar should provide one and only one
analysis for each sentence with a single reading. However, all multistratal
frameworks do allow such multiple vacuous representations. That is what all the
grammatical arguing is about. It is because of this failure to constrain the
theories, and of the equally serious failure to write formal rules, that
transformational grammarians and adherents of other multistratal theories are
 Lexicase ref erence manual                                                        19
 Revised 08.09.2010, 13:48



 forced to justify their analyses by argumentation rather than by empirical testing.
 I contend that any linguistic metatheory which allows so many different
 possibilities and offers no guidelines to decide between them has failed in its task
 to delimit the form of a possible grammar.

                       5.5.2.1.2. Monostratality and testability

     In order for a linguistic theory to make a testable claim about the nature of
 human language, it needs to be constrained so that only one analysis is possible
 for one sentence. That single analysis is the prediction that needs to be tested and
 if possible disconfirmed, and is the measure by which the theory stands or falls.
 By rejecting analyses with more than one level or stratum, lexicase has made a
 major step in the direction of a constrained theory. GPSG has imposed the same
 constraint, and other frameworks, such as Lexical Functional Grammar and
 Government and Binding Theory, are finally moving in that direction.

 Lexicase differs from the various Chomskyan grammatical frameworks in power,
 since there is no distinct deep structure and no transformational rules to relate two
 levels. The most disastrous innovation in modern syntax was the creation of Deep
 Structure. Giving Deep Structure (D-structure, Logical Form, etc.) to primitive
 linguistic tribes without adequately preparing them for it was like giving a loaded
 Uzi, safety off, to little Johnny on his way to kindergarten. Deep Structure is
 Power. A newly empowered linguist creates some arbitrary and implausible
 account of some phenomenon, cloaks it in his version of Deep Structure, severs
 the static line && between your construct and the phenomenon itself, and then
 what factual evidence can stand against him? Who or what will ever prove him
 wrong?

                       5.5.2.1.3. multistratal DG

      Maybe the scariest example of an basically clueless analysis bolted onto a
 menacing multistratal chassis is R.M.W. Dixon's 'Basic Theory' (Dixon 19&&),
 but the dependency camp has been infiltrated as well. One example is work by
 Prague School scholars Petr Sgall and Eva Hajic ová. In general, I have a high
 regard for their work on information structure, and lexicase has been strongly
 influenced by it. However, I think the distinction they make between deep
 ('tectogrammatical') and surface levels in analyzing pairs of sentences such as
 Error! Reference source not found.Error! Reference source not found.a) and
 Error! Reference source not found. Error! Reference source not found.b)
 (repeated here) is linguistically unmotivated.

Error! Reference source not found.Error! Reference source not found.)

      a. I would never turn my back on Thorkel.

      b. Thorkel I would never turn my back on.
Lexicase ref erence manual                                                                          20
Revised 08.09.2010, 13:48



Sgall and Hajic ová want to have a level of analysis which is free from 'function
words' (words which have no situational meaning), and which assigns the same
unmarked constituent order to sentences with the same 'propositional' (situational)
meaning. However, I have yet to see any supportable linguistic justification for
proposing these particular properties. It certainly can't be increased economy, as
Sgall claims (Sgall to appear), since a complex surface structure is replaced by a
simpler tectogrammatical structure plus the same old complex surface structure.
This extra level obscures what is otherwise a very insightful analysis (cf. Starosta
to appear d), but the authors haven't realized that because they think they have no
responsibility to lay out the exact path leading back from the dark deep world of
the Morlocks && to the bright surface Elysian Fields of the Eloi && (Wells
1895).

                      5.5.2.1.4. syntactic representations be monostratal and concrete

    Other constraints are particular to lexicase as compared to some related
modern dependency approaches. Most noteworthy among these are first the
requirement that syntactic representations be monostratal and concrete,
disallowing any kind of distinct Chomsky-style D-structure, or ‘tectogrammatical
level’ (TL; cf. Hajicová and Sgall 1987:446-448), ‘deep word order’ or
underlying representation (UR; cf. Sgall 1987:819), S-structure, LF, or ‘empty
categories’. Thus all grammatical representations are ‘base generated’; there are
no rules to order or reorder the constituents, change the dependency relationships,
or distort the original dependency stemmas (dependency trees) in other ways.

                      5.5.2.1.5. motivations for DS

                             5.5.2.1.5.1.relate S's

lexicase does by WFS

                             5.5.2.1.5.2.simplify DS, clean out irrelevant, meaningless S

no simplification, since SS still there

                             5.5.2.1.5.3.make all languages look like English
                                 5.5.2.1.5.3.1.       properties such as c-command based on Chomsky's
                                                  analysis of English
won't work for other kinds of languages, e.g. VSO

so just make all languages English in DS

                             5.5.2.1.5.4.Binarity
                                 5.5.2.1.5.4.1.       unmotivated fashion
                                     5.5.2.1.5.4.1.1.illogical extension of the S  NP VP
                                                   analysis for English
  Lexicase ref erence manual                                                                        21
  Revised 08.09.2010, 13:48



                               5.5.2.1.5.5.make grammar unfalsifiable
                                  5.5.2.1.5.5.1.      can be made to work in principle if have DS
                                          5.5.2.1.5.5.1.1.Empty categories?

  Does Hudson allow zero heads for DPs? I think And Rosta said he did.
                                          5.5.2.1.5.5.1.2.Lexical integrity

      The lexicase representation thus sticks quite close to the lexical ground,
  accepting as possible grammatical statements only those which can be predicated
  of the actual strings of lexical items which constitute the atoms of the sentence.
  Again, this constraint plus the constraints of earlier sections limit the class of
  possible grammars by excluding otherwise plausible analyses and deciding in
  favor of equally plausible analyses formulatable within the constrained lexicase
  framework. For example, it rules out such familiar- looking structural descriptions
  as:

(4)                 S

      NP          AUX                     VP

      N          Tense            V            NP

  John            Pres           love          Mary

  since Pres is not a word (Fig. 1.3). In a lexicase representation, the structural
  analysis of this sentence would be a monostratal tree with the tense- inflected verb
  loves as a terminal node, that is:

(5)                 S

  NP                V              NP

      N          loves                N

 John                             Mary



  Information about internal morphological structure is not needed in identifying
  the set of well- formed sentences (though it may of course be useful in decoding
  them), and normally the speaker is not aware of internal morphological structure,
  and does not need to be.
  Lexicase ref erence manual                                                               22
  Revised 08.09.2010, 13:48



      By allowing sublexical morphemes to appear as separate nodes in a tree,
  Chomskyan linguistics abolished the structuralist distinction between morphology
  and syntax, and in requiring all terminal nodes to be words, lexicase has
  reintroduced it.
  PERIPHERALITY: lexical heads must be phrase-peripheral.
  This constraint, proposed for example in Stowell (1981 :70), seems to be an
  artifact of the Chomskyan determination to maintain a VP constituent at all costs
  and almost equally firm categorial- grammar-like predilection for binary
  constituent structure. If a construction has two immediate constituents and one of
  them is the head, then plainly the head will either be at the left end of the
  construction or at the right end.
                                     5.5.2.1.5.5.1.3.X-bar levels
       5.5.2.1.5.5.1.3.1.                       Complements versus adjuncts
       5.5.2.1.5.5.1.3.2.                       The N' analysis                            17

      The N' analysis: A noun phrase such as the fanatic lexicalist can be analyzed
  in terms of PSRs and conventional X-bar grammar as including a NOM or N'
  node between the NP and the N (see (a) in Figure 1.7). In a lexicase grammar,
  however, because of the one-bar constraint, there can be no node between the NP
  and the head noun, so only a one-bar analysis such as the (b) in Figure 1.7 is
  possible.

(a)                        NP                         (b)            NP

      DET                         NOM                       Det      Adj          N

      ART                  ADJ              N               the     fanatic   lexicalist

      the               fanatic       lexicalist

                                        Figure 1.7

  It has been suggested (cf. Radford 1981104 and the references cited there,
  especially Hornstein and Lightfoot 1981) that (i) such intermediate nodes are
  required, and that (ii) because there may be more than one Adj preceding a noun,
  we must give up the Succession constraint and allow N'  ..N'... recursion. This
  analysis is based on the assumption that sequences such as the big bad wolf have
  the structure [ Det [ big [ bad wolf ] ] ].
                      NP        N'     N'


  The primary evidence for this assumption is the fact that one can substitute either
  for bad wolf or for big bad wolf. At the end of the chapter, however, Radford
  presents some exercises which show that the one-substitution test gives
Lexicase ref erence manual                                                                         23
Revised 08.09.2010, 13:48



contradictory results, and which in fact call into question the use of any such
substitution test as a means to determine constituency. The other type of evidence
adduced for an N' analysis is coordinatability, but this is subject to an alternate
explanation in terms of gapping (see 6.4), so the case for N' is dismissed.

    In the lexicase analysis of NPs, there can be no N' distinct from NP. The one-
substitution data can be accounted for by analyzing one as an anaphoric noun
which allows no inner (subcategorizing) attributes, and the appearance of more
than one Adj before a noun is iteration rather than recursion. No special
relaxation of Succession is required to account for this. Part of the rule stating the
allowable attributes will look like the following:
RR-22. [+N]                    ([+Adj]) 
                                           
                                -___[+Adj] 

and if nothing further is said, the rule automatically allows any number of
adjectives to precede the noun.
                                    5.5.2.1.5.5.1.4.Minor categories

The GKPS theory does not observe a particularly stringent set of conditions on
the X-bar system. Minor categories are left out of the bar system (Weak
Lexicality), and minor categories can be introduced as non- heads (a version of
Weak Maximality), for instance. (Pullum 1985:27)

                             5.5.2.1.5.6.Constraints on lexical representations              26

To ensure that a new and more powerful monster did not arise from the ashes of
the old, lexicase has started from something close to Chomsky's feature notation
for lexical matrices as proposed in Aspects and added new constraints where
possible. Constraints on lexical representation and the use of features at the
present time include the following:
                                5.5.2.1.5.6.1.   Constraints on possible lexical representations
                                5.5.2.1.5.6.2.   Feature decomposition of word classes

    Chomsky originally made the latter suggestion in order to be able to express
cross-categorial generalizations, especially with regard to the distributional
similarities between verbs and predicate adjectives and their respective
nominalizations. However, attempts to apply this consistently have resulted in
many clumsy and counterintuitive analyses. It has been shown within the lexicase
framework (Starosta forthcoming a:103-5; Lee 1972:65-6; Starosta, Pawley, and
Reid 1981:47-8, 50-62;) that if we eliminate this awkward revision of phrase
structure grammar, the resulting simplified framework is still able to account for
the data which originally motivated it in terms of contextual features which carry
over in the process of lexical derivation, so that the generalization is captured in a
lexical rule rather than in the syntactic representation itself.
 Lexicase ref erence manual                                                          24
 Revised 08.09.2010, 13:48


                              5.5.2.1.5.6.3.   Word class inventory
Syntactic categories are limited and atomic.

      (6) The inventory of major lexical classes (parts of speech) is limited to a
      small set of atomic categories.

 Only the following syntactically definable word classes ('parts of speech') are
 possible across languages: V (verb), Adv (adverb), Adj (adjective), N (noun), Det
 (determiner), P (preposition/postposition), Cnjc (conjunction), and Sprt (sentence
 particle), and no word can belong to more than one of these classes at the same
 time. Any other word class labels must either be alternative names for one of
 these eight classes, refer to subclasses of these classes, or be in error. Thus there
 is no 'Aux' class, no 'classifier' class, no 'Pronoun' class, etc., unless these are
 taken to be subclasses of the basic eight classes. This constraint, like the others in
 this section, is not an ex cathedra decree. Rather, it has been found in lexicase
 work on more than seventy languages that this seems to be the maximum number
 of word classes a language may contain. By the rules of the constraints game,
 anyone who claims that more than these eight classes are needed will need to
 justify that claim by showing that there are data which cannot be handled without
 postulating additional classes, and/or that assuming this set entails the loss of
 significant cross- linguistic generalizations. To date, this constraint has had very
 salutary consequences. For example, recognizing 'auxiliaries' as verbs, and
 forcing the reanalysis of 'classifiers' in Southeast and East Asian languages as
 nouns, has garnered some nice language- internal and cross- linguistic
 generalizations.

 Every word in a grammar is a member of one and only one of a restricted set of
 syntactic word classes or 'parts of speech', probably limited to the following: N, V,
 Adj, Adv, Det, P, cnjc ('conjunction'), and possibly SPart ('sentence particle').
 Every lexical entry is a member of one of these seven or eight classes. Each
 syntactic class is defined in terms of a single positive atomic feature drawn from
 this set of categories, such as [+N], [+Adj], etc., and no lexical item is marked
 positively for more than one of these features. All the categories are defined
 primarily in distributional terms, that is, in terms of sequential and non-sequential
 dependencies which they contract with other items in a phrase. Major syntactic
 categories are divided into syntactic subcategories based on differences in
 distribution. Thus nouns are divided into pronouns (no modifiers allowed),
 proper nouns (no adjectives and typically no determiners allowed), mass nouns
 (not pluralizable), etc., and similarly for the other syntactic classes. The
 contextual features associated with the words in these various distributional
 classes determine which words are dependent on which other words. They are in
 effect well- formedness conditions on the dependency trees associated with the
 words in a sentence. Morphological properties and derivational potential are of
 secondary importance and semantic or 'notional' criteria of tertiary importance.

    The requirement that major class features be atomic and mutually exclusive is
 meant to exclude analyses such as Hudson's (Hudson 1979c16; Hudson 198498))
 Lexicase ref erence manual                                                        25
 Revised 08.09.2010, 13:48



 in which gerunds are simultaneously marked positively for membership in the
 verb and noun classes, as well as elastic X-bar type GB analyses in which lexical
 categories are treated as feature complexes. For example, it rules out definitio ns
 of N, V, A, and P as [+N,-V], [-N,+V], [+N,+V], and [-N,-V] respectively
 (Stowell 198121; Gazdar, Klein, Pullum, and Sag 198520-1), Jackendoff's
 treatment of nouns for example as (Jackendoff 197:3):
 +Subj       
             
 -Obj        
             
 +Comp       

 and Chomsky's cutesy analysis of passive participles as simply [+V], a
 neutralization of verbs [+V,-N] and adjectives [+V,+N] (Chomsky 198155). It
 also excludes systems allowing feature matrices with internal structure, such as
 that developed in GPSG (Gazdar, Klein, Pullum, and Sag 198521), Bresnan's
 proposal to define categories as ordered pairs of an integer (representing bar level)
 and a bundle of feature specifications (Bresnan 1976, cited in Gazdar, Klein,
 Pullum, and Sag 198521), and analyses involving ordered pairs of sets of
 agreement feature specifications (Stucky 1981 and Horrocks 1983, cited with
 apparent approval in Gazdar, Klein, Pullum, and Sag 198521).


                              5.5.2.1.5.6.4.   Triune sign
                                  5.5.2.1.5.6.4.1.tertiotic sign

 A lexicase grammar is a grammar of words, and words are TRIUNE SIGNS
 composed of sound, meaning, and distribution:

4) The triune sign




                                               sound




                                  meaning                distribution
 Lexicase ref erence manual                                                         26
 Revised 08.09.2010, 13:48



 All words have all three components, since all three components are necessary.
 None can be predicted from the others. 6
                              5.5.2.1.5.6.4.2.Signs and lexical entries            44
                              5.5.2.1.5.6.4.3.Lexical disjointness                 28

A word can be a member of only one syntactic category.

 This constraint seems to be what is meant by Pullum's lexical disjointness
 condition (Pullum 19856), formulated as: If X 0  t and Y0  t' are both in R, and
 X =__/ Y then t =__/ t'. Pullum refers to this condition as 'unrealistic', but
 doesn't elaborate. Perhaps he is concerned about the fact that an element like run
 can appear as either a noun or a verb. However, in this situation we are talking
 about a root run, which has a pronunciation and a meaning but not a syntactic
 category. In a lexicase grammar, a word (as opposed to a root) has a grammatical
 category as well as a pronunciation and a meaning. Thus if we find that run
 appears as a noun and as a verb, then we are dealing with two lexically distinct
 though homophonous words, run1 [+V] and run2 [+N]. The regular relationship
 between such pairs of words is captured in lexicase by means of derivation rules
 rather than by complex lexical entries.

      In a lexicase grammar, a word is defined by three elements -- sound, meaning,
 and distribution -- rather than just the Saussurean semiotic sound and meaning.
 None of these three elements is predictable from the others, so all must be listed
 in the lexicon. This constraint rules out single entries which have subparts, say, a
 [+N] part marked for certain idiosyncratic features, a [+V] part with others, and
 the rest of the features held in common (Chomsky 197090), with 'redundancy
 rules' (ibid.) accounting for whatever is predictable across classes. However, the
 term 'redundancy rule' is inappropriate, because the information contained in such
 rules is not redundant: the existence, form, and meaning of a derived item is not
 completely predictable from its derivational source, and the need for rule features
 to make everything come out right at the end is just an admission of this fact. A
 rule feature is merely an address or a pointer, a clue in a treasure hunt game
 telling the searcher where to look next.

     Instead of positing combined entries and rule features, a lexicase grammar
 must establish separate lexical entries, one for the verb and one for the noun, and
 write a derivation rule to account for the regular relations between them. Because
 I take this position, I have been accused of disturbing 'the unity of the word'.
 However, I think the combined entry approach was never anything more than a
 lexicographic convention based on a neglect of the distributional dimension of the
 linguistic sign, a mystical belief in 'the unity of the word', and the availability of

 6
   Claims that distribution can be predicted from meaning (e.g. Levin 19&&:&&)
 are circular, because the 'semantic classes' used in doing the predicting are set up
 to match the grammatical properties in the first place.
Lexicase ref erence manual                                                           27
Revised 08.09.2010, 13:48



the alphabet. My experience with unwritten languages suggests that this 'unity'
has little psychological justification. There is no more linguistic reason to list
items together according to their common roots or (derivational) stems (that is,
according to their etymological sources) than there is to group them according to
their meanings (as in a thesaurus) or their distributions (as in grammar lessons or
reference works for the preparation of teaching materials). Except for inflectional
paradigms , 7 combined entries are just an arbitrary , 8 awkward, and formally
unworkable 9 alternative representation for separate independent entries, in my
opinion, and this constraint makes the potentially testable claim that such
composite representations have no psychological validity.
                                 5.5.2.1.5.6.4.4.Die Einheit des Wortes
      5.5.2.1.5.6.4.4.1.                      No polysemy - lg

As a consequence of the triune sign principle, there is no such thing as 'polysemy'.
Words which differ in any of the three aspects are separately listed, and separately
learned. If words of identical form occur in parallel sets, as for example pairs of
English transitive and intransitive verbs such as drink (transitive) and drink
(intransitive), and if speakers can be shown to be aware of this fact, then it can be
accounted for grammatically in terms of analogical formulae ('derivation rules').
Listing homophonous words together under a single lemma as alternative forms
of the 'same word' is a lexicographical tradition with no linguistic motivation.
                             5.5.2.1.5.6.5.      Feature types
      (7) All lexical features are binary, and rule features and double contextual
      features are eliminated.
                                 5.5.2.1.5.6.5.1.Unary and Binary                    29
      5.5.2.1.5.6.5.1.1.                      Unary
      5.5.2.1.5.6.5.1.2.                      Binary
                                 5.5.2.1.5.6.5.2.Contextual
      5.5.2.1.5.6.5.2.1.                      Domain

The sisterhead constraint is a restriction on the domain of subcategorization of
lexical items: contextual features are marked only on the lexical heads of


7
    . And maybe even including inflectional paradigms.
8
 . In order to maintain this fictional unity in a real generative grammar, it will be
necessary to list many essentially separate entries together in a clump under a
single phonological form when their semantic properties are subjectively felt to be
close enough to justify this stratagem (cf. Chomsky 1970:190 Hudson 1980b:6).
9
. Of course anything, including combined cross -categorial entries, can be made to
work with rule features, which is one reason why rules features are bad.
Lexicase ref erence manual                                                     28
Revised 08.09.2010, 13:48



constructions, and refer only to lexical heads of dependent sister constructions.
As I noted in my paper, 'A place for case' (Starosta 19763-5), the Standard Theory
leaves the matter of subcategorizational domain open; according to Chomsky's
Aspects, selectional restrictions apply between items which are grammatically
related, and grammatical relations are where you find them.

   On the other hand, a grammar meeting the sisterhead constraint clearly
commits itself about the domain of such features: contextual features can be met
or violated only by the lexical heads of sister attributes within the same
construction. Thus all grammatical relationships can be stated in terms of X (the
REGENT) and Comp (the DEPENDENT or ATTRIBUTE), and all pairs of items
                    i
standing in a regent-dependent affiliation are by definition grammatically
       .
related 10
     5.5.2.1.5.6.5.2.2.                Implicational
     1.1.1.1.1.1.1.1.1.case government
     1.1.1.1.1.1.1.1.2.agreement
     1.1.1.1.1.1.1.1.3.selection
                                5.5.2.1.5.6.5.3.R E S T

    In the last two years, linguists working in the lexicase framework have come
to recognize that there are two different kinds of contextual requirements marked
on lexical items, absolute and implicational (Khamisi 1985341; Pagotto and
Starosta 19864-5; cf. Blake 1983167). Absolute contextual features mark an item
as requiring or excluding a particular kind of dependent sister category in order
for the sentence to be well formed. For example, in English transitive verbs must
have overt objects, and finite verbs must have overt subjects. If either of these
requirements is not met, the sentence is clearly ungrammatical, e.g.

Similarly, pronouns may not cooccur with determiners, and the presence of a
determiner violating a negative absolute contextual feature results in
ungrammaticality:

           (20)       *John resembled.

                             [+___[+N]]

           (21)       *Resembled a gross-breasted rosebeak.

                      [+[+N]___]


10
  . Cf. Hudson 1979c:9-10 for a very similar approach to the domain of
subcategorization.
 Lexicase ref erence manual                                                       29
 Revised 08.09.2010, 13:48



            (22)       *The   I           love the      you.

                       [+Det] [-[+Det]]         [+Det] [-[+Det]]

     However, some positive contextual restrictions can be violated without
 necessarily resulting in ungrammaticality. One kind of example is the
 requirement that finite verbs have subjects. In English this is almost absolute
 (except for 'headless relative clauses' and probably for imperatives), but in
 Swahili (Khamisi 1985340-2), Spanish, and other 'pro-drop' languages it is not.
 In this situation we need to say that the finite verb expects a subject ([ [+Nom]])
 rather than that it requires one ([+[+Nom]]). The other common place where this
 comes up is in 'wh- movement' situations (cf. Pagotto 1985a30 ff). Lexicase does
 not allow empty categories, so if a constituent is not there, it is not there; yet a
 missing 'wh- moved' category does not necessarily result in ungrammaticality:

            (23)       What did you [ S put it on           ]?
                                                [__[+N]]

 To account for these two different kinds of contextual requirements, then, lexicase
 grammar now distinguishes absolute from implicational features. Violation of an
 absolute polar feature is sufficient grounds for ungrammaticality, whereas an
 implicational feature results in ungrammaticality only if the implied constituent
 has not been identified by a rule of interpretation at the end of the derivation.


                                  5.5.2.1.5.6.5.4.No double contextual features   31

No lexical entry may contain a double contextual feature.

 A double contextual feature is one in which a contextual feature is included
 within another contextual feature. For example, in order to maintain a GB-type
 analysis of infinitival complements, a feature might be proposed for verbs such as
 want which required the subject of an embedded infinitive to have a preceding
 accusative NP:

 want
 +V             
                
 + -fint     
             
  +[+Acc]___  

 Such an analysis would be ruled out by this constraint, since it requires a
 contextual feature nested within another contextual feature. It is important to
 impose this constraint, since without it the sisterhead constraint can be evaded and
 dependencies can be stated between any two arbitrarily chosen categories.
 Unfortunately, this constraint is also easy to evade even within the lexicase
 Lexicase ref erence manual                                                       30
 Revised 08.09.2010, 13:48



 notational system. For example, instead of the feature [-[+Acc]___] in the above
 matrix, we could substitute some dummy non-contextual feature [+dmmy]:

 want
 +V                    
                       
 + -fint             
                     
  +[+dmmy            

 and then mark infinitives one way or the other for this feature (SR-1), which is
 subsequently interpreted as a contextual feature (RR-12):

 SR-1.       [-fint]             [dmmy]

 RR-12. [+dmmy]                  [-[+Acc]___]

 Thus this constraint needs to be modified to prevent this kind of evasion as well
 as more direct violations. Unfortunately, as of this writing, the lexicase analysis
 of 'headless relative clauses' such as whatever bit John uses exactly this kind of
 mechanism, and it is not clear how to analyze this construction without it.
                                  5.5.2.1.5.6.5.5.No diacritic {rule} features   32

No lexical entry may contain a rule feature

 This constraint requires that a grammar contain no lexical diacritic feature of the
 form [+Ri] or [-Ri], where 'R ' is the number or address of a particular rule in a
                                      i

 grammar, and where [+Ri] indicates that the item on which it is marked must
 undergo rule Ri in every derivation, while [-Ri] indicates that its host item may
 never undergo Ri in any derivation for which the structural analysis of that rule is
 met.

     Rule features are not nice, and I don't want them in my nice clean grammar.
 Rule features are lexical specifications marking a lexical item as an exception to
 some grammatical rule. They can be used to camouflage a leaky rule as a general
 and productive rule, and their ready availability has had the effect of making it
 easy for syntactic-rule-oriented syntacticians to state transformations or relation-
 changing rules as if they were true productive structural generalizations, and
 disregard the fact that most 'syntactic rules' are lexically governed. This
 constraint decreases the power of the metatheory in a very salutary way: it weeds
 out some otherwise possible but unnatural and ad hoc analyses.

     Such a constraint immediately excludes all possible grammars which adopt
 the kind of notation and machinery proposed in Lakoff's Irregularity in Syntax
 (Lakoff 197022-6). A direct consequence of the stricture against rule features is
 the elimination of all transformations or relation-changing rules which have any
 arbitrary lexical exceptions, since there is no way to block the application of such
Lexicase ref erence manual                                                        31
Revised 08.09.2010, 13:48



rules if there are no rule features to refer to. Interestingly, this turns out to be
most of the transformations in the Chomskyan literature. The (partial)
generalization previously captured by such rules will then have to be captured
somewhere else, and the only other place to do it is the lexicon.

     As an example, consider the 'Dative Movement' rule. As is well known, there
are lexical exceptions to this rule, so the absence of rule features means that it
cannot be stated as a transformational rule or a Relational Grammar promotion
rule. Then where can this property of English grammar be stated, or can't it? It
should come as no surprise that it ends up in the lexicon. Thus the class of verbs
which occurs in the 'moved' environment (+[___NP NP]) for 'Dative Movement'
is treated as a separate syntactic class from the set of verbs which can occur in the
'unmoved' environment (+[___NP to/for NP]), and the relationship between the
two classes is accounted for in terms of a lexical derivation rule that derives the
+[___NP NP] verbs from their +[___NP PP] counterparts or vice versa (Starosta
and Nomura 198430). Since a derivation rule is a semi-productive analogical
process of word formation, gaps are allowed and expected. The psychological
claim made by this constraint then is that there is not a single class of verbs, some
of which allowed themselves to participate in certain external reshufflings and
others which do not. Instead, there are two distinct classes of verbs, the 'moved'
and the 'unmoved' verbs. They are learned and stored separately, but because of a
high degree of overlap in the class membership, the speaker is able to move items
from one class to the other by analogy, and to understand new examples produced
by other speakers in the same way. This claim is a consequence of the constraint
against rule features, and one which in principle is subject to empirical testing.

    To cite another example, because of the unavailability of rule features,
predicate adjectives cannot be lexically identified with adnominal ones, marking
some items as positive or negative exceptions to Lakoff's WH-DEL and ADJ-
SHIFT transformations (Lakoff 1970). Instead, predicate adjectives must be listed
in the lexicon as a separate class from adnominal adjectives. The intersection of
the two sets is accounted for again in terms of a lexical derivation rule, and the
relative complements of the intersection (the words that occur in only one of the
two sets but not the other) are considered lexical gaps.

    Finally, this constraint also excludes the lexical specification of irregular
morphology allowed in GPSG (Gazdar, Klein, Pullum, and Sag 198534) and
Hudson's morphological cross-referencing system (Hudson 1980b2,12,15).
Hudson groups irregular inflectional forms such as write : wrote : written into
single entries consisting of a common root, e.g. /rait/, plus a specification of
vowel alternations dependent on morphological contexts. This can be seen as
either a notational variant of separate lexical entries or as the old structuralist
replacive morph analysis, a way of marking a single lexical item to undergo its
own private morphophonemic rule, with the implied claim (which I consider
highly dubious) that the vowel changes reflect some psychologically real
morpheme division. When other irregular sets such as drive : drove : driven
exhibit the same alternations, Hudson allows them to refer to the same set of
Lexicase ref erence manual                                                       32
Revised 08.09.2010, 13:48



vowel changes and contexts (Hudson 1980b14-15). At this point we are surely
dealing with the equivalent of a minor rule in Lakoff's sense (Lakoff 1970), with
the vowel changes and contexts considered to be independent of any particular
lexical entry (Hudson 1980b12). The cross-referencing notation is a rule feature
triggering the application of the minor rule, although Hudson explicitly denies this
(Hudson 1980b15; cf. Hudson 198458). As such, it is excluded by the anti-rule
feature constraint, which thus provides an answer to his question (Hudson
1980b4) as to where to draw the line between reasonable and unreasonable
analyses.


                             5.5.2.1.5.6.6.      Content
                                 5.5.2.1.5.6.6.1.Phonology - lg

Phonological representations are composed of segments in dependency relations
to each other, with the primary-stressed segment (if any) governing any lower-
stressed syllabic nuclei, which in turn may govern 'sonables' (David Stampe, p.c.),
which govern other consonants, etc. These dependency relations are similar to
syntactic dependency relations except that a segment may sometimes be governed
from two directions. Other than 'constituency' breaks be tween segments
depending on two different regents, e.g. two different syllabic peaks, there are no
internal boundaries of any kind in lexical representations. This having been said,
it must be confessed that this is partly wishful thinking, since no serio us work on
dependency phonology has been done within the lexicase framework. The part
about the absence of internal boundaries however may be well motivated, since it
has been possible to find alternative accounts for putative boundary phenomena in
the cases investigated so far.
                                 5.5.2.1.5.6.6.2.Morphology
     5.5.2.1.5.6.6.2.1.                       Internal boundaries - lg

Lexicase morphology is 'seamless morphology', closely akin to the 'word-based
morphology' espoused by Rajendra Singh and his colleagues (cf.
Ford/Singh/Martohardjono 1997). Words, including compounds, have no internal
structure other than phonological structure, and no internal morphological
boundaries. Phenomena accounted for in other frameworks in terms of
morphological boundaries are either diachronic matters rather than syntactic
phenomena, and/or can be accounted for in terms of word formation strategies
that do not refer to or create internal morphological structure.
     5.5.2.1.5.6.6.2.2.                       No bound morphemes - lg

It is a corollary of the 'seamless morphology' principle that there are no bound
morphemes in the lexicon or anywhere else in a grammar. That is, there are no
stems, no affixes, and no words inside of other words. In this respect, the
seamless morphology position is more radical than that proposed in Anderson's
 Lexicase ref erence manual                                                           33
 Revised 08.09.2010, 13:48



 'A-morphous morphology' (Anderson 1992), which allows for bound stems and
 for internal structure in compounds.
      5.5.2.1.5.6.6.2.3.                       No stems

 In inflecting languages, every member of every paradigm is a separate lexical
 entry, with its own form, meaning, and distribution. There are no bound 'stems'
 abstracted from paradigms and stored as representatives of paradigms. Whether a
 given fully inflected entry is actually stored in active memory or constructed on
 the fly from representative members of the paradigm ('principal parts', 'citation
 forms') is a separate empirical question that has to be decided on a case-by-case
 basis (cf. Hudson 1984, 27-28), but what goes into a sentence is a fully specified
 inflected word.
      5.5.2.1.5.6.6.2.4.                       No affixes

 The lexicon contains no bound affixes. In particular, it contains no affixes with
 elaborate argument structure properties such as those assumed in word formation
 approaches by linguists such as Williams, Selkirk, and Lieber (cf. Hendrick 1995,
 302). Instead, lexicase accounts for 'affixation' in terms of seamless word
 formation strategies rather than segmental affixes, and thus avoids the problems
 with zero, discontinuous, and non-segmental 'morphemes' that bedevil
 Chomskyan Item-and-Arrangement analyses and the earlier American
 structuralist IA analyses from which they are derived. Until recently, lexicase
 grammars kept the traditional distinction between inflectional and derivational
 morphology that was still assumed in Anderson's A- morphous morphology, for
 example, but now is inclining toward the position taken by Ford et al
 (Ford/Singh/Martohardjono 1997, 3) that this distinction cannot be maintained.
      5.5.2.1.5.6.6.2.5.                       No compounds

 If compounds are words containing other words, then lexicase grammars can
 contain no compounds. Instead, 'compounds' turn out to be a special case of
 affixation, and fall under the seamless morphology stricture against internal
 morphological structure. This somewhat surprising conclusion is defended in
 several forthcoming papers by Singh and Dasgupta (Singh/Dasgupta 1998 ) and
 Starosta (Starosta to appear a, b).
                                  5.5.2.1.5.6.6.3.Syntax
                                  5.5.2.1.5.6.6.4.Semantics
                              5.5.2.1.5.6.7.      Abstractness and expressive power
                                  5.5.2.1.5.6.7.1.No abstract lexical entries         27

The lexicon contains only actually occurring words.

 The lexicon contains only full words, free forms with a syntactic category, a
 meaning, and a pronunciation. There are no affixes, derivational morphemes,
Lexicase ref erence manual                                                                                34
Revised 08.09.2010, 13:48



abstract verbs such as CAUSE, abstract underlying forms such as *ostrac for
ostracize, and no phonetically null elements such as /\, PRO, or t.

                             5.5.2.1.5.7.Constraints on rules          34

[d:\rt\th\dpndc\holdlxclxsm.doc] grammar as rules; rule types
                                5.5.2.1.5.7.1.        Rules as the expression of generalizations in the
                                                 lexicon
    Intraphrase relations are still part of a speaker's linguistic knowledge, but they
are treated as extensions of relations between the head lexical items of these
sentences, and these relations in turn are accounted for by Word Formation
Strategies rather than via some arbitrary and abstract 'underlying' representation.
WFSs can refer only to preexisting inherent and local valency features present in
the lexical representations of these words and their phonological shapes, and so
don't affect the set of possible grammatical representations.

LP, ID, X-bar
                                    5.5.2.1.5.7.1.1.entry- internal
     5.5.2.1.5.7.1.1.1.                          Linking
                                    5.5.2.1.5.7.1.2.analogical
                                    5.5.2.1.5.7.1.3.regent-dependent
     5.5.2.1.5.7.1.3.1.                          Chaining
                                5.5.2.1.5.7.2.       No transformations                         34
By this constraint, a grammar contains no 'syntactic rules' in the traditional
Chomskyan sense at all; that is, there are no phrase structure rules or
transformations.

    'Lexicalism' as used here is not quite the same thing as Chomsky's 'lexicalist
hypothesis' (Chomsky 1970), which is also a constraint on the expressive power
of grammars. The lexicalist hypothesis in its strongest form can be characterized
as follows:

     'The strong lexicalist hypothesis of Jackendoff (1972) excludes all
     morphological phenomena from the syntax.' (Aronoff 1976:9)

     '(178) Syntactic rules cannot make reference to any aspects of word- internal
     structure. (Anderson 1982; cited in Manning 1996:121-122)



MONOSTRATALITY: a lexicase grammar has only one level of representation.
There are no movement rules, no deletions, no insertions, no adjunctions, no
‘fusions’ (Halliday 1985:72), no empty categories, and no non- lexical categories.
Lexicase ref erence manual                                                       35
Revised 08.09.2010, 13:48




The original generative grammars made extensive and crucial use of descr iptive
devices called 'transformations' to relate overt 'surface structures' to meaning-
bearing 'deep structures', and to account explicitly for the systematic relation
between corresponding sentence patterns such as active and passive structures.
Among the people who have been seriously concerned about generative power
and what to do about it, there is a broadly held view that such transformations are
primarily to blame for the limited empirical content of most versions of generative
grammar, and that something must be done to limit this power. A first step in this
direction was Emonds' structure-preserving constraint (Emonds 1976). This was
an attempt within the Standard Theory tradition to limit the power of
transformational rules in order to 'clear the way for a generative model of a
language which will be both formally manageable and e mpirically significant'
(op. cit., endflap).

    By 1968, one faction of the MIT school had developed the transformational
machinery into an extremely abstract though quite inexplicit logic-oriented
system which came to be known as 'generative semantics'. Partly as a reaction to
the excesses of generative semantics, perhaps, Chomsky (Chomsky 1970)
proposed that some kinds of grammatical relatedness, in particular certain kinds
of nominalization, should be accounted for in terms of lexical rules rather than
transformations. This proposal, which is referred to as the lexicalist hypothesis,
has since spread to other areas of grammar and has been important in the ge nesis
of new grammatical frameworks within and beyond the bounds of MIT
linguistics, such as Bresnan and Kaplan's lexical- functional grammar (cf. Bresnan
1982a), Brame's Base Generated Syntax (Brame 1978), and Hudson's Daughter
Dependency Grammar (Hudson 1977) and Word Grammar (Hudson 1984), all of
which reject the use of transformations altogether, although in many cases they
seem to have created new devices of unknown power to replace them. 11

    Lexicase was apparently 12 the first generative framework to propose the total
abolishment of transformations, so that there is no distinction between deep and
surface structures, levels, strata, 'functional descriptions', or other simultaneous
levels of syntactic or semantic representation, and there are no grammatical rules
that map one sequence or hierarchy onto another; that is, there are no rules that
adjoin, delete, permute, relabel, or copy parts of one structure to produce another
structure.

    To invalidate the stricture against transformations and similar stratum-onto-
stratum mapping rules, it would be necessary to find some valid fact of grammar
which cannot be accounted for without invoking transformations or other equally
powerful descriptive devices. Much of the work in lexicase since 1970 has thus

11
     . For other references, see Hudson and McCord n.d.: 1.

12
     No mention in Newmeyer .ba.
 Lexicase ref erence manual                                                       36
 Revised 08.09.2010, 13:48



 been devoted to showing that facts formerly handled by transformations are
 either not facts of grammar proper or can be handled in the lexicon without
 resorting to transformations. For example, facts about discourse-related
 phenomena such as clause-external anaphora, conjunction reduction, short
 answers to questions, etc., must be considered part of performance, at least until
 someone can show such phenomena to be amenable to a formal and explicit
 description. On the other hand, grammatical processes such as Subject-Aux
 Inversion in English have been shown (Starosta 1977) to be susceptible to a
 purely lexical analysis, with no resort to transformations necessary.

 [d:\rt\th\dpndc\holdlxclxsm.doc], Subject-Aux Inversion
                              5.5.2.1.5.7.3.   No Phrase structure rules     35
There are no Phrase Structure rules (PSRs) as separate from lexical rules.

      (2) Phrase Structure Rules are eliminated as redundant.

 A final constraint is found to follow from the those proposed above: Phrase
 Structure rules are redundant and can be eliminated, and all rules of grammar
 proper can be reformulated as generalizations about relations among lexical
 entries or relations between features within lexical entries. Phrase Structure rules
 are used in a grammar to describe the ways in which words are allowed to
 combine into phrases, clauses, and sentences. Given the constraints proposed in
 this section, however, this information can be provided by means of contextual
 features marked on the lexical heads of constructions. (This consequence will be
 discussed in detail in section 2.4.3.) Eliminating PSRs is nice in terms of
 simplicity, but more important, it constrains possible syntactic representations. To
 demonstrate the difference in power between ordinary PS rules and their lexicase
 counterparts, we need to find tree structures which can be generated by ordinary
 PSRs but not by lexical rules constrained as proposed in this section, and then
 show that all the structures generated by lexical rules are also generable by PSRs.
 It is easy to show that PSRs can generate many kinds of structure which are not
 possible in lexicase. The following common transformational grammar-type rules
 for example generate trees which violate the one-bar constraint, since there is no
 lexical category on the right side of the arrow:

            *S"              TOP S'
            *S'              COMP S
            *S               NP INFL VP
            *NP              DET NOM
            *NP              NP S
            *NP              S
Lexicase ref erence manual                                                        37
Revised 08.09.2010, 13:48



    The elimination of Phrase Structure rules is not only motivated by questions
of expressive power, but can be justified in other ways as well. For example,
Jäppinen et al.( Jäppinen, Lehtola, and Valkonen 1986461) observe that PS rules
are not appropriate for free word order languages such as Finnish, since such rules
are heavily dependent on configuration, and hence on fixed-order concatenation
(Nelimarkka, Jäppinen, and Lehtola 1984169). Similar points have been made
within GPSG as justification for separating linear precedence from immediate
dominance. In retrospect, it is probably true that the invention of PSRs in the first
place was heavily influenced by the fact that the inventors were speakers of
languages with relatively fixed word order, and that 'Scrambling' transformations
had to be brought in solely to remedy the damage done by adopting an unsuitable
PSR notation in the first place. As will be seen below, the lexicase counterpart of
constituent structure rules is Inflectional Redundancy Rules which can either
control or ignore linear precedence, and thus have the best of both worlds.

    Whenever a grammatical framework imposes the elimination of some
previously accepted but excessively powerful formal mechanism, there is always
the danger that the Frosch Air Mattress Principle will be found to apply:
whenever you push an air mattress down in one place, it bulges up somewhere
else. This is for example how Pullum (Pullum 198523-4) evaluates the formal
proposals made by Stowell 1981:

     In Stowell's theory, particular grammars provide the lexical entries that trim
     this universal CFL down to a particular language, and its entire content hangs
     on the nature of those devices....As so often within current linguistics,
     particularly GB, a well understood device (PS rules) is 'eliminated', and in its
     place appear a dozen devices for which no one has ever given definitions, let
     alone specified their expressive power when they are used in concert in full-
     scale rule systems.

    The situation in lexicase differs from Pullum's characterization of Stowell in
that lexicase formalisms are a refined and developed subset rather than a vague
and programmatic superset of formalisms proposed in Chomsky 1965. Lexicase
'Derivation Rules' have no clear counterparts in Aspects, but they too have their
origins in the Revised Standard Theory, being essentially a more rigorous
conceptualization and formalization of Jackendoff's 'redundancy rules'
(Jackendoff 197241, Jackendoff 1975).

    In spite of the limited generative power of lexicase gramma rs incorporating
the constraints discussed above, it has turned out to be possible to describe quite
complex phenomena without going beyond the formal limitations imposed here.
In the nearly two decades since its inception and in the course of accounting for a
broad range of grammatical phenomena from close to fifty languages, lexicase has
(with one or two exceptions in the area of unbounded dependencies ) become more
tightly constrained than it was at the beginning. In spite o f this, even grammatical
constructions such as clefts, pseudo-clefts, and gerundive nominalizations can be
generated directly without losing generalizations, and the entire English auxiliary
Lexicase ref erence manual                                                          38
Revised 08.09.2010, 13:48



system, including passivization, subject-aux inversion, 'affix- hopping', etc., has
been described in terms of a single set of explicit lexical rules obeying the
constraints (Starosta 1977). Thus there seem to be grounds for thinking that the
theory is on the right track.
                             5.5.2.1.5.7.4.   Other rules
   Since all morphemes in a lexicase representation are already combined into
words, there are no 'percolation rules', 'Spell-Out Rules' (cf. Akmajian and Heny
1975116) or 'readjustment rules' (Aronoff 19765) outside the lexicon which put
morphemes together to form words. In fact, as we shall see, a lexicase core
grammar is 'pan- lexicalist' in Hudson's sense (Hudson 1979a, Hudson 1979b; cf.
Chapter 2), since there are no syntactic rules outside the lexicon at all.

[d:\rt\th\dpndc\holdlxclxsm.doc], DRs
     5.6. non-te rminal nodes
PRETERMINAL NODES : Interpreted as conditions on PSRs, the constraints in this
section require that at least one symbol on the right side of the arrow be an
obligatory lexical category symbol such as N or V. This category is the lexical
head of the constructions, and the phrasal category to the left of the arrow is a
projection of that category. In PSR-1 below, for example, the verb is the
immediate lexical head of a sentence, since all the rest of the categories are
optional, and the noun in PSR-2 is the immediate lexical head of the NP, since the
minimal NP is a single noun, for example an indefinite mass noun, proper noun,
or pronoun:

(18) PSR-1               S  (NP) V (NP)

        PSR-2            NP  (Det) (Adj) N

The one-bar constraint however prevents word-class and word-subclass node
labels such as N, Pron, Proper, V, V- intr, Det, Art (as distinct from Det), Adj,
Adj-degr, and so on from appearing in a lexicase representation. Both of these
types of category labels are incompatible with a strict interpretation of the X-bar
requirement (cf. Pullum 1985) that X be subcategorized only by Comp, where X'
 X, *Compn . If we abide strictly by the one-bar constraint, word subclass nodes
may not appear in trees because if they did, there would be no legal way to
exclude sentences such as:

           (19) *Every me expired the arrogant Nelly

This sentence cannot be marked as ill- formed using the contextual features of
lexical entries, since in a strict X-bar theory contextual features could only refer to
Comps, that is, to dependent sisters, whereas if the lexical entries are dominated
by word-class symbols, they will have no sisters to be subcategorized by:
Lexicase ref erence manual                                                       39
Revised 08.09.2010, 13:48




(19a)                              *S


               NP                  V                     NP


                             N   V-intr                   N


     Det                Pron     expired    Det          Adj         Proper


   Quant                 me                 Art       Adj-degr        Nelly


   every                                    the       arrogant


   We could make a special exception for major class X 0 labels, as is
conventionally done without comment in X-bar analyses, but in lexicase this
would be pointless and redundant, since the information conveyed by a node label
such as N is already present in the form of the lexical category feature in the
matrix of the noun, as shown in (19b).

(19b)                                       *S


                         NP        V                                   NP


    DetP                     N   expired   DetP         AdjP            N

    Det                  me                Det          Adj           Nelly


   every                [+ N ]              the       arrogant


  [+ Det ]                                 [+Det]       [+Adj]


When these redundant node labels are eliminated, it is possible to maintain a strict
interpretation of the Comp subcategorization condition, known in lexicase since
1979 as the Sisterhead Constraint: a lexical item X is subcategorized only by its
sisterheads, that is, by the lexical heads of its dependent sisters or 'Comps':
Lexicase ref erence manual                                                        40
Revised 08.09.2010, 13:48




(19c)                                     *S


                     NP      expired                          NP


     DetP            me        +V        DetP      AdjP      Nelly


     very             +N     -__ [+ N]    the     arrogant     +N


     [+Det]           -[]                [+Det]    [+Adj]      -[]




Then example (19c) is doubly ill- formed because (i) me and Nelly are marked by
[-[]] as excluding all Comps, that is, all dependent sisters, but each has a sister,
that is, another node dominated by the same mother; and (ii) expired is marked by
[-__[+N]] as not allowing a right sister whose lexical head is marked [+N], yet it
does have a right sister, and the lexical item directly dominated by the right sister
node is marked [+N].

    Note that once we have gone this far, we can go one step farther. Given the
fact that every lexicase node is a one-bar projection of its head lexical item, node
labels are predictable and therefore redundant as well. That is, assuming that an
S, as a one-bar projection of V, is a V' 13 , that the node dominating a [+N] item is
an N', i.e. an NP, etc., (19d) conveys exactly the same information as 19c.

(19d)                            *


                             expired


                     me        +V                            Nelly


     every            +N      -_[+ N]     the     arrogant     +N


     [+Det]           -[]     [+Adj]     [+Det]    [Adj]       -[]

       5.7. An den?




13
 This definition will have to be adjusted a bit below to allow for sentences with
NP and PP predicates.

								
To top