Docstoc

HPSG parser development at U-tokyo

Document Sample
HPSG parser development at U-tokyo Powered By Docstoc
					HPSG parser development
      at U-tokyo
     Takuya Matsuzaki


      University of Tokyo
                  Topics
• Overview of U-Tokyo HPSG parsing system
• Supertagging with Enju HPSG grammar
 Overview of U-Tokyo parsing system
• Two different algorithms:
  – Enju parser: Supertagging + CKY algo. for TFS
  – Mogura parser: Supertagging + CFG-filtering


• Two disambiguation models:
  – one trained on PTB-WSJ
  – one trained on PTB-WSJ + Genia (biomedical)
     Supertagger-based parsing
     [Clark and Curran, 2004; Ninomiya et al., 2006]
  • Supertagging [Bangalore and Joshi, 1999]:
      Selecting a few LEs for a word by using a probabilistic
      model of P(LE | sentence)

                 P: small

                          HEAD noun                     HEAD verb                 HEAD noun
                        HEAD noun>
                           SUBJ <                    HEAD verb                  HEAD noun>
                                                                                   SUBJ <
                     HEAD noun >
                         SUBJ < < >                    SUBJ <NP>             HEAD noun >
                                                                                 SUBJ < < >
                          COMPS
                  HEAD noun >                     HEAD verb
                                                    SUBJ <NP>                     COMPS
                                                                          HEAD noun >
                      SUBJ < < >
                        COMPS
               HEAD noun >
                                                      COMPS
                                               HEAD verb <NP>
                                                 SUBJ <NP>                    SUBJ < < >
                                                                                COMPS
                                                                       HEAD noun >
                   SUBJ < < >
                     COMPS                         COMPS <NP>
                                           HEAD verb                       SUBJ < < >
                                                                             COMPS
            HEAD noun >
                SUBJ < < >                    SUBJ <NP>
                                                COMPS <NP>          HEAD noun >
                                                                        SUBJ < < >
                  COMPS                 HEAD verb
                                          SUBJ <NP><NP>                   COMPS
             SUBJ < > < >
               COMPS                         COMPS                   SUBJ < > < >
                                                                       COMPS
                                       SUBJ <NP>
                                         COMPS <NP>
P: large    COMPS < >                 COMPS <NP>                    COMPS < >



                         I                            like                        it
     Supertagger-based parsing
     [Clark and Curran, 2004; Ninomiya et al., 2006]



     • Ignore the LEs with small probabilities
            LEs with P > threshold
                  P: small                 Input to the parser
threshold

                            HEAD noun                   HEAD verb                  HEAD noun
                          HEAD noun>
                             SUBJ <                  HEAD verb                   HEAD noun>
                                                                                    SUBJ <
                   HEAD noun < >
                      HEAD noun
                           SUBJ                        SUBJ <NP>             HEAD noun >
                                                                                  SUBJ < < >
                HEAD noun COMPS < >
                   HEAD noun >                    HEAD verb
                                                    SUBJ <NP>          HEAD noun COMPS
                                                                          HEAD noun >
                    SUBJ COMPS < >
                        SUBJ > <
                HEAD noun
                           <
             HEAD noun > < >                 HEADSUBJverb <NP>
                                                  verbCOMPS
                                                 HEAD <NP>                     SUBJ < < >
                                                                       HEAD noun COMPS
                                                                    HEAD noun > < >
                 SUBJ COMPS < >
                    SUBJ < >
                   COMPS<                          COMPS <NP>
                                                  verb
                                          HEAD verb <NP>
                                             HEAD <NP>                     SUBJ<
                                                                        SUBJ COMPS < >
             HEAD noun >
              SUBJ COMPS < >                    SUBJ
                                            SUBJ COMPS <NP>         HEAD noun >
                 SUBJ >
                    <
                COMPS<< >                      verb
                                          HEAD <NP>
                                            SUBJ <NP>
                                         SUBJ COMPS <NP>
                                           COMPS<NP>                   COMPS<< >
                                                                     SUBJ COMPS < >
                                                                           <
                                                                        SUBJ >
              SUBJ <<>> < >
             COMPS
                COMPS                    SUBJ <NP>                   SUBJ <<>> < >
                                                                       COMPS
                                                                    COMPS
             COMPS < >                     COMPS <NP>
                                        COMPS <NP>                  COMPS < >
P: large                                COMPS <NP>

                          I
                          I                            like
                                                       like                       it
                                                                                  it
          Flow in Enju parser
1. POS tagging by a CRF-based model
2. Morphological analysis (inflected  base
   form) by the WordNet dictionary
3. Multi-Supertagging by a MaxEnt model
4. TFS CKY parsing + MaxEnt disambiguation on
   the multi-supertagged sentence
         Flow in Mogura parser
1. POS tagging by a CRF-based model
2. Morphological analysis (inflected  base
   form) by the WordNet dictionary
3. Supertagging by a MaxEnt model
4. Selection of (probably) constraint-satisfying
   supertag assignment
5. TFS shift-reduce parsing on singly-
   supertagged sentence
     Previous supertagger-based parsing
     [Clark and Curran, 2004; Ninomiya et al., 2006]



     • Ignore the LEs with small probabilities
            LEs with P > threshold
                  P: small                 Input to the parser
threshold

                            HEAD noun                   HEAD verb                  HEAD noun
                          HEAD noun>
                             SUBJ <                  HEAD verb                   HEAD noun>
                                                                                    SUBJ <
                   HEAD noun < >
                      HEAD noun
                           SUBJ                        SUBJ <NP>             HEAD noun >
                                                                                  SUBJ < < >
                HEAD noun COMPS < >
                   HEAD noun >                    HEAD verb
                                                    SUBJ <NP>          HEAD noun COMPS
                                                                          HEAD noun >
                    SUBJ COMPS < >
                        SUBJ > <
                HEAD noun
                           <
             HEAD noun > < >                 HEADSUBJverb <NP>
                                                  verbCOMPS
                                                 HEAD <NP>                     SUBJ < < >
                                                                       HEAD noun COMPS
                                                                    HEAD noun > < >
                 SUBJ COMPS < >
                    SUBJ < >
                   COMPS<                          COMPS <NP>
                                                  verb
                                          HEAD verb <NP>
                                             HEAD <NP>                     SUBJ<
                                                                        SUBJ COMPS < >
             HEAD noun >
              SUBJ COMPS < >                    SUBJ
                                            SUBJ COMPS <NP>         HEAD noun >
                 SUBJ >
                    <
                COMPS<< >                      verb
                                          HEAD <NP>
                                            SUBJ <NP>
                                         SUBJ COMPS <NP>
                                           COMPS<NP>                   COMPS<< >
                                                                     SUBJ COMPS < >
                                                                           <
                                                                        SUBJ >
              SUBJ <<>> < >
             COMPS
                COMPS                    SUBJ <NP>                   SUBJ <<>> < >
                                                                       COMPS
                                                                    COMPS
             COMPS < >                     COMPS <NP>
                                        COMPS <NP>                  COMPS < >
P: large                                COMPS <NP>

                          I
                          I                            like
                                                       like                       it
                                                                                  it
Supertagging is “almost parsing”

                   HEAD verb
                    SUBJ <>
                   COMPS <>


       HEAD noun                HEAD verb
        SUBJ < >               SUBJ <NP>
       COMPS < >                COMPS <>


                       HEAD verb            HEAD noun
                      SUBJ <NP>              SUBJ < >
                     COMPS <NP>             COMPS < >

                        like
A dilemma in the previous method
 • Fewer LEs  Faster parsing, but
 • Too few LEs  More risk of no well-formed
   parse trees



    HEAD noun       HEAD verb     HEAD noun
     SUBJ < >      SUBJ <NP>       SUBJ < >
    COMPS < >     COMPS <VP>      COMPS < >


       I             like            it
input sentence
                            Mogura Overview
     I like it
                                                                                     HEAD noun         HEAD verb       HEAD noun
                                                                                      SUBJ < >        SUBJ <NP>         SUBJ < >
                                                                                  HEAD noun< >
                                                                                     COMPS           COMPS <NP> HEAD noun< >
                                                                                                    HEAD verb          COMPS
                                                                                   SUBJ < >        SUBJ <NP>         SUBJ < >
                                                                               HEAD noun< >
                                                                                  COMPS           COMPS <NP> HEAD noun< >
                                                                                                 HEAD verb          COMPS
                                                                                SUBJ < >
                                                                               COMPS < >
                                                                                          I            like
                                                                                                SUBJ <NP>
                                                                                               COMPS <NP>
                                                                                                                  SUBJ < >
                                                                                                                 COMPS < >
                                                                                                                              it
                                                                                      I              like                it
                                          Enumeration of                          I                like             it
Supertagger                                 assignments


                                                                                               Deterministic
                                                                                              disambiguation


              HEAD noun                  HEAD verb                 HEAD noun
            HEAD noun>
               SUBJ <                  HEAD verb                 HEAD noun>
                                                                    SUBJ <
         HEAD noun >
             SUBJ < < >                 SUBJ <NP>             HEAD noun >
                                                                  SUBJ < < >
              COMPS
      HEAD noun >                    HEAD verb
                                      SUBJ <NP>                    COMPS
                                                           HEAD noun >
          SUBJ < < >
            COMPS
   HEAD noun >
                                       COMPS
                                   HEAD verb <NP>
                                    SUBJ <NP>                  SUBJ < < >
                                                                 COMPS
                                                        HEAD noun >
       SUBJ < < >
         COMPS                       COMPS <NP>
                               HEAD verb                    SUBJ < < >
                                                              COMPS
HEAD noun >
    SUBJ < < >                    SUBJ <NP>
                                   COMPS <NP>        HEAD noun >
                                                         SUBJ < < >
      COMPS                 HEAD verb
                              SUBJ <NP>                    COMPS
 SUBJ < > < >
   COMPS                         COMPS <NP>           SUBJ < > < >
                                                        COMPS
                           SUBJ <NP>
                             COMPS <NP>
COMPS < >                 COMPS <NP>                 COMPS < >

                                                                                              I        like          it
             I                       like                          it
Enumaration of the maybe-parsable LE
            assignments
                           Enumeration of the
Supertagging                 highest-prob.      CFG-filter
   result                    LE sequences



                                ( 1 , 1 , 1)
                                                1   1    1

      2       2        2       ( 2 , 1 , 1 )
 1        1       1                                          ( 2 , 1 , 1 )
                                                2   1    1




                                                                  ...
                               ( 1 , 2 , 1 )
  I       like    it                            1    2   1
                                     ...
                     CFG-filter
• Parsing with a CFG that approximates the HPSG
  [Kiefer and Krieger, 2000; Torisawa et al, 2000]

  – Approximation = elmination of some constraints in the
    grammar (long-distance dep., number, case, etc.)

  – Covering property: if a LE assignment is parsable by the
    HPSG  it is also parsable by the approx. CFG

  – CFG parsing is much faster than HPSG parsing
                    Results on PTB-WSJ
    Parser             grammar     Accuracy      Speed
                                    90.02%
MST parser            dependency               4.5 snt/sec
                                     (LAS)
                                    89.01%
Sagae’s parser        dependency               21.6 snt/sec
                                     (LAS)
                                    89.27%
Berkeley parser          CFG                   4.7 snt/sec
                                     (LF1)
                                    89.55%
Charniak’s parser        CFG                   2.2 snt/sec
                                     (LF1)
Charniak’s parser                  91.40 %
                         CFG                   1.9 snt/sec
reranker                            (LF1)
                                    88.87%
Enju parser             HPSG                   2.7 snt/sec
                                   (PAS-LF1)
                                    88.07%
Mogura parser           HPSG                   22.8 snt/sec
                                   (PAS-LF1)
  Supertagging with Enju grammar
• Input: POS-tagged sentence

• Number of supertags (lexical templates): 2,308

• Current implementation
   – Classifier: MaxEnt, point-wise prediction (i.e., no
     dependencies among neighboring supertags)
   – Features: words and POS tags in -2/+3 window

• 92% token accuracy (1-best, only on covered tokens)

• It’s “almost parsing”: 98-99% parsing accuracy (PAS F1)
  given correct lexical assignments
                 Pointwise-Supertagging
                             Output


Lex. Ent.   S1     S2   S3    S4      S5   S6   S7   S8


POS tag     P1     P2   P3    P4      P5   P6   P7   P8


 Word       w1     w2   w3    w4      w5   w6   w7   w8



                             Input
                 Pointwise-Supertagging
                             Output


Lex. Ent.   S1     S2   S3    S4      S5   S6   S7   S8


POS tag     P1     P2   P3    P4      P5   P6   P7   P8


 Word       w1     w2   w3    w4      w5   w6   w7   w8



                             Input
                 Pointwise-Supertagging
                             Output


Lex. Ent.   S1     S2   S3    S4      S5   S6   S7   S8


POS tag     P1     P2   P3    P4      P5   P6   P7   P8


 Word       w1     w2   w3    w4      w5   w6   w7   w8



                             Input
                 Pointwise-Supertagging
                             Output


Lex. Ent.   S1     S2   S3    S4      S5   S6   S7   S8


POS tag     P1     P2   P3    P4      P5   P6   P7   P8


 Word       w1     w2   w3    w4      w5   w6   w7   w8



                             Input
                 Pointwise-Supertagging
                             Output


Lex. Ent.   S1     S2   S3    S4      S5   S6   S7   S8


POS tag     P1     P2   P3    P4      P5   P6   P7   P8


 Word       w1     w2   w3    w4      w5   w6   w7   w8



                             Input
                 Pointwise-Supertagging
                             Output


Lex. Ent.   S1     S2   S3    S4      S5   S6   S7   S8


POS tag     P1     P2   P3    P4      P5   P6   P7   P8


 Word       w1     w2   w3    w4      w5   w6   w7   w8



                             Input
   Supertagging: future directions
• Basic strategy: do more work in supertagging (rather
  than in parsing)
• Pros
   – Model/algorithm is simpler
    Easy error analysis
    Various features without extending the parsing algorithm
    Fast try-and-error cycle for feature engineering
• Cons
   – No tree structure
    Feature design is sometimes tricky/ad-hoc:
     e.g., “nearest preceding verb/noun”, instead of “possible
     modifiee of a PP”
  Supertagging: future directions
• Recovery from POS-tagging error in
  supertagging stage
• Incorporation of shallow processing results
  (e.g., chunking, NER, coordination structure
  prediction) as new features
• Comparison across other languages/grammar
  frameworks
Thank you!
        Deterministic disambiguation
• Implemented as a shift-reduce parser
     – Deterministic parsing: only one analysis at one time
     – Next parsing action is selected using a scoring
       function


                 next action a  arg max F (a, S , Q)
                                                 aA

• F: scoring function
    (averaged-perceptron algorithm [Collins and Duffy, 2002])
• Features are extracted from the stack state S and
 lookahead queue Q
• A: the set of possible actions (CFG-forest is used as a `guide’)
    Example

    Initial state
S                                         Q

       HEAD noun      HEAD verb   HEAD noun
        SUBJ < >     SUBJ <NP>     SUBJ < >
       COMPS < >    COMPS <NP>    COMPS < >




          I            like          it
                argmax F(a, S, Q) = SHIFT
S                                                          Q

    HEAD noun                          HEAD verb   HEAD noun
     SUBJ < >                         SUBJ <NP>     SUBJ < >
    COMPS < >                        COMPS <NP>    COMPS < >




       I                                like          it
                argmax F(a, S, Q) = SHIFT
S                                                   Q

    HEAD noun     HEAD verb                 HEAD noun
     SUBJ < >    SUBJ <NP>                   SUBJ < >
    COMPS < >   COMPS <NP>                  COMPS < >




       I            like                       it
                argmax F(a, S, Q) = SHIFT
S                                           Q

    HEAD noun     HEAD verb   HEAD noun
     SUBJ < >    SUBJ <NP>     SUBJ < >
    COMPS < >   COMPS <NP>    COMPS < >




       I           like         it
           argmax F(a, S, Q) = REDUCE(Head_Comp)

S                                                  Q

    HEAD noun          HEAD verb
     SUBJ < >        SUBJ <[1]NP>
    COMPS < >          COMPS <>



                    Head-Comp-Schema
       I
                  HEAD verb    HEAD noun
                 SUBJ <[1]>     SUBJ < >
                COMPS <NP>     COMPS < >


                  like              it
            argmax F(a, S, Q) = REDUCE(Subj_Head)
 S                                                  Q
       HEAD verb
        SUBJ <>
       COMPS <>


      Subj-Head-Schema

HEAD noun          HEAD verb
 SUBJ < >          SUBJ <[1]NP>
COMPS < >          COMPS <>


  I      HEAD verb           HEAD noun
         SUBJ <[1]>           SUBJ < >
        COMPS <NP>           COMPS < >


            like                  it

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:11/5/2012
language:Unknown
pages:31