Docstoc

Syntax

Document Sample
Syntax Powered By Docstoc
					Grammars and Parsing

       Allen’s Chapters 3,
Jurafski & Martin’s Chapters 8-9




                                   1
Syntax

•   Why is the structure of language (syntax)
    important?
•   How do we represent syntax?
•   What does an example grammar for
    English look like?
•   What strategies exist to find the structure
    in natural language?
•   A Prolog program to recognise English
    sentences
                                                  2
Syntax shows the role of words in a
sentence.


   John hit Sue
   vs
    Sue hit John

   Here knowing the subject allows us to
   know what is going on.


                                           3
Syntax shows how words are related in a
sentence.


Visiting aunts ARE boring.
      vs
Visiting aunts IS boring.

Subject verb agreement allows us to
   disambiguate here.


                                          4
Syntax shows how words are related between
sentences.
 (a) Italy was beating England. Germany too.
 (b) Italy was being beaten by England.
   Germany too.

 Here missing parts of a sentence does not
  allow us to understand the second sentence.

 But syntax allows us to see what is missing.

                                                5
But syntax alone is not enough

 Visiting museums can be boring

 This is not ambiguous for us, as we know there is
  no such thing as a "visiting museum", but syntax
  cannot show this to a computer.

 Compare with…
 Visiting aunts can be boring


                                                     6
How do we represent syntax?

Parse Tree




                              7
An example:


  – Parsing sentence:

  – "They are cooking apples."




                                 8
Parse 1




          9
Parse 2




          10
How do we represent syntax?

List

Sue hit John
[ s, [np, [proper_noun, Sue] ] ,
[vp, [v, hit],
[np, [proper_noun, John] ]



                                   11
Chomsky Hierarchy

0 Unrestricted        A  

1 Context-Sensitive   | LHS |  | RHS |

2 Context-Free        |LHS | = 1

3 Regular             |RHS| = 1 or 2 , A  a | aB, or
                                       A  a | Ba


                                                    12
What Makes a Good Grammar?



• Generality

• Selectivity

• Understandability

                             13
Generality of Grammars
Regular
{abd, ad, bcd, b, abcd, …}
S -> a S1 | b S2 | c S3 | d
S1 -> b S2 | c S3 | d
S2 -> c S3 | d
S3 -> d

Context Free
{anbn}
S -> ab | a S b

Context Sensetive
{ anbncn} or {abcddabcdd, abab, asease, …}


                                             14
   What strategies exist for trying to find the structure
   in natural language?
Top Down vs. Bottom Up

Bottom - Up                                      Top - Down
John, hit, the, cat                              s
prpn, hit, the, cat                              s -> np, vp
prpn, v, the, cat                                s -> prpn, vp
prpn, v, det, cat                                s -> John, v, np
prpn, v, det, n                                  s -> John, hit, np
np, v, det, n                                    s -> John, hit, det,n
np, v, np                                        s -> John, hit, the,n
np, vp                                           s -> John, hit, the,cat
s

Better if many alternative rules for a phrase    Better if many alternative terminal symbols
                                                  for each word
Worse if many alternative terminal symbols for
 each word                                       Worse if many alternative rules for a phrase
                                                                                       15
What does an example grammar for
English look like?

• Re-write rules

1.sentence -> noun phrase , verb phrase
2.noun phrase -> art , noun
3.noun phrase -> art , adj , noun
4.verb phrase -> verb
5.verb phrase -> verb , noun phrase

                                          16
Parsing as a search procedure
1. Select the first state from the possibilities list
       (and remove it from the list).

2. Generate the new states by trying every possible
   option from the selected state
       (there may be none if we are on a bad path).

3. Add the states generated in step 2 to the
   possibilities list

                                                        17
Top down parsing
1   The   2   dog 3 cried 4

Step Current state        Backup States   comment
1    ((S) 1)                              initial position
2    ((NP VP) 1)                          Rule 1
3    ((ART N VP) 1)                       Rules 2 & 3
                   ((ART ADJ N VP) 1)
4    ((N VP) 2)                           Match Art with the
                   ((ART ADJ N VP) 1)
5    ((VP) 3)                             Match N with dog
                   ((ART ADJ N VP) 1)
6    ((V) 3)                              Rules 4 & 5
                   ((V NP) 3)
                   ((ART ADJ N VP) 1)
7                                         Success

                                                               18
What strategies exist for trying to find
the structure in natural language?
Depth First vs. Breadth First

Depth First                     Breadth First
• Try rules one at a time       • Try all rules at the same
  and back track if you get       time
  stuck                         • Can be faster
• Easier to program             • Order of rules is not
• Less memory required            important
• Good if parse tree is         • Good if tree is flat
  deep

                                                         19
An Example of Top-Down Parsing
1 The 2 old 3 man 4 cried 5




                                 20
Depth First Search versus Breadth First




                                          21
What does a Prolog program look like that
tries to recognise English sentences?


    s --> np vp.

    np --> det n.

    np --> det adj n.

    vp --> v np.


                                            22
What does a Prolog program look like that
tries to recognise English sentences?
 sentence(S) :-
        noun_phrase(NP), verb_phrase(VP), append(NP,VP,S).
  noun_phrase(NP) :-
        determiner(D), noun(N), append(D,N,NP).
  noun_phrase(NP) :-
       determiner(D), adj(A), noun(N), append(D,A,AP), append(AP,N,NP).
  verb_phrase(VP) :-
        verb(V), noun_phrase(NP), append(V,NP,VP).
  determiner([D]) :- member(D,[the,a,an]).
  noun([N]) :- member(N,[cat,dog,mat,meat,fish]).
  adj([A]) :- member(A,[big,fat,red]).
  verb([V]) :- member(V,[ate,saw,killed,pushed]).




                                                                 23
Pattern matching as an
alternative (e.g., Eliza)
 •   This uses a database of input output pairs.
 •   The input part of pair is a template to be matched against the user
     input
 •   The output part of the pair is given as a response.
         X computers Y => Do computers interest you?
         X mother Y => Tell me more about your family?
            But…
 Nothing is known about structure (syntax)
         I X you => Why do you X me?
         Fine for X = like, but not for X = do not know
 Nothing is known about meaning (semantics)
         I feel X => I'm sorry you feel X.
         Fine for X = depressed, but not for X = happy




                                                                           24

				
DOCUMENT INFO