# Suggest logical forms for the following sentences:

Document Sample

```					      Computer Sc & Engineering Department, IIT Kharagpur

Natural Language Processing CS60057
Autumn 2006 Midterm                                                                Full Marks: 50
Date: Sep 15, 2006                                                                 Time: 2 hours

1. In what ways do natural languages (i) resemble (ii) differ from artificial languages. [6]

2. The Soundex algorithm is a method that can be used for representing people’s            [3+3]
names. Write a Finite State Transducer to implement the first three steps of the
Soundex algorithm. Construct a second FST to implement steps (iv) and (v) of the
algorithm.
(i) Retain the first letter of the word.
(ii) remove all occurrence of the following letters except from the first position:
'A', E', 'I', 'O', 'U', 'H', 'W', 'Y'.
(iii) Change letters from the following sets into the digit given:
'B', 'F', 'P', 'V'  1
'C', 'G', 'J', 'K', 'Q', 'S', 'X', 'Z'  2
'D','T'  3
'L'  4
'M','N'  5
'R'  6
(iv) Remove all pairs of digits which occur beside each other from the string that
resulted after step (iii). (i.e., 666 is changed to 6)
(v) Pad the string that resulted from step (f) with trailing zeros and return only the
first four positions, which will be of the form
<uppercase letter> <digit> <digit> <digit>.

3. The following table lists some bigram counts from the BERP domain.                [2+4+4]

I    want   to    eat   Chinese   food   lunch

I            8    1087   0     13    0         0      0

want         3    0      786   0     6         8      6
to           3    0      10    860   3         0      12
eat          0    0      2     0     19        2      52
Chinese      2    0      0     0     0         120    1
food         19   0      17    0     0         0      0
lunch        4    0      0     0     0         1      0

(i) Explain add-one OR any one of the other smoothing methods.
(ii) Apply the smoothing method to these bigrams and compute the smoothed
estimates of the bigram probability table (containing P(wi|wj)).
(iii)Now, calculate the probability of the following sentence based on the
above probabilities:
I want Chinese lunch.
4. Show a grammar that you can use to handle article-noun agreement in English. You [8]
will need a distinction between mass and count nouns (e.g., water, love, honesty vs. cup,
word, idea) and between singular count and plural count or mass indefinite articles (a vs.
some). Your grammar should accept the following:
 a cup
 some cups
 some water
but reject the following:
 *some cup
 *a cups
 *a water
Give the necessary grammar rules and the lexical entries for the words a, some, water,
cup, and cups, and show how the phrases would succeed or fail during parsing. You can
ignore the other details of the parser (assuming that the right entries or rules are
magically selected).

5. Suggest logical forms for the following sentences:                                  [4+6]
(i) PC teaches CS305 to UG students
(ii) A bus connects Kharagpur with Digha
Write a grammar/lexicon that computes the logical form for each of the above sentences.

6. Consider a small corpus consisting of the following sentences.                      [7+3]

I want to book a room. Book me a large room. I wish to read a book. Book a bed for the
night. I want a night light.

N : noun                          N  book | room | bed | night |
V : verb                          light
P : preposition                   V  want | book | wish | light
R : pronoun                       Pp  to | for
A: article                        A  a | the | an
J : adjective                     Pro  I | you | me
J  night | large

Construct by hand a Markov model from this corpus with six states corresponding to the
six parts of speech. No smoothing is required. Show the transition probabilities as well as
the emission probabilities. Provide a tagging for the following sentence and evaluate the
probability of this tag sequence.
I book a large book.