Information Visualization Principles_ Promise_ and Pragmatics by yurtgc548

VIEWS: 2 PAGES: 66

									SIMS 290-2:
Applied Natural Language Processing



Marti Hearst
November 15, 2004
 

 




                                      1
Question Answering

 Today:
   Introduction to QA
   A typical full-fledged QA system
   A very simple system, in response to this
   An intermediate approach
 Wednesday:
   Using external resources
    – WordNet
    – Encyclopedias, Gazeteers
   Incorporating a reasoning system
   Machine Learning of mappings
   Other question types (e.g., biography, definitions)

                                                         2
    A                  of Search Types

Question/Answer    What is the typical height of a giraffe?

                   What are some good ideas for landscaping my
Browse and Build
                   client’s yard?

                   What are some promising untried treatments
Text Data Mining
                   for Raynaud’s disease?




                                                              3
        Beyond Document Retrieval

        Document Retrieval
             Users submit queries corresponding to their information 
             needs.
             System returns (voluminous) list of full-length documents.
             It is the responsibility of the users to find information of 
             interest within the returned documents.
        Open-Domain Question Answering (QA)
             Users ask questions in natural language.
                  What is the highest volcano in Europe?
             System returns list of short answers.
                  … Under Mount Etna, the highest volcano in Europe, 
             perches the fabulous town …
             A real use for NLP




Adapted from slide by Surdeanu and Pasca                                     4
Questions and Answers

 What is the height of a typical giraffe?

    The result can be a simple answer, extracted from 
    existing web pages.
    Can specify with keywords or a natural language 
    query
     – However, most web search engines are not set up to 
       handle questions properly.  
     – Get different results using a question vs. keywords




                                                             5
6
7
8
9
    The Problem of Question Answering
  When was the San Francisco fire?
       … were driven over it. After the ceremonial tie was removed - it burned in the San
      Francisco fire of 1906 – historians believe an unknown Chinese worker probably drove
      the last steel spike into a wooden tie. If so, it was only…



        What is the nationality of Pope John Paul II?
            … stabilize the country with its help, the Catholic hierarchy stoutly held out for pluralism,
            in large part at the urging of Polish-born Pope John Paul II. When the Pope
            emphatically defended the Solidarity trade union during a 1987 tour of the…




  Where is the Taj Mahal?
      … list of more than 360 cities around the world includes the Great Reef in
      Australia, the Taj Mahal in India, Chartre’s Cathedral in France, and
      Serengeti National Park in Tanzania. The four sites Japan has listed include…


Adapted from slide by Surdeanu and Pasca                                                                    10
    The Problem of Question Answering
                                                Natural language question,
                                                   not keyword queries


        What is the nationality of Pope John Paul II?
            … stabilize the country with its help, the Catholic hierarchy stoutly held out for pluralism,
            in large part at the urging of Polish-born Pope John Paul II. When the Pope
            emphatically defended the Solidarity trade union during a 1987 tour of the…




               Short text fragment,
                   not URL list

Adapted from slide by Surdeanu and Pasca                                                                    11
    Question Answering from text

        With massive collections of full-text documents,
        simply finding relevant documents is of limited
        use: we want answers
        QA: give the user a (short) answer to their
        question, perhaps supported by evidence.
        An alternative to standard IR
             The first problem area in IR where NLP is really 
             making a difference.




Adapted from slides by Manning, Harabagiu, Kushmeric, and ISI    12
    People want to ask questions…
    Examples from AltaVista query log
          who invented surf music?
          how to make stink bombs
          where are the snowdens of yesteryear?
          which english translation of the bible is used in official catholic
          liturgies?
          how to do clayart
          how to copy psx
          how tall is the sears tower?
    Examples from Excite query log (12/1999)
          how can i find someone in texas
          where can i find information on puritan religion?
          what are the 7 wonders of the world
          how can i eliminate stress
          What vacuum cleaner does Consumers Guide recommend

Adapted from slides by Manning, Harabagiu, Kushmeric, and ISI                   13
    A Brief (Academic) History
        In some sense question answering is not a new research area
        Question answering systems can be found in many areas of NLP
        research, including:
             Natural language database systems
               – A lot of early NLP work on these
             Problem-solving systems
               – STUDENT (Winograd ’77)
               – LUNAR     (Woods & Kaplan ’77)
             Spoken dialog systems
               – Currently very active and commercially relevant
        The focus is now on open-domain QA is new
             First modern system: MURAX (Kupiec, SIGIR’93):
               – Trivial Pursuit questions 
               – Encyclopedia answers
             FAQFinder (Burke et al. ’97)
             TREC QA competition  (NIST, 1999–present)

Adapted from slides by Manning, Harabagiu, Kushmeric, and ISI          14
    AskJeeves 

        AskJeeves is probably most hyped example of
        “Question answering”
        How it used to work:
             Do pattern matching to match a question to their 
             own knowledge base of questions
             If a match is found, returns a human-curated answer 
             to that known question
             If that fails, it falls back to regular web search
             (Seems to be more of a meta-search engine now)
        A potentially interesting middle ground, but a fairly
        weak shadow of real QA


Adapted from slides by Manning, Harabagiu, Kushmeric, and ISI       15
    Question Answering at TREC
        Question answering competition at TREC consists of answering
        a set of 500 fact-based questions, e.g.,
             “When was Mozart born?”.
        Has really pushed the field forward.
        The document set
             Newswire textual documents from LA Times, San Jose Mercury 
             News, Wall Street Journal, NY Times etcetera: over 1M documents 
             now.
             Well-formed lexically, syntactically and semantically (were 
             reviewed by professional editors).
        The questions
             Hundreds of new questions every year, the total is ~2400
        Task
             Initially extract at most 5 answers: long (250B) and short (50B).
             Now extract only one exact answer.
             Several other sub-tasks added later: definition, list, biography.



Adapted from slides by Manning, Harabagiu, Kushmeric, and ISI                    16
         Sample TREC questions
1. Who is the author of the book, "The Iron Lady: A Biography of
Margaret Thatcher"?

2. What was the monetary value of the Nobel Peace Prize in 1989?

3. What does the Peugeot company manufacture?

4. How much did Mercury spend on advertising in 1993?

5. What is the name of the managing director of Apricot Computer?

6. Why did David Koresh ask the FBI for a word processor?

7. What is the name of the rare neurological disease with
   symptoms such as: involuntary movements (tics), swearing,
   and incoherent vocalizations (grunts, shouts, etc.)?

                                                                    17
TREC Scoring

 For the first three years systems were allowed to return 5
 ranked answer snippets (50/250 bytes) to each question.

    Mean Reciprocal Rank Scoring (MRR):
      – Each question assigned the reciprocal rank of the first correct 
        answer. If correct answer at position k, the score is 1/k. 
        1, 0.5, 0.33, 0.25, 0.2, 0 for 1, 2, 3, 4, 5, 6+ position


    Mainly Named Entity answers (person, place, date, …)


 From 2002 on, the systems are only allowed to return a single
 exact answer and the notion of confidence has been introduced.


                                                                           18
    Top Performing Systems
        In 2003, the best performing systems at TREC can answer
        approximately 60-70% of the questions
        Approaches and successes have varied a fair deal
             Knowledge-rich approaches, using a vast array of NLP 
             techniques stole the show in 2000-2003
               – Notably Harabagiu, Moldovan et al. ( SMU/UTD/LCC )
        Statistical systems starting to catch up
             AskMSR system stressed how much could be achieved by very 
             simple methods with enough text (and now various copycats)
             People are experimenting with machine learning methods
        Middle ground is to use large collection of surface matching
        patterns (ISI)




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI              19
Example QA System

 This system contains many components used by other systems,
 but more complex in some ways
 Most work completed in 2001; there have been advances by this
 group and others since then.
 Next slides based mainly on:
    Paşca and Harabagiu, High-Performance Question Answering
    from Large Text Collections, SIGIR’01.
    Paşca and Harabagiu, Answer Mining from Online 
    Documents, ACL’01.
    Harabagiu, Paşca, Maiorano: Experiments with Open-Domain
    Textual Question Answering. COLING’00




                                                                 20
    QA Block Architecture
                               Extracts and ranks passages
                               using surface-text techniques

Captures the semantics of the question                       Extracts and ranks answers
Selects keywords for PR                                      using NL techniques
                                   Question Semantics



  Q           Question
               Question
              Processing
               Processing
                               Keywords
                                           Passage
                                            Passage
                                           Retrieval
                                            Retrieval
                                                        Passages
                                                                   Answer
                                                                    Answer
                                                                   Extraction
                                                                    Extraction
                                                                                      A

                      WordNet                                            WordNet
                                           Document
                                            Document
                      Parser               Retrieval
                                            Retrieval                    Parser

                      NER                                                NER




Adapted from slide by Surdeanu and Pasca                                                  21
    Question Processing Flow
                                                                      Question
                                           Construction of the
                                            Construction of the       semantic
                                           question representation
                                            question representation   representation




  Q           Question
               Question
              parsing
               parsing
                                            Answer type detection
                                             Answer type detection    AT category



                                                Keyword selection
                                                 Keyword selection    Keywords




Adapted from slide by Surdeanu and Pasca                                            22
      Question Stems and Answer Types

     Identify the semantic category of expected answers

    Question                                        Question stem   Answer type

    Q555: What was the name of Titanic’s captain?   What            Person



    Q654: What U.S. Government agency registers     What            Organization
    trademarks?

    Q162: What is the capital of Kosovo?            What            City

    Q661: How much does one ton of cement cost?     How much        Quantity



          Other question stems: Who, Which, Name, How hot...
          Other answer types: Country, Number, Product...



Adapted from slide by Surdeanu and Pasca                                           23
     Detecting the Expected Answer Type
       In some cases, the question stem is sufficient to indicate the
       answer type (AT)
            Why   Þ REASON
            When Þ DATE
       In many cases, the question stem is ambiguous
            Examples
              – What was the name of Titanic’s captain ?
              – What U.S. Government agency registers trademarks?
              – What is the capital of Kosovo?
            Solution: select additional question concepts (AT words) 
            that help disambiguate the expected answer type
            Examples
              – captain
              – agency
              – capital



Adapted from slide by Surdeanu and Pasca                                24
      Answer Type Taxonomy
Encodes 8707 English concepts to help recognize expected answer type
Mapping to parts of Wordnet done by hand
   Can connect to Noun, Adj, and/or Verb subhierarchies




                                                                       25
    Answer Type Detection Algorithm

        Select the answer type word from the question representation.
             Select the word(s) connected to the question. Some 
             “content-free” words are skipped (e.g. “name”).
             From the previous set select the word with the highest 
             connectivity in the question representation.


        Map the AT word in a previously built AT hierarchy
             The AT hierarchy is based on WordNet, with some 
             concepts associated with semantic categories, e.g. “writer” 
             Þ PERSON.
        Select the AT(s) from the first hypernym(s) associated with a
        semantic category.



Adapted from slide by Surdeanu and Pasca                                    26
         Answer Type Hierarchy

                                     PERSON


          scientist,                     inhabitant,                 performer,
        man of science                dweller, denizen           performing artist
                                                                           dancer
   researcher  chemist           American westerner             actor
                                        islander,                             ballet
     oceanographer                   island-dweller        actress tragedian dancer


                       researcher                        name
       What
     PERSON                                       PERSON
                                                   What              oceanographer
                               discovered
                                                         French
    Hepatitis-B        vaccine                              Calypso         owned

  What researcher discovered the                What is the name of the French
  vaccine against Hepatitis-B?                  oceanographer who owned Calypso?

Adapted from slide by Surdeanu and Pasca                                               27
         Evaluation of Answer Type Hierarchy

            This evaluation done in 2001
            Controlled the variation of the number of WordNet synsets 
            included in the answer type hierarchy.
            Test on 800 TREC questions.

                           Hierarchy       Precision score
                           coverage        (50-byte answers)

                             0%                   0.296
                             3%                   0.404
                            10%                   0.437
                            25%                   0.451
                            50%                   0.461




              The derivation of the answer type is the main source of 
              unrecoverable errors in the QA system

Adapted from slide by Surdeanu and Pasca                                 28
    Keyword Selection

        Answer Type indicates what the question is looking
        for, but provides insufficient context to locate the
        answer in very large document collection
        Lexical terms (keywords) from the question, possibly
        expanded with lexical/semantic variations provide the
        required context.




Adapted from slide by Surdeanu and Pasca                        29
         Lexical Terms Extraction
          Questions approximated by sets of unrelated words
          (lexical terms)
          Similar to bag-of-word IR models

         Question (from TREC QA track)                  Lexical terms

         Q002: What was the monetary value of the Nobel monetary, value, Nobel,
         Peace Prize in 1989?                           Peace, Prize

         Q003: What does the Peugeot company            Peugeot, company,
         manufacture?                                   manufacture

         Q004: How much did Mercury spend on            Mercury, spend, advertising,
         advertising in 1993?                           1993

         Q005: What is the name of the managing         name, managing, director,
         director of Apricot Computer?                  Apricot, Computer



Adapted from slide by Surdeanu and Pasca                                               30
    Keyword Selection Algorithm

    1.     Select   all non-stopwords in quotations
    2.     Select   all NNP words in recognized named entities
    3.     Select   all complex nominals with their adjectival modifiers
    4.     Select   all other complex nominals
    5.     Select   all nouns with adjectival modifiers
    6.     Select   all other nouns
    7.     Select   all verbs
    8.     Select   the AT word (which was skipped in all previous steps)




Adapted from slide by Surdeanu and Pasca                                    31
    Keyword Selection Examples

        What researcher discovered the vaccine against Hepatitis-B?
             Hepatitis-B, vaccine, discover, researcher


        What is the name of the French oceanographer who owned
        Calypso?
             Calypso, French, own, oceanographer


        What U.S. government agency registers trademarks?
             U.S., government, trademarks, register, agency


        What is the capital of Kosovo?
             Kosovo, capital


Adapted from slide by Surdeanu and Pasca                              32
   Passage Retrieval
                               Extracts and ranks passages
                               using surface-text techniques

Captures the semantics of the question                       Extracts and ranks answers
Selects keywords for PR                                      using NL techniques
                                   Question Semantics



  Q           Question
               Question
              Processing
               Processing
                               Keywords
                                           Passage
                                            Passage
                                           Retrieval
                                            Retrieval
                                                        Passages
                                                                   Answer
                                                                    Answer
                                                                   Extraction
                                                                    Extraction
                                                                                      A

                      WordNet                                            WordNet
                                           Document
                                            Document
                      Parser               Retrieval
                                            Retrieval                    Parser

                      NER                                                NER




Adapted from slide by Surdeanu and Pasca                                                  33
    Passage Extraction Loop
        Passage Extraction Component
            Extracts passages that contain all selected keywords
            Passage size dynamic
            Start position dynamic
        Passage quality and keyword adjustment
            In the first iteration use the first 6 keyword selection 
            heuristics
            If the number of passages is lower than a threshold 
            Þ query is too strict Þ drop a keyword
            If the number of passages is higher than a threshold 
            Þ query is too relaxed Þ add a keyword



Adapted from slide by Surdeanu and Pasca                                34
    Passage Retrieval Architecture


   Keywords     Keyword     No             Passage      Yes   Passage    Passage
                 Keyword                                       Passage    Passage
                Adjustment
                 Adjustment                 Quality           Scoring
                                                               Scoring   Ordering
                                                                          Ordering


                                                 Passages
                                                                          Ranked
                                   Passage Extraction
                                    Passage Extraction                    Passages

                                                 Documents
                                           Document
                                            Document
                                           Retrieval
                                            Retrieval




Adapted from slide by Surdeanu and Pasca                                             35
    Passage Scoring
        Passages are scored based on keyword windows
        For example, if a question has a set of keywords: {k1, k2, k3, k4}, and
        in a passage k1 and k2 are matched twice, k3 is matched once, and k4
        is not matched, the following windows are built:
             Window 1                            Window 2
                k1                k2               k1              k2
                                       k3                               k3
              k2                                 k2
                             k1                               k1

             Window 3                            Window 4
                k1                k2               k1              k2
                                       k3                               k3
              k2                                 k2
                             k1                               k1
Adapted from slide by Surdeanu and Pasca                                          36
    Passage Scoring

        Passage ordering is performed using a radix sort that
        involves three scores:
             SameWordSequenceScore (largest)
               – Computes the number of words from the question that 
                 are recognized in the same sequence in the window
             DistanceScore (largest)
               – The number of words that separate the most distant 
                 keywords in the window
             MissingKeywordScore (smallest)
               – The number of unmatched keywords in the window




Adapted from slide by Surdeanu and Pasca                                37
 Answer Extraction
                               Extracts and ranks passages
                               using surface-text techniques

Captures the semantics of the question                       Extracts and ranks answers
Selects keywords for PR                                      using NL techniques
                                   Question Semantics



  Q           Question
               Question
              Processing
               Processing
                               Keywords
                                           Passage
                                            Passage
                                           Retrieval
                                            Retrieval
                                                        Passages
                                                                   Answer
                                                                    Answer
                                                                   Extraction
                                                                    Extraction
                                                                                      A

                      WordNet                                            WordNet
                                           Document
                                            Document
                      Parser               Retrieval
                                            Retrieval                    Parser

                      NER                                                NER




Adapted from slide by Surdeanu and Pasca                                                  38
           Ranking Candidate Answers

           Q066: Name the first private citizen to fly in space.


       n   Answer type: Person
       n   Text passage:
               “Among them was Christa McAuliffe, the first private
               citizen to fly in space. Karen Allen, best known for her
               starring role in “Raiders of the Lost Ark”, plays McAuliffe.
               Brian Kerwin is featured as shuttle pilot Mike Smith...”


       n   Best candidate answer: Christa McAuliffe



Adapted from slide by Surdeanu and Pasca                                  39
         Features for Answer Ranking

          relNMW number of question terms matched in the answer passage
          relSP number of question terms matched in the same phrase as the
                 candidate answer
          relSS number of question terms matched in the same sentence as the
                 candidate answer
          relFP flag set to 1 if the candidate answer is followed by a punctuation
                 sign
          relOCTW number of question terms matched, separated from the candidate
                  answer by at most three words and one comma
          relSWS number of terms occurring in the same order in the answer
                  passage as in the question
          relDTW average distance from candidate answer to question term matches




  SIGIR ‘01

Adapted from slide by Surdeanu and Pasca                                         40
        Answer Ranking based on 
        Machine Learning
          Relative relevance score computed for each pair of
          candidates (answer windows)
           relPAIR = wSWS ´ DrelSWS + wFP ´ DrelFP 
                     + wOCTW ´ DrelOCTW + wSP ´ DrelSP + wSS ´ DrelSS
                     + wNMW ´ DrelNMW + wDTW ´ DrelDTW + threshold
              If relPAIR positive, then first candidate from pair is more 
              relevant
          Perceptron model used to learn the weights
          Scores in the 50% MRR for short answers, in the 60%
          MRR for long answers




Adapted from slide by Surdeanu and Pasca                                41
         Evaluation on the Web

         - test on 350 questions from TREC (Q250-Q600)
         - extract 250-byte answers


                             Google        Answer            AltaVista   Answer extraction
                                           extraction from               from AltaVista
                                           Google


      Precision score          0.29             0.44             0.15           0.37

      Questions with a         0.44             0.57             0.27           0.45
      correct answer
      among top 5
      returned answers




Adapted from slide by Surdeanu and Pasca                                                     42
  Can we make this simpler?
       One reason systems became so complex is that they have to
       pick out one sentence within a small collection
            The answer is likely to be stated in a hard-to-recognize 
            manner.
       Alternative Idea:
            What happens with a much larger collection? 
            The web is so huge that you’re likely to see the answer 
            stated in a form similar to the question
       Goal: make the simplest possible QA system by exploiting this
       redundancy in the web
            Use this as a baseline against which to compare more 
            elaborate systems.
            The next slides based on:
              – Web Question Answering: Is More Always Better? Dumais, Banko, Brill, Lin, 
                Ng, SIGIR’02
              – An Analysis of the AskMSR Question-Answering System, Brill, Dumais, 
                and Banko, EMNLP’02. 


Adapted from slides by Manning, Harabagiu, Kusmerick, ISI                                    43
    AskMSR System Architecture

                                         1                      2


                                                            3



                                            5                   4
Adapted from slides by Manning, Harabagiu, Kusmerick, ISI           44
    Step 1: Rewrite the questions

        Intuition: The user’s question is often syntactically
        quite close to sentences that contain the answer

             Where is the Louvre Museum located?
                         
             The Louvre Museum is located in Paris

             Who created the character of Scrooge?

             Charles Dickens created the character of Scrooge.



Adapted from slides by Manning, Harabagiu, Kusmerick, ISI        45
   Query rewriting
   Classify question into seven categories

               Who is/was/are/were…?
               When is/did/will/are/were …?
               Where is/are/were …?

   a. Hand-crafted category-specific transformation rules
         e.g.: For where questions, move ‘is’ to all possible locations
               Look to the right of the query terms for the answer.
                                                                          Nonsense,
              “Where    is the Louvre Museum located?”                    but ok. It’s
                ®        “is the Louvre Museum located”                   only a few
                ®        “the is Louvre Museum located”                   more queries
                ®        “the Louvre is Museum located”                   to the search
                ®        “the Louvre Museum is located”                   engine.
                ®        “the Louvre Museum located is”

   b. Expected answer “Datatype” (eg, Date, Person, Location, …)
         When was the French Revolution? ® DATE



Adapted from slides by Manning, Harabagiu, Kusmerick, ISI                            46
     Query Rewriting - weighting

    Some query rewrites are more reliable than others.

         Where is the Louvre Museum located?

         Weight 1                                           Weight 5
Lots of non-answers                                         if a match,
could come back too                                         probably right
                                             +“the Louvre Museum is located”


        +Louvre +Museum +located


Adapted from slides by Manning, Harabagiu, Kusmerick, ISI                      47
    Step 2: Query search engine

        Send all rewrites to a Web search engine
        Retrieve top N answers (100-200)
        For speed, rely just on search engine’s “snippets”,
        not the full text of the actual document




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI     48
    Step 3: Gathering N-Grams

        Enumerate all N-grams (N=1,2,3) in all retrieved snippets
        Weight of an n-gram: occurrence count, each weighted by
        “reliability” (weight) of rewrite rule that fetched the document
            Example: “Who created the character of Scrooge?”

               Dickens                          117
               Christmas Carol                    78
               Charles Dickens                    75
               Disney                             72
               Carl Banks                         54
               A Christmas                        41
               Christmas Carol                    45
               Uncle                              31



Adapted from slides by Manning, Harabagiu, Kusmerick, ISI                  49
    Step 4: Filtering N-Grams

        Each question type is associated with one or more
        “data-type filters” = regular expression
        When…
        Where…                     Date
        What …
                                  Location
        Who …
                                  Person
        Boost score of n-grams that match regexp
        Lower score of n-grams that don’t match regexp
        Details omitted from paper….


Adapted from slides by Manning, Harabagiu, Kusmerick, ISI   50
    Step 5: Tiling the Answers
   Scores

   20                Charles         Dickens
                                                              merged,    discard
   15                               Dickens                             old n-grams
            Mr Charles
   10

                                      Score 45              Mr Charles Dickens


                          tile highest-scoring n-gram
        N-Grams                                                    N-Grams
                          Repeat, until no more overlap                               51
Adapted from slides by Manning, Harabagiu, Kusmerick, ISI
    Results
        Standard TREC contest test-bed (TREC 2001):
            ~1M documents; 900 questions
             Technique doesn’t do too well (though would have 
             placed in top 9 of ~30 participants)
               – MRR: strict: .34
               – MRR: lenient: .43
               – 9th place




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI        52
Results
 From EMNLP’02 paper
   MMR of .577; answers 61% correctly
   Would be near the top of TREC-9 runs
 Breakdown of feature contribution:




                                          53
    Issues

        Works best/only for “Trivial Pursuit”-style fact-based
        questions
        Limited/brittle repertoire of
             question categories
             answer data types/filters
             query rewriting rules




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI        54
    Intermediate Approach:
    Surface pattern discovery
        Based on:
             Ravichandran, D. and Hovy E.H. Learning Surface Text Patterns for a
             Question Answering System, ACL’02
             Hovy, et al., Question Answering in Webclopedia, TREC-9, 2000. 


        Use of Characteristic Phrases
        "When was <person> born”
             Typical answers
               – "Mozart was born in 1756.”
               – "Gandhi (1869-1948)...”
             Suggests regular expressions to help locate correct answer
               – "<NAME> was born in <BIRTHDATE>”
               – "<NAME> ( <BIRTHDATE>-”



Adapted from slides by Manning, Harabagiu, Kusmerick, ISI                          55
    Use Pattern Learning
       Examples:
            “The great composer Mozart (1756-1791) achieved 
            fame at a young age”
            “Mozart (1756-1791) was a genius”
            “The whole world would always be indebted to the 
            great music of Mozart (1756-1791)”
       Longest matching substring for all 3 sentences is
       "Mozart (1756-1791)”
            Suffix tree would extract "Mozart (1756-1791)" as an 
            output, with score of 3
       Reminiscent of IE pattern learning


Adapted from slides by Manning, Harabagiu, Kusmerick, ISI           56
    Pattern Learning (cont.)

         Repeat with different examples of same question type
              “Gandhi 1869”, “Newton 1642”, etc.


         Some patterns learned for BIRTHDATE
              a. born in <ANSWER>, <NAME>
              b. <NAME> was born on <ANSWER> , 
              c. <NAME> ( <ANSWER> -
              d. <NAME> ( <ANSWER> - )




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI       57
  QA Typology from ISI
      Typology of typical question forms—94 nodes (47 leaf nodes)
      Analyzed 17,384 questions (from answers.com)

        (THING                                             (SPATIAL-QUANTITY
         ((AGENT                                             (VOLUME-QUANTITY AREA-QUANTITY DISTANCE-QUANTITY)) ...
           (NAME (FEMALE-FIRST-NAME (EVE MARY ...))              PERCENTAGE)))
               (MALE-FIRST-NAME (LAWRENCE SAM ...))))     (UNIT
               (COMPANY-NAME (BOEING AMERICAN-EXPRESS))    ((INFORMATION-UNIT (BIT BYTE ... EXABYTE))
               JESUS ROMANOFF ...)                          (MASS-UNIT (OUNCE ...)) (ENERGY-UNIT (BTU ...))
           (ANIMAL-HUMAN (ANIMAL (WOODCHUCK YAK ...))       (CURRENCY-UNIT (ZLOTY PESO ...))
                   PERSON)                                  (TEMPORAL-UNIT (ATTOSECOND ... MILLENIUM))
           (ORGANIZATION (SQUADRON DICTATORSHIP ...))       (TEMPERATURE-UNIT (FAHRENHEIT KELVIN CELCIUS))
           (GROUP-OF-PEOPLE (POSSE CHOIR ...))              (ILLUMINATION-UNIT (LUX CANDELA))
           (STATE-DISTRICT (TIROL MISSISSIPPI ...))         (SPATIAL-UNIT
           (CITY (ULAN-BATOR VIENNA ...))                    ((VOLUME-UNIT (DECILITER ...))
           (COUNTRY (SULTANATE ZIMBABWE ...))))               (DISTANCE-UNIT (NANOMETER ...))))
         (PLACE                                               (AREA-UNIT (ACRE)) ... PERCENT))
           (STATE-DISTRICT (CITY COUNTRY...))             (TANGIBLE-OBJECT
           (GEOLOGICAL-FORMATION (STAR CANYON...))         ((FOOD (HUMAN-FOOD (FISH CHEESE ...)))
           AIRPORT COLLEGE CAPITOL ...)                     (SUBSTANCE
         (ABSTRACT                                           ((LIQUID (LEMONADE GASOLINE BLOOD ...))
          (LANGUAGE (LETTER-CHARACTER (A B ...)))             (SOLID-SUBSTANCE (MARBLE PAPER ...))
          (QUANTITY                                           (GAS-FORM-SUBSTANCE (GAS AIR)) ...))
           (NUMERICAL-QUANTITY INFORMATION-QUANTITY         (INSTRUMENT (DRUM DRILL (WEAPON (ARM GUN)) ...)
           MASS-QUANTITY MONETARY-QUANTITY                  (BODY-PART (ARM HEART ...))
           TEMPORAL-QUANTITY ENERGY-QUANTITY                (MUSICAL-INSTRUMENT (PIANO)))
           TEMPERATURE-QUANTITY ILLUMINATION-QUANTITY       ... *GARMENT *PLANT DISEASE)



Adapted from slides by Manning, Harabagiu, Kusmerick, ISI                                                     58
    Experiments

        6 different question types
             from Webclopedia QA Typology 
               –   BIRTHDATE
               –   LOCATION
               –   INVENTOR
               –   DISCOVERER
               –   DEFINITION
               –   WHY-FAMOUS




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI   59
    Experiments: pattern precision

        BIRTHDATE:
             1.0          <NAME> ( <ANSWER> - )
             0.85         <NAME> was born on <ANSWER>,
             0.6          <NAME> was born in <ANSWER>
             0.59         <NAME> was born <ANSWER>
             0.53         <ANSWER> <NAME> was born
             0.50         - <NAME> ( <ANSWER>
             0.36         <NAME> ( <ANSWER> -
        INVENTOR
             1.0          <ANSWER> invents <NAME>
             1.0          the <NAME> was invented by <ANSWER>
             1.0          <ANSWER> invented the <NAME> in



Adapted from slides by Manning, Harabagiu, Kusmerick, ISI       60
    Experiments (cont.)

        DISCOVERER
             1.0          when <ANSWER> discovered <NAME>
             1.0          <ANSWER>'s discovery of <NAME>
             0.9          <NAME> was discovered by <ANSWER> in
        DEFINITION
             1.0          <NAME> and related <ANSWER>
             1.0          form of <ANSWER>, <NAME>
             0.94         as <NAME>, <ANSWER> and




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI        61
    Experiments (cont.)

        WHY-FAMOUS
             1.0          <ANSWER> <NAME> called
             1.0          laureate <ANSWER> <NAME>
             0.71         <NAME> is the <ANSWER> of
        LOCATION
             1.0          <ANSWER>'s <NAME>
             1.0          regional : <ANSWER> : <NAME>
             0.92         near <NAME> in <ANSWER>
        Depending on question type, get high MRR (0.6–0.9),
        with higher results from use of Web than TREC QA
        collection


Adapted from slides by Manning, Harabagiu, Kusmerick, ISI     62
    Shortcomings & Extensions

        Need for POS &/or semantic types
               – "Where are the Rocky Mountains?”
               – "Denver's new airport, topped with white fiberglass 
                 cones in imitation of the Rocky Mountains in the
                 background , continues to lie empty”
               – <NAME> in <ANSWER>
        NE tagger &/or ontology could enable system to
        determine "background" is not a location




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI               63
    Shortcomings... (cont.)

        Long distance dependencies
             "Where is London?”
                 "London, which has one of the busiest airports in the 
                  world, lies on the banks of the river Thames”
             would require pattern like:
             <QUESTION>, (<any_word>)*, lies on <ANSWER>
             Abundance & variety of Web data helps system to 
             find an instance of patterns w/o losing answers to 
             long distance dependencies




Adapted from slides by Manning, Harabagiu, Kusmerick, ISI                 64
    Shortcomings... (cont.)
        System currently has only one anchor word
             Doesn't work for Q types requiring multiple words 
             from question to be in answer
               – "In which county does the city of Long Beach lie?”
               – "Long Beach is situated in Los Angeles County”
               – required pattern: 
                 <Q_TERM_1> is situated in <ANSWER> <Q_TERM_2>
        Does not use case
               – "What is a micron?”
               – "...a spokesman for Micron, a maker of
                 semiconductors, said SIMMs are..."
        If Micron had been capitalized in question, would be
        a perfect answer


Adapted from slides by Manning, Harabagiu, Kusmerick, ISI             65
Question Answering

 Today:
   Introduction to QA
   A typical full-fledged QA system
   A very simple system, in response to this
   An intermediate approach
 Wednesday:
   Using external resources
    – WordNet
    – Encyclopedias, Gazeteers
   Incorporating a reasoning system
   Machine Learning of mappings
   Alternative question types

                                               66

								
To top