Question Answering Technologies by 7D45JQxK

VIEWS: 0 PAGES: 24

									Question Answering Technologies


          Lyubomyr Havrylyuk



         University of Konstanz


             Feb 07,2011
      Outline
                       QA Systems Historical Introduction

                       Modern approaches

                       Question focus recognition

                       Intelligent numerical answers generation

                       Answer recognition and extraction

                       Performance issues and error analysis

                       Conclusions




Question Answering systems                       Information Retrieval - Konstanz Uni   2
      QA systems historical review
        Question Answering System (QA) is the system targeting the task of
        automatically answering a question posed in natural language.


        QA systems originated in early 1960s as systems for answering questions
        about a certain domain of knowledge.

        In 1965, the first generation of fifteen experimental question-answering
        systems was already reviewed . These included a social-conversation
        machine, systems that translated from English into limited logical calculus,
        and programs that attempted to answer questions from English text.




Question Answering systems               Information Retrieval - Konstanz Uni          3
      “First generation“
       Two most famous examples of QA systems of that time:
       • BASEBALL - answered questions about the US baseball league over a period of
       one year .
       • LUNAR - answered questions about the geological analysis of rocks returned by
       the Apollo moon missions.

       The common feature of all those systems is that they had a core database or
       knowledge system that was hand-written by experts of the chosen domain.


       First generation systems were often handicapped by :
       - the lack of adequate linguistic models
       - being written in low level languages such as FAP and IPL
       - different blunder often occurred, because systems didn’t store previously gained
       information, which led to hard updatability



Question Answering systems                Information Retrieval - Konstanz Uni        4
      “Second generation“ QA systems
         Second generation QA systems :
         - Programmed with the higher level languages (e.g. Lisp, SNOBOL, ALGOL)
         - Better updatable due to the inclusion of limited features for remembering
           previously mentioned topics and facts.
         - Fact-retrieval systems, with generalization of several approaches

    Data statements:
    1.     There are 5 fingers on a hand
    2.     There is one hand on an arm                                           Question: How many fingers are on man
    3.     There are 2 arms on a man
                                                                                 Answer: 10
    Inference rules as conditional statements with variables:

    1. If there are m X’s on a V and if there are n V’s on a Y,
    then there are m*n X’s on a Y.

                                                             DEDUctive COMmunicator (DEDUCOM) Example


Question Answering systems                                    Information Retrieval - Konstanz Uni                       5
      Modern QA Systems
       Nowadys interest to QA increases due to:

       - the popularity of Internet QA services (e.g. Ask.com , TrueKnowledge, EAGLi ,etc.)

       - the recent evaluations of domain-independent QA systems organized in the
       context of the Text REtrieval Conference (TREC)

           TREC restrictions:

       1.Exists at least one document in the test collection that contains answer to a
       test question

       2. Answer length is limited (e.g. 250 bytes)




Question Answering systems                  Information Retrieval - Konstanz Uni              6
      QA online systems examples


                                                               QA online service, with the list
                                                               of relevant online answers




                                                               QA online system, with the
                                                               answer in an excerpt from
                                                               online document




Question Answering systems   Information Retrieval - Konstanz Uni                                 7
      Finding answer
   To find the answer to a question several steps must be taken:

        question semantics needs to be captured
                 identifying: expected answer type
                             question keywords

        index of the document collection must be used

        answer extraction




Question Answering systems                           Information Retrieval - Konstanz Uni   8
      Question representation
    Possible issues :
     - establishing possible answer type
           i.e. PERSON, LOCATION, TIME, ORGANISATION, DATE, MONEY, NUMBER etc.

       - finding interdependencies between question keywords




    The answer type is the object of the verb visit, which is defined by the semantic category LANDMARK.
    The answer type replaces the question stem.




Question Answering systems                       Information Retrieval - Konstanz Uni                     9
      Semantic mapping
      Syntactic dependencies vary across question reformulations or equivalent answers made
      possible by the productive nature of natural language.
      Verbs see and visit are synonyms; visitor can be replaced by possible actor pronoun I.


      Question ET2:
      What could I see in Reims ?


      The unifying mapping of ET1 and ET2.




      Helps to recognize equivalent answers, when lexical and semantic alternations are allowed


      Establishes dependency relations, and defines the search space based on alternations of the
      questions and answer concept.


Question Answering systems                    Information Retrieval - Konstanz Uni             10
      Feedback supporting open-domain QA
      Answer correctness justification relies on lexico-semantic knowledge base
      (i.e. WordNet ).

      Sometimes answers fusion needed.




Question Answering systems               Information Retrieval - Konstanz Uni     11
      Question focus recognition
   Question focus is a noun phrase (NP) that is likely to be present in the answer.

   Question : Who was first governor of Alaska?
   FOCUS = the first governor of Alaska
   FOCUS-HEAD = governor
   MODIFIERS-FOCUS-HEAD= ADJ first, COMP Alaska

       NP synonyms of the questions focus head are also looked for.
       NPs can be associated with the score for relevance ranking if they are delimited.

       “This score takes into account the origin of the NP and the modifiers found in the
       question: when the NP contains the modifiers present in the question, its score is
       increased. The best score is obtained when all of them are present.” [4]



Question Answering systems                Information Retrieval - Konstanz Uni          12
      Answer from a set of candidate answers
        Most systems provide the user with:
        - either a set of potential answers (ranked or not)
        - the ”best” answer according to some relevance criteria.

        What about information from a set of candidate answers ?


    Example 1 :
    How many inhabitants are there in France?
    - Population census in France (1999): 60184186.
    - 61.7: number of inhabitants in France in 2004.


                             Example 2 :
                             What is the average age of marriage of women in 2004?
                             - In Iran, the average age of marriage of women was 21 years in 2004.
                             - In 2004, Moroccan women get married at the age of 27.



Question Answering systems                      Information Retrieval - Konstanz Uni             13
      Numeric results variation criteria
        Variation exists if there are at least k different numerical values with different
        criteria (time, place, other restrictions) among retrieved N frames or snippets
        (i.e. k = N / 4)

        Numerical value varies according to:




Question Answering systems                   Information Retrieval - Konstanz Uni            14
      Variation criteria




Question Answering systems   Information Retrieval - Konstanz Uni   15
      Buiding a trend
    In case of variation (over the time ) a trend can be drawn, and with correlation
    coefficient (i.e. Pearson c. c. r ) explanation can be generated.




           Variation mode: How many inhabitants are there in France?




Question Answering systems                      Information Retrieval - Konstanz Uni   16
      Numerical answer generation
    Once extracted numerical values are characterized, a cooperative answer can
    be generated.

    It is composed of two parts:
    - a direct answer if available,
    - an explanation of the value variation.
    A direct answer generation is mainly guided by constrains, if such are
    explicitly stated in the question.

       Ct -constrains on time
       Cp – constrains on place
       Cr – constrain on restriction

            C={Ct,Cp,Cr}


Question Answering systems                Information Retrieval - Konstanz Uni    17
      Numerical answer generation
       A direct answer has to be generated from the set of snippets AC which
       satisfy the set of constrains C.




Question Answering systems               Information Retrieval - Konstanz Uni   18
      Example
          Question : What is the average age of marriage in France ?

          A = {AC1;AC2} with:

          AC1 = {a1; a3; a5}, subset for restriction women,
          AC2 = {a2; a4; a6}, subset for restriction men.

          having :           a1= 27.7 a2=29.8
                             a3= 28   a4=30
                             a5= 28.5 a6=30.6


          Direct answer after aggregation process :

          In 2000,the average age of marriage in France was about 30 years for men and 28
          years for women.



Question Answering systems                      Information Retrieval - Konstanz Uni        19
      Serial system representation
    QA system, as a serial system representation :




Question Answering systems             Information Retrieval - Konstanz Uni   20
      Distibution of error per system module




Question Answering systems   Information Retrieval - Konstanz Uni   21
      Conclusion
        QA systems have been extended in recent years to explore critical new
        scientific and practical dimensions : automatic answering to temporal and
        geospatial questions, definitional questions, biographical questions, multilingual
        questions, and questions about different multimedia items.
        Nevertheless, the overall performance of QA systems is directly related to the
        depth of NLP resources, even being significantly enhanced by lexico-semantic
        information from different large lexical databases of English, and online
        documents.
          Bottlenecks of QA systems :
        - the derivation of the expected answer type
        - the keyword expansion


        The main problem is the lack of powerful schemes and algorithms for modeling
        complex questions in order to derive as much information as possible, and for
        performing a well-guided search through thousands of text documents.


Question Answering systems                 Information Retrieval - Konstanz Uni          22
      References
                             1.) Robert F. Simmons. 1970. Natural language question-answering systems: 1969.
                             Commun. ACM 13, 1 (January 1970), 15-30.

                             2) Marius Pas.ca, Sanda M. Harabagiu. 2001. Answer mining from on-line documents. In
                             Proceedings of the workshop on Open-domain question answering - Volume 12 (ODQA '01),
                             Vol. 12. Association for Computational Linguistics, Stroudsburg, PA, USA, 1-8.

                             3) Véronique Moriceau . 2006. Generating intelligent numerical answers in a question-
                             answering system. In Proceedings of the Fourth International Natural Language Generation
                             Conference (INLG '06). Association for Computational Linguistics, Stroudsburg, PA, USA,
                             103-110.
                             4) O. Ferret, B. Grau, M. Hurault-Plantet, G. Illouz, L. Monceaux, I. Robba, and A. Vilnat.
                             2001. Finding an Answer Based on the Recognition of the Question Focus. In 10th Text
                             Retrieval Conference.
                             5) Dan Moldovan, Sanda Harabagiu, and Mihai Surdeanu. 2003. Performance issues and
                             error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. 21, 2
                             (April 2003), 133-154.




Question Answering systems                                    Information Retrieval - Konstanz Uni                         23
                             Thank you, for your attention!!!




Question Answering systems                Information Retrieval - Konstanz Uni   24

								
To top