Textual Entailment Using Univariate Density Model and Maximizin

Shared by: HC12100121565
Categories
Tags
-
Stats
views:
0
posted:
10/1/2012
language:
English
pages:
18
Document Sample
scope of work template
							Recognizing Textual Entailment
   Progress towards RTE 4



              Scott Settembre
  University at Buffalo, SNePS Research Group
              ss424@cse.buffalo.edu
 Recognizing Textual Entailment Challenge (RTE) -
                     Overview
• The task is to develop a system to determine if a given pair of
  sentences has the first sentence “entail” the second sentence

• The pair of sentences is called the Text-Hypothesis pair (or T-
  H pair)

• Participants are provided with 800 sample T-H pairs annotated
  with the correct entailment answers

• The final testing set consists of 800 non-annotated samples
                    Development set examples

• Example of a YES result
<pair id=“28" entailment="YES" task="IE" length="short">
 <t>As much as 200 mm of rain have been recorded in portions of British
   Columbia , on the west coast of Canada since Monday.</t>
 <h>British Columbia is located in Canada.</h>
</pair>



• Example of a NO result

<pair id="20" entailment="NO" task="IE" length="short">
 <t>Blue Mountain Lumber is a subsidiary of Malaysian forestry transnational
   corporation, Ernslaw One.</t>
 <h>Blue Mountain Lumber owns Ernslaw One.</h>
</pair>
                   Entailment Task Types

• There are 4 different entailment tasks:

   – “IE” or Information Extraction
       • Text: “An Afghan interpreter, employed by the United States,
         was also wounded.”
       • Hypothesis: “An interpreter worked for Afghanistan.”


   – “IR” or Information Retrieval
       • Text: “Catastrophic floods in Europe endanger lives and cause
         human tragedy as well as heavy economic losses”
       • Hypothesis: “Flooding in Europe causes major economic
         losses.”
           Entailment Task Types - continued

• The two remaining entailment tasks are:

   – “SUM” or Multi-document summarization
       • Text: “Sheriff's officials said a robot could be put to use in
         Ventura County, where the bomb squad has responded to more
         than 40 calls this year.”
       • Hypothesis: “Police use robots for bomb-handling.”


   – “QA” or Question Answering
       • Text: “Israel's prime Minister, Ariel Sharon, visited Prague.”
       • Hypothesis: “Ariel Sharon is the Israeli Prime Minister.”
                    RTE3 - 2007 Results

• Our two runs submitted this year (2007) scored:
   – %62.62 (501 correct out of 800)
   – %61.00 (488 correct out of 800)


• For the 3nd RTE Challenge of 2007, a %62.62 ties for 12th out
  of 26 teams.
   – Top scores were %80, %72, %69, and %67.
   – Median: %61.75
   – Range: %49 to %80 (up from %75.38 last year)
                     RTE3 - 2007 Results

• Category breakdown consistent with last year
   –   QA (question answering) average was %71       [%75]
   –   IR (information retrieval) average was %66    [%63]
   –   Summary average was %58                       [%61.5]
   –   IE (information extraction) average was %52   [%51]


• This relationship between the entailment categories was
  consistent between the groups as well.
                               Best Performers


• Hickl, one of the top performers, used techniques like these:
     – Lexical relationships, using Wordnet *
     – N-gram, word similiarity *
     – Anaphora resolution
     – Machine learning techniques *
        • Entailment corpora, more than provided by RTE
     – Logical inference
        • Using background knowledge

*Also used by our submission
                       Best Performers

• Another top performer Tatu (%72), focused mainly on these
  techniques
   – Lexical relationships, using Wordnet
   – Anaphora resolution
   – Logical inference
      • Using background knowledge


• A good performances came out of LSA, Lexical Semantic
  Analysis
   – %67 score came out of using LSA (top 4 performer)
   – Only 3 teams used LSA, 2 scored low (%58,%55)
                      List of Techniques Used

• Lexical similarity, using a dictionary/thesaurus source
    – Wordnet, DIRT, and MSOffice dictionary used
• n-gram, word similarity (also “bag of words”)
• Syntactic matching and aligning
• Semantic role labeling
    – Framenet, Probank, Verbnet
• Corpus (web-based) statistics
    – LSA – Latent Semantic Analysis
• Machine Learning Classification
    – ANNs (Neural networks), HMMs, SVM (Support Vector machines)
• Anaphora resolution
• Entailment corpora, background knowledge
• Logical Inference
              Logical Inference Techniques Used

• SNePS should be here!
• Extended Wordnet or Wordnet 3.0
    – Expresses word relationships in logic rules
• DIRT – a paraphrase database of world knowledge
    – Expresses equivalent paraphrases in terms of rules
    – i.e. X kills Y  X attacks Y
    – note: this rule did not contain (“and Y dies”)
• Framenet
    – Uses a Frame to express a relationship between a “objects” in a script along with
      other “objects”, like roles, situations, events
• Use specifically developed semantic inference modules
• Oddly, no one used OpenCyc
        New Technique for our RTE 4 Submission

• Latent Semantic Analysis – LSA

• LSI technique developed back in 1988, addressing search indexing

• LSA improved upon LSI in 1990's, applied to summary and evaluation

• Important for us because result can be expressed as a metric or a feature
  vector, fits right into the RTE Tool

• Helps overcome the “poverty of the stimulus” problem, by
  “accommodating a very large number of local co-occurrence relations
  simultaneously” [Landauer, T. K., Foltz, P. W., & Laham, D. 1998]
                             How LSA Works

• The process includes
    – Setting up a matrix of words to words or words to documents
    – Performing a Singular Value Decomposition (SVD) on that matrix
    – Reducing the resulting three smaller matrixes by removing rows of 0 coefficients
    – We then reconstruct the original matrix, which essentially relates words (cells)
      that had not been directly related to each other initially, and redistributes the
      correlation between them
    – Then, depending on what relationship one is trying to find, we can extract the
      feature vectors we wish to compare and calculate the cosine between them
    – Uh huh, so what does this all mean…


• Let’s look at an oversimplified example
          LSA - Oversimplified Example – part 1

• We have two documents: D1 is about dogs, D2 cats

• D1 contains the words “dog” “pet” “leash” “walk” “bark”

• D2 contains the words “cat” “roam” “jump” “purr” “pet”

• At this level, we may not know if any of these words are related, especially
  if we have many documents and many words

• But, we can see that the word “pet” is in both documents

• This “may” imply that there is a relationship between some words in D1
  and D2, simply because “pet” occurred in both
                     LSA – How I Plan to Apply

• I will be creating two matricies
    – One matrix will contain data ONLY from entailed sentence pairs
    – Second matrix will be for non-entailed pairs


• Each matrix vector will contain word to word comparisons
    – Each row will contain a word that has been used in successful entailment
    – Each column will contain the passage to be entailed from
    – SVD performed on each, reduced, then combined again


• Now, to determine if a new pair is entailed
    –   We calculate the feature vector associated with each word and each matrix
    –   We then calculate the COS between each word/matrix vector
    –   Then “combine” the COS for each vector set (vector set from each matrix)
    –   Perform a linear discriminant function to classify entailment (from the RTE Tool of
        RTE3)
                              LSA – Progress

• Developing LSA in ACL

• Using Matrix package from http://matlisp.sourceforge.net/

• Benefits of using LSA
    – No need to program rules or compare sentence structures
    – Mimics performance that humans have [see ref from before]
    – No need to consider all information, since correlations can be created between
      words even if a specific word has not been seen before


• Drawbacks of using LSA
    – I need a lot more data, unsure how much (I may be able to calculate later)
    – Linear algebra is complicated for my small symbolic brain
    – I’m at the mercy of the literature, though I made some innovation in LSA use
                 RTE Challenge - Final Notes
• See the continued progress at:
http://www.cse.buffalo.edu/~ss424/rte3_challenge.html

• RTE Web Site:
http://www.pascal-network.org/Challenges/RTE3/

• Textual Entailment resource pool:
http://aclweb.org/aclwiki/index.php?title=Textual_Entailment_Resource_Pool

• Actual ranking released in June 2007 at:
http://acl.ldc.upenn.edu/W/W07/W07-14.pdf



November 15,                  SNeRG Meeting                       Scott

						
Related docs
Other docs by HC12100121565
S03 D66842
Views: 36  |  Downloads: 0
FY 2012 PCA HRSA-12-114 Presentation
Views: 4  |  Downloads: 0
Spanish I - DOC
Views: 2  |  Downloads: 0
FacEmplAppli02 07
Views: 0  |  Downloads: 0
Slide 1
Views: 4  |  Downloads: 0
Note au Gouvernement Conjoint
Views: 2  |  Downloads: 0
Congressional Letters
Views: 29  |  Downloads: 0
Geometry Tic Tac Toe 1
Views: 44  |  Downloads: 0
practice exercises
Views: 2  |  Downloads: 0