Docstoc

Knowledge Representation for Question Answering

Document Sample
Knowledge Representation for Question Answering Powered By Docstoc
					Knowledge Representation
 for Question Answering


       Deborah L. McGuinness

    Knowledge Systems Laboratory
         Stanford University
Overview
There are many ways to potentially improve answering systems
      Partial KR&R impact spectrum
      Full-fledged KR (KB for content; KR tools for evolution and
     maintenance e.g., HPKB, RKF,…)
      “Lighter” KR
        - Markup of source information (e.g., DAML, OWL, …)
        - Query expansion using ontologies (e.g., FindUR,
     enhanced TAP, …)
        - Structured Query (and answer pattern) language

      Take home message – KR can be applied to many areas of the
      question answering task, it can be used incrementally in
      partnership with other areas, and is ready for prime time use.


                                   2                                   3/25/03
 Some KR Options for Question Answering

Content-oriented processing            Term
                    Hot              meaning          Ontological
                                     markup                            KB of
Header             Links                            support for term content
info              (smart tags,        in text           defns*      (and assoc tools)*
                   Sentius,..)



                                                    Class-based          KR for
 Links for                      Query                  answer           queries*
  query                       Expansion*            presentation
  terms                                                 (TAP,
                                                      pruning
                                                    (classic)…)
 Query(Answer)-oriented processing
                                                3                                   3/25/03
Mainstream KR: A few example programs

• DARPA Rapid Knowledge Formation (RKF)
    Goal: allow distributed teams of subject matter experts to
     quickly and easily build, maintain, and use knowledge bases
     without need for specialized training.
    Stanford Knowledge Systems Lab focus - Creating,
     Maintaining, and Integrating Understandable Knowledge Bases
    Next PI meeting – May 13-15
• DARPA High Performance Knowledge Base
  (HPKB)
    Predecessor to RKF
    Goal: advance the technology of how computers acquire,
     represent and manipulate knowledge
    KSL built tools to build, analyze, manipulate, store KBs and led
     large Knowledge building effort for evaluation tests.

                                  4                                     3/25/03
Programs cont.
• ARDA’s Advanced Question & Answering for
  Intelligence (AQUAINT)
    Goal – Advance QA against structured and
     unstructured info
    KSL focus – ontology building support and tools
     (diagnostics, evolution, extraction, general reasoning
     (JTP), temporal reasoning, explanation, querying
     (DQL), partitioning, …)
• ARDA’s Novel Intelligence for Massive Data (NIMD)
    Goal – Avoid strategic surprise by helping analysts
     be more effective (focus attention on critical
     information and help
     analyze/prune/refine/explain/reuse/…)

                             5                                3/25/03
KR&R
• Rich expressive languages for encoding information
  (FOL-based languages)
• Large hand-coded knowledge bases (e.g., HPKB, Cyc
  kb, RKF kb…)
• Semi-automatically generated kbs
• Question answering ranging from lookup, keyword
  retrieval, reasoning from general principles
• Integrated with deep and special purpose reasoners
  (snark, jtp, qualitative reasoning, …)
• Extensive environmental support (Chimaera, Shaken,
  Kraken, KA, Inference Web..)
• Interest from outside Vulcan, NI,


                           6                           3/25/03
             Chimaera: Ontology Environment Tool

An interactive web-based tool aimed at supporting:
    •Ontology analysis (correctness, completeness, style, …)
    •Merging of ontological terms from varied sources
    •Maintaining ontologies over time
    •Validation of input
• Features: multiple I/O languages, loading and merging into multiple
namespaces, collaborative distributed environment support, integrated
browsing/editing environment, extensible diagnostic rule language
• Used in commercial and academic environments, basis of some
commercial re-implementations (Ontobuilder/Ontoserver, …)
• Available as a hosted service from www-ksl-svc.stanford.edu
• Information: www.ksl.stanford.edu/software/chimaera
                                  7                               3/25/03
    Inference Web (w/Pinheiro da Silva)
Motivated by trust and reuse needs, IW provides a solution for
  explaining reasoning/retrieval tasks by storing, exchanging,
  combining, annotating, filtering, segmenting, comparing and
  rendering proofs and proof fragments provided by reasoners.
        Portable proof specification as an interlingua for proof
         interchange
        Proof browser for displaying IW proofs (possibly from multiple
         retrieval/inference engines) and for supporting follow-up
         questions
        Registry agents to record information used in proofs (e.g.,
         sources, provenance information, reasoners, rules, etc.)
u   Used with JTP, DQL Server, Wine agent, … ready for external users.
    http://www.ksl.stanford.edu/software/iw/
                                       8                               3/25/03
Moving to lighter options




                   9        3/25/03
Markup additions to content

Some types of information encoded in markup
   Provenance – author, date, source, authoritativeness ranking,
    subjective index, …
   Topic tags – author meta tags: content, keyword, …; third
    party topic tags - yahoo categories, …
   Structural tags – title, author, …
   Type tags – using controlled vocabularies/ ontologies (type =
    person, author, …)
   Property tags – hasEducationalDegree, hasEmailAddress, …)


  Could use XML, RDF extensions such as DAML+OIL, OWL, …

                                  10                                3/25/03
OWL: W3C’s WebOnt’s Markup Language

                       Web Languages
                         RDF/S
                         XML



                       DAML-ONT

                        DAML+OIL
                          OWL
                                       OIL

                                         Formal Foundations
       Frame Systems
                                          Description Logics

                                       FACT, CLASSIC, DLP, …




                                11                             3/25/03
                       Ontology Spectrum

                   Thesauri
                  “narrower                     Formal  Frames General
Catalog/            term”                        is-a (properties) Logical
ID                 relation                                     constraints

      Terms/                    Informal                 Formal                 Disjointness,
                                                                         Value Inverse, part-
     glossary                      is-a                 instance
                                                                         Restrs. of…


                  Markup such as DAML+OIL, OWL can be used to encode the spectrum



 AAAI 1999- Ontologies Panel                     12                                     3/25/03
OWL Sublanguages
• OWL Lite supports users primarily needing a classification hierarchy
  and simple constraint features. (For example, while it supports
  cardinality constraints, it only permits cardinality values of 0 or 1. It
  should be simpler to provide tool support for OWL Lite than its more
  expressive relatives, and provides a quick migration path for thesauri
  and other taxonomies.)
• OWL DL supports users who need maximum expressiveness while
  their reasoning systems maintain computational completeness (all
  conclusions are guaranteed to be computed) and decidability (all
  computations will finish in finite time). OWL DL includes all OWL
  language constructs, but they can be used only under certain
  restrictions (for example, while a class may be a subclass of many
  classes, a class cannot be an instance of another class). OWL DL is
  named for its correspondence with description logics.
• OWL Full supports users who want maximum expressiveness and
  the syntactic freedom of RDF with no computational guarantees. For
  example, in OWL Full a class can be treated simultaneously as a
  collection of individuals and as an individual in its own right. OWL Full
  allows an ontology to augment the meaning of the pre-defined (RDF
  or OWL) vocabulary. It is unlikely that any complete and efficient
  reasoner will be able to support every feature of OWL Full.
                                    13                                   3/25/03
One option with simple taxonomies:
Query Expansion

Under some conditions, free text queries
 may be adequate with small
 enhancements.

Consider FindUR’s conditions:
 - web pages with few words
 - constrained domain
 - unconstrained query interface


                        14                 3/25/03
      FindUR Architecture

Content to Search:
 Research Site
 Technical Memorandum
 Calendars (Summit 2005, Research)        Content (Web                   Content

 Yellow Pages (Directory Westfield)     Pages or Databases             Classification

 Newspapers (Leader)                                                                      CLASSIC Knowledge
 Internal Sites (Rapid Prototyping)                                                          Representation System
 AT&T Solutions                                                         Domain
                                                                        Domain
  Worldnet Customer Care                                               Knowledge
                                                                       Knowledge
Search Technology:
                                                 Search                                 Verity (and
                                                                                        topic sets)
                                                 Engine
                                               GUI supporting
User Interface:                                                                  Collaborative
                                                  browsing
                                                and selection                    Topic Set Tool
                                                                                        Verity SearchScript,
                                                                                        Javascript, HTML,
                                           Results            Results                   CGI, CLASSIC
                                      (standard format)   (domain specific)
One option with simple KBs:
TAP Activity-based search (w/McCool, Guha, Fikes)

Under some conditions, free text queries may
  benefit contextual dynamic additions to
  answers

Augment standard (Google) retrieval with
  information based on type of search term if
  recognized
  - properties related to concept (similar to
  jeeves follow-up yesterday)
  - retrieve data from known sources


                         19                         3/25/03
     Note gardens,
     ferry,
     transportation,
     User might
     need to work
     to find
     common info
     about location,




20                 3/25/03
     Activity-based info
     on right plus search
     modified. More
     refinement needed
     but can help in
     sense
     disambiguation




21                  3/25/03
Query Language

• Pattern Matching Languages may be used
  to specify portions of information to
  return from structured data sources.
• Can be viewed as pruning languages
  (Asking Queries about Frames – KR ’96)
• Can be viewed as query-answering
  dialogues (DQL 2003)
• Use a formal language to specify
  semantic relationships between queries,
  query answer, and knowledge base
                    22                      3/25/03
DQL Example
Example taken from DQL Demo using JTP and the Wines KB. Given:

rdfs:subClassOf tkb:SEAFOOD-COURSE tkb:MEAL-COURSE
rdfs:subClassOf tkb:SEAFOOD-COURSE tkb:DRINK-HAS-WHITE-COLOR-RESTRICTION

Assuming the premise that a seafood course is served, one might ask about properties of the wine
    recommended to be served. In particular, a user might want to know what color wine to serve.

Given the premise:
rdf:type tkb:NEW-COURSE tkb:SEAFOOD-COURSE
 tkb:DRINK tkb:NEW-COURSE tkb: W1

And the query:
    tkb:COLOR tkb: W1 ?x

The answer is returned:
Premise
                 rdf:type tkb:NEW-COURSE tkb:SEAFOOD-COURSE
                 tkb:DRINK tkb:NEW-COURSE tkb:W1

Bindings
                  tkb:COLOR tkb:W1 tkb:WHITE

Can use inference web to explain answers

                                                    23                                             3/25/03
Conclusion

KR can be used to add intelligence to question
  answering tasks at many levels:
• Handcrafted KBs and queries can be used built
  and maintained with tool assistance
• Lighter weight KR can be used effectively
  exploiting simple taxonomies, limited frame
  information, limited or extensive markup, etc.
• Can be used in combination with other
  approaches (e.g., AQUA here)
• Languages, tools, methodologies are available
  for non-KR experts to use

                        24                         3/25/03
                                  Discussion
Position Papers:
-Ontologies come of age –
http://www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html

-Description Logics emerge from Ivory Towers
http://www.ksl.stanford.edu/people/dlm/papers/dls-emerge-abstract.html


Languages, Environments, Software:

-OWL - http://www.w3.org/TR/owl-features/ ,        http://www.w3.org/TR/owl-guide/

-Inference Web - http://www.ksl.stanford.edu/software/iw/

-Chimaera - http://www.ksl.stanford.edu/software/chimaera/

-FindUR - http://www.research.att.com/people/~dlm/findur/

-TAP – http://tap.stanford.edu/

-DQL - http://www.ksl.stanford.edu/projects/dql/



                                                   25                                3/25/03

				
DOCUMENT INFO