Semantic eScience by DWzJ823

VIEWS: 0 PAGES: 34

									Foundations VI: Provenance



   Deborah McGuinness and Peter Fox
            CSCI-6962-01
     Week 12, November 30, 2009



                                      1
                             References
•   PML -McGuinness, Ding, Pinheiro da Silva, Chang. PML 2: A Modular
    Explanation Interlingua. AAAI 2007 Workshop on Explanation-aware Computing,
    Vancouver, Can., 7/07. Stanford Tech report KSL-07-07.
    http://www.ksl.stanford.edu/KSL_Abstracts/KSL-07-07.html
•   Inference Web - McGuinness and Pinheiro da Silva. Explaining Answers from the
    Semantic Web: The Inference Web Approach. Web Semantics: Science,
    Services and Agents on the World Wide Web Special issue: International
    Semantic Web Conference 2003 - Edited by K.Sycara and J.Mylopoulis. Volume
    1, Issue 4. Journal published Fall, 2004
    http://www.ksl.stanford.edu/KSL_Abstracts/KSL-04-03.html
•   McGuinness, D.L.; Zeng, H.; Pinheiro da Silva, P.; Ding, L.; Narayanan, D.;
    Bhaowal, M. Investigations into Trust for Collaborative Information Repositories:
    A Wikipedia Case Study. The Workshop on the Models of Trust for the Web
    (MTW'06), Edinburgh, Scotland, May 22, 2006. 2006.
    http://www.ksl.stanford.edu/KSL_Abstracts/KSL-06-05.html

•   More from http://inference-web.org/wiki/Publications
                                                                                        2
                    Semantic Web Methodology and
                   Technology Development Process
            •   Establish and improve a well-defined methodology vision for
                Semantic Technology based application development
            •   Leverage controlled vocabularies, et c.


                         Leverage        Adopt
                 Rapid                 Technology Science/Expert
 Open World: Prototype Technology       Approach Review & Iteration
Evolve, Iterate,       Infrastructure
  Redesign,
  Redeploy
                                                         Use Tools
                            Evaluation
                                              Analysis
                 Use Case


                               Small Team,                      Develop
                                                                              3
                               mixed skills                      model/
                                                                ontology
    Ingest/pipelines: problem definition
•   Data is coming in faster, in greater volumes and outstripping our ability to perform
    adequate quality control

•   Data is being used in new ways and we frequently do not have sufficient
    information on what happened to the data along the processing stages to
    determine if it is suitable for a use we did not envision

•   We often fail to capture, represent and propagate manually generated
    information that need to go with the data flows

•   Each time we develop a new instrument, we develop a new data ingest
    procedure and collect different metadata and organize it differently. It is then hard
    to use with previous projects

•   The task of event determination and feature classification is onerous and we
    don't do it until after we get the data



                                                                                            4
                           5

20080602 Fox VSTO et al.
                Use cases
• Who (person or program) added the comments
  to the science data file for the best vignetted,
  rectangular polarization brightness image from
  January, 26, 2005 1849:09UT taken by the
  ACOS Mark IV polarimeter?
• What was the cloud cover and atmospheric
  seeing conditions during the local morning of
  January 26, 2005 at MLSO?
• Find all good images on March 21, 2008.
• Why are the quick look images from March 21,
  2008, 1900UT missing?
• Why does this image look bad?
                                                     6
                           7

20080602 Fox VSTO et al.
                           8

20080602 Fox VSTO et al.
               Provenance
• Origin or source from which something
  comes, intention for use, who/what generated
  for, manner of manufacture, history of
  subsequent owners, sense of place and time
  of manufacture, production or discovery,
  documented in detail sufficient to allow
  reproducibility
• Knowledge provenance; enrich with
  ontologies and ontology-aware tools

                                                 9
    Semantic Technology Foundations
• PML – Proof Markup Language – used for
  knowledge provenance interlingua
• Inference Web Toolkit – used to manipulate and
  access knowledge provenance
• OWL-DL ontologies (including SWEET and VSTO
  ontologies)




•   PML -McGuinness, Ding, Pinheiro da Silva, Chang. PML 2: A Modular Explanation Interlingua. AAAI 2007 Workshop on
    Explanation-aware Computing, Vancouver, Can., 7/07. Stanford Tech report KSL-07-07.
•   Inference Web - McGuinness and Pinheiro da Silva. Explaining Answers from the Semantic Web: The Inference Web
    Approach. Web Semantics: Science, Services and Agents on the World Wide Web Special issue: International Semantic
    Web Conference 2003 - Edited by K.Sycara and J.Mylopoulis. Volume 1, Issue 4. Journal published Fall, 2004
        Inference Web Explanation Architecture

                                      WWW                 Toolkit
                                                          IWTrust       Trust computation
       SDS         OWL-S/BPEL
Trace of web service discovery
                                    Proof Markup       IW Explainer/     End-user friendly
   Learners                 *      Language (PML)       Abstractor
                                                                         visualization
Learning Conclusions
                                                                        Expert friendly
  JTP/CWM                 KIF/N3        Trust           IWBrowser       Visualization
 Theorem prover/Rules

                                    Justification                       search engine
    SPARK              SPARK-L                           IWSearch       based publishing
Trace of task execution
                                     Provenance                         provenance
      UIMA        Text Analytics                         IWBase         registration
Trace of information extraction


  • Semantic Web based infrastructure
  • PML is an explanation interlingua
          – Represent knowledge provenance (who, where, when…)
          – Represent justifications and workflow traces across system boundaries
  • Inference Web provides a toolkit for data management and
    visualization
Global View and More    Views of Explanation

                filtered     focused      global


                         Explanation          abstraction
                          (in PML)
                                              discourse
               trust
                            provenance




                •      Explanation as a graph
                •      Customizable browser
                       options
                        –   Proof style
                        –   Sentence format
                        –   Lens magnitude
                        –   Lens width

                •      More information
                        –   Provenance metadata
                        –   Source PML
                        –   Proof statistics
                        –   Variable bindings
                 Provenance View                     Views of Explanation
• Source metadata: name, description, …      filtered    focused     global
• Source-Usage metadata: which fragment of
  a source has been used when                                          abstraction
                                                     Explanation
                                                      (in PML)
                                                                       discourse
                                             trust
                                                        provenance
              Trust View                            Views of Explanation

                                           filtered     focused    global


                               Trust Tab            Explanation      abstraction
              Detailed trust
                                                     (in PML)
               explanation
                                                                     discourse
                                           trust
                                                      provenance




                                           •       (preliminary) simple
                                                   trust representation
                                           •       Provides colored
                                                   (mouseable) view
                                                   based on trust values
                                           •       Enables sharing and
                                                   collaborative
                                                   computation and
                                                   propagation of trust
Fragment                                           values
colored by
trust value
                   Discourse View                  Views of Explanation
•   (Limited) natural language interface   filtered    focused     global
•   Mixed initiative dialogue
•   Exemplified in CALO domain                     Explanation
                                                    (in PML)
                                                                     abstraction

•   Explains task execution component      trust
                                                                     discourse
    powered by learned and human                      provenance
    generated procedures
Selected IW and PML Applications
• Portable proofs across reasoners: JTP (with temporal and
  context reasoners (Stanford); CWM (W3C), SNARK(SRI),
  …
• Explaining web service composition and discovery (SNRC)
• Explaining information extraction (more emphasis on
  provenance – KANI, UIMA)
• Explaining intelligence analysts’ tools (NIMD/KANI)
• Explaining tasks processing (SPARK / CALO)
• Explaining learned procedures (TAILOR, LAPDOG, /
  CALO)
• Explaining privacy policy law validation (TAMI)
• Explaining decision making and machine learning (GILA)
• Explaining trust in social collaborative networks (TrustTab)
• Registered knowledge provenance: IW Registrar
  (Explainable Knowledge Aggregation)
• Explaining natural science provenance – VSTO, SPCDIS,
  …
                PML1 vs. PML2
• PML1 was introduced in 2002
  – It has been used in multiple contexts ranging from
    explaining theorem provers to text analytics to machine
    learning.
  – It was specified as a single ontology

• PML2 improves PML1 by
  – Adopting a modular design: splitting the original ontology
    into three pieces: provenance, justification, and trust
      • This improves reusability, particularly for applications
        that only need certain explanation aspects, such as
        provenance or trust.
  – Enhancing explanation vocabulary and structure
      • Adding new concepts, e.g. information
      • Refining explanation structure
     PML Provenance Ontology
• Scope: annotating
  provenance metadata
• Highlights
  – Information
  – Source Hierarchy
  – Source Usage
   Referencing, Encoding and
 Annotating a Piece of Information
• Referencing a piece of information
   – using URI
• Encoding the content of information
   – Complete Quote:
     <hasRawString>(type TonysSpecialty SHELLFISH) </hasRawString>
   – Obtained from URL:
     <hasURL>http://inference-
     web.org/ksl/registry/storage/documents/tonys_fact.kif</hasURL>
• Annotations
   – For human consumption:
     <hasPrettyString>Tonys’ Specialty is ShellFish</hasPrettyString>
   – For machine consumption
      • Language:
         <hasLanguage rdf:resource="http://inference-web.org/registry/LG/KIF.owl#KIF" />
      • Format:
         <hasFormat "http://inference-web.org//registry/FM/PDF.owl#PDF" />
              Source Hierarchy
• Source is the container of information
• Our source hierarchy offers
  – Many well-known sources such as
     • Sensor (e.g. geo-science)
     • InferenceEngine (e.g. reasoner)
     • WebService (e.g. workflow)
  – Finer granularity of source than just document
     • DocumentFragment (for text analytics)
                    Source Usage
• Source Usage
  – logs the action that accesses a source at a
    certain dateTime to retrieve information
  – is part of PML1
• Example: Source #ST was accessed on
  certain date
 <pmlp:SourceUsage rdf:about="#usage1">
   <pmlp:hasUsageDateTime>2005-10-17T10:30:00Z</pmlp:hasUsageDateTime>
   <pmlp:hasSource rdf:resource="#ST"/>
 </pmlp:SourceUsage>
       PML Justification Ontology
• Scope: annotating
  justification process
• Highlights
   – Template for question-
     answer/justification
   – Four types of justification
       Four Types of Justification
Goal               conclusion without justification

Assumption         conclusion assumed (using
                   Assumption Rule) asserted by an
                   InferenceEngine, no antecedent

Direct Assertion   conclusion directly asserted (using
                   DirectAssertion rule) by an
                   InferenceEngine, no antecedent

Regular            conclusion derived from antecedent
                   conclusions
             PML Trust Ontology
• Scope: annotate trust and
  belief assertions
• Highlights
  – Extensible trust representation
    (user may plug in their
    quantitative metrics using OWL
    class inheritance feature)
  – Has been used to provide a
    trust tab filter for wikipedia –
    see McGuinness, Zeng, Pinheiro da
    Silva, Ding, Narayanan, and Bhaowal.
    Investigations into Trust for Collaborative
    Information Repositories: A Wikipedia
    Case Study. WWW2006 Workshop on the
    Models of Trust for the Web (MTW'06),
    Edinburgh, Scotland, May 22, 2006.
25
                           26

20080602 Fox VSTO et al.
Quick look browse




                                27

     20080602 Fox VSTO et al.
28
Visual browse




                29
30
31
   Search and structured query




Search                    Structured
                          Query




                                  32
Search




                           33

20080602 Fox VSTO et al.
                   Next week
• Next class
  – Architecture and Middleware
• Questions?




                                  34

								
To top