1-introduction.pptx

W
Document Sample
scope of work template
							                                                                                                                                                                            1/25/2010




                                                                                                              Intelligent Systems
               COMSM0301: 2009/0 - Lecture 1
                                                                                                                                          Ghost in the Shell
         “Introduction to Learning from                                      Synthetic


                Structured Data”




                                                                                          Deep Blue
                                     Oliver Ray                             REASONING




                                                                                                                                                                             HAL 9000
                                (oray@cs.bris.ac.uk)


                           Department of Computer Science
                                 University of Bristol

                                                                             Analytic
                                 25th January, 2010                                                                   Robot Scientist

                                                                                            functional                        REPRESENTATION                    relational




                          State-of-the-Art                                               A Hierarchy of Representations

 Synthetic




              ML
REASONING



                  DM                                                                        ATTRIBUTE-                MULTIPLE               MULTI               INDUCTIVE
                                                                                               VALUE                  INSTANCE            RELATIONAL               LOGIC
                          PA                                                                 LEARNING                 LEARNING             LEARNING            PROGRAMMING
 Analytic     O                           KR                                                          (AVL)               (MIL)              (MRL)                  (ILP)


             functional               REPRESENTATION           relational                   functional                        REPRESENTATION                    relational




                   Database Perspective                                                               A Hierarchy of Reasoning
                Vector         Spreadsheet       relational    deductive                                                                Generalisation:
                                                 database      database      Synthetic
                                                                                                              INDUCTION                 From particular cases
                                                                                                                                        To general laws


                                                                                                                                        Explanation:

                                                                            REASONING                         ABDUCTION                 From observed effects
                                                                                                                                        To hidden causes


             ATTRIBUTE-         MULTIPLE           MULTI        INDUCTIVE
                VALUE           INSTANCE        RELATIONAL        LOGIC                                                                 Consequence:
              LEARNING          LEARNING         LEARNING     PROGRAMMING                                     DEDUCTION                 From given knowledge
                                                                             Analytic
                (AVL)             (MIL)               (MRL)       (ILP)
                                                                                                                                        To necessary implications

             functional               REPRESENTATION           relational




                                                                                                                                                                                        1
                                                                                                                                                             1/25/2010




                  Syllogistic Perspective                                                       Learning From Structured Data
                                              These beans are white
 Synthetic                                                                          Synthetic                                                          LSD
                      INDUCTION               These beans are from this bag
                                              All beans from this bag are white

                                                                                                                 ILP
                                              All beans from this bag are white                     ML
REASONING            ABDUCTION                These beans are white                REASONING
                                              These beans are from this bag
                                                                                                                       ALP
                                                                                                         DM
                                              All beans from this bag are white                                                  LP
                      DEDUCTION               These beans are from this bag                                       PA
 Analytic                                                                           Analytic         O
                                              These beans are white
                                                                                                                                 KR

                                                                                                    functional                REPRESENTATION         relational




                            LSD: What?                                                                                 LSD: Why?

                                                                                         Extend the representation power of conventional machine
                                                                                        learning systems to support real-world relational data (with
                    Supervised Machine Learning                                         internal structure and external relationships)


                                                                                         Extend the reasoning power of conventional knowledge
                                        for                                             representation frameworks to support real-world synthetic
                                                                                        inference (for uncertain and incomplete knowledge)

             Structured Knowledge Representations
                                                                                        Focus on the key challenges in representation and reasoning
                                                                                        overlooked by conventional approaches which assume useful
                                                                                        features and relevant knowledge is known in advance




                             LSD: How?                                                 The Start: Learning as Func. Approx.
                                    1   upgrade learner
 Synthetic                                                              LSD
                            2     downgrade representation
                                                                                         I                                                                    O
                                                                                                e                                                       t
                                                                3
REASONING                                                  hybridise
                                                           inference                         Input Space                                       Output Space
                                                                                                                         H
                                                                                              (examples)                                         (targets)
                                                                                                                             m

 Analytic                                                                                                                Hypothesis Space
                                                                                                                            (models)
             functional                 REPRESENTATION                relational




                                                                                                                                                                    2
                                                                                                                                                           1/25/2010




The End: Learning with Uncertain Info.                                                                   Course Administration
  Probabilistic Logic Learning (PLL)                                                          Lecturer
                                                                                                Oliver Ray          MVB 2.xx oray@cs.bris.ac.uk
                                                                                              Attendance
                                                                                                Week 13             QB 1.69      Mon-Fri  9am - 1pm (4h)
                                                                                                Week 14             QB 1.69      Mon-Thur 2pm - 5pm (3h)
     (i) entailment               (ii) interpretations                 (iii) proofs           Assessment
                                                                                                Assignment 1        20 %         Week 14
  Nonmonotonic Inductive Logic Programming (nmILP)                                              Assignment 2        30 %         Week 21
                                                                                                Examination         50 %         May/June
                                                                                              Foundations
                                                                                                Intro. to Machine Learn.         COMS30301
                                                                                                Artif. Intell. & Logic Prog.     COMS30106

 (i) inverse entailment         (ii) least generalisation        (iii) inverse resolution




                          Course Timetable                                                                   Course Objectives
                      Mon         Tue            Wed           Thu           Fri
                      25/01      26/01          27/01         28/01         29/01

    09:00 – 09:50         lec      lec            lec           lec          lec             Appreciate the key limitations of traditional (attribute-value)
    10:00 – 10:45         tut      tut            tut           tut          tut             Machine Learning methods and understand the practical
                                                                                             need to overcome these limitations
    11:00 – 11:50         lec      lec            lec          PROJ          lec
    12:00 – 12:30         cwk      cwk           cwk           PROJ          cwk



                      Mon         Tue            Wed           Thu           Fri
                                                                                             Appreciate the key methods for learning with expressive
                      01/02      02/02          03/02         04/02         05/02            (relational, logical & probabilistic) representations and
                                                                                             understand the trade-offs such methods bring about
    14:00 – 14:50         lec      lec           LAB            lec          CNS
    15:00 – 15:45         tut      tut           LAB            tut          CNS
    16:00 – 16:30         cwk      cwk           cwk           cwk           CNS




                           Course Reading                                                             Case Study: Mutagenisis
  Relational                             Inductive Logic Programming:                       Mutagenic compounds encourage the mutation of DNA and pose
 Data Mining                              Techniques and Applications                       serious health risks which play a key role in drug development
                                                                                            The mutagenic activity of many compounds is known from in-vitro
                                                                                            studies, but it is not practical to test all compounds in this way
                                                                                            Thus we want predictive models, or Structure Activity Relationships
                                                                                            (SARs), that relate mutagenicity to physiochemical properties
                                                                                            The mutagenesis data set contains 230 Aromatic and Heteroaromatic
                                                                                            Nitro compounds, each described by 1 class label and 4 attributes:
                                                                                             act - mutagenic activity (log TA98 Ames test)
                                                                                             εLUMO - energy of lowest unoccupied molecular orbital
                                                                                             logP - hydrophobicity (log octanol/water coecient)
                                                                                             Ia - indicator for an acenthrylene
                                                                                             I1 - indicator for 3 or more fused rings
                Foundations of Inductive                 Logical and Relational Learning    The data set usually split into 188 “regression friendly” compounds
                   Logic Programming
                                                                                            and 42 “regression unfriendly” compounds




                                                                                                                                                                  3
                                                                              1/25/2010




Mutagenisis as Attribute-Value Learning                Visualisation




         BiLinear Regression              Mutagenisis as Attribute-Value Learning




      Basic Chemical Structures                   A New Structural Alert




                                                                                     4
                                                                                                                                                                                                          1/25/2010




                       The Bottom Line                                                                              Tutorial 1: Represent This!

Most real-world data is highly structured: sets, lists, trees, graphs,                                  Molecule                               Chemical Structure                                   Class
space, time, and complex relationships within and between objects
                                                                                                                                    H
                                                                                                      formaldehyde                      C     O                                                      neg
                                                                                                                                    H
To use attribute-value learners, the structure of the data must be
collapsed by extracting a predetermined set of features                                                                                               H
                                                                                                         methane                              H       C   H                                          pos

As there are a potentially infinite number of features, the data itself is                                                                            H
often the most compact and complete representation                                                                                                                 H            H
                                                                                                                                                                       C    C
                                                                                                         benzene                                          H        C            C           H        neg
In reality, identifying relevant features is the key learning problem;                                                                                                 C    C
and new techniques are needed to learn with structured data                                                                                                        H                H




                    Molecules Schema                                                                                          Molecules Database
                                                                                                             Molecule:                                Atom:                                       Bond:
          Name                           Molecule                            Class                      Name             Class               Atom-ID          Element                   Bond-ID      Valency
                                                                                                     formaldehyde          neg              atm_form_1             C                bnd_form_1            2
                                               1                                                       methane             pos              atm_form_2             O                bnd_form_2            1
                                                                                                       benzene             neg              atm_form_3             H                bnd_form_3            1
                                         Contains
                                                                                                                                            atm_form_4             H                bnd_meth_1            1
                                                                                                                                            atm_meth_1             C                        …             …
                                               *                                                                                                  …                …                bnd_benz_12           1
         Atom-ID                             Atom                        Element                             Contains:                      atm_benz_12            H

                                         1          1                                                Molecule             Atom                                                  Connects:
                                                                                                    formaldehyde      atm_form_1                              Bond               Atom1              Atom2
                                         Connects                                                   formaldehyde      atm_form_2                          bnd_form_1          atm_form_1          atm_form_2
                                                                                                    formaldehyde      atm_form_3                          bnd_form_2          atm_form_1          atm_form_3
                                               ?                                                    formaldehyde      atm_form_4                          bnd_form_3          atm_form_1          atm_form_4
                                                                                                      methane         atm_meth_1                          bnd_meth_1          atm_meth_1          atm_meth_2
         Bond-ID                             Bond                            Valency                     …                   …                                     …                    …             …
                                                                                                      benzene         atm_benz_12                         bnd_benz_12        atm_benz_11          atm_benz_12




      Simplified Molecules Database                                                                                 Tutorial 2: Represent This!
           Molecule:                                             Atom:
        Name           Class                            Atom        Molecule         Element

    formaldehyde       neg                         atm_form_1      formaldehyde        C
                                                                                                             1. TRAINS GOING EAST                                          2. TRAINS GOING WEST

        methane        pos                         atm_form_2      formaldehyde        O

        benzene        neg                         atm_form_3      formaldehyde        H       1.                                                             1.
                                                   atm_form_4      formaldehyde        H
                                                   atm_meth_1        methane           C       2.                                                             2.

                                                         …               …             …
               Bond:                               atm_benz_12       benzene           H
                                                                                               3.                                                             3.


  Atom1           Atom2        Valency
                                                                                               4.                                                             4.
atm_form_1     atm_form_2        2
atm_form_1     atm_form_3        1                                                             5.                                                             5.

atm_form_1     atm_form_4        1                 (i) Bond becomes a weak entity
atm_meth_1     atm_meth_2        1                 (ii) Contains and Connects are both
    …              …             …                      treated as attributes
atm_benz_11   atm_benz_12         1




                                                                                                                                                                                                                 5