Causality competition results - Causality Workbench

Document Sample
Causality competition results - Causality Workbench Powered By Docstoc
					                  Results of the Causality
                        Challenge
                        Isabelle Guyon, Clopinet
     Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ.
        André Elisseeff and Jean-Philippe Pellet, IBM Zürich
               Gregory F. Cooper, Pittsburg University
                    Peter Spirtes, Carnegie Mellon

Causality Workbench                                      clopinet.com/causality
                      Causal discovery
   What affects…                  …your health?




                                                     …climate
       … the economy?                                changes?

                 Which actions will have beneficial effects?
Causality Workbench                                     clopinet.com/causality
                      Systemic causality

                      The system
                                   External agent




Causality Workbench                                 clopinet.com/causality
                      Feature Selection

                                                               Y




                                                                X

                       Predict Y from features X1, X2, …
                       Select most predictive features.

Causality Workbench                                        clopinet.com/causality
                             Causation

                                                                        Y
                                                                    Y


                                                                         X

                      Predict the consequences of actions:
                      Under “manipulations” by an external agent,
                      some features are no longer predictive.

Causality Workbench                                                 clopinet.com/causality
                      Challenge Design




Causality Workbench                      clopinet.com/causality
                       Available data

  • A lot of “observational” data.
        Correlation  Causality!
  • Experiments are often needed, but:
        – Costly
        – Unethical
        – Infeasible
  • This challenge, semi-artificial data:
        – Re-simulated data
        – Real data with artificial “probes”


Causality Workbench                            clopinet.com/causality
                      Four tasks
  Challenge
   datasets




     Toy
   datasets




Causality Workbench                clopinet.com/causality
                  On-line feed-back




Causality Workbench                   clopinet.com/causality
                          Difficulties
            • Violated assumptions:
                  –   Causal sufficiency
                  –   Markov equivalence
                  –   Faithfulness
                  –   Linearity
                  –   “Gaussianity”
            • Overfitting (statistical complexity):
                  – Finite sample size
            • Algorithm efficiency (computational complexity):
                  – Thousands of variables
                  – Tens of thousands of examples
Causality Workbench                                   clopinet.com/causality
                         Evaluation

     • Fulfillment of an objective
                  • Prediction of a target variable
                  • Predictions under manipulations


     • Causal relationships:
                  • Existence
                  • Strength
                  • Degree


Causality Workbench                                   clopinet.com/causality
                        Setting

     • Predict a target variable (on training and
       test data).
     • Return the set of features used.
     • Flexibility:
            – Sorted or unsorted list of features
            – Single prediction or table of results
     • Complete entry = xxx0, xxx1, xxx2 results
       (for at least one dataset).

Causality Workbench                                   clopinet.com/causality
                                 Metrics
                • Results ranked according to the test set
                  target prediction performance “Tscore”:




                • We also assess directly the feature set with a
                  “Fscore”, not used for ranking.
Causality Workbench                                                clopinet.com/causality
                      Toy Examples




Causality Workbench                  clopinet.com/causality
            Causality assessment
             with manipulations
                                                                  Born an
                        Anxiety             Peer Pressure
                                                                  Even Day

              Yellow
                                  Smoking              Genetics
              Fingers

                                                                   Attention
                        Allergy             Lung Cancer
                                                                   Disorder


                                  Coughing              Fatigue

  LUCAS0:
   natural                                                        Car Accident


Causality Workbench                                                       clopinet.com/causality
            Causality assessment
             with manipulations
                                                                  Born an
                        Anxiety             Peer Pressure
                                                                  Even Day

              Yellow
                                  Smoking              Genetics
              Fingers

                                                                   Attention
                        Allergy             Lung Cancer
                                                                   Disorder


                                  Coughing              Fatigue

 LUCAS1:
manipulated                                                       Car Accident


Causality Workbench                                                       clopinet.com/causality
            Causality assessment
             with manipulations
                                                                 Born an
                       Anxiety             Peer Pressure
                                                                 Even Day

             Yellow
                                 Smoking              Genetics
             Fingers

                                                                  Attention
                       Allergy             Lung Cancer
                                                                  Disorder


                                 Coughing              Fatigue

 LUCAS2:
manipulated                                                      Car Accident


Causality Workbench                                                      clopinet.com/causality
               Goal driven causality
                                                         10        2         5
   • We define:
        V=variables of interest                      3        9        4
        (e.g. MB, direct causes, ...)                    1         0         6
                                                              11       8
   • Participants return:
                                                                             7
         S=selected subset
                      4   11   2   3   1   (ordered or not).

   • We assess causal relevance: Fscore=f(V,S).
Causality Workbench                                            clopinet.com/causality
         Causality assessment
         without manipulation?




Causality Workbench              clopinet.com/causality
      Using artificial “probes”
                                                                        Born an
                                 Anxiety          Peer Pressure
                                                                        Even Day

                      Yellow
                                           Smoking           Genetics
                      Fingers

                                                                        Attention
                                 Allergy          Lung Cancer
                                                                        Disorder

LUCAP0:
                                           Coughing          Fatigue
 natural
                                                                       Car Accident
                  P1            P2         P3           PT

                                     Probes
Causality Workbench                                                            clopinet.com/causality
       Using artificial “probes”
                                                                         Born an
                                  Anxiety          Peer Pressure
                                                                         Even Day

                       Yellow
                                            Smoking           Genetics
                       Fingers

                                                                         Attention
                                  Allergy          Lung Cancer
                                                                         Disorder

LUCAP1&2:
                                            Coughing          Fatigue
manipulated
                                                                        Car Accident
                   P1            P2         P3           PT

                                      Probes
 Causality Workbench                                                            clopinet.com/causality
             Scoring using “probes”

  • What we can compute (Fscore):
         – Negative class = probes (here, all “non-causes”, all manipulated).
         – Positive class = other variables (may include causes and non causes).

  • What we want (Rscore):
         – Positive class = causes.
         – Negative class = non-causes.

  • What we get (asymptotically):
       Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal)


Causality Workbench                                                   clopinet.com/causality
                      Results




Causality Workbench             clopinet.com/causality
                      Challenge statistics
   •    Start: December 15, 2007.
   •    End: April 30, 2000
   •    Total duration: 20 weeks.
   •    Last (complete) entry ranked:

                 Number of ranked entrants




               Number of ranked submissions


Causality Workbench                           clopinet.com/causality
                               Learning curves
                                       REGED                                                               SIDO
                    1                                                                 1

                   0.9                                                               0.9

                   0.8                                                               0.8
          Tscore




                                                                            Tscore
                   0.7                                                               0.7

                   0.6                                                               0.6

                   0.5                                                               0.5

                   0.4                                               0               0.4                                                 0
                                                                     1                                                                   1
                                                                     2                                                                   2
                   0.3                                                               0.3
                      0   20   40      60        80      100   120    140               0   20   40      60        80       100    120    140
                                    Days into the challenge                                           Days into the challenge


                                         CINA                                                             MARTI
                    1                                                                 1

                   0.9                                                               0.9

                   0.8                                                      Tscore   0.8
          Tscore




                   0.7                                                               0.7

                   0.6                                                               0.6

                   0.5                                                               0.5

                   0.4                                               0               0.4                                                 0
                                                                     1                                                                   1
                                                                     2                                                                   2
                   0.3                                                               0.3
                      0   20   40      60        80      100   120    140               0   20   40      60        80       100    120   140
                                    Days into the challenge                                           Days into the challenge


Causality Workbench                                                                                                               clopinet.com/causality
                      AUC distribution




Causality Workbench                      clopinet.com/causality
                      REGED
                                Gavin Cawley
                                Yin-Wen Chang
                                Mehreen Saeed
                                Alexander Borisov
                                E. Mwebaze & J. Quinn
                                H. Jair Escalante
                                J.G. Castellano
                                Chen Chu An
                                Louis Duclos-Gosselin
                                Cristian Grozea
                                H.A. Jen
                                J. Yin & Z. Geng Gr.
                                Jinzhu Jia
                                Jianming Jin
                                L.E.B & Y.T.
                                M.B.
                                Vladimir Nikulin
                                Alexey Polovinkin
                                Marius Popescu
                                Ching-Wei Wang
                                Wu Zhili
                                Florin Popescu
                                CaMML Team
                                Nistor Grozavu




Causality Workbench           clopinet.com/causality
                      SIDO
                               Gavin Cawley
                               Yin-Wen Chang
                               Mehreen Saeed
                               Alexander Borisov
                               E. Mwebaze & J. Quinn
                               H. Jair Escalante
                               J.G. Castellano
                               Chen Chu An
                               Louis Duclos-Gosselin
                               Cristian Grozea
                               H.A. Jen
                               J. Yin & Z. Geng Gr.
                               Jinzhu Jia
                               Jianming Jin
                               L.E.B & Y.T.
                               M.B.
                               Vladimir Nikulin
                               Alexey Polovinkin
                               Marius Popescu
                               Ching-Wei Wang
                               Wu Zhili
                               Florin Popescu
                               CaMML Team
                               Nistor Grozavu




Causality Workbench          clopinet.com/causality
                      CINA
                               Gavin Cawley
                               Yin-Wen Chang
                               Mehreen Saeed
                               Alexander Borisov
                               E. Mwebaze & J. Quinn
                               H. Jair Escalante
                               J.G. Castellano
                               Chen Chu An
                               Louis Duclos-Gosselin
                               Cristian Grozea
                               H.A. Jen
                               J. Yin & Z. Geng Gr.
                               Jinzhu Jia
                               Jianming Jin
                               L.E.B & Y.T.
                               M.B.
                               Vladimir Nikulin
                               Alexey Polovinkin
                               Marius Popescu
                               Ching-Wei Wang
                               Wu Zhili
                               Florin Popescu
                               CaMML Team
                               Nistor Grozavu




Causality Workbench          clopinet.com/causality
                      MARTI
                                Gavin Cawley
                                Yin-Wen Chang
                                Mehreen Saeed
                                Alexander Borisov
                                E. Mwebaze & J. Quinn
                                H. Jair Escalante
                                J.G. Castellano
                                Chen Chu An
                                Louis Duclos-Gosselin
                                Cristian Grozea
                                H.A. Jen
                                J. Yin & Z. Geng Gr.
                                Jinzhu Jia
                                Jianming Jin
                                L.E.B & Y.T.
                                M.B.
                                Vladimir Nikulin
                                Alexey Polovinkin
                                Marius Popescu
                                Ching-Wei Wang
                                Wu Zhili
                                Florin Popescu
                                CaMML Team
                                Nistor Grozavu




Causality Workbench           clopinet.com/causality
        Pairwise comparisons
                                 Gavin Cawley
                                 Yin-Wen Chang
                                 Mehreen Saeed
                                 Alexander Borisov
                                 E. Mwebaze & J. Quinn
                                 H. Jair Escalante
                                 J.G. Castellano
                                 Chen Chu An
                                 Louis Duclos-Gosselin
                                 Cristian Grozea
                                 H.A. Jen
                                 J. Yin & Z. Geng Gr.
                                 Jinzhu Jia
                                 Jianming Jin
                                 L.E.B & Y.T.
                                 M.B.
                                 Vladimir Nikulin
                                 Alexey Polovinkin
                                 Marius Popescu
                                 Ching-Wei Wang
                                 Wu Zhili
                                 Florin Popescu
                                 CaMML Team
                                 Nistor Grozavu




Causality Workbench            clopinet.com/causality
             Top ranking methods

     • According to the rules of the challenge:
            – Yin Wen Chang: SVM => best prediction accuracy on
              REGED and CINA. Prize: $400 donated by Microsoft.
            – Gavin Cawley: Causal explorer + linear ridge
              regression ensembles => best prediction accuracy on
              SIDO and MARTI. Prize: $400 donated by Microsoft.
     • According to pairwise comparisons:
            – Jianxin Yin and Prof. Zhi Geng’s group: Partial
              Orientation and Local Structural Learning => best
              on Pareto front, new original causal discovery
              algorithm. Prize: free WCCI 2008 registration.

Causality Workbench                                      clopinet.com/causality
        Pairwise comparisons
                      REGED   SIDO

                                        Gavin Cawley
                                        Yin-Wen Chang
                                        Mehreen Saeed
                                        Alexander Borisov
                                        E. Mwebaze & J. Quinn
                                        H. Jair Escalante
                                        J.G. Castellano
                                        Chen Chu An
                                        Louis Duclos-Gosselin
                                        Cristian Grozea
                                        H.A. Jen
                                        J. Yin & Z. Geng Gr.
                      CINA    MARTI     Jinzhu Jia
                                        Jianming Jin
                                        L.E.B & Y.T.
                                        M.B.
                                        Vladimir Nikulin
                                        Alexey Polovinkin
                                        Marius Popescu
                                        Ching-Wei Wang
                                        Wu Zhili
                                        Florin Popescu
                                        CaMML Team
                                        Nistor Grozavu




Causality Workbench                   clopinet.com/causality
                      Conclusion
    • We have found good correlation between
      causation and prediction under manipulations.
    • Several algorithms have demonstrated
      effectiveness of discovering causal
      relationships.
    • We still need to investigate what makes then
      fail in some cases.
    • We need to capitalize on the power of classical
      feature selection methods.
Causality Workbench                         clopinet.com/causality

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:5
posted:5/5/2011
language:English
pages:34