Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

ICA PCA

VIEWS: 1 PAGES: 31

									                                  Independent components
                                  analysis of starch deficient
                                  pgm mutants


                                  GCB 2004
                                  M. Scholz, Y. Gibon, M. Stitt, J. Selbig


Matthias Maneck - Journal Club WS 04/05
Overview

 Introduction
 Methods
       PCA  – Principal Component Analysis
       ICA – Independent Component Analysis
       Kurtosis

 Results
 Summary


Matthias Maneck - Journal Club WS 04/05
Introduction – techniques

   visualization techniques
       supervised
               biological background information
       unsupervised
             present major global information
             General questions about the underlying data
              structure.
             Detect relevant components independent from
              background knowledge.


Matthias Maneck - Journal Club WS 04/05
Introduction – techniques

   PCA
       dimensionality reduction
       extracts relevant information related to the
        highest variance
   ICA
       Optimizes independence condition
       Components represent different non-
        overlapping information

Matthias Maneck - Journal Club WS 04/05
Introduction - experiments

   Micro plate assays of
    enzymes form
    Arabidopsis thaliana.
         pgm mutant vs. wild type
         continuous night
   data                  j Samples
              i Enzymes




Matthias Maneck - Journal Club WS 04/05
            Introduction – workflow


                Data                        PCA               ICA       Kurtosis    ICs




              j Samples                                                            1st IC

                                         j Samples          j Samples
i Enzymes




                                  PC’s




                                                      ICs


                                                                                   2nd IC




            Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
                                 4




                                 3




                                 2




                                 1
                      Enzyme 2




                                 0




                                 -1




                                 -2




                                 -3




                                 -4
                                   -4   -3   -2   -1      0       1   2   3   4
                                                       Enzyme 1

Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
                                 4




                                 3




                                 2
                                                                             1. Principal Component
                                 1
                      Enzyme 2




                                 0




                                 -1




                                 -2
                                                                  2. Principal Component
                                 -3




                                 -4
                                   -4   -3   -2   -1      0       1    2     3    4
                                                       Enzyme 1

Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
                              4




                              3




                              2




                              1
                      2. PC




                              0




                              -1




                              -2




                              -3




                              -4
                                -4   -3   -2   -1     0     1   2   3   4
                                                    1. PC

Matthias Maneck - Journal Club WS 04/05
      PCA – calculation
                                                                          Eigenvalues

                                                                         λ1




                                                                              ...
            Data-Matrix                                     Cov-Matrix
              j Samples                                      i Enzymes




                                                                                    ...
                          - mean                                                          λi
i Enzymes




                                                i Enzymes
                          - mean

                          - mean
                                                                          Eigenvectors
                          - mean



                                                                         x1 ... ... xi




      Matthias Maneck - Journal Club WS 04/05
PCA – dimensionality reduction


  Selected Components                                 Data Matrix       Reduced Data Matrix
                                                       j Samples

           i Enzymes                                                            j Samples
                                          i Enzymes
  PCs




                                                                          PCs
                                                                    =




Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
                                 4




                                 3




                                 2
                                                                             1. Principal Component
                                 1
                      Enzyme 2




                                 0




                                 -1




                                 -2
                                                                  2. Principal Component
                                 -3




                                 -4
                                   -4   -3   -2   -1      0       1    2     3    4
                                                       Enzyme 1

Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
                         4




                         3




                         2




                         1




                         0




                         -1




                         -2




                         -3




                         -4
                           -4    -3       -2   -1     0     1   2   3   4
                                                    1. PC

Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis

   Minimizes correlation between components.
   Components are orthogonal to each other.
   Delivers transformation matrix, that gives the influence of
    the enzymes on the principal components.
   PCs ordered by size of eigenvalues of cov-matrix
    Reduced Data Matrix                   Selected Components               Data Matrix
                                                                            j Samples

            j Samples                         i Enzymes




                                                                i Enzymes
      PCs




                                      PCs




                                =


Matthias Maneck - Journal Club WS 04/05
ICA – independent component analysis


                                                   microphone signals are
                     Mike 1
     Person 1                                       mixed speech signals
                                     Person 2
                                                x1 (t )  a11s1 (t )  a12 s 2 (t )  a13 s3 (t )
                                                x 2 (t )  a 21s1 (t )  a 22 s 2 (t )  a 23 s3 (t )
    Mike 2
              Person 3
                                                x3 (t )  a31s1 (t )  a32 s 2 (t )  a33 s3 (t )

                                     Mike 3




Matthias Maneck - Journal Club WS 04/05
ICA – independent component analysis
         Microphone Signals X                             Mixing Matrix A                 Speech Signals S
                                                                mixing
                   time t                                       speech                        time t
   microphone




                                                  microphone




                                                                                speech
                                                                                signals
     signals




                                       =




      Demixing matrix A-1                 Microphone signals X                            Speech signals S
                demixing
                 speech                                        time t                           time t
                                     microphone
      speaker




                                                                                  speech
                                                                                  signals
                                       signals




                                                                            =


Matthias Maneck - Journal Club WS 04/05
 ICA – independent component analysis
35




30
                                                                                                                               The sum of distribution of the same
25
                                                                                                                               time is more Gaussian.
20




15



                                                                                                                                  60
10                   35




5                    30



                                                                                                                                  50
0                    25
     0   0.1   0.2    0.3     0.4   0.5   0.6   0.7        0.8    0.9     1



                     20


                                                                                                                                  40
                     15




                     10                               35


                                                                                                                                  30
                      5                               30




                      0                               25
                          0   0.1   0.2   0.3   0.4        0.5    0.6    0.7    0.8    0.9     1
                                                                                                                                  20
                                                      20




                                                      15

                                                                                                                                  10

                                                      10




                                                      5

                                                                                                                                  0
                                                                                                                                       0   0.5   1   1.5   2   2.5   3
                                                      0
                                                           0     0.1    0.2    0.3    0.4    0.5   0.6   0.7   0.8   0.9   1




 Matthias Maneck - Journal Club WS 04/05
ICA – independent component analysis

   Maximizes independence (non Gaussianity) between
    components.
   ICA doesn’t work with purely Gaussian distributed data.
   Components are not orthogonal to each other.
   Delivers transformation matrix, that gives the influence of the PCs
    on the independent components.
   ICs are unordered

                       ICs                Demixing Matrix         Data Matrix
                    j Samples                    PCs               j Samples


                                                            PCs
                                           ICs
             ICs




                                      =


Matthias Maneck - Journal Club WS 04/05
Kurtosis – significant components

   measure of non Gaussianity
                                                     n
      z  – random variable (IC)
                                                     ( zi   ) 4
       μ – mean                 kurtosis( z )  i 1              3
       σ – standard deviation
                                                    (n  1)  4




   positive kurtosis  super Gaussian

   negative kurtosis  sub Gaussian

Matthias Maneck - Journal Club WS 04/05
Kurtosis – significant components




Matthias Maneck - Journal Club WS 04/05
Influence Values
   Which enzymes have most influence on ICs?
    Reduced Data Matrix                   Selected Components               Data Matrix
                                                                            j Samples

            j Samples                              i Enzymes




                                                                i Enzymes
      PCs




                                      PCs

                                =



               ICs                          Demixing Matrix                 Data Matrix
            j Samples                                PCs                     j Samples




                                                                   PCs
     ICs




                                             ICs




                                =

Matthias Maneck - Journal Club WS 04/05
Influence Values
           Influence Matrix               Demixing Matrix            Selected Components
              i Enzymes                             PCs                                i Enzymes




                                                                     PCs
    ICs




                                            ICs
                                      =




                ICs                               Influence Matrix                     Data Matrix
                                                                                         j Samples

             j Samples                               i Enzymes




                                                                           i Enzymes
     ICs




                                           ICs




                                  =


Matthias Maneck - Journal Club WS 04/05
Results

   pgm mutant
       compares wild type and pgm mutant
       17 enzymes,125 samples
               wild type, pgm mutant
   continuous night
       responseto carbon starvation
       17 enzymes, 55 samples
               +0, +2, +4, +8, +24, +48, +72, +148 h


Matthias Maneck - Journal Club WS 04/05
Results – pgm mutant




Matthias Maneck - Journal Club WS 04/05
Matthias Maneck - Journal Club WS 04/05
Results – continuous night




Matthias Maneck - Journal Club WS 04/05
Results – combined




Matthias Maneck - Journal Club WS 04/05
Results – combined




Matthias Maneck - Journal Club WS 04/05
Results – combined




Matthias Maneck - Journal Club WS 04/05
Summary

   ICA in combination with PCA has higher
    discriminating power than only PCA.
   Kurtosis is used for selection optimal PCA
    dimension and ordering of ICs.
   pgm experiment, 1st IC discriminates between
    mutant and wild type.
   Continuous night, 2nd IC represents time
    component.
   The two most strongly implicated enzymes are
    identical.

Matthias Maneck - Journal Club WS 04/05
References

   Scholz M., Gibon Y., Stitt M., Selbig J.: Independent
    components analysis of starch deficient pgm mutants.
   Scholz M., Gatzek S., Sterling A., Fiehn O., Selbig J.:
    Metabolite fingerprinting: an ICA approach.
   Blaschke, T., Wiskott, L.: CuBICA: Independent
    Component Analysis by Simultaneous Third- and Fourth-
    Order Cumulant Diagonalization. IEEE Transactions on
    Signal Processing, 52(5):1250-1256.
    http://itb.biologie.hu-berlin.de/~blaschke/
   Hyvärinen A., Karhunen J., Oja E.: Independent
    Component Analysis. J. Wiley. 2001.

Matthias Maneck - Journal Club WS 04/05

								
To top