VIEWS: 0 PAGES: 30 POSTED ON: 2/17/2012 Public Domain
Supplementary Material for “Dramatic reduction of dimensionality in large biochemical networks due to strong pair correlations” Table of Contents: Page number 1. Supplementary Figures S1-S4 ……………………… 1-3 2. Generation of Random Linear Networks …………….. 4 3. Nonlinear Feedback Reactions in a Reaction Network with Linear Architecture ………… 5 4. Biological Networks (Fig. S6) ………………………. 7-9 5. Supplementary Figures S7-S8 ……………………...... 9-10 6. Details of the Simulation Method …………………… 11-12 7. Supplementary Figures S9 …………………………... 13 8. Failure of PCA in reducing dimensionality in a simple network …………………………………... 14-15 9. Analytical Calculations for the Linear Network …………………………………… 17-19 10. Supplement on Gramians ……………………………. 20-21 11. Studying the role of modifications likely to occur in exp- eriments ......................................................................... 22-26 12. Numbers of species and timepoints used for the biological networks …………………………………………........ 27 13. Effect of unit variance scaling (Fig. S12) ................... 28 14. References ................................................................... 29 N=64 Linear Cascade: Top Two Components 100 100 99 80 98 97 Trial Index 60 96 94 40 93 20 92 91 0 90 0 0.2 0.4 0.6 0.8 1 Time Fig. S1. Time dependence of the percentage explained by the top two eigenvalues of the covariance matrix calculated for the linear network for N=64 species. Results for100 trials (each associated with a set of rate constants and initial conditions randomly drawn from a uniform distribution of a range of 0.1 to 10) are shown and demonstrate that more than 90% of the total variance is captured by the top two eigenvalues most of the time. All the trials were started with a non-zero concentration of the last species ( c64 ! 0 ) and zero concentrations for all other species. The calculation of the covariance matrix included all the 64 species concentrations. 1 (A) (B) Fig. S2. Minimum percentage explained by the top and the top two eigenvalues of the covariance matrix for the linear reaction network. (A) Distributions of the minimum of the percentage explained by the top eigenvalue in the entire time course for any trial for N=8 to N=512. In each trial a set of rate constants and initial concentrations were randomly drawn from a uniform distribution between 0.1 and 10. All the 10,000 trials were started with a non-zero concentration of the last species and zero concentrations for all other species. The calculation of the covariance matrix included all the species concentrations. (B) Distributions of the minimum percentage explained by the top two eigenvalues for the same cases as in (A). (A) (B) (C) Fig. S3 Effect of initial conditions on the distributions of the minimum percentage explained by the eigenvalues in the linear network. We compare the distributions of the minimum percentage explained by the largest eigenvalue of the covariance matrix for two different types of initial conditions for the linear network with N=64. The two different types correspond to (i) only the concentration of the last species is non-zero ( c64 ! 0 ) at t=0, whereas, the rest of the species start with zero concentrations, (ii) all the species concentrations have non- zero values at t=0. For both the cases, the rate constants and the initial concentrations were varied in the same manner (as in Fig. S2). (A) Shows distributions of the largest eigenvalue. (B) Distributions for the top two eigenvalues. (C) Distributions for the top three eigenvalues. The top 2 eigenvalue seem to capture the variances more efficiently when all the species were given non- zero initial concentrations . The shapes of the distributions start to converge when three eigenvalues are considered. (A) (B) Fig. S4. Comparisons between results of calculations including all the species vs. including only a set of independent species in the covariance matrix for the linear network. (A) Distributions of the minimum percentage explained by the largest eigenvalue when the covariance matrix was calculated using all the species (black line) and only the independent species concentrations (red line) for N=64. The number of the independent species was calculated by accounting for the conservation laws (e.g., total concentrations of the all the N species remains unchanged in the linear network, ! c (t) = const ) that produce linear i=1 i dependence between different species concentrations. The parameters used for the simulation are the same as described in Fig. S2. (B) Distributions of the minimum percentage explained by the top three eigenvalues for the same cases as in (A). 3 Generation of Random Linear Networks 1. For a system of N number of species, we choose a subset of n1 ! N species, where n1 is drawn from a uniform random number distribution ranged from 2 to N. The species in the sub-network, indexed as, c1,!, cn1 , are connected such that ci reacts with ci+1 with first order forward and backward reactions for i < n1 . The step produces a connected reaction network with linear architecture. Then in addition to these reactions, we choose all the pairs that are not directly connected to each other and introduce a pair of forward and backward first order reaction, each with probability ½. The second step produces branched structure in the sub-network. 2. The next step involves three sub-steps. A. If, N ! n1 > 2 , then, we choose another subnet-work of n2 ! N " n1 following the same method as described in step 1. The reactions in the sub- network are also generated in the same manner. We continue this process as long as, a single species or a pair of species is left in the system. We follow the next two steps when that situation is reached. B. If, N ! n1 = 2 , then we connect the remaining pair of species by first order forward and backward reactions. C. If N ! n1 = 1 , then we connect the last species with a randomly chosen species in the last sub- network by first order forward and backward reactions. After we generate the network architecture that is consists of a collection of disconnected sub- networks with branched and/or linearly connected reactions we assign rate constants and initial conditions to all the reactions from the range [0.1-10] for the rates and [0.1-10] for the initial concentrations. At each trial, a new network architecture and a new set of reaction rates and initial conditions are generated. 4 Nonlinear feedback reactions in a reaction network with linear architecture The species in the network react with each other as shown above in the schematic figure. The wavy lines indicate the feedback reactions. In the above network, species i, interacts with species i and i+1 as following the mass-action kinetics below, dci = ki!1ci!1 + k!i ci+1 ! (k!(i!1) + ki )ci ! ki!1ci ci!1 + kif ci ci+1 f dt The last two terms in the above ODE denote the feedback reactions. Fig. S5. Minimum percentage explained by the top four eigenvalues in the linear reaction networks with non-linear feedback reactions. Distributions of the minimum percentage explained shown for N=8 to N=512 for 10,000 trials. In each trial a set of rate constants and initial concentrations were randomly drawn from a uniform distribution in the range of 0.1 to 10). Each trial was started with a non-zero concentration for the last species and zero concentrations for all other species. 5 (A) Ras Activation Network (B) EGFR Signaling Network (Kholodenko et al. (1999)) 6 (C) EGFR Signaling Network (Blinov et al. (2006)) 7 (D) NF-kB Activation Network Fig. S6. Biological networks. (A) Ras activation network. Ras activation (transition from RasGDP to RasGTP) is carried out by two enzymes (Guanine nucleotide Exchange Factors or GEFs) SOS and Rasgrp1. SOS is recruited to the plasma membrane by the T cell receptor (TCR) signalosome (not shown in the figure) which is formed when TCRs interact with strong affinity antigens. In addition to a catalytic site (marked by a yellow circle) that mediates Ras activation, SOS also has an allosteric site which can bind to both RasGDP and RasGTP (1). When the allosteric site is occupied by RasGTP the rate of Ras activation via SOS becomes almost 75 times faster. This is the positive feedback. Rasgrp1 is recruited to the plasma membrane by Diacylglycerol (DAG) which is produced by signaling events downstream to TCR triggering. Unlike SOS, Ras activation by Rasgrp1 does not have a positive feedback. Ras deactivation is mediated by the enzyme, RasGAP. (B) EFGR signaling network (coarse grained). EGFR signaling network where multiple receptor activation states are coarse-grained into a single state. The details regarding the signaling pathway shown here can be found at http://models.cellml.org/exposure/468382d300fa0027425f72c195afd11a and Ref. (2). (C) EFGR signaling network (including detailed receptor activation states). Details of the signaling network can be found in Ref. (3). The figure above was taken from Ref. (3) with the permission of the authors. (D) NF-kB activation network. In a resting state NF-κB is bound to the three isoforms of IκB (α, β, ε) in cytoplasm. Following stimulation, IKK-complexes get 8 activated. IKK phosphorylates and degrades IκB isoforms, releasing NF-κB from its bound state. Free NF-κB translocates to the nucleus(4). While the transcription factors for the IκBβ and IκBε are produced at a steady rate, the transcription factor for the IκBα gets positively regulated by nuclear NF-κB. The nuclear IκB isoforms then bind with their nuclear NF-κB partners and translocate them back to cytoplasm, where in presence of IKK, the same cycle repeats itself, giving rise to oscillation in NF-κB’s nuclear localization. The BioNetGen files describing the reactions and the parameters used can be found at the link http://planetx.nationwidechildrens.org/~jayajit/pairwise_correlations/ . Ras−Sos Network: Top Component Ras−Sos Network: Top Two Components 100 100 100 100 99 95 80 98 80 89 97 84 Trial Index 60 Trial Index 60 96 78 94 73 40 40 93 68 20 92 20 62 91 57 0 90 0 52 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Time Time Fig. S7 (A) Temporal evolution of the percentage explained by the largest eigenvalue for the Ras activation network until the steady state is reached for 100 different trials (denoted as trial index in the y axis). In each trail, the kinetic rates and initial concentrations were varied around the base parameters (web supplement) by up to 50% from a uniform random distribution. (B) Time evolution of the percentage explained by the top two eigenvalues for the same trails as in (A). 9 Minimum Percent Explained by Top Two Components Ras-Sos Network 100 Stable Bistable 80 Percent of Trials 60 40 20 0 20 30 40 50 60 70 80 90 100 Percent Explained (A) (B) (C) Fig. S8 Percentage explained by more than one principal components for the nonlinear networks. (A) Shows the Ras activation network, (B) EGFR signaling and (C) NF-kB activation networks. The parameters used are the same as in Fig. 2 in the main text. 10 Details of the Simulation Method: We use the rule based modeling software, BioNetGen(5), to generate time courses for the species kinetics in a biochemical network. This software produces a set of ODEs corresponding to the mass-action kinetics describing bio-chemical reactions in the networks. The ODEs are solved numerically in the software using the CVODE solver. We have used a time step ( dt ) proportional to the inverse of the fastest time scale ( dt = ! fastest / 2 ) associated with the reactions. For the linear networks we calculate ! fastest from the largest rate constant, ie., ! fastest = 1 / klargest , for each trial as all the reactions are first order reactions. For the networks with non-linear reactions, which consist of second order binding unbinding and first order reactions, we calculate the largest possible rate ( klargest ) of the reactions for every trial by finding the maximum of the first order rates and the product of the binding rates with the largest species concentration. Following the same scheme we calculate the minimum possible rate ( ksmallest ) of a reaction in the linear ( ksmallest =minimum of all the rates) and the non-linear networks ( ksmallest =minimum of all the first order rates and the product of binding rates and the largest species concentration). The BIONETGEN files describing the reactions for all the networks studied can be found at the link http://planetx.nationwidechildrens.org/~jayajit/pairwise_correlations/. We have checked in specific cases that the results obtained from BionetGen agree with results from higher accuracy calculations using Mathematica. Calculation of the steady state: We calculated the eigenvalues of the correlation matrix, Cij , for times until the system reached the steady state. When reaction kinetics is described by ODEs, the system reaches the steady state in principle at t ! " . However, as the system approaches the fixed point the relative changes in species concentrations compared to the steady state concentrations decreases with time. In our simulations, we ended the kinetics when we reached such a limit. We denote this time as the estimated steady state time (tS ) . For all the networks (linear and non-linear, except the NF-κB network) tS is calculated from the smallest time scale (! ksmallest ) in the system. We justified this choice by evolving the system for a longer time to N N check if the relative Euclidean distance, i.e., d(t1, t2 ) = " (ci (t1 ) ! ci (t2 ))2 i=1 " c (t = 0) between i=1 i the species concentrations is smaller than a tolerance of 10-6 when t1 = tS and t2 = 2tS . The table below (Table I) shows the values of these variables for the different networks we use. For some of the networks, the system reached close to steady state concentrations even for times t < tS , in such cases, we calculated the Euclidean distance, d(tend , tS ) and end the simulation when d(tend , tS ) ! 10 "6 . The table below also provides details regarding this. NF-κB network: This network can produce oscillations when the kinetic rates and the initial conditions fall within a particular range. We stop the simulations when 50 oscillations are detected in a trial. Our simulations indicated it was reasonable to assume that the system is well 11 inside the limit cycle after 50 oscillations. For the kinetic trajectories that did not show oscillations we stopped our kinetics at a time t=200 units which guaranteed that the distributions for the minimum percentage explained did not change significantly even when the system was evolve farther in time. We used a less stringent criterion for estimating the steady states for this network compared to the other networks since this case required a much smaller discretization time step ( dt = 4 !10 "4 ) which substantially increased the computation time for the range of parameters studied here. Table I: Details regarding the solutions of the ODEs and steady state calculations Network Estimated Simulation end steady state time (tend ) time (tS ) Linear cascade 10/ ksmallest Same as tS if max{ci } < 1 . When, max{ci } ! 1 (10 / ksmallest )cmax Random linear 1/ ksmallest Same as tS cascade Nonlinear 10/ ksmallest Same as tS feedback if max{ci } < 1 . When, max{ci } ! 1 (10 / ksmallest )cmax Ras-SOS 1/ ksmallest Same as tS Kholodenko et 2/ ksmallest tend < 2 / ksmallest al. Blinov et al. 2/ ksmallest tend < 2 / ksmallest 12 Fig. S9. Comparison with the Latin Hypercube Sampling method. The linear network for N=64 was investigated with values of the rate constants and initial concentrations generated using the Latin Hypercube(6) sampling method (red histograms) for 10,000 trials. The parameters were varied in the same manner as described in Fig. S2. The histograms generated by using the built-in random number generator in Perl are shown (black) for comparison. Both sampling with a uniform distribution and Latin hypercube sampling generate very similar distributions. 13 Failure of Principal Component Analysis in reducing dimensionality in a simple linear network Consider a set of disconnected first order reactions shown below: " " " X1 !k! #, X 2 !k! #,$ $ $ $ Xi !k! #,$ $ $ $ X m !k! # . 1 2 i m " (1) The kinetics of the above reactions is determined by a set of uncoupled linear ordinary rate equations given by, dc1 dc dc = !k1c1 ,!, i = !ki ci ,!, m = !km cm dt dt dt If ci 0 (ti ) is the initial concentration of the ith species ( Xi ) that starts decaying at, t = ti , then the concentration of the species at any time, t(> ti ) is given by, ci (t) = ci 0 (ti )e! ki (t !ti ) . Case I First we consider a situation where all the reactions start at the same time, t1 = t 2 = ! = t m = 0 , with the same values of initial concentrations, i.e., c10 (0) = ! = cm 0 (0) = c0 . We also assume that all the species decay at the same rate, therefore, k1 = k2 = ! = km = 1 . With the above simplifications, all the species follow identical kinetics, where, ci (t) = c0 e!t . Thus, the correlation matrix, Cij , defined as, T 1 Cij (T ) = ! dt ci (t)ci (t) T 0 2 T c0 c2 T ! = dt e"2t = 0 (1 " e"2T ) 0 2T Note that all the elements of the Cij matrix are equal, and there is only one eigenvalue ( 2 mc0 !1 = (1 " e"2T ) ) that is non-zero, and, rest of the eigenvalues vanish, i.e., 2T !2 = !3 = ! = !m = 0 . Therefore, the largest eigenvalue captures 100% of the variance! This behavior is expected, because, all the species started decaying at the same time with the same 14 rate (or “velocity”), therefore, the trajectory of the system in the manifold spanned by the species concentrations is a straight line and an eigenvector along that direction will capture all the variations in the trajectory at all times. Now, we will construct a case where we will introduce each of the reactions at different times and show that a particular arrangement of those times will create a trajectory that will require all the eigenvectors capture equal amount of variation in the trajectory. Case II Here we consider a situation where the reactions in (1) are introduced sequentially at different times ( {ti } ), such that, t m > t m !1 > t m ! 2 > ! > t 2 > t1 = 0 . As in case I, we assume all the rates are set to unity and all the species start with the same initial concentration, c0 . The correlation matrix, Cij , is given by, T 2 T 2 1 c0 c0 (ti +t j ) "2ti Cij (T , ti , t j ) = ! dt ci (t)c j (t) = ! dt e "(t "ti ) "(t "t j ) e = e # e " e"2T % $ & T 0 T ti 2T where, ti > t j and ci (t) = 0 when, t < ti and c j (t) = 0 when t < t j . It will be instructive to look at the case with two species (m=2) first and then we will generalize the result for any number of species. In this case, 1 1 !(t2 !t1 ) 1 C11 = "1 ! e!2(T !t1 ) $ , C12 = # % "e # ! e!2T +(t1 +t2 ) $ = C21 , C22 = % "1 ! e!2(T !t2 ) $ , where, 2T 2T 2T # % t 2 > t1 . The corresponding eigenvalues are, (C11 + C22 ) ± (C11 " C22 )2 + 4C12 2 !1,2 = 2 The % explained by the largest component is given by 1" (C11 ! C22 )2 + 4C12 % 2 $1 + ' ( 100% . 2$ C11 + C22 ' # & When, T ! t1 ,t 2 , the above expression simplifies to, &1 !2(t 2 !t1 ) ) ( "1 + e # $ + O(e!2T )+ , 100% . Therefore, when t 2 ! t1 ! 1 , the % explained by the top % '2 * component is very close to 50%, and both the eigenvectors capture the total variance almost equally. 15 Now we can easily generalize the result for any m > 2 . When, T ! t1 ,",t m , 1 !(ti !t j ) Cij (T ,ti ,t j ) = e + O(e!2T ), ti > t j 2T and, 1 Cii (T ,ti ,ti ) = + O(e!2T ) 2T Therefore, when the times ({tt }) for introducing the reactions are chosen such that ti ! t j ! 1, then the off-diagonal elements of the Cij matrix decreases exponentially fast compared to the diagonal elements. Thus, in this case we get a set of degenerate eigenvalues all equal to 1/2T in the leading order and the percentage explained by the any eigenvalue is equal to 1 / m (to the leading order). 16 Linear Model In this section we provide analytic arguments to support the numerical result pre- sented in the text that one or two principal components are suﬃcient to explain almost all the variance in an overwhelming number of cases for the linear cascade and its variants. In general the concentrations of the linear system satisfy the rate equations dci (t) ∑ = Mij cj (t) (1) dt j where the reaction matrix elements Mij are constants that are linear in the the rate con- stants. The concentrations that obey the set of reactions shown in Figure 1 in the text yield − the matrix elements Mi,i+1 = k−i , Mi,i−1 = ki−1 and Mii = −ki∑ k−(i−1) for 1 ≤ i ≤ n if we set k−1 = kn = 0. The system obeys the conservation law j cj (t) is independent of time. The eigenvalues and eigenvectors of M determine the time evolution of the system. One of the eigenvalues say λ0 will be zero reﬂecting the conservation law and the rest of them can be proved to be negative. Denote the eigenvalues by −λα (λα ≥ 0) and the right- and left-eigenvectors by vα,i and uα,j respectively. The solution can be written in the form ∑ N −1 ci (t) = e−λα t Dα vα,i (2) α=0 ∑ where the coeﬃcients Dα are determined by the initial conditions: Dα = i uα,i ci (t = 0) . The ﬁrst coeﬃcient D0 yields the sum of the concentrations. We integrate the solutions from t = 0 up to t = τ to obtain the expression for the elements of the covariance matrix ∑∑ ( ) 1 − e−(λα +λβ )τ (1 − e−λα τ )(1 − e−λβ τ ) Cij (τ ) = Dα Dβ vα,i vβ,j − (3) α β (λα + λβ )τ λα τ × λβ τ Only the terms in which both λα and λβ nonzero contribute in the above sums and so the covariance matrix is eﬀectively an (N − 1) × (N − 1) matrix. We write it in the form ∑∑ Cij (τ ) = Dα Dβ vα,i vβ,j × Fα,β (τ ) α β ( ) 1 − e−(λα +λβ )τ (1 − e−λα τ )(1 − e−λβ τ ) where Fαβ (τ ) ≡ − . (4) (λα + λβ )τ λα τ × λβ τ Since Fαβ (τ ) is independent of initial conditions and depends only on the rate constants the matrix denoted by F is easier to analyze. 17 k An extreme limit of the cascade is one with degenerate time scales given by Ci Cj k for all pairs i ̸= j. In this case the reactant matrix M has elements Mi,j = k for i ̸= j and Mii = −(N − 1)k. The eigenvalues are easy to determine: there is one zero eigenvalue and N − 1 degenerate eigenvalues with value −N k. It is convenient to let the indices run from 0 to N − 1: λ0 = 0 and λα = −N k for α = 1, · · · (N − 1). It is easy to see that F0α = 0 for all α. So we consider Fαβ with α and β ranging from 1 to (N − 1). Since all the λα are identical all the matrix elements of F are the same and we will denote it by f . It is easy to determine the eigenvalues of F : it has one non-vanishing eigenvalue (N − 1)f and the rest vanish. Since a single eigenvalue exhausts the trace it is straightforward to show that PCA works perfectly for all initial conditions. Given that in this extreme case PCA works exactly, for small deviations one can argue that the performance of PCA will be very good using matrix perturbation theory. The other case we consider is one in which the rate constants are randomly distributed yielding a uniform distribution for the eigenvalues of M . We have veriﬁed numerically that the eigenvalues of M for the case displayed in Figure 1 of the text yields an ap- proximately uniform distribution with an excess at λα close to zero for large N . We provide a variational bound for the percentage explained by the ﬁrst principal compo- nent of F . Denote the largest eigenvalue of F by Λmax . We use the variational bound ∑ Λmax ≥ α,β vα Fαβ vβ for any normalized vector v. For a given value of N we consider a model in which the eigenvalues of λ of the reac- tion matrix are distributed uniformly between 1/N and 1. Since F is a positive-deﬁnite matrix, we know from the Perron-Frobenius theorem that the eigenvector corresponding to the largest eigenvlaue has all positive elements. This suggests choosing the trial vector ⃗ = √N −1 (1, 1, 1, ·, · · · 1, 1) that is guaranteed to have a positive overlap with the exact v 1 ∑ ∑ eigenvector corresponding to Λmax . We have Λmax > N 1 −1 α β Fαβ . We replace the sum on the eigenvalues by an integral with a uniform density. Similarly we can evaluate the trace and ﬁnd a lower bound on the percentage explained by by the ﬁrst principal component. The averaging over the eigenvalue distribution can be evaluated numerically and the results for N = 1000 are displayed in the Figure. In the limit N → ∞ the integrals can be evaluated exactly in terms of incomplete Gamma functions (not displayed) and the results are similar. We emphasize that our results are for the matrix F and not the covariance matrix itself and therefore, for special initial conditions this bound may be violated. We have studied the large τ behavior as follows. Consider the matrix F for τ >> 1/λmin where λmin is the eigenvalue of M with the least magnitude. Recall that F is an N − 1 by N − 1 matrix with a zero eigenvalue removed. Since all the exponentials are negligible, to leading order in τ −1 it has the form of a Cauchy matrix with 1 1 Fα,β = . τ λα + λβ 18 0.9 0.8 explained 0.7 0.6 0.5 1 5 10 50 100 500 1000 Τ Figure S10: Bound on the percentage explained by the ﬁrst principal component as a function of τ . We have chosen τ ≥ 1 since the largest eigenvalue is O(1). We use the variational bound as before and obtain 1 ∑ ∑ 1 Λmax N −1 α β λα +λβ ≥ ∑ 1 T rF α 2λα Now we write the numerator as a sum of rows and use the fact that the harmonic mean is less than the arithmetic mean. For row α we have 1 1 ∑ ∑ ≤ (λα + λβ ) = µ + λα 1 N −1 1 β λα +λβ N −1 β ∑ where T r M = α λα is the trace of the reaction matrix and µ ≡ 1 N −1 Tr M we obtain ∑ 1 Λmax α λα + µ ≥ ∑ 1 T rF α 2λα So far this is exact; if we assume that the eigenvalues λα of the reaction matrix are distributed between 1/N and 1 uniformly for large N we can simplify this further. We have µ = 1/2 and the numerator can be seen to be approximately N log 3 for large N . The denominator involves a harmonic sum and is logarithmic: is (1/2) N log N . We have Λmax 2 log 3 ≥ . T rF log N Thus the fraction explained by the ﬁrst component is bounded at least by 1/ log N . At N = 100, N = 500 and N = 1000 this yields a bound of 0.477, 0.3535 and 0.31. These bounds are consistent with our numerical results. We have been unable to establish better analytic bounds. 19 Supplement on Gramians Consider a set of k vectors {⃗j } (j = 1, · · · , k) in an N -dimensional real vector space. The v (symmetric) Gram matrix or Gramian is deﬁned by Gij = ⃗i · ⃗j . Note that the k vectors v v are linearly independent if there exist k constants ck (not all of which are zero) such that ∑k ∑n v j=1 cj ⃗ j = 0 . This corresponds to the equation j=1 Gij cj = 0 and linear independence occurs if and only if det G ̸= 0. Thus the Gram determinant arises naturally in determining linear independence. The Gram determinant is non-negative and it can be shown to be the squared (hyper)-volume of the k-dimensional parallelepiped (or parallelotope) formed by the vectors. This is easy to demonstrate in three dimensions. c As in the diﬀerential geometry of three dimensions we consider a curve ⃗ in N -dimensions ′ , ⃗′′ , ⃗′′′ , · · · , ⃗ (k) } where and deﬁne its properties in terms of the N -dimensional vectors {⃗ c c c c the prime denotes diﬀerentiation with respect to the arc length s the arc length traced by the curve from a chosen initial point. The curve is planar (embedded in N -dimensions) if only the ﬁrst two of the vectors are linearly independent and so on. Thus the local dimensionality of the curve can be obtained by computing the above vectors successively and calculating the corresponding Gram determinant, We have done this for k = 2 and k = 3 for the linear and nonlinear networks. For completeness we provide some of the equations used in the program below. Let ... c, c, c denote the ﬁrst, second, and third derivatives of the concentration with respect to ˙ ¨ √ ˙ ˙ time respectively. The arc length is deﬁned by (ds)2 = d⃗ · d⃗. Therefore, s = ⃗ · ⃗ . c c ˙ c c √ We repeatedly use the fact f ′ (s) ≡ df = f˙ where s = ˙ ˙ ˙ ⃗ · ⃗ to obtain the the c c ds s˙ derivatives with respect to the arc length in terms of the time derivatives. We have ˙ c ⃗ ⃗ ′ (s) = c ˙ s ⃗¨ c ˙ c¨ ⃗s ⃗ ′′ (s) = 2 − 3 c ˙ s... s ˙ s⃗ − ⃗ s ˙c c˙ ... ¨ s ⃗ ′′′ (s) = c − 3 2 ⃗ ′′ (s) c s4 ˙ s˙ c These can be used to calculate the derivatives of ⃗ with respect to the arc length s knowing the time derivatives. Now the equation of motion yields ⃗:c˙ ∑ ∑ ˙ ci = Mij cj (t) + Bijk cj (t)ck (t) j j,k for system with a quadratic nonlinearity as in the case of NFκB network, for example. For c the linear network B vanishes. We can obtain the higher-order time derivatives of ⃗ by 20 diﬀerentiation iteratively. The time derivatives of s are displayed below: √ ˙ s = ˙ ˙ ⃗·⃗ c c ¨ ˙ ⃗·⃗ c c ¨ s = ˙ s ... ... ¨·⃗ +⃗ · ⃗ c ¨ ˙ ⃗ c c c ¨ s s = − s. ¨ ˙ s ˙ s Knowing the vectors ⃗′ , ⃗′′ , ⃗′′′ the Gram determinant can be evaluated using the dot c c c products numerically as a function of time. 21 Studying the role of modifications likely to occur in experiments. The EGFR model of Kholodenko et al (Ref. 27 in the main text) was used to address the role of the effects that are likely to occur in experimental datasets where it is not possible to assay all molecular species and measure protein activations at very small time intervals in modifying the ability of the principal components to capture variance in the data sets. In particular, we wanted to investigate why the variations captured (71%) by the top three components for the experiments in Gaudet et al. (Ref. 8 in the main text) was significantly less compared to the variation explained by the top three components for the in-silico networks we studied. We compared our results with the experiments in Gaudet et al. (Ref. 8 in the main text) where the EFGR signaling network is one of the cross-reacting signaling networks. All the simulations were performed using 1200 different parameter sets. (i) PCA was performed on a sub set of species instead of all the species that are generated by this signaling network. This is more like to occur in experiments where is it difficult to assay every molecular species that is generated due to the biochemical reactions. Here, we considered the species EGF, Grb2, Grb2_Sos, Shc, Sos, and R as described in the BioNetGen file (http://bionetgen.org/index.php/Egfr_path) for the above network that contains 18 species. (ii) We performed PCA on data with the concentrations chosen in (i) scaled by their concentrations at t=0. This probes the effect of using fold changes instead of absolute concentrations of proteins to measure activation as done in the analysis of experiments by Gaudet et al. (Ref. 8 in the main text). (iii) Next we probed the effect of choosing fewer data points in time separated by larger time intervals than that used for the results in Fig. 2D. There were more than 1000 time points in each kinetic trajectory used for Fig. 2D. We used 13 time points, 0 t S /1440, 5 t S /1440, 15 t S /1440, 30 t S /1440, 60 t S /1440, 90 t S /1440, 120 t S /1440, 240 t S /1440, 480 t S /1440, 720 t S /1440, 960 t S /1440, 1200/ t S 1440, 1440 t S /1440, where the time t S denotes the time where the system reaches the steady state. The specific time points were selected to better match the time points in the data given in Gaudet et al. (Ref. 8 in the main text). (iv) We added random noise in the species concentrations. This was done to investigate the effect of having random variability in the data arising from experimental measurements in Gaudet et al. (Ref. 8 in the main text). In our simulations, for each protein concentration obtained from the solution of the ODEs, a random number is drawn from either [.001,1] or [1,1000] with equal probability, and multiplied any concentration. Conclusion: Our results show that introducing the effects described above can in some cases reduce the percentage explained by the top eigenvectors. On the other hand, we find that, overall the above modifications increased the percentage explained by the top eigenvector, e.g. the peak of the minimum percentage explained distributions increased by about 30%. It is possible that for a specific set of parameters the minimum percentage explained decreased after the modifications. However, even in that case, the minimum value of the minimum percentage explained distributions is around 40% for all the modifications. Therefore, the effects considered above are unlikely to underlie the rather low value observed in the experimental dataset given in 22 Gaudet et al. (Ref. 8 in the main text) where the top three principal components capture only 71% of the total variance. Fig. S11: Results showing effects of different modifications in the EGFR network (Kholodenko et al. ) Minimum Percent Explained by First Component Kholodenko et. al., 1999 140 Experiment 0 120 100 80 Frequency 60 40 20 0 40 50 60 70 80 90 100 Minimum Percent Explained by Top Component A. Result of PCA when all the species in the network described by Kholodenko et al were considered. The time points used are the same as in Fig. 2D. 1200 different parameter sets were used. This graph represents the ‘control’ in our numerical experiments. Minimum Percent Explained by Top Component Kholodenko et. al., 1999 Change in Minimum Percent Explained by Top Component Vs Control 800 Kholodenko et. al., 1999 Experiment 1 120 Experiment 1 700 100 600 500 80 Frequency Frequency 400 60 300 40 200 100 20 0 60 65 70 75 80 85 90 95 100 0 -30 -20 -10 0 10 20 30 40 50 Percent Change in Percent B. (Left) Results from the modification described in (i). (Right) Distribution of the percent changes for any parameter set as (i) was carried out. A positive change implies increase in the percent explained when (i) was carried out and vice versa. 23 Minimum Percent Explained by First Component Kholodenko et. al., 1999 400 Experiment 2 350 300 250 Frequency 200 150 100 50 0 40 50 60 70 80 90 100 Minimum Percent Explained by Top Component C. Results from the simulations when both the effects described in (i) and (ii) were included. Minimum Percent Explained by First Component Kholodenko et. al., 1999 700 Experiment 3 600 500 400 Frequency 300 200 100 0 80 82 84 86 88 90 92 94 96 98 Minimum Percent Explained by Top Component D. Results from simulations when both the modifications described in (i) and (iii) are included. 24 Minimum Percent Explained by First Component Kholodenko et. al., 1999 500 Experiment 4 450 400 350 300 Frequency 250 200 150 100 50 0 50 55 60 65 70 75 80 85 90 95 100 Minimum Percent Explained by Top Component E. Results from simulations when the modifications described in (i), (ii), and (iii) are included. Minimum Percent Explained by First Component Kholodenko et. al., 1999 400 Experiment 5 350 300 250 Frequency 200 150 100 50 0 50 55 60 65 70 75 80 85 90 95 100 Minimum Percent Explained by Top Component F. Results from simulations when the modifications described in (i), (ii), and (iv) are included. 25 Minimum Percent Explained by First Component Kholodenko et. al., 1999 120 Experiment 6 100 80 Frequency 60 40 20 0 55 60 65 70 75 80 85 90 95 100 Minimum Percent Explained by Top Component G. Results from simulations when all the four modifications described in (i), (ii), (iv), and (iv) are included. 26 Table II: Number of species and the time points used in the data matrix for the biological networks studied in the paper. Network Number of species Ras Activation Network 14 EGFR Signaling Network: Smaller version 18 (Kholodenko et al., Ref. 27) EGFR Signaling Network: Larger version 356 (Blinov et al., Ref. 26) NF-κB Activation Network (Ref. 29) 26 The smallest time intervals for calculating the species concentrations and the time the system takes to reach the steady state concentrations were chosen using the largest and smallest time scales estimated from the rate constants and initial concentrations in a reaction network. Therefore, we had a large number (>103) of time points in our simulations. See the section on, “Details of the simulation method” and Table 1 in the supplementary material for more details. 27 Fig. S12: Effect of unit variance scaling on percentage explained by top principal components in the Ras activation network Minimum Percent Explained by Top Component Minimum Percent Explained by Top Two Components Ras-Sos Network Ras-Sos Network 35 25 Unit-Variance Scaling Unit-Variance Scaling 30 (A) 20 (B) 25 20 15 Frequency Frequency 15 10 10 5 5 0 35 40 45 50 55 60 65 70 75 80 85 0 Percent 70 75 80 85 90 95 100 Percent Minimum Percent Explained by Top Three Components Ras-Sos Network 80 Unit-Variance Scaling 70 (C) 60 50 Frequency 40 30 20 10 0 88 90 92 94 96 98 Percent We considered the kinetics of the molecular species involved in the Ras activation network (Fig. S6A). Each species concentration at any time point was scaled by the variance for that particular species calculated by using concentrations of that species at all the time points (from t=0 to the end time of the simulation) as independent variables. Then we performed principal component analysis on the scaled kinetic trajectories at all the time points as described in the main text. We show results for the distributions of the minimum percentage explained by the top three components calculated over 130 trials. The minimum percentage explained (shown in (A)) by the top component was peaked at a smaller value compared to case when the variables were not scaled (see Fig. 2C in the main text). However, as we included the top two (shown in (B)) and three (shown in (C)) principal components, the principal components in the scaled dataset captured significant amount (peak at 98% for three components) of variations in the original data set. 28 1. Das J, et al. (2009) Digital signaling and hysteresis characterize ras activation in lymphoid cells. Cell 136(2):337‐351. 2. Kholodenko BN, Demin OV, Moehren G, & Hoek JB (1999) Quantification of short term signaling by the epidermal growth factor receptor. J Biol Chem 274(42):30169‐ 30181. 3. Blinov ML, Faeder JR, Goldstein B, & Hlavacek WS (2006) A network model of early events in epidermal growth factor receptor signaling that accounts for combinatorial complexity. Biosystems 83(2‐3):136‐151. 4. Hoffmann A, Levchenko A, Scott ML, & Baltimore D (2002) The IkappaB‐NF‐kappaB signaling module: temporal control and selective gene activation. Science 298(5596):1241‐1245. 5. Blinov ML, Faeder JR, Goldstein B, & Hlavacek WS (2004) BioNetGen: software for rule‐based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics 20(17):3289‐3291. 6. Mckay MD, Beckman RJ, & Conover WJ (1979) Comparison of 3 Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics 21(2):239‐245. 29