VIEWS: 1 PAGES: 40 POSTED ON: 10/4/2012 Public Domain
Metabolic Pathway Analysis: Elementary Modes The technique of Elementary Flux Modes (EFM) was developed prior to extreme pathways (EP) by Stephan Schuster, Thomas Dandekar and co-workers: Pfeiffer et al. Bioinformatics, 15, 251 (1999) Schuster et al. Nature Biotech. 18, 326 (2000) The method is very similar to the „extreme pathway“ method to construct a basis for metabolic flux states based on methods from convex algebra. Extreme pathways are a subset of elementary modes, and for many systems, both methods coincide. Are the subtle differences important? 21. Lecture WS 2003/04 Bioinformatics III 1 Review: Metabolite Balancing For analyzing a biochemical network, its structure is expressed by the stochiometric matrix S consisting of m rows corresponding to the substances (metabolites) and n rows corresponding to the stochiometric coefficients of the metabolites in each reaction. A vector v denotes the reaction rates (mmol/g dry weight * hour) and a vector c describes the metabolite concentrations. Due to the high turnover of metabolite pools one often assumes pseudo-steady state (c(t) = constant) leading to the fundamental Metabolic Balancing Equation: dct 0 Sv (1) dt Flux distributions v satisfying this relationship lie in the null space of S and are able to balance all metabolites. Klamt et al. Bioinformatics 19, 261 (2003) 21. Lecture WS 2003/04 Bioinformatics III 2 Review: Metabolic flux analysis Metabolic flux analysis (MFA): determine preferably all components of the flux distribution v in a metabolic network during a certain stationary growth experiment. Typically some measured or known rates must be provided to calculate unknown rates. Accordingly, v and S are partioned into the known (vb, Sb) and unknown part (va, Sa). (1) leads to the central equation for MFA describing a flux scenario: 0 = S v = Sa va + Sb vb. The rank of Sa determines whether this scenario is redundant and/or underdetermined. Redundant systems can be checked on inconsistencies. In underdetermined scenarios, only some element of va are uniquely calculable. Klamt et al. Bioinformatics 19, 261 (2003) 21. Lecture WS 2003/04 Bioinformatics III 3 Software: FluxAnalyzer A network project constructed by FluxAnalyzer. Here, vb consists of R1, R2, and va of R3 - R7, whereof R3, R4, R7 can be computed. Biomass component 1: BC1[g] = 2[mmol]A + 1 [mmol]C Biomass component 2: BC2[g] = 1[mmol]C + 3[mmol]D 1 -1 0 -1 0 0 0 0.8 0 1 -1 0 0 0 0 0 S= Klamt et al. Bioinformatics 0 0 0 1 -1 1 0 1 0 0 1 0 1 -1 1 1.8 19, 261 (2003) R1 R2 R3 R4 R5 R6 R7 biomass synthesis 21. Lecture WS 2003/04 Bioinformatics III 4 Review: structural network analysis (SNA) Whereas MFA focuses on a single flux distribution, techniques of Structural (Stochiometric, Topological) Network Analysis (SNA) address general topological properties, overall capabilities, and the inherent pathway structure of a metabolic network. Basic topological properties are, e.g., conserved moieties. Flux Balance Analysis (FBA9 searches for single optimal flux distributions (mostly with respect to the synthesis of biomass) fulfilling S v = 0 and additionally reversibility and capacity restrictions for each reaction (i vi i). Klamt et al. Bioinformatics 19, 261 (2003) 21. Lecture WS 2003/04 Bioinformatics III 5 Review: Metabolic Pathway Analysis (MPA) Metabolic Pathway Analysis searches for meaningful structural and functional units in metabolic networks. The most promising, very similar approaches are based on convex analysis and use the sets of elementary flux modes (Schuster et al. 1999, 2000) and extreme pathways (Schilling et al. 2000). Both sets span the space of feasible steady-state flux distributions by non- decomposable routes, i.e. no subset of reactions involved in an EFM or EP can hold the network balanced using non-trivial fluxes. MPA can be used to study e.g. - routing + flexibility/redundancy of networks - functionality of networks - idenfication of futile cycles - gives all (sub)optimal pathways with respect to product/biomass yield - can be useful for calculability studies in MFA Klamt et al. Bioinformatics 19, 261 (2003) 21. Lecture WS 2003/04 Bioinformatics III 6 Elementary Flux Modes Start from list of reaction equations and a declaration of reversible and irreversible reactions and of internal and external metabolites. E.g. reaction scheme of monosaccharide Fig.1 metabolism. It includes 15 internal metabolites, and 19 reactions. S has dimension 15 19. It is convenient to reduce this matrix by lumping those reactions that necessarily operate together. {Gap,Pgk,Gpm,Eno,Pyk}, {Zwf,Pgl,Gnd} Such groups of enzymes can be detected automatically. This reveals another two sequences {Fba,TpiA} and {2 Rpe,TktI,Tal,TktII}. Schuster et al. Nature Biotech 18, 326 (2000) 21. Lecture WS 2003/04 Bioinformatics III 7 Elementary Flux Modes Lumping the reactions in any one sequence gives the following reduced system: Construct initial tableau by combining S with identity matrix: Ru5P GAP R5P F6P FP2 1 0 ... 0 0 0 1 0 0 Pgi 0 1 ... 0 0 -1 0 2 0 {Fba,TpiA} 0 0 ... 0 -1 0 0 0 1 Rpi reversible 0 0 ... 0 -2 0 2 1 -1 {2Rpe,TktI,Tal,TktII} T(0)= 0 0 ... 0 0 0 0 -1 0 {Gap,Pgk,Gpm,Eno,Pyk} 0 0 ... 0 1 0 0 0 0 {Zwf,Pgl,Gnd} 0 0 ... 0 0 1 -1 0 0 Pfk irreversible 0 0 ... 0 0 -1 1 0 0 Fbp 0 0 ... 1 0 0 0 0 -1 Prs_DeoB Schuster et al. Nature Biotech 18, 326 (2000) 21. Lecture WS 2003/04 Bioinformatics III 8 Elementary Flux Modes Aim again: bring all entries 1 0 0 1 0 0 of right part of matrix to 0. 1 0 -1 0 2 0 E.g. 2*row3 - row4 gives 1 -1 0 0 0 1 „reversible“ row with 0 in column 10 1 -2 0 2 1 -1 T(0)= 1 0 0 0 -1 0 New „irreversible“ rows with 0 entry in column 10 by row3 + row6 and 1 1 0 0 0 0 by row4 + row7. 1 0 1 -1 0 0 1 0 -1 1 0 0 In general, linear combinations 1 0 0 0 0 -1 of 2 rows corresponding to the same type of directio- 1 0 0 1 0 0 nality go into the part of the respective type in the 1 0 -1 0 2 0 tableau. Combinations by 2 -1 0 0 -2 -1 3 different types go into the 1 0 0 0 -1 0 „irreversible“ tableau T(1)= 1 0 1 -1 0 0 because at least 1 reaction is 1 0 -1 1 0 0 irreversible. Irreversible reactions 1 0 0 0 0 -1 can only combined using positive 1 1 0 0 0 0 1 coefficients. 1 2 0 0 2 1 -1 Schuster et al. Nature Biotech 18, 326 (2000) 21. Lecture WS 2003/04 Bioinformatics III 9 Elementary Flux Modes Aim: zero column 11. 1 0 0 1 0 0 Include all possible (direction-wise 1 0 -1 0 2 0 allowed) linear combinations of 2 -1 0 0 -2 -1 3 rows. 1 0 0 0 -1 0 1 0 1 -1 0 0 T(1)= 1 0 -1 1 0 0 1 0 0 0 0 -1 1 1 0 0 0 0 1 1 2 0 0 2 1 -1 1 0 0 1 0 0 2 -1 0 0 -2 -1 3 1 0 0 0 -1 0 1 0 0 0 0 -1 T(2)= 1 1 0 0 0 0 1 1 2 0 0 2 1 -1 1 1 0 0 -1 2 0 -1 1 0 0 1 -2 0 continue with columns 12- 1 1 0 0 0 0 0 14. Schuster et al. Nature Biotech 18, 326 (2000) 21. Lecture WS 2003/04 Bioinformatics III 10 Elementary Flux Modes In the course of the algorithm, one must avoid - calculation of nonelementary modes (rows that contain fewer zeros than the row already present) - duplicate modes (a pair of rows is only combined if it fulfills the condition S(mi(j)) S(mk(j)) S(ml(j+1)) where S(ml(j+1)) is the set of positions of 0 in this row. - flux modes violating the sign restriction for the irreversible reactions. 1 1 0 0 2 0 1 0 0 0 ... ... 0 Final tableau -2 0 1 1 1 3 0 0 0 ... ... 0 2 1 1 5 3 2 0 0 T(5) = 0 0 1 0 0 1 0 0 1 5 1 4 -2 0 0 1 0 6 -5 -1 2 2 0 6 0 1 0 ... ... 0 0 0 0 0 0 1 1 0 0 ... ... 0 This shows that the number of rows may decrease or increase in the course of the algorithm. All constructed elementary modes are irreversible. Schuster et al. Nature Biotech 18, 326 (2000) 21. Lecture WS 2003/04 Bioinformatics III 11 Elementary Flux Modes Graphical representation of the elementary flux modes of the monosaccharide metabolism. The numbers indicate the relative flux carried by the enzymes. Fig. 2 Schuster et al. Nature Biotech 18, 326 (2000) 21. Lecture WS 2003/04 Bioinformatics III 12 Two approaches for Metabolic Pathway Analysis? The pathway P(v) is an elementary flux mode if it fulfills conditions C1 – C3. (C1) Pseudo steady-state. S e = 0. This ensures that none of the metabolites is consumed or produced in the overall stoichiometry. (C2) Feasibility: rate ei 0 if reaction is irreversible. This demands that only thermodynamically realizable fluxes are contained in e. (C3) Non-decomposability: there is no vector v (unequal to the zero vector and to e) fulfilling C1 and C2 and that P(v) is a proper subset of P(e). This is the core characteristics for EFMs and EPs and supplies the decomposition of the network into smallest units (able to hold the network in steady state). C3 is often called „genetic independence“ because it implies that the enzymes in one EFM or EP are not a subset of the enzymes from another EFM or EP. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 13 Two approaches for Metabolic Pathway Analysis? The pathway P(e) is an extreme pathway if it fulfills conditions C1 – C3 AND conditions C4 – C5. (C4) Network reconfiguration: Each reaction must be classified either as exchange flux or as internal reaction. All reversible internal reactions must be split up into two separate, irreversible reactions (forward and backward reaction). (C5) Systemic independence: the set of EPs in a network is the minimal set of EFMs that can describe all feasible steady-state flux distributions. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 14 Two approaches for Metabolic Pathway Analysis? A(ext) B(ext) C(ext) R1 R2 R3 R4 B R8 R7 R5 A C P R9 R6 D Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 15 Reconfigured Network A(ext) B(ext) C(ext) R1 R2 R3 R4 B R8 R7f R7b A C P R5 R9 R6 D 3 EFMs are not systemically independent: EFM1 = EP4 + EP5 EFM2 = EP3 + EP5 EFM4 = EP2 + EP3 Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 16 Property 1 of EFMs The only difference in the set of EFMs emerging upon reconfiguration consists in the two-cycles that result from splitting up reversible reactions. However, two-cycles are not considered as meaningful pathways. Valid for any network: Property 1 Reconfiguring a network by splitting up reversible reactions leads to the same set of meaningful EFMs. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 17 Software: FluxAnalyzer What is the consequence of when all exchange fluxes (and hence all reactions in the network) are irreversible? EFMs and EPs always co-incide! Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 18 Property 2 of EFMs Property 2 If all exchange reactions in a network are irreversible then the sets of meaningful EFMs (both in the original and in the reconfigured network) and EPs coincide. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 19 Reconfigured Network A(ext) B(ext) C(ext) R1 R2 R3 R4 B R8 R7f R7b A C P R5 R9 R6 D 3 EFMs are not systemically independent: EFM1 = EP4 + EP5 EFM2 = EP3 + EP5 EFM4 = EP2 + EP3 Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 20 Comparison of EFMs and EPs Problem EFM (network N1) EP (network N2) Recognition of 4 genetically indepen- Set of EPs does not contain operational modes: dent routes all genetically independent routes for converting (EFM1-EFM4) routes. Searching for EPs exclusively A to P. leading from A to P via B, no pathway would be found. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 21 Comparison of EFMs and EPs Problem EFM (network N1) EP (network N2) Finding all the EFM1 and EFM2 are One would only find the optimal routes: optimal because they suboptimal EP1, not the optimal pathways for yield one mole P per optimal routes EFM1 and synthesizing P during mole substrate A EFM2. growth on A alone. (i.e. R3/R1 = 1), whereas EFM3 and EFM4 are only sub- optimal (R3/R1 = 0.5). Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 22 Comparison of EFMs and EPs Problem EFM (network N1) EFM (network N1) Analysis of network 4 pathways convert A Only 1 EP exists for flexibility (structural to P (EFM1-EFM4), producing P by substrate A robustness, whereas for B only one alone, and 1 EP for redundancy): route (EFM8) exists. synthesizing P by (only) relative robustness of When one of the substrate B. One might exclusive growth on internal reactions (R4- suggest that both A or B. R9) fails, for production substrates possess the of P from A 2 pathways same redundancy of will always „survive“. pathways, but as shown by By contrast, removing EFM analysis, growth on reaction R8 already substrate A is much more stops the production of flexible than on B. P from B alone. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 23 Comparison of EFMs and EPs Problem EFM (network N1) EFM (network N1) Relative importance R8 is essential for Consider again biosynthesis of single reactions: producing P by substrate of P from substrate A (EP1 relative importance of B, whereas for A there is only). Because R8 is not reaction R8. no structurally „favored“ involved in EP1 one might reaction (R4-R9 all occur think that this reaction is not twice in EFM1-EFM4). important for synthesizing P However, considering the from A. However, without this optimal modes EFM1, reaction, it is impossible to EFM2, one recognizes the obtain optimal yields (1 P per importance of R8 also for A; EFM1 and EFM2). growth on A. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 24 Comparison of EFMs and EPs Problem EFM (network N1) EFM (network N1) Enzyme subsets R6 and R9 are an enzyme The EPs pretend R4 and R8 and excluding subset. By contrast, R6 to be an excluding reaction reaction pairs: and R9 never occur pair – but they are not suggest regulatory together with R8 in an (EFM2). The enzyme structures or rules. EFM. Thus (R6,R8) and subsets would be correctly (R8,R9) are excluding identified. reaction pairs. However, one can construct simple (In an arbitrary composable examples where the EPs would also steady-state flux distribution they pretend wrong enzyme subsets (not might occur together.) shown). Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 25 Comparison of EFMs and EPs Problem EFM (network N1) EFM (network N1) Pathway length: The shortest pathway Both the shortest (EFM2) shortest/longest from A to P needs 2 and the longest (EFM4) pathway for internal reactions (EFM2), pathway from A to P are not production of P from the longest 4 (EFM4). contained in the set of EPs. A. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 26 Comparison of EFMs and EPs Problem EFM (network N1) EFM (network N1) Removing a All EFMs not involving the Analyzing a subnetwork reaction and specific reactions build up implies that the EPs must be mutation studies: the complete set of EFMs newly computed. E.g. when effect of deleting R7. in the new (smaller) sub- deleting R2, EFM2 would network. If R7 is deleted, become an EP. For this EFMs 2,3,6,8 „survive“. reason, mutation studies Hence the mutant is cannot be performed easily. viable. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 27 Comparison of EFMs and EPs Problem EFM (network N1) EFM (network N1) Constraining For the case of R7, all In general, the set of EPs reaction EFMs but EFM1 and must be recalculated: reversibility: EFM7 „survive“ because compare the EPs in network effect of R7 limited to the latter ones utilize R7 N2 (R2 reversible) and N4 B C. with negative rate. (R2 irreversible). Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 28 Software: FluxAnalyzer FluxAnalyzer has both EPs and EFMs implemented. Allows convenient studies of metabolic systems. Klamt et al. Bioinformatics 19, 261 (2003) 21. Lecture WS 2003/04 Bioinformatics III 29 Software: FluxAnalyzer Representation of stochiometric matrix. Klamt et al. Bioinformatics 19, 261 (2003) 21. Lecture WS 2003/04 Bioinformatics III 30 Application of elementary modes Metabolic network structure of E.coli determines key aspects of functionality and regulation Compute EFMs for central metabolism of E.coli. Catabolic part: substrate uptake reactions, glycolysis, pentose phosphate pathway, TCA cycle, excretion of by-products (acetate, formate, lactate, ethanol) Anabolic part: conversions of precursors into building blocks like amino acids, to macromolecules, and to biomass. Stelling et al. Nature 420, 190 (2002) 21. Lecture WS 2003/04 Bioinformatics III 31 Metabolic network topology and phenotype The total number of EFMs for given conditions is used as quantitative measure of metabolic flexibility. a, Relative number of EFMs N enabling deletion mutants in gene i ( i) of E. coli to grow (abbreviated by µ) for 90 different combinations of mutation and carbon source. The solid line separates experimentally determined mutant phenotypes, namely inviability (1–40) from viability (41–90). The # of EFMs for mutant strain allows correct prediction of growth phenotype in more than 90% of the cases. Stelling et al. Nature 420, 190 (2002) 21. Lecture WS 2003/04 Bioinformatics III 32 Robustness analysis The # of EFMs qualitatively indicates whether a mutant is viable or not, but does not describe quantitatively how well a mutant grows. Define maximal biomass yield Ymass as the optimum of: ei Yi , X / Si Sk ei ei is the single reaction rate (growth and substrate uptake) in EFM i selected for utilization of substrate Sk. Stelling et al. Nature 420, 190 (2002) 21. Lecture WS 2003/04 Bioinformatics III 33 Software: FluxAnalyzer Dependency of the mutants' maximal growth yield Ymax( i) (open circles) and the network diameter D( i) (open squares) on the share of elementary modes operational in the mutants. Data were binned to reduce noise. Stelling et al. Nature 420, 190 (2002) Central metabolism of E.coli behaves in a highly robust manner because mutants with significantly reduced metabolic flexibility show a growth yield similar to wild type. 21. Lecture WS 2003/04 Bioinformatics III 34 Growth-supporting elementar modes Distribution of growth-supporting elementary modes in wild type (rather than in the mutants), that is, share of modes having a specific biomass yield (the dotted line indicates equal distribution). Stelling et al. Nature 420, 190 (2002) Multiple, alternative pathways exist with identical biomass yield. 21. Lecture WS 2003/04 Bioinformatics III 35 Can regulation be predicted by EFM analysis? Assume that optimization during biological evolution can be characterized by the two objectives of flexibility (associated with robustness) and of efficiency. Flexibility means the ability to adapt to a wide range of environmental conditions, that is, to realize a maximal bandwidth of thermodynamically feasible flux distributions (maximizing # of EFMs). Efficiency could be defined as fulfilment of cellular demands with an optimal outcome such as maximal cell growth using a minimum of constitutive elements (genes and proteins, thus minimizing # EFMs). These 2 criteria pose contradictory challenges. Optimal cellular regulation needs to find a trade-off. Stelling et al. Nature 420, 190 (2002) 21. Lecture WS 2003/04 Bioinformatics III 36 Can regulation be predicted by EFM analysis? Compute control-effective fluxes for each reaction l by determining the efficiency of any EFM ei by relating the system‘s output to the substrate uptake and to the sum of all absolute fluxes. With flux modes normalized to the total substrate uptake, efficiencies i(Sk, ) for the targets for optimization -growth and ATP generation, are defined as: ei eiATP i S k , and i S k , ATP eil l eil l Control-effective fluxes vl(Sk) are obtained by averaged weighting of the product of reaction- specific fluxes and mode-specific efficiencies over all EFMs using the substrate under consideration: i S k , eil i S k , ATP eil vl S k 1 1 i i Ymax X / Sk S , l i k Y max A / Sk S , ATP l i k YmaxX/Si and YmaxA/Si are optimal yields of biomass production and of ATP synthesis. Control-effective fluxes represent the importance of each reaction for efficient and flexible operation of the entire network. Stelling et al. Nature 420, 190 (2002) 21. Lecture WS 2003/04 Bioinformatics III 37 Prediction of gene expression patterns As cellular control on longer timescales is predominantly achieved by genetic regulation, the control-effective fluxes should correlate with messenger RNA levels. Compute theoretical transcript ratios (S1,S2) for growth on two alternative substrates S1 and S2 as ratios of Calculated ratios between gene expression levels control-effective fluxes. during exponential growth on acetate and exponential growth on glucose (filled circles Compare to exp. DNA-microarray data indicate outliers) based on all elementary modes for E.coli growin on glucose, glycerol, versus experimentally determined transcript and acetate. ratios19. Lines indicate 95% confidence intervals for experimental data (horizontal lines), linear Excellent correlation! regression (solid line), perfect match (dashed Stelling et al. Nature 420, 190 (2002) line) and two-fold deviation (dotted line). 21. Lecture WS 2003/04 Bioinformatics III 38 Prediction of transcript ratios Predicted transcript ratios for acetate versus glucose for which, in contrast to a, only the two elementary modes with highest biomass and ATP yield (optimal modes) were considered. This plot shows only weak correlation. This corresponds to the approach followed by Flux Balance Analysis. Stelling et al. Nature 420, 190 (2002) 21. Lecture WS 2003/04 Bioinformatics III 39 Summary EFM are a robust method that offers great opportunities for studying functional and structural properties in metabolic networks. Klamt & Stelling suggest that the term „elementary flux modes“ should be used whenever the sets of EFMs and EPs are identical. In cases where they don‘t, EPs are a subset of EFMs. It remains to be understood more thoroughly how much valuable information about the pathway structure is lost by using EPs. Ongoing Challenges: - study really large metabolic systems by subdividing them - combine metabolic model with model of cellular regulation. Klamt & Stelling Trends Biotech 21, 64 (2003) 21. Lecture WS 2003/04 Bioinformatics III 40