matthew by yaoyufang

VIEWS: 7 PAGES: 16

									Towards Identifying Lateral
  Gene Transfer Events
 L. Addario-Berry, M. Hallett, J. Lagergren


                       Presented By: Jeff Mathew
Roadmap
 Key terms
 τ-transfer problem
 H-moves and I-moves algorithm
 Tree generation for simulation
 Experimental results
 Conclusions and future work
Lateral transfer scenario
 LGT = HGT
 Root of scenario tree must correspond to root
  of gene tree
 The scenario tree is connected and respects the
  direction of evolution implied by the arcs of T and
  S.
α-activity
 An α-active scenario for a gene tree and
  species tree allows at most alpha copies of
  a gene to simultaneously exist in the
  genome of an ancestral taxon.
 Authors focus on 1-active scenarios
  though intractability results have been
  proved earlier for α ≥ 1.
τ-transfer problem
   Input: Species tree S, gene tree T, integer τ

   Output: A τ* lateral transfer scenario for S and T,
    τ* ≤ τ

   Intractability result
    ◦ The decision version of the α-Active, τ-Transfer
      Problem (does there exist a α-active scenario with
      cost ≤ τ?) is NP-complete.

   τ is the number of lateral transfer events needed
    to explain the difference between S and T
Algorithm
   2 Phase approach
   Phase 1
    ◦ While H-fat or I-fat vertices remain
      Perform H-fat move or I-fat move
   At the end of phase 1, we are guaranteed
    that the scenario is 1-active. What about
    cycles?
   Phase 2
    ◦ Remove minimum number of LGT events from
      each candidate to make it acyclic.
   Running Time: 24τ n2
Simulating species trees
 Create random species tree S on n-leaves.
  Θ(log n) expected depth
 S is supposed to reflect the actual
  evolutionary relationships between taxa
    ◦ S is ultrametric. Therefore, edge-weights
      correspond to time.
    ◦ Randomly assign weights to every edge such
      that every root-to-leaf path has weighted sum
      1.
Simulating gene trees
 Begin with generated ultrametric species tree
 Lateral transfer events occur according to a
  Poisson process with mean rate λ
 Moving from root to leaves, for each vertex x0
  with children x1 and x2, examine both edges
    ◦ If the Poisson process provides us with a lateral
      transfer event along (x0, x1), we add it and point it to
      a randomly chosen edge alive at that point in time.
    ◦ Else add a speciation event for x1
    ◦ Repeat the analysis for (x0, x2)
Degenerate Cases
 Simulation can result in plausible biological events
  that are not detectable by the algorithm.
 Useless transfers: LGTs that don’t change the
  gene tree
 Transfer-loss events: One child of a node is a LGT
  event. Another child is a loss event.
             Ω   = number of repetitions



Results
             τ   = true number of LGT events
             τ‘ = minimum cost LGT scenario found by algorithm
             λ   = mean rate of LGTs from Poisson process
Finding the saturation point
   The point when the average τ‘ stops
    increasing.
   Random trees from a large pool were
    chosen as gene trees and species trees
    ◦ Trials suggest that saturation point is slightly
      above n/2, i.e., when τ > n/2, the algorithms stops
      detecting new LGT events
   Thus, if τ’ > n/2, the correspondence
    between T and S via LGT events is not very
    meaningful.
             Ω   = number of repetitions



Results
             τ   = true number of LGT events
             τ‘ = minimum cost LGT scenario found by algorithm
             λ   = mean rate of LGTs from Poisson process
             Ω   = number of repetitions



Results
             τ   = true number of LGT events
             τ‘ = minimum cost LGT scenario found by algorithm
             λ   = mean rate of LGTs from Poisson process
             Ω   = number of repetitions



Results
             τ   = true number of LGT events
             τ‘ = minimum cost LGT scenario found by algorithm
             λ   = mean rate of LGTs from Poisson process
Conclusions
 Empirically verified feasibility of the τ-
  transfer algorithm
 Degenerate events such as transfer-loss
  events that result in over-estimates of
  transfers occur with low probability
 Achieved near-optimal scenarios when λ is
  low enough not to cause saturation
 The cycle elimination phase of the algorithm
  is extremely rare in practice implying a O(22τ
  n2) running time.
Future work and open problems
   Use weighted gene trees and species trees
    ◦ Species trees are nearly ultra-metric while gene trees
      are not
 Do fast algorithms exist when the input is a set of
  gene trees with no species tree?
 Tractability on larger phylogenies
 Can we consider gene duplication, lateral gene
  transfers, and other events simultaneously?
 Can we use probabilistic models that assign
  likelihood events to various events and optimize
  over such models in a tractable manner?

								
To top