Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Genome dynamics in Bacillus megaterium

VIEWS: 28 PAGES: 42

									   Genome dynamics in Bacillus
          megaterium

What genomic sequencing tells us about the genetic forces
             that shape Bacillus genomes




                       October 29, 2009
                  Dept. of Biological Sciences
                              NIU
                  The Genus Bacillus
•   Gram-positive, aerobic endospore-
    forming rod-shaped bacteria
•   “Normal” habitat: the soil (plus lots
    of other places)
     – Mostly mesophilic, but some grow
       as low as 0°and as high as 65°.
•   Pathogens: B. anthracis and B.
    cereus
•   Industrial uses: enzyme production,
    Bt insecticidal corn
•   Endospores very resistant to heat
    and chemicals
Relatives among the Firmicutes
                         A Bit of History
•Bacillus subtilis, originally named
Vibrio subtilis, by Christian
Gottfried Ehrenberg in 1835. He
was the first to use the name
“bacteria”.
•Ferdinand Cohn (1872) renamed
the species Bacillus subtilis, as part
of his description of bacteria by their
shape (“bacillus” = “little stick”).
     --He is also responsible for
     bacteria being considered plants
     and not animals
•Robert Koch first showed that a
specific bacterium caused a specific
disease: B. anthracis and anthrax.
(1876)
•B. megaterium was first described
by Heinrich Anton de Bary in 1884.
      Bacillus’s Position in the Tree of Life
•   Anything called “Bacillus” in the 1800’s would
    now be a member of the Firmicutes (“strong
    skin”), a phylum that contains the Gram-
    positive low G+C bacteria.
•   An alternative model, based on indels in
    universal genes, puts the Firmicutes near the
    root of the tree.
                   Bacillus Taxonomy
• Bacillus is a very old genus name, and it has been split
  several times.
• Bergey’s Manual of Systematic Bacteriology, first
  edition (1986) lists 32 valid species, with about an equal
  number of synonyms.
    – Based on morphology, biochemistry, some DNA-DNA
      hybridization, numerical taxonomy
• Carl Woese introduced the use of 16S rRNA sequences
  for phylogeny in 1977.
• Bergey’s Manual second edition (2004) splits the
  Bacillus genus into 4 families, with 37 genera in the
  Bacillaceae. Over 200 species.
    – Bacillus is still a genus, and still contains both B. subtilis
      and B. megaterium.
• As in other taxa, a common phenotype is well
  correlated with a common genotype
Ash et al. (1991)
Lett. Appl. Microbiol.
13:202-206.
                         Genome Sequencing
•   Strain QM B1551, containing 7 plasmids
•   NSF Grant, to Pat Vary and Jacques
    Ravel
      – Most lab work done at TIGR/U.
          Maryland
      – NIU’s role: annotating the 6000
          genes
•   Joined forces with Dieter Jahn’s group at
    the Technische Universität in
    Braunschweig, Germany, who were
    sequencing the DSM 319 plasmidless
    strain
•   In addition, there are about 20 other
    fully sequenced genomes from Bacillus
    and related genera
•   DSM319 has no plasmids, but at least 70
    genes on the QM plasmids have good
    homologues on the DSM chromosome
    (purple ring near the middle)
 Common Features, Genetic Forces
 Assuming that all Bacillus species descended from a common
ancestor, what is similar and different between them, and why?

 Common Features                 Genetic Forces
• Morphological and            • Vertical descent
  biochemical                  • Background substitution and
  characteristics                indel mutations
• 16S rRNA genes               • Horizontal gene transfer
• A group of common              (about 10% different genes
  protein-coding genes           between QM and DSM)
• Chromosomal synteny          • Intragenomic recombination
• rRNA operons                 • Homogenization of rRNA
                                 operons, presumably by gene
                                 conversion
    16S Variation, Phylogeny, and Species
                Identification
• B. megaterium has 11 rRNA (rrn) operons
  on the chromosome in both sequenced
  strains, in the same genomic positions.
   – QM also has an rrn operon on plasmid
     pBM400, which is not found in DSM.
• The 16S genes in B. megaterium are 1540
  bp long and very similar, but not
  identical.
   – Gene conversion is thought to
     homogenize rRNA operons
   – Recombination between rrn operons leads
     to deletions

• The question addressed here: what effect
  does 16S variation within the genome
  have on phylogeny and species
  identification?
    Differences between 16S genes with B.
                 megaterium
                                                                         6




• Seven identical 16S genes:                                             5




  the rrnE, rrnF, and rrnI                                               4




                                           operons with variant allele
  genes in QM and the                                                    3




  rrnA, rrnB, rrnF, and                                                  2




  rrnK alleles in DSM.                                                   1




   – Also, the rrnA and rrnB
                                                                         0




                                                                                 45
                                                                                      90
                                                                             0




                                                                                                                                                                                                                   1035
                                                                                                                                                                                                                          1080
                                                                                                                                                                                                                                 1125
                                                                                                                                                                                                                                        1170
                                                                                                                                                                                                                                               1215
                                                                                                                                                                                                                                                      1260
                                                                                                                                                                                                                                                             1305
                                                                                                                                                                                                                                                                    1350
                                                                                                                                                                                                                                                                           1395
                                                                                                                                                                                                                                                                                  1440
                                                                                                                                                                                                                                                                                         1485
                                                                                                                                                                                                                                                                                                1530
                                                                                           135
                                                                                                 180
                                                                                                       225
                                                                                                             270
                                                                                                                   315
                                                                                                                         360
                                                                                                                               405
                                                                                                                                     450
                                                                                                                                           495
                                                                                                                                                 540
                                                                                                                                                       585
                                                                                                                                                             630
                                                                                                                                                                   675
                                                                                                                                                                         720
                                                                                                                                                                               765
                                                                                                                                                                                     810
                                                                                                                                                                                           855
                                                                                                                                                                                                 900
                                                                                                                                                                                                       945
                                                                                                                                                                                                             990
                                                                                                                                                                           position




     alleles in QM were identical                                                                •Positions 461 and 474 are
     to each other                                                                               probably a stem-loop:
   – Note the lack of clear                                                                      all genes with an A at 461 have a T
     vertical descent in this                                                                    at 474, and all lines with a G at 461
     pattern                                                                                     have a C at 474.

• Total of 20 sites with
  polymorphisms.                                                                                                         Avg                                         Min                                           Max
   – All but 4 are unique to a      Within QM                                                                            3.7                                         0                                             9
     single operon                  Within DSM                                                                           2.0                                         0                                             4
   – All but one shared
     polymorphism are found in      Between QM                                                                           3.1                                         0                                             8
     both QM and DSM                and DSM
          Mismatch Differences in Completely Sequenced Genomes
                              B.       B.
                              megate   megate                        B.                                B.                                                        G.
           num       max      rium     rium B.             B.        weihen B.       B.                amyloli B.       B.              O.      A.       G.      themod Paenib
           compare   internal QM       DSM3 anthrac B.     thuring   stephan cytotox pumilu B.         quefaci lichenif halodur B.      iheyens flavithe kaustop enitrifi acillus
           d         diffs B1551       19     is    cereus iensis    ensis is        s      subtilis   ens     ormis ans        clausii is      rmus hilus cans           sp.


QM             12         8
DSM            11         4       0
Banthra        11         5      81       78
Bcereus        14         8      80       78       2
Bthur          14         4      74       71       2      2

Bweihen        14       5        80       78     11      9     10
Bcyto          13       9        80       81     32     32     31        38
Bpumil          7       4        84       83     87     87     86        90      84
Bsub           10      12        94       93     91     90     87        97      93      49

Bamylo           9      7        97       96     91     90     87        95      94      47      12
Blichen          7      5       98        97 88 87 84                   91 93 54 30 27
Bhalo            8      5      106       105 99 99 100                 101 98 100 89 85 80
Bclaus           7     16      123       121 120 119 117               125 116 119 109 110 111 89
Oceano           6     14      109       107 114 112 107               112 118 113 109 116 116 124 123
Anoxy            8      8      126       125 129 128 121               134 113 128 128 127 121 111 135 136
Gkausto          9     12      150 150 140 138 129 143 133 145 140 138 131 135 142 163 105
Gthermo        10      14      145 146 133 132 122 137 127 138 140 135 125 130 138 161 91 33
Paeni          12       8      173 171 171 169 161 173 174 167 171 175 172 163 159 172 174 194 194
  Differences in Completely Sequenced
                Genomes
• Maximum differences within any genome = 16 (B. clausii)
   – My basic argument: there is no point in having two different
     species which are less different than 16S genes within the same
     genome.
• Among the cereus group genomes, there are fewer
  differences between genes in B. cereus, B. anthracis, and
  B. thurengiensis, than there are between genes in the same
  genome.
   – Also, B. weihenstephanesis has only very few differences from
     these
• B. subtilis and B. amyloliquifaciens are also very similar.

• Effects on phylogeny: pick a random 16S gene from each
  genome, align, count differences, do a neighbor-joining
  tree. 1000 reps.
        Neighbor-Joining Trees with Completely
                 Sequenced Genomes




•Different choices of which 16S genes to use leads to different phylogenies, both
at the species/subspecies level and at higher levels.
•The variable nodes in the cereus group and the halodurans/clausii group are
independent. Thus, these three tree represent 9 variants.
  Defining B. megaterium and distinguishing it from other
                    species, using 16S
•Comparison of B. megaterium
isolates from Genbank to QM-
rrnA

•A total of 185 isolates that were
>1390 bp (i.e. > 90% of full length)
and had fewer than 10 ambiguity
characters were aligned with
QM_rrnA, and the number of variant
positions were counted.
     •70% have 9 or fewer differences;
     •86% have 20 or fewer differences;
     •95% have 46 or fewer differences.

•Most isolates seem to fall into a
single group, but there may be some
significantly different subtypes in B.
megaterium.
     •Or, new species may be defined
  Positions of Nucleotide Variants in Genbank Isolates

•43% of the 1540 nucleotide
positions in the 16S gene
have at least one variant in
the B. megaterium strains
from Genbank.
•Most of the variation
occurs at the ends of the 16S
gene. This is also the region
where missing data is most
common.
      •PCR primers for 16S
      need to be internal to
      the gene
•The variant positions in the
middle were seen in QM and
DSM: the paired 461/474
positions, and 1140. There
are no major polymorphisms
outside the end regions that
are not seen in QM and
DSM.
             Closely Related Species
                                  species             count     average minimum
• How easy is it to               asahii                      4      80.5       67
  distinguish between B.          azotoformans                1        82       82
  megaterium and closely          bataviensis                 5      67.8       62
                                  benzoevorans                4        72       65
  related species?                circulans                  17      99.5       66
• What species are closely        cohnii                      7      68.9       58
  related to B. megaterium?       fastidiosus                 1        74       74
                                  firmus                     51      81.5       66
  Different phylogenetic          flexus                     47      29.4        0
  trees give different            fumarioli                  17    103.1        71
  answers.                        funiculus                   7        97       92
                                  halmapalus                  6      78.3       66
    – All of the species on the   horikoshii                 16      83.8       69
      next slides appear to be    infernus                    2        96       96
      more similar to B.          jeotgali                    1        78       78
      megaterium than             koreensis                   1        87       87
      members of the cereus       luciferensis                3      86.7       85
      group on at least one       megaterium                185      10.9        0
      phylogenetic tree.          methanolicus                4      76.5       74
                                  niacini                    14      79.1       59
    – All are in genus Bacillus   novalis                     3      72.7       71
      except Lysinibacillus       panaciterrae                5      90.2       82
      sphaericus.                 psychrosaccharolyti         1        73       73
    – Total of 344 strains used   simplex
                                  cus                        39      79.7       17
                                  soli                        7      68.4       64
                                  sphaericus                 75    119.4       103
                                  vireti                      6      76.3       71
         Number of Differences from QM-rrnA for
                    Different Species
•Except for B. flexus and one B.                                  60

simplex isolate, all strains are well-
differentiated from B. megaterium                                 50

with a minimum of 58 differences.
                                                                  40

•B. flexus overlaps the B.




                                         percentage of isolates
                                                                                                                                                                                                 B. firmus
                                                                                                                                                                                                 B. circulans
megaterium distribution heavily.                                  30
                                                                                                                                                                                                 B.flexus
                                                                                                                                                                                                 B. simplex
The average B. flexus isolaate had                                                                                                                                                               B. megaterium
                                                                                                                                                                                                 others
29.4 differences from QM_rrnA,                                    20

with some isolates
indistinguishable. Type strain                                    10

differs at 16 positions; the B.
megaterium type strain differs at 4                               0

positions.




                                                                                                                                                                       0

                                                                                                                                                                              5

                                                                                                                                                                                     0

                                                                                                                                                                                             5
                                                                       0

                                                                           5
                                                                               10

                                                                                    15

                                                                                         20

                                                                                              25

                                                                                                   30

                                                                                                        35

                                                                                                             40

                                                                                                                  45

                                                                                                                       50

                                                                                                                            55

                                                                                                                                 60

                                                                                                                                      65

                                                                                                                                           70

                                                                                                                                                75

                                                                                                                                                     80

                                                                                                                                                          85

                                                                                                                                                               90

                                                                                                                                                                     95




                                                                                                                                                                                            0+
                                                                                                                                                                    10

                                                                                                                                                                           10

                                                                                                                                                                                  11

                                                                                                                                                                                          11
                                                                                                                                                                                         12
                                                                                                                       differences from QM rrnA



•The average B. simplex isolate had
79.4 differences from B.
megaterium; the one exceptional
strain had 17 differences (maybe
it’s a mis-labeling)
          Conclusions about 16S genes
•   Choosing different 16S genes from within genomes can affect the resulting
    phylogenetic trees.
     – The 16S genes within B. megaterium and other completely sequenced Bacillus genomes
       differ from each other by up to 16 positions.
     – Some species differ from other species at fewer positions than 16S genes differ within
       individual genomes.

•   Although most B. megaterium strains are very similar to QM and DSM, there are a few
    strains with very different 16S genes that may represent subtypes within B.
    megaterium, or which may ultimately be assigned to new species.

•   Most of the polymorphisms in the 16S genes are almost unique; all of the widespread
    . megaterium polymorphisms are found in QM and DSM.

•   Most of the closely related species fall outside the range of variation seen within B.
    megaterium, but B. flexus is a major exception.
     – some isolates of B. flexus are indistinguishable from B. megaterium, and most fall within the
       same range of variation seen in B. megaterium
   Common Genes and Synteny
• Bacillus is a relatively well-sequenced
  genus, with 11 complete genomes publicly
  available (not including B. megaterium).

• What genes are found in all Bacillus
  species, the core genome?
• Where on the chromosome are the
  conserved genes?
Bacillus Core Genome
QM vs. DSM Genes
Between Species
             Synteny Results
• The syntenic region around the origin of
  replication is shared throughout the Bacillaceae,
  including the genera Bacillus, Geobacillus,
  Oceanobacillus, and Anoxybacillus.
• 99% of the 2000 core genes are in the syntenic
  region.

• Next: rRNA operons and adjacent genes: concrete
  examples of conserved synteny.
                  rRNA operons (rrn)

• There are 11 rRNA operons on the B.
  megaterium chromosomes, plus one on plasmid
  pBM400 in the QM strain.
    – Other Bacillus species have 8-15 rrn operons
• The rrn operons are in the conserved synteny
  region.
    – Only in Bacillus and relatives


• rrn operons are all on the leading DNA strand:
  transcribed in the same direction as the
  replication fork moves.
• Most Bacillus rrn operons are on the right
  replichore, near the origin of replication
From Stewart and
Cavanaugh, 2007,
J. Mol. Evol.
65:44-67
                       Common Sites
• Nearly all the rrn
  operons in the
  Bacillaceae can be
  found between sets
  of common flanking
  genes.
    – Sometimes with
      DNA insertions
      separating the rrn
      locus from one side
• A few unique rrn
  operons, including 2
  in B. megaterium             A: DNA repair protein recF
                               B: DNA gyrase, subunit B;
• Not in Paenibacillus         C: DNA gyrase, subunit A;
                               D: inosine-5'-monophosphate dehydrogenase;
                               E: D-alanyl-D-alanine carboxypeptidase.
                               F: glutamine amidotransferase, synthase subunit
rrn operons in Bacillaceaae are in specific sites
                       Variations
• Seven sites on the right replichore, plus one on the left
  replichore.
   – Also, a site shared within the cereus group, and two sites shared in
     Geobacillus and Anoxybacillus.
• Individual rrn sites can contain 0-5 rrn operons.
• Some sites are empty: the flanking genes are adjacent, with
  no rrn operon between
• A few sites are missing: the flanking genes are not present
  in the genome or are dispersed to very different locations.
• Tandem duplications of rrn operons are common
• Several variations caused by apparent intragenomic
  recombination
                    Tandem Duplication Copy Number
                                             Right replichore                          Left Replichore
                                                                                        Anoxy/
                                                                            shared      Geo       Geo
genome               total   rrnA rrnBC   rrnD   rrnE   rrnF    rrnG   rrnH cereus rrnK common common

B. megaterium          11      1     2      1      1      1       1      1      0    1       0           0
B. anthracis           11      1     1      1      1      1       3      1      1    1       0           0
B. cereus              12      1     1      1      1      1       4      1      1    1       0           0
B. thurengiensis       14      1     1      2      1      1       5      1      1    1       0           0
B. weihensteph.        14      1     1      2      1      1       5      1      1    1       0           0
B. cytotoxicus         13      1     1      2      1      1       4      1      1    1       0           0
B. pumilis              7      1     0      1      1      2       0      1      1    1       0           0
B. subtilis            10      1     1      2      3      1       0      1      0    1       0           0
B. amyloliquifac.      10      1     1      2     2.5     1       0      1      0    1       0           0
B. licheniformis        7      1     1      1      1      1       0      1      0    1       0           0
B. halodurans           8      1     1      2      1      1       1      1      0    0       0           0
B. clausii              7      1     1      2      1      1       0      1      0    0       0           0
Oceanobacillus          7      0     0      3      1      1       0      1      0    1       0           0
Anoxybacillus           8      1     1      1      1      1       1      1      0    0       1           0
G. kaustophilus         9      1     1      1      1      1       1      1      0    0       1           1
G. thermodenitro.      10      1     1      1      1      1       1      1      0    0       1           1
  Intragenomic Recombination at rrn Sites
• rrn operons are almost identical, among the very few repeated sequences
  in bacterial genomes
    – A second example: insertion sequences (IS) , which are mobile genetic elements
      found in many genomes (very few in B. megaterium ).


• The presence of highly conserved genes and the consequences of
  intragenomic recombination in a circular genome constrains genome
  rearrangements.

• The arrangement of rrn operons and their sites can be understood as the
  result of three forces:
   – intragenomic recombination between rrn operons,
   – insertions/deletions of blocks of protein-coding genes,
   – recombination events within tandem arrays of rrn operons.
  Symmetrical Inversion Between Replichores

• Anoxybacillus
  flavithermus
   Double Crossover Re-orders Flanking Genes
• B. pumilis
     Double Crossover in Flanking tRNA Regions

• B. amyloliquifaciens
  rrnE.
• Resulted in loss of 2/3
  of the 16S gene.
    – 23S and 5S OK
• very little obvious
  homology on the right
  side.
    Tandem Duplication Events:
Duplication by Unequal Crossing Over

                     • rrnD in
                       Oceanobacillus
                       iheyensis
Tandem Duplications in the cereus group rrnG site
•   Every deletion
    between adjacent
    rrn operons can
    be seen.

•   Deletion of genes
    between rrn 2
    and rrn3
    (preserving one
    gene in the
    middle).

•   Region between
    rrn 3 and rrn 4
    completely
    replaced.
Intragenomic Recombination Conclusions
• Most rrn operons are found in the same sites in all Bacillus
  genomes
   – Differences in rrn operon number are mostly due to tandem
     duplications within these sites
• Intragenomic recombination is well documented in
  Bacillus genomes
   –   Anoxybacillus: symmetric inversion across ori
   –   B. pumilis: double crossover involving 3 regions
   –   Oceanobacillus rrnD: CO between tandem copies
   –   rrnD in other species: at least 2 events
   –   cereus group rrnG: deletions between tandem copies (at least 4 different events)
   –   cereus group rrnG: replacement of inter-rrn region by presumed 2CO
   –   cereus group rrnG: deletion of inter-rrn region, leaving a central portion intact (2 deletions?).
   –   B. amyloliquifaciens rrnE: 2CO involving 3 regions, with the central section having the CO’s
       150 bp apart
   –   B. megaterium rrnBC: 2CO involving 3 regions, with little homology at one end
   –   Several other duplication/deletion events within tandem duplications
     Some Events NOT Observed
• The lack of certain events supports several current
  ideas.
    – to the extent that lack of evidence constitutes evidence.

• Crossovers between rrn sites: despite numerous CO events within rrnG
  in the cereus group, plus many other CO events
   – supports the idea that the flanking genes are necessary
• Asymmetric CO across ori: only one symmetric one observed, so
  evidence is not strong.
    – Supports the idea that symmetric replichores are selectively advantageous
• Inversions within a replichore: all rrn in all species are on the leading
  strand, in both replichores.
    – supports the idea that replication and rrn transcription must proceed in the
      same direction
      Some Unsolved Questions
• Replichore asymmetry:
   – most of the rrn are in the right replichore
   – compositional bias between replichores
• Mechanism of insertion/deletion/horizontal gene
  transfer
   – a big question. We are examining insertion sites for
     clues.
• Is there a common phylogeny for the conserved
  synteny region in B. megaterium?
   – Finding and analyzing allegedly unique events (indels
     and recombinations)
DNA Composition shows replichore asymmetry
Simple vs. Compound Insertions
                         Thanks!
•   NIU Biology Dept.            •   Argonne National Labs
     – Pat Vary                       – Ross Overbeek
     – Janaka Edirisinghe             – Gordon Pusch
                                      – Terry Disz
     – Kirthi Kumar Kutumbaka
     – Sandhya Balasubramanian
                                 •   TIGR/U. Maryland
     – Jenn Hintzsche
                                      – Jacques Ravel
     – Chris Braun
                                      – Mark Eppinger
     – Denise Tombolato               – MJ Rosovitz
     – Judy Luke
     – Scott Grayburn            •   Technische U. Braunschweig
                                      – Dieter Jahn
•   NIU Computer Science Dept.        – Boyke Bunk
     – Stephen Snow
     – Reva Freedman
     – Minmei Hou

								
To top