Acrobat PDF

Comparative genomics of bacteria NM

You must be logged in to download this document
Reviews
Shared by: sammyc2007
Categories
Stats
views:
129
rating:
not rated
reviews:
0
posted:
4/17/2008
language:
English
pages:
0
Oct 10 2006 Features and evolution of bacterial genomes Reading For today: 47-55 in G&M For Thursday: 118-119, 122-126 in G&M This week 1 2 3 Features of bacterial genomes Processes in bacterial genome evolution Functional categorization of genes (Thurs) 1 Features of Bacterial Genes and Genomes known before genomics era • Size - 0.5-10 Mb (~500 to 10,000 genes); average gene length=~1kb •Bacterial genomes are tightly packed with genes: little repetitive, transposable, & non-coding DNA, considered to have no or few pseudogenes • Bacterial genes lack introns, are arranged as operons, located on both strands • Chromosomes are (typically) circular, with a single origin and terminus region • Gene order conserved among closely related bacteria • Close relatives can differ in gene content, eg, genes conferring pathogenicity • Base composition varies (25-75% G+C), similar in closely related species • Base composition is relatively homogeneous over the entire chromosome • Within a species, each codon position has a characteristic G+C content. (and there is a species-specific pattern of codon usage) All of these “facts” based on a few model species, such as E. coli, Bacillus bacterial genome sequencing as of Oct 2006 • 429 sequenced and available in Bacteria • 36 in Archaea • Many others in progress – eg at GOLD http://www.genomesonline.org/ • Finished sequences are more accurate than sequences of eukaryotic genomes • Includes soil, marine, pathogenic, commensal, symbiotic, extremophilic, etc. bacteria • Possible to compare distant and related species to infer evolutionary changes in genomes 2 Depicting Full Genome Sequences ____________________________________________________________ Genes on lagging %GC Genes on leading GC skew (G-C)/G+C) tRNA rRNA Repeats (<30 nt) (Adapted from Bao et al. 2002. Genome Research) Features of Bacterial Genomes • Gene Inventory – Genome as a bag of genes • Non-coding sequences – Spacers and functional features • Base Composition – Variation among and within genomes • Gene Order and orientation 3 Gene Inventory • Complete gene set – Information only available from full genome sequences • Comparisons started with 2nd genome sequenced (Mycoplasma genitalium, 1995). – 25-fold differences among sequenced bacteria in #ORFs (470 to ~10000) • Now 50-fold because Carsonella symbiont is only 180 ORFs! (see Science on Friday Oct 13) – Distinct gene sets in genomes of similar size • Observations so far indicate up to 25% difference in gene content for strains of the same species • Much larger differences for unrelated bacteria Complete gene inventories and functions ______________________________________________________________________ _________ 4 Small genomes lack many genes for biosynthesis, transport, catabolism, signal transduction, unknown function. Little loss in genes for translation Xylella fastidiosa, Yersinia pestis, Vibrio cholerae, Pseudomonas aeruginosa, Pasteurella multocida, Neisseria meningitidus, Haemophilus influenzae, Escherichia coli K12, Buchnera aphidicola Minimal gene set • • • • Idea of “Minimal Genome” consisting of essential genes Each genome contains minimal gene set + additional niche-specific genes Minimal Gene Set would be those required for central processes of cell-expected to be universally distributed and to determine the minimal genome size Based on size determinations from Pulsed Field Gel Electrophoresis (CHEF gels), minimal genome sizes had been observed to be similar among diverse groups of bacteria: ~ 500 kb – Mycoplasmas – Rickettsiae – Symbiotic species of gamma-Proteobacteria – All tiny genomes are obligately host-dependent – Can be pathogenic or mutualistic in hosts – Can be close relatives of large genome organisms (Buchnera close to E. coli) 5 1996 Mushegian AR, Koonin EV. 1996 A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc Natl Acad Sci U S A 93:10268 “To derive such a set, we compared the 468 predicted M. genitalium protein sequences with the 1703 protein sequences encoded by the other completely sequenced small bacterial genome, that of Haemophilus influenzae. M. genitalium and H. influenzae belong to two ancient bacterial lineages, i.e., Gram-positive and Gram-negative bacteria, respectively. Therefore, the genes that are conserved in these two bacteria are almost certainly essential for cellular function. It is this category of genes that is most likely to approximate the minimal gene set. We found that 240 M. genitalium genes have orthologs among the genes of H. influenzae.” Minimal gene set, 2005 • Idea of “Minimal Genome” -- doesn’t work out as first envisioned. • The number of universal genes has decreased with each genome sequenced… • Only ~60 genes universal among sequenced genomes by 2003 • This is possible due to “non-orthologous” displacement, ie different gene sets to do the same job. Koonin Nature Rev. Microbiol 2003 6 A case of non-orthologous displacement lysyl tRNA synthetase Class II present in most bacteria and eukaryotes, crenarchaeota COG1190 Class I present in euryarchaeota, spirochetes most alphaproteobacteria COG1384 Perfectly complementary distribution among taxa Unrelated sequences - no indication of homology This is one of many cases--but some are not so clear. COG = “cluster of orthologous groups”, system for classifying genes by homology (topic for Thursday) Koonin 2003 The ubiquitous genes Ribosomal proteins Aminoacyle tRNA synthetases Translation factors Enzymes of RNA and protein modification Signal recognition components in secretion Molecular chaperone/protease RNA polymerase subunits DNA polymerase subunit, exonuclease, topoisomerase Total 30 15 6 3 3 1 2 3 63 Koonin 2003 7 Defining the Minimal Genome Combining genome sequence analysis and experimental studies Hutchison CA, Peterson SN, Gill SR, Cline RT, White O, Fraser CM, Smith HO, Venter JC. 1999. Global transposon mutagenesis and a minimal Mycoplasma genome. Science 286:2165 Presumed Dispensable Dispensable + Essential Mycoplasma genitalium 480 genes Mycoplasma pneumoniae 677 genes How much does gene set differ between genomes? drops off quickly with divergence of orthologous genes MA Huynen, P Bork. 1998. Measuring genome evolution. PNAS 95: 5849 8 Extensive divergence in gene sets of E. coli strains R. Welch et al. PNAS 2002. Features of Bacterial Genomes • Gene Inventory – Genome as a bag of genes • Non-coding sequences – Spacers and functional features • Base Composition – Variation among and within genomes • Gene Order and orientation 9 Genome size and ORF content in fully sequenced Bacteria obligate pathogen or symbiont 8000 7000 6000 5000 4000 3000 2000 1000 1 2 3 freeliving or facultative pathogen # intact ORFS For the most part, genomes are tightly packed with genes-very different from eukaryotes Mycobacterium leprae Many recent and older pseudogenes 4 5 6 7 8 Size of Genome (# megabase) When genes are lost--leave behind legacy of decaying ORFs, -- may be identifiable only if recent Features of Bacterial Genomes • Gene Inventory – Genome as a bag of genes • Non-coding sequences – Spacers and functional features • Base Composition – Variation among and within genomes • Gene Order and orientation 10 The smallest genome size and G+C content of any cellular organism )! (! GC content (%) '! &! %! $! #! "! ! ! " # $ % & ' ( ) * "! Genome size (Mb) Primary endosymbionts of insects Other bacteria 16.5% Features of Bacterial Genomes • Gene Inventory – Genome as a bag of genes • Non-coding sequences – Spacers and functional features • Base Composition – Variation among and within genomes • Gene Order and orientation – Usually similar between close relatives – Symmetry of inversions seen between related genomes (“X-Plots”) – Almost complete lack of correspondence between distant relatives (different phyla) except for a few conserved operons • Ribosomal proteins 11 Eisen et al. 2000 Eisen et al. 2000 12 Processes affecting bacterial genomes • • • • Mutational bias Lateral gene transfer Deletions and gene loss Rearrangements Differences in Base Composition among Bacteria are caused by Mutational Biases Sueoka 1962 Effect of mutational bias on genome base composition Affected by position on leading v lagging strand, transcription Much variation among genomes due to differences in mutational processes (repair gene sets, etc) G+C A+T 13 Compositional Heterogeneity among Bacterial Genomes (Muto & Osawa, 1987) 80 6 0 Why is base composition most conservative at 2nd positions and least conservative at 3rd positions? Consistent with mutational biases as main basis of differences among species in base composition. Stronger purifying selection at 2nd and 1st positions. 20 40 60 80 40 20 Species-specific genes have atypical base compositions Sequenced genes present in Salmonell a , but not in E. c (circa 1990) oli Gene Potential Function Cobinamide synthesis Flagell ar synthesis Flagell ar synthesis Host re cognition/invasion Sialidase Unknown Envelope protein Phosphatase LPS Synthesis Tra nscriptional control Tricarboxylate tr ansport Phosphoglycerate transport Ma p 41 56 56 59 20.5 98 25 96 31 7 57 – %G+C 59.3 40.9 52.3 45.5 40.9 38.2 43.4 46.5 33.5 39.8 55.0 45.3 CAI .233 .210 .216 .261 .263 .296 .274 .248 .175 .218 .278 .277 cbi fljA fljB inv/spa nanH ORF pagC phoN rfc sinR tctABCD pgtE Salmonell a genome is 52% G+C Early indicator of gene transfer: 14 Processes affecting bacterial genomes • • • • Mutational bias Lateral gene transfer (LGT) Deletions and gene loss Rearrangements • LGT can cause • sporadic distribution of a gene among related taxa. • mosaic organisms, with some sequences having ‘atypical’ features. • distinct gene content for closely related bacteria • unexpectedly high similarity between sequences from different taxa. • different phylogenies for different genes (“phylogenetic inconguence”). 15 Why is LGT important to bacterial evolution? 1. Genomes can no longer be view as constant. 2. May foil attempts to reconstruct the evolutionary relationships among organisms. 3. Conflicts with typical genetic and evolutionary processes. 4. Important in adaptation since acquired genes are often ecologically relevant. (from Doolittle. 1999. Science) Standard view of organismal relationships Doolittle’s revised view of organismal relationships With rampant LGT, all genomes are chimeric and classification might be difficult. Methods for detecting LGT • Abrupt shifts in sequence features (such as %GC, codon use) along chromosome • Conflict in gene trees • Gene content mapped onto phylogeny: genes present in some strains, absent in basal lineages • Comparisons of gene order, associations with phage sequences… 16 Detection of LGT • Abrupt shifts in sequence features (such as %GC, codon use) along chromosome • Conflict in gene trees • Gene content mapped onto phylogeny: genes present in some strains, absent in basal lineages • Comparisons of gene order, associations with phage sequences… Lateral Transfer as a Source of “Atypical” &“Species-specific” Genes Atypical” &“Species-specific” _______________________________________________________ – + E. coli Salmonella LGT + Species with Distinct G+C One of the original ways that LGT was detected was through atypical sequence composition of recently acquired fragments. 17 Bacterial genomes have characteristic GC content but most contain distinct regions of deviant GC content Streptococcus mutans, 37% G+C Neisseria meningitidis, 52% G+C ? (from Adjac et al. 2002. PNAS) (from Tettelin et al. 2000. Science) Base composition (deviation from mean) (GC of sliding windows) Massive gene exchange in microbial genomes Inferred from atypical base composition Amount of LTG is possibly underestimated by this compositional approach -genes from species of similar G+C content are not detected 18 Methods for detecting LGT • Abrupt shifts in sequence features (such as %GC, codon use) along chromosome • Conflicts among gene trees • Gene content mapped onto phylogeny: genes present in some strains, absent in basal lineages • Comparisons of gene order, associations with phage sequences… Conflict of gene trees Mycobacterium leprae Mycobacterium tuberculosis Streptomyces coelicolor Aquifex aeolicus Synechocystis sp. Pyrococcus horikoshii Pyrococcus abyssi Methanococcus jannashii Methanobacterium thermoautotrophicum Archaeoglobus fulgidus Campylobacter jejuni Helicobacter pylori Thermotoga maritima Caulobacter crescentus Deinococcus radiodurans Halobacterium sp. Thermoplasma acidophilum Caenorhabditis elegans Chlamydophila pneumoniae Xylella fastidiosa Saccharomyces cerevisiae Pseudomonas aeruginosa Vibrio cholerae Pasteurella multocida Haemophilus influenzae Neisseria meningitidis Buchnera sp. Aeropyrum pernix Sulfolobus solfataricus Bacteria Archaea Eukaryota Orotate Phosphoribosyltransferase 19 Phylogenetic Detection of LGT in Bacteria Evidence based on best hits via BLAST: Aquifex aeolicus and Thermotoga maritima, both hyperthermophilic bacteria, each have a high percentage (20% & 25%, respectively) of ORFs most similar to archaeal genes. Sometimes BLAST is not a good indicator of phylogeny Do approaches based on phylogenetic incongruence and on sequence composition reveal the same cases of gene transfer? Mycobacterium leprae Mycobacterium tuberculosis E. coli 400 350 300 Streptomyces coelicolor Aquifex aeolicus Synechocystis sp. Pyrococcus horikoshii Pyrococcus abyssi Methanococcus jannashii Methanobacterium thermoautotrophicum Archaeoglobus fulgidus Bacteria Archaea Eukaryota Gene number 17.6% Halobacterium sp. Campylobacter jejuni Helicobacter pylori Thermotoga maritima Caulobacter crescentus Deinococcus radiodurans Thermoplasma acidophilum 250 200 150 100 50 0 10 20 30 40 50 60 70 80 90 Caenorhabditis elegans Chlamydophila pneumoniae Xylella fastidiosa Saccharomyces cerevisiae Pseudomonas aeruginosa Vibrio cholerae Pasteurella multocida Haemophilus influenzae Neisseria meningitidis Buchnera sp. Aeropyrum pernix Sulfolobus solfataricus %GC3 Orotate Phosphoribosyltransferase Atypical sequence composition Phylogenetic incongruence Two aspects of the same phenomenon? 20 Comparing results from using gene content vs sequence features for detecting acquired genes - overall agreement 40 Transfer recognized by: Compositional features Taxa distribution Both criteria 30 20 10 kb (adapted from Lawrence & Ochman 2002) Detection of LGT • Abrupt shifts in sequence features (such as %GC, codon use) along chromosome • Conflicts among gene trees • Gene content mapped onto phylogeny: genes present in some strains, absent in basal lineages • Comparisons of gene order, associations with phage sequences… 21 Inferring LGT by comparing genome contents within a phylogenetic context A Acquisitions: +A – B – C Losses: –A + B + C Does gene content indicate the same gene transfers as phylogenetic incongruence? B C Gene gain/loss vs. phylogenetic disruptions for genes present in all taxa BuchBp Escherichia 933 K12 LT2 Yers Salmonella CT18 LT2 K12 Yers 4600 259 38 4554 9 10 282 13 BuchSG 545 BuchAP 564 Buchnera 5349 4289 522 144 22 114 1 11 This topo: 1589 Other topo: 9 Unresolved: 116 This topo: 1603 Other topo: 3 Unresolved: 97 This topo: 237 Other topo: 0 Unresolved: 169 Enterobacteriaceae Streptococcus Rhizobiaceae lactococcus pyog 1865 pneumon Sino 4080 2124 5004 K12 4289 549 54 601 29 4554 232 56 agal K12 LT2 Yers Vibrio 337 19 214 117 519 113 This topo: 923 Other topo: 0 Unresolved: 222 This topo : 93 Other topo: 28 Unresolved: 564 This topo : 616 Other topo: 34 Unresolved: 360 (adapted from Daubin et al. 2003 Science) Meso Brucella Agro 22 Gene gain/loss vs. phylogenetic disruptions for genes present in all taxa Genes gained and lost based on gene sets of 3 taxa. Trees derived from genes shared by all 4 taxa. “This topo” = topology of 16S rRNA tree. (adapted from Daubin et al. 2003 Science) The cohesion of bacterial genomes Other topologies 1 rRNA topology 1 Salmonella Escherichia Staphylococcus Enterobacteria Rhizobia 0 LGT Interspecies comparisons Intraspecies comparisons 0 0 Buchnera Staphylococcus aureus Chlamydophila pneumoniae Streptococcus Escherichia coli 1 Non-resolving (adapted from Daubin et al. 2003 Science) 23 Conclusion: LGT is not a usual basis for phylogenetic incongruencies No clear evidence for rampant LGT of orthologous genes (Frequencies of LGT are low; <5% for interspecies comparisons) But it does occur for a small proportion of genes. This implies two classes of genes in bacterial genomes 1. Orthologous genes - Generally resistant to LGT, used for phylogenetics 2. Recently acquired genes - Most of unknown function (many annotated as phage or secretion proteins) - Most are orphans (no known homologs, phylogenetically uninformative) - Generally, more A+T-rich than host genome -Often prophage-associated in some genomes -Not used for phylogenetics much due to erratic distribution CO 92 m lpis osa hid ico la P. mu ltoc ida sa ca mp es tris X. him uri u KIM nza ler ae vip a gin fas tidi o stis stis ue bre eru col i pe cho infl P. a W. S. Y. B. X. H. New genes arriving, Some persist in subclade E. Y. V. X. axo typ pe ap no po dis e 24 Distribution and Age of Acquired Genes in E. coli (adapted from Lawrence & Ochman 1998) Processes affecting bacterial genomes • • • • • Mutational bias Lateral gene transfer Deletions and gene loss Rearrangements Duplication 25 Role of deletion in Bacteria In the face of so much gene acquisition, why are bacterial genomes so small? Close packing of bacterial genomes implies that inactivated genes are removed relatively quickly. Deletional bias in mutation removes non-functional DNA. Selection is required to retain DNA in a genome. Dynamics of Bacterial Genome Size Gene acquisition Deletions of fragments spanning multiple genes Bacterial Genome Size Gene inactivation by point mutation Pseudogene erosion by small deletions Duplication Chronic pathogens: low rates of DNA acquisition Decrease in s & Ne increases gene loss through genetic drift A. Mira et al. 2001 Trends in Genetics 26 Processes affecting bacterial genomes • • • • • Mutational bias Lateral gene transfer Deletions and gene loss Rearrangements Duplication 27

Related docs
Bacteria
Views: 84  |  Downloads: 7
Community genomics and microbial diversity NM
Views: 51  |  Downloads: 5
Genomics
Views: 1  |  Downloads: 0
Proteomics translating genomics into products
Views: 1  |  Downloads: 1
Comparative Genomics Backgrounder
Views: 2  |  Downloads: 0
BMC Genomics
Views: 1  |  Downloads: 0
A vision for the future of genomics research
Views: 28  |  Downloads: 4
premium docs
Other docs by sammyc2007
What are the indications for intubation
Views: 328  |  Downloads: 13
VENTILATORY MANAGEMENT ENDOTRACHEAL INTUBATION
Views: 114  |  Downloads: 4
The Neonatal Airway and Neonatal Intubation
Views: 260  |  Downloads: 11
The Airway and Intubation
Views: 190  |  Downloads: 15
RSI RAPID SEQUENCE INTUBATION
Views: 278  |  Downloads: 6
Rapid Sequence Intubation The Role of the NH
Views: 120  |  Downloads: 2
PROTOCOL POST INTUBATION MANAGEMENT
Views: 138  |  Downloads: 4
PEDIATRIC INTUBATION POLICY AND PROCEDURE
Views: 154  |  Downloads: 1
Pediatric Airway Management
Views: 132  |  Downloads: 9
Pediatric Airway Emergencies
Views: 85  |  Downloads: 9
Non invasive ventilation and LV dysfunction
Views: 64  |  Downloads: 2
NASOGASTRIC INTUBATION
Views: 161  |  Downloads: 6
Mechanical Ventilation for Nursing
Views: 303  |  Downloads: 16
Management of the Routine Pediatric Airway
Views: 87  |  Downloads: 2