Acrobat PDF

Gene Sequencing of Porites lobata Whitepaper

You must be logged in to download this document
Reviews
Shared by: C Gunnison
Stats
views:
157
rating:
not rated
reviews:
0
posted:
12/29/2007
language:
English
pages:
0
Sequencing the Genome of the Coral, Porites lobata Gary K. Ostrander1, Steven L. Salzberg2, Craig Downs3, Karla Heidelberg4, J. Craig Venter4, and Claire M. Fraser2 1 Johns Hopkins University 3400 North Charles Street 234 Mergenthaler Hall Baltimore, MD 21218 410/516-8215 gko@jhu.edu 2 The Institute for Genomic Research 9712 Medical Center Drive Rockville, MD 20850 301/315-2537 salzberg@tigr.org cmfraser@tigr.org 3 EnVirtue Biotechnologies, Inc. 35 W. Piccadilly Street Winchester, Virginia 22601 U.S.A. 540/23-0597 540/723-0598 craigdowns@envirtue.com 4 The Center for the Advancement of Genomics 1901 Research Blvd. Suite 600 Rockville, MD 20850 301/309-3400 Karla.Heidelberg@tcag.org Craig.Venter@venterscience.org Introduction Coral reef ecosystems are arguably one of the most complex ecosystems on earth, and are home to literally thousands of species of invertebrates, vertebrates, and microbial organisms of which perhaps the majority have not yet been described. In fact, 34 of the 36 known animal phyla are found living within coral reef ecosystems. The primary infrastructure of these circumtropical ecosystems consists of the scleractinian or reef-building corals, which are characterized by a tightly coupled symbiotic relationship with dinoflagellate zooxanthellae. Numbering about 400 species, corals cover about 284,300 square kilometers worldwide (1) and are found in over 100 countries. Evolutionarily, scleractinian corals are relatively primitive animals and belong to the phylum Cnidaria, which also contains other simple organisms like jellyfish, sea anemones, and hydra. However, it is noteworthy that Cnidarians are evolutionarily very far removed from the sea urchin, sea squirt, flatworm, and other organisms whose genomes have either been sequenced or are in the NHGRI list of organisms approved for sequencing. Corals begin life as ciliated planular larvae, which attach to a suitable substrate and remain sessile for the remainder of their adult lives, often for many hundreds of years. In doing so, they form massive structures such as the 1,200 mile long Great Barrier Reef along the northeast coast of Australia. Once attached, corals can reproduce by asexual budding to form colonies. Each colony is the result of a clonal expansion of a single larva. The scleractinian corals also lay down a calcium carbonate skeleton that is the characteristic infrastructure of the coral reef. The soft tissue or living part of the coral also contains photosynthetically competent dinoflagellate algae called zooxanthellae. The ability of scleractinian corals to deposit calcium carbonate skeletons and to form reef structures is generally attributed to their symbiotic relationship with zooxanthellae, which provide the corals with large quantities of organic materials, especially high-caloric value lipids and carbohydrates. The materials translocated by the zooxanthellae are believed to provide most of the energy for maintenance, tissue and skeletal growth, and possibly reproduction (reviewed in 2). There is now substantial evidence that the added effects of global climate and anthropogenic stressors have overwhelmed the natural plasticity of the reef ecosystems and have contributed to the loss of 27% of the world’s coral reefs (3). If current trends continue unabated, it is estimated that nearly 60% of the world’s coral reefs will be lost in the next 25 years through reduced coral growth rates, disease outbreaks, and increased mortality. Coincident with these events is the less obvious loss of genetic diversity within all coral reef populations, and the potential impacts on all the plant and animal species associated with the reef community. Nearly all species found in coral reef ecosystems have co-evolved over millions of years, and the loss of the corals, and ecosystem substrate, will adversely affect the entire reef ecosystem dynamics (1). Migration of these species to other habitats is not an option as they are specifically adapted to the reef ecosystem. As a further threat to coral reef ecosystems, the multiple anthropogenic stressors currently affecting coral reef communities may be reducing genetic variation and biological resilience to the point where both the reefs and the associated organisms may be losing the ability to rebound from naturally occurring disturbances, in a manner analogous to the Irish potato famine. Specifically, the current list of 25-30 fatal diseases impacting corals will likely expand as will the frequency and extent of phenomena called “coral bleaching”. Bleaching occurs when the corals lose their colored symbiotic algae, the bright white coral skeleton is then revealed through the now transparent tissue (i.e. appearing to be freshly “bleached”), and the coral soon dies (1-3). The potential loss of one of the most significant ecosystems on earth is but one reason to sequence the genome of a reef-building coral. Having the full sequence of a scleractinian coral will greatly enhance studies on reef ecology, evolution, gene discovery and the fundamental processes humans share with corals. Specifically, the evolutionary placement of corals, among the most primitive of animals, suggests the opportunity to increase our understanding of fundamental biological processes humans share with corals (e.g. aging and cancer) as well as an opportunity to continue to mine the reef for novel medicinal compounds. Herein we propose the sequencing of the scleractinian (i.e. reef-building) coral Porites lobata. This is among the most abundant coral species in the world, with a cosmopolitan distribution that includes 1 the entire east coast of Africa, the Persian Gulf, the Red Sea, the Seychelles, throughout the Indo-Pacific including Australia and New Zealand north to Japan and Hawaii, and the entire west coast of Central America, Columbia and Peru (reviewed in 15). The proposed effort will cover 109 bases distributed on 48 chromosomes. As detailed in the 40+ letters provided in Appendix B, this effort has the support of the world-wide coral reef research community. A. Specific Biological Rationales for the Utility of New Sequence Data A.1. Improving human health. To be sure, the primary rationale for sequencing the genome of Porites lobata is not to address issues of human health in the traditional NIH sense. Nonetheless, the availability of this information will in fact lead to significant positive impact on human health in two areas. First, as mentioned above and detailed elsewhere in this white paper, coral reefs around the world are rapidly declining. As reefs die the accompanying populations of fish and invertebrates that are essential food sources for many emerging nations decrease. Presently we have only just begun to identify the disease vectors (primarily bacteria and viruses), though we have little idea of the causative agent(s) in their expression at this time. While it is unlikely that sequencing the coral genome will lead immediately to “treatments” for dying reefs, such information could reveal what anthropogenic factors (e.g. pollution, global warming, over-fishing) are involved and suggest ways that we might mitigate the damage. This information will also be invaluable to mangers who will rely on the development of molecular markers to understand the recruitment of corals and other organisms to particular reef communities. Twenty percent of the world’s fisheries depend on coral reefs, and 8% of the world’s population is directly dependent on reefs for their primary food source. Moreover, over 50% of federally managed fisheries species in the United States are tied to coral reefs (4,5). Globally, it has been estimated that reef habitats provide $375 billion each year to humans from fish and other edible organisms (27). Second, coral reefs are arguably the most complex ecosystems on earth. Literally, thousands of vertebrate, invertebrate, algae and microbe species coexist in relatively very small areas. Many of their adaptation strategies are unique (e.g. photosynthetic algae living in the soft tissues of the corals providing energy to the corals; sea cucumbers that eviscerate their entire alimentary canal when attacked by predators only to re-grow a new one in a short period of time; many species of fishes that change sex when population sex-ratios are skewed). Maintenance of this incredible complexity involves unique molecules that exist along P. lobata biochemical pathways. In much the same way as novel compounds from the plants and animals of the rain forests show potential as cancer treatments and as other products of medicinal value, compounds from the world’s coral reef ecosystems have a similar potential. Sequencing the genome of P. lobata will be the first step in realizing that objective. A.2. Informing human biology. Coral are very primitive animals that occupy the lower reaches of the phylogenetic tree. The complexity of their major “systems,” does not approach that of humans. For example, they have no circulatory, respiratory, digestive, or urogentital systems. They do not have any muscles and they do not exhibit cephalization, but instead have a simple nerve-net. Yet, they have a rigid skeleton of a bone-like material (discussed in detail in section A.6). They eat, reproduce sexually and asexually, develop gametes, capture, digest and metabolize food, eliminate wastes, respond to tactile stimulation, exhibit soft tissue repair and even develop cancer. The fact that they share basic biological processes with humans presents the opportunity to study fundamental aspects of the biology of complex systems in a relatively simple organism. Moreover, in ways that non-human organisms have been utilized in the past (e.g. squid giant axon, yeast, and Drosophila) coral do in fact have the potential to inform human biology. In other words, the admittedly bizarre life style of corals and the species with which they coexist will provide a unique opportunity to understand a large part of the easily accessible vocabulary of modern biology and how nature has put together genetic and biochemical systems. 2 A.3. Informing the human sequence Recent sequencing of genomes from several highly evolved (or “crown”) eukaryotes has greatly enhanced our understanding of human gene function. One thing we have learned as a community is that genes often take on many different roles. Discovering these roles in other species helps us to understand better how the genes may function in humans. For example, many of the human genes catalogued in OMIM are only understood through their connection to a human disease; i.e., we know that a defect in the gene leads to a particular phenotype. This information, though valuable, does not always tell us about the molecular function of the gene. As we sequence and come to understand human gene orthologs in other species, we gain new insights into gene function. For this and other reasons, it is important that we sample a wide range of species on the evolutionary tree, and not limit our genome sequencing to the crown eukaryotes. P. lobata represents a branch of the evolutionary tree that is almost completely unexplored, and the elucidation of its gene content will give us new perspectives on the thousands of genes that it almost certainly shares with humans. In addition to the direct comparisons between proteincoding genes, we will also be able to use the sequence of P. lobata to study conserved noncoding regions. Such regions might represent regulatory sites or new RNA genes, an area that has greatly expanded in just the past few years as thousands of new functional RNAs have been discovered. A.4. Providing a better connection between the sequences of non-human organisms and the human sequence. Our understanding of metazoan evolution is mostly based on a small number of complete genome sequences and large Expressed Sequence Tag (EST) datasets that represent only a few complex animals. Of necessity, therefore, deductions about the evolutionary origins and structures of the human genes are largely based on comparisons with a limited number of complete genomes, and these are primarily from crown eukaryotes (e.g., the insects Drosophila melanogaster and Anopheles gambiae; the nematode Caenorhabditis elegans; the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe, and the flowering plant Arabidopsis thaliana). A few more-distant evolutionary species are now being sequenced, such as Tetrahymena thermophila, but these barely begin to sample the breadth of diversity among eukaryotes. The Phylum of Cnidaria is regarded as the sister group to the Bilateria, and is likely to be critically important to understanding the evolution of metazoan genetic and developmental complexity. For example, anthozoan cnidarians such as Porites possess the most “primitive” present-day nervous systems, a morphologically homogeneous nerve net. The complete genome sequence of Porites may lead to important advances in our understanding of the human nervous system. Interestingly, the cnidarian nervous system appears to be entirely dispensable: it is possible to indefinitely culture hydra after destroying all nerve cells and the interstitial cells that give rise to them (6). Despite simplicity, plasticity and regenerative capabilities, the cnidarian nervous system is patterned by genes related to those of vertebrates. For example, using the scleractinian coral Acropora, Miller et al. (7) found a large number of genes involved in specifying and patterning the advanced nervous systems of flies and mammals. Several of these genes are expressed in the coral in patterns that resemble those seen in vertebrates. Cnidarians are therefore potentially highly informative for many aspects of human nervous system specification and regeneration. A.5. Expanding our understanding of basic biological processes relevant to human health A.5.1. Aging Studies have shown that telomeres, the tandem repeats of sequence found at the end of eukaryotic chromosomes, play an important role in the aging process. Loss of the telomere regions leads to chromosomal changes often associated with cancer and aging in humans and in fact may be causative. Coral colonies are known to be strikingly long-lived, with colonies over 600 years old commonly found in the world’s reefs. In light of this, it was first suggested that corals lacked the cellular components 3 and/or genes associated with aging. However, it has been recently reported that coral chromosomes do in fact contain telomeres and there may be significant sequence homology with human telomeres (8). Thus, understanding the telomere regions and age-associated genes in long-lived species such as corals will provide insight into our own aging process, as well as the potential for innovative therapies for disease. In order to characterize the telomere regions in corals, we must first obtain the genetic sequence of these regions. A.5.2. Cancer In the late 1980’s it was documented (and featured on the cover of the Journal of the Natl Cancer Inst) that corals are vulnerable to cancer (9). While the initial observations (i.e. calicoblastic epitheliomas) were restricted to a few stands of coral in the Florida Keys, in the intervening years the frequency of observations of hyperplasia and tumors among wild coral populations steadily increased worldwide. Corals remain the most primitive organisms known to exhibit cancer epizootics. As such, the fact that such a simple animal can develop cancer suggests opportunities for studies of the fundamental mechanisms associated with oncogenesis. A.5.3. Basic Biology We expect that fundamental biological processes shared between corals and humans will be prime targets for further study (e.g. neurobiology and developmental biology). Following the sequencing of the genome, we plan to annotate the genes as thoroughly as possible, using bioinformatics-based approaches, which will allow us to identify large numbers of genes, metabolic pathways, and other cellular mechanisms present in coral. These then provide an immediate substrate for further study and for direct comparison to the corresponding human pathways. A.5.4. Disappearing Food and Economic Resource With the widespread decline of coral reefs there is a direct impact on human health. As coral reefs are dying, we are also seeing a commensurate loss of economically important fisheries (e.g. grouper, snapper) and invertebrates (e.g. conch, lobster, crab). This loss is of vital importance to emerging nations where significant portions of the population depend on coral reefs for food or for tourism dollars. For example, on an annual basis tourism dollars related to coral reefs include $1.5 billion on the Great Barrier Reef (10), $2.5 billion in the Florida Keys (11), and likely another $2-3 billion across numerous nations in the Caribbean (12). Although, NHGRI sequencing programs are not designed to target socioeconomic problems, nonetheless it is worth noting that the entire economies of small island nations will be lost if coral reefs continue to decline at their current rate from widespread “bleaching events” and other disease processes. We envision a two-fold approach to increasing our understanding of coral disease processes. First, coral reefs are among the most productive of all ecosystems, yet they occur in oligotrophic tropical waters (13-15). This paradox of high coral productivity in low-nutrient oceanic waters has been an area of great interest for coral reef scientists (reviewed in 16). In addition to receiving translocated photosynthate (fixed carbon) from their zooxanthellae, corals can directly feed on particulates (e.g. zooplankton and microbial coated particles) and directly uptake nutrients from the surrounding waters. Understanding the molecular and cellular interactions of this complex energy and nutrient acquisition systems and the functional integrity of symbiosis would advance our basic understanding of how coral reefs function and what causes breakdowns in the symbiotic relationship resulting in bleaching and eventual death of the colony. Knowing which genes are involved (i.e. which biochemical pathways) would significantly advance our understanding of reef health. A white paper is also being submitted by our colleagues (Doak et al.) in this cycle to the NHGRI to advocate the sequencing of the genome of the dinoflagellate Symbiodinium which is symbiotic zooxanthellae found in corals. We and our co-workers in the field endorse this effort as the simultaneous sequencing of both these species will be invaluable to researchers interested in the study of coral reef ecosystems. 4 Second, corals also harbor unique microbial communities. These coral bacterial communities are diverse, species specific, and similar in corals from widely separated reefs. However, the nature of the relationship between corals and associated bacteria has yet to be established. Recent studies (17) show that coral-associated bacteria are thought to be regulated through nutrient limitation, and this regulation breaks down with carbon (glucose) addition and disease. However, bacterial disease has also destroyed entire reef communities (reviewed in 18) and coral disease incidence has increased dramatically since the first reports in the early 1970’s (3). Increasing numbers of coral colonies and species over wider geographical ranges have been affected by disease, resulting in extensive mortality throughout many of the world’s reefs. Despite the major ecological impacts of coral disease, the cause of most coral diseases remains unclear. Recent work from hydra and scleractinian corals demonstrates that anti-microbial peptides derived from coral pro-proteins significantly contribute to coral immunity, and that changes in the expression of these proteins are linked to increased susceptibility to disease (19,20, C. Woodley unpublished). Access to the Porites lobata sequence will provide a powerful tool in identifying and elucidating the genes that give rise to coral innate immunity. These discoveries will provide the platform for more accurate coral health assessment through advanced technology development. The better-defined assessment data, together with physicochemical and environmental data, can be synthesized into knowledge for practical use by managers. A.6. Providing additional surrogate systems for human experimentation Corals can function as suitable systems for human experimentation and have done so for over three decades. It is noteworthy that the calcium carbonate skeleton of Porites sp. has been tested since the 1970s, in vitro, in animal and in clinical human studies, as potential bone graft substitute (reviewed in 21). Its mechanical properties and porosity resemble those of bone. Its natural porous scaffold is biocompatible and osteoconductive, allowing bone cell attachment, growth and differentiation into the osteoblastic phenotype (22, 23). It is biodegradable, its resorption rate depending on the skeleton macroporosity, the implantation site and the species (24). It is not osteoinductive, but its porous scaffold acts as an adequate slow-release carrier for growth factors such as bone morphogenetic proteins BMP-2, BMP-7 and TGF-beta (25, 26). Sequencing the Porites sp. coral genome will greatly improve our understanding of the biology of coral skeletogenesis. It will provide access to primary peptide sequences that can be matched with sequences already obtained from proteins extracted from the organic matrix of other calcium biominerals (e.g. bone, dentin, molluscan nacre, etc.) and thus will help to identify proteins involved in the secretion and the maturation of skeletal organic matrix. In addition to the obvious benefits of the further development of bone and dental implants, identification of the coral molecules involved in the mineralization process will be a major step to gain insights into the fundamental mechanisms of biomineralization and the evolution of these processes. Other possibilities, such as corals as a model system for investigating the interface between neurophysiology and immunity (20), have been suggested, though none are sufficiently developed to warrant discussion herein. Perhaps, the availability of the sequence will allow for maturation of these possibilities and bring new opportunities to light. A.7. Facilitating the ability to do experiments Our collective field is currently limited in ways that would be overcome by availability of the P. lobata sequence. We cannot currently develop targeted methods without an understanding of what genes are present and biochemical pathways are utilized and thus suitable for exploitation. For example, bleaching appears to be a generalized stress response on the part of corals (2). Dr. Angela Douglas and colleagues (University of York, UK) are currently using a Porites species (P. cylindrica) to evaluate the resistance and resilience of corals to bleaching. A complete genome would enable her to relate the physiological differences to specific stressors as well as particular proteins that are up or down regulated as a result. Specifically, a sequence would allow for selection of the most likely candidate genes for 5 further study. At present, little is known of which genes are expressed in corals (e.g. HSP 60’s?, HSP90’s? SOD’s?, GST’s?, CYP’s ?). Likewise, we cannot exploit the structure of coral for studies of bone and dental implants because we lack the ability to readily identify and clone genes relevant to these processes. Availability of the sequence would facilitate work ongoing in North America (e.g. Dr. C. Demers and colleagues at Ecole Polytechnique, Montreal, Canada) and Europe (e.g. Dr. C. Delloye and colleagues at Service de Chirugie Orthopedique et de Traumatologie, Belgium; Dr. S. Reis and colleagues at the Free University of Berlin, Germany; and Dr. E. Arnaud and colleagues in Paris, France [one of at least 4 institutions in Paris studying corals as bioimplants] where preliminary studies are promising but slow. Likewise, because coral is one of the most primitive organisms that clearly “gets cancer,” we would like to understand the role of various oncogenes and tumor suppressors in the growth and development of coral. However, even though genes like retinoblastoma (Rb) and ras are present in coral, such genes have not been cloned in coral because of the general unavailability of genome resources in the species. The availability of the sequence would specifically facilitate the ongoing efforts of Dr. Jeanette Rotchell (University of Sussex) who is studying the function of the Rb and ras genes during normal development in non-mammalian aquatic models. As established above, few organisms live in such a harsh and ever changing environment, and yet have developed the aging mechanisms to survive up to at least 600 years. Dr. Colleen Sinclair’s (Towson University) laboratory is currently focused on understanding the biochemistry of survival/aging in coral, and she is particularly interested in the genetic variation observed in those genes that contribute to this unique survival process. The proposed sequence data could be used to simultaneously investigate the role of DNA repair genes (and other relevant genes) in coral longevity. Specific questions of interest include: are there more or fewer DNA repair genes; have they evolved to exhibit more genetic variation to accommodate the increased environmental stress; what classes of repair genes are most similar and dissimilar to humans and mice; are there particular subclasses of genes that are more highly evolved and if so exactly what is that variation at the DNA sequence level? We believe if we had such information we could begin to postulate how repair genes accommodate environmental stressors. But, without the ability to easily study the genetic variation of even the most fundamental of genetic pathways (i.e. Rb, p53, and ras) in coral, pathways that are present from yeast and E. coli through to humans, we are unable to exploit this unique opportunity. At this point in time the cloning of most coral genes and the study of the relevant promoters, enhancers etc, is a slow and laborious process. A.8. Expanding our understanding of evolutionary processes. We are proposing to use P. lobata as a model for understanding the evolutionary processes affecting sessile marine invertebrates in general, as these appear to have evolved in very different ways from terrestrial animals. One of the fundamental questions in coral ecology is that of speciation. Since coral and other sessile marine invertebrates release their gametes into the water column, where fertilization takes place, there are unparalleled opportunities for interspecific hybridization and introgression between species. While the exact nature of species in reef-building corals is a subject of continuing debate (28-30), some believe that corals are highly variable, widely distributed, and prone to interspecific hybridizing. While hybridization between species has been shown in laboratory experiments, and population studies based on rRNA comparisons suggest that interspecific hybridization events occur, population genetic approaches indicate that they occur with much lower frequency than would be expected on the basis of in vitro breeding trials. This implies the existence of (albeit imperfect) gamete recognition system in corals. However, even if such hybridization events occur only very rarely, they are likely to be important on evolutionary time scales, creating the capacity for adaptive evolution by increasing genomic diversity and heterozygosity. A complete genome of a model coral will prove invaluable to unraveling the evolutionary processes in the Cnidaria. Among other questions, the genome of P. lobata will allow us to study both large-scale and small-scale changes in chromosome structure and gene content by comparing it to other completed 6 genomes. For example, the comparison (which one of us was involved in) of the mosquito A. gambiae to the distantly-related D. melanogaster revealed extensive intra-chromosome gene shuffling, and a likely chromosome fusion event since the two species’ divergence (31). It also revealed in great detail the expansion and contraction of many gene families through individual gene duplication. Fine-scale alignments between individual genes showed how alternative splicing was surprisingly well conserved in many genes, although exon structure was less conserved in genes not subject to alternative splicing. B. Strategic Issues in Acquiring New Sequence Data B.1. The demand for the new sequence data. Genome sequences have transformed research for hundreds of bacteria and dozens of eukaryotic species. The scientific community studying corals would similarly experience dramatic improvements in its ability to study these diverse species once the genome of one species becomes available. (Please see 40+ attached letters of support in Appendix B for specifics.) If approved, we anticipate sequencing the coral genome in less than one year, based on the current capacity of the Joint Technology Center (JTC) at TIGR and its affiliates. All sequence data will be released immediately through the NCBI Trace Archive, but more importantly, we will assemble the sequences and release those assemblies as rapidly as possible. The bioinformatics team at TIGR will run automated genome annotation programs that will identify putative genes and other genomic features, and all of this information will be released for the scientific community to use. It is important to emphasize here that all of this data will be freely available, with absolutely no restrictions on its use or redistribution. Thus in less than one year from its inception, this project will revolutionize the study of coral reef biology. B.2. The suitability of the organism for experimentation and selection of the species. Porites is the most commonly occurring genus of coral present in the world, spanning the Indian and Pacific Oceans and the Caribbean Sea (3). Some species have very restricted distributions, while others are found throughout the Tropics. An enormous amount of intraspecific morphological variability exists, while at the same time similarities between species are striking; for example, intraspecific geographic differences in morphology can be as large as differences between species, in part, due to possible natural hybridization and introgression (32,33), but also due to differences in external factors such as light, temperature and water flow (e.g. 34-36). Porites thus provides an ideal model system for examining speciation and evolution of scleractinian reef coral species in general, on both temporal and spatial scales. Moreover, in January 2002, a workshop was convened by Coral Disease and Health Consortium, which included coral reef scientists from all over the world. Among the various priorities identified by the working subgroups was the sequencing of a coral genome. At the time a variety of species were suggested and discussed including Porites lobata. As we initiated the preparation of this application we returned to our colleagues via the Coral ListServe, emails, and a focused discussion at a recent workshop on coral reef ecotoxicology. As evidenced in the accompanying letters in Appendix B, we have achieved a high level of consensus among our colleagues that P. lobata is the most appropriate species to sequence at this time. It is noteworthy that in late September 2003 the First Annual Meeting of Coral Reef Ecotoxicologists concluded with the agreement that P. lobata would be the “white rat” for toxicology studies of involving corals. B.3. The rational for the complete sequence of the organism The coral genome is relatively distant from other sequenced genomes, and as a result we expect to have very few syntenic regions that we can use as an aid to assembly. For the same reason, we would propose to obtain relatively deep coverage of the genome so that our assembly will contain essentially all genes as well as an accurate picture of large-scale chromosome structure. Our experience with assembly 7 of other large genomes suggests that we should generate at least three libraries of different sizes, namely a 3 kbp (small) insert library, a 10 kbp (medium) insert library, and a 50 kbp (large) insert library. As we have done with many of our genome projects, we will also construct a BAC library (150 kbp inserts) and generate BAC-end sequences. Please see attached letter of support from Dr. Pieter De Jong (Appendix B), who will construct the libraries. The larger libraries will be tremendously valuable in providing linking information that will permit the assembly to create large scaffolds, while the small library provides much of the contig structure and helps to disentangle small repeat regions. The genome assembler that we use, the Celera Assembler (37), is able to take advantage of a mixture of insert sizes in both its contig and scaffold construction stages. We also have a separate scaffolding program, BAMBUS (open source software available from www.tigr.org/software) that can use other linking information, such as synteny to related species or physical maps, to build even larger scaffolds. To determine the optimal mix of small, medium, and large insert libraries, we will run simulations (using software that we have developed in-house) and also rely on our experience with several recent large genome assemblies. These include assemblies of mouse (2.4 Gbp), rat (2.6 Gbp), the fruit fly D. pseudoobscura (165 Mbp), the parasitic nematode Brugia malayi (110 Mbp), the parasite Schistosoma mansoni (300 Mbp), and the ciliate Tetrahymena thermophila. (See http://www.tigr.org/tdb/euk/ for details of TIGR’s ongoing eukaryotic genome projects.) B.4. The cost of sequencing the genome and the state of readiness of the organism’s DNA for sequencing. B.4.1 Expected cost of sequencing the genome We propose to undertake this project in conjunction with the NHGRI center at TIGR, using their Joint Technology Center (JTC) to generate the shotgun reads. Based on the current costs at the JTC, we expect the sequencing cost would be as shown in Table 1 below. We include two scenarios, 6X and 8X coverage, for comparison. We would recommend 8X coverage to obtain maximum scientific benefits from this project. The argument for 8X coverage rather than a “lighter” shotgun project is simple: deeper coverage provides much longer contiguous sequences (contigs) and much larger scaffolds, with fewer and smaller gaps. As is well known, the high-throughput shotgun phase is by far the cheapest and fastest method for sequencing, and an investment at this phase pays off handsomely in the future. At lower coverage levels, the large number of gaps will prompt future researchers to run their own small-scale (and relatively expensive) genomic PCR reactions to walk across and close those gaps. Hundreds, possibly thousands, of such gap-closing efforts might take place in future years, and the likelihood that the results would be incorporated into the public assembly is small. Thus scientists may close the same gaps over and over again. It also becomes very difficult to obtain DNA from the same strain as time passes, raising the question of whether additional sequence really represents the same genome. Table 1. Estimated cost for a coral genome project, assuming a 1 Gbp genome. Read lengths are assumed to be 800 bp and the sequencing success rate is 87%, both based on current JTC averages. Coverage 6X 8X Attempted Reads 8,620,000 11,494,000 Successful reads 7,500,000 10,000,000 Cost (@ 0.8213/read) $7.08 million $9.44 million B.4.2. Limited closure effort Although we are not proposing to close all gaps in the genome, we would propose to direct a limited amount of resources to closing gaps that occur in the middle of genes. Because we expect gene density to be relatively low (as it is for other genomes in this size class), most gaps are likely to fall in intergenic regions. However, if a gap occurs within a coding region, it will be relatively easy and 8 efficient to target additional sequencing reactions at such gaps. This limited gap closure would add probably less than 1% to the overall cost of the project, but it could yield significant benefits in terms of gene discovery. B.4.3. Expected assembly characteristics Although much less is known about the coral genome than about recent large mammalian genomes, we can anticipate that a genome of this size will assemble into a set of contigs and scaffolds with statistics similar to or better than those achieved for the recent rat genome. The Celera Assembler was run (at Celera) on the sequences at 6.5X coverage, from a mixture of insert sizes, to produce 4,291 scaffolds and 115,529 contigs spanning 2.56 Gbp. The largest 360 scaffolds covered over 99% (2.54 Gbp) of the genome. The average contig size was 22 kbp. We expect an 8X project on coral to produce larger contigs and to cover well over 99% of the clonable regions of the genome. Note that using the theoretical (Lander-Waterman) model, we would expect average contig sizes of 31 kbp at 6X coverage and 80 kbp at 8X coverage. B.4.2. State of readiness of the coral DNA for sequencing. The DNA to be used for the sequencing of Porites lobata will be obtained from sperm of a young single male colony to be collected in Hawaii or Australia depending on the time of year we initiate the project. Those who have studied senescence in corals have reported that as corals age their genomes deteriorate. As such, to ensure that the alleles we see are common in the population, and not a transitory somatic mutation, we will use young coral. Obtaining sperm from a coral colony is relatively easy if you know the spawning times at a particular location (and we do!!!). Porites lobata is a gonochoric species (i.e. separate male and female colonies) and it is relatively simple to ‘sex’ a colony by histological sectioning a small tissue sample collected during the gametogenic cycle. Within ten days of the expected date of gamete release, a single colony of P. lobata will be moved from the field and held in a recirculating aquarium. When gamete packages are visible within each polyp, corals will be exposed to 3 M MgCl2 for about 4 hours, until the majority of gametes are released. Gametes will rise to the surface and can be collected with a fine hatched dipping net. Gametes will be washed with 0.2 micron-filtered artificial seawater, and centrifuged at 120 x g. The gamete pellet is resuspended in 0.2 micron-filtered artificial seawater and the resulting gamete suspension is then applied to a 10%/5% percoll-step gradient to purify the coral gametic tissue from contaminating plankton. Coral gametes will migrate through the percoll density gradient and be collected using a 2.8 mL fine-neck transfer pipette. Gametes will be suspended in 0.2 micron-filtered artificial seawater at a 1:20 (v/v) ratio, and centrifuged at 150 x g to remove percoll contamination from the coral gametes. Based on our previous experience, we will isolate the DNA using a Dojindo Tissue DNA isolation kit. The DNA will be quantified using the Molecular Probes PicoGreen DNA quantification kit, as well as assayed for relative contamination using the 260/280 spectrophotometric method. DNA will also be subject to low density agarose electrophoresis to ensure that DNA extraction did not cause DNA breakage artifact (i.e. HMW DNA). To ensure absence of symbiotic dinoflagellate DNA contamination, P. lobata DNA will be assayed using a PCR diagnostic test using primers against Symbiodinium spp. ribosomal sequence and dinoflagellate actin 1. Using a 300 bp probe for dinoflagellate ribulose bisphosphate carboylase-oxygenase (RuBisCO), P. lobata DNA will be assayed using a southern blotting protocol as an alternative method for dinoflagellate contamination. B.5. Other funding. When the decision was made to develop a white paper for this initiative we also began the quest for additional funds for both sequencing and to facilitate use of the data that will be generated. To date we have significant interest from the National Oceanic and Atmospheric Administration (NOAA), EnVirtue Biotechnologies and New England Biolabs (NEB) to organize, host, and fund an initial meeting 9 for the coral genome community as the sequence becomes available. The meeting will be held at the Johns Hopkins University Montgomery County campus that is adjacent to TIGR. The purpose of the meeting will be to (1) summarize the features of the sequence for the community; (2) establish consortia to attack specific questions raised herein; and (3) to organize subgroups to focus on grantsmanship to further the use of the sequence. In addition, as this application goes forward in the review process we will continue our discussions with other international agencies, federal agencies and private foundations in order to secure funds to offset sequencing costs. Summary As detailed above, corals reside at the lower levels of the phylogenetic tree, yet exhibit many shared characteristics with higher organisms including humans. Access to the genome of a distant colonial metazoan would be very useful from a comparative standpoint and corals are an ideal candidate given the socioeconomic importance of coral reef ecosystems and their potential to inform ecology, evolutionary biology and human biology. A genomic sequence will also provide opportunities to enhance our understanding of various disease process currently decimating coral reefs world-wide. In short, sequencing the genome of the reef-building coral Porites lobata has the potential to significantly impact a number of diverse avenues of scientific investigation. The TIGR bioinformatics team has extensive experience with genome annotation, having pioneered annotation methods for both prokaryotic and eukaryotic species. We will annotate the genome initially using highly automated methods, and then release this preliminary annotation to the community for enhancements and curation. Our group has collaborated on several highly successful international annotation efforts in recent years, including the annotation of the genomes of Arabidopsis thaliana and Plasmodium falciparum, each of which involved multiple international groups. We have developed shared annotation software (e.g., the open-source Manatee system, available from www.tigr.org/software) and standards for data exchange to facilitate these projects. We are currently involved in international consortia to annotate the eukaryotic genomes Trypanosoma brucei, Trypanosoma cruzi, Leishmania major, Aspergillus fumigatus, Aspergillus nidulans, and others. We will draw upon this experience to bring together a consortium of experts to annotate the coral genome. Appendix A contains the complete citations of the work referenced in this document. Appendix B contains:    40+ letters of support from the world-wide research community that would benefit from the sequencing of the genome of Porites lobata A letter from Dr. Pieter De Jong that indicates his willingness to construct a BAC library. A letter from Dr. Cheryl Woodley who is at the NOAA Center for Environmental Health and Biomolecular Research and is the Chair of the Coral Disease and Health Consortium. In addition to voicing her support for the project in terms of her own research, she is also working to obtain funds for both a workshop (see B.5) and the sequencing portion of this project. A letter from Dr. Craig Downs (President) indicating the willingness of EnVirtue to help support the workshop described in B.5.  Finally, as this document was being completed we received an email from Dr. Don Comb, President of New England Biolabs indicating his willingness to provide support through their Ocean Genome Legacy project at NEB. 10

Related docs
Sequencing the Gene
Views: 4  |  Downloads: 1
Chicken Genome Sequencing Whitepaper
Views: 128  |  Downloads: 6
exon_gene_signal_estimate_whitepaper
Views: 51  |  Downloads: 1
Gene Sequencing for Research on Aging
Views: 171  |  Downloads: 2
Cotton Genome Sequencing
Views: 108  |  Downloads: 2
Mouse Sequencing Consortium
Views: 0  |  Downloads: 0
Additional primers for gene sequencing
Views: 31  |  Downloads: 0
premium docs
Other docs by C Gunnison
Three-Year Profit Projection
Views: 386  |  Downloads: 50
Start-up Expenses
Views: 615  |  Downloads: 90
Personal Financial Statement
Views: 362  |  Downloads: 35
Opening Day Balance Sheet
Views: 555  |  Downloads: 23
Loan amortization schedule
Views: 250  |  Downloads: 18
Financial History and Ratios
Views: 241  |  Downloads: 21
C Projected Balance Sheet
Views: 260  |  Downloads: 6
Break-Even Analysis
Views: 620  |  Downloads: 94
12 Month Cashflow Form Rev
Views: 322  |  Downloads: 10
12 Month Sales Forecast
Views: 347  |  Downloads: 28
12 Month Profit and Loss Projection1[4]
Views: 173  |  Downloads: 7
BankLoanRequestforSmallBusiness[3]
Views: 326  |  Downloads: 24
Competitive Analysis[4]
Views: 807  |  Downloads: 79
invoice_quadplay
Views: 1622  |  Downloads: 56
invoice_eternity
Views: 2330  |  Downloads: 111