Comparative Microbial Genomics group
Center for Biological Sequence analysis
An Overview of Genome Databases
- or Where can I find up-to-date information about genomes?
Dave Ussery DTU course #27101 Communicating Science: Comparative Genomics Friday, 12 September, 2008
Department of Systems Biology, Technical University of Denmark
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Comparative Microbial Genomics group
Center for Biological Sequence analysis
September Learn tools to compare genomes Journal Clubs / podcast Lectures on how to write papers!
October Use tools to compare genomes Journal Clubs / podcast Write papers (!)
November Posters due Journal Clubs / podcast Referee / publish
Department of Systems Biology, Technical University of Denmark
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
>gi|169754007|gb|ACA76706.1| histone family protein nucleoid-structuring protein H-NS MSVMLQSLNNIRTLRAMAREFSIDVLEEMLEKFRVVTKERREEEEQQQRELAERQEKISTWLELMKADGI NPEELLGNSSAAAPRAGKKRQPRPAKYKFTDVNGETKTWTGQGRTPKPIAQALAEGKSLDDFLI
Fasta format
GenBank ACA76706 format LOCUS
134 aa linear BCT 09-MAY-2008 DEFINITION histone family protein nucleoid-structuring protein H-NS [Escherichia coli ATCC 8739]. ACCESSION ACA76706 SOURCE Escherichia coli ATCC 8739 ORGANISM Escherichia coli ATCC 8739 Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. FEATURES Location/Qualifiers source 1..134 /organism="Escherichia coli ATCC 8739" /strain="ATCC 8739" /db_xref="ATCC:8739" /db_xref="taxon:481805" Protein 1..134 /product="histone family protein nucleoid-structuring protein H-NS" CDS 1..134 /locus_tag="EcolC_1037" /coded_by="CP000946.1:1125675..1126079" /note="PFAM: histone family protein nucleoid-structuring protein H-NS KEGG: sdy:SDY_2859 DNA-binding protein" /transl_table=11 /db_xref="InterPro:IPR001801" ORIGIN 1 msvmlqslnn irtlramare fsidvleeml ekfrvvtker reeeeqqqre laerqekist 61 wlelmkadgi npeellgnss aaapragkkr qprpakykft dvngetktwt gqgrtpkpia 121 qalaegksld dfli //
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
RefSeq records appear in a similar format as the GenBank records from which they are derived. However, they can be distinguished from GenBank records by their accession prefix, which includes an underscore, and a notation in the “comment” field that indicates the RefSeq status. RefSeq records can be accessed through NCBI’s Nucleotide and Protein databases, which are among the many databases linked through the Entrez search and retrieval system. When retrieving search results, users can choose to see all GenBank records or only RefSeq records by clicking on the appropriate tab at the top of the results page. Users also can choose to search only RefSeq records, or specific types of RefSeq records (such as mRNAs), by using the “Limits” feature in Entrez. Further information about the database can be obtained at the RefSeq homepage [http:// www.ncbi.nlm.nih.gov/RefSeq]. Key Characteristics of GenBank versus RefSeq
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
GenBank
CP000828
RefSeq
NC_009925
Not curated Author submits Only author can revise Multiple records for same loci common Records can contradict each other No limit to species included Data exchanged among INSDC members Akin to primary literature Proteins identified and linked Access via NCBI Nucleotide databases
Curated NCBI creates from existing data NCBI revises as new data emerge Single records for each molecule of major organisms Limited to model organisms Exclusive NCBI database Akin to review articles Proteins and transcripts identified and linked Access via Nucleotide & Protein databases
1-15
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
What about other [genome-related] databases?
Comparative Microbial Genomics group
Comparative Microbial Genomics group
Center for Biological Sequence analysis
Nucleic Acids Research, 2008, Vol. 36, Database issue D1 doi:10.1093/nar/gkm1139
EDITORIAL
The 2008 Database Issue of Nucleic Acids Research is the fifteenth in a series dedicated to databases in the field of molecular biology. These databases are essential resources for experimental and computational biologists alike and this compilation provides descriptions and updates of the most important of these databases, and serves to introduce newly compiled resources that provide specialist information in the biological area. The current issue presents 98 new databases (30 more than last year) and updates for 84 existing databases. The 2008 Database Issue is not included in the print subscription to NAR. Instead, the Database Issue is freely available online to all under NAR’s open access model. However, print copies are available for separate purchase by institutions and individuals. Michael Galperin has continued to produce and enlarge the Molecular Biology Database Collection, a compendium of databases that includes all those databases described in Nucleic Acids Research, as well as selected other databases relevant to biologists. NAR Online contains links to all of the databases in the compilation as well as brief summaries of their content. Individuals who wish to have their database listed in the Molecular Biology Database Collection or update a previous submission to the collection should contact Dr Michael Galperin directly (nardatabase@gmail.com). After 5 years as the Database Issue Editor I am stepping down. It has been my great pleasure to watch the growth of so many wonderful database resources and to help provide a forum for describing this important work. I am very pleased to announce that Michael Galperin will take over editing the next Database Issue. ALL authors wishing to submit articles for the 2008 Database Issue MUST contact Dr M. Galperin (nardatabase@gmail.com) with a presubmission enquiry, no later than July 1, 2008, to check whether a submission will be suitable for the issue. The pre-submission enquiry must present a working web accessible database for review by the Editor. Articles describing new databases will need to be received by August 15, 2008 at the latest, and should be prepared according to the instructions on the Nucleic Acids Research website (http://nar.oupjournals.org/). Authors who are submitting articles providing update information on databases that have previously been featured in Nucleic Acids Research should note that the deadline for submission of those articles is September 15, 2008. The database issue would not be possible without timely reports from hundreds of reviewers. Thanks to you all! I would also like to thank Deborah Wardle for excellent editorial assistance. Finally, I would like to thank Claire Bird, and the rest of the team at Oxford University Press for producing this important issue. Alex Bateman
Department of Systems Biology, Technical University of Denmark
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group
Center for Biological Sequence analysis Department of Systems Biology, Technical University of Denmark
Comparative Microbial Genomics group