Bioinformatics Databases

Reviews
Bioinformatics Databases Amandeep S. Sidhu Data Mining in Bioinformatics (Week 2) Outline            Biological Data Database Growth Challenges of Large Databases Genes Gene Databases Proteome Protein Databases Other Data Data Searching – ANGIS Summary Exercises Biological Data DNA and Protein Sequences are annotated      Source Organism Function Updates Etc. Database Growth Database Growth Database Growth Challenges of Large Databases  Storage  Indexing, physical layout, memory management  Modeling  Relational, hierarchical, semi-structured Update, query, analysis Visualization  Efficiency   Interpretation  What is a Gene?  the physical and functional unit of heredity that carries information from one generation to the next sequence necessary for the synthesis of a functional protein or RNA molecule  DNA Genome chromosomal DNA of an organism  number of chromosomes and genome size varies quite significantly from one organism to another size and number of genes does not necessarily determine organism complexity  Genome Genome Comparison ORGANISM CHROMOSOMES GENOME SIZE GENES Homo sapiens (Humans) 23 3,200,000,000 ~ 30,000 Mus musculus (Mouse) Drosophila melanogaster (Fruit Fly) Saccharomyces cerevisiae (Yeast) Zea mays (Corn) 20 2,600,000,000 ~30,000 4 180,000,000 ~18,000 16 14,000,000 ~6,000 10 2,400,000,000 ??? Genome Databases NCBI - http://www.ncbi.nlm.nih.gov/  GeneBank http://www.ncbi.nlm.nih.gov/Genbank/  SRS – http://srs.ebi.ac.uk/  GDB - http://gdbwww.gdb.org/  OMIM http://www.ncbi.nlm.nih.gov/entrez/query.f cgi?db=OMIM  ….. Proteome  the complete collection of proteins that can be produced by an organism. be studied either as static (sum of all proteins possible) or dynamic (all proteins found at a specific time point) entity  can Proteome Databases     PDB & WWPDB http://www.rcsb.org/pdb/ http://www.wwpdb.org/index.html PIR, SWISS-PROT & UniProt http://pir.georgetown.edu/home.shtml http://au.expasy.org/sprot/ http://www.expasy.uniprot.org/ InterPro http://www.ebi.ac.uk/interpro/ SCOP http://scop.mrc-lmb.cam.ac.uk/scop/ Other Data Annotations  PubMed http://www.ncbi.nlm.nih.gov/entrez/query.f cgi  GO http://www.geneontology.org/  PO http://proteinontology.info/ http://proteomeontology.org/ Data Searching  ANGIS http://www.angis.org.au/  ANGIS Demo  Login Details Summary  Bioinformatics  is truly interdisciplinary Biology (natural sciences), informatics, mathematics & statistics  Databases  Large, semistructured, incomplete, inaccurate  Wide-range  of problems Solutions employ knowledge from sciences with algorithms and models from informatics, mathematics, and statistics Exercises

Related docs
Databases in Bioinformatics
Views: 30  |  Downloads: 1
Introduction to Bioinformatics
Views: 33  |  Downloads: 7
The Future of Bioinformatics
Views: 32  |  Downloads: 7
Bioinformatics_
Views: 6  |  Downloads: 0
Bioinformatics in Glycobiology
Views: 5  |  Downloads: 1
Summary of Bioinformatics
Views: 63  |  Downloads: 8
Introduction to Bioinformatics
Views: 14  |  Downloads: 2
Other docs by One Seven