A Tutorial on Bioinformatics Overview of Bioinformatics
A/P Shoba Ranganathan Justin Choo National University of Singapore
What is Bioinformatics ?
Bioinformatics is “the study of the information content and information flow in biological systems and processes”. - Michael Liebman in “Bioinformatics: An Editorial Perspective” (http://www.netsci.org/Science/Bioinform/feature01.html) • Annotate -> store -> search/retrieve -> analyze -> visualize • Nucleic acid sequence (genes and RNAs), protein sequence and structural information.
SARS & Its Implication ...
SARS - Bioinformatics In Action
Sequencing Of SARS …
Photo above shows the sequencing area of the lab. Taken from http://www.bcgsc.ca/bioinfo/SARS/
Partial Sequence of SARS …
>gi|30248028|gb|AY274119.3| SARS coronavirus TOR2, complete genome ATATTAGGTTTTTACCTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTTCTCTAAACGA ACTTTAAAATCTGTGTAGCTGTCGCTCGGCTGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAA TTTTACTGTCGTTGACAAGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTTTCGTCCGTGT TGCAGTCGATCATCAGCATACCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTC TTGGTGTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCG TGGCTTCGGGGACTCTGTGGAAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGT CTAGTAGAGCTGGAAAAAGGCGTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAACGTTCTGATG CCTTAAGCACCAATCACGGCCACAAGGTCGTTGAGCTGGTTGCAGAAATGGACGGCATTCAGTACGGTCG TAGCGGTATAACACTGGGAGTACTCGTGCCACATGTGGGCGAAACCCCAATTGCATACCGCAATGTTCTT CTTCGTAAGAACGGTAATAAGGGAGCCGGTGGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAG GTGACGAGCTTGGCACTGATCCCATTGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGC ACTCCGTGAACTCACTCGTGAGCTCAATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGC CCAGATGGGTACCCTCTTGATTGCATCAAAGATTTTCTCGCACGCGCGGGCAAGTCAATGTGCACTCTTT CCGAACAACTTGATTACATCGAGTCGAAGAGAGGTGTCTACTGCTGCCGTGACCATGAGCATGAAATTGC CTGGTTCACTGAGCGCTCTGATAAGAGCTACGAGCACCAGACACCCTTCGAAATTAAGAGTGCCAAGAAA TTTGACACTTTCAAAGGGGAATGCCCAAAGTTTGTGTTTCCTCTTAACTCAAAAGTCAAAGTCATTCAAC CACGTGTTGAAAAGAAAAAGACTGAGGGTTTCATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCC ACAGGAGTGTAACAATATGCACTTGTCTACCTTGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAG ACGTGCGACTTTCTGAAAGCCACTTGTGAACATTGTGGCACTGAAAATTTAGTTATTGAAGGACCTACTA
...
...
...
...
...
The complete genome of SARS, obtained from http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?30248028:NCBI:4812069
1980
Bioinformatics -Timeline
Single Structures • Modeling & Geometry • Forces & Simulation • Docking Sequences, Sequence-Structure Relationships • Alignment • Structure Prediction • Fold recognition Genomics • Dealing with many sequences • Gene finding & Genome Annotation • Databases Integrative Analysis • Expression & Proteomics Data • Data mining • Simulation again…(whole cells?).
1985
1990
1995
2000
2005
Biological Databases
• Collect, organise and classify data • Query the dataset • Retrieve entries based on keyword search
EMBL
PDB Genbank
Sequence Analysis Software
• What is the information contained in a biological sequence? • How can we analyse it to gain knowledge? • Does it contain any functional clues?
Sequence Comparison
• How can we compare a given sequence to the millions in the database? • Which ones are truly related by evolution? • What can the study of related sequences tell us?
Sequence Alignment
• After collecting a set of related sequences, how can we compare them as a set? • How should we line up the sequences so that the most similar portions are together? • What do we do with sequences of different lengths?
Protein Structure
• The function of a protein is a consequence of its folded state: Anfinsen, 1961 • The 3D fold of a protein is called its structure • In 3D, the business end of the protein has contributions from different regions of its sequence
Picture taken from http://www.strgen.org/
Visualization
• Using graphic tools to view structures • Simple commands to analyse structures and active sites • Different graphic representations and colouring schemes
Picture taken from http://www.nature.com/
Careers in Bioinformatics
Genomics:
•
• • •
Genome sequencing of – Bacteria, viruses – Animals – Plants Comparative genomics Annotation and Mapping Gene Discovery
Careers in Bioinformatics
Functional Genomics (Gene Expression and Regulation):
•
• • •
Control Regions – Switches – Circuits – Bypass – Feedback loops Environmental Effects Diseased States Chemical Consequences
Careers in Bioinformatics
Pharmacogenomics:
•
• • •
SNPs – Regional, ethnic variations – Inheritance patterns – Radiological/ecological modifications Therapeutic target recognition Correlation of drug and expression effects Pathway Effects
Careers in Bioinformatics
Proteomics:
•
•
Protein Profiling – Alternate splice variants – Orphan genes – Cryptic introns Gene Therapy
Careers in Bioinformatics
Structural Genomics:
•
• • •
Experimental Protein structures – Apo state – Holo state – Structural modifications Membrane Proteins Homology Modelling Comparative Modelling
Careers in Bioinformatics
Drug and Vaccine Design:
•
• • •
Screening Natural Products – Plants – Fungi – Bacteria Chemicals In silico modifications of ligands Vaccine design and delivery
Job Sectors
Academia Research Institutes Biotechnology Bioinformatics Pharmaceutical Agriculture Biodiversity
The End