Homology Modeling Lab by xjl74245

VIEWS: 179 PAGES: 25

									Iowa State                                  Bioinformatics and Computational Biology
University                                  Graduate Program




     Homology Modeling Lab


              Michael Zimmermann
      Department of Biochemistry, Biophysics,
              and Molecular Biology
           Laurence H. Baker Center for
      Bioinformatics and Biological Statistics
             The Jernigan Laboratory
                    Homology Modeling Lab
  Iowa State                                Bioinformatics and Computational Biology
  University                                Graduate Program


           What will we do today?
• Quick Perspective on BCB and Structures
  • Molecular Structure Files (PDB)
  • Fold Space
  • Reduced Models
• Homology Modeling
  – What it is?
  – Why it is useful?
  – What problems does it solve?
  – How to do it?

                    Homology Modeling Lab
Iowa State                                  Bioinformatics and Computational Biology
University                                  Graduate Program


                Perspective




                Detailed genetic
              information informs
             organism wide views




                    Homology Modeling Lab
   Iowa State                              Bioinformatics and Computational Biology
   University                              Graduate Program


  Classical Structure Determination
• Proteins’ structures are solved mostly by:
  – x-ray crystallography
  – NMR spectroscopy
  – Electron Microscopy tomography
• Both methods require a lot of human input
  from highly trained specialists.
• time-consuming
• $10,000 - $1,000,000 for one structure.


                   Homology Modeling Lab
    Iowa State                                    Bioinformatics and Computational Biology
    University                                    Graduate Program


                        Fold Space
1. Continuous
     1. Evolution makes small, gradual changes (usually) in
         sequence
     2. TM-score relatedness network (Jeffrey Skolnick) of the PDB
         has an average pair-wise path length of 7 steps
2. Discrete
     1. The highly connected space clusters well and exhibits
         community structure
     2. Structural ontologies are filling




                          Homology Modeling Lab
Iowa State   Bioinformatics and Computational Biology
University   Graduate Program




                            61,086 total
                            structures on
                            11-02-09
   Iowa State                                Bioinformatics and Computational Biology
   University                                Graduate Program


      Where to get Molecular Files
• http://www.rcsb.org/




• Advanced Search




                     Homology Modeling Lab
Iowa State                           Bioinformatics and Computational Biology
University                           Graduate Program


   Where to get Molecular Files




             Homology Modeling Lab
       Iowa State                                             Bioinformatics and Computational Biology
       University                                             Graduate Program


                             Molecule Files
• The Protein DataBank (PDB) file 1T3R
ATOM         8   N     GLN   A   2           25.279      22.419        34.914     1.00   21.01      N
ATOM         9   CA    GLN   A   2           23.872      22.620        34.516     1.00   17.82      C
ATOM        10   C     GLN   A   2           23.654      24.078        34.247     1.00   18.11      C
ATOM        11   O     GLN   A   2           23.996      24.956        35.114     1.00   20.40      O
ATOM        12   CB    GLN   A   2           22.926      22.138        35.611     1.00   19.10      C
ATOM        13   CG    GLN   A   2           21.447      22.401        35.328     1.00   18.52      C
ATOM        14   CD    GLN   A   2           20.558      21.549        36.121     1.00   21.32      C
ATOM        15   OE1   GLN   A   2           20.145      20.502        35.662     1.00   22.49      O
ATOM        16   NE2   GLN   A   2           20.336      21.926        37.380     1.00   21.05      N




                 AtomType            Chain          X           Y        Z               B-Factor
                        Residue ResidueNumber                                   Occupancy


                                      Homology Modeling Lab
Iowa State                               Bioinformatics and Computational Biology
University                               Graduate Program



             Reduced Models




                 Homology Modeling Lab
   Iowa State                                  Bioinformatics and Computational Biology
   University                                  Graduate Program


           Types of Reduced Models

                             Alpha Carbon Only



CABS (Andrzej Kolinski )              Angle Sampling (Tobin Sosnick
                                                  and David Baker)




                       Homology Modeling Lab
    Iowa State                                Bioinformatics and Computational Biology
    University                                Graduate Program


                 A Word of Caution
 Only a few small proteins have ever been folded
from first principle methods
    In these cases, it took years of CPU time
    Failures are much more common
    This shows our lack of understanding
 Homology modeling is useful, but does not
illuminate the underlying problem of protein folding
 pFAM is a ready source for examples where known
methods break down
                      Homology Modeling Lab
     Iowa State                                 Bioinformatics and Computational Biology
     University                                 Graduate Program


            Ubiquitin and SUMO-2
share 15% sequence identity, but are 1.5Å RMSD
These differences
are likely within the
range of structural
flexibility
                                                                         1UBI
                                                                         1WM2




                        Homology Modeling Lab
Iowa State                             Bioinformatics and Computational Biology
University                             Graduate Program




             Homology
             Modeling

               Homology Modeling Lab
      Iowa State                                      Bioinformatics and Computational Biology
      University                                      Graduate Program


          A few Definitions by Eugene V. Koonin
Homologs: genes sharing a common origin
Orthologs: genes originating from a single ancestral gene in the last
common ancestor of the compared genomes
Paralogs: genes related via duplication
Co-orthologs: two or more genes in one lineage that are, collectively,
orthologous to one or more genes in another lineage due to a lineage-
specific duplication(s)
Outparalogs: paralogous genes resulting from a duplication(s)
preceding a given speciation event
Inparalogs: paralogous genes resulting from a lineage-specific
duplication(s) subsequent to a given speciation event
Xenolog: a member of the same orthologous cluster from a distant
lineage
                              Homology Modeling Lab
  Iowa State                                Bioinformatics and Computational Biology
  University                                Graduate Program


               Homology Modeling
• How to use homology modeling?
  – template selection
  – sequence-to-structure alignment
  – model building
  – model selection and refinement
• Examples




                    Homology Modeling Lab
   Iowa State                                Bioinformatics and Computational Biology
   University                                Graduate Program


                What is Homology?
• It is believed that number of possible protein folds
  is limited.
• Proteins with sequence identity of at least 35%
  almost certainly have a close ancestor (homology).
• For almost 70% of known protein sequences a
  structural homolog may be detected in PDB
  database, which means that for ~50,000,000
  proteins homological models of reasonable
  resolution (~3Å) may be obtained.



                     Homology Modeling Lab
Iowa State                                Bioinformatics and Computational Biology
University                                Graduate Program


             Homology Modeling




                  Homology Modeling Lab
   Iowa State                                   Bioinformatics and Computational Biology
   University                                   Graduate Program


                Template Detection
• Sequence-only methods:
  – Blast, Fasta scan against PDB database.
  – PSI-Blast scan against sequence database.
• Profile comparison:
  – Profile-to-profile alignment on structural database.
• Threading:
  – Optimal fitting of modeled sequence to structures from
    PDB.
• Metaservers:
  – Combination of all above (and others).


                        Homology Modeling Lab
    Iowa State                                            Bioinformatics and Computational Biology
    University                                            Graduate Program


                     Position Specific Scoring Matrix

    1   2   3    4   5   6    7   8     9     10   11     12   13   14   15   16   17   …   N

A

R

N

D                                       Pia

C

E

Q

…

V


                             Pia – Probability that at “i”-th position
                             in sequence is “a”-th amino acid.
                                  Homology Modeling Lab
  Iowa State                                Bioinformatics and Computational Biology
  University                                Graduate Program




• Alignment is generated along with template
detection.
• When using different sources of template
detection (metaserver) one may obtain different
alignments with the same template.
• Recently a new approach to this problem has
been proposed – no initial alignment is made,
modeled sequence is aligned to the template
during the modeling step (Kolinski, 2008).


                    Homology Modeling Lab
   Iowa State                                Bioinformatics and Computational Biology
   University                                Graduate Program


                   Modeling
• Template is used as a rigid scaffold. Modeling
  algorithm rebuilds missing parts – mainly loops.
• Template is used as a semi-flexible scaffold. Not
  only missing parts are rebuild, but also some
  adjustments to the template-based fragments are
  made upon modeling.
• Usually a great number of models is generated at
  this stage.
• Modeller (A. Sali), Rosetta (D. Baker),
  CABS (A. Kolinski), UnRes (H. Scheraga)


                     Homology Modeling Lab
  Iowa State                              Bioinformatics and Computational Biology
  University                              Graduate Program


Homology Modeling to (re)generate Loop Regions

                      1A14L.pdb

                      a – SICHO
                      b – CABS
                      c – REFINER
                      d – MODELLER

                      Green – Native loop structure
                      Red – modeled loop structure




                  Homology Modeling Lab
Iowa State                           Bioinformatics and Computational Biology
University                           Graduate Program




Homology Modeling Example




Adapted from material by Mateusz
Kurcinski who is working with Andrzej
  Kolinski and collaborators here.
             Homology Modeling Lab
 Iowa State                                 Bioinformatics and Computational Biology
 University                                 Graduate Program




               Thank You

Please direct questions you have about
 this material to michaelz at iastate.edu
       http://ribosome.bb.iastate.edu/people/




                    Homology Modeling Lab

								
To top