Docstoc

05

Document Sample
05 Powered By Docstoc
					  Secondary Structure
Assignment from Structure

    PHAR 201/Bioinformatics I
       Philip E. Bourne
  Department of Pharmacology, UCSD

    Reading Chapter 19 Structural
           Bioinformatics
            PHAR 201 Lecture 05, 2012   1
                Agenda

• Why secondary structure assignment is
  important
• Hydrogen bonding models
• DSSP (Kabsch-Sander) and its impact
• Other methods
• Conclusions


               PHAR 201 Lecture 05, 2012   2
Reminder - Dihedral Angles




                                     From http://www.imb-jena.de

phi     -   dihedral angle about the N-Calpha bond
psi     -   dihedral angle about the Calpha-C bond
omega   -   dihedral angle about the C-N (peptide) bond
               PHAR 201 Lecture 05, 2012                           3
        Reminder - Helices
                              phi(deg)   psi(deg) H-bond pattern
------------------------------------------------------------------
right-handed alpha-helix        -57.8      -47.0        i+4
pi-helix                        -57.1      -69.7        i+5
310 helix                       -74.0       -4.0        i+3

(omega is ~180 deg in all cases)
-----------------------------------------------------------------

                                                          From http://www.imb-jena.de




                     PHAR 201 Lecture 05, 2012                                 4
          Reminder - Beta Strands
                              phi(deg)   psi(deg)   omega (deg)
------------------------------------------------------------------
beta strand                    -120        120         180
-----------------------------------------------------------------




                         Hydrogen bond patterns in beta sheets. Here a four-stranded
                         beta sheet is drawn schematically which contains three
                         antiparallel and one parallel strand. Hydrogen bonds are
                         indicated with red lines (antiparallel strands) and green lines
                         (parallel strands) connecting the hydrogen and receptor oxygen.

                            From http://broccoli.mfn.ki.se/pps_course_96/

                            PHAR 201 Lecture 05, 2012                                      5
    Why is consistent secondary
structure assignment from structure
             important?
• Part of the fold and domain
• Useful conceptualization for understanding
  structure
• Influences the sequence alignment
• It is related to function
• It is useful as part of structure prediction –
  defines regions on the templates
• As a training set in machine learning algorithms
• Consistency of searching – author’s
  assignments differ
                  PHAR 201 Lecture 05, 2012          6
           150                                                            200
Ilk____PSS   ..........     .......... ........CC ....CEEEHH       HHCCCCCCEE         Example where
Ilk____Seq   ..........     .......... ........FK ....QLNFLT       KLNENHSGEL
------------                                   -+     +L-+++       KL-+---GE-
                                                                                      secondary structure
1fmk--_Seq
1fmk--_SS
             KHADGLCHRL
             HCCCCCCCCC
                            TTVCPTSKPQ TQGLAKDAWE IPRESLRLEV
                            CEECCCCCCC CCCCCCCCCE CCHHHEEEEE
                                                                   KLGQGCFGEV
                                                                   EEEECCCEEE
                                                                                      is important
                                                                     * * *
                                                                                      •“Integrin-linked kinase” (Ilk)
              200                                                               250
Ilk____PSS   EEEECCCCE.     EEEEEEECCC   CCCCCHHHHH   HHHHHHHHHC   CCCEEEEEEE         is a novel protein kinase fold
Ilk____Seq   WKGRWQGND.     IVVKVLKVRD   WSTRKSRDFN   EECPRLRIFS   HPNVLPVLGA         with strong sequence similarity
------------ W+G+W-G+-      +-+K+LK-      +T+++-+F-   +E---++-++   H++++-++++
1fmk--_Seq   WMGTWNGTTR     VAIKTLKP..   .GTMSPEAFL   QEAQVMKKLR   HEKLVQLYAV         to known structures (Hannigan
1fmk--_SS    EEEEECCCEE     EEEEEECC..   .CCCCHHHHH   HHHHHHHHCC   CCCECCEEEE
                               *                       *
                                                                                      et al. 1996 Nature 379, 91-96)
                250                                                           300     •Aligns to Src kinases with
Ilk____PSS     EECCCCEEEE   EEHHHHCCCC   HHHHHHCCCC   CCCCHHHHHH   HHHHHHHHHH
Ilk____Seq     CQSPPAPHPT   LITHWMPYGS   LYNVLHEGTN   FVVDQSQAVK   FALDMARGMA
                                                                                      BLAST e-value of 10-19 and
------------   ++++P   --   ++T--M++GS   L-++L-+-T+   --+--+Q-V+   +A+++A+GMA         27% identity (alignment shown
1fmk--_Seq     VSEEP...IY   IVTEYMSKGS   LLDFLKGETG   KYLRLPQLVD   MAAQIASGMA
1fmk--_SS      ECCCC...EE   EEEECCCCCE   HHHHHCCCCC   CCCCHHHHHH   HHHHHHHHHH         is to a known Src kinase
                                                                                      structure)
                300                                                           350
Ilk____PSS     HHHCCCCCEE   CCCCCCCCEE   ECCCCEEEEC   CCCCEEECCC   CCCCCCCCCC         •Several key residues are
Ilk____Seq     FLHTLEPLIP   RHALNSRSVM   IDEDMTARIS   MADVKFSFQC   PGRMYAPAWV         conserved, but residues
------------   ++++--- -    ---L-+++++   ++E+-+++++   ---+--           +---W-
1fmk--_Seq     YVERMNY..V   HRDLRAANIL   VGENLVCKVA   DFGLAR....   ....FPIKWT         important to catalysis, including
1fmk--_SS      HHHHHCC..C   CCCCCHHHEE   EECCCEEEEC   CCCCCC....   ....CCHHHC
                              *    *                  *
                                                                                      catalytic Asp, are missing
                             Cat. Loop
                350                                                             400   •Recent experimental evidence
Ilk____PSS     HHHHHHCCCC   CCCCEEEEEE   EEHHHHHHHH   H.CCCCCCCC   CHHHHHHHHH
Ilk____Seq     APEALQKKPE   DTNRRSADMW   SFAVLLWELV   T.REVPFADL   SNMEIGMKVA
                                                                                      suggests that Ilk lacks kinase
------------   APEA++++-      ---++D+W   SF++LL+EL+   T -+VP+-++   +N-E+-++V          activity (Lynch et al. 1999
1fmk--_Seq     APEAALYGR.   ..FTIKSDVW   SFGILLTELT   TKGRVPYPGM   VNREVLDQV.
1fmk--_SS      CHHHHHHCC.   ..CCHHHHHH   HHHHHHHHHH   CCCCCCCCCC   CHHHHHHHH.         Oncogene 18, 8024-8032)
               ***
                                                                                                                   7
                                             PHAR 201 Lecture 05, 2012
        History of Assignment
• Originally left to the interpretation of the
  structural biologist – inconsistent
• 1983 - the Kabsch- Sander algorithm was written
  as an aid in secondary structure prediction – the
  program as such never emerged – what did
  emerge is perhaps the most consistent and
  accepted algorithm in all of structural
  bioinformatics
• Assignments are embodies in the DSSP
  algorithm and associated database of
  assignments
                  PHAR 201 Lecture 05, 2012       8
Inconsistent Author Assignment




          PHAR 201 Lecture 05, 2012   9
    Hydrogen Bonding is Key to
       Automated Methods
• Why? - ~90% of backbone donors (NH)
  and acceptors (C=O) form hydrogen
  bonds
• 62% are intra-backbone
• Basic definition
  – Angle N – (H) – O greater than 120 degrees
  – H …O less than 2.5A
  – Note H’s not usually identified directly

                PHAR 201 Lecture 05, 2012        10
Hydrogen Bond - Definition




        PHAR 201 Lecture 05, 2012   11
      Coulomb Hydrogen Bond
     Calculation – used by DSSP
                  1 1 1 1                           
             + -                                    
      E = f    + + +                              
                 NO rHC' rHO rNC'
                  r                                   
•   f is a constant 332 Å kcal/e2
•   Delta is the + and – polar charge in electrons
•   Weakest H-bond –0.5 kcal/mole in DSSP
•   H not given – requires extrapolation – note assumes
    planar geometry for peptide bond
                      PHAR 201 Lecture 05, 2012            12
DSSP – Dictionary of Secondary Structures
               of Proteins
• Defined solely based on the H-bonds
  given – from the list of bonds and residues
  that form them; helix assignments are
  made as follows:
  – Alpha helix (H): start i -> i+4; end i-4 -> i
  – 310 helix (G): start i -> i+3; end i-3 -> I
  – Pi helix (I): start i -> i+5


                   PHAR 201 Lecture 05, 2012        13
DSSP – Dictionary of Secondary Structures
               of Proteins
                            • Similarly for beta sheets:

                                 – Residues (E) – have 2 H-
                                   bonds in the sheet or are
                                   surrounded by 2 H-bonds
                                 – Isolated residues (B) beta
                                   bridge 1GCS
                                 – Beta bulges also assigned
                                   E – may exist as up to 4 on
                                   one side of sheet and 1 on
                                   the other



              PHAR 201 Lecture 05, 2012                     14
          DSSP Nomenclature
•   H – alpha helix
•   G = 310 helix
•   I = Pi helix
•   B = bridge – single residue sheet
•   E = extended beta strand
•   T = beta turn (example)
•   S = bend
•   C = coil
                  PHAR 201 Lecture 05, 2012   15
        Converse Situation?
• In our discussions of structure comparison
  and alignment, structure classification and
  (soon) domain assignment we learnt there
  was not one generally accepted method
• DSSP has for a long time been a generally
  accepted method



                PHAR 201 Lecture 05, 2012   16
   DSSP as Implemented in the PDB




1ATP

             PHAR 201 Lecture 05, 2012   17
     STRIDE – Empirical Hydrogen Bond
               Calculation
                                              Ehb E  Et  Ep
                                                    r

                                                          6      8 
                                                           4r   3 rm 
                                                   E r   6 - 8 E m
                                                          r
                                                              m
                                                               r 

                                        E p = cos2 ()
                                              [0.9 + 0.1 sin(2t i )] cos(t o )      0 < t i  90
                                              
                                        E t =  1 [K 2 - cos2 (t i ) ] cos(t o )
                                               K                                     90 < t i  110
                                              
                                                           0                      110  t i


- Derived from small molecule structures rm (3.0A) and Em (-2.8kcal/mole)
- Total energy Ehb
                            PHAR 201 Lecture 05, 2012                                           18
 STRIDE – Empirical Hydrogen
      Bond Calculation
• Uses Ehb and phi-psi torsional angle
  criteria
• Torsional angles define secondary
  structures according to the regions of the
  Ramachandran plot in which they fall
• E is ignored if phi and psi are unfavorable



                PHAR 201 Lecture 05, 2012       19
Comparison DSSP & STRIDE




       PHAR 201 Lecture 05, 2012   20
          DSSP vs STRIDE
• Stride – added term in the expression of
  hydrogen bond energy
• Stride - Selection of terminal residues
  through reliance on torsional angles
• Stride – stresses planarity of hydrogen
  bonds while allowing longer bonds



                PHAR 201 Lecture 05, 2012    21
            Other Methods
• DEFINE – uses a distance criteria
  between Calpha atoms which varies
  slightly for each secondary structure type;
  allows modifications for curvature

• P-Curve – analysis of protein curvature –
  compares to ideal motifs – unknown motif
  defined by tilt, roll etc between peptide
  planes.
                PHAR 201 Lecture 05, 2012       22
              Comparative Notes
• The last residues of a sheet or a helix are often still in the same
  conformation, although they no longer have hydrogen bonds in the
  structure. This translates to the observation that ends (caps) of
  regular secondary structure segments are not well defined.
• It seems that Ca-distance criteria (applied in DEFINE) alone can
  accommodate considerable distortion of the backbone, giving an
  excess of secondary structure assignments despite having reduced
  e considerably.
• DSSP is the only assignment scheme with a large peak for a-helices
  of four residues, many of which constitute single helical turns.
• DEFINE assigns more than twice as many sheets of length four than
  the other methods.
• P-Curve has a tendency to assign overly long elements of regular
  secondary structure.



                         PHAR 201 Lecture 05, 2012                 23
Amino Acid Propensities Indicate the Role of
Side Chains in Defining Secondary Structure
      – Basis of Prediction Methods –
 Note that none of the assignment methods
                  use this

• Alpha helices – rich in ALA, LEU; poor in
  PRO and GLY
• Beta sheets – rich in VAL, ILE; poor in
  GLY, ASP, PRO
• 310 – rich in PRO; poor in ALA, LEU
• Beta bridges – poor in VAL, ILE
                PHAR 201 Lecture 05, 2012     24
   Newer Methods DSSPcont
• Use known alignments from multiple 3D
  structures or from multiple members of the
  NMR ensemble (DSSPcont)
• Consensus based approach




                PHAR 201 Lecture 05, 2012   25
                        Supersecondary Structures

                                                                             http://en.wikipedia.org/wiki/Meander_(art)

http://en.wikipedia.org/wiki/Zinc_finger




                             Zinc Finger Motif


                                                 PHAR 201 Lecture 05, 2012                                                26
            I-sites (Baker)
• I-sites – specific segments with common
  amino acid propensities
• Used by Rosetta to predict structure –
  perhaps the most successful method thus
  far
• Note considers only main chain hydrogen
  bonds – much of the tertiary structure is
  associated with side chain interactions
                PHAR 201 Lecture 05, 2012     27
                     Summary
• DSSP remains the first and most popular approach
• STRIDE may have been developed as part of the EMBL
  ….
• DSSP has been coded a number of times from the paper
  often with different results – open source helps this today
• DSSP is perhaps the most accepted algorithm in all of
  structural bioinformatics
• It is not always clear whether the secondary structure
  assignments deposited with a structure are from DSSP
  or from the authors view
• Consistent searching requires that DSSP be used for all
  structures – early structures had no author assignments
                      PHAR 201 Lecture 05, 2012            28

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:36
posted:10/19/2012
language:English
pages:28