Title here

Document Sample
Title here Powered By Docstoc
					           PREDICTION OF
    3D STRUCTURE & FUNCTION OF
        PROTEINS & PEPTIDES


A THESIS SUBMITTED TO THE UNIVERSITY OF PUNE
             FOR THE DEGREE OF
   DOCTOR OF PHILOSOPHY IN BIOINFORMATICS




                     BY
         URMILA KULKARNI-KALE


          UNDER THE GUIDANCE OF
           PROF. A. S. KOLASKAR


          BIOINFORMATICS CENTRE
            UNIVERSITY OF PUNE
                  PUNE


                 October 2003




                                               i
                                 ABSTRACT


The field of Bioinformatics is growing at a phenomenal rate in the post-genomic
era and the task in the early years of the millennium is to demonstrate how in-
silico simulations facilitate experiments in the laboratories and how this
knowledge can be applied in curing human diseases. The first step to achieve
this goal is to translate the genomic data into biological knowledge by studying
the function of all known proteins. Both, the computational and experimental
approaches play a critical role in the identification of the function of proteins.
Bioinformatics offers several approaches for the prediction of structure and
function of proteins on the basis of sequence and structural similarities. The
protein sequencestructurefunction relationship is well established and
reveals that the structural details at atomic level help understand molecular
function of proteins. The protein structure is conserved and can accommodate up
to 80% of sequence variation. X-ray crystallography and NMR are the
experimental methods to determine protein structure. They provide 3D structure
data for a relatively small number of proteins due to time-consuming
preprocessing requirements such as purification, crystallisation of proteins etc.
NMR is limited by the size of the protein molecule. Computational approaches
expand the structural knowledge when applied to potentially large number of
families of related proteins and thus help fill the gap between the number of
known protein sequences and the known structures.            The models that are
obtained using comparative or homology modeling methods could be used to
understand the biological role of the protein molecules, as well as to design
strategies for the development of new drugs and vaccines.
This thesis deals with (i) Prediction of 3D structure of proteins using knowledge-
based homology modeling approach (ii) Development of an algorithm to predict
and map sequential and conformational epitopes of the protein antigens of
known structure (iii) Prediction and structural analysis of antigenic peptides.



                                                                                  ii
Chapter 1: Prediction of structure and function of proteins: A review
The Bioinformatics approaches for the prediction of 3D structure and function of
proteins are reviewed in this Chapter.        3D structure prediction methods of
proteins are reviewed with emphasis on knowledge-based homology modeling.
Homology modeling is the most reliable method for the prediction of protein
structure, provided that high-resolution structure data is available for at least one
homologous protein and the sequence similarity between the proteins of
unknown and known structures is >35%. The steps in homology modeling are –
   Target identification and validation
   Alignment of query protein (protein for which structure is to be predicted)
    with the template (homologous protein for which structure is known)
   Determination of structurally conserved/framework regions and loop regions
   Assignment of initial conformation to the main-chain atoms
   Assignment of conformation to side-chains
   Energy minimisation and geometry optimisation of the initial structure
   Evaluation of predicted structure
The ‘information content’ of sequence and structure can be used to gain an
insight into the molecular function of proteins. The sequence-to-function is the
most commonly used approach for prediction and annotation of protein function.
Sequence-based function prediction methods rely on sequence similarity and
evolutionary relationship between the proteins of known and unknown functions.
However, sequence-based methods are limited by their ability to detect
similarities between distantly related proteins.
The 3D structure of proteins is more conserved than the sequence. Therefore,
structure-based function prediction approaches are considered to be more
reliable, as function can be assigned to proteins irrespective of studying their
evolutionary relationship. However, structure-based approaches are limited by
the availability of protein structures.    Hence, a unified approach involving an
array of methods based on sequence and structural similarities should be used to
assign/predict the molecular function of proteins.


                                                                                   iii
Chapter 2: Prediction of 3D structure of the variable region of heavy chain
(VH) of RF-RC1, a human Rheumatoid Factor autoantibody
Molecular modeling studies of the heavy chain of human IgM Rheumatoid
Factor (RF) are described in this Chapter. RFs are one of the well-characterised
human autoantibodies known to cause immunological abnormalities. The
structure of RF-RC1 VH was predicted using the X-ray crystal structure of an
IgM antibody. The molecule was modeled using CVFF force field and the
distance-dependent dielectric constant of 4rij. Steepest Descents and Conjugate
Gradients methods were used to optimise the structure of the molecule. Splice
regions were refined initially, such that the length and the planarity of the
peptide bonds are geometrically acceptable. The energy of the whole molecule
was minimised to obtain an energetically favourable conformation of heavy
chain of RF-RC1. The predicted structure was evaluated for its geometry and
stereochemistry and was found to be acceptable.      The predicted structure was
found to be similar to the structures of various types of immunoglobulins, except
for the CDR regions. The solvent accessibility of CDR3, which interacts with
the antigen, was found to be relatively higher than that of CDR1 and CDR2.


Chapter 3: Prediction of 3D structure of the variable region of heavy chain

(VH) of WGH1, a human IgM- antibody from a patient with Wegener’s

Granulomatosis disease
This Chapter describes the prediction of 3D structure of the variable region of

heavy chain of WGH1, a human IgM- monoclonal antibody with anti-PR3

specificity from a patient with Wegener’s Granulomatosis (WG). WG is a
systemic disease of unknown etiology characterised by necrotising granulomas
and vasculitis affecting upper and lower respiratory tracts and kidneys.
The structure of WGH1 VH was predicted using the protocol described in
Chapter 2.   However, the structure of CDR3 was predicted using a hybrid
approach involving molecular dynamics simulations and energy minimisation.
CDR3 of WGH1 is 21 residues-long with a unique composition of hydroxy and


                                                                               iv
acidic residues.   Modeling was carried out using CVFF force field with a
distance dependent dielectric constant of 4rij. Molecular dynamics of CDR3 was
carried out at 300K for 500ps. The predicted structure was evaluated using
various criteria and was found to be acceptable. Molecular modeling studies of
the VH of WGH1 revealed that CDR3 adopts a unique conformation consisting of

2 anti-parallel -strands joined by a hairpin bend.     The surface of CDR3 is

relatively flat. The presence of negatively charged Aspartic acid residues on the
surface of CDR3 seems to be a requirement to interact with the positively
charged surface of PR3 antigen.


Chapter 4: Insight into strain-specific properties of Japanese encephalitis
virus using predicted structures of Envelope glycoprotein
Studies on knowledge-based homology modeling of envelope glycoprotein of
two strains of Japanese encephalitis virus (JEV), namely, Nakayama and Sri
Lanka are described in this Chapter. JEV is a mosquito-borne Flavivirus and is
an important human pathogen. The envelope glycoprotein (Egp) of JEV is a
major structural antigen, which is responsible for viral haemagglutination and for
eliciting neutralising antibodies. The structures of the 399 amino acids-long extra
cellular domain of Egp of the two JEV strains, have been predicted using the X-
ray crystal structure of Egp of Tick-borne encephalitis virus (TBEV) as a
template. The Amber all-atom force field and the distance dependent dielectric
constant of 4rij were used in the modeling studies. The structure of every loop
was predicted using a hybrid approach of molecular dynamics (at 300K for
500ps) and energy minimisation. The predicted structures of Egp of JEVN and
JEVS were optimised in the following order: splice regions, loop regions,
flanking SCRs, domains and whole molecule.           The structures were further
optimised using a 10Å layer of water. Occupancy of dihedral angles in the
Ramachandran plot was checked for and the residues adopting disallowed
conformations were corrected.       The predicted structure was found to be
acceptable when evaluated using the programs WHAT_CHECK, PROCHECK


                                                                                 v
and ProsaII. The Egp of JEV has an extended structure with seven -sheets and

two -helices and folds into three domains. The predicted structures of Egp of

JEVN and JEVS are found to be structurally similar when compared. They also
exhibited structural similarity with template structure of Egp of TBEV.    There
are eight mutations in the Egp of JEVS.     The effect of mutations on the local
and global conformation of the Egp was analysed using predicted structures.


Chapter 5: An algorithm to map sequential and conformational epitopes of
protein antigens of known structure
Several algorithms are available to predict the sequential antigenic determinants
of proteins. However, no methods are available to predict conformational
epitopes.   This Chapter describes the development and evaluation of an
algorithm to map the sequential and conformational epitopes of protein antigens
of known structure. The proposed algorithm uses solvent accessibility of amino
acids explicitly as compared to the other algorithms, a few of which use
accessibility in an implicit manner. The steps involved in the prediction of
conformational epitopes are given below:
   Solvent accessibility of the residues is calculated using Lee & Richards
    algorithm

   Residues with accessibility 25% are termed as accessible residues

   A contiguous stretch, of more than three accessible residues, is termed as an
    antigenic determinant
   An antigenic determinant is extended to N’ and C’ terminii only if accessible
    amino acid(s) are present after an inaccessible amino acid residue
   The inter-atomic distances of every residue from ith determinant with the
    residues from the remaining determinants were calculated. If the distance

    between a pair of residues from sequentially distinct determinants is 6Å,

    then such determinants are grouped and defined as a conformational epitope.




                                                                               vi
This is the only method that uses 3D structure information to predict sequential
and conformational epitopes. The accuracy of this algorithm has been evaluated
using X-ray crystal structures of 21 antigen-antibody complexes available in the
Protein Data Bank (PDB).           The algorithm is found to be 80% accurate in
mapping both the sequential and conformational epitopes.


Chapter 6: 3D structure prediction of B-cell epitopes of Envelope
glycoprotein of Japanese encephalitis virus
Sequential antigenic determinants of Egp of JEV predicted using the Kolaskar &
Tongaonkar method were mapped on the predicted structure of Egp of JEV. The
epitopes were also predicted using the algorithm described in Chapter 5. A
                                               155
candidate sequential antigenic determinant,          YSAQVGASQ163 was identified,
as it was predicted to be antigenic by both the methods.
The conformational stability of the peptide YSAQVGASQ was studied by
stitching the oligopeptides NHGN and SENHGN to N’ and AAKFT to C’
terminii respectively.      The conformations of resultant peptides namely,
YSAQVGASQ,                 YSAQVGASQAAKFT,                   NHGNYSAQVGASQ,
NHGNYSAQVGASQAAKFT,                         SENHGNYSAQVGASQ                  and
SENHGNYSAQVGASQAAKFT were predicted using a few cycles of
molecular dynamics simulations of one nanosecond duration each. MD was
carried out at 400K using the Amber all-atom force field and distance dependent
dielectric constant of 4rij. The equilibration phase of 50ps was followed by the
production phase and the conformers were sampled at an interval of 10ps. The
structure of each peptide was predicted by analysing the trajectory of the
sampled conformations. The effect of stitching of peptides on the conformation
                           155
of the candidate peptide         YSAQVGASQ163 is discussed. The conformation of
the peptides stabilised, when stitched with AAKFT. The region VGAS attained
a helical turn in all the peptides except NHGNYSAQVGASQ. On comparison,
the predicted conformation of VGAS in the peptide was found to be similar to
the conformation of AQVG in the Egp of JEVN. Based on the conformational


                                                                               vii
analysis and the experimental results, the peptide SENHGNYSAQVGASQ has
been identified as a candidate B-cell epitope and the peptide AAKFT has been
proposed as a spacer for the peptide vaccine against JEV.


Appendix I: Antigenic determinants reacting with Rheumatoid Factor:
Epitopes with different primary sequences adopt similar conformation
The human IgM Rheumatoid Factor (RF) reacts with antigenic determinants of

Fc fragment of IgG (CH2 and CH3 domains) and 2-microglobulin. This section

includes analysis of the structures of antigenic peptides of CH2, CH3 domains of

IgG and human 2-microglobulin. The RF-reactive antigenic peptides vary in

sequence length and composition. It has been hypothesised that the reactivity of
RFs with sequentially different peptides results from conformational similarity.
Molecular modeling studies revealed that the RF-reactive antigenic peptides
share conformationally similar regions. The regions are said to be
conformationally similar if the dihedral angles of the main-chain atoms of

respective residues deviate by <30. Futhermore, it was observed that the RF-

reactive peptides are present on the surface of CH2, CH3 domains of IgG and on
B2-micoglobulin. The main-chain atoms of these peptides are accessible when
they are part of the respective proteins. These observations indicate that the
autoantibodies such as RFs recognise main-chain conformations of reactive
epitopes and thereby react with a number of sequentially different antigenic
determinants.




                                                                             viii
                    RELATED PUBLICATIONS
   Urmila Kulkarni-Kale, Shriram Bhosale, G. Sunitha Manjari, Ashok
    Kolaskar, (2004). VirGen: A comprehensive viral genome resource.
    (Accepted for publication in the Database issue of Nucleic Acids Research).

   Urmila Kulkarni-Kale & A. S. Kolaskar (2003). Prediction of 3D structure of
    envelope glycoprotein of Sri Lanka strain of Japanese encephalitis virus. In
    Yi-Ping Phoebe Chen (ed.), Conferences in research and practice in
    information technology. 19:87-96.

   A. S. Kolaskar & Urmila Kulkarni-Kale (1999) Prediction of three-
    dimensional structure and mapping of conformational antigenic determinants
    of envelope glycoprotein of Japanese encephalitis virus. Virology. 261:31-
    42.

   J. A. Davis, E. Peen, R. C. Williams, Jr., S. Perkins, C. C. Malone, W. T.
    McCormack, E. Csernok, W. L. Gross, A. S. Kolaskar, Urmila Kulkarni-
    Kale (1998). Determination of primary amino acid sequence and unique
    three-dimensional structure of WGH1, a monoclonal human IgM antibody
    with Anti-PR3 specificity. Clinical Immunology & Immunopathology.
    89:35-43.

   R. C. Williams, Jr., C. C. Malone, A. S. Kolaskar, Urmila Kulkarni-Kale
    (1997). Conformational antigenic determinants reacting with rheumatoid
    factor: different primary sequences of reactive epitopes share similar shapes.
    Molecular Immunology. 34:543-556.

                                    Patent
Indian patent:                                13/DEL/2001
International Patent Publication No:          WO 02/053182 A1
                                              20040076634
Available on-line at:                         http://ipdl.wipo.int
                                              http://appft1.uspto.gov/netahtml/PT
                                              O/search-bool.html
Title:         Chimeric T helper-B cell peptide as a vaccine for Flaviviruses.
Inventors:     Dr. M. M. Gore*, Dr. S.S. Dewasthaly*,
               Prof. A.S. Kolaskar#, Urmila Kulkarni-Kale#
Affiliation:   * National Institute of virology, Pune 411001. India
               #
                 Bioinformatics Centre, University of Pune, Pune 411007. India.
Applicants     Department of Biotechnology, National Institute of Virology,
               University of Pune.

                          Papers in preparation
   Urmila Kulkarni-Kale & A. S. Kolaskar An algorithm to predict the
    Antibody-binding sites.
   Shailesh Dewasthaly, Urmila Kulkarni-Kale, M. M. Gore, A. S. Kolaskar.
    Prediction of conformationally stable, neutralising antibody inducing epitope
    on envelope glycoprotein of Japanese encephalitis virus.



                                                                                ix

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:9
posted:5/11/2010
language:English
pages:9