Protein-Protein Docking Method Used to Study Complex Protein Interactions Tim Glennon & Dana Haley-Vicente Accelrys Inc., 10188 Telesis Court, San Diego, CA 92121, USA Abstract There are ~1400 known protein-protein interactions in humans (DIP, http://dip.doe-mbi.ucla.edu/). Understanding these interactions is necessary in order to gain insight into molecular recognition and networks such as signal transduction pathways in cells. One key interaction is between regulator of G-protein signaling (RGS) proteins and alpha subunit of G-proteins. The RGS family proteins are involved in accelerating GTP hydrolysis of the G-protein alpha subunit, which leads to rapid recovery of signaling transduction cascades. To analyze and validate the G-protein regulating system, we used ZDOCKpro for protein-protein docking and a complementary tool Evolutionary Trace for protein family analysis. Results indicate that Evolutionary Trace identified a cluster of residues in the RGS domain that includes the RGS/G-protein alpha subunit binding interface. These residues were then used to guide the protein-protein docking experiment. Out of 3600 docked complexes, ZDOCKpro predicted a good complex structure within the top 10 ranked complexes. Results improved with the addition of the Evolutionary Trace data with the best ranked ZDOCKpro complex having an interface RMSD of 4Å. Overall, we find that protein-protein docking and complementary tools are useful to study protein-protein interactions of unknown complex assemblies. Introduction Understanding protein-protein interactions is important for gaining insight into molecular recognition and networks, such as signal transduction pathways in cells (e.g. for human development, cancer, etc…). Protein-protein interactions regulate virtually every cellular process, including the workings of complex molecular machines like the splicesome, proteasome or ubiquitination complex. Interactions between small molecules and proteins are well studied. Liang et al. (1998) found small molecule binding sites to be indentations, crevices or cavities and often the largest site is the true binding site. For protein-protein interaction, Jones and Thornton (1997) found that binding sites tend to be flat and solvent accessible. Other characteristics varied based upon type of complex (e.g. hydrophilic for heterodimers and hydrophobic for homodimers). Over the past several decades, and still today, experimental technologies have been used to detect protein–protein interactions. These technologies include x-ray crystallography, competition binding, site-directed mutagenesis, yeast 2-hybridization, co-immunoprecipitation, etc… Increasingly, computational methods are being used to easily study protein-protein interactions without the need for in-vitro experimentation. The most recent method is protein-protein docking. Protein-protein docking is the computational determination of the structure of protein complexes or assemblies from individual protein structures. This method can be complemented by experimental data or other computational tools to improve the results of the docking experiment. Other computational methods include: • predicting hydrophobic patches on surface • mapping sites • identifying regions of the one of the proteins by using a chemical probe (e.g. tryptophan residue) • locating interface residue clusters based on conserved and family-specific residues within a family of related proteins. In this study, we analyzed and validated the Gprotein regulatory system using ZDOCKpro for protein-protein docking and a complementary tool Evolutionary Trace, for protein family analysis. The G-protein regulating system includes regulator of G-protein signaling (RGS) proteins and alpha subunit of G-proteins. The RGS family proteins are involved in accelerating GTP hydrolysis of the G-protein alpha subunit, which leads to rapid recovery of signaling transduction cascades that are involved in controlling vision, cardiac function, and many aspects of neuroendocrine signaling. Regulation of the G-protein signal transduction cascade G-protein signal transduction cascade the electrostatic potentials of residues and identifying possible interactions Methodology The goal of this study was to validate the binding of the unbound ligand, Rgs4 1ezt (205 residues), to the unbound receptor, Guanine Nucleotide-Binding Protein G-(iα1) 1git (353 residues), using Rgs4 1ezt (ligand) ZDOCKpro for protein-protein docking and Evolutionary Trace to guide the docking. The results were then compared to the crystal structure complex, 1agr, of the two proteins. Part 1: ZDOCKpro (Accelrys Inc., San Diego) is a protein-protein docking method developed by Prof. Zhiping Weng at Boston University. The program includes two main algorithms called ZDOCK (for fast, rigid-body, initial stage docking using pairwise shape complementarity) and RDOCK (for CHARMm-based refinement of the complex poses generated by ZDOCK and ranking of the docked structures based on CHARMm electrostatic interaction energy and ACE desolvation energy). The following steps were preformed in part 1 of this study: 1. Download experimental structures from the Protein Databank (PDB) (Unbound ligand [1ezt] and receptor [1git]) 2. Generate 3600 docked complexes using a 15o rotational search in 6-dimensions (~2 hours, Linux) 3. Refine and rerank top 500 docked complexes (~5 hours) 4. Validation: Compare predicted complexes to the 1agr complex crystal structure and calculate the RMSD of interface residues Part 2: Evolutionary Trace (ET, Lichtarge et al. 1996) is used to C 180o N Guanine Nucleotide-Binding Protein G-(iα1) 1git (receptor) 1agr complex crystal structure (with extra density corresponding to a longer receptor N-terminus) A B identify functionally and structurally important residues that cluster on a protein structure given a set of similar protein sequences. For the RGS system, the cluster of residues indicate the binding interface (Sowa et al ). The following steps were performed in Discovery Studio® Modeling 1.1 (Accelrys Inc., San Diego). 1. Download a protein structure, 1ezt (RGS ligand) 2. Use 1ezt sequence to BLAST nr90 (90% non-redundant NCBI) to find similar sequences • Select ~50 sequences based on alignment length and coverage size 3. Align sequences using Align123 4. Run Evolutionary Trace analysis • • Analyze conserved and class-specific (or family-specific) residues at different % sequence identity cut-offs (PIC) Identify possible interface residues on 1ezt Part 3: ZDOCKpro with filtering: Use ET results, clustered conserved and class-specific residues, to filter predicted hits. (A) Dendrogram of 42 RGS sequences. The vertical bar indicates an ~60% PIC in which several clusters appear (indicated by alternating pink and blue colored lines/nodes and sequence names). (B) Sequence alignment of the 42 RGS sequences with indicated class-specific residues and conserved residues based on the ~60% PIC. (C) 1ezt protein structure shown in CPK representation with indicated classspecific residues and conserved residues mapped onto the structure based on the Evolutionary Trace results at ~60% PIC. The top left image shows a cluster of mainly class-specific residues at the binding interface of the RGS/Gαi protein. The 180o view displays very little class-specific residues. Results vary depending on several sequences chosen, number of sequence, sequence lengths, alignment and PIC. Results and Discussion Results from ZDOCKpro revealed a good hit (4.3Å RMSD of the interface residues) that ranked 2nd out of the top 500 refined and reranked docked complexes. Evolutionary Trace identified a cluster of residues in the RGS domain that includes the RGS-Gαi binding interface. These residues were included in the ZDOCKpro analysis (part 3) as part of filtering before refinement and reranking. Rank 1 2 3 4 5 6 7 8 RMSD (Å) 4.3 23.27 17.79 9.54 14.98 9.82 19.25 19.4 19.28 22.34 21.94 20.36 4.01 Table of top-ranked ZDOCKpro- predicted docked complexes out of 500 total refined and reranked hits. RMSD is based on the interface residues of the predicted complex compared structure to the 1agr crystal are (receptors conserved residues GLU37 GLU41 ASN42 ARG121 class-specific residues GLU80 ASN82 SER85 ARG88 LEU113 ASP117 9 10 11 12 (Top figure) Best-ranked ZDOCKpro13 superimposed). Hits ranked 1 and 13 are considered good based on RMSD of the interface residues. predicted docked complex after including the Evolutionary Trace analysis data. RGS (1ezt) ligand in purple and the Gαi (1git) receptor in blue. (Top, right figure) Best-ranked ZDOCKpro-predicted docked complex superimposed (based on the receptors, Gαi) with the 1agr crystal structure complex. interface residues is 4.3Å. The RMSD of the The conserved (red) and class-specific (pink) residues are Results and Discussion (continued) After including the Evolutionary Trace analysis data, the 2nd hit became the 1st hit indicating that the results improved with this additional data. Note that there was another good hit (4.01Å RMSD, based on the interface residues) ranked 13th. This 13th hit was much lower before adding the ET data. Overall, additional information whether it be from experimental data (e.g. mutagenesis data) or computational methods (e.g. ET data) helps to guide protein-protein docking. Email Dana Haley-Vicente at email@example.com for more information. Conserved (red) and class-specific (pink) residues clusters (with a ~20% sequence identity cut-off, PIC) on the unbound RGS ligand (1ezt) at the RGS / Gαi proteinprotein interface (Sowa et al. 2000). These residues were used to filter the data for ZDOCKpro and indicate a hot spot where any residue mutation is associated with a major evolutionary divergence, a feature that generally correlates with functional sites. indicated and shown on the 1arg RGS ligand. Predicted complex: RGS (1ezt) ligand in purple and the Gαi (1git) receptor in blue. Crystal structure complex (1agr) ligand (orange) and receptor (green). References • Liang J, Edelsbrunner H, Woodward C. Protein Sci (1998) 7: 1884 • Jones, S., Thornton, J. M. J. Mol. Biol. (1997) 272:121-32 & 133-43 • Introduction images •http://www.rpi.edu/dept/bcbp/molbiochem/MBWeb/mb1/part2/sig nals.htm •Trends in CELL Biology (1999) 9:138-144 • ZDOCKpro references: • ZDOCKpro1.0, Accelrys Inc., San Diego, CA, USA. http://www.accelrys.com/zdockpro • Chen R. & Weng Z Proteins (2002) 47: 281-294. • Chen R., Li L. & Weng Z. Proteins (2003) 52: 80-87 • Li L.*, Chen R.* (* joint first authors) & Weng Z. Proteins (2003) 53: 693-707. • Evolutionary Trace references: • Discovery Studio Modeling 1.1, Accelrys Inc., San Diego, CA, USA. http://www.accelrys.com/dstudio • O. Lichtarge, H.R. Bourne, and F.E. Cohen, J. Mol. Biol, 257, 342-358, 1996. • O. Lichtarge, H.R. Bourne, and F.E. Cohen, Proc. Natl. Acad. Sci., 94, 75077511, 1996. • Sowa et al. PNAS v. 97 (2000) p 1483-1488.