VIEWS: 6 PAGES: 23 POSTED ON: 5/7/2012
Protein Folding - Results and Requirements Charles L. Brooks III The Scripps Research Institute La Jolla, California N+N Meeting, DC 04/04 Protein Folding - Results and Requirements • Basic elements of protein folding free energy landscape calculations • Folding studies predict folding pathways/mechanisms and inform new experiments • Computational requirements and experiences with the GRID N+N Meeting, DC 04/04 General issues in condensed phase simulations • Sampling, sampling, sampling – Anecdotal observations of individual events are exactly that - anecdotes, not a basis for discovery – Failures of high temperature unfolding studies linked to poor sampling of non-equilibrium events – Suspect AIMD simulations linked to non- convergence of statistical sampling • Non-equilibrium “experiments”must be run 10s of times • Statistical errors must be quantified N+N Meeting, DC 04/04 Measuring structure: protein folding “reaction” coordinates N+N Meeting, DC 04/04 Constructing folding free energy landscapes • Solvated molecular dynamics trajectories at several temperatures yield multiple unfolding pathways connecting native and unfolded states • Clustering in the space of r and Rg provide spanning initial conditions for biased free energy sampling N+N Meeting, DC 04/04 Funneled landscapes visualized and quantified • Ideas from simulation have helped “evolve” our thinking about protein folding funnels and free energy landscapes N+N Meeting, DC 04/04 Small helical proteins fold via multiple pathways on a funnel-like landscape Helix formation and collapse occur via at least two “pathways” Concomitant collapse and helix formation “Diffusion-collision” type mechanism N+N Meeting, DC 04/04 Folding is Downhill • Microsecond folding dynamics of the F13W G29A mutant of the B domain of staphylococcal protein A by laser-induced temperature jump. George Dimitriadis, Adam Drysdale, Jeffrey K. Myers, Pooja Arora, Sheena E. Radford, Terence G. Oas, and D. Alastair Smith, PNAS, 101, 3809 (2004). – “The data suggest, therefore, that helix II is formed in the rate- limiting transition state, consistent with theoretical models of how this protein folds (27).” – “Using this value, the DG† for F13W* at 37°C is ~3/4kBT … in the regime of downhill folding.” – 27. Guo, Brooks and Boczko, PNAS, 94, 10161 (1997) N+N Meeting, DC 04/04 And centered around the HII/HI interface • Testing protein-folding simulations by experiment: B domain of protein A Satoshi Sato, Tomasz Religa, Valerie Daggett, and Alan R. Fersht. PNAS, in the press (2004). – “The rate-determining transition state for the folding of the B domain is constructed around a nearly fully formed H2, which is stabilized by hydrophobic interactions from H1. H1 itself is only weakly structured…” – “Boczko and Brooks (27) and Guo et al. (28) used a biased- sampling method and concluded that the interface between H1 and H2 forms first, starting around the T1 region, and then the H2-H3 interface forms later with mostly concomitant secondary and tertiary structure formation.” – 27. Boczko and Brooks, Science, 269, 393 (1995). – 28. Guo, Brooks and Boczko, PNAS, 94, 10161 (1997). N+N Meeting, DC 04/04 New paradigms for experiment - single molecules, fast folding • Exciting new experiments begin to elucidate the nature of folding barriers • Simulation and theory suggest many single domain proteins may fold in downhill or nearly downhill fashion • Simulation and experiment converge in characterizing folding timescales for small proteins • Nearly barrier-less folding free energy landscapes observed in simulations under native-supporting conditions • Experiment confirms small (~ a few kBT) free energy barriers N+N Meeting, DC 04/04 Folding of a/b protein G involves significant water participation • Nature of the collapsed state involves – Significant solvent “lubrication” – Final barrier associated with solvent expulsion Water in core of collapsed state Barrier for final assent to native state N+N Meeting, DC 04/04 Function and form - folding and assembly, folding and function • New ideas are emerging from experiment and theory that link protein function to folding mechanism • Protein assembly and folding too may • PIN and YAP domain be coupled bind different consensus sequences • FBP binds two consensus sequence types • Multiphase folding as a hallmark of functional substates in protein function N+N Meeting, DC 04/04 Computational Requirements: • Requirements – 100-200 sequential runs on ~64 processors for total resource requirement of ~3 months 512 T3E processor hours. • GRID computing with Legion/Globus – 100-200 simultaneous runs of ~16-64 PEs of available machines on computational grid - each run of duration 8-12 hours. • Application software – CHARMM ports to T3E, O2000, Alpha, Linux, - utilizing CHARMM-MPI, MPI or Legion-MPI, Globus, MMTSB Tool Set N+N Meeting, DC 04/04 Constructing folding free energy landscapes via GRID “parameter-space” scans • Different conformational regions are mapped to different regions on the grid • Tightly coupled parallelism (16- 64 nodes) exploited for each region of conformational space • Loose coupling exploited to distribute calculations across geographically distributed GRID nodes. N+N Meeting, DC 04/04 GRID “parameter-space” scans: the reality • Persistence in middleware platform focus is sketchy at best: Legion, Globus, OGSI? – The flavor de jour is a continually moving target • Implementing new features to enable calculations within middleware framework is painful – Providing access to site-specific features, e.g., archival/retrieval • Persistence of middleware infrastructure across sites is poor/non-existent across many sites – Demonstration projects seem to work because coherence in site-specific attention can be maintained, long-term solutions have not existed. N+N Meeting, DC 04/04 Building a portal for the National computational grid N+N Meeting, DC 04/04 N+N Meeting, DC 04/04 N+N Meeting, DC 04/04 Grid Portal is a good idea to bring computing resource to broader community • Lack of persistence in Globus behind portal makes it difficult to provide robust portal • Little involvement of “application scientist” diminishes usefulness of final product. • NSF needs to re-embrace the idea that application science and scientists should be driving the development of cyber-infrastructure N+N Meeting, DC 04/04 Other solutions: Ensemble Computing with the MMTSB Tool Set • The ensemble computing paradigm is tightly integrated into the Tool Set – Ensemble data structures support repeated analysis on ensembles of biopolymer conformations or molecular structures • Energy and scoring of protein-ligand docking runs • 100’s - 1000’s of PB/GB-SA calculations • Use your imagination! – Master/slave parallel replica exchange simulations (see later lecture) • Ab initio protein and peptide folding • Loop modeling and homology modeling • Refinement and scoring of predicted structures – Ensemble construct integrated into much functionality of CHARMM, Amber, MONSSTER and other analysis tasks N+N Meeting, DC 04/04 http://mmtsb.scripps.edu Harnessing computational grids for biomolecular simulation studies Characterization of Desktop Grids • Large numbers of heterogeneous desktop PCs connected to the Internet and Intranets • Utilization of otherwise unused compute power 80%-90% of CPU time is idle time • Nodes join and leave the grid frequently • Unreliable, loosely-coupled interconnections • Intrusive users and malicious attacks Desktop grid Challenges using Desktop Grids • How can we optimally utilize a large amount of desktop PCs to stage high-quality, large-scale simulations? • How can we deal with the heterogeneity of desktop grid platforms? • Do the characteristics of the system (i.e. kind of resource, compute paradigms) affect the quality of the simulation results? N+N Meeting, DC 04/04 A Distributable Desktop Grid Environment for Macromolecular Modeling Built Around BOINC and CHARMM • Protein-ligand docking applications implemented and useful • Protein folding calculations (for structure prediction) ongoing for CASP6 • New “volunteer” resource for protein structure prediction, StructurePrediction@home http://predictor.scripps.edu N+N Meeting, DC 04/04 Acknowledgements • People: M. Crowley, M. Taufer, A. Grimshaw, M. Humphrey, J. Karanicolas, M.Feig • Money: NIH (NCRR and others), NSF (CTBP, NPACI) • Resources: TSRI, SDSC/NPACI, PSC N+N Meeting, DC 04/04
"GRC - Protein Folding Dynamics"