Molecular modeling – lab 1 PyMOL PyMOL Tutorial PyMOL is a tool for visualization of three dimensional structures. In this tutorial you will learn the basic functions of PyMOL and how to write simple PyMOL scripts. There are other similar tools that can be used instead such as VMD and Rasmol, however, we will mainly use PyMOL in this course. You will follow the instructions below and answer all questions marked with Q: in a separate report file that you will upload at the end of the tutorial. What is PyMOL? PyMOL is a free, open‐source tool to visualize molecules with a homepage located at http://pymol.sourceforge.net/. There you can find plenty of information and help if you want to expand your knowledge of PyMOL beyond this tutorial. More info can also be found at http://www.pymolwiki.org/. PyMOL is easy to install on your own computer and it can run on most systems like Linux, Unix, Mac and Windows. On our AFS system however, there is already a PyMOL module prepared. Hence, all you have to do is to write: >> module add pymol/0.99rc6 >> pymol & And you should have a window with pymol open! 1. PDB‐files PyMOL reads in molecular coordinates from a .pdb file, which is the standard representation of a protein structure. These files can be found at sites such as the Protein Data Bank (PDB) or the Orientations of Proteins in Membranes (OPM) database for membrane proteins. To start with you will need to get a pdb file from PDB. Look for the ABC‐ATPase with pdb code 1f2t. This can be done by searching for the structure in the query field at the top of the page. Once you have the structure in view, you can select "Download files" ‐> "PDB text" from the menu to the left. This will automatically save the file to your desktop. OBS! The file extension has to be .pdb and not .txt or anything else for PyMOL to read it. Now you have a pdb file. Let's take a quick look at it, you can open the file in any text editor or view it by using the less‐command in the terminal. Each line in a .pdb file starts with a declaration of what that row contains, such as e.g. HEADER or TITLE. All these rows are divided into several sections, which are described in more detail in this document. We will only deal with a few of these here however. The first set of lines is called the "Title section" and it mainly contains information about the molecules and the crystallographic conditions. The entries HEADER, OBSLTE, TITLE etc. may be found there. There is also a "Primary Structure Section", DBREF (reference to other databases), SEQADV (differences in sequence to DB entry), SEQRES (the sequence of each chain), MODRES (modifications to residues). Another interesting section is the "Secondary Structure Section" that defines where alpha helices and beta‐sheets are located. Q1a: Look under the entries "HELIX" and "SHEET" in your pdb file, how many alpha‐helices, beta‐strands and beta‐sheets can you find and to which protein chain do they belong to? Finally, the most important section is the "Coordinate section" that gives the location of each atom in 3D space. It also presents the occupancy and temperature factor for each atom. The occupancy is a fraction of atomic density at a given center. If there are two equally occupied conformers both will have occupancy of 0.5 ‐ the normal value is 1 range 0‐1. The Temperature factor (or B‐Factor) is the mean‐ square displacement of atom from its position in the model ‐ the normal range is 5‐ 50. These can be found under the entries ATOM/HETATOM (hetatom if it is not a protein). Look at the coordinates of your structure, it gives the index for each atom, a name for the atom, the residue, chain, and residue number followed by the coordinates. Q1b: What is the first residue in your structure? How many atoms does it contain? The atom names are denoted N, CA, C, O, for the main chain atoms, followed by the side chain atoms (CB, CG1, CG2 . . . ). By now you should be able to guess what each of these atoms names stands for. Q1c: Draw a peptide bond and indicate where each of the 4 atoms N, CA, C and O are located. 2. Looking at a structure in PyMOL Now that you now what the pdb file looks like, let's open it in PyMOL. Use "File" ‐> "Open" and select your file. After loading a pdb file, it can be manipulated in PyMOL from either the menus or from the command line. In this first section we will have a look at the structure using only the menus and then you will use some Tcl/Tk GUI commands. Mouse movements To start we will try zooming and rotating. Most movements can be made using the mouse and the keys "Shift" and "Ctrl". The main combinations are shown below, and more combinations and these can also be found to the right bottom corner of the PyMOL viewer if you forget them (see below). Try all the combinations below and see what happens to the molecule. Make sure that you have "3‐Button Viewing" turned on, if not, click on the text. Keyboard (none) Shift Left mouse button Rotate in 3D Middle mouse button Translate‐>scroll / Button‐>move Right mouse button Zoom/scale Grow a selection box Reduce a selection box Move clippnig plane(shadows distant atoms) Reset origin of rotation Move clipping plane Control+Shift Select Guide to mouse actions in lower right corner of PyMOL window. Display options Now we will begin manipulating the image, the simplest way of doing this is through the menu to the right (see picture below). There you have 5 buttons for each part of the structure; A (Action), S (Show), H (Hide), L (Label) and C (Color). Try the different types of representations in Show (S). Note: if you turn one visualization on, you hide it using the Hide (H) menu. There are also some advanced options that can only be accessed through the Tcl/Tk GUI under the "Settings" menu, where you can change transparency and select different types of cartoons. Once you have tried all the settings, lets display the protein in cartoon and hide all lines and waters. The next step is coloring in the C‐menu. Try the different colorings in the model and see what happens. Selections Now we will try different ways of making a selection. Each time the selection you have made should appear in the menu to the right as shown in the image above. Once you have your selection you can make manipulations on it similar to what you just have done for the whole protein. You have already tried making selections with the mouse. In the bottom right corner you can click on the green text "Selecting" and change the mode of selection. Try the different modes of selection and see what happens. In the log of the PyMOL Tcl/Tk GUI the results of your selections are displayed. Furthermore, you can see the sequence and your selections/coloring on the sequence at the top of the window by clicking on the S at the bottom right corner. Once you have a selection and want to use it for manipulation, you can rename it using the A‐menu and also delete it there if you don't need it anymore. Tcl/Tk GUI Commands Instead of manual clicking you can enable more specific selections by using the command line in the Tcl/Tk GUI. The PyMOL command line is actually a Python command line. A few commands that let you select different regions are: > select symbol c ‐ selects all carbon atoms > sele name ca+cb ‐ selects all alpha and beta carbons > sele resn gly ‐ selects all glycine residues > sele resi 1‐10 ‐ selects residues 1 through 10 > sele resi 1+5+7 ‐ selects 3 residues at different positions > sele chain a ‐ selects chain A (note: usually chain designations are single letters or numbers) > sele resi 1‐10 in chain a ‐ selects residues 1‐10 in chain A only > sele ss h+s ‐ selects all atoms assigned alpha helix or beta sheet (other options are l = loop, "" = unstructured) Some more advance selection options are: > sele name ca gap 5 ‐ selects everything at least 5 angstroms away from an alpha carbon > sele name ca around 5 ‐ selects everything within 5 angstroms of an alpha carbon > sele name cg within 5 of name ca ‐ selects all gamma carbons within 5 angstroms of any alpha carbon > sele byres name ca around 5 ‐ selects all atoms in all residues that have at least one atom within 5 angstroms of an alpha carbon. That is, if a single atom in a residue meets the selection criteria, the whole residue will be included in the new selection > sele neighbor name ca ‐ selects all atoms bonded to alpha carbons Selections can also be made with Boolean operators like and, or and not. In the menu you can name your selections using the command "set_name". For example: > sele resn ala > set_name sele, Alanines Will give you a selection named "Alanines" containing all ala‐residues. Another way of doing the same is to write: > sele Alanines = (resn;ala) or: > sele Alanines, resn ala If you want to color this selection use the command "color". > color blue, Alanines ‐ will color the selection named "Alanines" > color blue, (resi 1‐10) ‐ will color specific residues Now you should have enough knowledge to make some selections and color things yourself. So the next task is to show the structure as cartoon, color the two chains in different colors (did you answer Q1a correctly?) and then color the first, second and third alpha‐helix in chain A in three other colors. Do you think that the residues between the second and third helices really are in helical state? You can redefine the secondary state of a residue using the command: > alter 127‐128/, ss='' If you do this for a region and then select show cartoon again you should see the difference. Q2: Describe how you find helix 1, 2 and 3 in chain A and what residue numbers define their start and end? Another way to change the secondary structure assignment is to use A‐>"assign sec. struc.". Pymol calculates angles and will only assign perfect sheets and helices. It is automatically done when you display as cartoon if the pdb file does not contain any predefined secondary structures. Try this and see if you see any difference. Labeling Now let's show the residue numbers for the first and last residues in each of the two helices. First, make a selection of these 4 residues. Then you can show their label by using the Label‐menu (L) and "Residues". 3. Saving your image After you have created your image, you may wish to save it as a picture so that you can include it in papers or presentations. The first step is to ray‐trace the view you have open using the command "ray". But first, you may wish to change the background color to white, i.e. "bg_color white". With a white background you should also include: "set depth_cue=0" and "set ray_trace_fog=0". To change the lighting of the object during a ray trace "set direct=0" (values from 0 to 1). Another value to play with during ray trace is "set orthoscopic=0". When doing a final rendering, "set antialias=1.0" will improve resolution of the image. > bg_color white > set depth_cue, 0 > set ray_trace_fog, 0 > set direct, 0 > set orthoscopic, 0 > set antialias, 1.0 When you run ray tracing you need to specify the resolution of the image, "ray, 1000,1000" will give you an image 1000 by 1000 pixels. Note: the higher the resolution the longer time it will take to ray‐trace, and it is a slow process! Start with a smaller resolution to save time, e.g. 500. First you need to rotate the structure in the orientation that you want to take the picture. Then you type: > ray 500,500 OBS! After ray tracing do not click on the window before saving the image, then the trace will be lost! Then you can save the file in the format of your choice. Png could be a good idea, then type: > png directory/filename.png Q3: Do this with your image that you created above and include it in your report. 4. Writing PyMOL scripts When working with pymol you can save your issued commands in a text file (script file). You can run this script at a later time and then all the commands in the file will execute automatically. The standard script files for pymol are regular text files with the extension .pml. To invoke a script you type > @scriptname.plm in the command window. There you can use all the commands that you write in the Tcl/Tk GUI. If you are unsure of how a command works, you can find information about most things at http://www.pymolwiki.org/. You can record all your menu selections in a log so that you can find out what the commands for different actions are. It can be done by selecting "File"‐>"Log" and a file named log.pml (or something of your choice) will be created in your current directory. A pymol script may look something like (comments are indicated with #): delete all # remove old data load 1F2T.pdb # open a structure in the window select A, chain a show ribbon, A color red, A color blue, resi 1‐10 in A hide (solvent) # will not display all the waters Q4: Why is it convenient to use a script file do you think? 5. Looking at surfaces and interactions Now let's take a new structure from PDB, 1c9w that is a CHO‐reductase with a bound NADP+, and make two selections of ligand and protein. Save all the commands that you use in a text file with the extension .pml. This should be a script that can produce the images of this section and it should be included in the report. Note: this structure has a NADP ligand, look in the .pdb under the entry HETATM to find out how this molecule is defined and what its "residue" name is. You find it in the coordinates section. The ligand First, find the formula of the ligand NADP on the web. Then, start by hiding the protein and color the ligand by atom types. It could be a good idea to visualize it as sticks. Note: the default is to display all atoms except the hydrogen, to turn them on use the command h_add. Look at what colors the different atoms have. Q5a: Would you say that this is a polar molecule? What are the most polar parts? The protein Now hide the ligand for a while and show the surface of the protein. Now you get a feeling for how a protein "really" looks like, it takes up a lot more space than you might think. To get an idea about binding pockets it could be good to look at its hydrophobicity. This is done by selecting (1C9W)‐>A‐>generate‐>vacuum electrostatics‐>protein contact potential (local), or by command util.protein_vacuum_esp("1C9W",mode=2,quiet=0). This will approximate the electrostatics of surface patches. OBS: this must be done on whole molecules and not on selections. Look at the protein and see if you can find any hydrophilic clefts that might be the ligand‐binding pocket. Show the ligand again by clicking on it (and also on the whole molecule 1C9W). Look at it in the pocket; a good idea is to use slabs to zoom in within the molecule. Was it where you expected? Does the NADP molecule fit well within the pocket? Q5b: Rotate the protein so that you have the pocket in view and save an image. There are several types of manipulations that you could do on a molecule. For example, try these commands and see what they do. Could these be selections of the binding pocket of the protein perhaps? > sele bind, protein1 within 5 of nadp > sele bind2, byres protein1 within 5 of nadp (were protein1 is a selection of the protein chain and nadp is a selection of the ligand) Hydrogen bonds To define all possible hydrogen bonds that two molecules can form, you must first include all hydrogens (they are not included in the .pdb file) and define all possible donor/acceptor molecules. This can be done with command: > h_add nadp > h_add protein1 > select don, (elem n,o and neighbor hydro) > select acc, (elem o or (elem n and not neighbor hydro)) Q5c: Describe in your own words what this means and does it fit with what you have learned about hydrogen bonds? What atoms form hydrogen bonds in organic molecules? What is the typical length of a hydrogen bond? Then draw out all distances between donors and acceptors that are shorter than 3.2 Å: > dist HBA, (protein1 and acc),(nadp and don), 3.2 > dist HBD, (protein1 and don),(nadp and acc), 3.2 The labels are the distances between the atoms, if you don't want to see them type: > hide labels,HBA > hide labels,HBD Now modify the commands above to display all hydrogen bonds to the phosphate group’s oxygen atoms in NADP. Q5d: What are the residues that could form hydrogen bonds with the phosphate groups of NADP? Q5e: Save a nice image where you can see the ligand in the binding pocket with hydrogen bonds and include it in your report. Now you are finished with the PyMOL tutorial but you will use it a lot in the following labs, so you will soon learn some new PyMOL tricks. Please answer the questions and hand in the report (by e‐mail to firstname.lastname@example.org) before the next lab. Convert pictures so that the report size is < 5 MB.