Docstoc

Protein Crystallography

Document Sample
Protein Crystallography Powered By Docstoc
					Protein Crystallography
Part II Tim Grüne Dept. of Structural Chemistry Prof. G. Sheldrick University of Göttingen http://shelx.uni-ac.gwdg.de tg@shelx.uni-ac.gwdg.de

Overview
• The Reciprocal Lattice • The Ewald Sphere • Data Processing and Scaling • The Phase Problem • SAD, MAD, MIR, RIP, et al.

Molecular Biology

1

Protein Crystallography II

Amplitudes and Phases
The electron density can be calculated from the structure factors via the Fourier transformation.

ρ(x, y, z) = =

1 Vunitcell h,k,l 1 Vunitcell h,k,l

F (h, k, l) · e−2πi(hx+ky+lz) |F (h, k, l)| · eiφ · e−2πi(hx+ky+lz)

This is easily done by a computer. The equation, however, contains two unknown quantities, amplitude |F (h, k, l)| and phase φ of the reflections. They must be known before anything can be computed. The first half of this talk deals with how to extract the first part, the amplitude, from diffraction experiments. The second half is concerned with how to retrieve the phases.

Molecular Biology

2

Protein Crystallography II

The Reciprocal Lattice
b

A
a c∗ = A/Vunitcell

The reciprocal lattice is an important concept in crystallography. It is created by three reciprocal lattice vectors, a∗, b∗, and c∗, derived from the real space vectors a, b, and c.

c∗

For an orthorhombic space group (all angles 90◦), the reciprocal vectors are parallel to the real space vectors, with different lengths. For general space groups the vectors are not parallel to those of the real space unit cell. But in any case, the volume of the reciprocal cell is the inverse of the real space cell, V ∗ = 1/V The point group (symmetry without translations) is the same for the real and the reciprocal lattice. Therefore many symmetry related questions for crystals also apply to their reciprocal lattice.

Molecular Biology

3

Protein Crystallography II

The Ewald Sphere
It is important to collect as complete data as possible, i.e., to record nearly all reflections up to the resolution limit (which is often due to the crystal quality). In order to understand which reflections are collected, one can look at the Ewald Sphere. It is constructed in reciprocal space. When a lattice point crosses the Ewald sphere, a reflection occurs in the direction determined by the centre of the sphere and the point of intersection. The angle 2θ is the same as the direction recorded on the detector, even though the Ewald sphere is constructed in reciprocal space.

incident beam

2θ

origin

r = 1/λ

Molecular Biology

4

Protein Crystallography II

Limits of Data Collection
During data collection the crystal is rotated about an axis. The reciprocal lattice then rotates about the same axis. All lattice points that pass through the Ewald sphere during rotation are collected. Apart from the resolution limit (radius of the sphere, but more likely the quality of the crystal), two parts of reciprocal space cannot be collected:

rotation axis

The grey shaded zone can be minimised by changing the direction of the rotation axis with respect of the incident beam direction.

camera limit

r = 1/λ

Molecular Biology

5

Protein Crystallography II

Data Collection
0 − 1◦ 1 − 2◦ 179 − 180◦

X-ray beam

rotation

Typical frame widths range from 0.2–1◦. For a 180◦ scan, this gives 180–720 images. This is typical for proteins that diffract to moderate resolution. A more thorough data collection rotates the crystal about two axes. One easily ends up with a few thousand image.

Molecular Biology

6

Protein Crystallography II

Data Processing/ Integration
Data collection results in a list of images, each representing a wedge of the rotation of the crystal in the beam. The images are distorted sections of reciprocal space. Data integration has to reconstitute the original, undistorted lattice in 3 dimensions. It provides a (long) list with one line per refl ection: det.–coord’s x y z[◦] 1181.5 1235.6 107.4 1110.9 1205.1 76.0 1156.2 1233.4 18.3 1215.0 1226.7 165.0 1209.5 1074.0 57.3

H -3 -3 -3 1 4

K 0 -3 0 1 1

L -3 0 3 -4 -1

Intensity 4.162E+03 2.747E+03 3.946E+03 5.933E+03 5.640E+03

error 1.537E+02 -1.075E+02 1.451E+02 -2.139E+02 -2.064E+02

Molecular Biology

7

Protein Crystallography II

Data Integration — Flow Chart

Molecular Biology

8

Protein Crystallography II

Scaling I
Calculation of the electron density is based on an ideal crystal: infi nitely large, perfect unit cell, but also perfect data collection. This is quite far from reality. • Different regions of the detector have different sensibility • Beam instability: one some frames the total intensity can be higher than on others — this refers especially to synchrotrons • The crystal is not perfectly centred in the beam • Data may even be collected from several crystals

Molecular Biology

9

Protein Crystallography II

Scaling II
The (experimental) differences in intensities necessitates the scaling of the data: All refl ections must be put on a common scale. To do so, one takes symmetry related refl ections into account: Refl ections that are related by one of the symmetry operators of the crystal’s space group must have equal intensities. Even in the simplest space group (P1) with no symmetries, scaling can be carried out because of Friedel’s law: Refl ections with negated indices, i.e., (h, k, l) and (−h, −k, −l) have the same intensity. That is because they are refl ected from the same set of planes, but on opposite sides.

Molecular Biology

10

Protein Crystallography II

The Phase Problem

ρ(x, y, z) =

1

h,k,l=∞

Vunitcell h,k,l=−∞

|F (h, k, l)| eiφ(h,k,l)e−2πi(hx+ky+ly)

to calculate electron density

gives |F (h, k, l)|, but not φ(h, k, l)

The structure factor, from which we could calculate the electron density distribution of the crystal, is a complex quantity. It has an amplitude and a phase. Only the amplitude, but not the phase can be determined directly from a diffraction experiment. This loss can be compared with a projection on a plane wall: The eye may see a three dimensional object — but which face points forward?

This problem is known as the phase problem of crystallography. Molecular Biology 11 Protein Crystallography II

The Importance of the Phase
Unfortunately, the phase of the structure factor contains the main information about the shape of the molecule.
|F (h, k, l)|, φ(h, k, l)

inverse FT
φ(h, k, l)

FT

|F (h, k, l)|

inverse FT The phase φ of the duck determines the picture

|F (h, k, l)|, φ(h, k, l)

pictures from http://www.ysbl.york.ac.uk/~cowtan/fourier/fourier.html

Molecular Biology

12

Protein Crystallography II

Techniques for retrieving the Phases — Overview
One of the major efforts of macromolecular crystallography lies in determining good phases. The following are the most frequently used techniques: 1. direct methods (small molecules and high resolution only) 2. molecular replacement 3. isomorphous replacement 4. anomalous dispersion 5. exploitation of radiation damage

Molecular Biology

13

Protein Crystallography II

Direct methods
With small molecules ( <1000 unique atoms) and high resolution ( > 1.2Å), one can manage to fi the nd structure from random starting phases. The starting phases are optimised using the assumption that the structure consists of resolved atoms. This assumption imposes statistical restraints on the phase probability distribution. Very small structures can also be solved by interpreting the Patterson function. This is a Fourier transform based on intensities rather than structure factors, i.e., it can be calculated from experimental data. The Patterson function has the property that a vector to a peak is also a vector connecting two atoms in the structure. For too many atoms, the peaks of the Patterson function come to close to be interpreted.

Molecular Biology

14

Protein Crystallography II

Molecular Replacement
By November 2004, the PDB, the Protein Data Bank(http://www.rcsb.org/pdb), held more than 28,000 structures, both from X-ray crystallography and NMR. Less and less of newly deposited structures reveal a new fold. Sequence homology between two proteins normally also implies structural similarity, and therefore chances are good that a new structure is similar to an already determined one. One search the unit cell with a structure or a fragment of a known structure for the correct orientation and position. These co-ordinates can then be used to calculate fi phases for the experimental data. The rst search is done in two steps: Rotational search The Patterson function can be calculated both from the diffraction data and the search model. It does not depend on the position within the unit cell, but only on the orientation. Hence, we can calculate the Patterson for the model in different orientations, compare it with the Patterson of the data, and pick the orientation with the best agreement. Translational search The model is moved through the asymmetric unit keeping the orientation found at the rotational search. At each point, the calculated structure factor amplitudes |Fc| are scored against the experimental data. Problems: strong model bias (phases!), may sometimes not work even with 100% sequence homology (domain movements).

Molecular Biology

15

Protein Crystallography II

Isomorphous Replacement
Isomorphous replacement is based on the idea that introduction of a small molecule into a protein or nucleic acid crystal does not or hardly alter the structure of the macromolecule. On the other hand, a few heavy metal atoms can contribute detectably to the structure factors and hence introduce changes in the refl ection intensities. Common heavy metals are Hg (80e−), Pb (82e−), Au (79e−), Pt (78e−), or U (92e−). They can be incorporated by co-crystallisation or by soaking after the crystals have grown. The fi rst protein structures like myoglobin or hemoglobin were solved by isomorphous replacement.

G. Sheldrick

Molecular Biology

16

Protein Crystallography II

Isomorphous Replacement
In order to use the extra information, one needs at least two data sets: a native one (no heavy metal) and a derivative (with heavy metal).
derivative: |FT |

co-ordinates difference Harkerconstruction
|FH | , φH

|FT | , φT

native: |FP |

The co-ordinates of the heavy metal(s) can be derived via either direct methods or Patterson methods. From the co-ordinates one can calculate structure factors (amplitude and phase!). The phases for the derivative follow from the Harker construction. Molecular Biology 17 Protein Crystallography II

The Harker Construction
With a single derivative, the Harker construction provides phases for the protein structure up to a twofold ambiguity: 1. Draw a circle with radius |FT | 2. Draw the vector for the heavy atom,|FH |, φH 3. From its endpoint, draw a circle with radius |FP | The two circles have two points of intersection from which one reads the two possible phases φT for the derivative or ( drawing the vector from the endpoint of the heavy atom) the native structure φP .

|FT |, φT |FH |, φH

|FT | |FP |

With only one derivative, one speaks of SIR, single isomorphous replacement, with more than one, one speaks of MIR, multiple isomorphous replacement. MIR removes the ambiguity of SIR. The more derivatives, the better the phases (and their errors) can be determined.

Molecular Biology

18

Protein Crystallography II

Anomalous Dispersion
For a normal diffraction experiment, Friedel’s law is valid, which states that the intensities of the refl ection (h, k, l) and (−h, −k, −l) are equal and that the phases of the underlying structure factor have opposite signs, φ(h, k, l) = −φ(−h, −k, −l). For heavy atoms, the wavelength of X-rays lies in a region where c this is no longer true under all circumstances. This effect is due to absorption of these atoms at specifi wavelengths. This wavelength is different for every type of atom and normally has to be determined before data collection by a fl uorescence scan (scattering of X-rays at right angle to the incident beam). The difference in intensities can be exploited by a Harker construction similar to isomorphous replacement, but with |FT | and |FP | replaced with |F (h, k, l)| and |F (−h, −k, −l)|. With this SAD (single-wavelength anomalous dispersion) approach, the two-fold ambiguity for the phases remains.

Molecular Biology

19

Protein Crystallography II

SIRAS and MAD phasing
To overcome the twofold phase ambiguity, two methods can be applied: 1. SIRAS Often a native crystal or dataset is available, when SAD data are collected. This leads to the combination of SIR and SAD — SIRAS. SIR from the comparison of native to derivative, SAD from the derivative 2. MAD Instead of changing crystals, one can change the wavelength: the strength of anomalous signal varies with the wavelength. This results in multi-wavelength anomalous dispersion or MAD

Molecular Biology

20

Protein Crystallography II

Some “exotic” experimental techniques
RIP Radiation Induced Phasing makes use of the fact that radiation forms radicals. They damage the molecule, and apart from random destruction, carboxyl-groups are removed and disulphides destroyed. For RIP, a normal data set is collected (“native”), then the crystal is exposed to a high dose of X-rays, then a second set (“derivative”) is collected. Sulphur–SAD exploitation of the very weak signal of native S (or P for nucleic acid structures). Halide soaking Iodide SIRAS or bromide MAD after a quick soak (10–30s) in ≈ 1M KI or NaBr.

Molecular Biology

21

Protein Crystallography II

Example Phases
(Initial) centroid phases Resolved twofold ambiguity

Final (refi ned) phases

Molecular Biology

22

Protein Crystallography II


				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:185
posted:3/3/2009
language:English
pages:12