Biochemistry & Medicine 1 Robert K. Murray, MD, PhD INTRODUCTION biochemistry is increasingly becoming their common language. Biochemistry can be defined as the science concerned with the chemical basis of life (Gk bios “life”). The cell is the structural unit of living systems. Thus, biochem- A Reciprocal Relationship Between istry can also be described as the science concerned with Biochemistry & Medicine Has Stimulated the chemical constituents of living cells and with the reac- Mutual Advances tions and processes they undergo. By this definition, bio- chemistry encompasses large areas of cell biology, of The two major concerns for workers in the health sci- molecular biology, and of molecular genetics. ences—and particularly physicians—are the understand- ing and maintenance of health and the understanding The Aim of Biochemistry Is to Describe & and effective treatment of diseases. Biochemistry im- pacts enormously on both of these fundamental con- Explain, in Molecular Terms, All Chemical cerns of medicine. In fact, the interrelationship of bio- Processes of Living Cells chemistry and medicine is a wide, two-way street. The major objective of biochemistry is the complete Biochemical studies have illuminated many aspects of understanding, at the molecular level, of all of the health and disease, and conversely, the study of various chemical processes associated with living cells. To aspects of health and disease has opened up new areas achieve this objective, biochemists have sought to iso- of biochemistry. Some examples of this two-way street late the numerous molecules found in cells, determine are shown in Figure 1–1. For instance, a knowledge of their structures, and analyze how they function. Many protein structure and function was necessary to eluci- techniques have been used for these purposes; some of date the single biochemical difference between normal them are summarized in Table 1–1. hemoglobin and sickle cell hemoglobin. On the other hand, analysis of sickle cell hemoglobin has contributed A Knowledge of Biochemistry Is Essential significantly to our understanding of the structure and to All Life Sciences function of both normal hemoglobin and other pro- teins. Analogous examples of reciprocal benefit between The biochemistry of the nucleic acids lies at the heart of biochemistry and medicine could be cited for the other genetics; in turn, the use of genetic approaches has been paired items shown in Figure 1–1. Another example is critical for elucidating many areas of biochemistry. the pioneering work of Archibald Garrod, a physician Physiology, the study of body function, overlaps with in England during the early 1900s. He studied patients biochemistry almost completely. Immunology employs with a number of relatively rare disorders (alkap- numerous biochemical techniques, and many immuno- tonuria, albinism, cystinuria, and pentosuria; these are logic approaches have found wide use by biochemists. described in later chapters) and established that these Pharmacology and pharmacy rest on a sound knowl- conditions were genetically determined. Garrod desig- edge of biochemistry and physiology; in particular, nated these conditions as inborn errors of metabo- most drugs are metabolized by enzyme-catalyzed reac- lism. His insights provided a major foundation for the tions. Poisons act on biochemical reactions or processes; development of the field of human biochemical genet- this is the subject matter of toxicology. Biochemical ap- ics. More recent efforts to understand the basis of the proaches are being used increasingly to study basic as- genetic disease known as familial hypercholesterol- pects of pathology (the study of disease), such as in- emia, which results in severe atherosclerosis at an early flammation, cell injury, and cancer. Many workers in age, have led to dramatic progress in understanding of microbiology, zoology, and botany employ biochemical cell receptors and of mechanisms of uptake of choles- approaches almost exclusively. These relationships are terol into cells. Studies of oncogenes in cancer cells not surprising, because life as we know it depends on have directed attention to the molecular mechanisms biochemical reactions and processes. In fact, the old involved in the control of normal cell growth. These barriers among the life sciences are breaking down, and and many other examples emphasize how the study of 1 2 / CHAPTER 1 Table 1–1. The principal methods and NORMAL BIOCHEMICAL PROCESSES ARE preparations used in biochemical laboratories. THE BASIS OF HEALTH The World Health Organization (WHO) defines Methods for Separating and Purifying Biomolecules1 health as a state of “complete physical, mental and so- Salt fractionation (eg, precipitation of proteins with ammo- cial well-being and not merely the absence of disease nium sulfate) Chromatography: Paper; ion exchange; affinity; thin-layer; and infirmity.” From a strictly biochemical viewpoint, gas-liquid; high-pressure liquid; gel filtration health may be considered that situation in which all of Electrophoresis: Paper; high-voltage; agarose; cellulose the many thousands of intra- and extracellular reactions acetate; starch gel; polyacrylamide gel; SDS-polyacryl- that occur in the body are proceeding at rates commen- amide gel surate with the organism’s maximal survival in the Ultracentrifugation physiologic state. However, this is an extremely reduc- Methods for Determining Biomolecular Structures tionist view, and it should be apparent that caring for Elemental analysis the health of patients requires not only a wide knowl- UV, visible, infrared, and NMR spectroscopy edge of biologic principles but also of psychologic and Use of acid or alkaline hydrolysis to degrade the biomole- social principles. cule under study into its basic constituents Use of a battery of enzymes of known specificity to de- grade the biomolecule under study (eg, proteases, nucle- Biochemical Research Has Impact on ases, glycosidases) Nutrition & Preventive Medicine Mass spectrometry One major prerequisite for the maintenance of health is Specific sequencing methods (eg, for proteins and nucleic that there be optimal dietary intake of a number of acids) chemicals; the chief of these are vitamins, certain X-ray crystallography amino acids, certain fatty acids, various minerals, and Preparations for Studying Biochemical Processes Whole animal (includes transgenic animals and animals water. Because much of the subject matter of both bio- with gene knockouts) chemistry and nutrition is concerned with the study of Isolated perfused organ various aspects of these chemicals, there is a close rela- Tissue slice tionship between these two sciences. Moreover, more Whole cells emphasis is being placed on systematic attempts to Homogenate maintain health and forestall disease, ie, on preventive Isolated cell organelles medicine. Thus, nutritional approaches to—for exam- Subfractionation of organelles ple—the prevention of atherosclerosis and cancer are Purified metabolites and enzymes receiving increased emphasis. Understanding nutrition Isolated genes (including polymerase chain reaction and depends to a great extent on a knowledge of biochem- site-directed mutagenesis) istry. 1 Most of these methods are suitable for analyzing the compo- nents present in cell homogenates and other biochemical prepa- Most & Perhaps All Disease Has rations. The sequential use of several techniques will generally permit purification of most biomolecules. The reader is referred a Biochemical Basis to texts on methods of biochemical research for details. We believe that most if not all diseases are manifesta- tions of abnormalities of molecules, chemical reactions, or biochemical processes. The major factors responsible disease can open up areas of cell function for basic bio- for causing diseases in animals and humans are listed in chemical research. Table 1–2. All of them affect one or more critical The relationship between medicine and biochem- chemical reactions or molecules in the body. Numerous istry has important implications for the former. As long examples of the biochemical bases of diseases will be en- as medical treatment is firmly grounded in a knowledge countered in this text; the majority of them are due to of biochemistry and other basic sciences, the practice of causes 5, 7, and 8. In most of these conditions, bio- medicine will have a rational basis that can be adapted chemical studies contribute to both the diagnosis and to accommodate new knowledge. This contrasts with treatment. Some major uses of biochemical investiga- unorthodox health cults and at least some “alternative tions and of laboratory tests in relation to diseases are medicine” practices, which are often founded on little summarized in Table 1–3. more than myth and wishful thinking and generally Additional examples of many of these uses are pre- lack any intellectual basis. sented in various sections of this text. BIOCHEMISTRY & MEDICINE / 3 BIOCHEMISTRY Nucleic acids Proteins Lipids Carbohydrates Genetic Sickle cell Athero- Diabetes diseases anemia sclerosis mellitus MEDICINE Figure 1–1. Examples of the two-way street connecting biochemistry and medicine. Knowledge of the biochemical molecules shown in the top part of the diagram has clarified our understanding of the diseases shown in the bottom half—and conversely, analyses of the diseases shown below have cast light on many areas of biochemistry. Note that sickle cell anemia is a genetic disease and that both atherosclerosis and diabetes mellitus have genetic components. Impact of the Human Genome Project (HGP) on Biochemistry & Medicine Table 1–3. Some uses of biochemical Remarkable progress was made in the late 1990s in se- quencing the human genome. This culminated in July investigations and laboratory tests in 2000, when leaders of the two groups involved in this relation to diseases. effort (the International Human Genome Sequencing Consortium and Celera Genomics, a private company) Use Example announced that over 90% of the genome had been se- 1. To reveal the funda- Demonstration of the na- quenced. Draft versions of the sequence were published mental causes and ture of the genetic de- mechanisms of diseases fects in cystic fibrosis. 2. To suggest rational treat- A diet low in phenylalanine Table 1–2. The major causes of diseases. All of ments of diseases based for treatment of phenyl- the causes listed act by influencing the various on (1) above ketonuria. biochemical mechanisms in the cell or in the 3. To assist in the diagnosis Use of the plasma enzyme of specific diseases creatine kinase MB body.1 (CK-MB) in the diagnosis of myocardial infarction. 1. Physical agents: Mechanical trauma, extremes of temper- 4. To act as screening tests Use of measurement of ature, sudden changes in atmospheric pressure, radia- for the early diagnosis blood thyroxine or tion, electric shock. of certain diseases thyroid-stimulating hor- 2. Chemical agents, including drugs: Certain toxic com- mone (TSH) in the neo- pounds, therapeutic drugs, etc. natal diagnosis of con- 3. Biologic agents: Viruses, bacteria, fungi, higher forms of genital hypothyroidism. parasites. 5. To assist in monitoring Use of the plasma enzyme 4. Oxygen lack: Loss of blood supply, depletion of the the progress (eg, re- alanine aminotransferase oxygen-carrying capacity of the blood, poisoning of covery, worsening, re- (ALT) in monitoring the the oxidative enzymes. mission, or relapse) of progress of infectious 5. Genetic disorders: Congenital, molecular. certain diseases hepatitis. 6. Immunologic reactions: Anaphylaxis, autoimmune 6. To assist in assessing Use of measurement of disease. the response of dis- blood carcinoembryonic 7. Nutritional imbalances: Deficiencies, excesses. eases to therapy antigen (CEA) in certain 8. Endocrine imbalances: Hormonal deficiencies, excesses. patients who have been 1 treated for cancer of the Adapted, with permission, from Robbins SL, Cotram RS, Kumar V: The Pathologic Basis of Disease, 3rd ed. Saunders, 1984. colon. 4 / CHAPTER 1 in early 2001. It is anticipated that the entire sequence • The judicious use of various biochemical laboratory will be completed by 2003. The implications of this tests is an integral component of diagnosis and moni- work for biochemistry, all of biology, and for medicine toring of treatment. are tremendous, and only a few points are mentioned • A sound knowledge of biochemistry and of other re- here. Many previously unknown genes have been re- lated basic disciplines is essential for the rational vealed; their protein products await characterization. practice of medical and related health sciences. New light has been thrown on human evolution, and procedures for tracking disease genes have been greatly refined. The results are having major effects on areas REFERENCES such as proteomics, bioinformatics, biotechnology, and Fruton JS: Proteins, Enzymes, Genes: The Interplay of Chemistry and pharmacogenomics. Reference to the human genome Biology. Yale Univ Press, 1999. (Provides the historical back- will be made in various sections of this text. The ground for much of today’s biochemical research.) Human Genome Project is discussed in more detail in Garrod AE: Inborn errors of metabolism. (Croonian Lectures.) Chapter 54. Lancet 1908;2:1, 73, 142, 214. International Human Genome Sequencing Consortium. Initial se- SUMMARY quencing and analysis of the human genome. Nature 2001:409;860. (The issue [15 February] consists of articles • Biochemistry is the science concerned with studying dedicated to analyses of the human genome.) the various molecules that occur in living cells and Kornberg A: Basic research: The lifeline of medicine. FASEB J organisms and with their chemical reactions. Because 1992;6:3143. life depends on biochemical reactions, biochemistry Kornberg A: Centenary of the birth of modern biochemistry. has become the basic language of all biologic sci- FASEB J 1997;11:1209. ences. McKusick VA: Mendelian Inheritance in Man. Catalogs of Human Genes and Genetic Disorders, 12th ed. Johns Hopkins Univ • Biochemistry is concerned with the entire spectrum Press, 1998. [Abbreviated MIM] of life forms, from relatively simple viruses and bacte- Online Mendelian Inheritance in Man (OMIM): Center for Med- ria to complex human beings. ical Genetics, Johns Hopkins University and National Center • Biochemistry and medicine are intimately related. for Biotechnology Information, National Library of Medi- Health depends on a harmonious balance of bio- cine, 1997. http://www.ncbi.nlm.nih.gov/omim/ chemical reactions occurring in the body, and disease (The numbers assigned to the entries in MIM and OMIM will be reflects abnormalities in biomolecules, biochemical cited in selected chapters of this work. Consulting this exten- sive collection of diseases and other relevant entries—specific reactions, or biochemical processes. proteins, enzymes, etc—will greatly expand the reader’s • Advances in biochemical knowledge have illumi- knowledge and understanding of various topics referred to nated many areas of medicine. Conversely, the study and discussed in this text. The online version is updated al- of diseases has often revealed previously unsuspected most daily.) aspects of biochemistry. The determination of the se- Scriver CR et al (editors): The Metabolic and Molecular Bases of In- quence of the human genome, nearly complete, will herited Disease, 8th ed. McGraw-Hill, 2001. have a great impact on all areas of biology, including Venter JC et al: The Sequence of the Human Genome. Science 2001;291:1304. (The issue [16 February] contains the Celera biochemistry, bioinformatics, and biotechnology. draft version and other articles dedicated to analyses of the • Biochemical approaches are often fundamental in il- human genome.) luminating the causes of diseases and in designing Williams DL, Marks V: Scientific Foundations of Biochemistry in appropriate therapies. Clinical Practice, 2nd ed. Butterworth-Heinemann, 1994. Water & pH 2 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE oxygen atom pulls electrons away from the hydrogen nuclei, leaving them with a partial positive charge, Water is the predominant chemical component of liv- while its two unshared electron pairs constitute a region ing organisms. Its unique physical properties, which in- of local negative charge. clude the ability to solvate a wide range of organic and Water, a strong dipole, has a high dielectric con- inorganic molecules, derive from water’s dipolar struc- stant. As described quantitatively by Coulomb’s law, ture and exceptional capacity for forming hydrogen the strength of interaction F between oppositely bonds. The manner in which water interacts with a sol- charged particles is inversely proportionate to the di- vated biomolecule influences the structure of each. An electric constant ε of the surrounding medium. The di- excellent nucleophile, water is a reactant or product in electric constant for a vacuum is unity; for hexane it is many metabolic reactions. Water has a slight propensity 1.9; for ethanol it is 24.3; and for water it is 78.5. to dissociate into hydroxide ions and protons. The Water therefore greatly decreases the force of attraction acidity of aqueous solutions is generally reported using between charged and polar species relative to water-free the logarithmic pH scale. Bicarbonate and other buffers environments with lower dielectric constants. Its strong normally maintain the pH of extracellular fluid be- dipole and high dielectric constant enable water to dis- tween 7.35 and 7.45. Suspected disturbances of acid- solve large quantities of charged compounds such as base balance are verified by measuring the pH of arter- salts. ial blood and the CO2 content of venous blood. Causes of acidosis (blood pH < 7.35) include diabetic ketosis and lactic acidosis. Alkalosis (pH > 7.45) may, for ex- Water Molecules Form Hydrogen Bonds ample, follow vomiting of acidic gastric contents. Regu- lation of water balance depends upon hypothalamic An unshielded hydrogen nucleus covalently bound to mechanisms that control thirst, on antidiuretic hor- an electron-withdrawing oxygen or nitrogen atom can mone (ADH), on retention or excretion of water by the interact with an unshared electron pair on another oxy- kidneys, and on evaporative loss. Nephrogenic diabetes gen or nitrogen atom to form a hydrogen bond. Since insipidus, which involves the inability to concentrate water molecules contain both of these features, hydro- urine or adjust to subtle changes in extracellular fluid gen bonding favors the self-association of water mole- osmolarity, results from the unresponsiveness of renal cules into ordered arrays (Figure 2–2). Hydrogen bond- tubular osmoreceptors to ADH. ing profoundly influences the physical properties of water and accounts for its exceptionally high viscosity, surface tension, and boiling point. On average, each molecule in liquid water associates through hydrogen WATER IS AN IDEAL BIOLOGIC SOLVENT bonds with 3.5 others. These bonds are both relatively Water Molecules Form Dipoles weak and transient, with a half-life of about one mi- crosecond. Rupture of a hydrogen bond in liquid water A water molecule is an irregular, slightly skewed tetra- requires only about 4.5 kcal/mol, less than 5% of the hedron with oxygen at its center (Figure 2–1). The two energy required to rupture a covalent O H bond. hydrogens and the unshared electrons of the remaining Hydrogen bonding enables water to dissolve many two sp3-hybridized orbitals occupy the corners of the organic biomolecules that contain functional groups tetrahedron. The 105-degree angle between the hydro- which can participate in hydrogen bonding. The oxy- gens differs slightly from the ideal tetrahedral angle, gen atoms of aldehydes, ketones, and amides provide 109.5 degrees. Ammonia is also tetrahedral, with a 107- pairs of electrons that can serve as hydrogen acceptors. degree angle between its hydrogens. Water is a dipole, Alcohols and amines can serve both as hydrogen accep- a molecule with electrical charge distributed asymmetri- tors and as donors of unshielded hydrogen atoms for cally about its structure. The strongly electronegative formation of hydrogen bonds (Figure 2–3). 5 6 / CHAPTER 2 H CH3 CH2 O H O 2e H 2e H H 105° CH3 CH2 O H O H CH2 CH3 Figure 2–1. The water molecule has tetrahedral R R II geometry. C O H N RI R III INTERACTION WITH WATER INFLUENCES THE STRUCTURE OF BIOMOLECULES Figure 2–3. Additional polar groups participate in hydrogen bonding. Shown are hydrogen bonds formed Covalent & Noncovalent Bonds Stabilize between an alcohol and water, between two molecules Biologic Molecules of ethanol, and between the peptide carbonyl oxygen The covalent bond is the strongest force that holds and the peptide nitrogen hydrogen of an adjacent molecules together (Table 2–1). Noncovalent forces, amino acid. while of lesser magnitude, make significant contribu- tions to the structure, stability, and functional compe- phosphatidyl serine or phosphatidyl ethanolamine con- tence of macromolecules in living cells. These forces, tact water while their hydrophobic fatty acyl side chains which can be either attractive or repulsive, involve in- cluster together, excluding water. This pattern maxi- teractions both within the biomolecule and between it mizes the opportunities for the formation of energeti- and the water that forms the principal component of cally favorable charge-dipole, dipole-dipole, and hydro- the surrounding environment. gen bonding interactions between polar groups on the biomolecule and water. It also minimizes energetically Biomolecules Fold to Position Polar & unfavorable contact between water and hydrophobic Charged Groups on Their Surfaces groups. Most biomolecules are amphipathic; that is, they pos- sess regions rich in charged or polar functional groups Hydrophobic Interactions as well as regions with hydrophobic character. Proteins Hydrophobic interaction refers to the tendency of non- tend to fold with the R-groups of amino acids with hy- polar compounds to self-associate in an aqueous envi- drophobic side chains in the interior. Amino acids with ronment. This self-association is driven neither by mu- charged or polar amino acid side chains (eg, arginine, tual attraction nor by what are sometimes incorrectly glutamate, serine) generally are present on the surface referred to as “hydrophobic bonds.” Self-association in contact with water. A similar pattern prevails in a arises from the need to minimize energetically unfavor- phospholipid bilayer, where the charged head groups of able interactions between nonpolar groups and water. H H H H Table 2–1. Bond energies for atoms of biologic O O significance. H H H H O O H O Bond Energy Bond Energy H O H H Type (kcal/mol) Type (kcal/mol) H O H O—O 34 O==O 96 S—S 51 C—H 99 Figure 2–2. Left: Association of two dipolar water C—N 70 C==S 108 molecules by a hydrogen bond (dotted line). Right: S—H 81 O—H 110 Hydrogen-bonded cluster of four water molecules. C—C 82 C==C 147 C—O 84 C==N 147 Note that water can serve simultaneously both as a hy- N—H 94 C==O 164 drogen donor and as a hydrogen acceptor. WATER & pH / 7 While the hydrogens of nonpolar groups such as the the backbone to water while burying the relatively hy- methylene groups of hydrocarbons do not form hydro- drophobic nucleotide bases inside. The extended back- gen bonds, they do affect the structure of the water that bone maximizes the distance between negatively surrounds them. Water molecules adjacent to a hy- charged backbone phosphates, minimizing unfavorable drophobic group are restricted in the number of orien- electrostatic interactions. tations (degrees of freedom) that permit them to par- ticipate in the maximum number of energetically WATER IS AN EXCELLENT NUCLEOPHILE favorable hydrogen bonds. Maximal formation of mul- tiple hydrogen bonds can be maintained only by in- Metabolic reactions often involve the attack by lone creasing the order of the adjacent water molecules, with pairs of electrons on electron-rich molecules termed a corresponding decrease in entropy. nucleophiles on electron-poor atoms called elec- It follows from the second law of thermodynamics trophiles. Nucleophiles and electrophiles do not neces- that the optimal free energy of a hydrocarbon-water sarily possess a formal negative or positive charge. mixture is a function of both maximal enthalpy (from Water, whose two lone pairs of sp3 electrons bear a par- hydrogen bonding) and minimum entropy (maximum tial negative charge, is an excellent nucleophile. Other degrees of freedom). Thus, nonpolar molecules tend to nucleophiles of biologic importance include the oxygen form droplets with minimal exposed surface area, re- atoms of phosphates, alcohols, and carboxylic acids; the ducing the number of water molecules affected. For the sulfur of thiols; the nitrogen of amines; and the imid- same reason, in the aqueous environment of the living azole ring of histidine. Common electrophiles include cell the hydrophobic portions of biopolymers tend to the carbonyl carbons in amides, esters, aldehydes, and be buried inside the structure of the molecule, or within ketones and the phosphorus atoms of phosphoesters. a lipid bilayer, minimizing contact with water. Nucleophilic attack by water generally results in the cleavage of the amide, glycoside, or ester bonds that hold biopolymers together. This process is termed hy- Electrostatic Interactions drolysis. Conversely, when monomer units are joined Interactions between charged groups shape biomolecu- together to form biopolymers such as proteins or glyco- lar structure. Electrostatic interactions between oppo- gen, water is a product, as shown below for the forma- sitely charged groups within or between biomolecules tion of a peptide bond between two amino acids. are termed salt bridges. Salt bridges are comparable in strength to hydrogen bonds but act over larger dis- O + tances. They thus often facilitate the binding of charged H3N OH + H NH molecules and ions to proteins and nucleic acids. O– Alanine Van der Waals Forces O Van der Waals forces arise from attractions between Valine transient dipoles generated by the rapid movement of electrons on all neutral atoms. Significantly weaker than hydrogen bonds but potentially extremely numer- H2O ous, van der Waals forces decrease as the sixth power of the distance separating atoms. Thus, they act over very O + short distances, typically 2–4 Å. H3 N NH Multiple Forces Stabilize Biomolecules O– The DNA double helix illustrates the contribution of O multiple forces to the structure of biomolecules. While each individual DNA strand is held together by cova- While hydrolysis is a thermodynamically favored re- lent bonds, the two strands of the helix are held to- action, the amide and phosphoester bonds of polypep- gether exclusively by noncovalent interactions. These tides and oligonucleotides are stable in the aqueous en- noncovalent interactions include hydrogen bonds be- vironment of the cell. This seemingly paradoxic tween nucleotide bases (Watson-Crick base pairing) behavior reflects the fact that the thermodynamics gov- and van der Waals interactions between the stacked erning the equilibrium of a reaction do not determine purine and pyrimidine bases. The helix presents the the rate at which it will take place. In the cell, protein charged phosphate groups and polar ribose sugars of catalysts called enzymes are used to accelerate the rate 8 / CHAPTER 2 of hydrolytic reactions when needed. Proteases catalyze H7O3+. The proton is nevertheless routinely repre- the hydrolysis of proteins into their component amino sented as H+, even though it is in fact highly hydrated. acids, while nucleases catalyze the hydrolysis of the Since hydronium and hydroxide ions continuously phosphoester bonds in DNA and RNA. Careful control recombine to form water molecules, an individual hy- of the activities of these enzymes is required to ensure drogen or oxygen cannot be stated to be present as an that they act only on appropriate target molecules. ion or as part of a water molecule. At one instant it is an ion. An instant later it is part of a molecule. Individ- Many Metabolic Reactions Involve ual ions or molecules are therefore not considered. We Group Transfer refer instead to the probability that at any instant in time a hydrogen will be present as an ion or as part of a In group transfer reactions, a group G is transferred water molecule. Since 1 g of water contains 3.46 × 1022 from a donor D to an acceptor A, forming an acceptor molecules, the ionization of water can be described sta- group complex A–G: tistically. To state that the probability that a hydrogen exists as an ion is 0.01 means that a hydrogen atom has D−G + A = A−G + D one chance in 100 of being an ion and 99 chances out The hydrolysis and phosphorolysis of glycogen repre- of 100 of being part of a water molecule. The actual sent group transfer reactions in which glucosyl groups probability of a hydrogen atom in pure water existing as are transferred to water or to orthophosphate. The a hydrogen ion is approximately 1.8 × 10−9. The proba- equilibrium constant for the hydrolysis of covalent bility of its being part of a molecule thus is almost bonds strongly favors the formation of split products. unity. Stated another way, for every hydrogen ion and The biosynthesis of macromolecules also involves group hydroxyl ion in pure water there are 1.8 billion or 1.8 × transfer reactions in which the thermodynamically un- 109 water molecules. Hydrogen ions and hydroxyl ions favored synthesis of covalent bonds is coupled to fa- nevertheless contribute significantly to the properties of vored reactions so that the overall change in free energy water. favors biopolymer synthesis. Given the nucleophilic For dissociation of water, character of water and its high concentration in cells, why are biopolymers such as proteins and DNA rela- [H+ ][OH− ] K= tively stable? And how can synthesis of biopolymers [H2O] occur in an apparently aqueous environment? Central to both questions are the properties of enzymes. In the where brackets represent molar concentrations (strictly absence of enzymic catalysis, even thermodynamically speaking, molar activities) and K is the dissociation highly favored reactions do not necessarily take place constant. Since one mole (mol) of water weighs 18 g, rapidly. Precise and differential control of enzyme ac- one liter (L) (1000 g) of water contains 1000 × 18 = tivity and the sequestration of enzymes in specific or- 55.56 mol. Pure water thus is 55.56 molar. Since the ganelles determine under what physiologic conditions a probability that a hydrogen in pure water will exist as a given biopolymer will be synthesized or degraded. hydrogen ion is 1.8 × 10−9, the molar concentration of Newly synthesized polymers are not immediately hy- H+ ions (or of OH− ions) in pure water is the product drolyzed, in part because the active sites of biosynthetic of the probability, 1.8 × 10−9, times the molar concen- enzymes sequester substrates in an environment from tration of water, 55.56 mol/L. The result is 1.0 × 10−7 which water can be excluded. mol/L. We can now calculate K for water: Water Molecules Exhibit a Slight but Important Tendency to Dissociate [H + ][ OH − ] [10 −7 ][10 −7 ] K= = The ability of water to ionize, while slight, is of central [H2 O ] [ 55.56 ] importance for life. Since water can act both as an acid and as a base, its ionization may be represented as an = 0.018 × 10 −14 = 1.8 × 10 −16 mol / L intermolecular proton transfer that forms a hydronium ion (H3O+) and a hydroxide ion (OH−): The molar concentration of water, 55.56 mol/L, is too great to be significantly affected by dissociation. It H2O + H2O = 3O+ + OH− H therefore is considered to be essentially constant. This constant may then be incorporated into the dissociation The transferred proton is actually associated with a constant K to provide a useful new constant Kw termed cluster of water molecules. Protons exist in solution not the ion product for water. The relationship between only as H3O+, but also as multimers such as H5O2+ and Kw and K is shown below: WATER & pH / 9 [H + ][ OH − ] termediates, whose phosphoryl group contains two dis- K= = 1.8 × 10 −16 mol / L sociable protons, the first of which is strongly acidic. [H2 O ] The following examples illustrate how to calculate K w = (K )[H2 O ] = [H + ][ OH − ] the pH of acidic and basic solutions. Example 1: What is the pH of a solution whose hy- = (1.8 × 10 −16 mol / L ) ( 55.56 mol / L ) drogen ion concentration is 3.2 × 10− 4 mol/L? = 1.00 × 10 −14 (mol / L )2 pH = − log [H+ ] Note that the dimensions of K are moles per liter and = − log (3.2 × 10 − 4 ) those of Kw are moles2 per liter2. As its name suggests, = − log (3.2) − log (10 − 4 ) the ion product Kw is numerically equal to the product of the molar concentrations of H+ and OH−: = −0.5 + 4.0 = 3.5 K w = [H + ][ OH − ] Example 2: What is the pH of a solution whose hy- At 25 °C, Kw = (10−7)2, or 10−14 (mol/L)2. At tempera- droxide ion concentration is 4.0 × 10− 4 mol/L? We first tures below 25 °C, Kw is somewhat less than 10−14; and define a quantity pOH that is equal to −log [OH−] and at temperatures above 25 °C it is somewhat greater than that may be derived from the definition of Kw: 10−14. Within the stated limitations of the effect of tem- perature, Kw equals 10-14 (mol/L)2 for all aqueous so- K w = [H + ][ OH − ] = 10 −14 lutions, even solutions of acids or bases. We shall use Kw to calculate the pH of acidic and basic solutions. Therefore: pH IS THE NEGATIVE LOG OF THE log [H + ] + log [ OH − ] = log 10 −14 HYDROGEN ION CONCENTRATION or The term pH was introduced in 1909 by Sörensen, who defined pH as the negative log of the hydrogen ion pH + pOH = 14 concentration: To solve the problem by this approach: pH = −log [H + ] [OH− ] = 4.0 × 10 − 4 This definition, while not rigorous, suffices for many biochemical purposes. To calculate the pH of a solution: pOH = − log [OH− ] 1. Calculate hydrogen ion concentration [H+]. = − log (4.0 × 10 − 4 ) 2. Calculate the base 10 logarithm of [H+]. = − log (4.0) − log (10 − 4 ) 3. pH is the negative of the value found in step 2. = −0.60 + 4.0 For example, for pure water at 25°C, = 3.4 pH = − log [H + ] = −log 10 −7 = −( −7) = 7.0 Now: Low pH values correspond to high concentrations of pH = 14 − pOH = 14 − 3.4 H+ and high pH values correspond to low concentra- = 10.6 tions of H+. Acids are proton donors and bases are proton ac- Example 3: What are the pH values of (a) 2.0 × 10−2 ceptors. Strong acids (eg, HCl or H2SO4) completely mol/L KOH and of (b) 2.0 × 10−6 mol/L KOH? The dissociate into anions and cations even in strongly acidic OH− arises from two sources, KOH and water. Since solutions (low pH). Weak acids dissociate only partially pH is determined by the total [H+] (and pOH by the in acidic solutions. Similarly, strong bases (eg, KOH or total [OH−]), both sources must be considered. In the NaOH)—but not weak bases (eg, Ca[OH]2)—are first case (a), the contribution of water to the total completely dissociated at high pH. Many biochemicals [OH−] is negligible. The same cannot be said for the are weak acids. Exceptions include phosphorylated in- second case (b): 10 / CHAPTER 2 Concentration (mol/L) below are the expressions for the dissociation constant (Ka ) for two representative weak acids, RCOOH and (a) (b) RNH3+. −2 Molarity of KOH 2.0 × 10 2.0 × 10−6 [OH−] from KOH 2.0 × 10−2 2.0 × 10−6 R — COOH = — COO− + H+ R [OH−] from water 1.0 × 10−7 1.0 × 10−7 [R — COO− ][H+ ] Total [OH−] 2.00001 × 10−2 2.1 × 10−6 Ka = [R — COOH] R — NH3+ = — NH2 + H+ R Once a decision has been reached about the significance of the contribution by water, pH may be calculated as [R — NH2 ][H+ ] Ka = above. [R — NH3+ ] The above examples assume that the strong base KOH is completely dissociated in solution and that the Since the numeric values of Ka for weak acids are nega- concentration of OH− ions was thus equal to that of the tive exponential numbers, we express Ka as pKa, where KOH. This assumption is valid for dilute solutions of strong bases or acids but not for weak bases or acids. pK a = − log K Since weak electrolytes dissociate only slightly in solu- tion, we must use the dissociation constant to calcu- Note that pKa is related to Ka as pH is to [H+]. The late the concentration of [H+] (or [OH−]) produced by stronger the acid, the lower its pKa value. a given molarity of a weak acid (or base) before calcu- pKa is used to express the relative strengths of both lating total [H+] (or total [OH−]) and subsequently pH. acids and bases. For any weak acid, its conjugate is a strong base. Similarly, the conjugate of a strong base is a weak acid. The relative strengths of bases are ex- Functional Groups That Are Weak Acids pressed in terms of the pKa of their conjugate acids. For Have Great Physiologic Significance polyproteic compounds containing more than one dis- Many biochemicals possess functional groups that are sociable proton, a numerical subscript is assigned to weak acids or bases. Carboxyl groups, amino groups, each in order of relative acidity. For a dissociation of and the second phosphate dissociation of phosphate es- the type ters are present in proteins and nucleic acids, most + coenzymes, and most intermediary metabolites. Knowl- R — NH3 → R — NH2 edge of the dissociation of weak acids and bases thus is basic to understanding the influence of intracellular pH the pKa is the pH at which the concentration of the on structure and biologic activity. Charge-based separa- acid RNH3+ equals that of the base RNH2. tions such as electrophoresis and ion exchange chro- From the above equations that relate Ka to [H+] and matography also are best understood in terms of the to the concentrations of undissociated acid and its con- dissociation behavior of functional groups. jugate base, when We term the protonated species (eg, HA or RNH3+) the acid and the unprotonated species (eg, [R — COO− ] = [R — COOH] A− or RNH2) its conjugate base. Similarly, we may refer to a base (eg, A− or RNH2) and its conjugate acid (eg, HA or RNH3+). Representative weak acids or when (left), their conjugate bases (center), and the pKa values (right) include the following: [R — NH2 ] = [R — NH3 + ] R — CH2 — COOH R — CH2 — COO− pK a = 4 − 5 then + R — CH2 — NH3 R — CH2 — NH2 pK a = 9 − 10 K a = [H+ ] − H2CO3 HCO3 pK a = 6.4 − −2 Thus, when the associated (protonated) and dissociated H2PO4 HPO4 pK a = 7.2 (conjugate base) species are present at equal concentra- tions, the prevailing hydrogen ion concentration [H+] We express the relative strengths of weak acids and is numerically equal to the dissociation constant, Ka. If bases in terms of their dissociation constants. Shown the logarithms of both sides of the above equation are WATER & pH / 11 taken and both sides are multiplied by −1, the expres- Substitute pH and pKa for −log [H+] and −log Ka, re- sions would be as follows: spectively; then: K a = [H+ ] [HA ] + pH = pK a − log − log K a = −log [H ] [A − ] Since −log Ka is defined as pKa, and −log [H+] de- Inversion of the last term removes the minus sign fines pH, the equation may be rewritten as and gives the Henderson-Hasselbalch equation: pK a = pH [A − ] pH = pK a + log [HA ] ie, the pKa of an acid group is the pH at which the pro- tonated and unprotonated species are present at equal concentrations. The pKa for an acid may be determined The Henderson-Hasselbalch equation has great pre- by adding 0.5 equivalent of alkali per equivalent of dictive value in protonic equilibria. For example, acid. The resulting pH will be the pKa of the acid. (1) When an acid is exactly half-neutralized, [A−] = [HA]. Under these conditions, The Henderson-Hasselbalch Equation [A − ] 1 Describes the Behavior pH = pK a + log = pK a + log = pK a + 0 [HA ] 1 of Weak Acids & Buffers The Henderson-Hasselbalch equation is derived below. Therefore, at half-neutralization, pH = pKa. A weak acid, HA, ionizes as follows: (2) When the ratio [A−]/[HA] = 100:1, HA = H + + A − [A − ] pH = pK a + log The equilibrium constant for this dissociation is [HA ] pH = pK a + log 100 / 1= pK a + 2 [H + ][A − ] Ka = [HA ] (3) When the ratio [A−]/[HA] = 1:10, Cross-multiplication gives pH = pK a + log 1/ 10 = pK a + ( −1) + − [H ][A ] = K a[HA] If the equation is evaluated at ratios of [A−]/[HA] − ranging from 103 to 10−3 and the calculated pH values Divide both sides by [A ]: are plotted, the resulting graph describes the titration curve for a weak acid (Figure 2–4). [HA ] [H + ] = K a [A − ] Solutions of Weak Acids & Their Salts Take the log of both sides: Buffer Changes in pH [HA ] Solutions of weak acids or bases and their conjugates log [H + ] = log K a exhibit buffering, the ability to resist a change in pH [A − ] following addition of strong acid or base. Since many [HA ] metabolic reactions are accompanied by the release or = log K a + log uptake of protons, most intracellular reactions are [A − ] buffered. Oxidative metabolism produces CO2, the an- hydride of carbonic acid, which if not buffered would Multiply through by −1: produce severe acidosis. Maintenance of a constant pH involves buffering by phosphate, bicarbonate, and pro- [HA] − log [H+ ] = − log K a − log teins, which accept or release protons to resist a change [A − ] 12 / CHAPTER 2 to the pKa. A solution of a weak acid and its conjugate meq of alkali added per meq of acid 1.0 1.0 base buffers most effectively in the pH range pKa ± 1.0 0.8 0.8 pH unit. Figure 2–4 also illustrates the net charge on one Net charge 0.6 0.6 molecule of the acid as a function of pH. A fractional charge of −0.5 does not mean that an individual mole- 0.4 0.4 cule bears a fractional charge, but the probability that a given molecule has a unit negative charge is 0.5. Con- 0.2 0.2 sideration of the net charge on macromolecules as a function of pH provides the basis for separatory tech- 0 0 niques such as ion exchange chromatography and elec- 2 3 4 5 6 7 8 trophoresis. pH Figure 2–4. Titration curve for an acid of the type Acid Strength Depends on HA. The heavy dot in the center of the curve indicates Molecular Structure the pKa 5.0. Many acids of biologic interest possess more than one dissociating group. The presence of adjacent negative charge hinders the release of a proton from a nearby in pH. For experiments using tissue extracts or en- group, raising its pKa. This is apparent from the pKa zymes, constant pH is maintained by the addition of values for the three dissociating groups of phosphoric buffers such as MES ([2-N-morpholino]ethanesulfonic acid and citric acid (Table 2–2). The effect of adjacent acid, pKa 6.1), inorganic orthophosphate (pKa2 7.2), charge decreases with distance. The second pKa for suc- HEPES (N-hydroxyethylpiperazine-N9-2-ethanesulfonic cinic acid, which has two methylene groups between its acid, pKa 6.8), or Tris (tris[hydroxymethyl] amino- carboxyl groups, is 5.6, whereas the second pKa for glu- methane, pKa 8.3). The value of pKa relative to the de- sired pH is the major determinant of which buffer is se- lected. Buffering can be observed by using a pH meter while titrating a weak acid or base (Figure 2–4). We Table 2–2. Relative strengths of selected acids of can also calculate the pH shift that accompanies addi- biologic significance. Tabulated values are the pKa tion of acid or base to a buffered solution. In the exam- values (−log of the dissociation constant) of ple, the buffered solution (a weak acid, pKa = 5.0, and selected monoprotic, diprotic, and triprotic acids. its conjugate base) is initially at one of four pH values. We will calculate the pH shift that results when 0.1 Monoprotic Acids meq of KOH is added to 1 meq of each solution: Formic pK 3.75 Lactic pK 3.86 Acetic pK 4.76 Initial pH 5.00 5.37 5.60 5.86 Ammonium ion pK 9.25 [A−]initial 0.50 0.70 0.80 0.88 [HA]initial 0.50 0.30 0.20 0.12 Diprotic Acids ([A−]/[HA])initial 1.00 2.33 4.00 7.33 Carbonic pK1 6.37 Addition of 0.1 meq of KOH produces pK2 10.25 [A−]final 0.60 0.80 0.90 0.98 Succinic pK1 4.21 [HA]final 0.40 0.20 0.10 0.02 pK2 5.64 ([A−]/[HA])final 1.50 4.00 9.00 49.0 Glutaric pK1 4.34 log ([A−]/[HA])final 0.176 0.602 0.95 1.69 pK2 5.41 Final pH 5.18 5.60 5.95 6.69 Triprotic Acids ∆pH 0.18 0.60 0.95 1.69 Phosphoric pK1 2.15 pK2 6.82 pK3 12.38 Notice that the change in pH per milliequivalent of Citric pK1 3.08 OH− added depends on the initial pH. The solution re- pK2 4.74 sists changes in pH most effectively at pH values close pK3 5.40 WATER & pH / 13 taric acid, which has one additional methylene group, • Macromolecules exchange internal surface hydrogen is 5.4. bonds for hydrogen bonds to water. Entropic forces dictate that macromolecules expose polar regions to pKa Values Depend on the Properties an aqueous interface and bury nonpolar regions. of the Medium • Salt bonds, hydrophobic interactions, and van der Waals forces participate in maintaining molecular The pKa of a functional group is also profoundly influ- structure. enced by the surrounding medium. The medium may either raise or lower the pKa depending on whether the • pH is the negative log of [H+]. A low pH character- undissociated acid or its conjugate base is the charged izes an acidic solution, and a high pH denotes a basic species. The effect of dielectric constant on pKa may be solution. observed by adding ethanol to water. The pKa of a car- • The strength of weak acids is expressed by pKa, the boxylic acid increases, whereas that of an amine decreases negative log of the acid dissociation constant. Strong because ethanol decreases the ability of water to solvate acids have low pKa values and weak acids have high a charged species. The pKa values of dissociating groups pKa values. in the interiors of proteins thus are profoundly affected • Buffers resist a change in pH when protons are pro- by their local environment, including the presence or duced or consumed. Maximum buffering capacity absence of water. occurs ± 1 pH unit on either side of pKa. Physiologic buffers include bicarbonate, orthophosphate, and proteins. SUMMARY • Water forms hydrogen-bonded clusters with itself and REFERENCES with other proton donors or acceptors. Hydrogen Segel IM: Biochemical Calculations. Wiley, 1968. bonds account for the surface tension, viscosity, liquid Wiggins PM: Role of water in some biological processes. Microbiol state at room temperature, and solvent power of water. Rev 1990;54:432. • Compounds that contain O, N, or S can serve as hy- drogen bond donors or acceptors. SECTION I Structures & Functions of Proteins & Enzymes Amino Acids & Peptides 3 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE more than 20 amino acids, its redundancy limits the available codons to the 20 L-α-amino acids listed in In addition to providing the monomer units from which Table 3–1, classified according to the polarity of their R the long polypeptide chains of proteins are synthesized, groups. Both one- and three-letter abbreviations for each the L-α-amino acids and their derivatives participate in amino acid can be used to represent the amino acids in cellular functions as diverse as nerve transmission and peptides (Table 3–1). Some proteins contain additional the biosynthesis of porphyrins, purines, pyrimidines, amino acids that arise by modification of an amino acid and urea. Short polymers of amino acids called peptides already present in a peptide. Examples include conver- perform prominent roles in the neuroendocrine system sion of peptidyl proline and lysine to 4-hydroxyproline as hormones, hormone-releasing factors, neuromodula- and 5-hydroxylysine; the conversion of peptidyl gluta- tors, or neurotransmitters. While proteins contain only mate to γ-carboxyglutamate; and the methylation, L-α-amino acids, microorganisms elaborate peptides formylation, acetylation, prenylation, and phosphoryla- that contain both D- and L-α-amino acids. Several of tion of certain aminoacyl residues. These modifications these peptides are of therapeutic value, including the an- extend the biologic diversity of proteins by altering their tibiotics bacitracin and gramicidin A and the antitumor solubility, stability, and interaction with other proteins. agent bleomycin. Certain other microbial peptides are toxic. The cyanobacterial peptides microcystin and nodularin are lethal in large doses, while small quantities Only L- -Amino Acids Occur in Proteins promote the formation of hepatic tumors. Neither hu- mans nor any other higher animals can synthesize 10 of With the sole exception of glycine, the α-carbon of the 20 common L-α-amino acids in amounts adequate amino acids is chiral. Although some protein amino to support infant growth or to maintain health in adults. acids are dextrorotatory and some levorotatory, all share Consequently, the human diet must contain adequate the absolute configuration of L-glyceraldehyde and thus quantities of these nutritionally essential amino acids. are L-α-amino acids. Several free L-α-amino acids fulfill important roles in metabolic processes. Examples in- clude ornithine, citrulline, and argininosuccinate that PROPERTIES OF AMINO ACIDS participate in urea synthesis; tyrosine in formation of The Genetic Code Specifies thyroid hormones; and glutamate in neurotransmitter biosynthesis. D-Amino acids that occur naturally in- 20 L- -Amino Acids clude free D-serine and D-aspartate in brain tissue, Of the over 300 naturally occurring amino acids, 20 con- D-alanine and D-glutamate in the cell walls of gram- stitute the monomer units of proteins. While a nonre- positive bacteria, and D-amino acids in some nonmam- dundant three-letter genetic code could accommodate malian peptides and certain antibiotics. 14 Table 3–1. L- α-Amino acids present in proteins. Name Symbol Structural Formula pK1 pK2 pK3 With Aliphatic Side Chains -COOH -NH3 + R Group Glycine Gly [G] H CH COO – 2.4 9.8 NH3+ Alanine Ala [A] – 2.4 9.9 CH3 CH COO NH3+ H3C – CH CH COO Valine Val [V] + 2.2 9.7 H3C NH3 H3C – CH CH2 CH COO Leucine Leu [L] + 2.3 9.7 H3C NH3 CH3 CH2 Isoleucine Ile [I] CH CH COO – 2.3 9.8 CH3 + NH3 With Side Chains Containing Hydroxylic (OH) Groups Serine Ser [S] CH2 CH COO – 2.2 9.2 about 13 + OH NH3 Threonine Thr [T] CH3 CH CH COO – 2.1 9.1 about 13 + OH NH3 Tyrosine Tyr [Y] See below. With Side Chains Containing Sulfur Atoms – Cysteine Cys [C] CH2 CH COO 1.9 10.8 8.3 + SH NH3 Methionine Met [M] CH2 CH2 CH COO – 2.1 9.3 + S CH3 NH3 With Side Chains Containing Acidic Groups or Their Amides Aspartic acid Asp [D] – OOC CH CH COO – 2.0 9.9 3.9 2 + NH3 Asparagine Asn [N] C CH2 CH COO – 2.1 8.8 H2N + O NH3 – – OOC CH2 CH2 CH COO Glutamic acid Glu [E] 2.1 9.5 4.1 + NH3 – H2N C CH2 CH2 CH COO Glutamine Gln [Q] 2.2 9.1 + O NH3 (continued) 15 16 / CHAPTER 3 Table 3–1. L-α-Amino acids present in proteins. (continued) Name Symbol Structural Formula pK1 pK2 pK3 With Side Chains Containing Basic Groups -COOH -NH3 + R Group Arginine Arg [R] – 1.8 9.0 12.5 H N CH2 CH2 CH2 CH COO C NH2+ NH3 + NH2 – CH2 CH2 CH2 CH2 CH COO Lysine Lys [K] 2.2 9.2 10.8 + + NH3 NH3 – CH2 CH COO Histidine His [H] 1.8 9.3 6.0 HN N + NH3 Containing Aromatic Rings Histidine His [H] See above. – Phenylalanine Phe [F] CH2 CH COO 2.2 9.2 + NH3 Tyrosine Tyr [Y] 2.2 9.1 10.1 – HO CH2 CH COO + NH3 Tryptophan Trp [W] – 2.4 9.4 CH2 CH COO + NH3 N H Imino Acid Proline Pro [P] 2.0 10.6 + – N COO H2 Amino Acids May Have Positive, Negative, Molecules that contain an equal number of ioniz- or Zero Net Charge able groups of opposite charge and that therefore bear no net charge are termed zwitterions. Amino acids in Charged and uncharged forms of the ionizable blood and most tissues thus should be represented as in COOH and NH3+ weak acid groups exist in solu- A, below. tion in protonic equilibrium: NH3+ NH2 R — COOH = R — COO− + H+ + O– OH R — NH3 = R — NH2 + H+ R R O O While both RCOOH and RNH3 are weak acids,+ A B RCOOH is a far stronger acid than RNH3+. At physiologic pH (pH 7.4), carboxyl groups exist almost Structure B cannot exist in aqueous solution because at entirely as RCOO− and amino groups predomi- any pH low enough to protonate the carboxyl group nantly as RNH3+. Figure 3–1 illustrates the effect of the amino group would also be protonated. Similarly, pH on the charged state of aspartic acid. at any pH sufficiently high for an uncharged amino AMINO ACIDS & PEPTIDES / 17 O H+ O H+ O H+ O OH OH O– O– pK1 = 2.09 pK2 = 3.86 pK3 = 9.82 NH3+ (α-COOH) NH3+ (β-COOH) NH3+ (— NH3+) NH2 – – – HO O O O O O O O A B C D In strong acid Around pH 3; Around pH 6–8; In strong alkali (below pH 1); net charge = 0 net charge = –1 (above pH 11); net charge = +1 net charge = –2 Figure 3–1. Protonic equilibria of aspartic acid. group to predominate, a carboxyl group will be present At Its Isoelectric pH (pI), an Amino Acid as RCOO−. The uncharged representation B (above) Bears No Net Charge is, however, often used for reactions that do not involve protonic equilibria. The isoelectric species is the form of a molecule that has an equal number of positive and negative charges and thus is electrically neutral. The isoelectric pH, also pKa Values Express the Strengths called the pI, is the pH midway between pKa values on of Weak Acids either side of the isoelectric species. For an amino acid The acid strengths of weak acids are expressed as their such as alanine that has only two dissociating groups, pKa (Table 3–1). The imidazole group of histidine and there is no ambiguity. The first pKa (R COOH) is the guanidino group of arginine exist as resonance hy- 2.35 and the second pKa (RNH3+) is 9.69. The iso- brids with positive charge distributed between both ni- electric pH (pI) of alanine thus is trogens (histidine) or all three nitrogens (arginine) (Fig- ure 3–2). The net charge on an amino acid—the pK 1 + pK 2 2.35 + 9.69 pl = = = 6.02 algebraic sum of all the positively and negatively 2 2 charged groups present—depends upon the pKa values of its functional groups and on the pH of the surround- For polyfunctional acids, pI is also the pH midway be- ing medium. Altering the charge on amino acids and tween the pKa values on either side of the isoionic their derivatives by varying the pH facilitates the physi- species. For example, the pI for aspartic acid is cal separation of amino acids, peptides, and proteins (see Chapter 4). pK 1 + pK 2 2.09 + 3.96 pl = = = 3.02 2 2 For lysine, pI is calculated from: R R pK 2 + pK 3 N H N H pl = 2 N N H H Similar considerations apply to all polyprotic acids (eg, proteins), regardless of the number of dissociating R R R groups present. In the clinical laboratory, knowledge of the pI guides selection of conditions for electrophoretic NH NH NH separations. For example, electrophoresis at pH 7.0 will C NH2 C NH2 C NH2 separate two molecules with pI values of 6.0 and 8.0 because at pH 8.0 the molecule with a pI of 6.0 will NH2 NH2 NH2 have a net positive charge, and that with pI of 8.0 a net negative charge. Similar considerations apply to under- Figure 3–2. Resonance hybrids of the protonated standing chromatographic separations on ionic sup- forms of the R groups of histidine and arginine. ports such as DEAE cellulose (see Chapter 4). 18 / CHAPTER 3 pKa Values Vary With the Environment THE -R GROUPS DETERMINE THE The environment of a dissociable group affects its pKa. PROPERTIES OF AMINO ACIDS The pKa values of the R groups of free amino acids in Since glycine, the smallest amino acid, can be accommo- aqueous solution (Table 3–1) thus provide only an ap- dated in places inaccessible to other amino acids, it often proximate guide to the pKa values of the same amino occurs where peptides bend sharply. The hydrophobic R acids when present in proteins. A polar environment groups of alanine, valine, leucine, and isoleucine and the favors the charged form (R COO− or RNH3+), aromatic R groups of phenylalanine, tyrosine, and tryp- and a nonpolar environment favors the uncharged form tophan typically occur primarily in the interior of cy- (R COOH or RNH2). A nonpolar environment tosolic proteins. The charged R groups of basic and thus raises the pKa of a carboxyl group (making it a acidic amino acids stabilize specific protein conforma- weaker acid) but lowers that of an amino group (making tions via ionic interactions, or salt bonds. These bonds it a stronger acid). The presence of adjacent charged also function in “charge relay” systems during enzymatic groups can reinforce or counteract solvent effects. The catalysis and electron transport in respiring mitochon- pKa of a functional group thus will depend upon its lo- dria. Histidine plays unique roles in enzymatic catalysis. cation within a given protein. Variations in pKa can en- The pKa of its imidazole proton permits it to function at compass whole pH units (Table 3–2). pKa values that neutral pH as either a base or an acid catalyst. The pri- diverge from those listed by as much as three pH units mary alcohol group of serine and the primary thioalco- are common at the active sites of enzymes. An extreme hol (SH) group of cysteine are excellent nucleophiles example, a buried aspartic acid of thioredoxin, has a and can function as such during enzymatic catalysis. pKa above 9—a shift of over six pH units! However, the secondary alcohol group of threonine, while a good nucleophile, does not fulfill an analogous The Solubility and Melting Points role in catalysis. The OH groups of serine, tyrosine, of Amino Acids Reflect and threonine also participate in regulation of the activ- Their Ionic Character ity of enzymes whose catalytic activity depends on the phosphorylation state of these residues. The charged functional groups of amino acids ensure that they are readily solvated by—and thus soluble in— polar solvents such as water and ethanol but insoluble FUNCTIONAL GROUPS DICTATE THE in nonpolar solvents such as benzene, hexane, or ether. CHEMICAL REACTIONS OF AMINO ACIDS Similarly, the high amount of energy required to dis- Each functional group of an amino acid exhibits all of rupt the ionic forces that stabilize the crystal lattice its characteristic chemical reactions. For carboxylic acid account for the high melting points of amino acids groups, these reactions include the formation of esters, (> 200 °C). amides, and acid anhydrides; for amino groups, acyla- Amino acids do not absorb visible light and thus are tion, amidation, and esterification; and for OH and colorless. However, tyrosine, phenylalanine, and espe- SH groups, oxidation and esterification. The most cially tryptophan absorb high-wavelength (250–290 important reaction of amino acids is the formation of a nm) ultraviolet light. Tryptophan therefore makes the peptide bond (shaded blue). major contribution to the ability of most proteins to absorb light in the region of 280 nm. + H3N O H N O– N H Table 3–2. Typical range of pKa values for O O ionizable groups in proteins. SH Alanyl Cysteinyl Valine Dissociating Group pKa Range α-Carboxyl 3.5–4.0 Non-α COOH of Asp or Glu 4.0–4.8 Amino Acid Sequence Determines Imidazole of His 6.5–7.4 Primary Structure SH of Cys 8.5–9.0 The number and order of all of the amino acid residues OH of Tyr 9.5–10.5 in a polypeptide constitute its primary structure. α-Amino 8.0–9.0 Amino acids present in peptides are called aminoacyl ε-Amino of Lys 9.8–10.4 residues and are named by replacing the -ate or -ine suf- Guanidinium of Arg ~12.0 fixes of free amino acids with -yl (eg, alanyl, aspartyl, ty- AMINO ACIDS & PEPTIDES / 19 rosyl). Peptides are then named as derivatives of the SH carboxyl terminal aminoacyl residue. For example, Lys- O CH2 H Leu-Tyr-Gln is called lysyl-leucyl-tyrosyl-glutamine. The -ine ending on glutamine indicates that its α-car- C CH N boxyl group is not involved in peptide bond formation. CH2 N C CH2 CH2 H O COO– Peptide Structures Are Easy to Draw Prefixes like tri- or octa- denote peptides with three or H C NH3+ eight residues, respectively, not those with three or COO– eight peptide bonds. By convention, peptides are writ- ten with the residue that bears the free α-amino group Figure 3–3. Glutathione (γ-glutamyl-cysteinyl- at the left. To draw a peptide, use a zigzag to represent glycine). Note the non-α peptide bond that links the main chain or backbone. Add the main chain atoms, Glu to Cys. which occur in the repeating order: α-nitrogen, α-car- bon, carbonyl carbon. Now add a hydrogen atom to each α-carbon and to each peptide nitrogen, and an oxygen to the carbonyl carbon. Finally, add the appro- releasing hormone (TRH) is cyclized to pyroglutamic priate R groups (shaded) to each α-carbon atom. acid, and the carboxyl group of the carboxyl terminal prolyl residue is amidated. Peptides elaborated by fungi, N C Cα N C bacteria, and lower animals can contain nonprotein Cα N C Cα amino acids. The antibiotics tyrocidin and gramicidin S are cyclic polypeptides that contain D-phenylalanine O and ornithine. The heptapeptide opioids dermorphin H3C H H and deltophorin in the skin of South American tree + H3N C C N COO– frogs contain D-tyrosine and D-alanine. C N C C H H H CH2 CH2 O Peptides Are Polyelectrolytes – OOC OH The peptide bond is uncharged at any pH of physiologic Three-letter abbreviations linked by straight lines interest. Formation of peptides from amino acids is represent an unambiguous primary structure. Lines are therefore accompanied by a net loss of one positive and omitted for single-letter abbreviations. one negative charge per peptide bond formed. Peptides nevertheless are charged at physiologic pH owing to their Glu - Ala - Lys - Gly - Tyr - Ala carboxyl and amino terminal groups and, where present, E A K G Y A their acidic or basic R groups. As for amino acids, the net charge on a peptide depends on the pH of its environ- Where there is uncertainty about the order of a portion ment and on the pKa values of its dissociating groups. of a polypeptide, the questionable residues are enclosed in brackets and separated by commas. The Peptide Bond Has Partial Glu - Lys - (Ala , Gly , Tyr ) - His - Ala Double-Bond Character Although peptides are written as if a single bond linked Some Peptides Contain Unusual the α-carboxyl and α-nitrogen atoms, this bond in fact exhibits partial double-bond character: Amino Acids In mammals, peptide hormones typically contain only O O– the α-amino acids of proteins linked by standard pep- C C + tide bonds. Other peptides may, however, contain non- N N protein amino acids, derivatives of the protein amino H H acids, or amino acids linked by an atypical peptide bond. For example, the amino terminal glutamate of There thus is no freedom of rotation about the bond glutathione, which participates in protein folding and that connects the carbonyl carbon and the nitrogen of a in the metabolism of xenobiotics (Chapter 53), is peptide bond. Consequently, all four of the colored linked to cysteine by a non-α peptide bond (Figure atoms of Figure 3–4 are coplanar. The imposed semi- 3–3). The amino terminal glutamate of thyrotropin- rigidity of the peptide bond has important conse- 20 / CHAPTER 3 mixture of free amino acids is then treated with 6-amino- O R′ H H O N-hydroxysuccinimidyl carbamate, which reacts with 0.123 nm their α-amino groups, forming fluorescent derivatives 0. that are then separated and identified using high-pressure 121° 122° 13 C 120° C N C 2 nm liquid chromatography (see Chapter 5). Ninhydrin, also 117° 0. 14 7 nm widely used for detecting amino acids, forms a purple 53 120° N 120° 110° C nm C 0. 1 N product with α-amino acids and a yellow adduct with the imine groups of proline and hydroxyproline. 0.1 nm H O H R′′ H SUMMARY 0.36 nm • Both D-amino acids and non-α-amino acids occur Figure 3–4. Dimensions of a fully extended poly- in nature, but only L-α-amino acids are present in peptide chain. The four atoms of the peptide bond proteins. • All amino acids possess at least two weakly acidic (colored blue) are coplanar. The unshaded atoms are functional groups, RNH3+ and R COOH. the α-carbon atom, the α-hydrogen atom, and the α-R Many also possess additional weakly acidic functional group of the particular amino acid. Free rotation can groups such as OH, SH, guanidino, or imid- occur about the bonds that connect the α-carbon with azole groups. the α-nitrogen and with the α-carbonyl carbon (blue • The pKa values of all functional groups of an amino arrows). The extended polypeptide chain is thus a semi- acid dictate its net charge at a given pH. pI is the pH rigid structure with two-thirds of the atoms of the back- at which an amino acid bears no net charge and thus bone held in a fixed planar relationship one to another. does not move in a direct current electrical field. The distance between adjacent α-carbon atoms is 0.36 • Of the biochemical reactions of amino acids, the nm (3.6 Å). The interatomic distances and bond angles, most important is the formation of peptide bonds. which are not equivalent, are also shown. (Redrawn and reproduced, with permission, from Pauling L, Corey LP, • The R groups of amino acids determine their unique biochemical functions. Amino acids are classified as Branson HR: The structure of proteins: Two hydrogen- basic, acidic, aromatic, aliphatic, or sulfur-containing bonded helical configurations of the polypeptide chain. based on the properties of their R groups. Proc Natl Acad Sci U S A 1951;37:205.) • Peptides are named for the number of amino acid residues present, and as derivatives of the carboxyl quences for higher orders of protein structure. Encir- terminal residue. The primary structure of a peptide cling arrows (Figure 3 – 4) indicate free rotation about is its amino acid sequence, starting from the amino- the remaining bonds of the polypeptide backbone. terminal residue. • The partial double-bond character of the bond that Noncovalent Forces Constrain Peptide links the carbonyl carbon and the nitrogen of a pep- Conformations tide renders four atoms of the peptide bond coplanar and restricts the number of possible peptide confor- Folding of a peptide probably occurs coincident with mations. its biosynthesis (see Chapter 38). The physiologically active conformation reflects the amino acid sequence, REFERENCES steric hindrance, and noncovalent interactions (eg, hy- drogen bonding, hydrophobic interactions) between Doolittle RF: Reconstructing history with amino acid sequences. residues. Common conformations include α-helices Protein Sci 1992;1:191. and β-pleated sheets (see Chapter 5). Kreil G: D-Amino acids in animal peptides. Annu Rev Biochem 1997;66:337. Nokihara K, Gerhardt J: Development of an improved automated ANALYSIS OF THE AMINO ACID gas-chromatographic chiral analysis system: application to CONTENT OF BIOLOGIC MATERIALS non-natural amino acids and natural protein hydrolysates. Chirality 2001;13:431. In order to determine the identity and quantity of each Sanger F: Sequences, sequences, and sequences. Annu Rev Biochem amino acid in a sample of biologic material, it is first nec- 1988;57:1. essary to hydrolyze the peptide bonds that link the amino Wilson NA et al: Aspartic acid 26 in reduced Escherichia coli thiore- acids together by treatment with hot HCl. The resulting doxin has a pKa greater than 9. Biochemistry 1995;34:8931. Proteins: Determination of Primary Structure 4 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE Column Chromatography Proteins perform multiple critically important roles. An Column chromatography of proteins employs as the internal protein network, the cytoskeleton (Chapter stationary phase a column containing small spherical 49), maintains cellular shape and physical integrity. beads of modified cellulose, acrylamide, or silica whose Actin and myosin filaments form the contractile ma- surface typically has been coated with chemical func- chinery of muscle (Chapter 49). Hemoglobin trans- tional groups. These stationary phase matrices interact ports oxygen (Chapter 6), while circulating antibodies with proteins based on their charge, hydrophobicity, search out foreign invaders (Chapter 50). Enzymes cat- and ligand-binding properties. A protein mixture is ap- alyze reactions that generate energy, synthesize and de- plied to the column and the liquid mobile phase is per- grade biomolecules, replicate and transcribe genes, colated through it. Small portions of the mobile phase process mRNAs, etc (Chapter 7). Receptors enable cells or eluant are collected as they emerge (Figure 4–1). to sense and respond to hormones and other environ- mental cues (Chapters 42 and 43). An important goal of molecular medicine is the identification of proteins Partition Chromatography whose presence, absence, or deficiency is associated Column chromatographic separations depend on the with specific physiologic states or diseases. The primary relative affinity of different proteins for a given station- sequence of a protein provides both a molecular finger- ary phase and for the mobile phase. Association be- print for its identification and information that can be tween each protein and the matrix is weak and tran- used to identify and clone the gene or genes that en- sient. Proteins that interact more strongly with the code it. stationary phase are retained longer. The length of time that a protein is associated with the stationary phase is a function of the composition of both the stationary and PROTEINS & PEPTIDES MUST BE mobile phases. Optimal separation of the protein of in- terest from other proteins thus can be achieved by care- PURIFIED PRIOR TO ANALYSIS ful manipulation of the composition of the two phases. Highly purified protein is essential for determination of its amino acid sequence. Cells contain thousands of dif- ferent proteins, each in widely varying amounts. The Size Exclusion Chromatography isolation of a specific protein in quantities sufficient for Size exclusion—or gel filtration—chromatography sep- analysis thus presents a formidable challenge that may arates proteins based on their Stokes radius, the diam- require multiple successive purification techniques. eter of the sphere they occupy as they tumble in solu- Classic approaches exploit differences in relative solu- tion. The Stokes radius is a function of molecular mass bility of individual proteins as a function of pH (iso- and shape. A tumbling elongated protein occupies a electric precipitation), polarity (precipitation with larger volume than a spherical protein of the same mass. ethanol or acetone), or salt concentration (salting out Size exclusion chromatography employs porous beads with ammonium sulfate). Chromatographic separations (Figure 4–2). The pores are analogous to indentations partition molecules between two phases, one mobile in a river bank. As objects move downstream, those that and the other stationary. For separation of amino acids enter an indentation are retarded until they drift back or sugars, the stationary phase, or matrix, may be a into the main current. Similarly, proteins with Stokes sheet of filter paper (paper chromatography) or a thin radii too large to enter the pores (excluded proteins) re- layer of cellulose, silica, or alumina (thin-layer chro- main in the flowing mobile phase and emerge before matography; TLC). proteins that can enter the pores (included proteins). 21 22 / CHAPTER 4 R C F Figure 4–1. Components of a simple liquid chromatography apparatus. R: Reser- voir of mobile phase liquid, delivered either by gravity or using a pump. C: Glass or plastic column containing stationary phase. F: Fraction collector for collecting por- tions, called fractions, of the eluant liquid in separate test tubes. Proteins thus emerge from a gel filtration column in de- Ion Exchange Chromatography scending order of their Stokes radii. In ion exchange chromatography, proteins interact with the stationary phase by charge-charge interactions. Pro- Absorption Chromatography teins with a net positive charge at a given pH adhere to For absorption chromatography, the protein mixture is beads with negatively charged functional groups such as applied to a column under conditions where the pro- carboxylates or sulfates (cation exchangers). Similarly, tein of interest associates with the stationary phase so proteins with a net negative charge adhere to beads with tightly that its partition coefficient is essentially unity. positively charged functional groups, typically tertiary or Nonadhering molecules are first eluted and discarded. quaternary amines (anion exchangers). Proteins, which Proteins are then sequentially released by disrupting the are polyanions, compete against monovalent ions for forces that stabilize the protein-stationary phase com- binding to the support—thus the term “ion exchange.” plex, most often by using a gradient of increasing salt For example, proteins bind to diethylaminoethyl concentration. The composition of the mobile phase is (DEAE) cellulose by replacing the counter-ions (gener- altered gradually so that molecules are selectively re- ally Cl− or CH3COO−) that neutralize the protonated leased in descending order of their affinity for the sta- amine. Bound proteins are selectively displaced by grad- tionary phase. ually raising the concentration of monovalent ions in PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 23 A B C Figure 4–2. Size-exclusion chromatography. A: A mixture of large molecules (diamonds) and small molecules (circles) are applied to the top of a gel filtration column. B: Upon entering the column, the small molecules enter pores in the sta- tionary phase matrix from which the large molecules are excluded. C: As the mo- bile phase flows down the column, the large, excluded molecules flow with it while the small molecules, which are temporarily sheltered from the flow when in- side the pores, lag farther and farther behind. the mobile phase. Proteins elute in inverse order of the fied by affinity chromatography using immobilized sub- strength of their interactions with the stationary phase. strates, products, coenzymes, or inhibitors. In theory, Since the net charge on a protein is determined by only proteins that interact with the immobilized ligand the pH (see Chapter 3), sequential elution of proteins adhere. Bound proteins are then eluted either by compe- may be achieved by changing the pH of the mobile tition with soluble ligand or, less selectively, by disrupt- phase. Alternatively, a protein can be subjected to con- ing protein-ligand interactions using urea, guanidine secutive rounds of ion exchange chromatography, each hydrochloride, mildly acidic pH, or high salt concentra- at a different pH, such that proteins that co-elute at one tions. Stationary phase matrices available commercially pH elute at different salt concentrations at another pH. contain ligands such as NAD+ or ATP analogs. Among the most powerful and widely applicable affinity matri- Hydrophobic Interaction Chromatography ces are those used for the purification of suitably modi- fied recombinant proteins. These include a Ni2+ matrix Hydrophobic interaction chromatography separates that binds proteins with an attached polyhistidine “tag” proteins based on their tendency to associate with a sta- and a glutathione matrix that binds a recombinant pro- tionary phase matrix coated with hydrophobic groups tein linked to glutathione S-transferase. (eg, phenyl Sepharose, octyl Sepharose). Proteins with exposed hydrophobic surfaces adhere to the matrix via hydrophobic interactions that are enhanced by a mobile Peptides Are Purified by Reversed-Phase phase of high ionic strength. Nonadherent proteins are High-Pressure Chromatography first washed away. The polarity of the mobile phase is then decreased by gradually lowering the salt concentra- The stationary phase matrices used in classic column tion. If the interaction between protein and stationary chromatography are spongy materials whose compress- phase is particularly strong, ethanol or glycerol may be ibility limits flow of the mobile phase. High-pressure liq- added to the mobile phase to decrease its polarity and uid chromatography (HPLC) employs incompressible further weaken hydrophobic interactions. silica or alumina microbeads as the stationary phase and pressures of up to a few thousand psi. Incompressible matrices permit both high flow rates and enhanced reso- Affinity Chromatography lution. HPLC can resolve complex mixtures of lipids or Affinity chromatography exploits the high selectivity of peptides whose properties differ only slightly. Reversed- most proteins for their ligands. Enzymes may be puri- phase HPLC exploits a hydrophobic stationary phase of 24 / CHAPTER 4 aliphatic polymers 3–18 carbon atoms in length. Peptide through the acrylamide matrix determines the rate of mixtures are eluted using a gradient of a water-miscible migration. Since large complexes encounter greater re- organic solvent such as acetonitrile or methanol. sistance, polypeptides separate based on their relative molecular mass (Mr). Individual polypeptides trapped Protein Purity Is Assessed by in the acrylamide gel are visualized by staining with Polyacrylamide Gel Electrophoresis dyes such as Coomassie blue (Figure 4–4). (PAGE) Isoelectric Focusing (IEF) The most widely used method for determining the pu- rity of a protein is SDS-PAGE—polyacrylamide gel Ionic buffers called ampholytes and an applied electric electrophoresis (PAGE) in the presence of the anionic field are used to generate a pH gradient within a poly- detergent sodium dodecyl sulfate (SDS). Electrophore- acrylamide matrix. Applied proteins migrate until they sis separates charged biomolecules based on the rates at reach the region of the matrix where the pH matches which they migrate in an applied electrical field. For their isoelectric point (pI), the pH at which a peptide’s SDS-PAGE, acrylamide is polymerized and cross- net charge is zero. IEF is used in conjunction with SDS- linked to form a porous matrix. SDS denatures and PAGE for two-dimensional electrophoresis, which sepa- binds to proteins at a ratio of one molecule of SDS per rates polypeptides based on pI in one dimension and two peptide bonds. When used in conjunction with 2- based on Mr in the second (Figure 4–5). Two-dimen- mercaptoethanol or dithiothreitol to reduce and break sional electrophoresis is particularly well suited for sepa- disulfide bonds (Figure 4 –3), SDS separates the com- rating the components of complex mixtures of proteins. ponent polypeptides of multimeric proteins. The large number of anionic SDS molecules, each bearing a SANGER WAS THE FIRST TO DETERMINE charge of −1, on each polypeptide overwhelms the charge contributions of the amino acid functional THE SEQUENCE OF A POLYPEPTIDE groups. Since the charge-to-mass ratio of each SDS- Mature insulin consists of the 21-residue A chain and polypeptide complex is approximately equal, the physi- the 30-residue B chain linked by disulfide bonds. Fred- cal resistance each peptide encounters as it moves erick Sanger reduced the disulfide bonds (Figure 4–3), NH O HN H S O HN S H O NH O O SH HCOOH C2H5 OH NH O H HN SO2− HN O O HS H NH O Figure 4–4. Use of SDS-PAGE to observe successive purification of a recombinant protein. The gel was Figure 4–3. Oxidative cleavage of adjacent polypep- stained with Coomassie blue. Shown are protein stan- tide chains linked by disulfide bonds (shaded) by per- dards (lane S) of the indicated mass, crude cell extract formic acid (left) or reductive cleavage by β-mercap- (E), high-speed supernatant liquid (H), and the DEAE- toethanol (right) forms two peptides that contain Sepharose fraction (D). The recombinant protein has a cysteic acid residues or cysteinyl residues, respectively. mass of about 45 kDa. PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 25 pH = 3 pH = 10 IEF SDS PAGE Figure 4–5. Two-dimensional IEF-SDS-PAGE. The gel was stained with Coomassie blue. A crude bacter- ial extract was first subjected to isoelectric focusing (IEF) in a pH 3–10 gradient. The IEF gel was then placed horizontally on the top of an SDS gel, and the proteins then further resolved by SDS-PAGE. Notice the greatly improved resolution of distinct polypep- tides relative to ordinary SDS- PAGE gel (Figure 4–4). separated the A and B chains, and cleaved each chain Large Polypeptides Are First Cleaved Into into smaller peptides using trypsin, chymotrypsin, and Smaller Segments pepsin. The resulting peptides were then isolated and treated with acid to hydrolyze peptide bonds and gener- While the first 20–30 residues of a peptide can readily ate peptides with as few as two or three amino acids. be determined by the Edman method, most polypep- Each peptide was reacted with 1-fluoro-2,4-dinitroben- tides contain several hundred amino acids. Conse- zene (Sanger’s reagent), which derivatizes the exposed quently, most polypeptides must first be cleaved into α-amino group of amino terminal residues. The amino smaller peptides prior to Edman sequencing. Cleavage acid content of each peptide was then determined. also may be necessary to circumvent posttranslational While the ε-amino group of lysine also reacts with modifications that render a protein’s α-amino group Sanger’s reagent, amino-terminal lysines can be distin- “blocked”, or unreactive with the Edman reagent. guished from those at other positions because they react It usually is necessary to generate several peptides with 2 mol of Sanger’s reagent. Working backwards to using more than one method of cleavage. This reflects larger fragments enabled Sanger to determine the com- both inconsistency in the spacing of chemically or enzy- plete sequence of insulin, an accomplishment for which matically susceptible cleavage sites and the need for sets he received a Nobel Prize in 1958. of peptides whose sequences overlap so one can infer the sequence of the polypeptide from which they derive (Figure 4–7). Reagents for the chemical or enzymatic THE EDMAN REACTION ENABLES cleavage of proteins include cyanogen bromide (CNBr), PEPTIDES & PROTEINS trypsin, and Staphylococcus aureus V8 protease (Table TO BE SEQUENCED 4–1). Following cleavage, the resulting peptides are pu- rified by reversed-phase HPLC—or occasionally by Pehr Edman introduced phenylisothiocyanate (Edman’s SDS-PAGE—and sequenced. reagent) to selectively label the amino-terminal residue of a peptide. In contrast to Sanger’s reagent, the phenylthiohydantoin (PTH) derivative can be removed MOLECULAR BIOLOGY HAS under mild conditions to generate a new amino terminal REVOLUTIONIZED THE DETERMINATION residue (Figure 4–6). Successive rounds of derivatization OF PRIMARY STRUCTURE with Edman’s reagent can therefore be used to sequence many residues of a single sample of peptide. Edman se- Knowledge of DNA sequences permits deduction of quencing has been automated, using a thin film or solid the primary structures of polypeptides. DNA sequenc- matrix to immobilize the peptide and HPLC to identify ing requires only minute amounts of DNA and can PTH amino acids. Modern gas-phase sequencers can readily yield the sequence of hundreds of nucleotides. analyze as little as a few picomoles of peptide. To clone and sequence the DNA that encodes a partic- 26 / CHAPTER 4 S Peptide X Peptide Y C Peptide Z N O + NH2 H Carboxyl terminal Amino terminal N portion of portion of peptide X peptide Y N R H R′ O Figure 4–7. The overlapping peptide Z is used to de- Phenylisothiocyanate (Edman reagent) duce that peptides X and Y are present in the original and a peptide protein in the order X → Y, not Y ← X. sequence can be determined and the genetic code used to infer the primary structure of the encoded poly- S peptide. The hybrid approach enhances the speed and effi- N NH ciency of primary structure analysis and the range of H proteins that can be sequenced. It also circumvents ob- O stacles such as the presence of an amino-terminal block- H N ing group or the lack of a key overlap peptide. Only a N R H O few segments of primary structure must be determined R′ by Edman analysis. A phenylthiohydantoic acid DNA sequencing reveals the order in which amino acids are added to the nascent polypeptide chain as it is H+, nitro- H2O synthesized on the ribosomes. However, it provides no methane information about posttranslational modifications such as proteolytic processing, methylation, glycosylation, S O phosphorylation, hydroxylation of proline and lysine, NH2 and disulfide bond formation that accompany matura- N NH + N H tion. While Edman sequencing can detect the presence R of most posttranslational events, technical limitations O R often prevent identification of a specific modification. A phenylthiohydantoin and a peptide shorter by one residue Table 4–1. Methods for cleaving polypeptides. Figure 4–6. The Edman reaction. Phenylisothio- cyanate derivatizes the amino-terminal residue of a Method Bond Cleaved peptide as a phenylthiohydantoic acid. Treatment with acid in a nonhydroxylic solvent releases a phenylthio- CNBr Met-X hydantoin, which is subsequently identified by its chro- Trypsin Lys-X and Arg-X matographic mobility, and a peptide one residue Chymotrypsin Hydrophobic amino acid-X shorter. The process is then repeated. Endoproteinase Lys-C Lys-X Endoproteinase Arg-C Arg-X ular protein, some means of identifying the correct clone—eg, knowledge of a portion of its nucleotide se- Endoproteinase Asp-N X-Asp quence—is essential. A hybrid approach thus has V8 protease Glu-X, particularly where X is hydro- emerged. Edman sequencing is used to provide a partial phobic amino acid sequence. Oligonucleotide primers modeled on this partial sequence can then be used to identify Hydroxylamine Asn-Gly clones or to amplify the appropriate gene by the poly- o-Iodosobenzene Trp-X merase chain reaction (PCR) (see Chapter 40). Once an Mild acid Asp-Pro authentic DNA clone is obtained, its oligonucleotide PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 27 MASS SPECTROMETRY DETECTS COVALENT MODIFICATIONS Mass spectrometry, which discriminates molecules based solely on their mass, is ideal for detecting the S A phosphate, hydroxyl, and other groups on posttransla- tionally modified amino acids. Each adds a specific and readily identified increment of mass to the modified amino acid (Table 4–2). For analysis by mass spec- trometry, a sample in a vacuum is vaporized under E conditions where protonation can occur, imparting D positive charge. An electrical field then propels the cations through a magnetic field which deflects them Figure 4–8. Basic components of a simple mass at a right angle to their original direction of flight and spectrometer. A mixture of molecules is vaporized in an focuses them onto a detector (Figure 4–8). The mag- ionized state in the sample chamber S. These mole- netic force required to deflect the path of each ionic cules are then accelerated down the flight tube by an species onto the detector, measured as the current ap- electrical potential applied to accelerator grid A. An ad- plied to the electromagnet, is recorded. For ions of justable electromagnet, E, applies a magnetic field that identical net charge, this force is proportionate to their deflects the flight of the individual ions until they strike mass. In a time-of-flight mass spectrometer, a briefly applied electric field accelerates the ions towards a de- the detector, D. The greater the mass of the ion, the tector that records the time at which each ion arrives. higher the magnetic field required to focus it onto the For molecules of identical charge, the velocity to which detector. they are accelerated—and hence the time required to reach the detector—will be inversely proportionate to their mass. Conventional mass spectrometers generally are used phase HPLC column are introduced directly into the to determine the masses of molecules of 1000 Da or mass spectrometer for immediate determination of less, whereas time-of-flight mass spectrometers are their masses. suited for determining the large masses of proteins. Peptides inside the mass spectrometer are broken The analysis of peptides and proteins by mass spec- down into smaller units by collisions with neutral he- tometry initially was hindered by difficulties in lium atoms (collision-induced dissociation), and the volatilizing large organic molecules. However, matrix- masses of the individual fragments are determined. assisted laser-desorption (MALDI) and electrospray Since peptide bonds are much more labile than carbon- dispersion (eg, nanospray) permit the masses of even carbon bonds, the most abundant fragments will differ large polypeptides (> 100,000 Da) to be determined from one another by units equivalent to one or two with extraordinary accuracy (± 1 Da). Using electro- amino acids. Since—with the exception of leucine and spray dispersion, peptides eluting from a reversed- isoleucine—the molecular mass of each amino acid is unique, the sequence of the peptide can be recon- structed from the masses of its fragments. Table 4–2. Mass increases resulting from common posttranslational modifications. Tandem Mass Spectrometry Complex peptide mixtures can now be analyzed with- Modification Mass Increase (Da) out prior purification by tandem mass spectrometry, Phosphorylation 80 which employs the equivalent of two mass spectrome- Hydroxylation 16 ters linked in series. The first spectrometer separates in- dividual peptides based upon their differences in mass. Methylation 14 By adjusting the field strength of the first magnet, a sin- Acetylation 42 gle peptide can be directed into the second mass spec- trometer, where fragments are generated and their Myristylation 210 masses determined. As the sensitivity and versatility of Palmitoylation 238 mass spectrometry continue to increase, it is displacing Edman sequencers for the direct analysis of protein pri- Glycosylation 162 mary structure. 28 / CHAPTER 4 GENOMICS ENABLES PROTEINS TO BE in the hemoglobin tetramer undergo change pre- and IDENTIFIED FROM SMALL AMOUNTS postpartum. Many proteins undergo posttranslational OF SEQUENCE DATA modifications during maturation into functionally competent forms or as a means of regulating their prop- Primary structure analysis has been revolutionized by erties. Knowledge of the human genome therefore rep- genomics, the application of automated oligonucleotide resents only the beginning of the task of describing liv- sequencing and computerized data retrieval and analysis ing organisms in molecular detail and understanding to sequence an organism’s entire genetic complement. the dynamics of processes such as growth, aging, and The first genome sequenced was that of Haemophilus disease. As the human body contains thousands of cell influenzae, in 1995. By mid 2001, the complete types, each containing thousands of proteins, the pro- genome sequences for over 50 organisms had been de- teome—the set of all the proteins expressed by an indi- termined. These include the human genome and those vidual cell at a particular time—represents a moving of several bacterial pathogens; the results and signifi- target of formidable dimensions. cance of the Human Genome Project are discussed in Chapter 54. Where genome sequence is known, the Two-Dimensional Electrophoresis & task of determining a protein’s DNA-derived primary Gene Array Chips Are Used to Survey sequence is materially simplified. In essence, the second Protein Expression half of the hybrid approach has already been com- pleted. All that remains is to acquire sufficient informa- One goal of proteomics is the identification of proteins tion to permit the open reading frame (ORF) that whose levels of expression correlate with medically sig- encodes the protein to be retrieved from an Internet- nificant events. The presumption is that proteins whose accessible genome database and identified. In some appearance or disappearance is associated with a specific cases, a segment of amino acid sequence only four or physiologic condition or disease will provide insights five residues in length may be sufficient to identify the into root causes and mechanisms. Determination of the correct ORF. proteomes characteristic of each cell type requires the Computerized search algorithms assist the identifi- utmost efficiency in the isolation and identification of cation of the gene encoding a given protein and clarify individual proteins. The contemporary approach uti- uncertainties that arise from Edman sequencing and lizes robotic automation to speed sample preparation mass spectrometry. By exploiting computers to solve and large two-dimensional gels to resolve cellular pro- complex puzzles, the spectrum of information suitable teins. Individual polypeptides are then extracted and for identification of the ORF that encodes a particular analyzed by Edman sequencing or mass spectroscopy. polypeptide is greatly expanded. In peptide mass profil- While only about 1000 proteins can be resolved on a ing, for example, a peptide digest is introduced into the single gel, two-dimensional electrophoresis has a major mass spectrometer and the sizes of the peptides are de- advantage in that it examines the proteins themselves. termined. A computer is then used to find an ORF An alternative and complementary approach employs whose predicted protein product would, if broken gene arrays, sometimes called DNA chips, to detect the down into peptides by the cleavage method selected, expression of the mRNAs which encode proteins. produce a set of peptides whose masses match those ob- While changes in the expression of the mRNA encod- served by mass spectrometry. ing a protein do not necessarily reflect comparable changes in the level of the corresponding protein, gene PROTEOMICS & THE PROTEOME arrays are more sensitive probes than two-dimensional gels and thus can examine more gene products. The Goal of Proteomics Is to Identify the Entire Complement of Proteins Elaborated Bioinformatics Assists Identification by a Cell Under Diverse Conditions of Protein Functions While the sequence of the human genome is known, The functions of a large proportion of the proteins en- the picture provided by genomics alone is both static coded by the human genome are presently unknown. and incomplete. Proteomics aims to identify the entire Recent advances in bioinformatics permit researchers to complement of proteins elaborated by a cell under di- compare amino acid sequences to discover clues to po- verse conditions. As genes are switched on and off, pro- tential properties, physiologic roles, and mechanisms of teins are synthesized in particular cell types at specific action of proteins. Algorithms exploit the tendency of times of growth or differentiation and in response to nature to employ variations of a structural theme to external stimuli. Muscle cells express proteins not ex- perform similar functions in several proteins (eg, the pressed by neural cells, and the type of subunits present Rossmann nucleotide binding fold to bind NAD(P)H, PROTEINS: DETERMINATION OF PRIMARY STRUCTURE / 29 nuclear targeting sequences, and EF hands to bind • Scientists are now trying to determine the primary Ca2+). These domains generally are detected in the pri- sequence and functional role of every protein ex- mary structure by conservation of particular amino pressed in a living cell, known as its proteome. acids at key positions. Insights into the properties and • A major goal is the identification of proteins whose physiologic role of a newly discovered protein thus may appearance or disappearance correlates with physio- be inferred by comparing its primary structure with logic phenomena, aging, or specific diseases. that of known proteins. REFERENCES SUMMARY Deutscher MP (editor): Guide to Protein Purification. Methods En- • Long amino acid polymers or polypeptides constitute zymol 1990;182. (Entire volume.) the basic structural unit of proteins, and the structure Geveart K, Vandekerckhove J: Protein identification methods in of a protein provides insight into how it fulfills its proteomics. Electrophoresis 2000;21:1145. functions. Helmuth L: Genome research: map of the human genome 3.0. Sci- • The Edman reaction enabled amino acid sequence ence 2001;293:583. analysis to be automated. Mass spectrometry pro- Khan J et al: DNA microarray technology: the anticipated impact vides a sensitive and versatile tool for determining on the study of human disease. Biochim Biophys Acta 1999;1423:M17. primary structure and for the identification of post- translational modifications. McLafferty FW et al: Biomolecule mass spectrometry. Science 1999;284:1289. • DNA cloning and molecular biology coupled with Patnaik SK, Blumenfeld OO: Use of on-line tools and databases for protein chemistry provide a hybrid approach that routine sequence analyses. Anal Biochem 2001;289:1. greatly increases the speed and efficiency for determi- Schena M et al: Quantitative monitoring of gene expression pat- nation of primary structures of proteins. terns with a complementary DNA microarray. Science • Genomics—the analysis of the entire oligonucleotide 1995;270:467. sequence of an organism’s complete genetic mater- Semsarian C, Seidman CE: Molecular medicine in the 21st cen- ial—has provided further enhancements. tury. Intern Med J 2001;31:53. Temple LK et al: Essays on science and society: defining disease in • Computer algorithms facilitate identification of the the genomics era. Science 2001;293:807. open reading frames that encode a given protein by Wilkins MR et al: High-throughput mass spectrometric discovery using partial sequences and peptide mass profiling to of protein post-translational modifications. J Mol Biol search sequence databases. 1999;289:645. Proteins: Higher Orders of Structure 5 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE Globular proteins are compact, are roughly spherical or ovoid in shape, and have axial ratios (the ratio of Proteins catalyze metabolic reactions, power cellular their shortest to longest dimensions) of not over 3. motion, and form macromolecular rods and cables that Most enzymes are globular proteins, whose large inter- provide structural integrity to hair, bones, tendons, and nal volume provides ample space in which to con- teeth. In nature, form follows function. The structural struct cavities of the specific shape, charge, and hy- variety of human proteins therefore reflects the sophis- drophobicity or hydrophilicity required to bind tication and diversity of their biologic roles. Maturation substrates and promote catalysis. By contrast, many of a newly synthesized polypeptide into a biologically structural proteins adopt highly extended conforma- functional protein requires that it be folded into a spe- tions. These fibrous proteins possess axial ratios of 10 cific three-dimensional arrangement, or conformation. or more. During maturation, posttranslational modifications Lipoproteins and glycoproteins contain covalently may add new chemical groups or remove transiently bound lipid and carbohydrate, respectively. Myoglobin, needed peptide segments. Genetic or nutritional defi- hemoglobin, cytochromes, and many other proteins ciencies that impede protein maturation are deleterious contain tightly associated metal ions and are termed to health. Examples of the former include Creutzfeldt- metalloproteins. With the development and applica- Jakob disease, scrapie, Alzheimer’s disease, and bovine tion of techniques for determining the amino acid se- spongiform encephalopathy (mad cow disease). Scurvy quences of proteins (Chapter 4), more precise classifica- represents a nutritional deficiency that impairs protein tion schemes have emerged based upon similarity, or maturation. homology, in amino acid sequence and structure. However, many early classification terms remain in CONFORMATION VERSUS common use. CONFIGURATION The terms configuration and conformation are often confused. Configuration refers to the geometric rela- PROTEINS ARE CONSTRUCTED USING tionship between a given set of atoms, for example, MODULAR PRINCIPLES those that distinguish L- from D-amino acids. Intercon- version of configurational alternatives requires breaking Proteins perform complex physical and catalytic func- covalent bonds. Conformation refers to the spatial re- tions by positioning specific chemical groups in a pre- lationship of every atom in a molecule. Interconversion cise three-dimensional arrangement. The polypeptide between conformers occurs without covalent bond rup- scaffold containing these groups must adopt a confor- ture, with retention of configuration, and typically via mation that is both functionally efficient and phys- rotation about single bonds. ically strong. At first glance, the biosynthesis of polypeptides comprised of tens of thousands of indi- vidual atoms would appear to be extremely challeng- PROTEINS WERE INITIALLY CLASSIFIED ing. When one considers that a typical polypeptide BY THEIR GROSS CHARACTERISTICS can adopt ≥ 1050 distinct conformations, folding into the conformation appropriate to their biologic func- Scientists initially approached structure-function rela- tion would appear to be even more difficult. As de- tionships in proteins by separating them into classes scribed in Chapters 3 and 4, synthesis of the polypep- based upon properties such as solubility, shape, or the tide backbones of proteins employs a small set of presence of nonprotein groups. For example, the pro- common building blocks or modules, the amino acids, teins that can be extracted from cells using solutions at joined by a common linkage, the peptide bond. A physiologic pH and ionic strength are classified as sol- stepwise modular pathway simplifies the folding and uble. Extraction of integral membrane proteins re- processing of newly synthesized polypeptides into ma- quires dissolution of the membrane with detergents. ture proteins. 30 PROTEINS: HIGHER ORDERS OF STRUCTURE / 31 THE FOUR ORDERS OF PROTEIN STRUCTURE The modular nature of protein synthesis and folding are embodied in the concept of orders of protein struc- ture: primary structure, the sequence of the amino 90 acids in a polypeptide chain; secondary structure, the folding of short (3- to 30-residue), contiguous segments of polypeptide into geometrically ordered units; ter- tiary structure, the three-dimensional assembly of sec- ψ 0 ondary structural units to form larger functional units such as the mature polypeptide and its component do- mains; and quaternary structure, the number and types of polypeptide units of oligomeric proteins and their spatial arrangement. – 90 SECONDARY STRUCTURE Peptide Bonds Restrict Possible – 90 0 90 Secondary Conformations φ Free rotation is possible about only two of the three co- valent bonds of the polypeptide backbone: the α-car- Figure 5–1. Ramachandran plot of the main chain bon (Cα) to the carbonyl carbon (Co) bond and the phi (Φ) and psi (Ψ) angles for approximately 1000 Cα to nitrogen bond (Figure 3–4). The partial double- nonglycine residues in eight proteins whose structures bond character of the peptide bond that links Co to the were solved at high resolution. The dots represent al- α-nitrogen requires that the carbonyl carbon, carbonyl lowable combinations and the spaces prohibited com- oxygen, and α-nitrogen remain coplanar, thus prevent- binations of phi and psi angles. (Reproduced, with per- ing rotation. The angle about the CαN bond is mission, from Richardson JS: The anatomy and taxonomy termed the phi (Φ) angle, and that about the CoCα of protein structures. Adv Protein Chem 1981;34:167.) bond the psi (Ψ) angle. For amino acids other than glycine, most combinations of phi and psi angles are disallowed because of steric hindrance (Figure 5–1). The conformations of proline are even more restricted occur in nature. Schematic diagrams of proteins repre- due to the absence of free rotation of the NCα bond. sent α helices as cylinders. Regions of ordered secondary structure arise when a The stability of an α helix arises primarily from hy- series of aminoacyl residues adopt similar phi and psi drogen bonds formed between the oxygen of the pep- angles. Extended segments of polypeptide (eg, loops) tide bond carbonyl and the hydrogen atom of the pep- can possess a variety of such angles. The angles that de- tide bond nitrogen of the fourth residue down the fine the two most common types of secondary struc- polypeptide chain (Figure 5–4). The ability to form the ture, the helix and the sheet, fall within the lower maximum number of hydrogen bonds, supplemented and upper left-hand quadrants of a Ramachandran by van der Waals interactions in the core of this tightly plot, respectively (Figure 5–1). packed structure, provides the thermodynamic driving force for the formation of an α helix. Since the peptide bond nitrogen of proline lacks a hydrogen atom to con- The Alpha Helix tribute to a hydrogen bond, proline can only be stably The polypeptide backbone of an α helix is twisted by accommodated within the first turn of an α helix. an equal amount about each α-carbon with a phi angle When present elsewhere, proline disrupts the confor- of approximately −57 degrees and a psi angle of approx- mation of the helix, producing a bend. Because of its imately − 47 degrees. A complete turn of the helix con- small size, glycine also often induces bends in α helices. tains an average of 3.6 aminoacyl residues, and the dis- Many α helices have predominantly hydrophobic R tance it rises per turn (its pitch) is 0.54 nm (Figure groups on one side of the axis of the helix and predomi- 5–2). The R groups of each aminoacyl residue in an α nantly hydrophilic ones on the other. These amphi- helix face outward (Figure 5–3). Proteins contain only pathic helices are well adapted to the formation of in- L-amino acids, for which a right-handed α helix is by terfaces between polar and nonpolar regions such as the far the more stable, and only right-handed α helices hydrophobic interior of a protein and its aqueous envi- 32 / CHAPTER 5 R R N C R C R N C C N C R C N C R C R N C C R N R C N C Figure 5–3. View down the axis of an α helix. The C side chains (R) are on the outside of the helix. The van C N der Waals radii of the atoms are larger than shown here; C hence, there is almost no free space inside the helix. C (Slightly modified and reproduced, with permission, from 0.54-nm pitch Stryer L: Biochemistry, 3rd ed. Freeman, 1995. Copyright (3.6 residues) N C © 1995 by W.H. Freeman and Co.) C N C 0.15 nm C N polypeptide chain proceed in the same direction amino C to carboxyl, or an antiparallel sheet, in which they pro- ceed in opposite directions (Figure 5–5). Either config- uration permits the maximum number of hydrogen bonds between segments, or strands, of the sheet. Most Figure 5–2. Orientation of the main chain atoms of a β sheets are not perfectly flat but tend to have a right- peptide about the axis of an α helix. handed twist. Clusters of twisted strands of β sheet form the core of many globular proteins (Figure 5–6). Schematic diagrams represent β sheets as arrows that ronment. Clusters of amphipathic helices can create a point in the amino to carboxyl terminal direction. channel, or pore, that permits specific polar molecules to pass through hydrophobic cell membranes. Loops & Bends Roughly half of the residues in a “typical” globular pro- The Beta Sheet tein reside in α helices and β sheets and half in loops, The second (hence “beta”) recognizable regular sec- turns, bends, and other extended conformational fea- ondary structure in proteins is the β sheet. The amino tures. Turns and bends refer to short segments of acid residues of a β sheet, when viewed edge-on, form a amino acids that join two units of secondary structure, zigzag or pleated pattern in which the R groups of adja- such as two adjacent strands of an antiparallel β sheet. cent residues point in opposite directions. Unlike the A β turn involves four aminoacyl residues, in which the compact backbone of the α helix, the peptide backbone first residue is hydrogen-bonded to the fourth, resulting of the β sheet is highly extended. But like the α helix, in a tight 180-degree turn (Figure 5–7). Proline and β sheets derive much of their stability from hydrogen glycine often are present in β turns. bonds between the carbonyl oxygens and amide hydro- Loops are regions that contain residues beyond the gens of peptide bonds. However, in contrast to the α minimum number necessary to connect adjacent re- helix, these bonds are formed with adjacent segments of gions of secondary structure. Irregular in conformation, β sheet (Figure 5–5). loops nevertheless serve key biologic roles. For many Interacting β sheets can be arranged either to form a enzymes, the loops that bridge domains responsible for parallel β sheet, in which the adjacent segments of the binding substrates often contain aminoacyl residues PROTEINS: HIGHER ORDERS OF STRUCTURE / 33 N C C R N C R C N R C C N R C C N C R C N C R C N O R C C N R C C N C R C N C R C N R C Figure 5–5. Spacing and bond angles of the hydro- C gen bonds of antiparallel and parallel pleated β sheets. Arrows indicate the direction of each strand. The hydro- gen-donating α-nitrogen atoms are shown as blue cir- cles. Hydrogen bonds are indicated by dotted lines. For Figure 5–4. Hydrogen bonds (dotted lines) formed clarity in presentation, R groups and hydrogens are between H and O atoms stabilize a polypeptide in an omitted. Top: Antiparallel β sheet. Pairs of hydrogen α-helical conformation. (Reprinted, with permission, bonds alternate between being close together and from Haggis GH et al: Introduction to Molecular Biology. wide apart and are oriented approximately perpendicu- Wiley, 1964.) lar to the polypeptide backbone. Bottom: Parallel β sheet. The hydrogen bonds are evenly spaced but slant in alternate directions. that participate in catalysis. Helix-loop-helix motifs provide the oligonucleotide-binding portion of DNA- binding proteins such as repressors and transcription factors. Structural motifs such as the helix-loop-helix dered regions assume an ordered conformation upon motif that are intermediate between secondary and ter- binding of a ligand. This structural flexibility enables tiary structures are often termed supersecondary struc- such regions to act as ligand-controlled switches that af- tures. Since many loops and bends reside on the surface fect protein structure and function. of proteins and are thus exposed to solvent, they consti- tute readily accessible sites, or epitopes, for recognition Tertiary & Quaternary Structure and binding of antibodies. While loops lack apparent structural regularity, they The term “tertiary structure” refers to the entire three- exist in a specific conformation stabilized through hy- dimensional conformation of a polypeptide. It indicates, drogen bonding, salt bridges, and hydrophobic interac- in three-dimensional space, how secondary structural tions with other portions of the protein. However, not features—helices, sheets, bends, turns, and loops— all portions of proteins are necessarily ordered. Proteins assemble to form domains and how these domains re- may contain “disordered” regions, often at the extreme late spatially to one another. A domain is a section of amino or carboxyl terminal, characterized by high con- protein structure sufficient to perform a particular formational flexibility. In many instances, these disor- chemical or physical task such as binding of a substrate 34 / CHAPTER 5 COOH H H CH2 N H Cα H Cα C H N O C O CH3 C O H N CH2OH Cα Cα H H Figure 5–7. A β-turn that links two segments of an- tiparallel β sheet. The dotted line indicates the hydro- gen bond between the first and fourth amino acids of the four-residue segment Ala-Gly-Asp-Ser. 30 15 or other ligand. Other domains may anchor a protein to 55 a membrane or interact with a regulatory molecule that N modulates its function. A small polypeptide such as 50 345 80 70 triose phosphate isomerase (Figure 5–6) or myoglobin 330 280 (Chapter 6) may consist of a single domain. By contrast, 90 protein kinases contain two domains. Protein kinases 150 185 350 catalyze the transfer of a phosphoryl group from ATP to 145 C a peptide or protein. The amino terminal portion of the 230 245 377 polypeptide, which is rich in β sheet, binds ATP, while 320 310 the carboxyl terminal domain, which is rich in α helix, 110 220 260 binds the peptide or protein substrate (Figure 5–8). The groups that catalyze phosphoryl transfer reside in a loop 300 258 positioned at the interface of the two domains. 205 120 In some cases, proteins are assembled from more than one polypeptide, or protomer. Quaternary struc- 170 125 ture defines the polypeptide composition of a protein and, for an oligomeric protein, the spatial relationships Figure 5–6. Examples of tertiary structure of pro- between its subunits or protomers. Monomeric pro- teins. Top: The enzyme triose phosphate isomerase. teins consist of a single polypeptide chain. Dimeric Note the elegant and symmetrical arrangement of al- proteins contain two polypeptide chains. Homodimers ternating β sheets and α helices. (Courtesy of J Richard- contain two copies of the same polypeptide chain, son.) Bottom: Two-domain structure of the subunit of a while in a heterodimer the polypeptides differ. Greek homodimeric enzyme, a bacterial class II HMG-CoA re- letters (α, β, γ etc) are used to distinguish different sub- ductase. As indicated by the numbered residues, the units of a heterooligomeric protein, and subscripts indi- single polypeptide begins in the large domain, enters cate the number of each subunit type. For example, α4 designates a homotetrameric protein, and α2β2γ a pro- the small domain, and ends in the large domain. (Cour- tein with five subunits of three different types. tesy of C Lawrence, V Rodwell, and C Stauffacher, Purdue Since even small proteins contain many thousands University.) of atoms, depictions of protein structure that indicate the position of every atom are generally too complex to be readily interpreted. Simplified schematic diagrams thus are used to depict key features of a protein’s ter- PROTEINS: HIGHER ORDERS OF STRUCTURE / 35 from water. Other significant contributors include hy- drogen bonds and salt bridges between the carboxylates of aspartic and glutamic acid and the oppositely charged side chains of protonated lysyl, argininyl, and histidyl residues. While individually weak relative to a typical covalent bond of 80–120 kcal/mol, collectively these numerous interactions confer a high degree of sta- bility to the biologically functional conformation of a protein, just as a Velcro fastener harnesses the cumula- tive strength of multiple plastic loops and hooks. Some proteins contain covalent disulfide (S S) bonds that link the sulfhydryl groups of cysteinyl residues. Formation of disulfide bonds involves oxida- tion of the cysteinyl sulfhydryl groups and requires oxy- gen. Intrapolypeptide disulfide bonds further enhance the stability of the folded conformation of a peptide, while interpolypeptide disulfide bonds stabilize the quaternary structure of certain oligomeric proteins. THREE-DIMENSIONAL STRUCTURE IS DETERMINED BY X-RAY CRYSTALLOGRAPHY OR BY NMR SPECTROSCOPY X-Ray Crystallography Since the determination of the three-dimensional struc- ture of myoglobin over 40 years ago, the three-dimen- sional structures of thousands of proteins have been de- Figure 5–8. Domain structure. Protein kinases con- termined by x-ray crystallography. The key to x-ray tain two domains. The upper, amino terminal domain crystallography is the precipitation of a protein under binds the phosphoryl donor ATP (light blue). The lower, conditions in which it forms ordered crystals that dif- carboxyl terminal domain is shown binding a synthetic fract x-rays. This is generally accomplished by exposing peptide substrate (dark blue). small drops of the protein solution to various combina- tions of pH and precipitating agents such as salts and organic solutes such as polyethylene glycol. A detailed three-dimensional structure of a protein can be con- tiary and quaternary structure. Ribbon diagrams (Fig- structed from its primary structure using the pattern by ures 5–6 and 5–8) trace the conformation of the which it diffracts a beam of monochromatic x-rays. polypeptide backbone, with cylinders and arrows indi- While the development of increasingly capable com- cating regions of α helix and β sheet, respectively. In an puter-based tools has rendered the analysis of complex even simpler representation, line segments that link the x-ray diffraction patterns increasingly facile, a major α carbons indicate the path of the polypeptide back- stumbling block remains the requirement of inducing bone. These schematic diagrams often include the side highly purified samples of the protein of interest to chains of selected amino acids that emphasize specific crystallize. Several lines of evidence, including the abil- structure-function relationships. ity of some crystallized enzymes to catalyze chemical re- actions, indicate that the vast majority of the structures MULTIPLE FACTORS STABILIZE determined by crystallography faithfully represent the TERTIARY & QUATERNARY STRUCTURE structures of proteins in free solution. Higher orders of protein structure are stabilized primar- Nuclear Magnetic Resonance ily—and often exclusively—by noncovalent interac- tions. Principal among these are hydrophobic interac- Spectroscopy tions that drive most hydrophobic amino acid side Nuclear magnetic resonance (NMR) spectroscopy, a chains into the interior of the protein, shielding them powerful complement to x-ray crystallography, mea- 36 / CHAPTER 5 sures the absorbance of radio frequency electromagnetic Folding Is Modular energy by certain atomic nuclei. “NMR-active” isotopes of biologically relevant atoms include 1H, 13C, 15N, and Protein folding generally occurs via a stepwise process. 31 P. The frequency, or chemical shift, at which a partic- In the first stage, the newly synthesized polypeptide ular nucleus absorbs energy is a function of both the emerges from ribosomes, and short segments fold into functional group within which it resides and the prox- secondary structural units that provide local regions of imity of other NMR-active nuclei. Two-dimensional organized structure. Folding is now reduced to the se- NMR spectroscopy permits a three-dimensional repre- lection of an appropriate arrangement of this relatively sentation of a protein to be constructed by determining small number of secondary structural elements. In the the proximity of these nuclei to one another. NMR second stage, the forces that drive hydrophobic regions spectroscopy analyzes proteins in aqueous solution, ob- into the interior of the protein away from solvent drive viating the need to form crystals. It thus is possible to the partially folded polypeptide into a “molten globule” observe changes in conformation that accompany lig- in which the modules of secondary structure rearrange and binding or catalysis using NMR spectroscopy. to arrive at the mature conformation of the protein. However, only the spectra of relatively small proteins, This process is orderly but not rigid. Considerable flexi- ≤ 20 kDa in size, can be analyzed with current tech- bility exists in the ways and in the order in which ele- nology. ments of secondary structure can be rearranged. In gen- eral, each element of secondary or supersecondary Molecular Modeling structure facilitates proper folding by directing the fold- ing process toward the native conformation and away An increasingly useful adjunct to the empirical determi- from unproductive alternatives. For oligomeric pro- nation of the three-dimensional structure of proteins is teins, individual protomers tend to fold before they as- the use of computer technology for molecular model- sociate with other subunits. ing. The types of models created take two forms. In the first, the known three-dimensional structure of a pro- tein is used as a template to build a model of the proba- Auxiliary Proteins Assist Folding ble structure of a homologous protein. In the second, Under appropriate conditions, many proteins will computer software is used to manipulate the static spontaneously refold after being previously denatured model provided by crystallography to explore how a (ie, unfolded) by treatment with acid or base, protein’s conformation might change when ligands are chaotropic agents, or detergents. However, unlike the bound or when temperature, pH, or ionic strength is folding process in vivo, refolding under laboratory con- altered. Scientists also are examining the library of ditions is a far slower process. Moreover, some proteins available protein structures in an attempt to devise fail to spontaneously refold in vitro, often forming in- computer programs that can predict the three-dimen- soluble aggregates, disordered complexes of unfolded sional conformation of a protein directly from its pri- or partially folded polypeptides held together by hy- mary sequence. drophobic interactions. Aggregates represent unproduc- tive dead ends in the folding process. Cells employ aux- PROTEIN FOLDING iliary proteins to speed the process of folding and to guide it toward a productive conclusion. The Native Conformation of a Protein Is Thermodynamically Favored Chaperones The number of distinct combinations of phi and psi angles specifying potential conformations of even a rel- Chaperone proteins participate in the folding of over atively small—15-kDa—polypeptide is unbelievably half of mammalian proteins. The hsp70 (70-kDa heat vast. Proteins are guided through this vast labyrinth of shock protein) family of chaperones binds short se- possibilities by thermodynamics. Since the biologically quences of hydrophobic amino acids in newly syn- relevant—or native—conformation of a protein gener- thesized polypeptides, shielding them from solvent. ally is that which is most energetically favored, knowl- Chaperones prevent aggregation, thus providing an op- edge of the native conformation is specified in the pri- portunity for the formation of appropriate secondary mary sequence. However, if one were to wait for a structural elements and their subsequent coalescence polypeptide to find its native conformation by random into a molten globule. The hsp60 family of chaperones, exploration of all possible conformations, the process sometimes called chaperonins, differ in sequence and would require billions of years to complete. Clearly, structure from hsp70 and its homologs. Hsp60 acts protein folding in cells takes place in a more orderly later in the folding process, often together with an and guided fashion. hsp70 chaperone. The central cavity of the donut- PROTEINS: HIGHER ORDERS OF STRUCTURE / 37 shaped hsp60 chaperone provides a sheltered environ- sheep, and bovine spongiform encephalopathy (mad ment in which a polypeptide can fold until all hy- cow disease) in cattle. Prion diseases may manifest drophobic regions are buried in its interior, eliminating themselves as infectious, genetic, or sporadic disorders. aggregation. Chaperone proteins can also “rescue” pro- Because no viral or bacterial gene encoding the patho- teins that have become thermodynamically trapped in a logic prion protein could be identified, the source and misfolded dead end by unfolding hydrophobic regions mechanism of transmission of prion disease long re- and providing a second chance to fold productively. mained elusive. Today it is believed that prion diseases are protein conformation diseases transmitted by alter- Protein Disulfide Isomerase ing the conformation, and hence the physical proper- ties, of proteins endogenous to the host. Human prion- Disulfide bonds between and within polypeptides stabi- related protein, PrP, a glycoprotein encoded on the lize tertiary and quaternary structure. However, disul- short arm of chromosome 20, normally is monomeric fide bond formation is nonspecific. Under oxidizing and rich in α helix. Pathologic prion proteins serve as conditions, a given cysteine can form a disulfide bond the templates for the conformational transformation of with the SH of any accessible cysteinyl residue. By normal PrP, known as PrPc, into PrPsc. PrPsc is rich in catalyzing disulfide exchange, the rupture of an SS β sheet with many hydrophobic aminoacyl side chains bond and its reformation with a different partner cys- exposed to solvent. PrPsc molecules therefore associate teine, protein disulfide isomerase facilitates the forma- strongly with one other, forming insoluble protease-re- tion of disulfide bonds that stabilize their native confor- sistant aggregates. Since one pathologic prion or prion- mation. related protein can serve as template for the conforma- tional transformation of many times its number of PrPc Proline-cis,trans-Isomerase molecules, prion diseases can be transmitted by the pro- tein alone without involvement of DNA or RNA. All X-Pro peptide bonds—where X represents any residue—are synthesized in the trans configuration. However, of the X-Pro bonds of mature proteins, ap- Alzheimer’s Disease proximately 6% are cis. The cis configuration is partic- Refolding or misfolding of another protein endogenous ularly common in β-turns. Isomerization from trans to to human brain tissue, β-amyloid, is also a prominent cis is catalyzed by the enzyme proline-cis,trans-iso- feature of Alzheimer’s disease. While the root cause of merase (Figure 5–9). Alzheimer’s disease remains elusive, the characteristic senile plaques and neurofibrillary bundles contain ag- SEVERAL NEUROLOGIC DISEASES gregates of the protein β-amyloid, a 4.3-kDa polypep- RESULT FROM ALTERED PROTEIN tide produced by proteolytic cleavage of a larger protein CONFORMATION known as amyloid precursor protein. In Alzheimer’s disease patients, levels of β-amyloid become elevated, Prions and this protein undergoes a conformational transfor- mation from a soluble α helix–rich state to a state rich The transmissible spongiform encephalopathies, or in β sheet and prone to self-aggregation. Apolipopro- prion diseases, are fatal neurodegenerative diseases tein E has been implicated as a potential mediator of characterized by spongiform changes, astrocytic gli- this conformational transformation. omas, and neuronal loss resulting from the deposition of insoluble protein aggregates in neural cells. They in- clude Creutzfeldt-Jakob disease in humans, scrapie in COLLAGEN ILLUSTRATES THE ROLE OF POSTTRANSLATIONAL PROCESSING IN PROTEIN MATURATION H O H O ′ α1 O Protein Maturation Often Involves Making N N & Breaking Covalent Bonds α1 N α1 N The maturation of proteins into their final structural R1 ′ α1 R1 O state often involves the cleavage or formation (or both) of covalent bonds, a process termed posttranslational modification. Many polypeptides are initially synthe- Figure 5–9. Isomerization of the N-α1 prolyl peptide sized as larger precursors, called proproteins. The bond from a cis to a trans configuration relative to the “extra” polypeptide segments in these proproteins backbone of the polypeptide. often serve as leader sequences that target a polypeptide 38 / CHAPTER 5 to a particular organelle or facilitate its passage through rise per residue nearly twice that of an α helix. The a membrane. Others ensure that the potentially harm- R groups of each polypeptide strand of the triple helix ful activity of a protein such as the proteases trypsin pack so closely that in order to fit, one must be glycine. and chymotrypsin remains inhibited until these pro- Thus, every third amino acid residue in collagen is a teins reach their final destination. However, once these glycine residue. Staggering of the three strands provides transient requirements are fulfilled, the now superflu- appropriate positioning of the requisite glycines ous peptide regions are removed by selective proteoly- throughout the helix. Collagen is also rich in proline sis. Other covalent modifications may take place that and hydroxyproline, yielding a repetitive Gly-X-Y pat- add new chemical functionalities to a protein. The mat- tern (Figure 5–10) in which Y generally is proline or uration of collagen illustrates both of these processes. hydroxyproline. Collagen triple helices are stabilized by hydrogen Collagen Is a Fibrous Protein bonds between residues in different polypeptide chains. The hydroxyl groups of hydroxyprolyl residues also par- Collagen is the most abundant of the fibrous proteins ticipate in interchain hydrogen bonding. Additional that constitute more than 25% of the protein mass in stability is provided by covalent cross-links formed be- the human body. Other prominent fibrous proteins in- tween modified lysyl residues both within and between clude keratin and myosin. These proteins represent a polypeptide chains. primary source of structural strength for cells (ie, the cytoskeleton) and tissues. Skin derives its strength and flexibility from a crisscrossed mesh of collagen and ker- Collagen Is Synthesized as a atin fibers, while bones and teeth are buttressed by an Larger Precursor underlying network of collagen fibers analogous to the Collagen is initially synthesized as a larger precursor steel strands in reinforced concrete. Collagen also is polypeptide, procollagen. Numerous prolyl and lysyl present in connective tissues such as ligaments and ten- residues of procollagen are hydroxylated by prolyl hy- dons. The high degree of tensile strength required to droxylase and lysyl hydroxylase, enzymes that require fulfill these structural roles requires elongated proteins ascorbic acid (vitamin C). Hydroxyprolyl and hydroxy- characterized by repetitive amino acid sequences and a lysyl residues provide additional hydrogen bonding ca- regular secondary structure. pability that stabilizes the mature protein. In addition, glucosyl and galactosyl transferases attach glucosyl or Collagen Forms a Unique Triple Helix galactosyl residues to the hydroxyl groups of specific Tropocollagen consists of three fibers, each containing hydroxylysyl residues. about 1000 amino acids, bundled together in a unique The central portion of the precursor polypeptide conformation, the collagen triple helix (Figure 5–10). A then associates with other molecules to form the char- mature collagen fiber forms an elongated rod with an acteristic triple helix. This process is accompanied by axial ratio of about 200. Three intertwined polypeptide the removal of the globular amino terminal and car- strands, which twist to the left, wrap around one an- boxyl terminal extensions of the precursor polypeptide other in a right-handed fashion to form the collagen by selective proteolysis. Certain lysyl residues are modi- triple helix. The opposing handedness of this superhelix fied by lysyl oxidase, a copper-containing protein that and its component polypeptides makes the collagen converts ε-amino groups to aldehydes. The aldehydes triple helix highly resistant to unwinding—the same can either undergo an aldol condensation to form a C C double bond or to form a Schiff base (eneimine) principle used in the steel cables of suspension bridges. A collagen triple helix has 3.3 residues per turn and a with the ε-amino group of an unmodified lysyl residue, which is subsequently reduced to form a CN single bond. These covalent bonds cross-link the individual Amino acid polypeptides and imbue the fiber with exceptional sequence – Gly – X – Y – Gly – X – Y – Gly – X – Y – strength and rigidity. 2º structure Nutritional & Genetic Disorders Can Impair Collagen Maturation Triple helix The complex series of events in collagen maturation provide a model that illustrates the biologic conse- quences of incomplete polypeptide maturation. The Figure 5–10. Primary, secondary, and tertiary struc- best-known defect in collagen biosynthesis is scurvy, a tures of collagen. result of a dietary deficiency of vitamin C required by PROTEINS: HIGHER ORDERS OF STRUCTURE / 39 prolyl and lysyl hydroxylases. The resulting deficit in sized polypeptide fold into secondary structural the number of hydroxyproline and hydroxylysine units. Forces that bury hydrophobic regions from residues undermines the conformational stability of col- solvent then drive the partially folded polypeptide lagen fibers, leading to bleeding gums, swelling joints, into a “molten globule” in which the modules of sec- poor wound healing, and ultimately to death. Menkes’ ondary structure are rearranged to give the native syndrome, characterized by kinky hair and growth re- conformation of the protein. tardation, reflects a dietary deficiency of the copper re- • Proteins that assist folding include protein disulfide quired by lysyl oxidase, which catalyzes a key step in isomerase, proline-cis,trans,-isomerase, and the chap- formation of the covalent cross-links that strengthen erones that participate in the folding of over half of collagen fibers. mammalian proteins. Chaperones shield newly syn- Genetic disorders of collagen biosynthesis include thesized polypeptides from solvent and provide an several forms of osteogenesis imperfecta, characterized environment for elements of secondary structure to by fragile bones. In Ehlers-Dahlos syndrome, a group emerge and coalesce into molten globules. of connective tissue disorders that involve impaired in- • Techniques for study of higher orders of protein tegrity of supporting structures, defects in the genes structure include x-ray crystallography, NMR spec- that encode α collagen-1, procollagen N-peptidase, or troscopy, analytical ultracentrifugation, gel filtration, lysyl hydroxylase result in mobile joints and skin abnor- and gel electrophoresis. malities. • Silk fibroin and collagen illustrate the close linkage of protein structure and biologic function. Diseases of SUMMARY collagen maturation include Ehlers-Danlos syndrome • Proteins may be classified on the basis of the solubil- and the vitamin C deficiency disease scurvy. ity, shape, or function or of the presence of a pros- • Prions—protein particles that lack nucleic acid— thetic group such as heme. Proteins perform complex cause fatal transmissible spongiform encephalopa- physical and catalytic functions by positioning spe- thies such as Creutzfeldt-Jakob disease, scrapie, and cific chemical groups in a precise three-dimensional bovine spongiform encephalopathy. Prion diseases arrangement that is both functionally efficient and involve an altered secondary-tertiary structure of a physically strong. naturally occurring protein, PrPc. When PrPc inter- • The gene-encoded primary structure of a polypeptide acts with its pathologic isoform PrPSc, its conforma- is the sequence of its amino acids. Its secondary tion is transformed from a predominantly α-helical structure results from folding of polypeptides into structure to the β-sheet structure characteristic of hydrogen-bonded motifs such as the α helix, the PrPSc. β-pleated sheet, β bends, and loops. Combinations of these motifs can form supersecondary motifs. • Tertiary structure concerns the relationships between REFERENCES secondary structural domains. Quaternary structure of proteins with two or more polypeptides Branden C, Tooze J: Introduction to Protein Structure. Garland, (oligomeric proteins) is a feature based on the spatial 1991. relationships between various types of polypeptides. Burkhard P, Stetefeld J, Strelkov SV: Coiled coils: A highly versa- tile protein folding motif. Trends Cell Biol 2001;11:82. • Primary structures are stabilized by covalent peptide Collinge J: Prion diseases of humans and animals: Their causes and bonds. Higher orders of structure are stabilized by molecular basis. Annu Rev Neurosci 2001;24:519. weak forces—multiple hydrogen bonds, salt (electro- Frydman J: Folding of newly translated proteins in vivo: The role static) bonds, and association of hydrophobic R of molecular chaperones. Annu Rev Biochem 2001;70:603. groups. Radord S: Protein folding: Progress made and promises ahead. • The phi (Φ) angle of a polypeptide is the angle about Trends Biochem Sci 2000;25:611. the CαN bond; the psi (Ψ) angle is that about the Schmid FX: Proly isomerase: Enzymatic catalysis of slow protein Cα-Co bond. Most combinations of phi-psi angles folding reactions. Ann Rev Biophys Biomol Struct 1993;22: are disallowed due to steric hindrance. The phi-psi 123. angles that form the α helix and the β sheet fall Segrest MP et al: The amphipathic alpha-helix: A multifunctional structural motif in plasma lipoproteins. Adv Protein Chem within the lower and upper left-hand quadrants of a 1995;45:1. Ramachandran plot, respectively. Soto C: Alzheimer’s and prion disease as disorders of protein con- • Protein folding is a poorly understood process. formation: Implications for the design of novel therapeutic Broadly speaking, short segments of newly synthe- approaches. J Mol Med 1999;77:412. Proteins: Myoglobin & Hemoglobin 6 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE Myoglobin Is Rich in α Helix The heme proteins myoglobin and hemoglobin main- Oxygen stored in red muscle myoglobin is released dur- tain a supply of oxygen essential for oxidative metabo- ing O2 deprivation (eg, severe exercise) for use in mus- lism. Myoglobin, a monomeric protein of red muscle, cle mitochondria for aerobic synthesis of ATP (see stores oxygen as a reserve against oxygen deprivation. Chapter 12). A 153-aminoacyl residue polypeptide Hemoglobin, a tetrameric protein of erythrocytes, (MW 17,000), myoglobin folds into a compact shape transports O2 to the tissues and returns CO2 and pro- that measures 4.5 × 3.5 × 2.5 nm (Figure 6–2). Unusu- tons to the lungs. Cyanide and carbon monoxide kill ally high proportions, about 75%, of the residues are because they disrupt the physiologic function of the present in eight right-handed, 7–20 residue α helices. heme proteins cytochrome oxidase and hemoglobin, re- Starting at the amino terminal, these are termed helices spectively. The secondary-tertiary structure of the sub- A–H. Typical of globular proteins, the surface of myo- units of hemoglobin resembles myoglobin. However, globin is polar, while—with only two exceptions—the the tetrameric structure of hemoglobin permits cooper- interior contains only nonpolar residues such as Leu, ative interactions that are central to its function. For ex- Val, Phe, and Met. The exceptions are His E7 and His ample, 2,3-bisphosphoglycerate (BPG) promotes the F8, the seventh and eighth residues in helices E and F, efficient release of O2 by stabilizing the quaternary which lie close to the heme iron where they function in structure of deoxyhemoglobin. Hemoglobin and myo- O2 binding. globin illustrate both protein structure-function rela- tionships and the molecular basis of genetic diseases such as sickle cell disease and the thalassemias. Histidines F8 & E7 Perform Unique Roles in Oxygen Binding The heme of myoglobin lies in a crevice between helices HEME & FERROUS IRON CONFER THE E and F oriented with its polar propionate groups fac- ABILITY TO STORE & TO TRANSPORT ing the surface of the globin (Figure 6–2). The remain- OXYGEN der resides in the nonpolar interior. The fifth coordina- tion position of the iron is linked to a ring nitrogen of Myoglobin and hemoglobin contain heme, a cyclic the proximal histidine, His F8. The distal histidine, tetrapyrrole consisting of four molecules of pyrrole His E7, lies on the side of the heme ring opposite to linked by α-methylene bridges. This planar network of His F8. conjugated double bonds absorbs visible light and col- ors heme deep red. The substituents at the β-positions of heme are methyl (M), vinyl (V), and propionate (Pr) The Iron Moves Toward the Plane of the groups arranged in the order M, V, M, V, M, Pr, Pr, M (Figure 6–1). One atom of ferrous iron (Fe2+) resides at Heme When Oxygen Is Bound the center of the planar tetrapyrrole. Other proteins The iron of unoxygenated myoglobin lies 0.03 nm with metal-containing tetrapyrrole prosthetic groups (0.3 Å) outside the plane of the heme ring, toward His include the cytochromes (Fe and Cu) and chlorophyll F8. The heme therefore “puckers” slightly. When O2 (Mg) (see Chapter 12). Oxidation and reduction of the occupies the sixth coordination position, the iron Fe and Cu atoms of cytochromes is essential to their bi- moves to within 0.01 nm (0.1 Å) of the plane of the ologic function as carriers of electrons. By contrast, oxi- heme ring. Oxygenation of myoglobin thus is accompa- dation of the Fe2+ of myoglobin or hemoglobin to Fe3+ nied by motion of the iron, of His F8, and of residues destroys their biologic activity. linked to His F8. 40 PROTEINS: MYOGLOBIN & HEMOGLOBIN / 41 Apomyoglobin Provides a Hindered Environment for Heme Iron When O2 binds to myoglobin, the bond between the first N oxygen atom and the Fe2+ is perpendicular to the plane of the heme ring. The bond linking the first and second Fe2+ N N oxygen atoms lies at an angle of 121 degrees to the plane of the heme, orienting the second oxygen away from the N – distal histidine (Figure 6–3, left). Isolated heme binds O carbon monoxide (CO) 25,000 times more strongly than O oxygen. Since CO is present in small quantities in the at- mosphere and arises in cells from the catabolism of heme, why is it that CO does not completely displace O2 from O heme iron? The accepted explanation is that the apopro- O– teins of myoglobin and hemoglobin create a hindered environment. While CO can bind to isolated heme in its Figure 6–1. Heme. The pyrrole rings and methylene preferred orientation, ie, with all three atoms (Fe, C, and bridge carbons are coplanar, and the iron atom (Fe2+) O) perpendicular to the plane of the heme, in myoglobin resides in almost the same plane. The fifth and sixth co- and hemoglobin the distal histidine sterically precludes ordination positions of Fe2+ are directed perpendicular this orientation. Binding at a less favored angle reduces to—and directly above and below—the plane of the the strength of the heme-CO bond to about 200 times heme ring. Observe the nature of the substituent that of the heme-O2 bond (Figure 6–3, right) at which groups on the β carbons of the pyrrole rings, the cen- level the great excess of O2 over CO normally present tral iron atom, and the location of the polar side of the dominates. Nevertheless, about 1% of myoglobin typi- heme ring (at about 7 o’clock) that faces the surface of cally is present combined with carbon monoxide. the myoglobin molecule. THE OXYGEN DISSOCIATION CURVES FOR MYOGLOBIN & HEMOGLOBIN SUIT THEIR PHYSIOLOGIC ROLES O O– FG2 CD2 Why is myoglobin unsuitable as an O2 transport pro- C F9 H24 tein but well suited for O2 storage? The relationship HC5 F6 between the concentration, or partial pressure, of O2 C3 C7 CD1 G1 F8 C5 (PO2) and the quantity of O2 bound is expressed as an CD7 O2 saturation isotherm (Figure 6–4). The oxygen- E7 C1 E1 F1 G5 B14 D1 B16 D7 H16 E5 N N E7 E7 G15 N N EF3 EF1 B5 B1 O O NA1 A16 E20 O C +H N G19 3 H5 AB1 Fe Fe A1 N N H1 GH4 F8 F8 N N Figure 6–2. A model of myoglobin at low resolution. Only the α-carbon atoms are shown. The α-helical re- Figure 6–3. Angles for bonding of oxygen and car- gions are named A through H. (Based on Dickerson RE in: bon monoxide to the heme iron of myoglobin. The dis- The Proteins, 2nd ed. Vol 2. Neurath H [editor]. Academic tal E7 histidine hinders bonding of CO at the preferred Press, 1964. Reproduced with permission.) (180 degree) angle to the plane of the heme ring. 42 / CHAPTER 6 100 Hemoglobin Is Tetrameric Myoglobin Hemoglobins are tetramers comprised of pairs of two 80 Oxygenated blood leaving the lungs different polypeptide subunits. Greek letters are used to Percent saturation designate each subunit type. The subunit composition 60 of the principal hemoglobins are α2β2 (HbA; normal adult hemoglobin), α2γ2 (HbF; fetal hemoglobin), α2S2 Reduced blood 40 returning from tissues (HbS; sickle cell hemoglobin), and α2δ2 (HbA2; a minor adult hemoglobin). The primary structures of 20 the β, γ, and δ chains of human hemoglobin are highly Hemoglobin conserved. 0 20 40 60 80 100 120 140 Myoglobin & the Subunits Gaseous pressure of oxygen (mm Hg) of Hemoglobin Share Almost Identical Secondary and Tertiary Structures Figure 6–4. Oxygen-binding curves of both hemo- globin and myoglobin. Arterial oxygen tension is about Despite differences in the kind and number of amino 100 mm Hg; mixed venous oxygen tension is about 40 acids present, myoglobin and the β polypeptide of he- moglobin A have almost identical secondary and ter- mm Hg; capillary (active muscle) oxygen tension is tiary structures. Similarities include the location of the about 20 mm Hg; and the minimum oxygen tension re- heme and the eight helical regions and the presence of quired for cytochrome oxidase is about 5 mm Hg. Asso- amino acids with similar properties at comparable loca- ciation of chains into a tetrameric structure (hemoglo- tions. Although it possesses seven rather than eight heli- bin) results in much greater oxygen delivery than cal regions, the α polypeptide of hemoglobin also would be possible with single chains. (Modified, with closely resembles myoglobin. permission, from Scriver CR et al [editors]: The Molecular and Metabolic Bases of Inherited Disease, 7th ed. Oxygenation of Hemoglobin McGraw-Hill, 1995.) Triggers Conformational Changes in the Apoprotein Hemoglobins bind four molecules of O2 per tetramer, binding curve for myoglobin is hyperbolic. Myoglobin one per heme. A molecule of O2 binds to a hemoglobin therefore loads O2 readily at the PO2 of the lung capil- tetramer more readily if other O2 molecules are already lary bed (100 mm Hg). However, since myoglobin re- bound (Figure 6–4). Termed cooperative binding, leases only a small fraction of its bound O2 at the PO2 this phenomenon permits hemoglobin to maximize values typically encountered in active muscle (20 mm both the quantity of O2 loaded at the PO2 of the lungs Hg) or other tissues (40 mm Hg), it represents an inef- and the quantity of O2 released at the PO2 of the pe- fective vehicle for delivery of O2. However, when ripheral tissues. Cooperative interactions, an exclusive strenuous exercise lowers the PO2 of muscle tissue to property of multimeric proteins, are critically impor- about 5 mm Hg, myoglobin releases O2 for mitochon- tant to aerobic life. drial synthesis of ATP, permitting continued muscular activity. P50 Expresses the Relative Affinities of Different Hemoglobins for Oxygen THE ALLOSTERIC PROPERTIES OF HEMOGLOBINS RESULT FROM THEIR The quantity P50, a measure of O2 concentration, is the QUATERNARY STRUCTURES partial pressure of O2 that half-saturates a given hemo- globin. Depending on the organism, P50 can vary The properties of individual hemoglobins are conse- widely, but in all instances it will exceed the PO2 of the quences of their quaternary as well as of their secondary peripheral tissues. For example, values of P50 for HbA and tertiary structures. The quaternary structure of he- and fetal HbF are 26 and 20 mm Hg, respectively. In moglobin confers striking additional properties, absent the placenta, this difference enables HbF to extract oxy- from monomeric myoglobin, which adapts it to its gen from the HbA in the mother’s blood. However, unique biologic roles. The allosteric (Gk allos “other,” HbF is suboptimal postpartum since its high affinity steros “space”) properties of hemoglobin provide, in ad- for O2 dictates that it can deliver less O2 to the tissues. dition, a model for understanding other allosteric pro- The subunit composition of hemoglobin tetramers teins (see Chapter 11). undergoes complex changes during development. The PROTEINS: MYOGLOBIN & HEMOGLOBIN / 43 human fetus initially synthesizes a ζ2ε2 tetramer. By the Histidine F8 end of the first trimester, ζ and γ subunits have been re- F helix N placed by α and ε subunits, forming HbF (α2γ2), the C hemoglobin of late fetal life. While synthesis of β sub- CH HC units begins in the third trimester, β subunits do not N completely replace γ subunits to yield adult HbA (α2β2) Steric until some weeks postpartum (Figure 6–5). repulsion Fe Porphyrin plane Oxygenation of Hemoglobin Is Accompanied by Large Conformational Changes +O2 The binding of the first O2 molecule to deoxyHb shifts F helix the heme iron towards the plane of the heme ring from C N a position about 0.6 nm beyond it (Figure 6–6). This HC CH motion is transmitted to the proximal (F8) histidine N and to the residues attached thereto, which in turn causes the rupture of salt bridges between the carboxyl terminal residues of all four subunits. As a consequence, Fe one pair of α/β subunits rotates 15 degrees with respect to the other, compacting the tetramer (Figure 6–7). O Profound changes in secondary, tertiary, and quater- O nary structure accompany the high-affinity O2-induced transition of hemoglobin from the low-affinity T (taut) Figure 6–6. The iron atom moves into the plane of state to the R (relaxed) state. These changes signifi- the heme on oxygenation. Histidine F8 and its associ- cantly increase the affinity of the remaining unoxy- ated residues are pulled along with the iron atom. genated hemes for O2, as subsequent binding events re- (Slightly modified and reproduced, with permission, quire the rupture of fewer salt bridges (Figure 6–8). from Stryer L: Biochemistry, 4th ed. Freeman, 1995.) The terms T and R also are used to refer to the low- affinity and high-affinity conformations of allosteric en- zymes, respectively. 50 α chain Globin chain synthesis (% of total) γ chain α1 β2 α1 β2 40 (fetal) Axis β chain (adult) 30 α2 20 ∋ and ζ chains α2 β1 β1 (embryonic) 10 15° δ chain T form R form 0 3 6 Birth 3 6 Figure 6–7. During transition of the T form to the R Gestation (months) Age (months) form of hemoglobin, one pair of subunits (α2/β2) ro- tates through 15 degrees relative to the other pair Figure 6–5. Developmental pattern of the quater- (α1/β1). The axis of rotation is eccentric, and the α2/β2 nary structure of fetal and newborn hemoglobins. (Re- pair also shifts toward the axis somewhat. In the dia- produced, with permission, from Ganong WF: Review of gram, the unshaded α1/β1 pair is shown fixed while the Medical Physiology, 20th ed. McGraw-Hill, 2001.) colored α2/β2 pair both shifts and rotates. 44 / CHAPTER 6 T structure α1 α2 O2 O2 O2 O2 O2 β1 β2 O2 O2 O2 O2 O2 O2 O2 O2 O2 O2 O2 R structure Figure 6–8. Transition from the T structure to the R structure. In this model, salt bridges (thin lines) linking the subunits in the T structure break progressively as oxy- gen is added, and even those salt bridges that have not yet ruptured are progressively weakened (wavy lines). The transition from T to R does not take place after a fixed number of oxygen molecules have been bound but becomes more probable as each successive oxygen binds. The transition between the two structures is influenced by protons, carbon dioxide, chloride, and BPG; the higher their concentration, the more oxygen must be bound to trigger the transition. Fully oxygenated molecules in the T structure and fully deoxygenated molecules in the R structure are not shown because they are unstable. (Modified and redrawn, with permission, from Perutz MF: Hemoglobin structure and respiratory transport. Sci Am [Dec] 1978;239:92.) After Releasing O2 at the Tissues, Hemoglobin Transports CO2 & Protons to the Lungs In addition to transporting O2 from the lungs to pe- ripheral tissues, hemoglobin transports CO2, the by- product of respiration, and protons from peripheral tis- sues to the lungs. Hemoglobin carries CO2 as Deoxyhemoglobin binds one proton for every two carbamates formed with the amino terminal nitrogens O2 molecules released, contributing significantly to the of the polypeptide chains. buffering capacity of blood. The somewhat lower pH of peripheral tissues, aided by carbamation, stabilizes the T state and thus enhances the delivery of O2. In the lungs, the process reverses. As O2 binds to deoxyhemo- O globin, protons are released and combine with bicar- + H || bonate to form carbonic acid. Dehydration of H2CO3, CO2 + Hb NH3 = 2H+ + Hb N C O− catalyzed by carbonic anhydrase, forms CO2, which is exhaled. Binding of oxygen thus drives the exhalation of CO2 (Figure 6–9).This reciprocal coupling of proton Carbamates change the charge on amino terminals and O2 binding is termed the Bohr effect. The Bohr from positive to negative, favoring salt bond formation effect is dependent upon cooperative interactions be- between the α and β chains. tween the hemes of the hemoglobin tetramer. Myo- Hemoglobin carbamates account for about 15% of globin, a monomer, exhibits no Bohr effect. the CO2 in venous blood. Much of the remaining CO2 is carried as bicarbonate, which is formed in erythro- Protons Arise From Rupture of Salt Bonds cytes by the hydration of CO2 to carbonic acid When O2 Binds (H2CO3), a process catalyzed by carbonic anhydrase. At the pH of venous blood, H2CO3 dissociates into bicar- Protons responsible for the Bohr effect arise from rup- bonate and a proton. ture of salt bridges during the binding of O2 to T state PROTEINS: MYOGLOBIN & HEMOGLOBIN / 45 Exhaled 2CO2 + 2H2O CARBONIC ANHYDRASE 2H2CO3 PERIPHERAL TISSUES 2HCO3– + 2H+ Hb • 4O2 The hemoglobin tetramer binds one molecule of 4O2 BPG in the central cavity formed by its four subunits. 2H+ + 2HCO3– However, the space between the H helices of the β chains lining the cavity is sufficiently wide to accom- 4O2 Hb • 2H+ modate BPG only when hemoglobin is in the T state. (buffer) 2H2CO3 BPG forms salt bridges with the terminal amino groups of both β chains via Val NA1 and with Lys EF6 and LUNGS CARBONIC ANHYDRASE His H21 (Figure 6–10). BPG therefore stabilizes de- 2CO2 + 2H2O oxygenated (T state) hemoglobin by forming additional salt bridges that must be broken prior to conversion to Generated by the R state. the Krebs cycle Residue H21 of the γ subunit of fetal hemoglobin (HbF) is Ser rather than His. Since Ser cannot form a Figure 6–9. The Bohr effect. Carbon dioxide gener- salt bridge, BPG binds more weakly to HbF than to ated in peripheral tissues combines with water to form HbA. The lower stabilization afforded to the T state by carbonic acid, which dissociates into protons and bicar- BPG accounts for HbF having a higher affinity for O2 bonate ions. Deoxyhemoglobin acts as a buffer by than HbA. binding protons and delivering them to the lungs. In the lungs, the uptake of oxygen by hemoglobin re- leases protons that combine with bicarbonate ion, forming carbonic acid, which when dehydrated by car- bonic anhydrase becomes carbon dioxide, which then is exhaled. His H21 Lys EF6 hemoglobin. Conversion to the oxygenated R state breaks salt bridges involving β-chain residue His 146. BPG Val NA1 The subsequent dissociation of protons from His 146 α-NH 3+ Val NA1 drives the conversion of bicarbonate to carbonic acid (Figure 6–9). Upon the release of O2, the T structure and its salt bridges re-form. This conformational Lys EF6 change increases the pKa of the β-chain His 146 residues, which bind protons. By facilitating the re-for- mation of salt bridges, an increase in proton concentra- tion enhances the release of O2 from oxygenated (R His H21 state) hemoglobin. Conversely, an increase in PO2 pro- motes proton release. Figure 6–10. Mode of binding of 2,3-bisphospho- 2,3-Bisphosphoglycerate (BPG) Stabilizes glycerate to human deoxyhemoglobin. BPG interacts the T Structure of Hemoglobin with three positively charged groups on each β chain. A low PO2 in peripheral tissues promotes the synthesis (Based on Arnone A: X-ray diffraction study of binding of in erythrocytes of 2,3-bisphosphoglycerate (BPG) from 2,3-diphosphoglycerate to human deoxyhemoglobin. Na- the glycolytic intermediate 1,3-bisphosphoglycerate. ture 1972;237:146. Reproduced with permission.) 46 / CHAPTER 6 Adaptation to High Altitude In hemoglobin M, histidine F8 (His F8) has been replaced by tyrosine. The iron of HbM forms a tight Physiologic changes that accompany prolonged expo- ionic complex with the phenolate anion of tyrosine that sure to high altitude include an increase in the number stabilizes the Fe3+ form. In α-chain hemoglobin M vari- of erythrocytes and in their concentrations of hemoglo- ants, the R-T equilibrium favors the T state. Oxygen bin and of BPG. Elevated BPG lowers the affinity of affinity is reduced, and the Bohr effect is absent. HbA for O2 (decreases P50), which enhances release of β-Chain hemoglobin M variants exhibit R-T switching, O2 at the tissues. and the Bohr effect is therefore present. Mutations (eg, hemoglobin Chesapeake) that favor NUMEROUS MUTANT HUMAN the R state increase O2 affinity. These hemoglobins HEMOGLOBINS HAVE BEEN IDENTIFIED therefore fail to deliver adequate O2 to peripheral tis- sues. The resulting tissue hypoxia leads to poly- Mutations in the genes that encode the α or β subunits cythemia, an increased concentration of erythrocytes. of hemoglobin potentially can affect its biologic func- tion. However, almost all of the over 800 known mu- tant human hemoglobins are both extremely rare and Hemoglobin S benign, presenting no clinical abnormalities. When a mutation does compromise biologic function, the con- In HbS, the nonpolar amino acid valine has replaced dition is termed a hemoglobinopathy. The URL the polar surface residue Glu6 of the β subunit, gener- http://globin.cse.psu.edu/ (Globin Gene Server) pro- ating a hydrophobic “sticky patch” on the surface of vides information about—and links for—normal and the β subunit of both oxyHbS and deoxyHbS. Both mutant hemoglobins. HbA and HbS contain a complementary sticky patch on their surfaces that is exposed only in the deoxy- genated, R state. Thus, at low PO2, deoxyHbS can poly- Methemoglobin & Hemoglobin M merize to form long, insoluble fibers. Binding of deoxy- In methemoglobinemia, the heme iron is ferric rather HbA terminates fiber polymerization, since HbA lacks than ferrous. Methemoglobin thus can neither bind nor the second sticky patch necessary to bind another Hb transport O2. Normally, the enzyme methemoglobin molecule (Figure 6–11). These twisted helical fibers reductase reduces the Fe3+ of methemoglobin to Fe2+. distort the erythrocyte into a characteristic sickle shape, Methemoglobin can arise by oxidation of Fe2+ to Fe3+ rendering it vulnerable to lysis in the interstices of the as a side effect of agents such as sulfonamides, from splenic sinusoids. They also cause multiple secondary hereditary hemoglobin M, or consequent to reduced clinical effects. A low PO2 such as that at high altitudes activity of the enzyme methemoglobin reductase. exacerbates the tendency to polymerize. Oxy A Deoxy A Oxy S Deoxy S β α α β Deoxy A Deoxy S Figure 6–11. Representation of the sticky patch ( ) on hemoglobin S and its “receptor” ( ) on deoxyhemoglobin A and deoxyhemoglobin S. The complementary surfaces allow deoxyhe- moglobin S to polymerize into a fibrous structure, but the presence of deoxyhemoglobin A will terminate the polymerization by failing to provide sticky patches. (Modified and reproduced, with permission, from Stryer L: Biochemistry, 4th ed. Freeman, 1995.) PROTEINS: MYOGLOBIN & HEMOGLOBIN / 47 BIOMEDICAL IMPLICATIONS different primary structures, myoglobin and the sub- units of hemoglobin have nearly identical secondary Myoglobinuria and tertiary structures. Following massive crush injury, myoglobin released • Heme, an essentially planar, slightly puckered, cyclic from damaged muscle fibers colors the urine dark red. tetrapyrrole, has a central Fe2+ linked to all four ni- Myoglobin can be detected in plasma following a my- trogen atoms of the heme, to histidine F8, and, in ocardial infarction, but assay of serum enzymes (see oxyMb and oxyHb, also to O2. Chapter 7) provides a more sensitive index of myocar- • The O2-binding curve for myoglobin is hyperbolic, dial injury. but for hemoglobin it is sigmoidal, a consequence of cooperative interactions in the tetramer. Cooperativ- Anemias ity maximizes the ability of hemoglobin both to load O2 at the PO2 of the lungs and to deliver O2 at the Anemias, reductions in the number of red blood cells or PO2 of the tissues. of hemoglobin in the blood, can reflect impaired syn- • Relative affinities of different hemoglobins for oxy- thesis of hemoglobin (eg, in iron deficiency; Chapter gen are expressed as P50, the PO2 that half-saturates 51) or impaired production of erythrocytes (eg, in folic them with O2. Hemoglobins saturate at the partial acid or vitamin B12 deficiency; Chapter 45). Diagnosis pressures of their respective respiratory organ, eg, the of anemias begins with spectroscopic measurement of lung or placenta. blood hemoglobin levels. • On oxygenation of hemoglobin, the iron, histidine F8, and linked residues move toward the heme ring. Thalassemias Conformational changes that accompany oxygena- tion include rupture of salt bonds and loosening of The genetic defects known as thalassemias result from quaternary structure, facilitating binding of addi- the partial or total absence of one or more α or β chains tional O2. of hemoglobin. Over 750 different mutations have been identified, but only three are common. Either the • 2,3-Bisphosphoglycerate (BPG) in the central cavity α chain (alpha thalassemias) or β chain (beta thal- of deoxyHb forms salt bonds with the β subunits assemias) can be affected. A superscript indicates that stabilize deoxyHb. On oxygenation, the central whether a subunit is completely absent (α0 or β0) or cavity contracts, BPG is extruded, and the quaternary whether its synthesis is reduced (α+ or β+). Apart from structure loosens. marrow transplantation, treatment is symptomatic. • Hemoglobin also functions in CO2 and proton Certain mutant hemoglobins are common in many transport from tissues to lungs. Release of O2 from populations, and a patient may inherit more than one oxyHb at the tissues is accompanied by uptake of type. Hemoglobin disorders thus present a complex protons due to lowering of the pKa of histidine pattern of clinical phenotypes. The use of DNA probes residues. for their diagnosis is considered in Chapter 40. • In sickle cell hemoglobin (HbS), Val replaces the β6 Glu of HbA, creating a “sticky patch” that has a complement on deoxyHb (but not on oxyHb). De- Glycosylated Hemoglobin (HbA1c) oxyHbS polymerizes at low O2 concentrations, When blood glucose enters the erythrocytes it glycosy- forming fibers that distort erythrocytes into sickle lates the ε-amino group of lysine residues and the shapes. amino terminals of hemoglobin. The fraction of hemo- • Alpha and beta thalassemias are anemias that result globin glycosylated, normally about 5%, is proportion- from reduced production of α and β subunits of ate to blood glucose concentration. Since the half-life of HbA, respectively. an erythrocyte is typically 60 days, the level of glycosy- lated hemoglobin (HbA1c) reflects the mean blood glu- REFERENCES cose concentration over the preceding 6–8 weeks. Measurement of HbA1c therefore provides valuable in- Bettati S et al: Allosteric mechanism of haemoglobin: Rupture of formation for management of diabetes mellitus. salt-bridges raises the oxygen affinity of the T-structure. J Mol Biol 1998;281:581. Bunn HF: Pathogenesis and treatment of sickle cell disease. N Engl SUMMARY J Med 1997;337:762. Faustino P et al: Dominantly transmitted beta-thalassemia arising • Myoglobin is monomeric; hemoglobin is a tetramer from the production of several aberrant mRNA species and of two subunit types (α2β2 in HbA). Despite having one abnormal peptide. Blood 1998;91:685. 48 / CHAPTER 6 Manning JM et al: Normal and abnormal protein subunit interac- Unzai S et al: Rate constants for O2 and CO binding to the alpha tions in hemoglobins. J Biol Chem 1998;273:19359. and beta subunits within the R and T states of human hemo- Mario N, Baudin B, Giboudeau J: Qualitative and quantitative globin. J Biol Chem 1998;273:23150. analysis of hemoglobin variants by capillary isoelectric focus- Weatherall DJ et al: The hemoglobinopathies. Chapter 181 in The ing. J Chromatogr B Biomed Sci Appl 1998;706:123. Metabolic and Molecular Bases of Inherited Disease, 8th ed. Reed W, Vichinsky EP: New considerations in the treatment of Scriver CR et al (editors). McGraw-Hill, 2000. sickle cell disease. Annu Rev Med 1998;49:461. Enzymes: Mechanism of Action 7 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE with the ability to simultaneously conduct and inde- pendently control a broad spectrum of chemical Enzymes are biologic polymers that catalyze the chemi- processes. cal reactions which make life as we know it possible. The presence and maintenance of a complete and bal- anced set of enzymes is essential for the breakdown of ENZYMES ARE CLASSIFIED BY REACTION nutrients to supply energy and chemical building TYPE & MECHANISM blocks; the assembly of those building blocks into pro- A system of enzyme nomenclature that is comprehen- teins, DNA, membranes, cells, and tissues; and the har- sive, consistent, and at the same time easy to use has nessing of energy to power cell motility and muscle proved elusive. The common names for most enzymes contraction. With the exception of a few catalytic RNA derive from their most distinctive characteristic: their molecules, or ribozymes, the vast majority of enzymes ability to catalyze a specific chemical reaction. In gen- are proteins. Deficiencies in the quantity or catalytic ac- eral, an enzyme’s name consists of a term that identifies tivity of key enzymes can result from genetic defects, the type of reaction catalyzed followed by the suffix nutritional deficits, or toxins. Defective enzymes can re- -ase. For example, dehydrogenases remove hydrogen sult from genetic mutations or infection by viral or bac- atoms, proteases hydrolyze proteins, and isomerases cat- terial pathogens (eg, Vibrio cholerae). Medical scientists alyze rearrangements in configuration. One or more address imbalances in enzyme activity by using pharma- modifiers usually precede this name. Unfortunately, cologic agents to inhibit specific enzymes and are inves- while many modifiers name the specific substrate in- tigating gene therapy as a means to remedy deficits in volved (xanthine oxidase), others identify the source of enzyme level or function. the enzyme (pancreatic ribonuclease), specify its mode of regulation (hormone-sensitive lipase), or name a dis- ENZYMES ARE EFFECTIVE & HIGHLY tinguishing characteristic of its mechanism (a cysteine SPECIFIC CATALYSTS protease). When it was discovered that multiple forms of some enzymes existed, alphanumeric designators The enzymes that catalyze the conversion of one or were added to distinguish between them (eg, RNA more compounds (substrates) into one or more differ- polymerase III; protein kinase Cβ). To address the am- ent compounds (products) enhance the rates of the biguity and confusion arising from these inconsistencies corresponding noncatalyzed reaction by factors of at in nomenclature and the continuing discovery of new least 106. Like all catalysts, enzymes are neither con- enzymes, the International Union of Biochemists (IUB) sumed nor permanently altered as a consequence of developed a complex but unambiguous system of en- their participation in a reaction. zyme nomenclature. In the IUB system, each enzyme In addition to being highly efficient, enzymes are has a unique name and code number that reflect the also extremely selective catalysts. Unlike most catalysts type of reaction catalyzed and the substrates involved. used in synthetic chemistry, enzymes are specific both Enzymes are grouped into six classes, each with several for the type of reaction catalyzed and for a single sub- subclasses. For example, the enzyme commonly called strate or a small set of closely related substrates. En- “hexokinase” is designated “ATP:D-hexose-6-phospho- zymes are also stereospecific catalysts and typically cat- transferase E.C. 22.214.171.124.” This identifies hexokinase as a alyze reactions only of specific stereoisomers of a given member of class 2 (transferases), subclass 7 (transfer of a compound—for example, D- but not L-sugars, L- but phosphoryl group), sub-subclass 1 (alcohol is the phos- not D-amino acids. Since they bind substrates through phoryl acceptor). Finally, the term “hexose-6” indicates at least “three points of attachment,” enzymes can even that the alcohol phosphorylated is that of carbon six of convert nonchiral substrates to chiral products. Figure a hexose. Listed below are the six IUB classes of en- 7–1 illustrates why the enzyme-catalyzed reduction of zymes and the reactions they catalyze. the nonchiral substrate pyruvate produces L-lactate rather a racemic mixture of D- and L-lactate. The ex- 1. Oxidoreductases catalyze oxidations and reduc- quisite specificity of enzyme catalysts imbues living cells tions. 49 50 / CHAPTER 7 4 Prosthetic Groups Are Tightly Integrated Into an Enzyme’s Structure Prosthetic groups are distinguished by their tight, stable 1 3 1 incorporation into a protein’s structure by covalent or 3 noncovalent forces. Examples include pyridoxal phos- phate, flavin mononucleotide (FMN), flavin dinu- cleotide (FAD), thiamin pyrophosphate, biotin, and 2 2 the metal ions of Co, Cu, Mg, Mn, Se, and Zn. Metals are the most common prosthetic groups. The roughly Enzyme site Substrate one-third of all enzymes that contain tightly bound Figure 7–1. Planar representation of the “three- metal ions are termed metalloenzymes. Metal ions that participate in redox reactions generally are complexed point attachment” of a substrate to the active site of an to prosthetic groups such as heme (Chapter 6) or iron- enzyme. Although atoms 1 and 4 are identical, once sulfur clusters (Chapter 12). Metals also may facilitate atoms 2 and 3 are bound to their complementary sites the binding and orientation of substrates, the formation on the enzyme, only atom 1 can bind. Once bound to of covalent bonds with reaction intermediates (Co2+ in an enzyme, apparently identical atoms thus may be dis- coenzyme B12 ), or interaction with substrates to render tinguishable, permitting a stereospecific chemical them more electrophilic (electron-poor) or nucleo- change. philic (electron-rich). Cofactors Associate Reversibly With 2. Transferases catalyze transfer of groups such as Enzymes or Substrates methyl or glycosyl groups from a donor molecule to an acceptor molecule. Cofactors serve functions similar to those of prosthetic groups but bind in a transient, dissociable manner ei- 3. Hydrolases catalyze the hydrolytic cleavage of ther to the enzyme or to a substrate such as ATP. Un- C C, C O, CN, P O, and certain other like the stably associated prosthetic groups, cofactors bonds, including acid anhydride bonds. therefore must be present in the medium surrounding 4. Lyases catalyze cleavage of C C, C O, CN, the enzyme for catalysis to occur. The most common and other bonds by elimination, leaving double cofactors also are metal ions. Enzymes that require a bonds, and also add groups to double bonds. metal ion cofactor are termed metal-activated enzymes 5. Isomerases catalyze geometric or structural to distinguish them from the metalloenzymes for changes within a single molecule. which metal ions serve as prosthetic groups. 6. Ligases catalyze the joining together of two mole- cules, coupled to the hydrolysis of a pyrophospho- Coenzymes Serve as Substrate Shuttles ryl group in ATP or a similar nucleoside triphos- phate. Coenzymes serve as recyclable shuttles—or group transfer reagents—that transport many substrates from Despite the many advantages of the IUB system, their point of generation to their point of utilization. texts tend to refer to most enzymes by their older and Association with the coenzyme also stabilizes substrates shorter, albeit sometimes ambiguous names. such as hydrogen atoms or hydride ions that are unsta- ble in the aqueous environment of the cell. Other chemical moieties transported by coenzymes include PROSTHETIC GROUPS, COFACTORS, methyl groups (folates), acyl groups (coenzyme A), and & COENZYMES PLAY IMPORTANT oligosaccharides (dolichol). ROLES IN CATALYSIS Many Coenzymes, Cofactors, & Prosthetic Many enzymes contain small nonprotein molecules and metal ions that participate directly in substrate binding Groups Are Derivatives of B Vitamins or catalysis. Termed prosthetic groups, cofactors, and The water-soluble B vitamins supply important compo- coenzymes, these extend the repertoire of catalytic ca- nents of numerous coenzymes. Many coenzymes con- pabilities beyond those afforded by the limited number tain, in addition, the adenine, ribose, and phosphoryl of functional groups present on the aminoacyl side moieties of AMP or ADP (Figure 7–2). Nicotinamide chains of peptides. and riboflavin are components of the redox coenzymes ENZYMES: MECHANISM OF ACTION / 51 O Arg 145 NH2 NH + OH C N NH2 NH2 O CH2 O H H O C O O N H C C N H Tyr 248 H H H HO OH N C His 196 O P O– O CH2 Zn2+ NH2 NH2 O O His 69 C N N N Glu 72 N O N N H O P O CH2 Figure 7–3. Two-dimensional representation of a O– O dipeptide substrate, glycyl-tyrosine, bound within the active site of carboxypeptidase A. H H HO OR Figure 7–2. Structure of NAD+ and NADP+. For tribute to the extensive size and three-dimensional char- NAD+, R = H. For NADP+, R = PO32−. acter of the active site. ENZYMES EMPLOY MULTIPLE NAD and NADP and FMN and FAD, respectively. MECHANISMS TO FACILITATE Pantothenic acid is a component of the acyl group car- CATALYSIS rier coenzyme A. As its pyrophosphate, thiamin partici- pates in decarboxylation of α-keto acids and folic acid Four general mechanisms account for the ability of en- and cobamide coenzymes function in one-carbon me- zymes to achieve dramatic catalytic enhancement of the tabolism. rates of chemical reactions. Catalysis by Proximity CATALYSIS OCCURS AT THE ACTIVE SITE For molecules to react, they must come within bond- The extreme substrate specificity and high catalytic effi- forming distance of one another. The higher their con- ciency of enzymes reflect the existence of an environ- centration, the more frequently they will encounter one ment that is exquisitely tailored to a single reaction. another and the greater will be the rate of their reaction. Termed the active site, this environment generally When an enzyme binds substrate molecules in its active takes the form of a cleft or pocket. The active sites of site, it creates a region of high local substrate concentra- multimeric enzymes often are located at the interface tion. This environment also orients the substrate mole- between subunits and recruit residues from more than cules spatially in a position ideal for them to interact, re- one monomer. The three-dimensional active site both sulting in rate enhancements of at least a thousandfold. shields substrates from solvent and facilitates catalysis. Substrates bind to the active site at a region comple- mentary to a portion of the substrate that will not un- Acid-Base Catalysis dergo chemical change during the course of the reac- The ionizable functional groups of aminoacyl side tion. This simultaneously aligns portions of the chains and (where present) of prosthetic groups con- substrate that will undergo change with the chemical tribute to catalysis by acting as acids or bases. Acid-base functional groups of peptidyl aminoacyl residues. The catalysis can be either specific or general. By “specific” active site also binds and orients cofactors or prosthetic we mean only protons (H3O+) or OH– ions. In specific groups. Many amino acyl residues drawn from diverse acid or specific base catalysis, the rate of reaction is portions of the polypeptide chain (Figure 7–3) con- sensitive to changes in the concentration of protons but 52 / CHAPTER 7 independent of the concentrations of other acids (pro- enzyme’s active site failed to account for the dynamic ton donors) or bases (proton acceptors) present in solu- changes that accompany catalysis. This drawback was tion or at the active site. Reactions whose rates are re- addressed by Daniel Koshland’s induced fit model, sponsive to all the acids or bases present are said to be which states that when substrates approach and bind to subject to general acid or general base catalysis. an enzyme they induce a conformational change, a change analogous to placing a hand (substrate) into a Catalysis by Strain glove (enzyme) (Figure 7–5). A corollary is that the en- zyme induces reciprocal changes in its substrates, har- Enzymes that catalyze lytic reactions which involve nessing the energy of binding to facilitate the transfor- breaking a covalent bond typically bind their substrates mation of substrates into products. The induced fit in a conformation slightly unfavorable for the bond model has been amply confirmed by biophysical studies that will undergo cleavage. The resulting strain of enzyme motion during substrate binding. stretches or distorts the targeted bond, weakening it and making it more vulnerable to cleavage. HIV PROTEASE ILLUSTRATES Covalent Catalysis ACID-BASE CATALYSIS Enzymes of the aspartic protease family, which in- The process of covalent catalysis involves the formation cludes the digestive enzyme pepsin, the lysosomal of a covalent bond between the enzyme and one or more cathepsins, and the protease produced by the human im- substrates. The modified enzyme then becomes a reac- munodeficiency virus (HIV), share a common catalytic tant. Covalent catalysis introduces a new reaction path- mechanism. Catalysis involves two conserved aspartyl way that is energetically more favorable—and therefore residues which act as acid-base catalysts. In the first stage faster—than the reaction pathway in homogeneous so- of the reaction, an aspartate functioning as a general base lution. The chemical modification of the enzyme is, (Asp X, Figure 7–6) extracts a proton from a water mole- however, transient. On completion of the reaction, the cule, making it more nucleophilic. This resulting nucle- enzyme returns to its original unmodified state. Its role ophile then attacks the electrophilic carbonyl carbon of thus remains catalytic. Covalent catalysis is particularly the peptide bond targeted for hydrolysis, forming a common among enzymes that catalyze group transfer tetrahedral transition state intermediate. A second as- reactions. Residues on the enzyme that participate in co- partate (Asp Y, Figure 7–6) then facilitates the decompo- valent catalysis generally are cysteine or serine and occa- sition of this tetrahedral intermediate by donating a pro- sionally histidine. Covalent catalysis often follows a ton to the amino group produced by rupture of the “ping-pong” mechanism—one in which the first sub- peptide bond. Two different active site aspartates thus strate is bound and its product released prior to the can act simultaneously as a general base or as a general binding of the second substrate (Figure 7–4). acid. This is possible because their immediate environ- ment favors ionization of one but not the other. SUBSTRATES INDUCE CONFORMATIONAL CHANGES CHYMOTRYPSIN & FRUCTOSE-2,6- IN ENZYMES BISPHOSPHATASE ILLUSTRATE Early in the last century, Emil Fischer compared the COVALENT CATALYSIS highly specific fit between enzymes and their substrates Chymotrypsin to that of a lock and its key. While the “lock and key model” accounted for the exquisite specificity of en- While catalysis by aspartic proteases involves the direct zyme-substrate interactions, the implied rigidity of the hydrolytic attack of water on a peptide bond, catalysis Pyr Glu Ala CHO CH2NH2 KG CH2NH2 CHO E CHO E E E CH2NH2 E E E CHO Ala Pyr KG Glu Figure 7–4. Ping-pong mechanism for transamination. ECHO and ECH2NH2 represent the enzyme- pyridoxal phosphate and enzyme-pyridoxamine complexes, respectively. (Ala, alanine; Pyr, pyruvate; KG, α-ketoglutarate; Glu, glutamate). ENZYMES: MECHANISM OF ACTION / 53 O R′ N .. C R A B H H .. .. O 1 H O O H O O C C CH2 CH2 A B Asp X Asp Y O R′ N .. C R 2 H OH A B H H O O O O C C CH2 CH2 Asp Y Asp X Figure 7–5. Two-dimensional representation of Koshland’s induced fit model of the active site of a O R′ lyase. Binding of the substrate AB induces conforma- N H + C R tional changes in the enzyme that aligns catalytic H HO residues which participate in catalysis and strains the bond between A and B, facilitating its cleavage. 3 H O O O O C C by the serine protease chymotrypsin involves prior for- mation of a covalent acyl enzyme intermediate. A CH2 CH2 highly reactive seryl residue, serine 195, participates in Asp Y Asp X a charge-relay network with histidine 57 and aspartate 102. Far apart in primary structure, in the active site Figure 7–6. Mechanism for catalysis by an aspartic these residues are within bond-forming distance of one protease such as HIV protease. Curved arrows indicate another. Aligned in the order Asp 102-His 57-Ser 195, directions of electron movement. 1 Aspartate X acts they constitute a “charge-relay network” that functions as a base to activate a water molecule by abstracting a as a “proton shuttle.” proton. 2 The activated water molecule attacks the Binding of substrate initiates proton shifts that in ef- peptide bond, forming a transient tetrahedral interme- fect transfer the hydroxyl proton of Ser 195 to Asp 102 diate. 3 Aspartate Y acts as an acid to facilitate break- (Figure 7–7). The enhanced nucleophilicity of the seryl down of the tetrahedral intermediate and release of the oxygen facilitates its attack on the carbonyl carbon of split products by donating a proton to the newly the peptide bond of the substrate, forming a covalent formed amino group. Subsequent shuttling of the pro- acyl-enzyme intermediate. The hydrogen on Asp 102 then shuttles through His 57 to the amino group liber- ton on Asp X to Asp Y restores the protease to its initial ated when the peptide bond is cleaved. The portion of state. the original peptide with a free amino group then leaves the active site and is replaced by a water molecule. The charge-relay network now activates the water molecule by withdrawing a proton through His 57 to Asp 102. The resulting hydroxide ion attacks the acyl-enzyme in- 54 / CHAPTER 7 H O termediate and a reverse proton shuttle returns a proton to Ser 195, restoring its original state. While modified R1 N C R2 during the process of catalysis, chymotrypsin emerges unchanged on completion of the reaction. Trypsin and 1 O O H N N H O C elastase employ a similar catalytic mechanism, but the Ser 195 numbers of the residues in their Ser-His-Asp proton Asp 102 His 57 shuttles differ. H O Fructose-2,6-Bisphosphatase R1 N C R2 Fructose-2,6-bisphosphatase, a regulatory enzyme of 2 O O H N N H O gluconeogenesis (Chapter 19), catalyzes the hydrolytic Ser 195 release of the phosphate on carbon 2 of fructose 2,6- bisphosphate. Figure 7–8 illustrates the roles of seven Asp 102 His 57 active site residues. Catalysis involves a “catalytic triad” O of one Glu and two His residues and a covalent phos- R1 NH2 C R2 phohistidyl intermediate. 3 O O H N N O CATALYTIC RESIDUES ARE Ser 195 HIGHLY CONSERVED Asp 102 His 57 Members of an enzyme family such as the aspartic or O H serine proteases employ a similar mechanism to catalyze O C R2 a common reaction type but act on different substrates. O Enzyme families appear to arise through gene duplica- 4 O O H N N H Ser 195 tion events that create a second copy of the gene which encodes a particular enzyme. The proteins encoded by Asp 102 His 57 the two genes can then evolve independently to recog- nize different substrates—resulting, for example, in O chymotrypsin, which cleaves peptide bonds on the car- H O C R2 boxyl terminal side of large hydrophobic amino acids; and trypsin, which cleaves peptide bonds on the car- 5 O O H N N H O boxyl terminal side of basic amino acids. The common Ser 195 ancestry of enzymes can be inferred from the presence Asp 102 His 57 of specific amino acids in the same position in each HOOC R2 family member. These residues are said to be conserved residues. Proteins that share a large number of con- 6 H N N H O served residues are said to be homologous to one an- O O other. Table 7–1 illustrates the primary structural con- Ser 195 servation of two components of the charge-relay Asp 102 His 57 network for several serine proteases. Among the most Figure 7–7. Catalysis by chymotrypsin. 1 The highly conserved residues are those that participate di- charge-relay system removes a proton from Ser 195, rectly in catalysis. making it a stronger nucleophile. 2 Activated Ser 195 attacks the peptide bond, forming a transient tetrahedral ISOZYMES ARE DISTINCT ENZYME intermediate. 3 Release of the amino terminal peptide is facilitated by donation of a proton to the newly FORMS THAT CATALYZE THE formed amino group by His 57 of the charge-relay sys- SAME REACTION tem, yielding an acyl-Ser 195 intermediate. 4 His 57 and Higher organisms often elaborate several physically dis- Asp 102 collaborate to activate a water molecule, which tinct versions of a given enzyme, each of which cat- attacks the acyl-Ser 195, forming a second tetrahedral in- alyzes the same reaction. Like the members of other termediate. 5 The charge-relay system donates a pro- protein families, these protein catalysts or isozymes ton to Ser 195, facilitating breakdown of tetrahedral in- arise through gene duplication. Isozymes may exhibit termediate to release the carboxyl terminal peptide 6 . subtle differences in properties such as sensitivity to ENZYMES: MECHANISM OF ACTION / 55 particular regulatory factors (Chapter 9) or substrate Lys 356 Lys 356 affinity (eg, hexokinase and glucokinase) that adapt Arg Arg + 352 + 352 them to specific tissues or circumstances. Some iso- 6– P + 6– P + zymes may also enhance survival by providing a “back- up” copy of an essential enzyme. 2– 2– Arg 307 Arg 307 – O – O H+ Glu + Glu + H P P 327 + 327 + THE CATALYTIC ACTIVITY OF ENZYMES His His 392 392 FACILITATES THEIR DETECTION Arg 257 His 258 1 Arg 257 His 258 2 The minute quantities of enzymes present in cells com- E • Fru-2,6-P2 E-P • Fru-6-P plicate determination of their presence and concentra- tion. However, the ability to rapidly transform thou- sands of molecules of a specific substrate into products Lys 356 Lys 356 imbues each enzyme with the ability to reveal its pres- Arg Arg + 352 + 352 ence. Assays of the catalytic activity of enzymes are fre- H + + quently used in research and clinical laboratories. H O Under appropriate conditions (see Chapter 8), the rate Arg 307 Arg 307 of the catalytic reaction being monitored is proportion- – – Glu + H P + Glu + Pi + ate to the amount of enzyme present, which allows its 327 327 + + concentration to be inferred. His His 392 392 Arg 257 His 258 3 Arg 257 His 258 4 Enzyme-Linked Immunoassays E-P • H2O E • Pi The sensitivity of enzyme assays can also be exploited to Figure 7–8. Catalysis by fructose-2,6-bisphos- detect proteins that lack catalytic activity. Enzyme- phatase. (1) Lys 356 and Arg 257, 307, and 352 stabilize linked immunoassays (ELISAs) use antibodies cova- the quadruple negative charge of the substrate by lently linked to a “reporter enzyme” such as alkaline charge-charge interactions. Glu 327 stabilizes the posi- phosphatase or horseradish peroxidase, enzymes whose tive charge on His 392. (2) The nucleophile His 392 at- products are readily detected. When serum or other tacks the C-2 phosphoryl group and transfers it to His samples to be tested are placed in a plastic microtiter 258, forming a phosphoryl-enzyme intermediate. Fruc- plate, the proteins adhere to the plastic surface and are tose 6-phosphate leaves the enzyme. (3) Nucleophilic immobilized. Any remaining absorbing areas of the well attack by a water molecule, possibly assisted by Glu 327 are then “blocked” by adding a nonantigenic protein such as bovine serum albumin. A solution of antibody acting as a base, forms inorganic phosphate. (4) Inor- covalently linked to a reporter enzyme is then added. ganic orthophosphate is released from Arg 257 and Arg The antibodies adhere to the immobilized antigen and 307. (Reproduced, with permission, from Pilkis SJ et al: 6- these are themselves immobilized. Excess free antibody Phosphofructo-2-kinase/fructose-2,6-bisphosphatase: A molecules are then removed by washing. The presence metabolic signaling enzyme. Annu Rev Biochem and quantity of bound antibody are then determined 1995;64:799.) by adding the substrate for the reporter enzyme. Table 7–1. Amino acid sequences in the neighborhood of the catalytic sites of several bovine proteases. Regions shown are those on either side of the catalytic site seryl (S) and histidyl (H) residues. Enzyme Sequence Around Serine S Sequence Around Histidine H Trypsin D S C Q D G S G G P V V C S G K V V S A A H C Y K S G Chymotrypsin A S S C M G D S G G P L V C K K N V V T A A H G G V T T Chymotrypsin B S S C M G D S G G P L V C Q K N V V T A A H C G V T T Thrombin D A C E G D S G G P F V M K S P V L T A A H C L L Y P 56 / CHAPTER 7 NAD(P)+-Dependent Dehydrogenases Are tated by the use of radioactive substrates. An alternative strategy is to devise a synthetic substrate whose product Assayed Spectrophotometrically absorbs light. For example, p-nitrophenyl phosphate is The physicochemical properties of the reactants in an an artificial substrate for certain phosphatases and for enzyme-catalyzed reaction dictate the options for the chymotrypsin that does not absorb visible light. How- assay of enzyme activity. Spectrophotometric assays ex- ever, following hydrolysis, the resulting p-nitrophen- ploit the ability of a substrate or product to absorb ylate anion absorbs light at 419 nm. light. The reduced coenzymes NADH and NADPH, Another quite general approach is to employ a “cou- written as NAD(P)H, absorb light at a wavelength of pled” assay (Figure 7–10). Typically, a dehydrogenase 340 nm, whereas their oxidized forms NAD(P)+ do not whose substrate is the product of the enzyme of interest (Figure 7–9). When NAD(P)+ is reduced, the ab- is added in catalytic excess. The rate of appearance or sorbance at 340 nm therefore increases in proportion disappearance of NAD(P)H then depends on the rate to—and at a rate determined by—the quantity of of the enzyme reaction to which the dehydrogenase has NAD(P)H produced. Conversely, for a dehydrogenase been coupled. that catalyzes the oxidation of NAD(P)H, a decrease in absorbance at 340 nm will be observed. In each case, the rate of change in optical density at 340 nm will be THE ANALYSIS OF CERTAIN ENZYMES proportionate to the quantity of enzyme present. AIDS DIAGNOSIS Many Enzymes Are Assayed by Coupling Of the thousands of different enzymes present in the human body, those that fulfill functions indispensable to a Dehydrogenase to cell vitality are present throughout the body tissues. The assay of enzymes whose reactions are not accompa- Other enzymes or isozymes are expressed only in spe- nied by a change in absorbance or fluorescence is gener- cific cell types, during certain periods of development, ally more difficult. In some instances, the product or re- or in response to specific physiologic or pathophysio- maining substrate can be transformed into a more logic changes. Analysis of the presence and distribution readily detected compound. In other instances, the re- of enzymes and isozymes—whose expression is nor- action product may have to be separated from unre- mally tissue-, time-, or circumstance-specific—often acted substrate prior to measurement—a process facili- aids diagnosis. 1.0 Glucose ATP, Mg2+ 0.8 HEXOKINASE ADP, Mg2+ Optical density 0.6 Glucose 6-phosphate NADP+ 0.4 GLUCOSE-6-PHOSPHATE NADH DEHYDROGENASE NADPH + H+ 0.2 6-Phosphogluconolactone NAD+ Figure 7–10. Coupled enzyme assay for hexokinase 0 activity. The production of glucose 6-phosphate by 200 250 300 350 400 hexokinase is coupled to the oxidation of this product Wavelength (nm) by glucose-6-phosphate dehydrogenase in the pres- ence of added enzyme and NADP+. When an excess of Figure 7–9. Absorption spectra of NAD+ and NADH. glucose-6-phosphate dehydrogenase is present, the Densities are for a 44 mg/L solution in a cell with a 1 cm rate of formation of NADPH, which can be measured at light path. NADP+ and NADPH have spectrums analo- 340 nm, is governed by the rate of formation of glucose gous to NAD+ and NADH, respectively. 6-phosphate by hexokinase. ENZYMES: MECHANISM OF ACTION / 57 Nonfunctional Plasma Enzymes Aid heart) and M (for muscle). The subunits can combine Diagnosis & Prognosis as shown below to yield catalytically active isozymes of L-lactate dehydrogenase: Certain enzymes, proenzymes, and their substrates are present at all times in the circulation of normal individ- uals and perform a physiologic function in the blood. Lactate Examples of these functional plasma enzymes include Dehydrogenase lipoprotein lipase, pseudocholinesterase, and the proen- Isozyme Subunits zymes of blood coagulation and blood clot dissolution I1 HHHH (Chapters 9 and 51). The majority of these enzymes are I2 HHHM synthesized in and secreted by the liver. I3 HHMM Plasma also contains numerous other enzymes that I4 HMMM perform no known physiologic function in blood. I5 MMMM These apparently nonfunctional plasma enzymes arise from the routine normal destruction of erythrocytes, leukocytes, and other cells. Tissue damage or necrosis Distinct genes whose expression is differentially regu- resulting from injury or disease is generally accompa- lated in various tissues encode the H and M subunits. nied by increases in the levels of several nonfunctional Since heart expresses the H subunit almost exclusively, plasma enzymes. Table 7–2 lists several enzymes used isozyme I1 predominates in this tissue. By contrast, in diagnostic enzymology. isozyme I5 predominates in liver. Small quantities of lactate dehydrogenase are normally present in plasma. Isozymes of Lactate Dehydrogenase Are Following a myocardial infarction or in liver disease, Used to Detect Myocardial Infarctions the damaged tissues release characteristic lactate dehy- L-Lactatedehydrogenase is a tetrameric enzyme whose drogenase isoforms into the blood. The resulting eleva- four subunits occur in two isoforms, designated H (for tion in the levels of the I1 or I5 isozymes is detected by separating the different oligomers of lactate dehydroge- nase by electrophoresis and assaying their catalytic ac- tivity (Figure 7–11). Table 7–2. Principal serum enzymes used in clinical diagnosis. Many of the enzymes are not ENZYMES FACILITATE DIAGNOSIS specific for the disease listed. OF GENETIC DISEASES Serum Enzyme Major Diagnostic Use While many diseases have long been known to result from alterations in an individual’s DNA, tools for the Aminotransferases detection of genetic mutations have only recently be- Aspartate aminotransfer- Myocardial infarction come widely available. These techniques rely upon the ase (AST, or SGOT) Alanine aminotransferase Viral hepatitis catalytic efficiency and specificity of enzyme catalysts. (ALT, or SGPT) For example, the polymerase chain reaction (PCR) re- lies upon the ability of enzymes to serve as catalytic am- Amylase Acute pancreatitis plifiers to analyze the DNA present in biologic and Ceruloplasmin Hepatolenticular degeneration forensic samples. In the PCR technique, a thermostable (Wilson’s disease) DNA polymerase, directed by appropriate oligonu- cleotide primers, produces thousands of copies of a Creatine kinase Muscle disorders and myocar- sample of DNA that was present initially at levels too dial infarction low for direct detection. γ-Glutamyl transpeptidase Various liver diseases The detection of restriction fragment length poly- Lactate dehydrogenase Myocardial infarction morphisms (RFLPs) facilitates prenatal detection of (isozymes) hereditary disorders such as sickle cell trait, beta- thalassemia, infant phenylketonuria, and Huntington’s Lipase Acute pancreatitis disease. Detection of RFLPs involves cleavage of dou- Phosphatase, acid Metastatic carcinoma of the ble-stranded DNA by restriction endonucleases, which prostate can detect subtle alterations in DNA that affect their recognized sites. Chapter 40 provides further details Phosphatase, alkaline Various bone disorders, ob- concerning the use of PCR and restriction enzymes for (isozymes) structive liver diseases diagnosis. 58 / CHAPTER 7 + – (Lactate) SH2 LACTATE S (Pyruvate) DEHYDROGENASE Heart A NAD+ NADH + H+ Normal B Reduced PMS Oxidized PMS Liver C Oxidized NBT Reduced NBT (colorless) (blue formazan) 5 4 3 2 1 Figure 7–11. Normal and pathologic patterns of lactate dehydrogenase (LDH) isozymes in human serum. LDH isozymes of serum were separated by electrophoresis and visualized using the coupled reac- tion scheme shown on the left. (NBT, nitroblue tetrazolium; PMS, phenazine methylsulfate). At right is shown the stained electropherogram. Pattern A is serum from a patient with a myocardial infarct; B is nor- mal serum; and C is serum from a patient with liver disease. Arabic numerals denote specific LDH isozymes. RECOMBINANT DNA PROVIDES AN resulting modified protein, termed a fusion protein, IMPORTANT TOOL FOR STUDYING contains a domain tailored to interact with a specific ENZYMES affinity support. One popular approach is to attach an oligonucleotide that encodes six consecutive histidine Recombinant DNA technology has emerged as an im- residues. The expressed “His tag” protein binds to chro- portant asset in the study of enzymes. Highly purified matographic supports that contain an immobilized diva- samples of enzymes are necessary for the study of their lent metal ion such as Ni2+. Alternatively, the substrate- structure and function. The isolation of an individual binding domain of glutathione S-transferase (GST) can enzyme, particularly one present in low concentration, serve as a “GST tag.” Figure 7–12 illustrates the purifi- from among the thousands of proteins present in a cell cation of a GST-fusion protein using an affinity support can be extremely difficult. If the gene for the enzyme of containing bound glutathione. Fusion proteins also interest has been cloned, it generally is possible to pro- often encode a cleavage site for a highly specific protease duce large quantities of its encoded protein in Esch- such as thrombin in the region that links the two por- erichia coli or yeast. However, not all animal proteins tions of the protein. This permits removal of the added can be expressed in active form in microbial cells, nor fusion domain following affinity purification. do microbes perform certain posttranslational process- ing tasks. For these reasons, a gene may be expressed in Site-Directed Mutagenesis Provides cultured animal cell systems employing the baculovirus Mechanistic Insights expression vector to transform cultured insect cells. For more details concerning recombinant DNA techniques, Once the ability to express a protein from its cloned see Chapter 40. gene has been established, it is possible to employ site- directed mutagenesis to change specific aminoacyl Recombinant Fusion Proteins Are Purified residues by altering their codons. Used in combination by Affinity Chromatography with kinetic analyses and x-ray crystallography, this ap- proach facilitates identification of the specific roles of Recombinant DNA technology can also be used to cre- given aminoacyl residues in substrate binding and catal- ate modified proteins that are readily purified by affinity ysis. For example, the inference that a particular chromatography. The gene of interest is linked to an aminoacyl residue functions as a general acid can be oligonucleotide sequence that encodes a carboxyl or tested by replacing it with an aminoacyl residue inca- amino terminal extension to the encoded protein. The pable of donating a proton. ENZYMES: MECHANISM OF ACTION / 59 GST T Enzyme • Catalytic mechanisms employed by enzymes include the introduction of strain, approximation of reac- tants, acid-base catalysis, and covalent catalysis. Plasmid encoding GST Cloned DNA with thrombin site (T) encoding enzyme • Aminoacyl residues that participate in catalysis are highly conserved among all classes of a given enzyme activity. • Substrates and enzymes induce mutual conforma- Ligate together tional changes in one another that facilitate substrate recognition and catalysis. GST T Enzyme • The catalytic activity of enzymes reveals their pres- ence, facilitates their detection, and provides the basis for enzyme-linked immunoassays. Transfect cells, add inducing agent, then • Many enzymes can be assayed spectrophotometri- break cells cally by coupling them to an NAD(P)+-dependent dehydrogenase. Apply to glutathione (GSH) affinity column • Assay of plasma enzymes aids diagnosis and progno- sis. For example, a myocardial infarction elevates serum levels of lactate dehydrogenase isozyme I1. Sepharose GSH GST T Enzyme bead • Restriction endonucleases facilitate diagnosis of ge- netic diseases by revealing restriction fragment length Elute with GSH, polymorphisms. treat with thrombin • Site-directed mutagenesis, used to change residues suspected of being important in catalysis or substrate GSH GST T Enzyme binding, provides insights into the mechanisms of enzyme action. Figure 7–12. Use of glutathione S-transferase (GST) • Recombinant fusion proteins such as His-tagged or fusion proteins to purify recombinant proteins. (GSH, GST fusion enzymes are readily purified by affinity glutathione.) chromatography. REFERENCES SUMMARY Conyers GB et al: Metal requirements of a diadenosine pyrophos- phatase from Bartonella bacilliformis. Magnetic resonance and • Enzymes are highly effective and extremely specific kinetic studies of the role of Mn2+. Biochemistry 2000; catalysts. 39:2347. • Organic and inorganic prosthetic groups, cofactors, Fersht A: Structure and Mechanism in Protein Science: A Guide to and coenzymes play important roles in catalysis. Enzyme Catalysis and Protein Folding. Freeman, 1999. Coenzymes, many of which are derivatives of B vita- Suckling CJ: Enzyme Chemistry. Chapman & Hall, 1990. mins, serve as “shuttles.” Walsh CT: Enzymatic Reaction Mechanisms. Freeman, 1979. Enzymes: Kinetics 8 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE A +B → P+ Q (2) Enzyme kinetics is the field of biochemistry concerned Unidirectional arrows are also used to describe reac- with the quantitative measurement of the rates of en- tions in living cells where the products of reaction (2) zyme-catalyzed reactions and the systematic study of fac- are immediately consumed by a subsequent enzyme- tors that affect these rates. Kinetic analyses permit scien- catalyzed reaction. The rapid removal of product P or tists to reconstruct the number and order of the Q therefore precludes occurrence of the reverse reac- individual steps by which enzymes transform substrates tion, rendering equation (2) functionally irreversible into products. The study of enzyme kinetics also repre- under physiologic conditions. sents the principal way to identify potential therapeutic agents that selectively enhance or inhibit the rates of spe- cific enzyme-catalyzed processes. Together with site- CHANGES IN FREE ENERGY DETERMINE directed mutagenesis and other techniques that probe THE DIRECTION & EQUILIBRIUM STATE protein structure, kinetic analysis can also reveal details OF CHEMICAL REACTIONS of the catalytic mechanism. A complete, balanced set of enzyme activities is of fundamental importance for main- The Gibbs free energy change ∆G (also called either the taining homeostasis. An understanding of enzyme kinet- free energy or Gibbs energy) describes both the direc- ics thus is important for understanding how physiologic tion in which a chemical reaction will tend to proceed stresses such as anoxia, metabolic acidosis or alkalosis, and the concentrations of reactants and products that toxins, and pharmacologic agents affect that balance. will be present at equilibrium. ∆G for a chemical reac- tion equals the sum of the free energies of formation of CHEMICAL REACTIONS ARE DESCRIBED the reaction products ∆GP minus the sum of the free energies of formation of the substrates ∆GS. ∆G0 de- USING BALANCED EQUATIONS notes the change in free energy that accompanies transi- A balanced chemical equation lists the initial chemical tion from the standard state, one-molar concentrations species (substrates) present and the new chemical of substrates and products, to equilibrium. A more use- species (products) formed for a particular chemical re- ful biochemical term is ∆G0′, which defines ∆G0 at a action, all in their correct proportions or stoichiome- standard state of 10−7 M protons, pH 7.0 (Chapter 10). try. For example, balanced equation (1) below describes If the free energy of the products is lower than that of the reaction of one molecule each of substrates A and B the substrates, the signs of ∆G0 and ∆G0′ will be nega- to form one molecule each of products P and Q. tive, indicating that the reaction as written is favored in the direction left to right. Such reactions are referred to → A +B ← P + Q (1) as spontaneous. The sign and the magnitude of the free energy change determine how far the reaction will The double arrows indicate reversibility, an intrinsic proceed. Equation (3)— property of all chemical reactions. Thus, for reaction (1), if A and B can form P and Q, then P and Q can ∆G0 = −RT ln K eq (3) also form A and B. Designation of a particular reactant as a “substrate” or “product” is therefore somewhat ar- —illustrates the relationship between the equilibrium bitrary since the products for a reaction written in one constant Keq and ∆G0, where R is the gas constant (1.98 direction are the substrates for the reverse reaction. The cal/mol/°K or 8.31 J/mol/°K) and T is the absolute term “products” is, however, often used to designate temperature in degrees Kelvin. Keq is equal to the prod- the reactants whose formation is thermodynamically fa- uct of the concentrations of the reaction products, each vored. Reactions for which thermodynamic factors raised to the power of their stoichiometry, divided by strongly favor formation of the products to which the the product of the substrates, each raised to the power arrow points often are represented with a single arrow of their stoichiometry. as if they were “irreversible”: 60 ENZYMES: KINETICS / 61 For the reaction A + B → P + Q— characteristic changes in free energy, ∆GF, and ∆GD are [P][Q] associated with each partial reaction. K eq = (4) [A][B] E + R − L → ELRLL ← ∆GF (8) and for reaction (5) ELRLL → E − R + L ← ∆GD (9) A+A → P ← (5) → E + R − L ← E − R + L ∆G = ∆GF + ∆GD (8-10) [P ] K eq = (6) [A ]2 For the overall reaction (10), ∆G is the sum of ∆GF and ∆GD. As for any equation of two terms, it is not possi- —∆G0 may be calculated from equation (3) if the con- ble to infer from ∆G either the sign or the magnitude centrations of substrates and products present at equi- of ∆GF or ∆GD. librium are known. If ∆G0 is a negative number, Keq Many reactions involve multiple transition states, will be greater than unity and the concentration of each with an associated change in free energy. For these products at equilibrium will exceed that of substrates. If reactions, the overall ∆G represents the sum of all of ∆G0 is positive, Keq will be less than unity and the for- the free energy changes associated with the formation mation of substrates will be favored. and decay of all of the transition states. Therefore, it is Notice that, since ∆G0 is a function exclusively of not possible to infer from the overall G the num- the initial and final states of the reacting species, it can ber or type of transition states through which the re- provide information only about the direction and equi- action proceeds. Stated another way: overall thermo- librium state of the reaction. ∆G0 is independent of the dynamics tells us nothing about kinetics. mechanism of the reaction and therefore provides no information concerning rates of reactions. Conse- ∆GF Defines the Activation Energy quently—and as explained below—although a reaction may have a large negative ∆G0 or ∆G0′, it may never- Regardless of the sign or magnitude of ∆G, ∆GF for the theless take place at a negligible rate. overwhelming majority of chemical reactions has a pos- itive sign. The formation of transition state intermedi- ates therefore requires surmounting of energy barriers. THE RATES OF REACTIONS For this reason, ∆GF is often termed the activation en- ARE DETERMINED BY THEIR ergy, Eact, the energy required to surmount a given en- ACTIVATION ENERGY ergy barrier. The ease—and hence the frequency—with which this barrier is overcome is inversely related to Reactions Proceed via Transition States Eact. The thermodynamic parameters that determine The concept of the transition state is fundamental to how fast a reaction proceeds thus are the ∆GF values for understanding the chemical and thermodynamic basis formation of the transition states through which the re- of catalysis. Equation (7) depicts a displacement reac- action proceeds. For a simple reaction, where means tion in which an entering group E displaces a leaving “proportionate to,” group L, attached initially to R. −E act (11) E +R −L → E −R +L Rate ∝ e RT ← (7) Midway through the displacement, the bond between The activation energy for the reaction proceeding in the R and L has weakened but has not yet been completely opposite direction to that drawn is equal to −∆GD. severed, and the new bond between E and R is as yet incompletely formed. This transient intermediate—in which neither free substrate nor product exists—is NUMEROUS FACTORS AFFECT termed the transition state, E R L. Dotted lines THE REACTION RATE represent the “partial” bonds that are undergoing for- The kinetic theory—also called the collision theory— mation and rupture. of chemical kinetics states that for two molecules to Reaction (7) can be thought of as consisting of two react they must (1) approach within bond-forming dis- “partial reactions,” the first corresponding to the forma- tance of one another, or “collide”; and (2) must possess tion (F) and the second to the subsequent decay (D) of sufficient kinetic energy to overcome the energy barrier the transition state intermediate. As for all reactions, for reaching the transition state. It therefore follows 62 / CHAPTER 8 that anything which increases the frequency or energy of which can also be written as collision between substrates will increase the rate of the reaction in which they participate. A +B+B→P (15) the corresponding rate expression is Temperature Raising the temperature increases the kinetic energy of Rate ∝ [A ][B ][B ] (16) molecules. As illustrated in Figure 8–1, the total num- or ber of molecules whose kinetic energy exceeds the en- ergy barrier Eact (vertical bar) for formation of products Rate ∝ [A ][B ]2 (17) increases from low (A), through intermediate (B), to high (C) temperatures. Increasing the kinetic energy of For the general case when n molecules of A react with molecules also increases their motion and therefore the m molecules of B, frequency with which they collide. This combination of more frequent and more highly energetic and produc- nA + mB → P (18) tive collisions increases the reaction rate. the rate expression is Reactant Concentration Rate ∝ [A ]n [B ]m (19) The frequency with which molecules collide is directly proportionate to their concentrations. For two different Replacing the proportionality constant with an equal molecules A and B, the frequency with which they col- sign by introducing a proportionality or rate constant lide will double if the concentration of either A or B is k characteristic of the reaction under study gives equa- doubled. If the concentrations of both A and B are dou- tions (20) and (21), in which the subscripts 1 and −1 bled, the probability of collision will increase fourfold. refer to the rate constants for the forward and reverse For a chemical reaction proceeding at constant tem- reactions, respectively. perature that involves one molecule each of A and B, Rate 1 = k 1[A ]n [B ]m (20) A +B→P (12) the number of molecules that possess kinetic energy Rate −1 = k −1[P ] (21) sufficient to overcome the activation energy barrier will be a constant. The number of collisions with sufficient energy to produce product P therefore will be directly Keq Is a Ratio of Rate Constants proportionate to the number of collisions between A While all chemical reactions are to some extent re- and B and thus to their molar concentrations, denoted versible, at equilibrium the overall concentrations of re- by square brackets. actants and products remain constant. At equilibrium, the rate of conversion of substrates to products there- Rate ∝ [A ][B ] (13) fore equals the rate at which products are converted to Similarly, for the reaction represented by substrates. A + 2B → P (14) Rate 1 = Rate −1 (22) Therefore, k 1[A ]n [B ]m = k −1[P ] (23) Energy barrier ∞ and A B C Number of molecules k1 [P ] = (24) k −1 [A ]n [B ]m The ratio of k1 to k−1 is termed the equilibrium con- 0 stant, Keq. The following important properties of a sys- ∞ Kinetic energy tem at equilibrium must be kept in mind: Figure 8–1. The energy barrier for chemical (1) The equilibrium constant is a ratio of the reaction reactions. rate constants (not the reaction rates). ENZYMES: KINETICS / 63 (2) At equilibrium, the reaction rates (not the rate constants) of the forward and back reactions are ∆Go = −RT ln K eq (25) equal. If we include the presence of the enzyme (E) in the cal- (3) Equilibrium is a dynamic state. Although there is culation of the equilibrium constant for a reaction, no net change in the concentration of substrates or products, individual substrate and product A + B + Enz → P+ Q +Enz ← (26) molecules are continually being interconverted. the expression for the equilibrium constant, (4) The numeric value of the equilibrium constant Keq can be calculated either from the concentra- [P ][ Q ][Enz ] tions of substrates and products at equilibrium or K eq = (27) [A ][B ][Enz ] from the ratio k1/k−1. reduces to one identical to that for the reaction in the absence of the enzyme: THE KINETICS OF ENZYMATIC CATALYSIS [P ][ Q ] (28) K eq = [A ][B ] Enzymes Lower the Activation Energy Barrier for a Reaction Enzymes therefore have no effect on Keq. All enzymes accelerate reaction rates by providing tran- sition states with a lowered ∆GF for formation of the transition states. However, they may differ in the way MULTIPLE FACTORS AFFECT THE RATES this is achieved. Where the mechanism or the sequence OF ENZYME-CATALYZED REACTIONS of chemical steps at the active site is essentially the same as those for the same reaction proceeding in the absence Temperature of a catalyst, the environment of the active site lowers Raising the temperature increases the rate of both uncat- GF by stabilizing the transition state intermediates. As alyzed and enzyme-catalyzed reactions by increasing the discussed in Chapter 7, stabilization can involve (1) kinetic energy and the collision frequency of the react- acid-base groups suitably positioned to transfer protons ing molecules. However, heat energy can also increase to or from the developing transition state intermediate, the kinetic energy of the enzyme to a point that exceeds (2) suitably positioned charged groups or metal ions the energy barrier for disrupting the noncovalent inter- that stabilize developing charges, or (3) the imposition actions that maintain the enzyme’s three-dimensional of steric strain on substrates so that their geometry ap- structure. The polypeptide chain then begins to unfold, proaches that of the transition state. HIV protease (Fig- or denature, with an accompanying rapid loss of cat- ure 7–6) illustrates catalysis by an enzyme that lowers alytic activity. The temperature range over which an the activation barrier by stabilizing a transition state in- enzyme maintains a stable, catalytically competent con- termediate. formation depends upon—and typically moderately Catalysis by enzymes that proceeds via a unique re- exceeds—the normal temperature of the cells in which action mechanism typically occurs when the transition it resides. Enzymes from humans generally exhibit sta- state intermediate forms a covalent bond with the en- bility at temperatures up to 45–55 °C. By contrast, zyme (covalent catalysis). The catalytic mechanism of enzymes from the thermophilic microorganisms that re- the serine protease chymotrypsin (Figure 7–7) illus- side in volcanic hot springs or undersea hydrothermal trates how an enzyme utilizes covalent catalysis to pro- vents may be stable up to or above 100 °C. vide a unique reaction pathway. The Q10, or temperature coefficient, is the factor by which the rate of a biologic process increases for a ENZYMES DO NOT AFFECT Keq 10 °C increase in temperature. For the temperatures over which enzymes are stable, the rates of most bio- Enzymes accelerate reaction rates by lowering the acti- logic processes typically double for a 10 °C rise in tem- vation barrier ∆GF. While they may undergo transient perature (Q10 = 2). Changes in the rates of enzyme- modification during the process of catalysis, enzymes catalyzed reactions that accompany a rise or fall in body emerge unchanged at the completion of the reaction. temperature constitute a prominent survival feature for The presence of an enzyme therefore has no effect on “cold-blooded” life forms such as lizards or fish, whose ∆G0 for the overall reaction, which is a function solely body temperatures are dictated by the external environ- of the initial and final states of the reactants. Equation ment. However, for mammals and other homeothermic (25) shows the relationship between the equilibrium organisms, changes in enzyme reaction rates with tem- constant for a reaction and the standard free energy perature assume physiologic importance only in cir- change for that reaction: cumstances such as fever or hypothermia. 64 / CHAPTER 8 Hydrogen Ion Concentration the rate of the forward reaction. Assays of enzyme activ- ity almost always employ a large (103–107) molar excess The rate of almost all enzyme-catalyzed reactions ex- of substrate over enzyme. Under these conditions, vi is hibits a significant dependence on hydrogen ion con- proportionate to the concentration of enzyme. Measur- centration. Most intracellular enzymes exhibit optimal ing the initial velocity therefore permits one to estimate activity at pH values between 5 and 9. The relationship the quantity of enzyme present in a biologic sample. of activity to hydrogen ion concentration (Figure 8–2) reflects the balance between enzyme denaturation at high or low pH and effects on the charged state of the SUBSTRATE CONCENTRATION AFFECTS enzyme, the substrates, or both. For enzymes whose REACTION RATE mechanism involves acid-base catalysis, the residues in- In what follows, enzyme reactions are treated as if they volved must be in the appropriate state of protonation had only a single substrate and a single product. While for the reaction to proceed. The binding and recogni- most enzymes have more than one substrate, the princi- tion of substrate molecules with dissociable groups also ples discussed below apply with equal validity to en- typically involves the formation of salt bridges with the zymes with multiple substrates. enzyme. The most common charged groups are the For a typical enzyme, as substrate concentration is negative carboxylate groups and the positively charged increased, vi increases until it reaches a maximum value groups of protonated amines. Gain or loss of critical Vmax (Figure 8–3). When further increases in substrate charged groups thus will adversely affect substrate bind- concentration do not further increase vi, the enzyme is ing and thus will retard or abolish catalysis. said to be “saturated” with substrate. Note that the shape of the curve that relates activity to substrate con- ASSAYS OF ENZYME-CATALYZED centration (Figure 8–3) is hyperbolic. At any given in- REACTIONS TYPICALLY MEASURE stant, only substrate molecules that are combined with THE INITIAL VELOCITY the enzyme as an ES complex can be transformed into product. Second, the equilibrium constant for the for- Most measurements of the rates of enzyme-catalyzed re- mation of the enzyme-substrate complex is not infi- actions employ relatively short time periods, conditions nitely large. Therefore, even when the substrate is pre- that approximate initial rate conditions. Under these sent in excess (points A and B of Figure 8–4), only a conditions, only traces of product accumulate, hence fraction of the enzyme may be present as an ES com- the rate of the reverse reaction is negligible. The initial plex. At points A or B, increasing or decreasing [S] velocity (vi ) of the reaction thus is essentially that of therefore will increase or decrease the number of ES complexes with a corresponding change in vi. At point C (Figure 8–4), essentially all the enzyme is present as X the ES complex. Since no free enzyme remains available 100 for forming ES, further increases in [S] cannot increase the rate of the reaction. Under these saturating condi- tions, vi depends solely on—and thus is limited by— SH+ E– the rapidity with which free enzyme is released to com- bine with more substrate. % Vmax 0 Low High C Vmax/2 pH vi B Figure 8–2. Effect of pH on enzyme activity. Con- − sider, for example, a negatively charged enzyme (EH ) A Vmax/2 that binds a positively charged substrate (SH+). Shown is the proportion (%) of SH+ [\\\] and of EH− [///] as a Km [S] function of pH. Only in the cross-hatched area do both the enzyme and the substrate bear an appropriate Figure 8–3. Effect of substrate concentration on the charge. initial velocity of an enzyme-catalyzed reaction. ENZYMES: KINETICS / 65 =S =E A B C Figure 8–4. Representation of an enzyme at low (A), at high (C), and at a substrate concentration equal to Km (B). Points A, B, and C correspond to those points in Figure 8–3. THE MICHAELIS-MENTEN & HILL equal to [S]. Replacing Km + [S] with [S] reduces equa- EQUATIONS MODEL THE EFFECTS tion (29) to OF SUBSTRATE CONCENTRATION Vmax [S] Vmax [S] (31) The Michaelis-Menten Equation vi = vi ≈ ≈ Vmax Km + [S] [S] The Michaelis-Menten equation (29) illustrates in mathematical terms the relationship between initial re- Thus, when [S] greatly exceeds Km, the reaction velocity action velocity vi and substrate concentration [S], is maximal (Vmax) and unaffected by further increases in shown graphically in Figure 8–3. substrate concentration. (3) When [S] = Km (point B in Figures 8–3 and vi = Vmax [S] (29) 8–4). Km + [S] Vmax [S] Vmax [S] Vmax (32) vi = = = The Michaelis constant Km is the substrate concen- Km + [S] 2[S] 2 tration at which vi is half the maximal velocity (Vmax/2) attainable at a particular concentration of Equation (32) states that when [S] equals Km, the initial enzyme. Km thus has the dimensions of substrate con- velocity is half-maximal. Equation (32) also reveals that centration. The dependence of initial reaction velocity Km is—and may be determined experimentally from— on [S] and Km may be illustrated by evaluating the the substrate concentration at which the initial velocity Michaelis-Menten equation under three conditions. is half-maximal. (1) When [S] is much less than Km (point A in Fig- ures 8–3 and 8–4), the term Km + [S] is essentially equal to Km. Replacing Km + [S] with Km reduces equation A Linear Form of the Michaelis-Menten (29) to Equation Is Used to Determine Km & Vmax The direct measurement of the numeric value of Vmax V [S] V [S] V (30) and therefore the calculation of Km often requires im- v1 = max v1 ≈ max ≈ max [S] Km + [S] Km Km practically high concentrations of substrate to achieve saturating conditions. A linear form of the Michaelis- where ≈ means “approximately equal to.” Since Vmax Menten equation circumvents this difficulty and per- and Km are both constants, their ratio is a constant. In mits Vmax and Km to be extrapolated from initial veloc- other words, when [S] is considerably below Km, vi ∝ ity data obtained at less than saturating concentrations k[S]. The initial reaction velocity therefore is directly of substrate. Starting with equation (29), proportionate to [S]. (2) When [S] is much greater than Km (point C in Vmax [S] vi = (29) Figures 8–3 and 8–4), the term Km + [S] is essentially Km + [S] 66 / CHAPTER 8 invert Stated another way, the smaller the tendency of the en- zyme and its substrate to dissociate, the greater the affin- 1 K + [S] ity of the enzyme for its substrate. While the Michaelis = m (33) v1 Vmax [S] constant Km often approximates the dissociation con- stant Kd, this is by no means always the case. For a typi- factor cal enzyme-catalyzed reaction, 1 Km [S] = + (34) k1 k2 vi Vmax [S] Vmax [S] E + S → ES → E + P ← (39) and simplify k −1 1 K 1 1 the value of [S] that gives vi = Vmax/2 is = m + (35) vi Vmax [S] Vmax k −1 + k 2 Equation (35) is the equation for a straight line, y = ax [S] = = Km (40) k1 + b, where y = 1/vi and x = 1/[S]. A plot of 1/vi as y as a function of 1/[S] as x therefore gives a straight line When k −1 » k2, then whose y intercept is 1/Vmax and whose slope is Km/Vmax. Such a plot is called a double reciprocal or Lineweaver-Burk plot (Figure 8–5). Setting the y term k −1 + k 2 ≈ k −1 (41) of equation (36) equal to zero and solving for x reveals that the x intercept is −1/Km. and −b −1 k1 0 = ax + b; therefore, x = = (36) [S ] ≈ ≈ Kd (42) a Km k −1 Km is thus most easily calculated from the x intercept. Hence, 1/Km only approximates 1/Kd under conditions where the association and dissociation of the ES com- Km May Approximate a Binding Constant plex is rapid relative to the rate-limiting step in cataly- sis. For the many enzyme-catalyzed reactions for which The affinity of an enzyme for its substrate is the inverse k−1 + k2 is not approximately equal to k −1, 1/Km will of the dissociation constant Kd for dissociation of the underestimate 1/Kd. enzyme substrate complex ES. k1 The Hill Equation Describes the Behavior E + S → ES ← (37) of Enzymes That Exhibit Cooperative k −1 Binding of Substrate k −1 While most enzymes display the simple saturation ki- Kd = (38) k1 netics depicted in Figure 8–3 and are adequately de- scribed by the Michaelis-Menten expression, some en- zymes bind their substrates in a cooperative fashion analogous to the binding of oxygen by hemoglobin (Chapter 6). Cooperative behavior may be encountered 1 Slope = Km for multimeric enzymes that bind substrate at multiple vi Vmax sites. For enzymes that display positive cooperativity in binding substrate, the shape of the curve that relates changes in vi to changes in [S] is sigmoidal (Figure 1 8–6). Neither the Michaelis-Menten expression nor its – K m 1 derived double-reciprocal plots can be used to evaluate Vmax cooperative saturation kinetics. Enzymologists therefore 0 1 employ a graphic representation of the Hill equation [S] originally derived to describe the cooperative binding of O2 by hemoglobin. Equation (43) represents the Hill Figure 8–5. Double reciprocal or Lineweaver-Burk equation arranged in a form that predicts a straight line, plot of 1/vi versus 1/[S] used to evaluate Km and Vmax. where k′ is a complex constant. ENZYMES: KINETICS / 67 ∞ first substrate molecule then enhances the affinity of the enzyme for binding additional substrate. The greater the value for n, the higher the degree of cooperativity and the more sigmoidal will be the plot of vi versus [S]. A perpendicular dropped from the point where the y term log vi/(Vmax − vi) is zero intersects the x axis at a substrate concentration termed S50, the substrate con- vi centration that results in half-maximal velocity. S50 thus is analogous to the P50 for oxygen binding to hemoglo- bin (Chapter 6). KINETIC ANALYSIS DISTINGUISHES COMPETITIVE FROM 0 [S] ∞ NONCOMPETITIVE INHIBITION Inhibitors of the catalytic activities of enzymes provide Figure 8–6. Representation of sigmoid substrate both pharmacologic agents and research tools for study saturation kinetics. of the mechanism of enzyme action. Inhibitors can be classified based upon their site of action on the enzyme, on whether or not they chemically modify the enzyme, log v1 or on the kinetic parameters they influence. Kinetically, = n log[S] − log k ′ (43) we distinguish two classes of inhibitors based upon Vmax − v1 whether raising the substrate concentration does or Equation (43) states that when [S] is low relative to k′, does not overcome the inhibition. the initial reaction velocity increases as the nth power of [S]. Competitive Inhibitors Typically A graph of log vi/(Vmax − vi) versus log[S] gives a Resemble Substrates straight line (Figure 8–7), where the slope of the line n is the Hill coefficient, an empirical parameter whose The effects of competitive inhibitors can be overcome value is a function of the number, kind, and strength of by raising the concentration of the substrate. Most fre- the interactions of the multiple substrate-binding sites quently, in competitive inhibition the inhibitor, I, on the enzyme. When n = 1, all binding sites behave in- binds to the substrate-binding portion of the active site dependently, and simple Michaelis-Menten kinetic be- and blocks access by the substrate. The structures of havior is observed. If n is greater than 1, the enzyme is most classic competitive inhibitors therefore tend to re- said to exhibit positive cooperativity. Binding of the semble the structures of a substrate and thus are termed substrate analogs. Inhibition of the enzyme succinate dehydrogenase by malonate illustrates competitive inhi- bition by a substrate analog. Succinate dehydrogenase 1 catalyzes the removal of one hydrogen atom from each of the two methylene carbons of succinate (Figure 8–8). vi Both succinate and its structural analog malonate Vmax – (−OOC CH2 COO−) can bind to the active site of vi 0 Slope = n succinate dehydrogenase, forming an ES or an EI com- plex, respectively. However, since malonate contains Log –1 H –4 S50 –3 H C COO– –2H H C COO– Log [S] – – OOC C H OOC C H SUCCINATE Figure 8–7. A graphic representation of a linear DEHYDROGENASE H form of the Hill equation is used to evaluate S50, the Succinate Fumarate substrate concentration that produces half-maximal velocity, and the degree of cooperativity n. Figure 8–8. The succinate dehydrogenase reaction. 68 / CHAPTER 8 only one methylene carbon, it cannot undergo dehy- drogenation. The formation and dissociation of the EI complex is a dynamic process described by 1 k1 vi r ito EnzI → Enz + I ← (44) ib h In k −1 r bito + inhi – K1 No ′m 1 1 for which the equilibrium constant Ki is – K m Vmax 0 1 [Enz ][I] k 1 K1 = = (45) [S] [EnzI] k −1 Figure 8–9. Lineweaver-Burk plot of competitive in- hibition. Note the complete relief of inhibition at high In effect, a competitive inhibitor acts by decreasing [S] (ie, low 1/[S]). the number of free enzyme molecules available to bind substrate, ie, to form ES, and thus eventually to form product, as described below: For simple competitive inhibition, the intercept on ±I E-I the x axis is E ±S E-S −1 [I] x = 1+ (47) E+P (46) Km Ki A competitive inhibitor and substrate exert reciprocal Once Km has been determined in the absence of in- effects on the concentration of the EI and ES com- hibitor, Ki can be calculated from equation (47). Ki val- plexes. Since binding substrate removes free enzyme ues are used to compare different inhibitors of the same available to combine with inhibitor, increasing the [S] enzyme. The lower the value for Ki, the more effective decreases the concentration of the EI complex and the inhibitor. For example, the statin drugs that act as raises the reaction velocity. The extent to which [S] competitive inhibitors of HMG-CoA reductase (Chap- must be increased to completely overcome the inhibi- ter 26) have Ki values several orders of magnitude lower tion depends upon the concentration of inhibitor pre- than the Km for the substrate HMG-CoA. sent, its affinity for the enzyme Ki, and the Km of the enzyme for its substrate. Simple Noncompetitive Inhibitors Lower Double Reciprocal Plots Facilitate the Vmax but Do Not Affect Km Evaluation of Inhibitors In noncompetitive inhibition, binding of the inhibitor Double reciprocal plots distinguish between competi- does not affect binding of substrate. Formation of both tive and noncompetitive inhibitors and simplify evalua- EI and EIS complexes is therefore possible. However, tion of inhibition constants Ki. vi is determined at sev- while the enzyme-inhibitor complex can still bind sub- eral substrate concentrations both in the presence and strate, its efficiency at transforming substrate to prod- in the absence of inhibitor. For classic competitive inhi- uct, reflected by Vmax, is decreased. Noncompetitive bition, the lines that connect the experimental data inhibitors bind enzymes at sites distinct from the sub- points meet at the y axis (Figure 8–9). Since the y inter- strate-binding site and generally bear little or no struc- cept is equal to 1/Vmax, this pattern indicates that when tural resemblance to the substrate. 1/[S] approaches 0, vi is independent of the presence For simple noncompetitive inhibition, E and EI of inhibitor. Note, however, that the intercept on the possess identical affinity for substrate, and the EIS com- x axis does vary with inhibitor concentration—and that plex generates product at a negligible rate (Figure 8–10). since −1/Km′ is smaller than 1/Km, Km′ (the “apparent More complex noncompetitive inhibition occurs when Km”) becomes larger in the presence of increasing con- binding of the inhibitor does affect the apparent affinity centrations of inhibitor. Thus, a competitive inhibitor of the enzyme for substrate, causing the lines to inter- has no effect on Vmax but raises K ′m, the apparent cept in either the third or fourth quadrants of a double K m for the substrate. reciprocal plot (not shown). ENZYMES: KINETICS / 69 A B P Q 1 E EA EAB-EPQ EQ E vi r to bi hi A B P Q In – V ′1 + max r bito – 1 inhi Km No 1 EA EQ Vmax E EAB-EPQ E 0 1 [S] EB EP Figure 8–10. Lineweaver-Burk plot for simple non- B A Q P competitive inhibition. A P B Q E EA-FP F FB-EQ E Irreversible Inhibitors “Poison” Enzymes Figure 8–11. Representations of three classes of Bi- In the above examples, the inhibitors form a dissocia- Bi reaction mechanisms. Horizontal lines represent the ble, dynamic complex with the enzyme. Fully active en- enzyme. Arrows indicate the addition of substrates and zyme can therefore be recovered simply by removing the inhibitor from the surrounding medium. However, departure of products. Top: An ordered Bi-Bi reaction, a variety of other inhibitors act irreversibly by chemi- characteristic of many NAD(P)H-dependent oxidore- cally modifying the enzyme. These modifications gen- ductases. Center: A random Bi-Bi reaction, characteris- erally involve making or breaking covalent bonds with tic of many kinases and some dehydrogenases. Bot- aminoacyl residues essential for substrate binding, catal- tom: A ping-pong reaction, characteristic of ysis, or maintenance of the enzyme’s functional confor- aminotransferases and serine proteases. mation. Since these covalent changes are relatively sta- ble, an enzyme that has been “poisoned” by an irreversible inhibitor remains inhibited even after re- moval of the remaining inhibitor from the surrounding reactions because the group undergoing transfer is usu- medium. ally passed directly, in a single step, from one substrate to the other. Sequential Bi-Bi reactions can be further distinguished based on whether the two substrates add MOST ENZYME-CATALYZED REACTIONS in a random or in a compulsory order. For random- INVOLVE TWO OR MORE SUBSTRATES order reactions, either substrate A or substrate B may While many enzymes have a single substrate, many oth- combine first with the enzyme to form an EA or an EB ers have two—and sometimes more than two—sub- complex (Figure 8–11, center). For compulsory-order strates and products. The fundamental principles dis- reactions, A must first combine with E before B can cussed above, while illustrated for single-substrate combine with the EA complex. One explanation for a enzymes, apply also to multisubstrate enzymes. The compulsory-order mechanism is that the addition of A mathematical expressions used to evaluate multisub- induces a conformational change in the enzyme that strate reactions are, however, complex. While detailed aligns residues which recognize and bind B. kinetic analysis of multisubstrate reactions exceeds the scope of this chapter, two-substrate, two-product reac- tions (termed “Bi-Bi” reactions) are considered below. Ping-Pong Reactions The term “ping-pong” applies to mechanisms in Sequential or Single which one or more products are released from the en- Displacement Reactions zyme before all the substrates have been added. Ping- pong reactions involve covalent catalysis and a tran- In sequential reactions, both substrates must combine sient, modified form of the enzyme (Figure 7–4). with the enzyme to form a ternary complex before Ping-pong Bi-Bi reactions are double displacement re- catalysis can proceed (Figure 8–11, top). Sequential re- actions. The group undergoing transfer is first dis- actions are sometimes referred to as single displacement placed from substrate A by the enzyme to form product 70 / CHAPTER 8 Increasing [S2] 1 vi Figure 8–12. Lineweaver-Burk plot for a two-sub- strate ping-pong reaction. An increase in concentra- tion of one substrate (S1) while that of the other sub- 1 strate (S2) is maintained constant changes both the x S1 and y intercepts, but not the slope. P and a modified form of the enzyme (F). The subse- other combinations of product inhibitor and variable quent group transfer from F to the second substrate B, substrate will produce forms of complex noncompeti- forming product Q and regenerating E, constitutes the tive inhibition. second displacement (Figure 8–11, bottom). Most Bi-Bi Reactions Conform to SUMMARY Michaelis-Menten Kinetics • The study of enzyme kinetics—the factors that affect Most Bi-Bi reactions conform to a somewhat more the rates of enzyme-catalyzed reactions—reveals the complex form of Michaelis-Menten kinetics in which individual steps by which enzymes transform sub- Vmax refers to the reaction rate attained when both sub- strates into products. strates are present at saturating levels. Each substrate • ∆G, the overall change in free energy for a reaction, has its own characteristic Km value which corresponds is independent of reaction mechanism and provides to the concentration that yields half-maximal velocity no information concerning rates of reactions. when the second substrate is present at saturating levels. • Enzymes do not affect Keq. Keq, a ratio of reaction As for single-substrate reactions, double-reciprocal plots rate constants, may be calculated from the concentra- can be used to determine Vmax and Km. vi is measured as tions of substrates and products at equilibrium or a function of the concentration of one substrate (the from the ratio k1/k −1. variable substrate) while the concentration of the other • Reactions proceed via transition states in which ∆GF substrate (the fixed substrate) is maintained constant. If is the activation energy. Temperature, hydrogen ion the lines obtained for several fixed-substrate concentra- concentration, enzyme concentration, substrate con- tions are plotted on the same graph, it is possible to dis- centration, and inhibitors all affect the rates of en- tinguish between a ping-pong enzyme, which yields zyme-catalyzed reactions. parallel lines, and a sequential mechanism, which yields a pattern of intersecting lines (Figure 8–12). • A measurement of the rate of an enzyme-catalyzed Product inhibition studies are used to complement reaction generally employs initial rate conditions, for kinetic analyses and to distinguish between ordered and which the essential absence of product precludes the random Bi-Bi reactions. For example, in a random- reverse reaction. order Bi-Bi reaction, each product will be a competitive • A linear form of the Michaelis-Menten equation sim- inhibitor regardless of which substrate is designated the plifies determination of Km and Vmax. variable substrate. However, for a sequential mecha- • A linear form of the Hill equation is used to evaluate nism (Figure 8–11, bottom), only product Q will give the cooperative substrate-binding kinetics exhibited the pattern indicative of competitive inhibition when A by some multimeric enzymes. The slope n, the Hill is the variable substrate, while only product P will pro- coefficient, reflects the number, nature, and strength duce this pattern with B as the variable substrate. The of the interactions of the substrate-binding sites. A ENZYMES: KINETICS / 71 value of n greater than 1 indicates positive coopera- REFERENCES tivity. Fersht A: Structure and Mechanism in Protein Science: A Guide to • The effects of competitive inhibitors, which typically Enzyme Catalysis and Protein Folding. Freeman, 1999. resemble substrates, are overcome by raising the con- Schultz AR: Enzyme Kinetics: From Diastase to Multi-enzyme Sys- centration of the substrate. Noncompetitive in- tems. Cambridge Univ Press, 1994. hibitors lower Vmax but do not affect Km. Segel IH: Enzyme Kinetics. Wiley Interscience, 1975. • Substrates may add in a random order (either sub- strate may combine first with the enzyme) or in a compulsory order (substrate A must bind before sub- strate B). • In ping-pong reactions, one or more products are re- leased from the enzyme before all the substrates have added. Enzymes: Regulation of Activities 9 Victor W. Rodwell, PhD, & Peter J. Kennelly, PhD BIOMEDICAL IMPORTANCE concentration generate corresponding changes in me- tabolite flux (Figure 9–1). Responses to changes in sub- The 19th-century physiologist Claude Bernard enunci- strate level represent an important but passive means for ated the conceptual basis for metabolic regulation. He coordinating metabolite flow and maintaining homeo- observed that living organisms respond in ways that are stasis in quiescent cells. However, they offer limited both quantitatively and temporally appropriate to per- scope for responding to changes in environmental vari- mit them to survive the multiple challenges posed by ables. The mechanisms that regulate enzyme activity in changes in their external and internal environments. an active manner in response to internal and external Walter Cannon subsequently coined the term “homeo- signals are discussed below. stasis” to describe the ability of animals to maintain a constant intracellular environment despite changes in their external environment. We now know that organ- Metabolite Flow Tends isms respond to changes in their external and internal to Be Unidirectional environment by balanced, coordinated changes in the rates of specific metabolic reactions. Many human dis- Despite the existence of short-term oscillations in eases, including cancer, diabetes, cystic fibrosis, and metabolite concentrations and enzyme levels, living Alzheimer’s disease, are characterized by regulatory dys- cells exist in a dynamic steady state in which the mean functions triggered by pathogenic agents or genetic mu- concentrations of metabolic intermediates remain rela- tations. For example, many oncogenic viruses elaborate tively constant over time (Figure 9–2). While all chemi- protein-tyrosine kinases that modify the regulatory cal reactions are to some extent reversible, in living cells events which control patterns of gene expression, con- the reaction products serve as substrates for—and are tributing to the initiation and progression of cancer. The removed by—other enzyme-catalyzed reactions. Many toxin from Vibrio cholerae, the causative agent of cholera, nominally reversible reactions thus occur unidirection- disables sensor-response pathways in intestinal epithelial ally. This succession of coupled metabolic reactions is cells by ADP-ribosylating the GTP-binding proteins accompanied by an overall change in free energy that (G-proteins) that link cell surface receptors to adenylyl favors unidirectional metabolite flow (Chapter 10). The cyclase. The consequent activation of the cyclase triggers unidirectional flow of metabolites through a pathway the flow of water into the intestines, resulting in massive with a large overall negative change in free energy is diarrhea and dehydration. Yersinia pestis, the causative analogous to the flow of water through a pipe in which agent of plague, elaborates a protein-tyrosine phos- one end is lower than the other. Bends or kinks in the phatase that hydrolyzes phosphoryl groups on key cy- pipe simulate individual enzyme-catalyzed steps with a toskeletal proteins. Knowledge of factors that control the small negative or positive change in free energy. Flow of rates of enzyme-catalyzed reactions thus is essential to an water through the pipe nevertheless remains unidirec- understanding of the molecular basis of disease. This tional due to the overall change in height, which corre- chapter outlines the patterns by which metabolic sponds to the overall change in free energy in a pathway processes are controlled and provides illustrative exam- (Figure 9–3). ples. Subsequent chapters provide additional examples. COMPARTMENTATION ENSURES REGULATION OF METABOLITE FLOW METABOLIC EFFICIENCY CAN BE ACTIVE OR PASSIVE & SIMPLIFIES REGULATION Enzymes that operate at their maximal rate cannot re- In eukaryotes, anabolic and catabolic pathways that in- spond to an increase in substrate concentration, and terconvert common products may take place in specific can respond only to a precipitous decrease in substrate subcellular compartments. For example, many of the concentration. For most enzymes, therefore, the aver- enzymes that degrade proteins and polysaccharides re- age intracellular concentration of their substrate tends side inside organelles called lysosomes. Similarly, fatty to be close to the Km value, so that changes in substrate acid biosynthesis occurs in the cytosol, whereas fatty 72 ENZYMES: REGULATION OF ACTIVITIES / 73 ∆VB V ∆VA A Km ∆S ∆S B [S] Figure 9–1. Differential response of the rate of an enzyme-catalyzed reaction, ∆V, to the same incremen- Figure 9–3. Hydrostatic analogy for a pathway with tal change in substrate concentration at a substrate a rate-limiting step (A) and a step with a ∆G value near concentration of Km (∆VA) or far above Km (∆VB). zero (B). acid oxidation takes place within mitochondria (Chap- generation from those of NADPH that participate in ters 21 and 22). Segregation of certain metabolic path- the reductive steps in many biosynthetic pathways. ways within specialized cell types can provide further physical compartmentation. Alternatively, possession of Controlling an Enzyme That Catalyzes one or more unique intermediates can permit apparently a Rate-Limiting Reaction Regulates opposing pathways to coexist even in the absence of physical barriers. For example, despite many shared in- an Entire Metabolic Pathway termediates and enzymes, both glycolysis and gluconeo- While the flux of metabolites through metabolic path- genesis are favored energetically. This cannot be true if ways involves catalysis by numerous enzymes, active all the reactions were the same. If one pathway was fa- control of homeostasis is achieved by regulation of only vored energetically, the other would be accompanied by a small number of enzymes. The ideal enzyme for regu- a change in free energy G equal in magnitude but op- latory intervention is one whose quantity or catalytic ef- posite in sign. Simultaneous spontaneity of both path- ficiency dictates that the reaction it catalyzes is slow rel- ways results from substitution of one or more reactions ative to all others in the pathway. Decreasing the by different reactions favored thermodynamically in the catalytic efficiency or the quantity of the catalyst for the opposite direction. The glycolytic enzyme phospho- “bottleneck” or rate-limiting reaction immediately re- fructokinase (Chapter 17) is replaced by the gluco- duces metabolite flux through the entire pathway. Con- neogenic enzyme fructose-1,6-bisphosphatase (Chapter versely, an increase in either its quantity or catalytic ef- 19). The ability of enzymes to discriminate between the ficiency enhances flux through the pathway as a whole. structurally similar coenzymes NAD+ and NADP+ also For example, acetyl-CoA carboxylase catalyzes the syn- results in a form of compartmentation, since it segre- thesis of malonyl-CoA, the first committed reaction of gates the electrons of NADH that are destined for ATP fatty acid biosynthesis (Chapter 21). When synthesis of malonyl-CoA is inhibited, subsequent reactions of fatty acid synthesis cease due to lack of substrates. Enzymes that catalyze rate-limiting steps serve as natural “gover- Large nors” of metabolic flux. Thus, they constitute efficient molecules targets for regulatory intervention by drugs. For exam- ple, inhibition by “statin” drugs of HMG-CoA reduc- Small Small tase, which catalyzes the rate-limiting reaction of cho- Nutrients ~P ~P Wastes molecules molecules lesterogenesis, curtails synthesis of cholesterol. Small REGULATION OF ENZYME QUANTITY molecules The catalytic capacity of the rate-limiting reaction in a metabolic pathway is the product of the concentration Figure 9–2. An idealized cell in steady state. Note of enzyme molecules and their intrinsic catalytic effi- that metabolite flow is unidirectional. ciency. It therefore follows that catalytic capacity can be 74 / CHAPTER 9 influenced both by changing the quantity of enzyme Enzyme levels in mammalian tissues respond to a present and by altering its intrinsic catalytic efficiency. wide range of physiologic, hormonal, or dietary factors. For example, glucocorticoids increase the concentration of tyrosine aminotransferase by stimulating ks, and Control of Enzyme Synthesis glucagon—despite its antagonistic physiologic effects— increases ks fourfold to fivefold. Regulation of liver Enzymes whose concentrations remain essentially con- arginase can involve changes either in ks or in kdeg. After stant over time are termed constitutive enzymes. By a protein-rich meal, liver arginase levels rise and argi- contrast, the concentrations of many other enzymes de- nine synthesis decreases (Chapter 29). Arginase levels pend upon the presence of inducers, typically sub- also rise in starvation, but here arginase degradation de- strates or structurally related compounds, that initiate creases, whereas ks remains unchanged. Similarly, injec- their synthesis. Escherichia coli grown on glucose will, tion of glucocorticoids and ingestion of tryptophan for example, only catabolize lactose after addition of a both elevate levels of tryptophan oxygenase. While the β-galactoside, an inducer that initiates synthesis of a hormone raises ks for oxygenase synthesis, tryptophan β-galactosidase and a galactoside permease (Figure 39–3). specifically lowers kdeg by stabilizing the oxygenase Inducible enzymes of humans include tryptophan pyr- against proteolytic digestion. rolase, threonine dehydrase, tyrosine-α-ketoglutarate aminotransferase, enzymes of the urea cycle, HMG-CoA reductase, and cytochrome P450. Conversely, an excess of a metabolite may curtail synthesis of its cognate MULTIPLE OPTIONS ARE AVAILABLE FOR enzyme via repression. Both induction and repression REGULATING CATALYTIC ACTIVITY involve cis elements, specific DNA sequences located up- In humans, the induction of protein synthesis is a com- stream of regulated genes, and trans-acting regulatory plex multistep process that typically requires hours to proteins. The molecular mechanisms of induction and produce significant changes in overall enzyme level. By repression are discussed in Chapter 39. contrast, changes in intrinsic catalytic efficiency ef- fected by binding of dissociable ligands (allosteric reg- ulation) or by covalent modification achieve regula- Control of Enzyme Degradation tion of enzymic activity within seconds. Changes in The absolute quantity of an enzyme reflects the net bal- protein level serve long-term adaptive requirements, ance between enzyme synthesis and enzyme degrada- whereas changes in catalytic efficiency are best suited tion, where ks and kdeg represent the rate constants for for rapid and transient alterations in metabolite flux. the overall processes of synthesis and degradation, re- spectively. Changes in both the ks and kdeg of specific enzymes occur in human subjects. ALLOSTERIC EFFECTORS REGULATE CERTAIN ENZYMES Enzyme Feedback inhibition refers to inhibition of an enzyme ks k deg in a biosynthetic pathway by an end product of that pathway. For example, for the biosynthesis of D from A Amino acids catalyzed by enzymes Enz1 through Enz3, Protein turnover represents the net result of en- Enz1 Enz2 Enz3 zyme synthesis and degradation. By measuring the rates of incorporation of 15N-labeled amino acids into pro- A → B → C → D tein and the rates of loss of 15N from protein, Schoen- heimer deduced that body proteins are in a state of “dy- high concentrations of D inhibit conversion of A to B. namic equilibrium” in which they are continuously Inhibition results not from the “backing up” of inter- synthesized and degraded. Mammalian proteins are de- mediates but from the ability of D to bind to and in- graded both by ATP and ubiquitin-dependent path- hibit Enz1. Typically, D binds at an allosteric site spa- ways and by ATP-independent pathways (Chapter 29). tially distinct from the catalytic site of the target Susceptibility to proteolytic degradation can be influ- enzyme. Feedback inhibitors thus are allosteric effectors enced by the presence of ligands such as substrates, and typically bear little or no structural similarity to the coenzymes, or metal ions that alter protein conforma- substrates of the enzymes they inhibit. In this example, tion. Intracellular ligands thus can influence the rates at the feedback inhibitor D acts as a negative allosteric which specific enzymes are degraded. effector of Enz1. ENZYMES: REGULATION OF ACTIVITIES / 75 In a branched biosynthetic pathway, the initial reac- tions participate in the synthesis of several products. A B Figure 9–4 shows a hypothetical branched biosynthetic pathway in which curved arrows lead from feedback in- hibitors to the enzymes whose activity they inhibit. The S1 S2 S3 S4 sequences S3 → A, S4 → B, S4 → C, and S3 → → D C each represent linear reaction sequences that are feed- S5 D back-inhibited by their end products. The pathways of nucleotide biosynthesis (Chapter 34) provide specific examples. The kinetics of feedback inhibition may be competi- Figure 9–5. Multiple feedback inhibition in a tive, noncompetitive, partially competitive, or mixed. branched biosynthetic pathway. Superimposed on sim- Feedback inhibitors, which frequently are the small ple feedback loops (dashed, curved arrows) are multi- molecule building blocks of macromolecules (eg, amino ple feedback loops (solid, curved arrows) that regulate acids for proteins, nucleotides for nucleic acids), typi- enzymes common to biosynthesis of several end prod- cally inhibit the first committed step in a particular ucts. biosynthetic sequence. A much-studied example is inhi- bition of bacterial aspartate transcarbamoylase by CTP (see below and Chapter 34). Multiple feedback loops can provide additional fine phosphate (CTP). Following treatment with mercuri- control. For example, as shown in Figure 9–5, the pres- als, ATCase loses its sensitivity to inhibition by CTP ence of excess product B decreases the requirement for but retains its full activity for synthesis of carbamoyl as- substrate S2. However, S2 is also required for synthesis partate. This suggests that CTP is bound at a different of A, C, and D. Excess B should therefore also curtail (allosteric) site from either substrate. ATCase consists synthesis of all four end products. To circumvent this of multiple catalytic and regulatory subunits. Each cat- potential difficulty, each end product typically only alytic subunit contains four aspartate (substrate) sites partially inhibits catalytic activity. The effect of an ex- and each regulatory subunit at least two CTP (regula- cess of two or more end products may be strictly addi- tory) sites (Chapter 34). tive or, alternatively, may be greater than their individ- ual effect (cooperative feedback inhibition). Allosteric & Catalytic Sites Are Spatially Distinct Aspartate Transcarbamoylase Is a Model The lack of structural similarity between a feedback in- hibitor and the substrate for the enzyme whose activity Allosteric Enzyme it regulates suggests that these effectors are not isosteric Aspartate transcarbamoylase (ATCase), the catalyst for with a substrate but allosteric (“occupy another the first reaction unique to pyrimidine biosynthesis space”). Jacques Monod therefore proposed the exis- (Figure 34–7), is feedback-inhibited by cytidine tri- tence of allosteric sites that are physically distinct from the catalytic site. Allosteric enzymes thus are those whose activity at the active site may be modulated by the presence of effectors at an allosteric site. This hy- A B pothesis has been confirmed by many lines of evidence, including x-ray crystallography and site-directed muta- S1 S2 S3 S4 genesis, demonstrating the existence of spatially distinct active and allosteric sites on a variety of enzymes. C S5 D Allosteric Effects May Be on Km or on Vmax To refer to the kinetics of allosteric inhibition as “com- Figure 9–4. Sites of feedback inhibition in a petitive” or “noncompetitive” with substrate carries branched biosynthetic pathway. S1–S5 are intermedi- misleading mechanistic implications. We refer instead ates in the biosynthesis of end products A–D. Straight to two classes of regulated enzymes: K-series and V-se- arrows represent enzymes catalyzing the indicated con- ries enzymes. For K-series allosteric enzymes, the sub- versions. Curved arrows represent feedback loops and strate saturation kinetics are competitive in the sense indicate sites of feedback inhibition by specific end that Km is raised without an effect on Vmax. For V-series products. allosteric enzymes, the allosteric inhibitor lowers Vmax 76 / CHAPTER 9 without affecting the Km. Alterations in Km or Vmax REGULATORY COVALENT probably result from conformational changes at the cat- MODIFICATIONS CAN BE alytic site induced by binding of the allosteric effector REVERSIBLE OR IRREVERSIBLE at the allosteric site. For a K-series allosteric enzyme, this conformational change may weaken the bonds be- In mammalian cells, the two most common forms of tween substrate and substrate-binding residues. For a covalent modification are partial proteolysis and V-series allosteric enzyme, the primary effect may be to phosphorylation. Because cells lack the ability to re- alter the orientation or charge of catalytic residues, low- unite the two portions of a protein produced by hydrol- ering Vmax. Intermediate effects on Km and Vmax, how- ysis of a peptide bond, proteolysis constitutes an irre- ever, may be observed consequent to these conforma- versible modification. By contrast, phosphorylation is a tional changes. reversible modification process. The phosphorylation of proteins on seryl, threonyl, or tyrosyl residues, catalyzed FEEDBACK REGULATION by protein kinases, is thermodynamically spontaneous. IS NOT SYNONYMOUS WITH Equally spontaneous is the hydrolytic removal of these phosphoryl groups by enzymes called protein phos- FEEDBACK INHIBITION phatases. In both mammalian and bacterial cells, end products “feed back” and control their own synthesis, in many PROTEASES MAY BE SECRETED AS instances by feedback inhibition of an early biosyn- CATALYTICALLY INACTIVE PROENZYMES thetic enzyme. We must, however, distinguish between feedback regulation, a phenomenologic term devoid Certain proteins are synthesized and secreted as inactive of mechanistic implications, and feedback inhibition, precursor proteins known as proproteins. The propro- a mechanism for regulation of enzyme activity. For ex- teins of enzymes are termed proenzymes or zymogens. ample, while dietary cholesterol decreases hepatic syn- Selective proteolysis converts a proprotein by one or thesis of cholesterol, this feedback regulation does not more successive proteolytic “clips” to a form that ex- involve feedback inhibition. HMG-CoA reductase, the hibits the characteristic activity of the mature protein, rate-limiting enzyme of cholesterologenesis, is affected, eg, its enzymatic activity. Proteins synthesized as pro- but cholesterol does not feedback-inhibit its activity. proteins include the hormone insulin (proprotein = Regulation in response to dietary cholesterol involves proinsulin), the digestive enzymes pepsin, trypsin, and curtailment by cholesterol or a cholesterol metabolite of chymotrypsin (proproteins = pepsinogen, trypsinogen, the expression of the gene that encodes HMG-CoA re- and chymotrypsinogen, respectively), several factors of ductase (enzyme repression) (Chapter 26). the blood clotting and blood clot dissolution cascades (see Chapter 51), and the connective tissue protein col- MANY HORMONES ACT THROUGH lagen (proprotein = procollagen). ALLOSTERIC SECOND MESSENGERS Proenzymes Facilitate Rapid Nerve impulses—and binding of hormones to cell sur- Mobilization of an Activity in Response face receptors—elicit changes in the rate of enzyme- catalyzed reactions within target cells by inducing the re- to Physiologic Demand lease or synthesis of specialized allosteric effectors called The synthesis and secretion of proteases as catalytically second messengers. The primary or “first” messenger is inactive proenzymes protects the tissue of origin (eg, the hormone molecule or nerve impulse. Second mes- the pancreas) from autodigestion, such as can occur in sengers include 3′,5′-cAMP, synthesized from ATP by pancreatitis. Certain physiologic processes such as di- the enzyme adenylyl cyclase in response to the hormone gestion are intermittent but fairly regular and pre- epinephrine, and Ca2+, which is stored inside the endo- dictable. Others such as blood clot formation, clot dis- plasmic reticulum of most cells. Membrane depolariza- solution, and tissue repair are brought “on line” only in tion resulting from a nerve impulse opens a membrane response to pressing physiologic or pathophysiologic channel that releases calcium ion into the cytoplasm, need. The processes of blood clot formation and dis- where it binds to and activates enzymes involved in the solution clearly must be temporally coordinated to regulation of contraction and the mobilization of stored achieve homeostasis. Enzymes needed intermittently glucose from glycogen. Glucose then supplies the in- but rapidly often are secreted in an initially inactive creased energy demands of muscle contraction. Other form since the secretion process or new synthesis of the second messengers include 3′,5′-cGMP and polyphos- required proteins might be insufficiently rapid for re- phoinositols, produced by the hydrolysis of inositol sponse to a pressing pathophysiologic demand such as phospholipids by hormone-regulated phospholipases. the loss of blood. ENZYMES: REGULATION OF ACTIVITIES / 77 1 13 14 15 16 146 149 245 Pro-CT 1 13 14 15 16 146 149 245 π-CT 14-15 147-148 1 13 16 146 149 245 α-CT S S S S Figure 9–6. Selective proteolysis and associated conformational changes form the active site of chymotrypsin, which includes the Asp102-His57-Ser195 catalytic triad. Successive proteolysis forms prochymotrypsin (pro-CT), π-chymotrypsin (π-CT), and ul- timately α-chymotrypsin (α-CT), an active protease whose three peptides remain asso- ciated by covalent inter-chain disulfide bonds. Activation of Prochymotrypsin catalyzing transfer of the terminal phosphoryl group of Requires Selective Proteolysis ATP to the hydroxyl groups of seryl, threonyl, or tyro- syl residues, forming O-phosphoseryl, O-phosphothre- Selective proteolysis involves one or more highly spe- onyl, or O-phosphotyrosyl residues, respectively (Figure cific proteolytic clips that may or may not be accompa- 9–7). Some protein kinases target the side chains of his- nied by separation of the resulting peptides. Most im- tidyl, lysyl, arginyl, and aspartyl residues. The unmodi- portantly, selective proteolysis often results in fied form of the protein can be regenerated by hy- conformational changes that “create” the catalytic site drolytic removal of phosphoryl groups, catalyzed by of an enzyme. Note that while His 57 and Asp 102 re- protein phosphatases. side on the B peptide of α-chymotrypsin, Ser 195 re- A typical mammalian cell possesses over 1000 phos- sides on the C peptide (Figure 9–6). The conforma- phorylated proteins and several hundred protein kinases tional changes that accompany selective proteolysis of and protein phosphatases that catalyze their intercon- prochymotrypsin (chymotrypsinogen) align the three version. The ease of interconversion of enzymes be- residues of the charge-relay network, creating the cat- tween their phospho- and dephospho- forms in part alytic site. Note also that contact and catalytic residues can be located on different peptide chains but still be within bond-forming distance of bound substrate. ATP ADP REVERSIBLE COVALENT MODIFICATION Mg2+ REGULATES KEY MAMMALIAN ENZYMES KINASE Mammalian proteins are the targets of a wide range of Enz Ser OH Enz Ser O PO32 – covalent modification processes. Modifications such as PHOSPHATASE glycosylation, hydroxylation, and fatty acid acylation introduce new structural features into newly synthe- Mg2+ sized proteins that tend to persist for the lifetime of the Pi H2O protein. Among the covalent modifications that regu- late protein function (eg, methylation, adenylylation), Figure 9–7. Covalent modification of a regulated en- the most common by far is phosphorylation-dephos- zyme by phosphorylation-dephosphorylation of a seryl phorylation. Protein kinases phosphorylate proteins by residue. 78 / CHAPTER 9 accounts for the frequency of phosphorylation-dephos- Table 9–1. Examples of mammalian enzymes phorylation as a mechanism for regulatory control. whose catalytic activity is altered by covalent Phosphorylation-dephosphorylation permits the func- phosphorylation-dephosphorylation. tional properties of the affected enzyme to be altered only for as long as it serves a specific need. Once the Activity State1 need has passed, the enzyme can be converted back to its original form, poised to respond to the next stimula- Enzyme Low High tory event. A second factor underlying the widespread Acetyl-CoA carboxylase EP E use of protein phosphorylation-dephosphorylation lies Glycogen synthase EP E in the chemical properties of the phosphoryl group it- Pyruvate dehydrogenase EP E self. In order to alter an enzyme’s functional properties, HMG-CoA reductase EP E any modification of its chemical structure must influ- Glycogen phosphorylase E EP ence the protein’s three-dimensional configuration. Citrate lyase E EP The high charge density of protein-bound phosphoryl Phosphorylase b kinase E EP groups—generally −2 at physiologic pH—and their HMG-CoA reductase kinase E EP propensity to form salt bridges with arginyl residues 1 E, dephosphoenzyme; EP, phosphoenzyme. make them potent agents for modifying protein struc- ture and function. Phosphorylation generally targets amino acids distant from the catalytic site itself. Conse- quent conformational changes then influence an en- phosphorylation at different sites, or between phosphory- zyme’s intrinsic catalytic efficiency or other properties. lation sites and allosteric sites provides the basis for In this sense, the sites of phosphorylation and other co- regulatory networks that integrate multiple environ- valent modifications can be considered another form of mental input signals to evoke an appropriate coordi- allosteric site. However, in this case the “allosteric li- nated cellular response. In these sophisticated regula- gand” binds covalently to the protein. tory networks, individual enzymes respond to different environmental signals. For example, if an enzyme can PROTEIN PHOSPHORYLATION be phosphorylated at a single site by more than one protein kinase, it can be converted from a catalytically IS EXTREMELY VERSATILE efficient to an inefficient (inactive) form, or vice versa, Protein phosphorylation-dephosphorylation is a highly in response to any one of several signals. If the protein versatile and selective process. Not all proteins are sub- kinase is activated in response to a signal different from ject to phosphorylation, and of the many hydroxyl the signal that activates the protein phosphatase, the groups on a protein’s surface, only one or a small subset phosphoprotein becomes a decision node. The func- are targeted. While the most common enzyme function tional output, generally catalytic activity, reflects the affected is the protein’s catalytic efficiency, phosphory- phosphorylation state. This state or degree of phos- lation can also alter the affinity for substrates, location phorylation is determined by the relative activities of within the cell, or responsiveness to regulation by al- the protein kinase and protein phosphatase, a reflection losteric ligands. Phosphorylation can increase an en- of the presence and relative strength of the environ- zyme’s catalytic efficiency, converting it to its active mental signals that act through each. The ability of form in one protein, while phosphorylation of another many protein kinases and protein phosphatases to tar- converts it into an intrinsically inefficient, or inactive, get more than one protein provides a means for an en- form (Table 9–1). vironmental signal to coordinately regulate multiple Many proteins can be phosphorylated at multiple metabolic processes. For example, the enzymes 3-hy- sites or are subject to regulation both by phosphoryla- droxy-3-methylglutaryl-CoA reductase and acetyl-CoA tion-dephosphorylation and by the binding of allosteric carboxylase—the rate-controlling enzymes for choles- ligands. Phosphorylation-dephosphorylation at any one terol and fatty acid biosynthesis, respectively—are site can be catalyzed by multiple protein kinases or pro- phosphorylated and inactivated by the AMP-activated tein phosphatases. Many protein kinases and most pro- protein kinase. When this protein kinase is activated ei- tein phosphatases act on more than one protein and are ther through phosphorylation by yet another protein themselves interconverted between active and inactive kinase or in response to the binding of its allosteric acti- forms by the binding of second messengers or by cova- vator 5′-AMP, the two major pathways responsible for lent modification by phosphorylation-dephosphoryla- the synthesis of lipids from acetyl-CoA both are inhib- tion. ited. Interconvertible enzymes and the enzymes respon- The interplay between protein kinases and protein sible for their interconversion do not act as mere on phosphatases, between the functional consequences of and off switches working independently of one another. ENZYMES: REGULATION OF ACTIVITIES / 79 They form the building blocks of biomolecular com- active site. Secretion as an inactive proenzyme facili- puters that maintain homeostasis in cells that carry out tates rapid mobilization of activity in response to in- a complex array of metabolic processes that must be jury or physiologic need and may protect the tissue regulated in response to a broad spectrum of environ- of origin (eg, autodigestion by proteases). mental factors. • Binding of metabolites and second messengers to sites distinct from the catalytic site of enzymes trig- Covalent Modification Regulates gers conformational changes that alter Vmax or the Metabolite Flow Km. Regulation of enzyme activity by phosphorylation- • Phosphorylation by protein kinases of specific seryl, dephosphorylation has analogies to regulation by feed- threonyl, or tyrosyl residues—and subsequent de- back inhibition. Both provide for short-term, readily phosphorylation by protein phosphatases—regulates reversible regulation of metabolite flow in response to the activity of many human enzymes. The protein ki- specific physiologic signals. Both act without altering nases and phosphatases that participate in regulatory gene expression. Both act on early enzymes of a pro- cascades which respond to hormonal or second mes- tracted, often biosynthetic metabolic sequence, and senger signals constitute a “bio-organic computer” both act at allosteric rather than catalytic sites. Feed- that can process and integrate complex environmen- back inhibition, however, involves a single protein and tal information to produce an appropriate and com- lacks hormonal and neural features. By contrast, regula- prehensive cellular response. tion of mammalian enzymes by phosphorylation- dephosphorylation involves several proteins and ATP REFERENCES and is under direct neural and hormonal control. Bray D: Protein molecules as computational elements in living cells. Nature 1995;376:307. SUMMARY Graves DJ, Martin BL, Wang JH: Co- and Post-translational Modi- fication of Proteins: Chemical Principles and Biological Effects. • Homeostasis involves maintaining a relatively con- Oxford Univ Press, 1994. stant intracellular and intra-organ environment de- Johnson LN, Barford D: The effect of phosphorylation on the spite wide fluctuations in the external environment structure and function of proteins. Annu Rev Biophys Bio- via appropriate changes in the rates of biochemical mol Struct 1993;22:199. reactions in response to physiologic need. Marks F (editor): Protein Phosphorylation. VCH Publishers, 1996. • The substrates for most enzymes are usually present Pilkis SJ et al: 6-Phosphofructo-2-kinase/fructose-2,6-bisphospha- at a concentration close to Km. This facilitates passive tase: A metabolic signaling enzyme. Annu Rev Biochem control of the rates of product formation response to 1995;64:799. changes in levels of metabolic intermediates. Scriver CR et al (editors): The Metabolic and Molecular Bases of Inherited Disease, 8th ed. McGraw-Hill, 2000. • Active control of metabolite flux involves changes in Sitaramayya A (editor): Introduction to Cellular Signal Transduction. the concentration, catalytic activity, or both of an en- Birkhauser, 1999. zyme that catalyzes a committed, rate-limiting reac- Stadtman ER, Chock PB (editors): Current Topics in Cellular Regu- tion. lation. Academic Press, 1969 to the present. • Selective proteolysis of catalytically inactive proen- Weber G (editor): Advances in Enzyme Regulation. Pergamon Press, zymes initiates conformational changes that form the 1963 to the present. Molecular Genetics, Recombinant DNA, & Genomic Technology 40 Daryl K. Granner, MD, & P. Anthony Weil, PhD BIOMEDICAL IMPORTANCE* ELUCIDATION OF THE BASIC FEATURES The development of recombinant DNA, high-density, OF DNA LED TO RECOMBINANT high-throughput screening, and other molecular ge- DNA TECHNOLOGY netic methodologies has revolutionized biology and is DNA Is a Complex Biopolymer having an increasing impact on clinical medicine. Much has been learned about human genetic disease Organized as a Double Helix from pedigree analysis and study of affected proteins, The fundamental organizational element is the se- but in many cases where the specific genetic defect is quence of purine (adenine [A] or guanine [G]) and unknown, these approaches cannot be used. The new pyrimidine (cytosine [C] or thymine [T]) bases. These technologies circumvent these limitations by going di- bases are attached to the C-1′ position of the sugar de- rectly to the DNA molecule for information. Manipu- oxyribose, and the bases are linked together through lation of a DNA sequence and the construction of joining of the sugar moieties at their 3′ and 5′ positions chimeric molecules—so-called genetic engineering— via a phosphodiester bond (Figure 35–1). The alternat- provides a means of studying how a specific segment of ing deoxyribose and phosphate groups form the back- DNA works. Novel molecular genetic tools allow inves- bone of the double helix (Figure 35–2). These 3′–5′ tigators to query and manipulate genomic sequences as linkages also define the orientation of a given strand of well as to examine both cellular mRNA and protein the DNA molecule, and, since the two strands run in profiles at the molecular level. opposite directions, they are said to be antiparallel. Understanding this technology is important for sev- eral reasons: (1) It offers a rational approach to under- standing the molecular basis of a number of diseases Base Pairing Is a Fundamental Concept (eg, familial hypercholesterolemia, sickle cell disease, of DNA Structure & Function the thalassemias, cystic fibrosis, muscular dystrophy). (2) Human proteins can be produced in abundance for Adenine and thymine always pair, by hydrogen bonding, therapy (eg, insulin, growth hormone, tissue plasmino- as do guanine and cytosine (Figure 35–3). These base gen activator). (3) Proteins for vaccines (eg, hepatitis B) pairs are said to be complementary, and the guanine and for diagnostic testing (eg, AIDS tests) can be ob- content of a fragment of double-stranded DNA will al- tained. (4) This technology is used to diagnose existing ways equal its cytosine content; likewise, the thymine diseases and predict the risk of developing a given dis- and adenine contents are equal. Base pairing and hy- ease. (5) Special techniques have led to remarkable ad- drophobic base-stacking interactions hold the two DNA vances in forensic medicine. (6) Gene therapy for sickle strands together. These interactions can be reduced by cell disease, the thalassemias, adenosine deaminase defi- heating the DNA to denature it. The laws of base pairing ciency, and other diseases may be devised. predict that two complementary DNA strands will rean- neal exactly in register upon renaturation, as happens when the temperature of the solution is slowly reduced to normal. Indeed, the degree of base-pair matching (or * See glossary of terms at the end of this chapter. mismatching) can be estimated from the temperature re- 396 MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 397 quired for denaturation-renaturation. Segments of DNA exons. Regulatory regions for specific eukaryotic genes with high degrees of base-pair matching require more en- are usually located in the DNA that flanks the tran- ergy input (heat) to accomplish denaturation—or, to put scription initiation site at its 5′ end (5′ flanking- it another way, a closely matched segment will withstand sequence DNA). Occasionally, such sequences are more heat before the strands separate. This reaction is found within the gene itself or in the region that flanks used to determine whether there are significant differ- the 3′ end of the gene. In mammalian cells, each gene ences between two DNA sequences, and it underlies the has its own regulatory region. Many eukaryotic genes concept of hybridization, which is fundamental to the (and some viruses that replicate in mammalian cells) processes described below. have special regions, called enhancers, that increase the There are about 3 109 base pairs (bp) in each rate of transcription. Some genes also have DNA se- human haploid genome. If an average gene length is quences, known as silencers, that repress transcription. 3 × 103 bp (3 kilobases [kb]), the genome could consist Mammalian genes are obviously complicated, multi- of 106 genes, assuming that there is no overlap and that component structures. transcription proceeds in only one direction. It is thought that there are < 105 genes in the human and Genes Are Transcribed Into RNA that only 1–2% of the DNA codes for proteins. The Information generally flows from DNA to mRNA to exact function of the remaining ~98% of the human protein, as illustrated in Figure 40–1 and discussed in genome has not yet been defined. more detail in Chapter 39. This is a rigidly controlled The double-helical DNA is packaged into a more process involving a number of complex steps, each of compact structure by a number of proteins, most which no doubt is regulated by one or more enzymes or notably the basic proteins called histones. This con- factors; faulty function at any of these steps can cause densation may serve a regulatory role and certainly has disease. a practical purpose. The DNA present within the nu- cleus of a cell, if simply extended, would be about 1 meter long. The chromosomal proteins compact this RECOMBINANT DNA TECHNOLOGY long strand of DNA so that it can be packaged into a INVOLVES ISOLATION & MANIPULATION nucleus with a volume of a few cubic micrometers. OF DNA TO MAKE CHIMERIC MOLECULES Isolation and manipulation of DNA, including end-to- DNA Is Organized Into Genes end joining of sequences from very different sources to make chimeric molecules (eg, molecules containing In general, prokaryotic genes consist of a small regula- both human and bacterial DNA sequences in a se- tory region (100–500 bp) and a large protein-coding quence-independent fashion), is the essence of recom- segment (500–10,000 bp). Several genes are often con- binant DNA research. This involves several unique trolled by a single regulatory unit. Most mammalian techniques and reagents. genes are more complicated in that the coding regions are interrupted by noncoding regions that are elimi- Restriction Enzymes Cut DNA nated when the primary RNA transcript is processed Chains at Specific Locations into mature messenger RNA (mRNA). The coding re- gions (those regions that appear in the mature RNA Certain endonucleases—enzymes that cut DNA at spe- species) are called exons, and the noncoding regions, cific DNA sequences within the molecule (as opposed which interpose or intervene between the exons, are to exonucleases, which digest from the ends of DNA called introns (Figure 40–1). Introns are always re- molecules)—are a key tool in recombinant DNA re- moved from precursor RNA before transport into the search. These enzymes were called restriction enzymes cytoplasm occurs. The process by which introns are re- because their presence in a given bacterium restricted moved from precursor RNA and by which exons are the growth of certain bacterial viruses called bacterio- ligated together is called RNA splicing. Incorrect pro- phages. Restriction enzymes cut DNA of any source cessing of the primary transcript into the mature into short pieces in a sequence-specific manner—in mRNA can result in disease in humans (see below); this contrast to most other enzymatic, chemical, or physical underscores the importance of these posttranscriptional methods, which break DNA randomly. These defensive processing steps. The variation in size and complexity enzymes (hundreds have been discovered) protect the of some human genes is illustrated in Table 40–1. Al- host bacterial DNA from DNA from foreign organisms though there is a 300-fold difference in the sizes of the (primarily infective phages). However, they are present genes illustrated, the mRNA sizes vary only about 20- only in cells that also have a companion enzyme which fold. This is because most of the DNA in genes is pres- methylates the host DNA, rendering it an unsuitable ent as introns, and introns tend to be much larger than substrate for digestion by the restriction enzyme. Thus, 398 / CHAPTER 40 Regulatory Basal Transcription Poly(A) region promoter start site addition region site Exon Exon DNA 5′ CAAT TATA AATAAA 3′ 5′ Intron 3′ Noncoding Noncoding region region Transcription NUCLEUS Primary RNA transcript PPP Modification of 5′ and 3′ ends Modified transcript AAA---A Cap Poly(A) tail Removal of introns and splicing of exons Processed nuclear mRNA AAA---A Transmembrane CYTOPLASM transport mRNA AAA---A Translation Protein NH2 COOH Figure 40–1. Organization of a eukaryotic transcription unit and the pathway of eukaryotic gene expres- sion. Eukaryotic genes have structural and regulatory regions. The structural region consists of the coding DNA and 5′ and 3′ noncoding DNA sequences. The coding regions are divided into two parts: (1) exons, which eventually are ligated together to become mature RNA, and (2) introns, which are processed out of the pri- mary transcript. The structural region is bounded at its 5′ end by the transcription initiation site and at its 3′ end by the polyadenylate addition or termination site. The promoter region, which contains specific DNA sequences that interact with various protein factors to regulate transcription, is discussed in detail in Chap- ters 37 and 39. The primary transcript has a special structure, a cap, at the 5′ end and a stretch of As at the 3′ end. This transcript is processed to remove the introns; and the mature mRNA is then transported to the cyto- plasm, where it is translated into protein. site-specific DNA methylases and restriction enzymes HpaI) or overlapping (sticky) ends (eg, BamHI) (Figure always exist in pairs in a bacterium. 40–2), depending on the mechanism used by the en- Restriction enzymes are named after the bac- zyme. Sticky ends are particularly useful in constructing terium from which they are isolated. For example, hybrid or chimeric DNA molecules (see below). If the EcoRI is from Escherichia coli, and BamHI is from Bacil- four nucleotides are distributed randomly in a given lus amyloliquefaciens (Table 40–2). The first three letters DNA molecule, one can calculate how frequently a in the restriction enzyme name consist of the first letter given enzyme will cut a length of DNA. For each posi- of the genus (E) and the first two letters of the species tion in the DNA molecule, there are four possibilities (co). These may be followed by a strain designation (R) (A, C, G, and T); therefore, a restriction enzyme that and a roman numeral (I) to indicate the order of discov- recognizes a 4-bp sequence cuts, on average, once every ery (eg, EcoRI, EcoRII). Each enzyme recognizes and 256 bp (44), whereas another enzyme that recognizes a cleaves a specific double-stranded DNA sequence that is 6-bp sequence cuts once every 4096 bp (46). A given 4–7 bp long. These DNA cuts result in blunt ends (eg, piece of DNA has a characteristic linear array of sites for MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 399 Table 40–1. Variations in the size and complexity Table 40–2. Selected restriction endonucleases of some human genes and mRNAs.1 and their sequence specificities.1 mRNA Sequence Recognized Bacterial Gene Size Number Size Endonuclease Cleavage Sites Shown Source Gene (kb) of Introns (kb) ↓ β-Globin 1.5 2 0.6 BamHI GGATCC Bacillus amylo- Insulin 1.7 2 0.4 CCTAGG liquefaciens H β-Adrenergic receptor 3 0 2.2 ↑ Albumin 25 14 2.1 ↓ LDL receptor 45 17 5.5 BgIII AGATCT Bacillus glolbigii Factor VIII 186 25 9.0 TCTAGA Thyroglobulin 300 36 8.7 ↑ ↓ 1 The sizes are given in kilobases (kb). The sizes of the genes in- clude some proximal promoter and regulatory region sequences; EcoRI GAATTC Escherichia coli these are generally about the same size for all genes. Genes vary CTTAAG RY13 in size from about 1500 base pairs (bp) to over 2 × 106 bp. There is ↑ also great variation in the number of introns and exons. The β-adrenergic receptor gene is intronless, and the thyroglobulin ↓ gene has 36 introns. As noted by the smaller difference in mRNA EcoRII CCTGG Escherichia coli sizes, introns comprise most of the gene sequence. GGACC R245 ↑ ↓ HindIII AAGCTT Haemophilus the various enzymes dictated by the linear sequence of TTCGAA influenzae Rd its bases; hence, a restriction map can be constructed. ↑ When DNA is digested with a given enzyme, the ends of all the fragments have the same DNA sequence. The ↓ Hhal GCGC Haemophilus fragments produced can be isolated by electrophoresis CGCG haemolyticus on agarose or polyacrylamide gels (see the discussion of ↑ blot transfer, below); this is an essential step in cloning and a major use of these enzymes. ↓ A number of other enzymes that act on DNA and Hpal GTTAAC Haemophilus CAATTG parainfluenzae RNA are an important part of recombinant DNA tech- ↑ nology. Many of these are referred to in this and subse- quent chapters (Table 40–3). ↓ MstII CCTNAGG Microcoleus GGANTCC strain Restriction Enzymes & DNA Ligase Are ↑ Used to Prepare Chimeric DNA Molecules ↓ Sticky-end ligation is technically easy, but some special PstI CTGCAG Providencia techniques are often required to overcome problems in- GACGTC stuartii 164 herent in this approach. Sticky ends of a vector may re- ↑ connect with themselves, with no net gain of DNA. ↓ Sticky ends of fragments can also anneal, so that tandem Taql TCGA Thermus heterogeneous inserts form. Also, sticky-end sites may AGCT aquaticus YTI not be available or in a convenient position. To circum- ↑ vent these problems, an enzyme that generates blunt 1 A, adenine; C, cytosine; G, guanine, T, thymine. Arrows show the site ends is used, and new ends are added using the enzyme of cleavage; depending on the site, sticky ends (BamHI) or blunt ends terminal transferase. If poly d(G) is added to the 3′ ends (Hpal) may result. The length of the recognition sequence can be 4 bp of the vector and poly d(C) is added to the 3′ ends of (Taql), 5 bp (EcoRII), 6 bp (EcoRI), or 7 bp (MstII) or longer. By conven- the foreign DNA, the two molecules can only anneal to tion, these are written in the 5′ or 3′ direction for the upper strand of each recognition sequence, and the lower strand is shown with the each other, thus circumventing the problems listed opposite (ie, 3′ or 5′) polarity. Note that most recognition sequences above. This procedure is called homopolymer tailing. are palindromes (ie, the sequence reads the same in opposite direc- Sometimes, synthetic blunt-ended duplex oligonu- tions on the two strands). A residue designated N means that any nu- cleotide linkers with a convenient restriction enzyme se- cleotide is permitted. 400 / CHAPTER 40 A. Sticky or staggered ends 5’ G G A T C C 3’ G G A T C C BamHI + Figure 40–2. Results of restriction en- 3’ 5’ donuclease digestion. Digestion with a re- C C T A G G C C T A G G striction endonuclease can result in the for- B. Blunt ends mation of DNA fragments with sticky, or 5’ G T T A A C 3’ G T T A A C HpaI cohesive, ends (A) or blunt ends (B). This is + an important consideration in devising 3’ C A A T T G 5’ C A A T T G cloning strategies. quence are ligated to the blunt-ended DNA. Direct terized or used for other purposes. This technique is blunt-end ligation is accomplished using the enzyme based on the fact that chimeric or hybrid DNA molecules bacteriophage T4 DNA ligase. This technique, though can be constructed in cloning vectors—typically bacter- less efficient than sticky-end ligation, has the advantage ial plasmids, phages, or cosmids—which then continue of joining together any pairs of ends. The disadvantages to replicate in a host cell under their own control systems. are that there is no control over the orientation of inser- In this way, the chimeric DNA is amplified. The general tion or the number of molecules annealed together, and procedure is illustrated in Figure 40–3. there is no easy way to retrieve the insert. Bacterial plasmids are small, circular, duplex DNA molecules whose natural function is to confer antibiotic resistance to the host cell. Plasmids have several proper- Cloning Amplifies DNA ties that make them extremely useful as cloning vectors. A clone is a large population of identical molecules, bac- They exist as single or multiple copies within the bac- teria, or cells that arise from a common ancestor. Molec- terium and replicate independently from the bacterial ular cloning allows for the production of a large number DNA. The complete DNA sequence of many plasmids is of identical DNA molecules, which can then be charac- known; hence, the precise location of restriction enzyme Table 40–3. Some of the enzymes used in recombinant DNA research.1 Enzyme Reaction Primary Use Alkaline phosphatase Dephosphorylates 5′ ends of RNA and DNA. Removal of 5′-PO4 groups prior to kinase labeling to prevent self-ligation. BAL 31 nuclease Degrades both the 3′ and 5′ ends of DNA. Progressive shortening of DNA molecules. DNA ligase Catalyzes bonds between DNA molecules. Joining of DNA molecules. DNA polymerase I Synthesizes double-stranded DNA from Synthesis of double-stranded cDNA; nick translation; gener- single-stranded DNA. ation of blunt ends from sticky ends. DNase I Under appropriate conditions, produces Nick translation; mapping of hypersensitive sites; mapping single-stranded nicks in DNA. protein-DNA interactions. Exonuclease III Removes nucleotides from 3′ ends of DNA. DNA sequencing; mapping of DNA-protein interactions. λ exonuclease Removes nucleotides from 5′ ends of DNA. DNA sequencing. 32 Polynucleotide kinase Transfers terminal phosphate (γ position) P labeling of DNA or RNA. from ATP to 5′-OH groups of DNA or RNA. Reverse transcriptase Synthesizes DNA from RNA template. Synthesis of cDNA from mRNA; RNA (5′ end) mapping studies. S1 nuclease Degrades single-stranded DNA. Removal of “hairpin” in synthesis of cDNA; RNA mapping studies (both 5′ and 3′ ends). Terminal transferase Adds nucleotides to the 3′ ends of DNA. Homopolymer tailing. 1 Adapted and reproduced, with permission, from Emery AEH: Page 41 in: An Introduction to Recombinant DNA. Wiley, 1984. MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 401 EcoRI T restriction T A endonuclease A A A T Human DNA T Circular plasmid DNA Linear plasmid DNA with sticky ends EcoRI restriction endonuclease AATT A A A A T T T T TTAA TT TT Anne A A al Piece of human DNA cut with A A DNA same restriction nuclease and ligase containing same sticky ends T T T A T A A A A A A A T T T T Plasmid DNA molecule with human DNA insert (recombinant DNA molecule) Figure 40–3. Use of restriction nucleases to make new recombinant or chimeric DNA molecules. When in- serted back into a bacterial cell (by the process called transformation), typically only a single plasmid is taken up by a single cell, and the plasmid DNA replicates not only itself but also the physically linked new DNA insert. Since recombining the sticky ends, as indicated, regenerates the same DNA sequence recognized by the original restric- tion enzyme, the cloned DNA insert can be cleanly cut back out of the recombinant plasmid circle with this en- donuclease. If a mixture of all of the DNA pieces created by treatment of total human DNA with a single restriction nuclease is used as the source of human DNA, a million or so different types of recombinant DNA molecules can be obtained, each pure in its own bacterial clone. (Modified and reproduced, with permission, from Cohen SN: The manipulation of genes. Sci Am [July] 1975;233:34.) cleavage sites for inserting the foreign DNA is available. Larger fragments of DNA can be cloned in cosmids, Plasmids are smaller than the host chromosome and are which combine the best features of plasmids and therefore easily separated from the latter, and the desired phages. Cosmids are plasmids that contain the DNA se- plasmid-inserted DNA is readily removed by cutting the quences, so-called cos sites, required for packaging plasmid with the enzyme specific for the restriction site lambda DNA into the phage particle. These vectors into which the original piece of DNA was inserted. grow in the plasmid form in bacteria, but since much of Phages usually have linear DNA molecules into the unnecessary lambda DNA has been removed, more which foreign DNA can be inserted at several restric- chimeric DNA can be packaged into the particle head. tion enzyme sites. The chimeric DNA is collected after It is not unusual for cosmids to carry inserts of chimeric the phage proceeds through its lytic cycle and produces DNA that are 35–50 kb long. Even larger pieces of mature, infective phage particles. A major advantage of DNA can be incorporated into bacterial artificial chro- phage vectors is that while plasmids accept DNA pieces mosome (BAC), yeast artificial chromosome (YAC), or about 6–10 kb long, phages can accept DNA fragments E. coli bacteriophage P1-based (PAC) vectors. These 10–20 kb long, a limitation imposed by the amount of vectors will accept and propagate DNA inserts of sev- DNA that can be packed into the phage head. eral hundred kilobases or more and have largely re- 402 / CHAPTER 40 Table 40–4. Cloning capacities of common ferent recombinant clones is called a library. A genomic cloning vectors. library is prepared from the total DNA of a cell line or tissue. A cDNA library comprises complementary Vector DNA Insert Size DNA copies of the population of mRNAs in a tissue. Genomic DNA libraries are often prepared by perform- Plasmid pBR322 0.01–10 kb ing partial digestion of total DNA with a restriction en- Lambda charon 4A 10–20 kb zyme that cuts DNA frequently (eg, a four base cutter Cosmids 35–50 kb such as TaqI ). The idea is to generate rather large frag- BAC, P1 50–250 kb ments so that most genes will be left intact. The BAC, YAC 500–3000 kb YAC, and P1 vectors are preferred since they can accept very large fragments of DNA and thus offer a better placed the plasmid, phage, and cosmid vectors for some chance of isolating an intact gene on a single DNA cloning and gene mapping applications. A comparison fragment. of these vectors is shown in Table 40–4. A vector in which the protein coded by the gene in- Because insertion of DNA into a functional region troduced by recombinant DNA technology is actually of the vector will interfere with the action of this re- synthesized is known as an expression vector. Such gion, care must be taken not to interrupt an essential vectors are now commonly used to detect specific function of the vector. This concept can be exploited, cDNA molecules in libraries and to produce proteins however, to provide a selection technique. For example, by genetic engineering techniques. These vectors are the common plasmid vector pBR322 has both tetracy- specially constructed to contain very active inducible cline (tet) and ampicillin (amp) resistance genes. A promoters, proper in-phase translation initiation single PstI restriction enzyme site within the amp resis- codons, both transcription and translation termination tance gene is commonly used as the insertion site for a signals, and appropriate protein processing signals, if piece of foreign DNA. In addition to having sticky ends needed. Some expression vectors even contain genes (Table 40–2 and Figure 40–2), the DNA inserted at that code for protease inhibitors, so that the final yield this site disrupts the amp resistance gene and makes the of product is enhanced. bacterium carrying this plasmid amp-sensitive (Figure 40–4). Thus, the parental plasmid, which provides re- Probes Search Libraries for Specific sistance to both antibiotics, can be readily separated Genes or cDNA Molecules from the chimeric plasmid, which is resistant only to tetracycline. YACs contain replication and segregation A variety of molecules can be used to “probe” libraries in functions that work in both bacteria and yeast cells and search of a specific gene or cDNA molecule or to define therefore can be propagated in either organism. and quantitate DNA or RNA separated by electrophore- In addition to the vectors described in Table 40–4 sis through various gels. Probes are generally pieces of that are designed primarily for propagation in bacterial DNA or RNA labeled with a 32P-containing nu- cells, vectors for mammalian cell propagation and insert cleotide—or fluorescently labeled nucleotides (more gene (cDNA)/protein expression have also been devel- commonly now). Importantly, neither modification (32P oped. These vectors are all based upon various eukary- or fluorescent-label) affects the hybridization properties otic viruses that are composed of RNA or DNA of the resulting labeled nucleic acid probes. The probe genomes. Notable examples of such viral vectors are must recognize a complementary sequence to be effec- those utilizing adenoviral (DNA-based) and retroviral tive. A cDNA synthesized from a specific mRNA can be (RNA-based) genomes. Though somewhat limited in used to screen either a cDNA library for a longer cDNA the size of DNA sequences that can be inserted, such or a genomic library for a complementary sequence in mammalian viral cloning vectors make up for this the coding region of a gene. A popular technique for shortcoming because they will efficiently infect a wide finding specific genes entails taking a short amino acid range of different cell types. For this reason, various sequence and, employing the codon usage for that mammalian viral vectors are being investigated for use species (see Chapter 38), making an oligonucleotide in gene therapy experiments. probe that will detect the corresponding DNA fragment in a genomic library. If the sequences match exactly, A Library Is a Collection probes 15–20 nucleotides long will hybridize. cDNA probes are used to detect DNA fragments on Southern of Recombinant Clones blot transfers and to detect and quantitate RNA on The combination of restriction enzymes and various Northern blot transfers. Specific antibodies can also be cloning vectors allows the entire genome of an organ- used as probes provided that the vector used synthesizes ism to be packed into a vector. A collection of these dif- protein molecules that are recognized by them. MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 403 Ampicillin Tetracycline resistance gene resistance gene EcoRI EcoRI Tetracycline HindIII resistance gene HindIII PstI BamHI BamHI Cut open with PstI SalI PstI PstI SalI Then insert PstI-cut DNA Ampr Amps Tetr Tetr Host pBR322 Chimeric pBR322 Figure 40–4. A method of screening recombinants for inserted DNA fragments. Using the plasmid pBR322, a piece of DNA is inserted into the unique PstI site. This insertion disrupts the gene coding for a protein that pro- vides ampicillin resistance to the host bacterium. Hence, the chimeric plasmid will no longer survive when plated on a substrate medium that contains this antibiotic. The differential sensitivity to tetracycline and ampicillin can therefore be used to distinguish clones of plasmid that contain an insert. A similar scheme relying upon produc- tion of an in-frame fusion of a newly inserted DNA producing a peptide fragment capable of complementing an inactive, deleted form of the enzyme β-galactosidase allows for blue-white colony formation on agar plates con- taining a dye hydrolyzable by β-galactoside. β-Galactosidase-positive colonies are blue. Blotting & Hybridization Techniques Allow renatured, and analyzed for an interaction by hybridiza- Visualization of Specific Fragments tion with a specific labeled DNA probe. Colony or plaque hybridization is the method by Visualization of a specific DNA or RNA fragment which specific clones are identified and purified. Bacte- among the many thousands of “contaminating” mole- ria are grown on colonies on an agar plate and overlaid cules requires the convergence of a number of tech- with nitrocellulose filter paper. Cells from each colony niques, collectively termed blot transfer. Figure 40–5 stick to the filter and are permanently fixed thereto by illustrates the Southern (DNA), Northern (RNA), and heat, which with NaOH treatment also lyses the cells Western (protein) blot transfer procedures. (The first is and denatures the DNA so that it will hybridize with named for the person who devised the technique, and the probe. A radioactive probe is added to the filter, the other names began as laboratory jargon but are now and (after washing) the hybrid complex is localized by accepted terms.) These procedures are useful in deter- exposing the filter to x-ray film. By matching the spot mining how many copies of a gene are in a given tissue on the autoradiograph to a colony, the latter can be or whether there are any gross alterations in a gene picked from the plate. A similar strategy is used to iden- (deletions, insertions, or rearrangements). Occasionally, tify fragments in phage libraries. Successive rounds of if a specific base is changed and a restriction site is al- this procedure result in a clonal isolate (bacterial tered, these procedures can detect a point mutation. colony) or individual phage plaque. The Northern and Western blot transfer techniques are All of the hybridization procedures discussed in this used to size and quantitate specific RNA and protein section depend on the specific base-pairing properties molecules, respectively. A fourth hybridization tech- of complementary nucleic acid strands described above. nique, the Southwestern blot, examines protein•DNA Perfect matches hybridize readily and withstand high interactions. Proteins are separated by electrophoresis, temperatures in the hybridization and washing reac- 404 / CHAPTER 40 Southern Northern Western tions. Specific complexes also form in the presence of low salt concentrations. Less than perfect matches do DNA RNA Protein not tolerate these stringent conditions (ie, elevated temperatures and low salt concentrations); thus, hy- bridization either never occurs or is disrupted during Gel the washing step. Gene families, in which there is some electrophoresis degree of homology, can be detected by varying the stringency of the hybridization and washing steps. Cross-species comparisons of a given gene can also be made using this approach. Hybridization conditions ca- pable of detecting just a single base pair mismatch be- Transfer to paper tween probe and target have been devised. Manual & Automatic Techniques cDNA* cDNA* Antibody* Add probe Are Available to Determine the Sequence of DNA The segments of specific DNA molecules obtained by Autoradiograph recombinant DNA technology can be analyzed to de- termine their nucleotide sequence. This method de- pends upon having a large number of identical DNA Figure 40–5. The blot transfer procedure. In a molecules. This requirement can be satisfied by cloning Southern, or DNA, blot transfer, DNA isolated from a the fragment of interest, using the techniques described cell line or tissue is digested with one or more restric- above. The manual enzymatic method (Sanger) em- ploys specific dideoxynucleotides that terminate DNA tion enzymes. This mixture is pipetted into a well in an strand synthesis at specific nucleotides as the strand is agarose or polyacrylamide gel and exposed to a direct synthesized on purified template nucleic acid. The reac- electrical current. DNA, being negatively charged, mi- tions are adjusted so that a population of DNA frag- grates toward the anode; the smaller fragments move ments representing termination at every nucleotide is the most rapidly. After a suitable time, the DNA is dena- obtained. By having a radioactive label incorporated at tured by exposure to mild alkali and transferred to ni- the end opposite the termination site, one can separate trocellulose or nylon paper, in an exact replica of the the fragments according to size using polyacrylamide pattern on the gel, by the blotting technique devised gel electrophoresis. An autoradiograph is made, and by Southern. The DNA is bound to the paper by expo- each of the fragments produces an image (band) on an sure to heat, and the paper is then exposed to the x-ray film. These are read in order to give the DNA se- labeled cDNA probe, which hybridizes to complemen- quence (Figure 40–6). Another manual method, that of tary fragments on the filter. After thorough washing, Maxam and Gilbert, employs chemical methods to the paper is exposed to x-ray film, which is developed cleave the DNA molecules where they contain the spe- to reveal several specific bands corresponding to the cific nucleotides. Techniques that do not require the DNA fragment that recognized the sequences in the use of radioisotopes are commonly employed in auto- cDNA probe. The RNA, or Northern, blot is conceptually mated DNA sequencing. Most commonly employed is similar. RNA is subjected to electrophoresis before blot an automated procedure in which four different fluo- transfer. This requires some different steps from those rescent labels—one representing each nucleotide—are of DNA transfer, primarily to ensure that the RNA re- used. Each emits a specific signal upon excitation by a mains intact, and is generally somewhat more difficult. laser beam, and this can be recorded by a computer. In the protein, or Western, blot, proteins are elec- trophoresed and transferred to nitrocellulose and then Oligonucleotide Synthesis Is Now Routine probed with a specific antibody or other probe mole- The automated chemical synthesis of moderately long cule. (Asterisks signify labeling, either radioactive or oligonucleotides (about 100 nucleotides) of precise se- fluorescent.) quence is now a routine laboratory procedure. Each synthetic cycle takes but a few minutes, so an entire molecule can be made by synthesizing relatively short segments that can then be ligated to one another. Oligonucleotides are now indispensable for DNA se- MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 405 Reaction containing radiolabel: Sequence of original strand: ddGTP ddATP ddTTP ddCTP – A – G – T – C – T – T – G – G – A – G – C – T – 3′ Electrophoresis Slab gel G A T C A G T C T T G G A G C T Bases terminated Figure 40–6. Sequencing of DNA by the method devised by Sanger. The ladder-like arrays represent from bot- tom to top all of the successively longer fragments of the original DNA strand. Knowing which specific dideoxynu- cleotide reaction was conducted to produce each mixture of fragments, one can determine the sequence of nu- cleotides from the labeled end (asterisk) toward the unlabeled end by reading up the gel. Automated sequencing involves the reading of chemically modified deoxynucleotides. The base-pairing rules of Watson and Crick (A–T, G–C) dictate the sequence of the other (complementary) strand. (Asterisks signify radiolabeling.) quencing, library screening, protein-DNA binding, quences, and extension of the annealed primers with DNA mobility shift assays, the polymerase chain reac- DNA polymerase result in the exponential amplifica- tion (see below), site-directed mutagenesis, and numer- tion of DNA segments of defined length. Early PCR re- ous other applications. actions used an E coli DNA polymerase that was de- stroyed by each heat denaturation cycle. Substitution of The Polymerase Chain Reaction a heat-stable DNA polymerase from Thermus aquaticus (or the corresponding DNA polymerase from other (PCR) Amplifies DNA Sequences thermophilic bacteria), an organism that lives and repli- The polymerase chain reaction (PCR) is a method of cates at 70–80 °C, obviates this problem and has made amplifying a target sequence of DNA. PCR provides a possible automation of the reaction, since the polym- sensitive, selective, and extremely rapid means of ampli- erase reactions can be run at 70 °C. This has also im- fying a desired sequence of DNA. Specificity is based proved the specificity and the yield of DNA. on the use of two oligonucleotide primers that hy- DNA sequences as short as 50–100 bp and as long bridize to complementary sequences on opposite as 10 kb can be amplified. Twenty cycles provide an strands of DNA and flank the target sequence (Figure amplification of 106 and 30 cycles of 109. The PCR al- 40–7). The DNA sample is first heated to separate the lows the DNA in a single cell, hair follicle, or spermato- two strands; the primers are allowed to bind to the zoon to be amplified and analyzed. Thus, the applica- DNA; and each strand is copied by a DNA polymerase, tions of PCR to forensic medicine are obvious. The starting at the primer site. The two DNA strands each PCR is also used (1) to detect infectious agents, espe- serve as a template for the synthesis of new DNA from cially latent viruses; (2) to make prenatal genetic diag- the two primers. Repeated cycles of heat denaturation, noses; (3) to detect allelic polymorphisms; (4) to estab- annealing of the primers to their complementary se- lish precise tissue types for transplants; and (5) to study 406 / CHAPTER 40 evolution, using DNA from archeological samples after Targeted sequence RNA copying and mRNA quantitation by the so-called RT-PCR method (cDNA copies of mRNA generated START by a retroviral reverse transcriptase). There are an equal number of applications of PCR to problems in basic science, and new uses are developed every year. CYCLE 1 PRACTICAL APPLICATIONS OF RECOMBINANT DNA TECHNOLOGY ARE NUMEROUS The isolation of a specific gene from an entire genome CYCLE 2 requires a technique that will discriminate one part in a million. The identification of a regulatory region that may be only 10 bp in length requires a sensitivity of one part in 3 × 108; a disease such as sickle cell anemia is caused by a single base change, or one part in 3 × 109. Recombinant DNA technology is powerful enough to accomplish all these things. Gene Mapping Localizes Specific Genes to Distinct Chromosomes Gene localizing thus can define a map of the human genome. This is already yielding useful information in CYCLE 3 the definition of human disease. Somatic cell hybridiza- tion and in situ hybridization are two techniques used to accomplish this. In in situ hybridization, the sim- pler and more direct procedure, a radioactive probe is added to a metaphase spread of chromosomes on a glass slide. The exact area of hybridization is localized by lay- ering photographic emulsion over the slide and, after exposure, lining up the grains with some histologic identification of the chromosome. Fluorescence in situ hybridization (FISH) is a very sensitive technique that is also used for this purpose. This often places the gene at a location on a given band or region on the chromo- some. Some of the human genes localized using these techniques are listed in Table 40–5. This table repre- sents only a sampling, since thousands of genes have been mapped as a result of the recent sequencing of the Figure 40–7. The polymerase chain reaction is used to amplify specific gene sequences. Double-stranded DNA is heated to separate it into individual strands. These bind two distinct primers that are directed at specific sequences on opposite strands and that define the segment to be ampli- fied. DNA polymerase extends the primers in each direction and synthesizes two strands complementary to the original two. This cycle is repeated several times, giving an amplified CYCLES 4–n product of defined length and sequence. Note that the two primers are present in excess. MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 407 Table 40–5. Localization of human genes.1 Gene Chromosome Disease Insulin 11p15 Prolactin 6p23-q12 Growth hormone 17q21-qter Growth hormone deficiency α-Globin 16p12-pter α-Thalassemia β-Globin 11p12 β-Thalassemia, sickle cell Adenosine deaminase 20q13-qter Adenosine deaminase deficiency Phenylalanine hydroxylase 12q24 Phenylketonuria Hypoxanthine-guanine Xq26-q27 Lesch-Nyhan syndrome phosphoribosyltransferase DNA segment G8 4p Huntington’s chorea 1 This table indicates the chromosomal location of several genes and the diseases asso- ciated with deficient or abnormal production of the gene products. The chromosome involved is indicated by the first number or letter. The other numbers and letters refer to precise localizations, as defined in McKusick VA: Mendelian Inheritance in Man, 6th ed. John Hopkins Univ Press, 1983. human genome. Once the defect is localized to a region Recombinant DNA Technology Is Used of DNA that has the characteristic structure of a gene in the Molecular Analysis of Disease (Figure 40–1), a synthetic gene can be constructed and expressed in an appropriate vector and its function can A. NORMAL GENE VARIATIONS be assessed—or the putative peptide, deduced from the There is a normal variation of DNA sequence just as is open reading frame in the coding region, can be synthe- true of more obvious aspects of human structure. Varia- sized. Antibodies directed against this peptide can be tions of DNA sequence, polymorphisms, occur ap- used to assess whether this peptide is expressed in nor- proximately once in every 500 nucleotides, or about mal persons and whether it is absent in those with the 107 times per genome. There are without doubt dele- genetic syndrome. tions and insertions of DNA as well as single-base sub- stitutions. In healthy people, these alterations obviously occur in noncoding regions of DNA or at sites that Proteins Can Be Produced cause no change in function of the encoded protein. for Research & Diagnosis This heritable polymorphism of DNA structure can be associated with certain diseases within a large kindred A practical goal of recombinant DNA research is the and can be used to search for the specific gene involved, production of materials for biomedical applications. as is illustrated below. It can also be used in a variety of This technology has two distinct merits: (1) It can sup- applications in forensic medicine. ply large amounts of material that could not be ob- tained by conventional purification methods (eg, inter- feron, tissue plasminogen activating factor). (2) It can B. GENE VARIATIONS CAUSING DISEASE provide human material (eg, insulin, growth hormone). Classic genetics taught that most genetic diseases were The advantages in both cases are obvious. Although the due to point mutations which resulted in an impaired primary aim is to supply products—generally pro- protein. This may still be true, but if on reading the teins—for treatment (insulin) and diagnosis (AIDS initial sections of this chapter one predicted that ge- testing) of human and other animal diseases and for netic disease could result from derangement of any of disease prevention (hepatitis B vaccine), there are other the steps illustrated in Figure 40–1, one would have potential commercial applications, especially in agricul- made a proper assessment. This point is nicely illus- ture. An example of the latter is the attempt to engineer trated by examination of the β-globin gene. This gene plants that are more resistant to drought or temperature is located in a cluster on chromosome 11 (Figure extremes, more efficient at fixing nitrogen, or that pro- 40–8), and an expanded version of the gene is illus- duce seeds containing the complete complement of es- trated in Figure 40–9. Defective production of β-glo- sential amino acids (rice, wheat, corn, etc). bin results in a variety of diseases and is due to many 408 / CHAPTER 40 ∋ Gγ Aγ Ψβ δ β 5′ LCR 3′ 10 kb Hemoglobinopathy β0-Thalassemia β0-Thalassemia Hemoglobin Lepore Inverted (Aγδβ)0-Thalassemia Figure 40–8. Schematic representation of the β-globin gene cluster and of the lesions in some ge- netic disorders. The β-globin gene is located on chromosome 11 in close association with the two γ-glo- bin genes and the δ-globin gene. The β-gene family is arranged in the order 5′-ε-Gγ-Aγ-ψβ-δ-β-3′. The ε locus is expressed in early embryonic life (as a2ε2). The γ genes are expressed in fetal life, making fetal hemoglobin (HbF, α2γ2). Adult hemoglobin consists of HbA (α2β2) or HbA2(α2δ2). The Ψβ is a pseudo- gene that has sequence homology with β but contains mutations that prevent its expression. A locus control region (LCR) located upstream (5′) from the ε gene controls the rate of transcription of the en- tire β-globin gene cluster. Deletions (solid bar) of the β locus cause β-thalassemia (deficiency or ab- sence [β0] of β-globin). A deletion of δ and β causes hemoglobin Lepore (only hemoglobin α is present). An inversion (Aγδβ)0 in this region (colored bar) disrupts gene function and also results in thalassemia (type III). Each type of thalassemia tends to be found in a certain group of people, eg, the (Aγδβ)0 dele- tion inversion occurs in persons from India. Many more deletions in this region have been mapped, and each causes some type of thalassemia. different lesions in and around the β-globin gene turn results in an A-to-U change in the mRNA corre- (Table 40–6). sponding to the sixth codon of the β-globin gene. The altered codon specifies a different amino acid (valine C. POINT MUTATIONS rather than glutamic acid), and this causes a structural The classic example is sickle cell disease, which is abnormality of the β-globin molecule. Other point mu- caused by mutation of a single base out of the 3 × 109 tations in and around the β-globin gene result in de- in the genome, a T-to-A DNA substitution, which in creased production or, in some instances, no produc- 5′ I1 I2 3′ Figure 40–9. Mutations in the β-globin gene causing β-thalassemia. The β-globin gene is shown in the 5′ to 3′ orientation. The cross-hatched areas indicate the 5′ and 3′ nontranslated regions. Reading from the 5′ to 3′ direction, the shaded areas are exons 1–3 and the clear spaces are introns 1 (I1) and 2 (I2). Mutations that af- fect transcription control (•) are located in the 5′ flanking-region DNA. Examples of nonsense mutations ( ), mutations in RNA processing ( ), and RNA cleavage mutations ( ) have been identified and are indicated. In some regions, many mutations have been found. These are indicated by the brackets. MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 409 Table 40–6. Structural alterations of the β-globin E. PEDIGREE ANALYSIS gene. Sickle cell disease again provides an excellent example of how recombinant DNA technology can be applied Alteration Function Affected Disease to the study of human disease. The substitution of T for A in the template strand of DNA in the β-globin Point mutations Protein folding Sickle cell disease gene changes the sequence in the region that corre- Transcriptional control β-Thalassemia Frameshift and non- β-Thalassemia sponds to the sixth codon from sense mutations ↓ RNA processing β-Thalassemia CCTGAGG Coding strand Deletion mRNA production β0-Thalassemia GGAC T CC Template strand Hemoglobin ↑ Lepore Rearrangement mRNA production β-Thalassemia to type III CCTGTGG Coding strand GGAC A CC Template strand and destroys a recognition site for the restriction en- tion of β-globin; β-thalassemia is the result of these zyme MstII (CCTNAGG; denoted by the small vertical mutations. (The thalassemias are characterized by de- arrows; Table 40–2). Other MstII sites 5′ and 3′ from fects in the synthesis of hemoglobin subunits, and so this site (Figure 40–10) are not affected and so will be β-thalassemia results when there is insufficient produc- cut. Therefore, incubation of DNA from normal (AA), tion of β-globin.) Figure 40–9 illustrates that point heterozygous (AS), and homozygous (SS) individuals mutations affecting each of the many processes in- results in three different patterns on Southern blot volved in generating a normal mRNA (and therefore a transfer (Figure 40–10). This illustrates how a DNA normal protein) have been implicated as a cause of pedigree can be established using the principles dis- β-thalassemia. cussed in this chapter. Pedigree analysis has been ap- D. DELETIONS, INSERTIONS, & plied to a number of genetic diseases and is most useful REARRANGEMENTS OF DNA in those caused by deletions and insertions or the rarer instances in which a restriction endonuclease cleavage Studies of bacteria, viruses, yeasts, and fruit flies show site is affected, as in the example cited in this para- that pieces of DNA can move from one place to an- graph. The analysis is facilitated by the PCR reaction, other within a genome. The deletion of a critical piece which can provide sufficient DNA for analysis from of DNA, the rearrangement of DNA within a gene, or just a few nucleated red blood cells. the insertion of a piece of DNA within a coding or reg- ulatory region can all cause changes in gene expression F. PRENATAL DIAGNOSIS resulting in disease. Again, a molecular analysis of β-thalassemia produces numerous examples of these If the genetic lesion is understood and a specific probe processes—particularly deletions—as causes of disease is available, prenatal diagnosis is possible. DNA from (Figure 40–8). The globin gene clusters seem particu- cells collected from as little as 10 mL of amniotic fluid larly prone to this lesion. Deletions in the α-globin (or by chorionic villus biopsy) can be analyzed by cluster, located on chromosome 16, cause α-thal- Southern blot transfer. A fetus with the restriction pat- assemia. There is a strong ethnic association for many tern AA in Figure 40–10 does not have sickle cell dis- of these deletions, so that northern Europeans, Fil- ease, nor is it a carrier. A fetus with the SS pattern will ipinos, blacks, and Mediterranean peoples have differ- develop the disease. Probes are now available for this ent lesions all resulting in the absence of hemoglobin A type of analysis of many genetic diseases. and α-thalassemia. A similar analysis could be made for a number of G. RESTRICTION FRAGMENT LENGTH other diseases. Point mutations are usually defined by POLYMORPHISM (RFLP) sequencing the gene in question, though occasionally, if The differences in DNA sequence cited above can re- the mutation destroys or creates a restriction enzyme sult in variations of restriction sites and thus in the site, the technique of restriction fragment analysis can length of restriction fragments. An inherited difference be used to pinpoint the lesion. Deletions or insertions in the pattern of restriction (eg, a DNA variation occur- of DNA larger than 50 bp can often be detected by the ring in more than 1% of the general population) is Southern blotting procedure. known as a restriction fragment length polymorphism, 410 / CHAPTER 40 A. MstII restriction sites around and in the β-globin gene Normal (A) 5′ 3′ 1.15 kb 0.2 kb Sickle (S) 5′ 3′ 1.35 kb B. Pedigree analysis Fragment size 1.35 kb 1.15 kb AS AS SS AA AS AS Phenotype Figure 40–10. Pedigree analysis of sickle cell disease. The top part of the fig- ure (A) shows the first part of the β-globin gene and the MstII restriction en- zyme sites in the normal (A) and sickle cell (S) β-globin genes. Digestion with the restriction enzyme MstII results in DNA fragments 1.15 kb and 0.2 kb long in normal individuals. The T-to-A change in individuals with sickle cell disease abolishes one of the three MstII sites around the β-globin gene; hence, a single restriction fragment 1.35 kb in length is generated in response to MstII. This size difference is easily detected on a Southern blot. (The 0.2-kb fragment would run off the gel in this illustration.) (B) Pedigree analysis shows three possibili- ties: AA = normal (open circle); AS = heterozygous (half-solid circles, half-solid square); SS = homozygous (solid square). This approach allows for prenatal di- agnosis of sickle cell disease (dash-sided square). MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 411 or RFLP. An extensive RFLP map of the human H. MICROSATELLITE DNA POLYMORPHISMS genome has been constructed. This is proving useful in Short (2–6 bp), inherited, tandem repeat units of DNA the human genome sequencing project and is an impor- occur about 50,000–100,000 times in the human tant component of the effort to understand various sin- genome (Chapter 36). Because they occur more fre- gle-gene and multigenic diseases. RFLPs result from quently—and in view of the routine application of sen- single-base changes (eg, sickle cell disease) or from dele- sitive PCR methods—they are replacing RFLPs as the tions or insertions of DNA into a restriction fragment marker loci for various genome searches. (eg, the thalassemias) and have proved to be useful di- agnostic tools. They have been found at known gene I. RFLPS & VNTRS IN FORENSIC MEDICINE loci and in sequences that have no known function; Variable numbers of tandemly repeated (VNTR) units thus, RFLPs may disrupt the function of the gene or are one common type of “insertion” that results in an may have no biologic consequences. RFLP. The VNTRs can be inherited, in which case RFLPs are inherited, and they segregate in a they are useful in establishing genetic association with a mendelian fashion. A major use of RFLPs (thousands disease in a family or kindred; or they can be unique to are now known) is in the definition of inherited dis- an individual and thus serve as a molecular fingerprint eases in which the functional deficit is unknown. of that person. RFLPs can be used to establish linkage groups, which in turn, by the process of chromosome walking, will J. GENE THERAPY eventually define the disease locus. In chromosome Diseases caused by deficiency of a gene product (Table walking (Figure 40–11), a fragment representing one 40–5) are amenable to replacement therapy. The strat- end of a long piece of DNA is used to isolate another egy is to clone a gene (eg, the gene that codes for that overlaps but extends the first. The direction of ex- adenosine deaminase) into a vector that will readily be tension is determined by restriction mapping, and the taken up and incorporated into the genome of a host procedure is repeated sequentially until the desired se- cell. Bone marrow precursor cells are being investigated quence is obtained. The X chromosome-linked disor- for this purpose because they presumably will resettle in ders are particularly amenable to this approach, since the marrow and replicate there. The introduced gene only a single allele is expressed. Hence, 20% of the de- would begin to direct the expression of its protein prod- fined RFLPs are on the X chromosome, and a reason- uct, and this would correct the deficiency in the host ably complete linkage map of this chromosome exists. cell. The gene for the X-linked disorder, Duchenne-type muscular dystrophy, was found using RFLPs. Likewise, K. TRANSGENIC ANIMALS the defect in Huntington’s disease was localized to the The somatic cell gene replacement described above terminal region of the short arm of chromosome 4, and would obviously not be passed on to offspring. Other the defect that causes polycystic kidney disease is linked strategies to alter germ cell lines have been devised but to the α-globin locus on chromosome 16. have been tested only in experimental animals. A certain Intact DNA 5′ Gene X 3′ Fragments 1 2 3 4 5 Initial probe * Figure 40–11. The technique of chromosome walking. Gene X is to be isolated from a large piece of DNA. The exact location of this gene is not known, but a probe (*——) directed against a frag- ment of DNA (shown at the 5′ end in this representation) is available, as is a library containing a se- ries of overlapping DNA fragments. For the sake of simplicity, only five of these are shown. The initial probe will hybridize only with clones containing fragment 1, which can then be isolated and used as a probe to detect fragment 2. This procedure is repeated until fragment 4 hybridizes with fragment 5, which contains the entire sequence of gene X. 412 / CHAPTER 40 percentage of genes injected into a fertilized mouse ovum square centimeters. By coupling such DNA microarrays will be incorporated into the genome and found in both with highly sensitive detection of hybridized fluores- somatic and germ cells. Hundreds of transgenic animals cently labeled nucleic acid probes derived from mRNA, have been established, and these are useful for analysis of investigators can rapidly and accurately generate profiles tissue-specific effects on gene expression and effects of of gene expression (eg, specific cellular mRNA content) overproduction of gene products (eg, those from the from cell and tissue samples as small as 1 gram or less. growth hormone gene or oncogenes) and in discovering Thus entire transcriptome information (the entire col- genes involved in development—a process that hereto- lection of cellular mRNAs) for such cell or tissue sources fore has been difficult to study. The transgenic approach can readily be obtained in only a few days. Transcrip- has recently been used to correct a genetic deficiency in tome information allows one to predict the collection of mice. Fertilized ova obtained from mice with genetic hy- proteins that might be expressed in a particular cell, tis- pogonadism were injected with DNA containing the sue, or organ in normal and disease states based upon the coding sequence for the gonadotropin-releasing hormone mRNAs present in those cells. Complementing this high- (GnRH) precursor protein. This gene was expressed and throughput, transcript-profiling method is the recent de- regulated normally in the hypothalamus of a certain velopment of high-sensitivity, high-throughput mass number of the resultant mice, and these animals were in spectrometry of complex protein samples. Newer mass all respects normal. Their offspring also showed no evi- spectrometry methods allow one to identify hundreds to dence of GnRH deficiency. This is, therefore, evidence thousands of proteins in proteins extracted from very of somatic cell expression of the transgene and of its small numbers of cells (< 1 g). This critical information maintenance in germ cells. tells investigators which of the many mRNAs detected in transcript microarray mapping studies are actually trans- Targeted Gene Disruption or Knockout lated into protein, generally the ultimate dictator of phe- notype. Microarray techniques and mass spectrometric In transgenic animals, one is adding one or more copies protein identification experiments both lead to the gen- of a gene to the genome, and there is no way to control eration of huge amounts of data. Appropriate data man- where that gene eventually resides. A complementary— agement and interpretation of the deluge of information and much more difficult—approach involves the selec- forthcoming from such studies has relied upon statistical tive removal of a gene from the genome. Gene knock- methods; and this new technology, coupled with the out animals (usually mice) are made by creating a flood of DNA sequence information, has led to the de- mutation that totally disrupts the function of a gene. velopment of the field of bioinformatics, a new disci- This is then used to replace one of the two genes in an pline whose goal is to help manage, analyze, and inte- embryonic stem cell that can be used to create a het- grate this flood of biologically important information. erozygous transgenic animal. The mating of two such Future work at the intersection of bioinformatics and animals will, by mendelian genetics, result in a ho- transcript-protein profiling will revolutionize our under- mozygous mutation in 25% of offspring. Several hun- standing of biology and medicine. dred strains of mice with knockouts of specific genes have been developed. SUMMARY RNA Transcript & Protein Profiling • A variety of very sensitive techniques can now be ap- plied to the isolation and characterization of genes The “-omic” revolution of the last several years has cul- and to the quantitation of gene products. minated in the determination of the nucleotide se- quences of entire genomes, including those of budding • In DNA cloning, a particular segment of DNA is re- and fission yeasts, various bacteria, the fruit fly, the worm moved from its normal environment using one of Caenorhabditis elegans, the mouse and, most notably, hu- many restriction endonucleases. This is then ligated mans. Additional genomes are being sequenced at an ac- into one of several vectors in which the DNA seg- celerating pace. The availability of all of this DNA se- ment can be amplified and produced in abundance. quence information, coupled with engineering advances, • The cloned DNA can be used as a probe in one of has lead to the development of several revolutionary several types of hybridization reactions to detect methodologies, most of which are based upon high-den- other related or adjacent pieces of DNA, or it can be sity microarray technology. We now have the ability to used to quantitate gene products such as mRNA. deposit thousands of specific, known, definable DNA se- • Manipulation of the DNA to change its structure, so- quences (more typically now synthetic oligonucleotides) called genetic engineering, is a key element in cloning on a glass microscope-style slide in the space of a few (eg, the construction of chimeric molecules) and can MOLECULAR GENETICS, RECOMBINANT DNA, & GENOMIC TECHNOLOGY / 413 also be used to study the function of a certain frag- quences of a single strand of DNA or RNA. ment of DNA and to analyze how genes are regulated. Hybridization: The specific reassociation of com- • Chimeric DNA molecules are introduced into cells plementary strands of nucleic acids (DNA with to make transfected cells or into the fertilized oocyte DNA, DNA with RNA, or RNA with RNA). to make transgenic animals. Insert: An additional length of base pairs in DNA, • Techniques involving cloned DNA are used to locate generally introduced by the techniques of recom- genes to specific regions of chromosomes, to identify binant DNA technology. the genes responsible for diseases, to study how faulty Intron: The sequence of a gene that is transcribed gene regulation causes disease, to diagnose genetic but excised before translation. diseases, and increasingly to treat genetic diseases. Library: A collection of cloned fragments that rep- resents the entire genome. Libraries may be either genomic DNA (in which both introns and exons GLOSSARY are represented) or cDNA (in which only exons ARS: Autonomously replicating sequence; the ori- are represented). gin of replication in yeast. Ligation: The enzyme-catalyzed joining in phos- Autoradiography: The detection of radioactive phodiester linkage of two stretches of DNA or molecules (eg, DNA, RNA, protein) by visualiza- RNA into one; the respective enzymes are DNA tion of their effects on photographic film. and RNA ligases. Bacteriophage: A virus that infects a bacterium. Lines: Long interspersed repeat sequences. Blunt-ended DNA: Two strands of a DNA duplex Microsatellite polymorphism: Heterozygosity of a having ends that are flush with each other. certain microsatellite repeat in an individual. cDNA: A single-stranded DNA molecule that is Microsatellite repeat sequences: Dispersed or complementary to an mRNA molecule and is syn- group repeat sequences of 2–5 bp repeated up to thesized from it by the action of reverse transcrip- 50 times. May occur at 50–100 thousand loca- tase. tions in the genome. Chimeric molecule: A molecule (eg, DNA, RNA, Nick translation: A technique for labeling DNA protein) containing sequences derived from two based on the ability of the DNA polymerase from different species. E coli to degrade a strand of DNA that has been Clone: A large number of organisms, cells or mole- nicked and then to resynthesize the strand; if a ra- cules that are identical with a single parental or- dioactive nucleoside triphosphate is employed, the ganism cell or molecule. rebuilt strand becomes labeled and can be used as Cosmid: A plasmid into which the DNA sequences a radioactive probe. from bacteriophage lambda that are necessary for Northern blot: A method for transferring RNA the packaging of DNA (cos sites) have been in- from an agarose gel to a nitrocellulose filter, on serted; this permits the plasmid DNA to be pack- which the RNA can be detected by a suitable aged in vitro. probe. Endonuclease: An enzyme that cleaves internal Oligonucleotide: A short, defined sequence of nu- bonds in DNA or RNA. cleotides joined together in the typical phosphodi- Excinuclease: The excision nuclease involved in nu- ester linkage. cleotide exchange repair of DNA. Ori: The origin of DNA replication. Exon: The sequence of a gene that is represented PAC: A high capacity (70–95 kb) cloning vector (expressed) as mRNA. based upon the lytic E. coli bacteriophage P1 that Exonuclease: An enzyme that cleaves nucleotides replicates in bacteria as an extrachromosomal ele- from either the 3′ or 5′ ends of DNA or RNA. ment. Fingerprinting: The use of RFLPs or repeat se- Palindrome: A sequence of duplex DNA that is the quence DNA to establish a unique pattern of same when the two strands are read in opposite di- DNA fragments for an individual. rections. Footprinting: DNA with protein bound is resistant Plasmid: A small, extrachromosomal, circular mole- to digestion by DNase enzymes. When a sequenc- cule of DNA that replicates independently of the ing reaction is performed using such DNA, a pro- host DNA. tected area, representing the “footprint” of the Polymerase chain reaction (PCR): An enzymatic bound protein, will be detected. method for the repeated copying (and thus ampli- Hairpin: A double-helical stretch formed by base fication) of the two strands of DNA that make up pairing between neighboring complementary se- a particular gene sequence. 414 / CHAPTER 40 Primosome: The mobile complex of helicase and Spliceosome: The macromolecular complex respon- primase that is involved in DNA replication. sible for precursor mRNA splicing. The spliceo- Probe: A molecule used to detect the presence of a some consists of at least five small nuclear RNAs specific fragment of DNA or RNA in, for in- (snRNA; U1, U2, U4, U5, and U6) and many stance, a bacterial colony that is formed from a ge- proteins. netic library or during analysis by blot transfer Splicing: The removal of introns from RNA ac- techniques; common probes are cDNA molecules, companied by the joining of its exons. synthetic oligodeoxynucleotides of defined se- Sticky-ended DNA: Complementary single strands quence, or antibodies to specific proteins. of DNA that protrude from opposite ends of a Proteome: The entire collection of expressed pro- DNA duplex or from the ends of different duplex teins in an organism. molecules (see also Blunt-ended DNA, above). Pseudogene: An inactive segment of DNA arising Tandem: Used to describe multiple copies of the by mutation of a parental active gene. same sequence (eg, DNA) that lie adjacent to one Recombinant DNA: The altered DNA that results another. from the insertion of a sequence of deoxynu- Terminal transferase: An enzyme that adds nu- cleotides not previously present into an existing cleotides of one type (eg, deoxyadenonucleotidyl molecule of DNA by enzymatic or chemical residues) to the 3′ end of DNA strands. means. Transcription: Template DNA-directed synthesis Restriction enzyme: An endodeoxynuclease that of nucleic acids; typically DNA-directed synthesis causes cleavage of both strands of DNA at highly of RNA. specific sites dictated by the base sequence. Transcriptome: The entire collection of expressed Reverse transcription: RNA-directed synthesis of mRNAs in an organism. DNA, catalyzed by reverse transcriptase. Transgenic: Describing the introduction of new RT-PCR: A method used to quantitate mRNA lev- DNA into germ cells by its injection into the nu- els that relies upon a first step of cDNA copying of cleus of the ovum. mRNAs prior to PCR amplification and quantita- Translation: Synthesis of protein using mRNA as tion. template. Signal: The end product observed when a specific Vector: A plasmid or bacteriophage into which for- sequence of DNA or RNA is detected by autoradi- eign DNA can be introduced for the purposes of ography or some other method. Hybridization cloning. with a complementary radioactive polynucleotide Western blot: A method for transferring protein to (eg, by Southern or Northern blotting) is com- a nitrocellulose filter, on which the protein can be monly used to generate the signal. detected by a suitable probe (eg, an antibody). Sines: Short interspersed repeat sequences. SNP: Single nucleotide polymorphism. Refers to the fact that single nucleotide genetic variation in REFERENCES genome sequence exists at discrete loci throughout Lewin B: Genes VII. Oxford Univ Press, 1999. the chromosomes. Measurement of allelic SNP Martin JB, Gusella JF: Huntington’s disease: pathogenesis and differences is useful for gene mapping studies. management. N Engl J Med 1986:315:1267. snRNA: Small nuclear RNA. This family of RNAs Sambrook J, Fritsch EF, Maniatis T: Molecular Cloning: A Labora- is best known for its role in mRNA processing. tory Manual. Cold Spring Harbor Laboratory Press, 1989. Southern blot: A method for transferring DNA Spector DL, Goldman RD, Leinwand LA: Cells: A Laboratory from an agarose gel to nitrocellulose filter, on Manual. Cold Spring Harbor Laboratory Press, 1998. which the DNA can be detected by a suitable Watson JD et al: Recombinant DNA, 2nd ed. Scientific American probe (eg, complementary DNA or RNA). Books. Freeman, 1992. Southwestern blot: A method for detecting pro- Weatherall DJ: The New Genetics and Clinical Practice, 3rd ed. Ox- tein-DNA interactions by applying a labeled DNA ford Univ Press, 1991. probe to a transfer membrane that contains a rena- tured protein. Intracellular Traffic & Sorting of Proteins 46 Robert K. Murray, MD, PhD BIOMEDICAL IMPORTANCE the signal peptide are given below. Proteins synthesized on free polyribosomes lack this particular signal pep- Proteins must travel from polyribosomes to many dif- tide and are delivered into the cytosol. There they are ferent sites in the cell to perform their particular func- directed to mitochondria, nuclei, and peroxisomes by tions. Some are destined to be components of specific specific signals—or remain in the cytosol if they lack a organelles, others for the cytosol or for export, and yet signal. Any protein that contains a targeting sequence others will be located in the various cellular mem- that is subsequently removed is designated as a prepro- branes. Thus, there is considerable intracellular traffic tein. In some cases a second peptide is also removed, of proteins. Many studies have shown that the Golgi and in that event the original protein is known as a pre- apparatus plays a major role in the sorting of proteins proprotein (eg, preproalbumin; Chapter 50). for their correct destinations. A major insight was the Proteins synthesized and sorted in the rough ER recognition that for proteins to attain their proper loca- branch (Figure 46–2) include many destined for vari- tions, they generally contain information (a signal or ous membranes (eg, of the ER, Golgi apparatus, lyso- coding sequence) that targets them appropriately. Once somes, and plasma membrane) and for secretion. Lyso- a number of the signals were defined, it became appar- somal enzymes are also included. Thus, such proteins ent that certain diseases result from mutations that af- may reside in the membranes or lumens of the ER or fect these signals. In this chapter we discuss the intracel- follow the major transport route of intracellular pro- lular traffic of proteins and their sorting and briefly teins to the Golgi apparatus. Further signal-mediated consider some of the disorders that result when abnor- sorting of certain proteins occurs in the Golgi appara- malities occur. tus, resulting in delivery to lysosomes, membranes of the Golgi apparatus, and other sites. Proteins destined MANY PROTEINS ARE TARGETED for the plasma membrane or for secretion pass through BY SIGNAL SEQUENCES TO THEIR the Golgi apparatus but generally are not thought to carry specific sorting signals; they are believed to reach CORRECT DESTINATIONS their destinations by default. The protein biosynthetic pathways in cells can be con- The entire pathway of ER → Golgi apparatus → sidered to be one large sorting system. Many proteins plasma membrane is often called the secretory or exo- carry signals (usually but not always specific sequences cytotic pathway. Events along this route will be given of amino acids) that direct them to their destination, special attention. Most of the proteins reaching the thus ensuring that they will end up in the appropriate Golgi apparatus or the plasma membrane are carried in membrane or cell compartment; these signals are a fun- transport vesicles; a brief description of the formation damental component of the sorting system. Usually the of these important particles will be given subsequently. signal sequences are recognized and interact with com- Other proteins destined for secretion are carried in se- plementary areas of proteins that serve as receptors for cretory vesicles (Figure 46–2). These are prominent in the proteins that contain them. the pancreas and certain other glands. Their mobiliza- A major sorting decision is made early in protein tion and discharge are regulated and often referred to as biosynthesis, when specific proteins are synthesized ei- “regulated secretion,” whereas the secretory pathway ther on free or on membrane-bound polyribosomes. involving transport vesicles is called “constitutive.” This results in two sorting branches called the cytosolic Experimental approaches that have afforded major branch and the rough endoplasmic reticulum (RER) insights to the processes described in this chapter in- branch (Figure 46–1). This sorting occurs because pro- clude (1) use of yeast mutants; (2) application of re- teins synthesized on membrane-bound polyribosomes combinant DNA techniques (eg, mutating or eliminat- contain a signal peptide that mediates their attach- ing particular sequences in proteins, or fusing new ment to the membrane of the ER. Further details on sequences onto them; and (3) development of in vitro 498 INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 499 Proteins about 20–80 amino acids in length, which is not highly Mitochondrial conserved but contains many positively charged amino Nuclear acids (eg, Lys or Arg). The presequence is equivalent to (1) Cytosolic Peroxisomal a signal peptide mediating attachment of polyribosomes to membranes of the ER (see below), but in this in- Cytosolic stance targeting proteins to the matrix; if the leader se- Polyribosomes quence is cleaved off, potential matrix proteins will not ER membrane reach their destination. GA membrane Translocation is believed to occur posttranslation- (2) Rough ER Plasma membrane ally, after the matrix proteins are released from the cy- Secretory tosolic polyribosomes. Interactions with a number of Lysosomal enzymes cytosolic proteins that act as chaperones (see below) and as targeting factors occur prior to translocation. Figure 46–1. Diagrammatic representation of the Two distinct translocation complexes are situated two branches of protein sorting occurring by synthesis in the outer and inner mitochondrial membranes, re- on (1) cytosolic and (2) membrane-bound polyribo- ferred to (respectively) as TOM (translocase-of-the- somes. The mitochondrial proteins listed are encoded outer membrane) and TIM (translocase-of-the-inner by nuclear genes. Some of the signals used in further membrane). Each complex has been analyzed and sorting of these proteins are listed in Table 46–4. (ER, found to be composed of a number of proteins, some of endoplasmic reticulum; GA, Golgi apparatus.) which act as receptors for the incoming proteins and others as components of the transmembrane pores through which these proteins must pass. Proteins must be in the unfolded state to pass through the com- systems (eg, to study translocation in the ER and mech- plexes, and this is made possible by ATP-dependent anisms of vesicle formation). binding to several chaperone proteins. The roles of The sorting of proteins belonging to the cytosolic chaperone proteins in protein folding are discussed later branch referred to above is described next, starting with in this chapter. In mitochondria, they are involved in mitochondrial proteins. translocation, sorting, folding, assembly, and degrada- tion of imported proteins. A proton-motive force across the inner membrane is required for import; it is THE MITOCHONDRION BOTH IMPORTS made up of the electric potential across the membrane (inside negative) and the pH gradient (see Chapter & SYNTHESIZES PROTEINS 12). The positively charged leader sequence may be Mitochondria contain many proteins. Thirteen pro- helped through the membrane by the negative charge teins (mostly membrane components of the electron in the matrix. The presequence is split off in the matrix transport chain) are encoded by the mitochondrial by a matrix-processing peptidase (MPP). Contact genome and synthesized in that organelle using its own with other chaperones present in the matrix is essential protein-synthesizing system. However, the majority (at to complete the overall process of import. Interaction least several hundred) are encoded by nuclear genes, with mt-Hsp70 (Hsp = heat shock protein) ensures are synthesized outside the mitochondria on cytosolic proper import into the matrix and prevents misfolding polyribosomes, and must be imported. Yeast cells have or aggregation, while interaction with the mt-Hsp60- proved to be a particularly useful system for analyzing Hsp10 system ensures proper folding. The latter pro- the mechanisms of import of mitochondrial proteins, teins resemble the bacterial GroEL chaperonins, a sub- partly because it has proved possible to generate a vari- class of chaperones that form complex cage-like ety of mutants that have illuminated the fundamental assemblies made up of heptameric ring structures. The processes involved. Most progress has been made in the interactions of imported proteins with the above chap- study of proteins present in the mitochondrial matrix, erones require hydrolysis of ATP to drive them. such as the F1 ATPase subunits. Only the pathway of The details of how preproteins are translocated have import of matrix proteins will be discussed in any detail not been fully elucidated. It is possible that the electric here. potential associated with the inner mitochondrial mem- Matrix proteins must pass from cytosolic polyribo- brane causes a conformational change in the unfolded somes through the outer and inner mitochondrial preprotein being translocated and that this helps to pull membranes to reach their destination. Passage through it across. Furthermore, the fact that the matrix is more the two membranes is called translocation. They have negative than the intermembrane space may “attract” an amino terminal leader sequence (presequence), the positively charged amino terminal of the preprotein Plasma memb rane Cytosol Early Secretory endosome Constitutive storage (excretory) granule transport Prelysosome vesicle (or late endosome) Lysosome TGN Golgi trans apparatus medial cis N CG Endoplasmic reticulum Nuclear envelope Figure 46–2. Diagrammatic representation of the rough endoplasmic reticu- lum branch of protein sorting. Newly synthesized proteins are inserted into the ER membrane or lumen from membrane-bound polyribosomes (small black cir- cles studding the cytosolic face of the ER). Those proteins that are transported out of the ER (indicated by solid black arrows) do so from ribosome-free transi- tional elements. Such proteins may then pass through the various subcompart- ments of the Golgi until they reach the TGN, the exit side of the Golgi. In the TGN, proteins are segregated and sorted. Secretory proteins accumulate in secretory storage granules from which they may be expelled as shown in the upper right- hand side of the figure. Proteins destined for the plasma membrane or those that are secreted in a constitutive manner are carried out to the cell surface in trans- port vesicles, as indicated in the upper middle area of the figure. Some proteins may reach the cell surface via late and early endosomes. Other proteins enter prelysosomes (late endosomes) and are selectively transferred to lysosomes. The endocytic pathway illustrated in the upper left-hand area of the figure is consid- ered elsewhere in this chapter. Retrieval from the Golgi apparatus to the ER is not considered in this scheme. (CGN, cis-Golgi network; TGN, trans-Golgi network.) (Courtesy of E Degen.) 500 INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 501 to enter the matrix. Close contact between the mem- These macromolecules include histones, ribosomal pro- brane sites in the outer and inner membranes involved teins and ribosomal subunits, transcription factors, and in translocation is necessary. mRNA molecules. The transport is bidirectional and The above describes the major pathway of proteins occurs through the nuclear pore complexes (NPCs). destined for the mitochondrial matrix. However, cer- These are complex structures with a mass approxi- tain proteins insert into the outer mitochondrial mately 30 times that of a ribosome and are composed membrane facilitated by the TOM complex. Others of about 100 different proteins. The diameter of an stop in the intermembrane space, and some insert into NPC is approximately 9 nm but can increase up to ap- the inner membrane. Yet others proceed into the ma- proximately 28 nm. Molecules smaller than about 40 trix and then return to the inner membrane or inter- kDa can pass through the channel of the NPC by diffu- membrane space. A number of proteins contain two sion, but special translocation mechanisms exist for signaling sequences—one to enter the mitochondrial larger molecules. These mechanisms are under intensive matrix and the other to mediate subsequent relocation investigation, but some important features have already (eg, into the inner membrane). Certain mitochondrial emerged. proteins do not contain presequences (eg, cytochrome Here we shall mainly describe nuclear import of c, which locates in the inter membrane space), and oth- certain macromolecules. The general picture that has ers contain internal presequences. Overall, proteins emerged is that proteins to be imported (cargo mole- employ a variety of mechanisms and routes to attain cules) carry a nuclear localization signal (NLS). One their final destinations in mitochondria. example of an NLS is the amino acid sequence (Pro)2- General features that apply to the import of proteins (Lys)4-Ala-Lys-Val, which is markedly rich in basic ly- into organelles, including mitochondria and some of sine residues. Depending on which NLS it contains, a the other organelles to be discussed below, are summa- cargo molecule interacts with one of a family of soluble rized in Table 46–1. proteins called importins, and the complex docks at the NPC. Another family of proteins called Ran plays a IMPORTINS & EXPORTINS ARE critical regulatory role in the interaction of the complex INVOLVED IN TRANSPORT with the NPC and in its translocation through the NPC. Ran proteins are small monomeric nuclear GTP- OF MACROMOLECULES IN ases and, like other GTPases, exist in either GTP- & OUT OF THE NUCLEUS bound or GDP-bound states. They are themselves reg- It has been estimated that more than a million macro- ulated by guanine nucleotide exchange factors molecules per minute are transported between the nu- (GEFs; eg, the protein RCC1 in eukaryotes), which are cleus and the cytoplasm in an active eukaryotic cell. located in the nucleus, and Ran guanine-activating proteins (GAPs), which are predominantly cytoplas- mic. The GTP-bound state of Ran is favored in the nu- Table 46–1. Some general features of protein cleus and the GDP-bound state in the cytoplasm. The import to organelles.1 conformations and activities of Ran molecules vary de- pending on whether GTP or GDP is bound to them (the GTP-bound state is active; see discussion of G pro- • Import of a protein into an organelle usually occurs in three teins in Chapter 43). The asymmetry between nucleus stages: recognition, translocation, and maturation. • Targeting sequences on the protein are recognized in the and cytoplasm—with respect to which of these two nu- cytoplasm or on the surface of the organelle. cleotides is bound to Ran molecules—is thought to be • The protein is unfolded for translocation, a state main- crucial in understanding the roles of Ran in transferring tained in the cytoplasm by chaperones. complexes unidirectionally across the NPC. When • Threading of the protein through a membrane requires en- cargo molecules are released inside the nucleus, the im- ergy and organellar chaperones on the trans side of the portins recirculate to the cytoplasm to be used again. membrane. Figure 46–3 summarizes some of the principal features • Cycles of binding and release of the protein to the chaper- in the above process. one result in pulling of its polypeptide chain through the Other small monomeric GTPases (eg, ARF, Rab, membrane. Ras, and Rho) are important in various cellular pro- • Other proteins within the organelle catalyze folding of the cesses such as vesicle formation and transport (ARF and protein, often attaching cofactors or oligosaccharides and Rab; see below), certain growth and differentiation assembling them into active monomers or oligomers. processes (Ras), and formation of the actin cytoskele- 1 Data from McNew JA, Goodman JM: The targeting and assembly ton. A process involving GTP and GDP is also crucial of peroxisomal proteins: some old rules do not apply. Trends in the transport of proteins across the membrane of the Biochem Sci 1998;21:54. ER (see below). 502 / CHAPTER 46 Nucleus Cytoplasm Cytoplasm Nucleus Targeting 1 β RanGTP RanGTP α GDP OFF GTP ON Ran GAP Ran Docking + Pi Ran exchange GEF + 2 β RanGDP RanGDP α Ran GTP GDP 3 Ran GEF ? RanBP1 RanBP1 GDP α β Ran 4 GTP α β Termination Translocation Ran 6 GTP 5 Ran Ran β Pi α + GAP GTP 7 ? Ran Ran + GTP GDP α + β 8 Recycle factors Figure 46–3. Schematic representation of the proposed role of Ran in the import of cargo carrying an NLS signal. (1) The targeting complex forms when the NLS receptor (α, an importin) binds NLS cargo and the docking factor (β). (2) Docking occurs at filamentous sites that pro- trude from the NPC. Ran-GDP docks independently. (3) Transfer to the translocation channel is triggered when a RanGEF converts Ran-GDP to Ran-GTP. (4) The NPC catalyzes translocation of the targeting complex. (5) Ran-GTP is recycled to Ran-GDP by docked RanGAP. (6) Ran-GTP dis- rupts the targeting complex by binding to a site on β that overlaps with a binding site. (7) NLS cargo dissociates from α, and Ran-GTP may dissociate from β. (8) α and β factors are recycled to the cytoplasm. Inset: The Ran translocation switch is off in the cytoplasm and on in the nucleus. Ran-GTP promotes NLS- and NES-directed translocation. However, cytoplasmic Ran is enriched in Ran-GDP (OFF) by an active RanGAP, and nuclear pools are enriched in Ran-GTP (ON) by an active GEF. RanBP1 promotes the contrary activities of these two factors. Direct linkage of nu- clear and cytoplasmic pools of Ran occurs through the NPC by an unknown shuttling mecha- nism. Pi, inorganic phosphate; NLS, nuclear localization signal; NPC, nuclear pore complex; GEF, guanine nucleotide exchange factor; GAP, guanine-activating protein; NES, nuclear export sig- nal; BP, binding protein. (Reprinted, with permission, from Goldfarb DS: Whose finger is on the switch? Science 1997;276:1814.) INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 503 Proteins similar to importins, referred to as ex- the synthesis of bile acids, and a marked reduction of portins, are involved in export of many macromole- plasmalogens. The condition is believed to be due to cules from the nucleus. Cargo molecules for export mutations in genes encoding certain proteins—so carry nuclear export signals (NESs). Ran proteins are called peroxins—involved in various steps of peroxi- involved in this process also, and it is now established some biogenesis (such as the import of proteins de- that the processes of import and export share a number scribed above), or in genes encoding certain peroxiso- of common features. mal enzymes themselves. Two closely related conditions are neonatal adrenoleukodystrophy and infantile Refsum disease. Zellweger syndrome and these two MOST CASES OF ZELLWEGER SYNDROME conditions represent a spectrum of overlapping fea- ARE DUE TO MUTATIONS IN GENES tures, with Zellweger syndrome being the most severe INVOLVED IN THE BIOGENESIS (many proteins affected) and infantile Refsum disease OF PEROXISOMES the least severe (only one or a few proteins affected). Table 46–2 lists some features of these and related con- The peroxisome is an important organelle involved in ditions. aspects of the metabolism of many molecules, including fatty acids and other lipids (eg, plasmalogens, choles- THE SIGNAL HYPOTHESIS EXPLAINS terol, bile acids), purines, amino acids, and hydrogen HOW POLYRIBOSOMES BIND TO THE peroxide. The peroxisome is bounded by a single mem- brane and contains more than 50 enzymes; catalase and ENDOPLASMIC RETICULUM urate oxidase are marker enzymes for this organelle. Its As indicated above, the rough ER branch is the second proteins are synthesized on cytosolic polyribosomes and of the two branches involved in the synthesis and sort- fold prior to import. The pathways of import of a num- ing of proteins. In this branch, proteins are synthesized ber of its proteins and enzymes have been studied, some on membrane-bound polyribosomes and translocated being matrix components and others membrane com- into the lumen of the rough ER prior to further sorting ponents. At least two peroxisomal-matrix targeting (Figure 46–2). sequences (PTSs) have been discovered. One, PTS1, is The signal hypothesis was proposed by Blobel and a tripeptide (ie, Ser-Lys-Leu [SKL], but variations of Sabatini partly to explain the distinction between free this sequence have been detected) located at the car- and membrane-bound polyribosomes. They found that boxyl terminal of a number of matrix proteins, includ- proteins synthesized on membrane-bound polyribo- ing catalase. Another, PTS2, consisting of about 26–36 somes contained a peptide extension (signal peptide) amino acids, has been found in at least four matrix pro- teins (eg, thiolase) and, unlike PTS1, is cleaved after entry into the matrix. Proteins containing PTS1 se- Table 46–2. Disorders due to peroxisomal quences form complexes with a soluble receptor protein abnormalities.1 (PTS1R) and proteins containing PTS2 sequences complex with another, PTS2R. The resulting com- MIM Number2 plexes then interact with a membrane receptor, Pex14p. Zellweger syndrome 214100 Proteins involved in further transport of proteins into Neonatal adrenoleukodystrophy 202370 the matrix are also present. Most peroxisomal mem- Infantile Refsum disease 266510 brane proteins have been found to contain neither of Hyperpipecolic acidemia 239400 the above two targeting sequences, but apparently con- Rhizomelic chondrodysplasia punctata 215100 tain others. The import system can handle intact Adrenoleukodystrophy 300100 oligomers (eg, tetrameric catalase). Import of matrix Pseudo-neonatal adrenoleukodystrophy 264470 proteins requires ATP, whereas import of membrane Pseudo-Zellweger syndrome 261510 proteins does not. Hyperoxaluria type 1 259900 Interest in import of proteins into peroxisomes has Acatalasemia 115500 been stimulated by studies on Zellweger syndrome. Glutaryl-CoA oxidase deficiency 231690 This condition is apparent at birth and is characterized 1 Reproduced, with permission, from Seashore MR, Wappner RS: by profound neurologic impairment, victims often Genetics in Primary Care & Clinical Medicine. Appleton & Lange, dying within a year. The number of peroxisomes can 1996. vary from being almost normal to being virtually absent 2 MIM = Mendelian Inheritance in Man. Each number specifies a ref- in some patients. Biochemical findings include an accu- erence in which information regarding each of the above condi- mulation of very long chain fatty acids, abnormalities of tions can be found. 504 / CHAPTER 46 at their amino terminals which mediated their attach- and the β subunit spans the membrane. When the SRP- ment to the membranes of the ER. As noted above, signal peptide complex interacts with the receptor, the proteins whose entire synthesis occurs on free polyribo- exchange of GDP for GTP is stimulated. This form of somes lack this signal peptide. An important aspect of the receptor (with GTP bound) has a high affinity for the signal hypothesis was that it suggested—as turns the SRP and thus releases the signal peptide, which binds out to be the case—that all ribosomes have the same to the translocation machinery (translocon) also present structure and that the distinction between membrane- in the ER membrane. The α subunit then hydrolyzes its bound and free ribosomes depends solely on the for- bound GTP, restoring GDP and completing a GTP- mer’s carrying proteins that have signal peptides. Much GDP cycle. The unidirectionality of this cycle helps drive evidence has confirmed the original hypothesis. Because the interaction of the polyribosome and its signal peptide many membrane proteins are synthesized on mem- with the ER membrane in the forward direction. brane-bound polyribosomes, the signal hypothesis plays The translocon consists of a number of membrane an important role in concepts of membrane assembly. proteins that form a protein-conducting channel in the Some characteristics of signal peptides are summarized ER membrane through which the newly synthesized in Table 46–3. protein may pass. The channel appears to be open only Figure 46–4 illustrates the principal features in rela- when a signal peptide is present, preserving conductance tion to the passage of a secreted protein through the across the ER membrane when it closes. The conduc- membrane of the ER. It incorporates features from the tance of the channel has been measured experimentally. original signal hypothesis and from subsequent work. Specific functions of a number of components of the The mRNA for such a protein encodes an amino termi- translocon have been identified or suggested. TRAM nal signal peptide (also variously called a leader se- (translocating chain-associated membrane) protein may quence, a transient insertion signal, a signal sequence, bind the signal sequence as it initially interacts with the or a presequence). The signal hypothesis proposed that translocon and the Sec61p complex (consisting of three the protein is inserted into the ER membrane at the proteins) binds the heavy subunit of the ribosome. same time as its mRNA is being translated on polyribo- The insertion of the signal peptide into the conduct- somes, so-called cotranslational insertion. As the sig- ing channel, while the other end of the parent protein is nal peptide emerges from the large subunit of the ribo- still attached to ribosomes, is termed “cotranslational some, it is recognized by a signal recognition particle insertion.” The process of elongation of the remaining (SRP) that blocks further translation after about 70 portion of the protein probably facilitates passage of the amino acids have been polymerized (40 buried in the nascent protein across the lipid bilayer as the ribosomes large ribosomal subunit and 30 exposed). The block is remain attached to the membrane of the ER. Thus, the referred to as elongation arrest. The SRP contains six rough (or ribosome-studded) ER is formed. It is impor- proteins and has a 7S RNA associated with it that is tant that the protein be kept in an unfolded state prior closely related to the Alu family of highly repeated to entering the conducting channel—otherwise, it may DNA sequences (Chapter 36). The SRP-imposed block not be able to gain access to the channel. is not released until the SRP-signal peptide-polyribo- Ribosomes remain attached to the ER during syn- some complex has bound to the so-called docking pro- thesis of signal peptide-containing proteins but are re- tein (SRP-R, a receptor for the SRP) on the ER mem- leased and dissociated into their two types of subunits brane; the SRP thus guides the signal peptide to the when the process is completed. The signal peptide is SRP-R and prevents premature folding and expulsion hydrolyzed by signal peptidase, located on the luminal of the protein being synthesized into the cytosol. side of the ER membrane (Figure 46–4), and then is The SRP-R is an integral membrane protein com- apparently rapidly degraded by proteases. posed of α and β subunits. The α subunit binds GDP Cytochrome P450 (Chapter 53), an integral ER membrane protein, does not completely cross the mem- brane. Instead, it resides in the membrane with its sig- Table 46–3. Some properties of signal peptides. nal peptide intact. Its passage through the membrane is prevented by a sequence of amino acids called a halt- or • Usually, but not always, located at the amino terminal stop-transfer signal. • Contain approximately 12–35 amino acids Secretory proteins and proteins destined for mem- • Methionine is usually the amino terminal amino acid branes distal to the ER completely traverse the mem- • Contain a central cluster of hydrophobic amino acids brane bilayer and are discharged into the lumen of the • Contain at least one positively charged amino acid near ER. N-Glycan chains, if present, are added (Chapter their amino terminal 47) as these proteins traverse the inner part of the ER • Usually cleaved off at the carboxyl terminal end of an Ala membrane—a process called “cotranslational glycosyla- residue by signal peptidase tion.” Subsequently, the proteins are found in the INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 505 5′ 3′ Signal codons AUG Signal peptide SRP Signal peptidase Ribosome receptor Signal receptor Figure 46–4. Diagram of the signal hypothesis for the transport of secreted proteins across the ER membrane. The ribosomes synthesizing a protein move along the messenger RNA specifying the amino acid sequence of the protein. (The messenger is represented by the line between 5′ and 3′.) The codon AUG marks the start of the message for the protein; the hatched lines that follow AUG represent the codons for the signal sequence. As the protein grows out from the larger ribosomal subunit, the signal sequence is exposed and bound by the signal recognition particle (SRP). Transla- tion is blocked until the complex binds to the “docking protein,” also designated SRP-R (repre- sented by the solid bar) on the ER membrane. There is also a receptor (open bar) for the ribosome itself. The interaction of the ribosome and growing peptide chain with the ER membrane results in the opening of a channel through which the protein is transported to the interior space of the ER. During translocation, the signal sequence of most proteins is removed by an enzyme called the “signal peptidase,” located at the luminal surface of the ER membrane. The completed protein is eventually released by the ribosome, which then separates into its two components, the large and small ribosomal subunits. The protein ends up inside the ER. See text for further details. (Slightly modified and reproduced, with permission, from Marx JL: Newly made proteins zip through the cell. Sci- ence 1980;207:164. Copyright © 1980 by the American Association for the Advancement of Science.) lumen of the Golgi apparatus, where further changes in PROTEINS FOLLOW SEVERAL ROUTES glycan chains occur (Figure 47–9) prior to intracellular TO BE INSERTED INTO OR ATTACHED distribution or secretion. There is strong evidence that TO THE MEMBRANES OF THE the signal peptide is involved in the process of protein insertion into ER membranes. Mutant proteins, con- ENDOPLASMIC RETICULUM taining altered signal peptides in which a hydrophobic The routes that proteins follow to be inserted into the amino acid is replaced by a hydrophilic one, are not in- membranes of the ER include the following. serted into ER membranes. Nonmembrane proteins (eg, α-globin) to which signal peptides have been at- A. COTRANSLATIONAL INSERTION tached by genetic engineering can be inserted into the Figure 46–5 shows a variety of ways in which proteins lumen of the ER or even secreted. are distributed in the plasma membrane. In particular, There is considerable evidence that a second trans- the amino terminals of certain proteins (eg, the LDL re- poson in the ER membrane is involved in retrograde ceptor) can be seen to be on the extracytoplasmic face, transport of various molecules from the ER lumen to whereas for other proteins (eg, the asialoglycoprotein re- the cytosol. These molecules include unfolded or mis- ceptor) the carboxyl terminals are on this face. To ex- folded glycoproteins, glycopeptides, and oligosaccha- plain these dispositions, one must consider the initial rides. Some at least of these molecules are degraded in biosynthetic events at the ER membrane. The LDL re- proteasomes. Thus, there is two-way traffic across the ceptor enters the ER membrane in a manner analogous ER membrane. to a secretory protein (Figure 46–4); it partly traverses 506 / CHAPTER 46 N N N N N EXTRACYTOPLASMIC C FACE N C C PHO SPH BILA OLIPID YER C CYT N OPL A Various transporters (eg, glucose) C FAC SMIC E C C C N N Insulin and Influenza neuraminidase G protein–coupled receptors IGF-I receptors Asialoglycoprotein receptor Transferrin receptor HLA-DR invariant chain LDL receptor HLA-A heavy chain Influenza hemagglutinin Figure 46–5. Variations in the way in which proteins are inserted into membranes. This schematic representation, which illustrates a number of possible orientations, shows the seg- ments of the proteins within the membrane as α-helices and the other segments as lines. The LDL receptor, which crosses the membrane once and has its amino terminal on the exterior, is called a type I transmembrane protein. The asialoglycoprotein receptor, which also crosses the membrane once but has its carboxyl terminal on the exterior, is called a type II transmembrane protein. The various transporters indicated (eg, glucose) cross the membrane a number of times and are called type III transmembrane proteins; they are also referred to as polytopic membrane proteins. (N, amino terminal; C, carboxyl terminal.) (Adapted, with permission, from Wickner WT, Lodish HF: Multiple mechanisms of protein insertion into and across membranes. Science 1985;230:400. Copyright © 1985 by the American Association for the Advancement of Science.) the ER membrane, its signal peptide is cleaved, and its cleaved insertion sequences and as halt-transfer signals, amino terminal protrudes into the lumen. However, it is respectively. Each pair of helical segments is inserted as a retained in the membrane because it contains a highly hairpin. Sequences that determine the structure of a hydrophobic segment, the halt- or stop-transfer signal. protein in a membrane are called topogenic sequences. This sequence forms the single transmembrane segment As explained in the legend to Figure 46–5, the above of the protein and is its membrane-anchoring domain. three proteins are examples of type I, type II, and type The small patch of ER membrane in which the newly III transmembrane proteins. synthesized LDL receptor is located subsequently buds off as a component of a transport vesicle, probably from the transitional elements of the ER (Figure 46–2). As B. SYNTHESIS ON FREE POLYRIBOSOMES described below in the discussion of asymmetry of pro- & SUBSEQUENT ATTACHMENT TO THE teins and lipids in membrane assembly, the disposition ENDOPLASMIC RETICULUM MEMBRANE of the receptor in the ER membrane is preserved in the An example is cytochrome b5, which enters the ER vesicle, which eventually fuses with the plasma mem- membrane spontaneously. brane. In contrast, the asialoglycoprotein receptor pos- sesses an internal insertion sequence, which inserts into the membrane but is not cleaved. This acts as an anchor, C. RETENTION AT THE LUMINAL ASPECT OF THE ENDOPLASMIC RETICULUM and its carboxyl terminal is extruded through the mem- brane. The more complex disposition of the trans- BY SPECIFIC AMINO ACID SEQUENCES porters (eg, for glucose) can be explained by the fact A number of proteins possess the amino acid sequence that alternating transmembrane α-helices act as un- KDEL (Lys-Asp-Glu-Leu) at their carboxyl terminal. INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 507 This sequence specifies that such proteins will be at- denote transport steps that may be independent of tar- tached to the inner aspect of the ER in a relatively loose geting signals, whereas the vertical open arrows repre- manner. The chaperone BiP (see below) is one such sent steps that depend on specific signals. Thus, flow of protein. Actually, KDEL-containing proteins first travel certain proteins (including membrane proteins) from to the Golgi, interact there with a specific KDEL recep- the ER to the plasma membrane (designated “bulk tor protein, and then return in transport vesicles to the flow,” as it is nonselective) probably occurs without any ER, where they dissociate from the receptor. targeting sequences being involved, ie, by default. On the other hand, insertion of resident proteins into the D. RETROGRADE TRANSPORT FROM ER and Golgi membranes is dependent upon specific THE GOLGI APPARATUS signals (eg, KDEL or halt-transfer sequences for the Certain other non-KDEL-containing proteins destined ER). Similarly, transport of many enzymes to lysosomes for the membranes of the ER also pass to the Golgi and is dependent upon the Man 6-P signal (Chapter 47), then return, by retrograde vesicular transport, to the ER and a signal may be involved for entry of proteins into to be inserted therein (see below). secretory granules. Table 46–4 summarizes informa- The foregoing paragraphs demonstrate that a vari- tion on sequences that are known to be involved in tar- ety of routes are involved in assembly of the proteins of geting various proteins to their correct intracellular sites. the ER membranes; a similar situation probably holds for other membranes (eg, the mitochondrial mem- CHAPERONES ARE PROTEINS branes and the plasma membrane). Precise targeting se- THAT PREVENT FAULTY FOLDING quences have been identified in some instances (eg, KDEL sequences). & UNPRODUCTIVE INTERACTIONS The topic of membrane biogenesis is discussed fur- OF OTHER PROTEINS ther later in this chapter. Exit from the ER may be the rate-limiting step in the secretory pathway. In this context, it has been found PROTEINS MOVE THROUGH CELLULAR that certain proteins play a role in the assembly or COMPARTMENTS TO SPECIFIC proper folding of other proteins without themselves being components of the latter. Such proteins are called DESTINATIONS molecular chaperones; a number of important proper- A scheme representing the possible flow of proteins ties of these proteins are listed in Table 46–5, and the along the ER → Golgi apparatus → plasma membrane names of some of particular importance in the ER are route is shown in Figure 46–6. The horizontal arrows listed in Table 46–6. Basically, they stabilize unfolded Lysosomes cis medial trans Cell ER Golgi Golgi Golgi surface Secretory storage vesicles Figure 46–6. Flow of membrane proteins from the endoplas- mic reticulum (ER) to the cell surface. Horizontal arrows denote steps that have been proposed to be signal independent and thus represent bulk flow. The open vertical arrows in the boxes denote retention of proteins that are resident in the membranes of the organelle indicated. The open vertical arrows outside the boxes indicate signal-mediated transport to lysosomes and secre- tory storage granules. (Reproduced, with permission, from Pfeffer SR, Rothman JE: Biosynthetic protein transport and sorting by the en- doplasmic reticulum and Golgi. Annu Rev Biochem 1987;56:829.) 508 / CHAPTER 46 Table 46–4. Some sequences or compounds that Table 46–6. Some chaperones and enzymes direct proteins to specific organelles. involved in folding that are located in the rough endoplasmic reticulum. Targeting Sequence or Compound Organelle Targeted • BiP (immunoglobulin heavy chain binding protein) Signal peptide sequence Membrane of ER • GRP94 (glucose-regulated protein) • Calnexin Amino terminal Luminal surface of ER • Calreticulin KDEL sequence • PDI (protein disulfide isomerase) (Lys-Asp-Glu-Leu) • PPI (peptidyl prolyl cis-trans isomerase) Amino terminal sequence Mitochondrial matrix (20–80 residues) NLS1 (eg, Pro2-Lys2-Ala- Nucleus Several examples of chaperones were introduced Lys-Val) above when the sorting of mitochondrial proteins was PTS1 (eg, Ser-Lys-Leu) Peroxisome discussed. The immunoglobulin heavy chain binding protein (BiP) is located in the lumen of the ER. This Mannose 6-phosphate Lysosome protein will bind abnormally folded immunoglobulin 1 NLS, nuclear localization signal; PTS, peroxisomal-matrix target- heavy chains and certain other proteins and prevent ing sequence. them from leaving the ER, in which they are degraded. Another important chaperone is calnexin, a calcium- binding protein located in the ER membrane. This pro- tein binds a wide variety of proteins, including mixed or partially folded intermediates, allowing them time to histocompatibility (MHC) antigens and a variety of fold properly, and prevent inappropriate interactions, serum proteins. As mentioned in Chapter 47, calnexin thus combating the formation of nonfunctional struc- binds the monoglycosylated species of glycoproteins tures. Most chaperones exhibit ATPase activity and that occur during processing of glycoproteins, retaining bind ADP and ATP. This activity is important for their them in the ER until the glycoprotein has folded prop- effect on folding. The ADP-chaperone complex often erly. Calreticulin, which is also a calcium-binding pro- has a high affinity for the unfolded protein, which, tein, has properties similar to those of calnexin; it is not when bound, stimulates release of ADP with replace- membrane-bound. Chaperones are not the only pro- ment by ATP. The ATP-chaperone complex, in turn, teins in the ER lumen that are concerned with proper releases segments of the protein that have folded prop- folding of proteins. Two enzymes are present that play erly, and the cycle involving ADP and ATP binding is an active role in folding. Protein disulfide isomerase repeated until the folded protein is released. (PDI) promotes rapid reshuffling of disulfide bonds until the correct set is achieved. Peptidyl prolyl isom- erase (PPI) accelerates folding of proline-containing proteins by catalyzing the cis-trans isomerization of X-Pro bonds, where X is any amino acid residue. Table 46–5. Some properties of chaperone proteins. TRANSPORT VESICLES ARE KEY PLAYERS IN INTRACELLULAR PROTEIN TRAFFIC • Present in a wide range of species from bacteria to humans • Many are so-called heat shock proteins (Hsp) Most proteins that are synthesized on membrane- • Some are inducible by conditions that cause unfolding of bound polyribosomes and are destined for the Golgi newly synthesized proteins (eg, elevated temperature and apparatus or plasma membrane reach these sites inside various chemicals) transport vesicles. The precise mechanisms by which • They bind to predominantly hydrophobic regions of un- proteins synthesized in the rough ER are inserted into folded and aggregated proteins these vesicles are not known. Those involved in trans- • They act in part as a quality control or editing mechanism port from the ER to the Golgi apparatus and vice for detecting misfolded or otherwise defective proteins versa—and from the Golgi to the plasma membrane— • Most chaperones show associated ATPase activity, with ATP are mainly clathrin-free, unlike the coated vesicles in- or ADP being involved in the protein-chaperone interaction volved in endocytosis (see discussions of the LDL re- • Found in various cellular compartments such as cytosol, ceptor in Chapters 25 and 26). For the sake of clarity, mitochondria, and the lumen of the endoplasmic reticulum the non-clathrin-coated vesicles will be referred to in INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 509 this text as transport vesicles. There is evidence that Table 46–7. Factors involved in the formation of proteins destined for the membranes of the Golgi appa- non-clathrin-coated vesicles and their transport. ratus contain specific signal sequences. On the other hand, most proteins destined for the plasma membrane • ARF: ADP-ribosylation factor, a GTPase or for secretion do not appear to contain specific sig- • Coatomer: A family of at least seven coat proteins (α, β, γ, δ, nals, reaching these destinations by default. ε, β′, and ζ). Different transport vesicles have different com- plements of coat proteins. The Golgi Apparatus Is Involved in • SNAP: Soluble NSF attachment factor Glycosylation & Sorting of Proteins • SNARE: SNAP receptor • v-SNARE: Vesicle SNARE The Golgi apparatus plays two important roles in mem- • t-SNARE: Target SNARE brane synthesis. First, it is involved in the processing • GTP-γ-S: A nonhydrolyzable analog of GTP, used to test the of the oligosaccharide chains of membrane and other involvement of GTP N-linked glycoproteins and also contains enzymes in- • NEM: N-Ethylmaleimide, a chemical that alkylates sulfhy- volved in O-glycosylation (see Chapter 47). Second, it dryl groups is involved in the sorting of various proteins prior to • NSF: NEM-sensitive factor, an ATPase their delivery to their appropriate intracellular destina- • Rab proteins: A family of ras-related proteins first observed tions. All parts of the Golgi apparatus participate in the in rat brain; they are GTPases and are active when GTP is first role, whereas the trans-Golgi is particularly in- found volved in the second and is very rich in vesicles. Because • Sec1: A member of a family of proteins that attach to t-SNAREs and are displaced from them by Rab proteins, of their central role in protein transport, considerable thereby allowing v-SNARE–t-SNARE interactions to occur. research has been conducted in recent years concerning the formation and fate of transport vesicles. A Model of Non-Clathrin-Coated Vesicles Step 2: Membrane-associated ARF recruits the coat Involves SNAREs & Other Factors proteins that comprise the coatomer shell from the cytosol, forming a coated bud. Vesicles lie at the heart of intracellular transport of Step 3: The bud pinches off in a process involving many proteins. Recently, significant progress has been acyl-CoA—and probably ATP—to complete the made in understanding the events involved in vesicle formation of the coated vesicle. formation and transport. This has transpired because of the use of a number of approaches. These include es- Step 4: Coat disassembly (involving dissociation of tablishment of cell-free systems with which to study ARF and coatomer shell) follows hydrolysis of vesicle formation. For instance, it is possible to observe, bound GTP; uncoating is necessary for fusion to by electron microscopy, budding of vesicles from Golgi occur. preparations incubated with cytosol and ATP. The de- Step 5: Vesicle targeting is achieved via members of velopment of genetic approaches for studying vesicles a family of integral proteins, termed v-SNAREs, in yeast has also been crucial. The picture is complex, that tag the vesicle during its budding. v-SNAREs with its own nomenclature (Table 46–7), and involves pair with cognate t-SNAREs in the target membrane a variety of cytosolic and membrane proteins, GTP, to dock the vesicle. ATP, and accessory factors. Based largely on a proposal by Rothman and col- It is presumed that steps 4 and 5 are closely coupled leagues, anterograde vesicular transport can be consid- and that step 4 may follow step 5, with ARF and the ered to occur in eight steps (Figure 46–7). The basic coatomer shell rapidly dissociating after docking. concept is that each transport vesicle bears a unique ad- dress marker consisting of one or more v-SNARE pro- Step 6: The general fusion machinery then assem- teins, while each target membrane bears one or more bles on the paired SNARE complex; it includes an complementary t-SNARE proteins with which the ATPase (NSF; NEM-sensitive factor) and the SNAP former interact specifically. (soluble NSF attachment factor) proteins. SNAPs bind to the SNARE (SNAP receptor) complex, en- Step 1: Coat assembly is initiated when ARF is ac- abling NSF to bind. tivated by binding GTP, which is exchanged for Step 7: Hydrolysis of ATP by NSF is essential for GDP. This leads to the association of GTP-bound fusion, a process that can be inhibited by NEM (N- ARF with its putative receptor (hatched in Figure ethylmaleimide). Certain other proteins and calcium 46–7) in the donor membrane. are also required. 510 / CHAPTER 46 Coated 3 vesicle 4 t-SNARE GTP Coated GT P P GT GTP-γ-S bud GTP GTP 5 GT GT GTP Acyl-CoA P P SNAPs NSF GTP P ATP GT Pi GDP GTP Budding GT P 6 GTP SNAPs NSF Ca2+ 2 20S fusion Coatomer particle v-SNARE ATP 1 GTP NEM 7 Fusion GTP BFA GDP GDP GDP ARF Nocodazole Donor Target membrane membrane (eg, ER) 8 (eg, CGN) Figure 46–7. Model of the steps in a round of anterograde vesicular transport. The cycle starts in the bottom left-hand side of the figure, where two molecules of ARF are represented as small ovals containing GDP. The steps in the cycle are described in the text. Most of the abbreviations used are explained in Table 46–7. The roles of Rab and Sec1 proteins (see text) in the overall process are not dealt with in this figure. (CGN, cis-Golgi network; BFA, Brefeldin A.) (Adapted from Rothman JE: Mechanisms of intracellular protein transport. Nature 1994;372:55.) (Courtesy of E Degen.) Step 8: Retrograde transport occurs to restart the doubt remain to be discovered. COPI vesicles are in- cycle. This last step may retrieve certain proteins volved in bidirectional transport from the ER to the or recycle v-SNAREs. Nocodazole, a microtubule- Golgi and in the reverse direction, whereas COPII vesi- disrupting agent, inhibits this step. cles are involved mainly in transport in the former di- rection. Clathrin-containing vesicles are involved in transport from the trans-Golgi network to prelysosomes Brefeldin A Inhibits the Coating Process and from the plasma membrane to endosomes, respec- The following points expand and clarify the above. tively. Regarding selection of cargo molecules by vesi- cles, this appears to be primarily a function of the coat (a) To participate in step 1, ARF must first be modi- proteins of vesicles. Cargo molecules may interact with fied by addition of myristic acid (C14:0), employing coat proteins either directly or via intermediary proteins myristoyl-CoA as the acyl donor. Myristoylation is one that attach to coat proteins, and they then become en- of a number of enzyme-catalyzed posttranslational mod- closed in their appropriate vesicles. ifications, involving addition of certain lipids to specific (c) The fungal metabolite brefeldin A prevents residues of proteins, that facilitate the binding of pro- GTP from binding to ARF in step 1 and thus inhibits teins to the cytosolic surfaces of membranes or vesicles. the entire coating process. In its presence, the Golgi ap- Others are addition of palmitate, farnesyl, and geranyl- paratus appears to disintegrate, and fragments are lost. geranyl; the two latter molecules are polyisoprenoids It may do this by inhibiting the guanine nucleotide ex- containing 15 and 20 carbon atoms, respectively. changer involved in step 1. (b) At least three different types of coated vesicles (d) GTP- -S (a nonhydrolyzable analog of GTP have been distinguished: COPI, COPII, and clathrin- often used in investigations of the role of GTP in bio- coated vesicles; the first two are referred to here as chemical processes) blocks disassembly of the coat from transport vesicles. Many other types of vesicles no coated vesicles, leading to a build-up of coated vesicles. INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 511 (e) A family of Ras-like proteins, called the Rab pro- Asymmetry of Both Proteins & Lipids Is tein family, are required in several steps of intracellular Maintained During Membrane Assembly protein transport, regulated secretion, and endocytosis. They are small monomeric GTPases that attach to the Vesicles formed from membranes of the ER and Golgi cytosolic faces of membranes via geranylgeranyl chains. apparatus, either naturally or pinched off by homoge- They attach in the GTP-bound state (not shown in nization, exhibit transverse asymmetries of both lipid Figure 46–7) to the budding vesicle. Another family of and protein. These asymmetries are maintained during proteins (Sec1) binds to t-SNAREs and prevents inter- fusion of transport vesicles with the plasma membrane. action with them and their complementary v-SNAREs. The inside of the vesicles after fusion becomes the out- When a vesicle interacts with its target membrane, Rab side of the plasma membrane, and the cytoplasmic side proteins displace Sec1 proteins and the v-SNARE- of the vesicles remains the cytoplasmic side of the mem- t-SNARE interaction is free to occur. It appears that brane (Figure 46–8). Since the transverse asymmetry of the Rab and Sec1 families of proteins regulate the speed the membranes already exists in the vesicles of the ER of vesicle formation, opposing each other. Rab proteins well before they are fused to the plasma membrane, a have been likened to throttles and Sec1 proteins to major problem of membrane assembly becomes under- dampers on the overall process of vesicle formation. standing how the integral proteins are inserted into the (f) Studies using v- and t-SNARE proteins reconsti- lipid bilayer of the ER. This problem was addressed tuted into separate lipid bilayer vesicles have indicated earlier in this chapter. that they form SNAREpins, ie, SNARE complexes that Phospholipids are the major class of lipid in mem- link two membranes (vesicles). SNAPs and NSF are re- branes. The enzymes responsible for the synthesis of quired for formation of SNAREpins, but once they phospholipids reside in the cytoplasmic surface of the have formed they can apparently lead to spontaneous cisternae of the ER. As phospholipids are synthesized at fusion of membranes at physiologic temperature, sug- that site, they probably self-assemble into thermody- gesting that they are the minimal machinery required namically stable bimolecular layers, thereby expanding for membrane fusion. the membrane and perhaps promoting the detachment (g) The fusion of synaptic vesicles with the plasma of so-called lipid vesicles from it. It has been proposed membrane of neurons involves a series of events similar that these vesicles travel to other sites, donating their to that described above. For example, one v-SNARE is lipids to other membranes; however, little is known designated synaptobrevin and two t-SNAREs are des- about this matter. As indicated above, cytosolic pro- ignated syntaxin and SNAP 25 (synaptosome-associ- teins that take up phospholipids from one membrane ated protein of 25 kDa). Botulinum B toxin is one of and release them to another (ie, phospholipid exchange the most lethal toxins known and the most serious proteins) have been demonstrated; they probably play a cause of food poisoning. One component of this toxin role in contributing to the specific lipid composition of is a protease that appears to cleave only synaptobrevin, various membranes. thus inhibiting release of acetylcholine at the neuro- muscular junction and possibly proving fatal, depend- Lipids & Proteins Undergo Turnover at ing on the dose taken. Different Rates in Different Membranes (h) Although the above model describes non- clathrin-coated vesicles, it appears likely that many of It has been shown that the half-lives of the lipids of the the events described above apply, at least in principle, ER membranes of rat liver are generally shorter than to clathrin-coated vesicles. those of its proteins, so that the turnover rates of lipids and proteins are independent. Indeed, differ- ent lipids have been found to have different half-lives. THE ASSEMBLY OF MEMBRANES Furthermore, the half-lives of the proteins of these membranes vary quite widely, some exhibiting short IS COMPLEX (hours) and others long (days) half-lives. Thus, individ- There are many cellular membranes, each with its own ual lipids and proteins of the ER membranes appear to specific features. No satisfactory scheme describing the be inserted into it relatively independently; this is the assembly of any one of these membranes is available. case for many other membranes. How various proteins are initially inserted into the The biogenesis of membranes is thus a complex membrane of the ER has been discussed above. The process about which much remains to be learned. One transport of proteins, including membrane proteins, to indication of the complexity involved is to consider the various parts of the cell inside vesicles has also been de- number of posttranslational modifications that mem- scribed. Some general points about membrane assembly brane proteins may be subjected to prior to attaining remain to be addressed. their mature state. These include proteolysis, assembly 512 / CHAPTER 46 Membrane protein Exterior surface Table 46–8. Major features of membrane assembly. • Lipids and proteins are inserted independently into mem- branes. • Individual membrane lipids and proteins turn over indepen- C dently and at different rates. Plasma Lumen membrane • Topogenic sequences (eg, signal [amino terminal or inter- nal] and stop-transfer) are important in determining the in- N N Cytoplasm sertion and disposition of proteins in membranes. Integral • Membrane proteins inside transport vesicles bud off the en- protein doplasmic reticulum on their way to the Golgi; final sorting of many membrane proteins occurs in the trans-Golgi net- Vesicle membrane work. C • Specific sorting sequences guide proteins to particular organelles such as lysosomes, peroxisomes, and mitochon- dria. into multimers, glycosylation, addition of a glycophos- phatidylinositol (GPI) anchor, sulfation on tyrosine or carbohydrate moieties, phosphorylation, acylation, and N N prenylation—a list that is undoubtedly not complete. Nevertheless, significant progress has been made; Table C 46–8 summarizes some of the major features of mem- brane assembly that have emerged to date. Table 46–9. Some disorders due to mutations in genes encoding proteins involved in intracellular N N membrane transport.1 Disorder2 Protein Involved C C Chédiak-Higashi syndrome, Lysosomal trafficking regula- 214500 tor Figure 46–8. Fusion of a vesicle with the plasma Combined deficiency of factors ERGIC-53, a mannose- membrane preserves the orientation of any integral V and VIII, 227300 binding lectin proteins embedded in the vesicle bilayer. Initially, the Hermansky-Pudlak syndrome, AP-3 adaptor complex β3A amino terminal of the protein faces the lumen, or inner 203300 subunit cavity, of such a vesicle. After fusion, the amino termi- nal is on the exterior surface of the plasma membrane. I-cell disease, 252500 N-Acetylglucosamine That the orientation of the protein has not been re- 1-phosphotransferase versed can be perceived by noting that the other end Oculocerebrorenal syndrome, OCRL-1, an inositol poly- of the molecule, the carboxyl terminal, is always im- 30900 phosphate 5-phosphatase mersed in the cytoplasm. The lumen of a vesicle and 1 Modified from Olkonnen VM, Ikonen E: Genetic defects of intra- the outside of the cell are topologically equivalent. (Re- cellular-membrane transport. N Eng J Med 2000;343:1095. Certain drawn and modified, with permission, from Lodish HF, related conditions not listed here are also described in this publi- Rothman JE: The assembly of cell membranes. Sci Am cation. I-cell disease is described in Chapter 47. The majority of [Jan] 1979;240:43.) the disorders listed above affect lysosomal function; readers should consult a textbook of medicine for information on the clinical manifestations of these conditions. 2 The numbers after each disorder are the OMIM numbers. INTRACELLULAR TRAFFIC & SORTING OF PROTEINS / 513 Various Disorders Result From Mutations and attachment of transport vesicles to a target mem- in Genes Encoding Proteins Involved brane is summarized. in Intracellular Transport • Membrane assembly is discussed and shown to be complex. Asymmetry of both lipids and proteins is Some of these are listed in Table 46–9; the majority af- maintained during membrane assembly. fect lysosomal function. A number of other mutations affecting intracellular protein transport have been re- • A number of disorders have been shown to be due to ported but are not included here. mutations in genes encoding proteins involved in various aspects of protein traffic and sorting. SUMMARY REFERENCES • Many proteins are targeted to their destinations by Fuller GM, Shields DL: Molecular Basis of Medical Cell Biology. signal sequences. A major sorting decision is made McGraw-Hill, 1998. when proteins are partitioned between cytosolic and Gould SJ et al: The peroxisome biogenesis disorders. In: The Meta- membrane-bound polyribosomes by virtue of the ab- bolic and Molecular Bases of Inherited Disease, 8th ed. Scriver sence or presence of a signal peptide. CR et al (editors). McGraw-Hill, 2001. • The pathways of protein import into mitochondria, Graham JM, Higgins JA: Membrane Analysis. BIOS Scientific, nuclei, peroxisomes, and the endoplasmic reticulum 1997. are described. Griffith J, Sansom C: The Transporter Facts Book. Academic Press, 1998. • Many proteins synthesized on membrane-bound Lodish H et al: Molecular Cell Biology, 4th ed. Freeman, 2000. polyribosomes proceed to the Golgi apparatus and (Chapter 17 contains comprehensive coverage of protein sort- the plasma membrane in transport vesicles. ing and organelle biogenesis.) • A number of glycosylation reactions occur in com- Olkkonen VM, Ikonen E: Genetic defects of intracellular-mem- partments of the Golgi, and proteins are further brane transport. N Engl J Med 2000;343:1095. sorted in the trans-Golgi network. Reithmeier RAF: Assembly of proteins into membranes. In: Bio- • Most proteins destined for the plasma membrane chemistry of Lipids, Lipoproteins and Membranes. Vance DE, Vance JE (editors). Elsevier, 1996. and for secretion appear to lack specific signals—a Sabatini DD, Adesnik MB: The biogenesis of membranes and or- default mechanism. ganelles. In: The Metabolic and Molecular Bases of Inherited • The role of chaperone proteins in the folding of pro- Disease, 8th ed. Scriver CR et al (editors). McGraw-Hill, teins is presented, and a model describing budding 2001.