dnaG Primase dependent Origins of DNA Replication

Document Sample
dnaG Primase dependent Origins of DNA Replication Powered By Docstoc
Vol. 254, No 24, Issue of December 25, pp. 12615-12628,    1979
Prrnted LIZ U.S.A

dnaG            (Primase)-dependent                                      Origins               of DNA            Replication
NUCLEOTIDE                SEQUENCES               OF THE          NEGATIVE           STRAND           INITIATION         SITES      OF BACTERIOPHAGES                    St-l,
+K, AND a3*

                                                                                                                                    (Received for publication, May 22, 1979)

                  John      Sims,+      Daniel       Capon,       and David        Dressler
                  From    the Biological       Laboratories,       Harvard     University,      Cambridge,    Massachusetts      02138

   The simplest known origins                   of DNA replication                occur            stranded phages (Schekman et al., 1974). These experiments
in the single-stranded                 bacteriophages.            In one set of                    involve the conversion of viral positive strand circles to duplex
phages, negative             strand synthesis          is initiated         by a sin-              rings via the synthesis of a complementary,                     or negative,
gle protein,         the product        of the Escherichia            coli replica-                strand.
tion gene dnaG. Evidently,                    in these phages-G4,                   St-l,              1. With +X174 DNA as substrate, the initiation                 of negative
 cpK, and a3--the            origin      for negative        strand        synthesis               strand synthesis is a complex reaction, involving seven pro-
consists       of a nucleic           acid element         capable         of direct.              teins (Wickner and Hurwitz, 1974; Schekman et al., 1974). In
recognition          by the dnaG priming             protein.                                      a prepriming       step, the dnaC protein and initiation factors i, n,
   We have located and sequenced                        the origins          of nega-              and n’ (also called X, Y, and Z) transfer a molecule of dnaB
tive strand synthesis              in St-l, +K, and (~3, and compared

                                                                                                                                                                                       Downloaded from by guest, on July 12, 2011
                                                                                                   protein to the DNA binding protein-covered                   positive strand
them with the origin                 sequence      previously          determined                  template. The dnaG             protein, recognizing      this DNA. dnaB -
for G4. In each case, the point at which                             the negative
                                                                                                   protein complex, then synthesizes a short oligonucleotide
strand is initiated             can be identified           at the nucleotide
level. The data lead to the following                      conclusions:                           primer which remains bound to the DNA. After the special
   1. In all four phages,                the negative        strand initiation                    initiation     polymerase        (primase) has functioned,        the nascent
site occurs within             an intercistronic          region       of approxi-                negative strand is elongated by DNA polymerase III holoen-
mately 135 bases. While in G4, the origin lies between                                            zyme. In the case of 4X, it appears that negative strand
genes specifying             the viral coat proteins                F and G, the                  synthesis can be initiated at many (perhaps random) sites on
origin is shifted in St-l, +K, and a3 to a position                         between               the positive strand template (Eisenberg and Denhardt,                        1974;
coat protein          genes G and H.                                                               McMacken         et al., 1977; McMacken          and Kornberg,      1978).
   2. Extensive           nucleotide         conservation          exists at the                      2. In contrast to +X, the p riming of negative strand synthesis
negative        strand      origin,      but does not extend                 into the             in phage G4 (Godson, 1974) (and also in the independently
adjacent        coding      regions.        The conserved             origin       DNA            isolated phages St-l, +K, and a3 (Bradley,                     1970; Taketo,
occurs in two regions,                42 and 45 bases long, which are                              1976)) is enzymologically           much simpler. Studies in uiuo have
separated          by 13 bases of divergent              sequence.                                demonstrated          that negative strand synthesis in these phages
   3. Correlated           with      the two stretches             of conserved                   requires the dnaG            but not the dnaB protein (Bowes, 1974;
nucleotide          sequence       are two regions           of potential            sec-         Derstine et al., 1976; Kodaira and Taketo, 1977; Taketo, 1976;
ondary        structure.       The start point of negative                      strand            Taketo and Kodaira,               1978). In vitro    experiments      have ex-
synthesis        lies just prior to one of these hairpins.                                        tended this result: the dnaB and dnaC proteins and factors
   Similarities         in both primary          sequence        and secondary                    X, Y, and Z (i, n, and n’) are not required for negative strand
structure         can be found between                 the negative             strand            synthesis (Zechel et al., 1975; Bouch6 et al., 1975; Wickner,
origins      of G4, St-l, +K, and a3 and the general                             origin
                                                                                                   1977). Evidently,        these phage DNAs, when associated with
regions      of bacteriophage              X and of E. coli.
                                                                                                  DNA binding protein, possess a site which by itself contains
                                                                                                  the information         necessary for the dnaG protein to bind to the
                                                                                                  template and initiate primer synthesis. Because of the relative
    The dnaG protein of Escherichia      coli is responsible for the                              simplicity      of negative strand synthesis in G4, St-l, (PK, and
initiation  of DNA strands. Given an appropriate             single-                              a3, we have focused our attention on these phages.
stranded DNA template in z&-o, the enzyme synthesizes a                                               The purpose of this study has been to define the nature of
short oligonucleotide   which serves as a primer for elongation                                   the template site at which the initiation                of negative strand
by DNA polymerases (BouchB et al., 1975). The dnaG protein                                        synthesis occurs. This site is the DNA complement                         of the
has also been implicated in the initiation    of Okazaki fragments                                dnaG      protein, forming the second half of the protein-nucleic
in uiuo (Lark, 1972; Louarn,    1974).                                                            acid interaction        required for strand initiation.
    Two types of priming reaction involving the dnaG protein                                          We have previously reported the nucleic acid sequence for
have been discovered through studies in vitro using single-                                       the negative strand origin of one of these phages, G4 (Sims
                                                                                                  and Dressler, 1978), as have Fiddes et al. (1978). The G4 study
  * This study has been funded by Research Grants NP-57 from the                                  showed that the negative strand origin is located in a 135-base
American Cancer Society and GM-17088 from the National Institutes                                 untranslated        region between two coat protein genes (F and
of Health     and by Research           Career    Development      Award      GM-70440            G), and possesses extensive secondary structure. It also estab-
from the National          Institutes    of Health.     The costs of oublication         of      lished that the oligonucleotide             synthesized in vitro from G4
this article were defrayed            in part by the payment         of *page charges.
                                                                                                  DNA by the dnaG               protein (Bouch6 et al., 1978) is in fact
This article must therefore             be hereby      marked   “advertisement”          in
accordance      with 18 U.S.C. Section           1734 solelv to indicate        this fact.        complementary           to DNA located at the physiological            origin of
    $ Current    address, Medical         Research     Coun&     Laboratory        of Mo-         negative strand synthesis (Hourcade and Dressier, 1978). The
lecular Biology,      Hills Road, Cambridge,           England.                                   studies with G4 provided the fist example of an origin of

12616                                                    dnaG           (Primase)-dependent                         Origins       of DNA           Replication

 DNA synthesis, dependent on the Escherichia           coli replication                                              were pelleted          and cleared lysates were prepared                        by the method           of
protein dnaG, for which the exact start point of synthesis                                                           Clewell and Helinski                (1970). The viral DNA was then centrifuged                          to
 could be defined at the nucleotide level.                                                                           equilibrium        in CsCl/ethidium              bromide,       and the lower band, contain-
                                                                                                                     ing the covalently             closed DNA,             was recovered          and extracted        with
     We report here first the location, and then the DNA se-                                                          CsCl-saturated          isoamyl        alcohol      to remove        the dye. The DNA was
 quences, for the negative strand origins of phages St-l, +K,                                                         then freed of RNA by passage over a column of Bio-Gel                                  A-15m (Bio-
 and a3. These sequences were obtained so that they could be                                                         Rad) previously             equilibrated         with 10 mM Tris-HCl                (pH 7.4), 1 mM
 compared with the sequence of the G4 initiation          site, in order                                             EDTA,        10 mM NaCl. Yields were 0.5 to 1 mg of supercoiled                                  phage
 to learn which features are conserved among the different                                                           DNA from 1 liter of infected cells for St-l and +K, and 0.05 to 0.1 mg/
 origins and therefore likely to be important      for recognition      by                                           liter for a3.
                                                                                                                          Sequencing        of the Origins-Restriction                   fragments       from the origin
 the dnaG protein. We had anticipated            that the potential                                                  regions were obtained                 by digestion       of fully duplex rings. Restriction
 sequence homology would be short and possibly interrupted,                                                           digests were carried out at 37°C in a buffer containing                                6.7 mM Tris-
 as are the promoter sequences recognized by RNA polymer-                                                             HCl (pH 7.4), 6.7 mu MgCls, 6.7 mM 2-mercaptoethanol,                                    and 200 pg/
 ase. Instead, the initiation  region in all four phages was found                                                   ml of bovine serum albumin.                     For Hae III and Hinff              digestions,    NaCl
 to be highly conserved, with about 100 bases of DNA retained                                                        was added to 60 mM. Fragments                       were recovered        from gels as described
in common. This conservation appears functionally            significant                                             by Maxam          and Gilbert          (1977). Nonradioactive              fragments        on prepar
                                                                                                                     ative gels were located by uv shadowing                         (Hassur and Whitlock,            1974).
because it does not extend into the coding sequences that lie                                                             For end-labeling,           restriction      fragments        were dephosphorylated              for
 on either side. In all four origin sequences, the conserved                                                          30 min at 37’C in 100 mM Tris-HCl                             (pH 8.0), using 0.005 unit of
 DNA element can be used to form two regions of potential                                                             bacterial     alkaline phosphatase              per pmol of 5’ ends. End-labeling                using
 secondary structure, which may be involved in folding the                                                            polynucleotide           kinase       was performed            essentially       as described         by
 initiation  site DNA into a compact structure for recognition                                                        Maxam       and Gilbert          (1977), except that fragments                 with protruding          5’
by the dnaG priming protein.                                                                                          ends were not heat-denatured                       prior to the kinase reaction.               Strand
                                                                                                                     separations         and sequencing            reactions      were carried         out according         to
     As in G4, the negative strand initiation      sites in St-l, +K,                                                the procedures           of Maxam           and Gilbert        (1977), except for the modifi-
and a3 are intercistronic,    and occur between two coat protein                                                      cations noted in Sims and Dressler                      (1978); the “strong          adenine/weak

                                                                                                                                                                                                                                   Downloaded from by guest, on July 12, 2011
 genes. However, in phage G4, the origin lies between viral                                                           guanine”      and “alternative             guanine”      reactions      were used for cleavage
 genes F and G, whereas in phages St-l, +K, and a3 this very                                                          at the purines.          For resolution           of the cleavage          products,      20% acryl-
 similar sequence is shifted to a position between genes G and                                                        amide, 7 M urea gels (bis’/acrylamide                       ratio, 1:30) were run in 50 mM
 H.                                                                                                                   Tris/borate        (pH 8.3), 1 mu EDTA.                 Double that buffer concentration
                                                                                                                     was used for 10% or 8% acrylamide,                           7 M urea gels (bis/acrylamide
                             EXPERIMENTAL                   PROCEDURES                                               ratio, 1:20). Earlier           gels were 1.5 mm thick; later we used 0.38-mm-
                                                                                                                      thick gels for better resolution                 (Sanger and Coulson,              1978).
     Viruses-Phage               St-l was obtained               from S. Wickner                  (National                                        RESULTS          AND      DISCUSSION
Institutes      of Health).          It was isolated         at Stoke-on-Trent,                 England        in
1964 (Bradley,          1964, 1970). Phages 6K and (~3 were obtained                                 from A.                     Identification  of the Negative      Strand       Origins
Taketo       (Kanazawa            University).         They were isolated                  in Japan and
Edinburgh,         resnectivelv         (Taketo.       1976; Bradlev.           1970).                                   In order to sequence the negative strand initiation                  sites of
    Host Ceil&t-l                and 4K grow on E. coli K-12; o3 grows on E. coli                                    St-l, +K, and a3, it was fust necessary to locate these origins
C. The specific strains used to prepare phage stocks and viral DNAs                                                  on the viral genomes. In the previous study of phage G4, this
were MM 28 (W3350gal-                      str’) and HF4704          (thy- uurA-),           respectively.           was done by infecting           cells with uv-irradiated             phage and
    DNA-E.           coli DNA, from the K12 strain MM 294, was a kind gift                                           allowing negative strand synthesis to proceed until blocked by
of Peter LoMedico;                it had been purified             by the method               of Blin and           a uv-induced       lesion. The partially      duplex circles were then
Stafford      (1976). Plasmid pMB9 DNA and a restriction                                  fragment        (Hae
203) containing           part of the lac operon were gifts of Hunt Potter and                                       recovered from the cells and analyzed with restriction enzymes
Barbara       Meyer.                                                                                                 to determine from which point the genome had become dou-
    Enzymes-Restriction                   enzymes were purchased                   from New England                  ble-stranded,     thus defining the location of the origin of nega-
Biolabs,      except        for Eco RI, which                 was obtained              from Bethesda                tive strand synthesis (Hourcade              and Dressler, 1978). The
Research        Laboratories.            In addition,         some Ala I was a gift of U.                            DNA sequence          at the origin    was then      determined,        and the
Siebenlist:       some Hue III a eift of R. Tizard                         and P. Farabaueh:         Y
                                                                                                            and      initiation   site was defined at the nucleotide level by comparing
some Hi&I           a gift of G. Sutchffe            and D. Hourcade.                DNA polymerase
I, its Klenow          fragment,          and polynucleotide               kinase were purchased                     the DNA sequence in the region of the origin with the se-
from Boehringer             Mannheim.           RNase A was obtained                  from Sigma, and                quence of the RNA primer made from G4 DNA in vitro (Sims
DNase I from Worthington.                       Bacterial     alkaline        phosphatase           was pur-         and Dressler, 1978; Bouche et al, 1978).
chased from Worthington.                      For use, it was diluted to 1 mg/ml                        in 100           To locate the negative strand origins in St-l, +K, and a3,
mM Tris (pH 8.0), 5 mM MgC12, dialyzed,                                   and stored          in the same            we used a more rapid technique based on the same principles,
buffer.                                                                                                              but carried out in vitro. The viral positive strand circles were
    C/zemicals-[y-“‘P]ATP                      was purchased            from New England                    Nu-
                                                                                                                     primed with dnaG protein, and the primers were elongated
clear or synthesized              by the procedure            of Glynn and Chappell                    (1964),
as described          by Maxam            and Gilbert         (1977). [cY-32P]deoxynucleoside                        with DNA polymerase I using the four normal dNTPs and, in
triphosphates          were purchased             from New England                  Nuclear.       Dideoxy-          addition, a dideoxynucleoside          triphosphate.        Incorporation        of
thymidine       triphosphate            was obtained        from P-L Biochemicals,                     bovine        a dideoxy residue causes chain termination                  (Atkinson      et al.,
serum albumin              (Fraction          V) from Miles,             ethidium          bromide        from        1969), and synthesis is therefore expected to produce a popu-
Calbiochem,          acrylamide          and bisacrylamide             from Bio-Rad,             urea from           lation of partially duplex circles. The newly synthesized neg-
R-Plus, dimethyl             sulfate from Aldrich,              hydrazine         from Eastman              and
piperidine      from Fisher.                                                                                          ative strands will extend from the origin of synthesis (at their
                                                                                                                      5’ ends) to the random position at which a dideoxy nucleotide
                                                 Methods                                                             has been incorporated         (at their 3’ ends). They can be made
   Preparation      of Phage DNA-In            all cases, host cells were grown in                                   radioactive      by using a “‘P-labeled        deoxyribonucleoside            tri-
a medium containing,         per liter, 10 g of bacto-Tryptone,        5 g of KCl, 10                                phosphate during synthesis.
mM MgS04,       and 2 mM CaC12. For strain HF4704,                the medium         was
supplemented      with 20 pg/ml of thymine.                                                                              1 The abbreviations      used are: bis, N,N’-methylenebisacrylamide;
   To prepare    double-stranded,         supercoiled    viral DNA, cells at about                                   dNTP,     deoxynucleoside       triphosphate;    ddNTP,    2’,3’-dideoxynucleo-
3 x lO”/ml      were infected        with phage at a multiplicity          of 2 to 3.                                side triphosphate;       SSC, 0.15 M NaCl, 0.015 M sodium              citrate; 1 M
Chloramphenicol        was present       at 30 pg/ml to suppress        the late-life-                               phosphate      buffer,   0.4 M NaHzPOd,       0.6 M Na2HP04;         SDS, sodium
cycle synthesis     of phage. After 6 to 8 h of aeration          at 37”C, the cells                                 dodecyl sulfate.
                                           dnaG       (Primase)-dependent                 Origins        of DNA Replication                                                                       12617
     Restriction      enzyme digestion of the partially duplex rings,                            12                  M3                4         56                 M         7        a
followed by denaturation,            will convert the newly synthesized
negative strands to a collection                  of radioactive          fragments.                                                                                                             Hael
These fragments may be resolved on a gel to identify which
portion of the template served as the initiation site for negative                                                                                                                               Hae 2
strand synthesis.                                                                                                                                                                                Hae3
     When the dideoxynucleotide             is used in high concentration,                                                                                                                       Hae    4
its incorporation        should be relatively frequent, and the newly
synthesized negative strands are expected to be short. Upon                                                                                                                                      Hae5
digestion of the partially duplex rings with a given restriction
                                                                                                                                                                                                 Hae    6
enzyme, only a few radioactive                restriction      fragments should                                                                                                                  Hae    7
be produced. Assuming that priming occurs at a unique site,
these fragments will represent the segment of the viral chro-
mosome immediately                downstream        from the origin. In the
presence of a low concentration              of ddNTP, on the other hand,
                                                                                                                                                                                                 Hae    9
synthesis is expected to proceed most or all of the way around
the circle before being arrested. In this case, almost all nega-
tive strand fragments should be present in a restriction                           en-                                                                                                           Hae    10
zyme digest of the partially duplex rings; those that are rare
or missing would correspond to the region of the genome just
prior to the origin. This procedure                   for locating the origins
worked well, and will be illustrated                 using the data obtained
with phage St-l as an example.
                                                                                                                                                                                                 Hae    11

                                                                                                                                                                                                                 Downloaded from by guest, on July 12, 2011
     Fig. 1 shows the fragments of the negative strand released
upon Hae III digestion of partially duplex St-l circles, which
were produced by incubating               single-stranded         viral DNA with
replication      proteins in the presence of the four deoxyribonu-
cleoside triphosphates             and varying amounts of the chain-
terminating        analogue, ddTTP. In Lanes 1 throught 8, left to
right,     progressively      lower amounts of ddTTP were included                                                                                                                               Hae     12

in the reaction mixtures. The two lanes marked Mare                            marker
 lanes containing the full set of Hue III fragments produced by
 digestion of fully double-stranded              St-l circles.
                                                                                                FIG. 1. Localization                 of the St-l negative                 strand       origin       using
     With the highest concentrations                of ddTTP         (Lanes      1 and
                                                                                           dnaG-primed                  synthesis       in the presence              of the chain-terminat-
2), only three bands appear. Two of these are identified                             as    ing compound,                 dideoxythymidine                 triphosphate.             Partially      duplex
Hue      3 and Hae 4, the third and fourth largest fragments                                St-l circles were made in vitro by dnaG-primed                                        synthesis         in the
 present in a Hae III digest of fully duplex St-l rings (compare                           presence         of the four normal dNTPs                     and varying          amounts         of ddTTP
 Lanes      1 and 2 with Lane M). Since Hue 4 is the more heavily                           (as described            below). Lanes 1 through                   8 show the Hue III digestion
 represented fragment, it is identified as the piece immediately                            products        of the partially         duplex rings, after denaturation                     and electro-
                                                                                            phoresis        through         a polyacrylamide              gel. Reaction            1 contained            the
 downstream         from the initiation       site, followed by Hue 3. The                  highest      concentration            of ddTTP;          reaction        8, the lowest. The lanes
 third fragment in the experimental               digests, labeled N, is a 105-             marked       M display the end-labeled                   Hue III digestion            products        of fully-
 nucleotide band not present in a digest of fully duplex St-l                               duplex      St-l circles. It can be seen that in the presence                                      of a high
 rings, and is interpreted           as representing          the portion of the            concentration             of ddTTP,         only a short negative                 strand is made. AS
 negative strand lying between the origin and the first Hue III                             discussed in the text, the negative                      strand is initiated            about 105 bases
 cut.                                                                                       upstream          from the fragment               Hue 4, as judged by the length of the
                                                                                            new band N. Detailed                  procedures:         partially       duplex St-l circles were
     As less ddTTP is used in the reactions (Lanes 3 to 6), more                            synthesized           by incubating           single-stranded           viral DNA           (0.6 nmol, as
 bands appear, until in reactions containing the lowest concen-                             nucleotide)          with DNA binding protein                     (0.66 pg), dnaG protein                 (0.075
 trations of the chain terminator,               all of the Hue III negative                unit), and E. coli DNA                    polymerase            I (14 units), in 25-al reaction
 strand fragments are represented                (Lanes      7 and 8). However,             mixtures         containing        20 mM Tris-HCl              (pH 7.5), 10 mu MgCb. 3.5 mM
 two bands, Hue 8 and Hue 10, appear late and remain fainter                                dithiothreitol,            20 ig/ml       of rifampicin,            430 pg/ml         of bovine          serum
 than the rest of the fragments.                  One of these two is thus                  albumin.         40 UM ADP. 40 UM dATP,                     dGTP, and dCTP, 5 PM m-‘*Pl-
                                                                                            dTTP (specific activity,               46 Ci/mmol),            and varying         amounts of &lTG
  tentatively      assigned as the origin-containing                fragment,      and
                                                                                            (Lane 1,320 pM; Lane 2, 174 pM; Lane 3,91 pM; Lane 4,40 pM; Lane
  the other as the fragment immediately                  upstream. (We assume               5, 22 pM; Lane 6, 11 pM; Lane 7, 6 pM; Lane 8, no ddTTP                                               added).
  that the reason the origin-containing                fragment is represented              After 20 min at 3O”C, the reactions                               were stopped           by addition            of
  at all is due to either (a) strand displacement                        or (b) nick        EDTA to 50 mM. After addition                        of carrier tRNA, part of each reaction
  translation     through the initiation         site region, since DNA po-                 mixture        was extracted          with phenol and transferred                     into 10 mM Tris-
  lymerase I was used in these experiments.)                                                HCl (pH 7.4), 1 mM EDTA                       by centrifugation            through       Sephadex           G-50
                                                                                            (Neal-and          Florini, 1973). The partially               duplex rings were then digested
     A localization      of the St-l origin with respect to the cleavage                    with Hue III. The nroducts                     were lvophilized,            resuspended           in 20 111of
  sites of another restriction enzyme was obtained by analyzing                              formamide           containing       0.1% xylene cyan01 and bromphenol                             biue:del
  the partially duplex rings with Hpa II. The data are shown in                              natured        at 9O”C, and loaded                 onto a 4% acrylamide,                  7 M urea gel.
  Fig. 2. The first restriction          fragment downstream                 from the        Another        portion       of the partially        duplex rings was analyzed                    with Hpa
  origin is Hpa         12, followed     by Hpa          3. The new band, N,                II (see Fig. 2).
  representing       the distance from the origin to Hpa 12, is 90 to
  95 bases long (Fig. 2). In reactions in which synthesis has                               strand synthesis in phage St-l begins at a specific site located
  proceeded most of the way around the circle, Hpa 6,6, and 9                               approximately     105 bases upstream from fragment Hue 4 and
  are all quite faint, with Hpa 8 being the least represented.                              about 90 to 95 bases upstream from fragment Hpa 12. The
     From these experiments, it can be concluded that negative                              origin is contained within either Hue 8 or Hue 10, and prob-
12618                                    dnaG (Primase)-dependent Origins of DNA Replication
ably within Hpa 8. This information  was sufficient to proceed.                            proceeded to sequence directly the small number of fragments
It was more efficient at this point to analyze the relevant                                known to lie in the vicinity of the origin (Fig. 3).
fragments and obtain overlapping    sequences than to define
the St-l restriction enzyme map more extensively. We thus                                             Sequencing of the Negative Strand Origins
                                                                                               The fragments relevant to the St-l origin (Hue 8 and 10,
     12M3                         45              6M          78
                                                                                           Hpa 6, 8, and 9) were isolated preparatively               by digestion of
                                                                                           fully duplex St-l rings. Each restriction           fragment was then
                                                                        Hpal               end-labeled     using polynucleotide        kinase and [Y-~‘P]ATP.
                                                                        Hpa      2
                                                                        Hpa      3         Since the technique        used for sequencing          required      strands
                                                                        Hpa4               labeled only at one end, aliquots of each fragment were tested
                                                                        Hpa      5
                                                                                           for their ability to be cut by a number of different restriction
                                                                                            enzymes, and also for their strand separation properties. If the
                                                                                           strands separated well, this method was used to generate
                                                                                           singly end-labeled pieces; if not, a second enzyme which would
                                                                                           cut the primary piece was found. The singly end-labeled                  DNA
                                                                        Hpa      6
                                                                                           fragments thus produced were sequenced using the chemical
                                                                                           cleavage technique developed by Maxam and Gilbert (1977).
                                                                        Hpa      7             As can be seen in Fig. 4, the sequencing data for St-l showed
                                                                                           that Hue 4 and Hpa 12 (the two pieces identified from the
                                                                        Hpa      6         dideoxy experiments       as being immediately         downstream        from
                                                                                           the origin) overlapped,       as expected. The terminal              14 base
                                                                        Hpa      9         pairs of Hpa 12, which were not contained                 in Hue 4, were
                                                                                           found in Hue 8. Moreover,           Hue 8 and Hpa 8 overlapped.

                                                                                                                                                                              Downloaded from by guest, on July 12, 2011
                                                                                           These data are completely          consistent with the information
                                                                        Hoa      11        from the dideoxy experiments,          both in terms of the relative
                                                                                           locations of the fragments and in the assignment of the origin
                                                                        Hpa      1         to a site approximately       105 bases upstream from fragment
                                                                        !pa      1         Hue 4 and 90 bases upstream from fragment Hpa 12. The
                                                                                           sequences obtained from the Hue and Hpa fragments also
                                                                                           revealed cutting sites for additional        restriction     enzymes. Se-
                                                                        Hpa      13         quencing from these ends (as described in Fig. 3) was then
                                                                        Hoa      14         carried out to provide completely            confiied        positive and
                                                                                            negative strand sequence data for the immediate region of the
                                                                        HpalB               origin and several hundred base paira on either side.
                                                                                               The data relevant to the St-l negative strand origin are
                                                                                           summarized       in Figs. 3 and 4. Fig. 3 gives the origin-region
                                                                                           restriction map and the sequencing strategy used. Fig. 4 shows
                                                                                            the St-l sequence in the region surrounding           the initiation      site,
   FIG. 2. Localization   of the St-l     origin. This figure shows an                      written in double-stranded       form, and containing,         in addition
experiment identical to that described in the legend to Fig. 1, except                      to the origin, parts of the flanking structural genes.
that the partially duplex St-l circles (the same preparation as used in
Fig. 1) were digested with Hpa II. Hpa II was also used to generate                        Location of the St-l Initiation Site at the Nucleotide Level
the markers. The fust region of the positive strand template to
become duplex is represented by fragments N and Hpa 12, with                                  The approximate   position of the initiation site within the
synthesis starting 90 to 95 bases prior to fragment Hpa 12.                                St-l origin-region was determined    by using the information

                              /               I           I                            I            I           I
            +4c;o      +300            +200            +100        +A           -100         -200        -300
    FIG. 3. Restriction      enzyme    maps of St-l, gK, and (~3 in the                    The   base the arrow represents the labeled 5’ end of the strand; the
region     surrounding    the negative    strand origin, and sequencing                     solid line shows the sequence that could be read from the gels used
strategies      used. Restriction maps: these were derived from the final                  to separate the cleavage products. Dots indicate portions      of the
St-l, +K, and a3 sequences. The 5’ to 3’ direction of the positive                         sequence that were not resolved on the gels used in that particular
strand is left to right. Sequencing strategies: each arrow indicates the                    analysis.
sequence information obtained from a given end-labeled DNA strand.
                                                              dnaG       (Primase)-dependent                          Origins                  of DNA             Replication                                                                  12619
                                                                     +I.20                                     ft.00                                                 +380                                             +3.0
     St-l      positive                       strand     5’         . . . G~YRACCTTAGCTTTGCTGGT~ccTc*TAcccG*TTGTTGG~ATTGTTcGTTTTG*GTcTGcTTTTGAcc
               negative                       strand     3’         . . . CARYTGGAATCGAAACGACCATGGAGTATGGGCTAACAACCATAACAAGCAAAACTCAGACGAAAACTGG

                      f9l.0                                           f320                                         +300                                                +280                                             f2.0


                          +1&O                                          +210                                        +200                                                                                                     +,.0

                              +1&O                                           +120                                                           Hpal2/Hpa8                          +80                                                 +60


                                                                                                                                                                                                                                                       Downloaded from by guest, on July 12, 2011
                                 +40                                            +20                                           +1                                                -20                                                 -1.0

                                                                                            3’...      GGAAGGGAGGA                              5’       primer                                         MetLeuGlySerIleIle

                                                      HpaB/Hpab               -100                                                                                              -120                                                -16    0
                         +                                                         +                                                                                                          c                                            +


                                       -160                                                                                    -300                                                    -320


   FIG. 4. The DNA sequence            of bacteriophage      St-l surrounding          the negative        strand       origin.   The horizontal    bar covers the region
identified from the dideoxy     mapping      technique  as containing     the negative     strand initiation       site. The nucleotide      marked    +I indicates    the 5’.
nucleotide   of the primer synthesized     from St-l single-stranded       DNA in uitro.’ The flanking           structural     genes (whose codon-derived        amino acid
sequences are shown) were identified         by their homology      to +X174 and G4 genes G and H (the homologies                    are shown in Figs. 7 and 8).

available from the dideoxy experiments.          From these experi-                                                                                                 The       C#BK and            a3   Origins
ments, it was possible to conclude that the origin of negative                                                                The negative strand origins of phages C#IK and a3 were
strand synthesis lies about 105 nucleotides prior to fragment                                                             located and sequenced using the same procedures as described
Hue 4 and about 90 nucleotides           prior to fragment Hpa 12                                                          for St-l. The information    obtained from the dideoxy mapping
(Figs. 1 and 2). This position is marked by a horizontal       bar in                                                     technique     is included in the +K and a3 restriction       enzyme
Fig. 4. The St-l initiation      site could then be defined at the                                                        maps of Fig. 3. Figs. 5 and 6 contain the +K and a3 origin
nucleotide level by comparing the sequence in the immediate                                                               sequences, written in double-stranded       form. Once again, the
vicinity of the origin with the sequence of an RNA oligonucle-                                                            initiation   sites were located at the nucleotide    level by com-
otide primer synthesized from St-l DNA in vitro by the dnaG                                                               parison with the RNA primers synthesized             by the dnaG
priming    protein.   The RNA primer has the sequence 5’                                                                   priming protein in vitro. The RNA primers of phages +K and
pppAGGAGGGAAGGon              3’.’ As seen in Fig. 4, the primer is                                                        a3 are identical to that of St-1.2,3
complementary       to the positive strand beginning      at the nu-                                                          The ability to define negative strand initiation    sites for St-
cleotide marked +l.                                                                                                        1, +K, and a3 confiied     what previously had been supposition;
   * D. Capon,       M. Gefter,                   and S. Wickner,      unpublished          results                            ’ E. Benz, personal                   communication.
12620                                                      dnaG        (Primase)-dependent                                          Origins                    of DNA           Replication
                                                                                                                                            +I80                                                        +1.X0                                               +Li.O
                                                                                                                                                c                                                           c                                                   +
                                                                  positive                   strand                  5’            . . . CCGGTCGTGTGCTTGGTACCATCGCTACGACTCAGGTTATTCGTGAG
                                                                  negative                   strand                  3’             . . . GGCCAGCACACGAACCATGGTAGCGATGCTGAGTCCAATAAGCACTC

                                      +120           +100                          +Z30                                                                                                                      +.0                                                 +I.0
                             4.                           i                            i                                                                                                                           j.                                                   +

                                           +20                                              +1                                                      -20                                                      -*o                                                 -60
                                                 c                                            c                                                           +                                                        J-                                                   +
                                                     3'.    . . GCAAGGGAGGA                                  5’      primer                                                 MetLeuClySerIlei~eClyGlyIleCZySerSer

                                                 +                                                 +                                                           +                                                        +
     CTGCTCGGAGGAATTGCTTCCGGCGGTATCTCCAGCCTCCTTAATAAAATGTTCAGTAAAATGCCAGAACATGCCGCCTCTTCTGCTGGCC...                                                                                                                                                                     3'
     GACGAGCCTCCTTAACGAAGGCCGCCATAGAGGTCGGAGGAATTATTTTACAAGTCATTTTACGGTCTTGTACGGCGGAGAAGACGACCGG...                                                                                                                                                                     5'

                                                                                                                                                                                                                                                                             Downloaded from by guest, on July 12, 2011
   FIG. 5. The DNA sequence                of bacteriophage           +K surrounding                                                     G and H as described     in the legend to Fig. 4. In the region we have
the negative       strand     origin.     The horizontal         bar covers the region                                                   studied, St-l and $K are almost identical      phages (10 changes out of
identified     from the dideoxy          mapping      technique       as containing      the                                             336 base pairs sequenced),     despite the fact that St-l was isolated in
negative    strand initiation       site. The nucleotide          marked    +I indicates                                                 Stoke-on-Trent,  England in 1964 while +K was found in central Japan
the 5’-nucleotide       of the primer      synthesized       from +K single-stranded                                                     10 years later.
DNA in G&o.* The adjacent structural                 genes were identified         as genes

                                                                                                                                                                                    +I80                        +X60
                                                                                                                                                                                       +                             +
                                                                                                       CL3        positive                  strand                   5’      . . . CCGGTCGTGTGATTGGTACCATCGCTACGACT
                                                                                                                  negative                  strand                   3’      . . . GGCCAGCACACTAACCATGGTAGCGATGCTGA

                 +140                          +110                         +100                         +e0                                                                                                                                    +60
                                                    J-                                                                                                                                                                                                 +

                                                                                  I                                           I
                         +10                                            +20  +1                          -20                                                                                                                                    -60
                      4.                           4.                          c                             +                                                                                                                                         J-
                                                                                             3’.        . . GGAAGGGAGGA                                   5’       primer                                               MetLeuClyAZaValValG~~

                         -60                                                 80                                                   -100                                                     -120                                                 -140
                                 J-                                               J-                                                        t                                                       +                                                       J-

                               160                                           180                                                     -200                                                         220
                                      j.                                               4.                                                       +                                                       J-
     TTGACTAACGGCGGCGGCGCCArTTCTATGGATAATGACCAAGGTATTCAGTCTGCTATTCAAGGCTCGAATGTTCCTCCTGCTGGTCAG...                                                                                                                                                                3'
     AACTGATTGCCGCCGCCGCGGTAAAGATACCTATTACTGGTTCCATAAGTCAGACGATAAGTTCCGAGCTTACAAGGAGGACGACCAGTC...                                                                                                                                                                5'

   FIG. 6. The DNA sequence                 of bacteriophage           cu3 surrounding                                                   DNA     in uitro.2s3 The adjacent    structural     genes were identified         as
the negative      strand      origin.     The horizontal         bar covers the region                                                   genes G and H as described        in the legend to Figure           4. a3 is very
identified    from the dideoxy           mapping      technique       as containing     the                                              similar to St-l and +K within both the origin-containing           intercistronic
negative    strand initiation       site. The nucleotide          marked     +I indicates                                                region and the downstream       gene G. However,        it differs considerably
the 5’-nucleotide      of the primer        synthesized      from ~y3 single-stranded                                                    from St-l and +K within the upstream            gene H.
                                                        dnaG     (Primase)-dependent                   Origins       of DNA               Replication                                                                          12621

 namely, that because these phages require only the dnaG                                                   genes (see Figs. 4-6). Translational        stop signals interrupt the
 protein to initiate their negative strands, DNA synthesis be-                                             origin region DNA in all three possible reading frames. The
 gins at a specific site.                                                                                  longest potential peptide involving the intercistronic            region
                                                                                                           that could be synthesized in any of these phages is 42 amino
                                       The Four Origins                                                    acids long, starting at a GTG at position +198 in St-l and $K
                                                                                                           and terminating     with a TAG at position +72. The longest
                      The Origins               Are Intercistronic
                                                                                                           possible protein entirely within the intercistronic      region would
      The G4 negative strand initiation          site lies in what is evi-                                 be 17 amino acids long, between the GTG at +86 and the
  dently an untranslated        region between two coat protein genes,                                     TAG at +35 found in both a3 and +K. It is not known whether
  F and G (Sims and Dressler, 1978; Fiddes et al., 1978). Begin-                                           any of these potential peptides are made during the phage life
  ning abou’t 115 bases downstream           from the origin, the DNA                                      cycle.
  codes for a protein essentially identical to the coat protein F                                             Interestingly,  while the intercistronic     region containing the
  of the related phage +X, 23 nucleotides             upstream from the                                    origin in G4 is found between genes F and G, in the case of St-
  initiation   site are triplets which translate into a protein very                                       1, +K, and a3, the origin is found between sequences that code
 similar to that of +X gene G.                                                                             for proteins similar in amino acid sequence to c$X and G4
      The St-l, +K, and (~3 negative strand origins also appear to                                         genes G and H. This is shown in Figs. 7 and 8, in which the
 lie in an intercistronic       region between two viral coat protein                                      amino acid sequences (deduced from the DNA sequences) of
           St-l      gene      G         . . . . LEU-SER-PHE-ALA-GLY-THR-SER-TYR-PRO-ILE-VAL-GLY-ILE-VAL-ARG-PHE-GLU-(                                                                                           )-SER-
                     gene      G         .,.    . SER     LEU   SER     ASN    VAL    PRO    ALA     ASP     MET      II          ALA     PHE     ALA            ILE         ,,     ,,         ,,        VAL         ALA
                     gene      G         . . ..ILE        ALA     I,    ASP    ALA    ASP    PRO     LYS     PHE    PHE           ALA     CYS     LEU              I,        I,     ,I         ,,        (                 )

                     ASP    GLY    VAL     VAL       /I     II   ALA      VAL     PRO    (    ) ALA     LEU         TYR          ASP      VAL       ,I             ,I       I,      I,    (         )    THR        PHE

                                                                                                                                                                                                                                       Downloaded from by guest, on July 12, 2011
                     SER    SER    SER     VAL       I,     I,   THR      LEU     PRO    (    ) THR     ALA         TYR          ASP      VAL       I,             I,     LE”     ASN     (         )    GLY        ARG

                     ASN    ASN      I,   LYS     ALA    ILE      I,   PHE    LYS        I,  ALA    VAL       ,,                  ILE       I,   SER           II           II      I,         II        VAL          II
                     HIS    ASP      I,      ,,   TYR TYR       THR    VAL    LYS        I,    ,,   VAL       II                  ILE       II   VAL         LEU            ,I      ,I         I,        PRO          II

                            ASP      ,I    TYR     ALA      I,    ILE     MET    LEU   TRP       I,   ASN    ALA                            II   ALA         SER          THR     ILE    SER                I,     VAL
           z           ::     I/     II    TYR       II     II    PHE     MET    VAL   TRP       ,I   ASN    (   1 PC;,                     I,   ALA         THR          LYS     CYS    ARG                I,     LEU
           a3                                                                                                                                    . .           II           I,      I,   ILE                II       I,

                     LEU    SER    VAL    ASN       ,I     I,   ASN       II     II   ALA    THR               I,        I,         I,      I,      I,            I,      STOP
          ;i         VAL    SER    LEU    ASN       ,I     ,,     ,,    LYS      II   ILE    ILE             CYS         II         II     II      II             II      STOP
          a3           II     I(     II     ,I      II     II     II    HIS      II     II     II              II        II         II     II      I,             II      STOP
    FIG. 7. The gene downstream            from   the origin      in St-l,  +K,                           G4, and 013, nonidentical        amino acids are written  out, while a ditto
and a3 corresponds         to gene G of cpX and G4. The amino acid                                        indicates     that the same amino acid is present as in St-l. Parentheses
sequences,    deduced from the DNA sequences,         of the COOH-terminal                                mark      gaps which have been inserted       to maximize    homology.   The
portion   of the genes downstream     from the St-l and a3 origins can be                                 amino      acid sequences     for gene G of +X and G4 (used for this
aligned with the amino acid sequence of gene G of phages +X and G4.                                       comparison)        are taken from Sanger et al. (1978) and Godson et al.
+K, because of its virtual   identity  with St-l, has not been written      out.                          (1978), respectively.
The gene G amino acid sequence is written         out in full for St-l; for +X,

          St-l        gene         H            MET-LEU-GLY-SER-ILE-ILE-GLY-GLY-ILE-GLY-SER-SER-LEU-LEU~GLY-GLY-LEU-ALA-SER
                      gene         H             t,    PHE       II      ,,      I,   ALA       II     II     II    ALA     II           ALA       II        ALA            II     II      II           MET        GLY
          ifs         gene         H             II PHE          II    ALA            ALA       I,     I,     ,I    ALA     II           ALA       II        ALA            II     II    ALA            MET          I,
           a3         gelze        H             8,      ID      I,    ALA    VldL    VAL       81     I,     ,,    ALA     II           ALA       II        ALA          SER      ,,    ALA             II          II

                    LYS    LEU     PHE    GLY     GLY    GLY     GLN    (                                                                                               ) SER      ,,    ASP               II      THR
          ;;        LYS    LEU     PHE    GLY     GLY    GLY     GLN    (                                                                                               1 LYS      II    ALA               II      GLY
          cr3       LYS    LEU     PHE    GLY     GLY    SER     GLN    ( ) ARG        GLY    VAL      u    VAL                    II    GLN     GLN         (          1 GLY       I,   GLU               II        ,I

          St-l      GLY-LEU-THR-ASN-GLY-GLN-GLY-(                                                                             )-THR-ILE-GLY-MET-ASPTHR-ASP-ASP-ALA-GLY-
                           ILE(       )GLN        ,,            ASN    VAL    LE”    ALA     SER    ASP      ASN    ASN           VAL     VAL      II  ALA     (    ) ASN                     II           II        II
          ;;          ::   ILE(       )GLN        ,,            ASP    VAL    LEU    ALA     THR    ASP      ASN    ASN                   VAL      II    II    ( ) GLY                        II          II         I,
          CL3         ,,      ,I         ,I       4,       ,I   GLY      I,   (                                               ) .:A         I,   SER     II      II   ASN                     I,        GLN          II

                            LYS       II   ALA        II    II     II      ,I   THR       II     II   ASN    SER                 GLN     GLU     ALA         ALA          PRO      I(    ALA            ILE          ,,
                            LYS       I,    ALA      ,I     II     II      I!      II     ,I     I,   ASN    PRO                 ASP     GLU     ALA         ALA          PRO      I)    PHE            VAL          I,
          a3          II      II      II ALA         II     II     II      ,I     II      II     I,      II    ,I                  ,I      ,I    . .     .

    FIG. 8. The gene upstream                 from     the origin     in St-l, +K, and                 acid sequence for St-l gene H is written     out in full; for 6X, G4, and
(~3 corresponds             to gene H of cpX and G4. The amino acid se-                                03, nonidentical  amino acids are written    out, while a ditto indicates
quences,       deduced       from the DNA          sequences,     of the NHz-terminal                  that the same amino acid is present as in St-l. Parentheses     mark gaps
portion     of the genes upstream           from the St-l and a3 origins can be                        which have been inserted     to maximize  homology.    The gene H amino
aligned with the gene H sequences                of phages +X and G4. +K, because                      acid sequences of +X and G4 are taken from Sanger et al. (1978) and
of its virtual     identity      with St-l, has not been written         out. The amino                Godson et al. (1978), respectively.
12622                                           dnaG             (Primase)-dependent              Origins        of DNA    Replication
                                                                  +I80                                    +I60                                     +140
                                                                    +                                       +                                       J-
                           St-l            5’                CCGGTCGTGTGCTTG               GTACCATCGCTAC        GACTCAGGTTATTCGTGA           GTATCAGGTCCT
                           0K              5'                CCGGTCGTGTGCTTG               GTACCATCGCTAC        GACTCAGGTTATTCGTGA           GTATCAGGTCCT
                           cl .i           5'                CCGGTCGTGTGATTG               GTACCATCGCTAC        GACTCAGGTTATTCATGA           ATACCAAGTCCT
                           G4              5'                CTAAATTTAACATTAACGT               TTATCGTCACATGCCT        ACGACA      CGTGACTCAATCATGACCT

                                                             0             n                                                                                             1
                     end     0;‘   gene4                           +100                             +BO                                +60
                 CGTAACGC         AAC AAAGGCCGCCCCTCTACTGGTCAGATACCTGCCCAATGTGGGGCGGACCGTGCC                                                          TACGGAGATACTCGAG
                                                             HHHIHHHW                                            HHH                          H H                     H t

               +*0                                     +20                                   +1                             -20
                                                                                                          i             F==--=+

                                           H      H              H                H        H      H                                                       HI      C      n

                                                                                                                                                                                        Downloaded from by guest, on July 12, 2011
                                                       -60                                          -80                                -100
                ATT GGAGGTATTG        GCTC     :TC                   GCT          GCTCGGAGGACTTGCTTCCGGCGGTATCTCCAGT:T                                     3’
                ATT GGAGGCATTG        GTTC     ATC                   GCT          GCTCGGAGGAATTGCTTCCGGCGGTATCTCCAGCCT                                     3’
                GTT GGTGGCATTG        CCTC     AGC                    CT       TAGCTAGT     GGTGCCGCTTC      GAAATTATTTGGAGGCT                             3’
                ATTTCTAAGCACAATGCTCCAATTAACTCTA                                    CTC    AG    CTTGCTGCT     ACTAAAACCCCAG                   CT           3’

                                                                                                   IU                  u          uu

    FIG. 9. Comparison         of the four negative        strand      origins.  The               because of the gaps, differs slightly        for the other three phages. The
positive strand sequences for G4, St-l, +K, and (~3 have been aligned                              boxes under the sequences           indicate   nucleotides  or stretches    of nu-
at the negative      strand    initiation     site. The nucleotide        marked   +I              cleotides   which      are the same in all four phages.            The conserved
indicates  the 5’.nucleotide       of the primer synthesized       from each phage                 sequences    referred     to in the text extend      from positions   -18 to +27
in vitro. Short gaps have been introduced                at places in the DNA                      and from positions        +41 to +82. The G4 sequence is taken from Sims
sequence to maximize         homology.     The numbering        refers to St-l and,                and Dressler      (1978).

the St-l, +K and a3 structural genes flanking the origin region                                    H in St-l, $K, and a3. When one aligns all four gene G
are compared with the gene G and H amino acid sequences of                                         sequences or all four gene H sequences (see the boxes in Fig.
$X and G4. Thus, the approximately       135base intercistronic                                    10, and compare with Fig. 9), it remains evident that the
region appears to have been shifted, essentially intact, to a                                      nucleotide conservation    in the intercistronic    origin region is
different location on the St-l, +K, and a3 genomes.                                                much greater than the nucleotide conservation in the coding
                                                                                                   regions. Presumably,     selective pressures on the structural
                Conservation  of the Origin Sequence
                                                                                                   genes are exerted at the amino acid rather than at the nucleo-
    Fig. 9 shows a comparison       of the G4, St-l, +K, and a3
                                                                                                   tide level, and some variation in primary structure is compat-
positive strand templates at the negative strand origin. The
four origins have been aligned, using the initiating nucleotide                                    ible with the proteins retaining    their functions. The lack of
of the RNA primer as a reference point and inserting occa-                                         equivalent    divergence in the intercistronic     region suggests
sional gaps to maximize homology.                                                                  that these nucleotides are used directly to form a structural
                                                                                                   element recognized by the dnaG priming protein.
    The most notable aspect of this sequence comparison is the
high degree of nucleotide       conservation    in the area of the                                                            Secondary               Structure
initiation   site. The areas of identity     between the four se-
quences, which are indicated by boxes in Fig. 9, cluster into                                         How does the highly conserved sequence in G4, St-l, $K,
two regions. These conserved regions are 45 and 42 bases long,                                     and a3 provide an origin for the dnaG priming protein? The
and are separated by 13 bases of divergent sequence. The 45-                                       diameter of a spherical protein with the molecular weight of
base conserved region spans the actual start point of negative                                     the dnaG protein (Mr = 60,000) (Wickner et al., 1973; Rowen
strand synthesis and extends from nucleotide          -18 to nucleo-                               and Kornberg,     1978) can be calculated to be about 50 A. This
tide +27.4 The second conserved element, 42 bases long, lies                                       is sufficient to span approximately      15 base pairs of double-
downstream,      encompassing   nucleotides +41 to +82.                                            stranded DNA, and about the same length of extended single-
    The high degree of base conservation          in the immediate                                 stranded DNA. Assuming that the dnaG protein functions as
region of the origin does not extend into the coding regions on                                    a monomer, it follows that if the conserved 45- and 42-base
either side. This is not simply because the flanking genes code                                    stretches are involved in the recognition      process, they must
for different proteins: genes F and G in G4, and genes G and                                       adopt a more compact configuration.
                                                                                                      It is possible to draw potential regions of secondary struc-
    4 The distal part of this sequence,    the AAGGA      from -14 to -18,
probably   serves as part of the ribosome     binding site for the adjacent                        ture within the origin sequence. This has been done in Figs.
structural   gene (Shine and Dalgarno,    1974) and may be conserved      for                      I1 and 14, A and B. Estimates of the stability of these hairpins
that reason.                                                                                       are also given, calculated as described by Wang (1978), using
                                                                      dnaG   (Primase)-dependent                 Origins     of DNA            Replication                                               12623

the free energy values                                 for nonhelical         regions      compiled         by    three   phages.    This    hairpin      lies five bases downstream                       from
Tinoco et al. (1973).                                                                                             the start point     of primer      synthesis.
    As shown       in Fig. 11, two hairpin           loops are conserved            in all                            The other   region    of secondary        structure involves   the                 down-
four phages,        and correlate       generally       with     the two conserved                                stream half of the second conserved                            segment. A small hairpin,
 nucleotide     stretches.       The first region         of secondary       structure                            with a 6- to S-base-pair     stem and a 6-base         turnaround,      can be
involves    part of the 45base            conserved         element,    and includes                              drawn    for all four phages      (shown   in Fig. 11). In this hairpin,
 bases +6 to +26. It forms           a hairpin        with     an %base-pair         stem                         as in the one upstream,         the stems     are similar       but the loops
 and a &base         turnaround.     The stems are similar               in all cases,                            differ  between      G4 and the other       three    phages.       The second
 but the bases in the loop differ                 between         G4 and the other                                hairpin    is not as unique   as the fist,     since the same DNA           can

          St-'1         gene            G       5'      ..CCGGTCGTGTGCTTGGTA                CCATCGCTACGACTCAGGTTATTCGTGAGTATCAG                                     GTCCTTCAGCCGCTTAAATAA
          0K            gene            G       5 ’    . . CCGGTCGTGTGCTTGGTA               CCATCGCTACGACTCAGGTTATTCGTGAGTATCAG                                     GTCCTTCAGCCGCTTAAATAA
          a3            gene            G       5’     ..CCGGTCGTGTGATTGGTA                 CCATCGCTACGACTCAGGTTATTCATGAATACCAA                                    GTCCTTCAGCCGCTTAAATAA
          G4            gene            G       5’     . . CCTCTACTATCTCTGGCGTCC                TCTCTGTTAATCAGGTAAACCGTGAA                                 GCAACCGTCCTTCAACCTCTGAAATAA

                                                                                               H H HH                        HHH-t-R1                                                HHH             I

          St-l            gene              H                   ATGCTT   GGAAGTATCATTGGAGGTATTGGCTCATCGCT                              GCTCGGAGGACTTGCTTCCGG                          CGG TATC
          iaK             gene              H                   ATGCTT  GGTAGTATCATTGGAGGCATTGGTTCATCGCT                               GCTCGGAGGAATTGCTTCCGG                          CGG TATC
          n3              gene              H                   ATG TTAGGTGCCGTTGTTGGTGGCATTGCCTCAGC                             CTTAGCT    AGTGGTGCCGCTTC                         G AAATTATT

                                                                                                                                                                                                                  Downloaded from by guest, on July 12, 2011
          G4              gene              H                   ATGTTT  GGCTCTATCGCTGGCGGTATCGCCTCCGCACTT                              GC CGGAGGCCTTA       T                    GGGTAAATTATT

                                                                I      HHt          l-l,          HH                                                                      n

          St-l            TCCAG     TCTC                        CTTAATAAAATGTTT        AGTAAAATGCCAGAA        CACGCCGCCTCTTCTGCTGGCCTT                                        ACTAATGGTCAAGG
          0K              TCCAG   C CTC                         CTTAATAAAATGTT       CAGTAAAATGCCAGAA         CATGCCGCCTCTTCTGCTGGCC....
          a3              TGGAGGCTCTCAGC                              G T GGA   GTTTCTGTT       ATGC   AGCAAGGTGCCGAGTCT            GCCGGA                             TTGACTAACGGCGGCGG
          G4              TGGCGGCGGTCAG                                                                       TCCGCCGACTCT          ACCGGAAT                                 CCAA   GGCAACGT

    FIG. 10. Relative     lack of nucleotide    conservation     outside   the origin region.    The nucleotide                                                       sequences for genes G and H of phages
St-l, +K, 013, and G4 have been aligned,         and gaps introduced     to maximize   homology.    From this                                                      figure,   one can see the lessor degree of
nucleotide   conservation     within the coding regions, as compared      to the high degree of conservation                                                        in the origin region (compare   the boxes
here with those in Fig. 9). The sequences of G4 genes G and H are taken from Godson et al. (1978).

                                                                                                                                     T    A    G
                                   TG                                                                                                              -r.e         bcnl
                                                                                                                                     T         C
                               G        T                                                                                                G-C
                                                -5.3     km1
                               A        A                                                                                                C.G
                           C-G                                                                                                           G.C
                           C.G                                                                                                           G.C
                           C.G                                                                                                           A.T
                           G.C                                                                                                           A.T
                           T.A                                                                                                           G.C
     5’          GATGCC.GACCGAGCCGTACGGAGATACCCGATAAACTAGGAACGTG.CCTCCTGCTAA~~~CAAAAAGGAG                                                                                                                 3’
                                                       +i,                    +i,                     +t,                  +i,                             +i                       -10
                 a3 Positive        Strand

                                                                                                                                     A    A G
                                                                                                                                                   -7.    7 kcnl
                                        TG                                                                                           A      C
                                    A            T                                                                                       G-C
                                                       -1.5    kcal
                                    A            G                                                                                       C-G
                                  C.G                                                                                                    G.C
                                  C.G                                                                                                    G.C
                                  C.G                                                                                                    C.G
                                  G.C                                                                                                    A.T
                                  TG                                                                                                     G.C
     5’          CAGATACC.GACCGTGCCTACGGAGATACTCGAGTCTCCGATACATG.CCTACTGCAAAGCCAAAAGGACTA                                                                                                                 3’
                                       1                                                                                                                                              I
                                      +a0                +i,                 +t,                                           +i,                             +i                       -10
                 01 i’ositiue :trand

                                    G            T                                                                                   T    A    G -6.8
                                                 A     4.0     kcal                                                                                         kcal
                                    A                                                                                                T         C
                                        C.G                                                                                              G.C
                                        C.G                                                                                              C.G
                                        C.G                                                                                              G.C
                                        TG                                                                                      ::;
                                        C.G                                                                                     A.T
     5’      GTTATGT&GAGCCGTACGGAGATACC                                                        CGATAAACTAGGAACGTG.CCTCCTGCTAAGCCCAAAAAAGGA                                                                3’
                                                       +:.                    +b,                     +I,                  +i,                             +I                       20
             St- 1 %sitive              strand

    FIG. 11. The positive           strands      of G4, St-l,     and (~3, drawn         to                       hairpins    were estimated    as described    by Wang (1978), using the free
show potential        regions        of secondary       structure      in the vicinity                             energy values for nonhelical      regions compiled      by Tinoco et al. (1973).
of the negative     strand      initiation      site. The figure shows two hairpin                                Alternative     downstream    hairpins,   the structures    of which differ more
loops which. are conserved             in all four origin-region        sequences.    The                         from phage to phage than those shown here, are presented               in Fig. 14,
pattern  of secondary      structure       of +K is essentially    identical   to that of                         A and B.
St-l. and has therefore          not been written        out. The stabilities       of the
12624                                    dnaG       (Primase)-dependent           Origins      of DNA        Replication

be formed into an alternative       larger hairpin (see Fig. 14, A                    Thus, one encounters a paradox. The bacterium                      possesses
and B). These larger hairpins are formed from DNA partly                           a protein that by itself can begin strands by recognizing                        a
within the downstream       conserved region and partly outside                    relatively    simple, defined nucleic acid sequence (present in
it. The large hairpins shown in Fig. 14, A and B appear slightly                   G4, St-l, +K, and a3), but it appears unlikely that this protein-
more stable than the smaller ones depicted in Fig. 11, but                         nucleic acid interaction       is used frequently for strand initiation
differ in size and shape more significantly from phage to phage                    in the most obvious place, that is, for the formation of frag-
than do the alternative smaller hairpins. It is difficult to say                   ments in the bacterial DNA growing point. Why then is the
with certainty which structure would exist inside a cell. The                      dnaG protein able to recognize this specific sequence? Does
potential for hairpin formation appears limited to the origin-                     the sequence exist somewhere in the bacterial chromosome,
containing intergenic region; comparable regions of secondary                      perhaps for some special initiation           event?
structure have not been found within the flanking structural                          To explore this question we synthesized a Ill-base                     radio-
genes.                                                                             active DNA strand containing              the St-l origin sequence (see
    An observation which lends support to the hypothesis that                      Fig. 12). This fragment was then used as a probe (Fig. 13) for
the origin-region    secondary structure is functional    is the ex-               hybridization       against E. coli DNA restriction                 fragments,
istence of compensatory      base changes. In several cases, pairs                 which had been separated on an agarose gel and transferred
of well separated nucleotide changes result in the substitution                    to a nitrocellulose      filter by the method of Southern                (1975).
of an AT for a GC base pair in the stem of a hairpin. One                          The conserved sequence present in G4, St-l, +K, and a3 is
example occurs in the hairpin nearest the primer (the right-                       extensive enough that if a similar sequence were also present
hand hairpin in Figs. 11 and 14, A and B); a GC base pair in                       in E. coli, one or a few of the immobilized                   bacterial DNA
G4 has been changed to an AT base pair in the other three                          fragments should hybridize with the probe. However, we could
phages. Another      example is found at the base of the large                     not detect hybridization           of the origin probe to any E. coli
downstream hairpin (see Fig. 14A), where an AT pair in a3 is                       fragments, even with long exposure times.
replaced by a GC pair in St-l.                                                        Against this negative result, two basic controls were carried

                                                                                                                                                                                       Downloaded from by guest, on July 12, 2011
    The regions of conserved secondary structure,         assuming                 out.
they exist in uiuo, would serve to make the origin-region       more                   1. A sequence known to occur once in the E. coli chromo-
compact. However, additional        folding of the DNA, possibly                   some, part of the lactose operon, could readily be detected
using non-Watson-Crick       base pairs and involving interactions                 using a 200-base-pair lac operon restriction fragment as probe.
between the two hairpins, would still be required to bring the                     Thus, the hybridization          techniques employed were sensitive
initiation   site down to a size that could be recognized       by a               enough to detect homologous              DNA present at the level of a
protein with the dimensions of the dnaG protein.                                   single copy per genome.
                                                                                       2. A control was also performed to show that a related but
    Does the dnaG Recognition             Element     Occur in E. coli?            not completely homologous sequence could be detected. Spe-
    The four closely related viral DNA sequences we have                           cifically, the St-l probe was used to hybridize                   to G4 DNA,
determined function as recognition           elements that allow an E.             present at the level of single copy E. coli DNA. The St-l
coli replication      protein, dnaG, to initiate        a DNA strand.              origin-region     probe and the G4 DNA differ in about 20% of
Presumably, the bacterial protein does not exist solely to serve                   their bases. Under the conditions used the St-l probe hybrid-
the parasitic interests of these single-stranded          phages. There-           ized well to G4 DNA; a strong band could be seen for both
fore, the question arises as to whether sequences like that                        DNAs after 1 or 2 days of exposure time. This efficient cross-
found in G4, St-l, +K, and a3 occur in the bacterial chromo-                       hybridization     shows that an E. coli sequence related to the
some and are used for the initiation          of DNA strands.                      St-l origin probe (and containing            hairpins similar to those in
    A fist thought would be that the formation                 of Okazaki          the viral origins) could have easily been detected even if it
fragments in the bacterial DNA growing points might involve                        had about 20% mismatch.
a protein-nucleic      acid interaction    such as the one that occurs                 The data thus suggest that the highly conserved DNA
in these phages. However, this idea presents certain problems.                     element that occurs in G4, St-l, +K, and a3 is not found in the
Although      the dnaG protein is known to play a role in the                      E. coli chromosome.”
initiation   of Okazaki pieces at the growing point (Lark, 1972;                       Given the above result, one is again forced to ask why the
Louarn, 1974), the conserved dnaG recognition                 element we           bacterial cell would manufacture             a protein capable of inter-
have found is quite large, and its repetitive placement along                      acting with a specific DNA sequence that the cell itself does
the bacterial chromosome           to allow the initiation     of Okazaki           not contain. One possibility, which cannot be excluded, is that
fragments would be disruptive to the coding potential of the                        the extensive nucleotide         conservation      in the origin region is
DNA. Such a sequence repeated numerous times could also                             not all related to dnaG protein recognition,            and that the dnaG
potentiate recombination         events leading to deletions or inver-              protein recognizes only a small or interrupted                    part of the
sions.                                                                             conserved DNA. A short or interrupted                 recognition     sequence
    Moreover,     there is the additional       problem     of the dnaB             may not have been able to form a stable hybrid with the
protein, which is also known to be active at the growing point                      probe.
(Wechsler and Gross, 1971; Ueda et al., 1978). In vitro exper-                         However, it seems likely that DNA conserved near an origin
iments with negative strand synthesis in +X (Wickner and                           of replication    is important      for the function of that origin, and
Hurwitz,     1974; Schekman et al., 1974) indicate the involve-                    therefore we favor the following model. We assume that the
ment of the dnaB protein in dnaG-mediated              strand initiation,          dnaG protein recognizes a compact tertiary structure on the
a reaction easily imagined to be analogous to the initiation              of
an Okazaki fragment. If the initiation           of Okazaki fragments              interpretation.         Unlike the case with &aG”               mutants,    the synthesis      of
does involve the dnaB protein, then synthesis in the bacterial                     Okazaki       fragments        in &uB”        cells is apparently       normal   at the non-
DNA growing point would be less like the simple G4, St-l,                          permissive         temperature         (Lark and Wechsler,        1975).
+K, a3 reaction and more closely related to the enzymatically                          6 While these experiments                were in progress,        the sequence       of the
                                                                                   origin of replication            of the E. coli chromosome           was reported      (Meijer
more complex, and less site-specific, initiation            reaction seen
                                                                                   et al., 1979; Sugimoto              et al., 1979). It does not possess the conserved
with +X.5                                                                          sequences         found      in G4, St-l, +K, and 013, in agreement                 with the
  ‘Actually,   the   limited   data   available   do not    support   such   an    hybridization         results.
                                                          dnaG              (Primase)-dependent                        Origins         of DNA            Replication                                                                 12625
     end G                                                                          start       H                         A           !3         C           D           E          F
  -----_-_-_                template                                            /                        3’
                                                                               primer           for
                                                                                      probe         synthesis

                                                                      2°C                                5’

                                                            + RNAse
                                                                                                                            FIG. 13. Hybridization                      of single-stranded                   St-l origin-region
                                                                                                                         DNA to E. coli DNA.                     A radioactive             origin-sequence            probe (specific
                                                                                                                        activity,     -4 x lo8 cpm/gg)                was prepared              as described         in the legend to
                                                                                                                        Fig. 12. This probe was hybridized                          to E. coli DNA fragments                       derived
                                                                                            +                           by digestion           with four separate               restriction        enzymes:         Bum H-I (Lane
                                                                      r-c                                .i ’            B), Pst I (Lane                Cj, Eco RI (Lane D), or Xho I (Lane E). (Four
                                                                                                                         enzymes        were used so as not to destroy                               any potential            origin-like

                                                                                                                                                                                                                                                    Downloaded from by guest, on July 12, 2011
                                                                                                                        sequences.)         The DNA had been electrophoresed                                  through        an agarose
                                                                                                                        gel and transferred                  to nitrocellulose               paper.     No hybridization                  was
                                                                                                                        visible in the lanes containing                        E. coli DNA. As controls,                      the probe
                                                                                                                        also was hybridized                to Eco RI-digested                  St-l and G4 DNA (Lanes F
                                                                                                                        and A), present               at the level of single copy E. coli sequences.                                     The
                                                                                                                        probe hybridized               to both viral DNAs with comparable                             efficiencies.         As
                                                                                                                        a further         control,        a 200-base           pair Zac restriction                fragment,          made
     FIG. 12. Sy$hesis                   of a single-stranded                   radioactive             probe           radioactive         by nick translation                 (specific activity,            -1 x lo7 cpm/pg),
specific      for the origin              region.        The general strategy                was to form a              was used to demonstrate                       the ability           to detect the single copy Zuc
template      +primer complex                by assembling           appropriate         restriction        frag-       operon      DNA present                in Eco RI-digested                  bacterial      DNA         (Lane G).
 ments, as diagramed                in the figure. To copy the origin sequence,                                the      Detailed       procedures:             Xho I, Bum H-I, and Pst I restriction                              enzyme
 primer was fist extended                     by a single ribonucleotide,                  and then elon-               digestions        were performed               as described            under “Experimental                  Proce-
gated with DNA polymerase                          I and 32P-labeled            deoxynucleoside                 tri-    dures,”      using 60 mM NaCl in the Bum and Pst reactions                                         and 150 mM
phosphates.           Subsequent            cleavage       of the product             with RNase              and       NaCl in the Xho reaction                      mixtures.          Eco RI digestions                were carried
electrophoresis            through        a denaturing            gel allowed        the isolation           of a       out in 0.1 M Tris-HCl                 (pH 7.4), 5 mM MgC12,2                   mu 2-mercaptoethanol,
single-stranded,            Ill-base       radioactive         fragment       specific for the origin                   50 mu NaCl, and 200 pg/ml of bovine serum albumin.                                              Completeness
 region. Detailed            procedures:          the primed          template        was formed           from         of digestion           of phage DNAs                 was assayed             by electron           microscopy.
 the positive strand of an St-l Hpa II restriction                              fragment          (extending            Completeness             of digestion         of E. coli DNA was determined                           by adding
from +93 to -83) and the negative                         strand of an overlapping                  Alu/Hpa             either +X174 or pMB9                     supercoils,        depending          on the enzyme used, to
fragment        (extending          from -19 to -83).                 The positive           strand      of the         the same or a parallel reaction                        mixture         and following           the disappear-
Hpa fragment              was separated             from its complement                  on an 8% poly-                 ance of circles with the electron                         microscope.          The E. coli restriction
acrylamide          gel. About           0.5 pmol (as piece) of this single-stranded                                    fragments        were separated              on a horizontal             0.7% agarose gel run in 0.04
fragment       was then mixed with about 1.5 pmol of the double-stranded                                                M Tris/acetate,              0.02 M sodium acetate                     (pH 7.6) buffer containing                       1
St-l AZu/Hpa            fragment         in a final volume            of 16 4, containing             12.5 mM           mM EDTA.              Five micrograms                of DNA were loaded in each well (13
 Tris-HCl      (pH 7.4), 62.5 mM NaCl, and 1.25 mM dithiothreitol.                                     The St-          x 1.5 mm). Eight nanograms                          of St-l and G4 DNAs were electropho-
1 Alu/Hpa           fragment         was denatured              by boiling       for 3 min and then                     resed in parallel lanes; this amount                            corresponded           to a single copy E.
allowed to anneal (55”C, 45 min) to the positive strand Hpa fragment                                                    coli sequence,             given a size of 4 x lo6 base pairs for the bacterial
to form the primed                 template.        Elongation          of the primer           was carried             chromosome.              To ensure accurate                   concentration            measurements,                all
out in two stages. First, a single ribonucleotide                                   was added to the                    DNA samples were treated                          with 100 pg/ml of RNase A (heated                                  15
primer.     Two microliters               of 250 PM rCTP, 1~1 of 13.4 mu MnC12, and                                     min at 75°C in 0.15 M NaCl before use) for 3 h at 37°C in 50 mu Tris-
1 d (about 0.9 unit) of DNA polymerase                                 I (KIenow         fragment)         were         HCl (pH 7.5), 10 mM EDTA                          (Blin and Stafford,             1976). NaCl was then
added to the hybridization                    solution,     and the mixture            incubated         for 15         added to 0.5 M and the DNA passed over a column of Bio-Gel                                                  A-15m
min at room temperature.                     The sequence of the template                       DNA limits              equilibrated          in 0.5 M NaAc (pH 7.5), 1 mu EDTA.                                  DNA concentra-
synthesis      to the incorporation                 of a single rCMP            residue. Ribonucleo-                    tions were then determined                           by absorbance              at 260 mn. Southern
tide addition          was terminated                with 1 ~1 of 17 mu EDTA.                         Further           transfers       (Southern,          1975) were performed                    by placing the gel on a 2-
extension       of the primer            to synthesize          the Ill-base         radioactive         probe          to g-inch stack of Whatmann                         No. 3MM            paper partially           submerged           in
was carried out by addition                      of 20 ~1 of a solution             containing          20 mu           the transfer         buffer (20 x SSC,’ 1 M N&AC).                            The nitrocellulose               filter
Tris-HCl       (pH 7.4), 20 mM MgC12, 2 mM dithiothreitol,                                  200 F dGTP,                 (Millipore)        was placed on top of the gel and this in turn was covered
200 PM dTTP, 10.5 mu dCTP, and 4 PM [L~-~‘P]ATP                                        (specific activity               with more 3MM                  paper and then with paper towels. After overnight
about 400 Ci/mmol).                   After allowing           polymerization             to occur for 30               transfer,      the filter was washed briefly                         in 1 M N&AC              and dried well
min at 37”C, the reaction                   was stopped with 6 pl of 0.2 M EDTA.                             The        before baking.             Filters were presoaked                     12 to 24 h at 45-47”C                 in 5 X
probe was cleaved from the primer at the position                                   of the ribonucleo-                  SSC, 0.1 M phosphate                     buffer,’      1 X Denhardt’s              (1966) solution.              The
tide by addition               of 6 ~1 of heat-treated                RNase A (10 mg/ml)                      and       immobilized            DNA was then exposed to the radioactive                                    probe for 24
incubation        for another           60 min at 37°C. Six microliters                      of 1 M NaOH                to 36 h at 45-47”C,                  with shaking,             in 5 x SSC, 100 mu phosphate
and 40 ~1 of 10 M urea containing                         0.1% bromphenol               blue and xylene                 buffer,     1 X Denhardt’s                solution,        10 mM EDTA,                and 0.5% Sarkosyl.
cyan01 were then added, and the DNA was denatured                                           by heating for              The filters were washed in 6 X SSC, 1 x Denhardt’s                                              solution,       0.1%
4 min at 90°C before loading directly                         onto an 8% acrylamide,                  7 M urea          SDS,’ changing               the wash solution                once an hour for 6 h. They were
gel. After       electrophoresis,              the Ill-base            synthesized          fragment          was        then dried and autoradiographed                          at -70°C         using intensifying             screens.
identified      by autoradiography,                 excised, and eluted as described                     under          Pilot experiments               had shown that 45°C was the optimum                               temperature
“Experimental             Procedures.”           The procedure             used was adapted                from         for cross-hybridization                  of the St-l single-stranded                    probe to G4 DNA
the methods           of Barnes (1978) and Sanger et al. (1977).                                                        under the conditions                 used.
12626                                                                                                  dnaG           (Primase)-dependent     Origins                  of DNA                               Replication
                                               a.3                SA
A                                                                 1 +

                        d.K          T 17 4                       EC
                                                              c                 c
                                                                  C.G                                                                                              A   G
                                                                                                                                                                       c            -i.d             kcnl
                                                                  G TA
                                                                  C.GG                                                                                           C.G
                                                                  ;:;                                                                                            G.C
                        as           C.G
                                     T.A             *                                                                                                           A.T
                                                        k    mi:
                                     CGA                                                                                                                         A.T
                                     G.C                                                                                                                         G.C
 5'       TTAAATAAAAG.CGAGCCGTACGGAGATACCCGAiAAACTAGGAACGTG.CC~CCTGC~AA                                                                                                                                            .i'

                                                                                                                                                                                                                              FIG. 14. Comparison                    of the St-l,
                                                                                                                                                                                                                         +K and (~3 (A), and G4 (B) negative
                                                                               A.T                                                                                                                                       strand      origins       with the general                  ori-
                                                                          T           c
                                                                                                                                                                                                                          gin regions         of bacteriophages                  X (C)
                                                                               G.C                                                                                                                                        and @SO (D), and of E. coli (E). It can
                                                                                                                                                                                                                          be seen that all of these &zaG-dependent
                                                                   A CT'AA                  T                                                                                                                             origin regions contain 1) similarly-spaced
                                                                                            G                                                                                                                            large and small hairpin loops; and 2) a 5-
                                                                    CT     T
                                                                      C-G                                                                                                                                                or B-base pyrimidine-rich                  segment,        end-
                                                                      C.G                                                                                                                                                 ing with a T, immediately                upstream        from

                                                                                                                                                                                                                                                                                              Downloaded from by guest, on July 12, 2011
                                                                      C.G                                                                                                                                                 the smaller hairpin.          The sequences show
                                                                                                                                                                                                                          the template        strand and the nucleotide
                                                                      C.GA                                                                                                                                               at which synthesis           starts (which is num-
                                                                      G.C                                                                                        G.C
 5’   TAACGCAACA                                                  A A G.C G T G                 c c :lACGGAGATACTCGAGTCTCCGA7ACAGCCACTGCAAA                                                                        6'     bered). These are known for the single-
      64     positizw                ::trand                                                                                                                                                                             stranded      phages and postulated                   for the
                                                                                                                                                                                                                         h, $180, and E. coli origins.                 The E. coli
C                                    T                   T              -7.8          &a2                                                                                                                                sequence        is taken        from       Meijer       et al.
                                     T                   A
                                                                                                                                                                                                                         (1979) and Sugimoto                et al. (1979); the h
                                             G.C                                                                                                                                                                         and @So sequences are taken from Gros-
                                             G.C                                                                                                                                                                         schedl and Hobom                 (1979) and Dennis-
                                                                                                                                                                                                                          ton-Thompson            et al. (1977). The essen-
                                             T.A                                                                                                                                                                          tial region containing           the Hind111 cutting
                                                                                                                                                                                                                         site (AAGCTT)             of the E. coli origin re-
                                         A               A
                                     A                                                                                                                                                                                   gion (Messer          et al., 1979) is located at
                                         A               c
                          Cd                                                                                                                                                                                             the base of the small hairpin.                     The Eco
                                                                                                                                                                                                                         RI site (GAATTC)               in X and +80 (which
                         C.G                                                                                                                                     T.A                                                     lies within the area of greatest                   sequence
                          C.G                                                                                                                                    T.A
 5'   IAAAACAT.ACACAAAAGACACTATTACARAAGAAAAGAAAAAAGAAAAGA~TA.TCCTCTGACCA                                                                                                                                          3'     conservation           between           the      lambdoid
      i      e scrnnd
                                                                                                                                                                                                                         phage origins)          occurs at the top of the
                                                                                                                                                                                                                         small loop. Sequences               to the right of this
                                                                                                                                                                                                                          Eco RI site are not absolutely                     required
D                A A c -1.8 KCU%                                                                                                                                                                                         for origin function            (Furth        et aZ., 1977),
                T     C
                 A T                                                                                                                                                                                                     but they stimulate            replication         20. to 30.
                  A.T                                                                                                                                    AA                                                              fold     (Grosschedl          and Hobom,                1979).
                           G.C                                                                                                                                                4.:                 4cr;l
                       A         c                                                                                                              1                  :                                                     Deletions      in h which eliminate               either the
                           C.G                                                                                                                           G.C                                                             downstream          loop or large segments                      of
                           T.A                                                                                                                       c  T
                       A         A                                                                                                                 C.G                                                                    the region between the two loops abolish
                       A A
                                                                                                                                                                                                                         origin function         (Denniston-Thompson                     et
                     G.C                                                                                                                           C.G                                                                   al., 1977). The similarity                   between        the
                     C.G                                                                                                                           T.A
,'    CTT          C C.GTAGACAC                                                     CCAATACAAGAACAAGAACAGTAARAGAT                                C T.ATCC                       TC                TGAC       GC   L'
                                                                                                                                                                                                                         small hairpins         in X and E. coli has also
                                                                                                                                                                                                                         been noted by Meijer               et al. (1979).

E                                                                  C
                                                                              A -4.3 <JLIL
                                                                   C          A
                                                                        A G.C G
                                                                        GGT'AG                                                                   A           c     c   .!.I                *vi,
                                                              TC                                                                                 T C.G G
                                                         GA                               AT                                                       A.T
                                                         TA                               AG                                                       T4
                                                              AG                                                                                   A.T
                                                                        TGT         GGG                                                            G.C
                                        C.G                                                                                                        G.C
                                        A.T                                                                                                        T.A
                                        A.T                                                                                                        T.A
,'    GGAAAGGAICATI.ATACACAACTCAAAAAAC~GAACAACAG~~G~TC~GCTTCC~GACAG                                                                                                                                               ,'

                                        dnaG      (Primase)-dependent               Origins      of DNA        Replication                                                  I2627

DNA template, but that during replication           of the bacterial                 primer which begins with A7 and whose first segment is
chromosome       this structure may not be formed by the DNA                         composed predominantly         of purines.* Structure would thus be
itself, but rather by the binding of the dnaB or other proteins                      more important       than sequence for recognition      by the dnaG
to single-stranded     template DNA. We picture the initiation                       protein. This interpretation        is strengthened   by the finding
site sequences present in G4, St-l, +K, and a3 as having                             that, although the G4, St-l, +K, a3 sequence as such is not
evolved so that they spontaneously        adopt this active confor-                  found in E. coli, the origin regions in E. coli and the lambdoid
mation. Under this view, the virus is using some of its DNA                          phages can be drawn to possess a similar secondary structure.
to form a structural        element for direct recognition   by the                      In X and E. coli, the origin-containing      DNA normally exists
&LUG priming protein.                                                                in a double-stranded      configuration.    Perhaps some of the ad-
                                                                                     ditional proteins that are essential for the function of these
            Comparison       with Other      Origin   Sequences                      origins work by acting as site-specific denaturation         proteins,
     Do the nucleotide sequences conserved in G4, St-l, +K, and                      freeing the individual      DNA strands to participate          in the
a3 appear at other origins of DNA synthesis which are be-                            formation of a secondary or tertiary structure that is directly
lieved to utilize the dnaG protein? At this time, the exact                          available to the single-stranded        phages.
initiation    site of primer synthesis is known at the nucleotide                       Acknowledgments-We             wish to thank A. Taketo      and S. Wickner
level only for the single-stranded               phages. Thus, no fixed              for phage strains,     P. Farabaugh,         D. Hourcade,    U. Siebenlist,   G.
reference point exists to aid in aligning the G4, St-l, +K, and                      Sutcliffe,   and R. Tizard     for restriction    enzymes,   P. LoMedico,     H.
a3 sequences with other dnaG-dependent                     origin region se-         Potter, and B. Meyer      for gifts of DNA, E. Benz for communication         of
quences. Moreover, difficulties            may arise from the fact that              results prior to publication,       and A. Maxam      and J. Wang for discus-
proteins in addition to the dnaG protein are involved in the                         sions.
initiation    process at other origins, and these other proteins                                                         REFERENCES
may alter the specificity of the priming protein.
                                                                                     Atkinson,         M. R., Deutscher,      M. P., Kornberg,        A., Russell, A. F., and

                                                                                                                                                                                       Downloaded from by guest, on July 12, 2011
     We have previously pointed out (Sims and Dressler, 1978;                            Moffatt,       J. G. (1969) Biochemistry        8,4897-4904
see also Fiddes et al., 1978) limited similarities                  in primary       Barnes, W. (1978) J. Mol. Biol. 119,83-99
sequence between the G4 origin and the general region of the                         Blin, N., and Stafford,           D. (1976) Nucleic Acids Res. 3, 2302-2308
bacteriophage         h origin (Denniston-Thompson               et al., 1977).      Bouche,         J.-P., Rowen,      L., and Kornberg,        A. (1978) J. Biol.        Chem.
The present study allows us to test the significance of that                             253, 765-769
comparison by asking whether the bases common to G4 and                              Bouche, J.-P., Zechel, K., and Kornberg,                A. (1975) J. Biol. Chem. 250,
hare among those conserved in the St-l, $K, and a3 sequences.                        Bowes, J. M. (1974) J. Virol. 13, 1400-1403
In fact, many of them are not, and therefore that comparison                         Bradley,        D. E. (1964) J. Gen. Microbial.          35,471-482
loses much of‘its initial appeal.                                                    Bradley,        D. E. (1970) Can. J. Microbial.          16, 965-971
     However, further studies with bacteriophage                   X have sug-       CleweII, D. B., and Helinski,            D. R. (1970) Biochemistry           9,4428-4440
gested that the sequence used in the original comparison                      did    Denhardt,           D. T. (1966) Biochem.       Biophys.      Res. Commun.         23, 641-
not include all of the DNA necessary for optimal h origin                                646
                                                                                     Denniston-Thompson,               K., Moore,    D. D., Kruger,      K. E., Furth,     M. E.,
function. Recently a more extensive sequence of the origin                               and Blattner,         F. R. (1977) Science 198,1051-1056
region in X and several of its close relatives has been published                    Derstine,         P. L., Dumas,     L. B., and Miller,      C. A. (1976) J. Virol. 19,
by Grosschedl and Hobom (1979). Comparing                     these extended             915-924
lambdoid phage origin-region             sequences with those of G4, St-             Eisenberg.          S., and Denhardt,      D. (1974) Proc. Natl. Acad. Sci. U. S.
1, +K, and a3, we find certain pronounced               similarities     (shown          A. 71,1984-988
in Fig. 14, C and D). Each of the lambdoid phages can form                           Fiddes. J. C.. Barrell.           B. G.. and Godson.         G. N. (1978) Proc. Natl.
                                                                                         Acadl. Sci. ‘U. S. A. 75, 1081-1085
a hairpin of about the same size as the one found in G4, St-l,                       Furth,       M., Blattner,     F., McLeester,     C., and Dove, W. (1977) Science
+K, and a3 near the primer initiation              site. Moreover,        imme-         198,1046-1051
diately 3’ to this hairpin on the h and $30 1 strands are 5 or 6                     Glvnn, I., and Chappell,        J. (1964) Biochem.      J. 90, 147-149
pyrimidines       followed by a G, as is also the case in G4, St-l,                  Godson, 6. N. (1974) Virology            58,272-289
+K, and a3. On the other side of this hairpin are about 40                           Godson.      G. N.. Barrell.    B. G.. Staden.      R.. and Fiddes. J. C. (1978)
  bases of DNA with no potential for forming secondary struc-                           Nature      276,‘236-247’
                                                                                     GrosschedI,        R., and Hobom,     G. (1979) Nature     277,621-627
  ture, and then a region very rich in inverted repeats. Thus, we
                                                                                     Hassur, S. M., and Whitlock,            H. W. (1974) Anal. Biochem.      59, 162-
  would suggest that in h the dnaG protein may recognize a                               164
  structure similar to that formed by the intercistronic                  region     Hourcade,       D., and Dressler,     D. (1978) Proc. Natl. Acad. Sci U. S. A.
  DNA in G4, St-l, +K, and a3, and begin synthesis at position                          75, 1652-1656
  1164 of the Grosschedl and Hobom (1979) sequence with a                            Kodaira,      K.-I., and Taketo,      A. (1977) Biochim.      Biophys. Acta 476,
  purine-rich    primer complementary          to the I strand.                          149-155
                                                                                     Lark, K. G. (1972) Nature          New Biol. 240,237-240
     We have also looked for a region with analogous properties
                                                                                     Lark, K. G., and Wechsler,          J. A. (1975) J. Mol. Biol. 92, 145-163
in the sequence of the E. coli origin (the only other general                        Louarn,     J.-M. (1974) Mol. & Gen. Genet. 133, 193-200
origin region believed to need the dnaG protein for which                            Maxam.      A.. and Gilbert.    W. (1977) Proc. Natl. Acad. Sci. U. S. A. 74,
DNA sequencing             data are available)       (Meijer      et al., 1979;         560-564      ’
  Sugimoto et al., 1979). A region exists which shows the same
  general pattern of primary sequence and secondary structure;                            ‘The   dnaG       protein     appears      to use an adenine            nucleotide     for
 it corresponds        to a strand initiation       at position 252. This            initiation    on all templates         tested (Rowen           and Kornberg,       1978; Mc-
 section of E. coli origin-region         DNA is shown in Fig. 14E.                  Macken     and Kornberg,         1978; Wickner,          1978).
     If these are in fact the sequences at which synthesis starts                         a The chief drawback         to this comparison            is the modest stability      of
                                                                                     some of the hairpins,          in particular,       the downstream         loop in $30 and
 in the X (434, 21, $80) and E. coli origins, then an origin for
                                                                                     the “primer”        loop in E. coli. However,              other proteins      are probably
  dnaG-dependent         strand initiation may consist of (a) one large              invol;ed     in maintaining         the origins of these double-stranded                chro-
  and one small hairpin separated by 30 to 40 bases; and (b) a                       mosomes       in a single-stranded           configuration         and could stabilize     the
  short pyrimidine-rich        sequence ending in TG, located imme-                  weak structures.         Indeed,     such stabilization          (or the lack of it) might
  diately upstream         from the smaller loop, and coding for a                   contribute      to the regulation       of the initiation         process.
12628                                       dnaG (Primase)-dependent Origins of DNA Replication

McMacken,        R., Ueda, K., and Kornberg,         A. (1977) Proc. Natl. Acad.         Southern,      E. M. (1975) J. Mol. BioZ. 98, 503-517
  Sci. U. S. A. 74,4190-4194                                                             Sugimoto,      K., Oka, A., Sugisaki,         H., Takanami,        M., Nishimura,        A.,
McMacken,        R., and Kornberg,       A. (1978) J. Biol. Chem. 253, 3313-                Uasuda.      Y.. and Hirota.     Y. (1979) Proc. NatZ. Acad. Sci. U. S. A.
   3319                                                                                     76, 575-579
Meijer,   M., Beck, E., Hansen, F. G., Bergmans,             H. E. N., Messer,     W.,   Taketo,    A. (1976) Mol. & Gen. Genet. 148, 25-29
   von Meyenburg,        K., and SchaUer, H. (1979) Proc. Natl. Acad. Sci.               Taketo,     A., and Kodaira,       K.-I. (1978) Rio&m.            Biophys.     Acta 517,
   U. S. A. 76,580-584                                                                      55-64
Messer,    W., Meijer,     M., Bergmans,      H., Hansen,       F., von Meyenburg,       Tinoco, I., Borer, P., Dengler,         B., Levine, M., Uhlenbeck,         O., Crothers,
   K., Beck, E., and Schaher,          H. (1979) Cold Spring          Harbor   Symp.        D., and GraIIa, J. (1973) Nature            New BioZ. 246,40-41
   &ant.     BioZ. 43, 139-145                                                           Ueda, K., McMacken,            R., and Kornberg,           A. (1978) J. BioZ. Chem.
Neal, M., and Florini,       J. (1973) Anal. Biochem.         55,328-330                    253,261-269
Rowen. L.. and Kornbers.           A. (1978) J. BioZ. Chem. 253. 758-764                 Wang, J. C. (1978) DNA Synthesis,                 Present    and Future       (Molineux,
Sanger; F.; Nicklen,       S., and Coulson,     A. R. (1977) Pro;. NatZ. Acad.              I., and Kohiyama,       M. eds), pp. 347-366,          Plenum Publishing          Corn.,
   Sci. U. S. A. 74, 5463-5467                                                              New York
Sanger, F., Coulson,       A. R., Friedmann,      T., Air., G. M., Barrell,     B. G.,   Wechsler,     J., and Gross, J. (1971) Mol. & Gen. Genet. 113, 273-284
   Brown,    N. L., Fiddes, J. C., Hutchison,         C. A., III, Slocombe,    P. M.,    Wickner,     S. (1977) Proc. NatZ. Acad. Sci. U. S. A. 74,2815-2819
   and Smith, M. (1978) J. Mol. BioZ. 225, 225-246                                       Wickner,     S. (1978) Anna. Reu. Biochem.              47, 1163-1191
Sanaer. F.. and Coulson, A. R. (1978) FEBS Lett. 87, 107-110                             Wickner,      S., and Hurwitz,      J. (1974) Proc. NatZ. Acad. Sci. U. S. A.
Schekman,       R., Weiner, A., and Kornberg,         A. (1974) Science 186,987-            71, 4120-4124
   993                                                                                   Wickner,      S., Wright,   M., and Hurwitz,           J. (1973) Proc. NatZ. Acad.
Shine, J., and Dalgarno,        L. (1974) Proc. NatZ. Acad. Sci. U. S. A. 71,               Sci. U. S. A. 70,1613-1618
   1342-1346                                                                             Zechel, K., Bouche, J.-P., and Kornberg,               A. (1975) J. BioZ. Chem. 250,
Sims, J., and Dressler,        D. (1978) Proc. NatZ. Acad. Sci. U. S. A. 75,                4684-4689

                                                                                                                                                                                        Downloaded from by guest, on July 12, 2011