Characterization of the genes encoding phycoerythrin in the red

Document Sample
Characterization of the genes encoding phycoerythrin in the red Powered By Docstoc
					Proc. Nati. Acad. Sci. USA
Vol. 89, pp. 9564-9568, October 1992
Evolution


Characterization of the genes encoding phycoerythrin in the red
alga Rhodella violacea: Evidence for a splitting of the rpeB gene
by an intron
     (rhodophyta/plastid genome/cloning/sequencing/phycobilisome)
C. BERNARD*, J. C. THOMAS*, D. MAZELtt, A. MOUSSEAU*, A. M. CASTETSt, N. TANDEAU DE MARSACt,
AND J. P. DUBACQ*§
*Laboratoire des Biomembranes et Surfaces Cellulaires V6g6tales (Centre National de la Recherche Scientifique, Unite de Recherches Associde 0311), Ecole
Normale Supdrieure, 46 rue d'Ulm, 75230 Paris Cedex 05, France; and tUnitt de Physiologie Microbienne (Centre National de la Recherche Scientifique,
Unite de Recherches Associde 1129), Institut Pasteur, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
Communicated by Pierre Joliot, June 2, 1992 (received for review February 17, 1992)

ABSTRACT          The phycobilisome of the eukaryotic unicellu-                       In cyanobacteria, the genetics of the photosynthetic appa-
lar red alga Rhodela violacea presents in some respects an                          ratus is now well documented; in particular, most of the genes
organization that is intermediate between those of the homolo-                      encoding phycobilisome components have been character-
gous counterparts found in cyanobacteria (the putative chloro-                      ized from several strains (6, 7). In contrast, in rhodophyta, no
plast progenitor) and more advanced, pluricellular red algae.                       genes encoding phycobilisome proteins have been described
This suggests evolutionary relationships that we investigated at                    with the exception of those mentioned in two preliminary
the genome level. The present work describes the sequences of                       reports (8, 9). In these eukaryotic organisms, the regulation
two rhodophytan phycobilisome genes, rpeA and rpeB. These                           of the synthesis of phycobilisome components is complex,
chloroplast genes encode the a and g3 subunits of phycoerythrin,                    some of these components being under the control of both
the major component of the light-harvesting antennae and one                        nuclear and plastid factors (10).
of the most abundant cellular proteins in these algae. The amino                      In order to study the genetic control of phycobilisome
acid sequences deduced from both ipeA and rpeB present strong                       components in rhodophyta and their evolutionary relation-
homologies with those previously reported for phycoerythrin                         ships with cyanobacteria, we first examined the genes en-
subunits of cyanobacteria, rhodophyta, and cryptomonads. The                        coding PE in Rhodella violacea. This unicellular marine
main difference with the corresponding cyanobacterial genes                         species contains B-PE, a characteristic pigment of red algae,
was the unexpected occurrence of an intervening sequence that                       in a cyanobacterial-like phycobilisome structure (11). This
split rpeB into two exons. This intervening sequence presents                       suggests that R. violacea might be considered as a primitive
characteristics of group II introns but lacks several structural                    stage in the evolution of the red algal phycobilisome. On the
domains. Transcriptional analyses showed that the two Ipe genes                     other hand R. violacea displays a nucleoplastid regulation of
are cotranscribed and that the major RNA species detected                           the synthesis of PE and its associated linker polypeptides, the
corresponds to a mature mRNA lacking the intron. As the                             y subunits (C.B., A.M., and J.C.T., unpublished work). We
phycobiliproteins form a group of closely related polypeptides in                   report here the physical organization and the nucleotide
cyanobacteria and rhodophyta, the molecular events affecting                        sequence of the rpeA and rpeB genes , which encode the PE
the corresponding genes, such as the rpeB intron, may be a clue                     a and 13 subunits, respectively. These genes are located on
to elucidate some aspects of the molecular processes involved in                    the chloroplast genome of R. violacea. An intervening se-
the evolution of plastid genes.                                                     quence was discovered in the rpeB gene. To our knowledge,
                                                                                    there have been no reports of introns in protein-encoding
Phylogenetic analyses (1) showing that cyanobacteria,                               genes in rhodophyta or cyanobacteria.
prochlorophytes, and eukaryotic chloroplasts derive from a                                        MATERIALS AND METHODS
common ancestor reinforce the widely accepted theory for an
endosymbiotic origin of chloroplasts. Structurally and func-                           Materials. Restriction enzymes were purchased from either
tionally, the rhodophytan chloroplasts are very closely re-                         Boehringer Mannheim or Genofit (Geneva). [a-32P]dCTP
lated to cyanobacteria, having phycobilisomes as light-                             (3000 Ci/mmol; 1 Ci = 37 GBq), [a-[35S]thio]dATP (1000
harvesting antennae (2, 3). These antennae are multimolec-                          Ci/mmol), Hybond N membrane, and nick-translation kits
ular complexes regularly arrayed on the stromal surface of                          were from Amersham. The Cyclone system used to generate
the thylakoid membranes. They consist of two types of                               deletion mutants was from IBI (Genofit). The kilobase se-
proteins: the phycobiliproteins and the linker polypeptides.                        quencing system was from BRL. Phage M13 derivatives and
The major phycobiliproteins are allophycocyanin, phycocy-                           plasmid pTZ18R were from Pharmacia. Enzymes were used
anin, and phycoerythrin (PE), each consisting of two differ-                        according to the manufacturer's instructions. All chemicals
ent polypeptides (a and P subunits) present in equimolecular                        were reagent grade.
amounts. Several linker polypeptides, each specifically as-                            Culture Conditions. The unicellular marine red alga R.
sociated with the different phycobiliprotein complexes, con-                        violacea (strain 115-79 from Gottingen University) was
tribute to maintain the physical integrity of the structure of                      grown photoautotrophically in sterile seawater modified as
the phycobilisome and to optimize its light-harvesting and                          described (12). The culture was incubated at 21'C in glass
energy-transfer capability. In red algae, the PE (either B-PE
or R-PE) is present as an (a13)6y assembly, with the y subunit                      Abbreviations: cpDNA, plastid DNA; ORF, open reading frame; PE,
also carrying chromophores (4, 5).                                                  phycoerythrin; nt, nucleotide(s).
                                                                                    tPresent address: Unite de Physiologie Cellulaire, Institut Pasteur,
                                                                                     25 rue du Docteur Roux, 75724 Paris Cedex 15, France.
The publication costs of this article were defrayed in part by page charge          §To whom reprint requests should be addressed.
payment. This article must therefore be hereby marked "advertisement"               IThe sequence reported in this paper has been deposited in the
in accordance with 18 U.S.C. §1734 solely to indicate this fact.                     GenBank data base (accession no. L02188).
                                                                             9564
         Evolution: Bernard et al.                                                                  Proc. Natl. Acad. Sci. USA 89 (1992)    9565
culture tubes that were continuously flushed with sterile air                      intron sequence, were synthesized with a Milligen/Biosearch
and maintained under illumination with fluorescent tubes (40                       oligonucleotide synthesizer. These primers were used to
 uE-m-2.s'l; 1 ILE = 1 gnmol of photons) and a 16 hr light/8                       amplify the total sequence of the internal rpeB intron by PCR
hr dark photoperiod.                                                               as described (20).
   DNA and RNA Purification. Plastid DNA (cpDNA) from R.
violacea was isolated by procedures described for chro-                                                      RESULTS
mophyte algae (13). Total RNA from R. violacea was isolated                           Gene Cloning. The high degree of homology shared by all
as described for cyanobacteria (14).                                               the PE sequences determined previously (14, 21, 22) led us to
   Genomic Library Construction and Hybridization. Partial                         clone the R. violacea PE subunit genes by means of DNA
DNA libraries were constructed by ligation of HindIII plastid                      heterologous hybridization experiments. A probe containing
DNA fragments [1.5-2.5 kilobases (kb)] into the HindIII site                       the cpeA gene and the 3' end of the cpeB gene from Calothrix
of pTZ18R and by ligation of Pvu II plastid DNA fragments                          PCC7601 was hybridized to Southern blots of nuclear and
(0.8-1.5 kb) into the HincII site of pTZ18R. Standard meth-                        plastid (cpDNA) fractions of R. violacea DNA digested with
ods were used for in situ colony hybridizations (15). Southern                     HindIII. Only the cpDNA fraction showed a positive signal,
and Northern transfers and hybridizations were performed as                        corresponding to a HindIII fiagment =1.7 kb long. This
described (16). The probe used was a 1-kb Xba I-EcoRI                              fragment was further isolated from partial libraries con-
fragment containing the 3' end ofthe cpeB gene and the whole                       structed in pTZ18R (clone 1, Fig. 1). The recombinant
cpeA gene from Calothrix PCC7601 (14).                                             plasmids pENSB1 and pENSB2 contained the HindIll cp-
   DNA Sequence Analysis. After DNA fragments were sub-                            DNA fragment in both orientations. The nucleotide sequence
cloned into pTZ18R, overlapping clones were obtained by                            of this fragment revealed that it carried the entire rpeA gene
using the Cyclone system from IBI adapted to single-stranded                       but only the 3' end of the rpeB gene. To clone the complete
pTZ18R (17). Sequencing was carried out by using the                               rpeB gene, a second hybridization experiment was -per-
kilobase sequencing system from BRL on single-stranded                             formed, using a HindIII-Pvu II fragment containing the 3'
DNA with the M13 reverse primer. Sequence data were                                end of the R. violacea rpeB gene as a probe. A Pvu II cpDNA
managed and analyzed using the program developed by the                            fragment of =1 kb was identified and further cloned into
Unitd d'Informatique Scientifique of the Institut Pasteur                          pTZ18R (clone 2, Fig. 1). The recombinant plasmids pENSB3
(F-75015, Paris).                                                                  and pENSB4, which contained this Pvu II fragment in both
   Amino Acid Sequence Analysis. To determine the N-termi-                         orientations, were further studied and sequenced. A physical
nal amino acid sequence of the PE (3 subunit, phycobilipro-                        map of this region of the R. violacea plastid genome is
teins were separated by two-dimensional lithium dodecyl                            presented in Fig. 1.
sulfate/PAGE. The first dimension was an isoelectric focus-                           Nucleotide and Amino Acid Sequence Analysis. The nucle-
ing separation (18) in a 5% acrylamide gel containing a 4:1                        otide sequence of the cpDNA region corresponding to the
mixture of Serva ampholytes pH 4-6 and pH 3-7. The second                          HindIII and the Pvu II overlapping fragments is shown in Fig.
dimension was a separation according to molecular weight                           2. Translation of this sequence in the various reading frames
(19) in a 9-18% acrylamide/N,N'-methylene bisacrylamide                            was compared with the amino acid sequences of the PE a and
gel (30:0.8, wt/wt) using lithium dodecyl sulfate instead of                       (3 subunits of Porphyridium cruentum (23). The R. violacea
SDS. The PE ,( subunit was then transferred to Immobilon-P4                        rpeA gene was identified first. Its gene product showed a 85%
membrane according to the protocol provided by Millipore.                          identity with the PE a subunit of P. cruentum (Fig. 3 and
The amino acid sequence was determined on an Applied                               Table 1). In contrast, no open reading frame (ORF) corre-
Biosystems 470A protein sequencer.                                                 sponding to a complete PE (3-subunit sequence was found.
   Synthesis of the Intron-Specific DNA Fragment by the                            However, two independent regions with homology to the PE
Polymerase Chain Reaction (PCR). The two 30-mer oligonu-                           ,( subunit of P. cruentum were detected. The first region
cleotides, corresponding to the 5' and 3' extremities of the                       encoded the 13 N-terminal residues of the P subunit and the
                                                                                                                    100 bp




                             p                         H D                     H    P    p              D           H


                                 4     Intron                          rpe B         H       pr A


                                                p PE
                                                               7Oe                           a PE

                           Clones:
                                                                                                    I
                                                               2



                           Probes:
                                                       C
                                 IL     D                  I       L   B                 I    A


  FIG. 1. Restriction map of the Pvu II-HindIII DNA fragment that contains the rpeB and rpeA genes from R. violacea. The rpeB gene is split
by an intron. Clone 1 represents the 1.7-kb HindIII fragment and clone 2 the 1-kb Pvu II fragment. Probes A-D correspond to the DNA fragments
used for transcriptional analyses (see Results). P, Pvu II; H, HindIl; D, Dra I; bp, base pairs.
9566     Evolution: Bernard et al.                                                                                                                                                                                          Proc. Natl. Acad. Sci. USA 89 (1992)

                                                                                                                   M       L       D        A       F      S        R       V        V       V      N           S       04
       CTGAAACAATAAAATCCTTCAATTAATTAGGGAGAGATCAATGCTAGATGCATTTTCAAGAGTCGTAGTAAATTCTGATGTAAGGCTTCAATTGTATTTTAACCAATAAAA5CCTTA
                                                                                                                                       50                                                                                                                                       100

       TATATAAATATAACATTTACC ATTCAAAT TAAAGATTACAAATTTT AT ACTTTCTCTTCT AGTCGATTT AAT AAAT AACAAAGGGT ATTTTATTATGAAAC AATGATTCTCTGTGCTTT A
                                   150                                                      200

        TCCAATTAAAGATTATTACTAGATTAGTTTAGATCATAGTAAAAAATACTAAATTTTTGACATTTAATGAATATATTACATTTAGAGCAGAGATTTATTTGATAAAAAATlAAAAAGCT
                         250                                                                                                                                   300                                                                                                                                          350

         V                                                   V                                        VI                                           VI                   ;T          K       A       A       Y           V       G       G       S       D       L       Q       A       L       K       K       F       I       A       D
        GAATTTAATTAACCCTAATTTTTCAGTTATCTATGTAAAAACCAAGATTTTTATACTATTACAAAAGCTGCTTACGTAGGCGGAAGTGACTTACAAGCTTTAAAGAAATTTATLGCTGAT
                                                                                                           400                                     *                                                                                                450

         G       N       K       R       L       D       S       V       N       A        I       V        S       N       A       S       C       V       V       S        D       A       V       S       G           M       I       C       E       N       P       G       L       I       T       P       G       G       N       C
        GGTAACAAACGTCTTGATTCAGTAAACGCTATTGTTTCAAATGCTAGTTGTGTTGTTTCTGACGCAGTTTCTGGTATGATTTGTGAAAACCCAGGTCTTATTACTCCAGGTGGTAACTGC
                                                     500                                                                                                                                    550                                                                                                                                     600

         Y       T       N       R       R       M       A       A       C       L        R       D        G       E       I       I       I. R            Y       V        S       Y       A       L       L           S       G       D       P       S       V       L       E       D       R       C       L       N       G       L
        TATACGAACCGTCGTATGGCTGCTTGCTTACGTGACGGAGAAATTATTATCCGTTACGTATCTTATGCTCTTTTATCTGGTGATCCATCTGTTCTTGAAGACAGATGTTTAAACGGATTA
                                                                                                                                       650                                                                                                                                      700

         K       E       T       Y       I       A       L       G       V       P        T       N        S       N       A       R       A       V       D        I       M       K       A       S       V           V       A       L       I       N       N       T       A       T       L       R       K       M       P       T
        AAAGAAACTTATATTGCTTTAGGTGTTCCTACTAACTCTAACGCAAGAGCAGTAGACATCATGAAAGCTTCTGTTGTAGCTTTAATTAATAATACTGCAACTTTACGTAAAATGCCTACT
                                                                                 750                                                                                                                                        800

         P       S       G       D       C       S       A       L       A        A       E       A        G       S       Y       F       D       R       V       N        S       A       L       S       *                                                                                                                               M
        CCTTCAGGAGACTGTTCTGCTTTAGCTGCTGAAGCAGGTAGTTACTTTGACAGAGTAAATTCTGCTCTTAGCTAAACATAGAGTCGTTTTATAATATAAAGACAATAGGAGATAACTTAT
                         850                                                                                                                                       900                                                                                                                                      950

             K       S       V       I       T       T       V       I       S        A       A       D        A       A       G       R       F       P       T        A       S       D       L       E       S           V       Q       G       N       I       Q R             A       S       A       R       L       E       A       A
        GAAATCAGTTATTACAACTGTTATAAGTGCAGCTGATGCTGCAGGCCGTTTTCCAACTGCTTCAGATTTAGAATCTGTACAAGGAAACATTCAAAGAGCTAGTGCTAGACTAGAAGCAGC
                                                                                                          1000                                                                                                                                      1050

             E       K       L       A       G       N       Y       E       A        V       V       K        E       A       G       D       A       C       F        A       K       Y       A       Y       L           K       N       A       G       E       A       G       D       S       Q       E       K       I       N       K
        TGAGAAATTAGCTGGTAATTATGAAGCTGTAGTTAAAGAAGCAGGTGATGCTTGTTTTGCTAAGTATGCTTACTTAAAAAATGCTGGAGAAGCAGGTGATAGCCAAGAAAAAATTAATAA
                                                 1100                                                                                                                                       1150                                                                                                                                    1200

             C       Y       R       D       V       D       H       Y       M        R       L       I        N       Y       C       L       V       V       G        G       T       G       P       V       D           E       W       G       I       A       G       A       R       E       V       Y       R       T       L       N
        GTGCTATCGTGATGTGGATCACTATATGAGATTAATCAACTACTGTTTAGTAGTTGGTGGAACTGGACCAGTAGACGAATGGGGTATCGCAGGTGCAAGAGAAGTATATCGTACTTTAAA
                                                                                                                                   1250                                                                                                                                         1300
             L       P       T       A       S       Y       V       A       A        F       A       F        A       R       N       R       L       C       C        P       R       D       M       S           A       Q       A       G       V       E       Y       A       A       Y       L       D       Y       V       I       N
        TTTACCAACAGCTTCATACGTAGCAGCGTTTGCTTTTGCTCGTAATAGATTATGTTGTCCTAGAGATATGTCTGCTCAAGCAGGTGTTGAATATGCAGCATACTTAGATTATGTTATCAA
                                                                             1350                                                                                                                                       1400

             A       L       S       *
        TGCATTATCTTAGTTTTTAGCAAATCATTGCTTTCCACAATTTTTCTAAATTAGCAATTTATTTATTATCTAZATAThGATAATAAATAAATTGCTAATTTAACCTTTTTTATACAC
                                                                                                                  1550 1557
                         1450

   FIG. 2. Nucleotide and deduced amino acid sequence of the rpeB and rpeA genes from R. violacea. Vertical arrows indicate the sites of intron
splicing. Horizontal arrows below the DNA sequence indicate repeated sequences. Sequences implicated in the formation of the putative domains
V and VI of the intron are overlined. Asterisk below the DNA sequence indicates the nucleotide which could be required for the lariat formation.
second region located 341 nucleotides (nt) downstream,                                                                                                                              (probe A; see Fig. 1) revealed the major transcript of 1.4 kb
encoded the remaining 164 amino acids of the 1 subunit.                                                                                                                             (Fig. 4); a longer exposure also showed signals corresponding
These two regions corresponded to two different ORFs                                                                                                                                to the two minor species of 1.7 and 3 kb (data not shown).
separated by many stop codons. The absence of detection by                                                                                                                          Finally, we used a fragment generated by PCR amplification
heterologous hybridization of any other DNA fragment ho-                                                                                                                            (probe D; see Fig. 1), which was internal to the intervening
mologous to the cpeB gene suggested that the cloned frag-                                                                                                                           sequence. This probe unambiguously revealed the two minor
ment contained the single rpeB gene present in the R.                                                                                                                               RNA species of 1.7 and 3 kb (Fig. 4). The slight hybridization
violacea genome. We therefore concluded that the coding                                                                                                                             signal around 1.4 kb was probably due to the detection of the
region was interrupted by an intervening sequence.                                                                                                                                  major rpe mRNA by traces of the larger DNA fragment used
   To check for the splicing sites of the intervening sequence,                                                                                                                     as template for the intron PCR amplification still present in
we purified the PE ( subunit and determined its N-terminal                                                                                                                          probe D.
amino acid sequence. This sequence, MLDAFSRVVVNS-
DTKAAYVG, exactly fitted the fusion of the deduced amino                                                                                                                                                    DISCUSSION
acid sequences from the two ORFs described above. Further,                                                                                                                          Cyanobacteria and rhodophyta are the only organisms to
it allowed precise location of the splicing sites of the intron                                                                                                                     contain phycobilisomes, and comparison of their phycobil-
between the codons corresponding to the aspartic (D, no. 13)                                                                                                                        iprotein-encoding genes may yield clues for understanding
residue and the threonine (T, no. 14) residue. The organiza-                                                                                                                        their phylogenetic relationships. Although the complete
tion of rpe genes in R. violacea was consequently found to be                                                                                                                       amino acid sequence of PE from one member of the
as follows: 5'-rpeB' (39 nt)-intron (341 nt)-'rpeB (495 nt)-                                                                                                                        rhodophyta, P. cruentum, has been reported (23), the only
intergenic region (43 nt)-rpeA (495 nt)-3'.                                                                                                                                         PE genes so far characterized are from cyanobacteria (cpe
   Transcriptional Analysis. Total RNA extracted from R.                                                                                                                            genes). A previous study of the organization of the genome
violacea was hybridized with various DNA probes. A Pvu II                                                                                                                           in Porphyra yezoensis permitted the localization of the rpe
fragment corresponding to the whole rpeB gene, including the                                                                                                                        genes on the plastid chromosome (25) as in the case of R.
intervening sequence, and the 5' end of the rpeA gene (probe                                                                                                                        violacea (this study). The analysis ofthe sequenced fragment
C; see Fig. 1) revealed three mRNA species, a major one of                                                                                                                          shows that the R. violacea rpeB and rpeA genes have a
=1.4 kb and two minor ones of about 1.7 and 3 kb (Fig. 4).                                                                                                                          physical organization similar to their cyanobacterial coun-
A HindIII fragment internal to the second rpeB exon (probe                                                                                                                          terparts except for the R. violacea rpeB gene, which is split
B; see Fig. 1) and a Pvu TI-Dra I fragment internal to rpeA                                                                                                                         into two exons. As in cyanobacteria, rpeB is located up-
          Evolution: Bernard et al.                                                                                                                Proc. Natl. Acad. Sci. USA 89 (1992)              9567
                          a    P3-subunit:
                                                         10                   20                   30                       40                         50                       60             70

                          R.    v.     MKSVITTVISAADAAGRFPTASDLESVQGNIQRASARLEAAEKLAGNYEAVVKEAGDACFAKYAYLKNAG
                          P.    C.     MKSVITTVVSAADAAGRFPSNSDLESIQGNIQRSAARLEAAEKLAGNHEAVVKEAGDACFAKYAYLKNPG
                          C.    7601   MKSVVTTVIAAADAAGRFPSTSDLESVQGSIQRAAARLEAAEKLANNIDAVATEAYNACIKKYPYLNNSG
                          8.    6701   MKSVITTVVAAADAAGRFPSTSDLESVQGSIQRAAARLEAAEKLAANLDAVAKEAYDAAIKKYSYLNNAG
                          S.WE8020 MKSVITTVVGAADSASRFPSASDMESVQGSIQRAAARLEAAEKLSANYDAIAQRAVDAVYAQYPNGATGR
                                       ****        ***        ***   *    ***       **    **       **       ***        *********                    *     *          *       *          *


                                               80        90       100       110       120        130       140
                          R.    *.     EA-GDSQEKINKCYRDVDHYMRLINYCLVVGGTGPVDEWGIAGAREVYRTLNLPTASYVAAFAFARNRLCC
                          P.    C.     EA-GENQEKINKCYRDVDHYMRLVNYDLVVGGTGPLDEWGIAGAREVYRTLNLPTSAYVASIAYTRDRLCV
                          C. 7601 EA-NSTDTFKAKCARDIKHYLRLIQYSLVVGGTGPLDEWGIAGQREVYRALGLPTAPYVEALSFARNRGCA
                          8. 6701 EA-NSTDTFKAKCLRDIKHYLRLINYSLVVGGTGPLDEWGIAGQREVYRTLGLPTAPYVEALSFARNRGCS
                          S.WR8020 QPRQCATEGKEKCKRDFVHYLRLINYCLVTGGTGPLDELAINGQKEVYKALSIDAGTYVAGFSNMRNDGCS
                                                               **   **        **    **    *       **       *****       **        *    *       ***            *              **             *     *


                                              150       160
                          R.    v.     PRDMSAQAGVEYAAYLDYVINALS                                    164
                          P.    C.     PRDMSAQAGVEFSAYLDYLINALS                                    164
                          C. 7601 PRDMSAQALTEYNALLDYAINSLS                                         164
                          S. 6701 PRDLSAQALTEYNSLLDYVINSLS                                         164
                          S.WE8020 PRDMSAQALTAYNTLLDYVINSLG                                        164



                               PZ-Subunit          :

                                                         10                   20                   30                       40                          50                      60              70
                          R.    v.     MLDAFSRVVVNSDTKAAYVGGSDLQALKKFIADGNKRLDSVNAIVSNASCVVSDAVSGMICENPGLITPG
                          P.    C.     MLDAFSRVVVNSDAKAAYVGGSDLQALKSFIADGNKRLDAVNSIVSNASCMVSDAVSGMICENPGLISPG
                          C.    7601   MLDAFSRAVVSADASTSTV--SDIAALRAFVASGNRRLDAVNAIASNASCMVSDAVAGMICENQGLIQAG
                          S.  6701 MLDAFSRAVVSADSKTAPIGGDDLNQLRSFIASGNRRLDAVNAIASNASCMVSDAVAGMICENTGLIQAG
                          S.WE8020 MLDAFSRKAVSADSSGAFIGGGELASLKSFIADGNKRLDAVNALSSNAACIVSDAVAGICCENTGLTAPN
                          C.0      MLDAFSRVVTNADAKAAYVGGADLQALKKFISEGNKRLDAVNSVVSNASCIVSDAVSGMICENPSLISPG
                                       *******                  *                    *        *        *         **    ***       **            ***       *       *****      *        ***   *


                                                         80                   90                  100                   110                            120                  130                140
                          R.    v.     GNCYTNRRMAACLRDGEIIIRYVSYALLSGDPSVLEDRCLNGLKETYIALGVPTNSNARAVDIMKASVVA
                          P.    C.     GNCYTNRRHAACLRDGEIILRYVSYALLAGDASVLEDRCLNGLKETYIALGVPTNSSIRAVSIMKAQAVA
                          C.    7601   GNCYPNRRMAACLRDAEIVLRYVTYALLAGDASVLDDRCLNGLKETYAALGVPTTSTVRAVQIMJAQAAA
                          8.    6701   GNCYPNRRHAACLRDAEIILRYVSYALLAGDASVLDDRCLNGLKETYTALGVPLQSTARAVAIMKAQAAA
                          S.WE8020 GGVYTNRKHAACLRDGEIVLRYVSYALLAGDASVLQDRCLNGLRETYAALGVPTGSAARAVAIMKAASAA
                          C.0          GNCYTNRRHAACLRDGEIILRYVSYALLSGDSSVLEDRCLNGLKETYSSLGVPANSNARAVSIMKACAVA
                                       *       *   **    *******         **        ***   ****          **    ***       *******                ***        ****           *       ***    ****      *


                                              150       160      170     180
                          R.    v.     LINNTATL --RKMPTPSGD--CSALAAEAGSYFDRVNSALS                                                                            177
                          P.    C.     FITNTATE-------RKMSFAAGD--CTSLASEVASYFDRVGAAIS                                                                        177
                          C.    7601   HIQDTPSEARAGAKLRKMGTPVWEDRCASLVAFASSYFDRVISALS                                                                        184
                          S.    6701   HIQDNPSEALAGAKLRKMGTPVWEDRCASLVAESSSYFDRVIAALS                                                                        186
                          S.WE8020 LITNTNSQP------KKAAVTQGD--CSSLAGEAGSYFDAVISAIS                                                                            178
                          C.,      FINNTASQ-------RKLSTPQGD--CSGLASECASYFDKVTAAIS                                                                            177
                                           *                             *                    *        *              ****       *        *    *



   FIG. 3. Amino acid alignments of the PE a and ,B subunits of R. violacea (R.v.), P. cruentum (P.c.) (23), Synechocystis PCC6701 (S. 6701)
(21), Calothrix PCC7601 (C. 7601) (14), Synechococcus sp. WH8020 (S.WH8020) (22), and Cryptomonas c1 (C4.() (24). Stars indicate identical
amino acid residues in the different sequences.
stream from rpeA. In prokaryotes the Shine-Dalgarno se-                                                          genomes. Consequently the major, 1.4-kb mRNA species
quences act as ribosome binding sites. In R. violacea, such                                                      detected most likely corresponds to the mature mRNA
sequences are found at 5 bp and 6 bp upstream from the                                                           species devoid of the intervening sequence, with the two
initiation codon of rpeB and rpeA, respectively (see Fig. 2).                                                    rpeB exons being fused to allow an efficient translation of this
Furthermore, an inverted repeat of 28 nt, which is able to                                                       gene. The Northern hybridization results indicate that the
form a stable stem structure, is located 33 bp downstream                                                        two larger mRNAs detected, which have apparent sizes of 1.7
from rpeA. This structure could correspond to a transcription                                                    and 3 kb, contain the intervening sequence and could corre-
terminator and/or could stabilize the PE mRNAs by blocking                                                       spond to different unmatured mRNA species. A probe cor-
exonuclease degradation (26).                                                                                    responding to a fragment located 400 bp downstream from
   The transcriptional analysis shows that rpeB and rpeA are                                                     rpeA did not detect either the 3-kb or the 1.7-kb mRNA (data
always cotranscribed and organized in an operon. As men-                                                         not shown). Therefore, if the apparent size of the largest
tioned above, hybridization experiments suggest that no                                                          mRNA (3 kb) was not due to a specific structural modifica-
other rpe gene copies exist in the plastidial or nuclear                                                         tion, the 3-kb and 1.7 kb mRNAs could potentially represent
Table 1. Amino acid sequence identity (in percent) between the                                                   another gene located upstream from rpeB. Further study of
PE subunits aligned in Fig. 3                                                                                    this DNA region will allow clarification of this point.
                                                                                                                    The comparison of the R. violacea PE subunit sequences
                 R.v.    P.c. C. 7601 S. 6701 S. WH8020                                                          with the PE sequences previously reported shows a high
R.v.                      85         70         72              57                                               degree of conservation among the sequences regardless of
P.c.              83                 66         68              54                                               their origin (Fig. 3 and Table 1). The highest homology is
C. 7601           69      71                    90              63                                               found with the PE sequence from P. cruentum, another
S. 6701           69      74         87                         66                                               rhodophytan (85% and 83% identity for the a and P subunits,
S. WH8020         66      68         61         67                                                               respectively). The P subunit is more closely related to the
CO¢               81      84         61         65              66                                               Cryptomonas F PE f8 subunit (81% identity) than to homol-
   Identities between a subunits are shown in the upper right half, and                                          ogous polypeptides of cyanobacteria (Table 1). With the
those between P subunits in the lower left. Abbreviations are as in                                              cyanobacterial sequences, the percent identity decreases to
Fig. 3.                                                                                                          70% and 69% with the a and , subunits from Calothrix
9568        Evolution: Bernard et al.                                                      Proc. Natl. Acad. Sci. USA 89 (1992)
  Probes:           A       13           ~   ~             )                Aglaothamnion neglectum (8, 9), show that the physical
                                                                            organization of the rpe genes appears similar to that of R.
                                                                            violacea, although the rpeB gene is devoid of introns. Thus,
                                                                            it might be ofinterest to correlate the occurrence of this group
                                                                            II intron with the endosymbiotic evolution of chloroplasts.
                                                                               We thank A. Joder, G. Zabulon, and C. Passaquet for technical
                                                 .A
                                                                            assistance; F. Michel, E. Queinnec, J. Houmard, and V. Capuano for
                                                 .4,
                                                         il*   -OW   3. 0   helpful discussions; W. Saurin for computer analysis; and I. Old for
                                                                            critical reading ofthe manuscript. This work was supported by the Ecole
                                                                            Normale Supnrieure, the Centre National de la Recherche Scientifique
                                                                            (U.R.A. 1129 and 0311/G.D.R. 1002), and the Institut Pasteur.
                    AMML.                                       w    1..7
        1. 4   _w                                                             1. Turner, S., Burger-Wiersma, T., Giovannoni, S. J., Mur, L. R.
                            .    .
                                     I                                             & Pace, N. R. (1989) Nature (London) 337, 380-382.
                                                                             2. Gantt, E. (1981) Annu. Rev. Plant Physiol. 32, 327-347.
                                                                             3. Glazer, A. N. (1989) J. Biol. Chem. 264, 1-4.
                                                                             4. M6rschel, E., Wehrmeyer, W. & Koller, K. P. (1980) Eur. J.
                                                                                   Cell Biol. 21, 319-327.
                                                                             5. Klotz, A. V. & Glazer, A. N. (1985) J. Biol. Chem. 260,
                                                                                   4856-4863.
                                                                             6. Tandeau de Marsac, N. (1991) in Cell Culture and Somatic Cell
                                                                                   Genetics of Plants, eds. Bogorad, L. & Vasil, I. K. (Academic,
                                                                                   New York), Vol. 7, pp. 417-446.
 Iexposure:                                      84      84
                                                                             7. Bryant, D. A. (1991) in Cell Culture and Somatic Cell Genetics
 Xin hours)                                                                        of Plants, eds. Bogorad, L. & Vasil, I. K. (Academic, New
                                                                                   York), Vol. 7, pp. 257-300.
  FIG. 4. Autoradiograms of Northern blots of total RNA from R.
                                                                              8. Roell, M. K. & Morse, D. E. (1991) J. Phycol. 27, Suppl., 15
violacea after hybridization with probes A (a subunit), B (B subunit),
                                                                                   (abstr.).
C (a and P subunits and intron), and D (intron). Fragment sizes are
                                                                             9. Apt, K. E. & Grossman, A. R. (1991) J. Phycol. 27, Suppl., 366
                                                                                   (abstr.).
given in kilobases.                                                         10. Egelhoff, T. & Grossman, A. R. (1983) Proc. Natl. Acad. Sci.
                                                                                   USA 80, 3339-3343.
PCC7601, respectively, and to 72% and 69o with those from                   11. Morschel, E., Koller, K. P., Wehrmeyer, W. & Schneider, H.
Synechocystis PCC6701.                                                             (1977) Cytobiology 16, 118-129.
  The rpeB intron shares several structural features specific               12. Starr, R. C. (1978) J. Phycol. 14, Suppl., 47-100.
for the group II introns but none of those specific for the                 13. Douglas, S. E. (1988) Curr. Genet. 14, 591-598.
group I introns (27, 28). For example, the first 5 nt in the 5'             14. Mazel, D., Guglielmi, G., Houmard, J., Sidler, W., Bryant,
extremity of the rpeB intron sequence, GUAAG, resemble                             D. A. & Tandeau de Marsac, N. (1986) Nucleic Acids Res. 14,
the consensus GUGYG. In addition, the highly conserved                             8279-8290.
nucleotide required for lariat formation, an A located 7 or 8               15. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular
nt upstream from the 3' splicing site in group II intron                           Cloning: A Laboratory Manual (Cold Spring Harbor Lab., Cold
                                                                                   Spring Harbor, NY), 2nd Ed.
sequences, is also found in the rpeB intron (Fig. 2). More-                 16. Damerval, T., Castets, A. M., Guglielmi, G., Houmard, J. &
over, several parts of the rpeB intron sequence could be                           Tandeau de Marsac, N. (1989) J. Bacteriol. 171, 1445-1452.
folded into secondary structures, and the characteristics of                17. Mazel, D., Houmard, J. & Tandeau de Marsac, N. (1988) Mol.
two of them suggest that they could correspond to domains                          Gen. Genet. 211, 296-304.
V and VI of a typical group II intron (Fig. 2) (27). However,               18. O'Farrell, P. M. (1975) J. Biol. Chem. 250, 4007-4021.
this intron is significantly shorter than the described group II            19. Laemmli, U. K. (1970) Nature (London) 227, 660-685.
introns for which self-splicing has been demonstrated (29-                  20. Kaminski, P. A. & Elmerich, C. (1991) Mol. Microbiol. 5,
31). Further, the rpeB intron seems to be devoid of sequences                      665-673.
                                                                            21. Anderson, L. K. & Grossman, A. R. (1990) J. Bacteriol. 172,
corresponding to the structural domains I-IV of typical group                      1297-1305.
II introns. These features recall the characteristics of some of
                                                                            22. Wilbanks, S. M., De Lorimier, R. & Glazer, A. N. (1991) J.
the Euglena or Chlamydomonas reinhardtii group II-like                             Biol. Chem. 266, 9535-9539.
introns (27, 32). It has been proposed that the splicing ofthese            23. Sidler, W., Kumpf, B., Sutor, F., Klotz, A. V., Glazer, A. N.
introns would require trans-acting cofactors, which could be                       & Zuber, H. (1989) Biol. Chem. Hoppe-Seyler 370, 115-124.
proteins or RNA species providing the intron with the                       24. Reith, M. & Douglas, S. (1990) Plant Mol. Biol. 15, 585-592.
missing domains. A related mechanism has also been dem-                     25. Shivji, M. S. (1991) Curr. Genet. 19, 49-54.
onstrated for the trans splicing of the psaA gene in C.                     26. Brawerman, G. (1987) Cell 48, 5-6.
reinhardtii (33). The characteristics of the rpeB intron lead us            27. Michel, F., Umesono, K. & Ozeki, H. (1989) Gene 82, 5-30.
to propose such a mechanism, implying trans-acting cofac-                   28. Dujon, B. (1990) Ann. Inst. Pasteur 1, 181-194.
                                                                            29. Peebles, C. L., Perlman, P. S., Mecklenburg, K. L., Petrillo,
tor(s), for its efficient splicing.                                                M. L., Tabor, J. A., Jarrell, K. A. & Cheng, H. L. (1986) Cell
   From this study it appears that the PE genes of the                             44, 213-223.
eukaryotic alga R. violacea have kept most characteristics of               30. Van der Veen, R., Arnberg, A. C., Van der Horst, G., Bonen,
their prokaryotic ancestor but for the presence of a group II                      L., Tabak, H. F. & Grivell, L. A. (1986) Cell 44, 225-234.
intron, typical of eukaryotic organisms. To our knowledge,                  31. Schmelzer, C. & Scheweyen, R. J. (1986) Cell 46, 557-565.
the R. violacea intron is the first described from chloroplast              32. Choquet, Y., Goldschmidt-Clermont, M., Girard-Bascou, J.,
genomes of red algae. In cyanobacteria, from which the                             Kuck, U., Bennoun, P. & Rochaix, J. D. (1988) Cell 52, 903-913.
putative endosymbiotic progenitor of the rhodophytan chlo-                  33. Goldschmidt-Clermont, M., Choquet, Y., Girard-Bascou, J.,
roplast would have originated, only class I introns have been                      Michel, F., Schirmer-Rahire, M. & Rochaix, J. D. (1991) Cell
described so far. They are restricted to tRNAI-U (34, 35) and
                                                                                   65, 135-143.
                                                                            34. Xu, M. Q., Kathe, S. D., Goodrich-Blair, H., Nierzwicki-
none have been localized in protein-encoding genes. Prelim-                        Bauer, S. A. & Shub, D. A. (1990) Science 250, 1566-1569.
inary results on the characterization of the PE-encoding                    35. Kuhsel, M. G., Strickland, R. & Palmer, J. D. (1990) Science
genes from two other red algae, Polysiphonia boldii and                            250, 1570-1573.