proline-rich domain and a cysteine-rich domain

Document Sample
proline-rich domain and a cysteine-rich domain Powered By Docstoc
					Proc. Natl. Acad. Sci. USA
Vol. 90, pp. 6829-6833, July 1993
Plant Biology

A tobacco gene family for flower cell wall proteins with a
proline-rich domain and a cysteine-rich domain
HEN-MING WU, JITAO Zou, BRUCE MAY, QING Gu,                                  AND    ALICE Y. CHEUNG*
Department of Biology, Yale University, P.O. Box 6666, New Haven, CT 06511
Communicated by J. E. Varner, March 29, 1993

ABSTRACT           Flowering is known to be associated with the                     different cell types and at different developmental stages
induction of many cell wall proteins. We report here five                           (8-10). A flower is morphologically complex and composed
members of a tobacco gene family (CELP, Cys-rich extensin-                          of multiple tissue types (11). Flowering is known to stimulate
like protein) whose mRNAs are found predominantly in flowers                        or induce the expression of several cell-wall protein genes
and encode extensin-like Pro-rich proteins. CELP mRNAs                              (12-18). These extracellular matrix proteins may be involved
accumulate most abundantly in vascular and epidermal tissues                        in the organization of the floral apex from a vegetative apex
of floral organs. In the pistil, CELP mRNAs also accumulate                         or may fulfill special structural and functional requirements
in a thin layer of cells between the transmitting tissue and the                    for flower cell walls. Some flowering-associated extracellular
cortex of the style and in a surface layer of cells of the placenta                 matrix proteins are also stress-inducible or pathogen-related
in the ovary. This unique accumulation pattern of CELP                              and may contribute to the overall defense system in flowers
mRNAs in the pistil suggests a possible role in pollination and                     (12, 19, 20).
fertilization processes. CELP genes encode a class of plant                            We report here the characterization of a family of tobacco
extracellular matrix proteins that have several distinct struc-                     genes and cDNAs (CELPs, Cys-rich extensin-like proteins)
tural features: a Pro-rich extensin-like domain with Xaa-Pro3_7                     for a set of proteins that have an extensin-like domain.t
motifs and Xaa-Pro doublets, a Cys-rich region, and a highly                        CELP mRNAs accumulate almost exclusively in the four
charged C terminus. The extensin-like domains in these pro-                         floral organs-sepal, petal, stamen, and pistil. CELP mRNAs
teins differ significantly in their length and these differences                    accumulate in multiple cell types in each floral organ. They
appear to be results of both long and short deletions within the
coding regions of their genes. Furthermore, the number of                           are abundant in the vascular bundles and the epidermis,
charged amino acid residues in the C-terminal region varies                         except in the pistil where they are most abundant in the cells
among the CELPs. These structural differences may contribute                        demarcating the transmitting tissue and the cortex and in the
to functional versatility in the CELPs. On the other hand, the                      cell layer lining the placenta to which ovules are attached.
Cys-rich domain is highly conserved among CELPs and the                             The CELPs have three distinct structural domains: a Pro-rich
positions of the Cys residues are conserved, suggesting that this                   extensin-like domain, a Cys-rich domain, and a highly
region may have a common functional role. The presence of a                         charged C terminus. These features make the CELPs struc-
Pro-rich domain and a Cys-rich domain in these CELPs is                             turally distinct from tobacco extensin (21). The possible
reminiscent of a dass of hydroxyproline-rich glycoproteins,                         functional roles for this class of extracellular matrix proteins
solanaceous lectins, that are believed to be important in                           will be discussed in light of the available structural and
cell-cell recognition. The structure of these CELPs indicates                       expression information.
that they may be multifunctional and that their genes may have
arisen from recombinational events.
                                                                                                 MATERIALS AND METHODS
The primary walls of plant cells are thin and flexible but
strong and can accommodate cell expansion and changes in                               cDNA and Genomic Library Construction and Screening.
cell shape as cells grow and differentiate. Besides providing                       CELP cDNA clones were isolated from a tobacco floral
structural integrity, cell walls are also the sites where cell-                     cDNA library as described (20). A A Dash II genomic N.
cell interactions as well as interactions between plants and                        tabacum DNA library was made as described (22). CELP
their environment take place (1-5). Cell-wall proteins are                          genomic clones were isolated from an unamplified library
diverse and include many proteins that are believed to play                         using a 32P-labeled CELP-lc probe prepared by random
structural roles and others that are involved in defense or                         priming (22). Hybridization and washes were carried out at
other cellular and biochemical processes. The best charac-                          68°C in buffers as described (15).
terized cell wall protein genes are those for several classes of                       Nucleotide Sequence Analysis. Nucleotide sequence analy-
hydroxyproline (Hyp)-rich glycoproteins (HRGPs) (2, 6).                             sis was carried out by the dideoxynucleotide sequencing
One class includes the extensins, which typically have nu-                          method using double-stranded DNA and Sequenase Version
merous Ser-Pro(Hyp)4 motifs throughout the entire protein.                          2 (United States Biochemical) according to the manufactur-
Another class encodes what are collectively known as Pro                            er's recommendations. All the nucleotide sequences were
(Hyp)-rich proteins, which have multiple copies of the pen-                         determined on both strands of DNA.
tapeptide Val-Tyr-Lys-Pro-Pro or its variants. There are at                            RNA Expression Analysis. RNA preparation, gel electro-
least two additional classes of glycosylated Pro (Hyp)-rich
proteins, arabinogalactan proteins and solanaceous lectins.                         phoresis, blot analysis, and in situ hybridizations have been
Gly-rich cell wall protein genes have also been reported (7).                       described (20).
   Differential expression of cell wall protein genes may
contribute to meeting the functional and physical demands on                        Abbreviations: CELP, Cys-rich extensin-like protein; Hyp, hy-
                                                                                    droxyproline; HRGP, Hyp-rich glycoprotein.
                                                                                    *To whom reprint requests should be addressed.
The publication costs of this article were defrayed in part by page charge          tThe sequences reported in this paper have been deposited in the
payment. This article must therefore be hereby marked "advertisement"                GenBank data base (accession nos. CELP-1 to CELP-5, L13439-
in accordance with 18 U.S.C. §1734 solely to indicate this fact.                     L13443, respectively).
6830                Plant Biology: Wu et al.                                                                                         Proc. Natl. Acad. Sci. USA 90 (1993)

                           RESULTS                                                                                     Table 1. Amino acid composition of CELP-1, -2, -3, -4, and -5,
                                                                                                                       tobacco extensin, and potato lectin
   CELP-1 and CELP-lc, a Gene and Its cDNA for an Extensin-
Like Pro-Rich Protein. CELP-lc, a cDNA derived from a                                                                                               % of total residues
class of highly expressed flower mRNAs, was isolated from                                                                                        CELP
a tobacco floral cDNA library. CELP-lc has an open reading
                                                                                                                       Residue      1       2       3       4        5    TE*       PLt
frame corresponding to a 209-aa Pro-rich protein (Fig. 1). The
first of four closely spaced Met residues (NT 37-39) (Fig. 1)                                                          Ala        4.78     5.1     6.66 5.69 7.45          1.88      3.70
is assumed to function as the initiation codon of the deduced                                                          Arg        6.22     5.61    3.63     3.79 2.48 0              0.82
CELP-1 protein since the 5' end of the mRNA is located at                                                              Asn        2.87     2.55    3.63     2.53    2.48 0.31        5.34
the predominant transcription initiation site of the CELP-1                                                            Asp        3.82     4.08 4.24 3.16 3.72 0
gene at the thymidine denoted NT 1 (data not shown).                                                                   Cyst       6.70     6.63    7.87 8.22 8.07 0                 11.5
   The most striking feature of the deduced CELP-1 protein                                                                        9.87 1Q         Ill 1Lu IILt
is in the 74-aa central region that contains a series of Pro-rich                                                      Gln        2.87     3.57    2.42 3.79 3.72 0.31               6.99
sequences. The Pro residues are distributed in seven Xaa-                                                              Glu        5.26     5.61    6.06 5.69 5.59 0.31
Pro3 5motifs (Xaa being Trp, Cys, or Ser) and 18 Xaa-Pro                                                               Gly        0.95     1.02    1.81     1.26 2.48 0.31          11.5
doublets (X being Trp, Phe, Cys, Arg, or Gln). Overall, Pro                                                            His        0.478 0.51       0        0.63    0     4.71       0
residues are 26.3% of the entire CELP-1 protein but they                                                               Ile        3.34     3.06 5.45        3.79 4.96 0              1.23
                                                                                                                       Leu        5.74     5.61    7.87 8.86 9.31         2.20       1.23
make up 66.2% of the Pro-rich domain (see Table 1). The                                                                Lys        5.26     5.10 4.84 6.32 4.96 12.26                 3.70
Xaa-Pro35 motifs in the predicted CELP are similar to the                                                              Met        2.39     3.57    1.81     1.26 2.48 0.628          0.41
Ser-Pro4 repeats found in many plant-cell-wall HRGPs (4, 6).                                                           Phe        2.39     2.04 4.24 3.16 2.48 0.31                  0
CELP-1, especially its C-terminal half, is relatively Cys-rich                                                         Pro§      26.3     26      26       18.9    19.25 39.93     28.3
(Fig. 1 and Table 1). Moreover, 9 of the last 11 aa are charged.                                                                   I      67.1            65.7       A
Characterization of the CELP-1 gene revealed a 400-bp intron                                                           Ser        8.13     7.14    7.27 6.96 6.83 10.06             11.1
that interrupts Gln196 and Val'97 (data not shown), thus                                                               Thr        2.87     3.06 3.63 4.43           3.10 5.03        6.58
separating the region encoding the highly charged C terminus                                                           Trp        2.39     2.55    0.60     1.89 2.48 0              3.29
of the CELP-1 from the rest of the gene.                                                                               Tyr        2.39     2.04 4.24 3.16 3.72 14.46                 3.29
   CELP-1-Related Genes and cDNAs Encode Proteins with                                                                 Val        4.78     5.10    3.6      6.32 4.34 7.23           0.41
Similar Structural Features. CELP-1 gene is a member of a                                                                 Total 209¶     196     1661 1591 161w 3181              24311
complex multiple gene family (data not shown). Several other                                                              TE, tobacco extensin; PL, potato lectin.
CELP-lc-hybridizing cDNAs and genomic clones have also                                                                 *Ref. 21.
been characterized. CELP-lc, -2c, -3c, and -4c cDNAs and                                                               tRef. 23.
1                                                                                                                  8   tUnderlined numbers represent percent of Cys residues in the
                                                     M N N M L I M L                                                    C-terminal domain.
        ata aca caa acc att cag cga aaa acg gca acc AG AAC AAT AG CIC ATA AmG TTA
                                                                                                                       §Underlined numbers represent percent of Pro residues in the
 M       V A A F L F C S H Q Q V A T A R E V V                                                                          Pro-rich domain.
ATG     GIG ¢r ¢CA TIT T3 TG TGC A¢C CAC CAA CAA GrG GCC ACA G(G AGR GOA GrG GIT                                       VTotal number of amino acid residues in unprocessed proteins.
 V      A     E     L     A     V     A     D   D     R     N     E     L      Q     L    L     W   P     W
                                                                                                                       IIEstimated total number.
 I   P        C Y L T W P F P W P P P P P W P C
                                                                                                                       the coding region of CELP-5 gene share significant sequence
               rGr TAT CmG ACA C0(A TIC TCG CCA COG CCA cc CCA CEA TCC
                                                                                                                       homology (data not shown). Although the five deduced
 P P           R P R P R P R P C               P S P P P P P R P                                                       CELPs vary significantly in length (between 159 and 209 aa)
                   CGA CER CGA CCA CGR CCA TCCur AGC cur ccr CCA (A CrA CGA CrG
              Cr CCA
                                                                                                                       (Fig. 2), they all have the same overall primary structure: a
 R P           C P S P                P P P P R P R P                                C    P     S    P   P P           Pro-rich domain, followed by a Cys-rich region and a highly
              1E CC ACm
                     C                 r
                                       C T
                                       cuRAr. c r:. CGA OA
                                                        C aGA                       TOC   CTCT         Cr T CCG
                                                                                                                       charged C terminus.
 PP            Q P R P R P S P P P P S P P P                                                         P A P                Similar to CELP-lc, CELP-2c and -3c appear to be full-
                    CGR                                                                                   mC
                                                                                                                       length cDNAs whereas CELP-4c does not have an ATG
 S S     S C     S A S D E S N I Y R                                                 C M F           N E T             codon at the 5' end of its open reading frame. The N termini
AGr     A¢C TGC TCA GCr AGr GAT GAA TCA AAT ATT TA AGG                                              AAC GAA ACr
                                                                                    T0 AGTiC
                                                                                                            168        of the deduced CELPs are hydrophobic and have features
 K I     D P C C          P T F K S I L G
                                                                                     T   S C
                                                                                    AmT AGr TIC
                                                                                                     P C Y
                                                                                                                       characteristic of signal peptides (Fig. 2) (24). A putative
                                                                                                    Cur I

                                                                                                            188        signal peptide processing site is assigned in front of the
 K Y A E N L
                                       D N Q V L           I T    I  E S Y
                                      GAT AAT CAA G(G TIA ATr Am AT OAA TCT TAT TGT GAT GTr
                                                                                                     C D V             conserved Gln22 (CELP-1) residue found in all five CELPs.
601                                                          *                 *     *     *    *    *    *    *       There is one potential N-glycosylation site (25) in CELP-1, -3,
 D S     P C     K G
                                       V Q*V I K L S K E E E K K K
                                      GIT CAA Gr0 ATr AMG CTG TCC AAG GAA GAG GA AAM AAM AM
                                                                                                                       -4, and -5 (Fig. 2). Some of the Pro residues in the Pro-rich
                                                                                                                       domain of CELPs are most probably hydroxylated and these
        taa aaa     taa   agt ttt     aat gtt   taa ttc     cag   atc   att   tat   tag tag atg     att
                                                                                                          tgc tat
                                                                                                                       Hyp residues may provide additional sites for glycosylation
721                                                                                                                    (2, 4, 6).

        tta   tcc   ttt         ttg   aaa   gtc tct   agg   tta   tgt ttt     tgg   tct   tcc   ttg ttg tgt ttc
                                                                                                                          Sequence analysis of CELP-5 gene revealed that it also has
aga aat aat ttg cca tat atg tca aaa tct tag tac taa taa taa gaa tat tat aat ata
                                                                                                                       an intron (391 bp) that, similar to the one in CELP-1 gene,
taa aat ctc tca tct agaaaaa                                                                                            interrupts the same Gln-Val dipeptide close to the 3' end of
                                                                                                                       the coding region. Recently, the nucleotide sequences of two
   FIG. 1. Nucleotide sequence of the CELP-1 cDNA and the                                                              partial cDNAs for extensin-like flower-specific genes
deduced amino acid sequence of the CELP-1. Arrow, predominant                                                          (pMG02 and pMG04) have been described (13). These partial
transcription initiation site of the CELP-1 mRNA; triangle, putative                                                   sequences corresponded to part of the sequences found in
signal peptide processing site; diamond, location of the intron in                                                     CELP-lc and CELP-2c. Whether MG02 and MG04, and
CELP-1 precursor mRNA. Initiation and termination codons are in
italic type. The Pro-rich domain is underlined. The charged amino                                                      CELP-lc and -2c are for the same genes remains to be
acid residues at the C terminus are indicated by asterisks. The                                                        determined.
putative polyadenylylation signal is overlined. The numbers on the                                                        The Pro-Rich Domains of the CELPs Show Significant
left and right indicate first NT positions and the last amino acid                                                     Length and Motif Pattern Variations. The Pro-rich domains in
positions, respectively, of each row.                                                                                  the CELPs are flanked by a conserved tetrapeptide (Trp-
                 Plant Biology: Wu et Plant Biology: Wu et al.
                                      al.                                                                                    ~~~~~Proc. Nati. Acad. Sci. USA 90 (1993)      6831

       EL'p-1                   MVAAFLFCS[
                                                                                                                  conserved between CELP-1 and -3 whereas an entirely
      CELP-2                                       QQVAThREW
                                                                                                                  different hexapeptide was found in CELP-2 and -4. The
      CELP-4        PVKGLL'IL
                                                                       LAN-----tDG NELQ0-IFPFfl
                                                                       %%V---rG NEL.QL-WIFJ
                                                                                                                  hexapeptide in CELP-5 has 3 aa that are identical to those in
                  IAm4QLL'IL     t1?AILEV          CLT7EMLV            VA--- £QG    5QL-WEWK            44        CELP-1 and -3 and 3 aa that are identical to those in CELP-2
                                                                                                                  and -4. From the nucleotide sequences encoding these
                                 WPPPPFWPCP PP-RmPRPRP
                                 T.TnnnnrT.TfVln    --
                                                                                         RPRPrP.qPPP    96        hexapeptides (Fig. 3), it could be argued that the hexapeptide
      CELP-3      IFCLJF        WP--RPWTCP     -PPP              ----CPPPPPPP
                                                                                        RPRFCPSPPP      93
                                                                                                                  in CELP-5 was derived from those in CELP-1 and -3 via three
      CELP-4      IFCLJF        FP--RPYFCP         PPRPRPRP--
                                                                       - -.1            RP--CPDPPP
                                                                                                                  single-base-pair changes that resulted in changes in 3 aa. The
                                                                                                                  nucleotide sequences in CELP-5 could then have undergone
                  PPRFRCPSP PPFFPPSPPPP---SP PP                                         A  ENIYRC      143
                                                                                                                  additional changes in 3 bp to yield a total of the 6 aa
                  PP--------QPRPP SPPPPPPPSP PPP
                              ------- CPPPP---s P p
                                                                                                                  substitutions seen in CELP-2 and CELP-3.
                  p         p
                              ------  CPPPP----
                     ----------CPP--- p --
                                                                                        A ir-           97           Cys-Rich Domains in CELPs Are Highly Conserved. The
      CELP-5                                                                            &;EAKIM         98
                                                                                                                  Cys-rich regions of the CELPs are highly conserved (Fig. 2).
      CE.P-1      NFEIKEDC                         SCPCYKYAED
                                                                                                                  Most strikingly, the eight Cys residues in this region are in
      CELP-2        TFN
                                                   SCPCYKYAED          LaQvLrnLE        AYCDUDSPCK
                                                                                                       181        conserved positions in these proteins. This structural con-
                  04ENITSILEW   cPTh-KSIIa         SCPCYKYAEN
                                                                                                                  servation suggests that the Cys-rich domain of CELPs may
      CELP-5      iaerTUI9C
                                                                       LGVELIALQ        A=L            148        have common functional significance.
                                                                                                                     The C Terminus of CELPs Is Highly Charged and Is Coded
                                                                                                       209        in a Separate Exon in CELP Genes. The CELP C terminus is
      CE1LP-3     GLQIIMISM EY-K
                                                                                                       166        defined as the region encoded by the second exon of the
      CELP-4      GLtO.IICSKE EE                                                                       159
                                                                                                                  CELP-1 and CELP-5 genes beglnning with the last conserved
      CELP-5      OLCU R5N LE                                                                          160
                                                                                                                  Val residue (Fig. 2). More than half of the C-terminal amino
   FIG.     2.   Deduced amino acid sequences of CELP-1, -2, -3, -4, and                                          acid residues in this region are charged. Despite the similar
-5. The first amino acid residues shown                          are   either the first Met codon                 charge property, each CELP is unique in the composition or
(in    CELP-1c,      -2c, -3c, and -5)         or      the first codon in the 5' end of                      a    the total number of charged amino acid residues in this
cDNA sequence   (in CELP-4c). The Pro-rich regions are boxed.                                                     region, thus providing variability in this region of the CELPs.
Conserved amino acid residues among all five proteins on the N- and
                                                                                                                     CELPs Are Different from Extensins but Their Primary
C-terminal sides of the Pro-rich domain                                    underlined. The

         Cys residues in the C-terminal half of CELPs are indicated by
                                                                                                                  Structure Is Reminiscent of Solanaceous Lectins. A striking
asterisks. The charged amino acid residues at the C termini are in
                                                                                                                  difference between CELPs and extensins is the low Tyr
italic type. The potential glycosylation sites in CELP-1, -3, -4, and                                    -S
                                                                                                                  content and the absence of His residues in the presumed
proteins are indicated by dots. Gaps (dashes) are introduced to allow                                            mature CELPs (Fig. 2 and Table 1). Tyr residues in extensins
maximum alignment of these proteins. The numbers on the right                                                    are closely associated with the Ser-Pro4 motifs and can form
indicate, amino acid position for each of the CELPs. Triangle,                                                   isodityrosine linkages via intramolecular crosslinking (26,
putative signal peptide processing site; diamond, position of intron in                                          27), rendering them highly insoluble in the cell-wall matrix.
CELP-1 and CELP-5 genes.                                                                                         On the contrary, CELPs are relatively rich in Cys residues
                                                                                                                 whereas extensins are in general low in or devoid of this
Pro-Phe-Pro) and an Ala-Pro doublet at the N and C termini,                                                      amino acid residue (Table 1). A number of these Cys groups
respectively (Fig. 2). The Pro content in these proteins is                                                      are found within the Xaa-Pro3-7 motifs and among the Xaa-
between 19 and 26% for the total protein and between 65 and
                                                                                                                 Pro doublets. Some of these Cys residues may form disulfide
71% for the Pro-rich domains (Table 1). The Xaa-Pro3-7 MOtifS                                                    bonds under the proper oxidative-reductive conditions and
in CELPs differ from the Ser-Pro4 MOtifS in extensins in that
                                                                                                                 may be important participants in inter- and intramolecular
Xaa may be Ser, Cys, or Trp. Despite the significant homol-                                                      interactions involving the CELPs in the extracellular matrix.
ogy, CELPs differ among themselves in the number and in                                                          On the other hand, the amino acid compositions of the
the characteristics of the Xaa-Pro3-7 motifs and the Xaa-Pro                                                     CELPs, the presence of distinct Pro-rich and Cys-rich do-
doublets         (Fig. 2).      The Pro-rich domains in these                                   proteins         mains, and their solubility properties (A.Y.C. and H.-mn.W.,
range from 35 to 74                aa.    Most of the differences appear to                                      unpublished data) are reminiscent of the solanaceous lectins,
result from deletions of relativelylong regions of DNA. Some                                                     a class of Hyp-rich proteins that are also rich in Ser and Cys
minor insertions and deletions    might be responsible for the                                                   (and Gly) residues (Table 1) (23, 28, 29). Solanaceous lectins
shorter range heterogeneity. These differences adequately                                                        are structurally very different from other known lectins and
account for the overall length variations among the CELPs.                                                       are believed to be important to cell-cell interactions.
   Striking homology exists in a pentapeptide and an octapep-                                                       CELP mRNAs Are Predominant in Flowers and Accumulate
tide flanking the Pro-rich regions of the CELPs (Fig. 2).                                                        in Specific Cell Types, Especially in the Pistil. CELP mRNAs
Interestingly, a highly variable 6-aa domain is located on the                                                   accumulate almost exclusively in the four floral parts (Fig. 4
C-terminal side of the conserved octapeptides after the                                                          A and B). Their levels are very high throughout flower
Pro-rich regions (Figs. 2 and 3). This domain was 100%                                                           development. They begin to decline by anthesis (Fig. 4C) and
                                                                                                                 are greatly reduced in developing and mature fruits (data not
                                GAA rA AAT ATr                          A'LX
                                  E      S         N        I     Y            142
                                                                                                                 shown). In situ hybridization of CELP probes to tissue
                   CELP-3       GAA WCA MAT ATr rIAC ALX
                                                                                                                 sections indicates that CELP mRNAs accumulate in a cell-
                                  E      S         N        I     Y       R    101                               specifi'c manner in floral tissues (Fig. 5). CELP mRNA levels
                   CEbP-5       CGAA   gCA     AAg ATT TAC AaG                                                   are the highest in the epidermis and vascular bundles of petals
                                  E       A        K        I      Y      K        97                            (Fig. 5A) and sepals (data not shown). In the stamen, CELP
                   CELP-2       cA~A gCA                   gTr   aAg AaG                                         mRNAs are confined to the vascular bundles of the filament
                                         A         K       V             K     130
                                                                                                                 (Fig. 5C), whereas there are only low levels of these mRNAs
                   CELP-4              gCA             g    q aAg AaG                                            in the connective and vascular tissues of anthers (data not
                                  Q      A         K       V             K      96
                                                                                                                 shown). The most dramatic cell-specific CELP mRNA ac-
  FIG. 3.        Nucleotide sequences              encoding the highly variable region
                                                                                                                 cumulation pattern is observed in the pistil. In addition to
C-terminal to the Pro-rich domains of CELPs.                   Base-pair changes are                             being present in the vascular bundles, CELP mRNAs accu-
shown in lowercase type. Amino acid substitutions                                       are     shown in         mulate to high levels in a narrow region between the trans-
boldface and outlined types. Numbers at the end of each line indicate                                            mitting tissue and the cortex of the style (Fig. 5 D and F). In
the    position of the last amino acid residue shown for each protein.                                           the ovary, CELP mRNAs concentrate in a narrow row of
6832      Plant Biology: Wu et al.

                             I     S   L    F
                                                                          A.i           -:
                                                                                                       a._-..,ir. .
                                                                                                Proc. Natl. Acad. Sci. USA 90 (1993)

                                                                                                                     g d;Sh.f t
                                                                                                                -4L .t.
                                                                                                                L ..:E r.#

                                                                                                                  o. } L          ..            .....

                   B                                                                                  w W+',,.
                             so Ps St Pi                                                                         R3          .......
                                                                                                                              .:.:     ::

                                                                          D                            E                                                F



   FIG. 4. RNA blot analysis of CELP mRNA accumulation pat-
terns. (A) RNA from root (lane R), stem (lane S), leaves (lane L), and
                                                                          G                            H

flowers (lane F). Prolonged exposure of this autoradiogram revealed                 t        \ ,OY
a very low level ofCELP mRNAs in leaves. (B) Sepal (Se), petal (Pe),
stamen (St), and pistil (Pi) RNA from flowers at stage 5-8 (30). (C)
RNA from various floral developmental stages. The number above
                                                                                ,   I
                                                                                               .a V        If                                           J
each lane designates floral developmental stage (30). These blots                            ... JP
were hybridized with 32P-labeled CELP-lc DNA probe. Other
CELPc probes yielded identical results.
cells lining the placenta that are a continuation from the stylar
transmitting tissue to which ovules are attached (Fig. 5 G and
J).                                                                         FIG. 5. In situ hybridization analysis of cellular accumulation
                                                                         patterns of CELP mRNAs. Flowers at stage 5-6 were used. Bright-
                          DISCUSSION                                     field micrographs of flower sections hybridized with 35S-labeled
                                                                         CELP antisense RNA (A,C,D,F,G, and J) and control sense RNA
We have described here a family of genes and cDNAs                       (B,E,H, and I). (A and B) Cross sections of petal. (x 140.) The cells
(CELP) for a set of proteins with an extensin-like Pro-rich              without silver grain deposition in B reacted with tissue stain more
domain, a Cys-rich domain, and a highly charged C-terminal               efficiently than cells shown in A and so have a much darker outline.
domain (Fig. 2). CELP mRNAs accumulate almost exclu-                     (C) Cross section of a filament. (x55.) (D and E) Longitudinal
                                                                         sections of a pistil. (x20.) The dark area in the vascular bundle in E
sively in flowers and they are present in diverse cell types.            was from autofluorescence of the vascular tissues. (F) Cross section
Protein blot analysis and subcellular localization of CELPs              of a pistil. (x55.) (G and H) Longitudinal sections of an ovary. (x 15.)
indicate that they are glycoproteins localized largely to the            (I) Cross section of a pistil. (x20.) (J) Higher magnification of one
cell walls (A.Y.C. and H.-m.W., unpublished data), proper-               region of the ovary section shown in G. (x55.) ep, epidermis; vb,
ties similar to many other HRGPs. However, the unique                    vascular bundle; co, cortex; tt, transmitting tissue; pl, placenta; ov,
structure of these floral CELPs (Fig. 2) and the expression              ovule.
pattern for their genes (Figs. 4 and 5) distinguish them from            multifunctional. The variabilities observed in individual
several other female or male reproductive organ-specific Pro             structural regions among the CELPs may be able to modulate
(Hyp)-rich proteins (13-17).                                             the functions mediated by them. The differences in the
   Structurally, being Pro-rich proteins with low Tyr con-               Xaa-Pro3 7 motifs and their interruptions by different lengths
tents, the CELPs must rely on chemical reactions other than
the irreversible isodityrosine linkages presumably utilized by           of Xaa-Pro doublets may affect the extent and the pattern of
the Tyr-rich HRGPs to integrate into the cell-wall network               glycosylation in the CELPs, thus further amplifying their
(26, 27). It is likely that the Cys residues play a role in this         differences. These may then provide even more structural
aspect. In the Pro-rich domains, there are a number of                   variability and functional versatility to this family of proteins
Cys-Pro3 7 motifs and Cys-Pro doublets (Fig. 2). These Cys               as they are incorporated into the extracellular matrix of the
residues should be able to participate in disulfide bond                 different floral cell types.
formation under the proper oxidative-reductive conditions,                 The C-terminal half of the CELPs has eight Cys residues
allowing for interactions among CELPs, other Cys groups,                 (=10% of the residues) (Fig. 2 and Table 1), which are at
and sulfated nonproteinaceous components in the walls.                   conserved positions in these proteins. The Cys residues
Since disulfide bonds are reversible, interactions mediated by           inevitably contribute to the structure of CELPs, which
these Cys residues are thus labile and may change as devel-              should be important to their activities. The possibility that
opmental, functional, and environmental demands on the                   CELPs may have a role in cell-cell interactions should be
flower change.                                                           considered especially in light of their similarity to the sola-
  The complexity of the CELP gene family, its diverse                    naceous lectins, which are soluble HRGPs with a Pro-rich
cellular expression pattern in floral tissues, and the presence          domain and a Cys-rich domain. It is known that the sugar-
of distinct structural regions in CELPs suggest that they are            binding property of potato lectin resides in the Cys-rich
          Plant Biology: Wu et al.                                                Proc. Natl. Acad. Sci. USA 90 (1993)           6833
 domain and disruption of disulfide bonds interferes with this     how they contribute to the properties necessary for the
 activity (28). Because of their ability to recognize other        extracellular matrix of floral tissues.
 glyco-moieties, and their locations, both cellular and extra-
 cellular (29, 31), it has been suggested that solanaceous           We thank Dr. Jean Haley for her comments on the manuscript and
 lectins have important roles in cell-cell interactions. The       Mrs. Nancy Carrignan for her patient assistance in the preparation
 unique expression pattern of CELP mRNA accumulation in            of this manuscript. This work was supported by a grant from the
 stylar cells demarcating the transmitting tissue suggests a       McKnight Foundation to Yale University. J.Z. was a postdoctoral
 possible role in restricting the path of pollen tube growth to    fellow supported by the Rockefeller Foundation.
 within the transmitting tissue. Accumulation of CELP gene
 products on the placenta surface cells suggests that they may      1.   Varner, J. E. & Lin, L.-S. (1989) Cell 56, 231-239.
 contribute to ovule development or to pollen tube entrance to      2.   Roberts, K. (1989) Curr. Opin. Cell Biol. 1, 1020-1027.
 the ovules.                                                        3.   Roberts, K. (1990) Curr. Opin. Cell Biol. 2, 920-928.
                                                                    4.   Cassab, G. I. & Vamer, J. E. (1988) Annu. Rev. Plant Physiol.
    Presence of several distinct structural domains suggest that         Plant Mol. Biol. 39, 321-353.
 CELPs may not be entirely embedded in the fibrous wall             5.   Lamb, C. J., Lawton, M. A., Dron, M. & Dixon, R. P. (1989)
 matrix. Different domains of these proteins may be available            Cell 56, 215-224.
 for interactions with structures contiguous to the cell walls.     6.   Showalter, A. M. & Varner, J. E. (1989) in The Biochemistry of
 It is especially interesting to note the presence of a highly           Plants: A Comprehensive Treatise, ed. Marcu, A. (Academic,
 charged C terminus in these CELPs (Fig. 2). These charged               New York), Vol. 15, pp. 485-520.
 groups may be available to interact with other charged             7.   Condit, C. M. & Keller, B. (1990) in Organization and Assem-
 moieties in the extracellular matrix. Moreover, the variability         bly ofPlant andAnimal Extracellular Matrix, eds. Adair, W. S.
                                                                         & Mecham, R. P. (Academic, New York), pp. 119-135.
 observed in the C termini among the CELPs should allow             8.   Hong, J. C., Nagao, R. T. & Key, J. L. (1989) Plant Cell 1,
 more versatility for these interactions. Furthermore, the               937-943.
 highly charged C termini in CELPs are similar to a HRGP             9. Ye, Z.-H. & Varner, J. E. (1991) Plant Cell 3, 23-37.
 from Volvox, which has five negatively charged amino acid          10. Wyatt, R. E., Nagao, R. T. & Key, J. L. (1992) Plant Cell 4,
 residues followed by five positively charged residues close to          99-110.
 its C terminus (32). This C-terminal region was implicated in      11. Esau, K. (1977) Anatomy of Seed Plants (Wiley, New York).
the biogenesis of the extracellular matrix during Volvox            12. Neale, A. D., Wahleithner, J. A., Lund, M., Bonnett, H. T.,
embryogenesis. Although the significance of the highly                   Kelly, A., Meeks-Wagner, D. R., Peacock, W. J. & Dennis,
 charged C termini in the CELPs remains to be determined, it             E. S. (1990) Plant Cell 2, 673-684.
                                                                    13. De S. Goldman, M. H., Pezzotti, M., Seurinck, J. & Mariani,
 is likely that this domain has a specific functional role in            C. (1992) Plant Cell 4, 1041-1051.
flower extracellular matrix.                                        14. Chen, C.-G., Cornish, E. D. & Clarke, A. E. (1992) Plant Cell
    Because of the motif similarity shared among many Pro-               4, 1053-1062.
rich proteins, it had been suggested that recombination             15. Cheung, A. Y., May, B., Gu, Q. & Wu, H.-M. (1993) Plant J.
played a role in the evolution of these cell wall protein genes          3, 151-160.
(6, 33). It has also been suggested that an ancestral cytosine-     16. Baldwin, T. C., Coen, E. S. & Dickinson, H. G. (1992) Plant
rich DNA sequence for an extensin gene existed. Gene                    J. 2, 733-739.
duplication and subsequent recombinations resulted in the           17. Evrard, J.-L., Jako, C., Saint-Guily, A., Weil, J.-H. & Kuntz,
complex array of gene families encoding the many Pro-rich                M. (1991) Plant Mol. Biol. 16, 271-281.
                                                                    18. Ori, N., Sessa, G., Lotan, T., Himmelhoch, S. & Fluhr, R.
proteins. It was also suggested that the solanaceous lectins             (1990) EMBO J. 9, 3429-3436.
may have arisen from the recombination between a sequence           19. Fraser, R. S. S. (1981) Physiol. Plant Pathol. 19, 69-76.
encoding a Pro-rich protein and one encoding a Cys-rich            20. Gu, Q., Kawata, E. E., Morse, M.-J., Wu, H.-M. & Cheung,
protein. The structure of the CELP genes exemplifies these              A. Y. (1992) Mol. Gen. Genet. 234, 89-96.
hypotheses. The Pro-rich regions and the Cys-rich regions in       21. Memelink, J. (1988) Dissertation (University of Leiden, The
the CELPs could be a result of recombination between two                 Netherlands).
distinct DNA fragments. The length variability in the Pro-rich     22. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular
domains (Fig. 2) reflects possible unequal cross-over in this            Cloning: A Laboratory Manual (Cold Spring Harbor Lab.
region that resulted in large deletions in some of these genes.         Press, Plainview, NY), 2nd Ed.
                                                                   23. Van Holst, G.-J., Martin, S. R., Allen, A. K., Ashford, D.,
Smaller insertions and deletions can account for the shorter-           Desai, N. N. & Neuberger, A. (1986) Biochem. J. 233,731-736.
range variations (see Fig. 2). It is probable that recombino-      24. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690.
genic activities within the G+C-rich DNA sequences played          25. Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K. &
a significant role in the evolution of the Pro-rich domains in          Watson, J. (1989) Molecular Biology of the Cell (Garland, New
CELPs. In addition to recombination, exon shuffling (34), a             York), 2nd Ed.
mechanism believed to be important to the evolution of             26. Fry, S. C. (1986) Annu. Rev. Plant Physiol. 37, 165-186.
proteins with multiple functional domains might be respon-         27. Epstein, L. & Lamport, D. T. A. (1984) Phytochemistry 23,
sible in constructing the highly charged C-terminal domains             1242-1246.
in these proteins.                                                 28. Allen, A. K. (1983) in Chemical Taxonomy, MolecularBiology,
    The many functional roles for cell walls require the phys-          and Function of Plant Lectins, eds. Goldstein, I. J. & Etzler,
                                                                        M. E. (Liss, New York), pp. 71-85.
ical, biological, and chemical complexity observed for the         29. Casalongde, C. & Pont Lezica, R. (1985) Plant Cell Physiol. 26,
extracellular matrix of plant cells. While a vast amount of             1533-1539.
information is available concerning the biological properties      30. Koltunow, A. M., Treuttner, J., Cox, K. H., Wallroth, M. &
of the plant cell surface and the chemical properties of its            Goldberg, R. B. (1990) Plant Cell 2, 1201-1224.
constituents (1-3), how the cell walls are constructed and         31. Jeffree, C. E. & Yeoman, M. M. (1981) New Phytol. 87,
how their functions are mediated remain largely a puzzle. The           463-471.
many roles that some of the cell wall proteins are believed to     32. Ertl, H., Hallmann, A., Wenzl, S. & Sumper, M. (1992) EMBO
play in plant growth, development, and its defense remain to            J. 11, 2055-2062.
                                                                   33. Showalter, A. M. & Rumeau, D. (1990) in Organization and
be unequivocally demonstrated. The perplexing issues that               Assembly ofPlant andAnimal Extracellular Matrix, eds. Adair,
have been addressed here concerning the CELP gene family                W. S. & Mecham, R. P. (Academic, New York), pp. 247-281.
will help focus efforts toward unraveling the structural and       34. Gilbert, W. (1987) Cold Spring Harbor Sym. Quant. Biol. 52,
functional significance of these proteins and understanding             901-906.

Shared By: