Proc. Natl. Acad. Sci. USA
Vol. 90, pp. 6829-6833, July 1993
A tobacco gene family for flower cell wall proteins with a
proline-rich domain and a cysteine-rich domain
HEN-MING WU, JITAO Zou, BRUCE MAY, QING Gu, AND ALICE Y. CHEUNG*
Department of Biology, Yale University, P.O. Box 6666, New Haven, CT 06511
Communicated by J. E. Varner, March 29, 1993
ABSTRACT Flowering is known to be associated with the different cell types and at different developmental stages
induction of many cell wall proteins. We report here five (8-10). A flower is morphologically complex and composed
members of a tobacco gene family (CELP, Cys-rich extensin- of multiple tissue types (11). Flowering is known to stimulate
like protein) whose mRNAs are found predominantly in flowers or induce the expression of several cell-wall protein genes
and encode extensin-like Pro-rich proteins. CELP mRNAs (12-18). These extracellular matrix proteins may be involved
accumulate most abundantly in vascular and epidermal tissues in the organization of the floral apex from a vegetative apex
of floral organs. In the pistil, CELP mRNAs also accumulate or may fulfill special structural and functional requirements
in a thin layer of cells between the transmitting tissue and the for flower cell walls. Some flowering-associated extracellular
cortex of the style and in a surface layer of cells of the placenta matrix proteins are also stress-inducible or pathogen-related
in the ovary. This unique accumulation pattern of CELP and may contribute to the overall defense system in flowers
mRNAs in the pistil suggests a possible role in pollination and (12, 19, 20).
fertilization processes. CELP genes encode a class of plant We report here the characterization of a family of tobacco
extracellular matrix proteins that have several distinct struc- genes and cDNAs (CELPs, Cys-rich extensin-like proteins)
tural features: a Pro-rich extensin-like domain with Xaa-Pro3_7 for a set of proteins that have an extensin-like domain.t
motifs and Xaa-Pro doublets, a Cys-rich region, and a highly CELP mRNAs accumulate almost exclusively in the four
charged C terminus. The extensin-like domains in these pro- floral organs-sepal, petal, stamen, and pistil. CELP mRNAs
teins differ significantly in their length and these differences accumulate in multiple cell types in each floral organ. They
appear to be results of both long and short deletions within the
coding regions of their genes. Furthermore, the number of are abundant in the vascular bundles and the epidermis,
charged amino acid residues in the C-terminal region varies except in the pistil where they are most abundant in the cells
among the CELPs. These structural differences may contribute demarcating the transmitting tissue and the cortex and in the
to functional versatility in the CELPs. On the other hand, the cell layer lining the placenta to which ovules are attached.
Cys-rich domain is highly conserved among CELPs and the The CELPs have three distinct structural domains: a Pro-rich
positions of the Cys residues are conserved, suggesting that this extensin-like domain, a Cys-rich domain, and a highly
region may have a common functional role. The presence of a charged C terminus. These features make the CELPs struc-
Pro-rich domain and a Cys-rich domain in these CELPs is turally distinct from tobacco extensin (21). The possible
reminiscent of a dass of hydroxyproline-rich glycoproteins, functional roles for this class of extracellular matrix proteins
solanaceous lectins, that are believed to be important in will be discussed in light of the available structural and
cell-cell recognition. The structure of these CELPs indicates expression information.
that they may be multifunctional and that their genes may have
arisen from recombinational events.
MATERIALS AND METHODS
The primary walls of plant cells are thin and flexible but
strong and can accommodate cell expansion and changes in cDNA and Genomic Library Construction and Screening.
cell shape as cells grow and differentiate. Besides providing CELP cDNA clones were isolated from a tobacco floral
structural integrity, cell walls are also the sites where cell- cDNA library as described (20). A A Dash II genomic N.
cell interactions as well as interactions between plants and tabacum DNA library was made as described (22). CELP
their environment take place (1-5). Cell-wall proteins are genomic clones were isolated from an unamplified library
diverse and include many proteins that are believed to play using a 32P-labeled CELP-lc probe prepared by random
structural roles and others that are involved in defense or priming (22). Hybridization and washes were carried out at
other cellular and biochemical processes. The best charac- 68°C in buffers as described (15).
terized cell wall protein genes are those for several classes of Nucleotide Sequence Analysis. Nucleotide sequence analy-
hydroxyproline (Hyp)-rich glycoproteins (HRGPs) (2, 6). sis was carried out by the dideoxynucleotide sequencing
One class includes the extensins, which typically have nu- method using double-stranded DNA and Sequenase Version
merous Ser-Pro(Hyp)4 motifs throughout the entire protein. 2 (United States Biochemical) according to the manufactur-
Another class encodes what are collectively known as Pro er's recommendations. All the nucleotide sequences were
(Hyp)-rich proteins, which have multiple copies of the pen- determined on both strands of DNA.
tapeptide Val-Tyr-Lys-Pro-Pro or its variants. There are at RNA Expression Analysis. RNA preparation, gel electro-
least two additional classes of glycosylated Pro (Hyp)-rich
proteins, arabinogalactan proteins and solanaceous lectins. phoresis, blot analysis, and in situ hybridizations have been
Gly-rich cell wall protein genes have also been reported (7). described (20).
Differential expression of cell wall protein genes may
contribute to meeting the functional and physical demands on Abbreviations: CELP, Cys-rich extensin-like protein; Hyp, hy-
droxyproline; HRGP, Hyp-rich glycoprotein.
*To whom reprint requests should be addressed.
The publication costs of this article were defrayed in part by page charge tThe sequences reported in this paper have been deposited in the
payment. This article must therefore be hereby marked "advertisement" GenBank data base (accession nos. CELP-1 to CELP-5, L13439-
in accordance with 18 U.S.C. §1734 solely to indicate this fact. L13443, respectively).
6830 Plant Biology: Wu et al. Proc. Natl. Acad. Sci. USA 90 (1993)
RESULTS Table 1. Amino acid composition of CELP-1, -2, -3, -4, and -5,
tobacco extensin, and potato lectin
CELP-1 and CELP-lc, a Gene and Its cDNA for an Extensin-
Like Pro-Rich Protein. CELP-lc, a cDNA derived from a % of total residues
class of highly expressed flower mRNAs, was isolated from CELP
a tobacco floral cDNA library. CELP-lc has an open reading
Residue 1 2 3 4 5 TE* PLt
frame corresponding to a 209-aa Pro-rich protein (Fig. 1). The
first of four closely spaced Met residues (NT 37-39) (Fig. 1) Ala 4.78 5.1 6.66 5.69 7.45 1.88 3.70
is assumed to function as the initiation codon of the deduced Arg 6.22 5.61 3.63 3.79 2.48 0 0.82
CELP-1 protein since the 5' end of the mRNA is located at Asn 2.87 2.55 3.63 2.53 2.48 0.31 5.34
the predominant transcription initiation site of the CELP-1 Asp 3.82 4.08 4.24 3.16 3.72 0
gene at the thymidine denoted NT 1 (data not shown). Cyst 6.70 6.63 7.87 8.22 8.07 0 11.5
The most striking feature of the deduced CELP-1 protein 9.87 1Q Ill 1Lu IILt
is in the 74-aa central region that contains a series of Pro-rich Gln 2.87 3.57 2.42 3.79 3.72 0.31 6.99
sequences. The Pro residues are distributed in seven Xaa- Glu 5.26 5.61 6.06 5.69 5.59 0.31
Pro3 5motifs (Xaa being Trp, Cys, or Ser) and 18 Xaa-Pro Gly 0.95 1.02 1.81 1.26 2.48 0.31 11.5
doublets (X being Trp, Phe, Cys, Arg, or Gln). Overall, Pro His 0.478 0.51 0 0.63 0 4.71 0
residues are 26.3% of the entire CELP-1 protein but they Ile 3.34 3.06 5.45 3.79 4.96 0 1.23
Leu 5.74 5.61 7.87 8.86 9.31 2.20 1.23
make up 66.2% of the Pro-rich domain (see Table 1). The Lys 5.26 5.10 4.84 6.32 4.96 12.26 3.70
Xaa-Pro35 motifs in the predicted CELP are similar to the Met 2.39 3.57 1.81 1.26 2.48 0.628 0.41
Ser-Pro4 repeats found in many plant-cell-wall HRGPs (4, 6). Phe 2.39 2.04 4.24 3.16 2.48 0.31 0
CELP-1, especially its C-terminal half, is relatively Cys-rich Pro§ 26.3 26 26 18.9 19.25 39.93 28.3
(Fig. 1 and Table 1). Moreover, 9 of the last 11 aa are charged. I 67.1 65.7 A
Characterization of the CELP-1 gene revealed a 400-bp intron Ser 8.13 7.14 7.27 6.96 6.83 10.06 11.1
that interrupts Gln196 and Val'97 (data not shown), thus Thr 2.87 3.06 3.63 4.43 3.10 5.03 6.58
separating the region encoding the highly charged C terminus Trp 2.39 2.55 0.60 1.89 2.48 0 3.29
of the CELP-1 from the rest of the gene. Tyr 2.39 2.04 4.24 3.16 3.72 14.46 3.29
CELP-1-Related Genes and cDNAs Encode Proteins with Val 4.78 5.10 3.6 6.32 4.34 7.23 0.41
Similar Structural Features. CELP-1 gene is a member of a Total 209¶ 196 1661 1591 161w 3181 24311
complex multiple gene family (data not shown). Several other TE, tobacco extensin; PL, potato lectin.
CELP-lc-hybridizing cDNAs and genomic clones have also *Ref. 21.
been characterized. CELP-lc, -2c, -3c, and -4c cDNAs and tRef. 23.
1 8 tUnderlined numbers represent percent of Cys residues in the
M N N M L I M L C-terminal domain.
ata aca caa acc att cag cga aaa acg gca acc AG AAC AAT AG CIC ATA AmG TTA
§Underlined numbers represent percent of Pro residues in the
M V A A F L F C S H Q Q V A T A R E V V Pro-rich domain.
ATG GIG ¢r ¢CA TIT T3 TG TGC A¢C CAC CAA CAA GrG GCC ACA G(G AGR GOA GrG GIT VTotal number of amino acid residues in unprocessed proteins.
V A E L A V A D D R N E L Q L L W P W
IIEstimated total number.
TGM GCC GAA TIG GOC GMG GCC GAT GM A¢G ART GMTrGA CAR CMA CT TCG CCA TGG GAA
I P C Y L T W P F P W P P P P P W P C
the coding region of CELP-5 gene share significant sequence
rGr TAT CmG ACA C0(A TIC TCG CCA COG CCA cc CCA CEA TCC
homology (data not shown). Although the five deduced
P P R P R P R P R P C P S P P P P P R P CELPs vary significantly in length (between 159 and 209 aa)
CGA CER CGA CCA CGR CCA TCCur AGC cur ccr CCA (A CrA CGA CrG
(Fig. 2), they all have the same overall primary structure: a
R P C P S P P P P P R P R P C P S P P P Pro-rich domain, followed by a Cys-rich region and a highly
1E CC ACm
cuRAr. c r:. CGA OA
C aGA TOC CTCT Cr T CCG
charged C terminus.
PP Q P R P R P S P P P P S P P P P A P Similar to CELP-lc, CELP-2c and -3c appear to be full-
CAG CCA CEA CGA CCA ACC (Or QCA CA CRA T CCA CCC CCr
length cDNAs whereas CELP-4c does not have an ATG
S S S C S A S D E S N I Y R C M F N E T codon at the 5' end of its open reading frame. The N termini
AGr A¢C TGC TCA GCr AGr GAT GAA TCA AAT ATT TA AGG AAC GAA ACr
168 of the deduced CELPs are hydrophobic and have features
K I D P C C P T F K S I L G
AAA ATT GAT CCA T1GC T3C CCA ACA TC AAG AmC ATA CIT ¢T
T S C
AmT AGr TIC
P C Y
characteristic of signal peptides (Fig. 2) (24). A putative
188 signal peptide processing site is assigned in front of the
K Y A E N L
AAR TAT GCA GAG AAT TIG
D N Q V L I T I E S Y
GAT AAT CAA G(G TIA ATr Am AT OAA TCT TAT TGT GAT GTr
C D V conserved Gln22 (CELP-1) residue found in all five CELPs.
601 * * * * * * * * There is one potential N-glycosylation site (25) in CELP-1, -3,
D S P C K G
GAT AmC CCr ICC AAG GGT
V Q*V I K L S K E E E K K K
GIT CAA Gr0 ATr AMG CTG TCC AAG GAA GAG GA AAM AAM AM
-4, and -5 (Fig. 2). Some of the Pro residues in the Pro-rich
domain of CELPs are most probably hydroxylated and these
taa aaa taa agt ttt aat gtt taa ttc cag atc att tat tag tag atg att
Hyp residues may provide additional sites for glycosylation
721 (2, 4, 6).
tta tcc ttt ttg aaa gtc tct agg tta tgt ttt tgg tct tcc ttg ttg tgt ttc
Sequence analysis of CELP-5 gene revealed that it also has
aga aat aat ttg cca tat atg tca aaa tct tag tac taa taa taa gaa tat tat aat ata
an intron (391 bp) that, similar to the one in CELP-1 gene,
taa aat ctc tca tct agaaaaa interrupts the same Gln-Val dipeptide close to the 3' end of
the coding region. Recently, the nucleotide sequences of two
FIG. 1. Nucleotide sequence of the CELP-1 cDNA and the partial cDNAs for extensin-like flower-specific genes
deduced amino acid sequence of the CELP-1. Arrow, predominant (pMG02 and pMG04) have been described (13). These partial
transcription initiation site of the CELP-1 mRNA; triangle, putative sequences corresponded to part of the sequences found in
signal peptide processing site; diamond, location of the intron in CELP-lc and CELP-2c. Whether MG02 and MG04, and
CELP-1 precursor mRNA. Initiation and termination codons are in
italic type. The Pro-rich domain is underlined. The charged amino CELP-lc and -2c are for the same genes remains to be
acid residues at the C terminus are indicated by asterisks. The determined.
putative polyadenylylation signal is overlined. The numbers on the The Pro-Rich Domains of the CELPs Show Significant
left and right indicate first NT positions and the last amino acid Length and Motif Pattern Variations. The Pro-rich domains in
positions, respectively, of each row. the CELPs are flanked by a conserved tetrapeptide (Trp-
Plant Biology: Wu et Plant Biology: Wu et al.
al. ~~~~~Proc. Nati. Acad. Sci. USA 90 (1993) 6831
conserved between CELP-1 and -3 whereas an entirely
different hexapeptide was found in CELP-2 and -4. The
hexapeptide in CELP-5 has 3 aa that are identical to those in
IAm4QLL'IL t1?AILEV CLT7EMLV VA--- £QG 5QL-WEWK 44 CELP-1 and -3 and 3 aa that are identical to those in CELP-2
and -4. From the nucleotide sequences encoding these
RPRPrP.qPPP 96 hexapeptides (Fig. 3), it could be argued that the hexapeptide
CELP-3 IFCLJF WP--RPWTCP -PPP ----CPPPPPPP
in CELP-5 was derived from those in CELP-1 and -3 via three
CELP-4 IFCLJF FP--RPYFCP PPRPRPRP--
- -.1 RP--CPDPPP
single-base-pair changes that resulted in changes in 3 aa. The
nucleotide sequences in CELP-5 could then have undergone
PPRFRCPSP PPFFPPSPPPP---SP PP A ENIYRC 143
additional changes in 3 bp to yield a total of the 6 aa
PP--------QPRPP SPPPPPPPSP PPP
------- CPPPP---s P p
substitutions seen in CELP-2 and CELP-3.
----------CPP--- p --
A ir- 97 Cys-Rich Domains in CELPs Are Highly Conserved. The
CELP-5 &;EAKIM 98
Cys-rich regions of the CELPs are highly conserved (Fig. 2).
CE.P-1 NFEIKEDC SCPCYKYAED
Most strikingly, the eight Cys residues in this region are in
SCPCYKYAED LaQvLrnLE AYCDUDSPCK
181 conserved positions in these proteins. This structural con-
04ENITSILEW cPTh-KSIIa SCPCYKYAEN
servation suggests that the Cys-rich domain of CELPs may
LGVELIALQ A=L 148 have common functional significance.
The C Terminus of CELPs Is Highly Charged and Is Coded
209 in a Separate Exon in CELP Genes. The CELP C terminus is
CE1LP-3 GLQIIMISM EY-K
166 defined as the region encoded by the second exon of the
CELP-4 GLtO.IICSKE EE 159
CELP-1 and CELP-5 genes beglnning with the last conserved
CELP-5 OLCU R5N LE 160
Val residue (Fig. 2). More than half of the C-terminal amino
FIG. 2. Deduced amino acid sequences of CELP-1, -2, -3, -4, and acid residues in this region are charged. Despite the similar
-5. The first amino acid residues shown are either the first Met codon charge property, each CELP is unique in the composition or
(in CELP-1c, -2c, -3c, and -5) or the first codon in the 5' end of a the total number of charged amino acid residues in this
cDNA sequence (in CELP-4c). The Pro-rich regions are boxed. region, thus providing variability in this region of the CELPs.
Conserved amino acid residues among all five proteins on the N- and
CELPs Are Different from Extensins but Their Primary
C-terminal sides of the Pro-rich domain underlined. The
Cys residues in the C-terminal half of CELPs are indicated by
Structure Is Reminiscent of Solanaceous Lectins. A striking
asterisks. The charged amino acid residues at the C termini are in
difference between CELPs and extensins is the low Tyr
italic type. The potential glycosylation sites in CELP-1, -3, -4, and -S
content and the absence of His residues in the presumed
proteins are indicated by dots. Gaps (dashes) are introduced to allow mature CELPs (Fig. 2 and Table 1). Tyr residues in extensins
maximum alignment of these proteins. The numbers on the right are closely associated with the Ser-Pro4 motifs and can form
indicate, amino acid position for each of the CELPs. Triangle, isodityrosine linkages via intramolecular crosslinking (26,
putative signal peptide processing site; diamond, position of intron in 27), rendering them highly insoluble in the cell-wall matrix.
CELP-1 and CELP-5 genes. On the contrary, CELPs are relatively rich in Cys residues
whereas extensins are in general low in or devoid of this
Pro-Phe-Pro) and an Ala-Pro doublet at the N and C termini, amino acid residue (Table 1). A number of these Cys groups
respectively (Fig. 2). The Pro content in these proteins is are found within the Xaa-Pro3-7 motifs and among the Xaa-
between 19 and 26% for the total protein and between 65 and
Pro doublets. Some of these Cys residues may form disulfide
71% for the Pro-rich domains (Table 1). The Xaa-Pro3-7 MOtifS bonds under the proper oxidative-reductive conditions and
in CELPs differ from the Ser-Pro4 MOtifS in extensins in that
may be important participants in inter- and intramolecular
Xaa may be Ser, Cys, or Trp. Despite the significant homol- interactions involving the CELPs in the extracellular matrix.
ogy, CELPs differ among themselves in the number and in On the other hand, the amino acid compositions of the
the characteristics of the Xaa-Pro3-7 motifs and the Xaa-Pro CELPs, the presence of distinct Pro-rich and Cys-rich do-
doublets (Fig. 2). The Pro-rich domains in these proteins mains, and their solubility properties (A.Y.C. and H.-mn.W.,
range from 35 to 74 aa. Most of the differences appear to unpublished data) are reminiscent of the solanaceous lectins,
result from deletions of relativelylong regions of DNA. Some a class of Hyp-rich proteins that are also rich in Ser and Cys
minor insertions and deletions might be responsible for the (and Gly) residues (Table 1) (23, 28, 29). Solanaceous lectins
shorter range heterogeneity. These differences adequately are structurally very different from other known lectins and
account for the overall length variations among the CELPs. are believed to be important to cell-cell interactions.
Striking homology exists in a pentapeptide and an octapep- CELP mRNAs Are Predominant in Flowers and Accumulate
tide flanking the Pro-rich regions of the CELPs (Fig. 2). in Specific Cell Types, Especially in the Pistil. CELP mRNAs
Interestingly, a highly variable 6-aa domain is located on the accumulate almost exclusively in the four floral parts (Fig. 4
C-terminal side of the conserved octapeptides after the A and B). Their levels are very high throughout flower
Pro-rich regions (Figs. 2 and 3). This domain was 100% development. They begin to decline by anthesis (Fig. 4C) and
are greatly reduced in developing and mature fruits (data not
GAA rA AAT ATr A'LX
E S N I Y 142
shown). In situ hybridization of CELP probes to tissue
CELP-3 GAA WCA MAT ATr rIAC ALX
sections indicates that CELP mRNAs accumulate in a cell-
E S N I Y R 101 specifi'c manner in floral tissues (Fig. 5). CELP mRNA levels
CEbP-5 CGAA gCA AAg ATT TAC AaG are the highest in the epidermis and vascular bundles of petals
E A K I Y K 97 (Fig. 5A) and sepals (data not shown). In the stamen, CELP
CELP-2 cA~A gCA gTr aAg AaG mRNAs are confined to the vascular bundles of the filament
A K V K 130
(Fig. 5C), whereas there are only low levels of these mRNAs
CELP-4 gCA g q aAg AaG in the connective and vascular tissues of anthers (data not
Q A K V K 96
shown). The most dramatic cell-specific CELP mRNA ac-
FIG. 3. Nucleotide sequences encoding the highly variable region
cumulation pattern is observed in the pistil. In addition to
C-terminal to the Pro-rich domains of CELPs. Base-pair changes are being present in the vascular bundles, CELP mRNAs accu-
shown in lowercase type. Amino acid substitutions are shown in mulate to high levels in a narrow region between the trans-
boldface and outlined types. Numbers at the end of each line indicate mitting tissue and the cortex of the style (Fig. 5 D and F). In
the position of the last amino acid residue shown for each protein. the ovary, CELP mRNAs concentrate in a narrow row of
6832 Plant Biology: Wu et al.
I S L F
Proc. Natl. Acad. Sci. USA 90 (1993)
g d;Sh.f t
L ..:E r.#
o. } L .. .....
B w W+',,.
so Ps St Pi R3 .......
D E F
FIG. 4. RNA blot analysis of CELP mRNA accumulation pat-
terns. (A) RNA from root (lane R), stem (lane S), leaves (lane L), and
flowers (lane F). Prolonged exposure of this autoradiogram revealed t \ ,OY
a very low level ofCELP mRNAs in leaves. (B) Sepal (Se), petal (Pe),
stamen (St), and pistil (Pi) RNA from flowers at stage 5-8 (30). (C)
RNA from various floral developmental stages. The number above
.a V If J
each lane designates floral developmental stage (30). These blots ... JP
were hybridized with 32P-labeled CELP-lc DNA probe. Other
CELPc probes yielded identical results.
cells lining the placenta that are a continuation from the stylar
transmitting tissue to which ovules are attached (Fig. 5 G and
J). FIG. 5. In situ hybridization analysis of cellular accumulation
patterns of CELP mRNAs. Flowers at stage 5-6 were used. Bright-
DISCUSSION field micrographs of flower sections hybridized with 35S-labeled
CELP antisense RNA (A,C,D,F,G, and J) and control sense RNA
We have described here a family of genes and cDNAs (B,E,H, and I). (A and B) Cross sections of petal. (x 140.) The cells
(CELP) for a set of proteins with an extensin-like Pro-rich without silver grain deposition in B reacted with tissue stain more
domain, a Cys-rich domain, and a highly charged C-terminal efficiently than cells shown in A and so have a much darker outline.
domain (Fig. 2). CELP mRNAs accumulate almost exclu- (C) Cross section of a filament. (x55.) (D and E) Longitudinal
sections of a pistil. (x20.) The dark area in the vascular bundle in E
sively in flowers and they are present in diverse cell types. was from autofluorescence of the vascular tissues. (F) Cross section
Protein blot analysis and subcellular localization of CELPs of a pistil. (x55.) (G and H) Longitudinal sections of an ovary. (x 15.)
indicate that they are glycoproteins localized largely to the (I) Cross section of a pistil. (x20.) (J) Higher magnification of one
cell walls (A.Y.C. and H.-m.W., unpublished data), proper- region of the ovary section shown in G. (x55.) ep, epidermis; vb,
ties similar to many other HRGPs. However, the unique vascular bundle; co, cortex; tt, transmitting tissue; pl, placenta; ov,
structure of these floral CELPs (Fig. 2) and the expression ovule.
pattern for their genes (Figs. 4 and 5) distinguish them from multifunctional. The variabilities observed in individual
several other female or male reproductive organ-specific Pro structural regions among the CELPs may be able to modulate
(Hyp)-rich proteins (13-17). the functions mediated by them. The differences in the
Structurally, being Pro-rich proteins with low Tyr con- Xaa-Pro3 7 motifs and their interruptions by different lengths
tents, the CELPs must rely on chemical reactions other than
the irreversible isodityrosine linkages presumably utilized by of Xaa-Pro doublets may affect the extent and the pattern of
the Tyr-rich HRGPs to integrate into the cell-wall network glycosylation in the CELPs, thus further amplifying their
(26, 27). It is likely that the Cys residues play a role in this differences. These may then provide even more structural
aspect. In the Pro-rich domains, there are a number of variability and functional versatility to this family of proteins
Cys-Pro3 7 motifs and Cys-Pro doublets (Fig. 2). These Cys as they are incorporated into the extracellular matrix of the
residues should be able to participate in disulfide bond different floral cell types.
formation under the proper oxidative-reductive conditions, The C-terminal half of the CELPs has eight Cys residues
allowing for interactions among CELPs, other Cys groups, (=10% of the residues) (Fig. 2 and Table 1), which are at
and sulfated nonproteinaceous components in the walls. conserved positions in these proteins. The Cys residues
Since disulfide bonds are reversible, interactions mediated by inevitably contribute to the structure of CELPs, which
these Cys residues are thus labile and may change as devel- should be important to their activities. The possibility that
opmental, functional, and environmental demands on the CELPs may have a role in cell-cell interactions should be
flower change. considered especially in light of their similarity to the sola-
The complexity of the CELP gene family, its diverse naceous lectins, which are soluble HRGPs with a Pro-rich
cellular expression pattern in floral tissues, and the presence domain and a Cys-rich domain. It is known that the sugar-
of distinct structural regions in CELPs suggest that they are binding property of potato lectin resides in the Cys-rich
Plant Biology: Wu et al. Proc. Natl. Acad. Sci. USA 90 (1993) 6833
domain and disruption of disulfide bonds interferes with this how they contribute to the properties necessary for the
activity (28). Because of their ability to recognize other extracellular matrix of floral tissues.
glyco-moieties, and their locations, both cellular and extra-
cellular (29, 31), it has been suggested that solanaceous We thank Dr. Jean Haley for her comments on the manuscript and
lectins have important roles in cell-cell interactions. The Mrs. Nancy Carrignan for her patient assistance in the preparation
unique expression pattern of CELP mRNA accumulation in of this manuscript. This work was supported by a grant from the
stylar cells demarcating the transmitting tissue suggests a McKnight Foundation to Yale University. J.Z. was a postdoctoral
possible role in restricting the path of pollen tube growth to fellow supported by the Rockefeller Foundation.
within the transmitting tissue. Accumulation of CELP gene
products on the placenta surface cells suggests that they may 1. Varner, J. E. & Lin, L.-S. (1989) Cell 56, 231-239.
contribute to ovule development or to pollen tube entrance to 2. Roberts, K. (1989) Curr. Opin. Cell Biol. 1, 1020-1027.
the ovules. 3. Roberts, K. (1990) Curr. Opin. Cell Biol. 2, 920-928.
4. Cassab, G. I. & Vamer, J. E. (1988) Annu. Rev. Plant Physiol.
Presence of several distinct structural domains suggest that Plant Mol. Biol. 39, 321-353.
CELPs may not be entirely embedded in the fibrous wall 5. Lamb, C. J., Lawton, M. A., Dron, M. & Dixon, R. P. (1989)
matrix. Different domains of these proteins may be available Cell 56, 215-224.
for interactions with structures contiguous to the cell walls. 6. Showalter, A. M. & Varner, J. E. (1989) in The Biochemistry of
It is especially interesting to note the presence of a highly Plants: A Comprehensive Treatise, ed. Marcu, A. (Academic,
charged C terminus in these CELPs (Fig. 2). These charged New York), Vol. 15, pp. 485-520.
groups may be available to interact with other charged 7. Condit, C. M. & Keller, B. (1990) in Organization and Assem-
moieties in the extracellular matrix. Moreover, the variability bly ofPlant andAnimal Extracellular Matrix, eds. Adair, W. S.
& Mecham, R. P. (Academic, New York), pp. 119-135.
observed in the C termini among the CELPs should allow 8. Hong, J. C., Nagao, R. T. & Key, J. L. (1989) Plant Cell 1,
more versatility for these interactions. Furthermore, the 937-943.
highly charged C termini in CELPs are similar to a HRGP 9. Ye, Z.-H. & Varner, J. E. (1991) Plant Cell 3, 23-37.
from Volvox, which has five negatively charged amino acid 10. Wyatt, R. E., Nagao, R. T. & Key, J. L. (1992) Plant Cell 4,
residues followed by five positively charged residues close to 99-110.
its C terminus (32). This C-terminal region was implicated in 11. Esau, K. (1977) Anatomy of Seed Plants (Wiley, New York).
the biogenesis of the extracellular matrix during Volvox 12. Neale, A. D., Wahleithner, J. A., Lund, M., Bonnett, H. T.,
embryogenesis. Although the significance of the highly Kelly, A., Meeks-Wagner, D. R., Peacock, W. J. & Dennis,
charged C termini in the CELPs remains to be determined, it E. S. (1990) Plant Cell 2, 673-684.
13. De S. Goldman, M. H., Pezzotti, M., Seurinck, J. & Mariani,
is likely that this domain has a specific functional role in C. (1992) Plant Cell 4, 1041-1051.
flower extracellular matrix. 14. Chen, C.-G., Cornish, E. D. & Clarke, A. E. (1992) Plant Cell
Because of the motif similarity shared among many Pro- 4, 1053-1062.
rich proteins, it had been suggested that recombination 15. Cheung, A. Y., May, B., Gu, Q. & Wu, H.-M. (1993) Plant J.
played a role in the evolution of these cell wall protein genes 3, 151-160.
(6, 33). It has also been suggested that an ancestral cytosine- 16. Baldwin, T. C., Coen, E. S. & Dickinson, H. G. (1992) Plant
rich DNA sequence for an extensin gene existed. Gene J. 2, 733-739.
duplication and subsequent recombinations resulted in the 17. Evrard, J.-L., Jako, C., Saint-Guily, A., Weil, J.-H. & Kuntz,
complex array of gene families encoding the many Pro-rich M. (1991) Plant Mol. Biol. 16, 271-281.
18. Ori, N., Sessa, G., Lotan, T., Himmelhoch, S. & Fluhr, R.
proteins. It was also suggested that the solanaceous lectins (1990) EMBO J. 9, 3429-3436.
may have arisen from the recombination between a sequence 19. Fraser, R. S. S. (1981) Physiol. Plant Pathol. 19, 69-76.
encoding a Pro-rich protein and one encoding a Cys-rich 20. Gu, Q., Kawata, E. E., Morse, M.-J., Wu, H.-M. & Cheung,
protein. The structure of the CELP genes exemplifies these A. Y. (1992) Mol. Gen. Genet. 234, 89-96.
hypotheses. The Pro-rich regions and the Cys-rich regions in 21. Memelink, J. (1988) Dissertation (University of Leiden, The
the CELPs could be a result of recombination between two Netherlands).
distinct DNA fragments. The length variability in the Pro-rich 22. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989) Molecular
domains (Fig. 2) reflects possible unequal cross-over in this Cloning: A Laboratory Manual (Cold Spring Harbor Lab.
region that resulted in large deletions in some of these genes. Press, Plainview, NY), 2nd Ed.
23. Van Holst, G.-J., Martin, S. R., Allen, A. K., Ashford, D.,
Smaller insertions and deletions can account for the shorter- Desai, N. N. & Neuberger, A. (1986) Biochem. J. 233,731-736.
range variations (see Fig. 2). It is probable that recombino- 24. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690.
genic activities within the G+C-rich DNA sequences played 25. Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K. &
a significant role in the evolution of the Pro-rich domains in Watson, J. (1989) Molecular Biology of the Cell (Garland, New
CELPs. In addition to recombination, exon shuffling (34), a York), 2nd Ed.
mechanism believed to be important to the evolution of 26. Fry, S. C. (1986) Annu. Rev. Plant Physiol. 37, 165-186.
proteins with multiple functional domains might be respon- 27. Epstein, L. & Lamport, D. T. A. (1984) Phytochemistry 23,
sible in constructing the highly charged C-terminal domains 1242-1246.
in these proteins. 28. Allen, A. K. (1983) in Chemical Taxonomy, MolecularBiology,
The many functional roles for cell walls require the phys- and Function of Plant Lectins, eds. Goldstein, I. J. & Etzler,
M. E. (Liss, New York), pp. 71-85.
ical, biological, and chemical complexity observed for the 29. Casalongde, C. & Pont Lezica, R. (1985) Plant Cell Physiol. 26,
extracellular matrix of plant cells. While a vast amount of 1533-1539.
information is available concerning the biological properties 30. Koltunow, A. M., Treuttner, J., Cox, K. H., Wallroth, M. &
of the plant cell surface and the chemical properties of its Goldberg, R. B. (1990) Plant Cell 2, 1201-1224.
constituents (1-3), how the cell walls are constructed and 31. Jeffree, C. E. & Yeoman, M. M. (1981) New Phytol. 87,
how their functions are mediated remain largely a puzzle. The 463-471.
many roles that some of the cell wall proteins are believed to 32. Ertl, H., Hallmann, A., Wenzl, S. & Sumper, M. (1992) EMBO
play in plant growth, development, and its defense remain to J. 11, 2055-2062.
33. Showalter, A. M. & Rumeau, D. (1990) in Organization and
be unequivocally demonstrated. The perplexing issues that Assembly ofPlant andAnimal Extracellular Matrix, eds. Adair,
have been addressed here concerning the CELP gene family W. S. & Mecham, R. P. (Academic, New York), pp. 247-281.
will help focus efforts toward unraveling the structural and 34. Gilbert, W. (1987) Cold Spring Harbor Sym. Quant. Biol. 52,
functional significance of these proteins and understanding 901-906.