A Corpus-based Study of Connectors Research from the CA

W
Document Sample
scope of work template
							A Corpus-based Study of Connectors: Research
from the CAS Learner Corpus of English Essays




             Haiyang Ai, Gong Peng
Graduate University, Chinese Academy of Sciences
Outline of the talk

   Introduction

   Previous Studies

   Methodology and Corpus Building

   Results and Discussion

   Conclusion and Pedagogical Implication
Definition of connectors

   Connectors are devices used to state the
    relationship between units of discourse
    (Biber et al, 1999)

   Including conjunctions, some adverbs (e.g.
    firstly, namely, alternatively), and some
    prepositional phrases (e.g. in brief, in fact,
    of course)
Classification of connectors
   Quirk et al’s (1985) framework

    A Comprehensive Grammar of the
      English Language

   Adding of corroborative category
    - (Granger & Tyson, 1996)
    - (Altenberg & Tapper, 1998)
Quirk et al’s (1985) framework
               enumerative      e.g. for a star, finally
listing                         equative        e.g. in the same way, likewise
               additive                         e.g. moreover, further
                                reinforcing
summative      e.g. in sum, altogether
appositive     e.g. for example, namely
resultive      e.g. as a result, consequently
inferential    e.g. therefore, in that case, otherwise
               reformulatory    e.g. more precisely, rather
               replacive        e.g. better, again
contrastive
               antithetic       e.g. by contrast, instead
               concessive       e.g. in any case, however
               discoursal       e.g. by the way, incidentally
transitional
               temporal         e.g. in the meantime, meanwhile
Connectors investigated (68 items)
   Listing:
    first, second, third, firstly, secondly, thirdly,
    finally, furthermore, in addition, moreover,
    lastly, last but not least, to begin with, for
    another, in the first place, in the second place,
    similarly, for one thing, for another
   Summative:
    to sum up, to conclude, in summary, in short,
    in brief, in conclusion, overall, all in all,
    altogether
   Appositive:
    that is, that is to say, in other words, for
    instance, for example, namely, e.g.( eg),
    i.e.( ie)
Connectors investigated (68 items)
   Resultive:
    consequently, hence, therefore, thus, as a
    result, as a consequence, in consequence,
   Inferential: otherwise, in that case
   Contrastive:
    however, although, (even) though, on the other
    hand, instead, after all, on the contrary, in
    contrast, besides, nevertheless, anyway, still,
    by contrast, nonetheless, alternatively
   Transitional:
    meanwhile, eventually, subsequently, originally
   Corroborative:
    actually, in fact, of course, indeed, apparently
Rationales to use corpus data
   Corpus data are real and authentic =>
    empirical study
   Combines intuitions of many, more
    objective (McEnergy & Wilson, 2001)
   Corpora are precious resources for testing
    out linguistic hypothesis (Meyer, 2002)
   Learner corpus serves as the meeting
    point of corpus linguistics and SLA
    (Granger 1998)
    => pioneer: Sylviane Granger, ICLE
Research questions
   What’s the semantic distribution?

   What’s the top 10 most frequently used
    connectors?

   Which connectors are overused?

   What’s the differences and similarities
    compared with related studies, and why
    (universal features vs. transfer-related?)
Hypothesis
   Hypothesis:
       PhD students at GUCAS would overuse
    connectors in their English writings

   Formulated based on
      Previous studies from HK and Taiwan
       (Crewe 1990, Field & Yip 1992, Milton &
       Tsang 1993, Bolton et al 2002, Chen
       2006)

       The author’s own observation
Significance
   Systematic and corpus-based connector
    studies on PhD students writing of in
    GUCAS => shed some light on the
    everlasting cohesion & coherence
    problems in ESL/EFL writing

   Quantitative analysis can provide teachers
    (esp. at GUCAS) with a better idea on
    what needs to be done

   The construction of the CASCLEE
    computer learner corpus itself (Resources)
Outline again

   Approaching Connectors

   Previous Studies

   Methodology and Corpus Building

   Results and Discussion

   Conclusion and Pedagogical Implication
Previous corpus-based studies
   Milton & Tsang (1993)
       high ratio of overuse of entire range of
        connectors (HKUST vs. Brown, LOB)

   Granger & Tyson (1996)
       108 connectors, CIA method
       overuse <= L1 transfer

   Altenberg & Tapper (1998)
       timed + untimed essays
       underuse (resultive, contrastive) <=
        prefer less formal connectors
Previous corpus-based studies
   Bolton et al (2002)
        Overuse exists in both groups, ICE-HK vs.
        ICE-GB

       Raised 3 methodological issues

   Chen (2006)
       Latest, published on IJCL, Taiwanese EFL
        Learners

       Slightly overused connectors

       Increase learner’s register differences
Outline

   Introduction

   Previous Studies

   Methodology and Corpus Building

   Results and Discussion

   Conclusion and Pedagogical Implication
Corpus building
   Corpus name: CASCLEE - CAS Corpus of
    Learner English Essays
   Corpus Size: 494 essays, 120, 836 words,
    covering timed and untimed writings
   Data analysis:
    WordSmith Tool 4.0 + Manual Extraction
   Sampling & Representativeness
   Learner Background & Register of text
Method: CIA
   Contrastive interlanguage analysis
    (Granger 1996)
       L2 vs. L1

       L2 vs. L2

   Reference corpora

       Informative Writings of BNC Sampler
        Corpus (L1)

       The ICLE French Subcorpus (L2)
Outline

   Introduction

   Previous Studies

   Methodology and Corpus Building

   Results and Discussion

   Conclusion and Pedagogical Implication
Overall frequencies (normalised)

                                     Overall Connector Usage

                            131.9
                     140
                     120
 Per 10, 000 words




                                                                  99.5
                     100
                     80
                     60                            46.7
                     40
                     20
                      0
                           CASCLEE            BNC Sampler-     ICLE-French
                                               Informative

                                         The Three Corpora
                      Semantic distribution
                                                 Semantic Distribution of Connectors in the Three Corpora


                     700.0


                     600.0    577.6
per 100, 000 words




                     500.0


                     400.0
                                                                                                            322.8
                     300.0                                                                                          264.7
                                                                                                                                                  225.9
                                                                       196.5                                    192.8
                     200.0
                                                                                      137.2
                                       116.1                   116.7                                                                      110.1
                                                77.8                            84.4
                                                                                   75.0
                     100.0                                          53.6                                                                     62.7
                                   44.7                28.4                                    18.2 12.5                        25.0
                                                                                                                             11.6 14.2
                                                     4.7                                          8.1
                      0.0




                                                                                                                                              e
                                                                                 ve




                                                                                                               e
                                                                                                 l




                                                                                                                                l
                                   g




                                                 e




                                                                  l




                                                                                                ia




                                                                                                                             na
                                                               na




                                                                                                                                          iv
                                                                                                               iv
                                                iv
                              tin




                                                                                                nt
                                                                                lti




                                                                                                                                          at
                                                                                                             st
                                                at




                                                                                                                            tio
                                                              tio
                             lis




                                                                               su




                                                                                              re




                                                                                                           ra




                                                                                                                                         or
                                            m




                                                                                                                          si
                                                              si




                                                                                          fe
                                                                           re




                                                                                                       nt
                                            m




                                                                                                                                     ob
                                                                                                                        an
                                                         po




                                                                                          in
                                          su




                                                                                                     co




                                                                                                                                     rr
                                                        ap




                                                                                                                      tr




                                                                                                                                    co
                                                                                       categories

                                                        CASCLEE NF         BNC Sampler-Informative NF          ICLE-French NF
Top 10 most frequently used connectors

 Rank      CASCLEE        BNC Sampler-info.     ICLE-French

  1           first           however              indeed

  2          second           although            however

  3         however             thus              therefore

  4         secondly        (even) though         of course

  5       for example         therefore          moreover

  6         although         for example        for example

  7       (even) though       of course         for instance

  8          finally           indeed              in fact

  9          firstly           instead              thus

  10        of course        in addition      on the other hand
Quantitative difference: Overuse

   Overused connectors
       Group A (see Table 4)

       Group B (see Table 5)
Comparing with related studies

   Altenberg & Tapper (1998)
    Overuse of furthermore, for instance, still,
    of course (CASCLEE also)

   Bolten et al (2002)
    overuse both exist in ICE-HK & ICE-GB

   Chen (2006)
    slightly overused
Major findings

   PhD students overused a whole
    range of connectors (hypothesis
    supported)

   They significantly overused listing
    and summative connectors

   Overuse of connectors exist both in
    CASCLEE and ICLE French subcorpus
Outline

   Introduction

   Previous Studies

   Methodology and Corpus Building

   Results and Discussion

   Conclusion and Pedagogical
    Implication
Conclusion
   Objectives and contributions
       Build the CASCLEE learner corpus
       Analyzing connectors based on Quirk et
        al (1985) framework
   Methodology: contrastive interlanguage
    analysis
       L1 vs. L2 (CASCLE vs. BNC Sampler-info)
       L2 vs. L2 (CASCLEE vs. ICLE-French)
Pedagogical Implication
   Pedagogical implication
       Focus on contrastive, resultive and appositional
        connectors, over 70%
       Listing connectors should be addressed
       Correct forms of connectors

   Looking forward…
       More large-scale, corpus-based studies on EFL
        learners’ connector usage
       Probe into the possible causes for certain connector
        usage patterns
The End !

						
Related docs
Other docs by hji18939