					               The paired format in the Cambridge
               Speaking Tests
               Julie Norton

               Recent articles in this journal (Foot 1999; Saville and Hargreaves 1999) have
               focused on the advantages and disadvantages of the paired format of the
               Cambridge Speaking Tests. This article aims to contribute to the debate by
               considering how the pairing of candidates may impact upon the language
               sample produced and could affect the assessment process. Data from the
               Speaking Tests are presented which suggest that pairing potentially affects
               linguistic performance if one candidate has higher linguistic ability than the
               other, or if candidates know each other. Pairing also seems to affect the
               amount of talk produced, and depending on the gender make-up of the pair
               there appear to be qualitative differences in the respective participation in the
               tests of each partner.

Introduction   Cambridge ESOL (part of the University of Cambridge Local
               Examinations Syndicate—UCLES) has adopted a number of procedures,
               such as examiner training and scripted rubrics, to ensure the Cambridge
               Speaking Tests are administered in a fair and professional manner.
               Whilst these measures clearly contribute to the success of the assessment
               process, the findings of this study suggest that other areas of test
               standardization, namely the pairing of candidates, merit more detailed
               consideration and further research. This is particularly the case since
               candidate pairings are often organized on a rather ad hoc basis at ‘open
               centres’, depending upon who is present on the day, whereas candidates
               from the same school who have attended a regular course of study may
               have had the opportunity to practise with their partner for the speaking
               tests over a period of several months.
               This article explores the effects of the paired format upon the discourse
               produced and candidate performance during the Cambridge Speaking
               Tests. It aims to make explicit how the pairing of candidates could result
               in the adoption of particular communication strategies and potentially
               affect candidate assessment. Given the numerous and interrelated
               variables that impinge on the assessment process, it is impossible to
               show a definitive correlation between pairing and the final marks
               awarded to respective candidates. Close inspection of interview data from
               the Speaking Tests does, however, allow us to gain insights into the
               relationship between particular pairings and linguistic output, and make
               tentative links to assessment.

                    More generally, this article also aims to raise awareness of the effects
                    one’s speech may have upon one’s interlocutor and the sociocultural
                    factors which impinge on this, be it in a testing situation or classroom
                    context, in the belief that deeper understanding of this phenomenon is
                    relevant and useful for all English language teaching professionals and
                    learners of English.
                    The article firstly summarizes the arguments for and against
                    the adoption of the paired format in the Cambridge Speaking
                    Tests. The background to this study, the methodology, and the
                    presentation and discussion of examples from the interview data
                    then follow.

The paired format   Numerous advantages and disadvantages of the paired format of the
of the Cambridge    Cambridge Speaking Tests, and oral tests in general, have been cited in
Speaking Tests      previous articles in this journal (Foot 1999; Saville and Hargreaves 1999;
                    Egyud and Glover 2001). Saville and Hargreaves (1999) emphasize the
                    following advantages of the paired format of the Cambridge Speaking
                    Tests: candidates are more relaxed; they have the possibility of more
                    varied patterns of interaction during the tests; and this format can lead to
                    positive washback in the classroom by encouraging learners to interact
                    together in preparation for the test. Foot (1999), on the other hand,
                    highlights several potential problems: candidates may not necessarily
                    perform ‘better’ if they are more relaxed; nervous candidates could make
                    their partner feel more nervous; candidates who do not know each other
                    may feel more anxious about interacting with a stranger. In addition,
                    Foot claims that there is insufficient evidence about the effect of
                    differences in language ability, L1 background, social relationships, and
                    factors such as age, personality, and social class upon linguistic
                    performance in the paired interview format. Many of Foot’s points
                    appear valid and are supported to some extent by the findings of
                    this study. It would certainly be useful for Cambridge ESOL to
                    publish details of the research referred to by Saville and Hargreaves
                    (1999: 45) on the benefits of the paired interview format to allay
                    these concerns.

Background          The data for this study were collected as part of my doctoral research on
to the study        the under-performance of Japanese candidates in the Cambridge
                    Speaking Tests. Hence, a large number of Japanese candidates
                    feature in the data sample (twenty-seven Japanese candidates and six
                    European candidates), and their cultural background and
                    communication styles cannot be overlooked when assessing the
                    particular pairings in the interview data which are presented below.
                    The data consist of transcriptions of twenty-seven Cambridge
                    Speaking Tests: ten First Certificate in English (FCE); ten Certificate
                    in Advanced English (CAE) and seven Certificate of Proficiency in
                    English (CPE). The original intention was to include ten CPE Speaking
                    Tests in the data sample, but there was a low number of Japanese
                    entries for this examination at the time of data collection, so this was
                    not possible.

The Cambridge          An oral proficiency interview has been defined as ‘a sample of extended
Speaking Tests:        discourse, as a hybrid of interview and conversational interaction, and as
description            an instance of communication across cultures’ (Ross and Berwick 1992:
and procedures         160). This aptly describes the Cambridge Speaking Tests discussed in
                       this study. Each speaking test consists of an introduction, three separate
                       tasks and a closing. The tasks are carried out in the same order in each
                       speaking test and the examiner follows an ‘interlocutor frame’ for test
                       standardization purposes. According to Cambridge ESOL, ‘rubrics have
                       been extensively trialled and great care has been taken to ensure that both
                       candidates are treated equally’ (UCLES 1998: 19).
                       The FCE and CAE Speaking Tests are conducted in pairs (two candidates
                       and two examiners). One examiner acts as interlocutor whilst the other
                       examiner assesses the candidates. Only one paired interview (one
                       examiner and two candidates) features in the CPE Speaking Tests in this
                       data sample, because individual interviews (one examiner and one
                       candidate) were the more common format for this examination at the
                       time of data collection. This article, therefore, focuses on the FCE and
                       CAE data. It is worth noting, however, that the paired format has since
                       been adopted for CPE Speaking Tests (June 2003).

The examiners          Ten oral examiners, who speak British English as their first language,
                       participated in the FCE and CAE Speaking Tests in the combinations
                       shown in Tables 1 and 2.

Methodology            The data sample consists of video recordings of the FCE Speaking Tests
Data collection,       and audio recordings of the CAE Speaking Tests. The video data were
description,           recorded by UCLES during a piloting of the Revised FCE speaking test at
and transcription      an examination centre in the United Kingdom in December 1995. The
conventions            CAE data were also collected during examining sessions in 1995 –1996
                       in the United Kingdom. In all cases, filming and taping were carried out
                       as unobtrusively as possible. Candidates were informed by the respective
                       examination centres that the interviews would be recorded for research
                       purposes. The extent to which this affected the outcome of a candidate’s
                       performance can never be answered definitively. As an experienced oral
                       examiner, I can verify that the recorded interviews do not seem any
                       different from other Cambridge Speaking Tests. The interview data were
                       later transcribed following the transcription conventions presented in
                       Psathas (1995). The examiner is identified as ‘E’ (or ‘E1’ and ‘E2’ in
                       interviews where two examiners are present) and the candidates by
                       initials in the transcriptions. Each Speaking Test is numbered and line
                       numbering is used as a transcription convenience.

                       Examiners                          Candidates
                       2 females                          2 males (1 Japanese and 1 Swiss)
                                                          2 females (1 Japanese and 1 Swedish)
                                                          2 females (1 Japanese and 1 Spanish)
                       2 males                            2 females (both Japanese)
table 1                                                   2 males (both Japanese)
Speaking Test format                                      1 male and 1 female (both Japanese)
                                                          1 male and 1 female (both Japanese)

                            Examiners                            Candidates
                            1 male and 1 female                  2 females (both Japanese)
                                                                 2 females (1 Spanish and 1 Japanese)
                                                                 1 male and 1 female (French and Japanese
                            2 males                              1 male and 1 female (Korean and Japanese
                                                                 2 females (1 Finnish and 1 Japanese)
                                                                 2 females (both Japanese)
                            2 females                            2 females (1 Danish and 1 Japanese)
table 2
                                                                 1 male and 1 female (Swedish and Japanese
Speaking Test format                                             respectively)

The data                    Certain examination centres pre-assign candidates to particular pairs, or
The effects                 allow them to choose a partner before the day of the test. The pairs may
of the paired format        be given the opportunity to practise together during class time and are
                            often encouraged to prepare for the speaking test together outside school.
                            This is not true for all centres. In some cases, candidates from a
                            particular school may be required to take the speaking test at a larger
                            ‘open centre’ with candidates they do not know. In this case, pairing may
                            be randomly organized by the open centre co-ordinator immediately
                            prior to the speaking test. In this data sample, candidates who attended
                            class and prepared for the test together and candidates who met for the
                            first time immediately prior to the test are represented. In the FCE data
                            sample, candidates appear to have studied together, but it is unclear how
                            well individual candidates know each other. In the case of the CAE data,
                            all candidates claim to know each other well, with the exception of
                            candidates R.T. and A.L. (CAE.3) who, despite studying in the same class,
                            know each other by name only, and candidates M.S. and I.M. (CAE.5),
                            and M.K. and J.A. (CAE.8) who have met for the first time immediately
                            prior to the test.

Pairing                     In this study, particular pairings appear to have some effect upon
and linguistic              linguistic performance. The repetition of certain syntactic structures and
performance                 lexical items are noted in the contributions of particular pairs. Candidates
                            R.M. and T.K. (FCE.5), for example, employ conditional structures with
                            ‘if’ to link their contributions more frequently than other FCE candidates,
                            that is, they appropriate syntactic structures from each other’s discourse.
                            Examples of this are presented in Table 3.

                            103. T.K.: . . . if I can speak English more
                            104. better/ erm (1.0) erm/ I can/ I can work (2.0) more . . . (FCE.5,
                                 L103 –104)
                            265. R.M.: . . . so the/ if we go to a/ if we go to a place like this picture/ we can
                                 meet a lot of
                            266. people
Appropriation of
syntactic structures (See   274. T.K.: Yes/ but if I can/ if I/ if I can/ erm/ if I can be/ a F1 racer/ I want
Appendix for key to         339. T.K.: Uhm/ but if you go to
transcription symbols.)          New York/ you can see (points at visual material)

                            341. R.M.: But dangerous/ so:/ if you go to New York
                            366. R.M.: Uhm/ yeah/ but/ er/ if you go to Egypt
                            400. T.K.: If I go to Egypt/ to eat something
table 3                     402. R.M.: Ah I see/ I see/ it’s very difficult/ okay/ (hhh) erm/ if I go to/ if I go
                                 to these three
Appropriation of            403. countries . . .
syntactic structures (See
                            442. T.K.: Er (1.0) during the week days/ erm/ I al-/ I always/ er/ I always
Appendix for key to
                                 under the stress/ if I on holiday/ I can relax
transcription symbols.)

                            By the final stage of the speaking test, even the examiner’s syntax follows
                            this pattern:
                            Example 1
                            486. R.M.: . . . because if I/ er/ go/ if I go/ if I go on holiday with my
                                 family/ for example/ with
                            487. my parents (hhh) / the different age/ well the gap is very big
                            489. T.K.: ,Uhm uhm .
                            490. E: And if I came to: Japan on holiday
                            492. R.M.: Yeah
                            494. E: . . . erm/ what places would you recommend me to visit
                            It could, of course, be argued that certain tasks require candidates to
                            speculate and hypothesize, and that conditional structures are, therefore,
                            highly predictable. The high frequency of this pattern in this interview is
                            noteworthy, however, compared to its use in other interviews in this data
                            Appropriation also occurs at the lexical level, a feature which has been
                            noted in native speaker interaction (Tannen 1984). For this reason, being
                            paired with a candidate who has higher linguistic ability may be
                            beneficial for lower level candidates who are able to incorporate some of
                            their partner’s expressions into their own speech. Candidate M.T.
                            (FCE.2), a Japanese candidate, for example, employs ‘otherwise’ (L381)
                            immediately following her partner, candidate S.B.’s use of this item
                            (L329). It is worth noting that candidate S.B. is Swedish and is awarded
                            top marks of five on each assessment scale. Candidate M.T. achieves the
                            highest score of all the Japanese candidates in the FCE data sample with
                            top-band scores of five for Fluency and Interactive Communication,
                            and four on the other four assessment scales (Grammatical Accuracy,
                            Prosodic Features, Individual Sounds, Vocabulary Resource).
                            Example 2 below shows how ‘otherwise’ is first used by the Swedish
                            candidate in a discussion of the importance of feeling comfortable in
                            one’s work environment. Example 3 shows the Japanese candidate use
                            this lexical item in a discussion of class sizes a few lines later.
                            Example 2
                            326. M.T.: Erm/ every job/ I would say/ its the same thing in/ er/ every

                    328. S.B.: Erm/ you just/ you just have to like your job/ you just have to
                         enjoy going to work
                    329. otherwise
                    (FCE.2, L326 –329, my emphasis)
                    Example 3
                    375. M.T.: Well/ if it’s kindergarten/ erm/ maybe less than twenty/
                         because you have to take
                    376. care of them/ all of them
                    378. E: Uhm
                    380. M.T.: . . . otherwise/ you know/ er/ some will hurt/ some will hurt
                         other pe-/ other/ other children . . .
                    (FCE.2, L368 – 381, my emphasis)
                    Similar examples of lexical appropriation are noted in the CAE interview
                    data but are not presented here due to word limit constraints.
                    The influence of one candidate’s performance upon the linguistic
                    behaviour of the other is also apparent when the Japanese candidate, R.T.
                    follows her partner’s lead (a Finnish female candidate) and spells her
                    name in the opening stage of the test (CAE.3: L23), although she receives
                    no specific instructions from the examiner to do this.
                    Example 4
                    11. E: . . . my name’s G/ and my colleague/ M/ and your names are
                    13. A.L.: My name’s A/ [A]
                    15. E: [A]
                    17. A.L.: Yes/ A (spells it)
                    19. (2.0)
                    21. E: Now-
                    23. R.T.: My name is R (spells it)
                    (CAE.3, L11 –23, my emphasis)

Pairing:            Ikeda (1998: 93) suggests that candidates in oral proficiency interviews
psychological and   should be allowed to select their own partner to reduce anxiety levels.
gender issues       The findings of this study support this view, although this small data
                    sample clearly cannot offer conclusive evidence of this. In interview CAE.
                    7, for example, candidate I.F. (Japanese) is paired with a close friend,
                    candidate M.J. (Danish). Both candidates are obviously at ease as they
                    joke, banter and refer to each other by nickname in the opening stage of
                    the test. This sets the tone for the whole interview, which is very relaxed
                    and characterized by lots of laughter. Candidate I.F. attains high scores of
                    seven (out of a possible eight) for the Interactive Communication and
                    Task Achievement assessment scales, and also receives a high score of six
                    for Fluency. Her total score of thirty-seven marks out of a possible score

                     of forty-eight for each assessment band places candidate I.F. in joint top
                     position with three other Japanese female candidates in this data sample.
                     Three of these four top-scoring Japanese females are paired with a friend
                     whom they claim to know well. (See Table 4.)
                     More talk is produced in interview CAE. 7 than in all other CAE
                     interviews (mean number of words per CAE interview is 2552.75;
                     interview CAE.7 includes 3116 words) and the level of participation of
                     the respective candidates is fairly equally balanced. (See Table 5.)
                     Whilst the high level of participation can certainly be attributed to the
                     higher linguistic abilities of candidates I.F. and M.J., it is also possibly
                     indicative of how well these two candidates get on outside the testing
                     situation: they are friends who enjoy each other’s company. This cannot
                     be proved to have any impact on the assessment of these candidates, and
                     all quantitative findings presented here must be interpreted with
                     caution due to the small sample size. These findings do, however,
                     perhaps indicate that further research into the potential benefits of
                     being paired with a ‘friend’ is necessary in the interests of test
                     Candidate M.K. (CAE.8), who is paired with a stranger, a Swedish male
                     candidate (J.A.) of the same age, at an open centre, appears nervous and
                     reticent. She has the lowest mean utterance length of all CAE candidates
                     and is one of the lowest scoring candidates. (See Table 4.) Table 6 reflects
                     the huge disparity in participation of the respective candidates in this
                     Candidate M.K.’s limited linguistic ability may be a major factor in her
                     lack of extended contributions and failure to develop the interaction.
                     Another possible explanation for candidate M.K.’s reticence, which is

                     Candidate         Gender         Mean                     Nationality   Total marks
                                       (M/F)          utterance length                       awarded
                     M.K.              F               5.9                     Japanese      25
                      S.S.             F               6.5                     Japanese      34
                      S.K.             F               7.19                    Japanese      32
                      Y.S.             F               9.05                    Japanese      37
                     R.T.              F               9.39                    Japanese      19
                      A.K.             F               9.43                    Japanese      35
                      I.F.             F               9.55                    Japanese      37
                     K.W.              F              11.4                     Japanese      34
                      Y.N.             F              11.59                    Japanese      37
                     M.S.              F              21.19                    Japanese      37
                     J.A.              M              11.11                    Swedish       48
                     A.L.              F              12.66                    Finnish       25
table 4
                     I.M.              F              16.04                    Spanish       38
Amount of talk and
total scores         (pcandidates who claim to be paired with someone they know well)

                   Candidate                  Total % of words                   Total % of turns
table 5            I.F.                       36.78                              38.58
Participation in
                   M.J.                       44.54                              40.84
interview CAE.7

                   supported in the literature, may be the potentially negative effect of this
                   pairing: ‘The Japanese are generally not inclined to state an opinion on
                   issues or discuss topics at length with a stranger’ (White 1989: 70). This
                   may have contributed to M.K.’s nervousness and reluctance to speak.
                   Female Japanese candidates paired with male candidates of any
                   nationality in this data sample adopt a floor-supporting role in the three-
                   way discussion task by using more back-channelling tokens and allowing
                   their male partners to take the floor first. In three of the five interviews in
                   this data sample where male and female candidates are paired together,
                   the male candidates produce more talk. (See Table 7.) Again, this is
                   hardly conclusive, given the sample size, but there is evidence of more
                   equally balanced participation when two female Japanese candidates are
                   paired together in the data included in this study. (See Table 5 which is
                   representative of other female-female CAE pairings in this data sample.)

Discussion         It is impossible to correlate one aspect of a candidate’s performance with
                   the awarding of a particular score in the speaking tests. The examiner’s
                   demeanour, gender, topic choice, or the candidate’s familiarity with a
                   task type are some of the many variables which could influence the type
                   of language elicited and the amount of talk produced. How this in turn
                   impacts upon the final assessment of a candidate is enormously complex.
                   It is unfeasible and beyond the scope of this study to explore these issues.
                   The data from this study can, however, offer insights into aspects of oral
                   testing which appear under-researched and ripe for further exploration.
                   In this respect, it would seem prudent to give greater consideration to the
                   pairing of candidates, particularly in cases where the L1 politeness
                   conventions of the candidates could affect participation.
                   Watanabe’s (1993) study of group discussions in Japanese reveals how
                   Japanese participants discuss procedural issues regarding turn allocation
                   before beginning the discussion. This is organized according to the
                   hierarchical status of the participants in order to protect the ‘face’ of the
                   most senior person by allowing him or her to take the floor last, once the
                   various views have been presented. Lazaraton (1991: 18) notes that the
                   turn-taking system may cause problems for non-native speakers in
                   speaking tests if this differs from the one operating in their native
                   culture. Female Japanese candidates could be hesitant to express their
                   views in the Cambridge Speaking Tests because they wish to maintain
                   ‘face’ and do not wish to disagree with their male interlocutor, be it a

                   Candidate                  Total % of words                   Total % of turns
table 6            M.K.                       20.77                              32.25
Participation in
                   J.A.                       53.64                              44.2
interview CAE.8

                         Candidate        Test          Gender         Total %    Total %       Total
                                                        (M/F)          of words   of turns      score
                         T.K.             FCE.5         F              25.18      29.02         15/30
                         R.M.             FCE.5         M              33.00      33.30         17/30
                         Y.M.             FCE.6         F              29.00      33.51         22/30
                         S.A.             FCE.6         M              28.64      32.43         17/30
                         K.W.             CAE.2         F              36.59      38.71         34/48
                         A.T.             CAE.2         M              30.25      30.88         35/48
                         Y.N.             CAE.6         F              30.27      31.76         37/48
table 7                  C.R.             CAE.6         M              37.88      31.38         38/48
Male and female          M.K.             CAE.8         F              20.77      32.25         25/48
and marks awarded        J.A.             CAE.8         M              53.64      44.20         48/48

                         male examiner or a male candidate. This could compromise their
                         performance on the Interactive Communication assessment scale of the
                         Cambridge Speaking Tests which rewards candidates who attempt to
                         initiate topics and elaborate upon their responses.
                         It seems impossible to evaluate linguistic proficiency without taking into
                         account pragmatic and sociocultural competence which are also
                         necessary for successful communication in the target language. As van
                         Lier (1989) has pointed out, a ‘will-not talk’ candidate may be
                         confounded with a ‘cannot talk’ candidate which would most certainly
                         result in negative assessment in tests of spoken English. In defence of
                         Cambridge ESOL’s paired format, it should be noted that these speaking
                         tests are not devised as ‘stand-alone’ tests of oral proficiency. A
                         candidate’s results in the Cambridge speaking test must be considered in
                         relation to scores on other papers of the test to gain an overall view of his
                         or her communicative competence in English (Saville and Hargreaves
                         1999, personal communication). Further research is nonetheless
                         required to investigate these aspects of pairing upon linguistic

Conclusion               The data presented in this article raise awareness of potential problems
                         concerning the elicitation of language samples suitable for assessment
                         purposes and highlight areas which merit further research to ensure that
                         fair and appropriate testing procedures are adopted. Following Egyud  ¨
                         and Glover (2001), it would seem worthwhile to find out student
                         preferences for paired or individual interviews in speaking tests. It would
                         also be interesting to investigate examiner perceptions of the advantages
                         and disadvantages of the paired format. Finally, this study suggests the
                         value of gathering more data on social and cultural factors which may
                         potentially influence linguistic performance and assessment, specifically,
                         with regard to pairing effects on candidate performance. The data
                         presented here represent one type of reliability evidence that can be
                         collected, and could, of course, be considered together with rater

                            reliability analyses conducted on the Cambridge Speaking Tests. Insights
                            into variables such as these in the testing process are useful not only for
                            examiners and test item writers but also for all English language teachers
                            who are engaged in the preparation of candidates for tests of spoken
                            Final version received May 2004

Appendix   Key to Transcription Symbols (based on Psathas, 1995)
           (xxx) ¼ unclear speech
           /     ¼ short pause
           (3.0) ¼ 3 second pause
           ...    ¼ holding the floor
           :::    ¼ sound stretch
           -      ¼ cut off
            , . ¼ soft speech
           (hhh) ¼ laughter

