A Note on the Wilcoxon-Mann-Whitney Test for 2 x k Ordered Tables

Document Sample
A Note on the Wilcoxon-Mann-Whitney Test for 2 x k Ordered Tables Powered By Docstoc
					A Note on the Wilcoxon-Mann-Whitney Test for 2 x k Ordered Tables
Author(s): John D. Emerson and Lincoln E. Moses
Source: Biometrics, Vol. 41, No. 1 (Mar., 1985), pp. 303-309
Published by: International Biometric Society
Stable URL: .
Accessed: 20/02/2011 10:09

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . .

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact

                International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to
BIOMETRICS41, 303-309
March 1985

        A Noteon theWilcoxon-Mann-Whitney for2 x k
                               John Emerson
    Department Mathematics,
             of            Middlebury College,Middlebury,Vermont05753,U.S.A.
                               Lincoln Moses
            Division Biostatistics,
                    of           Stanford          SchoolofMedicine,

Biological medical    investigations use ordered
                                   often                    data.               are
                                                  categorical Whentwogroups to be
compared thedatafor groups in three moreordered
           and           the       fall      or                       the
                                                            categories, Wilcoxon-Mann-
Whitney   (WMW)test                  in
                     usesinformation theordering givea test
                                                  to              is
                                                              that usuallypowerfulagainst
shift           However, applications WMWoften
     alternatives.        such           of           involvedistributions which
                                                                        for     extensive
ties    an           role.
   play important Newly        availablecomputer         for
                                                programs performing tests deeper
                                                                      exact   give
insights                                                      and             of
         into the characteristics the exact WMW distributions the suitability normal
approximations. offer
                We       practical
                                 advice,                  with
                                        basedon experience published            datasets
and on numerical   studies hypothetical
                           of            orderedtables,forthe use of WMW and its normal

1. Introduction
Ordered  categoricalvariables                in
                              arisefrequently biomedical              In
                                                            research. some instances
thesevariables  providequalitative              that
                                    information is intrinsically               in
                                                                     imprecise; other
situations arisefrom
          they           classifying              into       of
                                    measurements ranges measured          values.
Emerson,                                  47            of
          and Hosseini(1984) identified instances ordered           categoricalvariables
          in                       in
appearing 32 ofthe 168 articles Volume306 (1982) oftheNewEngland              Journalof
Medicine.                      also
           These investigators reported     thatthe statistical analyses rarelyused the
inherent ordering;instead, they                            test
                                typically a chi-square of homogeneity.
                                        used                                   Further-
more, articles   often indicated collapsing ordered
                                 that          of                    was
                                                          categories employed     prior
to calculating chi-square          An
                          statistic. example             these
                                              illustrates points.
  Santoro al. (1982) reported a randomized
           et                                    controlled        trial compare
                                                            clinical    to         two
combination treatments advanced
             drug                        Hodgkin's  disease.Patient         to
                                                                   response treatment
was assessed and classified tumor
                           by        remission complete,
                                              as                    or
                                                            partial, none.The results
for seventy-five
    the            patients      as
                            were follows:
         Response                         Complete      Partial   None        Totals
Treatment MOPpa
       with                                 27            2        9           38
Treatment MOPP and ABVDb
        with                                34            3        0           37
    MOPP is Mechlorethamine,                   and
                                    Procarbazine, Prednisone.
    ABVD is Doxorubicin,                  and
                               Vinblastine, Dacarbazine.

Keywords: Exacttestin contingency
                                     Mann-Whitney                  methods;
   Ordered          data;
          categorical Wilcoxon  ranksumtest.
304                                      March1985
  In reporting results their
              the                                of
                                      comparisons thetwotreatments,
                             statistical                          Santoro et
al. indicated     thatthey
            clearly                         tests obtainP-values
                          had used chi-square   to              and thatthey
had made a continuity         for                 In
                     correction a smallsubgroup. theseanalyses they collapsed
thetablein twoways:

                         27      11         and         29         9
                         34       3                     37         0

Theirpaperpresented     P-values .02 (without continuity
                                of                a                        for
                                                                correction) thefirst   table
and .005 (with continuity                for
                             correction) thesecondtable.Notethatthefirst          collapsed
tablecorresponds categorizing             as           or        to
                                 patients showing failing showcomplete            remission
and the secondcategorizes                     or
                            themas having not having                          The
                                                              any remission. choiceof
collapsing a nonnegligible on theP-value, this true
           has                effect                  and      is                of
one employs continuity               to
                          correction chi-square.
  The Wilcoxon-Mann-Whitney       (WMW) procedure                 an
                                                        provides alternate            of
                                                                             analysis the
tableofSantoro al. that
                 et                     the
                           incorporates natural      ordering oftheresponse   categoriesand
                   It     a     of                                  in
avoidscollapsing. gives test thehypothesis no difference response treatment
                                                    of                         to
against shift             that
               alternative one treatment       affords better           The
                                                              response. exacttwo-sided
P-valueusing tiedconfiguration
              the                   is .010,ifwe agree measure extent departures
                                                         to          the       of
from nullhypothesis usingthedistance either
     the                  by                     (in        direction) theWMW statistic
from the assumedmean. A normalapproximation             givesa two-sided    P-valueof .012
(without continuity               The
                       correction). statistical   analysis indicates an alternating
                                                                     that              drug
combination MOPP and ABVD is better
             of                              thanMOPP alone,at significance .01.level
  An algorithm Mehta,            and
                           Patel, Tsiatis    (1984) makes calculation exact
                                                            the            of      P-values
for WMW statistic
   the                  readilyavailable evenforfairly   large tables. Thispapermakesuse
oftheir            in
       algorithm an examination theapplication WMW to 2 x k ordered
                                      of                  of                         tables.
Other               to
      approaches ordered                                                   in
                              categorical havealso beenconsidered theliterature,
including those basedon oddsratios. refer Armitage
                                      We        to           (1955),Simon(1974),Andrich
(1979),Patefield  (1982),Agresti(1983),and to thereferences     therein.

2. The ExactDistribution Its Approximations
One can describe WMW testusing
               the                   a
                               either Wilcoxonranksumor Mann-WhitneyU
statistic. choosethe secondformulation adoptthe notation Armitage
                                     and               of       (1971,
Chapter 13). Let x,, . .. ., xn denote a random sample froma random variableX, and let
          denotea random sample fromY. Then the Mann-Whitneystatistic
  .. ., yn2                                                                 is
                 UXY=     Rxi, y*)    xi < A> I + '21 {(Xi,   Yj       Xi   Ai 1,
where S I is thenumber elements a setS.
      |                 of         in
  Underthehypothesis X and Y haveidentical
                       that                       distributions, has meann1
                                                               Uxy           n2/2;
whenties are absent,Uxycan assume an integer      value from0 through n2 and is
symmetrically            aboutitsmean.Whentiesoccuramongthexi and theyj,the
attainable valuesof Uxyare integer half-integer need notbe equallyspaced.The
                                  or            and
exactdistribution        on                  of
                  depends theconfigurationtiesand is no longer    symmetricaround
n1n2/2. typically          moreerratic
                    exhibits            behavior whenthereare ties (Lehman,1961;
Klotz,1966;and Klotzand Teng,1977).Calculation theexactnulldistribution Uxy,
                                                 of                        of
         well           in
although understood principle Lehmann,
                                   (see          1975;? 1.4),hasonly       become
feasible anybutthesmallest     data sets.A computer            of
                                                     algorithm Mehtaet al. (1984)
provides exactcalculations most
         the                 for              data
                                      practical sets.
  Normalapproximations thenulldistribution Uxy wellknown[see Armitage
                          for                   of      are
                             Whitney for2 x k Ordered
                 Wilcoxon-Mann-    Test             Tables                         305
(1971)and Lehmann                              are
                    (1975)].The approximations very          whenn1and n2areat
least10and no tiesarepresent. causethevariance Uxy shrink; then = nI + n2
                             Ties                  of   to     if
observations on c distinct
           take              valuesand tjis thenumber observations at thejth
                                                      of          tied
value, variance  is

                                -        +
                                    nin2(n 1)
                                                 - E (t]- _
                                                  ~~j=1 fl

  Normal                  can
          approximations be madeeither            or           a           for
                                             with without correction continuity.
The continuity            is
                correction ordinarily                     the
                                       made by shifting value of the statistic      being
approximated  closer theorigin one-half
                    to            by         unit.In herstudy smalltableswith
                                                                  of                 ties,
Lehman                    that
        (1961) concluded thecorrection             a
                                            (using half-unit usually
                                                                shift)       improves the
normal                 and       to
       approximation tends giveconservative                    Her
                                                    P-values. findings      supported the
assertion Kruskal Wallis(1952) thata continuity
                    and                                 correction            the
                                                                    improves approxi-
mation whena P-valueis greater                              it
                                 than.02,butmayworsen otherwise.         Lehmann   (1975)
recommends use ofthecontinuity
            the                                 in           He
                                      correction general. does notuse it whenties
                            a                     in
between andy's produce lackofequal spacing thevaluesof Uxy;
        x's                                                              notethat shifting
thevalueby one-half no longer
                      unit           movestheteststatistic            to
                                                             halfway thenextpossible
  The normal                  to                of      is
               approximation thedistribution Uxy supported a limit     by       theorem:
(Uxy- n1 n2/2)/[var(Uxy)]"/2 to be a standard
                             tends                normal             as
                                                           variable nj and n2getlarge,
provided tj/n boundedaway from1 as n approaches
              is                                                    (see
                                                           infinity Lehmann,1975,
Appendix, a proof).The resultmay suggest          that,forfinite    sampleswitha large
proportion tiesin a particular
           of                     category, normal
                                          the                           need
                                                       approximation notbe good
evenfor fairlylargevaluesofn1and n2.
  A number authors,
            of          including Lehman (1961),Klotz(1966),Lehmann       (1975,Chapter
1),and Klotzand Teng(1977),haverecognized normal
                                              that          approximations WMW are
ofteninaccurate  whenthedataareheavily   tied.Untilnewalgorithms      madecalculation  of
theexactdistribution feasible, wasdifficult study
                              it           to        these approximations         in
                                                                           except very
smalldata sets.Algorithms,  including of Mehtaet al. (1984), now makeit easierto
compare  exactWMW distributions their
                                    with                         and
                                              approximations to recommend           when
exactcalculations needed.

          of        Biomedical
3. Analyses Published        Tables
In their       of                in
        survey all 168articles Volume306 (January-June oftheNewEngland
Journal Medicine,Moses et al. (1984) identified twenty-seven x k ordered
         of                                              all             2
contingency  tablesthatwereexplicitly   presented  or could be easilyreconstructed  from
information  in thearticles.Among        they
                                    these,                six      for
                                               identified tables which      theresultsof
          analyses were               and
                        wellreported, five    other tables whose analyseswere onlypartly
reported which      offered usefulillustrations ordered
                                              of                for
                                                          tables which   WMW couldbe
used.We present eleven
                  the        tables                 the                   of
                                   hereto illustrate exactcalculations P-values      and
their normalapproximations,     because we believethatthesetablesare representative   of
those which
      for        WMW is useful.
  Thepapers               these
              containing tables      almostalways  reported          or
                                                             two-sided omnibus  P-values
for analyses
    the          presented (usuallychi-square);herewegiveone-sided            in
                                                                      P-values order  to
circumvent  technical difficulties arisein defining two-sided
                                 that                  a           P-valuefora nonsym-
metric distribution Gibbonsand Pratt,
                    (see                     1975).Table 1 presents elevenpublished
tables,theirexactone-sided   P-values from WMW analysis, normal
                                           a                  and         approximations
for theseP-values          and      a
                  without with continuity        correction.
  Theexamples    suggest for tables
                         that, the        with  more                 in
                                                     thantencounts eachgroup,     either
normal                 does
        approximation wellfor        mostpurposes; theagreement
                                                    still,                      the
                                                                        between exact
306                                         March 1985
                                     Table 1
 Eleven x k ordered
       2                                          306
                   contingency published Volume of The NewEngland
                             tables      in                           Journalof
 Medicine.One-sided       are
                   P-values providedforWilcoxon-Mann-
         and usingnormal                  and    a       correction
                                    without with continuity
                        approximations                            (c.c.).
                       2 x ktable                             P-value percent
                                                      One-sided     as
                   Frequencies           Totals                          with
                                                  Exact Normal Normal c.c.
   A         9     0    1    1             11      .044     .10         .11
             1     3     6    4            14
   B        21     0     2    0           23       7.4       5.0           5.2
           15      3     1    2           21
   C       27      2     9                38        .41       .62           .64
           34      3     0                37
   D        5      2    12                19        .050      .042          .044
           20      5     4                29
   E         4     3    11                 18       .048      .036          .040
           21      4     5                30
   F       14     33                      47       3.4       1.9           3.0
           48    235                     283
   G       18      4    14                36      20.8      17.9          18.1
            7      3     2                12
   H       44     62    57                163       .0015     .0015         .0015
           19     53    89                161
   I         2     6     5    5            18      5.9       5.7           5.8
             2    10     6    0            18
   J         1     7    10                 18       .51       .38           .40
            8      6     4                 18
   K       26      3     3    6     90    128     47.4      47.4          47.5
            16     2     1   17     76    112

testsand theirapproximations substantially
                               is              worse than that forsimilarcomparisonsof
groupsfreeof ties. The use of a continuity correction                           for
                                                      makes verylittledifference these
tables;the last two columns of Table 1 are in close agreement.
  Example B illustrates thatthe adoption of a normal approximation     insteadof an exact
test can change the outcome of a level-.05 significance    test. Note in Example F the
discrepancy P-values despitethe largefrequencies. is in partattributable the strong
             in                                       It
asymmetry the exact hypergeometric      distribution this(collapsed) 2 x 2 table.
  We stressthe importanceof the ties correctionfactorin the variance forthe normal
approximation;we have used it for all normal approximationspresentedin this paper.
Withoutit,the normalapproximation WMW forExample C givesa one-sidedP-value
(not shownin the table) of 4.6%, more than 10 timesthe correct   value of .41%.

4. Analysisof Hypothetical
We consideredmany hypothetical x k tables in attempting enlargethe insights
                                  2                          to                       we
gained fromexploringthe published data sets. Table 2 presentsresultsforten of these
tables;we selectedsome of theseexamplesto illustratethat,whenmanyties occur,normal
approximations WMW can do somewhatworse than the data sets of Table 1 seem to
                    seven examples, no single categorycontains more than half of the
suggest.For the first
totalfrequencies.One categorycontainsmostofthedata in each ofthelastthreeexamples.
  Examples A, B, and C illustratetables with nj and n2 at 12 or less; in these tables,the
     approximation P-values highbya factor approximatelyor too low
normal           gives    too            of           2
                Wilcoxon-Mann-Whitney for2 x k OrderedTables
                                    Test                                        307
                                       Table 2
Ten2 x k ordered                      to
               contingency selected compare     exactone-sided
approximations. examples illustrate       large        between
                                  relatively differences                   and
                                                               exactP-values those
                        obtainedfrom  normal approximations.
                      2 x k table                   One-sided       as
                                                             P-value percent
                     Frequencies         Totals   Exact               with
                                                          Normal Normal   c.c.
   A        1    8      0                   9      .17      .31      .36
            8    0      1                   9
   B        2    6      1                   9     2.2      1.0         1.1
            7    2      0                   9
   C        3    8      1                  12     1.5       .7          .8
            9    3      0                  12
   D        4   10      1                  15     2.4      1.3         1.4
           10    5      0                  15
   E        3    9      3   0              15      .49      .63         .67
           13    3      0   2              18
   F       11   17      1                  29     7.0      4.9         5.1
           17   12      0                  29
   G       10   10     10 10 20            60     1.7      1.5         1.5
           30   30     30 30 20           140
   H       18    1      1   0              20     5.3      7.6         7.8
           15    0      0   5              20
   I       50    1      1   1    0   0     53     4.5      7.0         7.1
           45    0      0   1    2   4     52
   J        1   50      2   1    0   0     54     4.8      7.1         7.2
            0   45      0   1    2   3     51

bya factor approximatelyExamples
                           2.        D-G showthat, evenformuchlarger tables,the
normal  approximation theWMWP-value differ
                     to                 can      from exactvaluebya nonneg-
      amountin either
ligible                        for              the
                      direction; theseexamples normal     approximation usually
produces P-values    are
                 that within  50% oftheexactvalue.The lastthree          show
that,whenmostcounts in a single
                      fall                 the
                                   category, normal               can
                                                     approximation lead to a
differentconclusionfrom exactvalueevenwhen50 or moreobservations in each
                        the                                         are
group. Thus,investigators want use exactWMWP-values somerelatively
                        may     to                       for              large
datasets,whenmostobservations in a single
                              fall         category.

5. Recommendations
Ourexamination published
                   of          tablesfrom biomedical
                                         the                                 of
                                                                 consideration many
hypothetical tables, theoretical      and
                                results, an informal review relevant
                                                            of                 lead
                                                                      literature us
to offer following
         the                            for
                       recommendations theuse ofWMW for        analyzing        2
                                                                         ordered x k
contingency  tables:
  (i) We recommend exactcalculations
                       the                  unlessthe groupsizes,nj and n2,are 10 or
      more. makethisrecommendation
            We                                      of           of
                                           regardless theextent ties,   and we referto
      Mehtaetal. (1984) for suitable    algorithm.
 (ii) We recommend whentiesappearin thedata,thetiescorrection thevariance
                      that,                                             to
      always usedwhenapplying normal
            be                      the       approximation.
(iii) Forordered  tables withlargergroupsizesandwith many      a
                                                           ties, normal approximation
      usually          a
             provides P-valuethatis within    50% oftheexactvalue.Whena P-valueis
      closeto a traditional           level
                           significance (for   example,                  may
                                                        5%), investigators prefer   to
      report exactvalue.
308                              Biometrics,
(iv) Whenoverhalfof all observations in a single          an
                                                category, approximate   P-value
     maybe somewhat           even
                     unreliable, when   bothn1and n2arelarger  than10.
 (v) A continuity        for
                correction a normal approximation makes      difference both
                                                        little        when
     n1and n2are 10 or more.We recommend          its
                                           against use forWMW withordered
          becauseoftheunequalspacing valuesoftheWMW statistic.
  We believe     the
            that Wilcoxon-Mann-Whitney               for
                                           procedure investigating alterna-
     is     well
tives often suited analyzing 2 x k ordered
                      to        a             table.Our recommendations  should
aid investigatorsthepractical
               in                 of
                            aspects thisanalysis.

We thank          of
         members theHarvard    Study                  in
                                    Groupon Statistics theBiomedical   Sciences
who offered valuablecomments; theyincludeJohnBailar,GrahamColditz,Hossein
Hosseini,KatherineGodfrey,Robert Lew,PhilipLavori,ThomasLouis,Frederick   Mos-
teller,                         and
                Taylor-Halvorsen, JohnWilliamson. also benefited
                                                    We                from the
adviceofDavid Tritchler,
                       CyrusMehta, and David Hoaglin.Dr Mehtamadeavailable  to
us theprogram Mehta,       and
                      Patel, Tsiatis (1984)whichgivesexact           of
                                                          calculations WMW.
We thank MarySchaefer Marilyn
                      and               for        this
                                 Thorpe preparing manuscript.
  Preparation thismanuscript facilitated Grant
                            was          by                     the
                                                 RF-79026from Rockefeller

        souvent lesinvestigations
Onutilise      dans                      et        des
                                biologiquesmedicales donnees  qualitatives
Quand doit   comparer groupes quelesdonnees
                     deux      et                     dans
                                               tombent trois   classes       ou
plus, test Wilcoxon-Mann-Whitney tient
           de                      (WMW)             de        et
                                               compte l'ordre esthabituellement
       contre alternatives translation.
puissant     des         detype                   de
                                         Cependant,telles applicationsWMW
                                                                    du     font
souvent         des                        les
       intervenor distributions lesquelles ex-aequo
                              pour                           un
                                                      jouent roleimportant. De
nouveauxprogrammes          sur         permettantfaire tests
                   disponibles ordinateur        de     des     exactsdonnent
meilleure                           desdistributions duWMW la pertinence
                    descaracteristiques          exactes         et         des
approximations                     un            base
              normales. apportons avispratique, surl'experience
                      Nous                                                   de
donnees           publiees surdesetudes
       biomedicales      et                      de
                                       numeriques tables           dans
                                                         ordonnees, l'utilisation
duWMW desesapproximations
         et                 normales.

Agresti, (1983). Testing
                              homogeneity ordinal
                                        for              variables.
                                                categorical       Biometrics
Andrich, (1979). A model forcontingency
         D.                                  tableshavingan ordered response  classification.
   Biometrics 403-415.
Armitage, (1955).Testsfor  linear      in             and
                                 trends proportions frequencies.             11,
                                                                   Biometrics 375-386.
Armitage, (1971).Statistical        in
                            Methods MedicalResearch.     NewYork:Wiley.
Gibbons, D. and Pratt, W. (1975). P-values:
         J.             J.                                  and              The
                                               Interpretation methodology. American
   Statistician 20-25.
Klotz,J. H. (1966). The Wilcoxon, ties,and the computer.        of
                                                          Journal theAmerican    Statistical
   Association 772-787.
Klotz,J. H. and Teng,J. (1977). One-way   layoutforcountsand the exactenumeration theof
   Kruskal-Wallis distribution ties.Journal theAmerican
                                 with              of                      Association
                                                                 Statistical            72,
        W.                                         in
Kruskal, H. and Wallis, A. (1952).Use ofranks one-criterion
                         W.                                            analysis.
                                                                variance         Journal of
   theAmerican           Association 583-621.
                Statistical          47,
Lehman,S. Y. (1961). Exactand approximate                 for
                                             distributions the Wilcoxon  statisticwithties.
            of American
   Journal the                     Association 293-298.
                         Statistical           56,
Lehmann, L. (1975).Nonparametrics:
         E.                                   Methods
                                    Statistical        BasedonRanks. Francisco:
                                                                    San            Holden-
Mehta,C. R., Patel,N. R., and Tsiatis, A. (1984). Exact significance
                                      A.                                   to
                                                                    testing establish   the
   equivalence twotreatments compared thebasisofordered
               of              being                                       data.
                                                                 categorical Biometrics
   40, 819-825.
                            Whitney for2 x k Ordered
                Wilcoxon-Mann-    Test             Tables                          309
Moses,L. E., Emerson, D., and Hosseini, (1984).Analyzing from
                      J.                H.                          ordered          New
    England         of         311,
            Journal Medicine, 442-448.
         W.                      for       in
Patefield, M. (1982). Exacttests trends ordered     contingency tables.
Santoro, Bonadonna, Bonfante, and Valagussa, (1982).Alternating combinations
        A.,             G.,         V.,                                 drug
    in thetreatment advanced
                    of         Hodgkin's  disease.The New England         of
                                                                  Journal Medicine   306,
Simon,G. (1974). Alternative analyses the singly-ordered
                                     for                  contingency table.Journal the
    American           Association 971-976.
              Statistical        69,

                        Received                    1984.

Shared By: