ROC Curves _ Wilcoxon and Mann-Whitney Tests

Document Sample
ROC Curves _ Wilcoxon and Mann-Whitney Tests Powered By Docstoc
					ROC Curves & Wilcoxon and
   Mann-Whitney Tests
             Lindsay Jacks
         Tutorial Presentation
   CHL 5210 Categorical Data Analysis
          October 16th, 2007




                                        1
Outline
   Binary Classification Model

   ROC Curve

   Area under the ROC Curve

   Nonparametric Methods

       Mann-Whitney Test

       Wilcoxon Signed-Rank Test

   SAS Code

                                    2
Binary Classification Model
Confusion Matrix:   True Positive
                        The actual value is positive
                        and it is classified as positive

                    False Negative (Type II Error)
                        The actual value is positive
                        but it is classified as negative

                    True Negative
                        The actual value is negative
                        and it is classified as negative

                    False Positive (Type I Error)
                        The actual value is negative
                        but it is classified as positive

                                                       3
Evaluation Metrics
   True Positive Rate (TPR)
       Positives correctly classified / Total positives


    = Sensitivity


   False Positive Rate (FPR)
       Negatives incorrectly classified / Total negatives

    = 1 - Specificity

                                                             4
ROC Curve
Receiver Operating Characteristic (ROC) curve:

   A technique for visualizing, organizing and selecting
    classifiers based on their performance

   Two-dimensional graph in which the TPR is plotted on the Y
    axis and the FPR is plotted on the X axis

    Sensitivity vs. (1 – Specificity)

   Depicts relative tradeoffs between benefits (true positives)
    and costs (false positives)

                                                                   5
ROC Curve
   The relationship between sensitivity and specificity can be
    described in the graph below:



   The best possible prediction
    method produces a point in
    the upper left corner
    representing 100% sensitivity
    and 100% specificity


   If a diagnostic procedure
    has no predictive value, the
    relationship between
    sensitivity and specificity is
    linear


                                                                  6
ROC Space
   Each prediction result or
    one instance of a confusion
    matrix represents one point
    in the ROC space

   A completely random guess
    gives a point along the
    diagonal line (B)

   Points above the diagonal
    line (A, C’) indicate good
    classification results

   Points below the diagonal
    line (C) indicate incorrect
    results

                                  7
Area under ROC curve (AUC)
   The area under the ROC curve depends on the overlap of two
    normal distribution curves

   The greater the overlap of the
    curves, the smaller the area
    under the ROC curve (the lower
    the predictive power of the test)

   The area of overlap indicates
    where the test cannot distinguish
    normal from disease

   When the normal distribution
    curves overlap totally, the ROC
    curve turns into a diagonal line

                                                                 8
Area under ROC curve (AUC)
   To compare classifiers we may want to reduce the ROC
    performance to a single scalar value representing expected
    performance
                       Calculate the AUC

   Since the AUC is a portion of the area of the unit square, its
    value will always be between 0 and 1

   However, because random guessing produces the diagonal
    line between (0, 0) and (1, 1), which has an area of 0.5, no
    realistic classifier should have an AUC less than 0.5

   An ideal classifier has an area of 1


                                                                 9
Area under ROC curve (AUC)
   Important statistical property: AUC is equivalent to the
    probability that the classifier will rank a randomly chosen
    positive instance higher than a randomly chosen negative
    instance


   This is equivalent to the
    Mann-Whitney statistic


Comparing two ROC curves:
 The graph represents the areas
  under two ROC curves, A and B.
  Classifier B has greater area and
  therefore better average
  performance

                                                              10
ROC Curve: Applications
   ROC analysis provides a tool to select possibly optimal
    models and to discard suboptimal ones

   Related to cost/benefit analysis of diagnostic decision
    making

   Widely used in medicine, radiology, psychology; recently
    becoming more popular in areas like machine learning and
    data mining

   The area under the ROC curve is equivalent to the Mann-
    Whitney statistic; however, summarizing the ROC curve
    into a single number loses information about the pattern




                                                               11
Nonparametric Methods
   Usually require the use of interval- or ratio-scaled data

   Provide an alternative series of statistical methods that
    require no or very limited assumptions to be made about
    the data

   Require no assumptions about the population probability
    distributions

     Distribution-free methods




                                                                12
Mann-Whitney Test
   Also known as Mann-Whitney-Wilcoxon (MWW) or Wilcoxon
    rank-sum test

   A nonparametric alternative to the two-sample t-test which
    is based solely on the order in which the observations from
    the two samples fall

   Method for determining whether there is a difference
    between two populations

Requirements:
   Data must be ordinal or continuous measurements
   The two samples must be independent


                                                              13
Mann-Whitney Test
   Null hypothesis H0: The two populations are identical.
Process:
   Combine independent samples into one sample (n=n1+n2)
   Rank the combined data from lowest to highest values, with
    tied values being assigned the average of the tied rankings
   Compute T, the sum of the ranks for the observations in
    the first sample
       If the two populations are identical, the sum of the ranks of
        the first sample and those in the second sample should be
        close to the same value
   Compare the observed value of T to the sampling
    distribution of T for identical populations
                                                                        14
Mann-Whitney Test
Sampling distribution of T for identical populations (under H0)


   Mean        μT = n1(n1+n2+1)
                        2


   Variance    vT = n1n2(n1+n2+1)
                        12



   Test Statistic    z = T - μT     asymptotically N(0,1) distribution

                           √v T

                                                                          15
Wilcoxon Signed-Rank Test
   A nonparametric alternative to the paired t-test for the case
    of two related samples or repeated measurements on a
    single sample

   Method for determining whether there is a difference
    between two populations

Requirements:
   Data must be interval measurements
   Does not require assumptions about the form of the distribution of
    the measurements



                                                                    16
Wilcoxon Signed-Rank Test
   Test assumes there is information in the magnitudes of the
    differences between paired observations, as well as the signs

   Null hypothesis H0: The two populations are identical.

Process:
   Compute the differences between the paired observations (discard
    any differences of zero)
   Rank the absolute value of the differences from lowest to highest,
    with tied differences being assigned the average ranking of their
    positions
   Give the ranks the sign of the original difference in the data
   Sum the signed ranks and determine whether the sum is
    significantly different from zero


                                                                     17
Wilcoxon Signed-Rank Test
Sampling distribution of T for identical populations (under H0)


   Mean        μT = 0

   Variance    vT = n(n+1)(2n+1)
                           6



   Test Statistic       z= T        asymptotically N(0,1) distribution
                           √v T



                                                                          18
SAS Code
   ROC Curve

       %ROCPLOT macro
           Produces a plot showing the ROC curve associated
            with a fitted binary-response model

           Plot of the sensitivity against 1-specificity values
            associated with the observations' predicted event
            probabilities


**You must first run the LOGISTIC procedure to fit the desired
  model

                                                                   19
SAS Code
   ROC Curve

       %ROC macro
           Nonparametric comparison of areas under correlated
            ROC curves
           Provides point and confidence interval estimates of
            each curve's area and of the pairwise differences
            among the areas
           Tests of the pairwise differences are also given

**You must first run the LOGISTIC procedure to fit each of the
  models whose ROC curves are to be compared

                                                                 20
SAS Code
   Mann-Whitney-Wilcoxon Test
        PROC NPAR1WAY WILCOXON;
        CLASS variable;
        VAR variable;
        EXACT WILCOXON;



   Wilcoxon Signed-Rank Test
        PROC UNIVARIATE;
        VAR variable*;

    *You must first perform a DATA step to create the difference;
    SAS will not calculate the difference in PROC UNIVARIATE
                                                                    21

				
DOCUMENT INFO