Docstoc

Buyse GPC ENSAI May 2010

Document Sample
Buyse GPC ENSAI May 2010 Powered By Docstoc
					Generalized pairwise comparisons
  of prioritized outcomes in the
       two-sample problem

               Marc Buyse, ScD
         IDDI, Louvain-la-Neuve, and
     I-BioStat, Hasselt University, Belgium
            marc.buyse@iddi.com
                           Outline

•   Key problems in clinical development
•   An example in cancer
•   A bit of theory
•   Back to the example
•   Another example in ophthalmology
•   Conclusions
KEY PROBLEMS IN CLINICAL DEVELOPMENT
Development costs are too high…
                    Development times are too long…




                                                                 Dev_Days

                                                                 Approv_Days




                                  50% = 4500



                    20 % = 2400
                                                                                               Source: Steven
                                                                                               Hirschfeld (FDA)
  Days
         0   2000    4000    6000    8000      10000   12000   14000   16000   18000   20000



Ref: Steven Hirschfeld, FDA (personal communication)
             Too few new drugs are approved…




Ref: Arthur D. Little’s views on key Pharma trends, March 31, 2010
AN EXAMPLE IN CANCER
               Advanced colorectal cancer
                      420 subjects with previously
                         untreated metastatic
                           colorectal cancer

                                     R
                      210                     210


     LV5FU2 + oxaliplatin                             LV5FU2


new combination of 5-fluorouracil,       standard regimen of 5-fluorouracil
    leucovorin and oxaliplatin                       and leucovorin



       until disease progression, intolerance to treatment, or death
                                       Progression-free survival
                            100%
                                                                  LV5FU2+oxaliplatin (n=210)
                                                                  LV5FU2 (n=210)


                            80%
Progression-free Survival




                            60%
                                                 HR = 0.66, P = 0.0003


                            40%




                            20%




                             0%
                                   0    5       10           15           20                   25
                                                     Month
                                       Survival
                   100%
                                                          LV5FU2+oxaliplatin (n=210)
                                                          LV5FU2 (n=210)


                   80%




                   60%                       HR = 0.83, P = 0.12
Overall Survival




                   40%




                   20%




                    0%
                          0   5   10    15           20    25            30            35
                                             Month
Oxaliplatin approved for metastatic colorectal cancer
•   In France (AFSSAPS) in 1996
•   In Europe (EMEA) in 1999
•   In the US (FDA) in 2002
                              Problems?
1.   The two endpoints (OS and PFS) are analyzed separately.
     One endpoint suggests (PFS) statistically significant benefit, the other
     (OS) does not. On balance, do we claim treatment to be better?
                                  Problems?
1.       The two endpoints (OS and PFS) are analyzed separately.
         One endpoint suggests (PFS) statistically significant benefit, the other
         (OS) does not. On balance, do we claim treatment to be better?
2.       Neither endpoint is perfect:
     •       PFS is not confounded by other treatments, is less affected by
             unrelated causes of death, and has more events
     •       OS is clinically most relevant and is measured without bias or error
                                  Problems?
1.       The two endpoints (OS and PFS) are analyzed separately.
         One endpoint suggests (PFS) statistically significant benefit, the other
         (OS) does not. On balance, do we claim treatment to be better?
2.       Neither endpoint is perfect:
     •       PFS is not confounded by other treatments, is less affected by
             unrelated causes of death, and has more events
     •       OS is clinically most relevant and is measured without bias or error

3.       The PFS ignores the time between progression and death.
         The time to first event ignores subsequent events. Thus, LV5FU2 +
         oxaliplatin might prolong the PFS of some patients, but shorten their
         remaining survival afterwards.
                                  Problems?
1.       The two endpoints (OS and PFS) are analyzed separately.
         One endpoint suggests (PFS) statistically significant benefit, the other
         (OS) does not. On balance, do we claim treatment to be better?
2.       Neither endpoint is perfect:
     •       PFS is not confounded by other treatments, is less affected by
             unrelated causes of death, and has more events
     •       OS is clinically most relevant and is measured without bias or error

3.       The PFS ignores the time between progression and death.
         The time to first event ignores subsequent events. Thus, LV5FU2 +
         oxaliplatin might prolong the PFS of some patients, but shorten their
         remaining survival afterwards.
4.       Traditional methods of analysis cannot differentiate
         between a modest benefit in all patients and a large benefit
         in some patients.
A BIT OF THEORY
                        General Setup
                            Eligible subjects


                                    R



       Treatment (T )                                Control (C )


  Let Xi be the outcome of                      Let Yj be the outcome of
i th subject in T (i = 1, … , n )          j th subject in C (j = 1, … , m )
                   Recall the Wilcoxon test
Xi and Yj are realizations of a continuous or an ordered discrete variable.
Let S1 , S2 , … , Sn be the ordered ranks of the outcomes observed in T.
Wilcoxon (1945) proposed the test statistic




with expectation




and variance
      The Mann-Whitney form of the Wilcoxon test
The Wilcoxon test statistic can be derived from consideration of all
     possible pairs of subjects, one from each treatment group.
Let




The Wilcoxon-Mann-Whitney test statisticW can be written as
         Gehan generalized the Wilcoxon test
Gehan (1965) generalized the Wilcoxon test to the case of censored
    outcomes. Letting    and denote censored observations, the
    pairwise comparison indicator is now
               First, generalize the test further
                for a single outcome measure
Now let Xi and Yj be observed outcomes for any outcome measure
     (continuous, time to event, binary, categorical, …)
All we require is that the pairwise comparison of observed outcomes Xi
      and Yj be able to classify the pair as favoring T , C , or neither (if
      outcomes Xi and Yj are tied or if either outcome is missing).


                                     pairwise
                               Xi                Yj
                                    comparison



                    favors T                     favors C
                    (favorable)                  (unfavorable)

                               neutral   uninformative
         Continuous outcome measure

      Pairwise comparison                        Pair is
            Xi  Yj > *                       favorable
           Xi  Yj  ≤ *                       neutral
          Xi  Yj <  *                      unfavorable
         Xi orYj missing                     uninformative
*    chosen to reflect clinical relevance;  = 0 is Wilcoxon test
           Time to event outcome measure

          Pairwise comparison                          Pair is
       Xi  Yj > * or  Yj > *                     favorable
               Xi  Yj  ≤ *                         neutral
     Xi  Yj <  * or Xi     < *                unfavorable
                otherwise                         uninformative
*    chosen to reflect clinical relevance;  = 0 is Gehan test
        Binary outcome measure

    Pairwise comparison               Pair is
         Xi = 1, Yj = 0              favorable
Xi = 1, Yj = 1 or Xi = 0, Yj = 0      neutral
         Xi = 0, Yj = 1             unfavorable
       Xi orYj missing             uninformative
           Generalized pairwise comparisons
Let Xi and Yj be vectors of observed outcomes for any number of
      occasions of a single outcome measure, or any number of outcome
      measures that can be prioritized.
All we require is that the pairwise comparison of prioritized outcomes Xi
      and Yj be able to classify the pair as favorable, unfavorable, or
      neither.
Next, generalize the test to prioritized repeated
 observations of a single outcome measure…

  Occasion with     Occasion with       Pair is
  higher priority   lower priority
    favorable          ignored         favorable
   unfavorable         ignored        unfavorable
     neutral           ignored          neutral
  uninformative       favorable        favorable
  uninformative      unfavorable      unfavorable
  uninformative        neutral          neutral
  uninformative     uninformative    uninformative
     Last, generalize the test to several
      prioritized outcome measures…

Outcome with      Outcome with        Pair is
higher priority   lower priority
  favorable          ignored         favorable
 unfavorable         ignored        unfavorable
   neutral           ignored          neutral
uninformative       favorable        favorable
uninformative      unfavorable      unfavorable
uninformative        neutral          neutral
uninformative     uninformative    uninformative
         A general measure of treatment effect
Extend the previous definition of Uij




U is the difference between the proportion of favorable pairs and the
     proportion of unfavorable pairs. We call this general measure of
     treatment effect the « proportion in favor of treatment » ().
       The proportion in favor of treatment ()
 is a linear transformation of the probabilistic index, P (X > Y ) :




                  Situation                P (X > Y )      
         T uniformly worse than C               0          1
           T no different from C               0.5          0
         T uniformly better than C              1          +1
       The proportion in favor treatment ()

For a binary variable,  is equal to the difference in proportions




For a continuous variable ,  is related to the effect size d




For a time-to-event variable,  is related to the hazard ratio  and the
      proportion of informative pairs f
             A re-randomization test for 

The test statistic U (or ) no longer has known expectation
     and variance.
An empirical distribution of  can be obtained through re-
     randomization.
Tests of significance and confidence intervals follow suit.
BACK TO THE EXAMPLE
           Prioritized outcomes for patients with
                metastatic colorectal cancer

Priority                              Outcomes
   1              Time to death with pairwise difference ≥ 12 months
   2          Time to death with pairwise difference ≥ 6 but < 12 months
   3              Time to death with pairwise difference < 6 months
Prioritized outcomes for patients with
     metastatic colorectal cancer


Priority                 Outcomes
   1            Time to death from any cause
   2       Time to objective progression of disease
     Prioritized outcomes for patients with
early HER2neu overexpressing breast cancer


   Priority                Outcomes
      1          Time to death from any cause
      2       Occurence of congestive heart failure
      3            Time to distant metastases
      4       Occurrence of second invasive cancer
      5             Time to local recurrence
                          Progression-free survival
                     GENERALIZED PAIRWISE COMPARISONS
                                (44,100 pairs)


    Difference in PFS         Oxliplatin   Standard   Cumulative P-value *
                               better       better       
    At least 12 months            2.6%      1.1%         1.5%      0.090

    Between 6 and 12              15.5%     5.4%        11.6%     <0.0001
        months
    Less than 6 months            35.5%     22.9%       24.2%     <0.0001

*   Unadjusted for multiplicity
                                   Overall survival
                     GENERALIZED PAIRWISE COMPARISONS
                                (44,100 pairs)


    Difference in PFS         Oxliplatin   Standard   Cumulative P-value *
                               better       better       
    At least 12 months            10.9%     6.5%         4.4%      0.043

    Between 6 and 12              14.7%     10.8%        8.3%      0.038
        months
    Less than 6 months            17.0%     15.2%       10.1%      0.050

*   Unadjusted for multiplicity
                  Magnitude of benefits


                            0.044
≥ 12 months
                  0.015




                           0.039
 ≥ 6 months                                              OS
                                      0.101
                                                         PFS

                   0.018
 < 6 months
                                              0.126



              0            0.05     0.1           0.15
                              Prioritized outcomes
                     GENERALIZED PAIRWISE COMPARISONS
                                (44,100 pairs)


       Difference in          Oxliplatin   Standard   Cumulative P-value *
                               better       better       
       Time to death              42.6%     32.5%       10.1%      0.050

    Time to progression           9.1%      4.4%        14.8%      0.0054

*   Unadjusted for multiplicity
ANOTHER EXAMPLE
         Age-related Macular Degeneration
                            592 subjects with
                         neovascular age-related
                          macular degeneration

                                      R
                       296                        296


          Pegaptanib                                       Sham


Intravitreous injections of 3 mg of       Sham injections (with a syringe applied
  pegaptanib (an anti–vascular             on the surface of the eye to simulate
    endothelial growth factor)                 the pressure of an injection)



                  every 6 weeks over a period of 54 weeks
                                Endpoints

                           NCKZO
                               RHSDK
                                   DOVHR
                                   CZRHS
                                    ONHRC


Measurement of visual acuity (number of letters of standardized chart
   correctly read) every 6 weeks
                                  Mean visual acuity over time
                     55

                                                                            3 mg
                                                                            Sham



                     50
Mean visual acuity




                     45




                     40




                     35
                          0   6   12   18   24          30   36   42   48          54
                                                 Week
                                Endpoints

                            NCKZO
                                RHSDK
                                                           “clinically relevant loss”:
                                  DOVHR                    15 letters  3 lines

                                    CZRHS
                                     ONHRC


Primary endpoint: loss of < 15 letters of visual acuity at one year (prevention
   of major vision loss)
       The whole data, and nothing but the data:
            measurements of visual acuity

Wk     0    6    12   18   24   30   36   42   48   54
Pt 1   43   25   22   11   15   13   11   7    11   11

Pt 2   75   69   63   65   60   73   51        53

Pt 3   71   68   73   75   67

Pt 4   51   41   51   36   38   38   37   37

Pt 5   42   50   52   48   47   48   42   42   40   39

Pt 6   55   55   63   61   66   69   64   63   72   64

Pt 7   29   48   43   44        43   43   43   45   47
               Measurements of visual acuity
            with last observation carried forward

Wk     0      6   12   18   24   30   36   42   48   54
Pt 1   43    25   22   11   15   13   11   7    11   11

Pt 2   75    69   63   65   60   73   51   51   53   53

Pt 3   71    68   73   75   67   67   67   67   67   67

Pt 4   51    41   51   36   38   38   37   37   37   37

Pt 5   42    50   52   48   47   48   42   42   40   39

Pt 6   55    55   63   61   66   69   64   63   72   64

Pt 7   29    48   43   44   44   43   43   43   45   47
            Measurements of visual acuity
               at week 0 and week 54

Wk     0    6   12   18   24   30   36   42   48   54
Pt 1   43                                          11

Pt 2   75                                          53

Pt 3   71                                          67

Pt 4   51                                          37

Pt 5   42                                          39

Pt 6   55                                          64

Pt 7   29                                          47
             Measurements of visual acuity
            changes from week 0 to week 54

Wk     0    54   diff
Pt 1   43   11   -32

Pt 2   75   53   -22

Pt 3   71   67    -4

Pt 4   51   37   -14

Pt 5   42   39    -3

Pt 6   55   64   +9

Pt 7   29   47   +18
            Loss < 15 letters in visual acuity
               between weeks 0 and 54

Wk     0    54   diff   B
Pt 1   43   11   -32    0

Pt 2   75   53   -22    0

Pt 3   71   67    -4    1

Pt 4   51   37   -14    1

Pt 5   42   39    -3    1

Pt 6   55   64   +9     1

Pt 7   29   47   +18    1
                           Binary endpoint

                              STANDARD ANALYSIS


Loss of < 15        Pegaptanib        Sham
                                                  Difference   P-value
letters at 1 year    (N = 296)      (N = 296)
                      65.2%          55.4%          9.8%       0.0123
December 2004
December 2004
                            Problems?

1.   Binary endpoint ignores gains in vision
     Changes in vision on a continuous scale would be more sensitive to any
     change in vision
                              Problems?

1.   Binary endpoint ignores gains in vision
     Changes in vision on a continuous scale would be more sensitive to any
     change in vision
2.   Binary endpoint only considers one time point (1 year)
     Time to loss of 3 lines uses all time points
                              Problems?

1.   Binary endpoint ignores gains in vision
     Changes in vision on a continuous scale would be more sensitive to any
     change in vision
2.   Binary endpoint only considers one time point (1 year)
     Time to loss of 3 lines uses all time points
3.   Binary endpoint is clinically relevant but insensitive
     Repeated measures models use all data and are very likely to be most
     sensitive
                              Problems?

1.   Binary endpoint ignores gains in vision
     Changes in vision on a continuous scale would be more sensitive to any
     change in vision
2.   Binary endpoint only considers one time point (1 year)
     Time to loss of 3 lines uses all time points
3.   Binary endpoint is clinically relevant but insensitive
     Repeated measures models use all data and are very likely to be most
     sensitive
4.   Binary endpoint requires data imputation
     Time to loss of 3 lines and repeated measures models do not require
     imputation
               Ophthalmology - binary endpoint

                           STANDARD ANALYSIS

Loss of < 15        Pegaptanib        Sham
                                                  Difference   P-value*
letters at 1 year    (N = 296)      (N = 296)
                      65.2%          55.4%          9.8%        0.015
* ² test


                          PAIRWISE COMPARISONS
                               (87,616 pairs)

  Pegaptanib =      Pegaptanib >   Pegaptanib <
                                                              P-value
     Sham              Sham           Sham
      51.6%            29.1%          19.3%         9.8%        0.015
               Ophthalmology - binary endpoint

                       STANDARD ANALYSIS (STRATIFIED)

Loss of < 15        Pegaptanib        Sham
                                                  Difference   P-value*
letters at 1 year    (N = 296)      (N = 296)
                       65.2%         55.4%          9.8%        0.012
* Cochran-Mantel-Haenszel test


                           PAIRWISE COMPARISONS
                                (12,907 pairs)

 Pegaptanib =       Pegaptanib >   Pegaptanib <
                                                              P-value
    Sham               Sham           Sham
     51.9%             29.6%          18.5%         11.1%      0.0095
          Ophthalmology - continuous endpoint
                   GENERALIZED PAIRWISE COMPARISONS
                              (12,907 pairs)

 Change in VA        Pegaptanib >   Pegaptanib<          P-value
   at 1 year            Sham           Sham

At least 6 lines        11.4%          4.9%       6.5%    0.0013

At least 5 lines        4.4%           2.6%       8.3%    0.0011

At least 4 lines        5.1%           3.0%       10.4%   0.0007

At least 3 lines        6.2%           4.0%       12.6%   0.0005

At least 2 lines        7.9%           5.6%       14.9%   0.0003

 At least 1 line        8.6%           7.2%       16.3%   0.0005

Less than 1 line        9.6%           8.6%       17.1%   0.0007
        Ophthalmology - continuous endpoint
                GENERALIZED PAIRWISE COMPARISONS
                           (12,907 pairs)

 Change in        Pegaptanib >   Pegaptanib<          P-value
visual acuity        Sham           Sham
At week 54           48.0%         35.0%       13.0%   0.0036
At week 48           3.1%           1.0%       15.1%   0.0010
At week 42           1.1%           0.9%       15.3%   0.00093
At week 36           2.4%           0.9%       16.8%   0.0003
At week 30           0.8%           1.0%       16.6%   0.0003
At week 24           0.9%           0.6%       16.9%   0.0003
At week 18           0.7%           0.3%       17.3%   0.0003
At week 12           1.2%           0.4%       18.1%   0.0002
 At week 6           0.4%           0.0%       18.5%   0.0002
CONCLUSIONS
           Generalized Pairwise Comparisons

1.   are equivalent to well-known non-parametric tests in simple cases
2.   allow testing for differences thought to be clinically relevant
3.   allow any number of prioritized outcomes of any type to be
     analyzed simultaneously
4.   naturally lead to a universal measure of treatment effect, 
        The proportion in favor of treatment ()

1.   is a universal measure of treatment effect that can be calculated
     for any type of outcome measure
2.   is directly related to classical measures of treatment effect
     (difference in proportions, effect size or hazard ratio)
3.   for time-related outcomes such as survival, provides descriptive
     statistics on treatment effects in terms of differences in times to
     event
                                 References


Buyse M. Generalized pairwise comparisons for prioritized outcomes in the
   two-sample problem. Statistics in Medicine, 2010. DOI: 10.1002/sim.3923.
Buyse M. Reformulating the hazard ratio to enhance communication with
   clinical investigators. Clinical Trials 5:641-2, 2008.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:9
posted:3/21/2013
language:Unknown
pages:65