That s not specified on the case report forms
Shared by: 5sk81od
-
Stats
- views:
- 3
- posted:
- 7/3/2012
- language:
- pages:
- 265
Document Sample


1
FOOD AND DRUG ADMINISTRATION
CENTER FOR DRUG EVALUATION AND RESEARCH
SIXTY-THIRD MEETING
OF THE
ONCOLOGIC DRUGS ADVISORY COMMITTEE
8:01 a.m.
Friday, September 17, 1999
Kennedy Ballroom
Holiday Inn
8777 Georgia Avenue
Silver Spring, Maryland
2
ATTENDEES
COMMITTEE MEMBERS:
RICHARD L. SCHILSKY, M.D., Chair - for Roferon-A
Associate Dean for Clinical Research
Biological Sciences Division
University of Chicago
The University of Chicago Medical Center
5841 South Maryland Avenue, MC1140
Chicago, Illinois 60637
KAREN M. TEMPLETON-SOMERS, PH.D., Executive Secretary
Advisors & Consultants Staff, HFD-21
Food and Drug Administration
5600 Fishers Lane
Rockville, Maryland 20857
DOUGLAS W. BLAYNEY, M.D.
Medical Director, Oncology Program
The Robert and Beverly Lewis Family
Cancer Care Center
Pomona Valley Hospital Medical Center
1910 Royalty Drive
Pomona, California 91767
DAVID H. JOHNSON, M.D.
Director, Division of Medical Oncology
Department of Medicine
Vanderbilt University Medical School
1956 The Vanderbilt Clinic
Nashville, Tennessee 37232
DAVID P. KELSEN, M.D.
Chief, Gastrointestinal Oncology Service
Memorial Sloan-Kettering Cancer Center
1275 York Avenue
New York, New York 10021
SCOTT M. LIPPMAN, M.D.
Professor of Medicine and Cancer Prevention
The University of Texas M.D. Anderson Cancer Center
Department of Clinical Cancer Prevention
1515 Holcombe Boulevard, HMB 11.192c, Box 236
Houston, Texas 77030
3
ATTENDEES (Continued)
COMMITTEE MEMBERS: (Continued)
KIM A. MARGOLIN, M.D.
Staff Physician
Department of Medical Oncology and
Therapeutics Research
City of Hope National Medical Center
1500 East Duarte Road
Duarte, California 91010
STACY R. NERENSTONE, M.D., Acting Chair - for Taxol
Associate Clinical Professor
Oncology Associates, P.C.
Helen & Harry Gray Cancer Center
Hartford Hospital
85 Retreat Avenue
Hartford, Connecticut 06106
JODY L. PELUSI, F.N.P., PH.D., Consumer Representative
Cancer Program Coordinator
Maryvale Hospital
102 W. Campbell Avenue
Phoenix, Arizona 85031
DEREK RAGHAVAN, M.D., PH.D.
Associate Director
Head of Medical Oncology
University of Southern California
Norris Comprehensive Cancer Center
1441 Eastlake Avenue, Room 3450
Los Angeles, California 90033
RICHARD M. SIMON, D.SC.
Chief, Biometric Research Branch
National Cancer Institute
Executive Plaza North, Room 739
Bethesda, Maryland 20892
4
ATTENDEES (Continued)
COMMITTEE CONSULTANTS:
JAMES E. KROOK, M.D.
Principal Investigator
Duluth CCOP
400 East Third Street
Duluth, Minnesota 55805
KATHLEEN LAMBORN, PH.D.
Professor
Department of Neurological Surgery
University of California, San Francisco
350 Parnassus Street, Room 805, Box 0372
San Francisco, California 94143
COMMITTEE GUEST:
JOHN KIRKWOOD, M.D.
Medical Oncology, N-758
University of Pittsburgh
200 Lothrup Street
Pittsburgh, Pennsylvania 15213-2546
PATIENT REPRESENTATIVES:
Kenneth McDonough - for Roferon-A
North Huntington, Pennsylvania
SANDRA ZOOK-FISCHLER - for Taxol
New York, New York
FOOD AND DRUG ADMINISTRATION STAFF:
MASSIMO CARDINALI, M.D.
ROBERT JUSTICE, M.D.
PATRICIA KEEGAN, M.D.
PETER LACHENBRUCH, PH.D.
JAMES O'LEARY, M.D.
JAY SIEGEL, M.D.
ROBERT TEMPLE, M.D.
GRANT WILLIAMS, M.D.
5
ATTENDEES (Continued)
ON BEHALF OF BRISTOL-MYERS SQUIBB:
DON BERRY, PH.D.
RENZO CANETTA, M.D.
CRAIG HENDERSON, M.D.
LARRY NORTON, M.D.
DAVID TUCK, M.D.
ON BEHALF OF HOFFMANN-LA ROCHE, INC.:
ANTONIO BUZAID, M.D.
LONI da SILVA
SAM GIVENS, PH.D.
PROFESSOR JEAN-JOCK GROB
LEON HOOFTMAN, M.D.
MAURIZIO RAMISIO, PH.D.
ELIZABETH WASSNER, PHARM.D.
ALSO PRESENT:
MARGARET VOLPE
MARISSA WEISS, M.D.
6
C O N T E N T S - MORNING SESSION
NDA 20-262/S-033, TAXOL (paclitaxel) Injection
BRISTOL-MYERS SQUIBB COMPANY
Indicated for the Adjuvant Treatment of
Node-Positive Breast Cancer
Administered Sequentially to Standard Combination Therapy
AGENDA ITEM PAGE
CONFLICT OF INTEREST STATEMENT
by Dr. Karen Templeton-Somers 9
OPEN PUBLIC HEARING PRESENTATION
by Margaret Volpe 12
BRISTOL-MYERS SQUIBB PRESENTATION
Introduction - Dr. David Tuck 13
Breast Cancer Chemotherapy -
by Dr. Larry Norton 16
Intergroup 0148 Results -
by Dr. Craig Henderson 31
Concluding Remarks - by Dr. Renzo Canetta 49
QUESTIONS FROM THE COMMITTEE 51
FDA PRESENTATION
by Dr. James O'Leary 98
QUESTIONS FROM THE COMMITTEE 108
OPEN PUBLIC HEARING PRESENTATION
by Dr. Marissa Weiss 113
COMMITTEE DISCUSSION AND VOTE 117
7
C O N T E N T S - AFTERNOON SESSION
BLA 97-1001, ROFERON-A
HOFFMANN-LA ROCHE INC.
Indicated for Use as Adjuvant Treatment of
Surgically Resected Malignant Melanoma
Without Clinical Evidence of Nodal Disease,
AJCC stage II (Breslow thickness greater than 1.5 mm, N0)
AGENDA ITEM PAGE
CONFLICT OF INTEREST STATEMENT
by Dr. Karen Templeton-Somers 161
OPEN PUBLIC HEARING 163
UPDATE ON THE PRELIMINARY RESULTS OF EST 1690
(ECOG Intergroup Study of Intron A for the
Adjuvant Treatment of Melanoma) -
by Dr. John Kirkwood 163
HOFFMANN-LA ROCHE INC. PRESENTATION
Introduction - by Ms. Loni da Silva 184
Clinical Overview of Malignant Melanoma -
by Dr. Antonio Buzaid 185
Data on Roferon-A in the Treatment of Stage II
Malignant Melanoma - by Dr. Leon Hooftman 194
QUESTIONS FROM THE COMMITTEE 209
FDA PRESENTATION
by Dr. Massimo Cardinali 232
by Dr. Peter Lachenbruch 235
QUESTIONS FROM THE COMMITTEE 241
COMMITTEE DISCUSSION AND VOTE 244
8
P R O C E E D I N G S
(8:01 a.m.)
DR. NERENSTONE: Good morning. I'd like to thank
everybody for coming and starting on time.
I'd like to start with going around the table and
introducing the committee members. If we could start with
Dr. Krook.
DR. KROOK: Jim Krook, medical oncologist,
Duluth, Minnesota.
DR. JOHNSON: David Johnson, medical oncologist,
Vanderbilt University.
MS. ZOOK-FISCHLER: Sandra Zook-Fischler,
Patient Rep.
DR. PELUSI: Jody Pelusi, oncology nurse
practitioner in Phoenix, Arizona.
DR. RAGHAVAN: Derek Raghavan, medical
oncologist, University of Southern California.
DR. BLAYNEY: Doug Blayney, medical oncologist,
Pomona, California.
DR. NERENSTONE: Stacy Nerenstone, medical
oncologist, Hartford, Connecticut.
DR. TEMPLETON-SOMERS: Karen Somers, Executive
Secretary to the committee, FDA.
9
DR. LIPPMAN: Scott Lippman, medical oncologist,
M.D. Anderson Cancer Center.
DR. LAMBORN: Kathleen Lamborn, biostatistician,
University of California, San Francisco.
DR. MARGOLIN: Kim Margolin, medical oncology and
hematology, City of Hope, Los Angeles.
DR. O'LEARY: James O'Leary, medical reviewer at
the FDA.
DR. WILLIAMS: Grant Williams, medical team
leader, FDA.
DR. JUSTICE: Bob Justice, acting Division
Director, FDA.
DR. NERENSTONE: Thank you.
Dr. Somers will now read the conflict of interest
statement.
DR. TEMPLETON-SOMERS: The following
announcement addresses the issue of conflict of interest with
regard to this meeting and is made a part of the record to
preclude even the appearance of such at this meeting.
Based on the submitted agenda for the meeting and
all financial interests reported by the committee
participants, it has been determined that all interests in
firms regulated by the Center for Drug Evaluation and Research
10
present no potential for an appearance of a conflict of interest
at this meeting with the following exceptions.
Dr. Richard Schilsky and Dr. Richard Simon are
excluded from participating in today's discussion and vote
concerning Taxol.
In addition, in accordance with 18 U.S.C.
208(b)(3), full waivers have been granted to Drs. David Kelsen,
Stacy Nerenstone, William Gradishar, Kathleen Lamborn, and
Ms. Sandra Zook-Fischler, which permit them to participate
in all official matters concerning Taxol.
Further, Dr. Kim Margolin has been granted a
limited waiver which permits her to participate in the
committee's discussion of Taxol without voting privileges.
A copy of the waiver statements may be obtained
by submitting a written request to the agency's Freedom of
Information Office, room 12A-30 of the Parklawn Building.
In addition, we would like to disclose for the
record that Dr. Scott Lippman has an interest which does not
constitute a financial interest within the meaning of 18 U.S.C.
208(a), but which could create the appearance a conflict.
The agency has determined, notwithstanding his interest, that
the interests of the government in his participation outweighs
the concern that the integrity of the agency's programs and
11
operations may be questioned. Therefore, Dr. Lippman may
participate fully in today's discussion and vote concerning
Taxol.
Further, because of Dr. James Krook's and Dr. David
Johnson's past interests involving Taxol, the agency has
determined, notwithstanding their interests, that the
interests of the government in his participation outweighs
the concern that the integrity of the agency's programs and
operations may be questioned. Therefore, Dr. Krook and Dr.
Johnson will be permitted to participate in today's discussion
of Taxol without voting privileges.
In the event that the discussions involve any other
products or firms not already on the agenda for which an FDA
participant has a financial interest, the participants are
aware of the need to exclude themselves from such involvement,
and their exclusion will be noted for the record.
With respect to all other participants, we ask
in the interest fairness that they address any current or
previous financial involvement with any firm whose products
they may wish to comment upon.
Thank you.
I'd also like to remind people that Dr. Gradishar
was not able to travel to this meeting because of the weather.
12
Thank you.
DR. NERENSTONE: We are now going to open the
public hearing part of the meeting. We have one speaker who
has been asked, Margaret Volpe of the Y-ME National Breast
Cancer Organization. Ms. Volpe?
MS. VOLPE: Good morning. My name is Margaret
Volpe from Y-ME National Breast Cancer Organization, and I
have no financial connections with Bristol-Myers Squibb.
Thank you for allowing us to submit this statement
to the committee. I am here today on behalf of the Y-ME
National Breast Cancer Organization to express our position
regarding the potential approval of Taxol injection for the
adjuvant treatment of node-positive breast cancer administered
sequentially to standard combination therapy.
Y-ME National Breast Cancer Organization is a
nonprofit patient advocate organization whose mission is to
decrease the impact of breast cancer, create and increase
treatment awareness, and ensure, through information,
empowerment, and peer support, no one faces breast cancer
alone. We have 26 chapters nationwide, numerous publications,
and several outstanding public education programs. Y-ME has
no financial connection to Bristol-Myers Squibb Company.
The addition of Taxol to the adjuvant treatment
13
of node-positive women after standard chemotherapy,
doxorubicin and cyclophosphamide, represents a major
advancement in the treatment of breast cancer. The results
of the CALGB study 9344 showed that the addition of Taxol
increased overall survival and disease-free survival rates.
Y-ME believes that women and men diagnosed with
breast cancer should have access to as many treatment options
as possible. We believe the approval of Taxol in the adjuvant
setting will add a valuable option.
Thank you.
DR. NERENSTONE: Thank you very much.
Are there other public speakers at this time?
(No response.)
DR. NERENSTONE: If not, then we'll continue with
the sponsor presentation.
DR. TUCK: Thank you. Good morning. I'm David
Tuck from clinical oncology at Bristol-Myers Squibb.
We plan to present this morning the data from the
supplemental new drug application for the use of Taxol for
adjuvant treatment of node-positive breast cancer.
The initial presentation this morning will be by
Dr. Larry Norton, who will discuss current approaches to
adjuvant therapy for breast cancer. He will be followed by
14
Dr. Craig Henderson, who will present the results from the
pivotal study Intergroup 0148. Following this, Dr. Renzo
Canetta from Bristol-Myers Squibb will present some concluding
remarks, and then we will accept questions.
First of all, I would like to welcome our external
consultants today. All of them had to make extraordinary
travel arrangements to get here today, and we appreciate that.
But I would like to mention in particular the heroic efforts
that Dr. Don Berry made to get here from Houston, driving in
all night last night, at least the last leg, and arriving just
a little while ago.
Dr. Stephen George, the Director of the CALGB
Statistical Center, also participated in the preparation of
the NDA but was not available today.
Dr. Craig Henderson was the study chair for the
pivotal study.
And Dr. Larry Norton is the Chair of the CALGB
Breast Committee.
The activity of Taxol is well established in a
variety of settings with metastatic disease for breast cancer.
Early in the development, Taxol was shown to have high response
rates in metastatic breast cancer in phase II trials, including
heavily pretreated patients and patients who had failed
15
anthracycline therapy.
In 1994, a large randomized study led to the
initial approval by the FDA of Taxol for the second-line
treatment of metastatic disease using a dose of 175 milligrams
per meter squared over 3 hours.
In 1998, based on a large randomized trial,
Herceptin was approved to be used in combination with Taxol
using a dose of 175 milligrams per meter squared over 3 hours
for the first-line treatment of HER2 positive metastatic breast
cancer.
The pivotal trial, which is going to be presented
today, is an intergroup trial, INT-0148, which looked at both
doxorubicin dose escalation as well as the addition of Taxol
versus no further therapy as part of the
cyclophosphamide/doxorubicin adjuvant chemotherapy regimen
for node-positive breast cancer.
The coordinating group was the CALGB, and most
of the major cooperative groups in the U.S. participated,
including the Eastern Cooperative Oncology Group, the North
Central Cancer Treatment Group, and the Southwest Oncology
Group.
A total of 3,170 patients were accrued between
May 1994 and April 1997. This pivotal study then is the largest
16
randomized trial of chemotherapy in the adjuvant treatment
of breast cancer that has ever been submitted to the FDA.
As you will hear today, the results of this study
show that Taxol, given with standard dosage following standard
chemotherapy, demonstrates significant advantages in
disease-free and overall survival.
The safety profile in this setting is consistent
with the large experience accumulated with this approved dose
and schedule.
Therefore, we propose the following indication:
Taxol administered sequential to standard combination
chemotherapy is indicated for the adjuvant treatment of
node-positive breast cancer.
Now I'd like to have Larry Norton discuss adjuvant
chemotherapy.
DR. NORTON: Thank you. Good morning. My job
is to sort of introduce the topic by giving some background
and by showing some context. In this regard, I'd like to start
off with the next slide which describes sort of the basic core
kernel of knowledge of what we know at the present time about
the adjuvant chemotherapy of breast cancer.
We know for sure that adjuvant chemotherapy
improves disease-free and overall survival. We know that the
17
use of multiple agents, so-called polychemotherapy, is
superior in this regard to the use of a single agent,
monochemotherapy. We know that multiple cycles of
administration is superior to a single exposure. This is
largely a single perioperative exposure in some very early
trials. We know that there are no major advantages to
durations of therapy exceeding 3 months, and we know that the
anthracycline combinations are slightly better than CMF, which
is probably the world's most studied regimen, that the
anthracycline combinations are somewhat superior.
Now, how do we know all this? We know this clearly
from individual large studies, but also from the worldwide
overview that's being conducted based in Oxford, England every
five years. This activity, with which you're all familiar,
puts together all of the investigators in the world who have
done randomized trials, published and unpublished, for the
treatment of breast cancer, as well as other therapeutic
approaches in early disease.
Presented here is just a basic summary of some
of the key points for prolonged polychemotherapy, meaning more
than one cycle and involving more than one drug, on reducing
the annual odds of recurrence and death. One of the really
key things from this worldwide activity is not only putting
18
together the world's experience, but also the way that the
efficacy of therapy is expressed as a reduction in the annual
odds of an event.
For example, if you look at the CMF combination
versus no chemotherapy with over 8,000 randomized patients
throughout the world, there's a reduction in the annual odds
of recurrence by 24 percent. That's very statistically
significant, as shown here in yellow, with this being the
standard deviation. So, 2 standard deviations would be the
borderline for significance.
Death is reduced by 14 percent per year.
Chemotherapy. This plus stands for additional
agents, such as vincristine and prednisone and other such
agents, compared to no such therapy, is in the same ball park
of efficacy showing no real advantage.
Nevertheless, anthracycline combinations versus
CMF with almost 7,000 patients randomized shows an incremental
benefit for the doxorubicin or other anthracyclines of 12
percent in recurrence and an additional decrement in the annual
odds of death by 11 percent.
A very important observation is that longer
regimens versus shorter regimens of various trials involving
6,000 patients, that there's no statistically significant
19
difference between the longer versus the shorter regimens.
Now, how does this translate to the familiar time
to event curves? In this case we're doing the event being
recurrence. If you take a simulated example shown here in
yellow of no therapy being applied in the adjuvant setting
for a patient with very poor risk breast cancer, relapsing
at an average rate of 15 percent per year, you can see that
the curve goes down by about 15 percent with each year, and
at the end of 10 years, you're left with 20 percent of patients
free of disease.
CMF, if it reduces that 15 percent by 24 percent,
leaves you a residual risk of recurrence of 11.4 percent per
year, and that graphs out as this magenta curve.
AC involving an anthracycline reduces that 11.4
percent by 12 percent, leaving 10 percent. So, the light blue
line is 10 percent less each year than the year immediately
preceding it and that this is the overall benefit.
So, this is how reductions in the annual odds
translates to time to event curves. We should keep this in
mind as Craig in a few minutes presents the data for the use
of paclitaxel in the adjuvant setting.
Now, we know a few other things which are very
relevant to planning research and analyzing research. We know
20
from CALGB study 8541 that looked at three different dose levels
of chemotherapy, that Adriamycin doses, doxorubicin doses,
less than 40 milligrams per meter squared are inferior to the
now standard dose of 60 milligrams per meter squared. This
study did not go above 60 milligrams per meter squared.
We know from the NSABP study B-22, that
cyclophosphamide doses greater than 600 milligrams per meter
squared are not superior, rendering this dose now the standard
in wide use.
And we know from the worldwide overview that
chemotherapy seems more effective in estrogen receptor
negative than estrogen receptor positive disease. And I say
"seems" because the tests for interactions are somewhat
complicated and don't always reach statistical significance,
but there certainly is a trend in that direction.
I'll show you what we mean by that. If you look
at the impact of polychemotherapy versus no polychemotherapy
in young patients under 50, the impact in patients with estrogen
receptor disease is larger than the impact in patients with
estrogen receptor positive disease. In fact, it's large
enough in terms of survival that it's statistically significant
here, but in the ER positive subset, it's not statistically
significant.
21
For patients who are older, 50 and older, again
the same thing is seen. The impact in ER negative disease
is greater than in ER positive disease, and again for survival,
the impact is significant here, but you don't even see a
significant impact on survival for ER positive disease in the
older age group.
Now, building upon this data set, where can we
go to improve? These are some of the possibilities for where
we can go, and these were certainly in consideration in the
design of the intergroup study that we're presenting to you
today.
One is, can you do better escalating the dose of
the anthracycline? The previous CALGB study stopped at 60
milligrams per meter squared.
Is there any advantage to integrating new agents
such as other chemotherapy drugs or biological agents?
And if we are going to integrate them, how should
we do so? What is the best way to apply them in a drug schedule?
I will show you in a few minutes a consideration of one approach
which is called dose density or dose dense sequential therapy.
But first, if we are going to integrate a new
chemotherapeutic agent, which one should we use?
Well, the four that have recently been approved
22
for the treatment of advanced breast cancer are shown here.
The first one, of course, was paclitaxel, docetaxel to
following, capecitabine recently, and this not being a
chemotherapy drug, this is the monoclonal antibody directed
to the extracellular domain of HER2.
Well, of these, this was the one that was clearly
available and had clearly demonstrated attractive features
at the time that the study was designed in 1991-1992. So,
the data we'll present to you today involves the use of
paclitaxel, but I will show a little later how other agents
are integrated into this overall treatment approach.
Why paclitaxel? It's active as first
chemotherapy for stage IV disease with response rates
approaching 60 percent in two very carefully done phase II
studies and now universally corroborated in hundreds of trials
throughout the world.
It's also active after extensive prior
chemotherapy, including patients whose disease is refractory
to anthracycline. It's not just regression and regrowth, but
flat-out failure of anthracycline response if their response
is to paclitaxel, and overall after extensive prior disease,
response rates as high as 30 percent are seen at the NCI, at
Memorial Sloan-Kettering Cancer Center, and now worldwide in
23
multiple corroborating studies. So, it seems like a very
reasonable drug to use, especially after standard therapy that
may involve an anthracycline.
Now, this demonstrates a simulation of a tumor
that's growing in a curvilinear fashion on a semi-logarithmic
plot, the so-called Gumpertzian curve, and then responding
to various doses of therapy with regression and regrowth, as
you see. Leaving cells behind, even a small number of cells,
one can get rapid regrowth, replenishment, and eventually
recurrence at about 10 to the 11th cells and death at about
10 to the 12th cells.
Well, one concept that certainly has appealed to
many people to try to improve upon this is just to escalate
the dose of the chemotherapy, and that's shown on the next
click where each dose of drug is higher. You get more
regression with each dose of therapy, but as you can see,
there's a very interesting biological phenomenon, which is
that as the tumor gets smaller, it regrows more quickly, and
that eventual regrowth is such that the eventual outcome in
terms of relapse-free and overall survival can be extremely
modest. This can actually explain a great deal of data that
we're seeing lately in terms of the use of very high doses
of chemotherapy purely on a kinetic basis.
24
Now, there is one other approach that makes sense
and actually from a mathematical modeling view is more
rigorous, and that's shown on the next slide. The next slide
shows the standard dose intensity we're using as a comparison,
but I'll show you here with this simulation that we're giving
the same dose of drugs, but just pulling them closer together
in time. This is termed dose density. You can see it's the
same dose of drug, the same efficacy with the first cycle.
The second cycle is more efficacious because it's given sooner
when the tumor is smaller and so on, and in this simulation,
you actually get eradication with four doses of exactly the
same chemotherapy, just done more closely together in time.
Now, how does this relate to the current study?
That's shown on the next simulation where you have two
sub-lines growing, one responsive to one therapy, one
responsive only to the other. It's certainly seems to be a
rational, intuitive thing to come in with the other dose of
drug here because the tumor cells are growing. But you can
see, when you do that, you are actually spreading the doses
far apart of both the red treatment for the red cells and the
white treatment for the white cells, so the dose density is
very poor for both treatment plans. As a consequence of which,
both sub-lines are actually grossly sub-adequately treated.
25
This can be overcome -- next simulation, please
-- by giving all of this therapy first in a dose dense fashion,
as we showed in earlier simulations, allowing this tumor to
grow but then coming in with dose dense therapy for these tumor
cells and therefore, because it's dose dense, causing
eradication of the subpopulation. This simulation,
therefore, shows how sequential therapy is actually a form
of dose dense therapy.
Well, this was actually tested prospectively by
Bonadonna, Buzzoni, and colleagues in a trial in stage II breast
cancer patients with 4 or more involved axillary lymph nodes,
involving doxorubicin sequentially with CMF or the alternation
of CMF with doxorubicin, a carefully designed trial where the
doses are exactly the same, the time between therapy is exactly
the same, duration the same. Everything is the same except
that this is sequential, as shown in my second simulation,
and this is alternating, as shown in the first.
As predicted by the model, there is superiority
in both relapse-free survival and in overall survival by the
use of the sequential Adriamycin followed by CMF versus the
alternation of the two treatment plans.
Well, the CALGB, in preparation for applying this
concept in the stage II setting, first did a pilot study that
26
was presented by George Demetri at ASCO in '97 in node-positive
breast cancer patients. It was a very large size pilot
involving 172 patients with node-positive stage II or IIIa
disease. It involved an escalated dose of cyclophosphamide
-- this is before the B-22 data became available -- involving
G-CSF for actually 5 cycles with doxorubicin at 75 milligrams
per meter squared. This was obviously a very aggressive
treatment program. Following this, patients received 4 cycles
of paclitaxel at 175 milligrams per meter squared as a 3-hour
infusion every 3 weeks for 4 doses.
Of the 172 patients, 145 reached the paclitaxel
stage, and of those, about 90 percent were able to complete
the paclitaxel. During that period, the only major toxicities
were the grade IV neutropenia in a quarter of the patients,
grade IV thrombocytopenia in 4 percent of the patients, all
short-lived toxicities from which the patients recovered very
rapidly with no sequelae.
As a consequence of this, this was regarded as
a pilot, and the intergroup study that we'll present to you
today was designed according to this model. It's shown here
and Craig Henderson will show it to you again shortly. The
cyclophosphamide dose was reduced because of data to 600
milligrams per meter squared. That's the cyclophosphamide
27
dose. The doxorubicin dose was -- patients were randomized
between 60, 75 or 90 milligrams per meter squared, this
requiring G-CSF, to test the concept of dose escalation of
the anthracycline. Then patients were either crossed over
or not to paclitaxel at standard dosage and sequence. Patients
with hormone responsive disease, starting with estrogen
responsive and then changed by amendment to progesterone
receptor positive, received tamoxifen for 5 years thereafter.
Well, that trial obviously is going to be presented
to you in a great deal of detail. I just want to close by
showing the relationship between that trial and other trials
that were started before the results of this trial were
available and afterward, just to put it into global context
of where the American cooperative groups are going.
NSABP started their study called B-28 in a
comparable group of patients. They started accruing to this
trial about 16 months or so after we started accruing to the
intergroup study that we'll present as the pivotal trial today.
Another major difference between that trial and
the trial we'll present today is that the dose of paclitaxel
is higher. It's 225 milligrams per meter squared. The trial
has an endpoint of survival, so that it will require a longer
follow-up to give results. Concomitant tamoxifen was used
28
for hormone receptor positive disease for 5 years, and the
eligibility was very broad, involving all patients with hormone
receptor positive disease or patients who are over age 50
regardless of hormone receptor status, meaning that a much
larger percentage of the patients received tamoxifen. Because
this study was started later, because it has a survival
endpoint, it has finished accruing, but no data is available.
No analysis has been done, and we do not have any information
about this trial at the present time.
CALGB, upon closure of the study, the pivotal trial
study, opened this study which also now has closed to full
patient accrual which took the regimen that I've just presented
to you from our study and compared it with three others. One
of the other trials that was done using dose dense sequential
therapy was done at Memorial Sloan-Kettering by Cliff Hudis,
et al. involving doxorubicin, followed by paclitaxel, followed
by cyclophosphamide, so-called ATC. Everything was given
every 2 weeks to maximized dose density by the use of G-CSF
permitting that manipulation. So, the intergroup CALGB trial
involved this regimen and the same regimen given every 2 weeks
to see if that dose density makes a difference, and this regimen
also given every 2 weeks the standard way and every 3 weeks
without the G-CSF, so you have a two-by-two factorial design.
29
A very rapidly accruing trial, but much too early. No data
has been provided on this study at the present time.
Also before the results from the pivotal trial,
this study was initiated as an intergroup study coordinated
by SWOG in patients with 4 to 9 positive lymph nodes, stage
II or IIIa breast cancer, using the ATC regimen in actually
augmented doses, as was originally done by Hudis, et al., and
comparing it to an induction with AC, followed by high dose
chemotherapy requiring hematopoietic stem cell support, STAMP
I or STAMP V. This study is about halfway completed with its
accrual and continues to accrue well.
Lastly in this category is a trial that's about
to be coordinated for the intergroup by ECOG that takes the
same regimen as is in the pivotal trial, AC followed by
paclitaxel, and also randomizes patients to three other
possibilities: paclitaxel done weekly, which is actually more
dose dense, a variety of paclitaxel, and docetaxel done every
3 weeks and weekly. So, there will be a comparison of schedule
here, as well as comparison of different taxanes.
Now, the last, of course, important thing to keep
in mind is that the integration of biological agents has long
been considered a real possibility for improving prognosis,
and the biological agent we have to work with, because of
30
approval, is of course trastuzumab, or Herceptin, the anti-HER2
antibody.
Based on the data that led to approval of Taxol
with Herceptin, that integration into the adjuvant setting
is being conducted by a number of trials. The NSABP trial
will involve HER2 positive disease, use the same design as
the pivotal trial that's being presented today, but add
Herceptin during and after chemotherapy for these patients
who have HER2 positive disease in a randomized fashion.
The North Central Cancer Treatment Group will be
coordinating an intergroup study that has some other features,
the same basic crossover design involving paclitaxel alone,
paclitaxel alone followed by Herceptin, or paclitaxel with
Herceptin followed by Herceptin, asking the same basic
questions but also asking the question is the simultaneous
exposure to Herceptin an important feature of this particular
regimen or not.
Lastly the CALGB has designed a two-by-two-by-two
factorial experiment in stage IIIb, or locally inoperable
breast cancer, of AC followed by the weekly paclitaxel that
the North Central Group will be coordinating, with surgery
and radiotherapy to follow, with three randomizations of the
Zinecard or not during AC to minimize cardiac effects to show,
31
we hope, that the dexrazoxane does not impede the doxorubicin
efficacy in this setting, Herceptin or not during the
paclitaxel, and then Herceptin or not to complete a year after
the paclitaxel. So, all the critical questions will be
addressed in this particular trial.
Hence, this approach, the sequential dose dense
approach, has some real advantages. In the study we're
presenting to you, it integrates paclitaxel, which is active
as a single agent and active post anthracycline. We'll be
showing you data that it significantly augments the efficacy
of chemotherapy in the adjuvant setting.
It does so in a way that actually minimizes
incremental toxicity, and as we all know, the combination of
taxanes with anthracyclines can have considerable incremental
toxicity. And we'll demonstrate to you that we can minimize,
truly minimize, that incremental toxicity by the sequential
approach.
And the sequential approach also allows the
integration of biological therapies such as Herceptin, as I've
just presented to you.
Thank you very much.
The next speaker will be Craig Henderson, who
chaired the pivotal trial, and he will be presenting the data
32
on this trial to you.
DR. HENDERSON: Thank you. Good morning. It's
always a pleasure to be able to present and discuss with this
group.
This is an intergroup study addressing two
questions, a Taxol and doxorubicin question. It was led by
the Cancer and Leukemia Group B and involved substantial
participation as well by ECOG, SWOG, and the North Central
Group.
The study rationale has really been presented I
think quite nicely by Larry. Just to remind you, based on
everything we know, the dose response for doxorubicin may be
steep. Cyclophosphamide, obviously, had been ruled out, and
so we concentrated on doxorubicin dose escalation.
We know that Taxol and doxorubicin are not
cross-resistant from a number of studies. So, Taxol was a
logical drug to add here.
Finally, sequential use of AC and Taxol allowed
us to evaluate two separate questions, that is, the doxorubicin
dose and a promising new drug.
Our study objectives then were quite simple: to
assess the effects of three doxorubicin doses, 60, 75, and
90, in combination with a fixed dose of cyclophosphamide; and
33
to assess the effects of sequential addition of Taxol following
cyclophosphamide.
Now, we very consciously tried to make this a
large, simple trial in many ways, which I think is increasingly
more important. The number of patients that you accrue and
having a large trial is probably more important than fine
definitions, and in addition to that, it means that when you
finish, the results are going to be applicable to a broad
population of patients.
So, this included all patients who had operable
breast cancer where you could remove the entire tumor with
clear margins. Patients had to be node-positive. Treatment
had to start within 84 days from the last surgery, whether
that was lumpectomy or node dissection. No non-surgical
treatment was allowed, and they had to have normal liver
function.
It was a three-by-two design, asking first in three
arms either 60, 75, or 90 per meter squared of doxorubicin
the doxorubicin dose question, and in one of two arms the Taxol
versus no Taxol. We gave 4 cycles every 3 weeks of the
cyclo/adria and we gave 4 cycles every 3 weeks of the Taxol.
Again, cyclophosphamide remained constant. Patients on the
highest dose of doxorubicin received G-CSF routinely, while
34
patients on the other two arms received G-CSF in accordance
with the label for G-CSF in the product insert. Patients on
the 75 and 90 per meter squared dose received doxorubicin on
day 1 and day 2, that is, split because of our concerns of
cardiotoxicity, while these patients received it as a bolus
in the usual fashion. When Taxol was given, 175 milligrams
per meter squared over 3 hours was administered based on the
fact that this is the approved dose and is the most commonly
used dose in the community at the present time.
So, study design. Three-by-two with
stratification based on nodal groups only, 1 to 3, 4 to 9,
and 10-plus.
Tamoxifen was given for 5 years for all patients
that were ER positive, and regardless of the arm to which the
patient was randomized, tamoxifen was begun on week 24 so that
tamoxifen duration or the duration of exposure did not become
a confounding factor.
Radiation therapy, however, was given immediately
after the completion of chemotherapy, so that in the patients
randomized to cyclo/adria, that would be after 3 months; for
those randomized to cyclo/adria plus Taxol, that would be 6
months.
We powered the study to detect the effect of Taxol,
35
the effect of doxorubicin dose, and the interaction between
Taxol and doxorubicin dose.
Our median disease-free survival for our power
calculations was assumed to be 6 years without Taxol.
Our power was 95 percent to detect a 25 percent
decrease in the hazard rate from the addition of Taxol.
Based on these assumptions, we planned to accrue
3,000 patients over 3 years, and we assumed that we would have
1,800 occurrences 4 years thereafter.
The randomization was central. Data management
was conducted by the Cancer and Leukemia Group B using its
standard procedures.
There was an independent data safety monitoring
board. They were the only ones who saw the data. In fact,
as the PI in the study, the first indication I even had of
the trends that were happening in this study were 6 weeks before
the data were presented at ASCO. They did an interim safety
analysis every 6 months. They did analyses of disease-free
survival after 450, 900, 1,350, and a planned 1,800 events.
So, we've completed this analysis and had dramatic effects
that the data safety monitoring board felt justified for
publication.
3,170 patients were accrued. However, between
36
giving informed consent and the time when they received the
first dose of treatment, a certain number of patients dropped
out, leaving 3,121 who received at least their first course
of therapy. Usual policy in the Cancer and Leukemia Group
B is to omit these patients from the analysis. So, everything
you will see now is based on the 3,121 patients who were
randomized and treated. We do not have data and did not follow
up the patients who elected to drop out of the study.
Accrual was from May 1st, 1994 to April 15th, 1997.
So, we accomplished the accrual goals in slightly less than
the planned 3 years.
We had a preplanned interim analysis based on 450
events, so it was actually done at 453 events. And the data
safety monitoring board decided that the results were such
that it was important to release them to the public and that
patients who were participating in it, making future decisions,
deserved to know the results of these analyses in March of
1998.
And in May of 1998, we presented them to ASCO,
and at that time had a 22 percent reduction in risk recurrence
and a 26 percent reduction in mortality.
Now, it was after that that we began a
collaboration with Bristol-Myers Squibb for the first time.
37
They were not involved in the design or management of this
trial at any point before that. The interactions between BMS
were with the National Cancer Institute, but not directly with
the Cancer and Leukemia Group B.
In October of 1998, BMS and the CALGB had a pre-sNDA
meeting with the FDA. It was decided to update the trial and
have a larger database, and that was conducted in December
of 1998. And the sNDA submission was in April of 1999.
Now, just to give you some sense of the differences
between the first presentation and ASCO, May 1998, and at the
time of the sNDA, the median follow-up at the first presentation
was 20 months; for the data that you're looking at today, 30
months.
Number of events for disease-free survival: 453
in the first analysis; 624 today.
For overall survival, the number of events: 200
at the time of ASCO; 342 today.
Just to put this in perspective, in 1979 a National
Cancer Institute consensus conference decided that it was
appropriate to recommend adjuvant chemotherapy to all
premenopausal node-positive women, and at that point, the
number of events in these two categories from all trials
worldwide was less than half of what was available at the time
38
of the ASCO meeting. I state that to underscore the power
of this very large trial.
The pretreatment characteristics are well
balanced between the two arms in all subsets.
You will notice particularly that about two-thirds
of the women are premenopausal, which I think is understandable
in a study of chemotherapy of this intensity.
The number of women who had 1 to 3 positive and
4 to 9 positive nodes, however, is about the same. The 10
positive node group is somewhat smaller, reflecting the fact
that this is less prevalent in this society as a rule; that
is, among breast cancer patients, having more than 10 positive
nodes is not that common in the United States.
Secondly, patients who were enrolled in this trial
had to be offered participation in a randomized trial
evaluating high dose chemotherapy in bone marrow first, and
if they declined that, then they could participate in this
trial.
About two-thirds of the patients were treated with
a modified radical mastectomy.
About two-thirds of the patients were receptor
positive.
Now, among all the patients who were enrolled and
39
started on course number 1, you can see that there is no
significant difference between those randomized to AC and those
randomized to AC plus Taxol in terms of dropout over these
first 4 courses. So, approximately 3 to 4 percent of patients
in the two arms dropped out over their first 4 courses of AC.
Now, among the patients who then went on and had
been all previously randomized to Taxol, there were 4 percent
who said, look it, I've had enough and decided not to go on
as they had been previously randomized. So, we have 92 percent
of all the patients randomized to AC plus Taxol who started
on course number 1 of Taxol and there's a 7 percent dropout
rate during those 4 courses of Taxol.
This shows you now the disease-free survival
differences between AC, shown in white, and AC plus Taxol,
shown in yellow. You'll notice that at the 1-year point,
almost all of the patients who had been randomized had reached
that point and had a year of follow-up. At the time of even
the initial analysis, all patients were a year from
randomization and at least 6 months from the completion of
chemotherapy.
You can see that even at 3 years of follow-up,
the number of patients at risk exceeds 600, which is
considerably more than most randomized trials in the adjuvant
40
setting in the past.
We see that these differences are highly
significant, based on a multivariate Cox model. This is the
model that was used. It shows, first of all, the comparison
of Taxol with no Taxol and the risk ratio is .78 or a 22 percent
reduction, highly significant.
On the other hand, when we look at doxorubicin
dose, for example, comparing 60 with 90, we see no advantage
from adding dose.
We see that there is a twofold increased risk if
you had 10 positive nodes instead of 1. There's an increased
risk, which is statistically significant for patients with
larger tumors than with smaller tumors. However, there is
no difference in patients who are pre- and post-menopausal
in terms of disease-free survival.
Finally, patients who were receptor negative had
about a two-and-a-half-fold increase in risk compared to those
who were receptor positive.
If we look at the same data now for overall
survival, shown here in white is the AC. Shown in yellow again,
AC plus Taxol. Highly significant in our Cox model, and this
shows you the model Taxol versus no Taxol, a 26 percent
reduction in risk. Highly significant. No evidence of effect
41
of doxorubicin dose. Again, positive nodes, tumor size show
an increased risk. Estrogen receptor negative, increased
risk. Here we also see an increased risk of dying -- this
is dying of any cause now -- among the post-menopausal compared
to the pre-menopausal, which isn't surprising considering that
it's an older population.
Now, just to look at the two different times that
we analyzed the data, we see that the results are identical.
At the time of ASCO, a 22 percent and 26 percent reduction
in risk of recurrence and mortality; at the present time, 22
and 26 percent.
Now, we saw no evidence of a dose effect whatsoever
for doxorubicin. This shows you the three curves for
disease-free survival, the white being the 60, the yellow being
the 75, and the blue being the 90 per meter squared, and also
for overall survival. You see no evidence of effect.
Further, we could show that individually, for
example, the effects of adding Taxol to 60 milligrams per meter
squared of doxorubicin are greater than the effects of giving
90 per meter squared of doxorubicin alone, which is only one
part of the evaluation showing no evidence of an interaction
between doxorubicin dose and paclitaxel addition.
Now, we did a number of subset analyses. These
42
were not necessarily planned subset analyses and are
confounded, obviously, by multiple comparisons, but I think
most physicians and I would imagine most of the ODAC panel
would be interested in seeing these, so we've summarized them
here.
I think the take-home points are, first of all,
that we saw a similar effect in almost all of the subsets we
looked at, certainly the node-positive groups where there is
no significant difference in the effect of adding paclitaxel
in these groups, tumor size, and interestingly in terms of
menopausal status.
Secondly, the size of the effect is quite
substantial in all cases, ranging from 20 to 25 percent.
Now, the one exception to that are in patients
who have receptor positive versus receptor negative tumors.
This was not a planned subset analysis and it's not one that
has traditionally been done either by the Cancer and Leukemia
Group B or, until very recently, by any groups. The overview
data that you saw from Larry Norton is a first that they have
actually looked at that.
We looked at this a little bit further and here
we can show you the disease-free survival hazard ratios by
receptor status. So, here is the hazard ratio with 95 percent
43
confidence intervals for the entire study. So, we're at about
78 percent there, or 0.78.
Now, we look at the same thing, but just for those
patients who are receptor positive and for those patients who
are receptor negative. You can see that there is a greater
effect. Even though the confidence intervals overlap here
quite substantially, there appears to be a greater effect in
the patients who were receptor negative compared to those who
were receptor positive.
We can see the same thing in terms of overall
survival. The overall survival of the group as a whole with
the hazard ratio here being .74, as I showed you earlier, with
the effects in the receptor positive and in the receptor
negative patients. Again, considerable overlap but the
appearance of a greater advantage in the receptor negative
patients.
Now, to summarize then what I have just gone over
in terms of efficacy, we conclude the following. The addition
of Taxol following standard combination chemotherapy in
patients with node-positive breast cancer reduces the risk
of recurrence by 22 percent and reduces the risk of death by
26 percent. And if you do that in terms of annual odds of
recurrence, you come up with exactly the same number.
44
There is no evidence of a dose response to
doxorubicin for doses above 60 per meter squared.
There is no evidence of an interaction between
doxorubicin dose and Taxol.
And the benefits of Taxol in various subsets,
including the receptor subsets, are consistent with the effects
of chemotherapy in the worldwide overview.
Now, to turn to safety, the first thing it's
important to understand about safety is that this study was
designed to intensely evaluate the first 325 patients. We
concentrated on those patients because we did not feel, in
the design of this study, that it was necessary to collect
extensive safety data on cyclophosphamide, doxorubicin, and
paclitaxel, drugs in which there are already huge safety
databases. On the other hand, we were escalating the
doxorubicin dose, quite substantially and we wanted to make
sure that we monitored that very carefully.
So, the first 325 patients we obtained CBCs, for
example, twice weekly. We required safety information on all
types of toxicity, and we collected and put in our database
anything that was grade 2 or above. These 325 patients were
appropriately distributed among the major participants, so
they weren't all from the CALGB. In other words, we had the
45
same number from CALGB, ECOG, SWOG, and a slightly smaller
number reflecting a smaller group from the North Central.
Now, our original plan, or at least the original
plan that I had in my mind and a number of the people on the
Breast Committee, was to only report ADRs after collection
of these data very intensely and very carefully. However,
as happens oftentimes with groups, there was a continuing
discussion of whether we should stop all collection of data
and get only ADRs, which we did by default for 1,815 patients,
or whether we should collect more information mainly because
of issues regarding presentation of the data and so on.
So, we made an amendment to the protocol here as
a consensus among the different points of view, and for the
last 981 patients, we collected grade 4 and 5 hematologic
toxicity and we collected grade 3 and above non-hematologic
toxicity routinely.
Now, some investigators, having started with the
intense reporting, continued to submit that even though it
wasn't required by the protocol in the interim.
The take-home point is these are the data that
are going to be most precise and represent the most careful
monitoring for safety and those are the ones that I will
emphasize. I will show you all of the patients together as
46
well in separate columns as we go along.
First of all, grade 3-4 hematologic toxicity.
Patients randomized either to AC or AC plus Taxol in the early
population. First of all, you see that there is no difference
in the overall hematologic toxicity in these two arms.
Secondly, you see that, as you would expect with
the very intense therapy, that you have a high incidence of
leukopenia and granulocytopenia. We'll talk about the degree
to which this occurred in just the Taxol part in a few moments.
You see that the numbers in the total population
are smaller, but again, you see no difference when you look
at the total population in the hematologic toxicity in patients
randomized to AC or randomized to AC plus Taxol.
Sequelae to hematologic toxicity, that is,
infection, fever, hemorrhage. The requirement for platelet
transfusions, requirement for red blood cell transfusions is
also not significantly different. There's an appearance of
a significant difference here, for example, in the incidence
of infection, but among the 14 percent of patients randomized
to AC plus Taxol who had infection, which constitutes 23
patients, 21 of the 23 patients had the infections while they
were receiving the AC, not while they were receiving the Taxol.
So, only 2 out of these 23 patients had an additional infection
47
as a result of Taxol directly.
And the same thing is true for patients with fever.
There were 4 patients, or 3 percent, who had fever that was
grade 3 or grade 4, and all of them on the AC therapy.
We looked at a variety of non-hematologic
toxicities, first of all, cardiovascular, neuromotor,
alopecia, nausea and vomiting, diarrhea, stomatitis, and
abnormalities of liver or renal function. We see no
significant differences either in the early population or
overall among patients randomized to AC or those randomized
to AC plus Taxol.
The greatest difference is in stomatitis. Again,
that's greater actually in the patients randomized to AC only
rather than those randomized to AC plus Taxol.
Now, we looked very specifically at
non-hematologic toxicities that are commonly associated with
Taxol: neurosensory, neuropathies, arthralgia, myalgias, or
hypersensitivity reactions. It's not surprising, since these
are associated with Taxol, that there is a higher incidence
among the patients randomized to the Taxol arm in the study
than there are to the AC. However, the total percentage of
grade 3-grade 4 toxicities in these three categories is
relatively modest.
48
Other adverse events. Hospitalization, no
difference. Late cardiac disease, no difference. This is
being monitored on every follow-up form and has been
consistently. So, this applies to the entire population of
patients.
Secondary malignancies occurred in 2 percent of
the patients. No difference in AC and AC plus Taxol. The
incidence is about what we would expect to see in most adjuvant
therapy trials, and also as with most trials, about half of
all the second malignancies are second breast cancers.
Now, looking specifically at toxicities that occur
while patients were receiving Taxol, again looking first at
the hematologic toxicities, grade 3 and grade 4, early
population here and the total population here, we see that
17 percent of the patients had a grade 3/grade 4 leukopenia
while getting Taxol; 46 percent had grade 3/grade 4
granulocytopenia. As previously, thrombocytopenia and anemia
are fairly uncommon with Taxol therapy.
The sequelae, infection, fever, hemorrhage,
requirement for platelet and blood transfusions occurred in
1 percent or less of the population.
We look at non-hematologic toxicity, again
specifically during Taxol therapy, the same group that I showed
49
you before, and you can see again it occurs very infrequently,
at most 1 percent of the patients.
Finally, we look at non-hematologic toxicity for
those things that are known to be associated with paclitaxel
and are unique to that drug, neurosensory, arthralgia, myalgia,
and hypersensitivity. This is only while now the patients
are getting Taxol and the numbers are very similar to what
you saw before.
Finally, you remember that at the beginning of
the presentation I showed you the dropout rate over the course
of therapy. What were the reasons why patients dropped out?
First of all, why did patients drop out of AC?
First of all, the patients here who were randomized to AC and
the patients here who were randomized to AC plus Taxol, but
these two columns represent the dropout from AC itself. First
of all, 95 to 96 percent of patients completed all 4 courses,
as I've shown you before.
2 percent of the patients on each arm requested
that they drop out for one reason or another. That's not
specified on the case report forms. 1 percent of the patients,
again the same in both arms, because of specific toxicities,
and then a small number because of disease progression or a
50
mixed category.
Now, we had 1 patient here who died within 30 days
of having gotten a dose of chemotherapy, so still on active
dose. That particular patient was on the AC only arm and that
patient had respiratory failure and cardiac failure which was
assessed to be due to neoplastic process.
Now, among these 1,570 patients randomized to AC
plus T and got AC, you remember I showed you earlier that only
1,449 of those patients went on to receive paclitaxel. Now,
of this group, 92 percent completed treatment. The reason
for not completing it, 1 percent patient request, 6 percent
because of toxicity, a small number for disease progression
and other, and there were 2 patient deaths within 30 days of
a chemotherapy regimen. One had a hypersensitivity reaction
as a cause of death, and one patient had a brain infarction
with subsequent sepsis.
So, in conclusion, we believe that we've shown
that the benefit of adding Taxol to standard
anthracycline-containing therapy is similar to adding
chemotherapy to surgery. The basis of saying is that when
you look -- and you saw the numbers earlier from Dr. Norton
-- at chemotherapy versus nil, you see a reduction in the odds
of death or reduction in the annual odds of recurrence that
51
are about the same as we have shown here in adding paclitaxel
to doxorubicin.
The robustness of the results of this large study
is supported by the consistency of the treatment outcomes in
the two points of analysis, that is, first a presentation at
ASCO in 1998 and the presentation today.
And finally, the addition of a single agent Taxol
to standard combination chemotherapy is very well tolerated
compared to most things that we do as medical oncologists today.
I thank you for your attention.
DR. CANETTA: Thank you, Craig.
I will just offer a very few concluding remarks
to wrap up our presentation.
We believe that the data that we have shown
actually follow in the footsteps of what we have found out
about the effects of Taxol in breast cancer and I think it
is comforting to see that as you move to earlier stages of
disease, the magnitude of the benefit increases. The pivotal
study, whose results you've seen presented, is the largest
trial that's ever been submitted to this agency for the approval
of a new chemotherapeutic agent in node-positive breast
carcinoma.
The comparison of Taxol versus no further therapy
52
does demonstrate there is a significant effect, a significant
benefit in the two important endpoints in the setting of the
disease, disease-free survival and overall survival.
I'd like to point out that when you look at the
subset analysis, multiplicity of analysis, but one data is
very, very comforting and very reassuring. No matter what
subset you look at, there is always a positive effect of Taxol,
and that is very, very solid evidence that it is the drug that
is exerting an effect.
Finally, although Taxol is a cytotoxic agent, I
think that what we have seen in terms of the safety profile,
even in this setting, is very, very consistent with what had
been seen with exactly the same dosages of Taxol that have
been approved for a long time in the treatment of this disease
and in treatment of other diseases.
Therefore, we do propose that Taxol administered
sequential to standard combination therapy be indicated for
the treatment of node-positive breast cancer.
And the dosage and schedule that we recommend is
the classical standard dosage of 175 milligrams per square
meter given intravenously over 3 hours every 3 weeks for 4
courses, as you have seen.
I'd be glad to take questions from the committee.
53
DR. NERENSTONE: Thank you very much.
We're going to open up now for questions from the
committee to the sponsor. I would like to take the Chair's
prerogative for just a moment and ask two points of
clarification.
One, on the patients who died on the Taxol, one
had a septic related death. Can you tell me what the dose
of doxorubicin that patient had received prior to the Taxol?
DR. CANETTA: We need to check that.
DR. NERENSTONE: While you're looking at that,
the second question is really sort of a clarification of the
toxicity slides. When Dr. Henderson reviewed the toxicity
data, especially of the grade 3 and 4 toxicities, his numbers
were early population and then a percentage for the total
population. But in fact, aren't those numbers incorrect
because you didn't have data on 1,800 patients in the middle
group who did not have recording of grade 3 and 4 toxicity.
They only had reporting of ADRs.
DR. CANETTA: I think I can address that. The
early population, as Dr. Henderson said, is the one that has
been intensely monitored, and that's very obvious when you
look at granulocytopenia. Twice a week counts result in 90
percent incidence of grade 3 or 4 granulocytopenia in the early
54
population. The late population, every patient was included
in the denominator, but you need to remember that all the
serious adverse events have been reported even after the early
population. So, when you look at severe toxicity, of course,
you have a slight underestimate, but I think it's very
reassuring that for clinically important toxicities -- and
you have the infection example -- the incidence is actually
the same whether you monitor intensively or whether you don't
monitor intensively.
DR. NERENSTONE: Okay.
DR. CANETTA: For that patient, Dr. Tuck will give
you some details.
DR. TUCK: That patient was on the high dose of
doxorubicin, 90 milligrams.
DR. NERENSTONE: Other questions? Dr. Blayney.
DR. BLAYNEY: You didn't specify as part of the
trial protocol what premedications were used with paclitaxel.
Could you review that? And as part of your proposed labeling,
do you propose a premedication regimen with paclitaxel?
DR. CANETTA: Yes. During the Taxol phase, the
standard three class of agents premedication was administered
with a steroid, H1 and H2 blocker. We do maintain that in
this proposed dosage we will retain the same type of
55
premedication.
DR. BLAYNEY: Did the patient who died of -- it
was reported as an anaphylactic event receive the
premedication?
DR. CANETTA: Yes. That patient did receive
premedication. It is very unfortunate, but severe
hypersensitivity reaction can still occur despite
premedication in a very, very small percentage of patients.
DR. BLAYNEY: Are there other medicines that you
would caution physicians to avoid as part of the paclitaxel
administration? For instance, trastuzumab or Herceptin?
DR. CANETTA: I think it's important to point out
that there is nothing special about this patient population
vis-a-vis the pharmacologic behavior of Taxol. So, all the
type of cautions that are already attached in the current
package insert for Taxol for this dosage and schedule of Taxol
will be maintained. So, whatever we say that refers to Taxol
for metastatic disease will also refer to this population.
We are not in the possession of data of the use
of Taxol and Herceptin in combination in the adjuvant setting,
and we cannot refer, at least in our package insert, so we've
been told by the agency, to the Herceptin data. So, I think
patients and care providers will have to be directed to the
56
Herceptin package insert.
DR. BLAYNEY: Thank you.
DR. NERENSTONE: Dr. Raghavan?
DR. RAGHAVAN: I have two questions. I guess Dr.
Henderson drew out the issue of receptor positive disease and
showed that there was a reduction, but probably the least
significant level of reduction. I'm just interested just to
confirm that the randomization was not stratified for receptor
status.
And secondly, the group with 10 nodes positive
disease seemed also to be one with a relatively small impact,
and the question on that relates to does Dr. Henderson feel
the study was well powered to identify clearly the level of
difference in that context.
So, the questions are receptor positivity. Was
the stratification included for receptors? Second question,
lymph node 10 plus. Were there enough cases to have a strong
feeling of where that fits into the scheme of things?
DR. CANETTA: I'll let Dr. Henderson answer.
DR. HENDERSON: First of all, there was no
stratification based on receptor status.
Secondly, when you read over the statistical
section -- and I very carefully checked this, writing the paper
57
-- there is no mention even of the possibility of doing that
subset analysis. That was an unplanned subset analysis and
even the overview data that we've shown you weren't out at
that point. This idea of doing subset analyses in receptor
positive patients is something that really has popped up in
the last couple years, maybe even in the last year, year and
a half, and not something that was done before that.
The second question had to do with the power within
the group that has more than 10 positive nodes. The way I
look at this is to ask the statisticians to say can you tell
me that there is a significant difference, using a regression
model, in these three groups, even though it would appear that
way just by eyeballing it. And the answer has come back
repeatedly no. There is not evidence of a significant
difference.
Now, I believe that that's because of the
difference in the power in the first two groups, 1 to 3 and
4 to 9, versus the 10 group. But using a test for trend, for
example, you do not see a significant difference.
DR. NERENSTONE: Dr. Lippman.
DR. LIPPMAN: Yes, I really had a related question
to Dr. Raghavan's regarding the subset analyses, because this
will come up again I guess in the FDA presentation. I'd like
58
some thoughts from your statisticians perhaps on the issue
of subset analyses because, particularly if you look at overall
survival in the two different receptor groups, it's 17 percent
reduction in the positive group and 29 percent, so still
substantial in both groups. It wasn't a prespecified subset
analysis, and I guess from Dr. Henderson's presentation, it
has never been done in a prespecified way in any large phase
III adjuvant study. When you look at the graph and the
confidence overlaps on the overall survival between the two,
it's pretty large. So, how strong is that particular subset
analysis for clinical recommendations to patients?
DR. CANETTA: Dr. Don Berry will address this.
DR. BERRY: Subset analyses are problematic, as
you know. This was unplanned. Is the result strong? Is the
result real? I don't know. I don't think anybody can say.
I think that it is a subset analysis and that there is no
difference between the two. It may turn out, as we go down
the line, that other studies show that there is a relationship
and that's one of the reasons we announced the study when we
did is so people could look at this question. I don't think
it's very strong.
DR. LAMBORN: While you're up there, could I just
ask a clarification? The actual test for a difference or for
59
an interaction was non-significant or what was the p value?
I recognize that it is a subset analysis. We don't have the
information about the potential difference.
DR. BERRY: It actually was significant at the
time of the ASCO presentation in terms of disease-free
survival. It is not significant now. Am I correct in that
statement? The test for interaction using a Cox model in which
receptor status and Taxol is included in the interaction term.
I don't believe that it is significant now, but it was at
the time of ASCO.
DR. NERENSTONE: Dr. Williams, did you have a
question?
DR. WILLIAMS: I do have a question regarding Dr.
Henderson's statement about looking at subgroups on receptor
status. Somewhat different but extremely closely related is
looking at the effect of chemotherapy in patients who have
received tamoxifen. Obviously, that's the very same group
we're talking about here, not just their receptor status, but
the fact that all patients were supposed to receive tamoxifen.
Certainly it looks like in the overview that was addressed
specifically, and I would imagine that goes back some years.
Whether or not you do it within a trial is another question,
but clearly it was specifically addressed as a concept that
60
there might or might not be an effect in this group.
DR. CANETTA: We have a few slides to show and
Dr. Henderson will present.
DR. HENDERSON: First of all, we didn't show you
the data separately, actually prepared slides, for the overview
data ER and tamoxifen. The reason we didn't show them to you
-- and I don't know whether we have them here. We can -- is
that my feeling was that when you look at the overview data,
the interaction is stronger for ER than it is for tamoxifen.
Now, if you look at the four groups, because the
way the overview is set up, it's under 50 and over 50. You
don't have the whole population put together, as I'll
underscore in just a minute. That's the way the data were
shown to you.
For example, the tests for interaction on all but
one of the subsets for ER are negative. Only one of them is
positive.
DR. WILLIAMS: Could you clarify what you mean
by that?
DR. HENDERSON: Well, if you do a formal test for
interaction so that you say is there an interaction between
the effects of therapy and the presence or absence of an
estrogen receptor or the effects of therapy and the presence
61
or absence of tamoxifen, the formal tests for interaction are
negative.
As you know, that's not a very strong or very robust
statistical test to use and some people aren't enthusiastic
about it at all, but nonetheless, that was done as a formal
evaluation and led people like Richard Peto to say we don't
see a significant difference in those two populations.
Let me just show you briefly. First of all, these
are the results using the Kaplan-Meier estimates for AC and
AC plus Taxol disease-free at 1 year, 2 years, and 3 years.
This is for the entire population.
The point that we're going to make is that it's
important to look at your patients at risk and look at the
confidence intervals around the estimates in the receptor
positive patients at each of these points. This is for the
entire population of patients, but if you look at just the
receptor positive subset, you'll see that as we get further
out, the confidence intervals around any differences grow
larger at each point.
The take-home point then is that our ability to
use just a single point, such as 3 years, which was put into
the questions and the summary of the questions, is probably
inappropriate. You want to look at the growing effects, and
62
you can see a difference with fairly tight intervals of about
1 percent at 1 year in the ER positive patients in absolute
difference in disease-free survival and about 2 percent at
2 years. At 3 years you see a smaller effect, but with very,
very wide confidence intervals.
DR. TEMPLE: Is that for the whole population,
Craig?
DR. HENDERSON: Pardon.
DR. TEMPLE: That's for the whole population.
Right?
DR. HENDERSON: Yes. No. This is for the whole
population. The slide I wanted up here -- we just made a
mistake. Sorry about that -- was patients who were receptor
positive. And maybe they'll get that up for you in a moment.
DR. WILLIAMS: So, where would be the appropriate
-- I mean, in a normal adjuvant trial, we would have enough
data that we would have a 5-year survival and that would be
probably a fairly appropriate place to look. This is just
as close to the plateau as one can get with these data, which
are somewhat premature. If you want an estimate for women
of what's going to be the case based on these data, you have
to pick some point other than a hazard ratio which has little
meaning.
63
DR. HENDERSON: Why you think a hazard ratio has
little meaning?
DR. WILLIAMS: Because there's an absolute risk
of death from breast cancer in particular women, and that
absolute risk times the relative change in that risk is your
benefit. A 20 percent benefit, if there's a 1 percent risk
to start with, doesn't mean much.
So, these women obviously have much less risk of
recurrence, and that relative risk, regardless of how confident
you are of it, overall means less in that setting.
DR. HENDERSON: I would take a slightly different
point of view. First of all, in terms of using hazards or,
as we have done in the last 15 years in the breast cancer
literature, using reductions in odds of death or reductions
in odds of recurrence, the annual reduction in odds of death
or the annual reduction in odds of recurrence have been constant
across all the subgroups that we've looked at carefully with
one exception well established, that is, between ER and
tamoxifen. So, when you use tamoxifen, the reduction in odds
is much greater in receptor positive than receptor negative
patients.
We're working hard on that question to say is that
true for HER2 positive patients, but I would say that's still
64
a point of great controversy and we certainly haven't looked
at it yet in the adjuvant setting with any statistical power.
Now, we have a third possible interaction where
the reduction in odds is different. That's a hypothesis,
hypothesis generated in part by this trial, that maybe there
is an interaction between chemotherapy and receptor status
that is a qualitative rather than a pure quantitative
interaction.
Now, when you accept those three, now you go back
to all the other subsets. Until proven otherwise by careful
prospective trials, it is reasonable to take the reduction
in annual odds, which is almost always, I'd say, very, very
close to the difference in hazard. In other words, 1 minus
the hazard rate is going to be very close, within a percentage
or two, in almost all cases to the reduction in odds.
Now, for a doctor practicing, what I usually
encourage doctors to do is say calculate what the risk is to
your patient. You have to consider these qualitative
interactions, but for all other subsets, take your estimate
of 10-year mortality and multiply that by the reduction in
annual odds. That's doable because what we have seen in almost
all studies that are done is the reduction in annual odds is
constant. In fact, if you look at the longest trials we have,
65
the ovarian ablation trials which go back to 1948, you can
show that the reduction in odds is constant up through 25 years
at almost all time points. So, what is going to be dependent
is what are going to be the effects within or the risks within
that particular group.
So, I would say that for the overall analysis,
I certainly wouldn't call these premature data when you have
this much statistical power, but for the subset certainly these
would be early data.
DR. WILLIAMS: Your statement that you expect the
same proportional reduction in these groups -- didn't the
overview show a different proportional reduction like 19
percent for the 50- to 59-year group that received tamoxifen
versus a higher percent, around 30 percent, for the groups
overall? So, the proportional reduction in recurrence was
not estimated to be the same for patients who had received
tamoxifen versus the other patients studied.
DR. HENDERSON: That's a good point. I probably
should put that into a fourth category. We have a tendency,
and have for some time, to a priori divide our patients into
pre- and post-menopausal. So, that's a very well taken point.
And the effects in older and younger women of chemotherapy
are clearly different. For tamoxifen they're not clearly
66
different.
DR. WILLIAMS: That's not older and younger.
This is the patients who had received tamoxifen, those trials,
plus or minus chemo versus the other patients. It wasn't
specifically an age factor, and that's exactly the question
we have here, the patients who received tamoxifen versus those
who didn't.
DR. TEMPLE: You don't show tamoxifen yes or no.
Actually the data look even more different when you do.
DR. HENDERSON: I'll show you those data right
now. Okay? So, let's go back one slide.
This was the slide I wanted first. This is just
now looking at disease-free survival for the receptor positive
subset for the 3 years follow-up. The point that I was trying
to make and describe to you before were the differences in
the confidence intervals around a 3-year figure, for example,
compared to either a 1 or a 2-year figure, just emphasizing
follow-up is important, the duration and the number of patients
at risk.
Next slide please.
DR. TEMPLE: Craig, before you leave that, we're
familiar with the treacheries of subset analyses. Okay? We
know that. This is a little striking, though. Two-thirds
67
of the patients randomized seem to have not much going on and
all of the good action is in one-third.
So, I guess one question you need to address is,
when does something that you didn't plan overwhelm you so much,
look so strong that you should believe it anyway? Some people
would say the answer is never, and I always quote Salim Yusef
and all that. We all do that.
But still, that's the question here. This is
two-thirds of the population. It's not some little subset
that emerged, and it can be defined either by receptor status
or by the use of another tamoxifen. Concomitant therapy is
the sort of subset one does look at. That's not pulled out
of left field exactly.
DR. HENDERSON: Let me address that question, but
let me finish the first one, which is just looking at the hazard
risk for the two populations, the receptors which I showed
you a moment ago, these again. Disease-free survival. These
are the data that I showed you for disease-free survival.
Next, overall survival.
Next, this is now for tamoxifen, disease-free
survival. This is the overall estimate. This is now the
patients who did not get tamoxifen and those who did get
tamoxifen. Looking at this as receptor positive/tamoxifen
68
or receptor negative/tamoxifen and so on is not very
informative because the number of patients in these subsets,
other than the two major ones, get down to 125 patients to
150 patients at risk. So, we don't think that that's very
meaningful. So, this is disease-free survival.
Next, overall survival again for the group as a
whole and then the two subsets where you see wide overlaps
for the tamoxifen, just as you did for the receptor.
Now, next slide please. This is getting now to
more directly addressing your question. This is the effects
of adjuvant chemotherapy in estrogen receptor positive
patients from the overview. Now, again as I've told you, we
have to look at younger women and older women separately because
that's the way the data are available to us. Again, we see
this same difference -- this is younger women -- in the effects
of therapy in the receptor positive versus receptor negative.
Among older women, it's even more marked, but again an overlap
in the confidence intervals.
Next, please. And the effects now in terms of
reduction in annual odds of death. Again, you can see that
when you look at the younger women -- and you're looking now
at adjuvant chemotherapy, over 1,000 women now in this subset
-- you see that for the receptor positive patients, there in
69
fact is not a statistically significant survival benefit from
adjuvant chemotherapy either in the receptor positive women
under age 50 or the receptor positive patients over age 50,
while it's in the group who are receptor negative in which
you see significant survival advantages. Again, you see this
same pattern of difference.
Next slide, please. Now, for this particular
study, I think it's too early to make a firm conclusion because
in the receptor positive subset, there appears to be a smaller
benefit, but the relative effects are quite similar to what
you see in the overview. And we believe that as time goes
on and we have more events, particularly in this particular
subset, the picture will become clearer.
So, now coming back to your direct question, when
do you decide on the basis of a subset analysis, even if it's
very large, that you are not going to give therapy to that
particular group or that you're going to change therapy on
the basis of an unplanned subset analysis? I would go so far
as to say that thus far I've been resistant to doing that
consistently across the board in all cases. It seems to me
that what you do is a subset analysis. You generate a
hypothesis and then you go out and test it.
The best example in my experience is in the issue
70
of HER2. Should we use HER2 to select patients for therapy?
Our first subset analysis, which we published a few years
ago, showed a p value which was way out there. I don't
remember. .001 or .0001. And then our subsequent analysis
wasn't quite as clear. When we look at all of the data, it's
still being sorted out with results that are not totally
consistent. Is this due to doxorubicin? Is it due to dose?
There were people who were prepared to argue on
the basis of that first study, which is a very large study,
1,800 patients in the entire study randomized -- or 1,500.
I've forgotten the number that were in the HER2 subset, but
it was about 600 I think. So, it was a very large subset
analysis. There were people who were saying we should declare
a change in therapy at that time, others who said let's wait.
I personally was in that latter group and I would be in that
latter group here as well. I think that the issue here is
probably not an issue of Taxol. This is an issue of
chemotherapy and probably applies across the board.
But I've been writing for a number of years on
the issues of chemotherapy in older and younger women and some
of these issues whether we should give chemotherapy at all.
The way I usually present this is to say your first question
is, is chemotherapy appropriate in a particular patient? And
71
then your second question is, if it's appropriate, then what
is the marginal advantage of going from CMF to cyclo/adria,
of cyclo/adria to cyclo/adria plus Taxol? Then what is the
marginal increase in toxicity? And then asking the patient
whether that's worth it to that patient. So, to me that's
the thinking that you go through, but you wouldn't jump to
the end of that process and say, I'm going to not give Taxol
for this particular group of patients, but I would give
cyclo/adria to that group of patients. I don't think that
that's the appropriate sequence for thinking out the problem
as a clinician.
Does that answer the question you're asking, in
other words, when and why?
DR. WILLIAMS: I hate to keep going here. This
is not our usual format. But this is the most central point
for us.
I want to ask Dr. Berry, who mentioned the point
about the interaction. I do remember now where I read that
and it was in your study report that there was interaction
either with tamoxifen or the estrogen receptor. So, I would
imagine it holds up for these data, and if it was really present
at ASCO, that means that there was a very strong interaction
almost certainly at two times because you had less data then.
72
If it was positive then with less data, that means that the
effect was even stronger.
DR. BERRY: Yes. I want to correct something that
I said to Dr. Lamborn.
By the way, I'm responsible for this subset
analysis. I plead guilty to that. It's difficult for me not
to look at these things, and my attitude was similar to Dr.
Temple's. I must say over time I've been moved in the other
direction.
This is the disease-free survival, Cox regression,
and you see the usual covariates, number of positive nodes,
et cetera, menopausal status, not significant. This was the
issue that Dr. Lamborn raised. The interaction between Taxol
and ER status is statistically significant but barely, and
the next slide shows the corresponding thing for survival and
it's not statistically significant.
At the time, Dr. Williams, of ASCO, indeed it was
more highly significant than this.
And Dr. Temple is right. We don't have the
corresponding Cox regressions for interaction with tamoxifen,
but there is a somewhat stronger, although not incredibly
stronger, interaction with tamoxifen.
I would like to address something else that Dr.
73
Williams raised. Could I have the next slide? This is the
hazards over time, and this is a compelling picture for me.
There are three curves on here. One is the AC plus Taxol.
Another is the AC alone, and the third curve, the one that
extends out here -- and I can't tell the difference between
these two colors and I guess it doesn't make any difference.
But this one is our previous study, CALGB 8541. These are
hazards, which means that one calculates the number of
recurrences in a given time period, divided by the number at
risk in that time period. So, it's like an actuarial
comparison.
What that means is that these comparisons at 3
months and 9 months are really independent. The set of
occurrences in this time period is different from this, is
different from this, and you see that the benefit -- the hazard
ratio that we're talking about is averaged over this entire
time period. You see the benefit of Taxol occurs early, and
these are like four or five independent analyses. They're
all in favor of Taxol.
The point I want to make here is that the benefit
of chemotherapy -- and it's not just in this study, but in
every study in node-positive breast cancer -- occurs early.
After 3 or 5 years, there is essentially no benefit. The
74
overview looks exactly like this, and the hazard for
node-positive disease returns to the hazard for node-negative
disease. If you were node-positive 5 years ago when you had
breast cancer and you're still alive and disease-free now,
you're essentially like you were node negative at diagnosis.
So, I think it's compelling that the benefit is
in the early time period. It's exactly where we would expect
the benefit to be for a chemotherapy.
DR. NERENSTONE: Dr. Lippman?
DR. LIPPMAN: As a non-statistician, I tend to
have a very negative view of subset analyses because, first
of all, this is a secondary analysis and a subset analysis.
When you look at the subset table, the changes over time,
although under disease-free survival there's a bigger
difference in receptor status, they come together under overall
survival, and there are much bigger differences, for instance,
when you subset out the nodal groups. So, I think in terms
of planning patient management on this, this is why I raise
this, whether we're confident about an unplanned, secondary,
subset analysis.
DR. CANETTA: I would tend to agree with Dr.
Lippman's statement. I think that in this subset analysis
story, what again it is important to keep in mind -- and we're
75
all aware of the vagaries of subset analyses, we're all aware
of the problems of multiple analyses. But one consistent thing
that has happened in this subset analysis is that no matter
what subset you look at and no matter what endpoint you look
at, because this is true for both disease-free survival and
for overall survival, every single analysis comes with a
direction in favor of the use of Taxol. And that is consistent
with what Dr. Berry was talking about.
DR. NERENSTONE: Dr. Lamborn?
DR. LAMBORN: I'd like to ask one question about
the subset analysis. Sometimes things will happen over the
course of the trial where you have new information, and
therefore, while it is a subset analysis, there is a medical
logic to why you're looking at it, where perhaps you didn't
originally plan it. What I thought I have heard is that there
has now been a large evaluation of adjuvant chemotherapy which
said that the risk reduction would be expected to be
substantially less in the node-positive. So, in a sense, this
is not one of a whole set of cases. So, I just wanted to make
sure I understood what it was we were saying.
DR. CANETTA: Dr. Norton or Dr. Henderson? Can
we give a chance to both of them?
DR. HENDERSON: If it had happened the way you
76
described --
DR. LAMBORN: Excuse me. ER positive.
DR. HENDERSON: There are two possible scenarios
here. One scenario is that the committee of investigators
or the CALGB breast group said, look it, this is becoming an
important question and turned to our statistical group and
said, let's look at it because the hypothesis has been
generated. Now let's look at it in our data. That's one
scenario. That kind of a scenario implies what you were
suggesting. There are other people that have generated a
hypothesis. People are beginning to think about it and now
going forward.
The other hypothesis is you've got somebody
sitting there saying, well, let me just look at the data and
see what happens in this group and happens in this group and
happens in this group. As you know, the probability of getting
a false positive result in subsets when you do that approaches
50 percent. So, that's why we usually don't do that.
Now, which scenario applies to what we showed you?
The latter, not the former. The first time that I had ever
seen these data, had ever thought about it and so on was when
the data were sent to me after the data safety monitoring
committee released the data. It had not been something that
77
had been discussed or planned or anything prior to that. So,
it was not something where the scientists and the physicians
involved in the study generated and said, let's ask this
question, but rather an individual looking at it privately
came to that conclusion. So, that's why I describe it as a
hypothesis generating subset analysis rather than a test of
the question.
DR. NORTON: Could I just clarify this again just
to sort of emphasize it again? Because I think there's a danger
here that there's a lot of people who potentially could really
benefit from Taxol who may not end up getting it depending
upon what this committee does, and I think it would be a very
bad thing if that happens. The reason I'm saying that is
because let's just look at these curves again in this thing.
These are overall because there are a lot of
patients here. You subdivide it. You get wider confidence
limits. Of course, that's always going to happen. And you
see that the effect by ER negative/ER positive, that this is
now subdivided and there's a little bit less effect in ER
positive and a little bit better effect in ER negative, and
they average out to an overall effect. This is for
disease-free survival.
Overall survival, same thing. They subdivide
78
out. The real issue here -- I mean, the median points here,
the central point of effect is still good. It's just that
the confidence limits widen out, and that's why we see this.
And the confidence limits widen out because we're dealing
with a subset analysis here.
Next slide, same thing. It moves in a positive
direction, but wider subset analysis.
Next slide. This is by overall survival by
tamoxifen use, the same basic thing.
Next slide. The point I want to make is if you
look at the whole worldwide overview, you're dealing with much
larger numbers. Obviously, these get further away from the
0 line, the no effect line, because you're looking at
chemotherapy versus nothing. Before we were looking for Taxol
adding to AC, which is already good treatment. So, the
magnitude of the effect is going to be somewhat reduced. But
it's the same basic direction. The reason why these are
impressive is because the larger numbers involved bring the
confidence limits down and so it pulls it away from the line
of no effect.
Next slide, please. In fact, when we start to
do this with more reasonable comparisons, this is the effect
on subsets by age in the overview, you see that basically you
79
do, indeed, come to conclusions that the impact of therapy
on the ER positive group, whether they're older or they're
younger, starts to even get into that category. They start
to actually get into this no effect kind of group.
Now, universally worldwide, we're giving
chemotherapy to ER and PR positive patients that are
pre-menopausal and post-menopausal. If this number were not
7,000 but this number were 70,000 or 100,000, then the
confidence limits would shrink down and the patients would
clearly be receiving benefit. There's absolutely no doubt
about this. Because we're dealing with a trial that's a huge
trial of over 3,000 patients, but it's not 20,000 patients,
with the exact, same magnitude of the effects here, that we
could be misled into denying patients therapy that could be
lifesaving for them. And I think that we really have to be
aware of this as a potential danger. It's really not a matter
of subset type things. It's a matter of when you subset, you
have a smaller number of patients and you have wider confidence
limits.
There are very good kinetic reasons why the effects
are so. ER positive disease grows more slowly. The effect
of chemotherapy may be less because it's growing more slowly,
as is universally seen in all models we've looked at. But
80
also, it takes longer to see a benefit because it takes longer
for patients to relapse. So, for very good kinetic and logical
reasons we get these basic effects, exactly the same effects
we see for chemotherapy universally in all of our experience
as summarized in the worldwide overview.
DR. NERENSTONE: Dr. Johnson, did you have a
question?
DR. JOHNSON: Yes, I had a couple and it had
nothing to do with subset analysis --
(Laughter.)
DR. JOHNSON: -- although I'm thinking about
asking one now.
(Laughter.)
DR. JOHNSON: I had two questions. One had to
do with the cardiac toxicity which seemed shockingly low to
me, especially in light of yesterday's presentation where we
saw a lot of data about the use of single agent doxorubicin.
I guess it matters how one assesses the cardiac toxicity in
order to make that determination.
So, it wasn't very clear to me how that was done
in this trial, even in that first 300 patients. Were they
required to receive MUGA scans, for example, and if so, on
what basis and how frequent?
81
As a corollary to that, do we know what the late
developing cardiac toxicity might be in an individual who
receives AC followed by Taxol? We know, I think, a lot about
giving the two together, but what about the sequential use
of these?
DR. CANETTA: For the cardiac toxicity, can we
show that?
While the data are being sorted out, let me make
a statement concerning your last question, the sequential
effect. The monitoring of this trial continues and continues
for late cardiac effects and for secondary neoplasm, as you
know. Very recently in August, we filed the 120-day safety
update, which is mandated by law, to this NDA. I can tell
you that there was no difference again between the incidence
of cardiac effects occurring late in patients who received
AC as compared to patients who received AC followed by Taxol.
By the same token, there was no difference in the incidence
of secondary malignancies even with the 120-day safety update.
Here is the data. This is the data for the cardiac
toxicity during the period of follow-up. As you can see, we
decided to display this by doxorubicin dose, given the fact
that there was the 60, the 75, and the 90 milligrams per square
meter dosage. There seems to be a certain increase of cardiac
82
toxicity that is not really related to Taxol but appears to
be more related to the dosage with Adriamycin administered.
That's not surprising.
DR. HENDERSON: I think the important thing,
comparing yesterday and today, is the fact that the maximum
dose of doxorubicin, cumulative dose in the study is 360 per
meter squared. As you know, you don't really see a lot before
you get to that point.
The second this is that when you're randomizing
3,170 patients and you multiply that by the cost of the MUGAs,
if you're obtaining them on a regular basis, the costs are
astronomical. We didn't feel that the costs justified the
kind of intense monitoring that took place in the study you
heard yesterday or, for example, in the Zinecard preparations.
So, we had a baseline MUGA on all the patients. We require
that every single follow-up form provide information on whether
there have been any cardiac events of any type since the last
follow-up form. So, unlike some of the data where it's hit
and miss, this is one of the things that has been monitored
on every follow-up form from day 1.
I was just checking the exact day. I think it's
5 years, but there is a required MUGA, as part of the long-term
follow-up, and we felt that it was more important to look at
83
this for all patients at the same point in time, but some time
out. As you know, cardiotoxicities often do not manifest early
and particularly not in an adjuvant setting. It becomes more
manifest particularly when the patients relapse and undergo
the extra stress to the heart and the various things that affect
it.
So, I think that given 360 per meter squared is
your maximum dose and given the fact that we're not intensely
looking for things, that this is probably very reasonable to
what a practicing oncologist would see.
DR. CANETTA: If we can show the slide, let me
back my statement with the actual numbers. This is the 120-day
safety update. As you can see, these are percentages, and
there is no difference between the two treatment arms. This
is consistent with what was presented in the NDA.
DR. JOHNSON: Now, what does cardiac function
mean?
DR. CANETTA: This is left ventricular ejection
fraction as contained in the follow-up form.
DR. JOHNSON: Is that statistically different?
DR. CANETTA: It's a reduction of the LVEF.
DR. JOHNSON: I don't understand. So, 40
patients in the AC had a reduction versus 56. Nearly 50 percent
84
more? Is that what you're saying?
DR. TUCK: Because of the way the data was
reported, it's not possible to give, for instance, a breakdown
of the not specified. This could include a variety of
different kinds of --
DR. JOHNSON: No. I'm looking at cardiac
function there. It says cardiac function, 40 under AC, 56
under ACT, total 96. I think those two add up.
DR. TUCK: It's not statistically significant
according to the statisticians.
DR. JOHNSON: Just in response to Dr. Henderson's
comment from yesterday. Actually the data yesterday showed
-- and I agree that clinically we don't see much in the way
of cardiac toxicity, but in that intensely monitored group,
actually the largest number of events, as it were, occurred
between 300 and 399 milligrams per meter squared of doxorubicin
of left ventricular ejection fraction decline, which if that
in turn is a marker or a surrogate endpoint I guess for
subsequent cardiac problems, it might be interesting to know
in that first 300 patients. I like the idea of doing the late
follow-up, though. I think that's critical.
The second question I have, though -- and actually
Dr. Norton addressed this in his overview, and I'm appreciative
85
of what he had to say about the number of cycles, but I want
to go back and ask this question very specifically. That is,
is the difference here Taxol, or is the difference here cycles
of therapy? And if it's the difference in therapy, I would
sort of like the impression of the two breast cancer experts
on their thoughts about this.
DR. CANETTA: Dr. Norton will give you the answer.
DR. NORTON: We thought about this very hard, and
I think you don't know for any individual patient obviously
if you're eradicating all, let's say, the AC sensitive cells
with 4 or if you need 5 cycles. There probably is some small
percentage of patients that would benefit from a little bit
more of a monotherapy, but it's probably going to be very small.
Obviously, we thought about this very intensively both in
the design of the analysis and the study.
If you look at the worldwide overview, this splits
it down by -- these are all the longer versus shorter regimens,
and these are the regimens that were longer than shorter ones,
but the shorter ones are at least 6 months, and the more relevant
ones are longer versus regimens that are less than 6 months,
especially these last four which are basically 6 cycles versus
3 cycles of something, three of them with CMF and one of them
with epirubicin. As you can see overall there, if there is
86
an effect at all of duration, it's in the 7 percent reduction
range with a standard deviation of 4, which doesn't meet
statistical significance. Even if you look at the most
relevant ones, you can see that the confidence limits really
overlap the no-effect curve for longer versus shorter. This
one actually goes in the other direction. These may go in
a direction, but it's a very, very slight effect.
This particular study with longer follow-up was
recently reported, and the confidence limits just barely shrunk
down to make it. This is the only one. It's an outlier effect,
and it took a long follow-up to basically see the effect.
So, there may be an effect of duration, but it's a very slight
effect and it doesn't come close to the magnitude of the effect
we're seeing in this trial.
The next, by the way, just shows the exact, same
thing. This next slide just shows for mortality. In
mortality the points I was making are made even more clearly.
DR. JOHNSON: Now I'm going to try to expand on
this just a little bit, and the statisticians may come to my
rescue here because I'm going to ask sort of a statistical
question I think. How confident can we be of these data that
this is not simply a duration effect? In other words, you've
just shown us a 7 percent difference, and the magnitude of
87
the difference here I see is quite large, in the 25 percent
range. In other words, do those two confidence intervals
overlap or are they really separate --
DR. NORTON: Well, in the overview it's 7 percent
for recurrence-free survival, less for overall survival.
Neither of them reach statistical significance. Here we're
talking about 22 percent and 26 percent reduction in death
rate, both very statistically significant and early on.
Obviously we have a very large trial and a large number of
patients giving us great power, so that's why we're seeing
it early on. But these are the 7 percent and less than 7
percent, not statistically significant, with 15-year
follow-up. You see? So, if we're seeing these kinds of
magnitudes this early, you can imagine how good it's going
to look in 15 years. So, I think it's really very clear we're
seeing something very different here than any kind of subtle
duration effect.
DR. NERENSTONE: Our time is running a bit short.
Drs. Temple, Lamborn, Raghavan, and Blayney all have
questions. Dr. Temple?
DR. TEMPLE: When you showed the subset analyses
for the overall population, one of the things that was, I guess,
impressive was that whatever the number of nodes, tumor size,
88
et cetera, the hazard ratios were all the same. Did you happen
to do that for the tamoxifen treated and for the no-tamoxifen
groups? I'm absolutely sure I know what the answer is -- I
mean, I know what the result of that is going to be, but each
subset is going to show nothing on the tamoxifen treated
patients. Right?
DR. CANETTA: I think actually Dr. Henderson
already showed that. We can show it again, the hazard ratio
bar graphs by tamoxifen treatment.
DR. TEMPLE: For each of the node subsets and tumor
size subsets, things like that.
I'm making the point that to achieve a hazard ratio
of approximately 1, you're going to have to have the same effect
in all of the subsets that were impressive before because they
all showed the effect. It's just that there's a consistent
finding. What to make of it is a tough question. Do you
understand the analysis --
DR. CANETTA: Yes. We'll try to pull out the
data, if we can.
DR. BERRY: I don't understand the question, Dr.
Temple. Are you saying that if you restrict to those who were
treated with tamoxifen, what do you get? If you restrict to
those who were not, what do you get? Are you saying if you
89
look within 1 to 3 nodes, do you get the same effect for
tamoxifen interaction?
DR. TEMPLE: Yes. One of the things that's always
impressive in a large database is that you look at all the
reasonable subsets and you find the same effect in all of them.
That was done for the entire population.
My guess is if you do that, dividing the population
up into the tamoxifen treated and the non-tamoxifen treated,
or receptor positive/non-receptor positive, you will see the
same phenomenon. The subsets will all look terrific for the
receptor negative ones and the subsets will all look like
nothing for the receptor positive ones or the tamoxifen treated
patients.
DR. BERRY: Yes, you are absolutely correct in
what you say. If you look at 1 to 3 positive nodes, 4-plus
positive nodes, and you look at the potential interaction with
tamoxifen, it's essentially the same in both.
And the effect of Taxol is the same in both. In
fact, it's essentially statistically significant in both of
those groups.
DR. NORTON: These are the actual data that we
pulled up because we analyzed. This is the overall effect,
which is good, narrow confidence limits. The one thing that
90
moves up here is -- and this is the one. This is the ER positive
or hormone receptor positive getting tamoxifen. It moves up.
The others are down. This even could be a statistical fluke
outlier, frankly, because the others move in the direction.
But even here, even in this subset, the midpoint is still
below the 0 line.
Remember, we're talking about subsets of subsets,
129 patients, 800 patients, 150 patients. So, when you start
to get subsets of subsets, you're going to get variable data.
DR. TEMPLE: I didn't mean the very small ones.
It's just the observation you made before, that when you break
it down by receptor status, it looks different. It looks even
more different actually when you break it down by whether or
not they were treated with tamoxifen because in the small subset
of receptor positive people who weren't treated with tamoxifen,
Taxol looked okay again.
DR. NORTON: Yes, but it's a subset of a subset,
and who knows what to make of this. This was all unplanned.
Patients were not randomized to tamoxifen. It's hard to know
what to make of that.
DR. WILLIAMS: Before you leave that slide, I
don't think that's a random subset, though. That is the group
that would benefit from tamoxifen. The others wouldn't. So,
91
it's not at all illogical that if the tamoxifen effect was
competing for the chemotherapy effect, that that group alone
would show it.
DR. NORTON: Yes. Obviously you would see it in
that effect, and it would be a lesser effect. But we're dealing
now with 1,900 patients. We're not dealing with 190,000
patients. You know what I mean? It's not a matter of
direction. It's exactly as the overview. It's a matter of
the confidence limits and it's a question of how far you want
to drive it. But it's entirely consistent with our whole
worldwide experience over 15-20 years.
DR. TEMPLE: That's been said multiple times.
The idea that there's a difference between the
groups in the overview is one thing. You're talking here about
hazard ratios that are very close to 1.
One thing that Dr. Berry may want to comment on
-- it has come up several times -- that patients were not
stratified by receptor status. What I always learned is that
when you're talking about a characteristic that's very common
in a large study, such as receptor status or something like
that, it's a pretty fair assumption that patients were randomly
assigned to treatments whether they were receptor negative
or positive. You're talking about 2,000 patients and 1,000
92
patients. That is not likely to be a problem. There are
plenty of other problems in interpretation, but that doesn't
seem like it would be one of them.
DR. BERRY: Yes, I absolutely agree.
DR. NERENSTONE: Dr. Lamborn.
DR. LAMBORN: I'd like to go to a whole different
topic, which is the issue of how do you interpret the p values
in this environment and the issue that came up of the fact
that there was a decision to announce the results early, that
the results have to be interpreted in the context of interim
analyses, and there's obviously the recognition that if you
look at the data multiple times, that you have an inflation
of the p value.
I would like to get a sense from the thinking of
the group that made the decision to make the announcement early.
As I understand it, there was a change from the original
stopping rule planned and the announcements were made early.
So, I'd just like a discussion of that and the implications
of that for our ability to evaluate how strong this data is.
DR. CANETTA: The data safety monitoring board
of the CALGB proceeded with this decision. I'd like Dr. Berry
to discuss it. I just want to make the point that Bristol-Myers
Squibb was not part of the DSMB and appropriately so.
93
DR. BERRY: This is to discuss a bit about the
DSMB deliberations. I can't tell you what the DSMB
deliberations were in closed session because I was not there.
I was not on the DSMB. I reported to the DSMB, and so I can
tell you what my deliberations were.
I was the person who drove from Charlotte in the
wee, small hours of the morning and lost sleep over this study.
That is not the first time I've lost sleep over this study.
I lived with it in the days when I was the only one who knew
the results. We presented to the DSMB blinded results by three
arms. They did not know that the three best performing arms
were the Taxol arms, and I lost sufficient sleep that I wanted
them to share my grief and I unblinded them in the early days
of the study, December 1996 -- not early days, but after 2,000
patients, when patient accrual was continuing. My question
to myself and the DSMB is, is it reasonable to continue with
accrual of this study in view of the results?
So, I'll address to some extent Dr. Lamborn's
questions about significance testing, adjusting for interim
analyses, announcing early versus early stopping, the
factorial design and early monitoring, the receptor status
interactions we've talked about, potential for treatment
crossovers, predicted probabilities and power calculations
94
versus ethics.
At the time that we announced the results of the
study, all patients had completed therapy. In fact, the last
patient was entered on April 15th of 1997, a year before we
announced the results.
The predicted probabilities of positive
significance results after 1,800 events were considered, and
delayed announcement might have denied some women the potential
benefit of receiving Taxol. That was a critical issue.
The O'Brien-Fleming -- this was based on four
analyses including the final analysis at 1,800 events. Of
course, it wouldn't be a final analysis. We'll continue to
monitor this study -- indicates a p value of .000007. It's
extremely conservative, and we did not reach it. So, strictly
speaking, the results at that time were not statistically
significant, even though the nominal p value, the actual p
value if we ignore interim analyses, was .007.
O'Brien-Fleming boundaries were proposed for
early stopping. This is not a question of early stopping.
We had stopped the study. There was no accrual of the study.
The question was should we announce the results early or not.
There was no consideration in the protocol to
adjust for a significance level for the factorial design.
95
That was my fault and so, strictly speaking, we couldn't obey
what the protocol told us to do.
The predictive probabilities -- and this was very
important to the DSMB I'm led to believe -- of a statistically
significant result, if we went to the 1,800 events in May of
1998 for Taxol versus no Taxol, the probability of statistical
significance was 93 percent. At the current time with 624
events, it's 99 percent; that is, if we were to continue and
monitor it to 1,800 events, it's very likely we'd get
statistical significance.
DR. LAMBORN: Could you just clarify under what
assumption?
DR. BERRY: Yes. This is a Bayesian calculation
assuming a non-informative prior.
This is related to Dr. Williams' question about
1 year, 2 years, et cetera. The data are essentially in at
1 year. There is a highly statistically significant
difference at 1 year, and so if we were to go 30 years from
now, this observation is essentially the same now as it will
be then, and this is about a 40 percent reduction in
disease-free survival. And similarly, about a 45 percent
reduction in death.
This is a picture I showed you before, and in this
96
region we have essentially complete data. So, these results
are not going to change even if we were to follow up longer.
One further question about this subset thing.
I didn't mention it. One of the reasons for announcing the
results was precisely so that laboratories could address this
question of Taxol versus tamoxifen or Taxol versus hormone
receptor status, and that is being done. To my knowledge,
the only extant explanation, to address one of Dr. Lamborn's
earlier questions, biologically for the relationship is HER2
nu and estrogen receptor status. There's a negative
relationship between the two. HER2 nu is known to affect
Taxol. There are some people who publish results showing
sensitivity, some showing resistance. If indeed there's
sensitivity, then this might explain some of the interaction,
but it cannot explain all of the interaction.
DR. NERENSTONE: Dr. Raghavan? Dr. Blayney.
DR. BLAYNEY: On page 20 of your briefing
document, you talk about the patients who were over 65 years
old. 94 percent of your patients were less than 65 years of
age. Dr. Henderson went through a nice step-wise progression
of how he counsels a patient regarding the benefits of
chemotherapy. For those breast cancer women who are estrogen
receptor negative who are over 65 or for those breast cancer
97
women who might be candidates for chemotherapy who are over
65, do you feel comfortable in proceeding to the last step
of your progression, which includes AC followed by Taxol, based
on this data?
DR. CANETTA: Before we discuss the efficacy
subset, let me make a statement that I think is pertinent to
this. As part of our study report, we did analyze toxicity
in this subset of patients, and I can tell you that when you
look only at the AC plus Taxol arm and you compare younger
patients or 65 and older patients, the incidence of grades
3 and 4 granulocytopenia in the entire population is 50 percent
for the younger patients, 55 percent for the older patients.
The incidence of infection is 6 percent, 6 percent. So, it
doesn't appear at this level of safety consideration that this
population suffers significantly more.
I should add an important thing, though. In
August, we submitted to the FDA a complete reanalysis of all
our NDA pivotal trials done with Taxol in breast cancer, in
all the other tumor types, where we reanalyzed the safety
according to the age of the patient. These encompassed
actually a review of a fairly large database, more than 3,000
patients. It has been submitted to the agency as part of the
modification of the package insert so as to provide this type
98
of information to the care provider. And there doesn't seem
to be an increased risk of toxicity in the older population.
That is consistent not only with the finding of the study
but with the overall experience with this compound.
DR. BLAYNEY: So, the febrile neutropenia is an
acute toxicity. I think part of the issue I face in dealing
with over 65 women is sort of the more the chronic or longer-term
toxicity.
DR. CANETTA: We can show the data, but again in
terms of mere incidence, there is no difference between the
younger patients and the older patients in this study, nor
in the overall database for Taxol for other stages of this
disease and for other tumor types.
Can we show the data?
DR. NERENSTONE: We're running short on time.
Did that answer your question, Dr. Blayney?
DR. BLAYNEY: There's a small number of patients,
6 percent of your 3,000. That's 180 patients were over 65.
Is that significant? How comfortable can we be in advising
the FDA that this is relevant to 65-year-old and older women?
DR. CANETTA: Again, when you put things in
perspective, the reason of our comfort is that this is almost
200 patients in this study, but we have the entire experience
99
with Taxol in the treatment of cancer that supports that.
That's what makes us more comfortable with the fact that elderly
patients will not be at an undue risk of toxicity receiving
Taxol at these dosages and at this schedule.
DR. BERRY: I just want to make one comment about
that. It is, of course, a very small subset. I just looked
at the disease-free survival effect of Taxol in the greater
than 65. It's exactly the same as in the younger patients.
DR. BLAYNEY: Thank you.
DR. NERENSTONE: And, Dr. Pelusi, did you have
one more question?
DR. PELUSI: I just want to make a comment in terms
of quality of life and I think that that is some of the things
that have come out either in the long term, cardiac toxicities,
as well as our older patients. I think it becomes very valuable
to all of us as we're trying to decide which patients should
go or be encouraged, if you will, or given options in different
treatment, what really is the effect of quality of life because
as we start to see different approaches to the same thing,
the question is what is the quality of life. Nowhere did I
see any quality of life studies at this particular time, and
I think it might be interesting long term to see if that can
be added, not just necessarily toxicities, but what do those
100
toxicities translate into for quality of life for the patients.
DR. CANETTA: Unfortunately, for this particular
trial, instruments of quality of life were not used. I have
to say that the surrogate marker for quality of life would
be the interpretation of toxicity, acute and chronic toxicity.
As you have seen, we've been monitoring in the longer follow-up
for cardiac events, for secondary malignancies.
I can tell you that the toxicities that were
induced by Taxol during the Taxol phase consisted chiefly of
neurosensory toxicity. The vast majority of the patients who
dropped out of Taxol did so because of neurotoxicity, and that
was reversible, and 14 patients altogether dropped out for
hypersensitivity reaction out of the 1,400 patients.
Obviously this stopped as Taxol was stopped. The other
toxicities. Alopecia, unfortunately, is a side effect of
Taxol. It's fully reversible. And there is no sign that Taxol
added toxicity.
On the other hand, again we're talking about a
survival advance here. Therefore, I think you need to put
that in perspective with the efficacy.
DR. PELUSI: And I do appreciate that, but again
when we look at overall quality of life, there are additional
things other than those specific things. I do agree with you
101
on that, but again there are family issues as well.
DR. NERENSTONE: I'd like to thank everyone and
the sponsor.
We'll take a break now and I'd like everyone back
at 10:20. We are running behind. Thank you.
(Recess.)
DR. O'LEARY: Good morning, members of the
committee, ladies and gentlemen. My name is James O'Leary
and I will be presenting the FDA review of the supplement for
Taxol for adjuvant treatment of breast cancer.
Before I begin, I would like to recognize the
members of the review team who were instrumental in helping
the FDA perform this review.
As I said, I'll skip this first slide since the
sponsor already went over the proposed indication.
We're all familiar with the title of the study,
and the sponsor also addressed this.
So, I will go on to the third slide. I would just
like to bring at this point that the applicant has performed
the first interim analysis as prespecified in the protocol
to take place at 450 events. The data presented in this
analysis represents an update to that first interim analysis.
Two more interim analyses are scheduled to take place at 900
102
events and 1,350 events, and the final analysis will take place
when 1,800 events have occurred.
Accrual by arm, the sponsor already addressed
this. There was equal distribution of patients to each arm.
And I'll get right into the FDA analysis. The
FDA agrees with the applicant's analysis of the overall
disease-free survival in the population studied. However,
the core of my discussion will focus on results of this study
in subgroups defined by hormone receptor status, particularly
those patients with estrogen receptor and progesterone
receptor negative tumors, those patient with estrogen receptor
positive and/or progesterone receptor positive tumors, and
finally those patients with ER and/or PR positive tumors who
received tamoxifen. Although these analyses represent
subgroup analyses, I think that the large number of patients
in each group and the notable number of events occurring in
each group lends credibility to these analyses.
First of all, in the group of patients with
receptor negative tumors composed of over 1,000 patients, the
apparent beneficial effect of Taxol is dramatic, with the
hazard ratio of 0.66 suggesting almost a 34 percent reduction
in risk of recurrence.
When the results of the disease-free survival
103
analysis for the receptor negative patients are plotted, this
graph, which was submitted by the sponsor, shows a substantial
difference in disease-free survival in favor of the Taxol
treated patients. The agency estimated disease-free survival
estimates at 3 years using unadjusted Kaplan-Meier curves.
The results of this analysis showed that the Taxol treated
patients had an estimated 3-year disease-free survival rate
of 67.3 percent compared to 56.8 percent for the control group.
This difference represented by the two survival curves at
3 years is quite noteworthy at 10.5 percent.
The next subgroup that we analyzed in terms of
disease-free survival consisted of over 2,000 patients who
had ER positive and/or PR positive tumors. The agency derived
a hazard ratio of 0.93 with a p value of 0.56, which is similar
to the sponsor's value for this analysis.
These statistical calculations at this interim
analysis provide little justification for believing that
Taxol, sequential to AC, confers added benefit to patients
with ER positive and/or PR positive tumors. The following
graph, which was also included in the sponsor's submission,
shows that there's no appreciable difference between the
disease-free survival curves.
The agency once again performed estimates of
104
3-year disease-free survival. The results, 81.6 percent for
the AC treated patients and 81.2 percent for the AC plus Taxol
treated patients, provide no evidence that the Taxol treated
patients who had ER positive and/or PR positive tumors evinced
any benefit from the addition of four cycles of Taxol in
adjuvant therapy for their node-positive disease.
The findings in the ER positive and/or PR positive
subset of patients prompted the FDA to perform an additional
analysis on those patients who had hormone receptor positive
tumors and received tamoxifen. Even though this represents
a more specific subgroup than the previously identified group,
it consisted of a sizable number of patients at close to 2,000.
The analysis of this subgroup is even less suggestive of a
trend toward Taxol effect with a hazard ratio of close to 1.
The most closely related analysis performed by
the sponsor is disease-free survival in all tamoxifen treated
patients. As can be seen in the sponsor's graph, there is
no appreciable difference in the disease-free survival curves
for Taxol treated patients compared to the control group.
In summary, the agency is in agreement with the
sponsor on the overall positive effect of Taxol. However,
these overall positive results are based on the findings in
105
the ER/PR negative group of patients. The evidence for a Taxol
effect in the receptor positive or tamoxifen treated patients
appears to be insufficient.
In this trial, the efficacy endpoints were
disease-free survival and overall survival. Objective
disease relapse was used to evaluate disease-free survival
and was defined as the appearance of local recurrence or distant
metastases at any site or death due to any cause. The most
common reason for failure was the occurrence of distant
metastases, with the second most common reason for failure
being local disease recurrence.
Taxol demonstrated efficacy in decreasing the odds
of both distant recurrence and local recurrence. This chart
shows that the effect of sequential Taxol in decreasing the
odds of recurrence was similar for both distant and local rates
of recurrence.
Before I go on, I would like to present a quick
overview of the other definition for objective disease relapse
in this protocol which was death due to any cause. At a median
follow-up of 30.1 months, a total of 342 deaths had been
reported. 192 deaths had occurred in the AC treated group,
which is comparable to 12 percent of the population, and 150,
or 10 percent, of those treated with AC plus Taxol had died.
106
The corresponding percentages of survivors are shown on the
right-hand side of the figure.
As we saw in the analysis of disease-free survival,
according to the three identified subgroups, when we interpret
the results in overall survival with respect to the same three
subgroups, a similar pattern emerges. The positive results
for the entire study population are driven by the very
noteworthy beneficial effect of Taxol in the ER negative/PR
negative population.
The first graph, this graph, and all subsequent
graphs were taken from the sponsor's submission. This first
graph compares overall survival in receptor negative patients
treated with AC versus AC plus Taxol. Those treated with
sequential Taxol derived a substantial survival advantage.
Sponsor and agency hazard ratios were consistent. The sponsor
reported a hazard ratio of 0.72 with a corresponding p value
of 0.11.
In those patients with ER positive and/or PR
positive tumors, there was no appreciable difference in overall
survival when the AC treated group was compared to the AC plus
Taxol treated patients. The sponsor calculated a hazard ratio
of 0.83 with a corresponding p value of 0.31.
The lack of evidence for effect with sequential
107
adjuvant Taxol after 4 cycles of AC is even more pronounced
when comparing AC treated versus AC plus Taxol treated patients
who had hormone receptor positive tumors and received
tamoxifen. The sponsor's hazard ratio of 0.92 and p value
of 0.63 reflect all patients treated with tamoxifen.
Since the reported toxicities for AC were
comparable and occurred with equal frequency during the AC
part of treatment in all patients, I will not repeat them here.
Instead I will focus on the toxicity associated with 4
additional cycles of Taxol, which is not without risk.
The early population, as the sponsor indicated
earlier, consisted of the first 325 patients that were accrued
to the trial. The protocol specified complete reporting of
all adverse events that were grade 2 or higher for this cohort
of patients. Therefore, the figures in blue represent the
most accurate toxicity profile for Taxol in this trial. The
incidence of adverse events were reported as the worst grade
per patient. This does not tell us if the same worst grade
toxicity recurred in subsequent cycles of therapy. Women of
all age groups experienced more non-hematologic toxicities
with the addition of Taxol. The risk profile is expected based
on the known toxicities associated with the use of Taxol with
the most notable toxicities including hypersensitivity
108
reactions, neurosensory events, arthralgias/myalgias,
diarrhea, and neuromotor toxicity. In summary, the impact
of 4 additional months of therapy should not be discounted.
The women suffered some morbidity and some decrease in quality
of life.
82 patients, or 6 percent, of those randomized
to treatment with AC plus Taxol discontinued therapy during
Taxol due to drug-related toxicity. In comparison, 15
patients withdrew from therapy in the AC arm, and 17 patients
randomized to the AC plus Taxol regimen withdrew during the
AC portion of their treatment.
2 patients died acutely from Taxol toxicity. 1
patient had a brain infarct subsequent to sepsis, and 1 patient
experienced a hypersensitivity reaction. The patient who died
during AC treatment died of respiratory disease which was
attributed by the investigator to disease progression and not
related to drug toxicity.
Some issues to consider. For the entire study
population, the overall results of the trial are very positive.
The use of Taxol reduced the recurrence rate or risk of
recurrence by 22 percent with a hazard ratio of 0.78 and reduced
the risk of death by 26 percent with a hazard ratio of 0.74.
Although the FDA usually views subset analyses
109
with trepidation and great caution, the agency feels that the
results in this trial with respect to the identified subgroups
are compelling. The subgroups are large, with a notable number
of events occurring in each. The subgroups represent
medically plausible populations. In fact, the protocol
specified different treatment for patients in each subgroup
to receive or not receive tamoxifen.
And finally, the overall results of this trial
seem to be driven by the findings in the receptor negative
population treated with Taxol.
Furthermore, 4 additional cycles of chemotherapy
are not without risk. As we saw, 82 patients discontinued
Taxol therapy because of drug-related toxicity and 2 patients
died acutely of drug-related toxicity during Taxol therapy.
Based on these data from an interim analysis, it seems to
me that the lack of evidence of a Taxol effect in patients
with receptor positive tumors treated with tamoxifen would
not justify the added toxicity of 4 additional cycles of Taxol
chemotherapy.
In summary, based on the current interim data,
the net beneficial outcome in disease-free survival and overall
survival reported for all AC plus Taxol treated patients
appears to be derived from those patients with tumors that
110
were hormone receptor negative for both estrogen and
progesterone. This group comprised about one-third of the
entire study population.
I believe there is sufficient evidence to approve
Taxol as adjuvant therapy subsequent to the combination of
doxorubicin and cyclophosphamide in patients with
node-positive breast cancer who have tumors that are negative
for both estrogen and progesterone receptors. This
recommendation is based on the striking improvement
demonstrated for disease-free survival and overall survival
in this subgroup.
Two-thirds of the study population had tumors
which were hormone receptor positive. Per protocol these
patients which received tamoxifen at the first interim analysis
of this trial, there seems to be no evidence of benefit from
4 additional courses of chemotherapy with Taxol after AC in
patients who will receive tamoxifen. The effect of Taxol
cannot be discerned in this group of patients.
Therefore, based on the currently available
interim data, I do not believe there is sufficient evidence
to recommend approval for Taxol as adjuvant therapy sequential
to the combination of doxorubicin and cyclophosphamide in
patients with node-positive receptor positive breast cancer.
111
This recommendation is based on the near unity in the hazard
ratio and no trend toward statistical significance, along with
3-year disease-free survival estimates showing no difference.
I must say that the result of future interim analyses and/or
the final analysis may alter this current recommendation.
Thank you for your attention.
DR. NERENSTONE: Thank you. We'll now open up
to questions from the committee. Dr. Williams?
DR. WILLIAMS: I want to make a statement for the
team.
I think we've had a very good discussion with
breast cancer experts and with the company and the team's
presentation.
We made a recommendation here but I really think
that at this point in time we're really more asking what's
the right thing to do. I really think that this is a very
tough call. I just wanted to sort of communicate the FDA's
current position on this.
DR. NERENSTONE: Thank you.
Dr. Kelsen?
DR. KELSEN: It seems to me that the major issue
that you've raised, since there's general agreement on a
recommendation for non-estrogen receptor and progesterone
112
receptor patients is now bad is it to take 4 cycles of Taxol
for ERP/PRP positive patients when we don't yet have full
evidence of benefit, but you're basing it on a subset analysis.
As I look at, I guess it's slide 32 from the
sponsor's presentation, looking not at grade 2 toxicity but
grade 3 or 4 toxicity because, although no one wants any
toxicity, the key issues are serious toxicities, those numbers
are very small for the AC followed by T arm for serious grade
3 toxicity unless I'm misreading this, either hematologic or
non-hematologic toxicities.
DR. O'LEARY: I believe it was in the range of
about 15 percent --
DR. KELSEN: Leukopenia, 9 percent;
granulocytopenia, 21 percent; less than 1 percent or 1 percent
for everything else, including cardiovascular, nausea,
vomiting, whatever. Slides 34 and 32.
So, we're basing our recommendation to not give
therapy to ER or PR positive patients on a subset analysis
with trends that are slightly below the unity point. And
that's not a very comfortable feeling to withhold therapy that
may change the cure rate. So, you have to be pretty comfortable
I think that it's the right thing to do because it will be
113
several years before we know for sure that this is not effective
therapy in making this decision.
DR. O'LEARY: The next interim analysis will
occur? Can the sponsor tell us?
DR. BERRY: The 900 will be probably 12, 18 months
from now. I'm not sure.
DR. CANETTA: If I can just make a point. I wonder
whether it is appropriate to call these interim analyses
because the definition of interim analysis applied to the
stopping rules for the protocol. This study that's been
reported has not been stopped. It has been completed. So,
I don't think that there is a compelling reason to go back
to 900 events or 1,350 events as the protocol wrote that would
have been done in the event that the protocol had to stop.
The protocol has been completed.
DR. WILLIAMS: I think the protocol was designed
to perform analyses based on number of events, and I call that
an interim analysis. I think we'd be interested in the data
as they were designed to be collected and we would make
decisions based on those at each particular time. I'm not
quite sure I understand your distinction. Certainly we can't
stop the trial, but we're certainly going to look at the data
when there's twice as much as there is now.
114
DR. NERENSTONE: Dr. Temple and then Dr. Margolin.
DR. TEMPLE: I don't think interim here was meant
to imply that there's anything wrong with it. I think Dr.
O'Leary was just expressing the hope that perhaps with more
data, there might be a benefit seen in that subpopulation.
I imagine everybody sort of hopes for that. It wasn't a
statement that the data aren't persuasive for some information
now.
DR. NERENSTONE: Dr. Margolin.
DR. MARGOLIN: I think, although we don't have
this data and we won't really from this study or maybe from
the next or next after that early trialists group, we have
to consider that the addition of Taxol is going to have an
impact on all groups similar to the addition of chemotherapy
to hormonal therapy in patients with ER positive disease.
Since there are often different levels of
limitation or caution that can be placed on drug approvals,
one option that we've seen the FDA do sometimes -- and I would
wonder if that's being considered -- is not to limit the actual
sentence that's written for the indication in the approval,
but to have very prominently in the package insert the data
from this trial cautioning that the proof of benefit of Taxol
in the ER positive patients who receive tamoxifen has not yet
115
been demonstrated beyond all doubt.
DR. NERENSTONE: Dr. Justice?
DR. JUSTICE: The answer to that question is yes.
We can put in the clinical study section, if the committee
recommends, a full disclosure of the issues. We have that
in indications as well, but definitely in the clinical/pharm.
DR. NERENSTONE: Dr. Kelsen.
DR. KELSEN: This is a procedural question.
We're hopefully going to see large scale trials in a number
of solid tumors over the next few years, many of which may
not have a subgroup analysis planned of this type. What will
the position of the agency be, let's say, if we do a colon
cancer trial and we're lucky enough to get 5,000 patients in
it? And there are a number of subgroups in colon cancer.
We're not going to do subgroup analyses in all of them. How
shall we approach that as these adjuvant trials come through?
DR. NERENSTONE: Dr. Temple?
DR. TEMPLE: Well, we're usually on the other side
of this argument.
(Laughter.)
DR. TEMPLE: We're historically skeptical about
subgroup analyses, especially when they try to salvage an
otherwise negative study.
116
I think the theme here is that this sort of grabs
you by the hair more than most of them do. We are, in general,
resistant to making much out of the many possible subset
analyses that show up in trials. So, we have the same attitude
that the company is expressing. It's just that when you see
two-thirds of a study with a hazard ratio of approximately
1, you sort of have to say, well, what should I do with this?
So, I would consider this quite exceptional. We don't usually
celebrate the small differences that are inevitable in any
trial. So, it's not a difference in attitude. We're very
skeptical. But as Jim said, this sort of grabs you.
DR. NERENSTONE: Are there other questions from
the committee?
(No response.)
DR. NERENSTONE: Okay, thank you very much.
At this point, I've been asked to reopen the public
hearing and Dr. Marissa Weiss would like to address the
committee.
DR. WEISS: Good morning. My name is Marissa
Weiss. I'm a physician oncologist specializing in breast
cancer, and I'm here today representing my nonprofit
educational organization, Living Beyond Breast cancer, which
is Philadelphia based but a national organization. Our
117
mission is to help all women affected by breast cancer live
as long as possible with the best quality of life.
I am here on my own. I was invited by myself.
Bristol-Myers is one of many companies that buys a few seats
at our table for our annual gala, which is next week, and all
of you are invited. There will be 800 people there.
(Laughter.)
DR. WEISS: I'd just like to start by putting this
into perspective. We're all here in the room for the same
reason, which is 40 percent of 180,000 newly diagnosed women
with breast cancer with have their lymph nodes involved, and
as Dr. Henderson said, of the 3,000 people on this study, over
half were expected to have a recurrence. So, this is a large
group of women, 72,000 women diagnosed each year, with nodes
involved, and over half are still predicted to recur over the
long term. So, we desperately need effective treatments for
these women.
I am struck by the incremental benefit that Taxol
offers to women who have already completed their Adriamycin
and Cytoxan chemotherapy. It's very impressive, and the shape
of the curves, two parallel curves, over time -- those two
points of analysis -- they're identical. But also the curves
start to plateau out. So, I feel comfortable with the
118
reliability of that data.
Also, we've had a longer experience with Taxol
than just this study. This is not the first study. We have
a lot of information about toxicity, not necessarily after
AC chemotherapy.
These data do cover the highest risk period in
this particular population of women with nodes positive, the
first 3 years being the highest risk period. These data are
just short of 3 years.
Just to say for all of us in the room who have
already given our patients the benefit of Adriamycin and
Cytoxan chemotherapy, what this study does show is at least
dose intensification of Adriamycin doesn't buy you anything
more. So, we've got this group of women who have gotten the
benefit of the best standard chemotherapy and giving more of
it doesn't do a damned thing. So, the point is what more can
we do for these women that's substantially different, and it
seems that Taxol does do that without significant incremental
side effects.
Clearly additional chemotherapy being involved
for 4 more months, quality of life issues are definitely there.
But we all know that for those women on this study -- and
most of them are young women in the prime of their lives.
119
They're going to choose it. They can trust that they with
their doctor can have a discussion that says, based on this
potential incremental benefit in your situation, do you want
to accept these additional incremental sides effects. I have
to say that the people I represent want to have that option.
In terms of the subset analyses, I'm happy to see
that the estrogen receptor negative patient who hasn't had
the benefit from tamoxifen over these years and is very envious
of the woman who's estrogen receptor positive who gets
tamoxifen, but this is really good news for them.
But in terms of the subset analyses, you could
really take that pretty far. For example, is there a spectrum.
You've shown us that the women that are hormone receptor
negative, both estrogen and progesterone receptor negative,
have the greatest benefit. If you look at the women who were
either ER positive or PR positive, they don't see as great
a benefit. There may be a continuous spectrum of benefit from
starting from those patients who were both ER/PR negative
having the greatest benefit and those patients who were both
ER/PR positive who are also taking tamoxifen and stick with
their tamoxifen, they're going to see the least benefit because
those people of this group are going to do the best anyway.
So, any incremental benefit is going to be hard to measure,
120
particularly over this period of time. 3,000 patients is a
lot of patients, but maybe not large enough.
So, these data are very compelling to me, and I
am concerned about the subset analyses, and I think if you
really want to put weight on these subset analyses, I'd like
to see a spectrum of the differential effect that Taxol gives
after AC for every combination of the hormone receptor
positivity and negativity, starting from all ER/PR positive
to the ER/PR negative and the different combinations, different
numbers, and also if the patients stick to tamoxifen or they
don't because we all have patients who are ER/PR positive who
can't take it for some reason or who start taking it and stop
taking it. Then your hands are tied. What more can I do for
this woman who's in front of me? We're talking about women
whose lymph nodes are involved. You're talking about people
whose long-term survival is 50 percent over long term, and
we want to make things better.
So, as a physician and as an advocate for the 30,000
breast cancer patients nationally who are members of our
organization, I think that Taxol should be approved and be
available to the patient and the doctor with an up-front
discussion. I really favor this being part of the package
insert, where a doctor is guided by the package insert and
121
says, we're in this situation now. You've had the benefit
of this. What is your style of making decisions? Do you want
to do everything possible today to make sure you never see
the cancer again? And make sure that the decision to proceed
with this is an informed one.
Thank you.
DR. NERENSTONE: Thank you very much, Dr. Weiss.
Now I'd like to open up the committee discussion.
First, are there any general comments from the committee?
Dr. Raghavan?
DR. RAGHAVAN: I think everybody has identified
just how difficult one part of this is. I came in this morning
thinking the FDA were absolutely wrong, and Grant Williams
is a thoughtful reviewer and I was surprised that he would
actually do an about-face and allow subset analysis in with
FDA blessing and, in fact, castigated him as I arrived in.
But listening to the discussion, the faster Larry
Norton talked, the more confused I became and came out of it
feeling that maybe he was wrong. He made one statement that
troubled me a lot, which is the smaller the sample size, the
broader the confidence interval, and that's not a generically
true statement. It's only true if you have a scatter of points.
If everybody has a similar survival with a small sample size,
122
then the confidence intervals don't widen. It's a small point,
but it just got me to thinking that it isn't that simple.
I listened to Dr. Weiss just now and I was thinking
that she was oversimplifying things as well.
I think the reality is Taxol is a terrifically
useful drug for some people, but it's a drug that causes side
effects and people potentially have anaphylactic reactions.
And we shouldn't just say this is an all or nothing thing
in which it's either all good or all bad.
Now, I think everybody has conceded that in ER
negative patients, there's a really substantial survival
benefit, both overall and disease-free. That's terrific.
It means that for ER negative patients this is a major step
forward, and Larry Norton's conceptual thinking has influenced
us on this. And it's a huge step forward, and I think that's
great.
Of course, what we're struggling with now is the
fact that there's such a major impact on the outcome of that
smaller group that it could easily have weighted the overall
study. And it's pretty hard not to look at the survival curves
and say they really sit one on top of the other, notwithstanding
the fact that it's a subset analysis.
I think Dr. Temple's point is a little different
123
because the subset that is being looked at is actually bigger
than any other subset in the whole study.
So, in the discussion that ensues, I hope that
the rhetoric that we've been hearing doesn't sway us. I think
the reality of the situation is there's one group of about
1,000 patients that were ER negative/PR negative and didn't
get tamoxifen or, for that matter, did get tamoxifen where
the hazard ration clearly favors approval.
It's not quite that simple, I don't think, with
the ER positives who got tamoxifen. The question, of course,
is if a woman is having chemotherapy and is going through the
tail end of it, which is normally when it's the toughest and
the most wearing, if they're on tamoxifen, you want to be sure
that you're actually giving them something back for adding
Taxol.
So, I'd like to hear the breast experts around
the table and elsewhere talking a little more about it, not
just to make a very simple one-liner that subset analyses are
bad because I think this is one of the more difficult decisions
we've had to make at the committee.
DR. NERENSTONE: Any takers?
I'll plunge in a little bit, Derek. I think one
of the things as a practitioner that I agree the lack of
124
significant effect is -- "concerning" is too great a word,
and I think you're right. There is no question about the ER/PR
negative patients.
The survival curves are very close, but there is
an effect. The curves never cross, at least not from my
non-statistical eyeballing of the curves, suggesting that it
is very possible in the future that they will separate. Maybe
we should have a statistical discussion about that. What is
the likelihood that we will get an effect with more events
and further follow-up because I think that's the question.
Remember, this is a subset analysis and the study is very
positive.
What I think clinicians want to avoid at this point
is the denying of patients, possibly curative therapy, although
everyone will admit the effect is going to be small, on the
basis of a subset analysis where we know the benefit is going
to be small.
Dr. Lamborn, can you comment on that?
DR. LAMBORN: The problem, of course, as has been
identified, is as soon as you go into subset analysis, you
have to consider how much you believe this is based on prior
medical judgment that these groups are going to be different
versus you've just taken a whole series of subsets.
125
But the closest I can come, based on the
information you have right now, is to reference back to I think
it is the last slide that was in the FDA presentation where
they looked at the ER positive and/or PR positive tumors and
looked at the 3-year disease-free survival. And you asked
did it cross. Obviously, it slightly crossed in terms of
disease-free survival because that's 81.9 on the AC plus Taxol
compared to 82.7 for the AC group. But they are so much on
top of each other, what do you call "cross"?
But the other thing is your hazard ratio, which
is a .98, which is pretty close to 1 -- when we were talking
about equivalence yesterday, we would have said, .98, wow,
they've really demonstrated equivalence. You do see a
confidence interval. Again, you have to remember to interpret
that in light of the fact that they've looked at multiple
analyses.
But that's sort of the best I could do for you
in terms of trying to gauge the potential of what will see,
and there's no reason, I guess, to expect that as you move
forward, if you believe the modeling assumptions, that you're
going to change that number. You would assume that this number
is where it will about fit. The confidence interval would
get narrower, but the estimate would stay about the same.
126
DR. NERENSTONE: The sponsor said, in their
defense, that they thought the 1-year was more accurate because
more patients had gotten to that point. Do you agree or not
agree with that?
DR. LAMBORN: To the extent that we're describing
where the value will actually be at the end of all the analyses,
clearly the 1-year result is not going to change since everybody
has moved beyond that point. I don't remember what the 1-year
result was for this particular group of patients.
DR. BERRY: I don't think we gave it to you.
DR. LAMBORN: That's why we don't remember it.
(Laughter.)
DR. LAMBORN: Do you have it?
DR. BERRY: But you're talking about the ER
positive.
DR. LAMBORN: That's right.
DR. BERRY: We didn't do that. You're talking
about the ER positive, and we didn't show that. We do have
it I think.
DR. LAMBORN: I think it would be helpful if we
could see it.
DR. HENDERSON: I did show those data, and the
point was that I was trying to make was that as you go along,
127
the confidence interval gets wider and wider. I will give
you those numbers in just a second here.
There you go. So, you can see that there's a small
benefit at 1 year, fairly narrow confidence intervals around
each of the estimates, a slightly larger benefit at 2 years,
slightly larger but still fairly tight intervals around the
estimates, and then no difference at 3 years but wider
confidence intervals around both of them. I think that's the
data set that you're asking for.
DR. LAMBORN: That is specifically it because I
think the issue we're being asked is what do we expect to see
down the line. I think the only thing we can say is what we
see now is our best estimate of what we would expect to see,
and in some instances we're pretty sure of what we're going
to see in terms of final data.
DR. BERRY: Excuse me. I want to point out that
the reduction at 1 year is essentially what we see overall
and, in fact, is better, if you go back to that please. Compare
97.7 versus 96.5. The reduction is about a third in this ER
positive group.
DR. NERENSTONE: Dr. Margolin.
DR. MARGOLIN: I'm sure everybody knows this, but
I think we need to remember this business about ER and PR
128
positivity and how positive in measuring, and the interaction
with pre- and post-menopausal need to be kept in mind as well
as we, those of us who are in the clinic treating patients,
have to make a judgment every single time we make a
recommendation to a patient about her adjuvant therapy.
The NSABP has tried, in some of their retrospective
analyses, to look at their outcomes in various studies as
grouped by level of ER and PR positivity, and they've taken
the stance in many of their studies prospectively that they
don't care. They just put everybody over 50 on tamoxifen.
So, I think that, again, rather than trying to
say this is group A and group B, we really have quite a spectrum
and it makes more biological sense to look at it that way.
DR. NERENSTONE: Dr. Lippman.
DR. LIPPMAN: Again, since we're about to discuss
a recommendation based on a subset analysis and the issue of
consistency from meeting to meeting -- it actually came up
at the last meeting on another drug. But the issue that Dr.
Kelsen mentioned and I think Dr. Temple indicated his thoughts
on this is that one of the things is the idea that one looks
very skeptically on subset analyses from a negative trial.
I guess the question I have is, is there any reason
to think that there's more importance or less importance or
129
more validity or less validity to a subset analysis based on
whether the primary endpoint of the study is positive or
negative?
DR. NERENSTONE: Would someone from FDA like to
answer that? Dr. Temple.
DR. TEMPLE: For what it's worth, one of the
requirements that a sponsor faces in submitting an application
is that we ask them to look at whether effects are similar
in men and women, old and young, black and white, generally
by looking at an overview of the data, pooling the available
trials, and looking at those. Now, those are three demographic
figures. It's not 20 subsets. It's three. And many people
would condemn that and say that's just exploratory nonsense,
and you really should pin it down.
But I think there's a feeling that it is worth
looking for these things, and if the differences appear very
large, you sort of do your best with them. I think most people
would say that's the rule on subsets. You should be skeptical.
You shouldn't do it willy-nilly. You should be aware of how
many things you're looking at.
So, one of the things you'd consider is how
plausible, among the various things one is looking at, would
it be to look at other therapy. Well, a lot of times the other
130
therapy people are on is one of the first things you'd consider
in looking at plausible subsets. So, a lot depends on whether
there's 40 subsets out of which you're pulling it or only one,
and medical plausibility and all that.
So, I don't think I could give a rule. We're
generally skeptical about these things. That's our rule.
But no one would ever say they're never credible.
DR. LIPPMAN: The comment that I was making,
because it really did come up at the last meeting and you
commented here, is just the issue of whether the study is
actually positive or negative in terms of the primary endpoint
and whether that changes the validity, statistically or
otherwise, of subset analyses.
DR. WILLIAMS: Maybe I could add something. I
believe our usual action that we take on the basis of subgroup
analyses would be to put them in the labeling. Usually we
have a positive trial and usually we would say, well, there
seems to be less or more effect. So, we don't have that problem
if we have a negative trial. There's nothing to put in the
labeling. So, I mean, maybe that's what we usually did.
DR. TEMPLE: One is also, let's say, more
skeptical when the overall trial is negative because the urge
to find a subset with an effect becomes overwhelming. Maybe
131
there's less of an urge, maybe this is more spontaneous. These
are all nuanced and no good rules.
But I think it's fair to say most people think
you should. One of the great things about the overview
analyses is there were so many patients in them that you can
start to do credible subset analyses. So, Richard Peto who
started both this and is a very powerful skeptic of subset
analyses -- he's famous for showing that people with -- I guess
aspirin doesn't work if you were born under certain zodiacal
signs, which he did not consider support for astrology, but
support for not doing subset analyses.
DR. JOHNSON: Do you remember which sign?
(Laughter.)
DR. TEMPLE: Gemini was one where it didn't work.
(Laughter.)
DR. TEMPLE: Libra and Gemini. So, those among
you for whom that's relevant will know.
But at the same time as he's a known skeptic of
these, one of the things you can do when you have 50,000 patients
randomized is start to look and perhaps learn something. So,
everybody who looks at this has mixed emotions. They all say
don't do it, and they, every once in a while, find themselves
to be persuaded anyway.
132
DR. LIPPMAN: But I think we can all agree this
is a very intriguing finding. We talked about the value of
hypothesis generation and so on. I think the issue really
is -- I don't think any of us would disagree with putting this
information in the clinical section of it. The question is
whether to put it up front to really say that we're sure this
should affect patient care as a hypothesis testing point, and
I don't think that's what happened here.
DR. BERRY: Dr. Nerenstone, I don't know the
protocol here. Can I address some of the things that have
been discussed?
DR. NERENSTONE: Dr. Johnson.
DR. JOHNSON: Well, actually like yesterday,
we've sort of gone back and forth between questions and
discussion. I would like to just put forward some thoughts
about this.
Like everyone else, I too am a little bit concerned
about the subset analyses. I think had the study shown
equivalence, let's say, and then a subset analysis had been
done with 2,000 out of the 3,000 patients that was positive,
I'm not sure we would have accepted that as an indication to
go forward and approve the product.
I agree with everything that Derek said. The
133
reason I think we're a bit concerned about it is a point that
has been made by others about the biological plausibility of
the subset, which is a group that was ER/PR positive that got
tamoxifen and obviously benefitted from that.
The biological facts are -- and we've known this
for a long time -- is if there's difference in how the ER/PR
positive tumor progresses, the growth if you will, the kinetics
if you will, of that tumor, therefore the events may not be
evident as early in the process as they would be with ER negative
tumors. That may be what we're observing.
My personal preference -- again, I'm allowed to
speak but not vote, like at home.
(Laughter.)
DR. JOHNSON: I'm talking about my home home, you
know, with my wife and daughter.
It seems to me that we ought to accept the overall
result of this large, powerful trial. And then I like Dr.
Margolin's and Dr. Lippman's suggestion that we put forward
the data in the package insert which guides the clinician and
the patient as to what benefit he or she may obtain from this.
I can tell you from having seen these data -- and
I, like Derek, was wondering if Grant had lost his mind because
yesterday he obviously lost his mind and again today he's lost
134
it.
(Laughter.)
DR. JOHNSON: But I'm also very persuaded by the
data as were shown, and I'm not sure how now I'm going to handle
the patient that I have at home with positive nodes who's ER/PR
positive. Candidly I've been going to using the sequential
therapy, and now that I see these data, I'm a bit hesitant
about that. But nevertheless, I like that as an option and
I think these data prove that. I suspect -- this is my
prediction -- that we will see a difference as time goes on,
but related to the biology of the tumor types rather than just
some sort of specific interaction with Taxol per se.
DR. NERENSTONE: Ms. Fischler.
MS. ZOOK-FISCHLER: Well, as the patient rep, I
have to take a patient's position, and I think that is that
patients need the options. As I'm listening, while I do see
-- and it jumped off the page at me as well -- that the ER/PR
negative women had the greater advantage, I didn't see that
the women who were estrogen positive were at a disadvantage.
They just weren't as at great an advantage.
But as the patient, I would like to be able to
sit down with my doctor and decide what's best for me. I also
know from working with women in SHARE, the group that I'm
135
affiliated with, I've seen many women who can't tolerate
tamoxifen. So, for those women, it would be a very important
option to be able to have Taxol. So, I would like to see it
go ahead with Dr. Margolin's proviso.
DR. NERENSTONE: Dr. Kelsen.
DR. KELSEN: These data now have been available
for some time I think to the breast cancer specialist community.
How has it influenced your studies? Larry, you just showed
us a whole series of trials that are underway. When patients
enter those trials and they are ER positive or ER negative,
are they being treated differently in the Taxol-containing
studies?
DR. NORTON: No, absolutely not. That was a very
major consideration in the design of all these trials, and
everybody felt that this type of subset analysis was
inappropriate for guiding future decisions, especially because
we want more data. And if we make that decision, then it's
a self-fulfilling prophecy. We won't have the data and we
won't have that kind of information. So, that's why you'll
notice that there's a taxane and in fact Taxol in all the current
and future plans in the cooperative groups.
DR. KELSEN: So that we will never have a
prospectively randomized study in which women who are ER or
136
PR positive or both are randomly assigned to receive T or not
to receive T after AC with tamoxifen.
DR. NORTON: That is not currently planned. This
is what Dr. Temple said. I think that with all of these trials
involving taxanes and with patients with ER positive disease
being treated as well as ER negative and patients getting
tamoxifen or not, we're going to have a huge data set that
we could then do some very reasonable subset analyses of in
this regard, and that that's going to really give us the power
for making that determination long term rather than the
randomization.
In terms of the randomization, since you bring
it up, it's an ethical consideration. It's exactly what we
decided. Let's say we decided not to give Taxol to ER positive
patients. Let's say 5 years from now we find out that indeed
the curves start to separate as we get past 3 and a half, 4
years and the tamoxifen effect wears off and the curves start
to separate. We've cost a lot of women their lives by making
that decision.
If we decide, however, to give Taxol and it turns
out long term not to be effective, what have we really cost
them? We've caused some toxicity, but compared to what they've
received with the AC and compared to many other things we do
137
in oncology, it's really very minimal.
So, balanced with that minimal toxicity versus
the potential for saving lives, the intergroup decided to
include Taxol for everybody regardless of hormone receptor
positivity.
DR. NERENSTONE: There is another study. The
NSABP study is closed. It was not randomized I don't believe.
I mean, ER/PR was not in the randomization. I do believe
it was in the stratification, and that study is now closed
to accrual but did randomize AC plus or minus Taxol to stage
II patients. So, there will be another group of patients
along.
Dr. Margolin?
DR. MARGOLIN: Just as a point of clarification
mainly to Ms. Zook-Fischler, I think we recognize that the
data for the small number of patients who were estrogen receptor
positive but didn't end up on tamoxifen is no more convincing
of a Taxol effect than the whole group at large. So, I don't
think for the patient who can't take tamoxifen, we can say
that Taxol supplants that and it replaces the effect of
tamoxifen.
DR. NERENSTONE: Dr. Temple.
DR. TEMPLE: I just wanted to be sure I understood.
138
There are going to be further data on Taxol, yes or no, in
the receptor subtypes, although not most of the ongoing trials
because everybody is getting Taxol there. But there are at
least a couple trials where one will be able to look at it.
If they're stratified, that's more than
sufficient. You can't randomize to receptor status, but
whether stratified or not, both statuses are sufficiently
common in the population that you'll get effective
randomization I think anyway.
DR. NERENSTONE: Dr. Lamborn.
DR. LAMBORN: Could I ask that we hear Dr. Berry's
additional comments or clarification?
DR. NERENSTONE: Yes, thank you.
DR. BERRY: Thank you.
I completely agree with Dr. Temple concerning
looking at subsets and the strength of the subset. If there
is something that grabs you by the hair or knocks your socks
off, I look at it and I believe it.
The question is, does this knock your socks off?
And the appropriate analysis is exactly what Dr. Lamborn
suggested, namely we do a Cox proportional hazards model,
adjusting for all the other covariates, and we ask if there
an interaction between the use of Taxol in estrogen receptor/PR
139
status. The answer was for overall survival, there's no
significant interaction. For disease-free survival, there
is a .036 p value.
Now, in doing interim analyses, we adjust for
multiple looks. In doing subset analyses, we adjust for
multiple subsets. How many subsets did I look at? I looked
at nodes. I looked at tamoxifen. I looked at menopausal
status. I looked at tumor size. How many? I don't know.
A half dozen, 10? Even if I looked at two subsets and adjust
this p value accordingly, it is not statistically significant.
This is not an effect that knocks your socks off.
Two final points. One is Dr. Lippman's question.
The vagaries of subset analyses are identical whether it's
a negative study or a positive study. The same problems arise.
Another point about sample size and confidence
intervals. If you take a random subset of a set of patients
and look at the size of the confidence interval, it has to
increase. So, Dr. Norton's statement I would agree with.
DR. NERENSTONE: Thank you.
Other questions from the committee?
(No response.)
DR. NERENSTONE: If not, then I'd like to go to
the questions from the FDA. I will skip all of the preamble
140
-- it just goes over the discussion and the data that we've
already seen -- and go right to the questions, which is the
last page of the handout.
Do the results of this trial provide highly
reliable and statistically strong evidence of an important
clinical benefit from Taxol in patients with node-positive
breast cancer?
Discussion?
(No response.)
DR. NERENSTONE: Okay, then let's see a show of
hands. All the people who say yes?
(A show of hands.)
DR. NERENSTONE: That's 8 yeses. That's everyone
who is voting.
The second question. Do the results of this trial
provide evidence of clinical benefit from Taxol in patients
with node-positive, receptor-positive breast cancer who also
receive tamoxifen as adjuvant therapy?
Comments please? Dr. Lamborn.
DR. LAMBORN: I guess I have some problem with
the question as it's posed because if you just were to say
look at this subset and look at the data, then you have one
answer to the question. If you ask the question of you have
141
overall results and you've now done a subset analysis, do you
have convincing evidence that in fact the result is different
for the receptor positive group, then I think that it becomes
a slightly different issue. So, I don't know if others see
this as a -- I think it's really the latter question that we
can address from this data.
DR. NERENSTONE: Dr. Williams?
DR. WILLIAMS: I would suggest that you address
it any way you want to. It's the decision I think someone
is faced with when they have a women in this situation, based
on anything you think is appropriate, including the evidence
from this trial, whichever evidence you want to consider and
what you've seen presented.
DR. NERENSTONE: I'm not sure I see the difference
between question number 2 and question number 3, the first
part. They really feed into one another. Maybe we should
go to question number 3 which is really the crux of the
discussion, which is, for which population with node-positive
breast cancer -- all patients, patients with receptor negative
tumors, patients with receptor negative tumors plus others
who cannot receive adjuvant tamoxifen -- should this indication
be approved? In deciding this, issues include the toxicity
of Taxol, the size and the medical plausibility of the subgroup,
142
and the unplanned nature of the subset analysis.
Discussion? Dr. Raghavan.
DR. RAGHAVAN: Well, I started by taking the
devil's advocate view partly because I believed it and partly
because I was asking questions. I think the discussion
actually resolved my concerns pretty comfortably. I'm a
long-term opponent of subset analyses, and I think that even
though this is a bigger subset than average, whoever made the
point that the damage we would do by withholding the drug with
the knowledge base we have is more than the damage we would
do by letting it through.
I'm totally sympathetic to the position of the
FDA. I think it's their job to raise questions like this and
it's our job to deliberate on the data that are presented,
not to do it in a trivial way, but in fact go through it very
carefully.
Some of the early discussion I thought did
trivialize the question, and I think now the discussion has
been of a nature that when we look back in 10 years, my hunch
is that once again Dave Johnson is wrong and the curves won't
diverge. And he can't vote. So, who cares?
(Laughter.)
DR. RAGHAVAN: But I think his point is correct,
143
which is that until we have data, then we should be conservative
in favor of the patient. Therefore, these latter questions
probably become moot. What we do is we advise the FDA.
They've heard the clear sense of equipoise, but the jury moving
towards feeling that the data support an approval for
node-positive disease with caveats in the package insert.
So, I make the comment because I was the person
at this part of the discussion that raised the questions, and
I just want to comment that I'm pretty comfortable that my
questions have been resolved.
DR. NERENSTONE: Ms. Zook-Fischler?
MS. ZOOK-FISCHLER: Yes. The question asks for
which group of people it should be approved, and if it's
approved for all patients, that doesn't mean all patients need
to take that treatment. But it does open up all the
possibilities for the patient and her physician, and I think
that's what's really important here.
DR. NERENSTONE: Dr. Margolin.
DR. MARGOLIN: Well, just really a reiteration
of what I said earlier. This is a very tiny point, but I would
not leave in the package insert or any sort of subcomment that
patients with ER positive tumors who cannot receive adjuvant
tamoxifen -- we still don't know which makes you achieve less
144
benefit with Taxol, the fact that you are receptor positive
or the fact that you were receptor positive and received
tamoxifen.
DR. NERENSTONE: Other comments? Dr. Blayney.
DR. BLAYNEY: I view statistics as a way to
scientifically approach biology and the biology of breast
cancer in this particular discussion. ER positive breast
cancer is a slowly growing tumor. We don't eradicate and cure
some of those patients and the time that that makes itself
manifest is longer.
I'm new to the regulatory advice arena, but I agree
with Dr. Raghavan that I think, as presented, the data is
persuasive to me that we should advise them to approve this
for node-positive breast cancer patients, but with the caveat
that the data is what it is in 1999, and the second caveat
that I made earlier, that in over 65-year-olds, the data is
what it is, and that should also be considered by physicians
advising their women patients.
DR. NERENSTONE: Dr. Lippman?
DR. LIPPMAN: Yes, I actually agree with Dr.
Johnson on both points. In terms of biologic plausibility,
there certainly is biologic plausibility that with time we
might see an effect in ER positive patients because the effect
145
that we see will take longer to manifest, if it's really slow
growing, based on Dr. Norton's kinetic argument. But we don't
know, but there's biologic plausibility there.
First of all, people will see this published, and
putting this information in the package insert will lead to
deliberations like Dr. Johnson just mentioned. People will
interpret this and it will affect, I think, the types of
patients possibly and when it's being used. I think the
information will be there and will guide us, and with time,
we'll have more information.
DR. NERENSTONE: Other comments from the
committee?
(No response.)
DR. NERENSTONE: What I'd like to do then is we'll
take the first question as all patients, and if it passes,
then obviously we don't have to do a subgroup. For the
population with node-positive breast cancer, starting with
all patients, should this indication be approved? All those
who say yes?
(A show of hands.)
DR. NERENSTONE: 8.
The second question and I think the sense of the
committee was that a package insert should reflect the relative
146
data that was presented here. Does that need to be voted on,
or you have the sentiment of the committee?
DR. WILLIAMS: Could I get some more detail on
that? Let me give you an example. Aredia package insert was
altered because of an apparent different size of effect in
hormone treated breast cancer patients versus chemotherapy
treated patients. That was put in the indications section,
a statement referring them to the clinical trials section.
I don't think a lot of people read the clinical trials section.
It will mean a lot to the company. I think they
will not want it in the indications section. Most companies
do not want their indications section to be cluttered with
a statement talking about something somewhat negative.
So, I would wonder where you thought this would
be appropriate, what level of concern should it be brought
to, and if there's a statement that were to be put in the
indications section, you might have some discussion about what
it would say.
DR. NERENSTONE: Comments? Dr. Margolin, that
was initially your suggestion.
DR. MARGOLIN: I'm not sure that I really
understand what Grant is saying vis-a-vis the way the question
147
reads. I thought the question was whether we want --
DR. WILLIAMS: It's a new question.
DR. MARGOLIN: Oh.
DR. WILLIAMS: This has to do with what kind of
statement you want in the package insert, whether you want
something in the indications section referring people to the
clinical trials section where some data may be, or whether
you want them, if they have the concern, to go find the
indications section and look for the data.
DR. NERENSTONE: Dr. Temple, would you like to
comment?
DR. TEMPLE: Well, just to illustrate. It could
say for the treatment of patients following other therapy with
node-positive breast cancer. It could also say, see clinical
trials section for discussion of unbelievable difference
between two --
(Laughter.)
DR. TEMPLE: Or some variation of that.
DR. NERENSTONE: Maybe relative clinical.
DR. TEMPLE: Yes. So, you flag it and that gives
you some hope that someone will read the section although,
as Grant says, who knows?
DR. NERENSTONE: Dr. Lippman and then Dr. Kelsen.
148
DR. LIPPMAN: Again, just in terms of consistency
and setting a new precedent, I think if we do that, that kind
of comment could be made on almost every drug that's approved.
We could refer them in this case to people with a lot of
positive nodes. So, I guess the question is, since there are
subsets in a lot of these, this could be something that is
put in, this kind of thing in a lot of approvals, and do we
want to go there?
DR. TEMPLE: Well, you sort of have to trust me
on this, but you don't see things this striking all that often.
Now, one of the things about subset analyses is
nobody pays any attention to them at all unless they're
plausible and striking. So, there's a sort of self-fulfilling
prophecy here and you can be misled and that's why people worry
about it. But it's unusual to see anything that interesting
in a large fraction of the patients treated. That doesn't
happen every day at least partly because we don't pay any
attention to them even if they're sort or large unless they
seem credible and involve a large fraction of the population.
So, I guess I would say you don't have to worry
that we're going to throw these every time because we're highly
resistant to that suggestion. It's more a question of whether
this is different enough or striking enough to merit unusual
149
treatment.
DR. LIPPMAN: But again, just in terms of
clarification of what Dr. Berry just said, if these data were
presented in a different way, adjusting for the number of subset
analyses, I understand that they would not have been even
statistically significant.
DR. TEMPLE: Yes. That's essentially always
going to be true. If you have 10 subsets and Bonferronize,
you'll never overcome that. So, you have to do more subjective
things like think how plausible it is and think how many subsets
there really were that were that interesting. It's a very
hard problem. That's why we usually reject them.
DR. NERENSTONE: Dr. Kelsen.
DR. KELSEN: I think we should put something in
the package insert about this difference. I'm not sure where
it would go yet, but I wonder how we'll handle it -- how you'll
handle it I guess -- at 2 years or 3 years from now when one
of these two things is going to be true. One, there is a late
effect. ER patients do benefit, that warning or whatever you
want to call it, caveat should be removed. Two, we're wrong.
Even though the toxicities are relatively acceptable for an
increase in cure rate, there is no difference and therefore
the package insert should be changed. How will that be
150
handled?
DR. TEMPLE: Well, if the data start to look really
good for that subset, I think that's going to be not a problem
because the company will take care of reminding us of those
data. If it poops along and looks sort of the same, I guess
we might even come back to you. If it now looks really
overwhelming, maybe we've learned something true or maybe other
available data will contribute to that. So, we'll arrange
with the sponsor to provide the follow-up. I'm sure they will
be glad to do that.
DR. KELSEN: If it was going to be done in that
way, then I would probably stick in the indications, see the
clinical trials section, since it seems to me that we're so
uncertain at this point, rather than put it in the indications
section.
DR. JOHNSON: Presumably in that section, you
would have the very analysis that has been shown to us with
those differences.
DR. TEMPLE: In the clinical trials or the
indication?
DR. JOHNSON: No, in the clinical trials section.
DR. TEMPLE: Yes, that's exactly right.
151
DR. WILLIAMS: The question is exactly what sort
of statement would be in the indications section that would
be pointing you to the clinical trials statement. What is
the sense of the committee? Should it be there's little data
or preliminary data show, et cetera?
DR. JOHNSON: No. What I would do is based on
what the trial was designed to do. I would say it's indicated
for node-positive breast cancer. Then I would put,
parentheses, see clinical trial data.
DR. WILLIAMS: Okay. So, from what I've heard
from two so far is that you would not make a special statement
in the indications section that would try to describe the sense
of what's going to be in the clinical trials section.
DR. JOHNSON: We've had this same conversation
about toxicity issues in the past where we've allowed the
sponsor or FDA has required that certain data be placed in
there, and we've simply directed the physician to that area.
DR. WILLIAMS: The difference here is that
oftentimes we will direct people to another section, but it
will be in such a context, they'll know why they're looking.
We might say, especially look because of the ER positive
findings. Then they would know to look to the section.
152
Another that sounded like what you were saying
is approve it and go look in the clinical trials section.
Is that what you're saying?
DR. JOHNSON: Well, again, I think that what the
study did was looked at node-positive patients. So, again,
I would say it's approved for node-positive --
DR. WILLIAMS: The indication would be
node-positive patients. That's no question. The next
sentence might be to guide them to the clinical trials section
for a particular purpose. The purpose of putting it in the
indications section is to make it prominent.
DR. JOHNSON: No, I understand that, but it also
suggests that the comment that you would like to put there
would be, and especially pay attention to the ER positive/PR
positive tamoxifen treated. And I wouldn't say that. I
personally would just say see the clinical data.
By the way, I'm stunned -- stunned -- actually
that you think we don't read these package inserts.
(Laughter.)
DR. JOHNSON: And I want you to know, Bob, I
personally trust you.
(Laughter.)
DR. NERENSTONE: Go ahead, Dr. Temple.
153
DR. TEMPLE: Well, I'd just like to hear a little
more from everybody. There's a huge range of things one could
say, but I think the assumption based on what you just said
is it will say for node-positive patients. You can then say,
see clinical trials. My bias is you tell people to do that
and you don't tell them why, they don't pay much attention
to you. So, one could say, see clinical trials and mention
an unplanned subset analysis that suggested a possible
difference based on receptor status. That's not as extreme
as saying, don't use it, but it does point out what the area
of problem might be, and then they'll go see it. So, unless
you didn't think that was a good idea, that's probably what
we would plan to do.
DR. NERENSTONE: Dr. Margolin.
DR. MARGOLIN: I was just going to suggest some
wording to the effect of near the indications say, see clinical
trials for important information about receptor positive
patients, and then in the clinical trials section, just before
you show the graphs and the tables, just a statement that not
that it doesn't work, not that we're waiting, but just say
Taxol has not been proven to benefit patients with ER positive
tumors who are receiving tamoxifen or just with ER positive
tumors in overall survival and that the benefit in disease-free
154
survival --
DR. TEMPLE: That's actually a relatively strong
statement. I think other sense I get is that most people
wouldn't want anything quite that strong, but those are the
nuances. I think we have a pretty good sense of what people
want.
DR. BLAYNEY: How easy is it to change the package
insert in these various sections? I know Dr. Johnson would
jump right on it when you did change it.
(Laughter.)
DR. BLAYNEY: So, how easy is it to change these
inserts in the indications and clinical trial information?
DR. TEMPLE: Probably you've got to ask the
companies that too. We think it's not very hard if you've
got data that support it.
DR. BLAYNEY: In 3 years, for instance, if an
analysis is published suggesting that there is benefit in ER
positive patients, is that an easy thing for you all to put
into the clinical trials section?
DR. TEMPLE: If it's convincing, it's very easy.
It could be changed in a very short order. We're familiar
with the study. It's just the same analyses that have been
done, extended by a little bit. It's a very easy change to
155
make if the data are there.
DR. BLAYNEY: In that case, I would advocate
putting a statement in the indications and including in that
indication the phrase "unplanned subset analysis." I think
that's fair warning and a fair statement of the data upon which
we advised you today.
DR. NERENSTONE: Dr. Lippman?
DR. LIPPMAN: I guess I don't fully agree. I
think the issue of unplanned, planned, secondary subset
analyses -- we think a lot about that. I think one only needs
to think about selenium and olaxafene and other issues to
understand how that is accepted and understood elsewhere.
If I were to say anything, I would say, see clinical trials
section for detailed analyses and subset analyses. End of
sentence without pulling anything out.
DR. NERENSTONE: Dr. Margolin.
DR. MARGOLIN: I strongly agree with that. I
think the word "unplanned" is sort of meaningless. It's the
numbers and the fact that it wasn't prestratified and things
like that and not the fact that you didn't plan to do it but
now you did it. That's really irrelevant. It's a misleading
word I think.
DR. NERENSTONE: Other comments? Dr. Blayney.
156
DR. BLAYNEY: I meant to convey the fact that a
subset analysis is recognized not to be statistically rigorous.
So, however you would want to flag that for people I think
could be useful for practicing physicians.
DR. TEMPLE: I think what we'd probably try to
do is mention in the indications section what area of the
clinical trials is of interest, that is, it refers to receptor
status, and then in the clinical trials section, one would
discuss the nature of the analysis and all that stuff. If
we put too much into the indications section, we're sort of
taking away the indication, which is what a number of people
have said you don't really want to do. So, we want to introduce
a note of caution and get people to read that section, but
we don't want to deny the indication because that was your
recommendation.
DR. NERENSTONE: Dr. Lamborn.
DR. LAMBORN: I think that you've sort of hit on
exactly what I get the sense is that we have here, which is
something in the indication and something that might point
them to where the area is that they would want to look for
further information but not something that took the indication
away.
DR. NERENSTONE: If everybody will turn to the
157
last question, for the patient group designated by ODAC in
question number 3 -- and that is all patients, which is what
we voted on -- should Taxol be approved for use subsequent
to standard combination chemotherapy or only for use after
treatment with doxorubicin and cyclophosphamide, the
chemotherapy used in the trial?
Comments? Dr. Kelsen.
DR. KELSEN: It would seem to me that if we did
that, you'd sort of be saying that the standard of care for
node-positive women is only AC, or at least you might be
implying that the standard of care for node-positive women,
as far as the non-Taxol part of treatment, is only AC and that
no other regimen might be acceptable. If you approved it only
for use with AC and with no other treatment, would that not
be implying that that was the only acceptable standard of care
with Taxol? That's a question.
DR. NERENSTONE: Dr. Margolin.
DR. MARGOLIN: Well, I think this is probably one
of the toughest questions because we don't have the numbers
to look at anything else, but you also don't want to be so
rigid as to say that, even though this was a study that was
done, this is the only setting in which it might work. I think
the most important thing is the question of whether the
158
interaction with Adriamycin is the compelling thing and you
don't want people to be using oddball regimens like
melphalan-based regimens. So, perhaps a compromise to the
effect of Adriamycin-based adjuvant therapy which you know
99 percent of regimens are going to include Adriamycin, Cytoxan
with or without something else.
DR. NERENSTONE: Dr. Raghavan?
DR. RAGHAVAN: Yes, I agree with that. The one
caveat is that I think we've spent the morning talking about
data and what's presented, and we haven't heard anything about
Taxol following anything else. So, I'm comfortable with what
Kim said, which is Adriamycin-based regimens, but I don't know
from anything I've heard in the last 4 hours what Taxol does
after CMF. I know there are data that relate to that. They
just haven't been presented. So, I think we should work within
the confines of what the discussion was. If the company had
wanted a broader indication, they might have presented data
that related to it. So, I think flushed with enthusiasm for
having done good work, we want to still remain within the bounds
of sanity.
DR. NERENSTONE: Other discussion? Dr. Kelsen.
DR. KELSEN: I'm very comfortable with the
doxorubicin-containing regimen.
159
DR. NERENSTONE: Dr. Lippman.
DR. LIPPMAN: Yes, I am too for the reason of sort
of biologic plausibility since it wasn't looked at here, but
it certainly is consistent with the mechanisms.
DR. NERENSTONE: Just one comment. I'm also
concerned about additive toxicities, certainly with CMF, you
could have prolonged neutropenia, and how many doses of CMF?
Would you get six? Would you get four? And the added
toxicity of Taxol after 4 to 6 cycles of daily Cytoxan for
14 days I think your toxicity profile could well be quite
different, and we don't have the data here to do that.
My question, though, is what about the dose of
Adriamycin. Do we make a comment about that as well, or is
that not necessary?
Would the FDA like to address that?
DR. WILLIAMS: I'd sort of like Dr. Temple's
opinion on that. The study has three doses of doxorubicin,
but of course this isn't the doxorubicin labeling. The study
basically found no difference in effect with -- the lowest
dose seemed to be acceptable. How should this label or
especially dosage administration --
DR. TEMPLE: That's difficult, and Bob and I were
just talking about this. The labeling for cytotoxic adjuvant
160
therapy is grossly deficient. We just approved epirubicin,
so we finally have one thing that's covered. None of the others
are. So, the solution is not so easy.
I think what we usually do in that case is describe
what was done, which takes care of the immediate problem.
How to get the new doxorubicin finding into labeling is hard,
given that it's not labeled for that use. I think we need
to try to think about how to do it, and I don't know the answer
yet.
DR. WILLIAMS: But your answer is that we
shouldn't necessarily address it in this label in terms of
indications section.
DR. TEMPLE: That would be most odd to basically
label another drug, and it's not really the Taxol part of the
study. But I'd be interested in hearing what people say.
It certainly ought to get into the label somewhere.
DR. NERENSTONE: Dr. Margolin.
DR. MARGOLIN: Most people who are aware of these
data are aware of the doxorubicin data from this trial and
what the NSABP has done over and over again. I think if you
just simply use the word "standard" doxorubicin-based
chemotherapy, most people think standard and think 60 times
4, and you're going to have very little variation from that.
161
DR. NERENSTONE: Yes, Dr. Williams.
DR. WILLIAMS: As you know, I think just two days
ago we approved epirubicin which is an anthracycline. So,
the question is, would you feel comfortable broadening this
to anthracycline?
DR. NERENSTONE: Discussion from the committee?
Dr. Johnson.
DR. JOHNSON: I'll talk about that in just a
second, but we did hear yesterday in a survey, when we were
talking about another product, that I think the figure was
86 percent of women currently receiving adjuvant treatment
are getting a doxorubicin-based regimen. So, even if we
summarily exclude CMF, it's not a high percentage of patients.
Those were the data we saw yesterday.
But I have another concern. I actually think
Kim's suggestion is the right one, with this minor concern,
and that is 4 cycles of AC or 6 cycles of FAC or classic CAF?
Again, there the issue about other toxicities, including
cardiac toxicities, is another issue. I mean, it comes up.
I think it's likely to be a relatively minor issue, but I
don't know that we know that either. It goes back to what
do we know, the data we have, and whether or not one should
be willing to do this.
162
Again, my personal bias -- and we've repeatedly
had these discussions around this table -- is that I believe
we should leave flexibility for the physician treating the
patient and the patient to make a decision, as long as we can
provide appropriate guidelines and caveats. In this case,
if one were to use the language that Kim used, perhaps it might
be appropriate to say standard therapy and then say the study
was done with four cycles of AC, and then leave it to the
treating physicians to interpret that data in an appropriate
manner.
Oh, epirubicin. Personally again I would go back
to the language that Kim used, doxorubicin-containing therapy,
not to suggest that you shouldn't use epirubicin, but the study
was done with doxorubicin therapy.
DR. NERENSTONE: Other comments? Dr. Lippman.
DR. LIPPMAN: I'd just like to clarify Dave's
point. So, in the indication, you would put standard therapy.
You wouldn't specify doxorubicin-containing, but you would
put in parentheses the study was done with --
DR. JOHNSON: No. I would use the term "standard
doxorubicin-containing adjuvant chemotherapy," but I would
make it clear in the data set that it was 4 cycles of AC.
I do think that a lot of physicians use AC, but
163
candidly, at least where I practice, in the region in which
I practice, 4 cycles of AC is not what most of the physicians
use. It may be what they ought to use, but that's not what
most of the physicians use.
DR. NERENSTONE: Dr. Lippman.
DR. LIPPMAN: Well, if you're going to put sort
of in parentheses in the indication what the study used, would
you want to, since we're basing this on sort of biologic
plausibility of mechanism -- that's the doxorubicin-based
therapy. Would you want to broaden it to anthracycline-based
therapy? The study used 4 cycles of AC.
DR. JOHNSON: I'm less comfortable doing that
personally. Again, if the committee and the FDA decides to
do it, I'm fine with it, but again, I'd like to try to stick
with the data at hand.
DR. NERENSTONE: Yes.
DR. JUSTICE: I think the number of cycles issue
would be something we would address in the clinical study
section normally, and we're already referring to it. So, I
think we can cover it there.
DR. NERENSTONE: Other discussion?
(No response.)
DR. NERENSTONE: Do you need a vote on this, or
164
do you have a sense of the committee?
DR. WILLIAMS: I think we have a sense. I'm not
sure what we're going to do.
DR. NERENSTONE: Fair enough.
Well, thank you, everybody, for sitting through
this. We'll adjourn now and reconvene at 1 o'clock. Thank
you.
(Whereupon, at 11:52 a.m., the committee recessed,
to reconvene at 1:00 p.m., this same day.)
165
AFTERNOON SESSION
(1:03 p.m.)
DR. SCHILSKY: My thanks to Dr. Nerenstone for
standing in for me this morning.
We'd like to begin again with introduction of the
committee members since we do have different people at the
table at different sessions. So, Dr. Nerenstone?
DR. NERENSTONE: Stacy Nerenstone, medical
oncology, Hartford, Connecticut.
DR. JOHNSON: I'm David Johnson, medical oncology
at Vanderbilt University.
MR. McDONOUGH: Kenneth McDonough, Patient
Representative, Pittsburgh, PA.
DR. PELUSI: Jody Pelusi, oncology nurse
practitioner, Phoenix, Arizona, and consumer rep.
166
DR. RAGHAVAN: Derek Raghavan, medical
oncologist, University of Southern California.
DR. BLAYNEY: Doug Blayney, medical oncologist,
Pomona, California.
DR. SCHILSKY: Richard Schilsky, medical
oncologist, University of Chicago.
DR. TEMPLETON-SOMERS: Karen Somers, Executive
Secretary to the committee, FDA.
DR. LIPPMAN: Scott Lippman, medical oncologist,
University of Texas, M.D. Anderson Cancer Center.
DR. LACHENBRUCH: Peter Lachenbruch, FDA,
statistician.
DR. CARDINALI: Massimo Cardinali, FDA.
DR. KEEGAN: Patricia Keegan, Division of
Clinical Trials, CBER.
DR. SIEGEL: Jay Siegel, Office of Therapeutics,
CBER.
DR. SCHILSKY: Thank you.
Karen has a conflict of interest statement.
DR. TEMPLETON-SOMERS: The following
announcement addresses the issue of conflict of interest with
regard to this meeting and is made a part of the record to
preclude even the appearance of such at this meeting.
167
Based on the submitted agenda for the meeting and
all financial interests reported by the committee
participants, it has been determined that all interests in
firms regulated by the Center for Drug Evaluation and Research
present no potential for an appearance of a conflict of interest
at this meeting with the following exceptions.
Dr. Kim Margolin is excluded from participating
in today's discussion and vote concerning Roferon.
In addition, in accordance with 18 U.S.C.
208(b)(3), a full waiver has been granted to Dr. Scott Lippman
which permits him to participate in all official matters
concerning Roferon.
A copy of the waiver statements may be obtained
by submitting a written request to the agency's Freedom of
Information Office, room 12A-30 of the Parklawn Building.
In the event that the discussions involve any other
products or firms not already on the agenda for which an FDA
participant has a financial interest, the participants are
aware of the need to exclude themselves from such involvement,
and their exclusion will be noted for the record.
With respect to FDA's invited guest, there are
reported involvements which we believe should be made public
to allow the participants to objectively evaluate his comments.
168
Dr. John Kirkwood would like to disclose that he has an
interest in Schering-Plough's interferon alpha 2b. He also
has received grants, consulting fees, and speaking fees from
Schering and speaking fees from Roche.
With respect to all other participants, we ask
in the interest fairness that they address any current or
previous financial involvement with any firm whose products
they may wish to comment upon.
I'd also like to announce that Dr. Janice Dutcher
was unable to attend due to weather problems and that Dr. Scott
Lippman has stalwartly agreed to take over the role of
discussant.
Thank you.
DR. SCHILSKY: Thank you, Karen.
There's no one listed on the agenda as having
requested to speak at the open public hearing, but is there
anyone in the room who wishes to make a statement to the
committee?
(No response.)
DR. SCHILSKY: If not, we'll move right on with
the remainder of the agenda.
As Karen mentioned, the FDA has invited Dr. John
Kirkwood from the University of Pittsburgh to make a
169
presentation to the committee to help provide us some context
in which to consider the sponsor's application today. Dr.
Kirkwood?
DR. KIRKWOOD: Dr. Schilsky, Dr. Keegan, I'm
delighted to have the opportunity to review with you the updated
information on E1690, the intergroup trial of high dose and
low dose interferon in high risk melanoma patients.
This trial was commenced based upon background
data that I think everyone is well aware of, objective responses
in approximately 16 percent of patients in large collected
series treated with all varieties of interferon alpha 2,
durable responses in about 5 percent of these patients, which
are very comparable to what we know from interleukin-2,
subsequently approved for the therapy of metastatic melanoma.
A variety of antitumor effects in vitro and
immunomodulatory effects, including up-regulation of MHC class
1 and class antigens, have been the focus of a variety of studies
that I won't have time to talk about today.
The trial 1684, which was the pivotal basis for
the approval of interferon alpha 2b at high dosage for high
risk melanoma patients, included 287 patients, half randomized
to high dose interferon for a year, the other half observed.
170
As you all know, this showed very significant relapse-free
survival improvements to a p value of .004, overall survival
impact to a significance of .04, and a quality of life
improvement, as well as cost efficacy, which is comparable
to accepted therapies of adjuvant therapies of other solid
tumor chemotherapies.
The trial data that I think you're all well aware
of showed an impact which included durable response and now
out to 10 years, no significant difference with the data that
was published at 7 years, as you see reported here for the
alpha 2b high dose trial 1684; survival impact which was also
significant and which is also now updated to 10 years without
change in this pattern.
The trial 1690 that I'll talk about today was
designed in 1990 when the relapse-free survival benefit of
1684 was recognized, but certainly no survival impact had yet
been observed. It was conducted between February of 1991 and
June of 1995, and an important element that I didn't put in
the chronology here is that in July of 1995 this committee
considered the application for alpha 2b and approved it for
adjuvant therapy of high risk patients with melanoma using
the high dose regimen that we had developed in E1684.
In May of 1998, some two to three years before
171
what we had anticipated would be the closure of 1690 at the
scheduled number of 200 deaths or relapses, the data safety
monitoring committee decided to unblind this trial because
of the slowing number of events, the basis for this, the
improved prognosis that I'll come back to discuss in the E1690
experience.
And over the summer of 1998, there were both
external and internal audits of the data which corroborated
all of the database that we had in ECOG.
In the fall of 1998, a statistical analysis was
presented to the FDA and to CTEP on October 13th, and in
November, this was placed on the web and summarized as an
abstract presented at the European Society for Medical
Oncology.
Between March and April of 1999, data on salvage
therapies, which I will review with you today, were collected,
and this was all presented briefly to ASCO in May of 1999.
The trial 1690 included 642 patients, a third
randomized to high dose interferon given for 1 year, a third
to low dose for 2 years, and a third to observation.
The trial had one important difference in the
eligibility in that patients who had primary cutaneous
melanomas greater than 4 millimeters of Breslow depth were
172
allowed with or without regional lymph node dissection, a key
distinction from the E1684 trial such that 80 percent of the
patients who entered this trial had clinically node-negative
but not pathologically established node-negative disease.
We included about 10 to 20 percent of patients who had regional
lymph node metastases presenting as primary disease in the
regional lymph nodes, but half of patients presented and
entered this trial with recurrent lymph node metastatic
disease.
The trial analysis that I'll report to you today
included 642 patients in the intention-to-treat analysis, all
patients who entered the trial. 34 cases were ineligible,
and so all of the demographic analyses will focus upon the
95 percent of patients in this trial, 608, who met eligibility
requirements.
The goals of this study were an endpoint first
which was used for all monitoring committee decisions and for
the decision to unblind, which was relapse-free survival; a
second primary goal, overall survival analysis. And the
design was to pick up 83 percent power for a 10 percent increase
in cure or a 50 percent increase in either the median
relapse-free or overall survival. And two two-sided log rank
tests were specified for analysis.
173
I will also report to you Cox analyses, adjusting
for all the prognostic variables that we recognized, and a
comparison to the E1684 data as well as an analysis of the
salvage therapies that have now been gone through in detail
for 93 percent of the patients on the trial.
The demography of the patients entering this trial
included 25 percent of patients who were node-negative, N0;
34 percent who had 1 node involved; 21 percent who had 2 or
3 nodes involved; and 20 percent who had 4 or more nodes
involved. This contrasts with the E1684 trial which had only
11 percent of patients with T4 node-negative disease.
The analysis of the outcomes for relapse-free
survival show a hazard ratio for prolongation of time to relapse
or improvement in the fraction of relapse, 1.28, with all of
the 95 percent confidence intervals above 1, a p value of .05.
The low dose interferon impact was 1.09 hazard
ratio, crossing the value of 1, with a p value of .17.
The surprise in this trial was that survival was
not impacted at all on either of the therapeutic arms, and
we'll come back to discuss that later.
The plots for the relapse-free survival
illustrated with high dose interferon in all of these as yellow,
low dose interferon as red, and observation as blue, revealed
174
the data that's consistent with the hazard ratios I presented
before, survival plots overlapping in all three of the arms.
Hazard function analysis shows, similar to the
E1684 trial, an early impact of the high dose interferon
illustrated in yellow here. The relapse risk of patients who
were observed, somewhat less than we had seen in the E1684
trial, and the values for the hazard functions for the low
dose interferon arm intermediate between the high dose and
the observation plots.
Subset analyses, although I know these are
somewhat fraught with problems, show a consistency of impact
across all of the stratification groups that we analyzed both
by stage of disease and by nodal category, the exception for
this being the 1-node-positive group for which the hazard ratio
was 1.0. As you see, the node-negative population, hazard
ratio 1.46, the node-positive populations also about
equivalent, but this one group of single node-positive
patients, clearly the outlier in the subset analyses.
I should back up to say that the one group that
by itself achieved nominal significance was this one group
of 2 to 3 node-positive patients, and for this group, the hazard
ratio of 1.92 associated with the curves that I have on the
next slide for this group achieving significance, as is shown
175
here, in the subset alone.
The toxicity of interferon alpha 2b given at high
dosage in this trial was about equivalent to what we saw in
the E1684, the single exception being that we had no toxic
deaths on the high dose interferon arm. In fact, the only
two toxic deaths were observed both on the low dose interferon
arm, one of a cerebrovascular accident, one of the myocardial
infarction.
The toxicity required dose reduction during the
induction first month of therapy in 44 percent of patients
for toxicity reasons, not relapse in this particular case.
Maintenance arm treatment associated with a requirement for
dose delay or dose reduction in half of patients over the
subsequent 11 months. And again a similar fraction to the
earlier trial, 75 percent of patients were able to stay on
treatment throughout the period of a year of treatment.
The average daily dose delivered in the 1690 trial
was above that which was delivered in 1684, in the induction
phase, 18.5 million units per meter squared as the median dose;
8.2 as opposed to 8.1 during the maintenance phase.
Comparing the absolute and relative impact of 1684
and 1690, we have here the impact in terms of relapse-free
survival for the high dose interferon arm. 37 percent over
176
26 percent continuously free of disease at 5 years in the E1684
trial; 44 percent as opposed to 35 percent in the 1690 trial.
This increment in terms of absolute percentage points is 11
percent in the 1684 trial, 9 percent in the 1690 trial; the
relative increment 42 percent in the 1684 trial, 25 percent
in 1690. As we've earlier mentioned, there is no difference
in the overall survivals at 5 years, as is shown here.
The conclusions we drew then at the first analysis
of this were that the high dose interferon arm improves
relapse-free survival with a hazard ratio of 1.28, a continuous
relapse-free survival of 9 percent improved at 5 years, log
rank p of .05, a Cox analysis, .03, as I'll show you in a minute,
and is consistent with the 1684 trial.
Secondly, the subset data, which in the 1684 trial
had showed no benefit for the node-negative population, were
here refuted and the node-positive and node-negative
populations behaved very, very consistently in this trial so
that there seems to be a consistent effect across the risk
groups that we studied.
Low dose interferon had a lower absolute reduction
in relapse rate, a hazard ratio of 1.09, a log rank of .16,
and a nonsignificant value by Cox analysis, and that none of
the treatments tested in this trial had altered survival at
177
5 years, for which we will review some other analyses now.
The questions that we developed then were whether
patient populations differed between the two studies or whether
the treatment results differed between the studies. The
conclusions we'll draw from data that I'll now show you are
that there are major differences between these populations
in terms of the observation arm outcomes, that the observation
arm outcomes differ by .01 significance for relapse-free
survival and .001 for overall survival, and that there is no
study effect. There is no difference between the impact of
high dose interferon in 1684 as it is compared to 1690 between
the trials.
The Cox model analyses, adjusting for treatment,
showed a significant study effect, as I mentioned already,
.01 for relapse-free, .001 for overall survival. The Cox model
treatment by study analyses demonstrated consistency with the
interaction term .55 uncorrected to .90 as it was corrected
between the 1684 and the 1690 studies, saying that there was
not a difference between the impact of interferon in 1684 and
1690.
Adjusting for staging and nodal stratification
variables in 1690, the high dose treatment effect was
significant in Cox model analysis to a p value of .03.
178
The differences in the aggregate populations
studied in 1690, the solid line, and 1684, the dotted line,
here are shown for relapse-free survival. So, this is all
patients entered into the whole 1684 study here and all patients
in the 1690 study here, and you see that this is the basis
for the significance of .01 for the improvement in relapse-free
survival between the studies.
Even greater is the difference between the overall
survival of the 1690 population in solid white here and the
1684 population in the dotted white here, significant to a
value of .001.
The largest discrepancy was already identified
in the single node-positive population. Here you see the
observation arm with 1 node positive, untreated in 1690, and
the observation arm in 1684 compared where the value is almost
the same even though it's a much smaller subset between the
two studies. So, a radical difference in the survivorship
and the relapse-free interval for these populations.
Comparing the 1690 to the 1684 studies, within
study arms, the hazard ratios that we can show suggest that
consistent improvement in the relapse-free survival, 1.21
times better for the high dose interferon arm of 1690 compared
to the high dose interferon arm of 1684; overall survival
179
consistently better, 1.23, the hazard ratio for 1690 high dose
interferon compared to high dose interferon 1684. But the
observation arm compared within these two studies shows an
improvement which is greater than that for the treated arm,
and the greatest improvement of all is the 1.64 hazard ratio
for the untreated arms of the two trials compared in terms
of overall survival.
Looking at the stratification groups that we had
entered patients into these trials and comparing again the
two studies by subsets, we see that all of the subsets analyzed
in 1690, whether by nodes positive on this plot or by the stage
groupings that were used on the top plot, show a consistent
of the 1690 or consistent improvement of the outcome for the
high dose interferon in 1690 as opposed to 1684. The one
discrepancy here, the single node-positive group that we've
already talked about.
For the observation group comparing the two trials
in subset analysis, we see that the one group that does not
show an improvement in the outcome for the 1690 trial is the
node-negative group, and this group, you will recall, is the
group that we entered into 1690 without node dissection so
that we know this group is heterogeneous and contains perhaps
20 or more percent who had nodes involved. So, this is the
180
explanation for the hazard decrement in that group.
Comparing graphically the outcome of 1684 on top
and 1690 on the bottom, observation groups in blue and treatment
groups in yellow, you see that the lighter bar is the
relapse-free interval where we have an improvement in the
relapse-free interval in 1684, which is about equivalent or
even better in the 1690 trial. We have a post-relapse survival
which is about 2 years in each of these after relapse for all
groups, save for the observation group of 1690.
Displayed in a table, the numbers are 2.1 years,
1.8 years, 2.6 years for the post-relapse survival of the
treated and the observation groups, except for this observation
group of the 1690 trial where this is 4.34 years survival post
relapse and an overall survival from time of entry to trial
of nearly 6 years, really unheard of in trials that we've done
beforehand.
So, how could this have occurred? The questions
were, did this arise from entry demographic changes between
the two studies; stage migration, Will Rogers phenomenon; or
changes in definitive surgery; or perhaps in post-relapse
salvage therapies that were used for these patients?
The demographics of patients between 1684 and 1690
is here portrayed. The node-positive population in 1684 was
181
89 percent of patients who entered this trial. It was only
75 percent of the 1690 trial. The recurrent disease population
was 65 percent of the 1684 trial, but it was only half of the
1690 trial.
Conversely, the T4 population, the most favorable
subset of entry stratification groups, was 11 percent of the
1684 trial and 25 percent of the 1690 trial. Of this
population, 80 percent were not dissected as they came into
the 1690 trial, offering the frequent opportunity for surgical
salvage and entry to treatment, as you recall, with July 1995
approval of interferon, through the back door off protocol
with the very same agent that we were testing in the original
trial.
In summary, of relapse sites of disease of the
patients on all arms, there was no difference in the
distribution of relapses between high dose arm, low dose arm,
and observation. That is to say, the impact we saw was
generalized across all groups in the trial. There was a
significant fraction of regional, nonvisceral relapses for
which surgical salvage, as I've already mentioned, was a
possibility and subsequent off-protocol therapy was feasible.
This is a graphical display of the regional,
surgically salvageable relapses in 1690 arm A, high dose; arm
182
b, low dose; and arm c, observation. You see here the 26
relapses, here the 37, 38 relapses that had the opportunity
for subsequent surgical salvage and subsequent systemic
treatment by a variety of routes.
So, we went back between February and April of
1999, analyzed those of the 642 patients in the trial for whom
we could get data. Relapses constituted 357 patients at that
time. 331, or 93 percent, of the data were obtained on these
subsequent data sweeps: 228 by on-site audits, 103 by queries
of institutions where 1 or less patients had been accrued to
the trial. Only 26 patients had missing data, only 5 from
the observation arm.
These are the systemic biological salvage
therapies or biochemotherapy salvage therapies used for all
patients in the high dose arm and all patients in the
observation arm displayed. And I will go through these in
detail, so I won't dwell longer upon this table, given the
short time.
Interleukin-2 was approved in the interim period
while this trial was unfolding. We surmised that this might
have been one of the therapies that would have accounted for
the differences in outcome. Of the 114 failures from high
dose interferon, only 13 received interleukin-2. Of the 121
183
from observation, 22 received interleukin-2. This difference
is not a difference. It doesn't achieve significance, and
we looked at the impact of this therapy and it also did not
make a difference in terms of the outcome of these patients
for their post-relapse survival.
Biochemotherapy was also in increasing favor.
Biochemotherapy was given to only 7 of 114 high dose failures,
where it was given to 20 of the 121 observation failures.
This difference is a difference, but it didn't, in terms of
post-relapse survival, have any further connotations. There
were not longer survivals amongst the recipients of
biochemotherapy than those who did not receive this, as I can
show you later.
The interferon salvage of the patients who failed
high dose interferon was 17 of 114. The numbers in parentheses
here are just the high dose recipients. This contrasted
against 37 of 121 patients who failed observation and this
difference was the most significant that we observed to a p
value of .004. The impact of the interferon treatment of these
patients illustrated graphically was a 2.2 year post-relapse
survival of the treated patients as opposed to a .8 year median
survival for the patients who were not treated.
We wondered what this had to do with the surgical
184
salvage of regional disease. How did this differ between
regional and systemic disease? So, the next plot shows you
in the solid lines regional disease failures who received
interferon as opposed to those who did not in the solid blue
and solid yellow lines, systemic relapses who received
interferon in the dotted yellow as opposed to systemic relapses
who did not receive interferon in the blue. And you see that
the impact was greater for those patients who had regional,
salvageable, operable disease.
We wondered whether this was just a surrogate for
treatability, the patients who looked better got treated and
therefore did better. This is a plot of those who received
chemotherapy or other forms of non-interferon-containing
therapy illustrated here, as contrasted to the interferon,
and there was a difference here as well.
So, the conclusions that I draw are that if we
look at trials that have demonstrated relapse-free survival
and overall survival impact, 1684 is what we have. If we look
for continuous relapse-free survival impact, we have 1684 and
we have 1690. I've not had time to date to talk much about
the NCCTG 83-7052 trial that was reported in the same year
as the 1684 trial, but in fact, for the subset of node-positive
patients, high risk patients showed exactly the same trend.
185
Pending we have a series of studies, the 1694 trial
of ganglioside GM2 versus interferon, which will be completed
within the next 2 weeks with 851 patients; the Sunbelt trial,
a 3,000-patient trial, which is currently ongoing and about
half done; and the EORTC 18952 trial which is being conducted
in Europe testing two intermediate dosages. So, this data
is coming in from a variety of new vantage points.
Of the data that is completed and in hand, we have
the 1684 trial, the NCCTG trial that I mentioned already with
262 patients, 162 who had nodal involvement and who comprised
the basis for this Cox analysis positive for the impact in
that trial of 3 months of therapy, and the 1690 trial that
I mentioned already in detail today.
These are the trials that are pending, and I don't
need to spend longer on this since we're short on time.
But I think the conclusions that I draw or the
implications that I draw from this are that we have established
the adjuvant role of high dose interferon alpha 2b, and it
is consistent with the findings that we have in 1690. We have
salvage data for melanoma recurrences that I wouldn't have
predicted and I don't think anybody else on our committee would
have predicted but are interesting and that suggest that for
resectable nonvisceral as well as visceral disease there is
186
an impact that I think we hadn't before anticipated.
The endpoints for future trials, I think a key
point of consideration for this committee, because I think
we have to worry from now on that any trial that focuses upon
overall survival will have to deal with salvage of patients
that is hard to constrain for trials conducted in the era when
you have alternative therapies.
And we really need prognostic and response
indicators that are much shorter time lines to data than any
of the clinical endpoints that we talked about.
It's 1:30, Rich.
DR. SCHILSKY: John, thank you very much.
We'll take a few questions from the committee if
there are any information items you want clarification on.
Dr. Blayney.
DR. BLAYNEY: The 1690 trial included an
observation arm. Is this an ethical thing to do given the
results of the 1684 trial, or what figured into your
deliberations?
DR. KIRKWOOD: Good question. 1690 was started
before any survival impact was apparent, as I've shown in the
chronology of time line. At the time that we first had
statistically significant survival and relapse interval data
187
from 1684, we had already completed all accrual and all
follow-up on all patients in 1690.
DR. BLAYNEY: How did you handle patients who had
sentinel lymph node dissection in the 1690 trial?
DR. KIRKWOOD: As Rich said I was going to get
my legs cut off if I didn't stop at 1:30, I took those slides
out. Those analyses were all conducted. I actually expected
we would see a significantly larger fraction of patients with
sentinel node mapping done as a basis of entry to this trial.
In fact, it turns out that less than 5 percent of the patients
who were node-negative had any sentinel procedure done and
less than 5 percent of patients in any of the other groups
of 1 node, 2 to 3 node, or 4 or more nodes positivity had sentinel
node procedure. So, it was a very small component of the
surgical practice in this trial probably because it happened
just before the wave of this hit the surface.
DR. SCHILSKY: John, let me just ask you two
things. In the 1690 trial, what was the dose of the low dose
interferon?
DR. KIRKWOOD: It was the exact same dose that
you'll hear further about today given for 2 years. We actually
deliberated, when we designed 1690, whether we should give
3 million units 3 times a day forever, and I was the lone vote
188
on our committee to actually push for that. We actually
stopped at 2 years because people thought it was impossible
to carry patients past 2 years of this therapy without knowledge
about outcome.
DR. SCHILSKY: Just to be clear, the low dose
interferon in the 1690 trial didn't demonstrate any benefit
with respect to either disease-free or overall survival?
DR. KIRKWOOD: As I showed in the hazard ratio
analysis and as we have in subset analyses that I didn't have
time to present, it did show an impact and it showed an impact
which was intermediate on average between the high dose and
the observation.
DR. SCHILSKY: That was statistically
significant?
DR. KIRKWOOD: It was not statistically
significant in overview. The p value was .16.
DR. SCHILSKY: Thanks.
Any other questions for Dr. Kirkwood? Dr.
Lippman.
DR. LIPPMAN: I just want to clarify. You went
through the data pretty quickly because of time. I understand
that. But just to clarify this good survival on the
observation arm in 1690, the biggest difference between the
189
salvage therapies involved the interferon.
DR. KIRKWOOD: True.
DR. LIPPMAN: Do you think that that was in part
the explanation for the better survival on the observation
arm?
DR. KIRKWOOD: There's a component that may have
been played by biochemotherapy, but I think the interferon
salvage is the only explanation we presently have for that
greater survival of the patients in the observation arm.
DR. SCHILSKY: Dr. Simon.
DR. SIMON: Is there any documented randomized
trial evidence for the use of effectiveness of interferon in
recurrent patients commensurate with what you're claiming from
this sort of nonrandomized comparison?
DR. KIRKWOOD: We have done a number of those
trials and we've done them in small enough series that I think
none of them has had the power required to detect this kind
of an impact that we're seeing here. I think that there's
not adequate data.
DR. SIMON: Well, what was the size of the trials
you did?
DR. KIRKWOOD: 20, 30 patients. They were phase
I/phase II trials.
190
DR. SIMON: They were randomized trials?
DR. KIRKWOOD: No. These are phase I/phase II
trials.
DR. SIMON: So, there have been no --
DR. KIRKWOOD: There have been no randomized
trials that I'm aware of that have tested the impact of this
--
DR. SIMON: So, there's really no randomized
documentation --
DR. KIRKWOOD: Right.
DR. SIMON: -- that that really is a real effect.
DR. KIRKWOOD: True.
DR. SCHILSKY: Okay, John, thank you very much.
We'll proceed to the sponsor's presentation.
MS. da SILVA: Thank you. Good afternoon,
everyone, ladies and gentlemen of the advisory committee and
FDA. I'm Loni da Silva, Program Director of Regulatory Affairs
at Hoffmann-La Roche, and this afternoon we'll be discussing
Roferon-A for stage II treatment of malignant melanoma.
The proposed indication which we are seeking is
adjuvant therapy of and prevention of recurrence in surgically
resected stage II malignant melanoma, Breslow tumor thickness
greater than 1.5 millimeters, in patients without clinically
191
detectable lymph node metastases at a low dose of Roferon-A,
3 million units, subcutaneously 3 times weekly for 18 months.
Our presentations this afternoon will consist of
two speakers. Our first speaker is Dr. Antonio Buzaid, the
Executive Director of the Oncology Center, Hospital
Sirio-Libanes, Sao Paulo, Brazil, who is also the former
Medical Director of the Melanoma Unit at Yale and former
Director of the Melanoma Skin Center at M.D. Anderson. He
will be discussing the clinical overview of malignant melanoma
and concentrating also on the difference in the staging between
specifically stage II and stage III.
He will be followed then by Dr. Leon Hooftman,
who is our Director of Oncology at Hoffmann-La Roche. He will
be presenting our data on Roferon-A in the treatment of stage
II malignant melanoma.
Specifically we'll be focusing on these key
points. As I said previously, you will hear the differences
between the disease stagings, specifically stage II and stage
III, and that our data shows a prolonged disease-free interval
compared to no treatment, that disease-free interval is our
primary endpoint and is a good predictor for overall survival.
There is a strong trend towards increase in overall survival,
and with low dose Roferon-A, it has a well established safety
192
profile.
With that, I would like to call Dr. Antonio Buzaid.
DR. BUZAID: Good afternoon, Chairman, members
of the committee.
My focus and task today is to provide an overview
on prognostic factors of patients with melanoma stage I and
II, briefly also in stage III disease, and finally provide
a snapshot on adjuvant therapy of melanoma.
As you all know, the incidence of melanoma is
growing markedly worldwide. In fact, in the U.S. by the year
2000, 1 of 75 Americans will have the diagnosis of melanoma.
As far as the staging is concerned, we currently
have four stages for melanoma. Stages I and II pertain to
patients with primary melanoma. Concerning the next
presentation, clinical stage II disease are those with Breslow
depth greater than 1.5 millimeters. Stage III disease was
just presented by John, and it's basically patients with nodal
metastases and also in-transit metastases, and stage IV is
basically distant disease.
Most patients with melanoma present with stage
I and II disease at the time of diagnosis. Obviously, the
prognosis is very different otherwise it wouldn't be called
stage I, II, and III. But it's important to emphasize a few
193
things here.
First of all, in the stage I and II category, the
slope of the curve goes down very slowly, while here, as you
can see, stage III disease is a very rapid drop. In fact,
about 80 percent of the patients with stage III disease recur
in the first 3 years, while only half of the patients with
stage II disease. These patients probably have a lower
microscopic tumor burden because imaging studies are usually
negative in this setting. Although they recur, they recur
in a much more slower fashion, while patients with stage III
disease probably have a larger microscopic tumor burden because
you can see that with CT-scans, but the curve drops reasonably
rapidly.
Let's focus on the prognosis of primary melanoma,
that is, stages I and II. Looking at one of the largest
databases, about almost 5,000 patients, University of Alabama
and Sidney Melanoma Unit database, the three most important
factors is the Breslow depth or obviously tumor thickness,
ulceration, the location of the primary, the pathologic stage,
whether or not the nodes were involved regionally, level of
invasion, Clark level, sex, and age. But the most powerful
factor is obviously Breslow depth.
The Breslow, as you all know, is measured from
194
the granular layer of the epidermis to the deepest melanoma
cell that can be seen in the microscope, and there is obviously
a direct correlation between tumor thickness and outcome.
It's for patients less than 1 millimeter, 1 to 2, 2 to 4, and
graded in 4 millimeters.
We know well that this correlation is direct but
not linear, in fact, is relatively linear up to 5 millimeters
or so, 4 to 5 millimeters, and then it flattens out somewhat.
So, very thick lesions, if you have an 8 millimeter or a 6,
it may not make a tremendous difference, but if you have a
2 versus 4, the jump is tremendous.
Now, let's focus a little bit on disease-free
survival and overall survival. There are very few series in
the medical literature that present data on disease-free
survival in primary melanoma. This is the largest data set,
5,000 patients from Duke University, and the only one that
actually has both curves clearly outlined. There are
important messages here.
The first one is obviously -- this is shown by
tumor thickness in groups between 0.76 and 1.5, 1.5 to 4 in
blue, and finally orange, greater than 4 millimeters. The
solid line is overall survival; the dashed line, disease-free
survival.
195
First of all, there is obviously a direct
correlation between disease-free survival and overall
survival, as you would expect in melanoma. This is not
testicular cancer, but you can salvage almost everybody with
chemotherapy.
Now, on the other hand, there is about a 25 percent
difference, absolute difference, that you see in general, about
20-25 percent for almost each category, and you need to
understand why this is happening here. So, you have patients
that recurred but haven't died. These are patients with
primary melanoma. The major element that explains the
difference between disease-free survival and overall survival
here is surgery because two-thirds of the patients with primary
melanoma recur regionally, in general nodal metastasis, and
about 40 percent of the patients that recur with nodal
metastasis, you can salvage them with surgery. This gives
you about 40 percent out of two-thirds, which is about 20 or
so percent of the patients. So, the major difference between
disease-free and overall survival is explained by surgery for
regional metastases. Nonetheless, still the majority of the
patients that recur eventually die, at least about 70 percent
of them.
Sentinel node mapping is a novel technique for
196
melanoma, although very old for other cancers. It started
in melanoma in 1992. In sentinel node biopsy, basically we
inject a blue dye and/or a radioactive material and try to
find the first node the melanoma cells would drain to if they
were to metastasize. That's the concept of sentinel node,
and basically after the injection, you find the blue node and
send it to pathology. We know that there was a strong
correlation between this node and the remaining of the nodal
basin. If this node is negative, there's about a 98 percent
chance the rest will be negative. If it is positive, it's
positive.
One of the largest databases in sentinel node
mapping is from M.D. Anderson, Lee Moffit Cancer Center. It's
about 500 or so patients recently published in the Journal
of Clinical Oncology. As you can see here, there was a direct
correlation between tumor thickness and the chances of having
positive microscopic nodes. That's identical data to the
elective node dissection in the past. As you can see here,
pertaining to this particular presentation, greater than 1.5
millimeter Breslow depth has about a 22 percent chance of having
microscopic nodal metastases. So, about 80 percent of the
patients will be node-negative.
When you have such a database where all patients
197
underwent sentinel node mapping, we've learned that the most
powerful prognostic factor, if you do have that piece of
information, is the sentinel node histologic status. In the
multi-variate analysis, this is the most significant factor
followed by Breslow depth. If you do not have sentinel node
information, Breslow depth is the most powerful prognostic
factor.
This is the actual Kaplan-Meier survival curve
for disease-free survival. All patients studied. The
negative patients, the curve goes up, so it's a more favorable
subset now, and those with positive nodes, obviously the curves
do go down and go down relatively rapidly. This is
disease-free. But not everybody has died yet. As you can
see, about half of them have already died, and the majority
of patients with sentinel node have only 1 positive node.
That's why the curves look so favorable.
This leads to the next topic which is the prognosis
of patients with regional metastases, primarily nodal
metastases. Like all the other cancers in oncology, the number
of positive nodes is the most powerful prognostic factor for
patients with nodal metastases. Presence of extranodal
extension is also an adverse effect, and also patients with
dual nodal basin versus only one nodal basin as a more
198
unfavorable group.
This is a Kaplan-Meier using an overlay graphic
technique. What you can see from this slide here is that if
you have nodal metastases, at least half of the patients will
eventually die, and in fact, looking at all curves in general,
about 70 percent of the patients will die. That is about 30
percent of the patients in general will be alive at 10 years,
if you have nodal metastases.
Again, this difference pertains to the number of
positive nodes. That is, patients with 1 node in general have
about a 40 percent chance of being alive at 10 years. Patients
with multiple nodes have usually about a 20 percent chance
of being alive at 10 years. Patients with extranodal extension
have about a 10 to 20 percent chance as well.
Now, as I pointed out before, if a patient has
a primary in the back and this patient has 2 lymph nodes involved
in one axilla, this patient fares a little bit better than
a patient that would have both axillas involved in a primary
in the back. It is 1 node on the left and 1 on the right.
This patient will fare worse than one that has 2 nodes and
one site only. This is single nodal basin versus dual nodal
basin for the same number of nodes.
Finally, as far as subcutaneous and intradermal
199
metastases, what we call in general in-transit metastases,
the patients have a poor prognosis. Again similar to the
patients with nodal metastases, about 70 percent of them in
general will be dead at 10 years. This is similar to patients
with local recurrences.
A snapshot on adjuvant therapy. As you all know,
melanoma is the most serious type of skin cancer, which has
a high chance, depending on the prognosis of the patient, to
metastasize. Multiple attempts have been made in order to
reduce this risk of recurrence. In the past -- this is all
randomized phase III studies from stages I up to III --
chemotherapy has been employed, and the drug that has been
most widely studied was carbazine. Other regimens, some of
them somewhat bizarre regimens, have also been studied and
showed no impact in disease-free or overall survival.
Specific monotherapy, such as BCG, C. parvum,
transfer factor, or gamma interferon, and levamisole, somewhat
controversial but also considered negative definitely in this
country, showed no impact in disease-free or overall survival.
As you all know, when you combine things that don't work,
they usually don't work well. We've done that in oncology
as well. DTIC plus BCG is of no benefit in terms of overall
survival or disease-free survival.
200
Vaccines have a tremendous appeal for the
population. Whether it helps patients with melanoma, we don't
know. What we know to date is there are two randomized trials
reported. They're relatively small studies, but both were
negative. The first trial is in the vaccine in melanoma,
oncolysate, VMO. It was as negative as you can imagine. The
p value was 0.99 and 0.88. The Memorial Sloan-Kettering
program using a ganglioside had a very modest impact on
disease-free survival and has been evaluated further in larger
randomized trials, but again it was preliminarily negative.
Other vaccine programs are ongoing and the results are not
as of yet available.
Finally, interferons. John Kirkwood has
presented in absolute detail the ECOG 1690 and the ECOG 1684
data. He also alluded to the North Central Cancer Treatment
Group protocol and WHO 16. It's important to emphasize that
these studies were conducted in patients primarily with
node-positive disease. The ECOG trials, about 80 percent of
the patients had basically node-positive disease; the North
Central, at least two-thirds have node-positive disease; and
WHO was completely node-positive disease. So, these studies
are really different, different population of patients
compared to the trials that will be discussed today.
201
The trials that will be discussed today will be
two studies, two randomized trials, which include patients
with clinical stage II disease, that is, patients with primary
greater than 1.5 millimeters and clinically node-negative.
And I will pass now to Dr. Hooftman. Thank you.
DR. HOOFTMAN: Good afternoon, ladies and
gentlemen, members of the committee, and FDA. My name is Leon
Hooftman. I'm one of the R&D directors for oncology for
Hoffmann-La Roche.
It's my pleasure this afternoon to present you
the data that form the basis of the license application that's
under discussion. We are here today to get the recommendation
of the advisory committee with regard to the license
application concerning low dose Roferon-A for adjuvant therapy
of stage II melanoma patients, that is, clinical stage II
melanoma, clinically node-negative melanoma.
I will do my job reasonably well if I am able to
discuss four specific important messages that form the basis
of this presentation.
Further to what Dr. Buzaid said, I would like to
emphasize the fact that currently there's no recognized
standard therapy available for patients with stage II melanoma.
Secondly, there's a distinct difference for
202
disease prognosis, as well as disease state, between stage
II and stage III melanoma.
Thirdly, we believe that low dose interferon alpha
2a prolongs disease-free interval in a patient population that
consists only of stage II melanoma patients.
And last but not least, there is a robust and strong
correlation between disease-free interval as a parameter and
the important long-term outcome parameter, which is overall
survival.
To come back to one of these points -- and I
apologize for the reasonable simple nature of this slide --
we have studied a low dose variety of Roferon-A for stage II
melanoma only. The ECOG 1684 and 1690 studies have a certain
proportion of patients with stage IIb, but the main body of
the study is about stage III, which is node-positive disease.
The Cascinelli study only studies stage III
disease, but with a low dose, the same dose as we have studied
in our trials.
What is also important to note is that a certain
proportion of all patients with stage I/II and a certain
proportion of all patients with stage II will develop stage
III disease, a certain proportion of patients thereof will
develop metastatic disease which is not curable.
203
I would like to discuss now the two large-scale,
randomized, multi-center trials that form the basis of our
license application, one pivotal, one supportive, that were
conducted in France and Austria, respectively.
The first study we call our pivotal study. It's
the French study performed by the French Melanoma Group that
started in January 1990, and the lead investigator was
Professor Grob. This study recruited 499 patients.
The study that we use for supportive purposes is
the study performed by the Austrian Melanoma Group, and that
study recruited 311 patients, started almost at the same time,
February 1990. The lead investigator here is Professor
Pehamberger, and both investigators are here with us today.
These larger studies prospectively studied the
usefulness of a low dose of interferon, 3 million units, given
3 times a week for a duration of 18 months, in order to be
able to bring down the incidence of recurrence of disease,
in other words, as adjuvant therapy, for stage II melanoma.
The design of the first study that I am going to
discuss is as follows. This is the pivotal study as conducted
by the French Melanoma Group in France. This well-controlled
study started, as I said, in January 1990, and patients were
recruited until January 1994, over a 4-year period.
204
The patient population of this study consisted
of clinical stage II melanoma patients only, that is, patients
without clinical, palpable lymph nodes, in other words,
clinically node-negative.
The dose used was 3 million units subcutaneously
given 3 times a week for 18 months.
Patients were randomized within 6 weeks after
surgery. Stratification by center was applied, but not by
risk factors. I will get back to that later.
Here you see depicted the conduct of this pivotal
study. As I said, it was initiated in January 1990, and the
primary efficacy analysis was done in January 1994 when all
patients were recruited. We'll have to go back to that later.
246 patients went into the observation arm. 253
patients ended up in the Roferon-A arm. Treatment duration
was for 18 months for all patients. Prospective follow-up,
as per protocol, was for 36 months, meaning that all patients
were followed up for 36 months, but the patients that had been
in the study longer had a follow-up of up to 7 years.
At that point, the prospective part of the study
finishes and a retrospective section of this study starts.
Patients were asked to provide a second, new consent and were
205
seen once by the clinician in order to be able to collect data
for long-term follow-up.
The primary efficacy endpoint, as used in this
study, was disease-free interval. This is the time between
initiation of therapy and relapse. This primary efficacy
analysis was conducted as a sequential analysis. This part
of the study was conducted as a sequential trial. A triangular
test was used. The alpha was 5 percent; the beta, 10 percent;
in other words, with 90 percent power.
The assumptions for the design of this study were
as follows. At 3 years, the investigators expected that 60
percent of all patients in the observation arm would have
relapsed, and what they wanted to do was increase this figure
to 75 percent for the Roferon-A patients, an absolute increase
of 15 percent. For that purpose, they needed 104 relapses,
and all together at the time they thought they needed 452
patients.
Three sequential analyses were performed. At the
last sequential analysis, a sample size adjustment was
performed as well, and a sample size adjustment was used in
this trial in order to be able to stop recruitment in the study
at the moment in time that enough data would be collected to
be able to answer the predefined question and show the
206
predefined difference.
A first interim analysis was performed in July
'92, when a total number of relapses existed of 59: 34 in
the observation arm and 25 in the Roferon arm. A second
sequential analysis in April '93, but the main efficacy
analysis was performed as the third interim analysis, the third
sequential analysis, in January '94.
At that moment in time, there were 134 relapses
in total, 80 in the observation arm and 55 in the Roferon arm,
a difference of 25.
The null hypothesis of this analysis of this part
of the trial was that observation was the same as Roferon-A.
At that moment in time, this null hypothesis was rejected.
A p value was reached of .038. This demonstrated that, at
that moment in time, Roferon-A statistically significantly
prolonged disease-free interval as compared to observation.
Quite separately from this main efficacy analysis,
a long-term analysis was performed for all patients with at
least 3 years follow-up. These were further exploratory
analyses of the primary efficacy endpoint and analyses of
secondary efficacy endpoints, as there are overall survival
and safety. They were performed at the end of the study.
That was the time when all patients had reached at least 36
207
months in the trial. And I remind you that treatment continued
for 18 months.
For this long-term analysis, we used an eligible
patient population. The total number of patients recruited
was 499. The eligible patient population consisted of 489
patients. We think that this is very close to an ITT, an
intent-to-treat, population.
As you can see here, these were the patients
excluded from these long-term analyses. The reasons for
exclusion, as listed here, are in fact violations and would
have normally been considered exclusion criteria as per
protocol. The 5 patients that had no injection initially
agreed to participate in the trial, but then immediately
withdrew their consent because they didn't want to have the
3 times weekly injections.
One has to appreciate that at the time of
initiation of this trial, it was not clear to the patients
what their potential benefit of this therapy would be, and
therefore the threshold, at least that was the risk a priori
-- the threshold for withdrawal from the trial would be low.
That has not materialized, fortunately, because withdrawal
in total -- and I'll get back to that later -- was not
considerable.
208
A little bit about demographics. As I said
before, stratification by center was done, but not by the more
relevant risk factors. However, with this number of patients,
499, it balanced out beautifully. Differences were very
small. There was no statistical difference, for example, for
the more powerful of the risk categories that we have used,
which is Breslow thickness.
Further to Dr. Buzaid's presentation, you see here
the categories of Breslow tumor thickness. Our patients in
this stage II melanoma patient population consisted of patients
with tumor thickness of 1.5 millimeters and more. This should
be looked at in categories and not as a continuous variable
because, obviously, these subcategories follow in some ways
anatomical boundaries.
This is one of the busiest slides that I'm going
to show this afternoon, and it will take me some time to guide
you through it, but this is quite a crucial slide for the message
of the presentation.
This is the long-term analysis on eligible
patients for disease-free interval, and disease-free interval,
the time from initiation of therapy to relapse, the difference
remains significant. This analysis was done when a median
time to follow-up existed of 4.4 years. That means that the
209
first patients were up to 7 years in the trial and the last
patient entered 36 months.
The time to 25 percent relapse -- and I do not
show that on the slide here -- was 1.3 years in the observation
arm and 2.1 years in the Roferon arm, a rather remarkable
reduction of 25 percent, or 10 months.
The p value for the Kaplan-Meier estimates, as
you see here, is .035.
The number of relapses in the Roferon arm in total
was 100; in the observation arm, 119; a difference of 19.
Last but not least -- and that is perfectly
justified by the protocol -- if one would do a cutoff analysis,
something that most simple people like myself would understand
better, if one would do a cutoff at 3 years, then the percentage
of withdrawals here would be 32 percent and 49 percent in the
observation arm, a difference of 17 percent. With
stratification by center, that carries a p value of .005.
Breslow thickness, as presented before by Dr.
Buzaid, is a powerful risk parameter or prognostic factor.
We show this slide here today of the Kaplan-Meier estimates
for the specific subsets of Breslow thickness only to show
that the impact, the effect, for all categories is similar.
I also need to inform you that there was no
210
interaction between this risk parameter and the outcome as
disease-free interval, nor was there any interaction between
age and sex and this outcome parameter.
Before I start explaining this slide, it's my task
to bring across to you that this study was never designed to
evaluate overall survival. I'll try and explain that.
A sequential analysis was performed and a
triangular design was used. That means that discontinuation
of recruitment into the trial was done at the moment in time
that there were enough events to answer the question about
disease-free interval. By nature of things, there will always
be more events such as relapses than death. Therefore, it
is a little bit unreasonable to expect that one would be able
to show a difference for an outcome parameter which has less
events like death.
As it happens, we come close with a p value of
.059. But the only thing we can conclude from that is that
there is a strong trend.
However, as I said before, there is a robust
correlation between disease-free interval and overall
survival, and I will get back to that when I conclude this
talk.
There were 59 deaths in the Roferon arm in this
211
analysis and 76 deaths in the observation arm. It's obvious
that at 6 years, at the tail end of the curve, like with the
other curve, there are few patients in the analysis simply
because median follow-up time here as well was 4.4 years.
Dr. Buzaid showed a slide in his presentation where
he put together disease-free interval or time to relapse and
overall survival. Sorry. This is disease-free interval
obviously for both and here is overall survival.
I would like to show to you what the difference
is between the two with regard to events. 100 relapses in
the Roferon arm, 119 in the observation arm. 59 deaths in
the Roferon arm, 76 in the observation arm. One difference
of 19. One difference of 17.
I think that the crux of my argument for this
afternoon is that if we manage to delay or prevent recurrence
in this disease, it is possible that we may delay death as
an event. I think that that is an important thing to keep
in mind.
The shapes of these curves are similar, but that's
the only thing I can say about them.
It's very important for a regimen that has to be
continued for 18 months that tolerability is more than
acceptable. We have looked at the adverse event pattern of
212
this dose used in this study, 3 million units 3 times a week,
and we have concluded that the pattern of adverse events that
we observed is not different from the pattern of adverse events
that we see with the use of this drug in other indications.
There are no surprises and there are no events
that suggest the sort of toxicity that one would relate to
a higher dose of this drug that we have also seen in other
studies with our drug in the past.
So, here you see the percentages of the patients
with flu-like symptoms, asthenia, headache, nausea/vomiting,
depression, and dizziness being the most commonly reported
adverse events in this trial.
If we then look at the percentage of patients with
grade 3-4 toxicity, then these percentages are low. Again,
this is a well-established safety profile that we know and
have seen several times before with the use of this drug.
What is important to show, however, is that there
is a certain withdrawal rate, and this withdrawal rate is 14
percent. 35 patients withdrew from treatment over the course
of 18 months. The majority of these withdrawals happened
around the 1-year time point. More importantly, they were
for events such as asthenia, flu-like symptoms, dizziness,
depression, usually grade 1-2. There were 9 patients, though,
213
with grade 3-4 that withdrew, and you see them described here.
There were 2 patients withdrawn for severe increases in liver
enzymes.
I will now move on to discuss the study that formed
the supportive data for this application, the study performed
by the Austrian Melanoma Group. Recruitment took place
between 1990 and 1994, roughly in parallel with the French
study. This was also a prospective, randomized, multi-center
trial. Patients had Breslow tumor thickness of 1.5
millimeters and more, in other words, clinically node-negative
patients, exactly the same patient population as we had in
the other study.
The primary efficacy parameter was also the same,
disease-free interval, time from initiation of therapy to
relapse.
The dose was the same, the regimen slightly
different, and the treatment duration was different. 3
million units were given 5 times weekly, once daily for 5 days,
for a duration of 3 weeks, sort of an induction regimen. The
maintenance part was, however, the same as I described for
the previous trial.
I base this part of the presentation on the
publication database. The data that I've presented and
214
present from the publication, this publication has a patient
number of 311: 154 in the Roferon arm, 157 in the observation
arm. There is currently a database that has 330 patients,
as 19 CRFs were collected after the publication cutoff.
Demographics. Again, I show Breslow thickness
as a risk parameter only, and here as well, whereas there was
no stratification for this parameter, both arms are well
balanced. There is certainly no statistically significant
difference between the two. There are only small differences
that are not clinically relevant.
These are the Kaplan-Meier estimates for this
study, also for disease-free interval. Here you see the
observation arm. Here you see the Roferon arm.
This analysis was done in September 1995 when
patients had been in the study for at least 1 year and observed
and followed up for at least 1 year. So, recruitment took
3 years, 154 here and 157 on the other side.
37 patients relapsed in the Roferon arm, 57 in
the observation arm. The p value was less than .05.
Here you see our overall conclusions. We have
seen parallel efficacy in two independent studies with 800
and more patients in these studies all together.
The reduction in recurrence rates or time to
215
recurrence of 25 percent in our view is clinically meaningful.
This translates into prolongation of disease-free interval
of 9 to 10 months. The time to 25 percent relapse in the French
study, in the pivotal study, was 1.3 years in the observation
arm and 2.1 years in the Roferon arm. If we cut off at 3 years,
32 percent of patients have relapsed in the Roferon arm and
44 percent in the observation arm.
We have seen a strong trend towards increase in
overall survival that is properly correlated with the increase
we have seen that is statistically significant for disease-free
interval.
This drug has a well established safety profile.
The withdrawal rate over 18 months in this study was low.
It was 14 percent, but in view of the fact that patients did
not know exactly what their advantage was going to be, this
was very reasonable. The drug was therefore well tolerated.
Patients could continue with work and lead an essentially
normal life. This is important for a prophylaxis regimen and
a regimen that relies on compliance and has to be maintained
for 18 months.
We designed low dose Roferon-A for a situation
whereby there's a low tumor burden and an intermediate to high
risk of recurrence. What this therapy does is it may prevent
216
or delay the dreadful moment of disease recurrence. It may,
therefore, delay death as visceral metastases directly lead
to death within 12 to 18 months.
We, therefore, recommend low dose interferon alpha
2a, otherwise called Roferon-A, therapy as adjuvant therapy
of stage II melanoma patients. These are patients with
clinically node-negative melanoma. This translates into a
Breslow tumor thickness of more than 1.5 millimeters. We
recommend a treatment duration of 18 months.
This brings me to the end of my presentation.
Thank you.
DR. SCHILSKY: Thank you very much.
Are there questions from the committee members
for the sponsor? Dr. Raghavan?
DR. RAGHAVAN: These are two quite large sets of
data and you're asking us to accept disease-free interval as
a good surrogate of overall survival.
The one thing that troubles me and puzzles me is
the time of recruitment to these two trials was for the French
trial January 1990 to December 1993, and the Austrian trial
sometime in 1990 to 1994. By my calculations, you should have
follow-up data conservatively to 9 years and maybe to 10 years,
and yet the survival curves that you present show weak power
217
out at 6 years. So, effectively you're presenting old data
that haven't been updated and yet asking us to accept
disease-free survival rather than overall survival. Could
you clarify why that is?
DR. HOOFTMAN: I would not immediately agree with
that. With this proposal for this therapy in an indication
of stage II melanoma, median time to death is 7 to 8 years.
Our median follow-up is 4.4 years. We are, however, getting
closer to the moment in time where we could produce longer
follow-up data.
DR. RAGHAVAN: No. I'm sorry. I guess I asked
the question without clarity and I apologize.
I understand what you just said, but the reality
of the situation is that even your disease-free survival
curves, unless I'm misinterpreting them, don't go out to the
full time that would be eligible for the duration of follow
up. It looks to me like the data that you've shown us, whether
they're disease-free or total survival, are old data. I can't
understand if you had patients entered in 1990 who you propose
are still alive, which I hope is the case, why the survival
curves have so few cases at 6 years that are still going.
It doesn't make sense to me.
Why have you censored at 6 years? Why do the
218
curves not go out at least to the 9-year point?
DR. SCHILSKY: Would you please identify
yourself?
DR. WASSNER: I'm Elizabeth Wassner. I'm working
in oncology in Basel.
The dossier has been submitted two years ago.
These are the data that you reviewed.
Now, if we look at 5-year survival data, which
is actually a reliable time point in the study, we've got a
p value of 0.021, which is even more significant than what
we've presented here.
DR. SCHILSKY: Can we just clarify that perhaps
by hearing a brief summary of the registration history? You
just said that the materials were submitted two years ago and
that that's the data that we're reviewing today.
DR. WASSNER: Yes.
DR. SCHILSKY: Since you originally submitted the
data two years ago, have you provided any update to those data?
DR. WASSNER: We haven't been requested to do
that, but it is planned, of course, to look longer into these
data. But right now this is the data we have, and we're
actually claiming overall disease-free survival and this is,
I think, mature data. Overall survival, of course, would
219
request 10-year follow-up in this population, and an end of
recruitment, which is December 1993. 10-year data are still
far away.
MS. da SILVA: Just to clarify the regulatory
history of the submission, we originally submitted our
application of September 1997 and the year time clock for acting
on that with FDA was in September of 1998 when we received
questions and responses from them. We then took into account
their comments and resubmitted a response in March of 1999,
which included a second study with the Austrian publication,
and then we are here before you today, of course. We were
notified in July, so we have not submitted an update as of
yet.
DR. SCHILSKY: Thank you.
Other questions? Dr. Nerenstone.
DR. NERENSTONE: I'm not familiar at all with
these clinical trials groups. We're usually given a little
bit more information about frequency of follow-up or how
patients are clinically staged. That's sort of important in
a study where it's a disease-free interval difference that
you're looking at. Can you tell me how often these patients
are followed and what kind of tests are done, whether liver
function tests are done, CT scans, or clinical, and how often
220
that interval is?
DR. HOOFTMAN: Can I please defer this question
to Professor Grob who was the lead investigator of this trial?
PROFESSOR GROB: Jean-Jock Grob, dermatology,
France.
Both groups were followed exactly in the same way.
People were examined every 3 months and they underwent CT
scan and x-ray explorations every 6 months, exactly in the
same way in the two groups.
DR. NERENSTONE: And were laboratory evaluations
done as well at every 3-month follow-up?
PROFESSOR GROB: Yes.
DR. NERENSTONE: Were CNS relapses considered
relapse?
PROFESSOR GROB: Yes.
DR. SCHILSKY: Could I just pursue that before
you sit down? Because, as I understand it, the follow-up was
done for 36 months according to the protocol, and then there
was an effort made I guess by the company to then ascertain
again the clinical status of all the patients sometime after
the protocol-prescribed follow-up was completed.
So, can you tell us something about what the
follow-up of the patients was in that interval of time from
221
when the protocol-specified follow-up ended until the data
were collected again from all the participating sites? Did
the investigators continue to follow the patients on the same
schedule? Do we have a way of verifying in fact that they
were followed on the same schedule with the same tests being
done at the same intervals on both arms?
PROFESSOR GROB: Well, I would say that we were
out of the limits of the protocol, but most patients were
followed exactly in the same way and some were followed more
closely because the follow-up protocol is a little bit less
tight than the usual process in France. The only way to check
it would be to come back to the files because a point was made
after.
DR. SCHILSKY: Yes. It is a bit of a concern
because the ascertainment of relapse status in a sense could
be very unbalanced in that interval of time when the protocol
was no longer necessarily being followed. Since that's the
primary endpoint that we're looking at here, I think we have
some concern about whether in fact patients were followed
exactly in the same way. It was an unblinded study. There
could have been biases in favor or against the treatment that
were in the minds of the physicians or the patients.
Okay. Other questions from the committee? Dr.
222
Johnson?
DR. JOHNSON: I think I read and understood Dr.
Hooftman's presentation to say that the pivotal trial was
designed without consideration of the usual prognostic factors
being used for stratification purposes. I believe that was
correct. Is that correct?
DR. HOOFTMAN: I wouldn't say without
consideration, but there was no stratification for the more
powerful risk categories such as Breslow, nor for age or sex.
However, as I showed you on the slide, there was no imbalance
between the two.
DR. JOHNSON: I won't be too melodramatic, but
I'm very surprised that a study of this size undertaken at
the time that this was would have done that, to be honest.
I'm just very surprised. This is not new information really.
I just don't understand why a trial of this size would be
undertaken without proper consideration of known prognostic
factors.
What you showed us was a Breslow depth. You
haven't shown us the other prognostic factors I don't believe.
DR. HOOFTMAN: Can we call up these? We have some
backup slides, with permission.
I can already start and answer the question.
223
There was no imbalance at all with regard to the risk categories
of Breslow tumor thickness, age, sex, location of primary or
pathology.
DR. JOHNSON: Do you have location?
DR. HOOFTMAN: Here you see depicted the sites
of melanoma or location of primary.
DR. SCHILSKY: Anything else you want to see,
David?
DR. JOHNSON: Yes. Well, I want to ask a couple
of other questions.
You gave us the overall survival data and you
mentioned the number of deaths, but I don't recall. Were all
of those deaths due to melanoma?
DR. HOOFTMAN: No, they were not all due to
melanoma.
DR. JOHNSON: Can you give us the causes of death
on the two arms?
DR. HOOFTMAN: 4 deaths were not related to
melanoma, 2 in each arm.
DR. JOHNSON: The other question I have, I was
also surprised at the differences in the number of patients
not eligible on the treatment arm. I believe there were 9
patients, if I'm not mistaken, versus 1 on the observation
224
arm.
DR. HOOFTMAN: That's correct.
DR. JOHNSON: The skeptic that I tend to be, if
all 9 of those patients had, in fact, progressed, what would
that have done to your DFI curves and the observation arm had
remained the same? Would it still be statistically
significantly different?
DR. HOOFTMAN: That is a perfectly reasonable
question.
DR. JOHNSON: I thought so.
(Laughter.)
DR. HOOFTMAN: Can I defer this to my colleague,
Sam Givens, the statistical expert?
DR. GIVENS: My name is Dr. Sam Givens. I'm a
statistician at Hoffmann-La Roche.
Yes, that is a good question. Let me start off
by answering it in one way, and that is that the sequential
analysis that was done, which was defined in the protocol as
the primary analysis to stop recruitment of the trial, was
done on all patients. There were no exclusions in that
analysis and that analysis was significant at the .038 level.
I think they naively did not include Breslow in
their anticipated statistical analysis for that sequential
225
stop. Their thought was that if they're balanced, they'll
be okay, and the other aspect was, when we followed the patients
longer, the expectation was to include that category into the
final analysis.
As to the question of if all 9 of those patients
had died, I believe that reduces the difference in survival
by 9 and would drop it from 19 to 10. My expectation is
certainly that that would have lost significance.
DR. JOHNSON: I'm asking also DFI. This is
overall survival. I'm asking for DFI as well, which is the
only endpoint that you showed a statistically significant
difference.
DR. GIVENS: So, now you're saying in the
hypothetical situation on DFI, if we had known all 9 of those
patients had had a relapse.
DR. JOHNSON: Correct.
DR. GIVENS: Well, those 9 patients were included
in the analysis with what we knew about them, but I think that
had all 9 of those died that -- or had all 9 of those relapsed,
I would anticipate that they would not be significant.
DR. SCHILSKY: Dr. Lippman.
DR. LIPPMAN: Actually I had a comment and a
question, but before that, just following up on the last point,
226
all 9 patients were included in an intent-to-treat analysis
that was presented in terms of disease-free and overall
survival?
DR. GIVENS: The sequential analysis that was done
included all patients. There were no patients who had been
eliminated at that time that led to the stopping of the trial
-- stopping of recruitment. Sorry.
DR. LIPPMAN: So, I think that answers that
question, Dave, if they were included.
DR. JOHNSON: Well, actually I don't think that's
what I heard. What I heard is that those 9 were not included
in that analysis. Maybe in the stopping of the trial but not
in the analysis of the DFI.
DR. SIMON: If I could clarify what I heard, it
sounded like they were included at the interim analysis that
led to the stopping of recruitment, but they were excluded
in the analysis based on further follow-up.
DR. JOHNSON: That's right. That's what I
understood, and the numbers reflect that I think there.
DR. GIVENS: You are both correct with that
statement.
DR. SIEGEL: Can I get a clarification? Dr. Simon
just referred to the analysis that led to the stopping of the
227
trial as an interim analysis. If I understood the
presentation, that's the analysis you presented as the primary
analysis with the .038. This analysis is the analysis when
everybody had 3 years of follow-up, which you presented as
a secondary analysis, and then additional follow-up beyond
3-year data -- you haven't presented those data. Is that a
correct understanding?
DR. HOOFTMAN: It's almost correct. The primary
efficacy analysis was for disease-free interval. It was at
the same time the analysis that determined the discontinuation
of recruitment in the trial. You have to set that apart from
the long-term analysis that is an exploratory type of analysis.
The third analysis was solely -- it was done
retrospectively, but to get more information with regard to
overall survival. The trial and the protocol as such was
written for a 36-month course. That means that the last
patient entered reached 36 months and then the long-term
analysis was performed.
DR. SCHILSKY: Dr. Lippman.
DR. LIPPMAN: I just have to clarify one other
thing. Maybe I'm just missing the point. Hypothetically we
assume what happened if they all progressed, and that's a big
concern when they're eliminated from an intent-to-treat
228
analysis. But we don't have to be hypothetical here. Right?
You have follow-up on those and they were included in your
analysis? We know as much as we know about those patients?
DR. HOOFTMAN: These are the patients that were
excluded from this long-term type of analysis. 5 of these
patients never received an injection because they, so to say,
got cold feet and they didn't want to be in the study once
it was clear what was going to happen. 3 patients had the
wrong diagnosis. The patients that you see at the top of the
list had stage IV and died after a few days. The second patient
had a Clark level I tumor. The third patient had lymphoma.
The fourth patient had a previous melanoma, which was also
an exclusion criteria, and the 1 patient in the observation
arm had a previous melanoma.
DR. LIPPMAN: So that that would add 3 relapses,
if they were included in patients that had the right eligibility
criteria.
DR. JOHNSON: Well, no. I would say 5 at a
minimum, the 5 who withdrew their consent. To me that's not
an intent-to-treat analysis. That's a "I took out 5 people
I didn't want to include" analysis.
DR. LIPPMAN: The question that I had actually
is this issue of disease-free interval and the importance of
229
that. Actually in the context of everything that we've heard
this afternoon, the first presentation by Dr. Kirkwood and
this, I actually was very disturbed by the finding of 1690
and the explanations for that in which you saw significant
improvements in disease-free but absolutely nothing, not even
a trend in survival. In this case there's a significant effect
in disease-free survival and a .056 which translate to 59
deaths, if I read the slide correctly, in Roferon, and 76 in
the observation arm. So, it's certainly consistent and in
the right direction.
But I want to get to the explanation that was given
by Dr. Kirkwood, at least that I asked earlier, that the major
aspect of that difference in survival he thought could have
been explained by salvage interferon. So, the question here,
have you looked at patients? Two issues. One, on the
observation arm, if there as a drop-in rate on the interferon.
Certainly it has been available and people have been talking
about interferon and melanoma for a long time. And two, at
relapse, the differences between the arms in terms of salvage
interferon.
DR. HOOFTMAN: Would you please repeat the
question?
DR. LIPPMAN: So, the question is, on the
230
observation arm, of the patients that recurred, what was the
salvage therapy? Were a substantial number of the recurrences
on the observation arm treated with interferon at recurrence?
DR. HOOFTMAN: The only thing I can do in this
situation is ask Professor Grob to answer the question. I
think that the difference with what Dr. Kirkwood's group has
done is that we have not formally retrieved that information
in a retrospective fashion.
PROFESSOR GROB: If I understood you well, the
question is what kind of therapy did the patient receive after
relapse. We do not have this information in our data. Of
course, we can go to the files, but I think really that none
of the therapy of metastatic disease, of distant metastatic
disease, visceral metastases has shown any effect on the
overall survival. So, this is my first answer.
And the second would be that it is highly likely
that the treatment after recurrence were well balanced between
the two groups. But the effect of the treatment on the overall
survival, I would be happy to get one.
DR. LIPPMAN: The reason I bring that is up is
I was surprised also by the presentation of Dr. Kirkwood that
there as a major difference between the arms in terms of who
had gotten interferon, and that that was the best explanation
231
at least that exists, as I understand, for the fact that you
see an improvement in disease-free survival but nothing in
terms of survival. If that was even a potential confounder
in this study, that might account for why your p value is .056
instead of .049. Could that have played an effect if what
Dr. Kirkwood told us is correct?
PROFESSOR GROB: Well, this is an explanation and
a hypothesis which was provided by Dr. Kirkwood. I would say
I don't share this explanation because really I don't think
that either IL-2 or chemotherapy or interferon can really
change the overall survival. At least this has not been
established in the literature, neither in my experience.
DR. SCHILSKY: Dr. Simon?
DR. SIMON: I had a few questions. One is you
indicated there were 35 patients who withdrew from treatment.
How were they handled in the analysis?
DR. HOOFTMAN: You're asking a question about the
35 patients --
DR. SIMON: Yes.
DR. HOOFTMAN: -- the 14 percent who withdrew from
treatment?
DR. SIMON: Right.
DR. HOOFTMAN: As usual, they were all included.
232
DR. SIMON: Their follow-up continued as for the
patients who did not withdraw from treatment?
DR. HOOFTMAN: That's correct.
DR. SIMON: I would like to get some clarification
about the database that was used for the analysis, not for
the interim analysis because my experience is at a time of
interim analysis, there are delays in reporting and that's
really not necessarily a very accurate database, particularly
in a multi-center study with many centers involved and
particularly when you're using something like a triangular
test in which the protocol says you do analyses after every
20 recurrences. I don't really think that's practical in a
multi-center study, and I have questions about the accuracy
of the database in a situation like that. So, I would like
clarification. So, for me, that's really not the definitive
analysis.
I would like clarification of what additional
follow-up was performed and what kind of auditing was done
and how long each patient was followed and what proportion
of the patients were lost to follow-up not for the interim
analysis but for the subsequent analysis.
DR. HOOFTMAN: I understand the question. Can
I give the work to a statistical colleague who was intrinsically
233
involved at the time?
DR. RAMISIO: My name is Dr. Maurizio Ramisio,
statistician, Hoffmann-La Roche, Basel.
The database that was used for the third sequential
analysis is unfortunately not available anymore. We collected
complete information on all the patients in the beginning of
1996 and, as Dr. Hooftman said, getting a new informed consent
from all the patients. The follow-up analysis that has been
presented is based on those data.
The triangular test analysis that has been
presented is based on the data of the 1st of January 1994,
which are not available any longer.
We have simulated an analysis at the time of the
1st of January 1994 by putting a cutoff, using the data that
we have to date, but putting a cutoff on the 1st of January
1994. The result that we have got with this analysis is still
significant, is 0.035 on the log rank test. But again, we
are not able to reproduce the analysis of that time.
DR. SIMON: So, the .035 represents an estimated
significance level at the time that that interim analysis was
performed?
DR. RAMISIO: This is what I'm saying now. What
has been presented by Dr. Hooftman is the result which was
234
obtained by Professor Chastung at that time doing the third
sequential analysis on the data which was available at that
time.
DR. SIMON: Suppose we forget about sequential
analysis. Can you just clarify what is the most complete data
available?
DR. RAMISIO: All right. The most complete data
available is the data that have been collected in the beginning
of 1996, and this is the data that have been presented as
follow-up analysis by Dr. Hooftman.
As I said before, if we do a cutoff on that set
of data, which has been quality controlled, and source
documents verified, and we do the analysis as it would have
been done on the 1st of January 1994. We get a log rank test
with 0.035 percent.
DR. SIMON: Suppose you don't do a cutoff and you
just do the analysis with all of the data.
DR. RAMISIO: If we do the analysis with all of
the data -- I don't remember what was the significance. If
we do the analysis on disease-free interval, including all
the patients, so intent-to-treat, including all the 499
patients, we have to exclude 2 who had no follow-up visit at
all. They went into the study. They were randomized but had
235
no visit at all. So, if we analyze that -- I'm sorry. I must
find the right page.
Here. The disease-free interval -- the
significance, stratifying by center, is 0.074. If we do the
analysis on the eligible patient population, so excluding the
10 patients that we have discussed about before, we get a p
value, which is 0.035. This is including all the data
available up to the beginning of 1996.
If we do the analysis as it was prescribed by the
protocol, we said an analysis will be performed at the end
of the study, which could be interpreted as when all the
patients will have had 3 years follow-up. The p value becomes
0.005.
Is this answering your question?
DR. SIMON: What was the last point? If you do
what?
DR. RAMISIO: The protocol prescribed a primary
analysis, which was the sequential, and said, unfortunately
a little bit unclearly, a further analysis will be performed
at the end of the study. So, it is a matter of interpretation
what is the end of that study.
In another place, the protocol says the patients
will have to be followed for 3 years. So, an interpretation
236
of the end of the study might be when all the patients will
have been followed for 3 years. So, if we do an analysis
cutting all the data following the 3 years, so treating is
censored all the patients who had a relapse after the 3 years,
we obtain a log rank test with a p value of 0.005.
If we do not do that, if we take all the data
considering a median follow-up of 4.4 years, where some
patients have been followed up for 3 years and some have been
followed up for 6 years and more, then we get, on the eligible
patients population, a p value of 0.035 and, on the ITT
population, a p value of 0.074.
DR. SIMON: One other question. You didn't
present any data on sites of recurrence, which ones were
resectable, which weren't. Do you have that data?
DR. HOOFTMAN: Yes, we have that information.
We just have to find it.
As you can see here, the recurrences were mainly
regional or local as opposed to visceral.
DR. SCHILSKY: Dr. Blayney.
DR. BLAYNEY: Thank you. I have three questions.
As has been alluded to earlier, in an analysis
where you're looking at disease-free interval, there's a
potential for bias introduced into the ascertainment of the
237
data points because patients may be lost to follow-up, the
ones that recur may die without knowledge of the investigator.
Without a prospective plan for follow-up, this is of some
concern in trying to interpret the data. I guess I would have
some more comfort if you could tell me how many patients were
lost to follow-up and how these were handled in your analysis.
DR. HOOFTMAN: Please bear with us until we find
that information.
Can I defer this question to Dr. Sam Givens?
DR. WASSNER: We only lost something like 6
patients to follow-up in the long-term follow-up in the
no-treatment arm and 8 patients in the treatment arm over the
7 years of the trial.
DR. BLAYNEY: So, since those numbers are equal,
I'm understanding that there's probably a -- or roughly equal,
there's no bias, likely there would be no follow-up bias in
that.
DR. WASSNER: No. And less than 2 percent of the
patients have been lost to follow-up over this period.
DR. BLAYNEY: In your slide number 111, you have
a p value of .038. Now, maybe Dr. Simon's question got to
this issue, but is that p value adjusted for multiple analyses?
DR. WASSNER: Yes. This value has been adjusted
238
only for that, only for the multiple analysis, not for any
prognostic factors.
DR. BLAYNEY: Thirdly, why did you choose or why
was it chosen to give patients 3 million units and not adjust
based on body surface area or some other measure of size?
DR. HOOFTMAN: The decision by the clinicians
separately for the French study, as well as for the Austrian
-- they made that decision separately and not knowing from
each other what they exactly were going to do -- was based
on the fact that they were looking for the dose that could
be maintained for a long time and the lower dose that was
effective, which was 3 million units, as used in other
indications, for example, hairy cell leukemia, at the time.
DR. SCHILSKY: Let me just make a comment to the
committee. I'm bound and determined to keep us on schedule
this afternoon because I know that some committee members will
have to be leaving. So, we have about 3 minutes left for
questions. So, let me just ask you to just keep your questions
very focused.
Dr. Raghavan, do you have a question?
DR. RAGHAVAN: I just wanted clarification of one
quick thing. I think I understood somebody from the sponsor
to say the database is no longer available. What does that
239
mean and why?
DR. GIVENS: What that means is that they did not
save the database when they did the publication. They kept
adding to the database and making corrections. So, the
database as of today is the most up-to-date that we have, but
we don't have a copy of precisely what they used when they
did the sequential analysis, which is why we went back and
said, let's cut off all data that should have been collected
on visits up until the 1st of January and do the analysis again.
DR. SCHILSKY: Dr. Nerenstone?
DR. NERENSTONE: Very briefly, first of all, was
there central pathologic review?
DR. HOOFTMAN: No, there was not.
DR. NERENSTONE: We've heard about how many
patients were withdrawn because of adverse experiences.
However, you have no information about what actual dose was
given, what kind of delays there were in the patients who were
on treatment for specific toxicity or even for the asthenia,
depression, and flu-like symptoms. Do you have any other data
available about that?
DR. HOOFTMAN: Yes, we have. We have information
with regard to dose reductions. About 83 patients, 33 percent,
in the Roferon arm had their dose reduced temporarily.
240
DR. SCHILSKY: Any other questions from the
committee?
(No response.)
DR. SCHILSKY: If there are none, then let's break
for about 14 minutes and reconvene promptly at 3:15. Shorter
if we can.
(Recess.)
DR. SCHILSKY: We'd like to continue with the FDA
presentation.
DR. CARDINALI: Good afternoon. My name is
Massimo Cardinali. I will introduce the FDA perspective on
this application.
First, I would like to acknowledge the review team
that worked on this application. Dr. Neeman did the bulk of
the statistical review, and Dr. Tiwari also participated in
the review. Dr. Gupta in the last week or so did some
additional analysis.
This slide is to remind the approved indication
for this product. The indication for the hairy cell leukemia
has the closest dosage to the one that the company is seeking
for this application.
This is the indication that the company is seeking
for this product as presented in the submission.
241
I'll briefly go over the events that took place.
You see in white the company and in yellow the agency. The
supplemental application was submitted in 1997. The company
provided us with the translated protocol and statistical plan
and database for the Grob study, as well as the available
literature at the time on the subject and an unpublished report.
This was the study WHO 16, the Cascinelli study.
We finished our review in March of '98, and Dr.
Neeman asked the company for some additional information on
the Grob study and that was received in May of that year.
The monitoring of the French centers was completed
in May of '98.
We issued a complete review letter in August of
that year. The database and data that the company provided
was perceived to be not sufficient for approval by the agency,
and we requested a database for the other study with Roferon
that was available, as well as some additional clarification
on the Grob study. The information was provided in November
of that year, and the paper for the Pehamberger study was
submitted to the application in March of '99.
We received about a month ago the translated study
protocol for the Pehamberger study and early this month the
data set that Dr. Gupta analyzed.
242
I will go briefly to the structure of the two
studies. The Grob study was conducted between 1990 and 1994.
The inclusion criteria, essentially patients with
AJCC stage II and no previous therapy was in the provision
of the protocol. And the performance status was set as ECOG
less than or equal to 2.
The endpoint specified in the protocol,
disease-free interval, and as secondary endpoints, overall
survival and tolerability of the treatment.
The dose administered was 3 million units 3 times
per week subcutaneous for a total duration of 18 months.
The study conducted in Austria was started
approximately at the same time and the same duration than the
French study. The inclusion criteria were almost identical
in terms of the staging of the disease. There was no systemic
therapy within 3 months of inclusion in the study and the
performance status was a little more stringent.
The material that we received did not specify the
endpoint, and there was no statistical plan in the protocol.
Again, the studies are very similar. The
difference that we can observe is the duration of the treatment.
The study had an induction phase of a 3-week duration and
243
then it was continued at 3 million units 3 times per week for
a year.
I'll leave the floor to Dr. Lachenbruch that will
summarize the results and the statistical analysis.
DR. LACHENBRUCH: Thank you. I'm almost an
imposter up here in that the primary analysis was done by Dr.
Neeman at the FDA and then later Dr. Tiwari did this work.
The study by Grob, M 23031, is the primary trial
that was submitted to the FDA. This trial was planned to have
sequential looks every 20 events. However, the timing was
not adhered to and three looks were done.
As you can see here in a triangular test, a score
Z is computed, and if the null hypothesis is true, that will
be around 0, and a variance V is also computed which is
proportional to the number of events at the time of analysis.
If the points exceed the upper boundary, the null hypothesis
is rejected, as you see. On January 1st, '94 when the analysis
was done, it did exceed the null hypothesis.
During the FDA review, we requested that the
sponsor submit more mature data from the additional follow-up
that they have, and our analyses are all based on an
intent-to-treat at this time of final analysis.
This is a graph you've seen before. The medians
244
are indicated. Because the number of relapses at and before
this time of the medians, the estimate of the medians may be
somewhat variable. This again is based on the ITT population
and not the per-protocol population. This results in an
additional 9 patients being added to the overall population,
and the significance level that we see here is .095 as opposed
to the .038 from the sponsor's analysis. This is no doubt
due to both the additional data, more mature data, and the
additional patients.
The overall survival is shown here, again with
the ITT population. We came up with a .09 p value.
We also decided to examine some additional
analyses which are exploratory, and these are, indeed, post
hoc but I think they are of some importance. This slide shows
the effect on relapse-free survival of the covariate alone,
and that's important to realize. Thus, the Breslow thickness
has a p value of less than .001. That is for the effect of
Breslow thickness on survival. It is not a p value for Roferon
given Breslow thickness.
Among these data, the p value for Roferon is
larger, i.e., less significant, than for any of the others.
Also, I should point out that Dr. Neeman used the Breslow
thickness as a continuous rather than as a categorical
245
variable.
We also attempted to find a best model for using
the covariates, and in this case we found that Breslow
thickness, age, and sex gave the best model. Adding Roferon
treatment to those three led to a p value for Roferon of .25.
The sponsor, Roche, did do a similar analysis. They
dichotomized age as greater than 50 or less than 50. The
differences may be due to more mature data, the use of age,
or the additional patients.
The results are marginal significance. The p
value at the time of the termination of the study is .038,
but after the data had matured, it was .095.
We received the Pehamberger data last week, and
we have been unable to do a detailed and rigorous analysis
of the results. We received a translation of the protocol
about a week earlier.
We attempted to reproduce the analyses that
appeared in the article and will present some comments. The
inclusion criteria, of course, are essentially the same as
for the Grob study. The analytic plan was not presented in
the protocol and endpoints were not specified. We used
relapse-free survival and overall survival, and we've also
done some adjustments for Breslow depth and did a corresponding
246
analysis including age and gender as we did with the Grob study.
Here we see the relapse-free survival, and we found
a p value of .04 and median for controls is 4. The Roferon
group did not reach a median.
In doing the same proportional hazards model, we
find quite similar results. Breslow thickness is highly
significant; age, significant; sex, somewhat less; and Roferon
as, of course, .04.
At the same time we did the adjustment for Breslow
alone, which is what was reported in the Pehamberger article,
and found a p value of .1, and if we adjust for Breslow
thickness, age, and sex, we had a p value of .22, quite similar
and comparable to the p of .25 that was seen in the Grob study.
Again, our conclusions seem to show that there
was a moderate effect of Roferon by itself, which is the primary
analyses that are presented by the company. However,
adjusting for Breslow thickness and other variables does seem
to reduce the effect.
Based on this, we felt that it was appropriate
to begin planning an overview of the published literature.
So, we are doing this to combine the evidence. What we want
to do is substantiate the evidence of efficacy from known
studies of adjuvant interferon in melanoma, and for this
247
purpose, we will use studies of both Roferon and Intron. These
are exploratory and we want to emphasize that the data support
from Roche will be the only material that is used in any
decisions regarding this product. We will be using
relapse-free survival and overall survival, as they are the
generally accepted outcomes. And we are in the process of
obtaining data from investigators.
We will be looking at Roferon and Intron trials.
We want them to be randomized, concurrent controlled trials,
and so far all have an observational control and are for
adjuvant therapy.
We have searched a number of databases seen here.
The trials that we have identified and the studies come from
North America, Europe, Australia, and New Zealand. We will
be looking to get estimates of the odds ratio by means of ratio
of medians, and that's very nice if you happen to have
exponential survival. That's for the statisticians. And the
Peto method is basically a log rank type method.
We will also be looking for estimates of survival,
either relapse-free or total survival at 3 years. We'll be
looking at Kaplan-Meier estimates, 95 percent confidence
intervals, and so forth.
So far the studies that we have found are those
248
from Dr. Creagan, Dr. Cascinelli, Dr. Grob, Dr. Pehamberger,
which all were using Roferon. We've seen five studies from
Kokoschka, Kirkwood, Cornbleet, Rusciani, and the Kirkwood
ECOG 1690.
This slide provides estimates of the percent
improvement and confidence intervals for relapse-free survival
that we have seen thus far. A square is placed at the estimate
for the difference in proportions. The whiskers are the 95
percent confidence intervals. A positive value is favorable
for interferon. So, if the whiskers cross the line, it is
not possible to rule out a difference of 0 between observation
and interferon.
The size of the box, that is the area, is
proportional to the sample size. These generally indicate
a consistent improvement of about 8 to 9 percent over
observation. We don't have reliable 5-year data at the present
time to conduct a similar display.
In overall survival, we see the same picture.
As you can see, there's a bit less of an impressive difference
in these. We did not have the data from Dr. Pehamberger for
survival. The difference is around overall about 4 to 5
percent.
Our next steps will be to get individual data from
249
studies and perform the analyses that we have indicated above.
The information contained in the literature does not permit
sufficiently detailed analyses.
To summarize, for relapse-free survival, all
studies do point in the same direction. These are marginally
significant or barely not significant, and there's a moderate
early effect. But we don't have a lot of data for longer term
effects.
For overall survival, there is a consistent trend
toward improvement but evidence is not that strong, and I have
in my notes, parentheses, "yet" with a question mark. We did
not show it, but there do seem to be fairly similar results
with high and low dose and with node-positive and node-negative
disease from the material that we've seen.
Thank you.
DR. SCHILSKY: Thank you very much.
Questions for the FDA? Dr. Raghavan?
DR. RAGHAVAN: I'm totally mystified as to why
you went through that statistical exercise because the best
data points come from a product that isn't even up for
submission. So, I just wondered why you spent all your time
doing this and what the point was.
DR. LACHENBRUCH: The purpose here was to really
250
look for evidence combining all of the Roferon data. Over
here, we see that there are four studies, and so what we would
like to do is be able to draw information from all of these.
So, what we see is overall there does seem to be a significant
improvement in 3-year survival.
DR. SCHILSKY: Other questions? Dr. Simon?
DR. SIMON: I guess I wouldn't put much credence
in a meta-analysis based on literature data. There may be
exclusions. There are all kinds of biases in published
reports. The fact that they're published may be publication
bias. If you're planning on doing an individual case
meta-analysis, I would say go ahead and do it, but I don't
find it useful to present a meta-analysis based on
publications.
DR. LACHENBRUCH: These are very preliminary
results, and we are trying to get the data at the present time.
So, I would agree with you.
DR. KEEGAN: I think to some extent the reason
why these data were presented was that up until very recently,
the only information we had was from a single study. So, this
was our attempt to see what other information was available
in support of this application. We're not saying it's optimal
information, but it was all that we had available.
251
DR. CARDINALI: As a note, the Pehamberger and
Grob study data is from the publication not from the data set
we have analyzed.
DR. SCHILSKY: Dr. Simon.
DR. SIMON: Do you have any insight for the French
study as to why the significance level, say, for relapse-free
survival, after adjustment for thickness, age, and sex, changed
so much? Were there any imbalances?
DR. LACHENBRUCH: No. For a covariate analysis,
as you know, the purpose is not necessarily to adjust for
imbalance, although that can be one use of it, but these happen
to be important prognostic factors for survival. So, what
we're saying is we'd like to look at these after we have adjusted
for these.
DR. SCHILSKY: Dr. Lippman.
DR. LIPPMAN: Just a quick clarification. In
your last conclusion slide, you said that there were similar
results with high and low dose. Is that what we just saw from
Dr. Kirkwood with Intron or is that with Roferon?
DR. LACHENBRUCH: I believe that was the for the
Roferon, the study of Dr. Creagan and the Grob and --
DR. SCHILSKY: Other questions from the committee
members?
252
(No response.)
DR. SCHILSKY: Okay, thank you.
Let me point out to the committee members that
there's a slightly different set of questions than the ones
that were in the blue folder, and those should have been put
at your place right after lunch. It looks like this. It's
a two-page thing. It has only one of these meta-analysis
charts. I think the content of the questions is largely the
same, but these are the questions that we should be focusing
on at this point.
Before we get into the questions, actually I'd
like clarification of one point from the FDA because most of
these questions are posed in such a way that they ask us to
consider the results of the sponsor's data in conjunction with
the overview analysis that was just presented. Now, I was
quite sure I heard the FDA presenter say that the overview
analysis would not be taken into consideration by FDA in
assessment of the sponsor's application. So, could we get
some clarification on that?
DR. LACHENBRUCH: Yes. What I said was no Intron
data would be taken into account.
DR. SCHILSKY: I see. It's a little bit difficult
for us to sort out from those meta-analyses which ones had
253
Intron data and which ones had Roferon data.
DR. SIEGEL: Let me clarify something. First of
all, the Roferon data were the top part of all those slides
and are on the second page of the questions.
The FDA has a policy regarding use of literature
in support of applications for new indications for already
approved drugs. The gist of the policy says that literature
data, especially if consistent and compelling from multiple
sites, can be important, but the value of the data is largely
dependent on the ability to substantiate it through finding
protocols, data sets, ensuring that there were intent-to-treat
analyses, and the normal things. So, these are things I think
that, as a matter of policy and procedure, should not be
ignored, but I think that the weaknesses or concerns that have
been highlighted are important ones to take into account.
DR. SCHILSKY: Okay, thank you.
Maybe we'll just get on with the questions then.
Yes, Scott.
DR. LIPPMAN: I know that we're not considering
Intron here, but I think the data are relevant in the sense
that -- two issues. One is the biological plausibility
mechanism and the other is consistency within the committee
in terms of approval.
254
Again, we talk about the fact that there's very
little data. So, we have one study of 500 patients which,
at least in the FDA presentation, we've talked about those
mysterious 9 cases and how that would affect. But at least
in the FDA presentation, it was significant. Every one of
the boxes is -- it's modest, but it's positive both in terms
of disease-free and overall survival, and the whiskers come
very close, just past the survival curve of 0, as opposed to
another situation where we're using interferon where it's
approved and where you don't see that pattern even with a very
high dose in terms of survival. And we've heard some
explanations of that. It's really a question of whether we
should take that issue, the consistency, the biology, the
mechanism, into account in some of these discussions.
DR. SCHILSKY: I don't think we should ignore the
universe of information that we're aware of and we have
available to us.
I just want to get clarification on this again.
First of all, the meta-analyses with respect to the Roferon
data, which is what's on our question sheet -- so, there are
four studies listed for disease-freed survival and three listed
for overall survival. Of those, only the Grob study would
appear to show a significant benefit with respect to
255
disease-free survival as it's listed here. However, as the
more detailed analysis of the study was presented to us, there
are questions as to, in fact, whether even that study shows
a significant difference in disease-free interval. So,
although the trend appears to be in favor of interferon in
each of these examples, there's very little in the way of a
statistically significant benefit for interferon.
Further, it's fair to say that, I guess, in a sense
these are at best incomplete meta-analyses for the reasons
Dr. Simon mentioned, that this information is just based upon
data you could glean from published reports in the literature,
not from the actual patient data that's contained within those
reports. Correct? Okay.
Scott?
DR. LIPPMAN: Just to clarify, because with all
the discussion, I guess I was sort of surprised when I look
at this. I'm not talking about the meta-analysis, just the
big box of 500 patients under Grob. It is significant, doesn't
cross the line. I haven't read the recent set of questions,
but one of them was should we recommend approval based on one
large randomized trial. So, I'd like to clarify maybe from
the FDA if they're going to stick with this box. In that case,
that is statistically significant and survival is close and
256
the other studies corroborate that. So, I'd just like to
clarify.
DR. SIEGEL: Well, I guess a lot of people have
addressed different parts of this question. I'll take my turn.
That box was an endpoint that was chosen in part
because it was, I think, the easiest endpoint to get on all
of the trials, and it's endpoint data truncated at 3 years.
That's the endpoint that the Grob data looked the best at
because, in fact, the curves have maximal separation at about
3 years and start coming together after 3 years. As noted,
that studied had 3 years of planned and prescheduled follow-up,
so it's not an irrelevant time period for that study. But
at best, let's say that the primary time for follow-up is
ambiguous in the protocol and difficult to determine. As we
determine it, the intent-to-treat analysis of the most complete
available data set was at the .095 level and with covariate
correction at the .25 level.
We'll stand behind that analysis. It's one of
several analyses. We won't stand behind it as like the one
that tells the story. I don't think, given the ambiguities
of the protocol and the flaws and strengths of different
analyses, that there's probably not one p value that you can
hang your hat on and say this tells you the statistical
257
significance of the trial.
DR. SCHILSKY: Are we ready to go to the questions?
Let me just read the first question. There's a two-paragraph
summary. Then the question is, does the committee find that
the results of a single multi-center, randomized, controlled
trial, in conjunction with the overview analysis of the three
randomized, controlled trials of Roferon-A, provide
substantial evidence that Roferon-A prolongs the disease-free
interval in patients with surgically resected melanoma?
Is there discussion on that before we vote? Dr.
Lippman.
DR. LIPPMAN: I will just say that the real
fundamental issue that I'm having a problem with is the floating
p values. Given that we've heard a lot of discussion on this
and still know real consensus, I don't think, in terms of what
is either reasonable or meant or intended, that's going to
fundamentally affect how I vote anyway on this.
DR. SCHILSKY: Well, I think we've seen the data
as presented by the sponsor. We've seen the data as presented
by the FDA with the adjustments to the p value, if you will,
based upon the other covariate prognostic factors. We've
seen, for what it's worth, the preliminary meta-analysis.
So, is there anything else you would like to know before you
258
vote on this?
DR. LIPPMAN: I think fundamentally if we knew
exactly in the design what the primary endpoint was -- was
it a 3-year? I think that's where the debate is.
DR. SCHILSKY: It appears that we don't know that
because it wasn't well specified.
DR. KEEGAN: That's correct. The protocol really
is open to quite a bit of interpretation as to when that final
analysis was to have occurred and exactly what it was to consist
of.
DR. SIMON: I will say, however, that my
experience is if you have an endpoint, that your most accurate
analysis is the one based on the longest follow-up and that's
what you should hang your hat on and not one that was simulated
based on what might have happened some years ago. So, anyway,
I guess that's one issue.
The other issue is for myself I guess I just have
some basic uncertainty about the quality of the data from that
trial, the potential biases in follow-up. It looked like there
was too much of an emphasis that the main analysis would have
been the one that was essentially an interim analysis that
stopped the recruitment. Then there were sort of ad hoc
attempts to increase follow-up. I just am left with some
259
uncertainty as to how accurate that additional follow-up was.
So, I myself, in addition to the variable p values, just have
some uncertainty in the credibility of that data.
DR. SCHILSKY: Dr. Keegan.
DR. KEEGAN: I would say that the protocol did
not specify what the continued follow-up should be after 36
months, and when we requested the additional data, it was
necessary for the company to go back to the investigators,
who then reconsented patients to get the information. From
the monitoring inspections of some of the sites, it's clear
that there wasn't a rigidly adhered to schedule for follow-up.
We did also ask the company to analyze the data
to determine whether or not there was a systematic bias in
terms of the follow-up, and it didn't appear that the follow-up
was systematically biased towards one or the other arm. It
was equally -- I won't characterize it as haphazard, but
definitely not done according to a rigid schedule. But that
seemed to be present in both arms.
One other point I'd like to make in terms of the
policy is that for a single study in support of effectiveness,
one of the criteria that FDA uses is that the trial have a
statistically significant result that's fairly robust such
that we would have confidence that the result would be
260
reproducible. At best, the p value here is .04, and our concern
at the time of even the review of the data with the most
up-to-date follow-up that we could get through 1997 suggested
to us that that result, although statistically significant,
would not meet that condition of being so robust that we were
convinced that it was a reproducible result, which is why we
encouraged the company to go back and obtain additional study
data.
DR. SCHILSKY: Dr. Johnson?
DR. JOHNSON: Yes. I didn't realize this was
going to take a lot of discussion, but since Scott seems
conflicted, let me go through a number of reasons why I think
this is a poor study.
First of all, I'm not sure I accept the endpoint
as one that's therapeutically efficacious. DFI, in the
absence of a survival benefit, is of uncertain benefit in my
view. We can debate that but there are plenty of diseases
where DFI can be prolonged and survival is not. And we don't
do the therapy that prolongs the DFI. Small cell lung cancer
immediately comes to mind. There are 10 randomized trials
out there showing DFI is prolonged, survival is not. No one
uses maintenance chemotherapy in that disease.
If they had shown me some quality of life benefit
261
to that DFI, that symptoms had improved or some other meaningful
patient benefit, then perhaps I could have accepted that as
an endpoint of value, but I don't. And I didn't see that data.
Thirdly, again, I find it shocking -- and I think
that's the word -- that a study of this size would be undertaken
without appropriate stratification for known prognostic
endpoints. That being said, even more importantly, there was
no quality control of pathology. We have no idea whether these
patients were equally balanced other than what they tell us.
There was no central review of the patient pathology. They
could have all been one stage in the Roferon arm and quite
another in the other, just on the basis of that inequity.
All we have is a report. They've told us there was no central
pathology review.
Candidly, I just think that the overall data are
highly questionable. I agree with Richard. I think these
are not the quality of data that we see come to this agency
that generates approval by this body. That's my perspective
on this, and personally I don't see how we can vote anything
other than no on this question.
DR. SCHILSKY: Dr. Raghavan?
DR. RAGHAVAN: Yes. I think I always feel sorry
for the FDA because they're victims and they get beaten up
262
by everyone, but as a taxpayer I really have to say that I
don't think you've done as well as you usually do this time.
You've left it to the committee to identify a whole series
of very bad statistical concepts and poor quality data. I
shouldn't have to remind you: garbage in, garbage out no
matter what the p value. I just feel very disappointed that
we've had to go through this exercise.
Dr. Lippman has tried very hard to be fair, and
I recognize and respect that. For those of us who are crusty
veterans who have seen outstandingly good data over the years,
this is not an example of that. And bending over backwards
to bring in Intron data that were approved based on good quality
data and then tainting that information based on very poor
quality information with bad follow-up sets up a precedent
that that I think is kind of disappointing. And I would hate
people to leave here starting to question decisions made in
the past based on good data when we've now added a bunch of
information that's out-of-date, hard to quantify,
irreproducible, et cetera.
And I just felt I wanted to make that comment.
I apologize for beating you up, but you deserve it.
(Laughter.)
DR. SIEGEL: Allow me to respond in part, although
263
I don't want to take up too much time with this.
First of all, I think it's a mischaracterization
to suggest that it took the committee to identify the flaws
in this data. I don't think there was a flaw discussed here
that was not identified by the FDA. The FDA did an
intent-to-treat analysis from the beginning. We carefully
inquired and investigated about the relevance of the follow-up
data, the quality of the follow-up data, and the choice of
the endpoints, and made a presentation of the data, I think,
that accurately reflects our perception.
As to the question of why these data were brought
before the committee, perhaps this requires a bit of
understanding of time lines. At the time we need to make a
decision about scheduling a committee, it's usually a couple
months before the committee. As we have made clear in the
presentation, we had felt that based on the Grob study alone,
there was no reason to discuss or consider approval of this
application.
What we had available to us at the period two months
before this committee was a published report from the
Pehamberger study that showed a p value of .02 and new
information from the company that they were, in fact, going
to be able to get the data set and the protocol. Those, as
264
you've heard, I'm sure for a good reason, took longer than
anticipated to get. So, they arrived within the last week
or two. You've seen the preliminary analyses of those. The
study did not look like what we expected it to look like, but
I think with that perspective, perhaps you can better
appreciate where we've come from.
DR. SCHILSKY: All right. Thank you.
In the interest of time, I'm going to call for
the vote. I think we're probably ready. Let me just restate
briefly the question. Does the committee find that the results
of a single multi-center, randomized, controlled trial
provides substantial evidence that Roferon-A prolongs the
disease-free interval in patients with surgically resected
melanoma?
All those who would vote yes, please raise your
hand.
(No response.)
DR. SCHILSKY: That's 0 yes.
All those who would vote no?
(A show of hands.)
DR. SCHILSKY: 7 no.
Abstentions?
(A show of hands.)
265
DR. SCHILSKY: 1 abstention. Sorry. 2
abstentions.
DR. SIEGEL: I think we're done.
DR. SCHILSKY: That's what I was about to ask
because the second question says, assuming that the answer
to question 1 is yes, well, we know now what the answer to
question 1 is. So, I think that completes the committee's
deliberations. Thank you all very much.
(Whereupon, at 4:02 p.m., the committee was
adjourned.)
Get documents about "