Clinical Research Database
(CRDB) Overview
Dawn Caron-Fabio
Manager, Data Management Resource
Office of Clinical Research
CRDB Milestones
Pilot Version November 1991
Release 1.0 July 1992
Release 2.0 April 1994
Release 3.0 November 1996
Release 3.3 October 1999
Conversion to Web Summer 2004
CRDB Objectives
Institutional accountability for research data
Flexibility to accommodate a large variation of
studies, goals and data types
User friendliness
Prospective registration and randomization
Easy access for global reporting of
institutional clinical research
Minimize disruptions of frequent staffing
changes
CRDB Goals and Challenges
Data coding
Standardized data codes
Security
Ensuring protocol-specific access
Controlling read/write access
Flexibility and Ease of use
Accommodate a large number of protocols
Easy to enter and retrieve data
CRDB Current Status
366,776 distinct patients in CRDB as of 3/2/2007
MSKCC Patients 329,068 (89.7%)
Non-MSKCC Patients 37,708 (10.3%)
>148,000* patient records in >3,700 IRB protocols
>483,000* patient records in >290 prospective databases
* Patients may be registered to multiple protocols and/or databases
CRDB Design and Structure
Hardware
Database Server Cluster
(2) Compaq AlphaServer GS80 machines
(4) EV68/1224 CPUs and (6) GB memory, each
Cluster is connected to a shared EVA storage array
containing 2.5TB raw disk space in either mirrored or
RAID5 configuration
(26) Slot tape change with (2) DLT4 tape drives
CRDB Design and Structure
Software
HP Alpha Tru64 Operating System, v5.1A
HP Alpha TruCluster 1.5
Oracle 9i RAC database software
CRDB Network Diagram
Storage Area Network (SAN) connections
10101
0 1 2 3 4 5 6 7
IP
StorageWorks SANSwitch 2/8-EL
SAN switches
HP AlphaServer GS80 HP AlphaServer GS80 10101
StorageWorks SANSwitch 2/8-EL
0 1 2 3 4 5 6 7
IP
Database Server crdbds1 Database Server crdbds2
High-speed cluster interconnect (memory channel)
Gigabit Network Switch
hp procurve
switch 5304XL
J4850A
Hewlett-Packard
StorageWorks
J4820A 24 port 10/100TX J4820A 24 port 10/100TX MSL 6030
HP RP5470 Application
DLT backup device
Server crdbas1
Run Attn. Fault Remote Power
hp A l p h a S e r v e r E S 8 0 hp A l p h a S e r v e r E S 8 0
HP RP5470 Application
HP9000 rp54x0
Server crdbas2
Run Attn. Fault Remote Power
HP Proliant DL380
HP9000 rp54x0
Application Server
crdbas3
hp proliant DL380g3 Int
Ext
Lan1
Tape
Lan2
UID
5 3 1
4 2 0
EVA disk array
Local Area Network (LAN) connections
MSKCC 1000BaseT LAN WAN
Standard Non-standard
Remote users
workstations workstations
Internet
hp workstation xw4000 hp workstation xw4000
Printers
Protocol Set-up
PI specifies the data requirements of the protocol
• DMR staff define protocols using on-screen form:
– Eligibility criteria
– Randomization criteria (if protocol is a
randomized trial)
– Applications (protocol-specific subset of data
entry forms, reports, etc.)
– Users and their privileges
– Protocol-specific subsets of data codes
Protocol Structure
Users can work on only one protocol at once
– In data entry, a user can enter data on only one
patient on one protocol at a given time
Hierarchical
– You can establish a “parent” or “umbrella”
protocol covering multiple IRB protocols
– This enables you to analyze multiple protocols
across the same disease, service, department,
etc.
Hierarchical Protocol Structure
Example – Department of Medicine
MP-DOM
MP-BRST MP-GI MP-HEM MP-LEUK
00-093 01-011 01-129
Patient Registration
When a patient is registered to a protocol:
– Demographic data retrieved from SMS, last 30 days lab data
retrieved from Laboratory Computer System
– User is prompted with protocol eligibility questions
– If an eligibility question can be answered from retrieved lab data (e.g.,
“Is the patient’s WBC above N?”) the system enters it automatically
CRDB automatically determines eligibility
– (user can register the patient even if ineligible, but it is recorded)
User prompted with randomization strata questions
MSKCC pharmacy later consults CRDB
– to confirm registration, treatment assignment before dispensation
Prospective Databases
POEs (points of entry):
– “private” (usually service-specific) non-IRB prospective
databases
– Collect data on (e.g.,) all patients entering a particular
service or department, regardless of whether they will
ever go on an IRB protocol
Set up on request from investigator, who decides
what data should get collected
– DMR staff set up the database in much the same way as
an IRB protocol
– Data feeds from other hospital systems will happen
automatically (flow starts at patient “registration”)
CRDB Data Sharing
Outside MSKCC In MSKCC
SMS
Pharmacy
Clinical Labs
OR System
Surgical Complications
PIMS CRDB
Disease Mgm
IDB (ARC)
Pathlite
Industrial Sponsors
CDUS (CTEP)
Additional CRDB Collaborations
Information Systems
System Tracking and Reporting (STAR) System
Pancreatic Cyst Tracking Application
Computational Biology
Study Tracker
Pathlite
NCI caBIG
Clinical Trials Management System Contract
Tissue Banks and Pathology Tools Contract
Design and implement a public interface using caTissue (along
with the Inter-Prostate SPORE Biomarker Study )
CRDB Specialized Programs
AICT (T Cell) Laboratory Database
Cell Marker Laboratory Database
Mouse Tracking Database
Patient Survivorship Form
Family History Pedigrees
Health, Habits, and History Questionnaire
Scannable Forms (Teleforms)
Coding of Lab Values to CTC Toxicity Grades
Analysis Tables
CRDB Specialized Programs
Pet Scan Forms
Biostatistics Collaboration Form
Departmental Conference Sheets (eg, BMT
and Sarcoma)
Comprehensive Phase Based Reports
Specimen Protocol Registration Form
Pharmacy Patient Query Form
Specimen Tracking System
CRDB/ TPS Specimen Tracking
CRDB - Specimen Tracking Banks
Tissue Procurement Service
Sarcoma Bank
Myelodysplastic Syndrome Bank
Clinical Chemistry Lab
Specimen Tracking Bank Data 2006
Clinical Chemistry Lab Specimen
Accession 2006
July August Sept Oct Nov Dec
# Patients with New Specimens 4199 5328 6298 6410 6280 5964
# Specimens Stored 5532 7468 8792 8967 8809 8098
Specimen Tracking Bank Data
Tissue Procurement Specimen
Accession and Distribution 2004-2006
2004 2005 2006
# Patients with New Specimens 3821 4895 5033
# Specimens Stored 14238 23391 24031
# Specimens Distributed 3042 3847 4091
Single Patient Request 13 17 14
Clinical Review 615 683 698
Pilot Research Project 343 582 489
Clinical Trial 2071 2565 2890
TPS Specimen Tracking Design Team:
Chair of the Dept. of Biostatistics
Director of the Office of Clinical Research
Clinical Research Database Administrator
Database Developers (2)
Data Management Resource Staff (2)
Tissue Procurement Service Staff
(TPS Director and Coordinator)
1. 2.
Standard
PI PI
Patient Clinical
PI
Tests
Patient Consent
Required
Tissue
OR Pathology Procurement
4.
Request
Request
3.
Specimen Banking NT
Tumor
Process Bank
T
A
B C1
(Specimen Flow from Surgery to
C C2
Distribution from Tumor Bank) D C3
E
Goals for the System:
1. Relational Database Design
2. Coherent Management of Tissue Requests
3. Information Management of Resource Depletion
4. Linkage of Distributed Specimens with Clinical
Outcome Data
5. Ability to Distribute Specimens and Related Data
Anonymously
6. Improved Accession of Tissue
Database Relationships for Specimen
Tracking and Research Results:
Specimen Patient CRDB
Tracking MRN
System Specimen (Clinical Outcome
ID# Data)
Specimen ID#
Only
Pathological
Research Results
Decision
Support Tables
CRDB = Clinical Research Database
Specimen Tracking Table Structure
Data Entry Forms/Tables:
Specimen Accession Form
Specimen Accession
Specimen Part
Specimen Bank
Specimen Bank Detail
Specimen Request Form
Specimen Request
Specimen Request Detail
Specimen Distribution Form
Specimen Distribution
Specimen Accession Tables:
Specimen Accession
Information about the accession of individual
specimens and clinical details about the patient from
whom the specimen was obtained
Specimen Part(s)
Details about the individual part(s) that are created
from a single surgical procedure
Specimen Bank
Catalogues all specimens received by the bank
Specimen Bank Detail
Details about all specimens and their distribution
Specimen Accession: Data Elements
Patient Identifiers & Diagnosis Information
Pathology Date & Accession #
Procurement Date
Service Code
Specimen Part #
Specimen Identifiers (Site and Cell Type)
Bank #
Specimen Type (e.g., Tissue, Blood)
Amount of Specimen Available
Distribution ID
Allocation Time
Distribution Time
Specimen Accession Form (Page 1)
Specimen Accession Form (Page 2)
Specimen Request Tables:
Specimen Request
Contains information about the individual
request for specimens
Specimen Request Detail
Information about the types of specimen that
are requested (including the specific site(s)
and cell type(s) needed to fill the request)
Specimen Request: Data Elements
Request ID
Requester Information
Request Date
Distribution Type (e.g., Anonymous)
Type of Request
IRB Protocol
Pilot Study
Clinical Review
Single Patient Request
Specimen Information (Site and Cell Type)
Billing Criteria
Specimen Request Form (Page 1)
Specimen Request Form (Page 2)
Specimen Distribution Table:
Specimen Distribution
Information about specimens distributed from
the bank. This process includes:
- Selection of Current Requests
- Ability to Query Available Specimens in Bank
- Distribution of Specimens Based on
Patient’s Consent Status
- Anonymization of Information
Specimen Distribution:Data Elements
Distribution ID
Request ID
Additional Data From Previous Tables by
Way of Hidden Patient Identifiers
Specimen Distribution Form (Search)
Specimen Distribution Form (Allocation)
Specimen Distribution Form (Validation)
Specimen Distribution Form (Summary)
Non-Anonymized:
Non-Anonymized (patient identity known)
Patients give full consent for the specific
molecular or genetic tests that will be
conducted
No restrictions are placed on the coding
variables (each specimen may be identified
by patient identifiers, facilitating a direct link
with clinical outcome data)
Investigator Anonymized:
Investigator Anonymized (samples coded
but linked to patient identity )
Investigators will be unable to identify the
patient
Specimens will be distributed with a coded
identification number that will identify the
specimen
Clinical information may also be supplied via
the Specimen Bank #
Anonymized:
Anonymized (unlinked to patient identity)
Once a specimen is distributed it can never
be re-linked to the patient
All clinical data must be available at the time
of distribution
Additional Specimen Forms
Liquid Accession Form
Clinical Chemistry Lab:
The Next Steps…
CRDB
Availability of CRDB on the Web
Initial Projects
- Specimen Protocol Registration (06-107)
- Minimal Data Set Data Collection
Specimen Tracking
Incorporation of the new 06-107 Consent
Process