Supporting Scientific Analysis and
Decision Support Workflows with
caBIG® Tools and Technology
The Cancer Genome Atlas
Radiology Project
Eliot Siegel, M.D.
Joseph Chen, M.D.
University of Maryland School of
Medicine Department of Diagnostic
Radiology
Introduction
• One of the major original goals of caBIG
was to determine out how to create a
system that would enable extraction of
data for research or clinical decision
support that would:
• Allow access to a variety of types and sources of data
including genomic, proteomic, clinical, lab,
demographic, and diagnostic imaging
• Take advantage of analytic potential of grid computing
to combine and cross-reference these for analysis for
research and clinical care
• The caBIG Imaging workspace has
worked to build basic tools toward this
goal and the TCGA imaging workspace
project represents an example of the
potential for caBIG to have a major
impact on the way in which data are
shared, research conducted, and patient
care is provided
• Our work has important touch points
with all of the other caBIG workspaces
and we are hoping by providing this
presentation that we can get creative
suggestions and ideas for research,
clinical use, clinical trials purposes
• We believe that our project has major
implications for the way that imaging data
can be collecting for clinical trials as an
integrated part of clinical trial
management software
• The project closely relies on caBIG
projects that involve data integrity and
security, vocabulary and common data
elements, grid performance, genomic
analysis, and integrative cancer research
Introduction to the caBIG in Vivo
Imaging Workspace
• caBIG in vivo Imaging workspace established
April 2005 a little more than a year after the
establishment of the other caBIG workspaces
• NCI funded effort by far the biggest and most
productive effort in imaging informatics today
• Subject matter experts from around country
with representation from major Universities,
informatics experts, industry, NCI
Review of Relevant Workspace
Projects XIP, AIM, Middleware, NBIA
Rapid application
development
environment for
diagnostic imaging
tasks that researchers
and others use to
create targeted
workflows customized
for specific projects
XIP Application Builder Medical Imaging
Workstation
XIP Application
XIP Modules
Host Independent
Web-based
XIP Host Adapter Application
XIP LIB
VTK
ITK
...
WG WG WG WG
23 23 23 23 Distribute
XIP Host
(Can be replaced with
any DICOM WG23- Host-Specific Plug-in Libraries
compatible Host)
DICOM, HL7, and othercaGRID Services via
services per IHE Profiles
Imaging Middleware
Standalone
Application
XIP Class Library
Auto Conversion Tool
Annotations and Image Markup
(AIM) Being Adopted by Increasing Number of
Research and Commercial Systems
Represents a “standard” means of adding information/knowledge to an image in a research or
clinical environment to allow easy and automated search for image “content”
Imaging Middleware
(including GridCAD and Virtual PACS)
Grid computing has received
surprisingly little attention. One
application has been to allow
multiple computers to work in
parallel on a single task such as
CAD detection of lung nodules or
to give multiple opinions using
multiple algorithms
Middleware software is used to
create interoperability between
DICOM devices and the caGRID
which uses a service oriented
architecture
NBIA: National Cancer
Imaging Archive
• Initially designed as repository for LIDC and
RIDER CT lung nodule studies
• Expanded to include multiple additional types
of image collections with role based security to
share with public or a selected group or to
support ongoing clinical trials or other reader
studies
• Open source and free
• Meant to be “federated” to create virtual
database across multiple instances of NCIA
software
NBIA Demo: Home Page
NBIA Demo: Using the
Search Criteria
NBIA Demo: Search Results/
Selecting Images for Download
NBIA Demo: Image Visualization
NBIA PC DICOM Viewer: Cedara i-
Response
NBIA Mac DICOM Viewer: OSIRIX
Download or “Virtual PACS”
The Cancer Genome
Atlas (TCGA) In Vivo
Imaging Project
Initial Phase
TCGA
• The Cancer Genome Atlas
• Collaboration between National Human
Genome Research Institute and NCI
• The Cancer Genome Atlas (TCGA) is a
comprehensive and coordinated effort to
accelerate our understanding of the genetics
of cancer using innovative genome analysis
technologies.
The Cancer Genome Atlas
• TCGA researchers have identified four
distinct molecular subtypes of
glioblastoma multiforme (GBM), and
demonstrated that response to
aggressive chemotherapy and radiation
differed by subtype
• These findings, reported in the January
19 issue of Cancer Cell, may result in
more personalized approaches to
treating groups of GBM patients based
on their genetic alterations
TCGA Second Study in Cancer Cell
• Another study published in April by The Cancer
Genome Atlas Research Network also in
Cancer Cell used epigenomic profiling
• Maps specific chemical changes or 'marks' to different
areas of the genome, to reveal a new subtype of
Glioblastoma Multiforme (GBM)
• Most patients with GBM survive only 12-15
months after their initial diagnosis
• However, patients with this specific subtype,
called Glioma CpG Island Methylator Phenotype
(G-CIMP), have a median survival of three years
Goals of TCGA Imaging Workspace
Project
• Investigate the added value of highly
structured interpretation and quantification of
MRI images of the TCGA dataset using AIM
• Determine the correlation between MRI
imaging and genotypic information and
response to therapy and prognosis
• Revise Cell article to include impact of MRI
data
• Determine the potential for these tools in
routine clinical practice
Feature Set – Controlled Vocabulary
• 20 features clustered by categories.
• Lesion Location
• Morphology of Lesion Substance
• Morphology of Lesion Margin
• Alterations in Vicinity of Lesion
• Extent of Resection
• Goal is to capture imaging features of
entire tumor and imaging features of
resection specimen.
Examples Non-standardized Features
May correspond to Angiogenesis,
Oxygenation, Apoptosis, Cellularity
• Infiltration
• Margination
• Edema
• Non-enhancing tumor.
• Enhancement
• Irregular
• Nodular
• Indistinct
• Infiltrative
• Necrosis
• Physiologic
• Diffusion
• Perfusion
Well marginated Non-enhancing
Infiltrative & Necrotic Type
Nodular Predominantly Non-enhancing
Three Workstations (Osirix [Mac], Clear Canvas [PC] and XIP
Purpose Built Were Modified to Retrieve TCGA Images from NBIA
Database and Use Standardized Template and Save Interpretation
and Quantitative Measurements to AIM Data Service on caGRID
Osirix / iPad Assistant Demo
Osirix / iPad Workstation
XIP / AVT Workstation -
Clear Canvas Workstation
Purpose of TCGA Radiology Phase II Project
Project Goals
Utilize multiple CBIIT/caBIG® technologies together to create
a practical system to capture diagnostic imaging “knowledge”
in a structured, standardized manner and to allow for the
integration with genomic and clinical data
Have at least two radiologists interpret the TCGA MRI brain
images associated with the Cancer Cell article
Utilize caBIG tools to create a repository of the qualitative
and quantitative information associated with the analysis of
the images
Utilize caBIG tools to perform cross database comparisons
for research purposes
Demonstrate potential of caBIG tools to assist in clinical
decision support
Review of Scope of Phase II:
Access to TCGA Clinical Data on caGRID and Imaging
Interpretation of TCGA Imaging Studies and Use of
B2B and caIntegrator 2 for Analysis of Data
Initial Tasks Additional Tasks Performed
1. 88 TCGA Radiology* cases in 1. Creation of a data service
NBIA read by two utilizing data from the Cancer
experienced neuroradiologists Cell article published 12/09.
using the Clear Canvas
2. Stand up data service as a
Workstation.
grid service at Emory.
2. Develop bi-directional search
3. Deploy AIME RESTful web
capability in the AIM Data
interface.
Service.
4. Develop a XML to table report
3. Develop Unique ID Fields
mechanism for AIME Data
within AIM Data Service.
Service.
*82 out of the 88 cases had complete data sets
Achievements:
Radiology Reading
TCGA cases in NBIA have been read by at least two funded neuro-radiologists:
A radiologist fills out
AIM based reporting
template.
New annotation data
is saved on AIME.
New markups created
on Workstation and
saved to the AIME.
Existing markups and
annotation retrieved Images retrieved
from AIM Data Service from NBIA at CBIIT
at Emory (AIME).
Achievements:
AIM Tasks
Achieve AIME bidirectional query capability (to reach full CQL compliance)
The AIME unique ID field population tasks is completed (to support queries from caB2B)
Tony, can you add a screen shot
with call out boxes like I did for
CC?
Query “up” AND “down” the document
hierarchy, following bidirectional associations in
Domain Model. Unsupported previously.
“id” field populated with
unique values instead
of “0”
Achievements:
AIM Export Script
Script created to read AIME data an output into a spreadsheet.
Achievements:
TCGA Cancer Cell Data Service
Because the existing TCGA Grid Data
Service is not currently available, we
created our own grid data service to host
genomic and clinical data from the 12/09
Cancer Cell article.
•Built a data model for Cancer Cell genomic
and clinical data
• Used caCORE SDK 4.2 to quickly
generate an application from this model
• Used caGrid Introduce SDK to create a
Grid data service from the SDK model
• Deployed data service at Emory
• Create scientific queries for caB2B
•Successfully queried 3 disparate caGrid
data services (AIM, NBIA, TCGA Cancer
Cell) with caB2B
•Documented insights gained from the
process of setting up our own data and grid
service
Achievements:
caB2B Query of NBIA, AIM and TCGA CC
Data Services
• Imported models for AIM and TCGA
data services into caB2B and manually
loaded URLs for these services
• Created groups of related classes
across NBIA, AIM and TCGA CC data
models
• Built scientific queries to exercise
queries joining NBIA, AIM and TCGA
CC data using the B2B thick client
• Exposed these queries through caB2B
3.1 web client
• Successfully queried 3 disparate
caGrid data services (AIM, NBIA,
TCGA Cancer Cell) with caB2B
• There are limitations regarding speed
of return of data
• Documented performance limitations in
detail along with other insights gained
during the process of configuring caB2B
for this project.
Achievements:
Additional Analysis with caIntegrator2
• caIntegrator2 team added a feature to
support integration with AIM grid data
service to load annotations
• caIntegrator2 Study: Combine TCGA
Cancer Cell data (from CSV), AIM data
from grid service, and images from
NBIA production grid service.
• Created scientifically relevant
queries based on image observations
and clinical data
• Generated Kaplan-Meier plots of
survival based on certain
observations and genomic subtypes
Achievements:
Cross Program Coordination
Diverse Group of Contributors
Imaging and ICR Facilitators (Ed and Juli)
Life Science CAT (led by Robert)
Cancer Imaging Program
Imaging SMEs (Emory, Stanford, NW, UMD, UVA, TJU)
Multiple Contractors (Booz Allen, SAIC, 5AM, Sapient)
Grid KC (OSU)
caBIG® Imaging Enterprise Use
Case Project: TCGA Radiology
Experience with
caB2B and caIntegrator2
July 15th 2010
caB2B “Thick” Desktop Client
caB2B “Thick” Desktop Client
Search for AIM annotation:
“Thickness of the Enhancing Margin Thick”
Search for Gender from TCGA Patient data
caB2B “Thick” Desktop Client
Summary of all matches
in AIM and TCGA data
caB2B “Thin” Web Client
Searching for TCGA data
Includes TCGA, AIM, and NBIA services
caB2B “Thin” Web Client
Searching for Female patients with the
“Proneural” Genomic Subtype
caB2B “Thin” Web Client
Export results to
CSV for
further analysis
caB2B “Thin” Web Client
Add formula in Excel
caIntegrator2
Study deployed with
TCGA Cancer Cell
data (from
spreadsheet), AIM and
NBIA image data from
grid services
caIntegrator2
This saved query shows only Age at First
We can export to CSV Diagnosis and whether Hemorrhage
for additional analysis exists. Other columns can be added.
as in caB2B
caIntegrator2
Representative image
from annotated image
series in NBIA
Achievements:
Preliminary Scientific Findings
• Survival of patients with greater
thickness of enhancement (who appear
to have had tumors with a thicker “rim”)
was significantly for shorter than those
who had less.
• Survival of patients who had larger
thickness of enhancement tumors
with hemorrhage was significantly for
shorter than those who did not.
• Survival of patients who had
tumors that crossed midline was
significantly for shorter than those
who did not.
Opportunities to Further Deploy TCGA Related
Imaging and Life Sciences Technologies
Cancer Imaging Program:
- Continued TCGA Genotype/Phenotype Research with CBIIT, NIH Clinical Center
- Quantitative Imaging Network Program
- Cancer UK Research Program
- All Ireland Initiative Program
Radiation Research Program
- RTOG 0522 Study
NIAMS Osteoarthritis Study
- Annotation of radiology data
- Integrating of radiology data with other OAI data types
How the TCGA Radiology Project Fits Into
the caBIG® Imaging Program Roadmap
The Workstation provides a template for the type of visualization service
that we wish to make available as part of the suite of Imaging web-based
services.
The AIM Data Service is part of the proposed suite of web-based services
offered by CBIIT.
All of the TCGA technologies are part of the proposed software refactoring
for SAIF/ECCF compliance.
Proposed Next Steps for TCGA Radiology
1. Ongoing operation and maintenance of NBIA, Clear Canvas, AIM Data Service
and TCGA Cancer Cell Data Service.
2. Communication to community that radiologists can continue to read the cases
and add to the AIM TCGA data set
3. CIP recruited additional radiologists to read the cases since the AIM model
allows any number of readers to refer to one or more instances of the AIM data
service
4. CIP also says that are working with TCGA sites to get additional TCGA
radiology cases to be loaded on CBIIT’s NBIA.
1. Plan to create a hosted instance of AIM Data Service,
and TCGA Cancer Cell Data Service at CBIIT and in the
cloud
2. Communication to community that researchers can
now query across the three data services. CIP is also
working with Carl Schaefer and Robert Clifford to
begin to do research correlations among the clinical,
genomic and image annotation data.
3. Solicit feedback from community regarding desired
features for the Workstation and AIM Data Service.
Future Plans
• Provide software to NCI clinical cancer centers for
their own clinical trials/research studies involving
diagnostic imaging
• Extend work from in-vivo Imaging to pathology
Future Plans for TCGA Imaging Project
• Include higher order analysis, such as quantitative
diffusion imaging and perfusion imaging metrics,
that could be more sensitive predictors of disease
severity, candidates for effective therapy, and
expected outcomes combining human with semi-
automated and automated analysis of images
Future Plans for TCGA Project
• Ultimately would like to develop a “service” that
has capability to provide immediate feedback for
radiologist or oncologist on patient survival,
patient treatment, etc.
• Incorporate genomic and other data display
during radiology interpretation at a workstation
General Access TCGA Data
• The TCGA Study is currently available in
limited access [on the QA tier].
• We plan to offer the study for public
consumption [on the production tier] by the
end of September.
• The TCGA Radiology caIntegrator Study
contains 82 cases with at least one radiology
interpretation
• The radiology interpretation data is provided
in AIM format.
• The total amount of data in the caInt TCGA
Rad Study includes:
• 202 patient cases from the TCGA Cancer Cell Article
• 196 of the 202 cases have valid genomic subclass and clinical
data loaded into caIntegrator
• 196 of those 196 cases have valid genomic
expression data from caArray
• 88 of those 196 cases have radiology images in
the NBIA TCGA collection
• 82 of those 88 cases have radiology
annotation/tumor characteristic data from AIME
• The caInt TCGA Rad Study pulls its data
from:
• NBIA (Images)
• Cancer Cell Data Service (Clinical and
Genomic Subtypes)
• AIM Data Service
• caArray (Genomic Microarray Data)
The TCGA Radiology caIntegrator Study has the
following the following clinical data provided by
the authors of the 12/09 Cancer Cell article.
Patient Barcode (Unique ID)
• Genomic Subtypes
• Gender
• Vital Status (at time data was gathered)
• Age at First Diagnosis
• Survival (Days)
• Percent Tumor Nuclei
• - Percent Tumor Necrosis
Demonstration of Interactive Use of
caIntegrator2 to Explore TCGA Data
Including Radiology Phenotypic Data
• Dr. Joseph Chen
• Instructor University of Maryland School
of Medicine Department of Diagnostic
Radiology