SC2003 Tutorial Sample Visuals
Additional presentations can be found at the following URLs:
http://medicine.osu.edu/Informatics/talks/powerpoint/rockville04-03-23final.ppt http://medicine.osu.edu/Informatics/talks/powerpoint/CCG%20Presentation.ppt http://medicine.osu.edu/informatics/talks/powerpoint/janiesweb.ppt ftp://ftp.amnh.org/pub/people/djanies/poyusersguide.doc ftp://ftp.amnh.org/pub/people/djanies/poyscreenshot.doc ftp://ftp.amnh.org/pub/people/djanies/janies&wheeler.pdf http://medicine.osu.edu/Informatics/talks/powerpoint/MSA-SAC2003.ppt
Joel Saltz
Chair and Professor Biomedical Informatics Department Professor Computer and Information Science, Pathology The Ohio State University
Overview
• Grid based clinical research infrastructure • OSU Grid Software Infrastructure • CALGB and cardiac clinical research studies
Grid based clinical trials support
Worldwide Scope for Clinical Research Studies
1000’s of potential clinical research sites, different studies involve different subsets of sites Different sites can use different names for the same entity Semantic grid, SNOMED, LOINC Support for authentication, encryption, anonymization, role based data access Support for grid data aggregation Grid based coordination of clinical studies Must leverage pre-existing medical IS systems – each of these are complex trigger based federated systems
Clinical Research Grid: Types of Information
• Radiological Studies • Pathology • Molecular (Proteomics, gene expression) • Genetic, Epigenetic (SNPs, haplotype analysis) • Laboratory, pharmacy, outcome data
Aggregation of Data in Virtual Information Warehouses
Virtual Information Warehouses
Clinical Genomic Data
Clinica l Data
Clinica l Data
Tissue bank Lab Data
Clinical Research Grid: More than just data aggregation
• Define iteratively define clinical protocols
– Changes arise from scientific, institutional review and with ongoing analysis of study data
• Patient accrual
– Identify suitable patients, obtain patients’ consent for study
• Execution of protocol
– Maintain and execute rule base in order to carry out treatment and testing specified by protocol – Ongoing assessment of patient data determines patient treatment, what data will be obtained, which specimens will be collected, how the specimens will be processed, which tests will be carried out…
Patient safety and protocol optimization
Ongoing analysis of data from overall study and of data from individual patient
Analysis
Prediction of patient outcome, effectiveness of treatment; relationship of genomic data to pathophysiological measurements, outcome
Data streamed to Analysis
Drives accrual, protocol changes, choice of laboratory, imaging, genomic testing
Analysis subscribes for data updates Generates requests for data
Data
Diagnosis, Treatment, Laboratory, Imaging, Proteomic, Gene Expression, Gene Sequence
Workflow
Execution of rule-based protocols, execution of algorithms that specify tests and treatments, coordination of patient consenting, specimen collection and analysis
Data driven algorithms -patient accrual, clinical, laboratory, genomic testing
•
Overview of Clinical Research Grid Customized access
control
– Institutions and patients decide what data to share
OSU Information Warehouse
• Ad-hoc data warehouses
– Each research project and consortium can maintain its own data view – Institutional databases linked to grid
• Grid based molecular dataset and image analysis
– Images as first class objects
Clinical Research Environment at Single Site: OSU IS Infrastructure
Components of Local Information System
• Electronic medical record • Clinician order entry, clinical protocol specification and tracking • Laboratory System • Digital Radiology (PACS) • Datagate: triggers invoked by message monitoring • Appointments, billing • Logistics – scheduling people and resources • Information warehouse • Security, single sign-on
Cardiology (CFP) Cardiology ECG (Marquette) Labs, Pathology (Sunquest) Discharge Instructions (Homegrown) Respiratory Therapy (CliniVision)
Results Reporting Data Flow
Clinical Repositories Palm Labs (K1) Radiology (K3)
Interface Engine (Datagate)
Cardiology/ECG (K5) Respiratory/Access Databases (K7) Dicated Reports (K9) All Results
RT C No ons u tifi ca lt t io ns Labs ,
Lifetime Clinical Record/ CAPI (Siemens) Soarian/ Clinical Access (Siemens) Bedside (CliniComp)
Signed and Prelim Dicated Reports
ad R d, bs ar Labs Labs Labs La , C bs La
Reports to be signed
Blood Gases (StatLab)
Consult Paging
Radiology (IDXRad) PACS Images (AFGA) Gastro (ProVation) Planned Consults (Siemens)
Dictated Reports:
Autofax Discharge Sumary, Discharge Instructions, Referring Notes, Operative Notes, Letters, Clinic Notes, etc.
Information Warehouse (Homegrown) GCRC (Homegrown) Nephrology (Homegrown)
bs La
CVC Notes (Siemens)
Medical Records (SoftMed)
Includes: - Radiation Onocology - ED/Trauma - Staff Notes - History & Physicals - Office letters, summaries - Consultations - Discharge letters - EMG Results
Transplant (Translink) Surgery Risk Mgmt (Homegrown) Oncology (Homegrown)
Transcribed Reports (MedQuist)
Department ReportsWord Macro
Order Entry Data Flow
Consults
Lab Orders Addon Orders (Send Num) Number Assn Order Status Upd 1. A. B. C.
Labs (Sunquest)
1. 1. 1.
Lab Orders (Q5) Radiology Orders (Q7) Dietary Orders (Q9)
Rad Orders
1. A. B. C.
Clinical Order Entry (Siemens) A.
A. B. B.
Lab Add-On Orders (L6) Radiology Add-On Orders (L8) Lab Number Assignment (Q5) Radiology Number Assignment (Q7)
Datagate Interface Engine (SeeBeyond)
C.
Addon Orders (Send Num) Number Assn Order Status Upd
Radiology (IDXRad)
Lab Pendings/Results Rad Sched/Results C. Lifetime Clinical Results CAPI C.
Dietary Orders 1.
Dietary (CBORD)
(Siemens)
Legend 1 - HIS Orders - No Ancillary Change 1. Physician places order. Sent to ancillary via interface or paper. C. If order fine: Radiology is scheduled, arrived, completed via order status update. Lab orders are collected, received and finalled. Legend 2 - Ancillary Addon Order A. Ancillary places order. Sent to COE B. Ancillary order number recorded, new HIS number assigned and sen t. Ancillary store HIS number Legend 3 - HIS Orders - Ancillary Changes 1. Physician places order. Sent to ancillary via interface. A. Ancillary cancels order by sending cancel message to COE. A. Ancillary places correct add-on order and sends to COE. B. Coe receives order, creates HIS order number and sends "Number Assigned" msg to ancillary. C. Order status updates show status as described in Legend 1.
Patient Management Data Flow ADT and Scheduling
Outpatient Registration/Sched (IDX)
IDX Sched IDX BAR IDX Reg Outpatient Schedules Outpatient ADT Reg's All "SMS" ADT ABN (MediQuant) Fax Server Housekeeping Page on Admit, ED Discharges Notices to Referrrings
Ancillary Systems: Bedside (CliniComp) Cardiology (CFP) Dietary (CBORD) Gastro/Endo (ProVation) Home Health (Delta)
Secondary Systems:
ECG (Marquette)
ADT
Hospital Info System (Seimens)
OSUMC Patient Accounting James Patient Accounting OSUMC/James Patient Management
Inpatient, ED, LD, ASU ADT Hospital Based "IDX" ADT
Datagate/EGate Interface Engine (SeeBeyond)
Labs/Path (Sunquest) Labs/URL (Antrim) Materiel Sys. (OmniCell) Med Records (SoftMed)
Instruments
Materiel Sys (Class) Transcription (MedQuist) Case Cart Dispensing (Pyxis) PACS (AGFA) Modalities (Kodak, Agfa)
All East ADT Non-Hospital "IDX" ADT All ADT Direct Database Applications
OR (ORIS) Pharmacy (Pharmakon) Radiology (IDXRad) Rad Onc (Siemens) Respiratory (CliniVision) Research Databases: GCRC (Homegrown) Visit DB: Bedboard ADT Q/A Ref Phys App Patient Transport Respiratory (StatLab) Switchboard (XTend)
East Patient Accounting
East Patient Management
MPI - EAD (Seimens)
Imaging (Seimens)
All ADT
New "Sorian" Info System (Seimens)
Clinical Access
Onc Outcomes (Homegrown) Transplant (Homegrown) Surgery Risk (Homegrown)
Infrastructure for Clinical Trials Support at OSU
Knowledge Base CRIS Application& Rules Engine
Em ailN otifica tions
Inf ormation warehouse; Clinician order entry WebServ er
OSU Hospital 802.11b Wireless Network
s age eb P W
Web Pages
W
eb
es ag P
H
l he nd a
dW
es ag P eb
Tablet PC
Tablet PC
Av antGo
H ots y nc
Pocket PC Palm OSPDA Palm OSPDA Tablet PC
PC/Mac Web Browser
Tablet PC Palm OSPDA
Layered on OSU Information Warehouse
CPR Decision Support
Facilitate Best Practice through POE
Standard templates/defaults for complex orders
Order sets designed to support evidence based clinical guidelines
Ohio Clinical Trial Research Consortium
Federate Emerging Databases
Infrastructure to relate, combine & produce meta data
Deformation Segmentation Quantification
Slide courtesy of Arthur Toga
Two groups have developed BIRN project partnerships, so far:
• Mouse BIRN - Animal Models of Disease / Multi Scale/Multi Method - MS Mouse and DAT KOM (a schizophrenic and otherwise interesting mouse animal model) • Brain Morphology BIRN - Targets: neuroanatomical correlates of neuropsychiatric illness (Unipolar Depression, mild Alzheimer's Disease (AD), mild cognitive impairment (MCI)
OSU Grid Software Infrastructure
• Support for optimized query, processing, analysis of distributed datasets • Integration of software with NSF PACI software suite (Globus, SRB, Network Weather Service) • Collaboration with BIRN
Software Support for Data Driven Applications
• DataCutter: Component Framework for Combined Task/Data Parallelism:
– Filtering/Program coupling Service: Distributed C++ component framework
• GridDB Lite: Large Data Query Layered on DataCutter
– Indexing: Multilevel hierarchical indexes based on R-tree indexing method.
• Data Cluster/Decluster/Range Query
• Active Proxy G: Active Semantic Data Cache
– Employ user semantics to cache and retrieve data – Store and reuse results of computations
DataCutter
• • Flow control between components Schedulers place filters on grid processors (scheduler API) Stream based communication MetaChaos data descriptor, data mapping support for inter-component data transfers Data aggregation implemented as a component NPACkage
Combined Data/Task Parallelism
E0 R0 R1
host1
• •
Ra0
host3
EK
host1
Ra1
host4
M
host1 Cluster 3
R2
host2 Cluster 1
9/11/2002
EK+1 EN
host2
DataCutter
• •
Ra2
host5
Cluster 2
19
Download at www.datacutter.org
Integrating DataCutter with existing Grid toolkits SRB, Globus, NWS – SRB integration: Subset and filter datasets – Globus integration: DataCutter uses Globus’ resource discovery, resource allocation, authentication, and authorization services. – Network Weather Service (NWS) integration: NWS for used for system as NPACKage Distributed by NPACI monitoring.
GridDB Lite: Select Operation on Grid Data Distributed Array
Query Planning
Query Filter Data Source
Index
Distribution Generation Service
Data Source
Filter Distributed Program
Query Execution
Filter
Data Source
Data Cache
Data Cache
partition
Data Mover Service
Data Source
Distribution Generation Service
Partition
Filter
Data Mover Service
Distributed Program
Multi-Query Optimization: Active Proxy G
• Goal: minimize the total cost of processing a series of queries by creating an optimized access plan for the entire sequence [Kang, Dietz, and Bhargava] • Approach: minimize the total cost of processing a series of queries through data and computation reuse • [IPDPS2002,SC2002,ICS02]
q1 q2
This blue slab is the same as in q1 We have seen the pieces of q3 computed for other queries in the past
q3
Grid Based Image Analysis Toolkit
• Framework to support distributed image processing applications • Use DataCutter, VTK, and ITK • Provide a standard framework for describing image processing workflow and it’s data in order to enable creation of image processing Grid Services.
• • DataCutter – Distributed workflow system used for building applications that can operate in a cluster computing environment. VTK – The visualization toolkit used for creating visualization applications of all kinds which will manipulate and view image data. ITK – Insight segmentation and registration toolkit is quickly becoming the standard toolkit for the archival and invention of image analysis algorithms.
•
NPACI Telescience, BIRN and Microscopy Support for Telescience Portal using VTK, ITK
40,000 pixels
•
Goal
– – Remote access to and processing of subsets of large, distributed images. Even single images can be very large (a few hundred MB to tens of GB per image for montaged digitized microscopy images).
40,000 pixels
Query
•
•
Support by DataCutter for – Basic database operations: Indexing, querying, and subsetting of large images and image datasets. – Image processing supported by VTK (Visualization Toolkit) and ITK (Insight Segmentation and Registration Toolkit) layered on DataCutter. – Use of heterogeneous, distributed clusters for data processing. DataCutter is part of NPACKage which also includes SRB, Globus, and Network Weather Service as an integrated suite of tools.
DataCutter
Telescience Portal
Telescience Portal and DataCutter Demo at NPACI 2003 All Hands Meeting (March ’03) Storage Systems Compute Cluster
Globus
Storage Resource Broker (SRB)
DataCutter Filter VTK/ITK
DataCutter Processing Filter DataCutter Processing Filter DataCutter Processing Filter VTK/ITK
VTK/ITK
VTK/ITK
DataCutter
Globus
Compute Cluster
Globus
•
With DataCutter • Some of the processing can be done near data sources to reduce volume of data. • Compute intensive operations can be executed on collections of compute clusters.
•
Web Portal
•
Middleware Tools DataCutter -- subsetting, filtering, and processing of data in a distributed environment Globus -- Authentication, resource allocation, and remote process execution. SRB -- file I/O to different storage systems.
Radiology: Clinical Studies using Dynamic Contrast Imaging
•
•
•
1000s of dynamic image sets per clinical study Iterative investigation of image quantification, image registration and image normalization techniques Assess techniques’ ability to correctly characterize anatomy and pathophysiology
– Biopsy results – Changes in tumor structure and activity over time with treatment
• •
Images from many sites including NIH, Heidelberg, Oklahoma, Ohio State Collaboration with Michael Knopp, MD
(a)
(a)
(b)
(b)
DCE-MRI of a juvenile osteosarcoma in the lower arm in a young patient. This is an example image taken from a dynamic series during passage of a contrast agent. With a two-compartment model, different pharmacokinetic parameters can be calculated. A region of interest (ROI) analysis of a highly vascularized tumor area (“hot region”, yellow circle) allows calculation of a signal-intensity curve. This area shows a fast up-slope, a high amplitude and a fast wash-out of the contrast agent.
prior to therapy
after 2 cycles
after 4 cycles
DCE-MRI of a malignant breast tumor during neo-adjuvant chemotherapy. During therapy monitoring it could be demonstrated that the bulk of the tumor responded very well to chemotherapy (after 2 and 4 cycles) whereas tiny tumor “islands” showed no response and remained still active (c3, arrows).
Advanced Analysis of DCE-MR Images
Advanced DCE-MRI Evaluation
NO
No DCE Evaluation
NO
Computer-aided Quality Control
Contrast Enhancement, No severe image artifact
YES
Detectable Motion YES in Dataset
Motion Correctable
YES
Motion Correction
NO
Algorithmic tumor Classification (3D subclassification of Heterogeneous lesion)
Computer-aided tumor differentiation
Computer-aided tumor detection
Coregistration (cross-modality or Longitudinal study)
System Support
Data Analysis Runtime Support
Being built on a number of toolkits and frameworks for execution in cluster and Grid environments.
• DataCutter
– A component framework for distributed execution of queries and data analysis operations. – Allows execution of a chain of processing operations on parallel and distributed machines.
•
GridDB-Lite
– Support for subsetting of large, distributed datasets, for user-defined filtering, and for data transfer. – Builds on DataCutter
• • •
GridMD
– Grid-enabled meta-data management services
Active Proxy-G
– Caching of query results and data in Grid environments.
NPACKage
– A suite of tools for application development and execution in a Grid environment – Includes Globus, DataCutter, Network Weather Service, and Storage Resource Broker
•
For more information
– www.datacutter.org – www.medicine.osu.edu/informatics
Distributed Execution using DataCutter
• Middleware for subsetting and processing of large datasets in a distributed environment • Distributed C++ component framework • Provides support for data and task parallelism • Application-specific components for processing data
– filters – logical unit of computation
• init,process,finalize interface
– streams – how filters communicate
• unidirectional buffer pipes • uses fixed size buffers
– manually specify filter connectivity and filter-level characteristics
Image Processing for the Grid
• A toolkit that makes it easy to create parallel, distributed image processing applications. • Built on the DataCutter framework. • Ability to harness a heterogeneous grid of compute and storage hardware. • Enables using Insight Segmentation and Registration Toolkit (ITK) and Visualization Toolkit (VTK) in a grid-based computation environment. • Dynamic loading of filters • XML based workflow model.