Biomedical Informatics Middleware, caGrid and Design Templates
Joel Saltz MD, PhD Chair Department of Biomedical Informatics OSU College of Medicine/OSUCCC
Outline of Talk • Design Templates in Translational Research • caGrid Roadmap and Middleware support for Design Templates
Pattern Language
Design Patterns
Patterns serve as a good team communications medium Patterns are extracted from working designs Patterns capture the essential parts of a design in compact form Patterns can be used to record and encourage reuse of "best practices“ Good patterns are difficult and time consuming to write
(1996)
Design Templates
for Translational Research
• Coordinated Systems-Level Attack on Focused Problem • Prospective clinical research study • Multiscale Investigations that encompass genomics, epigenetics, (micro)anatomic structure and function • Secondary Data Analysis • Adaptive Image Guided Intervention • Ad-hoc discovery, query, invocation of discrete services
caGrid in One Slide
•caGrid Components • Language (metadata, ontologies) • Security (GAARDS) • Advertisement and Discovery • Workflow • Grid Service Graphical Development Toolkit (Introduce) • Efficient Bulk Data Transport (IVI middleware) • DICOM compatability (IVI middleware)
Design Template: Coordinated Systems-Level Attach on Focused Problem The Cardiovascular Research Grid (PI Rai Winslow -- JHU, OSU, UCSD)
The D. W. Reynolds Cardiovascular Clinical Research Center
Who should receive ICDs?
Large patient cohort (~ 1,200) at high risk for sudden cardiac death
Genetic Variability Gene Expression Profiling
All have CAD, LV dysfunction, received ICD placement Multi-scale data from each patient Patients with appropriate ICD firings are defined as high risk, patients without as low risk Challenge – discover biomarkers that are predictive of high risk
Protein Expression Electrophysiological Profiling Data
Multi-Modal Imaging Data Analysis And Modeling
Test biomarkers on novel (~500) patient population
Algorithmic challenge: Statistical Learning With Multi-Scale Cardiovascular Data
Goal – predict risk of SCD and therefore identify patients to receive ICDs Develop learning methods that work in the “small sample regime” Develop methods capable of combining features across different levels of biological organization SNP transcriptome proteome Electrophysiological (ECG) Imaging Clinical caGrid middleware integrated with BIRN Portal
Slide adapted from Rai Winslow
Design Template: Multiscale Translational Research • Investigations that encompass genomics, epigenetics, (micro)anatomic structure and function • Tools include microscopy, Radiology imaging, high throughput molecular analyses, simulation
• • • • Tumor microenvironment characterization Brain and nervous system function Cardiac arrhythmia research Nervous system regeneration
Tumor Microenvironment
• • • • • Cancer is a complex phenomenon A tumor is an organ Structural and functional differentiation within tumor Molecular pathways are time and space dependent “Field effects” – gradient of genetic, epigenetic changes • Anatomy, physiology, molecular biology of cancer
Tumor microenvironment research: True Multiscale Information Integration
Tumor Microenvironment
Team led by Kun Huang, OSU
Brain Atlas: Pipeline to Support for AtlasBased Query Gene Expression in Brain (joint with BIRN/UCSD)
Design Template: Prospective clinical research studies • Osteoarthritis initiative • Most studies carried out by Cancer cooperative groups • Framingham Study • Woman’s Health Initiative
caGrid Roadmap and Support for Design Templates
• Data and Analytic Services • Integrate existing relational, XML database systems, parallel database and file systems, OWL/RDF databases, Highperformance Grid Nodes, multi-core systems, on-demand computing, data intensive computing Support for Federated Query and Orchestration of Grid Services • Federated query support, semantic query support, enhanced workflow support, HPC issues: coordination and communication between parallel services, large data transfers, mapping to fine grain workflows on a HPC Grid node, interaction and relationship between coarse grain Grid workflows and fine grain dataflows/workflows.
•
caGrid Roadmap
• Semantic Infrastructure • Semantic annotations for services, relationship between semantics and data structures, systematic curation vs community freedom, semantic query support. • Interoperability • Support for enterprise service buses (ESB), Enterprise Systems, Other grids, IHE. Support for transactions such as clinical reminders, critical action value handing, adverse event reporting
caGrid Roadmap
• Security • Compliance with Regulatory and Federal eAuthentication Guidelines, Establishment of Grid-wide User Directory, Security interoperability with institutional security frameworks, enterprise systems • Governance of caGrid Middleware Development • Management of development of core caGrid system, community contributed extensions and components, release cycles, testing environment, caGrid and community application development on caGrid.
caGrid and Translational Biomedical Informatics
• caGrid is intended to support data federation, analytical service invocation, metadata management, security requirements driven by translational research design templates caGrid is funded by NCI but the infrastructure can be used in any translational research effort The design templates are not disease specific (and many can easily be extended beyond biomedicine) There is an ongoing community caGrid Roadmap process to architect future software Not difficult to find out technical details -- high level descriptions of caGrid and caGrid security infrastructure available as 2008 JAMIA articles (Langella et all, Oster et al); caGrid source code and documentation available on line.
• • • •
Thank you
Acknowledgments
• • • • The caGrid team:caGrid 1.0: Scott Oster, Stephen Langella, Shannon Hastings, David Ervin, Ravi Madduri, Tahsin Kurc, Frank Siebenlist, Ian Foster, Krishnakant Shanbhag, Peter Covitz OSU Imaging Informatics/HPC team: Berkant Barla Cambazoglu Ph.D., Umit V. Catalyurek Ph.D., Metin N. Gurcan Ph. D. ,Kun Huang Ph.D.,Tony C. Pan M.S., Ashish Sharma Ph.D., Manuel Ujaldon Ph.D. Olcay Sertel Antonio Ruiz, Vijay Kumar, Raghu Machiraju Jim Purdy, Walter Bosch from Advanced Technology Consortium Eliot Siegel, Paul Mulhorn, Michael McNitt-Gray, all SMEs and participants in the caBIG in-vivo imaging workspace