Virtual Data Language exemplar: MRI atlas generation
Mike Wilde University of Chicago / Argonne National Lab
Virtual Data is a Large Team Effort
The GriPhyN Virtual Data System is the work of Ian Foster, Jens Voeckler, Mike Wilde and Yong Zhao, University of Chicago and Argonne National Laboratory, and Ewa Deelman, Gaurang Mehta, and Karan Vahi, USC Information Sciences Institute MRI Applications and datasets were provided by Jack Van Horn and Jed Dobson, fMRI Data Center, Dartmouth College
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
2
Virtual Data Describes analysis workflow
file1 simulate –t 10 … file2 reformat –f fz … file1 file1 File3,4,5 file7 psearch –t 10 … file8
Requested dataset
conv –I esd –o aod
file6
summarize –t 10 …
q
The recorded virtual data “recipe” here is:
– Files: 8 < (1,3,4,5,7), 7 < 6, (3,4,5,6) < 2 – Programs: 8 < psearch, 7 < summarize, (3,4,5) < reformat, 6 < conv, (1,2) < simulate
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
3
VDL: Virtual Data Language Describes Data Transformations
q
Transformation
– Abstract template of program invocation – Similar to "function definition"
q
Derivation
– “Function call” to a Transformation – Store past and future:
> A record of how data products were generated > A recipe of how data products can be generated
q
Invocation
– Record of a Derivation execution
q
These XML documents reside in a “virtual data catalog” – VDC - a relational database
www.griphyn.org/chimera 18 Sep 2004
4
MICCAI Tutorial
VDL Describes Workflow via Data Dependencies
TR tr1(in a1, out a2) { argument stdin = ${a1}; argument stdout = ${a2}; } TR tr2(in a1, out a2) { argument stdin = ${a1}; argument stdout = ${a2}; } DV x1->tr1(a1=@{in:file1}, a2=@{out:file2}); DV x2->tr2(a1=@{in:file2}, a2=@{out:file3});
file1 x1 file2 x2 file3
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
5
Image analysis algorithms used
q
UCLA Automatic Image Registration (AIR)
q
http://bishopw.loni.ucla.edu/AIR5/index.html
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
6
MRI ATLAS Generation Workflow
3a.h 3a.i 4a.h 4a.i ref.h ref.i 5a.h 5a.i 6a.h 6a.i
align_warp/1
3a.w
align_warp/3
4a.w
align_warp/5
5a.w
align_warp/7
6a.w
reslice/2
3a.s.h 3a.s.i
reslice/4
4a.s.h 4a.s.i
reslice/6
5a.s.h 5a.s.i
reslice/8
6a.s.h 6a.s.i
softmean/9
atlas.h atlas.i
slicer/10
atlas_x.ppm
slicer/12
atlas_y.ppm
slicer/14
atlas_z.ppm
convert/11
atlas_x.jpg
convert/13
atlas_y.jpg
convert/15
atlas_z.jpg
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
7
Deployment on Grid3
q
q
q
international Data Grid with dozens of sites and thousands of processors. operated jointly by the U.S. Grid projects iVDGL, GriPhyN and PPDG, and the U.S. participants in the LHC experiments ATLAS and CMS. Project highlights include:
– Participation by more than 25 sites across the US and Korea which collectively provide more than 2000 CPUs – Resources used by 7 different scientific applications, including 3 high energy physics simulations and 4 data analyses in high energy physics, bio-chemistry, astrophysics and astronomy – More than 100 individuals are currently registered with access to the Grid – A peak throughput of 500-900 jobs running concurrently with a completion efficiency of approximately 75%
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
8
Virtual Data Language Enables Grid-wide distributed computation
MICCAI Tutorial
Site Status on 11/19/03 (http://www.ivdgl.org/grid2003) www.griphyn.org/chimera 18 Sep 2004
9
Grid 3 this morning…
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
10
Virtual Data Language Enables Gridwide distributed computation
q
Atlases from 25, 100, and 590 image datasets processed on distributed grid resources Test virtual data workflow locally, run same VDL code on wide-area Grid
q
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
11
MRI Example: AIR Tools
TR air::align_warp( in reg_img, in reg_hdr, in sub_img, in sub_hdr, m, out warp ) { argument = ${reg_img}; argument = ${sub_img}; argument = ${warp}; argument = "-m " ${m}; argument = "-q"; } TR air::reslice( in warp, sliced, out sliced_img, out sliced_hdr ) { argument = ${warp}; argument = ${sliced}; } TR air::warp_n_slice( in reg_img, in reg_hdr, in sub_img, in sub_hdr, m = "12", io warp, sliced, out sliced_img, out sliced_hdr ) { call air::align_warp( reg_img=${reg_img}, reg_hdr=${reg_hdr}, sub_img=${sub_img}, sub_hdr=${sub_hdr}, m=${m}, warp = ${out:warp} ); call air::reslice( warp=${in:warp}, sliced=${sliced}, sliced_img=${sliced_img}, sliced_hdr=${sliced_hdr} ); } TR air::softmean( in sliced_img[], in sliced_hdr[], arg1 = "y", arg2 = "null", atlas, out atlas_img, out atlas_hdr ) { argument = ${atlas}; argument = ${arg1} " " ${arg2}; argument = ${sliced_img}; }
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
12
MRI Example: AIR Tools
TR air::align_warp( in reg_img, in reg_hdr, in sub_img, in sub_hdr, m, out warp ) { argument = ${reg_img}; argument = ${sub_img}; argument = ${warp}; argument = "-m " ${m}; argument = "-q"; } TR air::reslice( in warp, sliced, out sliced_img, out sliced_hdr ) { argument = ${warp}; argument = ${sliced}; } TR air::warp_n_slice( in reg_img, in reg_hdr, in sub_img, in sub_hdr, m = "12", io warp, sliced, out sliced_img, out sliced_hdr ) { call air::align_warp( reg_img=${reg_img}, reg_hdr=${reg_hdr}, sub_img=${sub_img}, sub_hdr=${sub_hdr}, m=${m}, warp = ${out:warp} ); call air::reslice( warp=${in:warp}, sliced=${sliced}, sliced_img=${sliced_img}, sliced_hdr=${sliced_hdr} ); } TR air::softmean( in sliced_img[], in sliced_hdr[], arg1 = "y", arg2 = "null", atlas, out atlas_img, out atlas_hdr ) { argument = ${atlas}; argument = ${arg1} " " ${arg2}; argument = ${sliced_img}; }
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
13
MRI Example: AIR Tools
DV air::i3472_3->air::warp_n_slice( reg_hdr = @{in:"3472-3_anonymized.hdr"}, reg_img = @{in:"3472-3_anonymized.img"}, sub_hdr = @{in:"3472-3_anonymized.hdr"}, sub_img = @{in:"3472-3_anonymized.img"}, warp = @{io:"3472-3_anonymized.warp"}, sliced = "3472-3_anonymized.sliced", sliced_hdr = @{out:"3472-3_anonymized.sliced.hdr"}, sliced_img = @{out:"3472-3_anonymized.sliced.img"} ); DV air::i3472_4->air::warp_n_slice( reg_hdr = @{in:"3472-3_anonymized.hdr"}, reg_img = @{in:"3472-3_anonymized.img"}, sub_hdr = @{in:"3472-4_anonymized.hdr"}, sub_img = @{in:"3472-4_anonymized.img"}, warp = @{io:"3472-4_anonymized.warp"}, sliced = "3472-4_anonymized.sliced", sliced_hdr = @{out:"3472-4_anonymized.sliced.hdr"}, sliced_img = @{out:"3472-4_anonymized.sliced.img"} ); … DV air::i3472_6->air::warp_n_slice( reg_hdr = @{in:"3472-3_anonymized.hdr"}, reg_img = @{in:"3472-3_anonymized.img"}, sub_hdr = @{in:"3472-6_anonymized.hdr"}, sub_img = @{in:"3472-6_anonymized.img"}, warp = @{io:"3472-6_anonymized.warp"}, sliced = "3472-6_anonymized.sliced", sliced_hdr = @{out:"3472-6_anonymized.sliced.hdr"}, sliced_img = @{out:"3472-6_anonymized.sliced.img"} );
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
14
MRI Example: AIR Tools
DV air::i3472_3->air::warp_n_slice( reg_hdr = @{in:"3472-3_anonymized.hdr"}, reg_img = @{in:"3472-3_anonymized.img"}, sub_hdr = @{in:"3472-3_anonymized.hdr"}, sub_img = @{in:"3472-3_anonymized.img"}, warp = @{io:"3472-3_anonymized.warp"}, sliced = "3472-3_anonymized.sliced", sliced_hdr = @{out:"3472-3_anonymized.sliced.hdr"}, sliced_img = @{out:"3472-3_anonymized.sliced.img"} ); DV air::i3472_4->air::warp_n_slice( reg_hdr = @{in:"3472-3_anonymized.hdr"}, reg_img = @{in:"3472-3_anonymized.img"}, sub_hdr = @{in:"3472-4_anonymized.hdr"}, sub_img = @{in:"3472-4_anonymized.img"}, warp = @{io:"3472-4_anonymized.warp"}, sliced = "3472-4_anonymized.sliced", sliced_hdr = @{out:"3472-4_anonymized.sliced.hdr"}, sliced_img = @{out:"3472-4_anonymized.sliced.img"} ); … DV air::a3472_3->air::softmean( sliced_img = [ @{in:"3472-3_anonymized.sliced.img"}, @{in:"3472-4_anonymized.sliced.img"}, @{in:"3472-5_anonymized.sliced.img"}, @{in:"3472-6_anonymized.sliced.img"} ], sliced_hdr = [ @{in:"3472-3_anonymized.sliced.hdr"}, @{in:"3472-4_anonymized.sliced.hdr"}, @{in:"3472-5_anonymized.sliced.hdr"}, @{in:"3472-6_anonymized.sliced.hdr"} ], atlas = "atlas", atlas_img = @{out:"atlas.img"}, atlas_hdr = @{out:"atlas.hdr"} MICCAI Tutorial www.griphyn.org/chimera
18 Sep 2004
15
For Information and Software
q
Virtual Data System
– www.griphyn.org/chimera: Overview, papers, software
q
Grids and Grid Software
– – – – www.globus.org – The Globus Toolkit www.cs.wisc.edu/condor - The Condor Project www.griphyn.org/vdt - Virtual Data Toolkit www.ivdgl.org/grid2003 - Using Grid3
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
16
Acknowledgements: Virtual Data is a Large Team Effort
The GriPhyN Virtual Data System is the work of Ian Foster, Jens Voeckler, Mike Wilde and Yong Zhao, University of Chicago and Argonne National Laboratory, and Ewa Deelman, Gaurang Mehta, and Karan Vahi, USC Information Sciences Institute fMRI Applications and datasets were provided by Jack Van Horn and Jed Dobson, fMRI Data Center, Dartmouth College
MICCAI Tutorial www.griphyn.org/chimera 18 Sep 2004
17
Acknowledgements
GriPhyN and iVDG are supported by the National Science Foundation
The Globus Alliance and PPDG are supported in part by the US Department of Energy, Office of Science; by the NASA Information Power Grid program; and by IBM
MICCAI Tutorial
www.griphyn.org/chimera
18 Sep 2004
18