Scalability Requirements and
Implementation Options
Big Picture Goals for BIRN
• Provide IT Infrastructure for Biomedical
Research
– Best Practices from CS/Grid Community
– Adapted to Biomedical Needs
– Highly Available and Scalable
• Provide High Speed Access
– Internet2 and TeraGrid
– Overflow to Commercially Available Services
when Research Facilities are Overloaded
Current Implementation
• Relies on Dedicated Hardware Resources
Per Site
– To Ensure Compatibility
– Consistent Target for Software Stack
• Requires Dedicated HW/SW Support at
BIRN-CC and Site
• Not Scalable Across Sites or Within
Institutions
– Finite Disk and Compute Resources per Rack
– Custom Installations, Troubleshooting
Goals for a Virtualized BIRN Infrastructure
• Meet Specifications of Current BIRN
– Security, Audit Trails, Access Controls
– BIRN Application Software Stack
• Migrate to Commodity Computing
– Machine Instances Created as Needed
– Distributed, Redundant Data Centers
– Outsource Facilities, Maint, Admin…
• Provide Predictable Costs and Availability
to Biomedical Research Customers
Amazon EC2 / S3 as Possible
Implementation
• Known Costs per CPU Hour / Gigabyte of Data
Transferred
– Allows Detailed Budgeting for Grants
• BIRN Software Stack Provided on Boot Disk
Images
– Known Application Environment to Support Research
Needs
• BIRN Audits Accounts, Permissions
• All Administration is Virtual
– Avoid Hardware Headaches
– Effort Scales Easily
Practical Requirements
• BIRN Rocks Distributions Ported to Xen
• S3FS Fuse Module Needed
– Likely to be Developed by Community
– Not yet available
– May Require BIRN Extensions
• Make BIRN Authentication Control Login to
Virtual BIRN Servers
• Provide Rights-Based Access to / Audit of S3FS
Data Resources Based BIRN Credentials
• Provide User Tools and Best Practices to
Integrate with Workstations and Desktops Inside
Firewalls
– sshfs, sftpdisk, rsync, unision…
To Be Determined…
• Feasibility of S3FS
• Performance of S3/EC2
• Migration Path for BIRN Users
• Other Unknowns…