Docstoc

Amazon Web Services Training

Document Sample
Amazon Web Services Training Powered By Docstoc
					Virtualization & Life Science R&D
2010 BioIT World Web Symposia Series
http://www.bio-
itworldsymposia.com/Bioitsymposia_Content.aspx?id=97682




                                                 chris@bioteam.net - http://www.bioteam.net
Who am I?
• I’m from the BioTeam
  ▫ Independent consulting shop
  ▫ Staffed by scientists forced to learn
    IT to get our own research done

• Found a fun business niche
  ▫ Bridging the “gap” between
    science, IT & high performance
    computing

• This matters today because …
  ▫ We work daily in pharma, biotech,
    academic, government & military
    sites
      … often we can talk/share our
       experiences
  ▫ We have no financial
    entanglements with providers of
    virtualization products

                                            chris@bioteam.net - http://www.bioteam.net
Disclaimer
• When it comes to virtualization
  I’m beholden to nobody
• Plan on speaking only on what
  I’ve done, seen and
  implemented
• Think of me as an on-the-
  ground spy reporting on trends
  and applications, not a pundit,
  “expert” or “visionary”
  ▫ No excuses for not doing
    your own due diligence




                                    chris@bioteam.net - http://www.bioteam.net
Overview
I am not planning on …             I do plan to talk
• Explaining what virtualization   • … specifically about products
  is                                 and strategies we see often
• Selling you on a particular        see in research IT
  product or strategy                environments
• Covering the pros & cons in      • What I feel are the best
  any vague or generic sense         benefits to IT and the
                                     researcher
                                   • Will also be specific about
                                     what my own company is
                                     doing


                                         chris@bioteam.net - http://www.bioteam.net
Meta-Topics for Today
• Why virtualize?              • Benefits: Everyone
• Benefits: IT Organizations   • Some practical advice
• Benefits: Research
  Organizations




                                    chris@bioteam.net - http://www.bioteam.net
Reasons to virtualize
Hype aside, what are the real benefits?




                                          chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Three main areas of benefit …

• Enhances Research
• Enhances IT Operations
• Saves Money




                                chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Enhancing research

1. Gives researchers new capabilities
2. Increases the speed at which researchers can
   gain access to existing capabilities
3. Ideal meeting ground for when informatics & IT
   have potentially conflicting goals




                              chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Research: New Capabilities

• Desktop level virtualization gives researchers a
  manageable way to escape the confines of a
  managed corporate desktop OS image
• Uses:
   ▫ Access to Linux
   ▫ Sandbox environment for software dev & testing
   ▫ Testing of entire workflows & even small compute
     farms


                                 chris@bioteam.net - http://www.bioteam.net
Linux virtualization on Apple OS X




                                     chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Research: Extend/Enhance Existing Capabilities

• Generally boils down to being able to do things
  faster
   ▫ Cliché example is provisioning rapidly new servers
     for research informatics workflows or development
      Weeks or months to provision if hardware needs to
       be purchased & installed
      Days or weeks to provision if hardware already
       onsite



                                         chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Research: Extend/Enhance Existing Capabilities

• Also a nice way for Enterprise IT to met Research IT
  halfway:
   ▫ Need: scientists often use webservers, databases and
     webapps for simple sharing of data among
     workgroups & projects
   ▫ Problem: scientists don’t typically code for security
     and their apps rarely rise to the level of needing to be
     added formally to the enterprise software portfolio
   ▫ Solution: enterprise qualified, patched & secured
     LAMP stack for researchers

                                         chris@bioteam.net - http://www.bioteam.net
Why virtualize?
And this brings me to …

• Virtual OS images can be an ideal “middle
  ground” between research informatics &
  Enterprise IT
   ▫ Especially IT support organizations …




                                  chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Finding Common Ground between Research & Enterprise IT
• Absolutely standard, non controversial needs in
  many research environments:
   ▫ Access to Linux
   ▫ Ability to easily install software, libraries and patches
     via internet & other repositories without having to file a
     helpdesk ticket or ask someone for permission
   ▫ Elevated access privileges to handle file and owner
     permissions in a project or workgroup environment
   ▫ Desire to run web, database and application servers
   ▫ Elevated access privileges to control servers &
     services
   ▫ Quick & dirty code & scripts for short-term & one-off
     projects
                                       chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Finding Common Ground between Research & Enterprise IT

• Given the laundry list of requirements in the
  previous slide, how can Enterprise IT possibly
  support these crazy R&D people who need root
  access and write bad or insecure code?

   ▫ Provisioning enterprise-approved VM images
     makes this a manageable problem
   ▫ The VM image is also the ideal “line in the sand”
     when it comes to support

                                       chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Finding Common Ground between Research & Enterprise IT

• The “blessed” Linux VM for researchers:
   ▫ Fully patched OS & kernel
   ▫ Already integrated with Active Directory
   ▫ Root password known only to IT
      Elevated privileges via managed /etc/sudoers file
      Anyone who needs sudo is allowed to have it
      Sudo actions logged to external host for audit trail
   ▫ Configurable periodic snapshot-based backups


                                       chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Finding Common Ground between Research & Enterprise IT

• The Middle Ground
   ▫ In exchange for having unlimited freedom on the
     Linux VM, researchers understand that they give
     up a certain amount of IT support & handholding
   ▫ If the researchers freeze, kill, crash or corrupt the
     Linux VM during their work:
      IT will restore the VM from the last known-good
       snapshot or backup image
      No individual troubleshooting or support, if they
       “break” the system they will be given an older
       working backup image, nothing else.
                                       chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Enhance IT Operations




                        chris@bioteam.net - http://www.bioteam.net
Reduce Operational Burden
Remote mgmt. is non-trivial
1.   Remote power control
2.   Serial console switch
3.   Serial console cabling
4.   IP KVM Device


•    All these devices (and often
     more) are required to
     successfully provide a 100%
     “lights-out” remote
     management capability

                                    chris@bioteam.net - http://www.bioteam.net
„lights-out‟ mgmt is baked into virtualized
platforms




                                              chris@bioteam.net - http://www.bioteam.net
Manage your virtual IT with an Apple iPad ( shown: DesktopConnect.app via SSH
tunnel)




                                                chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Enhance IT Operations

• The “lights-out” stuff is nice but not a huge win
   ▫ You still need remote power, serial & KVM to your
     hypervisor hosts after all!

• The real win is what comes next …




                                  chris@bioteam.net - http://www.bioteam.net
“Scriptable Infrastructure” is a BIG DEAL




      This single command will start a 5GB managed MySQL database in the Amazon
      cloud for $0.11/hour. The database is automatically patched, managed and
      backed up. Planned enhancements include auto-scaling & snapshots.

                                                         chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Scriptable VM infrastructure!




                                chris@bioteam.net - http://www.bioteam.net
And just because we can … (SSH host management via
iPad)




                                        chris@bioteam.net - http://www.bioteam.net
Why virtualize?
Save Money

• Can yield significant financial savings
• Four main ways
  1. API & management tools reduce admin staff
     burden
  2. Green IT (performance per watt, avg. utilization)
  3. Reduced physical footprint
  4. Storage efficiencies (*ymmv)



                                  chris@bioteam.net - http://www.bioteam.net
Case Study
Large Cancer Research Institute

• On-campus research datacenter almost full
• Explosive growth in demand for CPU and
  storage
   ▫ Driven by next-gen DNA sequencing …
• No additional electrical power available
• No additional HVAC/cooling resources available

• What to do?

                                  chris@bioteam.net - http://www.bioteam.net
Case Study
Large Cancer Research Institute – “Virtual Colocation Facility”
Project
• Datacenter turned into “virtual colo” facility
   ▫ What Customers?
       Campus research labs & PI’s
• Started small, initial focus on server
  consolidation
   ▫ Replaces the biggest or least efficient boxes
• As HVAC/power envelope grew, add serious
  central storage and backup systems

                                            chris@bioteam.net - http://www.bioteam.net
Case Study
Large Cancer Research Institute – “Virtual Colocation Facility”
Project
• Results?
   ▫ Moderate gains:
       Increase in admin staff productivity
       Better monitoring & reporting capability
   ▫ Major gains:
       Server consolidation & more efficient hardware
        drove cooling & power requirements way down
       Now well within envelope existing facility can handle
       Instant ROI when measured against the cost of new
        datacenter construction
                                            chris@bioteam.net - http://www.bioteam.net
Case Study
Large Cancer Research Institute – “Virtual Colocation Facility”
Project
• Additional gains for IT:
   ▫ Central storage system is “VM aware”
       More efficient disk utilization via thin provisioning
       Additional efficiency/backup gains with de-dupe and
        other content-optimization techniques
       Savings measured in many dozens-of-terabtyes
       Each efficiency win on disk yields downstream
        efficiencies with tapes, backup & replication
        resources

                                            chris@bioteam.net - http://www.bioteam.net
Case Study
Large Cancer Research Institute – “Virtual Colocation Facility”
Project
• Additional gains for Researchers & PI’s:
   ▫ Have the expansion (CPU & disk) that they
     needed
   ▫ Delegated management tools let researchers
     “own” and control their own servers
       Especially the systems they had managed pre-
        consolidation (an important political issue)
   ▫ Delegated management of storage pools also let
     research self-manage disk resources & quotas

                                            chris@bioteam.net - http://www.bioteam.net
Case Study
Large Cancer Research Institute – “Virtual Colocation Facility”
Project
• End result:
   ▫ Significant capability expansion within same
     physical, cooling & power envelope; no additional
     datacenter required
   ▫ Major centralized storage efficiency gains with thin
     provisioning & content optimization
   ▫ IT staff can be more productive
   ▫ Research staff have the same (if not more)
     administrative control over “their” systems

                                            chris@bioteam.net - http://www.bioteam.net
Additional Benefits
Items I could not wedge in elsewhere …




                                     chris@bioteam.net - http://www.bioteam.net
Virtualization is a step towards the cloud …
• As a hype-averse anti-marketing cynic even I
  see the value of cloud platforms
• Virtual local resources puts you one step
  towards an easier cloud migration in the future
  ▫ Two main ways:
    Direct VM movement via commercial companies
     such as CloudSwitch.com
    Open Virtualization Format (“OVF”) is a no-brainer
     middle-ground. I expect to see cloud providers
     supporting OVF import/export in the future*

                                   chris@bioteam.net - http://www.bioteam.net
Virtualization audit/process friendly
• In audit-intensive environments it is
  straightforward to write the documents and
  idempotent deployment steps that will build a
  consistent OS image time and time again
• Combined with a good configuration
  management system you have an auditable
  system build process with change management
  & tracking built in



                             chris@bioteam.net - http://www.bioteam.net
Virtualization is orchestration/CM friendly
• I made the same mistake with the BioTeam
  virtualization platform as I originally did with the
  Amazon Cloud in 2007-2009
  ▫ Mistake: wasting too much time and effort building
    & managing unique OS images for different roles
    & tasks
• Then I saw the light …
  ▫ OpsCode Chef (www.opscode.com)
     Bork!


                                  chris@bioteam.net - http://www.bioteam.net
Chef lets you …
Treat your infrastructure as code

•   Manage configuration as idempotent resources
•   Put resources together as recipes
•   Group recipes into roles
•   Track it all like source code
•   Configure your servers




                                    chris@bioteam.net - http://www.bioteam.net
As a result of Chef …
BioTeam now manages only 4 Linux AMI‟s in the cloud
• 32 & 64 bit CentOS Linux
• 32 & 64 bit Debian Linux
   ▫ … using chef-solo or Chef Server we can orchestrate any
     cloud server into any role we need in a matter of minutes.
• My internal IT “to-do” list involves BioTeam’s corporate
  XenServer VM platform:
   ▫ Build a single, patched and LDAP-aware stripped down
     CentOS image
   ▫ Orchestrate it via Chef
   ▫ If I do it correctly
      … my Chef recipes will work locally & on the cloud
      Infrastructure agnosticism rocks!
                                           chris@bioteam.net - http://www.bioteam.net
Recap




        chris@bioteam.net - http://www.bioteam.net
My $.02 on virtualization in R&D
In a nutshell …

• Ideal “middle ground” between research
  informatics efforts & enterprise IT groups
• Significant administrative burden savings
• Significant “Green IT” facility/footprint savings
• Potentially large gains in storage efficiency
• Cloud & Orchestration friendy




                                 chris@bioteam.net - http://www.bioteam.net
Some final advice
An attempt to provide some specific tips …




                                       chris@bioteam.net - http://www.bioteam.net
Final Advice - 1
• Consider hiring an expert
 ▫ This is not a new field, best practices exist
 ▫ Don’t waste time making the mistakes that others
   have already discovered
• Involve your storage architects from day 1
 ▫ Many of the virtualization benefits are realized on
   top of a solid shared storage pool




                                 chris@bioteam.net - http://www.bioteam.net
Final Advice - 2
• VMWare is not the only game in town
 ▫ There are worthy competitors out there
• Commercial is not your only option
 ▫ BioTeam uses the free Citrix XenServer platform
 ▫ … does 100% of what we require
 ▫ … and 90% of what we would “like to have”




                                chris@bioteam.net - http://www.bioteam.net
Final Advice - 3
• Don’t get fooled by hardware vendors & price tags
 ▫ Yes, for many large enterprise projects it DOES
   make sense to use Tier1 server and storage
   products …
 ▫ This is not true for all projects and all use-cases
• BioTeam runs it’s lab & business operations using:
 ▫ Citrix XenServer (free)
 ▫ Server iron from SiliconMechanics.com
 ▫ OpenFiler NAS software (free) for our storage pools
    Lots of good high and midrange storage options out
     there (NexSan, etc. etc.)
                                  chris@bioteam.net - http://www.bioteam.net
end;
Questions?



Comments/feedback: chris@bioteam.net

Thanks!




                           chris@bioteam.net - http://www.bioteam.net

				
DOCUMENT INFO