Cambridge, MA USA
Amazon Web Services
Schedule & Location
• Half-Day “AWS for Sysadmins” - Not yet scheduled
• Day One “AWS Architecture” - March 3, 2010
• Day Two “AWS Architecture” - March 8, 2010
Technical training can be incredibly boring without interactive components and opportunities for self-paced learning.
We are trying to balance the syllabus in order to get the right level of content delivered without turning the session
into a 100% powerpoint-driven sleep-inducing lecture. Our decisions regarding treatment of classroom exercises,
live demos and recorded screencasts are based largely on estimated class size. Right now we are operating under
the assumption that the class size for the Half-Day “Sysadmin” will be small enough to allow focus on heavy interac-
tivity and student exercises. We are also assuming that the attendee count for the 2-Day session will be large
enough that the sessions will be more “lecture style” and will make heavy use of live instructor-led demos and
screencasts rather than individual labs and tasks for each student.
Feedback on the proposed syllabus is appreciated
• Chris Dagdigian <email@example.com>
• Adam Kraut <firstname.lastname@example.org>
Internet access from the training facility will be required.
Attendees to the Half-Day “AWS for Sysadmins” session should bring laptops. Mac OS X or Linux systems should
have Java 1.5 (JDK or SDK) installed and available so that the AWS command line tools can be installed and con-
ﬁgured. For Window users, a SSH client such as Putty should be present so that login sessions to a BioTeam train-
ing Linux server can be established.
Ideally all attendees should already have personal accounts set up with Amazon Web Services
(http://aws.amazon.com). This is not required for training as shared credentials belonging to BioTeam will be used
for exercises, labs and demonstrations.
We may in the future put out a call for names and email addresses of attendees in case we want to share BioTeam
AMI access with individual AWS accounts or provide unique LDAP/OpenID credentials to people attending the Two-
Day Architecture session. This is still being discussed internally.
Comments or questions can be addressed directly to Chris Dagdigian <email@example.com>
Half Day “AWS for SysAdmins” Course
In a half-day presentation, cover the primary essential topics required for someone with Linux and/or systems skills
to conquer the initial AWS learning curve and get up and running with AWS. At the end of the day, each attendee
will have root access to a cloud server instance that they have booted, conﬁgured, bundled and registered them-
I. AWS Logistics
• Credentials, keys & access identiﬁers
• Installing AWS utilities and conﬁguring local environment
Goal: Cover the minimum necessary steps to prep attendees for diving right in to performing real actions on
EC2 and other AWS product offerings using personal laptops or BioTeam shell accounts on a training server.
II. Immersion / Learn by doing
• Boot a linux server in the cloud
• Connect as root via AWS credential keys
Goal: Each attendee in the class will have root on a cloud instance they have booted themselves
III. Server & AMI Customization
• Add software, run "yum update", install SSH keys etc.
• Bundling changes & Uploading to S3
• Registering a new AMI
Goal: Show how to make local changes and installs within the AMI so that we can then "rebundle" the AMI
image, upload it and get it registered as a new AMI. End result is students learn how to make changes and
see them persist. Process also shows how slow the bundle/upload process is - leading well into future dis-
cussions regarding conﬁguration management best practices.
IV. Access Control, Security & Management
• Altering and applying new security policies to your AMI
• Granting access to your AMI to others
• Quick tour of management and monitoring interfaces
• Various GUI and API methods for control
Goal: Now that attendees know how to boot a bare bones OS instance, make changes and bundle the result-
ing custom OS into a new AMI of their own we are in a good position to talk about security and access con-
trol as well as everything related to management and monitoring of cloud systems.
V. Getting Work Done
• Self-organizing Grid Engine clusters
• More traditional AWS workﬂows with SQS queues
Goal: End the session with something practical that can be used as a base for further exploration. We'll walk
through the steps required to alter our AMI images so that when booted in arbitrary numbers they will self-
organize into a functional Grid Engine compute farm. This may or may not lead into discussions about how to
support existing/legacy apps and workﬂows in a cloud environment. Would also like to show another exam-
ple of a more traditional AWS application workﬂow architecture. For time reasons this may be done via in-
structor demo in conjunction with recorded screencasts.
AWS Architecture - Day 1 of 2
Scheduled: March 3, 2010 - location TBD
Objective: Progress iteratively through the topics essential for building out larger or more production-focused
workstreams on the AWS platform. Day One will focus on the basic foundations and will use the Maq assembler as
an example use case for building out a more traditional (or ‘legacy’) workﬂow on Amazon AWS. Due to estimated
session size, this training will be lecture, discussion and live demo driven.
I. Intro & Logistics
II. AWS Overview
Goal: There are a huge number of AWS service and product offerings. We’ll cover the ones most of interest
to people involved in informatics and high performance computing.
III. Mapping Informatics to the Cloud
Goal: Cover the major environmental, performance and architecture differences between HPC, grid and clus-
ter environments and the AWS cloud environment.
IV. AWS: Billing & Credential Management
Goal: Brieﬂy cover the logistics and mechanisms behind organizational billing and credential management
with focus on the newly announced AWS ‘consolidated billing’ offering.
V. AWS: EC2 Overview
Goal: Light introduction to Amazon EC2 to cover deﬁnitions & capabilities before we start making heavy use
of EC2 instances in live demos and recorded screencasts.
VI. AWS: Conﬁguration Management
Goal: Conﬁguration management of EC2 AMIs is a major component in deploying cloud applications in a
reliable, repeatable and easy to manage process. For this topic, we will be using Chef Server
(http://www.opscode.com/chef/) to demonstrate conﬁguration management of cloud-based server AMIs.
VII. AWS: Identity Management
Goal: There are some cases where individual access via SSH keys may not be sufﬁcient (such as with web
applications). Topic will be covered with a demonstration of either LDAP server integration or OpenID inte-
VIII. AWS: Monitoring & Reporting
Goal: Discuss and demonstrate a number of different monitoring & reporting options. Speciﬁc focus on Ama-
zon Cloudwatch (AWS product offering), Server Density (commercial solution from www.serverdensity.com),
Hyperic HQ Open Source Edition (open source solution from www.hyperic.com) and SyslogNG
(www.balabit.com) for logﬁle conslidation.
IX. Putting it all together: Maq Assembler
Goal: Using the Maq assembler algorithm as our demonstration use-case we will discuss and show several
different legacy deployment methods utilizing Amazon Web Services. The “legacy” methods are for support-
ing existing applications and workstreams that may have been built for HPC clusters and compute farms.
Day One will showcase the “legacy” methods while Day Two will showcase a more traditional cloud architec-
ture using current AWS best practices.
Time permitting we will discuss and demonstrate each of the following:
• “Cloud Bursting” - Local persistent servers capable of harnessing EC2 nodes on-demand. Demo may
involve commercial products from www.univa.com (“UnivaCluster & UniCloud”) or Sun Microsystems
(Grid Engine SDM & Cloud Services Adapter).
• Standalone large Grid Engine instance - Single server solution for completing a particular workload.
Grid Engine will be used as the task/job scheduler.
• Self-assembling Grid Engine clusters - Multi-node self organizing Grid Engine clusters within AWS. Of
particular interest for development, testing and supporting of legacy HPC workstreams and applica-
X. Wrap-up & Discussion
Goal: Discuss and review the topics of the day with particular focus on identifying attendee interest in areas
that were not covered or were not covered enough. Time is being left open in the “Day Two” schedule to
handle inclusion of additional topics or demonstrations.
AWS Architecture - Day 2 of 2
Scheduled: March 8, 2010 - location TBD
Objective: Continue progressing iteratively through the topics essential for building out larger or more production-
focused workstreams on the AWS platform. Day Two will focus on continued use of the Maq assembler as our ex-
ample use case. The focus today will be on architecting solutions using current AWS products and best practices.
Due to estimated session size, this training will be lecture, discussion and live demo driven.
I. Intro & Logistics
II. AWS: S3 Overview
Goal: Coverage of the object-based AWS storage service.
III. AWS: EBS Overview
Goal: Coverage of the block-based AWS storage service.
IV. AWS: Data Movement
Goal: Data movement in and out of “the cloud” is problematic for data heavy ﬁelds like life science informat-
ics. Cover known issues, alternatives such as the Amazon physical ingest/outgest services and where to
“draw the line”. Possibly demonstrate GridFTP and/or Aspera for network data movement.
V. AWS: SQS Overview
Goal: Review and demonstrate the AWS SQS service, often a central component of cloud-resident work-
VI. AWS: Additional Topics
Goal: Placeholder topic for areas identiﬁed during Day One as needing more depth, discussion or demon-
VII. Putting it all together: MAQ Assembly Revisited
Goal: Review the “legacy” Maq solutions shown in Day One and discuss the pros and cons of those ap-
proaches. Continue on with discussion of current-day AWS best practices culminating in a revised/revisited
Maq demonstration using more traditional cloud workﬂow methods.
VIII. Wrap-up & Discussion