Document Sample
NASA Powered By Docstoc
					NASA's cloud computing odyssey:
From Earth to Mars

By Rohan Pearce | Computerworld Australia | 14 December 12

When NASA's Curiosity rover landed on Mars at the end of its 563
million kilometre journey, it was a triumph for engineering. And it
was also a triumph for IT.

The anxiety of watching the rover's descent wasn't confined to rocket
scientists at the Jet Propulsion Laboratory; it was shared by NASA software
engineer Khawaja Shams, a member of JPL's Operations Planning Software
Lab, who experienced what he describes as the toughest day in his career.
Shams has responsibility for the pipeline that makes sure that the data
collected by Curiosity gets back to Earth where it can be used by scientists
around the world. And in the lead-up to touchdown, he was responsible for
building cloud architecture that could let millions of people round the world
observe the historic landing, watching the data and images streamed back by
Curiosity in real time.

Data gathered by Curiosity on Mars is sent to one of the orbiters around the
planet then from there to the Deep Space Network: a collection of satellite
antenna 70 metres in diameter situation around the world, spaced 120
degrees apart so that the DSN can see in any direction at any time. From
there, data is sent to JPL.

Within JPL there are always-on nodes that pre-process the data and upload it
to an S3 bucket on Amazon's cloud. While the data is going into S3, NASA is
provisioning EC2 nodes that get the data from Amazon's storage service and
process it, then re-store the results in S3. From there, scientists around the
world can download the data directly.

On the day Curiosity touched down, the system succeeded magnificently by
most measures. But getting there was a long journey, and it's the story of how
NASA moved from relying on what it could provide on-premise to using the
cloud to do the heavy lifting. The implications go far beyond Curiosity: cloud
computing is a key factor in letting NASA cope with the onslaught of data its
missions, both in the Solar System and on Earth, are increasingly delivering to
the world's scientists.

But, years before Curiosity tweeted that it had landed safely on the fourth
planet from the Sun, JPL's cloud odyssey had a somewhat less auspicious
start; at least from Sham's perspective. And it began with procurement gone
32GB of what?

"In 2008," Shams told Computerworld Australia, "I had to order a set of
servers -- actually for the Curiosity mission -- and I started off by sending an
email to my favourite IT person saying, 'Hey I need these machines.' And then
I got an email back saying 'Okay, how much RAM do you need?' 'Okay well
here's how much RAM I need.' How much CPU do you need? How much hard
drive do you need? What operating system?

"It was a lot of emails that transpired in this process and it turned out -- I'd
asked, I think for 32 gigs of RAM and they thought I needed 32 gigs of hard
drive space and we got the wrong order.

"So talking to Tom [Sodastrom, NASA IT CTO at JPL] and Jim [Rinaldi], our
CIO, we went over and we did a retrospective on what this procedure was
JPL's 'Seven Minutes of Terror' and the cloud

During the opening keynote at Amazon Web Services' re:Invent conference in
Las Vegas, Shams took to the stage to recount the toughest night of his
career: The landing of NASA's Curiosity rover on Mars, and his role in making
sure it would be shared with the world in real time.

"I'm going to take us back today to the night of August 5th, 2012, when 350
million miles away the Curiosity Mars rover is about to complete its journey to
the surface of Mars. Engineers at NASA have worked tirelessly for nearly a
decade to make this day a possibility. They have executed countless
simulations, tested every individual component. But the system as a whole is
about to be tested for the first time.

"We can feel success on the horizon, we can taste the victory. But the
smallest mistake can take it all away. Tonight, everything must be perfect.
Millions of people are watching this around the world, relying on us to safely
land Curiosity. We have deployed Web experiences and video streaming
solutions on the cloud to share tonight with the whole world. Tonight,
everything must be perfect.

"And tonight we either succeed remarkably or fail spectacularly. We eagerly
look towards the sky with anticipation because we know that it all comes down
to the next seven minutes: The Seven Minutes of Terror.

"Having developed the data processing pipeline for Curiosity, tonight is the
toughest test of my career. As soon as Curiosity lands, bits will flow through
my pipeline to be processed, stored then distributed as images to the mission
operators and scientists, as well as the rest of the world.

"My heart is pounding as I steal a glance at the AWS health dashboard to
ensure that all the services that Curiosity relies today on are up and running.
EC2 -- check. S3 -- check. SWF -- check. VPC, ELB, autoscaling, Route53,
SimpleDB, RDS -- check. CloudFront, CloudFormation, CloudWatch - check.
"It's show time.

"The world watches on the video streaming solution deployed on the cloud. A
solution that could withstand the failure of over a dozen data centres and still
deliver the live video stream from JPL to you. A solution that could scale over
a terabit per second, but one that only requires us to provision and pay for
exactly as much capacity as we need. A solution that is possible today only
due to the invention of cloud computing.

"The excitement builds around the world as Curiosity enters the Martian
atmosphere and its temperature rapidly rises to over 1600 degrees. I take a
glance out our CloudWatch console to realise that our streaming solution has
gone up to over 40Gbps. I calmly launch another CloudFormation stack to
increase our capacity and register it to a Route53 domain.

"Curiosity deploys its parachute and Mark II and our streaming solution
exceeds 70Gbps. I calmly launch another CloudFormation stack. Curiosity
jettisons its parachute and activates its jetpack as it approaches the surface.
"Back on earth we get a surprising call. The main JPL website, still running on
traditional infrastructure, is crumbling under the crushing load of millions of
excited users. We act quickly and route all JPL website traffic to the
CloudFront-based Mars site. Our traffic exceeds over 100Gbps but the cloud
hasn't even started to sweat. No. The sweat is only on my forehead as I
anxiously await the next bandwidth milestone so I can add more capacity.
"We are a scant 21 metres from the surface of Mars as Polyphony, our data
processing pipeline, provisions EC2 in anticipation of the bits [of data] coming
to Earth. The Sky Crane lowers the rover down to the surface of Mars.
Curiosity has landed on Mars.

"The whole world celebrates with us and we are just as thrilled. We have
made every JPLer, no every NASA employee, no every American proud. We
have made humanity proud for we have landed a one ton mobile lab on the
surface of Mars. Mission accomplished.
"Or is it?

"The landing engineers have been successful tonight, but my test has just
begun. The bits flow from Curiosity to the Odyssey orbiter to the Deep Space
Network and then finally into JPL. An intricate orchestration process co-
ordinated by Simple Workflow magically causes nodes at JPL to upload these
bits onto an S3 bucket. EC2 nodes pick up these bits and process them into

"These elastically provisioned EC2 nodes will process images rapidly and
within seconds of bits arriving to Earth, the first Mars images will be on your
iPad, Android and laptop screens around the world. On August 5th we had
two successes: we landed Curiosity on Mars and we shared the first pictures
from Mars with you in real time. You saw the first pictures from Mars at the
same time that we did.
"Mission accomplished."

It became one of JPL's earliest discussions about cloud computing. "Jim
Renaldi came up on the spot with the vision of--you, Khawaja should not have
to buy a machine, you should have to rent them. And you should be able to
get them on demand. That was the vision that he painted," Shams says.
"So we will provision instead of purchase. And that was I think the key vision
that enabled us to adopt cloud computing because that's effectively what it
allowed us to do.

"It allowed us to come in and instead of me saying, 'Well I need these
machines' and then communicating with a bunch of different people down the
pipeline and then waiting for the order to be shipped, and then waiting for
somebody to install it physically, and then someone to install the operating
system, only to find out it was wrong. We just come in and say, 'Well I need
five machines on the cloud with this image and if it's wrong, well, okay, that's
fine let me. Just make two other clicks and correct that mistake. So that was
one of the earliest conversations."

Too... much... data Cloud was always, in retrospect, going to be a natural fit
for JPL. The organisation has around 5000 people and, Shams says, "we are
busier than we ever have been before". "We've got missions that are going all
over the Solar System, we have landed missions on Mars recently, and we
have been to every planet in the Solar System. And recently we've started
having much more focus on earth science. And the problem is that with earth
science we have the opportunity to get a lot more data."

"So we're really busy, we've got all these missions going all over the Solar
System and beyond, and our data centres are getting filled to capacity and
our data needs are growing faster than ever," Shams explains.

"In come the earth science missions -- and over the next couple of years
we're going to be getting two terabytes of data per day from some of these
missions. And this is a scale that is orders of magnitude bigger than what
we've ever seen before.

"So we're running out of space, we're running out of capacity. We want to be
able to use the physical space that we have at our laboratory for people and
for science rather than running infrastructure. We're also noticing that cloud
vendors are starting to offer these capabilities and infrastructure at a much
lower cost. And add to that the elasticity that is available to us in the cloud
diminishes our cost even more."

This combination of on-premise infrastructure reaching its limits, an onslaught
of data and the limited timeline of some of JPL's missions -- some only last for
six months -- made cloud an inviting option for the organisation. When a
mission is underway, JPL "really process the data as much as we can for
those six months [for example], and then that infrastructure is going to go to
waste after that. So with cloud computing, we're able to say, 'Okay well we're
just going to pay for it while we use it and turn it off when we're done. '"
Using cloud for computationally intensive processing mitigates the risks
associated with capital investment, Shams adds. Before employing cloud
computing, the IT infrastructure for a mission would be purchased a year or
more in advance: It would be tested and then put in change control
configuration and not used until the actual mission took place.

"Now there's a risk here that if the launch is unsuccessful we have made all
this investment and this infrastructure's not going to be used," Shams says.
"The other risk is that we have paid too much for this infrastructure because
we bought it a year in advance. So now move forward four years -- cloud
computing. Let's say there's a hundred machines that we needed. We have
the opportunity to bring up the hundred machines [in the cloud], test
everything worked, shut them down, don't pay for anything and if the mission
is successful -- which almost every time it is -- we just launch those machines,
just as if we left them, and start paying for them immediately."

Getting to the point where Shams could ramp up instances in AWS for the
data pipeline from Curiosity to Earth was not straightforward, however.
There are a lot of government regulations NASA has to abide by. The good
news, Shams says, that all the downlink data the space agency collects can
be released into the public domain, which was an important factor that let JPL
experiment with cloud computing while navigating these regulations.

NASA has worked closely with cloud vendors to find ways of using cloud that
don't fall afoul of the law; for example Amazon established its GovCloud in the
Oregon Region, which offers the same security as AWS public cloud but is
compliant with ITAR -- the International Traffic in Arms Regulations -- which
governs the movement of sensitive data: Data won't be shifted offshore and
the facilities are staffed only by "US persons" (a category that includes US
citizens and certain permanent US residents, among others).

And while it may have only taken a bungled procurement order and a
conversation to set things in motion -- making it happen took a lot longer. For
example, the first contract that JPL signed with AWS took eight months.
"At that time, cloud vendors weren't ready for the enterprise," Shams says.
"They were still built for the start-up with a credit card, or Joe Smith with a
credit card. Having to deal with the enterprise was something they were
learning first hand."

Dealing with a government agency involves an even steeper learning curve,
due to the regulations that must be abided by when dealing with an
organisation like NASA. But that eight months wasn't wasted: not only was the
contract signed in the end, but Soderstrom conducted a retrospective to
identify blockages in the process that could be removed to ensure smoother
sailing as NASA continued to embrace cloud.
"The reason why it took us eight months is because there was a long
communication chain," Shams says.

The convoluted communications chain involved Shams communicating with
NASA's procurement team, which would have to talk to NASA's legal team as
well as Amazon's sales team. On top of that JPL's security team was also an
obvious stakeholder.

On the Amazon side, they also had their security team and compliance and
legal teams. The upshot was a communication pipeline ripe with potential for
blockages and channelled through Khawaja and the procurement team.

"You can imagine," Shams says, "IT security tells me something, I tell it to the
procurement guys, who will then tell talk to the sales guy, who will then talk to
IT security guy, who will then come back to them, then go to procurement who
will then go to Khawaja who will then go to security. This is completely
In the wake of the lengthy negotiations over the first contract, Tom
Sodastrom, NASA IT CTO at JPL, figured out what Shams describes as a
"magic formula": Bringing stakeholders on each side face to face to discuss
the issues involved.

When Amazon established its GovCloud Region, NASA needed to sign a new
contract with AWS. This time, it only took a month. This idea of approaching a
cloud vendor as more of a partner has continued, with peers on each side
having meetings with each other to foster a collaborative relationship.

Shams believes this approach, of bringing together peers across the customer
and the vendor, is applicable for large enterprises beyond NASA. It stops
things getting lost in translation as communications go up and down the chain,
and removes bottlenecks.

He cites as an example the collaboration between JPL's security team, and
AWS's. "The IT security team's goal has been identified as, well, you're going
to make cloud computing secure or tell me why it can't be done. So it's now
part of their pay cheque ... to go figure this out. So they have to go to talk to
the IT security team [at AWS].

"The IT security team [at AWS] has to make cloud computing secure anyway
at Amazon because that's their bread and butter. Now you've got two teams
with the same motive, right, and they're going to talk to each other without any
bottlenecks. So I do think for any large enterprise, this is a magic formula: to
identify the peers and to talk to them as much as you possibly can."

Cloud means enterprises need to move beyond a routine customer-vendor
relationship, requiring a deeper level of collaboration. Shams and Sodastrom
sit on cloud vendors' customer advisory boards, letting them have input into
the direction of product development.

"We provide insights into how cloud is being used in our organisation and
what are the features that are missing, or what are the key enablers that are
missing, that will allow us to adopt cloud more effectively," Shams says.

NASA also has relationship with some cloud vendors' internal product teams.
"They'll bounce some ideas off of us, and that helps us influence them, but it
also helps us understand where things are going and it helps us get guidance
to ensure that we're using the best practices."

The cloud changes IT

Shams says that Sodastrom's approach to IT at JPL has been a key factor in
the shift to cloud. His approach and that of the Office of the CIO is to treat IT
as an enabler of new capabilities, rather than a hindrance. Shams cites the
example of cloud security -- "IT security would be very cautious of, 'Hey! You
mean you're going to tell your data on whose servers?!"

"So what's Tom's doing with them for instance," Shams says, "is he's telling
them don't say no -- say how? Or say, why not? So rather than 'Hey guys,
don't do this', it's more like 'How do I do it more securely, more effectively and
if really I can't, if I'm really doing something very stupid, tell me what else I can
do to still meet my requirements.'"

JPL is still adjusting to the impact that cloud has on how IT operates, for
example the shift from capital expenditure to operational expenditure. At first
this shift made some project managers uneasy, Shams says, because of the
impression that this would make their budgets more unpredictable. However,
"they quickly realised they have more control of the budget now all of sudden
because they can control how much capacity they can invest in."

"Typically if you bought a bunch of hardware before the mission started
producing data you might not even realise that you've bought too much or too
little," Shams says.

"So now, based on the amount of money you have available and based on
the changing requirements, you have the agility to redefine how much
infrastructure you're actually going to use and pay for.

"We are noticing so, for instance, for some of the projects I'm working on,
when I'm putting the budgets [together] I'm actually putting in operational cost
-- this is how much money we're intending to spend every month -- rather than
at the start of the project I'll go buy these 40 terabytes of hard drives and set
them up accordingly."

"It's literally: as the project progresses we'll continue to pay a monthly [fee],"
Shams says.

Although it was AWS's cloud that did the heavy lifting for Shams data pipeline,
NASA has a multi-cloud approach.
The agency has an internal document that sets out its strategy for cloud
computing, as identified by Soderstrom. The idea boils down to 'Use the right
cloud for the right job'.

Soderstrom developed a tool called CASM: the Cloud Applicability Suitability
Model. It's a simple questionnaire that NASA stuff can fill in, answering
questions about their project's needs -- latency requirements, regulatory
requirements and the data to compute ratio, for example.

"Based on these questions, we assess whether their data belongs in our
private cloud or in the public cloud, and within the public cloud, it's 'Which
public cloud?" Shams says.

"And within the private cloud, does it belong in the supercomputer centre,
does it belong in our regular data centre does it, belong in a virtualized
environment with VMware, or does it belong in an Openstack environment...
things like that."

"So these questionaries help us asses where to place the environment," he

"But there's no edict from NASA or anybody that says 'use Amazon' or 'use
Microsoft' or 'use Google'. It's literally about having the right cloud for the right
job, and it's literally on a per application basis that we make this decision."

When Curiosity landed on red soil

When Curiosity ended one phase of its mission by hitting red soil and started
the main, most important part of its job, JPL witnessed firsthand how the
elasticity of cloud can, in some cases, be a game changer. The Mars Science
Laboratory[5] website, which ran on AWS, was a "good foray" into cloud
computing when it comes to Web hosting, Shams says, surviving the
onslaught of massive amounts of traffic.

In the wake of its success -- JPL's regular website went down during the
Curiosity landing due to the volume of traffic, so it was redirected to the MSL
site, which remained up -- websites across NASA are beginning to be
migrated to the cloud. Unlike a service such as Netflix -- also a heavy user of
Amazon's public cloud -- which knows that it's going to have massive traffic on
a daily, NASA's traffic tends to spike and ebb, depending on public excitement
about different missions.

"We get a lot of attention and then it will die down, and then we land a rover,
get a lot of attention, and it dies down," Shams says. "It's a very elastic
environment that's basically built for cloud computing."

With a service like S3, NASA can store data and then not worry about going in
and adding more services as interest ramps up, because it will scale
automatically behind the scenes. Shams adds that the storage service also
means that backups aren't a concern because data will be automatically
replicated across multiple data centres, and daemons will regularly check the
integrity of data to make sure nothing has been lost, re-replicating it as

Another advantage of using cloud for Web hosting is security: holes have to
be opened in firewalls to allow page requests in and data out, which can
create a vulnerability. If a machine is running on NASA's network, there's the
risk that the compromising of a Web server might open up the rest of the
organisation's network to attack. With cloud, you can put a Web server in an
isolated environment, "so somebody penetrates your website -- that's all
they've gotten into."

Despite wariness over cloud computing, the security team at JPL has also
found other advantages over on-premise hardware. Cloud computing can be
used to combat uncontrolled IT sprawl and give security far more oversight.
"Cloud computing is way more secure than me setting up a server at a desk
under my cubicle," Shams says.

"We will see a major shift toward cloud computing for websites across NASA,"
Shams says. It's an "ongoing process" that's being endorsed within JPL. It's
"being enabled by our Office of the CIO, and they're doing everything they can
to make it happen as quickly as possible."

When Curiosity touched down on Mars, there were two successes, not one,
Shams said during a presentation at AWS's re:Invent conference. The rover
was landed successfully, and NASA was able to share the moment with the
rest of the world. And while the magnitude of the latter feat may go unnoticed
by some, its implications for IT are significant, to say the least.

"Mission accomplished," Shams told the conference.

Not always smooth sailing

Although Curiosity may have provided a highly visible success story, the path
to the cloud has not always been smooth sailing for JPL. One of the early
stress-inducing incidents encountered was an apparent attack on their cloud

JPL uses Amazon Web Services' Virtual Private Cloud (VPC) offering, which
lets an enterprise cordon off a set of EC2 instances and connect them to their
internal network over a VPN, treating them as extensions of internal
infrastructure. JPL set up a VPC and started running instances in it, and one
morning at 6am, Shams says, they got a phone call saying that a node in the
VPC was under attack.

"We all panic and we're looking around to see what might be going on, who
might be on to us and who's trying to compromise our system and if there's an
internal breach... And three hours later we're still trying to figure out what
actually happened," he recalls.

"It turns out that our IT security team, the same people who monitor the alarm
that went off, also have a system that does penetration testing of all of our
systems. And because our machines were in the VPC, they also went and
said 'Oh hey, you're a Web server I'm going to starting throwing all this traffic
at you and see if you succumb to one of my SQL injections for instance'."

"So it was a self-inflicted-alarm," Shams says. However, it was actually "really
good", he adds, "because, one, it helped us ensure that our testing
infrastructure that we have for internal resources is still working and, two,
when things do go wrong in the VPC we still figure out, just like our
infrastructure. It was an attestation to the fact that we're able to leverage our
internal infrastructure to protect the resources in the cloud, just like they would
the resources on-premise."

Another lesson learned the hard way, albeit one that involved less panicked
phone calls, was the importance of collaboration with cloud vendors:
approaching the relationship as more of a partnership than a pure vendor-
customer relationship.

"With cloud is there are so many features that are coming in so fast," Shams
says, and when JPL started its cloud journey circa 2008-2010, Shams was "in
a very exciting development role". "I would be developing services to build on
top of cloud capabilities and I'd develop a service, it would have two or three
more bugs left, and we'd hear that Amazon was about to release this other
service that's going to do everything that we've written, except a lot better.
And at that point we would throw away our code and say, 'Okay we'll just use

The lesson, Shams says, was to be open with the vendor about what you're
trying to build and what's missing in their services, "so that we can actually
stay on the same page as to what might be coming up and what we should
build and what we shouldn't build. Kind of understand, principally, what the
vendor is interested in building and what they're interested in letting others
build on top of that."

Rohan Pearce travelled to Amazon re:Invent as a guest of Amazon.


Shared By:
Roland Millaner Roland Millaner Managing Director
About I'm your local copywriter here in Maidstone, Kent. I am a highly skilled and versatile copywriter who can speak to and engage both a B2B and B2C audience, i.e. market. As such I write powerful, high impact - results oriented copy that attracts, converts and sells. I specialise in increasing click-thru rates, conversions and sales. And have done so on behalf of numerous industries, products and services. On a personal note I'm someone who under promises and over delivers. I'm also an individual who would welcome, and would very much like to contribute to, and be a part of your organization. I'm very driven, professional, self-directed, motivated, reliable and always focused to the tasks at hand and never miss a deadline. Prior to entering the world of direct response copywriting I was a journalist as well as a corporate sales and marketing director. In fact, I credit my ability for writing powerful, effective copy, i.e. “The ability to sell on paper” for helping me reach and exceed sales targets. This is the expertise that positioned me as the undisputed top revenue generator in a software organization I formerly served, and I was able to accomplish this in less than 12-months from my start date! I'm a superb researcher and I do have a fully equipped home office as well as a great deal of expertise in SEO, SEM, and Social Media . Please click here for my site: At the end of the day I'm actively looking for an opportunity to grow and contribute to the success of a company such as your company, and I truly believe and am confident you would be very pleased with my creative and marketing skills. At the end of the day I understand that it's all about the client. As such I'm a roll up your sleeves and get the job done at any cost kind of person. I specialize in writing: White papers Case studies Web content E-mail campaigns Landing pages Brochures Press Releases Feature Articles Video Scripts Sales well as all types of communications that help build a brand and sell a product or service. Here's what others have to say: “I found Roland to be a very thorough and enthusiastic freelancer for the project he helped us with. The response was phenomenal. He was on deadline and gave us exactly what we were looking for.” Jason Holland Managing Editor Early to Rise – Agora Publishing When it came to creating marketing strategies, content and communications that converted our website visitors to buyers, and when we needed compelling and passionate direct response marketing content that our clients couldn’t resist, Roland was the expert we turned to, and will again in the future. Roland thanks for all your help! Yours truly, - John Ioannidis Director of Operations, Himalfarb Proszanski LLP, Dear Roland, I just wanted send you this short e-mail to thank you for your assistance and also note, how your remarkable keen sense and insight into our business has helped create a clear well documented path for our firm to navigate. The financial services industry is extremely complex; being driven by many human undercurrents and emotions such as fear and greed, then add the panoply of investor misunderstanding, leads me to believe your business plan and the introduction of our newsletter you created for us “Cornerstone Planning” will add to the firms 23 years of continued success. Thanks again Roland for your help. Looking forward to working with you on phase 2. Cheers, - David J. Newman,