Making the Most of Your Storage Budget by taenk


									Making the
 Most of
Your Storage


     an            Storage eBook
             Making the Most of Your Storage Budget

                            This content was adapted from’s Enterprise Storage Forum Web
                            site. Contributors: Drew Robb, Henry Newman, and Paul Shread.

                             2       What’s Selling In the Data Storage Market?

                             5       Despite Economy, Storage Bargains Hard to Find

5       7                    7       How You Can Save on Data Storage Costs

                           10        Brother, Can You Spare a Petabyte?

10      13                 13        Data Corruption: Dedupe’s Achilles Heel

    1            Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                       Making the Most of Your Storage Budget

              What’s Selling In the Data
                 Storage Market?
                                                           By Drew Robb

                  ith EMC’s SAN sales falling by about 20 per-        source-based storage technologies are gaining ground as a
                  cent in the first quarter and the rest of the       cost-savings measure.
                  data storage market also under pressure,
                  observers could be forgiven for thinking that       “What’s selling well are items that are essential to keep-
                  nothing is selling out there.                       ing business running along with technologies that can help
                                                                      reduce costs, boost efficiency or productivity as opposed to
But there were bright spots even within EMC’s report —                discretionary or nice to have items,” said Greg Schulz, Stor-
Celerra unified storage ar-                                                                           ageIO Group founder and se-
rays, for example, continue to                                                                        nior analyst. “Dedupe, flash,
sell at a double-digit rate —                                                                         and virtual tape libraries are
and other storage technolo-                                                                           some obvious examples.”
gies have managed to catch
on in the downturn either                                                                                 Solid Sales for
because they offer users                                                                                  Solid State
a way to save on storage                                                                                  Schulz points to solid state
costs or they offer such a                                                                                drive (SSD, or flash technol-
compelling value that users                                                                               ogy) as having gained a lot of
are willing to spend on them                                                                              ground recently, particularly
even in a tough economy.                                                                                  for read- and write-intensive
                                                                                                          applications as a way to boost
Economic downturns can                                                                                    performance and efficiency.
also be where dramatic                                                                                    This correlates well with what
change occurs and buying                                                                                  storage vendors are saying.
patterns shift — often per-
manently. New darlings can                                                                                Jim Cates, senior director of
emerge and seize the                                                                                      storage development at Sun
moment, displacing old                                                                                    Microsystems has noticed a
faithfuls that are no longer regarded as current or cost-effec-       big uptick in interest in SSDs.
tive. And once changed by economic necessity, the new hab-
its that emerge can become permanent.                                 “While people used to use striped disk for high IOPS, they
                                                                      are now tending toward SSD,” said Cates. “The price of flash
The big winners, this time, appear to be flash technol-               is low enough that they want to use it for high IOPS data
ogy and data deduplication, while some areas of the tape              instead of DRAM, which is cost-prohibitive.”
and Fibre Channel SAN markets are suffering. But even
within those categories, there are specific areas which are           Pat Wilkison, vice president of marketing and business
thriving. And not surprisingly in the current environment, open       development at SSD manufacturer STEC, echoes this view.

                          The big winners, this time, appear to be
                        flash technology and data deduplication…

     2      Table of Contents                Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                     Making the Most of Your Storage Budget

But even within the SSD segment, buyers have changed their          of its DXi series appliances, a strategy that is paying off.
ways. Instead of purchasing some small flash drives, they           Comparing its December quarter results to those in the
now want larger models in order to get the greatest capacity        previous quarter, disk and software sales increased about
per dollar.                                                         44 percent, while tape automation sales were roughly flat.

For example, STEC’s ZeusIOPS product line features                  “Sale of Quantum’s deduplication technology — which goes
different versions for different needs. At one extreme is the       to market both through our branded DXi products and through
highest Input/Output Operations per Second (IOPS) model,            our OEMs — was the biggest driver for the 44 percent
and at the other extreme is high capacity (40 percent more          increase in disk and software sales,” said Steve Whitner,
capacity, but still with decent IOPS). Thus STEC now offers         disk product marketing manager for Quantum. “During
750GB products and is experiencing what it describes as             the last quarter, we also benefited from the rapid market
meaningful demand for 1.5TB sizes, which will begin shipping        adoption of the EMC Disk Library products that include
in the fourth quarter.                                              Quantum’s dedupe software.”

“We have noticed a bias toward cost savings and an                  Tape, Fibre Channel Hang On
emerging market that doesn’t need breakneck I/O but                 Quantum and EMC, of course, made their money on old-
wants higher-capacity, as SSD is better than cache,” said           school storage platforms such as tape and Fibre Channel.
Wilkison. “Our high-capacity, lower I/O market has grown from       While those fields are hurting to some degree, they certainly
0 percent to 30 percent of total orders in recent months.”          aren’t all bad. Like mainframes in the 1990s, tape’s demise
                                                                    exists mainly in the heads of competitors and pundits. Far
EMC was the first storage vendor to market with SSDs and            from falling off the cliff, tape technology retains a strong
has been pushing flash drives as a means of enhancing               user base.
storage tiering. In the EMC vision, flash becomes the first
tier, Fibre Channel disk is tier two and SATA becomes tier 3,       “Tape continues to be leveraged for bulk data protection,
a model that other vendors are also promoting.                      backup and archiving,” said Schulz.

Users look to a few SSD drives for the most heavily                 Sun’s Cates noted that declines in tape are being felt in small
utilized data, then a small amount of FC drives for less            autoloaders and libraries with fewer than 50 cartridges. That
utilized data, and low-cost, high-capacity SATA for the bulk        market is being gobbled up by disk. On the other side of
of information. As that latter information isn’t accessed too       the coin, though, Sun is seeing some desire to upload large
often — or isn’t mission-critical — it can comfortably reside       repositories onto tape as opposed to trying to manage it
on bulk SATA drives.                                                all on disk.

“It’s rare that they need more than a half-dozen to a dozen         “We are also experiencing growth in consolidation
flash drives,” said Ken Steinhardt, vice president and CTO of       opportunities — moving several small libraries into one
customer operations at EMC. “The last six months have seen          centralized unit, as it is more cost-efficient,” said Cates.
an acceleration of the usage of flash for the first tier.”          “In addition, we are seeing an uptick in enterprise storage
                                                                    systems in general.”
Deduping the Way to Profits
Steinhardt has also noticed a marked shift toward deduplica-        Recent stats compiled by Dell’Oro Group confirm this. Fourth
tion technology. The massive amount of duplicate data in any        quarter 2008 Fibre Channel sales were overall about even
system makes this technology a compelling value proposition,        with the prior quarter. Like every other sector of storage,
he said.                                                            though, there were stronger and weaker elements to the
                                                                    market. Fibre Channel switch sales rose, for example,
Just look at Data Domain, the dedupe pioneer that, even in          primarily due to higher prices rather than volume of sales.
a recession, was still growing at a 30 percent to 40 percent        Users are clearly buying into the latest generation of
rate before it was acquired by EMC.                                 switches, with their new features available at a premium.
                                                                    This includes 8 Gbps Fibre Channel and Fibre Channel over
EMC itself is pushing dedupe on several fronts, as are a            Ethernet (FCoE).
host of others. Quantum, a dedupe partner for both EMC
and Dell, is heavily promoting the deduplication capabilities

     3     Table of Contents               Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                       Making the Most of Your Storage Budget

Late last year, users began trials of Cisco’s Nexus 5000              declines reported in the server market.
switch with FCoE software and adapters from Emulex and
QLogic, said Tam Dell’Oro, president of Dell’Oro Group.               “The Fibre Channel adapter market is not feeling the rewards
                                                                      of users migrating to the higher-priced, higher-featured
On the downside, host bus adapter (HBA) numbers were                  products,” Dell’Oro said. “Instead, this market is char-
down from both the previous quarter and the year-ago                  acterized by an increasing portion of lower-priced blade
quarter. Dell’Oro believes this is a function of the significant      server adapters.”

     4      Table of Contents                Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                      Making the Most of Your Storage Budget

            Despite Economy, Storage
              Bargains Hard to Find
                                                          By Drew Robb

           ou’d think with the economy struggling that now           Others report a steady if unspectacular decline in prices that
           would be a good time for a discount on that data          began well before the current woes.
           storage array you’ve been coveting from the likes
           of EMC or NetApp, but so far storage pricing              “Pricing was already dropping due to the arrival on the market
           seems to be holding up.                                   of products built from general purpose server components,”
                                                                     said Jason Williams, CTO and COO of Digitar Inc. of Boise,
Auto dealers and retailers offer big discounts to boost sales        Idaho, a company that has saved a lot of money with com-
when the economy sours, so why aren’t data storage compa-            modity hardware and open source storage. “With more com-
nies doing it? You’d think that with budgets being cut drasti-       panies embracing those solutions due to financial pressures,
cally, storage vendors and resellers would be offering fantas-       traditional proprietary vendors are being forced to lower their
tic savings on the cost of new                                                                       prices to stay competitive.”
hardware or software. But
rebate and trade-in deals to                                                                              Schulz, for instance, men-
rival the automotive industry                                                                             tions that 4Gb Fibre Channel
don’t appear to be on the way                                                                             SAN products have become
just yet.                                                                                                 more affordable, thanks to
                                                                                                          the arrival of 8Gb and Fibre
“On some products and tech-                                                                               Channel over Ethernet (FCoE)
nologies, pricing is holding,                                                                             lurking just around the corner.
particularly for those where                                                                              He has also noticed a price
there continues to be strong                                                                              drop in 10Gb Ethernet ports
demand,” said Greg Schulz,                                                                                as well as many midrange
senior analyst and founder of                                                                             storage systems, includ-
StorageIO Group.                                                                                          ing those using high-perfor-
                                                                                                          mance Fibre Channel or SAS
Mind you, there are some                                                                                  disk drives.
deals to be had, and prices
in general are heading south-                                                                       “From a storage system
ward. One anonymous user                                                                            perspective, particularly for
witnessed a deal with Com-                                                                          entry-level solutions, some
pellent and EMC where both were forced to drop their prices          real bang for the buck can be found” in solutions such as the
significantly on a midrange array due to price pressure from         EMC Clariion AX4, Dell MD1000/3000 series, HP MSA2000,
the Sun 7000 series.                                                 IBM DS3000, NetApp FAS2000 and Nexsan SATAbeast
                                                                     systems, among others, said Schulz. “Prudent buyers that
Jim Dougherty, lead engineer at Plixer International Inc.            can plan and leverage their purchasing plans and capacity
of Sanford, Maine, is noticing more price cuts gradually             plans have great opportunities to leverage current vendor
creeping in.                                                         incentives and promotions,” he said.

“You are seeing the better deals or leverage on the larger           Similarly, Chris Beck, a network administrator for the City
items/quotes,” he said. “The vendors know that the business          of Fontana, Calif., has observed that products seem to
is out there and will do whatever they can to obtain it.”            be cheaper than before. He replaced an HP EVA 5000

     5      Table of Contents               Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                         Making the Most of Your Storage Budget

with a Xiotech Emprise 7000 as the city’s core production                also play one vendor against another — unfortunately, it has
storage system.                                                          come to that.”

“The cost of our Emprise 7000 was about a third of the cost              So far he hasn’t seen much in the way of cut-rate storage.
that we paid for our EVA 5000 back in 2001 — and the EVA                 What he has found, though, are desktop systems with more
5000 had less than half the capacity back then that it does              bells and whistles at better prices than a year or two ago. He’s
now,” said Beck. “Even the new EVA 8100 that was our sec-                also seen some especially aggressive pricing in the antivirus
ond option was less than half the cost of the original EVA that          software market.
we purchased.”
                                                                         “Newer companies tend to go after the pricing provided by
Maintenance and Services Deals                                           older, more established companies,” said Mueller. “We re-
One area where the better deals are to be had appears to be              ceived a bid from a newer player that offers us three years
in services. According to Schulz and Williams, the bargains              of coverage compared to what we paid for one year with the
are often in multi-year support contracts.                               older company.”
“Anything that will ensure
multi-year revenue to a vendor                                                                                Leasing, SaaS and
can be used now as a great                                                                                    Open Source
bargaining tool on the rest of                                                                                All of this may add up to radi-
the hardware in the deal,” said               The cost of our                                                 cal changes in buying patterns
Williams.                                                                                                     over the long term. With
                                       Emprise 7000 was about a                                               dollars for outright purchasing
Tim Chester, CIO of Pep-                                                                                      growing tighter, leasing may
perdine University in Malibu,         third of the cost that we paid                                          make a comeback.
Calif., agrees. While he isn’t
seeing much in the way of               for our EVA 5000 back in                                              “I’m seeing and hearing a
hardware price cuts, what he                                                                                  pickup in leasing activity,
is noticing more value add in          2001… the EVA 5000 had                                                 which has been rather light
ongoing deals. This includes                                                                                  to non-existent for the past
free consulting and more help         less than half the capacity…                                            several years, as a means of
on implementation. He has                                                                                     stretching dollars and cash
also noticed far more cold                                                                                    flow,” said Nickolett.
calling from vendors, which
bodes well for easier negotia-                                                                     Other possible shifts in
tions going forward.                                                                               the market might appear in
                                                                         the areas of open source software and Software as a
Resellers and consultants are noticing it too.                           Service (SaaS).

“Companies are definitely pushing back on vendors wherever               “SaaS and open source have become vehicles for newer
possible,” said Chip Nickolett, owner of Comprehensive Con-              vendors to provide a credible threat,” said Nickolett.
sulting Solutions Inc. of Brookfield, Wisc. “Usually there is
some threat of discontinuing use or migrating off a product              He suggests a complete proof of concept effort to dem-
used as leverage to renegotiate an existing multi-year agree-            onstrate the technical capabilities of such alternatives —
ment or achieve more favorable terms on renewals.”                       then create a plan to migrate 10 percent to 15 percent of
                                                                         your IT footprint to that platform as part of a strategic cost
And don’t expect too much customer loyalty in the current                reduction effort.
climate. The likelihood is that users will lose their long-
term preferences when a potential usurper provides a low                 “You’ll soon have your vendor’s attention,” said Nickolett.
enough offer.                                                            “I personally believe that our current economic crisis is the
                                                                         change agent that will drive SaaS and open source to the next
“I will go outside of normal channels to find that price,” said          level of widespread enterprise adoption.”
Rainer Mueller, IT analyst for the City of Encinitas, Calif. “I will

     6       Table of Contents                  Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                      Making the Most of Your Storage Budget

         How You Can Save on Data
              Storage Costs
                                                         By Drew Rob

                ata storage has been a major and growing             backup windows (33 percent) and time to restore reduced by
                part of IT budgets for many years, so it’s not       75 percent, this makes dedupe a relatively easy item to sell to
                surprising that cost-cutters have been taking        even the most tightfisted CFO.
                a hard look at storage costs in the worst eco-
                nomic downturn in more than 50 years.                A Data Domain customer, Great River Energy of Maple Grove,
                                                                     Minn., reported data compression rates as high as 100 to
We’ve compiled a few tips and technologies that can help in          one on some applications. As a result, restores were done in
the new era of frugality. Some are free but may require some         half the time and backup administration has been cut down
investment in resources, while others promise a rapid return         from one day per week to ten minutes, said Joe Gleason, IT
on investment (ROI), in some cases offering a payback in less        Systems Engineer for Great River Energy.
than six months.
                                                                                                          The savings occurred on many
Dedupe Express                                                                                            fronts, such as a 45 percent
Two areas tend to dominate                                                                                reduction in wattage.
discussions about storage
costs — flash-based solid                                                                                 “Compared to a scenario in
state drives (SSDs) and data                                                                              which we would expand on
deduplication.                                                                                            our legacy tape library
                                                                                                          platform, this solution has
Starting with the latter,                                                                                 provided us with significant
dedupe has become almost                                                                                  power, cooling and data
a badge of honor among                                                                                    center footprint advantages,”
storage vendors. While Data                                                                               said Gleason.
Domain      popularized     the
technology, it’s hard to find a                                                                           Flash in the SAN
storage vendor that doesn’t                                                                          Like dedupe, flash drives are
offer the technology these                                                                           constantly in the news these
days. The likelihood is that                                                                         days. While they don’t match
this technology will eventually                                                                      up on a capacity/cost basis
become standard for storage                                                                          against Fibre Channel (FC)
and backup purposes.                                                 disk, prices are dropping rapidly.

“Data deduplication means you can squeeze a lot more                 “The price of flash is down 76 percent in the past year,”
data into a lot less space,” said Mike Sparkes, product              said Ken Steinhardt, vice president and CTO of customer
marketing manager for entry disk systems at Quantum.                 operations at EMC. “Every day that goes by, it keeps
“It helps you save money in numerous ways.”                          getting cheaper.”

He gives the example of a small software development                 According to Pat Wilkison, vice president of marketing
company with a single site. By installing a dedupe appliance         and business development at SSD maker STEC, the most
for $12,000, it saved that amount alone in tape media costs          obvious value proposition is to use SSDs to reduce the
in its first year. When you factor in a reduction in backup          amount of memory needed by a system. Flash works out
management efforts by 250 to 300 hours per year, shorter             at orders of magnitude cheaper than RAM and has enough

     7      Table of Contents               Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                      Making the Most of Your Storage Budget

capacity these days that an entire database can be saved on
flash for super fast I/O.                                                 HP Sees Opportunity
“Flash offers an immediate and tangible ROI, as you gain high
performance and require less memory,” said Wilkison.
                                                                          in Data Deduplication
                                                                          By Paul Shread
Tiering Up Over Flash
Some vendors, like EMC, Compellent and Sun Microsys-

tems, have taken SSDs a step further as a new high-perfor-                         P sees the bidding war between
mance storage tier. This reduces the need for Fibre Channel                        EMC and NetApp for Data Domain
disks and speeds performance for the most mission-critical                         as evidence of the potential of
applications.                                                                      deduplication — and the company
                                                                           says it’s ready to do battle.
Steinhardt suggests placing the most heavily utilized data
on flash, then offloading it to FC for medium utilization, and             “The size of the deal and the bidding be-
running the bulk of data on low-priced SATA drives.                        tween EMC and NetApp was a testament
                                                                           to the size of the market opportunity,”
“This new kind of system tiering is being driven by flash and              said Kyle Fitze, HP’s marketing director
low-cost, high-capacity SATA,” said Steinhardt. “This cuts                 for Storage Platforms. “We want to
your power, cooling and space costs, and reduces reliance                  be there and we want to compete
on a large pool of expensive FC drives.”                                   aggressively in the market.”
SATA-fied Storage Customers                                                HP has offered dedupe for more than a
Moosa Matariyeh, an enterprise storage specialist at CDW,                  year through its partnership with Sepa-
takes things a step further and advises those looking to save              ton, and the company also offers its HP
on storage costs to dump FC and SAS for SATA.                              Labs-developed D2D Backup Systems for
                                                                           remote offices and small businesses.
“Migrate data from Fibre Channel, small computer system
                                                                           HP’s Data Protector software also offers
interface (SCSI) or SAS storage to less expensive SATA,”
                                                                           host-based dedupe capabilities, and the
he said. “This also saves on cooling and power costs as well
as rack space, as the SATA drives you would be migrating to
                                                                           company announced a reseller agree-
have more density.”                                                        ment with compression and dedupe
                                                                           specialist Ocarina Networks, which also
He offers an example — costs reflect only the drive street                 works on image files, for HP’s NAS
price and do not include any enclosure or additional                       offerings.
equipment needed — SAS storage can be as low as $1.25
a GB, while SATA drives from the same manufacturer cost                    “This is a space that HP is taking seri-
as little as 15 cents per GB, or nearly 90 percent less. Add               ously,” said Fitze.
in power and cooling costs and the total cost of ownership
(TCO) can be dramatically less.                                            Deduplication technology reduces data,
                                                                           speeds up restores, and helps minimize
“Migration can be done manually, or a software package can                 bandwidth usage during replication, he
be implemented to automatically monitor the age of files,” said            said. Fitze said HP sees dedupe as part of
Matariyeh. EMC, Symantec and CommVault “are among those                    an overall capacity optimization strat-
offering software packages to manage this functionality.”                  egy that also includes thin provisioning,
                                                                           snapshots, pooling and virtualization,
Keep Consolidating                                                         as storage users become more focused
Consolidating has been a staple in IT now for most of this                 on freeing “trapped capacity and perfor-
decade. And it just keeps right on going. The more you get rid                                              continued on page 11
of data center sprawl and Indiana Jones-esque warehouses
full of endless rows of servers, the lower your management,

     8      Table of Contents               Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                       Making the Most of Your Storage Budget

power, cooling and space expenses will be. With one caveat:           Leasing, the Cloud and Open Source
don’t throw out perfectly good equipment to consolidate, as           Leasing is making a comeback as a way to stretch storage
that delays ROI considerably. Wait till gear is at or beyond end      dollars, says Chip Nickolett of Comprehensive Consulting
of life and then bring in the consolidation cavalry.                  Solutions of Brookfield, Wisc.

“In storage, it is all about consolidation, consolidation,            Open source storage technologies have also been catching
consolidation,” said Shaun Walsh, vice president of corpo-            on in the weaker economy. Sun has built its Open Storage
rate marketing at Emulex. “In addition to extending the life of       line on open source software and commodity hardware, while
current hardware and lowering administration and storage              vendors like Zmanda have used open source technologies as
costs, one of the hidden benefits of aggressive consolidation         a way to break into the storage market.
is saving on the ongoing cost of service and maintenance —
service contracts on many older systems often cost more an-           Nickolett suggests a modest open source implementation as
nually than purchasing new storage.”                                  a good way of getting your vendor’s attention.

Use the Windows Storage SIS Feature                                   And lastly, while cloud-based storage services have been
CDW’s Matariyeh offers one free tip for controlling un-               slow to catch on in the enterprise space, this year has seen
structured data, which he regards as the biggest issue in             the arrival of a new startup that claims its service can make
storage because it contains so many file types, is coming             primary data storage in the cloud a reality.
from different sources and is growing at the fastest rate.
Estimates are that 70 to 80 percent of the data in data               One of the more interesting uses of a cloud storage service
centers today is unstructured and growing at more than                has been Twitter, which uses Amazon’s Simple Storage
65 percent a year.                                                    Service (S3) to store avatar icons. Perhaps finding well
                                                                      targeted uses for online storage services is an avenue
“One simple way to help free up space is to activate a                worth exploring.
currently existing function in your Windows Storage
Servers,” he said. “Single-instance storage (SIS) is a feature
built into Windows Storage Server 2003 which will take a
look at all the data within the volumes and reduce duplicates
to one file.”

For example, if a department sends out a 1 MB PowerPoint
document to 30 people and each one saves it in their “My
Documents” directory, that is 30 MB of space from a single
file. WSS will reduce this down to one copy and point all
users to the one copy. That is a 29 MB savings in space from
activating a feature already available in the system.

Get Rid of Certain File Types
Sometimes, it’s the simple things that can make daily work
experience easier. Storage managers can focus on the easy
items that provide the biggest payback, either in reclaimed
storage or data protection for business continuity. For
example, it isn’t difficult to spot and remove any personal,
unnecessary, or large file types from expensive corporate
storage resources.

“Such work is often already mandated as part of compli-
ance directives that outlaw files whose name ends with
.mpeg, .mpg, .mp3, .wav, .pst, .log, .bak, and so on, said
Stefan Kochishan, director of mainframe product marketing
at CA. “These files can then be deleted to increase usable
storage space.”

     9      Table of Contents                Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                       Making the Most of Your Storage Budget

                  Brother, Can You Spare
                        a Petabyte?
                                                        By Henry Newman

                 et’s face it: Times are tough and there’s a lot       error saying that the device cannot be accessed. There are
                 of pressure to cut costs. I hear it all the time      lots of reasons for hard errors, such as media errors, head
                 from my cus-                                                                           errors and media failures. It
                 tomers.                                                                                doesn’t matter what the cause
                                                                                                        is; what matters is how often
But it’s not as simple as                                                                               it happens for each of the
choosing the cheapest data                                                                              devices. If you have a hard
storage technology. If you                                                                              error with a RAID-5 LUN,
care about your data — and if                                                                           the LUN will need to be re-
you’re reading this, you prob-                                                                          built, and hopefully you won’t
ably do — then you need to                                                                              get another hard error or the
consider the technology and                                                                             data will be lost. With RAID-
reliability tradeoffs of storage                                                                        6, another hard error is still
technologies, whether you’re                                                                            not catastrophic, as you have
an enterprise, small business                                                                           two parity devices.
or even an individual home
user (my own home backup                                                                                     You know what they say about
and data protection scheme                                                                                   lies and statistics, but the
borders on the paranoid).                                                                                    hard error rates below come
Storage costs aren’t just about                                                                              from drive manufacturers for
the price of the hardware or                                                                                 both disk and tape.
software; they’re about op-
erating and maintenance costs — and the cost of
                                                                        Device      Hard error Equivalent PB            Days to      Days to
lost or corrupt data.                                                               Rate (in   in bytes   equivalent    hit at 120   hit at 200
                                                                                    bits)                               MB/sec       MB/sec
When I am trying to help customers understand the
technology tradeoffs, the first thing I do is to try to under-          Consumer 10E + 14       12.5E+13    0.89        92           55
stand what their requirements are. Usually I get a glazed               SATA
look or get told to just solve the problem, and some-                   Enterprise 10E + 15     12.5E+14    8.88        920          552
times I’m told that the requirement is for storage that’s               SATA
as cheap as possible. Very few people actually understand               Enterprise 10E + 16     12.5E+15    88.82       9,198        5,159
their requirements, and even fewer know how to apply them.              SAS/FC
                                                                        LTO         10E + 17    12.5E+16    888.18      91,982       55,189
SATA, SAS and Tape                                                      T10000B     10E + 19    12.5E+18    88,817.84 9,198,247      5,518,949
Let’s look at the example of choosing between different types
of disk and tape drives. You might say these can all be taken
care of by RAID, but there are some important things to con-
sider; I think that even the bean counters don’t want you to put       You have to remember that the bit error rate (BER), which is
the company’s data at risk.                                            also known as the hard error rate, is completely different
                                                                       from the annualized failure rate (AFR) of the device. One way
The biggest issue is the hard error rate of the technology.            to look at it is the failure of a single access compared to the
Every disk and tape drive has a hard error rate specified in the       failure of the whole device. Sometimes with some RAID con-
average number of bits, which if read or written, will return an       trollers, the failure of a single access is the failure of the de-

    10      Table of Contents                 Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                         Making the Most of Your Storage Budget

vice, but you have to remember that BER is measured in bits
of transfer and AFR is measure in hours. A device can fail                   continued from page 8
just sitting doing nothing, but the BER is based on device
usage. If you care about your data, this is a critical issue.                 mance in the existing environment.”
                                                                              Boosting capacity utilization from, say,
Some lower-end storage systems use consumer-level SATA                        25 percent to 50 percent can mean big
drives, which if used heavily can fail pretty quickly. The
                                                                              savings for end users, Fitze said.
problem is that in RAID devices, sometimes if one device fails,
other devices will fail during rebuild. The bottom line is that               Fitze did not provide the number of HP
you need to consider the disk drives and your exposure to                     dedupe users, but he said the number is
data loss as part of any storage decision. Buying the cheapest
                                                                              “growing all the time.”
stuff on the market might get you the storage you want, but it
might wind up costing you your data.                                          David Shoup, technology manager for
                                                                              the Mohegan Tribe in Connecticut,
The cost per GB for SAS and Fibre Channel drives is much
                                                                              said the tribal government chose HP’s
higher than SATA, but few people realize that for important
                                                                              Sepaton-based Virtual Library System
data you should include the reliability calculation as part of
                                                                              to work with its HP EVA SAN environ-
the decision-making process. If your data is critically impor-
tant to your organization, having a BER that’s ten times better
                                                                              ment. The tribe also switched to HP’s
is an important consideration; clearly the cost difference per                Data Protector software at the same
GB between SATA and SAS/FC isn’t nearly as great. Even in                     time for centralized management. Shoup
tough times, it is important to consider not just the initial cost,           said he briefly looked at Data Domain,
but the cost of losing what is important.                                     but the HP VLS made more sense for the
                                                                              HP environment.
Tape Versus Deduped Disk
I have seen no reputable study showing that disk and tape                     “It’s worked rather well for us,” he said,
costs per GB are even close. Tape always wins on cost, but                    as backups have fallen from more than
do you have to write everything to tape?                                      24 hours to 8 to 10 hours even as the
                                                                              tribe has backed up more data. He said
Data deduplication has become one of the fastest-growing                      he’s seeing about a 5.5 to 1 deduplication
segments of the storage market, if not the fastest. There are                 ratio on changed data.
many companies that provide dedupe technology. Some are
integrated hardware platforms, while others are just software.                Asked what he’d like to see added to the
Some of the claims of 50 to 1 reduction in the amount of data                 VLS, Shoup said he would like to see
backed up are realistic in environments such as VMware, but                   more automation, but nonetheless said
other environments such as media files do not get anywhere                    he finds it “straightforward” to operate
near that ratio and compression is often similar or even bet-                 and is generally a satisfied customer.

Dedupe can speed up the backup process if there is enough
bandwidth to the dedupe device compared with the band-                  but that isn’t the real issue. More often than not, the real issue
width to tape. With tape latency and other issues, dedupe will          for backup and tape performance is that tapes are faster to-
likely be a big winner over standard tape backup from a time            day than most networks that they are attached to. Take the fol-
perspective, and depending on the size of the backup and the            lowing facts. In 2000, LTO uncompressed data rates were 20
number of tapes, tape slots and the cost of the dedupe sys-             MB/sec and most networks were 1Gb, or realistically about
tem, it can even be a cost savings. Of course, the real issue           80 MB/sec to 90 MB/sec, so the network was four or more
for backup isn’t backing up the data, but restoring it. Keep in         times faster, and about half that with compression.
mind that the dedupe platform can expand data faster than it
can likely write it to the channel.                                     LTO-4 today boasts a 120 MB/sec uncompressed data rate,
                                                                        240 MB/sec compressed, and with 10GbE networks to the
One of the biggest complaints I hear about tape is that it is           backup server, you have a little more breathing room, but not
slow. The latency for tape to load and thread and be ready              much. But the problem is that very few people have an end-to-
hasn’t changed much since the advent of the tape cartridge,

     11      Table of Contents                 Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                      Making the Most of Your Storage Budget

end 10GbE network, and remember you will be bound by the             state of your network, the cost of the additional hardware and
slowest point on the network. The same is true with tape — if        software and other factors such as power, training and floor
you are using FC-2 with LTO-4, for example, FC-2 has a 200           space. One benefit of a D2D2T system could be deduping
MB/sec limit and LTO-4 with compression is 240 MB/sec.               data before writing to tape, thus saving even more money.
Add to this that most people put multiple tape drives on the
same FC connection and you have a performance issue that             And another factor to consider: if you’re eliminating multiple
is again caused by the network.                                      copies of data, make sure the one you’re keeping is right.
                                                                     Check with your dedupe vendor to make sure they have prop-
This is why if you are going to use tape — which is, after           er checks for ensuring data integrity and reliability (see Data
all, not only cheaper than disk, but also more reliable if han-      Corruption: Dedupe’s Achilles Heel).
dled and stored properly — to use tape efficiently you need
to stream the device at full rate, including compression, so         The disk and tape tradeoffs are pretty clear. Tape is cheaper
disk-to-disk-to-tape (D2D2T) is the way to go. To accomplish         and potentially more reliable than disk, but you need the right
this requires using either a VTL or backup software that man-        infrastructure to make it efficient. Dedupe has promise for sav-
ages a D2D2T framework, and this usually is an added ex-             ing on storage costs, but cheap disks carry the potential for
pense for the software. The tradeoff between D2D2T, VTLs             data loss. With apologies to Rush, you can’t get Something
and dedupe, or a combination of one or more, is a complex            For Nothing in the data storage market, but hopefully you now
decision that depends on the dedupability of your data, the          know something about spending your money wisely.

    12      Table of Contents               Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                       Making the Most of Your Storage Budget

                       Data Corruption:
                     Dedupe’s Achilles Heel
                                                        By Henry Newman

             ata de-duplication is one of the hottest technolo-        on IBM mainframes running MVS, although the potential is far
             gies in storage these days, and users and vendors         lower than any other system given the amount and number of
             alike are climbing on the bandwagon. There are            parity and checksums calculated and checked.
             vendors building hardware products, others build-
             ing software products, and some doing both.               A Swiss Laboratory last year published a paper on data cor-
                                                                       ruption and its sources that is worth reading.
I am not going to compare products or different vendor tech-
nologies, but I am going to look at an important issue you             You might wonder what all this has to do with data de-du-
need to ask your vendor about                                                                        plication. In a nutshell, if you
if you’re considering purchas-                                                                       de-duplicate your data and
ing data de-duplication hard-                                                                        the hash area for the data de-
ware or software, and that is                                                                        duplication hardware or soft-
data corruption.                                                                                     ware gets corrupted, you can
                                                                                                     lose all of you data. If you’re
You might wonder what de-                                                                            going to get rid of duplicate
duplication has to do with data                                                                      data, it’s critical that the data
corruption, and I’ll get to that in                                                                  you have be right.
a minute. But it’s important to
note that I’m writing this article                                                                     For example, what if the data
from a generic hardware and                                                                            comparison hash was data
software point of view. Some                                                                           that was corrupt at the time
vendors’ products may or may                                                                           the data was read, but the
not address all or part of the                                                                         data on the disk is still good?
problems I will discuss in this                                                                        If you read it again, you will
article. It’s up to you to under-                                                                      likely get the correct data. But
stand what you are buying and                                                                          what if the hash data written
to ask the vendors the right                                                                           on disk was bad or went bad,
questions. Caveat emptor.                                                                              would you still be able to read
                                                                                                       your files? Let’s step through
A Trip Down the Data Path                                              these two examples and see what happens. As a reminder,
I wrote an article on a data corruption experience I had where         I am doing this generically and the examples might or might
I compared a few bits and the ASCII characters had changed             not work for a set of vendors based on their hardware and
dramatically; in fact, most of the bytes went bad in the ex-           software.
ample I gave.
                                                                       Case 1: Corrupted Data Read
The point of the article was that bits occasionally go bad,            If you read data from a disk and the data you read was cor-
sometimes sooner than later. It does not matter if it is high-end      rupted for any reason (disk drive, channel, controller, or other
enterprise Fibre Channel, which might happen far less often            reason) and then started to apply the corrupted data to new
than cheap SATA. It might not even be the drives or the con-           data, you would have a major problem. When you read the
troller; it could be that the memory of the machine corrupted          information again from disk to de-duplicate it, it would not be
the data or the CPU or something else. The bottom line is that         the same.
at some point your digital data in the digital world will be cor-
rupted. Although the likelihood varies based on the operating          If you compare the data that you read with the incoming data,
system, the hardware, and the software, it can happen even             the data in memory will be bad, so any data that you find a

     13      Table of Contents                Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.
                                       Making the Most of Your Storage Budget

match with will be compared with data that will be different the       hardware in the data path, including the disk drives, and is
next time it is read. So basically any new data from the point of      from the same people that brought you the SCSI protocol.
the data read with the corrupted read will be compared incor-
rectly and therefore be unreadable.                                    There are file systems that do checksums, but if a file system
                                                                       is doing checksums and correcting the data, then you have
If the hash is reread then for some reason and is read cor-            two issues:
rectly, any subsequent data read will be just fine. Other than
that, it will be a debugging nightmare, one which I am pretty            •	 The	file	system	must	read	the	data	back	to	the	server	before		
sure is unrecoverable and a significant amount of data will be              the checksum can be confirmed or rejected. It is not checked
lost. The scary part is that some of the data is good and some              when the data is written to the device by some of the hardware
of the data is bad, and figuring that out is likely not possible            in the path.
without some serious detective work.
                                                                         •	 The	server	CPU	must	calculate	the	checksum	and	also	con
                                                                            firm it when the file is read back in. There is a significant effect
Case 2: Corrupted Data Hash Data                                            on the server doing all of this checksum activity. This includes
What if the data on disk gets corrupted and is bad from the                 increased memory bandwidth requirements and utilized CPU
start? This is a similar problem to the first case, except that             caches, requiring applications to potentially reload from
with Case 1 you have good data, then bad data, and then                     memory and memory bandwidth usage to increase by the
likely good data. With this case, the hash that was created                 checksum calculation.
is in memory and is good, but the hash on disk is bad. That
means you have data that was created with a good hash, but             This is an issue if you are running applications that use
once the hash is read from disk, the data will be bad. The             significant server resources.
good news, if there is any, is that once the hash is read from
disk back into memory, it will be the same, so the problem             There are products that have their own file systems and check-
should be limited. But you will have data you create that can-         sums and address some of my concerns about data corrup-
not be un-de-duplicated for the time period that the data was          tion, but not all vendors have products that have this func-
created with the original in memory hash. So when you go to            tionality built into their offerings. This is just one of the areas
un-de-duplicate the data months or years later, you will have          that you should be concerned about with data de-duplication.
bad data until you re-read the hash from disk and then have            It should not be the only consideration for the evaluation of
good data from that point on. Again, this is a debugging night-        a vendor’s offering, but it should be one of the high-priority
mare and likely impossible to figure out.                              considerations. Vendors might say that this is your problem
                                                                       when you ask the question, and that your environment should
What You Need to Ask Vendors                                           be running something like T10 DIF. Wrong answer. Vendors
I am a firm believer in the reality of undetected data corrup-         need to be thinking about your hardware and software before
tion. It has happened to me and I have seen it happen to oth-          you ever ask a question, and if they leave the problem to you,
ers, and sooner or later it will happen to you. I am also a firm       then I would be running the other way.
believer in the new T10 Data Integrity Field standard, which
passes an 8 byte checksum from the host to the disk and                Data de-duplication is a great tool for some environments, but
has the disk confirm the checksum, which should be gener-              as with everything complex, it requires some careful planning
ally available from a number of vendors likely later this year. I      and execution.
personally like this standard, as some of it is implemented in

    14      Table of Contents                 Making the Most of Your Storage Budget, an Storage eBook. © 2009, WebMediaBrands Inc.

To top