At the Core
➢ examines electronic discovery: its
relationship to records management
and its future RIM implications
➢ discusses how e-mail, storage,
backup, software, and operating
systems are evolving
➢ offers predictions for electronic
discovery in 2010
34 The Information Management Journal • November/December 2003
The past 10 years have proved
that the escalating costs of data collection
and review in discovery, as well as the
complexity of the systems themselves,
demand a major realignment of how
business data is maintained
Deborah H. Juhnke
T he year is 2010. Margaret Techway, a highly placed, first-generation,
holographic memory engineer, has recently left her company, Innovations
Inc., to join market-newcomer 3-D Strategies. Upon her departure, the
“data-freeze” provision of Innovations’ e-risk management policy was
implemented automatically. A remotely performed, quick forensic review of
her primary workstation uncovers suspicious activity during the previous two
weeks, which gives Innovations cause to file a lawsuit against Techway and
3-D Strategies for trade secret theft. The challenges of proving the case,
however, are just beginning. Blogs, biometric keys, and blades are only a few
of the technological hurdles attorneys will face in developing the case.
Because instant messaging (IM) has replaced e-mail as the preferred form of
business communication but has not been consistently monitored or saved at
Innovations, there are no e-mail archives to search. What files there are had
been copied to a removable thumb drive and taken by Techway, leaving little
evidence of their removal. Asking for the thumb drive in discovery will be
only half the battle, however, because Techway’s thumbprint is necessary
to access the drive. 3-D Strategies has adopted blade servers that are
configured with a random array of inexpensive disk (RAID) format, meaning that
Innovations’ attorneys cannot simply ask for “the server” drive. The increased
capacities and more complicated backup models hamper the plaintiff’s
attempts to narrow the scope of digital data discovery.
Finally, because Techway has participated in an unstructured public weblog
(blog) dedicated to the discussion of new technologies (and sanctioned by
Innovations), there are some questions regarding whether the trade secrets
taken were, in fact, secrets anymore.
November/December 2003 • The Information Management Journal 35
This brief vignette illustrates several points: digital documents, Web cams, and IM have become main-
• Reliance on the “document” paradigm must change. In years stream, and new sources of digital data present themselves daily.
past, discovery was comparatively simple. Ask for documents, get These new technologies offer risks along with rewards.
paper. But no longer. Much of what constitutes relevant Organizations must accept that both technology and re-
discovery today and in the future will not, cannot, or should designed processes will be required to help manage, search, and
not be printed. produce an increasing variety and volume of data. As volumes
• Constant vigilance in understanding new technology as it increase and sources multiply, it will no longer be possible to
relates to electronic discovery is required. Remember when gather and review all data.
there was no such thing as a personal digital assistant (PDA)? • Computer-based discovery cannot be treated like paper-
Over the past 10 years, fledgling technologies such as cell phones, based discovery. The quill pen has given way to the digital pen,
creating a responsibility to respect and protect this more fragile
form of evidence.
When viewed in light of recent corporate scandals, topics
such as these are more relevant than ever to records managers,
lawyers, and corporate management. The past decade has pro-
vided some lessons, but there are many more to learn.
The Document Is Dead
There was a time when documents were described in discov-
ery as “writings of every kind and description that are fixed in
any form of physical media.” The problem is that the common
legal definition of a document is conceptually misleading in the
context of electronic discovery issues. This is particularly true
for collection and review of voice, video, databases, and
Internet-based communications. When addressing these types
of data, the average person’s concept of a document – some-
thing that may be printed, read, and held in a person’s hand –
begins to blur.
Although expanding the legal definition of a document to
Predictions for Electronic include electronic data creates the obligation to produce such
data in discovery, it offers no guidance on how that production
Discovery in 2010 should be carried out. Consequently, there is significant varia-
tion in methods used to produce electronic data for discovery.
• Indiscriminate conversion and production of data will end. The assumed intent of production is to provide meaningful
information, but there are ways in which this intent may be
• The “document” will be replaced by the “dataset.” intentionally or inadvertently circumvented.
• Calculated (not random) sampling will be standard. With paper documents or even word processing files, the
meaning is fairly clear. There is a beginning, an end, and a logi-
• Language used to request and describe electronic discovery cal structure. True documents tend to be self-contained, or at
will become more specific. worst, refer to other documents in support of their content. This
• There will be more use of technology and techniques for makes fitting digital data into the conceptual framework of a
document particularly troublesome.
filtering, including search–and–review tools based on artifi-
There have been attempts in the past five years or so to shoe-
cial intelligence models. horn digital data generated in discovery into the document par-
• Computer technology will no longer be Microsoft-centric. adigm, including printing it to paper, printing it to image,
extracting it into file structures, and posting it to the Web for
• E-mail will give way to other forms of communication as the
review. As technology advances, however, these techniques will
primary source of data discovery. become less suitable. They will fall short in their ability to
• There will be a need for the wider use of experts, consultants, accommodate all relevant forms of data and must evolve to
and attorney “specialists.” remain viable. Likewise, forays into electronic discovery that
have been limited to the collection and review of e-mail should
• The judiciary will become more educated and experienced be made cautiously: the good stuff may be left behind. The case
in the use and abuse of electronic discovery. where relevant data is found buried in a single field within a
corporate database is only one example.
36 The Information Management Journal • November/December 2003
Key Discovery Technologies
Technology What It Is Example Electronic Discovery Issues
Instant • allows immediate communication via AOL, MSN Messenger • enables users to circumvent
messaging the Internet corporate e-mail
• similar to e-mail, but without con- • no record unless saved
straint, tracking, or preservation proactively
Alternative e-mail systems that operate outside the • PocoMail • enables users to circumvent
e-mail corporate environment • ISP-based e-mail such as Yahoo corporate e-mail
• no record unless saved proactively
Biometrics security based on personal physical thumbprint access on PDAs or USB can confound discovery and data
characteristics, such as retinal scan and port drives retrieval efforts by making access diffi-
fingerprint cult or impossible
Filtering filters spam or other messages • Spam Assassin • cannot assume that data sent was
software and files • filters embedded in ISP services received
such as AOL, Earthlink, and MSN • on subscription services alone (not busi-
ness e-mail), 11.7 percent of messages
requested were never received, accord-
ing to Information World
Collaboration enables communication between com- • Eroom • may be overlooked as source of data
software panies and individuals in remote or • WebEx • may be only copy of relevant data
Web-based environments • difficult to monitor for data preservation
• eventually may become part of the
operating system itself
Virtual offices business model whereby employees in Jet Blue reservation agents dispersed data
selected departments work from home
Portable small, removable storage devices holding • Pocket Drive • hard to find
storage up to 40 GB and costing only about $400 • Microdrive (IBM) • hard to track
• SanDisk CompactFlash • easy to steal data
Blogs Web-based personal or topic-specific See blogger.com for examples • ad-hoc nature
bulletin boards • difficult to track, collect, or identify
• if found, could be good evidence
Blade servers network-based servers based on “blades” IBM, HP, others • more difficult for the untrained user to see
that are added to a chassis, enabling • holds more data
many servers to be housed in a small • more difficult to seize and review, as
space and boosting network efficiency they are generally formatted as RAID
Digital files former analog files that are now digi- .wav and .MP3 an often-forgotten source of relevant
(beyond word tized, including voice and audio data, particularly when used to
processing) broadcast corporate information
Data mining programs that enable data from a variety generally customized or business- presumption is that all information is
of sources to be viewed in the aggregate specific, such as for hotel industry or locatable because it is in data ware-
and from varying perspectives manufacturing house
World Wide all content formatted for the Internet any Web site • another overlooked source of evi-
Web or a corporate intranet dentiary information
• difficult to track and preserve
Digital bringing together digital data of many cell phones • creates another good source
convergence types and sources into a single location of digital evidence
• hard to monitor and track
Peer-to-peer communication protocol that enables Groove networks • difficult to track and monitor data
networking PCs to talk directly to one another with- maintained in this environment
out sharing access to a centralized server • poses problems for data preserva-
tion efforts because of its decentral-
38 The Information Management Journal • November/December 2003
New data types and greater reliance on electronic communi- It is increasingly difficult to identify and collect the most
cation also present a significant records management challenge appropriate evidence. In this respect, technology is both a bless-
– one that must be addressed by changes in process and in the ing and a curse. A curse because each year brings new places
technology used to manage that process. where relevant data may lurk and ways to exploit the weaknesses
in data management structures; a blessing because as each
So What’s New? weakness is identified, inventive companies develop the tools to
From the Fortune 50 to the “mom-and-pop,” organizations bolster or eradicate it.
are increasingly implementing digital technologies. Unfortu- The enormous popularity of do-it-yourself in everything
nately, the impact of new and more ephemeral data sources on from home repair to self-help is filtering into the field of digital
records management and litigation are the farthest thing from discovery, sometimes with disastrous results. Inadvertent over-
the minds of those who implement new technology. writing of data and failure to preserve are two areas in which the
Collaboration software, data warehouses, ISP-hosted e-mail, do-it-yourselfer risks exposure and sanctions. The days of sim-
and Web-based content all present opportunities for indiscrim- ply collecting e-mail from an Outlook server and calling discov-
inate archiving and dissemination of corporate information. ery done are waning, if not already gone.
Such consequences are often lost, however, in the cost-benefit Those who find this preposterous should consider the
discussions among IT staff and corporate management. Sarbanes-Oxley Act. Its document retention provisions alone
In 1995, Microsoft Windows was predominant, there were few mandate a higher standard of care. When taken in the context of
personal computers, and the PDA had not yet been born. Storage litigation and discovery, however, Sarbanes-Oxley goes well
was measured in megabytes, not gigabytes, and only “gear-heads” beyond monetary sanctions to the specter of jail time. Thus,
and professors wandered the Internet. Fast forward to 2003 and where to focus attention becomes increasingly important.
consider the current landscape: cyber hacking, computer viruses,
the Linux operating system, terabytes and petabytes, Internet Instant Messaging and E-mail
cafes, and cell phones that take pictures. What was once the stuff IM is an immediate issue for most companies. It is ubiqui-
of science fiction and spy movies is now mainstream. So how do tous, generally unmonitored, and a great way to circumvent
these advances impact electronic discovery? restrictive corporate e-mail policies. [Editor’s Note: see “IM:
November/December 2003 • The Information Management Journal 39
Invaluable New Business Tool or Records Management become critical in cases involving trade-secret theft, for exam-
Nightmare?” on page 27.] According to a study quoted in ple. Biometrics and hardware-based security can still foil an
Information Week, “By 2007, businesses will be supporting 182 investigator’s attempts to access the data, however. IBM is plac-
million IM users”; PC World estimates IM users will top 250 ing storage of biometric factors and encryption keys on a dedi-
million. But when misused, IM can be used to leak everything cated processor on the computer’s motherboard. To gain access,
from financial data to source code. For example, consider the some removable media require fingerprint recognition or put-
possibility of an IM thread about pricing between competing ting the device into its host computer.
On the high end, storage area networks (SANs) are replacing
companies and its implications for antitrust violations.
the need to add larger hard drives to individual servers. Both
Almost as bad as IM misuse is the fact that commercial
SANs and outsourced data warehousing can easily be over-
Internet service providers such as AOL and Yahoo have intro- looked as a relevant data cache.
duced more sophisticated encryption options and premium
e-mail services that enable customers to store more e-mail in Backup
It appears the industry may also be moving beyond backup
tapes into the world of “data protection appliances,” a phrase
Disk-based backups ... that is not a euphemism for file cabinets. Tape backup, which is
linear, subject to failure, and tedious for data recovery, is being
may soon supplant challenged by small computer systems interface (SCSI) devices
that keep an initial copy of a protected drive and log changes at
backup tapes whose intervals as short as 30 seconds. Disk-based backups such as
these may soon supplant backup tapes whose only goal is data
only goal is data recovery rather than data archiving. For now, tapes continue to
grow and by 2010, super advanced intelligent tape (SAIT) may
recovery rather hold as much as 4 terabytes per tape.
The underlying issue, however, is too much data. As storage
than data archiving becomes less expensive and more data is generated, the tempta-
tion is simply to keep it available. If that trend continues, the
potential liability and cost of gathering and filtering this data for
their personal accounts for longer time periods. As cell phones litigation will be staggering. Consider that the reported average
and PDAs converge, they, too, will harbor data that may be storage capacity of a company’s Windows NT servers is 43 ter-
subject to both retention and discovery. abytes. To put this number in perspective, if 43 terabytes of doc-
The effects of increasing e-mail volume are becoming evi- uments were printed, they would stack over 800 miles high.
dent. Last year, as a cost-saving measure and in response to a Rapid Restore, a new IBM ThinkPad feature, creates a hidden
100-percent increase in e-mail in two years, EDS asked its service partition that backs up the entire system image, from
employees to save messages in their local Microsoft Outlook data files to registry settings, with periodic updates. Although
inboxes, rather than on the Exchange server. This short-term fix not the same as an evidentiary image, this backup will let users
is just one example of how companies react to an immediate locate and restore single files that have been corrupted or delet-
problem without considering the long-term impact. ed. That is good news for discovery but bad news for those try-
Compounding the situation is the fact that many users have not ing to maintain tight controls.
been trained to use their e-mail systems effectively, making it
much more difficult to retrieve and isolate relevant e-mail. Software and Operating Systems
The good news is that the new version of Exchange, code-named Integrated messaging, version control, audit trail, and event
Titanium, promises to protect messaging from hackers and inte- notification are all components of the latest online collabora-
grates an automatic backup component that takes regular snap- tion tools. Objectively, they are excellent tools for streamlining
shots of the data. It also will further the centralization of e-mail to such business processes as product development, corporate
fewer servers, facilitating both discovery and data retention. management, and more. When litigation threatens, however,
they are just one more place where data may lurk.
Storage Technology is slowly moving away from a Microsoft-centric
Storage has become personal. Corporate servers are no longer view of business computing. Linux and other open-source plat-
the exclusive keepers of corporate data. Thumb drives, flash forms will heighten the variety and complexity of internal data
cards, and micro-drives are now capable of holding gigabytes of review and storage. Futuristic applications such as visualization
data that can be downloaded simply and secretly. Employees can and mapping technology, rather than the printed report, may
more easily take their work (or anything else) home or to a ultimately hold the best evidence. It is therefore critical that cor-
competitor. Gaining access to such devices in discovery may porate managers, attorneys, and records managers understand
40 The Information Management Journal • November/December 2003
current and future technologies and their effect on both reten- A do-it-yourself trend is beginning to emerge, as lawyers and
tion requirements and proactive discovery in litigation. For IT personnel take on more responsibility for managing elec-
example, will cell-site data (which antenna towers or wireless tronic discovery. Large companies may want to build in-house
facilities a cell phone accesses) or .wav files be important in the expertise in electronic discovery. However, they must recognize
company’s next litigation? Probably not, but IM and collabora- that they will require significant training and an ongoing pro-
tive software probably will be. gram to update them on current tools and technologies.
The law is not settled as to form, scope, and cost of electron-
ic discovery. Two recent cases, Zubulake v. UBS Warburg LLC,
2003 ILRWeb (P&F) 2253 [SDNY, 2003], and Rowe
A single advanced Entertainment, Inc. v. The William Morris Agency, Inc. 205 F.R.D.
421 (January 16, 2002), offer guidance but do not acknowledge
server, when the coming storm created by the compression of court dockets
and the expansion of information and new technologies.
clustered, can hold
The Future of Electronic Discovery
a whopping 11 Some technologies will flourish, some will die, some will just
keep hanging on. Predicting which will survive is like predicting
petabytes (11,000 the outcome of the next Kentucky Derby. One thing is certain,
however: Our computing environments will continue to change
terabytes) of data and impact discovery in litigation.
Clearly, information managers must develop an understand-
ing of hardware and software beyond that gained through
personal experience to adequately pursue or defend electronic
Two Roads Diverged discovery in litigation. It is likewise easy to take a “been there,
Imagine that a person is carrying five pingpong balls back done that” attitude toward electronic discovery, but the times
and forth across the room in her hands. Each time she crosses are quickly changing. AOL alone now generates 3 terabytes of
the room, another ball is added. After only a few trips she starts logs a month. A single advanced server, when clustered, can hold
to drop a ball here and there, and she suddenly realizes that she a whopping 11 petabytes (11,000 terabytes) of data.
could put all the pingpong balls into a box to make the task eas- Emerging best practices for data retention and preservation
ier. She continues to carry the box back and forth, each trip can help corporate counsel address these issues proactively.
adding another ball, but now baseballs, basketballs, and foot- They will require rethinking in terms of how to approach dis-
balls are added. The box finally becomes too heavy to carry, covery. Records and information management systems have not
however, and she eventually drops all the balls. historically been deployed with litigation in mind, but perhaps
This not-so-subtle metaphor helps illustrate how most peo- they should be. The escalating costs of data collection and
review in discovery, as well as the complexity of the systems
ple have thus far approached computer-based discovery (i.e.,
themselves demand a major realignment of how data is main-
continuing to follow the same practices used for paper-based
tained in the ordinary course of business.
discovery), seeking only to contain the increasing amount and Thus, an e-risk management plan has become an imperative.
variety of data in a larger container. But take a step back and As with most things, a focus on minimizing risk now will yield
consider whether carrying all those balls back and forth was benefits in the future.
The costs and time associated with computer-based discovery Deborah H. Juhnke is Vice President of Seattle-based Computer
can be greatly minimized with a little prior planning. Careful Forensics Inc. and is a leader in the field of electronic data
selection of datasets, filtering, and sampling techniques offer recovery. She may be contacted at firstname.lastname@example.org.
ways to focus discovery efforts and limit unnecessary collection.
Needless to say, if a comprehensive e-risk management plan is References
implemented prior to litigation, the amount of data available “Corporate Instant Messaging Ready to Take Off.” Information
for review will likely be much smaller. For example, the “2003
Week. 2 April 2003.
E-Mail Rules, Policies and Practices Survey,” co-sponsored by
American Management Association, The ePolicy Institute, and “E-mail Habits Are Risky Business.” PCWorld.com. 24 June
Clearswift, revealed a lack of e-mail retention and deletion 2003.
training and policies in U.S. corporations. According to Nancy “The E-mail Scandal.” Infoworld.com. 25 November 2002.
Flynn, ePolicy Institute’s executive director, “... only 27 percent
“How Secure Is Instant Messaging?” PC World, October 2002.
of the [1,100 U.S. companies that participated in the survey] are
doing any training about retention and deletion of e-mail, and “More Than an In-Box.” InformationWeek.com, 6 May 2002.
only 34 percent have any retention and deletion policies at all.” “2003 Infoworld Storage Survey.” Infoworld.com. 2003.
42 The Information Management Journal • November/December 2003