Data Storage in Clinical Trials by PfXLd0


									                   International Biopharmaceutical Association Publication

                       Data Storage in Clinical Trials: New Approaches

                                      By Tatiana Golod


Clinical trials are about capturing data. It is the process by which clinical researchers
collect, review, and accept clinical data and then input the data into a database. Starting in
the preclinical laboratories, the computer based tools and software modules helps to
analyze the data based on which the decision is obtained whether to move forward with
clinical trials or not.
Now the world of IT is much bigger than just data and databases. There's word
processing, e-mail, voice-mail, image processing, multimedia and most of all, there's the
World Wide Web.

Nowadays, the perceived need for multiple studies and the increasing use of
multinational mega-trials are producing nearly unmanageable quantities of data that must
be managed. Data resides everywhere today, internally and externally, in the laboratory
and on the manufacturing floor, on network drives, in legacy systems, and on individual
During the trial and the post marketing study period, data is accumulated from clinical
observations, tests and various analyses. That information is collected during 12-15 years
of a study research. The data generated to support the new drug application (the NDA)
can contain up to 300,000 pages of information.
It is a major challenge for pharmaceutical companies to handle the huge amount of data
generated throughout a trial contained in documents in multiple formats and sometimes
requiring updating during the trial. Along with the lasting discussion about reducing the
time and money in clinical development of new drugs, the amount of continuously
growing collected information is a challenging concern among pharmaceutical industry

Until a few years ago, pharmaceutical companies relied almost exclusively on data from
studies monitored by internal clinical staff. Generally, the data was created in various
operational databases and stored with neither a metadata structure (Metadata it is data
that used to describe other data. Metadata describes how and when and by whom a
particular set of data was collected, and how the data is formatted) nor a standardized
facility for searches and archiving.
Today, the importance of the metadata increased so tremendously that courts and
regulatory agencies are declaring that metadata must be treated equally with the
information being described, and that must be preserved together with the information
being described. There is a growing body of law that gives third parties access to the
metadata created and used operationally within systems, and that imposes evidentiary
obligations for that metadata upon system managers - obligations that they previously had
only for the information itself.

In the past, the data generated during clinical research was recorded on paper or through
hybrid paper-electronic systems. Traditional data capture uses ink and paper with double-
                   International Biopharmaceutical Association Publication

entry keystroke data entry. The gap between data generated in the lab and their potential
value in the market is widening. The reason is paper. The output for each sophisticated
instrument is a printed report. Thus, despite a lab’s ability to speed data generation, the
only method for converting data into usable information is to manually piece together a
puzzle of paper.
Emerging technologies like internet data collection systems, handheld computers, voice
recognition, wearable monitoring devices are becoming alternatives for collecting clinical
data. Technologies are enabling direct capture using internet and intranet connectivity.
With the advent of the FDA rules that accept electronic records and electronic signatures
as equivalent to the paper based records, more and more electronic capture takes place.
Currently, the drive to an information strategy solution indicates the need for a universal
clinical data interchange.
An organization must first structure itself to learn by giving the decision makers more
rapid access to expertise and information, then it must identify and use technologies
possessing the ability to integrate, store, access, and manage dissimilar data within a
single pool from anywhere in the corporation and comply with the federal regulations for
the e-records.

The major challenge of this decade for best practice organizations will be to capture and
organize the information and change their organizational cultures to more openly share
information on widely accessed systems. To be effective, an organization must enable
performance, reduce rework, and eliminate work that can be automated.
Open sharing of information raises other issues. The central problem is maintaining the
security of sensitive information. Other issues are the variation of the design and
terminology between companies, countries, drugs, and even individual studies for the
same drugs. Traditionally, data has been stored in formats defined by commercial data
management systems, which invariably use differing logical data models and thus make
data interchange difficult. Most systems were optimized for double data key entry, not for
integrating data from the emerging data-capture tools.

Organizations don’t want to spend time any more searching for information relying on
someone’s recollection of some similar problem in the past.
Managers have to learn to eliminate data that do not relevant to the information they
need; and to organize data, to analyze, to interpret, and then to focus the resulting
information on action.
Certainly, scientific information has value, but having a huge database does not mean it is
necessarily worth a lot. Because the modern laboratory generates the bulk of its data in
digital form, it is critical to have the ability to archive and readily retrieve these data.
Archiving can improve the throughput of your on-line data significantly, but at the cost of
gumming up the analysis of your off-line data. Quality, speed, and cost are the variables
that must be optimized.
Today’s technologies can capture different data from multiple sources, laboratory
instruments, graphical interfaces, documents, and more. By itself, however, data is not
promptly usable. It must be searchable and retrievable regardless of its origin, and
organized into a format that can be used to deal with the crisis of the moment.
Commonly known as SDMS (Scientific Data Management Systems), these systems
                   International Biopharmaceutical Association Publication

gather data from multitude applications, create a common electronic format then store
machine-readable electronic records, transform that data into human-readable reports,
and store those reports in databases. SDMS automate the collection, storage, and sharing
of scientific data. The software is able to capture the data from a range of sources,
application independent, and to store it in a central database. SDMS technology improves
the flow of information by enabling scientists to unify, share, and fully reuse scientific
data generated from a vast array of laboratory applications to a central database, creating
a unified repository of electronic online laboratory reports. From this storage, information
can be quickly viewed, mapped, analyzed, organized, shared, and reused. Consequently,
companies can manage all of their lab information using single software architecture and
a common information exchange environment. This database management technology
dramatically reduces the amount of time wasted looking for results data on multiple
analytical platforms. The automated system backs up analytical data and methods in near
real time. For all applications, the software can read the folder names in the directory tree
where the data are stored. These names can be captured and stored in the database as
context-sensitive keys to assist in locating data at a later date. The archive engine
typically resides on a single dedicated PC, sweeping the network to copy data from user
defined sources. This engine is responsible for the archiving process, inserting context-
sensitive key information into the database, and recording system-generated transactions
and messages that occur during the archiving process. The user defines an archive profile,
which effectively tells the engine what products it is archiving data from, where the data
are located, how often to look for new data, and what keys to map into the database to
assist with data retrieval in the future. A client that restores information may be installed
on multiple PCs where access to data in the database catalog is required. Restore clients
may be located in the laboratory or in offices, and they provide the ability to search
through the database catalog by using either browsing techniques or direct ad hoc queries.
SDMS technology has provided a new capability for managing and applying information
throughout the scientific enterprise. With its unique capacity to archive, search, and
organize data through one central electronic repository, many activities that were once
performed manually are now fully automated. Quantities of raw data generated
throughout the laboratory can be shared, analyzed, and interpreted. Thus, data become
highly valuable information.

Another issue that arises from the growing amount of collected information is the
physical abilities of the storage.
The data management and archiving solutions don't really change the amount of data
you're trying to store. Here hardware technologies have the space to rule. The concept of
combining multiple storage servers together to form a redundant ring of storage devices,
called clustered storage systems, must be replaced by new storage solutions. Realizing
that the most important customer attributes of disk storage are the cost per megabyte, data
rate, and access time, storage technologies must continue to improve in order to keep
pace with the rapidly increasing demand. The data storage systems have made enormous
technical advances toward their goal.

Remember the big improvement from floppy discs to CDs, expanding storage from
around 1 megabyte to 650 megabytes? Then we moved on to DVD, bumping capacity up
                  International Biopharmaceutical Association Publication

to 4.7 gigabytes (1 gigabyte = 1024 megabytes). The shorter wavelength of the blue-laser
DVD versus the conventional red-laser DVD enables smaller and closer pits
(representing bits of data) to be stored along the two-dimensional plane of the disc. This
advance leads to a fivefold increase in the amount of data that the laser can read on the
disc. A blue-laser DVD maxes out at 100 gigabytes of capacity.
However, both magnetic (the hard disk and magnetic tape) and conventional optical data
storage technologies (the floppy discs, CDs and DVD disks standard capacity of 4.7
gigabytes), where individual bits are stored as distinct magnetic or optical changes on the
surface of a recording medium, are approaching physical limits beyond which individual
bits may be too small or too difficult to store. Storing information throughout the volume
of a medium, not just on its surface, offers an intriguing high-capacity alternative.

Holographic data storage is a volumetric approach which, although conceived decades
ago, has made recent progress toward practicality with the appearance of lower-cost
enabling technologies, significant results from longstanding research efforts, and progress
in holographic recording materials. An entire page of information is stored at once as an
optical interference pattern within a thick, photosensitive optical material. A signal beam
is encoded with digital information using a spatial light modulator. It makes an optical
interference pattern with a reference beam, originating from the same laser, to form the
data-encoded hologram. A light-sensitive material, usually a photopolymerizable
medium, records the hologram. The data can be retrieved later by scanning with a beam
identical to the reference beam.
InPhase company, which started out as a research group within Lucent Technologies' Bell
Laboratories, demonstrated a prototype of holographic data storage that holds 200
gigabits per square inch of storage capacity. The company is planning to commercially
launch a disc at 2006 that has a capacity of 300 gigabytes. By the end of the decade, it
hopes to have a 1.6 terabyte (1 terabyte = 1024 gigabytes) disc ready.
The real limitation of holographic data storage is not the technology, however, but the
marketplace. Holographic technology is meant to eventually replace blue-laser DVD, but
surely, it’s a question of time.

So, obviously, smart management of your storage operations includes knowing the latest
critical hardware elements together with the software solution for managing the data
warehouse, data management process and tools available, the latest compliance and new
regulations, and more.
                  International Biopharmaceutical Association Publication


1. Joseph F. Noferi and Daniel E. Worden, “Planning for Failure,” Applied Clinical
    Trials, October 1999, 44–48.
2. Code of Federal Regulations, Title 21, Part 11 (U.S. Government Printing Office,
    Washington, DC).
3. Thomas Petzinger, Jr., “Talking About Tomorrow. Peter Drucker: The Arch-Guru of
    Capitalism Argues That We Need a New Economic Theory and a New Management
    Model,” The Wall Street Journal, 1January 2000, R34.
4. What Every CIO Needs to Know About Metadata
5. Douglas Dixon, “ Plain talk about technology for home & business ”
6. IBM articles, “Holographic data storage”
7. Alexander H. Tullo, “Data storage in 3-D” y.html
8. Industrial News Room - TomasNet
9. Joshua Greenbaum, “The trouble with terabytes”
10. James Conley, “Springing into action with data management systems”
11. Joseph F. Noferi and Daniel E.Worden, “Advantage management”

To top