Data storage is an important part in Database
Management System. It can be divided in to two parts:
1. How does a computer system store and manage very
large volumes of data and
2. What representations and data structures best support
efficient manipulation of these data.
The Memory Hierarchy:
A typical computer system has several different
components in which data can be stored.
These components have data capacities of various ranges.
The cost per byte of these components also varies.
The device with smallest capacity offers the fastest
access speed and have the highest cost per byte.
The semantic of the memory hierarchy can be shown as:
memory DBMS Disk
Virtual Memory File System
Cache is the lowest level of hierarchy in the memory
It is an integrated circuit, or a part of the processor’s
chip, that is capable of holding data or machine
instructions. It can be accessed in a few nanosecond
Data in the cache are a copy of certain location in the
When a machine executes instructions,
(1) it looks for both the instruction and the data used by
those instructions in the cache.
(2) If it doesn’t find them there, it goes to the main
memory and copies the instruction or data in to the
cache. It take 10-100 nanosecond to move from Main
memory to processor or cache
Second Storage :Typically in2008 single disk has
capacities up to a terabyte.
Data can be read/write between cache and processor at
the speed of the processor instructions, commonly a few
On the other hand, moving an instruction or data item
between cache and main memory tales much longer,
perhaps 100 nano-seconds.
Moving data between disk and main memory 10
As capacious as a collection of disk units can be, there
are databases much larger than what can be stored in the
disks of a single machine, or even a substantial collection
To serve such needs, tertiary storage devices, has been
developed to hold data volumes measured in terabytes.
Tertiary storage is characterized by significantly higher
read/write time than secondary storage, but also by much
larger capacity (peta bytes) and smaller cost per byte than
is available from magnetic disks.
Volatile and Nonvolatile Storage:
An additional distinction among storage devices is
whether they are volatile or nonvolatile.
A volatile device forgets what is stored in it when the
power gets off.
A nonvolatile device, on the other hand, is expected to
keep its contents intact even foe long periods when the
device is turned off.
Devices such as magnetic disk and tape drives are
nonvolatile. Essentially, all secondary and tertiary
storage devices are nonvolatile. But main memory is
generally volatile, while virtually memory is always
Essentially every computer has some sort of secondary
storage, which is a form of storage that is both
significantly slower and significantly more spacious than
main memory, yet is essentially random – access, with
relatively small differences among the times required to
access different data items.
The disk is considered the support for both a virtual
memory and a file system.
Files are moved between disk and main memory in
blocks, under the control of the operating system or the
Moving a block from the disk to the main memory is a
disk read; moving a block from main memory to the disk
is a disk write.
Either can be referred to as a disk I/O. Certain parts of
main memory is used to buffer files, that is, to hold block
size pieces of these files.
A database management system manages data blocks
itself, rather than relaying on the operating system’s file
manager to move blocks between main and secondary
The use of secondary storage is one of the important
characteristics of the database management systems, and
secondary storage is almost exclusively based on
Mechanics of Disks:
The two typical moving pieces of a disk drive are a disk
assembly and a head assembly. The disk assembly
consists one or more circular platters that rotate around
the central spindle. The upper and the lower surface of
the platter are covered with a thin magnetic material on
which the bits are stored.
The location in which the bits are stored are organized in
to tracks, which are concentric circles on a single platter.
Tracks occupy most of a surface. Tracks are organized in
to sectors, which are segments on the circle that are
separated by gaps. Gaps often represent 10% of the total
track and are used to identify the beginning of the sector.
The second movable piece is the head assembly, holds
the disk heads. There is one head for each surface, riding
extremely close to the surface. A head reads the magnetic
passing under it, and can also alter the magnetism to
write information on the disk. The head are each attached
to an arm, and the arms for all the surfaces move in and
out together, being parts of a rigid head assembly.
The Disk Controller:
One or more disk drives are controlled by a disk
controller, which is a small processor capable of:
1. Controlling the mechanical actuator that moves the
head assembly, to position the heads at a particular
radius. At this radius, one track from each surface will
be under the head for read or write. The tracks under
the heads at the same time are called the cylinder.
2. Selecting a surface from which to read or write and
selecting a sector from the track on that surfaces that is
under the head. The controller is also responsible for
knowing when the rotating spindle has reached the
point where the desired sector is beginning to move
under the head.
3. Transferring the bits read from the desired sector to
the computer’s main memory or transferring the bits to
be written from the main memory to the intended
Disk Storage Characteristics:
Disk storage is in flux, as the space needed to store a bit
shrinks rapidly. Some of the typical measures associated
with the disks are:
Rotation speed of the Disk Assembly: One rotation
every 10 millisecond , is common, although higher and
lower speeds are also found.
Number of Platters per Unit: A typical disk drive has
about five platters and therefore 10 surfaces.
Number of Tracks per Surface: A surface may have as
many as 100,000 tracks per inch, although diskettes
have much smaller number.
Number per Bytes per Track: Common disk drives have
1 million bits per inch along a track.
Disk Access Characteristics:
It is important to understand how the data are
manipulated. Since all computation takes place in the
main memory or cache, the only issue as far as the disk is
concerned is how to move blocks of data between disk
and main memory. The data blocks are read and written
a) The heads are positioned at the cylinder contain the
track on which the block is located and
b) The sector containing the block move under the disk
head as the entire disk assembly rotates.
Thus, the time between the moment at which the
command to read a block is issued and the time that the
content of the block appear in the main memory can be
broken in to the following:
The time taken by the processor and the disk controller
to process the request.
The time to position the head assembly at the proper
cylinder, called seek time.
The time for the disk to rotate so the first of the sector
containing the block reaches the head, called the
rotational latency. And
The transfer time, during which the sector of the block,
and any gaps between them , rotate past the head.
The process of writing a block is quite analogous to
reading a block. The disk heads are positioned at the
proper cylinder, to wait for proper sector to rotate under
the head, but, instead reading the data we use the head to
write the new data. The minimum, maximum and
average time is exactly same as in reading.
It is not possible to modify a block on the disk directly.
For that we do the following:
1. Read the block in to main memory.
2. Make whatever changes to the block are desired in the
main-memory copy of the block
3. Write the new contents of the block back on to the
4. If appropriate, verify that the write was done