1cs257_storage_ch2a

					                     Data Storage


Data Storage

Data storage is an important part in Database
Management System. It can be divided in to two parts:

1. How does a computer system store and manage very
  large volumes of data and

2. What representations and data structures best support
  efficient manipulation of these data.

                                                           1
The Memory Hierarchy:

A typical computer system has several different
components in which data can be stored.

These components have data capacities of various ranges.
The cost per byte of these components also varies.

The device with smallest capacity offers the fastest
access speed and have the highest cost per byte.

The semantic of the memory hierarchy can be shown as:


                                                        2
                 DBMS

           Tertiary Storage
Programs, Main-
memory DBMS       Disk

Virtual Memory            File System

            Main Memory

                 Cache


                                        3
Cache:

Cache is the lowest level of hierarchy in the memory
system.

It is an integrated circuit, or a part of the processor’s
chip, that is capable of holding data or machine
instructions. It can be accessed in a few nanosecond




                                                            4
Main memory
Data in the cache are a copy of certain location in the
main memory.

When a machine executes instructions,
 (1) it looks for both the instruction and the data used by
 those instructions in the cache.

 (2) If it doesn’t find them there, it goes to the main
 memory and copies the instruction or data in to the
 cache. It take 10-100 nanosecond to move from Main
 memory to processor or cache


                                                              5
Second Storage :Typically in2008 single disk has
capacities up to a terabyte.

Data can be read/write between cache and processor at
the speed of the processor instructions, commonly a few
nano-seconds.
On the other hand, moving an instruction or data item
between cache and main memory tales much longer,
perhaps 100 nano-seconds.

Moving data between disk and main memory 10
miliseconds


                                                          6
Tertiary Storage:

As capacious as a collection of disk units can be, there
are databases much larger than what can be stored in the
disks of a single machine, or even a substantial collection
of machines.

To serve such needs, tertiary storage devices, has been
developed to hold data volumes measured in terabytes.

Tertiary storage is characterized by significantly higher
read/write time than secondary storage, but also by much


                                                            7
larger capacity (peta bytes) and smaller cost per byte than
is available from magnetic disks.

Volatile and Nonvolatile Storage:

An additional distinction among storage devices is
whether they are volatile or nonvolatile.

A volatile device forgets what is stored in it when the
power gets off.




                                                          8
A nonvolatile device, on the other hand, is expected to
keep its contents intact even foe long periods when the
device is turned off.

Devices such as magnetic disk and tape drives are
nonvolatile. Essentially, all secondary and tertiary
storage devices are nonvolatile. But main memory is
generally volatile, while virtually memory is always
volatile.

Secondary Storage:



                                                          9
Essentially every computer has some sort of secondary
storage, which is a form of storage that is both
significantly slower and significantly more spacious than
main memory, yet is essentially random – access, with
relatively small differences among the times required to
access different data items.

The disk is considered the support for both a virtual
memory and a file system.

Files are moved between disk and main memory in
blocks, under the control of the operating system or the
database system.

                                                           10
Moving a block from the disk to the main memory is a
disk read; moving a block from main memory to the disk
is a disk write.

Either can be referred to as a disk I/O. Certain parts of
main memory is used to buffer files, that is, to hold block
size pieces of these files.

A database management system manages data blocks
itself, rather than relaying on the operating system’s file
manager to move blocks between main and secondary
memory.

                                                              11
Disks:

The use of secondary storage is one of the important
characteristics of the database management systems, and
secondary storage is almost exclusively based on
magnetic disks.

Mechanics of Disks:

The two typical moving pieces of a disk drive are a disk
assembly and a head assembly. The disk assembly
consists one or more circular platters that rotate around

                                                            12
the central spindle. The upper and the lower surface of
the platter are covered with a thin magnetic material on
which the bits are stored.


The location in which the bits are stored are organized in
to tracks, which are concentric circles on a single platter.
Tracks occupy most of a surface. Tracks are organized in
to sectors, which are segments on the circle that are
separated by gaps. Gaps often represent 10% of the total
track and are used to identify the beginning of the sector.



                                                           13
The second movable piece is the head assembly, holds
the disk heads. There is one head for each surface, riding
extremely close to the surface. A head reads the magnetic
passing under it, and can also alter the magnetism to
write information on the disk. The head are each attached
to an arm, and the arms for all the surfaces move in and
out together, being parts of a rigid head assembly.



The Disk Controller:



                                                         14
One or more disk drives are controlled by a disk
controller, which is a small processor capable of:

1. Controlling the mechanical actuator that moves the
  head assembly, to position the heads at a particular
  radius. At this radius, one track from each surface will
  be under the head for read or write. The tracks under
  the heads at the same time are called the cylinder.

2. Selecting a surface from which to read or write and
  selecting a sector from the track on that surfaces that is
  under the head. The controller is also responsible for
  knowing when the rotating spindle has reached the

                                                               15
 point where the desired sector is beginning to move
 under the head.

3. Transferring the bits read from the desired sector to
  the computer’s main memory or transferring the bits to
  be written from the main memory to the intended
  sector.


Disk Storage Characteristics:




                                                           16
Disk storage is in flux, as the space needed to store a bit
shrinks rapidly. Some of the typical measures associated
with the disks are:

 Rotation speed of the Disk Assembly: One rotation
  every 10 millisecond , is common, although higher and
  lower speeds are also found.

 Number of Platters per Unit: A typical disk drive has
  about five platters and therefore 10 surfaces.




                                                              17
 Number of Tracks per Surface: A surface may have as
  many as 100,000 tracks per inch, although diskettes
  have much smaller number.


 Number per Bytes per Track: Common disk drives have
  1 million bits per inch along a track.


Disk Access Characteristics:

It is important to understand how the data are
manipulated. Since all computation takes place in the

                                                        18
main memory or cache, the only issue as far as the disk is
concerned is how to move blocks of data between disk
and main memory. The data blocks are read and written
when:

a) The heads are positioned at the cylinder contain the
  track on which the block is located and
b) The sector containing the block move under the disk
  head as the entire disk assembly rotates.

Thus, the time between the moment at which the
command to read a block is issued and the time that the


                                                          19
content of the block appear in the main memory can be
broken in to the following:

 The time taken by the processor and the disk controller
  to process the request.

 The time to position the head assembly at the proper
  cylinder, called seek time.
 The time for the disk to rotate so the first of the sector
  containing the block reaches the head, called the
  rotational latency. And



                                                               20
 The transfer time, during which the sector of the block,
  and any gaps between them , rotate past the head.


Writing Blocks:

The process of writing a block is quite analogous to
reading a block. The disk heads are positioned at the
proper cylinder, to wait for proper sector to rotate under
the head, but, instead reading the data we use the head to
write the new data. The minimum, maximum and
average time is exactly same as in reading.


                                                             21
Modifying Blocks:

It is not possible to modify a block on the disk directly.
For that we do the following:

1. Read the block in to main memory.

2. Make whatever changes to the block are desired in the
  main-memory copy of the block

3. Write the new contents of the block back on to the
  disk


                                                             22
4. If appropriate, verify that the write was done
  correctly.




                                                    23

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:5/23/2011
language:English
pages:23