Introduction to Database Management Systems
(Chapters 13 and 14 of the textbook)
School of Computing and Engineering
University of Missouri-Kansas City
A file is a collection or set (ordered or unordered) of data elements stored on a storage media.
A system software module is responsible for managing (reading, processing, deleting, etc.) a file.
Let us see how much work goes in a simple file operation.
The system performs the following steps to write a character to a sequential file.
Example: write (f1, c); where f1 = text file name. c = a char variable contains a value 'P'.
0. In the beginning 'P' is at memory location c.
1. The program asks O/S to write the contents of c to file f1.
2. The O/S passes the job to the File Manager.
3. The file manager looks up f1 in the directory to see if f1 is open and available for use,
allowed access types, and what physical file the logical name f1, corresponds to.
4. The file manager searches a file allocation table for the physical location of the sector
(last) that is to contain the byte.
5. The file manager makes sure that the last sector in the file has been stored in a system I/O
buffer in the main memory, then deposits the 'P' into its proper position in the buffer.
6. The file manager gives instructions to the I/O processor about where the byte is stored in
the main memory and where it needs to be sent on the disk.
7. The I/O processor puts the data in the proper format for the disk. It may also buffer the
data to send it out in chunks of the proper size for the disk.
8. The I/O processor sends the data to the disk controller.
9. The controller instructs the drive to move the read/write head to the proper track, waits
for the desired sector to come under the read/write head, then sends the byte to the driver
to be deposited bit by bit, on the surface of the disk. (Remember this byte has its
companion and each companion is deposited on the disk bit by bit).
The complete the entire set of operations about 10-12,000 assembly instructions are
Logical Structure of a file
Field: Smallest (fixed) indivisible logical unit of a file. A field holds a part of some data
Record: A set of logically related fields. Cardinality of this set may be fixed or variable,
i.e., a record size may be fixed or variable. A file, therefore, is a collection of
logically related records
A Field A Record A File
SSN Name Age Phone # SSN Name Age Phone #
2/22 CS470: Introduction to File Organizations. Vijay Kumar
A record can be of fixed or variable size. Most commonly fixed size records are used. The
file structure has been illustrated here in a tabular form but note that a file is not just a table its
records can be organized in many different ways each way defines a unique organization.
Operations on file
Create: Create a new file.
Write: Write to a file
Read: Read a part of a file
Rewind: Go to the beginning of the file
Delete: Remove a file from the system.
Close: Close a file (at the end of manipulation).
Open: Open a file for processing.
Modify: Modify a field value.
Relationship of a file with the storage media
Common storage media are disk, optical disk, and tame. Disk is a random storage device
because a record of a file can be accessed from the place it is stored. In some cases a file
acquires the attributes of the storage media on which it is stored, i.e., magnetic tape. To access a
record of a file stored on a magnetic tape, unlike disk files, the entire file has to search
sequentially before reaching the record.
Disk File structure: Several platters, several cylinders, several tracks on a cylinder, several
sectors on a track (Sector = Block (fixed by manufacturer)).
Access path: Disk No → Cylinder No →Track No → Sector No
Unit of data transfer between memory and disk: Block.
Relationship between logical and physical units
A fixed record size may be larger or equal to or smaller than a disk block. For storage
purpose a unit called Blocking factor (Bfr) is defined, which is the number of records in a block
and computed as follows:
Blocking factor - Bfr (Number of records in a disk block) = ⎣B/R⎦ where B is disk block size in
number of records and R is file record size. Thus, unused space in a block: B - (Bfr * R) bytes
Disk parameters: Seek time, rotational latency time and block transfer time.
Seek time: The time it takes to move the read/write arm to the correct cylinder. It is the largest in
cost. Average seek time is the same as the time it takes to traverse one third of the cylinders.
Rotational latency time: The time the disk unit takes to move or to position the read/write on to
the beginning of the sector from where the file records are stored.
Block transfer time: Time for the read/write head to pass over a disk block. During the block
transfer time, the bytes of data can be transferred between the disk and main memory.
3/22 CS470: Introduction to File Organizations. Vijay Kumar
Rotational latency and Block transfer time
Average rotational latency (r) = 1/2 of one disc revolution = 0.5. rpm = No. of rotations per min.
1 60 × 1000 0.5 × 60 × 1000
Time for 1 revolution= min = ms. Time for half revolution = ms.
rpm rpm rpm
Common speed for disk rotation is 3600 rpm so average latency time =
0.5 × 60 × 1000
ms = 8.3ms
Let data transfer speed = t bytes/ms and block size = B, then the Block transfer time (btt) = B/t
In a formatted data on the disk there is a gap between two consecutive blocks. Some control
information (information about the following block) is stored in this gap. In reading a set of
blocks these gaps are also read so we need to take into account the time to read these gaps in the
computation of block transfer time.
Suppose t' = data transfer speed for formatted data (block and the gap). So the effective
block transfer time Ebt = B/t'. We use IBM 3380 disk in our example to compute various
parameter values. The total capacity of the disk is over 2 billion bytes. Table 1 lists the values
of some useful parameters of this disk.
Parameter Definition Value
B Block size 2400 bytes
t Data transfer speed 3000 bytes/ms
btt Block transfer time 0.8ms
t' Formatted data (interblock gap) speed 2857 bytes/ms
ebt Effective block transfer time 0.84 ms (= B/t')
N No. of cylinders 885 per spindle
r Average rotational latency 8.3 ms
s Average seek time 16 ms
Time to read 10 blocks sequentially
It is the sum of average seek time, average rotational latency time, and 10 effective transfer
time. Effective block transfer time must be used, since the head must move over the interblock
gap. So reading time = s + r + 10 × ebt = (16 + 8.3 + 8.4) = 32.7 ms. Note that the seek time
and the rotational latency time are significant with respect to the transfer time.
Time to read 10 blocks randomly
In this case individual seek time and rotational latency time are required for each block. So
reading time = 10 × (s + r + btt) = 10 × (16 + 8.3 + 0.84) = 10 × 25.14 = 251 ms. The random
time for reading 10 blocks is higher than the sequential time by a factor of 8. Thus, time for
reading 100 sequential blocks = 0.1083 sec. Time for reading 100 random blocks = 2.510 sec.
4/22 CS470: Introduction to File Organizations. Vijay Kumar
The time for reading randomly 100 blocks is 25 times larger than time for reading 100 blocks
One of the main objectives of file organization is to speed up the file retrieval time, that is, to
reduce the I/O time. This can be done by improving disk access speed but that has a limit
because of the hardware constraints. The other way is to store file records in such a way that
records can be accessed concurrently or in parallel. A hardware solution to this is using RAID
structure. A RAID is an array of several disks which can be accessed in parallel. A software
solution is through some file organization. We discuss a number of file organizations. Note that
no single file organization is the most efficient for all types of data access. For this reason a
database management system uses a number of different file organizations for storing the same
set of data.
There are basically three categories of file organizations (a) Sequential organization, (b)
organization using Index, and (c) Random organization. We begin with sequential category.
Files of these categories are called sequential files.
We proceed as follows; (a) understand the file organization, (b) understand how a set of
operations are performed, and (c) analyze the time required to performed each operation. This
will help us to identify which file is suitable for what kind of data processing.
In this organization records are written consecutively when the file is created. Records in a
sequential file can be stored in two ways. Each way identifies a file organization.
• Pile file: Records are placed one after another as they arrive (no sorting of any kind).
• Sorted file: Records are placed in ascending or descending values of the primary key.
Pile file: The total time to fetch (read) a record from a pile file requires seek time (s), rotational
latency time (r), Block transfer time (btt), and number of blocks in the file (B). We also need a
key field of the record (K). In pile file we do not know the address of a record, so we search
sequentially. The possibilities are (a) the desired record may be in the first block or (b) it may be
in some middle block or (c) it may be in the last block.
1 B 1 B × ( B + 1) B
So the average number of blocks read × ∑i = × ≈ . Thus the time to find
B i =1 B 2 2
(read) a record in a pile file is approximately: TF = × ebt . The fetching (searching) process is
shown in the flow diagram (Figure 1) and the example illustrates the time involved.
5/22 CS470: Introduction to File Organizations. Vijay Kumar
Read next block
Read next record in the block
Is K = the key value of the record? Y
N Last record of this block?
All blocks searched?
Display message: "Record does not exist" and exit the search
Figure 1. Search in a sequential file organization.
File Reorganization: In file reorganization all records, which are marked to be deleted are
deleted and all inserted records are moved to their correct place (sorting). File reorganization
• read the entire file (all blocks) in RAM.
• remove all the deleted records.
• write all the modified blocks at a different place on the disk.
Time to read the entire old file blocks = B (No. of blocks in the file) × ebt.
Time to write out new file blocks (n) = (n /Bfr) × ebt.
Total reorganization time TR = (b + n/Bfr)× ebt.
Record size: 400 bytes.
File size: 100,000 records 40 megabytes.
Storage disk: IBM 3380.
Block size: 2400 bytes.
Number of blocks in the file (B): 40 mb/2400 = 16,667.
The total time to read a record is TF: 8333 × 0.84 ms = 7 secs.
This time is tolerable for infrequent reads. However, if frequency of reads is large, then this
time is not acceptable. Suppose we want to look for 104 names of patients. For this the total
search time is = 7 × 104 ms = 19 hrs. and time to read through the entire file with independent
fetches is TX (independent fetches) = 100,000 × 7000ms > 8 days.
Missouri has about 5 million personal income tax record of 400 bytes each. This will fit on
one IBM 3380 drive. The time to read this file with independent searches is:
5000000 × 7000
TX = = 2 years .
1000 × 60 × 60 × 24 × 365
6/22 CS470: Introduction to File Organizations. Vijay Kumar
Inserting a record: To insert a record, it is placed at the end of the file. No need to sort
(ascending or descending order) the file. However, the file may have duplicate records. The
total time involved in this process is TI (insert time) = s (seek) + r (latency) + btt + r = 42 ms
(on IBM 3380).
Deleting or modifying a record: This will require to fetch the block containing the record, find
the record in the block and just mark it deleted, then write the modified block to the disk. Total
time required: TD or TM = TF + 2r.
Sorted Sequential File: In a sorted file, first the record is inserted at the end of the file and then
moved to its correct location (ascending or descending). Records are stored in order of the
values of the key field.
A sequential file organization
A sequential file usually has an overflow area. This area is to avoid sorting the file after
every deletion, insertion and/or modification. All records, which were added after the file was
first populated go to this overflow area. The overflow area is not itself sorted it is a pile file with
fixed size record. At some convenient time the entire file is sorted where records from the
overflow area go to their correct position in the main file.
Record ordering: Stored in an ascending or descending order of key = K. Record Rj precedes
record Rk iff K(Rj) <= K(Rk).
Retrieval: Records are retrieved in a consecutive order. The order of record storage determines
order of retrieval. During retrieval several required operations (partial result output etc.) can be
Insert, delete and modify (Update): Sequential files are usually not updated in place. Each
operation regenerates a new version of the file. The new version contains the up-to-date
information and the old version is usually saved for recovery. The new version becomes the old
version and the next update to the file uses this old version to generate next new version. The
intensity or frequency of use of a sequential file is defined by a parameter called “Hit ratio”,
which defines is defined as follows:
No. of records accessed for responding to a query
Hit ratio = .
Total number of records in the file
Desirable: high hit ratio value. This means a larger number of records are accessed to respond to
a query. Interactive transactions have very low hit ratio.
Advantages of sequential file
• Good for batch transactions.
• Simple to implement
7/22 CS470: Introduction to File Organizations. Vijay Kumar
• Good for report generation, statistical computation and inventory control.
• Not good for interactive transactions
• High overheads in file processing for simple queries.
Index File Organization
Index organization tries to reduce access time. It may not reduce the storage requirement of
a file. We define some important terms of this organization.
Index: An index is similar to a pointer with an additional property. This additional property
allows an index to identify the record it is pointing to. For example, an index to a record of
employees points and identifies an employee record. This means that an index has some
semantics associated with it. Formally, an index maps the key space to the record space. Index
for a file may be created on the primary or secondary keys.
Index types: Primary, Nondense, and Dense.
Primary Index: An ordered file of index record. This file contains index records which are of
fixed length. Each index record has two fields:
• one field holds the primary key of the data file record.
• the other holds pointer to disk block where the data file is stored.
Nondense index: No. of entries in the index file << no. of records in the data file.
Dense index: No. of entries in the index file = no. of records in the data file.
Example: How record search time is reduced by indexing.
Sequential search (no indexing)
File size = 30,000 records. Record size = 100 bytes (R). Block size = 1024 bytes (B).
Blocking factor = ⎣1024/100⎦ =10 records per block.
No. of blocks for the data file = 30,000/10 = 3000 blocks.
A binary search on the data file ⎣log2 3000⎦ = 12 blocks accesses.
Primary key = 9 bytes. Block pointer = 6 bytes. Index entry size = 9 + 6 = 15 bytes.
Index file blocking factor = ⎣1024/15⎦ = 68 entries. No. of blocks for index file = 3000/68 =
Binary search on index file = ⎣log2 45⎦ = 6 blocks accesses. Total no. of accesses = 6 + 1 = 7.
Index Sequential File (ISAM - Index Sequential Access Method)
Introduced by IBM in early 1960s. This file organization closely related to the physical
characteristics of the storage media. It has two parts: (a) Index Part and (b) Data Part.
Index Part: This part stores pointers to the actual record location on the disk. It may have several
levels and each level represents some physical layout such as sector, cylinder, of the storage
Data Part: It holds actual data records and is made up of two distinct areas: Prime area and
Overflow area. The prime area holds records of the file. The overflow area holds records when
the prime area overflows.
8/22 CS470: Introduction to File Organizations. Vijay Kumar
Organization: Data records are stored on disk tracks in their ascending key values.
Organization: Data records are stored on disk tracks in their ascending key values.
Track No. Key Record Key Record Key Record Key Record
Track 1 10 rec10 14 rec14 18 rec18 20 rec20
Track 2 26 rec26 34 rec34 41 rec41 60 rec60
Track 3 72 rec72 73 rec73 74 rec74 75 rec75
Track 4 77 rec77 78 rec78 79 rec79 82 rec82
Track 5 89 rec89 91 rec91 93 rec93 98 rec98
To locate a data record a Track Index is required.
Track No. Highest key on the track
Operation on the file: Read: record 79.
Sequential access: Search sequentially from track 1, record 10. Very slow.
Semi-random access: (a) Search track index, (b) find the track that contains a key greater than
79. This takes us to track 4. Search track 4 sequentially to find record 79.
Problem: For a large file track index would be large. A large index is not convenient to manage.
Solution: Create Cylinder index to manage track index.
Cylinder Highest key
Track index for cylinder 1 Track index for cylinder 2 Track index for cylinder 3
Track Highest key Track Highest key Track Highest key
1 20 1 107 1 201
2 60 2 122 2 210
3 75 3 148 3 223
4 82 4 163 4 259
Find record with key value 248.
• Search Cylinder index to find correct cylinder. Correct cylinder 3.
• Search track index for cylinder 3. Correct track 4.
• Search track 4 (sequentially) for the desired record.
9/22 CS470: Introduction to File Organizations. Vijay Kumar
Problem: Cylinder index may become very large.
Example: File size = 5000 tracks of data. Disk Type: 3350 (IBM), 16 platters/volume, that is,
30 surfaces/volume, and 555 tracks/surface, that is, 555 cylinders/volume. A cylinder has 30
tracks, so to store the entire file 5000/30 =167 cylinders are required (cylinder wise data storage,
i.e., store file records on the first track of cylinder 1, then first track of cylinder 2, and so on. ).
Apart from the file, there are other controlling information such as track, cylinder, and master
indexes, therefore, all 30 tracks are not used to store data. 25 tracks are used for data and the rest
are for control information. Result: 5000/25 = 200 cylinders for indexes and data, which means
there are 200 entries in a cylinder index.
Each entry of index record may be large. The size of cylinder index might grow with the file
size. To reduce the search time for a cylinder index, a master index is created, which holds the
address of cylinders storing the records.
Overflow area: It accommodates overflow record from the prime area. If the record key
indicates that the record must go on a track, which is full then some records from this track is
moved to the overflow area. With overflow area the track index takes the following form:
Track No. Highest key on track Highest key in overflow area Addr. of the first overflow
1 20 20 Null
2 60 60 Null
3 75 75 Null
4 82 82 Null
5 98 98 Null
Deletion: Simple. Find the desired record. Put a special marker into the record to indicate that it
has been deleted.
Example: Insert record 55.
Destination: Track 2. Track 2 is full. Move record 60 to the overflow area. Modify the track
10 14 18 20 1 20 20 Null
26 34 41 55 2 55 60 Address of Rec. 60
72 73 74 75 3 75 75 Null
77 78 79 82 4 82 82 Null
89 91 93 98
Add record 42. Result Prime Area Track Index. Other records can be inserted similarly. When
the overflow area is full then records are inserted in overflow area on the disk, however, this
10 14 18 20 1 20 20 Null
26 34 41 42 2 42 60 Address of Rec. 55
72 73 74 75 3 75 75 Null
77 78 79 82 4 82 82 Null
89 91 93 98
10/22 CS470: Introduction to File Organizations. Vijay Kumar
Implementation: The cylinder index is usually kept in the memory. The cylinder index gives the
address of the correct track index, which is located on the disk. The track index is fetched and the
address of the record is found.
Performance: Time to fetch a record: TF = r + s + btt + r + btt
As new records are added, the ISAM file degraded in performance. It becomes slow and
often had to be reorganized at high cost. This is why ISAM is not the choice of indexing method
for new systems. Let us suppose that all overflows is on the same cylinder. Then to look up a
record, we obtain cylinder track index and search the primary area. If the record is not there then
we must follow the chain of pointers to various records in the overflow area of the cylinder. If
this overflow area does not contain the record then the second overflow area on another cylinder
(not necessarily the adjacent cylinder). So a search for a record may require several seeks.
Similarly, reading the file in order after many insertions becomes quite complicated. Each linked
list must be followed, record-by-record, to read sequentially. This can require a large number of
seeks. ISAM is not used any more. It is replaced by B+ structure.
VSAM (Virtual Storage Access Method)
Structure: It has three parts: (a) Index set, (b) Index sequence set, and (c) Data blocks.
Index set: Set of records. Identical to ISAM Index file. Each index record points to one of the
lower level index block.
Index set record structure: Each record has two fields:
Index sequence set
High key value of the block being pointed to Ptr to the block
Index set B
High key value of the block being pointed to Ptr to the block L
High key value of the block being pointed to Ptr to the block M
In an index set (block) the index records are maintained physically in ascending sequence by
primary key value, The index records in the index set are nondense, meaning that there is only
one index record for a lower level index block.
Index Sequence Set: Nondense. Each record in this set points to one Control Interval.
Control Interval: Set of data records, which are physically stored in ascending key
Control Area: An ordered set of control intervals and Free control intervals for a file.
Distributed free space: Set of free control intervals in a control area.
The sequence set contains index records that point to every control interval allocated in the
file. The following diagram illustrates Index sequence set, Control intervals, Distributed free
space and Control area. The size of control area and the number of control intervals may be
predefined. At the end of file initialization the unfilled control intervals are set aside as
distributed free space.
11/22 CS470: Introduction to File Organizations. Vijay Kumar
D A B C D
H E F H
M J M
P N P Control intervals
S Q S
Free Free control interval
Free Distributed free space
Free control interval
Index sequence set
The complete structure of a VSAM look like:
Index sequence set
A Control interval
Index set B Control interval
L Control interval
M Control interval
T Control interval
Sequential access: It can begin from the beginning of the file or from any other point.
Random access: Through the index set to the index sequence set to the data.
Update (insert a new record): During insert the following situation might occur:
a. Space available in the target control interval.
b. New record does not fit in the target control interval.
c. The control area is completely full.
Space available in control interval
• Bring desired control interval in the buffer.
• Move key values higher than the target record up in the control interval.
• Insert the desired record in the control interval
• Write the control interval back to the file.
A D F
A D F
A CD F
Not enough room in the target control interval: The available room in the control area is
insufficient to accommodate the new record. In order to insert C we must get a new control
interval. If there is a free control interval in the control area then the records with higher key
values from the control interval are copied into the new control interval. If there is no free
control interval in a control area then a new control area is allocated and initialized. The
initialization process copies all the control intervals with records having value smaller than the
new record. The new record is then copied into the proper place and the rest of the records are
then copied. The old control area is discarded, but this seldom happens.
12/22 CS470: Introduction to File Organizations. Vijay Kumar
C Record to be added
A B D F
Index sequence set
A B C Original control interval
Control interval split
D F New control interval
C A B C
Mofified index sequence set
The control area is completely full: In this situation an entire new control area is created,
added to the file, and then populated.
File Size and File creation: User may specify the number and size of control intervals and
control areas. If not specified, VSAM will use some default value. The size of control interval
dictates the size of the unit of I/O data transfer. It is important for performance reasons to make
the control interval a multiple of the actual physical blocks on the disk itself.
File name: Given by the user. This name is stored in the master catalogue.
Freespace: The user has the capability of allocating distributed free space throughout the file.
This space will not be used when the file is populated. However, this will be available for control
interval spliting tasks.
Record size: Greater flexibility. The user must specify not only the maximum record length but
also the average record size. VSAM can use the average size to perform calculations as to how
large, for example, a control interval must be.
File population: One of the main ways is to sort all data records by their key field and present
this to the system. This is called mass insert and can optimize the process of how to move
records within control intervals.
The following diagram illustrates this process:
Add records A, B, C, and D A B C D
Add records E, F, G, H, and I A B C D E F G H I
Mass insert into empty file
Original file A D K Z
Add record L A D K L Z
A D K L M N O P
Add records M, N, O, P, Q, and R Q R Z
Mass insert between existing records
Indexing provides the most powerful and flexible tools for organizing files. However, (a)
indexes take up space and time to keep them organized and (b) the amount of I/O increases with
13/22 CS470: Introduction to File Organizations. Vijay Kumar
the size of index. This problem can be minimized by direct file organization where the address
of the desired record can be found directly (no need of indexing or sequential search). Such file
are created using some hashing function so they are called hashing organization or hashed files.
Two types of hashing: (a) Static and (b) Dynamic.
Static: The address space size is predefined and does not grow or shrink with file.
Dynamic: The address space size can grow and shrink with file.
Key Space: Set of Primary keys.
Address Space: Set of home addresses.
Hash function (Primary Key) → Home address of the record.
Home address → file manager → disk address.
Requirements: 1. A predefined set of home addresses (address space). 2. A hashing function.
3. Primary key (Hashed key).
There is a unique relationship between the key and the record address. Any add, delete or
change in a record must not break this relationship. If this bond is broken then the record is no
longer accessible by hashing.
1. With hashing, the address generated is random.
2. No obvious connection between key and the home address. (For this reason hashing is
sometimes referred as randomizing.)
Record distribution in address space: a. Uniform. b. Random.
Best Worst Acceptable
Record Key Address Record Key Address Record Key Address
1 1 1
2 2 2
A 3 A 3 A 3
B 4 B 4 B 4
C 5 C 5 C 5
D 6 D 6 D 6
E 7 E 7 E 7
F 8 F 8 F 8
G 9 G 9 G 9
No synonyms All synonyms Few synonyms
Collision: When a hashing function generates the same home address for more than one record.
Solutions for collision: Progressive overflow (also called open addressing)
5 N 6 R
H(K) Home address 6 R First try (full) 7 J Wrap around
7 J Second try (full) 8 M
8 M Third try (full) H(T) Home address 9 S
Fourth try. K's home address
Collision Repair by Progressive Overflow Home address search beyond the end of file
• What happens if there is a search for a record, but the record was never placed in the file?
14/22 CS470: Introduction to File Organizations. Vijay Kumar
• What happens if a record is hashed at a filled address and only the predecessor location is
These problems increase the average search length.
Example: Consider the following keys and their home addresses:
Key Home address Actual address No. of accesses to retrieve a record
Adams 20 0
Bates 21 --
Cole 21 20 Adams 1
Dean 22 21 Bates 1
Evans 20 22 Cole 2
23 Dean 2
24 Evans 5
Average search length = (1 + 1 + 2 + 2 + 5)/5 = 2.2. Acceptable average search length = 2.0.
Anything higher than this is regarded high.
Improvement (reduce the average search length): by storing more than one record at a single
home address. Storing records in Buckets
Bucket: A logical unit of storage where more than one records are stored and the set of records
set is retrieved in one disk access (one I/O).
The home addresses of these records are the same. On sector-addressing disks, a bucket typically
consists of one or more sectors. On block-addressing disks a bucket might be a block (physical,
defined by the manufacturer). Consider the following set of keys, to be loaded into a hash file.
Key Home address Buckets
Green 30 30 Green Hall
Hall 30 31
Jenks 32 32 Jenks
King 33 33 King Land Marks
Land 33 Nutt is an overflow record
Each address (30, 31 etc.) identifies a bucket capable of holding the records corresponding to
three synonyms. Only the record corresponding to Nutt cannot be accommodated in a home
Access record: Green. 1. Load the buffer in the memory. 2. Search the bucket serially.
Deletions: More complicated than adding a record for two reasons:
1. The slot freed by the deletion must not be allowed to hinder later searches.
2. It should be possible to reuse the freed slot for later additions.
In progressive overflow a search for a record terminates if an open address is encountered.
Problems in searching a record
15/22 CS470: Introduction to File Organizations. Vijay Kumar
Four keys: Adams, Jones, Morris and Smith. Collisions: Adam and Smith at address 5, and
Jones and Morris at 6.
The following diagram illustrates the setup:
Record Home address Actual address File
Adam 5 5 5 Adam
Jones 6 6 6 Jones
Morris 6 7 7 Morris
Smith 5 8 8 Smith
File before deletion
Now suppose Morris is deleted, leaving an empty space. A search for Smith will starts at address
5, then looks at addresses 6 and 7. Since address 7 is now empty, it is reasonable for the program
to conclude that Smith's record is not in the file.
Solution: Mark the deleted record with a tombstone. The search for a record will not stop at a
tombstone. This solution makes insertion difficult.
Example: Record deleted: Morris. Record to be inserted: Smith.
The program gets the address 5 and tries to find an empty slot. During the search it encounters a
tombstone. The record Smith is inserted here.
Problem: The File has duplicate Smith.
Solution: The program must search the entire file and then go back to the first tombstone and
insert the record.
Another problem with a tombstone is it should not be inserted when the last record or a record,
which does not have any successor, is deleted.
Dynamic Hashing (Extensible Files also called Linear Hashing)
Files grow and shrink in database systems. The static hashing organization is not suitable for
these files since they are unable to shrink and expand with files.
Solution: Dynamic hashing. It manages expansion by
• Splitting a bucket when it becomes full.
• Distributing records between old and new buckets.
Uses multiple hashing functions. These functions are related and the selection of the function
depends upon the level of bucket split. Division is the basis of hashing. We assume
N = Number of buckets. Address = key mod N (hashing into any bucket).
When a bucket is full then to accommodate an insert:
• The bucket is split. A different but related hashing function is used to distribute the
records of the old bucket between the old and the new buckets.
• The desired record is inserted in these buckets using the current hashing function.
16/22 CS470: Introduction to File Organizations. Vijay Kumar
N = 100, C = 4 (bucket size). Record Keys: 508, 17308, 208, 408.
Hashing function: address = key mod 100. Records hashed into the bucket as follows:
The bucket is full. New record 1208 is to be inserted there. The insertion of 1208 forces a
split. A new bucket is acquired on the disk. The reorganization of these records is done first in
the memory buffer then the records are written on the disk buckets (old and new). The five
records will be rehashed using function: address = key mod 2N (key mod 200).
The following diagram shows the state of the file after the distribution of the four old records
and the insertion of new record with key 1208.
Buckets after rehashing
Bucket 8 Bucket 108
Suppose the record 1608 is now inserted into bucket 8 and then record 5808 comes. We need to
split bucket 8 a second time, and on this occasion we use: address = key mod 4N (key mod 400).
Bucket 208 is allocated to the file. The file after a second split
2408 508 208
1208 17308 5808
Bucket 8 Bucket 108 Bucket 208
In general, if we are splitting a bucket that has been involved as destination of S previously
splits, we use the function: address = key mod (2s+1 × N).
• Waste of space (unused buckets after each split).
• If two buckets n and n+j are in use then all buckets between n and n+j must be available
to the algorithm for inserting records.
• The history of insertion must be saved to access a record. This is because many related
hashing functions are used in the search operation.
In this scheme there is no relationship between the buckets. The position of one bucket is not
related to the position of any bucket, which was the case in virtual hashing. Initially files are
organized in N buckets. These buckets reside on the disk but each of them is pointed to by a cell
in the memory, and bucket address is therefore not as important as in Virtual hashing. Suppose N
17/22 CS470: Introduction to File Organizations. Vijay Kumar
(number of buckets) = 20. C (bucket size in number of records) = 3. We have the following
1 2 3 Index level 0
Buc. 1 Buc. 2 Buc. 3
A hash function transforms a key into an address of a level 0 index element (1 through 3 in this
case). The appropriate bucket is accessed by the pointer.
Insertion: The hashing function gives the address of an index element and the record is inserted
in the bucket pointed to by that node. If the target bucket is full, a new bucket is allocated to the
file. The C + 1 records are redistributed between the two buckets. These two buckets then are
pointed to by the children of the original index node as shown in the following diagram:
1 2 3
Index level 0
Index level 1
1 2 3 21
Hash File after Bucket Split
• When redistributing records between two buckets (bucket split), how do we decide which
record goes to which bucket?
• A record search begins from the Index level 0. When searching a record, how do we find
which of the two buckets pointed to by the index block contains the record?
Solution: To solve the above problems, a second function is used. This function, given a key
(usually primary), returns an arbitrarily long binary string (1's and 0's) of fixed length
(predefined) as shown below. This string is referred to here as B (Key).
Key H (Key) B (Key)
157 2 10100...
95 1 00011...
88 1 01100...
205 2 10010...
13 1 10111...
125 1 10001...
6 1 01000...
301 1 00110...
How a B (Key) is used?: The bits of B are used to identify the right bucket for inserting and for
searching the desired record. If the bucket being split is pointed to from a node at level I in the
tree, then we use the value of the (I + 1)st digit of B (key) to determine if a record goes to the left
(old) or to the right (new) bucket. The following diagram shows the insertions of the first five
18/22 CS470: Introduction to File Organizations. Vijay Kumar
1 2 Index level 0
Bucket 1 Bucket 2
File after 5 insertions
Insert 125 and 6
The hashing on key 125 gives 1 so it should be inserted in bucket pointed by node 1 of level
0. The bucket is full so it splits. A new bucket is allocated to the file. The full bucket was pointed
to by a node at level 0, so we use the first digit in B (key) string to decide the destination bucket
of a record 125. Records 95 and 88 go to the left bucket, and records 13 and 125 go to the right
bucket. Record 6 goes to Bucket 1.
1 2 Index level 0
Index level 1
95 157 13
88 205 125
Bucket 1 Bucket 2 Bucket 3
Split after inserting 125. No split in inserting 6
Insert record 301
H(301) = 1, so its B (key) takes it to bucket 1. Bucket 1 is full so a split occurs. The bucket to
be split is being pointed to by a leaf at index level 1. So in distributing the records (old) and
inserting new record (301), 2nd bit (Index level 1+1 = 2) of B key is used for distribution. The
structure of the file after inserting 301 is:
Index level 0
Index level 1
Index level 2
95 157 88 13
301 205 6 125
Bucket 1 Bucket 2 Bucket 3 Bucket 4
Bucket Split to accommodate 301
Indexing and Hashing
Hashing of any form is not at all good for sequential access. So a process, which requires a
few logically contiguous records may not be processed efficiently and indexing is the best
organization for such requirements. The performance of index organization degrades when the
volume of updates and insertion increases.
There are a large number of different file organizations such as extendible hashing, linear
hashing, B-trees group (B+-tree, B-tree, etc.) but all are based in some form of indexing and
hashing. In a database system, a number of different file organizations are used to store data. In
19/22 CS470: Introduction to File Organizations. Vijay Kumar
many cases the same data is stored in multiple file organizations to improve data retrieval for
different types of queries (key query, range query, mult-key queries, etc.). Students are strongly
advised to study these file organizations from other books.
System allocates disk space to file blocks. Disk space management has two problems: (a)
several orders of magnitude slower disk access time and (b) an order of magnitude larger number
of blocks to deal with. Moreover, the variability and dynamic changes in file sizes make
prediction of their resource requirements difficult and unreliable in most cases.
A good space management mechanism should take into consideration:
1. Processing speed of sequential and random access to files, and allocation and
deallocation of blocks.
2. Ability to make use of multisector and multitrack transfers.
3. Disk utilization.
4. Main memory requirements of a given algorithm.
Processing speed: Speed of sequential and random access to files and not the allocation and
deallocation of blocks.
Multisector and multitrack transfer: Least number of I/O commands for large transfers, i.e.,
the entire track or a large number of sectors.
Disk utilization: Percentage of disk space allocatable to files, i.e., minimal fragmentation.
Main memory utilization: Generates low table fragmentation.
Methods of space allocation: Contiguous and non-contiguous.
In this method disk blocks are allocated contiguously. It also maintains logical contiguity. If
we represent the disk space as a 2-dimensional matrix, then a contiguous allocation can be
arranged as follows. LB indicates logical block of the file.
0 1 2 3 4
5 6 7 8 9
10 LB0 11 LB1 12 LB2 13 LB3 14 LB4
15 16 17 18 19
1. Estimate file size.
2. Search BFD (Basic File Directory) to find (First Fit or Best Fit) contiguous disk blocks.
3. Allocate the set of disk blocks to the file.
4. Update BFD.
5. Update free disk table.
Possible access type: sequential and random.
Sequential: Address of the first block and file size is given in the BFD. Read all the blocks
Random: Address of the first block + (block number × block size).
Sequential access is fast. Multitrack or multisector transfer can be conveniently used.
20/22 CS470: Introduction to File Organizations. Vijay Kumar
Deallocation: Easy. Find the file in the BFD. Overwrite just the type field and leave the address
and size field as they are.
Advantages of contiguous allocation
• Easy implementation.
• Very little directory and memory space is needed for keeping track of contiguous files.
Disadvantages of contiguous allocation
• Disk fragmentation: internal and external.
• File size must be pre-declared and pre-claimed especially in creating files, alright in
copying files. For example, what would be the size of a file when creating it using editor?
• Must provide an overflow area for managing the file expansion.
Noncontiguous allocation: Chaining. A disk based version of linked list.
Implementation: A few bytes of each disk block are reserved for forward pointer to the next
block in the sequence. The directory contains the head pointer. The following figure shows the
0 1 2 3 LB1 [next 18] 4
5 6 7 8 LB4 [next - nil] 9
10 LB0 [next 3] 11 12 13 14
15 16 17 19 LB2 [next 20] 20 LB3 [next 8]
• Well suited for sequential access.
• Multisector transfer is difficult.
• Direct access very slow, in fact it is not easily possible since all intermediate disk blocks
must be visited before going to the desired block.
• Expansion and contraction of files can be easily managed, therefore, pre-declaring of file
size is not necessary.
• Minimum external fragmentation, therefore disk compaction is not necessary.
• Space for disk pointer is usually below 1%, its effect on the disk utilization is negligible.
• Allows easy handling of bad blocks by omitting them from both file and free chain.
• Insertion and deletion of file blocks in the middle of the chain is easy.
• Pointers are sensitive data and their correctness can be questionable.
0 1 2 3 4 LB0
5. 6 7 LB3 8 9
10 11 12 13 14
15 16 LB4 17 LB1 18 LB2 19
Noncontiguous allocation: Indexing. Based on index sequential file. The first disk block of the
file is reserved as index block and the addresses of the index block is stored in the file directory.
The following figure illustrates the allocation.
21/22 CS470: Introduction to File Organizations. Vijay Kumar
• Both random and sequential accessing of indexed files requires one disk access per level
of indexing to locate the address of the target block.
• Multisector transfer is not easy.
22/22 CS470: Introduction to File Organizations. Vijay Kumar