File design

Document Sample
File design Powered By Docstoc
					File Design

   Unit 5
                Define File?
File is a collection of related records
   organized for a particular purpose.
   (Data Set)
                           File Structure

         Data BASE
     (Collection of Files)

(Collection of Logical Records)

       Records (Tuple)
       (Logical record )

                             Data Item (Attribute)

                                                 (smallest addressable unit)
       Types of Computer Files
1.   Master File
2.   Transaction File
3.   Transfer File
4.   Work File
5.   Output file
6.   Dump File
                   Master file:

A file which is a major source of reasonably
    stable information about entities is called
    a master file.

     When records are added or deleted from a master
       file or when individual fields of records are
       changed the master file is said to be updated.

     Ex:In payroll processing the file which contains
        records of all employees is called master file.
             Transaction file:
• A request to update the master file is
  called the transaction.
• In online transaction processing systems
the master file is updated as soon as a
  transaction occurs.
• In batch-processing systems the
  transactions are collected together in a file
  called transaction file.
• Periodically the transaction file is used to
  update the master file.
For example in payroll system every
month each employees record is retrieved
from the master file and deductions of an
employee are found by retrieving the
corresponding transaction record of
monthly deductions from transaction file
and the employee pay-slip is generated.

If a transaction record requires some changes in
   the master record the change is made and the
   master record is written back.
      Batch Processing in sequential
Top half disk     file
(master file)                 Write back updated file


                                     Program            Pay slip

      Bottom Half disk
      (transaction file)
            File activity Ratio
• The ratio of the number of transaction
  records to the number of records in the
  master file is called is called file activity
• File activity ratio=No of transaction
  records/No of master records
• In the payroll system the FAR is Unity.
              Transfer File
• This type of file carries data from one
  processing stage to another.
• Ex: The transaction file may be input to a
  sorting process, the output file of sorted
  records constitutes a transfer file for input
  into the next process –i.e. the updating of
  master file.
              Output File
• This contains information usually extracted
  from one or more master files for output
  from the system.
• It may be in printed form for dispatch to
  customers or on magnetic medium as
  input to another process.
• Ex: Invoices sent to customers.
•     Weekly sales summaries as an input to
  monthly accounting process.
                     Dump file
• This is a copy of computer held data at a particular point
  in time. Also called as Backup file.
• This may be a complete copy of master file to be
  retained to help recovery in the event of a possible
  future corruption of the master file.
• It can be a part of a program in which a possible fault is
  being investigated.
               Archival File
• This is for long term storage of information
  about the organization’s business.

• Library file: Refers to the file containing
  application programs, utility programs,
  system software etc.
 File Design Consideration Factors
 Operational purpose: Rapid –response systems dictate direct-
  access method of file processing. For daily, weekly, monthly needs
  where file activity is high and there is no immediate need of
  information the sequential orgn will be effective.
 Hardware : The availability of hardware decides the choice of media
  for the files.
 Method of Access : If the proposed system systems require real-
  time operation or on-line enquiry then direct access will be essential.
  Even if direct access is not a prerequisite the use of direct access
  media may be desirable if this reduce the sorting time ,enable the
  file to be split into smaller files or significantly reduce the
  processing time.
 File Size: Small files can be kept in disc to improve
  availability .Very large files can be kept on tapes
  because the raw cost of the tape is 1/10th that of disc .
 Output Requirements: Output file can be kept on disc.
  This permits greater versatility of processing selective
 Input Requirements: A transaction file is normally be
  subjected to validation, control and sorting. Detailed
  analysis of requirements decide the storage medium.
    File Activity:
    a) The frequency of reading and updating of file.
    b) The percentage of file required during the run
    c) Will it be used by more than one program ? All these
     issues decides the file design.
 Speed of processing:
a) Data transfer time: (Time necessary to transfer sector
   between the disk and the memory buffer)

b) Latency time( Time spent for waiting the target sector to
   appear under the read/write heads) and
     seek time( Time necessary for read/write heads to
   travel to the target track) on disk.
c) Stop /start time of tape

 Cost : Tape is cheaper for storing large files. Tapes can
  be posted (Transporting) easily than disks which are
  bulky and heavy.
 File Volatility :This term refers to the rate at which
  records are deleted or added to from a file. A static file
(General details of employee i.e. master file)
has a low percentage while a dynamic file (Monthly
  transactions of inventory) has a high percentage of
 Growth potential of file: Usually files are designed with
  a plan based on their anticipated growth over a period of
 Data format :The packing density of data plays a role in
  deciding file design. There should be a balance between
  time savings (resulting from high density packing) and
  time losses (caused by high data overflow percentages).
 File density :
Density(%) =(no of data chars/number of character
  positions available)*100
the greater the density of a file the lower storage
 Frequency of maintenance: This means the
the number of updates to the file in a certain period
  of time.
Two types of maintenance are assumed in this
a) Transactions are processed as and when they
 b)Transactions are withheld until an economic –
  sized batch is ready processing.
Blocking :The records can be grouped
   together in multiple blocks within fixed
   length tracks.
This way storage utilization can be improved
   and there is faster transfer of data. If the
   hit rate is low then blocking may be a
   disadvantage a more time is spent on
   transferring and unpacking a block of data
   rather than a single record processing.
If the hit rate is high it will result in faster
   data transfer.
File growth: The data organization must be
 designed to accommodate the steady
 growth of the file if it happens.

The file has to be physically organized on
 the appropriate device and method of
 accessing the data needs to determined
 based on all these above factors.
Sequential and Random
Access File Organization
  Sequential File Organization
• In sequential file organization records are
  sorted and arranged in ascending or
  descending order of a key attribute.
• These records are stored in secondary
  storage in sorted order.
• Records are retrieved for processing in the
  same order as they are stored.
• Whenever new records are added or
  deleted from file the file has to be re-
• If additions and deletions are frequent then
  repeated sorting will be costly.
 Random Access File Organization
• In random file organization records are not
• Individual records are stored and retrieved
  independent of other records in the file.
• A record can be inserted, deleted and
  updated without affecting the other records
  in the file.
• A random access file is stored in a disk by
  transforming the key of each record to a
  physical address on the disk.
• F (key value) =Address of record
• F is the mapping function.
• When record is to be accessed the key is
  again transformed by using mapping
  function F to obtain its address.
•  Random access file organization is used
   whenever arbitrary records need to be
   accessed. Example
1. Accessing the record of a customer
   when the customer deposits or
   withdraws the money.
2. Railway reservation file
• A bank may have 10000 customers out of
  which 500 will have transactions on any
  given day.
• Activity ratio=500/10000=.05
Activity   Volatility   Preferred    application
Ratio                   File orgn

≈1         ≈0           Sequential   Payroll,
≈ 0.05     ≈0           Random       Enquiry

Shared By: