Docstoc

Appendix File Organization Storage Structure

Document Sample
Appendix File Organization Storage Structure Powered By Docstoc
					         Appendix C

File Organization & Storage Structure
                 Agenda
• Definition
• Types of Storage
• Types of File Organization
               Definition
• Logical record & physical record
• File organization
• Access method
      Types of File Organization
•   Heap (unordered)
•   Sequential (ordered or sorted)
•   Hash (direct or random)
•   Index
                         Heap
• Unordered structure
• Pros
   – Simple
   – No overhead
• Cons
   – Slow
   – Waste space (deletion)
• For
   – Bulk-loaded
   – Short file
   – Retrieving 80% of the file
                  Ordered
• Sorted according to a field value or primary
  key field
• Pros
  – Binary search
  – Sequential processing
• Con
  – Slow for retrieval information needed by
    management
                            Hash
• Terminology
   – Hash field, hash key
   – Collision, synonyms
   – Bucket, slots
• Types
   – Folding
   – Division-remainder
• Collision handling
   – Open addressing or unchained overflow
   – Chained overflow
   – Multiple hashing
         Direct (Random or Hash)

• Pro
  – Random processing
• Cons
  – Sequential processing
  – Updating (reorganization)
                       Indexes
• Terminology
  – Primary index (one for each file)
  – Secondary index for unique field or non-unique field
    (several for each file)
  – Clustering index for clustering attribute (non-key field
    or non-unique field)
  – Sparse index for some of the search key values
  – Dense index for every search key value
• Types
  –   Linked list
  –   Inverted file
  –   Indexed sequential
  –   B+-tree
            Indexed Sequential
• Structure
   – Primary area
   – Index area: track no, highest key on the track, highest
     key in the overflow, address of first overflow record
   – Overflow area: address, record, pointer
• Types
   – Indexed Sequential Access Method (ISAM)
   – Virtual Sequential Access Method (VSAM)
• Pro
   – Sequential & random processing
• Con
   – Waste spaces (deletion)
   – Inefficient due to overflow
                          B+-Tree
• Terminology
   – Node
   – Root
   – Parent
   – Child
   – Leaf
   – Depth
   – Balanced tree
   – Degree or order (n)
• Rules
   – Root having two children
   – Each node having n/2 and n pointers (children)
   – Key values in leaf between (n-1)/2 and (n-1)
   – Key values in non-leaf is 1 less than pointer
   – Balanced tree
   – Ordered values in leaf
        Points to Remember
• Definition
• Types of Storage
• Types of File Organization
               Assignment
• Review appendix c
• Read chapters 1 and 2 (skip relational
  calculus)
• Homework due date:

				
DOCUMENT INFO