Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

Distributed Database Management Systems Chapter 10 Distributed Database

VIEWS: 292 PAGES: 71

									Chapter 10
   Distributed Database
   Management Systems

   Database Systems:
   Design, Implementation, and
   Management, Seventh Edition, Rob
   and Coronel
                                      1
In this chapter, you will learn:

 What a distributed database
 management system (DDBMS) is and
 what its components are
 How database implementation is
 affected by different levels of data and
 process distribution
 How transactions are managed in a
 distributed database environment
 How database design is affected by
 the distributed database environment
                                       2   2
The Evolution of Distributed
Database Management Systems

 Distributed database management
 system (DDBMS)
   Governs storage and processing of
   logically related data over
   interconnected computer systems in
   which both data and processing
   functions are distributed among several
   sites



                                        3   3
The Evolution of Distributed Database
Management Systems (continued)

 Centralized database required that
 corporate data be stored in a single
 central site
 Dynamic business environment and
 centralized database’s shortcomings
 spawned a demand for applications
 based on data access from different
 sources at multiple locations

                                    4   4
The Evolution of Distributed Database
Management Systems (continued)




                                    5   5
DDBMS Advantages and
Disadvantages

 Advantages include:
   Data are located near “greatest
   demand” site
   Faster data access
   Faster data processing
   Growth facilitation
   Improved communications




                                     6   6
DDBMS Advantages and
Disadvantages (continued)

 Advantages include (continued):
   Reduced operating costs
   User-friendly interface
   Less danger of a single-point failure
   Processor independence




                                           7   7
DDBMS Advantages and
Disadvantages (continued)

 Disadvantages include:
   Complexity of management and control
   Security
   Lack of standards
   Increased storage requirements
   Increased training cost




                                     8   8
DDBMS Advantages and
Disadvantages (continued)




                            9   9
DDBMS Advantages and
Disadvantages (continued)




                            10 10
DDBMS Advantages and
Disadvantages (continued)




                            11 11
Characteristics of Distributed
Management Systems
 Application interface
 Validation
 Transformation
 Query optimization
 Mapping
 I/O interface




                                 12 12
Characteristics of Distributed
Management Systems (continued)

 Formatting
 Security
 Backup and recovery
 DB administration
 Concurrency control
 Transaction management


                            13 13
Characteristics of Distributed
Management Systems (continued)

 Must perform all the functions of
 centralized DBMS
 Must handle all necessary functions
 imposed by distribution of data and
 processing
   Must perform these additional functions
   transparently to the end user



                                       14 14
Characteristics of Distributed
Management Systems (continued)




                            15 15
DDBMS Components
 Must include (at least) the following
 components:
   Computer workstations
   Network hardware and software
   Communications media
   Transaction processor (application
   processor, transaction manager)
     Software component found in each
     computer that requests data



                                        16 16
DDBMS Components (continued)
 Must include (at least) the following
 components (continued):
   Data processor or data manager
     Software component residing on each
     computer that stores and retrieves data
     located at the site
     May be a centralized DBMS




                                           17 17
DDBMS Components (continued)




                           18 18
Levels of Data and Process
Distribution




                             19 19
Single-Site Processing,
Single-Site Data (SPSD)

 All processing is done on single CPU
 or host computer (mainframe,
 midrange, or PC)
 All data are stored on host
 computer’s local disk
 Processing cannot be done on end
 user’s side of system


                                  20 20
Single-Site Processing,
Single-Site Data (SPSD) (continued)

 Typical of most mainframe and
 midrange computer DBMSs
 DBMS is located on host computer,
 which is accessed by dumb
 terminals connected to it
 Also typical of first generation of
 single-user microcomputer
 databases

                                      21 21
Single-Site Processing,
Single-Site Data (SPSD) (continued)




                                      22 22
Multiple-Site Processing,
Single-Site Data (MPSD)

 Multiple processes run on different
 computers sharing single data
 repository
 MPSD scenario requires network file
 server running conventional
 applications that are accessed
 through LAN
 Many multiuser accounting
 applications, running under
 personal computer network, fit such
                                   23 23
 a description
Multiple-Site Processing,
Single-Site Data (MPSD) (continued)




                                      24 24
Multiple-Site Processing,
Multiple-Site Data (MPMD)
 Fully distributed database management
 system with support for multiple data
 processors and transaction processors at
 multiple sites
 Classified as either homogeneous or
 heterogeneous
 Homogeneous DDBMSs
   Integrate only one type of centralized DBMS
   over a network



                                             25 25
Multiple-Site Processing,
Multiple-Site Data (MPMD) (continued)

 Heterogeneous DDBMSs
   Integrate different types of centralized
   DBMSs over a network
 Fully heterogeneous DDBMS
   Support different DBMSs that may even
   support different data models
   (relational, hierarchical, or network)
   running under different computer
   systems, such as mainframes and
   microcomputers
                                         26 26
Multiple-Site Processing,
Multiple-Site Data (MPMD) (continued)




                                   27 27
Distributed Database
Transparency Features

 Allow end user to feel like
 database’s only user
 Features include:
   Distribution transparency
   Transaction transparency
   Failure transparency
   Performance transparency
   Heterogeneity transparency


                                28 28
Distribution Transparency

 Allows management of physically
 dispersed database as though it
 were a centralized database
 Following three levels of distribution
 transparency are recognized:
   Fragmentation transparency
   Location transparency
   Local mapping transparency


                                    29 29
Distribution Transparency
(continued)




                            30 30
Distribution Transparency
(continued)




                            31 31
Transaction Transparency

 Ensures database transactions will
 maintain distributed database’s
 integrity and consistency




                                  32 32
Distributed Requests and
Distributed Transactions
 Distributed transaction
   Can update or request data from
   several different remote sites on
   network
 Remote request
   Lets single SQL statement access data
   to be processed by single remote
   database processor
 Remote transaction
   Accesses data at single remote site

                                         33 33
Distributed Requests and Distributed
Transactions (continued)

 Distributed transaction
    Allows transaction to reference several
    different (local or remote) DP sites
 Distributed request
    Lets single SQL statement reference
    data located at several different local or
    remote DP sites




                                          34 34
Distributed Requests and Distributed
Transactions (continued)




                                       35 35
Distributed Requests and Distributed
Transactions (continued)




                                       36 36
Distributed Requests and Distributed
Transactions (continued)




                                       37 37
Distributed Requests and Distributed
Transactions (continued)




                                       38 38
Distributed Requests and Distributed
Transactions (continued)




                                       39 39
Distributed Concurrency Control

 Multisite, multiple-process
 operations are much more likely to
 create data inconsistencies and
 deadlocked transactions than are
 single-site systems




                                  40 40
Distributed Concurrency Control
(continued)




                                  41 41
Two-Phase Commit Protocol
 Distributed databases make it
 possible for transaction to access
 data at several sites
 Final COMMIT must not be issued
 until all sites have committed their
 parts of transaction
 Two-phase commit protocol
 requires each individual DP’s
 transaction log entry be written
 before database fragment is
 actually updated                   42   42
Performance Transparency
and Query Optimization

 Objective of query optimization routine is
 to minimize total cost associated with
 execution of request
 Costs associated with request are
 function of:
   Access time (I/O) cost
   Communication cost
   CPU time cost
 Must provide distribution transparency as
 well as replica transparency
                                        43 43
Performance Transparency
and Query Optimization (continued)

 Replica transparency
   DDBMS’s ability to hide existence of
   multiple copies of data from user
 Query optimization techniques
 include:
   Manual or automatic
   Static or dynamic
   Statistically based or rule-based
   algorithms

                                          44 44
Distributed Database Design

 Data fragmentation
   How to partition database into
   fragments
 Data replication
   Which fragments to replicate
 Data allocation
   Where to locate those fragments and
   replicas


                                     45 45
Data Fragmentation

 Breaks single object into two or
 more segments or fragments
 Each fragment can be stored at any
 site over computer network
 Information about data
 fragmentation is stored in
 distributed data catalog (DDC),
 from which it is accessed by TP to
 process user requests
                                 46 46
Data Fragmentation (continued)
 Strategies
   Horizontal fragmentation
     Division of a relation into subsets
     (fragments) of tuples (rows)
   Vertical fragmentation
     Division of a relation into attribute
     (column) subsets
   Mixed fragmentation
     Combination of horizontal and vertical
     strategies


                                              47 47
Data Fragmentation (continued)




                                 48 48
Data Fragmentation (continued)




                                 49 49
Data Fragmentation (continued)




                                 50 50
Data Fragmentation (continued)




                                 51 51
Data Fragmentation (continued)




                                 52 52
Data Fragmentation (continued)




                                 53 53
Data Fragmentation (continued)




                                 54 54
Data Replication

 Storage of data copies at multiple
 sites served by computer network
 Fragment copies can be stored at
 several sites to serve specific
 information requirements
   Can enhance data availability and
   response time
   Can help to reduce communication and
   total query costs

                                    55 55
Data Replication (continued)




                               56 56
Data Replication (continued)
 Replication scenarios
   Fully replicated database
     Stores multiple copies of each database
     fragment at multiple sites
     Can be impractical due to amount of
     overhead
   Partially replicated database
     Stores multiple copies of some database
     fragments at multiple sites
     Most DDBMSs are able to handle the
     partially replicated database well

                                           57 57
Data Replication (continued)

 Replication scenarios (continued)
   Unreplicated database
     Stores each database fragment at single
     site
     No duplicate database fragments




                                          58 58
Data Allocation

 Deciding where to locate data
 Allocation strategies
   Centralized data allocation
     Entire database is stored at one site
   Partitioned data allocation
     Database is divided into several disjointed
     parts (fragments) and stored at several
     sites



                                             59 59
Data Allocation (continued)

 Allocation strategies (continued)
   Replicated data allocation
     Copies of one or more database
     fragments are stored at several sites
 Data distribution over computer
 network is achieved through data
 partition, data replication, or
 combination of both


                                             60 60
Client/Server vs. DDBMS

 Way in which computers interact to
 form system
 Features user of resources, or
 client, and provider of resources, or
 server
 Can be used to implement a DBMS
 in which client is the TP and server
 is the DP

                                    61 61
Client/Server vs. DDBMS
(continued)

 Client/server advantages
   Less expensive than alternate
   minicomputer or mainframe solutions
   Allow end user to use microcomputer’s
   GUI, thereby improving functionality
   and simplicity
   More people in job market have PC
   skills than mainframe skills
   PC is well established in workplace

                                      62 62
Client/Server vs. DDBMS
(continued)

 Client/server advantages
 (continued)
   Numerous data analysis and query
   tools exist to facilitate interaction with
   DBMSs available in PC market
   Considerable cost advantage to
   offloading applications development
   from mainframe to powerful PCs



                                           63 63
Client/Server vs. DDBMS
(continued)

 Client/server disadvantages
   Creates more complex environment
     Different platforms (LANs, operating
     systems, and so on) are often difficult to
     manage
   An increase in number of users and
   processing sites often paves the way
   for security problems



                                             64 64
Client/Server vs. DDBMS
(continued)
 Client/server disadvantages
 (continued)
   Possible to spread data access to much
   wider circle of users
     Increases demand for people with broad
     knowledge of computers and software
     Increases burden of training and cost of
     maintaining the environment




                                           65 65
C. J. Date’s Twelve Commandments
for Distributed Databases

 Local site independence
 Central site independence
 Failure independence
 Location transparency
 Fragmentation transparency
 Replication transparency



                                   66 66
C. J. Date’s Twelve Commandments
for Distributed Databases (continued)

 Distributed query processing
 Distributed transaction processing
 Hardware independence
 Operating system independence
 Network independence
 Database independence



                                    67 67
Summary
 Distributed database stores logically
 related data in two or more physically
 independent sites connected via computer
 network
 Distributed processing is division of logical
 database processing among two or more
 network nodes
 Distributed databases require distributed
 processing
 Main components of DDBMS are
 transaction processor and data processor
                                          68 68
Summary (continued)
 Current database systems can be
 classified by extent to which they support
 processing and data distribution
 Homogeneous distributed database
 system integrates only one particular type
 of DBMS over computer network
 Heterogeneous distributed database
 system integrates several different types
 of DBMSs over computer network


                                        69 69
Summary (continued)
 DDBMS characteristics are best described
 as set of transparencies
 Transaction is formed by one or more
 database requests
 Distributed concurrency control is
 required in network of distributed
 databases
 Distributed DBMS evaluates every data
 request to find optimum access path in
 distributed database


                                      70 70
Summary (continued)

 The design of distributed database
 must consider fragmentation and
 replication of data
 Database can be replicated over
 several different sites on computer
 network
 Client/server architecture refers to
 way in which two computers
 interact over computer network to
 form a system                      71   71

								
To top