CHAPTER 15: QUESTIONS AND ANSWERS 1. What does it mean to say that a transaction is a user-defined concept? Why is it important that transactions are user defined? Ans: For a DBMS, a transaction is a user-defined concept means that the transaction designer determines the database operations in a transaction, not the DBMS. For example, making an airline reservation may involve reservations for the departure and return. To the user, the combination of the departure and the return is a transaction, not the departure and the return separately. It is important that transactions are user- defined because a transaction can involve any number of reads and writes to a database. To provide the flexibility of user-defined transactions, a DBMS cannot restrict transactions to only a specified number of reads and writes to a database. 2. List transactions with which you have interacted in the last week. Ans: Students' responses will vary. However examples might be: Pay credit card bills online, withdraw money from ATM, make an airline reservation, buy a book online, and reserve a rental car. 3. Explain the purpose of the SQL statements START TRANSACTION, COMMIT, and ROLLBACK. How do these statements vary across DBMSs? Ans: The START TRANSACTION and COMMIT statements define the statements in a transaction. Any other SQL statements between them are part of the transaction. The ROLLBACK statement causes all effects of a transaction to be removed. Some DBMSs use the keyword BEGIN instead of START. Some DBMSs such as Oracle do not use a statement to explicitly start a transaction. A new transaction begins with the next SQL statement following a COMMIT statement. 4. Briefly explain the meaning of the ACID properties. How do concurrency control and recovery management support the ACID properties? Ans: The ACID properties are atomic, consistent, isolated, and durable. Atomic means that a transaction cannot be subdivided. The transaction is all or nothing. Consistent means that a transaction satisfies integrity constraints after completion. Isolated means that transactions do not interfere with each other except in allowable ways. A transaction should never overwrite changes made by another transaction. In addition, a transaction may be restricted from interfering in other ways such as not viewing the uncommitted changes made by other transactions. Durable means that any changes resulting from a transaction are permanent. No failure will erase any changes after a transaction terminates. Concurrency control and recovery management ensure that transactions meet the ACID properties. Recovery management involves actions to deal with failures such as communication errors and software crashes. Concurrency control involves actions to control interference among multiple, simultaneous users of the database. 5. Briefly explain the meaning of transparency as it relates to computer processing. Why is transparency important for concurrency control and recovery management? Ans: For a DBMS, transparency means that the inner details of transaction services are invisible. Transparency is important for concurrency control and recovery management because the services that ensure ACID transactions are difficult to implement. 6. What costs are associated with concurrency control and recovery management? In what role, database administrator or database programmer, would you assess these costs? Ans: Concurrency and recovery, though invisible, involve overhead that may adversely impact performance or require additional resources to reach an acceptable level of performance. More computing resources such as memory, disk space, and parallel processing may be useful to improve performance. A database administrator or database programmer in the role of database designer would assess these costs. 7. What is the objective of concurrency control? How is the measure used in the objective related to waiting time? Ans: The objective of concurrency control is to maximize transaction throughput while preventing interference among multiple users. Transaction throughput is measured as the number of transactions processed per time unit, typically transactions per minute. From a user’s perspective, transaction throughput is related to response time. Higher transaction throughput means lower response times. 8. What is a hot spot? How are hot spots related to interference problems? Ans: A hot spot is common data that multiple users try to change. Without adequate concurrency control, users may interfere with each other on hot spots. 9. Discuss the consequences of each kind of interference problem. Which problem seems to be the most serious? Ans: There are three consequences on interference. Lost Update: a problem in which one user’s update overwrites another user’s update. Uncommitted Dependency: a problem in which one transaction reads data written by another transaction before the other transaction commits. If the second transaction aborts, the first transaction is using data that no longer exists. An uncommitted dependency cannot cause a problem unless a rollback occurs. Inconsistent Retrieval: This class of problems involves inconsistency among multiple retrievals of a subset of data. An incorrect summary occurs when a transaction calculating a summary function, reads some values before another transaction changes the values but reads other values after another transaction changes the values. The phantom read problem occurs when a transaction executes a query with record conditions. Then, another transaction inserts or modifies data that the query would retrieve. Finally, the original transaction executes the same query again. The second query execution retrieves different records than the first execution. The non repeatable read problem occurs when a transaction reads the same value more than one time. In between reading the data item, another transaction modifies the data item. Lost update is the most serious interference problem because changes to a database are inadvertently lost. 10. What is a lock? Briefly explain the differences between shared (S) and exclusive (X) locks? Ans: Locks provide a way to prevent other users from accessing part of the database being used. Before accessing part of the database, a lock must be obtained. Other users must wait if trying to obtain a conflicting lock on the same part of the database. A shared (S) lock must be obtained before reading part of the database, while an exclusive (X) lock must be obtained before writing. Any number of users can hold a shared lock on the same part of the database. However, only one user can hold an exclusive lock. 11. What operations are performed by the lock manager? Ans: The concurrency control manager can perform two operations on lock records. The lock action adds a record to the lock table. Likewise, the unlock or release action deletes a record from the lock table. 12. What is a deadlock and how do most DBMSs handle deadlocks? Ans: A deadlock is a problem of mutual waiting. One transaction has a resource that another transaction needs, and a second transaction holds a resource that the first transaction needs. To control deadlocks, most DBMSs use a time-out policy. The concurrency control manager aborts (with a ROLLBACK statement) any transaction waiting for more than a specified time. 13. What is locking granularity? What are the trade-offs of holding locks at a finer level versus a coarser level of granularity? Ans: Locking granularity is the size of the database item locked. Holding locks at a finer level decreases waiting time, but increases overhead. Holding locks at a coarser level reduces overhead (number of locks held), but increases waiting time. 14. What is an intent lock? Why are intent locks used on items of coarse granularity? Ans: An intent lock is a lock on a large database item that indicates that the user intends to lock smaller items contained in the larger item. Intent locks are used on items of coarse granularity to alleviate blocking (waiting time) that would occur if shared or exclusive locks were held. In addition, intent locks allow efficient detection of conflicts among items of varying granularity. 15. Why is the third condition of 2PL typically simplified so that locks are released at the end of a transaction? Ans: The third condition of 2PL is typically simplified so that at least exclusive locks are released at the end of a transaction. Simplifying the definition of 2PL makes the protocol easier to enforce and prevents the uncommitted dependency problem. Holding locks until end of transaction obviates the difficult problem of predicting when a transaction may release locks. Holding write locks until end of transaction also prevents the uncommitted dependency problem because changes are not exposed to other transactions until permanent (after commit). However, releasing shard locks before end of transaction may lead to occurrences of the incorrect summary, the non repeatable read, and the phantom read problems. 16. What is the appeal of optimistic concurrency control approaches? Why might optimistic concurrency control approaches not be used even if they provide better expected performance? Ans: The appeal of optimistic concurrency control approaches is that transactions are permitted to access the database without acquiring locks. The performance of optimistic approaches depends on the frequency of conflicts. If conflicts increase, the performance of optimistic approaches decreases. Even if conflicts are rare, optimistic approaches can have more variability because the penalty for conflicts is larger in optimistic approaches. Pessimistic approaches resolve conflicts by waiting. Optimistic approaches resolve conflicts by rolling back and restarting. Restarting a transaction may delay a transaction more than waiting for a resource to be released. 17. Explain the difference between volatile and nonvolatile storage. Ans: Volatile storage loses its state when the power is disconnected. Nonvolatile storage does not lose its state when the power is disconnected. 18. Explain the effects of local, system, and device failures on active and past transactions. Ans: The effect of local failures can be distinguished in two kinds, program-detected failure and abnormal termination. Both kinds of failure affect one active transaction. The other two types of failure have more serious consequences but are not frequent. A system failure is an abnormal termination of the operating system. A system failure affects all active transactions. A device failure affects all active and past transactions stored on the device. 19. Why is force writing the most fundamental tool of recovery management? Ans: Force writing is the most fundamental tool of recovery management because it allows the recovery manager to reason when data are written to nonvolatile storage. Without the ability to control the timing of write operations to non-volatile storage, recovery is not possible. Force writing means that the DBMS, not the operating system, controls when data are written to nonvolatile storage. 20. What kind of redundant data is stored in a log? Why is management of the log critical to recovery? Ans: A typical log record contains a unique log sequence number (LSN), a transaction identifier, a database action, a time, a row identifier, a column name, and values (old and new). Management of the log is critical to recovery because a log can add considerable storage overhead. Because of this large size, many organizations have both an on-line log stored on disk and an archive log stored on magnetic tape. In addition to the size, the integrity of the log is crucial. Some DBMSs can maintain redundant logs to provide non- stop processing in case of a log failure. 21. What is the checkpoint interval? What is the trade-off in determining the checkpoint interval? Ans: The checkpoint interval is defined as the period between checkpoints. The interval can be expressed as a time (such as five minutes) or as a size parameter such as the number of committed transactions, the number of log pages, or the number of database pages. The checkpoint interval is a design parameter. A small interval reduces restart work but causes more overhead to record checkpoints. A large interval reduces checkpoint overhead but increases restart work. 22. What processing occurs at checkpoint time? Ans: Recording a checkpoint may involve considerable disruption in transaction processing as all transaction activity ceases while a checkpoint occurs. No new transactions can begin and existing transactions cannot initiate new operations during a checkpoint. The length of the disruption depends on the type of checkpoint used. In a cache-consistent checkpoint, buffer pages (log pages and some dirty database pages) remaining in memory are written to disk and then the checkpoint record is written to the log. A page is dirty if it has been changed by a transaction. 23. What restart work is necessary for a media failure? Ans: The restart work for a media failure is the restoring from the most recent backup and applying the redo operator to all committed transactions. 24. What restart work is necessary for local and system failures under the immediate update approach? Ans: In restart work for local failure under the immediate update approach, all log records of the transaction are found. The undo operation is then applied to each log record of the transaction. In restart work for system failure under the immediate update approach, both undo and redo operations may be involved because all active users are affected. T2 transactions (started before the checkpoint and finished after the checkpoint) must be redone from the checkpoint because only database changes prior to the checkpoint are stable. T3 transactions (started after the checkpoint and finished after the checkpoint) must be redone entirely because database changes are not guaranteed to be stable even though some changes may be recorded on disk. T4 and T5 transactions (not committed at the failure time) must be undone entirely because some database changes after the checkpoint may be recorded on disk 25. What restart work is necessary for local and system failures under the deferred update approach? Ans: Local failures under the deferred update approach are handled without any restart work because no database changes occur until after a transaction commits; the transaction is aborted without any undo work. System failures require redo operations only. T4 and T5 transactions (not yet committed) do not require undo operations because no database changes are written to disk until after a transaction commits. T2 and T3 transactions (committed after the checkpoint) require redo operations because it is not known whether all database changes are stable. T2 transactions (started before the checkpoint) must be redone from the first log record rather than just from the checkpoint as in the immediate update approach. 26. What is a transaction boundary? Why can an inappropriate choice for transaction boundary lead to poor performance? Ans: A transaction boundary is an important decision of transaction design in which an application in which a collection of SOL statements is divided into one or more transactions. An inappropriate choice for transaction boundary may cause the transaction duration to be longer and keep other transactions waiting, which leads to poor performance. 27. What criteria should be used in selecting a transaction boundary? Ans: In selecting a transaction boundary, the criteria that should be used are not only the number of reads and writes to a database but also the time spent waiting for user responses. The choice of a transaction boundary should balance the duration of a transaction with the need to enforce integrity constraints during the execution of a transaction. 28. Why must constraints such as the debit–credit constraint be enforced as part of a transaction rather than between transactions? Ans: Because constraint checking must occur by the end of a transaction. If the SQL statement to post a debit and a credit are placed in the same transaction, the DBMS can enforce the accounting constraint at the end of a transaction. If they are placed in separate/between transactions, constraint checking cannot occur until after both transactions are committed. 29. Explain the difference between system-independent and system-dependent hot spots. Why is it useful to identify hot spots? Ans: System-independent hot spots are parts of a table that many users simultaneously may want to change. System-dependent hot spots depend on the DBMS. It is useful to identify hot spots so that a designer can understand how transaction boundaries can affect performance. 30. Explain the three choices for transaction boundary of a hierarchical form. Ans: When designing a hierarchical form, the database programmer has three choices for the transaction boundary: (1) the entire form, (2) the main form as one transaction and all subform lines as a second transaction, and (3) the main form as one transaction and each subform line as a separate transaction. The third choice is usually preferred because it provides transactions with the shortest duration. However, constraint checking may force the choice to (1) or (2). 31. How can deadlock possibility be influenced by the choice of a transaction boundary? Ans: If the transaction boundary choice includes multiple resources (enrollment seats, inventory items, flight seats), deadlocks can occur when transactions have mutual waiting among the hot spots. If the transaction boundary choice only includes a single resource, deadlocks will not occur. However, there may be other reasons that influence the transaction boundary choice besides deadlock possibility. 32. What is the purpose of the SQL isolation levels? Ans: The isolation level specifies the degree to which a transaction is separated from the actions of other transactions. A transaction designer can balance concurrency control overhead with potential interference problems by specifying the appropriate isolation level. 33. How do the isolation levels achieve more concurrent access? Ans: The isolation levels vary according to the duration and type of locks held. The isolation levels are SERIALIZABLE (all locks held until EOT), REPEATABLE READ (short-term predicate locks), READ COMMITTED (no predicate locks, short-term shared locks), and READ UNCOMMITTED (no locks held). 34. What isolation level can be dangerous and why? Ans: Although SQL:1999 provides SERIALIZABLE as the default level, some DBMS vendors such as Oracle and Microsoft SQL Server use READ COMMITTED as the default level. READ COMMITTED can be a dangerous default level as it permits a variation of the lost update problem known as the scholar’s lost update. 35. Provide an example of a constraint for which deferred enforcement may be appropriate. Ans: Students' responses may vary. However, an example might be an enrollment transaction with constraints about co-requisites. The co-requisite constraints should be checked after all courses are selected, not after selecting each course in which to enroll. 36. What SQL statements and clauses involve constraint timing specification? Ans: SQL statements and clauses involve constraint timing specification are the SET CONSTRAINTS statement and the DEFERRABLE INITIALLY DEFERRED and DEFERRABLE INITIALLY IMMEDIATE clauses. 37. What is the role of the DBA in specifying of constraint timing? Ans: The DBA uses the constraint timing clause for constraints that may need deferred checking. 38. What is the role of the transaction designer in specifying of constraint timing? Ans: The transaction designer may specify whether deferrable constraints are deferred or immediately enforced using the SET CONSTRAINTS statement. 39. What is the purpose of a save point? Ans: SQL:1999 provides the SAVEPOINT statement to allow partial rollback of a transaction. Save points are useful in complex transactions that have tentative operations that a user can cancel. 40. What is a workflow and how is it related to database transactions? Ans: A workflow is a collection of related tasks structured to accomplish a business process. Workflow defines the order of transactions, the conditions for transactions, and the results of performing transactions. 41. What are the differences between human-oriented and computer-oriented workflows? Ans: In human-oriented workflows, humans provide most of the judgment to accomplish work. In computer-oriented workflow, the computer determines the processing of work. 42. Provide an example of a workflow with high task complexity and another example with high task structure. Ans: Students' responses may vary. However, an example might be: For high task complexity, the example may be the decision to grant a loan that involves complex reasoning using many variables. For high task structure, the example may be processing an insurance claim that may have conditions about denying the claim, litigating the claim, and investigating the claim. 43. Discuss the enabling technologies for workflow management. What role does transaction management play in workflow management? Ans: For workflow management, three enabling technologies are important: Distributed Object Management: Workflows can involve many types of data in remote locations. Workflow Specification and Implementation: Workflow involves identifying the tasks and specifying the relationships among tasks. Then the implementation may involve diverse hardware, software, and people. Customized Transaction Management: To more fully support workflow management, some industry leaders have proposed that transaction management should be customized according to workflow requirements. 44. What are limitations of transaction management to support workflows? Ans: Limitations of transaction management to support workflows are: Some transactions may involve subtasks, changing the notation of atomicity. Some workflows involve tasks with a long duration because of user interaction. Transaction management may not work well. Some transactions may be performed by legacy systems that do not support ACID properties. Some workflows may require transactions to be undone after they are complete. PROBLEM SOLUTIONS The problems provide practice using transaction-defining SQL statements, testing your knowledge of concurrency control and recovery management, and analyzing design decisions about transaction boundaries and hot spots. 1. Identify two transactions that you have encountered recently. Define pseudo code for the transactions in the style of Figures 15.1, 15.2, and 15.3. Ans: Here is pseudo code for some transactions in which you may be familiar. START TRANSACTION display greeting get course offering request select offering record if seats remain in the course offering then update seatsremain of offering record insert enrollment record send message to billing department else inform student that the offering is full end if on error: ROLLBACK COMMIT Pseudo Code for a Course Offering Registration START TRANSACTION display greeting get event request and credit card data select event record if no credit verification then inform customer of credit problem else if seats remain then update seatsremain of event record insert billing record send message to the credit card company send message to mail tickets else inform customer that event is full end if on error: ROLLBACK COMMIT Pseudo Code for a Concert Ticket Purchase 2. Identify hot spots in your transactions from problem 1. Ans: The seats remaining field of the course offering table is a hot spot in the course registration transaction. The seats remaining field in the event table is a hot spot in the ticket purchase transaction. 3. Using a timeline, depict a lost update problem using your transactions from problem 1 if no concurrency control is used. Ans: The following timeline shows a lost update problem for the seats remaining field. Transaction A Time Transaction B Read SR (10) t1 t2 Read SR (10) if SR > 0 then SR = SR -1 t3 t4 if SR > 0 then SR = SR -1 Write SR (9) t5 t6 Write SR (9) 4. Using a timeline, depict an uncommitted dependency problem using your transactions from problem 1 if no concurrency control is used. Ans: The following timeline shows an uncommitted dependency problem for the seats remaining field. Transaction A Time Transaction B Read SR (10) t1 SR = SR - 1 t2 Write SR (9) t3 t4 Read SR (9) Rollback t5 5. Using a timeline, depict a nonrepeatable read problem using your transactions from problem 1 if no concurrency control is used. Ans: The following timeline shows a non repeatable read problem for the seats remaining field. Transaction A Time Transaction B Read SR (10) t1 t2 Read SR t3 SR = SR - 1 t4 Write SR (9) Read SR (9) t5 6. Explain whether deadlock would be a problem using your transactions from problem 1 if locking is used. If a deadlock is possible, use a timeline to demonstrate deadlock with your transactions. Ans: Deadlock is not a problem for either transaction if both transactions obtain locks in the same order: first on the row to update and then on the row to insert. If the transactions do not obtain locks in the same order, a deadlock might occur if the concurrency control manager locks the entire page instead of a record. 7. Use the Accounting Database tables and the Accounting Register (shown below) to answer problems 7.1 to 7.7. Comments are listed after the tables and the form. Account(AcctNo, Name, Address, Balance, LastCheck, StartDate) Entry(EntryNo, AcctNo, Date, Amount, Desc) Category(CatNo, Name, Description) EntryLine(EntryNo, CatNo, Amount, Description) Accounting Register for Wells Fargo Credit Line Entry No. E101 Date: 3/11/2001 Description: Purchases at OfficeMax Amount: $442.00 Invoice No. I101 Category Description Amount Office supplies Envelopes 25.00 Equipment Fax machine 167.00 Computer software MS Office upgrade 250.00 • The primary keys in the tables are underlined. The foreign keys are italicized. • The Accounting Register records activities on an account, such as a line of credit or accounts receivable. The Accounting Register is designed for use by the accounting department of moderate-size businesses. The sample form shows one recorded entry, but a register contains all recorded entries since the opening of the account. • The main form is used to insert a record into the Entry table and update the Balance field of the Account table. Accounts have a unique name (Wells Fargo Line of Credit) that appears in the title of the register. Accounts have other attributes not shown on the form: a unique number (name is also unique), start date, address, type (Receivable, Investment, Credit, etc.), and current balance. • In the subform, the user allocates the total amount of the entry to categories. The Category field is a combo box. When the user clicks on the category field, the category number and name are displayed. Entering a new subform line inserts a row into the EntryLine table. • The Description field in the subform describes a row in the EntryLine table rather than the Category table. 7.1 What are the possible transaction boundaries for the Accounting Register form? 7.2 Select a transaction boundary from your choices in problem 7.1. Justify your choice using the criteria defined in Section 15.4.1. 7.3 Identify system-independent hot spots that result from concurrent usage (say, many clerks in the accounting department) of the Accounting Register. For each hot spot, explain why it is a hot spot. 7.4 Identify system-dependent hot spots that result from concurrent usage (say, many clerks in the accounting department) of the Accounting Register. You may assume that the DBMS cannot lock finer than a database page. 7.5 Describe a lost update problem involving one of your hot spots that could occur with concurrent usage of the Accounting Register. Use a timeline to depict your example. 7.6 Describe a dirty read situation involving one of your hot spots that could occur with concurrent usage of the Accounting Register. Use a timeline to depict your example. Ans: 7.1 The three choices for transaction boundary are (i) the entire form as one transaction, (ii) the main form and subform as separate transactions, and (iii) the main form as one transaction and each line of the subform as separate transactions. 7.2 Choice (iii) is usually preferred because it provides transactions with the shortest duration. However choice (iii) does not permit checking the constraint that an entry is fully allocated to categories. Therefore, choice (i) is the only choice that supports this important constraint. 7.3 Balance and last check number are system independent hot spots in the account table. Each transaction changes the balance of a checking account. If many users access the same account, the balance will be locked by every user. The last check number is updated when paying by check. If there are many checking transactions, the last check number must be updated frequently. 7.4 The next available pages in the entry and entryline tables are system-dependent hot spots if page locks are used. For example, each subform line inserts a new row in the entryline table. If records locks are not used, the entire page is locked preventing other users from inserting new rows. This is true even if the new rows are for transactions of different accounts. 7.5 User 1 Time User 2 Read balance (500) t1 t2 Read balance (500) Update balance = balance - 100 (400) t3 t4 Update balance = balance - 200 (300) 7.6 User 1 Time User 2 Update balance to 250 t1 t2 Read balance (250) Rollback t3 t4 Update balance = balance - 100 (150) 7.7 A deadlock involves mutual waiting. There should not be a deadlock problem with the account table because a transaction affects only 1 row. Users may wait to obtain an exclusive lock on a row of the table but mutual waiting should not occur. Likewise, a deadlock should not occur with page locking of the entryline table because a user must obtain a lock on the next page before proceeding. Because two users cannot simultaneously obtain a lock on the next page, they will not wait on each other. The deadlock analysis does not depend on whether the locks on all subform rows are held until EOT. Assuming that locks are obtained in the same order by every transaction, there should not be a deadlock from mutual waiting over entryline and entry page locks. If the category table has a balance field that is updated for every transaction line, a deadlock is possible. For example, one transaction may acquire a lock on auto expenses category and another transaction on auto gasoline expenses. If these transactions then need locks on the categories held by the other transaction, a deadlock occurs. 8. Use the Patient tables and the Patient Billing Form (shown in the problem narrative in the textbook) to answer problems 8.1 to 8.4. Comments are listed after the tables and the form. Patient(PatSSN, PatName, PatCity, PatAge) Doctor(DocNo, DocName, DocSpecialty) Bill(BillNo, PatSSN, BillDate, AdmitDate, DischargeDate) Charge(ChgNo, BillNo, ItemNo, ChgDate, ChgQty, DocNo) Item(Itemno, ItemDesc, ItemUnit, ItemRate, ItemQty) • The main form is used to insert a record into the Bill table. Fields from the Patient table are read-only in the main form. • The subform can be used to insert a new row into the Charge table. Fields from the Doctor and the Item tables are read-only. • When a subform line is entered, the associated item row is updated. The form field Qty is subtracted from the current value in the field ItemQOH (item quantity on hand). 8.1 What are the possible transaction boundaries for the Patient Billing Form? 8.2 Select a transaction boundary from your choices in problem 8.1. Justify your choice using the criteria defined in Section 15.4.1. 8.3 Identify system-independent hot spots that result from concurrent usage (say, many health providers) of the Patient Billing Form. For each hot spot, explain why it is a hot spot. 8.4 Identify system-dependent hot spots that result from concurrent usage of the Patient Billing Form. You may assume that the DBMS cannot lock finer than a database page. Ans: 8.1 The three choices for transaction boundary are (i) the entire form as 1 transaction, (ii) the main form and subform as separate transactions, and (iii) the main form as one transaction and each line of the subform as separate transactions. 8.2 Choice (iii) is usually preferred because it provides transactions with the shortest duration. Choice (iii) is preferred here because there are no critical constraints connecting the main form to the subform or all subform lines together. 8.3 The field itemqoh is a system independent hot spot in popular rows of the item table. This is a hot spot no matter where the transaction boundary is set. 8.4 The next available pages in the bill and charge tables are system dependent hot spots if page locks are used. For example, each subform line inserts a new row in the charge table. If records locks are not used, the entire page is locked preventing other users from inserting new rows. This is true even if the new rows are for transactions of different bills. 9. Use the Airline Reservation database tables and the Flight Reservation Form (shown below) to answer problems 9.1 to 9.4. Comments are listed after the tables and the form. Flight(FlightNo, DepCityCode, ArrCityCode, DepTime, ArrTime, FlgDays) FlightDate(FlightNo, FlightDate, RemSeats) Reservation(ResNo, CustNo, ResDate, Amount, CrCardNo) ReserveFlight(ResNo, FlightNo, FlightDate) Customer(CustNo, CustName, Custstreet, CustCity, CustState, CustZip) City(CityCode, CityName, Altitude, AirportConditions) Flight Reservation Form Reservation No. R101 Today’s Date: 8/26/2001 Credit Card No. CC101 Amount: $442.00 Customer No. C101 Customer Name Jill Horn Flight Schedule Flight No. Date Dep City Dep Time Arr City Arr Time F101 8/26/2001 DNV 10:30AM CHG 11:45AM F201 8/31/2001 CHG 10:00AM DNV 1:20PM • The primary keys in the tables are underlined. The foreign keys are italicized. Note that the combination of ResNo, FlightNo, and FlightDate is the primary key of the ReserveFlight table. The combination of FlightNo and FlightDate is a foreign key in the ReserveFlight table. The foreign key refers to the FlightDate table. • The Flight Reservation Form is somewhat simplified as it accommodates only a single class of seating, no reserved seats, and no meals. However, commuter and low-cost airlines often have these restrictions. • The main form is used to insert a record into the Reservation table. The fields from the Customer table are read-only. • The subform is used to insert new rows in the ReserveFlight table and update the field RemSeats in the FlightDate table. The fields from the Flight table are read-only. 9.1 Select a transaction boundary for the Flight Reservation Form. Justify your choice using the criteria defined in Section 15.4.1. 9.2 Identify system-independent hot spots that result from concurrent usage (say, many reservation agents) of the Flight Reservation Form. For each hot spot, explain why it is a hot spot. 9.3 Identify system-dependent hot spots that result from concurrent usage of the Flight Reservation Form. You may assume that the DBMS cannot lock finer than a database page. 9.4 Is deadlock likely to be a problem with concurrent usage of the Flight Reservation Form? If deadlock is likely, provide an example as justification. Ans: 9.1 The entire form should be a transaction because a reservation is not feasible unless all flights are confirmed. Customers do not normally accept a reservation unless all flights in the reservation are acceptable. 9.2 The field remseats is a system independent hot spot in popular rows of the flightdate table. This is a hot spot no matter where the transaction boundary is set. 9.3 The next available pages in the reservation and reserveflight tables are system dependent hot spots if page locks are used. For example, each subform line inserts a new row in the reserveflight table. If records locks are not used, the entire page is locked preventing other users from inserting new rows. This is true even if the new rows are for transactions of different bills. 9.4 There is a possibility of deadlocks with popular rows of the flightdate table because most transactions involve more than one flight. In the timeline below, SR1 and SR2 represent the seats remaining field in two rows of the flightdate table. However, deadlocks will be rare if the legs in a reservation are locked trip order. For example, if a reservation includes legs from Denver to Chicago and Chicago to New York, there will not be deadlocks with other transactions requesting the same two legs in the same order. There could be a deadlock with more than two transactions involving reservations of multiple intersecting legs but this situation should be rare. Transaction A Time Transaction B XLock SR1 t1 t2 XLock SR2 XLock SR2 (wait) t3 t4 XLock SR1 (wait) 10. The following timeline shows the state of transactions with respect to the most recent backup, checkpoint, and failure. Use the timeline when solving the problems in subparts of this problem. backup checkpoint failure time T1 T2 T3 T4 T5 T6 T7 T8 T9 10.1 Describe the restart work if transaction T3 is aborted (with a ROLLBACK statement) after the checkpoint but prior to the failure. Assume that the recovery manager uses the deferred update approach. 10.2 Describe the restart work if transaction T3 is aborted (with a ROLLBACK statement) after the checkpoint but prior to the failure. Assume that the recovery manager uses the immediate update approach. 10.3 Describe the restart work if a system failure occurs. Assume that the recovery manager uses the deferred update approach. 10.4 Describe the restart work if a system failure occurs. Assume that the recovery manager uses the immediate update approach. 10.5 Describe the restart work if a device failure occurs. Ans: 10.1 If transaction T3 is aborted after the checkpoint, there is no restart work if the recovery manager uses the deferred update approach. Database writes are not permanent until after the commit. 10.2 If transaction T3 is aborted after the checkpoint, the recovery manager uses undo operations if the immediate update approach is used. The recovery manager applies the undo operator to all log records for transaction T3. The undo operations are applied in reverse order beginning with the most recent (last written) log record. 10.3 After a system failure, the recovery manager uses redo operations if the deferred update approach is used. The recovery manager takes the following actions: No action for T1 and T4 because they committed before the most recent checkpoint. No action for T6, T8, and T9 because they did not commit before the system failure. No database writes are permanent because the deferred update approach is used. Redo log records starting from the oldest log record: T2, T3, T5, T7 Restart transactions that did not commit before failure: T6, T8, T9 10.4 After a system failure, the recovery manager uses undo and redo operations if the immediate update approach is used. The recovery manager takes the following actions: No action: T1 and T4 because they committed before the most recent checkpoint. Undo log records starting from most recent log record backwards: T6, T8, and T9 because they did not commit before the system failure. Some database writes may be permanent because the immediate update approach is used. Redo log records starting from the oldest log record after the checkpoint: T2, T3, T5, T7 Restart transactions that did not commit before failure: T6, T8, T9 10.5 After a media failure, the recovery manager takes the following actions: Use the most recent backup to restore the database. Redo log records for all transactions committed after backup: T1, T2, T3, T4, T5, T7. For each transaction, redo forward starting with the oldest log record. Restart uncommitted transactions: T6, T8, T9 11. Use the World Wide Web to review transaction processing benchmarks. Why has the debit–credit benchmark been superseded by other benchmarks? How many transactions per minute are reported for various DBMSs? Inspect the code for one or more benchmark transactions. Can you identify hot spots in the transactions? Ans: The debit-credit benchmark is one of the oldest benchmarks for transaction processing. The history of the debit-credit benchmark is presented by Kim Stanley, the Chief Operating Officer of the Transaction Processing Council (TPC), in the article “History and Overview of the TPC” (http://www.tpc.org/articles/tpc.overview.history.1.html). Some important points from this article about the debit-credit benchmark and its limitations are listed below: The debit-credit benchmark was originally designed by IBM to test the performance of Automated Teller Machines. The benchmark was originally known as TP1. The TP1 benchmark measured batch-processing performance of a system handling ATM transactions. The benchmark was flawed because it ignored network and user interaction, thus inflating benchmark results. The benchmark process was flawed because it did not have proper supervision to ensure consistent and repeatable results across vendors. In addition, there was no auditing to ensure compliance with benchmark standards. The TP1 benchmark evolved into two benchmarks that were controlled by the TPC. TPC-A measures online performance of ATM transactions including network and user interaction time. TPC-B measures batch performance of ATM transactions ignoring network and user interaction times. TPC-A and TPC-B, although better defined and controlled by the TPC, have been superseded by other benchmarks. ATM transactions are not representative of more complex transactions that are common today. New benchmarks include TPC-C, an order entry benchmark, TPC-H, an ad-hoc decision support benchmark, TPC-W, an electronic commerce benchmark, and TPC-R, a repetitive decision support benchmark, are the current benchmarks supported by the TPC. The TPC publishes current performance results on its website (www.tpc.org). In May 2002, there are results for the four supported benchmarks (TPC-C, TPC-W, TPC-R, and TPC-H). To understand the results, you should study the benchmark specification and the performance measures used. For TPC-C, two measures are used: the number of transactions processed per minute (tpmC) and the total system cost divided by tpmC (Price/tpmC). In May 2002, the best result is Microsoft SQL Server 2000 Enterprise Edition running under Windows 2000 Advanced Server with COM+ as a teleprocessing monitor using a HP ProLiant DL760-900-256P server. The tpmC number is 709,220 and the Price/tpmC number is 14.96 US$. To see possible hot spots, you need to understand the logical database design, the transaction statements, and the details of the database management system. You should especially look for popular records in a table that are updated by many simultaneous users such as the seats remaining field in popular flight records, the quantity on hand of popular inventory records, and current balance of popular accounts. You should also consider the usage of page locking on insert operations. Page locking can result in congestion on insert operations because the next available page is being locked by all transactions that perform an insert operation on a specified table. Index records and pages can also be the source of system hot spots. To see system hot spots, it may be necessary to monitor resource usage during transaction processing to detect waiting for a particular resource.