Transactions Management and Concurrency Control Chapter 9 Databases are shared resources. We must expect and plan for the likelihood that several users will attempt to access and manipulate data at the same time. With concurrent processing, involving updates, a database without concurrency control will be compromised due to interference between users. There are to basic approaches to concurrency control; a pessimistic approach (involving locking) and an optimistic approach . In a multiuser environment, were data in only being read, concurrency control in not a problem. However, if one or more users are updating data, then potential problems with maintaining data integrity arise. If two or more users are attempting to update the same data at the same time ,via transactions, each is potentially sending, in our case, SQL statements to the CPU that are potentially conflicting re the integrity of the data. Given all the SQL instructions available to the CPU, CPUs process one instruction at a time. One of the goals of CPU management is to keep the CPU busy while non CPU functions, I/O functions, are being performed. It accomplishes this goal via algorithems that manage CPU memory utilization. As multiple transactions are submitted while other processing is occurring against the database, the transactions are usually interleaved, with the CPU switching among the transactions so that some protion of each transaction is performed as the CPU addresses each transaction in turn. However, in performing its interleaving functions against multiple users attempting to change the same data, database integrity must be maintained. The process of maintaining this data integrity is called Concurrency Control. Why does a multi-user database environment give us a special interest in transaction management and concurrency control? Multiple user applications What happens when: 1. two people want to view or change the same record, 2. we wish to restrict some users and not others access to the database, 3. we prevent certain users from changing the database, 4. we set different permissions for different users? We begin by exploring what a transaction is, what its components are, and why it must be managed carefully in a single-user database environment. Even more so in a multi-user database environment. Transaction A transaction represents a real-world event such as the sale of a product. It represents a logical unit of work. No portion of a transaction stands by itself. For example: a product sale has an effect on inventory and, if it is a credit sale, it has an effect on customer balances. A transaction must take a database from one consistent state to another. Therefore, all parts of a transaction must be executed or the transaction must be aborted. All transactions have 4 propertities, these are: 1. atomicity: Unless all parts of the transaction are executed, the transaction is aborted. 2. durability: Once a transaction is committed, it cannot be rolled back. 3. serializability: The result of concurrent transactions is the same as though the transactions were executed in serial order. 4. isolation: Data used by one transaction cannot be used by another transaction until the first transaction is completed. The Isolation Property: The most common problem encountered when multiple users attempt to updata a database without adequate concurrency control is that of lost updates. A lost update would be one where John and Marsha have a joint checking account and both want to withdraw some cash at the same time from different ATM locations. John's transaction reads the account balance at $1000 and he proceeds to withdraw $200. Before the transaction writes the new balance $800, Marsha's transaction reads the account balance still at $1000. She then withdraws $300 leaving a balance of $300. Her transaction then writes this account balance, which replaces the one written by John's transaction. Thus the balance is shown to be $700 when, in fact, the balance should be $500. The bank is not happy! This sequence of events violates the transaction isolation property. The Serializability Property: Concurrent transactions need to be processed so that they do not interfere with each other. If one transaction were entirely processed at a time before another transaction, no interference would occur. Procedures that process transactions so the the outcome is the same as this are called serializable. Processing transactions using a serializable schedule will give the same results as if the transactions had been processed one after the other. The most common approach to achieve Serializability is via locking . With locking, any data that are retrieved by a user for updating must be locked, or denied to other users, until the update is complete (committed) or aborted (rolled back). Consider the above transactions of John and Marsha where record locks are in use. Since John's transaction will update this record, the application program locks this record before reading it into main memory. John proceeds to withdraw $200, and the new balance $800 is computed and updated. Marsha has initiated a withdrawal transaction shortly after John, but her transaction cannot access the account record until John's transaction has returned the updated record to the database and unlocked the record. The locking mechanism thus enforces a sequential updating process that prevents erroneous updates. Types of Locks With the above, we illustrated locks that prevent all access to locked items. We can lock all or any part of a database from Database to Table to Record Block (Page) to Record to Field. In practice the database administrator can generally choose between two types of locks; shared and exclusive. An Exclusive Lock prevents another transaction from reading and therefore updating a record until it is unlocked. Exclusive Lock is placed on a record as it is accessed for update. A Shared Lock allows other transactions to read (but not update) a record. Again, consider the above transactions of John and Marsha with the use of shared and exclusive locks for the checking account example. When John initiates his transaction, the program places a read lock on his account record, since he is reading the record to check the account balance. When John requests a withdrawal, the program attempts to place an exclusive lock (write lock) on the record, since this is an update operation. However, Marsha has already initiated a transaction that has placed a read lock on the same record. As a result, John's request for an exclusive lock to update the record is denied. if another record has a read lock, another user can not obtain an exclusive (write) lock. Since John has a prior read lock on the record when Marsha requests a withdrawal, her request for an exclusive lock to update the record is also denied. So they wait…????? A Deadlock results when two or more transactions have locked a common resource, and each must wait for the other to unlock that resource. John's transaction is waiting for Marsha's transaction to remove the read lock from the account record, and vice versa. There are two basic ways to resolve deadlocks; 1. Prevention and 2. Resolution. Deadlock prevention requires an ability to predict and then lock all required records at the beginning of a given transaction. Unfortunately, it is often difficult to predict in advance what records will be required to process a transaction, where multiple records are involved. Thus, deadlock prevention is not always practical. The more common approach is to allow deadlocks to occur but to build mechanisms into the DBMS for detecting and breaking the deadlocks. Versioning Locking is referred to as a pessimistic concurrency control mechanism, because of the assumption that two or more transactions may be processed at the same time. Another approach called, versioning, takes the optimistic approach that most of the time other users do not want the same record, or if they do, they only want to read (but not update) the record. With versioning, there is no form of locking. Each transaction is restricted to a view of the database as of the time that transaction started, and when a transaction modifies a record, the DBMS creates a new record version instead of overwriting the old record. Eg. John reads the record containing the account balance, successfully withdraws $200, and the new balance $800 is posted to the account with a COMMIT statement. Meanwhile, Marsha has also read the account record and requested a withdrawal, which is posted to her local version (view) of the account record. When Marsha's transaction attempts to COMMIT it discovers the update conflict, via the timestamped transaction of John, and her transaction is aborted. "Cannot complete transaction at this time" She can then restart the transaction, working from the correct starting balance of $800. The best way to understand versioning is to imagine a central records room, corresponding to the database. The records room has a service window. Users (corresponding to transactions) arrive at the window and request documents (corresponding to database records). However, the original documents never leave the records room. Instead, the clerk (corresponding to the DBMS) makes copies of the requested documents and timestamps them. Users then take their private copies (or versions) of the documents to their own workplace and read them and/or make changes. When finished, they return their marked-up copies to the clerk. The clerk merges the changes from marked-up copies into the central database. When there is no conflict (for example, when only one user has made changes to a set of database records), that user's changes are merged directly into the public (or central) database. Suppose instead that there is a conflict; for example, two users have made conflicting changes to their private copy of the database. In this case, the changes made by one of the users are committed to the database (the transactions are timestamped, so that the earlier transaction can be given priority.) The other user must be told that there was a conflict, and his work cannot be committed to the central database. He must check out another copy of the data records and repeat the previous work. Under the optimistic assumption, this type of rework will be exception rather than the rule. Thus, The DBMS's component in charge of concurrency control is the scheduler. The scheduler establishes the order in which the operations within concurrent transactions are executed. The scheduler interleaves the execution of database operations to ensure the serializability of transactions. To determine the appropriate execution order, the scheduler bases its actions on concurrency control algorithms such as locking or timestamping. Most multi-user DBMSs automatically initiate and enforce locking procedures. All lock information is managed by a lock manager, which is responsible for assigning and policing the locks used by the transactions. The scheduler via its control algorithms attempts to manage the trade-offs between maximum data integrity, minimum user deadlocks and maximum Central Processing Unit (CPU) efficiency. SQL as a Tool for Transaction and Concurrency Control SQL provides transaction support through: COMMIT (permanently saves changes to disk) and ROLLBACK (restores the previous database state) Example: SQL statements that might be used in transaction management and how they work. The following transaction registers the credit sale of a product to a customer. The credit sale transaction must do all of the following: 1. create the invoice record 2. update the inventory data and 3. update the customer account data BEGIN TRANSACTION <-- Start of transaction INSERT INTO INVOICE <-- Adds Invoice Record (INV_NUM, INV_DATE, ACCNUM, TOTAL) VALUES (1020,10/10/92,'60120010',3500); UPDATE INVENTORY <-- Updates the quantity on hand of the product SET ON_HAND = ON_HAND - 100 WHERE PROD_CODE = '345TYX'; UPDATE ACCREC <-- Updates the balance of the customer account SET BALANCE = BALANCE + 3500 WHERE ACCNUM = '60120010'; COMMIT; In SQL, the transaction begins automatically with the first SQL statement, or the user can start with the BEGIN TRANSACTION statement. The SQL transaction ends when the last SQL statement is found and/or the program ends the user cancels the transaction COMMIT or ROLLBACK statements are found The DBMS will ensure that all SQL statements are executed and completed before committing all work. If, for any reason, one or more of the SQL statements in the transaction cannot be completed, the entire transaction is aborted and the database is rolled back to its previous consistent state. Show Commit and RollBack Example "a:\CommitRollBack.exe" The VB Code behind CaseyToYesac and YesacToCasey Private Sub Command1_Click() 'Using Transactions to Control Changes Dim db As Database, rs As Recordset, ws As Workspace Set ws = DBEngine.Workspaces(0) Set db = ws.OpenDatabase("a:\TeamStats97.mdb") Set rs = db.OpenRecordset("PlayerStats", dbOpenTable) ws.BeginTrans Do Until rs.EOF If rs![Name] = "Casey" Then rs.Edit rs![Name] = "Yesac" rs.Update End If rs.MoveNext Loop If MsgBox("Save all changes?", vbQuestion + vbYesNo, "Save Changes") = vbYes Then ws.CommitTrans Else ws.Rollback End If rs.Close db.Close End Sub Microsoft Jet Engine and Access allows on to choose between: optimistic and pessimistic locking. With pessimistic locking, Access locks a record whenever someone starts to change the contents of that record. Pessimistic locking implies that when a user edits a record, Access pessimistically assumes that someone else will also want to edit that same record and so it needs to lock it. This method give an exclusive locking to the user such that only one person at a time can have access to a record. A second person can not gain access to the record until person one has finished with the record and has saved the changes. With an optimistic locking strategy the second user is able to make changes to a given record while the first person is also editing it. Optimistic locking implies that, when a user edits a record, Access optimistically assumes that no-one else will want to edit the same record, and so does not lock it. This can be a very dangerous scenario. Ie. when person one saves the record, nothing out of the ordinary happens. But, when the second person tries to save the record he/she is presented with a dialog box that describes a conflict which the second person must resolve. The second person is presented with three choices. They are: 1. Save the record which will overwrite the first person's changes without that person knowing about them. Not a Solution! 2. Copy the change to a clipboard, look at the changes made by person one and decide if the new changes should be made. Versioning. 3. Drop the changes made by the second person. Exclusive Lock. An additional point is that Access does not lock individual records. It locks data pages. A data page contains a given amount of data and may contain one or more records. Thus, when a record is locked all records on that page are also locked. If a record spans two pages both pages will be locked. So which locking strategy should you choose? There is no simple answer, but in general: If it is unlikely that any two users will want to amend the same or adjacent records simultaneously one can use optimistic locking. However, If it is likely that two or more users may want to amend the same or adjacent records simultaneously one should choose the exclusive pessimistic locking. For Access, Choose from the Advanced Page of the Tools/Options dialog box: 1. No Locks optimistic locking, a data page is locked only when a record on the page is actually being saved 2. All Records entire table is locked when any record is edited 3. Edited Records pessimistic locking, a data page is locked when someone starts to edit a record on it. Transaction Log The transaction log is a special DBMS table that contains a description of all the database transactions executed by the DBMS. The database transaction log plays a crucial role in maintaining database concurrency control and integrity. The information stored in the database is used by the DBMS to recover the database after a transaction is aborted (RolledBack) or after a system failure. The transaction log is usually stored in a different hard disk or in a different media (tape) to prevent the failure caused by a media error. The transaction log maintains a record of all database transactions that changed the database. For example, the insertion of a new row to an INVOICE table, the update to the P_ON_HAND attribute for a row identified by '345TYX' in a PRODUCT table, and the update of ACCT_BALANCE attribute for a row identified by '60120010' in an ACCOUNT table. The transaction log also records the transaction's beginning and end in order to help the DBMS to determine the operations that must be rolled back if the transaction is aborted (RollBack). Note: Only the current transaction may be rolled back, not all the previous transactions. If the database must be recovered via RollBack, the DBMS will: Return the database to the initial pre-transaction state.
Pages to are hidden for
"Transactions Management and Concurrency Control"Please download to view full document