Embed
Email

doc file

Document Sample

Shared by: linzhengnd
Categories
Tags
Stats
views:
6
posted:
11/14/2011
language:
English
pages:
7
CS 223 Final Exam : Take Home.

Posted Wednesday, 3/16/11. Due Friday Noon on 3/18/11.









Instructions:

 The take home exam is based on the honor code and trust the instructor has in you that

you will solve the questions individually. It is usually quite easy for the instructor to

determine if indeed students have violated the honor code. Such a suspicion will not

only negatively impact your grades, but will push the instructor to offer only “in-class”

exams. The take-home exam has a lot of advantages – e.g., providing opportunity for

you to think without a pressure of time, ensure that t your exam performance accurately

reflects your level of understanding, learn something new through exams, etc. Please

do follow the honor code to ensure that future generation of students continue to get the

same opportunities that are being given to you.



 This is an OPEN notes OPEN book exam. So feel free to consult any reference material

you want. The only thing you are barred from consulting is other people. Extensive

searching on the Web for answers to the question will most likely not help. All you truly

need is your class notes + time to think through the solutions.



 Please begin answer to each question on a NEW page. I would prefer typed answers

(so that I do not make mistakes due to poor handwriting) but will accept hand written

exam sheets as well in case you do not have time to type your answers. Do not forget to

number your pages.



 Slip in your answer sheets in my office – BH 2082 BEFORE Noon on Friday. This is a

hard deadline. No exam copy will be accepted after that. No exceptions! So make sure

that you have a nicely typed, stapled answer sheet turned in before Noon.



 Please make sure to write your name on EACH page of the answer sheets you turn in.



 It is preferred that you include your answer sheets in a envelope before slipping it in my

office. This way there is lesser change of pages being misplaced and/or damaged.









1|Page

Question 1: (25 points) In a TP system, when an application requests a begin transaction, the

system assigns it a trid that is used to identify the transaction. Trid is an important concept in

implementing transactions both in a centralized and a distributed TP system. For instance, in a

centralized system, when a transaction makes a RPC call to a resource manager, the call

contains with it the transaction’s trid which is used by the resource manager to identify the

transaction (e.g., the resource manager will acquire all locks it needs for the transaction using

the trid). Likewise, in a distributed system, when an application at node A wishes to access a

resource manager at node B, it makes a remote procedure call (RPC) to the remote resource

manager. The RPC carries with it a trid of the transaction at node A. If this is the first invocation

for the transaction at node A, the transaction's trid is registered at node B and becomes the trid

for the transaction at node B. The resource manager at node B called on behalf of a transaction

at node A, itself may need to call some other resource manager at another node C and the

transaction may spread to multiple nodes. Irrespective of which node the transaction executes

on, it is always identified by the same trid which is the trid assigned to it on the node at which

the transaction started.









Let us consider a distributed system with three nodes A, B, and C. Let us assume that these

three systems were manufactured by different vendors and each of these systems has a

different representation of trid which cannot be used across multiple systems – that is, trid for a

transaction generated say at node A cannot be used to identify the transaction at node B, and

vice versa. Other than that, the three system are fully compatible and support all other

mechanisms needed for implementing distributed transactions – e.g., they use the exact same

version of 2PC protocol etc. for committing transactions.



One approach to execute distributed transactions in such a heterogeneous distributed system is

for each node to support a gateway for the transaction requests arriving to it from the other

remote systems-- for example, A supports a gateway to B and C. When a request arrives to A

from B carrying the trid (B.trid) the appropriate gateway generates a new trid for the transaction

-- say A.trid, that is compatible with the system at A. The gateway stores a translation table

containing the association between the transaction's trids -- that is, it stores in the table (B.trid,

A.trid). The gateway can now deliver the request to the appropriate resource manager using

A.trid as the transaction's identifier. Future requests from B for the same transaction that arrive

to A will have the same B.trid value. The gateway will appropriately modify the trid associated

with the call to A.trid using its translation table.









2|Page

 Do the entries in the translation tables maintained have to be made durable? If so, when

should an entry be made durable -- at the beginning when the RPC arrives, when a

commit message for a transaction arrives? Explain. (5 points)

 Illustrate a scenario in which, if the above method is used, different request on behalf of

the same transaction (say that originated at B) execute under more than one trid at

system A. (7 points)

 Can something go wrong if different requests of the same transaction execute under

more than one trid at system A? Will serializability be sacrificed? Will atomicity of

transaction be sacrificed? List out all the possible problems you can think of along with

a justification for your answer. (6 points)

 Can you design mechanisms to overcome the problem that does not require major

changes to the existing code and do not require systems to change their trid

implementation but can overcome the problems you identified? (7 points)









3|Page

Question 2: (25 points) Consider a relation R with three attributes A, B, and C. Let there exist

a B-tree index on each of the three attributes. Furthermore, let us assume that each of the

indices are unique (no duplicate records for any of the attribute values) and the relation

contains the following records {, , , .. }. The system uses

multi-granularity locking protocol.



Let us first assume that the system uses dynamic key ranges as lockable granules. For

instance, for attribute A, these granules correspond to : range_A(1) = (-infinity, 1], range_A(2) =

(1,2], … range_A(10} = (9,10] , and range_A(infinity) = (10,infinity). Likewise ranges are defined

for attribute B ( that is, range_B(1), range_B(2), …) and also for attribute C (that is, range_C(1),

range_C(2), etc.) .



Specify exactly which key range locks and record level locks will be acquired and for what

duration for the following operations:



 (5 points) A query that selects all records from R that satisfy the condition 2



Now let us assume that instead of supporting ranges as explicit granules, key ranges are

implicitly locked by the transaction by locking the record as in the case of ARIES KVL. That is, if

a transaction requires a IS lock on say a range for one of the attributes and an S lock on the

record, it simply acquires S lock on record which implicitly locks the corresponding range in S-

mode (which is more restrictive than IS). l Specify exactly which locks are acquired and for

what duration for the following operations:



 (5 points) A query that selects all records from R that satisfy the condition 2



 ( 5 points) Which of the two strategies offers higher concurrency:



Strategy 1: when key ranges are explicitly identified as lockable granules.



Strategy 2: when record locking implicitly locks corresponding key ranges.



Illustrate the answer using a concrete example. The example should show a transaction whose

operation would be blocked by one of the strategies but permitted by another.









4|Page

Question 3: (15 points) Consider a shared-nothing multiprocessor database system wherein

data is partitioned across N different machines. Transactions may span multiple processors and

the system uses 2 phase locking coupled with two-phase commit. Each processor P_i maintains

a separate log manager LM_i that writes a log file LF_i. Each log manager follows the standard

logging protocols – viz., force log at commit and write ahead logging. An LSN value is

associated with the log records written by a log manager LM_i and logs are made persistent by

LM_i in the LSN order – thus, for instance, if a log record with LSN value v1 has been made

persistent by LM_i, then all log records written by LM_i that have a LSN value less than v1 must

already have been made persistent by LM_i. However, LSN values are generated by the log

managers independently and are not synchronized. Thus, it is quite possible that a log with LSN

value say 10 at LM_1 may still be in memory at processor P_1 while the log with LSN value 20

LM_2 is persistent at processor P_2.



 (5 points) Let us first assume that the system uses logical logging to implement

atomicity of transactions. To ensure atomicity of actions constituting a transaction (viz. to

prevent action consistency and partial failure problems), the system acquires appropriate

page latches/short duration locks and writes physical log records for each operation

corresponding to a logical action. In case of failures, each log manager uses multi-level

recovery. Does multi-level recovery guarantee atomicity in such a multi-log

multiprocessor system? Notice that the operations constituting a logical action may

span multiple processors. If so, create an example illustrating the problem. If not,

intuitively argue why such a situation cannot arise.



 (5 points) Again let us assume logical logging is being used just as in the previous

question. Except operations that correspond to a single logical action of a transaction

are such that they are constrained to be on a single processor and hence logs for all

operations corresponding to a single logical action are written to the same log file.

Notice that logs corresponding to different logical actions could, of course, span multiple

log files since a transaction may consist of actions at multiple-processor. Assuming that

multi-level recovery is used, can action consistency/partial failures arise in such a

scenario? If so, create an example illustrating the problem. If not, intuitively argue why

such a situation cannot arise.



 (5 points) Now consider that instead of logical logging, physiological logging was being

used wherein each action constituting a transaction is constrained to be on a single page

(which by definition resides on a single processor). Also, it is assumed that in addition to

WAL and FL@C each processor also follows the FIX protocol to guarantee that no

partially updated page can be chosen as a victim in a buffer replacement policy. Can the

action consistency and partial failures arise in this case? Again, create an example

illustrating the problem or argue why such a situation cannot arise.









5|Page

Question 4: (25 points)

Let us assume that the system uses a 2PC protocol with presumed abort optimization for

atomic commitment of transactions. Refer to the figure in the class notes (or one of the books)

to see how 2PC works during normal processing and its various timeout and recovery actions.





 (5 points) In 2PC, a cohort forces the prepare log record to disk before replying to the

coordinator with its vote. Explain using an example what could go wrong if the cohort did

not force the prepare log record to disk before sending its vote to the coordinator.



 (5 points) In 2PC, the coordinator forces the commit log record to disk before sending

commit messages to other cohorts. Explain what could go wrong if the coordinator did

not force the commit log record to disk before sending the commit message to the

cohorts.



In the 2PC protocol, the coordinator on timeout while waiting for acknowledgements from

cohorts for a commit message, resends the commit message to the cohorts who have not as yet

responded with their acknowledgements. Furthermore, a cohort on timeout waiting for a commit

message from the coordinator, probes the coordinator for the commit decision. (i.e., the cohort

sends a message to the coordinator asking for the commit decision for the transaction.)



On first glance, it appears that one of these two timeout actions is redundant and not necessary.

For example, assume that the coordinator on timeout waiting for commit acknowledgement from

cohort resends the commit message but the cohort takes no timeout action waiting for the

commit message. In this case, since the coordinator will take a timeout action and resend the

commit message (since the cohort would not have sent its acknowledgement), the cohort, if it

waits long enough, would be able to learn about the committment of the transaction.



 (7.5 points) Is the timeout action by the coordinator to resend the commit message to

the cohort that has not responded with the acknowledgement required even though

cohort, on timeout, will probe the coordinator for the status of the transaction? If yes,

explain what would go wrong if the coordinator takes no timeout action.





 (7.5 points) Is the timeout action by the cohort to probe the coordinator for the commit

decision required even though, on timeout, the coordinator will resend the commit

message to the cohort. If yes, explain what would go wrong if the cohort takes no

timeout action.









6|Page

Question 5: (10 points)



Succinctly, in about a para or two describe your course project, what did you do, what did you

find hard, what you liked about it, what you did not. Is the documentation about the project on

the web sufficiently detailed, if you did the standard class project. If you did a research oriented

project, are you happy about your choice? What could the instructor have done (constructive

suggestions only please  ) to improve your experience with the project.









7|Page



Related docs
Other docs by linzhengnd
i-Health
Views: 0  |  Downloads: 0
State employees recall events of September 11
Views: 7  |  Downloads: 0
0804050421330_2110
Views: 4  |  Downloads: 0
Listino2009 - Meetup
Views: 0  |  Downloads: 0
TwoSurveyCalculator
Views: 0  |  Downloads: 0
Guidelines.xlsx
Views: 0  |  Downloads: 0
APPALACHIA AND THE OZARKS
Views: 2  |  Downloads: 0
Proliferation Studies
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!