Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Get this document free

recovery

VIEWS: 3 PAGES: 49

									        Carnegie Mellon Univ.
      Dept. of Computer Science
    15-415 - Database Applications

                  C. Faloutsos
                   Recovery


Carnegie Mellon
                  General Overview
•   Relational model - SQL
•   Functional Dependencies & Normalization
•   Physical Design &Indexing
•   Query optimization
•   Transaction processing
     – concurrency control
     – recovery
Carnegie Mellon       15-415 - C. Faloutsos   2
                  Transactions - dfn
= unit of work, eg.
     move $10 from savings to checking


Atomicity (all or none)                        recovery
Consistency
Isolation (as if alone)                        concurrency
                                               control
Durability
Carnegie Mellon        15-415 - C. Faloutsos                 3
                  Overview - recovery
• problem definition
     – types of failures
     – types of storage
• solution#1: Write-ahead log
     – deferred updates
     – incremental updates
     – checkpoints
• (solution #2: shadow paging)
Carnegie Mellon         15-415 - C. Faloutsos   4
                  Recovery
• Durability - types of failures?




Carnegie Mellon    15-415 - C. Faloutsos   5
                  Recovery
•   Durability - types of failures?
•   disk crash (ouch!)
•   power failure
•   software errors (deadlock, division by zero)




Carnegie Mellon     15-415 - C. Faloutsos          6
        Reminder: types of storage
• volatile (eg., main memory)
• non-volatile (eg., disk, tape)
• “stable” (“never” fails - how to implement
  it?)




Carnegie Mellon   15-415 - C. Faloutsos        7
              Classification of failures:
frequent; ‘cheap’
   • logical errors (eg., div. by 0)
   • system errors (eg. deadlock - pgm can run
     later)
   • system crash (eg., power failure - volatile
     storage is lost)
   • disk failure
rare; expensive

   Carnegie Mellon     15-415 - C. Faloutsos       8
                  Problem definition
• Records are on disk
• for updates, they are copied in memory
• and flushed back on disk, at the discretion
  of the O.S.! (unless forced-output:
  ‘output(B)’ = fflush())



Carnegie Mellon        15-415 - C. Faloutsos    9
           Problem definition - eg.:
read(X)
                  buffer{             5
X=X+1                                          5      }page
write(X)
                                              disk
                                  main
                                 memory




Carnegie Mellon       15-415 - C. Faloutsos          10
           Problem definition - eg.:
read(X)
                                   6
X=X+1                                       5

write(X)
                                           disk
                               main
                              memory




Carnegie Mellon    15-415 - C. Faloutsos          11
           Problem definition - eg.:
read(X)
                                       6
X=X+1                                           5

write(X)
buffer joins an ouput queue,                   disk
but it is NOT flushed immediately!
Q1: why not?
Q2: so what?
Carnegie Mellon        15-415 - C. Faloutsos          12
            Problem definition - eg.:
read(X)
                                    6
read(Y)                                      5          X
                                    3        3
X=X+1                                                   Y

Y=Y-1                                       disk
write(X)
write(Y)
Q2: so what?
 Carnegie Mellon    15-415 - C. Faloutsos          13
           Problem definition - eg.:
read(X)
                                        6
read(Y)                                           5          X
                                        3         3
X=X+1                                                        Y

Y=Y-1                                            disk
write(X)
                  Q2: so what?
write(Y)
                  Q3: how to guard against it?


Carnegie Mellon         15-415 - C. Faloutsos           14
                  Solution #1: W.A.L.
•   redundancy, namely
•   write-ahead log, on ‘stable’ storage
•   Q: what to replicate? (not the full page!!)
•   A:
•   Q: how exactly?



Carnegie Mellon         15-415 - C. Faloutsos     15
                  W.A.L. - intro
• replicate intentions: eg:
     <T1 start>
     <T1, X, 5, 6>
     <T1, Y, 4, 3>
     <T1 commit> (or <T1 abort>)




Carnegie Mellon      15-415 - C. Faloutsos   16
                  W.A.L. - intro
• in general: transaction-id, data-item-id, old-
  value, new-value
• assumption: each log record is immediately
  flushed on stable store
• each transaction writes a log record first,
  before doing the change
• when done, write a <commit> record & exit

Carnegie Mellon      15-415 - C. Faloutsos     17
         W.A.L. - deferred updates
• idea: prevent OS from flushing buffers,
  until (partial) ‘commit’.
• After a failure, “replay” the log




Carnegie Mellon   15-415 - C. Faloutsos     18
         W.A.L. - deferred updates
                                                             before
                                                <T1 start>
• Q: how, exactly?
     –   value of W on disk?                    <T1, W, 1000, 2000>

     –   value of W after recov.?               <T1, Z, 5, 10>

     –   value of Z on disk?                    <T1 commit>
     –   value of Z after recov.?                                     crash




Carnegie Mellon         15-415 - C. Faloutsos                          19
         W.A.L. - deferred updates
                                                             before
                                                <T1 start>
• Q: how, exactly?
     –   value of W on disk?                    <T1, W, 1000, 2000>

     –   value of W after recov.?               <T1, Z, 5, 10>

     –   value of Z on disk?                                          crash
     –   value of Z after recov.?




Carnegie Mellon         15-415 - C. Faloutsos                         20
         W.A.L. - deferred updates
                                                           before
                                              <T1 start>
• Thus, the recovery algo:
     – redo committed transactions            <T1, W, 1000, 2000>

     – ignore uncommited ones                 <T1, Z, 5, 10>


                                                               crash




Carnegie Mellon       15-415 - C. Faloutsos                    21
         W.A.L. - deferred updates
                                                       before
                                          <T1 start>
Observations:
                                          <T1, W, 1000, 2000>
- no need to keep ‘old’ values
                                          <T1, Z, 5, 10>
- Disadvantages?
                                                           crash




Carnegie Mellon   15-415 - C. Faloutsos                    22
         W.A.L. - deferred updates
- Disadvantages?
(eg., “increase all balances by 5%”)
May run out of buffer space!
Hence:




Carnegie Mellon   15-415 - C. Faloutsos   23
                  Overview - recovery
• problem definition
     – types of failures
     – types of storage
• solution#1: Write-ahead log
     – deferred updates
     – incremental updates
     – checkpoints
• (solution #2: shadow paging)
Carnegie Mellon         15-415 - C. Faloutsos   24
   W.A.L. - incremental updates
- log records have ‘old’ and ‘new’ values.
- modified buffers can be flushed at any time

Each transaction:
- writes a log record first, before doing the
   change
- writes a ‘commit’ record (if all is well)
- exits
Carnegie Mellon     15-415 - C. Faloutsos       25
     W.A.L. - incremental updates
                                                             before
                                                <T1 start>
• Q: how, exactly?
     –   value of W on disk?                    <T1, W, 1000, 2000>

     –   value of W after recov.?               <T1, Z, 5, 10>

     –   value of Z on disk?                    <T1 commit>
     –   value of Z after recov.?                                     crash




Carnegie Mellon         15-415 - C. Faloutsos                          26
     W.A.L. - incremental updates
                                                             before
                                                <T1 start>
• Q: how, exactly?
     –   value of W on disk?                    <T1, W, 1000, 2000>

     –   value of W after recov.?               <T1, Z, 5, 10>

     –   value of Z on disk?                                          crash
     –   value of Z after recov.?




Carnegie Mellon         15-415 - C. Faloutsos                         27
     W.A.L. - incremental updates
                                                         before
                                            <T1 start>
• Q: recovery algo?
                                            <T1, W, 1000, 2000>
• A:
                                            <T1, Z, 5, 10>
     – redo committed xacts
     – undo uncommitted ones                                      crash
• (more details: soon)



Carnegie Mellon     15-415 - C. Faloutsos                         28
     W.A.L. - incremental updates
                                                       before
                                          <T1 start>
Observations
                                          <T1, W, 1000, 2000>
• “increase all balances by
  5%” - problems?                         <T1, Z, 5, 10>

• what if the log is huge?                                      crash




Carnegie Mellon   15-415 - C. Faloutsos                         29
                  Overview - recovery
• problem definition
     – types of failures
     – types of storage
• solution#1: Write-ahead log
     – deferred updates
     – incremental updates
     – checkpoints
• (solution #2: shadow paging)
Carnegie Mellon         15-415 - C. Faloutsos   30
              W.A.L. - check-points
                                                          before
                                             <T1 start>
Idea: periodically, flush
  buffers                                    <T1, W, 1000, 2000>

Q: should we write                           <T1, Z, 5, 10>

  anything on the log?                       ...
                                             <T500, B, 10, 12>
                                                                   crash




Carnegie Mellon      15-415 - C. Faloutsos                         31
              W.A.L. - check-points
                                                          before
                                             <T1 start>
Q: should we write
 anything on the log?                        <T1, W, 1000, 2000>

A: yes!                                      <T1, Z, 5, 10>
                                             <checkpoint>
Q: how does it help us?
                                             ...
                                             <checkpoint>
                                             <T500, B, 10, 12>
                                                                   crash
Carnegie Mellon      15-415 - C. Faloutsos                         32
              W.A.L. - check-points
                                              <T1 start>
                                              ...
Q: how does it help us?
                                              <T1 commit>
     A=? on disk?                             ...
     A=? after recovery?                      <T499, C, 1000, 1200>
     B=? on disk?                             <checkpoint>
                                              <T499 commit>
     B=? after recovery?                                          before
                                              <T500 start>
     C=? on disk?                             <T500, A, 200, 400>
     C=? after recovery?                      <checkpoint>
                                              <T500, B, 10, 12>
                                                                       crash
Carnegie Mellon       15-415 - C. Faloutsos                                33
              W.A.L. - check-points
                                             <T1 start>
                                             ...
Q: how does it help us?
                                             <T1 commit>
Ie., how is the recovery                     ...

algorithm?                                   <T499, C, 1000, 1200>
                                             <checkpoint>
                                             <T499 commit>
                                                                 before
                                             <T500 start>
                                             <T500, A, 200, 400>
                                             <checkpoint>
                                             <T500, B, 10, 12>
                                                                      crash
Carnegie Mellon      15-415 - C. Faloutsos                                34
              W.A.L. - check-points
                                             <T1 start>
                                             ...
Q: how is the recovery
                                             <T1 commit>
algorithm?                                   ...

A:                                           <T499, C, 1000, 1200>
                                             <checkpoint>
  - undo uncommitted                         <T499 commit>
                                                                 before
  xacts (eg., T500)                          <T500 start>

  - redo the ones                            <T500, A, 200, 400>
                                             <checkpoint>
  committed after the last
                                             <T500, B, 10, 12>
  checkpoint (eg., none)                                              crash
Carnegie Mellon      15-415 - C. Faloutsos                                35
      W.A.L. - w/ concurrent xacts
                                           <T1 start>

Log helps to rollback
                                           <checkpoint>
 transactions (eg., after a                <T499 commit>
 deadlock + victim                         <T500 start>
 selection)                                <T500, A, 200, 400>
                                           <T300 commit>
Eg., rollback(T500): go                                        before
                                           <checkpoint>
 backwards on log;                         <T500, B, 10, 12>
 restore old values                        <T500 abort>


Carnegie Mellon    15-415 - C. Faloutsos                                36
      W.A.L. - w/ concurrent xacts
                                           <T1 start>
                                           ...
-recovery algo?
                                           <T300 start>
- undo uncommitted ones                    ...

- redo ones committed                      <checkpoint>
                                           <T499 commit>
   after the last checkpoint                                     before
                                           <T500 start>
                                           <T500, A, 200, 400>
                                           <T300 commit>
                                           <checkpoint>
                                           <T500, B, 10, 12>

Carnegie Mellon    15-415 - C. Faloutsos                              37
      W.A.L. - w/ concurrent xacts
                                           ck   ck     crash

-recovery algo?             T1
- undo uncommitted          T2

   ones                      T3

                            T4
- redo ones
   committed after
   the last checkpoint
                                                     time
- Eg.?

Carnegie Mellon    15-415 - C. Faloutsos               38
      W.A.L. - w/ concurrent xacts
                                              ck   ck     crash

-recovery algo?                T1
   specifically:               T2

- find latest                   T3

   checkpoint                  T4

- create the ‘undo’
   and ‘redo’ lists
                                                        time


Carnegie Mellon       15-415 - C. Faloutsos               39
        W.A.L. - w/ concurrent xacts
                   ck   ck         crash             <T1 start>
                                                     <T2 start>
T1                                                   <T4 start>
T2                                                   <T1 commit>
T3                                                   <checkpoint   >

T4                                                   <T3 start>
                                                     <T2 commit>
                                                     <checkpoint   >
                                                     <T3 commit>
                              time


 Carnegie Mellon             15-415 - C. Faloutsos                     40
       W.A.L. - w/ concurrent xacts
                                                <T1 start>
  <checkpoint> should also                      <T2 start>
  contain a list of ‘active’
                                                <T4 start>
  transactions (= not
                                                <T1 commit>
  commited yet)
                                                <checkpoint   >
                                                <T3 start>
                                                <T2 commit>
                                                <checkpoint   >
                                                <T3 commit>




Carnegie Mellon         15-415 - C. Faloutsos                     41
       W.A.L. - w/ concurrent xacts
                                                <T1 start>
  <checkpoint> should also                      <T2 start>
  contain a list of ‘active’
                                                <T4 start>
  transactions
                                                <T1 commit>
                                                <checkpoint {T4, T2}>
                                                <T3 start>
                                                <T2 commit>
                                                <checkpoint {T4,T3} >
                                                <T3 commit>




Carnegie Mellon         15-415 - C. Faloutsos                           42
       W.A.L. - w/ concurrent xacts
                                                     <T1 start>
  Recovery algo:
                                                     <T2 start>
  - build ‘undo’ and ‘redo’ lists                    <T4 start>
  - scan backwards, undoing ops                      <T1 commit>
   by the ‘undo’-list transactions                   <checkpoint {T4, T2}>
  - go to most recent checkpoint                     <T3 start>

  - scan forward, re-doing ops by                    <T2 commit>
  the ‘redo’-list xacts                              <checkpoint {T4,T3} >
                                                     <T3 commit>




Carnegie Mellon              15-415 - C. Faloutsos                           43
       W.A.L. - w/ concurrent xacts
                                                       <T1 start>
  Observations:
                                                       <T2 start>
  - during checkpoints: assume                         <T4 start>
  that no changes are allowed by
  xacts (otherwise, ‘fuzzy                             <T1 commit>
  checkpoints’)                                        <checkpoint {T4, T2}>
  - recovery algo: is idempotent                       <T3 start>
  (ie., can work, even if there is a                   <T2 commit>
  failure during recovery!
                                                       <checkpoint {T4,T3} >
  - how to handle buffers of stable
                                                       <T3 commit>
  storage?



Carnegie Mellon                15-415 - C. Faloutsos                           44
                  Overview - recovery
• problem definition
     – types of failures
     – types of storage
• solution#1: Write-ahead log
     – deferred updates
     – incremental updates
     – checkpoints
• (solution #2: shadow paging)
Carnegie Mellon         15-415 - C. Faloutsos   45
                  Shadow paging
• keep old pages on disk
• write updated records on new pages on disk
• if successful, release old pages; else release
  ‘new’ pages
• not used in practice - why not?



Carnegie Mellon      15-415 - C. Faloutsos     46
                  Shadow paging
• not used in practice - why not?
• may need too much disk space (“increase all
  by 5%”)
• may destroy clustering/contiguity of pages.




Carnegie Mellon      15-415 - C. Faloutsos   47
                  Other topics
• against loss of non-volatile storage: dumps
  of the whole database on stable storage.




Carnegie Mellon     15-415 - C. Faloutsos       48
                  Conclusions
• Write-Ahead Log, for loss of volatile
  storage,
• with incremental updates,
• and checkpoints.
• On recovery: undo uncommitted; redo
  committed transactions.


Carnegie Mellon     15-415 - C. Faloutsos   49

								
To top