lec09-pageReplace by xuyuzhu


									                           UNIVERSITY of WISCONSIN-MADISON
                              Computer Sciences Department

CS 537                                                       Andrea C. Arpaci-Dusseau
Introduction to Operating Systems                             Remzi H. Arpaci-Dusseau
                                                                    Haryadi S. Gunawi

                         Page Replacement
 Questions answered in this lecture:
       Example of Paging, TLB, and VM, all in one
       Page replacement policy
       LRU implementation
       Clock algorithm

            Virtual Memory (Recap)
   • A virtual page can be in memory (in the form of physical page) or
     in a backing store (e.g. the swap file system on the disk)
   • Reason why work: processes exhibit spatial and temporal locality
   • Extend page table (add present bit)
   • Flow: (see illustration in next slide)
       – An instruction is executed
       – The instruction needs to access a page
       – If during address translation present=0 the a page fault occurs
           • A page fault is basically a trap to OS (i.e. the instruction is halted in the
             middle, and the OS gains control to swap-in the page)
           • Need HW support to be able to trap to OS
       – After the trap finishes, the instruction must be reexecuted
           • An instruction has side effects, when trap returns, the instruction should
             be restarted with side effects reversed (need HW support)
2 Decisions the OS must make
   • Page selection (when to bring a page in)
       – Prepaging, demand paging, or combination of the two
       – In practice: demand paging
   • Page replacement (Today)
 Paging, TLB, and Virtual Memory (before)
                               P2 Page Table
                P2                                                 OS
                          Frame #      Present
              VPN # 1          22           1                  Frame # 22
                               21           1
              VPN # 0                                          Frame # 21

                                                               Frame # 20
                               P1 Page Table
Mem             P1
                          Frame #      Present
              VPN # 1          700          0
                               20           1                  Blk #700
              VPN # 0
SP                                                             Swap area

          Processor                              TLB
     Mov (SP)+ Mem(1100)               VPN       PPN   Valid
         SP             0880            0        20     1
     Mem(1100)          1100            1        22     0                   3
     Paging, TLB, and Virtual Memory (after)
                            P2 Page Table
                P2                                              OS
                        Frame #     Present
              VPN # 1       22           1                  Frame # 22
                            710          0
              VPN # 0                                       Frame # 21

                                                            Frame # 20
                            P1 Page Table
Mem             P1
                        Frame #     Present
              VPN # 1       21           1
                            20           1                  Blk #710
              VPN # 0
SP                                                          Swap area

            Processor                         TLB
         Reexecute Mov …,           VPN       PPN   Valid
             and done                0        20     1
                                     1        21     1                   4
                                Flow of controls
4 controls: CPU, TLB, Address translator (in MMU), and OS

1) Processor obtains content of SP
     •   Goto TLB, TLB hit!
     •   VA: 0880, PA: 200880
2) Write to Mem(1100)
     •   Goto TLB, TLB miss!
     •   Goto to Address Translator (in MMU)
           – Get P1’s PT
           – Check present bit for index 1
           – Present=0  Page fault! Goto OS (trap to OS)
3) OS performs Place Replacement
     •   Swap-out a page
           –   (for now), pick random, Frame 21
           –   Pick a free block in the swap area (e.g. 710)
           –   Copy content of frame 21 to block 710
           –   Update corresponding P2’s Page Table entry
     •   Swap-in the requested page
           – Copy content of block 700 to frame 21
           – Update P1’s page table entry (index 1)
           – Update TLB entry
4) Processor reexecutes instruction
     •   Tricky due to side effects (e.g. mov (SP)+ already increments the SP pointer before page
     •   Need internal support from the CPU to rewind side effects (more in last lecture)

                  Page Replacement
Which page in main memory should selected as victim?

Random: Replace any page at random
   • Ex: ABCD in memory, E comes in, pick any from ABCD
   • Advantages: Easy to implement
   • Works okay sometimes, why?
       – When memory is not severely over-committed
       – When a process’ memory accesses are random

FIFO: Replace page that has been in memory the longest
   • ABCD in memory, A was swapped in before BCD, E comes in, then A is
     swapped out
   • Intuition: First referenced long time ago, done with it now
   • Advantages:
       – Fair: All pages receive equal residency
       – Easy to implement (circular buffer)
   • Disadvantage:
       – Some pages may always be needed (FIFO is not necessarily the way to use
         the memory)
     Page Replacement Continued
OPT: Replace page not used for longest time in future
   • Ex: ABCD in memory, E comes in, and future accesses are
     AAABBBCCCDDD, then throw away D
   • Advantages: Guaranteed to minimize number of page faults
   • Disadvantages: Requires that OS predict the future
      – Not practical, but good for comparison

LRU: Replace page not used for longest time in past
   • Ex: AAAAABBCCD, and then E comes, A is throw out
   • Intuition: Use past to predict the future
      – But not account frequency (e.g. although A was frequently accessed
        in the past, A is the one that is not used for longest time)
   • Advantages:
      – With locality, LRU approximates OPT
   • Disadvantages:
      – Harder to implement, must track which pages have been accessed
      – Does not handle all workloads well                            7
Page Replacement Example (with Answer)
Page reference string: ABCABDADBCB
Three pages of physical memory
  • Which pages are in memory? And which access causes a page fault?

FIFO:                              OPT:
  A B C A B D A D B C B              A B C A B D A D B C B
1 A     .   D   .   C              1 A      .     .    C
2   B     .   A                    2    B     .       or
3     C           B                3      C     D   .  C

   A B C A B D A D B C B                     FIFO: 7 page faults
 1 A      .     .     C                      OPT: 5 page faults
 2    B     .       .   .                    LRU: 5 page faults
 3      C     D   .
   Page Replacement Comparison
Add more physical memory, what happens to
  • LRU, OPT: Add more memory, guaranteed to have
    fewer (or same number of) page faults
      – Smaller memory sizes are guaranteed to contain a
        subset of larger memory sizes
  • FIFO: Add more memory, usually have fewer page
     – Belady’s anomaly: May actually have more page faults!

                   Implementing LRU
Software Perfect LRU
   •   OS maintains ordered list of physical pages by reference time
   •   When page is referenced: Move page to front of list
   •   When need victim: Pick page at back of list
   •   Trade-off: Slow on memory reference, fast on replacement. Why?
        – OS trap for each memory reference
        – On replacement, OS gains control anyway, so just do a little extra work
Hardware Perfect LRU
   •   Associate register with each page
   •   When page is referenced: Store system clock in register
   •   When need victim: Scan through registers to find oldest clock
   •   Trade-off: Fast on memory reference, slow on replacement
       (especially as size of memory grows)
In practice, do not implement Perfect LRU
   • LRU is an approximation anyway, so approximate more
   • Goal: Find an old page, but not necessarily the very oldest
                        Clock Algorithm
   • A clock with a hand, page frames around the clock, the hand
     sweeps the pages to find a victim page
   • Keep use (or reference) bit for each page frame
   • When page is referenced: set use bit
   • (Note: since the HW sets the use bit (not the OS), must have HW
     support to implement clock algorithm)
Operating System
   • Idea:
       – Look for page with use bit cleared (has not been referenced for
       – Never loops infinitely
   • Algorithm, on page fault:
       – Check use bit
             • 1  clear use bit, advance hand, and repeat algorithm
             • 0  replace this page (write to disk if dirty, update the old and new TLB
               entries), advance hand, stop
Example:                                                                             11
   • When system boots? After a while?
               Clock Algorithm Example
           A    B     C      D                          B   E       B
1 0 .. 1 A 1 A 1 A 0 A 0 A 0 A 1 D 1 D 1 D 1 D 1 D
2 0 .. 0 .. 1 B 1 B 1 B 0 B 0 B 0 B 1 B 0 B 0 B 1 B
3 0 .. 0 .. 0 .. 1 C 1 C 1 C 0 C 0 C 0 C 0 C 1 E 1 E

    Clock is moving slow         Clock is moving fast
    (e.g. advance 3x in          (e.g. advance 3x in
    3 references)                1 reference)

    F                        B    G
1       0 D 0 D 0 D 1 F 1 F 1 F 1 F
2       1 B 0 B 0 B 0 B 1 B 0 B 0 B
3       1 E 1 E 0 E 0 E 0 E 0 E 1 G
                                                      VPN References
                                                    PPN# Use bit VPN#
        Clock hand         Page comes in                            12
                                                        3       1   C

To top