CS 412 CS718 by kala22

VIEWS: 0 PAGES: 26

									CSL718 : Memory Hierarchy


       Cache Memories
        6th Feb, 2006


        Anshul Kumar, CSE IITD
          Memory technologies
• Semiconductor
   –   Registers
   –   SRAM              Random Access
   –   DRAM
   –   FLASH
• Magnetic
   – FDD
   – HDD
• Optical                Random + sequential
   – CD
   – DVD


Anshul Kumar, CSE IITD                         slide 2
          Hierarchical structure

Speed                 CPU      Size     Cost / bit


Fastest             Memory   Smallest     Highest



                    Memory




Slowest             Memory   Biggest      Lowest



 Anshul Kumar, CSE IITD                     slide 3
System Configuration:               e-bay price: Rs. 37,500
     Processor: Intel P4 3.2GHz (800FSB) 1024k CPU with Hyper Threading
     CPU Fan: P4 Heavy Duty Cooling Fan With Heat Sink
  Motherboard: D915G express chipset 800FSB (up to 3.6GHz support)
      Memory: 1GB DDR400 PC3200 DUAL CHANNEL RAM
   Video Card: GeForce FX 6200 256MB 16x PCI-e video with TV out
    Hard drive: 160GB 7200RPM UDMA-150 SATA
     CD drive: 52x32x52x16x CDRW + DVD ROM drive
  Floppy drive: Sony 1.44MB 3.5" drive
        Sound: AC 97 6 ch 5.1 Full duplex digital sound, stereo speakers
      Network: 10/100 RJ45 onboard network (Ethernet, cable or DSL)
       Modem: 56k v92 modem
         Ports: Six USB 2.0 ports,1 serial, 1 parallel, 1 microphone jack
         Case: Black i BOX 522 Mid Tower 400w power supply (front USB)
     Keyboard: Black PS2 Windows Keyboard
       Mouse: Black PS2 Scroll Mouse
      Anshul Kumar, CSE IITD                                                slide 4
     Monitor: 17" SAMSUNG 793S MONITOR
  Main Memory for Pentium IV
  DDR (double data rate) DRAM

      Size               Interface    Price
    128 MB                PC-333     Rs. 599

    256 MB                PC-333     Rs. 1,299

      1 GB                PC-333     Rs. 4,999

      1 GB                PC-400     Rs, 5,299

Anshul Kumar, CSE IITD                           slide 5
            Disk drives
   Seagate Baracuda 7200 RPM
       Capacity           Price
          40 GB          Rs. 2,999
          80 GB          Rs. 3,499
         120 GB          Rs. 4,499
         160 GB          Rs. 4,799
         200 GB          Rs. 5,500
         250 GB          Rs. 6,999
         300 GB          Rs. 9,900
         400 GB          Rs. 14,950
Anshul Kumar, CSE IITD                slide 6
  Data transfer between levels
              Processor                             hit
                                     access
                                                   miss



                     Data transfer

                                     unit of transfer = block




Anshul Kumar, CSE IITD                                    slide 7
            Principle of locality
• Temporal Locality
   – references repeated in time
• Spatial Locality
   – references repeated in space
   – Special case: Sequential Locality




Anshul Kumar, CSE IITD                   slide 8
     Memory Hierarchy Analysis
Memory Mi:                  M1, M2, …. , Mn
Capacity si:                s1< s2< …. < sn
Unit cost ci:               c1> c2> …. > cn
Total cost Ctotal:          i ci . si
Access time ti :            1+ 2+ …. +i (i at level i)
                             1< 2< …. < n
Hit ratios hi(si):          h1< h2< …. < hn = 1
Effective time Teff:        i mi . hi . ti = i mi . i
Miss before level i, mi:    (1-h1)(1-h2) …. (1-hi-1)

   Anshul Kumar, CSE IITD                             slide 9
                   Cache Types
Instruction | Data | Unified | Split
       Split vs. Unified:
       • Split allows specializing each part
       • Unified allows best use of the capacity
On-chip | Off-chip
       • on-chip : fast but small
       • off-chip : large but slow
Single level | Multi level

Anshul Kumar, CSE IITD                             slide 10
                  Cache Policies
•   Placement            what gets placed where?
•   Read                 when? from where?
•   Load                 order of bytes/words?
•   Fetch                when to fetch new block?
• Replacement            which one?
• Write                  when? to where?



Anshul Kumar, CSE IITD                              slide 11
          Block placement strategies

       Direct mapped              Set associative      Fully associative
Block # 0 1 2 3 4 5 6 7       Set #    0 1 2 3


   Data                        Data                  Data




                1                      1                               1
    Tag                         Tag                   Tag
                2                      2                               2
 Search                       Search                Search


     Anshul Kumar, CSE IITD                                     slide 12
  Organization/placement policy

                                Set 1
            Cache
                                Set S

  Set    Sector 1    Sector 2           Sector SE LRU

Sector   Tag     Block 1    Block 2           Block B

Block    VDS        AU 1        AU 2           AU A


  Anshul Kumar, CSE IITD                           slide 13
                Addressing Cache
          Sector Name        Set Index      Block Displacement
                                           Address
                             Selects set
      Compared to Tags
                                           Selects
                                           Block
                                                     Selects AU


Early select: access data after tag matching
Late select: access data while tag matching

   Anshul Kumar, CSE IITD                                   slide 14
         Cache organization example
                           Sector                 Sector

                       Block        Block      Block       Block

             1    Tag V D AU AU V D AU AU Tag V D AU AU V D AU AU
             2
             3
             4
Sets
             5
             6
             7
             8
       Anshul Kumar, CSE IITD                                 slide 15
      Cache access mechanism
                                     Address
                                31                   0

           Hit           Tag             18    12        2        Data
                                                          byte
                                         index
                                                         offset
                     index v tag              data
                        0
                        1

                          ...
                          ...


                         4095
                                         18      32
                                     =


Anshul Kumar, CSE IITD                                                   slide 16
           Cache with 4 word blocks
                          Address
                     31                  0

Hit          Tag 18              10 2        2                                       Data
                                             byte offset
                              index                          block offset

            index v tag                               data
               0
               1

               ...
               ...


             1023
                              18    32              32        32            32
                          =

                                                      Mux
      Anshul Kumar, CSE IITD                                                     slide 17
       4-way set associative cache
              31                 0

        tag    20         8 2        2 byte offset
                       index                            block offset
            v tag data         v tag data         v tag data      v tag data
       0
      ...
      ...
      ...
      255
                  20     128         20     128           20     128       20     128
              =                  =                    =                =
                       Mux                Mux                  Mux              Mux
                         32                 32                   32               32

Hit
                                                Mux
  Anshul Kumar, CSE IITD                                       Data              slide 18
                   Read policies
• Sequential or concurrent
   – initiate memory access only after detecting a
     miss
   – initiate memory access along with cache access
     in anticipation of a miss
• With or without forwarding
   – give data to CPU after filling the missing block
     in cache
   – forward data to CPU as it gets filled in cache

Anshul Kumar, CSE IITD                           slide 19
                   Read Policies
Sequential Simple:
             1               1   1   Teff=(1-pm).1 +
Cache
                         T                 pm . (T+2)
Memory
Concurrent Simple:
             1           1   1       Teff=(1-pm).1 +
Cache
                    T                      pm . (T+1)
Memory
Sequential Forward:
             1               1       Teff=(1-pm).1 +
Cache
                         T                 pm . (T+1)
Memory
Concurrent Forward:
            1            1           Teff=(1-pm).1 +
Cache
                    T                            (T)
                                           pm .slide 20
Memory Kumar, CSE IITD
   Anshul
                       Load policies

                            4 AU Block
                   0        1    2     3

Cache miss on AU 1

                                            Block Load
                                           Load Forward
                                           Fetch Bypass
                                           (wrap around
                                           load)
   Anshul Kumar, CSE IITD                             slide 21
                  Fetch Policies
• Fetch on miss (demand fetching)
• Software prefetching
• Hardware Prefetching




Anshul Kumar, CSE IITD              slide 22
                  Fetch Policies
• Demand fetching
   – fetch only when required (miss)
• Hardware prefetching
   – automatically prefetch next block
• Software prefetching
   – programmer decides to prefetch
   questions:
   – how much ahead (prefetch distance)
   – how often
Anshul Kumar, CSE IITD                    slide 23
    Software Control of Cache
Software visible cache
   –   mode selection (WT, WB etc)
   –   block flush
   –   block invalidate
   –   block prefetch




Anshul Kumar, CSE IITD               slide 24
          Replacement Policies
•   Least Recently Used (LRU)
•   Least Frequently Used (LFU)
•   First In First Out (FIFO)
•   Random




Anshul Kumar, CSE IITD            slide 25
                  Write Policies
• Write Hit
   – Write Back
   – Write Through
• Write Miss
   – Write Back
   – Write Through (with or without Write Allocate)
Buffers are used in all cases to hide latencies


Anshul Kumar, CSE IITD                                slide 26

								
To top