Ch07CAO

Document Sample
Ch07CAO Powered By Docstoc
					 7-1                                                                                 Chapter 7 - Memory




                     Computer Architecture and
                          Organization
                          Miles Murdocca and Vincent Heuring




                                                  Chapter 7 – Memory



Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-2                                                                                 Chapter 7 - Memory




                                      Chapter Contents
         7.1 The Memory Hierarchy
         7.2 Random-Access Memory
         7.3 Memory Chip Organization
         7.4 Case Study: Rambus Memory
         7.5 Cache Memory
         7.6 Virtual Memory
         7.7 Advanced Topics
         7.8 Case Study: Associative Memory in Routers
         7.9 Case Study: The Intel Pentium 4 Memory System




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-3                                                                                 Chapter 7 - Memory




       The Memory
        Hierarchy




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-4                                                                                 Chapter 7 - Memory


        Functional Behavior of a RAM Cell




   Static RAM cell (a) and dynamic RAM cell (b).
Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-5                                                                                 Chapter 7 - Memory


                    Simplified RAM Chip Pinout




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-6                                                                                 Chapter 7 - Memory




     A Four-Word
     Memory with
     Four Bits per
     Word in a 2D
     Organization




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-7                                                                                 Chapter 7 - Memory



         A Simplified Representation of the
            Four-Word by Four-Bit RAM




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-8                                                                                 Chapter 7 - Memory


       2-1/2D Organization of a 64-Word by
                  One-Bit RAM




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-9                                                                                 Chapter 7 - Memory


   Two Four-Word by Four-Bit RAMs are
     Used in Creating a Four-Word by
              Eight-Bit RAM




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-10                                                                                Chapter 7 - Memory


Two Four-Word by Four-Bit RAMs Make
  up an Eight-Word by Four-Bit RAM




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-11                                                                                Chapter 7 - Memory



   Single-In-Line
      Memory
      Module

 • 256 MB dual in-line
   memory module organized
   for a 64-bit word with 16
   16M × 8-bit RAM chips
   (eight chips on each side
   of the DIMM).




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-12                                                                                Chapter 7 - Memory




  Single-In-
Line Memory
   Module

• Schematic diagram of
  256 MB dual in-line
  memory module.
  (Source: adapted from
  http://www-
  s.ti.com/sc/ds/tm4en64
  kpu.pdf.)



Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-13                                                                                Chapter 7 - Memory



      A ROM Stores Four Four-Bit Words




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-14                                                                                Chapter 7 - Memory


   A Lookup Table (LUT) Implements an
             Eight-Bit ALU




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-15                                                                                Chapter 7 - Memory


                                             Flash Memory




   • (a) External view of flash memory module and (b) flash module
     internals. (Source: adapted from HowStuffWorks.com.)
Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-16                                                                                Chapter 7 - Memory


            Cell Structure for Flash Memory




 • Current flows from source to drain when a sufficient negative charge is
   placed on the dielectric material, preventing current flow through the
   word line. This is the logical 0 state. When the dielectric material is not
   charged, current flows between the bit and word lines, which is the
   logical 1 state.
Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-17                                                                                Chapter 7 - Memory


                                      Rambus Memory
        • Comparison of DRAM and RDRAM configurations.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-18                                                                                Chapter 7 - Memory


                                      Rambus Memory
        • Rambus technology on the Nintendo 64 motherboard (left)
          enables cost savings over the conventional Sega Saturn
          motherboard design (right).




      • Nintendo 64 game console:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-19                                                                                Chapter 7 - Memory


           Placement of Cache Memory in a
                 Computer System




   • The locality principle: a recently referenced memory location is likely to
     be referenced again (temporal locality); a neighbor of a recently
     referenced memory location is likely to be referenced (spatial locality).

Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-20                                                                                Chapter 7 - Memory



 An Associative Mapping Scheme for a
           Cache Memory




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-21                                                                                Chapter 7 - Memory


                Associative Mapping Example
  • Consider how an access to memory location (A035F014)16 is mapped to
    the cache for a 232 word memory. The memory is divided into 227 blocks
    of 25 = 32 words per block, and the cache consists of 214 slots:




     • If the addressed word is in the cache, it will be found in word (14)16 of a
        slot that has tag (501AF80)16, which is made up of the 27 most
        significant bits of the address. If the addressed word is not in the cache,
        then the block corresponding to tag field (501AF80)16 is brought into an
        available slot in the cache from the main memory, and the memory
        reference is then satisfied from the cache.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-22                                                                                Chapter 7 - Memory



   Associative Mapping Area Allocation
  • Area allocation for associative mapping scheme based on bits stored:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-23                                                                                Chapter 7 - Memory


                               Replacement Policies
  • When there are no available slots in which to place a block, a
    replacement policy is implemented. The replacement policy governs
    the choice of which slot is freed up for the new block.

  • Replacement policies are used for associative and set-associative
    mapping schemes, and also for virtual memory.

  • Least recently used (LRU)

  • First-in/first-out (FIFO)

  • Least frequently used (LFU)

  • Random

  • Optimal (used for analysis only – look backward in time and reverse-
    engineer the best possible strategy for a particular sequence of
    memory references.)
Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-24                                                                                Chapter 7 - Memory



    A Direct Mapping Scheme for Cache
                 Memory




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-25                                                                                Chapter 7 - Memory


                           Direct Mapping Example
  • For a direct mapped cache, each main memory block can be mapped to
    only one slot, but each slot can receive more than one block. Consider
    how an access to memory location (A035F014)16 is mapped to the
    cache for a 232 word memory. The memory is divided into 227 blocks of
    25 = 32 words per block, and the cache consists of 214 slots:




• If the addressed word is in the cache, it will be found in word (14)16 of slot
   (2F80)16, which will have a tag of (1406)16.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-26                                                                                Chapter 7 - Memory



              Direct Mapping Area Allocation
  • Area allocation for direct mapping scheme based on bits stored:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-27                                                                                Chapter 7 - Memory


        A Set Associative Mapping Scheme
               for a Cache Memory




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-28                                                                                Chapter 7 - Memory


         Set-Associative Mapping Example
  • Consider how an access to memory location (A035F014)16 is mapped to
    the cache for a 232 word memory. The memory is divided into 227 blocks
    of 25 = 32 words per block, there are two blocks per set, and the cache
    consists of 214 slots:




  • The leftmost 14 bits form the tag field, followed by 13 bits for the set field,
    followed by five bits for the word field:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-29                                                                                Chapter 7 - Memory


                Set Associative Mapping Area
                         Allocation
• Area allocation for set associative mapping scheme based on bits stored:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-30                                                                                Chapter 7 - Memory

               Cache Read and Write Policies




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-31                                                                                Chapter 7 - Memory

 Hit Ratios and Effective Access Times
  • Hit ratio and effective access time for single level cache:




  • Hit ratios and effective access time for multi-level cache:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-32                                                                                Chapter 7 - Memory




               Direct Mapped Cache Example
  • Compute hit ratio and
    effective access time for
    a program that executes
    from memory locations
    48 to 95, and then loops
    10 times from 15 to 31.

  • The direct mapped
    cache has four 16-word
    slots, a hit time of 80 ns,
    and a miss time of 2500
    ns. Load-through is
    used. The cache is
    initially empty.

Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-33                                                                                Chapter 7 - Memory



   Table of Events for Example Program




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-34                                                                                Chapter 7 - Memory



    Calculation of Hit Ratio and Effective
     Access Time for Example Program




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-35                                                                                     Chapter 7 - Memory


                        Multi-level Cache Memory
As an example, consider a two-level cache in which the L1 hit time is 5 ns,
the L2 hit time is 20 ns, and the L2 miss time is 100 ns. There are 10,000
memory references of which 10 cause L2 misses and 90 cause L1 misses.
Compute the hit ratios of the L1 and L2 caches and the overall effective
access time.

H1 is the ratio of the number of times the accessed word is in the L1 cache
to the total number of memory accesses. There are a total of 85 (L1) and 15
(L2) misses, and so:




                                               (Continued on next slide.)

Computer Architecture and Organization by M. Murdocca and V. Heuring        © 2007 M. Murdocca and V. Heuring
 7-36                                                                                Chapter 7 - Memory


          Multi-level Cache Memory (Cont’)
H2 is the ratio of the number of times the accessed word is in the L2 cache
to the number of times the L2 cache is accessed, and so:




 The effective access time is then:




                           = 5.23 ns per access

Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-37                                                                                Chapter 7 - Memory


                         Neat Little LRU Algorithm
  • A sequence is shown for the Neat Little LRU Algorithm for a cache with
    four slots. Main memory blocks are accessed in the sequence: 0, 2, 3,
    1, 5, 4.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-38                                                                                Chapter 7 - Memory

                                      Cache Coherency
 • The goal of cache coherence is to ensure that every cache sees the
   same value for a referenced location, which means making sure that any
   shared operand that is changed is updated throughout the system.

 • This brings us to the issue of false sharing, which reduces cache
   performance when two operands that are not shared between processes
   share the same cache line. The situation is shown below. The problem is
   that each process will invalidate the other’s cache line when writing data
   without a real need, unless the compiler prevents this.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-39                                                                                Chapter 7 - Memory


                                                       Overlays
• A partition graph for a program with a main routine and three subroutines:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-40                                                                                Chapter 7 - Memory


                                           Virtual Memory
  • Virtual memory is stored in a hard disk image. The physical memory
    holds a small number of virtual pages in physical page frames.

  • A mapping between a virtual and a physical memory:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-41                                                                                Chapter 7 - Memory


                                                  Page Table
  • The page table maps between virtual memory and physical memory.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-42                                                                                        Chapter 7 - Memory

                                Using the Page Table

 • A virtual address is
   translated into a physical
   address:




                                                                       Typical page table entry

Computer Architecture and Organization by M. Murdocca and V. Heuring           © 2007 M. Murdocca and V. Heuring
 7-43                                                                                Chapter 7 - Memory

                  Using the Page Table (cont’)
• The
  configuration of
  a page table
  changes as a
  program
  executes.

• Initially, the page
   table is empty. In
   the final
   configuration,
   four pages are in
   physical
   memory.


Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-44                                                                                Chapter 7 - Memory


                                             Segmentation
  • A segmented memory allows two users to share the same word
    processor code, with different data spaces:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-45                                                                                Chapter 7 - Memory


                                            Fragmentation

• (a) Free area
  of memory
  after initial-
  ization; (b)
  after
  fragment-
  ation; (c)
  after
  coalescing.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-46                                                                                Chapter 7 - Memory


                Translation Lookaside Buffer
  • An example TLB holds 8 entries for a system with 32 virtual pages and
    16 page frames.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-47                                                                                Chapter 7 - Memory


                             Putting it All Together
  • An example TLB holds 8 entries for a system with 32 virtual pages and
    16 page frames.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-48                                                                                Chapter 7 - Memory


            Content Addressable Memory –
                     Addressing
      • Relationships between random access memory and content
        addressable memory:




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-49                                                                                Chapter 7 - Memory


                                      Overview of CAM

• Source: (Foster,
  C. C., Content
  Addressable
  Parallel
  Processors, Van
  Nostrand
  Reinhold
  Company, 1976.)




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-50                                                                                Chapter 7 - Memory


            Addressing Subtrees for a CAM




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-51                                                                                Chapter 7 - Memory


            Associative Memory in Routers

                              • A simple network with
                                three routers.



 • The use of associative
   memories in high-end routers
   reduces the lookup time by
   allowing a search to be performed in a single operation.

 • The search is based on the destination address, rather than the
   physical memory address.

 • Access methods for this memory have been standardized into an
    interface interoperability agreement by the Network Processing Forum.
Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-52                                                                                Chapter 7 - Memory



        Block Diagram of Dual-Read RAM
 • A dual-read or dual-port
   RAM allows any two
   words to be
   simultaneously read
   from the same memory.




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring
 7-53                                                                                Chapter 7 - Memory



    The Intel 4 Pentium Memory System




Computer Architecture and Organization by M. Murdocca and V. Heuring   © 2007 M. Murdocca and V. Heuring

				
DOCUMENT INFO
Shared By:
Tags:
Stats:
views:18
posted:2/10/2011
language:English
pages:53