Computer Organization and Architecture - PowerPoint

Document Sample
Computer Organization and Architecture - PowerPoint Powered By Docstoc
					           Chapter 2

Computer Evolution and Performance
 Designing   for Performance

 Performance   Measurement
 Thebasic building blocks for today’s
 computer are the same as those of the
 IAS compute.

    many tricks have been invented to
 But
 improve performance
 Pipelining
 On board cache
 On board L1 & L2 cache
 Branch prediction
 Data flow analysis
 Speculative execution
 Processorspeed increased
 Memory capacity increased
 Memory speed lags behind processor
  speed(see the figure)
   Increase number of bits retrieved at one time
     Make DRAM “wider” rather than “deeper”
     Using wide data path
 Change DRAM interface to make it more efficient
  by including the Cache
 Reduce frequency of memory access
     By increasing more complex cache and more caches on
      the processor chip
   Increase interconnection bandwidth between
    processor and memory
     By using higher speed buses
     Using hierarchy of buses
The main goal is the increase of CPU
1.   Increase hardware speed of processor
     Shrinking the size of logic gates
     More gates can be packed together more tightly and
      increasing the clock rate
     Propagation time for signals reduced
     Increase in clock rate means operations execute more
2.   Increase size and speed of caches
     Dedicating part of processor chip to cache memory
     Cache access times drop significantly
3.   Change processor organization and
     In a way that increase effective speed of execution
     Using parallelism
   Typically two or three levels of cache between
    processor and main memory
   Chip density increased
    More cache memory on chip enabling faster cache access

   Original Pentium chip devoted about 10% of chip
    area to cache
   Pentium 4 devotes about half of the chip area to
 Enable   parallel execution of instructions
 Pipeline works like assembly line, enabling
 different stages of execution of different
 instructions at same time along the pipeline
 Superscalar architecture allows multiple
 pipelines within a single processor so that
 instructions that do not depend on one
 another can be executed in parallel
   8080
     first general purpose microprocessor
     8 bit data path to memory
     Used in first personal computer, the Altair
   8086
     much more powerful
     16 bit
     instruction cache that prefetches few instructions
     8088 used in first IBM PC(variant from 8086)
   80286
     16 Mbyte memory addressable instead of just 1Mb
   80386
    • 32 bit
    • Support for multitasking
 80486
    sophisticated powerful cache and instruction
    built in maths co-processor
 Pentium
    Superscalar technique
    Multiple instructions executed in parallel
 Pentium Pro
    Increased superscalar organization
    Aggressive register renaming
    branch prediction
    data flow analysis
    speculative execution
 Pentium II
   Incorporate MMX technology
   Designed to process graphics, video & audio
 Pentium III
   Incorporate additional floating point instructions
      for 3D graphics
 Pentium 4
   Arabic rather than Roman numerals
   Further floating point and multimedia
 Itanium1
     64 bit organization
 Itanium2
When we say one computer faster than
     another what do we mean?
   The time between the start and the completion of an
    event is response time or execution time
   The total amount of work done in a given time is
   The phrase “machine X is faster than Y” means that the
    response time or execution time is lower on X than Y for
    the given task
   The phrase “the throughput of X is 1.3 times higher
    than Y” means that the number of tasks completed per
    unit time on machine X is 1.3 times the number
    completed on Y
                   Execution time y
                   Execution time x

    This means machine X is n times faster than
    machine Y
              Execution time y performanc e x
           n                 
              Execution time x performanc e y

 Improve performance =increase performance
 Improve execution time= decrease execution
 The  performance improvement to be
  gained from using some faster mode of
  execution is limited by the fraction of the
  time the faster mode can be used.
 It defines the speedup can be gained by
  using a particular enhancement.
          Performanc for entire task               Execution time for entire task
          using the enhancement when possible      without using the enhancement
speedup                                      
           Performanc for entire task without   Execution time for entire task
           using the enhancement                using the enhancement when possible
 Executiontimenew
 = Execution timeold × ((1Fraction   enhaced)                    )
                                                   Fraction enhanced

The overall speedup is the ratio of the
 execution time:
  Speedupoverall = Executiontime =
                              old      1
                                                                Fraction enhanced
                                    (1  Fraction enhanced) 
                                                                speedup enhanced
 Example: An  enhancement run 10 times
 faster than the original machine, but it is
 usable 40% of the time, then the speedup

    fE = 0.4
    sE = 10
     Speedup = 1/((1-0.4) + 0.4/10)
             = 1.56

Shared By: