Computer Architecture Instruction Set Architecture

Document Sample
Computer Architecture Instruction Set Architecture Powered By Docstoc
					Computer Architecture
           Chapter 4


             Spring 2005
    Department of Computer Science
         Kent State University
             Performance

• In order to evaluate architectural decisions
  we must be able to measure their impact on
  performance
• There are many aspects to the performance
  of a computer system
• Performance cannot be distilled down to a
  single number
             Airplane Example
                                                 Passenger
                                      Cruising
             Passenger   Cruising                throughput
Airplane                              speed
             capacity    range (mi)              (passengers
                                      (MPH)
                                                 × MPH)

Boeing 777     375         4630         610       228,750

Boeing 747     470         4150         610       286,700

Concorde       132         4000        1350       178,200

 DC-8-50       146         8720         544        79,424
      Defining Performance

• Execution time (aka response time) is the
  time it takes to complete a task
  – The lower the execution time the higher the
    performance
• Throughput is the amount of work done in a
  given time
  – The higher the throughput the higher the
    performance
            Measuring Time

• Wall-clock time or response time is the total time
  to complete a task including OS overhead, I/O
  time, etc.
• CPU time is the time spent by the CPU and
  excluding time spent waiting for I/O devices
• User CPU time is the time the CPU spends
  executing a program excluding time performing
  OS tasks, which is called system CPU time
• Time can also be measured in clock cycles
                Clock Cycles

• Clock cycle time (clock period) is the length of
  one clock cycle in seconds (ps, ns)
• Clock rate (clock frequency) is the number of
  clock cycles per second measured in Hertz (GHz,
  MHz)
• Clock rate = 1 / Clock cycle time
   – Make sure units match
• Shortcuts
   – Clock rate (MHz) × Clock cycle time (ns) = 1000
   – Clock rate (GHz) × Clock cycle time (ps) = 1000
        CPU Execution Time

CPU time  Instructio count CPI  Clock cycle time
                     n
           Instructio count  CPI
                     n
         
                 Clock rate

• The instruction count (IC) is the number of
  instructions that the processor executes
• Cycles per instruction (CPI) is the average number
  of clock cycles to execute one instruction
          Instruction Count

• Static instruction count is the number of
  instructions in a stored program
• Dynamic instruction count is the number of
  instructions the processor executes while running
  a program
• For evaluating performance, we always look at the
  dynamic instruction count
• When comparing processors that implement the
  same instruction set, IC can be ignored
      Cycles Per Instruction

• CPI is the average number of cycles it takes to
  complete an instruction
• Typically CPI is calculated as a weighted average
  over all the instruction classes



        CPI   CPIi  Ci
              Benchmarks

• A benchmark is a suite of programs designed to
  measure performance of a computer system
• Synthetic benchmarks try to measure low-level
  performance by repeating short blocks of code
• Benchmarks based on real programs are much
  better at measuring performance
• The SPEC benchmarks are a good example of
  benchmarks based on real programs
                  Speedup

• Speedup tells us how many times faster our
  system is after making some improvement
• That is, a speedup of a 2 means the new version is
  twice as fast as the old one


          Time before improvemen t
Speedup 
           Time after improvemen t
               Amdahl's Law

• Amdahl's law provides a
  limit on the improvement
  in system performance
  from an improvement in
  one part of the system                      1
• Demonstrates the law of    Speedup 
                                                    f
  diminishing returns                    (1  f ) 
• f is the fraction of the                          s
  computation that is
  improved
• s is the speedup of the
  improvement
          Amdahl's Law (2)

                 Time before improvement
Speedup 
             Time affected
                                 Time unaffected
          Amount of improvement

• Alternate form of Amdahl's law based on actual
  execution time instead of fractions
• This is the form used by the book
 Bad Performance Measures

• Clock rate
• Millions of instructions per second (MIPS)
• Floating-point operations per second
  (FLOPS)
           Power Efficiency

• Power usage is quickly becoming an
  important factor in performance
• For devices like cell phones and PDAs, a
  faster processor isn't useful if it significantly
  reduces battery life
• Modern embedded processors are often
  designed to be power-aware

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:18
posted:8/24/2012
language:English
pages:15