# ECE 445 – Computer Organization

Document Sample

```					ECE 445 – Computer Organization

Pipelined Datapath and Control
(Lecture #13)

The slides included herein were taken from the materials accompanying
Computer Organization and Design, 4th Edition, by Patterson and Hennessey,
and were used with permission from Morgan Kaufmann Publishers.
Material to be covered ...

Chapter 4: Sections 5 – 9, 13 – 14

Fall 2010             ECE 445 - Computer Organization   2
Performance of the Single-Cycle MIPS

Fall 2010               ECE 445 - Computer Organization   3
Fall 2010   ECE 445 - Computer Organization   4
Example: MIPS Clock Rate

   Determine the clock rate for the MIPS
architecture, assuming the following:
   The MIPS is a Single Cycle Machine
   1 clock cycle per instruction
   CPI = 1
   Access time for memory units = 200 ps
   Operation time for ALU and adders = 100 ps
   Access time for register file = 50 ps

Fall 2010                   ECE 445 - Computer Organization   5
Example: MIPS Clock Rate

Instruction Class                 Functional Units used by the Instruction Class
ALU Instruction        Inst. Fetch      Register            ALU        Register
Load Word              Inst. Fetch      Register            ALU        Memory     Register
Store Word             Inst. Fetch      Register            ALU        Memory
Branch                 Inst. Fetch      Register            ALU
Jump                   Inst. Fetch

Fall 2010                            ECE 445 - Computer Organization                         6
Example: MIPS Clock Rate

Instruction Class    Instr   Register           ALU           Data    Register     Total
ALU Instruction      200      50              100             0        50         400 ps
Load Word            200      50              100             200      50         600 ps
Store Word           200      50              100             200      0          550 ps
Branch               200      50              100             0        0          350 ps
Jump                 200      0               0               0        0          200 ps

Fall 2010                         ECE 445 - Computer Organization                           7
Example: MIPS Clock Rate

    The clock cycle time for a machine with a
single clock cycle per instruction will be
determined by the longest instruction.
   In this example, the load word instruction
requires 600 ps.
    The clock rate is then
Clock rate = 1 / Clock Cycle Time
Clock rate = 1 / 600 ps = 1.67 GHz

Fall 2010                ECE 445 - Computer Organization     8
Performance Issues
   Longest delay determines clock period
     Critical path: load word (lw) instruction
     Instruction memory  register file  ALU  data
memory  register file
   Not feasible to vary clock period for different
instructions
   Violates design principle
     Making the common case fast
   Improve performance by pipelining
Fall 2010                  ECE 445 - Computer Organization    9
How does pipelining work?

Fall 2010         ECE 445 - Computer Organization   10
§4.5 An Overview of Pipelining
Pipelining Analogy
   Pipelined laundry: overlapping execution
   Parallelism improves performance

   Speedup
= 8/3.5 = 2.3
   Non-stop:
   Speedup
= 2n/0.5n + 1.5 ≈ 4
= number of stages

Fall 2010                    ECE 445 - Computer Organization                         11
Objective:

Keep all stages of the pipeline busy at all times.

Fall 2010           ECE 445 - Computer Organization    12
Pipelining: Improving Performance

Latency          Max. Throughput
Non-Pipelined            2 hours                    0.5
Pipelined                2 hours                     2

Assuming all stages of pipeline
Length of time for each                       are busy at all times.

Latency = time from start of one load to the end of same load.
Maximum Throughput = # of loads completed per hour.

Fall 2010                         ECE 445 - Computer Organization                                     13
Pipelining: Improving Performance

Pipelining improves performance by increasing
instruction throughput, rather than decreasing
execution time of an individual instruction.

Fall 2010           ECE 445 - Computer Organization   14
The MIPS Pipeline

Fall 2010     ECE 445 - Computer Organization   15
MIPS Pipeline

      Five stages, one step per stage
–      IF    : Instruction fetch from memory
–      ID    : Instruction decode & register read
–      EX    : Execute operation or calculate address
–      MEM   : Access memory operand
–      WB    : Write result back to register

Fall 2010                ECE 445 - Computer Organization     16
MIPS Pipeline

Fall 2010     ECE 445 - Computer Organization   17
Pipeline Performance
     Assume time for stages is
   100ps for register read or write
   200ps for other stages
     Compare pipelined datapath with single-cycle
datapath

Instr          Instr fetch   Register     ALU op           Memory    Register   Total time
lw             200ps         100 ps       200ps            200ps     100 ps     800ps
sw             200ps         100 ps       200ps            200ps                700ps

R-format       200ps         100 ps       200ps                      100 ps     600ps

beq            200ps         100 ps       200ps                                 500ps

Fall 2010                          ECE 445 - Computer Organization                           18
Pipeline Performance
Single-cycle (Tc= 800ps)

Why is the clock period 800ps?

Pipelined (Tc= 200ps)

Why is the clock period 200ps?

Fall 2010         ECE 445 - Computer Organization                                    19
Pipeline Speedup
    If all stages are balanced
    i.e., all take the same time
    Time between instructionspipelined
= Time between instructionsnonpipelined
Number of stages
    If not balanced, speedup is less
    Speedup due to increased throughput
    Latency (time for each instruction) does not
decrease

Fall 2010                   ECE 445 - Computer Organization   20
Pipelining and ISA Design
    MIPS ISA designed for pipelining
    All instructions are 32-bits
   Easier to fetch and decode in one cycle
   c.f. x86: 1- to 17-byte instructions
    Few and regular instruction formats
   Can decode and read registers in one step
   Can calculate address in 3rd stage, access memory in
4th stage
    Alignment of memory operands
   i.e. on word boundaries
   Memory access takes only one cycle
Fall 2010                       ECE 445 - Computer Organization         21
Pipeline Summary
The BIG Picture
    Pipelining improves performance by increasing
instruction throughput
    Executes multiple instructions in parallel
    Each instruction has the same latency
    Subject to hazards                        hazards will be discussed in upcoming lectures

    Structure, data, control
    Instruction set design affects complexity of
pipeline implementation

Fall 2010                   ECE 445 - Computer Organization                               22
§4.6 Pipelined Datapath and Control
MIPS Pipelined Datapath

Fall 2010          ECE 445 - Computer Organization   23
Pipeline registers
   Need registers between stages                            Why?
     To hold information produced in previous cycle

Fall 2010                  ECE 445 - Computer Organization          24
Pipeline Operation
    Cycle-by-cycle flow of instructions through the
pipelined datapath
    “Single-clock-cycle” pipeline diagram
   Shows pipeline usage in a single cycle
   Highlight resources used
    “Multi-clock-cycle” diagram
   Graph of operation over time

    We’ll look at “single-clock-cycle” diagrams for
Fall 2010                      ECE 445 - Computer Organization   25

Fall 2010        ECE 445 - Computer Organization   26

Fall 2010         ECE 445 - Computer Organization   27

Fall 2010    ECE 445 - Computer Organization   28

Fall 2010     ECE 445 - Computer Organization   29

Wrong
register
number
Why?
Fall 2010                      ECE 445 - Computer Organization   30

Fall 2010            ECE 445 - Computer Organization   31
EX for Store

Fall 2010    ECE 445 - Computer Organization   32
MEM for Store

Fall 2010     ECE 445 - Computer Organization   33
WB for Store

Fall 2010    ECE 445 - Computer Organization   34
Multi-Cycle Pipeline Diagram
    Form showing resource usage

Fall 2010            ECE 445 - Computer Organization   35
Multi-Cycle Pipeline Diagram

Fall 2010               ECE 445 - Computer Organization   36
Single-Cycle Pipeline Diagram
    State of pipeline in a given cycle

Fall 2010             ECE 445 - Computer Organization   37
Questions?

Fall 2010   ECE 445 - Computer Organization   38

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 31 posted: 8/15/2011 language: English pages: 38
How are you planning on using Docstoc?