Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

ECE 445 – Computer Organization

VIEWS: 31 PAGES: 38

									ECE 445 – Computer Organization



           Pipelined Datapath and Control
                                (Lecture #13)




    The slides included herein were taken from the materials accompanying
  Computer Organization and Design, 4th Edition, by Patterson and Hennessey,
     and were used with permission from Morgan Kaufmann Publishers.
            Material to be covered ...




            Chapter 4: Sections 5 – 9, 13 – 14




Fall 2010             ECE 445 - Computer Organization   2
            Performance of the Single-Cycle MIPS




Fall 2010               ECE 445 - Computer Organization   3
Fall 2010   ECE 445 - Computer Organization   4
            Example: MIPS Clock Rate

   Determine the clock rate for the MIPS
    architecture, assuming the following:
               The MIPS is a Single Cycle Machine
                     1 clock cycle per instruction
                     CPI = 1
               Access time for memory units = 200 ps
               Operation time for ALU and adders = 100 ps
               Access time for register file = 50 ps

Fall 2010                   ECE 445 - Computer Organization   5
            Example: MIPS Clock Rate


   Instruction Class                 Functional Units used by the Instruction Class
ALU Instruction        Inst. Fetch      Register            ALU        Register
Load Word              Inst. Fetch      Register            ALU        Memory     Register
Store Word             Inst. Fetch      Register            ALU        Memory
Branch                 Inst. Fetch      Register            ALU
Jump                   Inst. Fetch




Fall 2010                            ECE 445 - Computer Organization                         6
            Example: MIPS Clock Rate


 Instruction Class    Instr   Register           ALU           Data    Register     Total
                     Memory    read            operation      Memory    write
ALU Instruction      200      50              100             0        50         400 ps
Load Word            200      50              100             200      50         600 ps
Store Word           200      50              100             200      0          550 ps
Branch               200      50              100             0        0          350 ps
Jump                 200      0               0               0        0          200 ps




Fall 2010                         ECE 445 - Computer Organization                           7
            Example: MIPS Clock Rate

      The clock cycle time for a machine with a
       single clock cycle per instruction will be
       determined by the longest instruction.
               In this example, the load word instruction
                requires 600 ps.
      The clock rate is then
            Clock rate = 1 / Clock Cycle Time
            Clock rate = 1 / 600 ps = 1.67 GHz

Fall 2010                ECE 445 - Computer Organization     8
                  Performance Issues
   Longest delay determines clock period
           Critical path: load word (lw) instruction
           Instruction memory  register file  ALU  data
            memory  register file
   Not feasible to vary clock period for different
    instructions
   Violates design principle
           Making the common case fast
   Improve performance by pipelining
Fall 2010                  ECE 445 - Computer Organization    9
            How does pipelining work?




Fall 2010         ECE 445 - Computer Organization   10
                                                                                          §4.5 An Overview of Pipelining
                      Pipelining Analogy
       Pipelined laundry: overlapping execution
               Parallelism improves performance


                                                         Four loads:
                                                              Speedup
                                                               = 8/3.5 = 2.3
                                                         Non-stop:
                                                              Speedup
                                                               = 2n/0.5n + 1.5 ≈ 4
                                                               = number of stages


Fall 2010                    ECE 445 - Computer Organization                         11
                      Objective:

  Keep all stages of the pipeline busy at all times.




Fall 2010           ECE 445 - Computer Organization    12
  Pipelining: Improving Performance


                                         Latency          Max. Throughput
                 Non-Pipelined            2 hours                    0.5
                 Pipelined                2 hours                     2

                                                                    Assuming all stages of pipeline
                      Length of time for each                       are busy at all times.
                      load does not change.


            Latency = time from start of one load to the end of same load.
            Maximum Throughput = # of loads completed per hour.




Fall 2010                         ECE 445 - Computer Organization                                     13
  Pipelining: Improving Performance



    Pipelining improves performance by increasing
    instruction throughput, rather than decreasing
      execution time of an individual instruction.




Fall 2010           ECE 445 - Computer Organization   14
            The MIPS Pipeline




Fall 2010     ECE 445 - Computer Organization   15
                     MIPS Pipeline

      Five stages, one step per stage
     –      IF    : Instruction fetch from memory
     –      ID    : Instruction decode & register read
     –      EX    : Execute operation or calculate address
     –      MEM   : Access memory operand
     –      WB    : Write result back to register




Fall 2010                ECE 445 - Computer Organization     16
            MIPS Pipeline




Fall 2010     ECE 445 - Computer Organization   17
                   Pipeline Performance
     Assume time for stages is
           100ps for register read or write
           200ps for other stages
     Compare pipelined datapath with single-cycle
      datapath

Instr          Instr fetch   Register     ALU op           Memory    Register   Total time
                             read                          access    write
lw             200ps         100 ps       200ps            200ps     100 ps     800ps
sw             200ps         100 ps       200ps            200ps                700ps

R-format       200ps         100 ps       200ps                      100 ps     600ps

beq            200ps         100 ps       200ps                                 500ps

Fall 2010                          ECE 445 - Computer Organization                           18
            Pipeline Performance
                Single-cycle (Tc= 800ps)



                                                    Why is the clock period 800ps?




                 Pipelined (Tc= 200ps)



                                                    Why is the clock period 200ps?




Fall 2010         ECE 445 - Computer Organization                                    19
                     Pipeline Speedup
    If all stages are balanced
           i.e., all take the same time
           Time between instructionspipelined
            = Time between instructionsnonpipelined
                       Number of stages
    If not balanced, speedup is less
    Speedup due to increased throughput
           Latency (time for each instruction) does not
            decrease

Fall 2010                   ECE 445 - Computer Organization   20
                 Pipelining and ISA Design
    MIPS ISA designed for pipelining
           All instructions are 32-bits
                Easier to fetch and decode in one cycle
                c.f. x86: 1- to 17-byte instructions
           Few and regular instruction formats
                Can decode and read registers in one step
           Load/store addressing
                Can calculate address in 3rd stage, access memory in
                 4th stage
           Alignment of memory operands
                i.e. on word boundaries
                Memory access takes only one cycle
Fall 2010                       ECE 445 - Computer Organization         21
                    Pipeline Summary
  The BIG Picture
    Pipelining improves performance by increasing
     instruction throughput
           Executes multiple instructions in parallel
           Each instruction has the same latency
    Subject to hazards                        hazards will be discussed in upcoming lectures

           Structure, data, control
    Instruction set design affects complexity of
     pipeline implementation

Fall 2010                   ECE 445 - Computer Organization                               22
                                                          §4.6 Pipelined Datapath and Control
            MIPS Pipelined Datapath




Fall 2010          ECE 445 - Computer Organization   23
                     Pipeline registers
   Need registers between stages                            Why?
           To hold information produced in previous cycle




Fall 2010                  ECE 445 - Computer Organization          24
                       Pipeline Operation
    Cycle-by-cycle flow of instructions through the
     pipelined datapath
           “Single-clock-cycle” pipeline diagram
                Shows pipeline usage in a single cycle
                Highlight resources used
           “Multi-clock-cycle” diagram
                Graph of operation over time


    We’ll look at “single-clock-cycle” diagrams for
     load word and store word.
Fall 2010                      ECE 445 - Computer Organization   25
            IF for Load, Store, …




Fall 2010        ECE 445 - Computer Organization   26
            ID for Load, Store, …




Fall 2010         ECE 445 - Computer Organization   27
            EX for Load




Fall 2010    ECE 445 - Computer Organization   28
            MEM for Load




Fall 2010     ECE 445 - Computer Organization   29
                              WB for Load




             Wrong
            register
            number
                       Why?
Fall 2010                      ECE 445 - Computer Organization   30
            Corrected Datapath for Load




Fall 2010            ECE 445 - Computer Organization   31
            EX for Store




Fall 2010    ECE 445 - Computer Organization   32
            MEM for Store




Fall 2010     ECE 445 - Computer Organization   33
            WB for Store




Fall 2010    ECE 445 - Computer Organization   34
            Multi-Cycle Pipeline Diagram
    Form showing resource usage




Fall 2010            ECE 445 - Computer Organization   35
            Multi-Cycle Pipeline Diagram
    Traditional form




Fall 2010               ECE 445 - Computer Organization   36
        Single-Cycle Pipeline Diagram
    State of pipeline in a given cycle




Fall 2010             ECE 445 - Computer Organization   37
             Questions?




Fall 2010   ECE 445 - Computer Organization   38

								
To top