Docstoc

550-1-6-2004

Document Sample
550-1-6-2004 Powered By Docstoc
					                CPU Design Steps
  1. Analyze instruction set operations using independent
     ISA => RTN => datapath requirements.
  2. Select required datapath components, connections &
     establish clock methodology.
  3. Assemble datapath meeting the requirements.
  4. Analyze the implementation of each instruction to
     determine setting of control points that effects the
     register transfer.
  5. Design & assemble the control logic.


                                            EECC550 - Shaaban
(Chapter 5.4)                                  #1 Lec # 5 Winter 2003 1-6-2004
Single Cycle MIPS Datapath:                                                                           CPI = 1, Long Clock Cycle
                                                                                                             Instruction<31:0>




                                                   <21:25>

                                                                  <16:20>

                                                                             <11:15>
                                  Inst




                                                                                             <0:15>
                                 Memory
                                      Adr
                                                Rs           Rt             Rd              Imm16
Branch                   PCSrc
                                       RegDst                                                                    ALUctr MemWr             MemtoReg
  Zero                                                                                       Zero
                                                Rd Rt
                                               1             0
                                                         Rs Rt
         4          PC+4               RegWr 5               5
                                                             5
                                                                                                                          Main
                 Adder




                                                               busA    R[rs]
                                                                                                                          ALU
                           0                       Rw Ra Rb                                                         =
                                 00




                                      busW
                                                                      32




                                                                                                                        ALU
                                                   32 32-bit    R[rt]
                         Mux




                                       32          Registers   busB                                                           32                  0
                                 PC




                                                                         0




                                                                                                                                                 Mux
                                                                                                       Mux
                                                                32
                 Adder




                           1
                                        Clk                                                                      32           WrEn Adr
                                                                                 Extender
                                                                                                                                                  1
                               Clk
        PC Ext




                                                                                                        1       Data In
                                            imm16                                            32                                Data
imm16




                    Branch                                   16                                                               Memory
                    Target
                                                                                                                    Clk


Jump Not Included                                                                 ExtOp               ALUSrc
                                (Includes ORI)
                                                                                                               EECC550 - Shaaban
                                                                                                                    #2 Lec # 5 Winter 2003 1-6-2004
 Drawbacks of Single-Cycle Processor
• Long cycle time.
• All instructions must take as much time as the slowest:
   – Cycle time for load is longer than needed for all other
     instructions.
• Real memory is not as well-behaved as idealized memory
   – Cannot always complete data access in one (short) cycle.
• Cannot pipeline (overlap) the processing of one
  instruction with the previous instructions.
   – (instruction pipelining, chapter 6).

                                            EECC550 - Shaaban
                                                #3 Lec # 5 Winter 2003 1-6-2004
Abstract View of Single Cycle CPU
                                                             Main
                                     op                      Control

                                                                               ALU
                                     fun
                                                                               control
   Branch, Jump




                                                                    ALUSrc
                                                     Equal




                                                                                MemWr




                                                                                                          MemWr
                                                                                MemRd




                                                                                                 RegWr
                                                                                                 RegDst
                                                                ALUctr
                                                                 ExtOp




                                                                                                                  Result Store
                                                                                                 Reg.
                                                  Register
                                 Instruction
                  Next PC




                                                                                                 Wrt
                                                                                  Access
                                                   Fetch
                                                                         ALU




                                                                                   Mem
                                                                   Ext
                            PC

                                   Fetch




                                                                                                 Mem
                                                                                                 Data
                                        One CPU Clock Cycle
                                        Duration C = 8ns


 One instruction per cycle CPI = 1
                                                                                    EECC550 - Shaaban
                                                                                           #4 Lec # 5 Winter 2003 1-6-2004
   Single Cycle Instruction Timing
Arithmetic & Logical
 PC        Inst Memory        Reg File       mux    ALU     mux   setup

Load
 PC       Inst Memory         Reg File mux          ALU     Data Mem                   mux setup

                               Critical Path
Store                    (Determines CPU clock cycle, C)
 PC       Inst Memory         Reg File       mux    ALU     Data Mem
Branch
 PC       Inst Memory         Reg File        cmp    mux




                                                           EECC550 - Shaaban
                                                              #5 Lec # 5 Winter 2003 1-6-2004
Clock Cycle Time & Critical Path
                          One CPU Clock Cycle
                          Duration C = 8ns here

Clk



      .     .                                               .           .
      .     .                                               .           .
      .     .                                               .           .




• Critical path: the slowest path between any two storage devices
• Clock Cycle time is a function of the critical path, and must be
  greater than:
   – Clock-to-Q + Longest Path through the Combination Logic +
     Setup

                                                  EECC550 - Shaaban
                                                    #6 Lec # 5 Winter 2003 1-6-2004
 Reducing Cycle Time: Multi-Cycle Design
• Cut combinational dependency graph by inserting registers / latches.
• The same work is done in two or more shorter cycles, rather than one
  long cycle.
                                                  storage element
                storage element


                                                  Acyclic
                Acyclic                           Combinational
                Combinational                     Logic (A)
                Logic

                                    =>            storage element


                                                  Acyclic
                                                  Combinational
                                                  Logic (B)
                storage element

                                                   storage element

                                                 EECC550 - Shaaban
                                                     #7 Lec # 5 Winter 2003 1-6-2004
Instruction Processing Steps



                                                               }
Instruction
              Obtain instruction from program storage
   Fetch             Instruction ← Mem[PC]
                                                                            Common
   Next       Update program counter to address                             steps
                                                                            for all
Instruction   of next instruction
                                                                            instructions
                           PC ← PC + 4       (For MIPS)

Instruction   Determine instruction type
 Decode       Obtain operands from registers


 Execute      Compute result value or status


 Result       Store result in register/memory if needed
 Store        (usually called Write Back).



                                                   EECC550 - Shaaban
                                                          #8 Lec # 5 Winter 2003 1-6-2004
Partitioning The Single Cycle Datapath
    Add registers between steps to break into cycles
         Branch, Jump




                                                                                        MemWr




                                                                                                                  MemWr
                                                                                        MemRd




                                                                                                        RegDst
                                                                                                        RegWr
                                                                     ALUSrc
                                                             ExtOp


                                                                              ALUctr




                                                                                                                          Result Store
                             Instruction




                                                 Operand




                                                                                                        Reg.
    Next PC




                                                                                                        File
                                                                                         Access
                                                  Fetch
                                                                     Exec
                               Fetch




                                                                                          Mem
                        PC




                                                                                                        Mem
                                                                                                        Data
                                                                                         Data
     Instruction                           Instruction      Execution                   Memory
       Fetch                                 Decode          Cycle                                         Write back
                                                                                         Access
       Cycle
1                                          2 Cycle         3 (EX)                      4 Cycle        5     Cycle
         (IF)                                  (ID)                                                          (WB)
                                                                                         (MEM)

                                                                                        EECC550 - Shaaban
                                                                                            #9 Lec # 5 Winter 2003 1-6-2004
                      Example Multi-cycle Datapath

       Branch, Jump




                                                                                                MemToReg
                           To Control Unit




                                                                                                                   RegWr
                                                                                                                   RegDst
                                                                            MemWr
                                                                            MemRd
                                                              ALUSrc
                                                              ALUctr
                                                      ExtOp




                                                                                                                Equal
                                                                                                           Reg.
                                                                                                           File
                                                              ALU
     Next PC


                                        Reg       A




                                                              Ext
                                                                       R
                      PC




                               IR
                                        File
                                                  B




                                                                           Access
                                                                            Mem
                                                                                        M

            Instruction




                                                                                         Mem
                                                                                         Data
                                    Instruction
              Fetch                   Decode           Execution
                (IF)                                     (EX)   Memory                                     Write Back
                                        (ID)                     (MEM)                                      (WB)
              2ns                       1ns              2ns
                                                                  2ns                                        1ns
Registers added:           (not shown register write enable control lines)
IR:     Instruction register
A, B: Two registers to hold operands read from register file.
R:      or ALUOut, holds the output of the main ALU
M:      or Memory data register (MDR) to hold data read from data memory
CPU Clock Cycle Time: Worst cycle delay = C = 2ns                          (ignoring MUX, CLK-Q delays)
                                                                            EECC550 - Shaaban
                                                                                    #10 Lec # 5 Winter 2003 1-6-2004
         Operations (Dependant RTN) for Each Cycle
                                        Logic
                    R-Type             Immediate                     Load               Store                        Branch

      Instruction   IR ← Mem[PC]   IR ← Mem[PC]              IR ← Mem[PC]           IR ← Mem[PC]              IR ← Mem[PC]
IF
      Fetch


ID    Instruction   A ← R[rs]      A ← R[rs]                 A ← R[rs]              A ← R[rs]                   A ←       R[rs]
      Decode        B ← R[rt]                                                       B ← R[rt]                  B     ←     R[rt]

                                                                                                              Zero ← R[rs] - R[rt]
                                                                                                              If Zero = 1:
                                                             R ← A + SignEx(Im16)                             PC ← PC + 4 +
EX    Execution     R← A + B       R ← A OR ZeroExt[imm16]                          R ← A + SignEx(Im16)
                                                                                                                (SignExt(imm16) x4)
                                                                                                                   else (i.e Zero =0):
                                                                                                              PC ← PC + 4



       Memory                                                M ← Mem[R]             Mem[R]     ←   B
MEM
                                                                                    PC ← PC + 4




        Write       R[rd] ← R        R[rt] ← R               R[rt]    ← M

WB      Back        PC ← PC + 4       PC ← PC + 4            PC ← PC + 4




                                                                                     EECC550 - Shaaban
                                                                                             #11 Lec # 5 Winter 2003 1-6-2004
MIPS Multi-Cycle Datapath:
            Five Cycles of Load
                Cycle 1 Cycle 2   Cycle 3 Cycle 4   Cycle 5



         Load    IF      ID       EX       MEM       WB

 1- Instruction Fetch (IF):
      Fetch the instruction from instruction Memory.
 2- Instruction Decode (ID):
      Operand Register Fetch and Instruction Decode.
3- Execute (EX): Calculate the effective memory address.
4- Memory (MEM): Read the data from the Data Memory.
5- Write Back (WB):
    Write the loaded data to the register file. Update PC.
                                                     EECC550 - Shaaban
                                                          #12 Lec # 5 Winter 2003 1-6-2004
Multi-cycle Datapath Instruction CPI
 • R-Type/Immediate: Require four cycles, CPI = 4
    –   IF, ID, EX, WB

 • Loads: Require five cycles, CPI = 5
    –   IF, ID, EX, MEM, WB

 • Stores: Require four cycles, CPI = 4
    – IF, ID, EX, MEM

 • Branches/Jumps: Require three cycles, CPI = 3
    – IF, ID, EX
 • Average program CPI: 3 ≤ CPI ≤ 5
   depending on program profile (instruction mix).

                                     EECC550 - Shaaban
                                          #13 Lec # 5 Winter 2003 1-6-2004
             Single Cycle Vs. Multi-Cycle CPU
                       Cycle 1                                    Cycle 2
       8ns (125 MHz)
Clk
Single Cycle Implementation:                                      8 ns
                     Load                                          Store                 Waste

      2ns (500 MHz)
      Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
Clk

Multiple Cycle Implementation:
      Load                                         Store                                    R-type
      IF        ID     EX        MEM      WB       IF        ID      EX          MEM          IF


        Single-Cycle CPU:                               Multi-Cycle CPU:
        CPI = 1 C = 8ns                                 CPI = 3 to 5 C = 2ns
        One million instructions take =                 One million instructions take from
        I x CPI x C = 106 x 1 x 8x10 -9 = 8 msec             106 x 3 x 2x10-9 = 6 msec
                                                        to 106 x 5 x 2x10-9 = 10 msec
                                                        depending on instruction mix used.

                                                                  EECC550 - Shaaban
                                                                     #14 Lec # 5 Winter 2003 1-6-2004
Finite State Machine (FSM) Control Model
• State specifies control points for Register Transfer.
• Transfer occurs upon exiting state (falling edge).

     inputs (conditions)                                       Last State



         Next State
         Logic                      State X
                                Register Transfer
        Control State            Control Points

                                               Depends on Input

        Output Logic
                                                                Next State

     outputs (control points)

                                          EECC550 - Shaaban
                                              #15 Lec # 5 Winter 2003 1-6-2004
                  Control Specification For Multi-cycle CPU
  Finite State Machine (FSM) - State Transition Diagram
                                              IR ← MEM[PC]                 “instruction fetch”



                                                 A ← R[rs]                 “decode / operand fetch”
                                                 B ← R[rt]



                   R-type             ORi           LW                                              BEQ & Zero
                                                                      SW
                                                                                 BEQ & ~Zero
Execute




                                                                                      PC ← PC + 4          PC ← PC +
          R ← A fun B           R ← A or ZX   R ← A + SX              R ← A + SX
                                                                                                            4+ SX || 00
Memory




                                              M ← MEM[R]              MEM[R] ← B                        To instruction fetch
                                                                      PC ← PC + 4




                                                                                                                           Write-back
           R[rd] ← R             R[rt] ← R     R[rt] ← M                           13 states:
          PC ← PC + 4           PC ← PC + 4   PC ← PC + 4                          4 State Flip-Flops needed

             To instruction fetch              To instruction fetch
                                                                                   EECC550 - Shaaban
                                                                                        #16 Lec # 5 Winter 2003 1-6-2004
     Traditional FSM Controller
              next
state op cond state         control points




Truth or Transition Table
                                                      next
                                                      State         control points
                                                 11
                 Equal
                                      6




                                             4        State
                                                                       To datapath
                                 op
                datapath State
                                                              EECC550 - Shaaban
                                                                #17 Lec # 5 Winter 2003 1-6-2004
  Traditional FSM Controller

datapath + state diagram => control

• Translate RTN statements into
  control points.
• Assign states.
• Implement the controller.

                         EECC550 - Shaaban
                              #18 Lec # 5 Winter 2003 1-6-2004
               Mapping RTNs To Control Points Examples
                        & State Assignments
                                                        IR ← MEM[PC]                    “instruction fetch”
                                                             0000
                         imem_rd, IRen                                            0
                                                          A ← R[rs]                     “decode / operand fetch”
                            Aen, Ben                      B ← R[rt]           1
                                                              0001
      ALUfun, Sen

                   R-type                   ORi            LW                                                      BEQ & Zero
          4                                                                SW                                                             2
                                  6                   8                  11                    BEQ & ~Zero
Execute




                                                                              R ← A + SX                                   PC ← PC +
           R ← A fun B           R ← A or ZX       R ← A + SX                                      PC ← PC + 4
             0100                                                                  1011                                     4+SX || 00
                                       0110               1000                                           0011                   0010

                                                  9                         12                     3
                         RegDst,
Memory




                         RegWr,                    M ← MEM[R]                  MEM[R] ← B
                         PCen                         1001                     PC ← PC + 4                             To instruction fetch
                                                                                                                         state 0000
                                                                                        1100
          5                        7
                                                  10




                                                                                                                                          Write-back
           R[rd] ← R             R[rt] ← R         R[rt] ← M
          PC ← PC + 4           PC ← PC + 4       PC ← PC + 4
              0101                     0111               1010


              To instruction fetch state 0000         To instruction fetch state 0000
                                                                                                EECC550 - Shaaban
                                                                                                       #19 Lec # 5 Winter 2003 1-6-2004
      Detailed Control Specification - State Transition Table
       Current   Op field Z   Next IR    PC      Ops     Exec           Mem         Write-Back
       State                            en sel   AB    Ex Sr ALU S     RWM          M-R Wr Dst
IF     0000      ??????   ?   0001 1
       0001      BEQ      0   0011               11
       0001      BEQ      1   0010               11
       0001      R-type   x   0100               11
ID
       0001      orI      x   0110               11
       0001      LW       x   1000               11
       0001      SW       x   1011               11
       0010      xxxxxx   x   0000   1     1
BEQ
       0011      xxxxxx   x   0000   1     0     Can be combines in one state
 R
       0100      xxxxxx   x   0101                     0 1 fun   1
       0101      xxxxxx   x   0000   1     0                                         0       1      1
       0110      xxxxxx   x   0111                     0 0 or    1
ORI
       0111      xxxxxx   x   0000   1     0                                         0       1      0
       1000      xxxxxx   x   1001                     1 0 add 1
LW     1001      xxxxxx   x   1010                                     1 0 1
       1010      xxxxxx   x   0000   1     0                                         1       1      0
SW     1011      xxxxxx   x   1100                     1 0 add 1
       1100      xxxxxx   x   0000   1     0                           0 1

                                                                     EECC550 - Shaaban
                                                                       #20 Lec # 5 Winter 2003 1-6-2004
Alternative Multiple Cycle Datapath (In Textbook)
• Miminizes Hardware: 1 memory, 1 ALU
PCWr                  PCWrCond                                                                               PCSrc          BrWr
                             Zero
            IorD           MemWr        IRWr                      RegDst         RegWr        ALUSelA                1             Target




                                                                                                                   Mux
                                                                                                                            32
                          32
 PC
                                                                                                         0           0
                     32                                                                                                       Zero




                                                                                                        Mux
                                                                  Rs


                                          Instruction Reg
            0                                                                     Ra
        Mux




  32                      RAdr                                    Rt         5                                     32




                                                                                                                                         ALU Out
                                                                                                                             ALU
                                                             32                   Rb busA                1
  32                        Ideal                                            5    Reg File 32
            1             Memory                                  Rt 0                            4            0


                                                                       Mux
                          WrAdr    32                                             Rw                                               32
       32                                                         Rd                                           1     32
                          Din Dout                                                busW busB 32
                                                                    1                                          2
                32                                                                                                        ALU
                                                                   1 Mux 0                                     3
                                                                                           << 2                          Control



                                                                             Extend
                                                            Imm 16                      32                                ALUOp
                                                                 ExtOp                 MemtoReg         ALUSelB


                                                                                              EECC550 - Shaaban
                                                                                                      #21 Lec # 5 Winter 2003 1-6-2004
 Alternative Multiple Cycle Datapath (In Textbook)




•Shared instruction/data memory unit
• A single ALU shared among instructions
• Shared units require additional or widened multiplexors
• Temporary registers to hold data between clock cycles of the instruction:
     • Additional registers: Instruction Register (IR),
       Memory Data Register (MDR), A, B, ALUOut

                                                                EECC550 - Shaaban
                                                                    #22 Lec # 5 Winter 2003 1-6-2004
Alternative Multiple Cycle Datapath With Control Lines
                 (Fig 5.33 In Textbook)



                                                            PC+ 4




                                                                      Branch
                                                                      Target




(ORI not supported, Jump supported)
                                      EECC550 - Shaaban
                                        #23 Lec # 5 Winter 2003 1-6-2004
                    Operations In Each Cycle
                                           Logic
                    R-Type                Immediate                      Load                   Store                     Branch

      Instruction   IR ← Mem[PC]      IR ← Mem[PC]                   IR ← Mem[PC]            IR ← Mem[PC]             IR ← Mem[PC]
IF                                                                   PC ← PC + 4             PC ← PC + 4
      Fetch         PC ← PC + 4       PC ← PC + 4                                                                     PC ← PC + 4

                    A ← R[rs]            A ← R[rs]                     A ← R[rs]               A ←    R[rs]             A ← R[rs]
      Instruction   B ← R[rt]                                          B ←                     B ← R[rt]                B ←
ID                                       B ← R[rt]                            R[rt]                                             R[rt]
      Decode                                                         ALUout ← PC +           ALUout ← PC +
                    ALUout ← PC +     ALUout ← PC +                                                                   ALUout ← PC +
                    (SignExt(imm16)                                                             (SignExt(imm16) x4)
                    x4)                        (SignExt(imm16) x4)     (SignExt(imm16) x4)                              (SignExt(imm16) x4)



EX    Execution
                    ALUout ← A + B             ←                     ALUout ←                                         If Equal = 1
                                      ALUout                                                 ALUout ←
                                         A OR ZeroExt[imm16]           A + SignEx(Im16)        A + SignEx(Im16)         PC ← ALUout




MEM
       Memory
                                                                     M ← Mem[ALUout]         Mem[ALUout]      ←   B




        Write
                    R[rd] ← ALUout       R[rt] ← ALUout               R[rt]   ← Mem
WB      Back



                                                                                              EECC550 - Shaaban
                                                                                                     #24 Lec # 5 Winter 2003 1-6-2004
High-Level View of Finite State
      Machine Control




•   First steps are independent of the instruction class
•   Then a series of sequences that depend on the instruction opcode
•   Then the control returns to fetch a new instruction.
•   Each box above represents one or several state.

                                              EECC550 - Shaaban
                                                 #25 Lec # 5 Winter 2003 1-6-2004
Instruction Fetch (IF) and Decode (ID)
             FSM States
           IF              ID




                          EECC550 - Shaaban
                            #26 Lec # 5 Winter 2003 1-6-2004
Load/Store Instructions FSM States


   EX




   MEM




   WB


                       EECC550 - Shaaban
                         #27 Lec # 5 Winter 2003 1-6-2004
EX   R-Type Instructions
        FSM States

WB




             EECC550 - Shaaban
               #28 Lec # 5 Winter 2003 1-6-2004
Branch Instruction   Jump Instruction
 Single EX State      Single EX State




            EX                                 EX




                          EECC550 - Shaaban
                            #29 Lec # 5 Winter 2003 1-6-2004
FSM State Transition
Diagram (From Book)    IF   ID




EX




MEM                    WB




 WB
                            EECC550 - Shaaban
                                 #30 Lec # 5 Winter 2003 1-6-2004
          Finite State Machine (FSM) Specification
                                                   IR ← MEM[PC]
                                                    PC ← PC + 4                “instruction fetch”
                                                        0000

                                      A ← R[rs]     ALUout                     “decode”
                                      B ← R[rt]      ← PC +SX
                                                       0001

               R-type                               LW                                    BEQ
                                     ORi                              SW
Execute




          ALUout                 ALUout            ALUout                    ALUout              If A = B then
          ← A fun B             ← A op ZX          ← A + SX                  ← A + SX           PC ← ALUout
            0100                     0110           1000                     1011
                                                                                                      0010
Memory




                                                     M ←
                                                  MEM[ALUout]            MEM[ALUout]                       To instruction fetch
                                                    1001                     ← B




                                                                                                                              Write-back
                                                                              1100
           R[rd]                  R[rt]
          ← ALUout              ← ALUout           R[rt] ← M
           0101                      0111           1010
              To instruction fetch
                                                      To instruction fetch
                                                                                     EECC550 - Shaaban
                                                                                          #31 Lec # 5 Winter 2003 1-6-2004
            MIPS Multi-cycle Datapath
             Performance Evaluation
• What is the average CPI?
   – State diagram gives CPI for each instruction type.
   – Workload (program) below gives frequency of each type.

   Type           CPIi for type     Frequency    CPIi x freqIi
   Arith/Logic      4               40%               1.6
   Load             5               30%               1.5
   Store            4               10%               0.4
   branch           3               20%               0.6
                                  Average CPI:        4.1

    Better than CPI = 5 if all instructions took the same number
    of clock cycles (5).
                                                 EECC550 - Shaaban
                                                    #32 Lec # 5 Winter 2003 1-6-2004

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:5/1/2013
language:Latin
pages:32
iasiatube.news iasiatube.news http://
About