# ECE 274

Document Sample

```					              ECE 274 - Digital Logic                                                                                                                                      ECE 274 - Digital Logic
Lecture 21                                                                                                                                                   Announcements

Lecture 21 – Optimization                                                                                                                                        All requests for regrades must be submitted in writing within one
Mealy vs. Moore                                                                                                                                             week of the distribution of graded material

This could possibly result in a lower score for the requested problem

Other problems within the same assignment might also be regraded
I.e., regrades for problems not specifically requested will NOT lower your
score

1                                                                                                                                2

Moore vs. Mealy FSMs                                                                                                                                         Mealy FSMs May Have Fewer States
Mealy FSM a dds thi s
Inputs: enough (bit)                    Inputs: enough (bit)
outputs

outputs

Output                                                          Output
FSM

FSM

O                                                               O                                Outputs: d, clear (bit)                 Outputs: d, clear (bit)
logic                                                           logic
I          Combinational        O                                                                                                                                                                                                  /d=0, clear=1
logic
inputs

inputs
FSM

FSM

S
I
Next-stat e
I
Next-stat e
Moore              Init      Wait                           Init        Wait                Mealy
enough’
logic                                                          logic                                                   d=0                                                          enough’
enough
S                                                             S                                                               clear=1
clk    State register                                                                                                                                                                                       Disp                                  enough/d=1
N                               clk       State register                                     clk         State register                                                                        d=1

clk                                           clk
Moore                          N
Mealy                            N                       a

(a)                                                             (b)                                        Inputs: enough                               Inputs: enough
FSM implementation architecture                                                                                                                                                 State:      I W W D I                         State:      I W W I
State register and logic                                                                                   Inputs: b; Outputs: x
Outputs: clear
More detailed view                                                                                                                /x=0
Outputs: clear
d                                               d
Next state logic – function of present state and FSM                                                          S0                          S1                                           (a)                                              (b)
inputs
Output logic                                                                                                              b/x=1                         Soda dispenser example: Initialize, wait until enough, dispense
If function of present state only – Moore FSM                                                        b’/x=0                                        Moore: 3 states; Mealy: 2 states
If function of present state and FSM inputs – Mealy FSM                                  Graphically: show outputs with
arcs, not with states

3                                                                                                                                4

Mealy vs. Moore Example: Beeping
Mealy vs. Moore                                                                                                                                              Wristwatch
Q: Which is Moore,
Inputs: b; Outputs: s1, s0, p                               Inputs: b; Outputs: s1, s0, p
Button b                                     Inputs: b; Outputs: s1, s0, p                Inputs: b; Outputs: s1, s0, p
b’                                                                                                                                   b’
Sequences mux select lines
and which is Mealy?                                               Time        b’/s1s0=00, p=0                               Time
s1s0=00, p=0                      s1s0 through 00, 01, 10, and
Time           b’/s1s0=00, p=0              Time
s1s0=00, p=0
b/s1s0=00, p=1                                         b                                                                                         b/s1s0=00, p=1                             b
• A: Mealy on left,                                                                                                            S2 s1s0=00, p=1                       11                                                                                             S2 s1s0=00, p=1
Alarm       b’/s1s0=01, p=0                                                                                                                        Alarm          b’/s1s0=01, p=0
Moore on right                                                                                                                  b                                      Each value displays different                                                                b
b/s1s0=01, p=1                                                  b’                             internal register                                 b/s1s0=01, p=1                                     b’
Alarm                                                                                                                               Alarm
– Mealy outputs on                                            Date        b’/s1s0=10, p=0                                     s1s0=01, p=0                     Each unique button press                          Date           b’/s1s0=10, p=0                   s1s0=01, p=0
arcs, meaning                                                     b/s1s0=10, p=1
b
should cause 1-cycle beep,                            b/s1s0=10, p=1
b

outputs are function
S4 s1s0=01, p=1
with p=1 being beep                                                                           S4 s1s0=01, p=1
Stpwch       b’/s1s0=11, p=0                                     b                                                                                 Stpwch          b’/s1s0=11, p=0                 b
of state AND                                                                                                                           b’                 Must wait for button to be                                                                                    b’
released (b’) and pushed
b/s1s0=11, p=1                                      Date                                                                                         b/s1s0=11, p=1                          Date
INPUTS                                                                                                                          s1s0=10, p=0                                                                                                                   s1s0=10, p=0
b                                again (b) before sequencing                                                                        b
– Moore outputs in                                                  Mealy                                                 S6 s1s0=10, p=1
Note that Moore requires
Mealy                                    S6 s1s0=10, p=1
states, meaning                                                                                                             b                                    unique state to pulse p,                                                                    b
b’                                                                                                                            b’
outputs are function                                                                                                 Stpwch                                      while Mealy pulses p on arc                                                             Stpwch

of state only                                                                                                            b
s1s0=11, p=0                          Tradeoff: Mealy’s pulse on p                                                                b
s1s0=11, p=0

S8 s1s0=11, p=1                           may not last one full cycle                                                               S8 s1s0=11, p=1

Moore                                                                                                                             Moore
5                                                                                                                                6
Mealy vs. Moore Tradeoff                                                                                                                 Implementing a Mealy FSM
Mealy outputs change mid-cycle if input changes                                                                                          Straightforward
Note earlier soda dispenser example                                                                                                      Convert to state table
Mealy had fewer states, but output d not 1 for full cycle                                                                            Derive equations for             Inputs: enough (bit)
Represents a type of tradeoff                                                                                                            each output                      Outputs: d, clear (bit)
/ d=0, clear=1
Inputs: enough (bit)
Outputs: d, clear (bit)
Inputs: enough (bit)
Outputs: d, clear (bit)
Key difference from
/d=0, clear=1
Moore: External outputs              Init       Wait

(d, clear) may have
enough’/d=0

Moore                  Init           Wait                          Init        Wait                      Mealy                         different value in same                                enough/d=1
enough’
enough’
d=0
clear=1
enough                                                                                             state, depending on
Disp                                   enough/d=1                                       input values
d=1

clk                                               clk
Inputs: enough                                     Inputs: enough
State:       I W W D I                            State:      I W W I

Outputs: clear                                   Outputs: clear
d                                                   d
(a)                                               (b)                               7                                                                                                                    8

6.4

Mealy and Moore can be Combined                                                                                                          Datapath Component Tradeoffs
Final note on Mealy/Moore                                                                                                                Can make some components faster (but bigger), or smaller (but
slower), than the straightforward components we previously built
May be combined in same FSM
We’ll build
Inputs: b; Outputs: s1, s0, p
Could also do for the other datapath components
b’/p=0
Time
s1s0=00
b/p=1
b’/p=0
Alarm
s1s0=01
b/p=1               Combined
b’/p=0  Moore/Mealy
Date
s1s0=10          FSM for beeping
b/p=1                wristwatch
b’/p=0    example
Stpwch
s1s0=11
b/p=1

9                                                                                                                10

a3b3 a2 b2 a1b1 a0 b0                ci

Built carry-ripple adder in Ch 4                                                   a3 b3     a2 b2       a1 b1     a0 b0           cin
Similar to adding by hand, column by
column
cout             s3      s2    s1         s0        combinational logic design process
Con: Slow                                                                          carries:      c3     c2      c1        cin            Recall that 4-bit two-level adder was                            Two-level: AND level
Output is not correct until the carries have                                  B:             b3     b2      b1        b0      a      big                                                                followed by ORs
rippled to the left                                                           A:           + a3     a2      a1        a0             Pro: Fast
4-bit carry-ripple adder has 4*2 = 8 gate                                            cout    s3     s2      s1        s0
delays                                                                                                                                   2 gate delays
Pro: Small                                                                                                                               Con: Large                                                                 co s3 s2 s1 s0
a3b3                        a2 b2                         a1b1                         a0 b0      ci
4-bit carry-ripple adder has just 4*5 = 20                                                                                               Truth table would have 2
(4+4)
=256                          10000
gates
rows                                                       8000
s
r
o

t

a
500 gates                                              s
n
a
i

r

T
4000
2000
Is there a compromise design?                                       0
FA                           FA                           FA                           FA                                                                                                                   1   2   3   4       5   6   7        8
Between 2 and 8 gate delays                                                            N

co          s3                              s2                         s1                                 s0                  Between 20 and 500 gates

11                                                                                                                12
Idea                                                                                                                                                               •            Want each stage’s carry-in bit to be function of external inputs only (a’s, b’s, or c0)

Modify carry-ripple adder – For a stage’s carry-in, don’t wait for                                                                                                                a3b3                                 c3            a2b2                  c2         a1b1                  c1             a0 b0 c0

carry to ripple, but rather directly compute from inputs of earlier
stages                                                                                                                                                                                                                                                                                                                                            a

stages rather than waiting for carry to ripple to current stage                                                                                                              FA                                                   FA                            FA                                     FA
a3a3 b3
b3                          a2 b2
a2 b2                               a1 b1
a1b1                            a0 b0 c0
a0 b0        ci                                                                                                              co2                       co1                                   co0
co                      s3                                      s2                                s1                               s0
s = a xor b                                                                                                            Stage 0: Carry-in is already an
c = ab + ac + bc                                                                                                       external input: c0
a      a3b3                    a2b2               a1b1              a0b0 c0
Stage 1: c1=co0
look                        look                          look
look
look
look
c1 = b0c0 + a0c0 + a0b0
FA            c3               FA          c2                     FA        c1                        F c0
A                                                   c3                    c2                 c1                 c0
Stage 2: c2=co1
FA                                                                                                                                              FA
c4        stage 3            stage 2           stage 1             stage 0
co1 = b1c1 + a1c1 + a1b1
c2 = b1c1 + a1c1 + a1b1
c4co             s3
stage 3                            s2
stage 2                          stage 1 s1                    stage 0            s0           cout        s3               s2                 s1                 s0

c2 = b1(b0c0 + a0c0 + a0b0) + a1(b0c0 + a0c0 + a0b0) +a1b1
cout           s3                      s2                                s1                             s0                                                                                                                       c2 = b1b0c0 + b1a0c0 + b1a0b0 + a1b0c0 + a1a0c0 + a1a0b0 + a1b1
Notice – no rippling of carry                                                                                                                                                            Continue for c3

13                                                                                                                                                                                                               14

a3b3                      a2b2                     a1b1                  a0b0 c0
Have each stage compute two terms
Carry lookahead logic                                                                                                                                                                   Propagate: P = a xor b
function of external inputs                                                                                                                                                             Generate: G = ab
No waiting for ripple                                                                                                                                                   Compute lookahead from P and G terms, not from external inputs
look                  look                     look
Why P & G? Because the logic comes out much simpler
Equations get too big                                   c3                     c2                       c1                      c0
Very clever finding; not particularly obvious
Not efficient                                    FA
Why those names?
Need a better form of                         c4            stage 3              stage 2                  stage 1                    stage 0
G: If a and b are 1, carry-out will be 1 – “generate” a carry-out of 1 in this case
cout            s3                 s2                       s1                      s0
lookahead                                                                                                                                                                                                        P: If only one of a or b is 1, then carry-out will equal the carry-in – propagate the
carry-in to the carry-out in this case
c1 = b0c0 + a0c0 + a0b0
cin                                                                                          c0
a
carries: c4 c3 c2 c1 c0                                                                     1      0    1         1         1       1    1         1
c1                                                                               b0
B:      b3 b2 b1 b0                                                                       1              1                0               1
c2 = b1b0c0 + b1a0c0 + b1a0b0 + a1b0c0 + a1a0c0 + a1a0b0 + a1b1                                                                                                                                                                                                                                                     a0
A:    + a3 a2 a1 a0                                                                     + 1            + 1              + 1             + 0
cout   s3 s2 s1 s0                                                                       0              1                0               0
c3 = b2b1b0c0 + b2b1a0c0 + b2b1a0b0 + b2a1b0c0 + b2a1a0c0 + b2a1a0b0 + b2a1b1 +
a2b1b0c0 + a2b1a0c0 + a2b1a0b0 + a2a1b0c0 + a2a1a0c0 + a2a1a0b0 + a2a1b1 + a2b2
(a)              if a0b0 = 1        if a0xor b0 = 1
then c1 = 1       then c1 = 1 if c0 = 1
(call this G:Generate) (call this P: Propagate)

15                                                                                                                                                                                                               16

a3 b3                                              a2 b2                          a1 b1                            a0 b0                 cin
sum/propagate/generate

a3 b3    a3 b3             a2 b2    a2 b2                      a1 b1 a1 b1                        a0
a0 b0 b0 c0         cin                                                                                                   SPG
block
(SPG) block
Call this

look                        look                          look                                                                                                   G3      P3        c3                                 G2      P2         c2           G1        P1             c1     G0         P0              c0
a

c3                            c2                            c1                            c0                                     a                                    cout                      s3                                      s2        (b)                     s1                               s0
G3       P3FA     c3      G2           P2          c2          G1       P1           c1            G0      P0              c0
Carry-loo kahead lo gic                                                                                                                                                                  P3 G3                                                      P2 G2                           P1 G1                             P0    G0          c0
c4          stage 3                      stage 2                          stage 1                       stage 0                                                                         Carry-loo kahead lo gic
cout         s3                         s2          (b)               s1                             s0
cout          s3                              s2                                s1                                s0
a
After plugging in:
With P & G, the carry lookahead
equations are much simpler                                       c1 = G0 + P0c0                                                                     a

Equations before plugging in                                  c2 = G1 + P1c1 = G1 + P1(G0 + P0c0)
c1 = G0 + P0c0                                            c2 = G1 + P1G0 + P1P0c0
Stage 4                   Stage 3                      Stage 2                          Stage 1
c2 = G1 + P1c1                                            c3 = G2 + P2c2 = G2 + P2(G1 + P1G0 + P1P0c0)
c3 = G2 + P2c2                                            c3 = G2 + P2G1 + P2P1G0 + P2P1P0c0                                                                                                                                                                          c1 = G0 + P0c0
c2 = G1 + P1G0 + P1P0c0
cout = G3 + P3c3                                          cout = G3 + P3G2 + P3P2G1 + P3P2P1G0 +                                                                                                                              c3 = G2 + P2G1 + P2P1G0 + P2P1P0c0
cout = G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0c0
P3P2P1P0c0                                                                                                                                                                       (c)

17                                                                                                                                                                                                               18
a3      b3                  a2     b2                a1    b1                       a0    b0        c0                Problem: Gates get bigger in each stage
a      b       cin          a      b       cin       a     b      cin              a     b       cin                         4th stage has 5-input gates
SPG block                   SPG block               SPG block                     SPG block
32nd stage would have 33-input gates
P         G                 P      G                 P     G                       P      G
Too many inputs for one gate
P3        G3           c3 P2       G2           c2 P1     G1                c1 P0        G0                                        Would require building from smaller gates,
4-bit carry-lookahead logic                                                                    meaning more levels (slower), more gates
cout
(bigger)
Gates get bigger
cout             s3                         s2                         s1                        s0
One solution: Connect 4-bit CLA adders in                                                                        in each stage

Fast -- only 4 gate delays                                                        • 4-bit adder comparison                  ripple manner                                                                                                                       Stage 4

Each stage has SPG block with 2 gate levels                                       (gate delays, gates)                            But slow
Carry-lookahead logic quickly computes the                                             – Carry-ripple: (8, 20)
a15-a12        b15-b12            a11-a8            b11-b8           a7a6a5a4 b7b6b5b4            a3a2a1a0 b3b2b1b0
carry from the propagate and generate bits                                             – Two-level: (2, 500)
using 2 gate levels inside                                                             – CLA: (4, 26)                             a3a2a1a0 b3b2b1b0                 a3a2a1a0 b3b2b1b0                   a3a2a1a0 b3b2b1b0            a3a2a1a0 b3b2b1b0

Reasonable number of gates -- 4-bit adder
o Nice compromise
cout s3s2s1s0                   cout s3s2s1s0                         cout s3s2s1s0              cout s3s2s1s0
has only 26 gates
cout    s15-s12                            s11-s8                             s7s6s5s4                s3s2s1s0

19                                                                                                                                           20

Better solution -- Rather than rippling the carries, just                                                                   Hierarchical CLA concept can be applied for larger adders
32-bit hierarchical CLA
repeat the carry-lookahead concept                                                                                                   Only about 8 gate delays (2 for SPG block, then 2 per CLA level)
Requires minor modification of 4-bit adder to output P and G                                                                     Only about 14 gates in each 4-bit CLA logic block
PG c                                                                                                                   a
SPG block
Q: How many gate
These use carry-lookahead internally                                                                                                                                                                    delays for 64-bit
a15-a12         b15-b12        a11-a8        b11-b8      a7a6a5a4      b7b6b5b4             a3a2a1a0     b3b2b1b0
hierarchical CLA,
4-bit
CLA
4-bit
CLA
4-bit
CLA
4-bit
CLA
4-bit
CLA
4-bit
CLA
4-bit
CLA
4-bit
CLA
using 4-bit CLA logic?
a3a2a1a0 b3b2b1b0             a3a2a1a0 b3b2b1b0           a3a2a1a0 b3b2b1b0                  a3a2a1a0 b3b2b1b0
logic         logic           logic       logic           logic       logic            logic       logic
P G cout        s3s2 s1s0     P G cout s3s2 s1s0           P G cout s3s2 s1s0                P G cout s3s2s1s0
P G c
PGc     c GP        c G P                   P Gc        P Gc        c GP     c G P       in 1st level, 4 in 2nd, 1
in 3rd -- so still just 8
4-bit                                                   4-bit
P3G3                         c3 P2 G2                     c2 P1 G1                        c1 P0 G0                                                 CLA                                                     CLA                         gate delays (2 for
logic                                                   logic
P G cout
4-bit carry-lookahead logic                                                                                                                                                              SPG, and 2+2+2 for
CLA logic). CLA is a
P G c           2-bit           c G P
s15-s12                      s11-s18                           s7-s4                        s3-s0
CLA                                                     very efficient method.
logic
21                                                                                                                                           22

Design Challenge
Design Challenge
Convert the following Moore FSM to the nearest Mealy FSM
equivalent.

Due:
Next Lecture (Monday, November 28)
Extra Credit (Homework)
2 points

23

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 5 posted: 8/28/2011 language: Norwegian pages: 4