Introduction to basic concepts on asynchronous circuit design

Document Sample
Introduction to basic concepts on asynchronous circuit design Powered By Docstoc
					Bridging the gap between
  asynchronous design
     and designers


        Hao Zheng




                           1
                 Outline

What is an asynchronous circuit ?
Asynchronous communication
Asynchronous design styles (Micropipelines)
Asynchronous logic building blocks
Control specification and implementation
Delay models and classes of async circuits
Why asynchronous circuits ?



                                              2
         Synchronous circuit


  R       CL       R       CL       R       CL         R




                          CLK

Implicit (global) synchronization between blocks
Clock period > Max Delay (CL + R)
 Time is an independent physical variable (quantity)       3
               Asynchronous circuit

Ack

         R       CL       R       CL       R        CL        R



 Req


       Explicit (local) synchronization:
                          Req / Ack handshakes

                    Time = events + quantity
                                                                  4
         Time does not exist if nothing happens (Aristotle)
   Motivation for Asynchronous

Asynchronous design is often unavoidable:
  Asynchronous interfaces, arbiters etc.



Modern clocking is multi-phase and distributed – and
virtually ‘asynchronous’ (cf. GALS – next slide):
  Mesachronous (clock travels together with data)

  Local (possibly stretchable) clock generation



Robust asynchronous design flow is coming (e.g. VLSI
programming from Philips, NCL from Theseus Logic,
fine-grain pipelining from Fulcrum)


                                                       5
Motivation (Technology Aspects)

Low power
  Automatic clock gating

Electromagnetic compatibility
  No peak currents around clock edges

Security
  No ‘electro-magnetic difference’ between logical ‘0’

   and ‘1’in dual rail code
Robustness
  High immunity to technology and environment

   variations (temperature, power supply, ...)



                                                          6
 Motivation (Designer’s View)

Modularity for system-on-chip design
 Plug-and-play interconnectivity

Average-case peformance
 No worst-case delay synchronization

Many interfaces are asynchronous
 Buses, networks, ...




                                        7
  Globally Async Locally Sync (GALS)

Asynchronous       Clocked Domain
World


  Req1                                 Req3
                 R       CL      R

   Ack1                                Ack3

                     Local CLK         Req4
   Req2


   Ack2        Async-to-sync Wrapper   Ack4


                                        8
         Key Design Differences

Synchronous logic design:
   proceeds without taking timing correctness
    (hazards, signal ack-ing etc.) into account
   Combinational logic and memory latches
    (registers) are built separately
   Static timing analysis of CL is sufficient to
    determine the Max Delay (clock period)
   Fixed set-up and hold conditions for latches




                                                    9
         Key Design Differences

Asynchronous logic design:
   Must ensure hazard-freedom, signal ack-ing, local
    timing constraints
   Combinational logic and memory latches (registers)
    are often mixed in “complex gates”
   Dynamic timing analysis of logic is needed to
    determine relative delays between paths
To avoid complex issues, circuits may be built as
Delay-insensitive and/or Speed-independent
(Maller’s theory vs Huffman asynchronous automata)



                                                         10
Verification and Testing Differences
Synchronous logic verification and testing:
  Only functional correctness aspect is verified and

   tested
  Testing can be done with standard ATE and at low

   speed
Asynchronous logic verification and testing:
  In addition to functional correctness, temporal aspect

   is crucial: e.g. causality and order, deadlock-freedom
  Testing must cover faults in complex gates

   (logic+memory) and must proceed at normal
   operation rate
  Delay fault testing may be needed




                                                            11
    Synchronous communication



1        1         0          0        1        0

Clock edges determine the time instants where data
must be sampled

Data wires may glitch between clock edges (set-up/hold
times must be satisfied)

Data are transmitted at a fixed rate
(clock frequency)

                                                     12
                  Dual Rail
1          1                             1

                    0        0                    0


Two wires with L(low) and H (high) per bit
  “LL” = “spacer”, “LH” = “0”, “HL” = “1”



n-bit data communication requires 2n wires

Each bit is self-timed

Other delay-insensitive codes exist (e.g. k-of-n) and
event-based signalling (choice criteria: pin and power
efficiency)
                                                         13
                 Bundled Data


1           1         0          0           1          0
    Validity signal
      Similar to an aperiodic local clock



    n-bit data communication requires n+1 wires

    Data wires may glitch when no validity signal.

    Signaling protocols
      level sensitive (latch)

      transition sensitive (register): 2-phase / 4-phase
                                                            14
        Example: Memory Read Cycle

Valid address


Address         A                    A

Valid data


Data                D                    D


     Transition signaling, 4-phase


                                             15
          Example: Memory Read Cycle

Valid address


Address         A                    A

Valid data


 Data               D                    D


     Transition signaling, 2-phase


                                             16
       Asynchronous Modules
                         DATA
   Data IN               PATH              Data OUT


                 start          done
        req in                            req out
       ack in         CONTROL             ack out


Signaling protocol:
reqin+ start+ [computation] done+ reqout+ ackout+ ackin+
reqin- start-    [reset]   done- reqout- ackout- ackin-
(more concurrency is also possible)
                                                           17
    Asynchronous Latches: C element
                             Vdd
A                        A       B
            C        Z
B
                             Z
                         B       A
                                                Z
    A   B       Z+       B       A
                             Z
    0   0       0                      Static Logic
    0   1       Z                    Implementation
    1   0       Z        A       B
                                     [van Berkel 91]
    1   1       1
                             Gnd
                                                 18
    C-element: Other Implementations
     Vdd                   Vdd
A                      A
                                 Weak inverter
B                      B
                Z                          Z


B                      B

A            Dynamic   A            Quasi-Static

     Gnd                   Gnd
                                                 19
         Dual-Rail Logic

A.t
B.t             C.t

                        Dual-rail AND gate
A.f
                C.f
B.f




Valid behavior for monotonic environment


                                             20
    Completion Detection



    Dual-rail                 C     done
     logic
                •
                •
                •
•
•
•
                    Completion detection tree

                                                21
Differential Cascode Voltage Switch Logic

                start


            Z.f                             Z.t

                                                  done

                                      A.t
          C.f     B.f   A.f           B.t     N-type
                                      C.t
                                              transistor
                                              network
                              start
3-input AND/NAND gate
                                                           22
    Examples of Dual-Rail Design

Asynchronous dual-rail ripple-carry adder (A.
Martin, 1991)
   Critical delay is proportional to logN (N=number of
    bits)
   32-bit adder delay (1.6m MOSIS CMOS): 11ns versus
    40 ns for synchronous
   Async cell transistor count = 34 versus synchronous =
    28
More recent success stories (modularity and
automatic synthesis) of dual-rail logic from Null-
Convension Logic from Theseus Logic

                                                       23
Bundled-Data Logic Blocks



              Single-rail logic
                                  •
          •                       •
          •                       •
          •

  start             delay             done


   Conventional logic + matched delay
                                             24
 Micropipelines (Sutherland 89)
         Micropipeline (2-phase) control blocks

                                    r1    g1
                                    d1            Request-
    C                                             Grant-Done
   Join                             r2            (RGD)Arbiter
                   Merge            d2    g2


  sel                                r1
   outf                out           a1    r
in                  in 0                   a
   outt                out           r2              Call
                       1             a2
Select             Toggle
                                                            25
       Micropipelines (Sutherland 89)

Aout        delay                   delay       Ain
                    C                       C



        L   logic   L   logic   L   logic   L



        C                       C
Rin                     delay                   Rout

                                                 26
               DataPath / Control



       L      logic     L      logic     L      logic   L




Rin                                                              Rout
Aout                        CONTROL                              Ain

           Synthesis of control is a major challenge
                                                            27
Control specification

 A+

       A
 B+


 A-    B


 B-            A input
               B output


                          28
Control specification

 A+


 B-
       A                B
 A-


 B+



                        29
 Control specification

A+        B+

               A
     C+
                   C     C
A-        B-   B


     C-



                         30
 Control specification

A+        B+

               A
     C+
                   C     C
     A-
               B
     B-

     C-



                         31
          Control Specification
                             Ri+    Ro+

Ri                Ro
          FIFO            Ao+       Ai+
          cntrl
Ao                Ai
                             Ri-    Ro-

                          Ao-       Ai-
     Ri
                         C         Ro
     Ao       C



                                   Ai

                                          32
   Gate vs Wire delay models

Gate delay model: delays in gates, no delays in wires




Wire delay model: delays in gates and wires




                                                        33
 Delay Models for Async. Circuits
Bounded delays (BD): realistic for gates and wires.
  Technology mapping is easy, verification is
   difficult
                                                           BD
Speed independent (SI): Unbounded (pessimistic)
delays for gates and “negligible” (optimistic) delays
for wires.                                                      DI
   Technology mapping is more difficult,
    verification is easy
                                                          SI  QDI
Delay insensitive (DI): Unbounded (pessimistic)
delays for gates and wires.
   DI class (built out of basic gates) is almost empty

Quasi-delay insensitive (QDI): Delay insensitive
except for critical wire forks (isochronic forks).
  In practice it is the same as speed independent
                                                                 34
       Environment models

Slow enough environment = Fundamental mode
(Inputs change AFTER system has settled)

Reactive environment = I/O mode
(Inputs may change once the first output changes)




                                              35
    Correctness of a Circuit wrt Delay
              Assumptions
             C-element: z = ab +zb + za


a
                              a
b
                z             b           z




                                              36
              Resistance

Concurrent models for specification
 CSP, Petri nets, ...: no more FSMs

Difficult to design
 Hazards, synchronization

Complex timing analysis
 Difficult to estimate performance

Difficult to test
 No way to stop the clock




                                       37
   But ... some successful stories

Philips
AMULET microprocessors
Sharp
Intel (RAPPID)
Start-up companies:
  Theseus logic, Fulcrum, Self-Timed

   Solutions
Recent blurb: It's Time for Clockless Chips, by
Claire Tristram (MIT Technology Review, v.
104, no.8, October 2001:
http://www.technologyreview.com/magazine/o
ct01/tristram.asp)
 ….
                                                  38

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:5
posted:11/23/2012
language:English
pages:38