Thesis Presentation

Document Sample
Thesis Presentation Powered By Docstoc
					Asynchronous 8051 Microcontroller Presentation



                    By:

                Ryan Mabry

               April 18, 2005
                        Agenda



•   8051 Background
•   Motivation
•   Architecture
•   Design Flow
•   Design Implementation
•   Results
•   Challenges
•   Conclusion
                    8051 Background


•   Developed by Intel in 1980
•   Widely used in embedded systems
•   Very popular after 25 years on the market
•   Official 8051 family designation is MCS 51
•   Based on Harvard Architecture –
          Separate memory for instructions and data
          ROM stores program instructions
          RAM stores program data
                8051 Background Continued


• 8051 Predecessor was the 8048
      Used in IBM’s first PC keyboard
• Enhanced version of 8051 is 8052
      Increased Internal Memory Capacity
      Additional Timer
      More Registers
                           Motivation


• Project is based off VHDL synthesizable 8051 model
  developed by University of California’s Dalton Project
  (http://www.cs.ucr.edu/~dalton/8051)
• Two Goals
        A) Develop asynchronous 8051
        B) Use synchronous design tools in the process
• Asynchronous Advantages
        A) Lower Power Consumption
        B) No clock skew
                    Motivation Continued


• Asynchronous Disadvantages
      A) No complete design solution tools
      B) No global clock: communication must be done
         through handshaking or other methods
      C) Must ensure timing and data integrity when using
         asynchronous communication methods
                        Synchronous Architecture

                                                                                Clock

                                                                                        rst

                                                     td
               Op-code
                                                    addr
I8051_DEC                                                         I8051_ROM
                   ip
                                                   data

                              I8051_CTR
                 Src-1                               td
                 Src-2                               wr
I8051_ALU        Src-3                              addr
             Carry-in 1 & 2                      Is-bit-addr      I8051_RAM
                 des                                data
                                                  Data-bit
            Carry-out 1 & 2
               Overflow
            ALU-Op-code




                                  (Rd, wr, addr, data_out, data_in)     Ports
                              Asynchronous Architecture

                                                                                                                        rs t


                                                                               td
                        O p -c o d e
                                                                             addr
                                                                                                   I8 0 5 1 _ R O M
I8 0 5 1 _ D E C             ip
                                                                             d a ta


                                          I8 0 5 1 _ C T R
                          S rc -1                                              td
                          S rc -2                                              wr
I8 0 5 1 _ A L U          S rc -3                                            addr
                    C a rry -in 1 & 2                                    Is -b it-a d d r          I8 0 5 1 _ R A M
                                                                             d a ta
                           des
                                                                          D a ta -b it
                   C a rry -o u t 1 & 2
                       O v e rflo w

                    A L U -O p -c o d e


                               re q
     ALU                                      CTR
                    ack
  W ra p p e r                              W ra p p e r


     C lo c k        C lo c k in g
                                                  (R d , w r, a d d r, d a ta _ o u t, d a ta _ in )          P o rts
                     E le m e n t
                  Architecture Differences


• Clock is generated onboard asynchronous 8051
• Clock is stopped while controller waits for ALU to complete
  an operation
       - Implemented through handshaking signals
       generated by ALU and Controller Wrappers
• No excess cycles in asynchronous controller
       - Defined in synchronous version as clock cycles
       where controller is doing nothing and waiting for
       ALU to complete an operation
Asynchronous Design Flow

  F u n c tio n a l S im u la tio n


         S y n th e s is o f
   S y n c h r o n o u s B lo c k s


             T im in g
            A n a ly s is



 A s y n c h ro n o u s W ra p p e r
             D e s ig n


     T im in g S im u la tio n
            Asynchronous Design Flow Continued


• Functional Simulation – Verify Functionality Of Design
        A) Standard VHDL Compilers cannot synthesize
        VHDL code that implements asynchronous logic
        B) This project used Modelsim
        C) Compare controller registers, memory contents
        and instructions executed in asynchronous and
        synchronous versions – verify to be the same
• Synchronous Block Synthesis – Synthesize synchronous
  parts of both 8051 microcontrollers. This project used
  Ambit Buildgates.
           Behavioral Code -> Verilog Netlist
            Asynchronous Design Flow Continued


• Timing Analysis – Generate Delay Numbers
       A) Cadence Encounter generates parasitics for circuits
       B) Use Synopsys Primetime for critical path analysis
       C) Import parasitics and verilog netlist into Primetime
       D) Remove successive ALU Operations to get delay
       numbers
              IE: Remove division case from ALU to obtain
              critical path delay for multiplication
       E) Also generate critical path numbers for RAM, ROM,
       decoder, and controller modules
            Asynchronous Design Flow Continued


• Asynchronous Wrapper Design
       A) Implement delay elements for wrappers in
       Cadence Composer schematic editor
       B) Combinational logic elements in wrappers
       can be designed in VHDL code and then imported
       C) Wire two parts together in schematic
• Timing Simulation
       Unable to test implementation of asynchronous design
       since university does not have post-synthesis timing
       simulator installed.
             Design Implementation - Handshaking


• Controller needs ALU Operation to be performed:
       A) Assert request line
       B) Stop Clock
• Once ALU Operation is finished:
       A) Assert acknowledge line
       B) Start Clock
• Deassert request Line
                                               Req+
                                           S to p C lo c k

• Deassert acknowledge line
                                                                 A ck+
                                          A ck-
                                                             S ta rt C lo c k




                                                   Req-
                  Design Implementation – ALU Wrapper


                    Req
 ALU
Opcode   Select
                           Logical       Add
         Logic                                             Multiply                  Divide
                          Operations   Subtract


         S2 S1 S0
                                                  0                   0                       0
                                                    2to1                      2to1                    2to1 Ack
                                                  1 Mux                       Mux                     Mux
                                                                          1                       1
                                                      S0                        S1                      S2
     Design Implementation – ALU Wrapper Continued


• Remove operations from ALU to obtain delay numbers
• Buffers used as building block for each delay element
       - Delay of 114ps (Used 100ps to simplify design)
• Primetime was used for critical path analysis
• Apply 50% safety margin to initial numbers to account for
  operating conditions – temperature changes and
  voltage fluctuations

   ALU Ops           Delay(ps)   ALU Ops              Buffers
   Division           37000      Division              163
   Multiplication     15800      Multiplication         30
   Add & Subtract     12800      Add & Subtract         38
   Logical Operations 9000       Logical Operations    90
       Design Implementation – Controller Wrapper

           A L U O p -c o d e
                                  CTR          R eq

                 Ack            W ra p p e r



• Asserts request signal while controller is waiting for ALU
  to complete operation
• Deasserts request signal once acknowledge signal from
  ALU wrapper is received
• Implemented in VHDL code
                          Design Implementation – Controller Modifications


            • Excess Cycles Eliminated
                  Example: ADDC_1 instruction takes 8 clock cycles
                  in synchronous controller and 6 clock cycles in
                  asynchronous controller
when OPC_ADDC_1 =>
 case exe_state is
       …
   when ES_5 =>                     Cycles ES_5 and ES_6 are excess cycles
     exe_state <= ES_6;
   when ES_6 =>                     Eliminated in asynchronous version
     exe_state <= ES_7;
   when ES_7 =>
    SHUT_DOWN_ALU;
     cpu_state <= CS_1;
     exe_state <= ES_0;
end case;
          Design Implementation – Clocking Unit

    req
    ack
                                                  Clock
                                       ...
                                 Inverter Chain

• When req=‘1’ and ack=‘0’ clock is stopped. Otherwise
  behaves as a synchronous clock
• Length of inverter chain is longer than critical path in
  RAM to avoid timing violations
• Critical path in RAM module is 30.9ns
• Since inverter has delay of 50ps, inverter chain must be
   682 inverters long
                            Results

• Targeted VTVT standard cell library developed by Virginia
  Tech VLSI for Telecommunications.
• Asynchronous 8051 consumes more area due to onboard
clock and wrappers. RAM dominates both chip areas
       Asynchronous Cell Area: 72400
       Synchronous Cell Area: 65662
• Divmul program on Dalton website used to roughly
benchmark designs in Modelsim
       Asynchronous Simulation Time: 172,030ns
       Synchronous Simulation Time: 221,390ns
• Asynchronous 8051 is roughly 28.7% faster while using
10% more area than synchronous version
                            Challenges

• Had to learn all of the different tools
       A) Technical assistance was available for Ambit
       Buildgates and Cadence Encounter
       B) Resorted to user manuals and the Internet for
       Synopsys Primetime
• Learned other tools not necessary to design flow
       - Time spent learning Synopsys Design Analyzer and
       Timemill could have been better spent in later stages
       of design flow
                             Conclusion

• A lot of work to change existing synchronous design to
asynchronous design
• Use of synchronous design tools in asynchronous design
flow made process much easier
• Since no post-synthesis timing simulators are installed, it is
impossible to verify the correctness of the asynchronous
design
• I would like to thank Narender Hanchate for his time in
helping me learn most of the tools used in this project

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:19
posted:12/4/2011
language:English
pages:22