Docstoc

Timing Issues for Low-Energy Design

Document Sample
Timing Issues for Low-Energy Design Powered By Docstoc
					      From Algorithms to
Systems-on-a-Chip in a Semester
           E225C - 2000
          Borivoje Nikolić
         Fall 2000 - EE225C
• Course topics:
  – Communication systems oriented
  – Building blocks
     • Datapaths, arithmetic (adders, multipliers, MACs,
       dividers, CORDICs)
     • Parallelization, pipelining, unrolling, etc.
     • Transformations: FIR filters, Viterbi decoders
  – Systems
     • Finite wordlengths, ADCs, AGC, adaptive equalizers,
       sequence detection
     • Applied to xDSL, Gigabit ethernet, wireless, disk drives
                 Projects
• 18 students
• Two phases:
                                   MCL                              Simulink
                                   code                              model

  – Block design               Module
                                                                    Simulink
                               Compiler
  – Putting a system
                       std. cell          behavioral        test
    together            netlist             VHDL          vectors


• Simulink + Module                                 VHDL
                                                  Simulation
  Compiler + functional
                                                correspondence
  equivalence (VSS)                                  report
        Design Projects
– Timing recovery for CDMA
– OFDM receiver with multi-antenna support
– 3G Turbo decoder
– LDPC iterative decoder
– Polyphase filter bank
– RAKE receiver
– Adaptive image-reject mixer
– Decoder for maskless lithography
               OFDM Receiver
 Similar to 802.11a system specification
• Blocks
    –   Synchronization
    –   FFT
    –   Viterbi decoder
    –   SVD
• System Integration and Simulation
• Students: Hayun Tang, Ning Zhang, Dejan
  Markovic, Yun Chiu
                 OFDM receiver




Students: Hayun Tang, Ning Zhang, Dejan Markovic, Yun Chiu
                               SVD for multi-antenna
            Transmitter                                             from   Rx

                                                  transpose 1   4               4
                           1
                               1
                                   S/P
                                         48
                                              1                     SVD              1 transpose 1   48
                                                                                                          IFFT
                                                                                                                 64   cyclic   64
                                                                                                                                    P/S
                                                                                                                                          1
                                                                                                                                                  1
                                                                                                                      prefix
            Coding and
            Modulation




                                                                4               4
                               1         48                2        SVD              2               48          64   cyclic   64         1
                           2       S/P        2                                                 2         IFFT                      P/S           2
        1                                                                                                             prefix                  D/A
bits
                           3
                               1
                                   S/P
                                         48
                                              3                                                 3
                                                                                                     48
                                                                                                          IFFT
                                                                                                                 64   cyclic   64
                                                                                                                                    P/S
                                                                                                                                          1    RF 3
                                                                                                                      prefix
                               1         48                                                          48          64   cyclic   64         1
                           4       S/P        4                 4               4               4         IFFT                      P/S           4
                                                          48        SVD              48                               prefix



             Receiver                                      from     Rx
                                                                     Tx
                                                                    to



                                                  transpose 1   4               4
                           1
                               1
                                   P/S
                                         48
                                              1                     SVD              1 transpose 1   48
                                                                                                          FFT
                                                                                                                 64   cyclic   64
                                                                                                                                    S/P
                                                                                                                                          1
                                                                                                                                                  1
            Demodulation
            Decoding and




                                                      4                         4                                     pref-1
       48              transpose 1                              4                4
                                                                                         1 transpose                  48
        1        1
                          1
                           2
                               48
                                   P/S        2
                                                           2
                                                            SVD
                                                              SVD                    2
                                                                                                2
                                                                                                 48
                                                                                                            1
                                                                                                          FFT
                                                                                                                 64   cyclic
                                                                                                                      pref-1
                                                                                                                               64
                                                                                                                                    S/P
                                                                                                                                          1
                                                                                                                                              A/D
                                                                                                                                                  2
bits
                               1         48           4                         4                    48          64   cyclic   64         1    RF 3
       48
                           3       P/S        3
                                              2             SVD                          2      3         FFT
                                                                                                                      48 -1
                                                                                                                      pref
                                                                                                                                    S/P

                 2         4
                               1
                                   P/S
                                         48
                                              4           48
                                                                4
                                                                    SVD
                                                                                4
                                                                                     48         4
                                                                                                     48
                                                                                                            2 64
                                                                                                          FFT
                                                                                                                      cyclic   64
                                                                                                                                    S/P
                                                                                                                                          1
                                                                                                                                                  4
                                                                                                                      pref-1

       48                                                                                                             48
                 3                                                                                          3
       48                                                                                                             48
                 4                                    4                         4                           4
                                          48                SVD                          48



       Students: Hayun Tang, Ning Zhang, Dejan Markovic, Yun Chiu
           CDMA Baseband
Students: Josie Ammer, Mike Sheets
•Design a 1.6 Mbps DSSS timing recovery unit
•Modulation
  –Length 31 PN code
  –QPSK symbol constellation
•System specifications
  –Maximum frequency offset of +/- 200 KHz
  –Minimum input SNR of +1 dB
  –Input is in-phase & quadrature samples at 200 MHz
  with 7 bits each
                   CDMA Baseband
           8*14                           3*14                             1*14
stre ams                      M UX                              M UX


                  3*14
                              2                                 2                          1*24


                                                                                                               HARD SYM B
                                  sel1    Fre que ncy Offse t




                                                                    sel2
              Coarse Timing                                                  Rotate and
                                         Estimation and Fine                                         PLL
               Acquisition                                                    Corre late
                                         Timing Acquisition
                                                                                                               SOFT SYM B
PN_PILOT
            sym strobe




                                                                                                  pilot mode
                                                                           correction




                                                                                                  correction
             pilot det




                                         freq est
                                         s_start




                                                                            s_start
                                          s_end




                                                                             s_end
                                           clear
               start



                sel




                                            sel
                clk




                                            clk




                                                                               clk




                                                                                                      clk
                                            PN




                                                                               PN
                en




                                            en




                                                                               en




                                                                                                       en
PN_DATA                                             CONTROLLER

CONTROL




                     Students: Josie Ammer, Mike Sheets
                         RAKE Receiver
Student: Tufan Karalar

      I
                  S[n]                      S[n-1]                  S[n-2]                 S[n-3]
     8Q                               Z-1                     Z-1               Z-1
                                                 1st                     2nd                   3rd
                               0th
                    R[n]                                    R[n]               R[n]
                                     R[n]

     Correlator                                                                                      Third multipath
                                                                                                     component can
                           Σ                Σ                       Σ                      Σ
                                                                                                     be observed in
                                                                                                     here
                   C*0
                                     C*1                    C*2                C*3




     One finger
                               22           22                      22
                                                                                22                    Dissipates 4mW
                                                                                                      power, runs at 25
                                                                                                      MHz has an
                                                       24                                       I     areaof .4mm2
                                                                                      Ul
                                                                                                Q
     Polyphase Filter Bank
480MHz
                                        15MHz*




Students: Kevin Camera, Changchun Shi
  Adaptive Image-Reject Mixer
                                                                    LO2
                            LO1
           Image Tone
                                                                     j
                             j
          rf filter                                             I    Q
                            I Q                                                               A/D       DSP
                      LNA

                                                                                Mixer 2 Gain

•Image-Rejection Ratio is                                       70
reduced by circuit mismatches                                                           1 + DA = 1.001
                                                                65


                                         Image Rejection (dB)
   –Phase mismatch in                                                                               1 + DA = 1.003
   quadrature oscillators                                       60


   –Gain (DA) mismatch in I and                                 55

   Q paths                                                      50
                                                                                                    1 + DA = 1.01
•Need 60 dB IRR
                                                                45
                                                                         0.02    0.06     0.1        0.14     0.18
                                                                                   f (deg.)

        Students: Gabriel Desjardins, Isaac Sever
Adaptation via Spectral Estimation
• Two Components
   – Discrete Fourier Transform, Finite State Machine
• FSM uses DFT output to make gain and
  phase tuning decisions

 GAIN
 TUNE                                             6

           MIXER     13            32
                          DFT            FSM      6
           + A2D
 PHASE
 TUNE
                         Adaptation via LMS
                                                           LMS Update Equation
                                                           GA n1  GA n  X n n
                                       I Channel
             Gain Tune
                           Gain Tune
                                       Q C hannel




  Gain Tune                                   I Channel


Phase Tune               Mixer & ADC          Q C hannel




                                       I Channel

      Phase Tune
                          Phase Tune   Q C hannel
                    3G Turbo Decoder
uk                                           xs
                                            xp1           x
                                            xp2
                    Encode 1

                                          Encoder,
                   Encode 2
                                           parallel
                                           concatenation
                                             -1
             Decoder
                                              
              ys
                                                                     ^
                                                                     uk
     y        yp1          SISO 1                       SISO 2
              yp2


                    Students: Stephanie Augsburger, Chris Savarese
   SISO Block: SOVA Implementation
• Standard Viterbi algorithm plus soft output
• Reliability Measure Unit computes soft outputs
• Less complex than MAP
• Expected higher BER than MAP

                                            current
   SISO Block: MAP Implementation
• Double Viterbi algorithm: forward and backward
• Soft output is a Log-Likelihood Ratio (LLR)
• More complex than SOVA
• Expected BER improvement over SOVA

                       current
                                   
High-Speed Iterative Decoder
                            Noise
                                                        Outer
 Outer        Inner                    Inner     -1   Decoder
 Encoder       Encoder                  Decoder


                                                  


Outer: Turbo (convolutional) or
 Low-density parity-check code
Inner: Channel with MAP (BCJR) or SOVA decoder


                   Students: Yeo, Zlatanovici
                                  MAP Decoder
Bi-directional trellis decoding


              -Memory
                  3L



   -ACSA          -ACSA     -ACSA


                                       SM1
  -Memory
     L                      MUX
                                                          Abs()   LUT
                                                  -
                                       BM1               MSB
                                       SM2                               SM’

             -ACSA tree
                                       BM2
BCJR- (Bahl, Cocke, Jelinek, Raviv)          Log(eA0 + eA1) =
Algorithm                                      max{A0,A1} + Log(1+e-|A0-A1|)
                            LDPC Decoder
             Bit-to-Check                                                     Check-to-Block
(Each Bit node connected to 4 Check nodes)                        (Each Check node connected to 36 Bit nodes)
         f                                                                          f             f
                       3D                                                               f
R j1,i
                                                                                          /36
                       3D
                                             Q i,j1                       f                                Rj-36, i to
                                                                                                              L        Memory
                                                                                                              U
                                                      Qi,j from
R j2,i                      f                         Memory      L                                   f       T
                                Prior        Q i,j2               U
                                                                  T
                 f                                    MSB                                                 f
                                                                                                                   sgn( Q
R j3,i                                                                          36-Cycle FIFO
                                             Q i,j3                                                                               i ' ,j)
                                                                           36-Cycle FIFO                      i 'Row [ j ] \ i

R j4,i
                       3D                    Q i,j4

                       3D
                                                              f                            f
                                                                                   f            /36
         •   Fine grained pipelining
         •   Carrysave Operations
         •   Shift Registers for Memory and automatic
              pipelining for Address Decoding
           Maskless Lithography
                1.1 GBit/s             400 GBit/s        10 TBit/s

  Storage Disks         Processor Board
                                              Decompress             Writers
    640 GBit            64 GBit Memory

                                                    On-chip Hardware
   25 to 1 all           25 to 1 single
compressed layers      compressed layer
                                Parallel                 Writers built on
                     Demux      decompression paths      ‘smart’ memory array
          I/O
                     + Buffer




   10mm                                        Writers
                             Decomp.

                                       20mm
    Decoding for Maskless Lithography

      Huffman decoding
      1D match decoding                                                   10 kHz flash
                                                                 Writers
                                                                           1 pixel / cycle
1 bit / cycle     Huffman                    Stream         LZ Systolic    10 TBit/s
                                 Buffer
400 GBit/s        Decode                     Decode           Array




                Students: Vito Dai, Yasheh Shroff, Mason Freed
     Separate Class Project
SCF 1
0.25m CMOS
Fully functional first time




 SCF 2
 Bob Brodersen, Mats Torkelsen, Nathan Chan
 Using the new design flow
           What did we learn?
•   Flow works surprisingly well
•   Easy to learn
•   Still fragile
•   Need to add support for SRAM
•   Need block-level timing analysis
•   For faster designs will need regular
    placement

				
DOCUMENT INFO