Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out

A new concept to use 3D vertical integration technology for

VIEWS: 3 PAGES: 36

									A new concept to use 3D
vertical integration technology
for fast pattern recognition
Ted Liu, Jim Hoff, Grzegorz Deptuch, Ray
Yarema

Fermilab

Questions or Comments: jimhoff@fnal.gov
Introduction and Outline
 The development of 3D technology for the
  solution of the fast pattern recognition
  problem is part of a broader, ongoing R&D
  effort that includes both 2D and 3D
  solutions.
 This talk will cover:
 ◦ An introduction to the problem
 ◦ A description of the Associative Memory solution
 ◦ A new concept – VIPRAM – that uses emerging
   3D technology
The Obvious Problem…

                    1032 cm-2 s-1                                  1033



There are enormous challenges in implementing pattern recognition for a
tracking trigger at LHC (L1&L2), due to

1. The much higher occupancy and event rates at the LHC
2. The much more massive detectors
                         1034
3. The larger number of channels in their tracking volumes         1035

There is a clear need to develop/improve the hardware-based pattern
recognition technology to advance the state-of-the-art for the future



                          simulation
The Challenges

    To increase the patterns density by 3
   orders of magnitude (from the original
 AMchips) and increase the speed by more
  than a factor of 3 while reducing power
    consumption (or at least dramatically
   reducing the rate of increase of power
               consumption)1.

[1] Based on the extensive simulation studies by Atlas FTK Collaboration
Some Obvious Questions…
 Can’t we just use what we currently have
  and just make bigger PC boards or more
  of them?
 ◦ No. This results in severe speed bottlenecks
   and power issues.
 Can’t we just use commercial CAMs?
 ◦ No. CAMs are part of the fast pattern
   recognition process, but not all of it. Alone,
   CAMs lack certain necessary features, making
   them unsuitable for fast track triggering.
It’s not a CAM; it’s a PRAM
 A CAM (Content Addressable Memory)
  is a classical digital system building block




                                                 Match
                             Match



                                         Match




                                                         Match
           3
           7
   Pattern 1




  •One pattern at a time
  •Each CAM cell responds or does not respond to the current pattern
  •There is no memory of previous matches
It’s not a CAM; it’s a PRAM
 A PRAM on the other hand is a Pattern
  Recognition Associative Memory.
                                                          Road!


    Layer 1




                                                           Match
   Address 4

    Layer 2

                  Match




                                                  Match

                                                           Match

                                                                   Match
    Layer 2
   Address 1
   Address 4




                                  Match




                                                           Match
                                          Match
    Layer 3
   Address 79

    Layer 4
                          Match




                                                  Match

                                                           Match
   Address 4
History and the “traditional” effort
                     The AMchips were invented
                      and developed in Italy
                      resulting in the AMchip03
                      which is currently being used
                      by CDF.
                     There is an ongoing effort,
                      led by Italians, to improve on
                      the AMchip03 design. We
                      are now a part of this
                      collaboration.
                     The idea, of course, is to
                      increase pattern density and
                      speed and to optimize the
                      performance.
                     Design in deep sub-micron
                      processes. The current
                      target is 65nm.
Limitations in 2D…
A Single PRAM Cell
             (in 2 dimensions)
                                                In the older version of the AMchip,
                                                the match lines were a source of
                         Ma                     speed limitation because of their
                            tch
                                  Sto           length and capacitance. The Glue
                                     rag
                                           e    Logic was large and slow.
                    Le
                      ngt
                         h-
                           >C
                             apa




                                                                          Match lines
                                     cit
                                        anc




                                                                                        Glue Logic
                                               e-
CA                                                  >R
  M                                                   ed
                                                        uce
      Ce                                                   dS
           lls                                                pe
                                                                 ed
THE CONCEPT – VIPRAM
Vertically Integrated Pattern Recognition
Associative Memory

 A Reduced Footprint and
  therefore greater pattern
  density.
 Shorter Match lines and
  therefore greater speed.




                                                Much Shorter
 Less Capacitance and
  therefore reduced power
  consumption
 Each detector layer
  corresponds to a single tier
 All communication from




                                 Much S
  “CAM Tiers” to the single
  “Control Tier”
 The PRAM concept is tailor-

                                       horter
  made for 3D design.
Another Single CAM Cell
    (this time in 3 dimensions)
                       Viewing this structure as a
                        pseudo-layout some of the
                        aforementioned benefits
                        become even more obvious.
                       The 3-dimensional design
                        of the VIPRAM makes the
                        PRAM appear like a 2-
                        dimensional array of
                        “tubes”, each dedicated to
                        a single pattern.
                       Communication with the
                        outside world during
                        normal operation is done
                        solely through the Control
                        Tier (the blue tier on top).
Pattern recognition for tracking
is naturally a task in 3D
  road
                              track
              Majority Logic – Old Version

                         Adder               Digital
                                           Comparator
Match Lines




                                                        Road Flag




                            User-defined
                             Threshold
Majority Logic – New Version

                     Pass Transistor Logic
           1                   1                  1                  1

           0                   0                  0                  0
               Sel




                                   Sel




                                                      Sel




                                                                         Sel




                                                                               Match Pattern
           1                   1                  1                  1

           0                   0                  0                  0
               Sel




                                   Sel




                                                      Sel




                                                                         Sel
           1                   1                  1                  1

           0                   0                  0                  0
               Sel




                                   Sel




                                                      Sel




                                                                         Sel
  Match1              Match2             Match3             Match4
Majority Logic – New Approach
                       Stage      Stage      Stage
                       Input     Output:    Output:
                                 Match      Mismatch
                           111     111          011
   For each stage…
                           011     011          001
                           001     001          000
                           000     000          000

                     Majority        Meaning
                     Pattern
                     111             Perfect Match
   In the end…
                     011             1 Missing Layer
                     001             2 Missing Layers
                     000             3 or More Missing
                                     Layers
Can 3D exploit even more advantages
from the new Majority Logic?

 Yes. We have divided the 3D design by
  detector layer (i.e. each CAM Tier is
  dedicated to one detector layer)
  Therefore, any logical division by
  detector layer results in functions that
  can be sub-divided by tier.
   Can 3D exploit even more advantages
   from the new Majority Logic?

                   Match Pattern
         Sel          Sel       Sel
           0
               1




                        0
                            1




                                   0
                                       1
Match4




         Sel          Sel       Sel
           0
               1




                        0
                            1




                                   0
                                       1
Match3




         Sel          Sel       Sel
           0
               1




                        0
                            1




                                   0
                                       1
Match2




         Sel          Sel       Sel
           0
               1




                        0
                            1




                                   0
                                       1
Match1
Readout
           The top tier (a.k.a. the
            Control Tier) is a two
            dimensional array of elements
            whose position is indicative of
            its address and that contains
            an indication of whether or
            not a road was found.
            Compare this with a pixel
            array which is a two
            dimensional array of elements
            whose position is indicative of
            its address and that contains
            an indication of whether or
            not a hit was found.
           In other words, high-speed
            readout architectures for
            pixel arrays can and should be
            used for VIPRAM readout.
Design for Simplicity
                   The VIPRAM has two types of tiers,
                    CAM and Control. In the final design,
                    there will be several CAM tiers and
                    only one Control tier.
                   Each CAM tier is functionally identical
                    to the others, but must maintain a
                    unique relationship to the Control tier
                    in order to work. In other words,
                    patterns that come into the Control
                    Tier from Detector “1” must be sent to
                    the CAM tier dedicated to Detector “I”.
                    Similarly, when data is sent from CAM
                    tier #3, the Control Tier must know it
                    came from CAM tier #3 and not some
                    other CAM tier.
                   How can this be done without requiring
                    unique mask sets for each CAM tier?
Great minds think alike?
                  Having gone part-way
                   through this design procedure,
                   the collaboration had the
                   opportunity to meet with
                   Bob Patti of Tezzaron who
                   has been involved in 3D
                   memory design from the
                   beginning.
                  Tezzaron’s 3D Memories
                   follow exactly this
                   arrangement of Control Tier
                   and (in Tezzaron’s case)
                   Memory Tier.
                  In other words, we are
                   following a beaten path, not
                   blazing a new trail.
The Diagonal Via




The Diagonal Via was patented by Bob
Patti and Tezzaron in 2000. It converts
vertical position to horizontal position and
allows a common mask set to provide unique
access to each layer.
Conclusions and Future Work
 The VIPRAM is a new concept and now we
  are developing a collaboration with Fermilab,
  University of Chicago, INFN and Argonne.
  ◦ The immediate goal is a proof of principal
  ◦ The ultimate goal is a 3 order-of-magnitude
    increase in performance (density+speed).
 At present, we are seeking funding for the
  VIPRAM development.
 You will hear from us again at the next TIPP
  (please pick a nice place for my wife…)
Background
Figure 13 - Pass Transistor Multiplexors in the Majority Logic
VIPRAM –
A Vertically Integrated PRAM
 Modern technology
  provides us with another
  approach…and another
  dimension.
 At first, the idea was
  extremely simple –
  increase the pattern
  density by stacking
  otherwise normal AMchips.
  The outputs of existing
  AMchips are already in a
  daisy chain. The stacked
  AMchips would not need
  to “know” that they were
  part of a stack.
VIPRAM –
A Vertically Integrated PRAM
                   This was necessarily
                    modified to include
                    “wrapping” an AMchip in
                    circuitry that dealt with
                    the 3D stacking, leaving an
                    AMchip core that was
                    identical to the 2D
                    AMchips that are under
                    development.
  Not the first to consider 3D
  Content Addressable Memory
    Oh and Franzone1 first
     suggested the advantages of
     3D design on CAMs in 2007
                                                                    CAM
    Their idea involved vertically                                Bit Cell   3D Layer 1
     integrating the CAM cell itself
     so that the Matchline was                                      CAM
     vertical. This minimized its                                             3D Layer 2
                                                                   Bit Cell
     length and therefore its




                                                       Matchline
     capacitance.                                                   CAM
                                                                              3D Layer 3
    The method is highly                                          Bit Cell
     impractical since it requires
     f(N) 3D layers where N is the
     number of bits in the CAM cell.
                                                                    CAM
                                                                              3D Layer N
                                                                   Bit Cell
[1] E.C. Oh and P.D. Franzon, “Design Considerations
and Benefits of Three-Dimensional Ternary Content
Addressable Memory”, IEEE Custom Integrated Circuits
Conference, 2007, p. 591
Again, this is a PRAM not a
CAM
 There is a perfectly
  natural, 3D functional                                                      Road!
  division in a PRAM. Each
  detector layer gets its
  own 3D layer.                3D Layer 1
 The vertical interconnect




                                                                                    Match
  is not the CAM match line,
  but the Road line.           3D Layer 2
 Moreover, each detector




                                            Match




                                                                            Match
                                                                                    Match
                                                                                            Match
  layer has independent
  data lines for both          3D Layer 3
  pattern matching and




                                                            Match




                                                                                    Match
                                                                    Match
  pattern loading, and this
  is a natural consequence
  of this architecture.        3D Layer N




                                                    Match




                                                                            Match
                                                                                    Match
   How can we improve on this
   design?


  ~80%                          4 blocks
 AM bank                        of 1280
                                6-layer
   ~20%                         patterns
  control
      &
 interface

 Move to
another tier
   in 3D
How can we fundamentally improve
on this design?
How can we fundamentally improve
on this design?
   How can we fundamentally improve
   on this design?
                     Majority block
                     still in standard cell
                     ~ 30%
within each
pattern block




                    can be also moved to the
                    control/interface tier in 3D
Fischer Tree (Mephisto Logic)
  P. Fischer introduced
   the Mephisto readout
   architecture [1].
  We found “Fischer
   Tree” easier to say.
  It is a self-selecting,
   self-addressing priority
   encoding architecture
   that performs the task
   in log[N] time.

[1]“First implementation of the MEPHISTO binary
readout architecture for strip detectors” Nuclear
Instruments and Methods in Physics Research Section
A: Accelerators, Spectrometers, Detectors and
Associated Equipment Volume 461, Issues 1-3, 1 April
2001, Pages 499-504 8th Pisa Meeting on Advanced
Detectors
Fischer Tree (Mephisto Logic)
                              Col 1           Col 2           Col 3               Col N
 Fischer Trees can be
  stacked if need be, so
  the two dimensional
  array in the Control




                               Fischer Tree


                                               Fischer Tree


                                                               Fischer Tree




                                                                                   Fischer Tree
  Tier can be handled
  this way.                                                                   …

 An alternate approach
  could take each output
  and push it into a stack.


                                                    Fischer Tree

								
To top