cover-story-stacked-loaded by linzhengnd


									    C O V E R S T O RY

Stacked & Loaded:
Xilinx SSI, 28-Gbps I/O   by Mike Santarini

Yield Amazing FPGAs       Publisher, Xcell Journal
                          Xilinx, Inc.

8     Xcell Journal                        First Quarter 2011
‘More than Moore’ stacked silicon interconnect
technology and 28-Gbps transceivers
lead new era of FPGA-driven innovations.

                              ilinx recently added to its line-   the transistor counts of new devices
                              up two innovations that will        every 22 months, in lockstep with the
                              further expand the application      introduction of every new silicon
                     possibilities and market reach of            process. Like other companies in the
                     FPGAs. In late October, Xilinx®              semiconductor business, Xilinx has
                     announced it is adding stacked silicon       learned over the years that to lead the
                     interconnect (SSI) FPGAs to the high         market, it must keep pace with Moore’s
                     end of its forthcoming 28-nanometer          Law and create silicon on each new gen-
                     Virtex®-7 series (see Xcell Journal,         eration of process technology—or bet-
                     Issue 72). The new, innovative archi-        ter yet, be the first company to do so.
                     tecture connects several dice on a sin-         Now, at a time when the complexity,
                     gle silicon interposer, allowing Xilinx      cost and thus risk of designing on the
                     to field Virtex-7 FPGAs that pack as         latest process geometries are becoming
                     many as 2 million logic cells—twice          prohibitive for a greater number of
                     the logic capacity of any other              companies, Xilinx has devised a unique
                     announced 28-nm FPGA—enabling                way to more than double the capacity
                     next-generation capabilities in the cur-     of its next-generation devices, the
                     rent generation of process technology.       Virtex-7 FPGAs. By introducing one of
                        Then, in late November, Xilinx            the semiconductor industry’s first
                     tipped the Virtex-7 HT line of devices.      stacked-die architectures, Xilinx will
                     Leveraging this SSI technology to com-       field a line of the world’s largest
                     bine FPGA and high-speed transceiver         FPGAs. The biggest of these, the 28-nm
                     dice on a single IC, the Virtex-7 HT         Virtex-7 XC7V2000T, offers 2 million
                     devices are a giant technological leap       logic cells along with 46,512 kbits of
                     forward for customers in the commu-          block RAM, 2,160 DSP slices and 36
                     nications sector and for the growing         GTX 10.3125-Gbps transceivers. The
                     number of applications requiring high-       Virtex-7 family includes multiple SSI
                     speed I/O. These new FPGAs carry             FPGAs as well as monolithic FPGA
                     many 28-Gbit/second transceivers             configurations. Virtex-7 is the high end
                     along with dozens of 13.1-Gbps trans-        of the 7 series, which also includes the
                     ceivers in the same device, facilitating     new low-cost, low-power Artix™
                     the development of 100-Gbps commu-           FPGAs and the midrange Kintex™
                     nications equipment today and 400-           FPGAs—all implemented on a unified
                     Gbps communications line cards well          Application Specific Modular Block
                     in advance of established standards          Architecture (ASMBL) architecture.
                     for equipment running at that speed.            The new SSI technology is more than
                                                                  just a windfall for customers itching to
                     MORE THAN MOORE                              use the biggest FPGAs the industry can
                     Ever since Intel co-founder Gordon           muster. The successful deployment of
                     Moore published his seminal article          stacked dice in a mainstream logic chip
                     “Cramming More Components onto               marks a huge semiconductor engineer-
                     Integrated Circuits” in the April 19,        ing accomplishment. Xilinx is deliver-
                     1965 issue of Electronics magazine, the      ing a stacked silicon chip at a time
                     semiconductor industry has doubled           when most companies are just evaluat-
First Quarter 2011                                                                       Xcell Journal   9

                                                                                           Silicon Interposer
     Figure 1 – The stacked
     silicon architecture                                                                  >10K routing connections
     places several dice                                                                   between slices
     (aka slices) side-by-side                                                             ~1ns latency
     on a silicon interposer.


                                                   FPGA Slices                              Silicon
                                                   Side-by-Side                           Interposer

ing stacked-die architectures in hopes       number of smaller dice and then use       advantages. “We use regular silicon
of reaping capacity, integration, PCB        a silicon interposer to connect those     interconnect or metallization to con-
real-estate and even yield benefits.         smaller dice lying side-by-side on top    nect up the dice on the device,” said
Most of these companies are looking          of the interposer so they appear to       Madden. “We can get many more
to stacked-die technology to simply          be, and function as, one integrated       connections within the silicon than
keep up with Moore’s Law—Xilinx is           die” (Figure 1).                          you can with a system-in-package.
leveraging it today as a way to exceed          Each of the dice is interconnected     But the biggest advantage of this
it and as a way to mix and match com-        via layers in the silicon interposer in   approach is power savings. Because
plementary types of dice on a single IC      much in the same way that discrete        we are using chip interconnect to
footprint to offer vast leaps forward in     components are interconnected on the      connect the dice, it is much more
system performance, bill-of-materials        many layers of a printed-circuit board    economical in power than connect-
(BOM) savings and power efficiency.          (Figure 2). The die and silicon inter-    ing dice through big traces, through
                                             poser layer connect by means of multi-    packages or through circuit boards.”
THE STACKED                                  ple microbumps. The architecture also        In fact, the SSI technology provides
SILICON ARCHITECTURE                         uses through-silicon vias (TSVs) that     more than 100 times the die-to-die
“This new stacked silicon intercon-          run through the passive silicon inter-    connectivity bandwidth per watt, at
nect technology allows Xilinx to offer       poser to facilitate direct communica-     one-fifth the latency, without consum-
next-generation density in the current       tion between regions of each die on       ing any high-speed serial or parallel
generation of process technology,”           the device and resources off-chip         I/O resources.
said Liam Madden, corporate vice             (Figure 3). Data flows between the           Madden also notes that the
president of FPGA development and            adjacent FPGA die across more than        microbumps are not directly con-
silicon technology at Xilinx. “As die        10,000 routing connections.               nected to the package. Rather, they
size gets larger, the yield goes down           Madden said using a passive sili-      are interconnected to the passive
exponentially, so building large dice        con interposer rather than going with     interposer, which in turn is linked to
is quite difficult and very costly. The      a system-in-package or multichip-         the adjacent die. This setup offers
new architecture allows us to build a        module configuration has huge             great advantages by shielding the
10          Xcell Journal                                                                                          First Quarter 2011
                                                                                                                       COVER STORY

microbumps from electrostatic dis-              NO NEW TOOLS REQUIRED                            ments to their design methods or
charge. By positioning dice next to             While the SSI technology offers some             flows,” he said.
each other and interfaced to the ball-          radical leaps forward in terms of                   At the same time, Madden said that
grid array, the device avoids the ther-         capacity, Madden said it will not                customers will benefit from adding
mal flux, signal integrity and design           force a radical change in customer               floor-planning tools to their flows
tool flow issues that would have                design methodologies. “One of the                because they now have so many logic
accompanied a purely vertical die-              beautiful aspects of this architecture           cells to use.
stacking approach.                              is that we were able to establish the
   As with the monolithic 7 series              edges of each slice [individual die in           A SUPPLY CHAIN FIRST
devices, Xilinx implemented the SSI             the device] along natural partitions             While the design is in and of itself
members of the Virtex-7 family in               where we would have traditionally                quite innovative, one of the biggest
TSMC’s 28-nm HPL (high-perform-                 run long wires had these structures              challenges of fielding such a device
ance, low-power) process technolo-              been in our monolithic FPGA archi-               was in putting together the supply
gy, which Xilinx and TSMC devel-                tecture,” said Madden. “This meant               chain to manufacture, assemble, test
oped to create FPGAs with the right             that we didn’t have to do anything               and distribute it. To create the end
mix of power efficiency and perform-            radical in the tools to support the              product, each of the individual dice
ance (see cover story sidebar, Xcell            devices.” As a result, “customers                must first be tested extensively at the
Journal, Issue 72).                             don’t have to make any major adjust-             wafer level, binned and sorted, and

 Figure 2 – Xilinx’s stacked silicon                                     Microbumps
 technology uses passive silicon-based                                   • Access to power / ground / IOs
 interposers, microbumps and TSVs.                                       • Access to logic regions
                                                                         • Leverages ubiquitous image sensor
                                                                           microbump technology

                                                                             Through-Silicon Vias (TSVs)
                                                                             • Only bridge power / ground / IOs to C4 bumps
                                                                             • Coarse pitch, low density aid manufacturability
                                                                             • Etch process (not laser drilled)

                                                            Passive Silicon Interposer
                                                            • 4 conventional metal layers connect microbumps & TSVs
                                                            • No transistors means low risk and no TSV-induced performance degradation
                                                            • Etch process (not laser drilled)

                                                  Side-by-Side Die Layout
                                                  • Minimal heat flux issues
                                                  • Minimal design tool flow impact

            28nm FPGA Slice     28nm FPGA Slice       28nm FPGA Slice        28nm FPGA Slice
                                                                                                            Silicon Interposer

                                                                                                            Through-Silicon Vias

                                         Package Substrate                                                  C4 Bumps

                                                                                                            BGA Balls

First Quarter 2011                                                                                                         Xcell Journal   11

                                                                                            DRIVING COMMUNICATIONS
                                                                                            TO 400 GBPS
                                                                                            The new Virtex-7 HT line of devices is
                                                                                            targeted squarely at communications
                                                                                            companies that are developing 100- to
                                                                                            400-Gbps equipment. The Virtex-7 HT
                                                                                            combines on a single IC multiple
                                                                                            28-nm FPGA dice, bearing dozens of
                                                                                            13.1-Gbps transceivers, with 28-Gbps
                                                                                            transceiver dice. The result is to
                                                                                            endow the final device with a formi-
                                                                                            dable mix of logic cells as well as cut-
                                                                                            ting-edge transceiver performance
                                                                                            and reliability.
                                                                                               The largest of the Virex-7 HT line
                                                                                            includes sixteen GTZ 28-Gbps trans-
                                                                                            ceivers, seventy-two 13.1-Gbps trans-
                                                                                            ceivers plus logic and memory, offering
                                                                                            transceiver performance and capacity
                                                                                            far greater than competing devices
                                                                                            (see Video 1,
      Figure 3 –Actual cross-section of the 28-nm Virtex-7 device. TSVs can be seen
      connecting the microbumps (dotted line, top) through the silicon interposer.

then attached to the interposer. The            Xilinx did with its ultrafast Virtex-7 HT         Video 1 – Dr. Howard Johnson
combined structure then needs to be             line, announced just weeks after the              introduces the 28-Gbps
packaged and given a final test to              SSI technology rollout.                           transceiver-laden Virtex-7 HT.
ensure connectivity before the end                                                      
product ships to customers.                                                                       XilinxInc#p/c/71A9E924ED61B8
   Madden’s group worked with TSMC                                                                F9/1/eTHjt67ViK0.
and other partners to build this supply
chain. “This is another first in the
industry, as no other company has put
in place a supply chain like this across
a foundry and OSAT [outsourced
semiconductor assembly and test],”
said Madden.
   “Another beautiful aspect of this
approach is that we can use essential-
ly the same test approach that we use
in our current devices,” he went
on.“Our current test technology allows
us to produce known-good dice, and
that is a big advantage for us because
in general, one of the biggest barriers
of doing stacked-die technology is how
do you test at the wafer level.”
   Because the stacked silicon tech-
nology integrates multiple Xilinx
FPGA dice on a single IC, it logically
follows that the architecture would
also lend itself to mixing and matching
FPGA and other dice to create entirely
new devices. And that’s exactly what
12       Xcell Journal                                                                                                     First Quarter 2011
                                                                                                            COVER STORY

                                                  Communications equipment design         tion calls for 28-Gbps networking
                                               teams traditionally have used FPGAs        equipment to have extremely tight
                                               to receive signals sent to equipment       jitter budgets.
                                               in multiple protocols, translate those         Signal integrity is an extremely
                                               signals to common protocols that the       crucial factor for 28-Gbps operation,
                                               equipment and network use, and then        said Panch Chandrasekaran, senior
                                               forward the data to the next destina-      marketing manager of FPGA compo-
                                               tion. Traditionally companies have         nents at Xilinx. To meet the stringent
                                               placed a processor in between FPGAs        CEI-28G jitter budgets, the trans-
                                               monitoring and translating incoming        ceivers in the new Xilinx FPGAs
                                               signals and those FPGAs forwarding         employ phase-locked loops (PLLs)
     Figure 4a –Xilinx 28-Gbps transceiver     signals to their destination. But as       based on an LC tank design and
       displays an excellent eye opening       FPGAs advance and grow in capacity         advanced equalization circuits to off-
             and jitter performance            and functionality, a single FPGA can       set deterministic jitter.
          (using PRBS31 data pattern).         both send and receive, while also per-         “Noise isolation becomes a very
                                               forming processing, to add greater         important parameter at 28-Gbps sig-
                                               intelligence and monitoring to the         naling speeds,” said Chandrasekaran.
                                               system. This lowers the BOM and,           “Because the FPGA fabric and trans-
                                               more important, reduces the power          ceivers are on separate dice, the sen-
                                               and cooling costs of networking            sitive 28-Gbps analog circuitry is iso-
                                               equipment, which must run reliably         lated from the digital FPGA circuits,
                                               24 hours a day, seven days a week.         providing superior isolation com-
                                                  In a white paper titled “Industry’s     pared to monolithic implementa-
                                               Highest-Bandwidth FPGA Enables             tions” (Figures 4a and 4b).
                                               World’s First Single-FPGA Solution for         The FPGA design also includes
                                               400G Communications Line Cards,”           features that minimize lane-to-lane
    Figure 4b – This is a competing device’s   Xilinx’s Greg Lara outlines several com-   skew, allowing the devices to sup-
      28-Gbps signal using a much simpler      munications equipment applications         port stringent optical standards such
    PRBS7 pattern. The signal is extremely     that can benefit from the Virtex-7 HT      as the Scalable Serdes Framer
        noisy with a significantly smaller
                                               devices (see        Interface standard (SFI-S).
      eye opening. Eye size is shown close
                                               support/documentation/white_papers/            Further, the GTZ transceiver
                to relative scale.
                                               wp385_V7_28G_for_400G_Comm_                design eliminates the need for
                                               Line_Cards.pdf).                           designers to employ external refer-
                                                  To name a few, Virtex-7 HT FPGAs        ence resistors, lowering the BOM
                                               can find a home in 100-Gbps line           costs and simplifing the board
    “We leveraged stacked interconnect         cards supporting OTU-4 (Optical            design. A built-in “eye scan” func-
technology to offer Virtex-7 devices           Transfer Unit) transponders. They          tion automatically measures the
with a 28G capability,” said Madden. “By       can be used as well in muxponders or       height and width of the post-equal-
offering the transceivers on a separate        service aggregation routers, in lower-     ization data eye. Engineers can use
die, we can optimize our 28-Gbps trans-        cost 120-Gbps packet-processing line       this diagnostic tool to perform jitter
ceiver performance and electrically iso-       cards for highly demanding data pro-       budget analysis on an active chan-
late functions to offer an even higher         cessing, in multiple 100G Ethernet         nel and optimize transceiver param-
degree of reliability for applications         ports and bridges, and in 400-Gbps         eters to get optimal signal integrity,
requiring cutting-edge transceiver per-        Ethernet line cards. Other potential       all without the expense of special-
formance and reliability.”                     applications include base stations         ized equipment.
    With the need for bandwidth                and remote radio heads with 19.6-              ISE® Design Suite software tool
exploding, the communications sector           Gbps Common Public Radio Interface         support for 7 series FPGAs is avail-
is franticly racing to establish new net-      requirements, and 100-Gbps and 400-        able today. Virtex-7 FPGAs with mas-
works. The wireless industry is scram-         Gbps test equipment.                       sive logic capacity, thanks to the SSI
bling to produce equipment support-                                                       technology, will be available this
ing 40-Gbps data transfer today, while         JITTER AND EYE DIAGRAM                     year. Samples of the first Virtex-7 HT
wired networking is approaching 100            A key to playing in these markets is       devices are scheduled to be available
Gbps. FPGAs have played a key role in          ensuring the FPGA transceiver sig-         in the first half of 2012. For more
just about every generation of net-            nals are robust, reliable and resistant    information on Virtex-7 FPGAs and
working equipment since their incep-           to jitter or interference and to fluctu-   SSI technology, visit http://www.
tion (see cover stories in Xcell               ations caused by power system noise.
Journal, Issues 65 and 67).                    For example, the CEI-28G specifica-        7-series-fpgas.htm.

First Quarter 2011                                                                                              Xcell Journal   13

To top