modular_arrays by keralaguest


									            Modular Arrays – The solution for DSM high NRE prices
                                                 Gideon Amir

Modular Arrays, a concept invented by the FPGA companies and modified using Gate Array methodology
has recently been gaining a lot of attention from customers. The problem is that high end FPGA‟s, although
offering very low NRE investment, are still limited in performance and density and cost too much for volume
production. CBIC‟s NRE‟s on the other hand have skyrocketed with the introduction of 0.13u and below
technologies. This article desribes the Modular Array concept, explains why customers are excited about this
technology and surveys the new products on the market.

The Custom Logic Landscape

The custom logic landscape, of which the Modular Array is a variant, is best understood by considering how
the logic cells are physically organized to implement a system, and further divided into how the devices are
configured to implement the desired functions.

A Cell Based IC (CBIC) is created by assembling a collection of “standard” cells, each with an optimal
implementation of a particular bit of logic. The design task becomes mapping the design into a completely
“blank slate” of silicon by choosing from a large library of building block cells, and interconnecting them as
necessary. A relatively large number of different collections of standard cells can implement the same

Structured ASICs simplify the complexity of custom silicon design by inverting the problem. They do this by
providing a fabric of identical building block cells that are prearranged in a series of sizes and complexities.
This means the design task is mapping the circuit into a fixed arrangement of known cells, rather than
mapping standard cells to the design. A wide variety of technologies fall into the Structured ASIC domain as
illustrated in Figure 1.

                                  Figure 1: The Structured ASIC Spectrum

Gate Arrays. The oldest Structured ASIC technology has been around since the late „70‟s. The name is a
misnomer, since the Gate Array is actually an array of individual transistors, connected through one or two
metal layers to create the desired logic. Gate arrays were a wonderful technology at the time, providing
flexibility of design with fast TAT. Unfortunately, this technology is not efficient enough to handle the more
complex designs of today. As a result, Gate Arrays are only used today in relatively simple, logic only
designs and are not even available in technologies below .25u.

FPGA and CPLD. These technologies were developed in the late 80‟s and are still going strong. The big
advantages: field programmability and near zero NRE is offset by high production cost and lower
performance than CBIC solutions. Larger and faster versions are coming onto the market every year. This
variant of the Structured ASIC group has turned out to be the heavyweight member of the family and has
been increasing its market share at the expense of CBIC and Gate Arrays, the two other heavyweights in the
ASIC arena.
Embedded Cell Arrays. This approach is actually an attempt to bridge the flexibility inherent in CBIC
ASIC‟s and the fast TAT of Gate Arrays. Most of the design is implemented in CBIC, including Memories,
I/O‟s and IP blocks. A section of the die is implemented in blank Gate Array. When the logic design is
finalized, the gate array section can be customized in a relatively short time, and logic iterations are
cheap. Still, the initial NRE is at least as expensive as standard CBIC.

Platform Structured ASIC. This is a relatively recent idea taking the Embedded Cell Array closer to a
standard product. An Embedded Array is built with SOC components such as CPU, I/F standards, RAM
etc. and a UDL (User Defined Logic) block which is either an FPGA based or a Gate Array based. The
biggest problem with this approach is that most of the time, the respources on the platform do not match
the application requirements. For the few lucky cases when they do – this is a wonderfull solution.

Modular Array ASIC. This is the new kid on the block and the subject of this paper.

The Modular Array ASIC – What is it?

The Modular Array ASIC, sometimes also Structured Array, is basically an array of Logic Modules, similar
to FPGA‟s modules (see for example Figure 2) interspersed with Memory modules. The periphery of the
device contains configurable I/O cells, PLL‟s and support structures such as DFT, JTAG etc.

                     Figure 2: Modular Array Logic Modules (LightSpeed Corp.)

Figure 3 shows a typical Modular Array architecture:
                          Figure 3: A typical Modular Array Architecture

The device can be customized by interconnecting the desired logic modules, memory modules and I/O‟s
using 2 to 4 metal layers and via‟s depending on the manufacturer. Figure 4 shows a cross section of
such Modular array:

                    Figure 4: Modular Array Cross Section (LightSpeed Corp.)
This array can be fabricated up through the first few metal layers as if it were a standard product,
almost as a cross between an FPGA and a gate array. Then the base wafers can be warehoused,
waiting for an order.

A customer design, meanwhile, is accepted at the register-transfer or netlist level and mapped onto the
logic cells and memory blocks. The wafers are pulled from inventory, the upper metal layers completed
to implement the customer's design and the finished wafers treated from there on out as if they were
any other sort of ASIC.

Since the hard parts-the critical mask layers, the majority of processing steps and most of the
expensive masks-are shared among all the users of the technology, NRE can be quite small, as can
turnaround time (TAT). And if the architectural design is clever, the devices give up very little compared
with a full-up cell-based design. This is where the real benefit lies. As DSM (Deep Sub-Micron)
technology masks become incredibly expensive, a .13u CBIC can easily exceed $600K in NRE costs.
The Modular Array NRE on the other hand can be quite low, with prices in the range of $100K to

A matter of flavor

Not surprisingly, each of the structured-ASIC product lines announced to date has a flavor of its own,
reflective of the vendor that produced it. The families from AMI and Lightspeed reflect, to a greater or
lesser degree, those companies' experiences in the FPGA-conversion market. The Chip Express and
NEC offerings are distinctly more ASIC-flavored. Altera's HardCopy is explicitly promoted as an FPGA
without the programmable interconnect. LSI Logic's RapidChip is more accurately described as a
customizable application-specific standard product. Despite the strong conceptual similarities, each of
these lines has architectural details, implementation choices and tool affinities that reflect its heritage.
And these differences can be important in finding the best family for a particular application.

Some of the differences are obvious. For example, Altera's HardCopy devices are nothing more nor
less than Altera FPGAs with the programmable interconnect replaced by hard interconnect lines. The
underlying logic cell and memory structure matches that of the equivalent FPGA. The intended design
flow for the devices is to develop the application using the FPGA flow, then simply move it across.

LSI's RapidChip presents a different scenario. Devices in the family are essentially application-specific
platforms composed of all the major intellectual-property (IP) blocks LSI deems necessary for a
particular application. The first announced member of the family, for example, has 48 high-speed I/O
ports, six phase-locked loops, a double-data-rate DRAM port and a number of SRAM ports as diffused
IP blocks. In the die area not used by these blocks, LSI provides approximately 3 million usable gates
of uncommitted logic. These are used to adapt the big IP blocks to a particular customer's needs.

The other four families fit better into the generalization of a cross between an FPGA and a gate array.
Except for a few serdes blocks or high-speed I/O pads that are showing up in some lines (notably
NEC's), the chips consist of uncommitted logic cells and memory blocks-lightly structured blank slates
on which the users may write as they will. Yet even in these devices the structure is significant.

Under the hood

Looking beneath the surface of all the devices, differences start to appear. Perhaps the most obvious is
in the logic cells of which the devices are composed. And it is the logic cell that separates most of the
families from their nearest ancestors, gate arrays. In all cases except one, the logic cells are a good
deal larger than the transistor pairs or quads that were used in sea-of-gates gate arrays.

The devices with the most complex logic cells are, no surprise, those with a heritage of FPGA
conversion. AMI's logic cell, for example, comprises "some RAM and some simple logic functions," said
AMI vice president Vince Hopkin. The company, like most of its competitors, doesn't care to discuss the
details of its logic fabric without a nondisclosure agreement. But Hopkin added that the cell was
sufficiently complex to implement at least a simple flip-flop in a single cell.
Lightspeed's basic logic block is apparently even more complex, including multiplexers, a logic gate, a
fully scannable flip-flop and buffers. As in FPGA architectures, part of the power and the challenge of the
architecture lies in the ability to share resources in a single logic block among different nets.

At the other extreme, the Chip Express logic cell comprises only three muxes and three inverters. Two
cells are required to implement a simple D-type flip-flop. And LSI's logic element is even smaller, based on
an underlying metal-programmable logic cell technology from Virage Logic (Fremont, Calif.). Virage's
metal-programmed technology may also unerlie the NEC logic element.

The devices also differ considerably in their approach to power and clock distribution. The Lightspeed
devices distribute power with a top metal layer, dropping it down to dedicated power-routing segments in
the first four, non-user-configured metal layers. Similarly, Lightspeed constructs clock trees after layout,
using both the wires in user-dependent metal layers 5 and 6 and dedicated, carefully spaced clock
segments in layers 3 and 4. Chip Express, in keeping with its gate array like flow, builds delay-locked
loops and clock trees from logic elements and user-determined metal. Power routing is fixed in metal 1.
NEC does the bulk of its balanced clock tree routing in metal 1-3, but leaves enough flexibility to permit
multiple clock regions and gated clocks. Power is delivered through a fixed grid transparent to the
customer. AMI also delivers power in a fixed grid in the lower layers, but synthesizes its clocks in upper
metal layers.

LSI Logic, with its large, fixed IP blocks, uses a common power grid across its entire planned family of
devices. Clocks are synthesized through a proprietary system called the Clock Factory, with access to
most of the metal layers. Unlike the other vendors, LSI banks wafers at metal-1, so most layers can
respond to the needs of the customer design.

Test strategies also vary somewhat. Lightspeed is unusual in this regard, offering full visibility and
controllability of all the flip-flops on the chip and also, uniquely, of many of the purely combinatorial cells.
The result is what Lightspeed terms unprecedented access for test and diagnosis. This access also
makes it possible to capture user clocks, apply them at speed and extract signatures from the output of a
block for delay-fault testing.

NEC provides scan chains that include all the flip-flops on the chip, as well as boundary scan for the I/O
pads. Senior product-design engineer Gary Smith described the resulting methodology as "a cross
between mux scan and LSSD [level-sensitive scan design]." AMI similarly embeds scan cells in its array
and organizes the scan chains in accord with the customer's design. Chip Express inserts scannable flip-
flops into the user design before layout, then reorders the scan chains after layout. LSI follows a similar
methodology for the user-defined portion of its chips, blending the resulting chains with the scan chains
provided in the company's diffused IP. All the vendors supply some form of memory built-in self-test, and
all provide loopbacks or other self-test mechanisms for high-speed I/O.

At this stage, it is fair to say that the structured-ASIC lineup includes distinct variations on a common
theme. Choosing the right one will depend on the application, the degree of vendor involvement the
customer wishes and predispositions about design flows. But the choices are there.

Gideon Amir, CEO Advanced Semiconductor Technology-

To top