Document Sample

1 CHAPTER 1 1.1 INTRODUCTION Fast paced progress on devising and assembling nanoelectronic devices suggests that it will be possible to manufacture large scale computation nanofabrics within 10-15 years. Aside from greatly increasing concerns with complexity and scalability, nanoelectronic fabrics share a characteristic that will impact the entire design hierarchy. Irrespective of the winning technologies, e.g., semiconductor nanowires and or carbon nanotubes, it is widely recognized that devices and interconnect at the nanoscale will exhibit fault densities much higher than state of the art silicon technology. Indeed, they will have a density of defects which is much higher than current silicon technologies are likely to be much more susceptible to transient faults, sometimes referred to as soft faults. The increases are, in part, due to the physical dimensions being considered. Indeed, from a materials perspective, decreasing the size of structures increases the ratio of surface area to volume, making imperfections on surfaces or materials’ boundaries more critical to proper function of nanoscale interconnects and devices. Furthermore, at such reduced scales, the discrete nature of atomic matter and charge becomes signiﬁcant. Namely, a single charge or defect may signiﬁcantly impact the structural stability of a nanodevice as well as its sensitivity to the electrostatic environment. These observations point to a reliability problem that is intrinsic to nanoscale regimes, and is thus here to stay. Overcoming this problem will require work at many levels, including devising more reliable devices, interconnects, manufacturing processes, and materials. At the same time, one needs to start devising design principles, abstractions, and tools to enable system level designers to address the projected increases in faults. These will not only be pivotal to conclusively demonstrating the viability of nanotechnologies, but also critical to moving them from labs to production. Current design methodologies and tools take high reliability for granted, and thus, their direct application to designing nanosystems would lead to exceedingly low yields. A paradigm shift in design methods and tools is thus required, placing defect and fault tolerance at the forefront, and recognizing these as inextricably tied to system performance. Moreover given the substantial degree of uncertainty inherent to nanoscale 2 technologies, design will have to be framed in a probabilistic setting, if effective optimization of performance, yield, and other key ﬁgures of merit is to be achieved. It has been demonstrated that nanoelectronics technology is well suited to building reconﬁgurable computational fabrics. This is signiﬁcant because reconﬁgurability provides a powerful tool to circumvent the uncertainty associated with distributions of defects. However, designing complex systems to be instantiated through reconﬁguration poses a major scalability challenge defect mapping and conﬁguration must be performed on a per chip basis. This paper proposes a novel probabilistic design paradigm targeting reconﬁgurable architected nanofabrics and shows that it lays a promising foundation towards comprehensively addressing, at the system level, the density and reliability challenges posed by emerging nanotechnologies. Indeed, ﬁrst it provides a hierarchy of design abstractions aimed at ensuring scalability, not only during a nanosystem’s synthesis, but also in the defect mapping and conﬁguration phases. Scalability must be jointly addressed across these phases this is one of the innovative aspects of our approach. Second, the proposed hierarchy is based on abstractions that enable an effective integration of fault tolerance and defect avoidance techniques in design methodologies. We will show that this is the key to enabling the design of robust nanosystems. A third innovation embodied in our approach is that it provides an adequate probabilistic framework in which to consider critical system level design trade offs between performance, yield and per chip costs. (Margarida Jacome, et al., 2004). 1. 1 .1 NANO ELECTRONICS DEVICES Nanoelectronics refer to the use of nanotechnology on electronic components, especially transistors. Although the term nanotechnology is generally defined as utilizing technology less than 100 nm in size, nanoelectronics often refer to transistor devices that are so small that inter-atomic interactions and quantum mechanical properties need to be studied extensively. As a result, present transistors do not fall under this category, even though these devices are manufactured with 45 nm, 32 nm, or 22 nm technology. Nanoelectronics are sometimes considered as disruptive technology because present candidates are significantly different from traditional transistors. Some of these candidates include: hybrid molecular/semiconductor electronics, one dimensional nanotubes, nanowires, or advanced molecular electronics. 3 1.1.1.1 Transistor A transistor is a semiconductor device used to amplify and switch electronic signals and power. It is composed of a semiconductor material with at least three terminals for connection to an external circuit. A voltage or current applied to one pair of the transistor's terminals changes the current flowing through another pair of terminals. Because the controlled output power can be much more than the controlling input power, a transistor can amplify a signal. Today, some transistors are packaged individually, but many more are found embedded in integrated circuits. 1.1.1.2 Carbon Nanotube Carbon nanotubes are allotropes of carbon with a cylindrical nanostructure. Nanotubes have been constructed with length to diameter ratio of up to 132,000,000:1, significantly larger than for any other material. These cylindrical carbon molecules have unusual properties, which are valuable for nanotechnology, electronics, optics and other fields of materials science and technology. In particular, owing to their extraordinary thermal conductivity and mechanical and electrical properties, carbon nanotubes find applications as additives to various structural materials. For instance, in baseball bats, car parts and even golf clubs, where nanotubes form only a tiny portion of the material. Nanotubes are members of the fullerene structural family, which also includes the spherical buckyballs, and the ends of a nanotube may be capped with a hemisphere of the buckyball structure. Their name is derived from their long, hollow structure with the walls formed by one atom thick sheets of carbon, called graphene. These sheets are rolled at specific and discrete angles and the combination of the rolling angle and radius decides the nanotube properties; for example, whether the individual nanotube shell is a metal or semiconductor. Nanotubes are categorized as single walled nanotubes and multi walled nanotubes. Individual nanotubes naturally align themselves into ropes held together by Vander Waals forces, more specifically, pi stacking. Applied quantum chemistry, specifically, orbital hybridization best describes chemical bonding in nanotubes. The chemical bonding of nanotubes is composed entirely of sp2 bonds, similar to those of graphite. These bonds, which are stronger than the sp3 bonds found in alkanes, provide nanotubules with their unique strength. 4 1.1.1.3 Nanowire A nanowire is a nanostructure, with the diameter of the order of a nanometer (10−9 meters). Alternatively, nanowires can be defined as structures that have a thickness or diameter constrained to tens of nanometers or less and an unconstrained length. At these scales, quantum mechanical effects are important which coined the term quantum wires. Many different types of nanowires exist that includes metallic (e.g., Ni, Pt, Au), semiconducting (e.g., Si, InP, GaN, etc.), and insulating (e.g., SiO2, TiO2). Molecular nanowires are composed of repeating molecular units either organic (e.g. DNA) or inorganic (e.g. Mo6S9-xIx). The nanowires could be used, in the near future, to link tiny components into extremely small circuits. Using nanotechnology, such components could be created out of chemical compounds. Fig.1. Nanowire used in a Transistor (www.nanowerk.com) 5 1.1.2 Nanoelectronic architectures Nanoelectronics offers the potential for much denser circuitry than is possible with current CMOS technology, but presents a number of challenges that must be addressed if we are to successfully exploit it. At the size scales we are considering (< 15 nm), conventional lithography will lack the resolution necessary to create the individual devices normally combined to create larger circuits. Sensitivity to noise, subatomic particles, and even quantum uncertainty at this scale will increase the rate of transient faults to the point that redundancy and fault tolerance will become necessary for logic functions as well as memory. And regardless of fabrication technique, nanoscale circuits will likely contain defects so numerous that it would be uneconomical to simply discard circuits containing a single defect some form of defect tolerance will be necessary to achieve acceptable yields. These challenges have led us to focus on conﬁgurable crossbar architectures where each cross point within a crossbar can be independently conﬁgured to activate an electronic device, such as a resistor, diode, or transistor. Crossbars are one of the easiest structures to build using nanoimprint lithography. Their high degree of redundancy offers a simple strategy for defect tolerance. Their regular structure makes them easy to analyze for devising a strategy for fault tolerance. Conﬁgurability offers the potential for using a single nanoelectronic fabric for a large number of applications, much like ﬁeld programmable gate arrays, reducing design costs. The architectures we describe here are necessarily speculative to varying degrees since many of the crossbar devices we hypothesize are either not yet functional in the laboratory, or are functional but incompletely characterized. Conﬁgurable nanoscale transistor crossbars, for example, do not yet exist; consequently we use simpliﬁed, idealized models for them. (Snider, et al., 2005) 1.1.2.1 Photolithography The processes used in traditional top down manufacturing at the nanoscale inﬂuence all levels of the design process. In particular, regular layouts are preferred. For example, as feature sizes become smaller than the illumination wavelength used in photolithography, resolution enhancement techniques (RETs) must be used to ensure printability. RETs are already restricting layouts that are manufacturable. Chemical Mechanical Polishing (CMP) is 6 another example of a process/layout interaction. To maintain a uniform thickness of the wafer, CMP requires a uniform layout density. The complexity of generating a reliable mask set which produces reliable chips also limits the ability to create arbitrary patterns of wires. There is ongoing research into solving these problems; for example, restricted design rules and process aware routing aim to reduce the impact of RET layout. Dummy features added to each layer help to make irregular layouts more uniform for CMP. However, as scaling continues, designs with regular layouts will be favored. One possible solution to these issues as well as the high cost of mask sets is to share some or all the masks between designs. Structured ASICs and FPGAs are two possible approaches. A Structured ASIC is based on a prefabricated set of logic blocks which can be customized by each user. The customization requires signiﬁcantly fewer masks than a traditional ASIC. Furthermore, the prefabricated layers are generally the ones which have the highest critical dimensions, removing the most expensive masks from the customization process. SRAM based FPGAs are even more regular and all customization occurs post manufacturing completely eliminating the need for custom masks. Irregular layouts are a natural byproduct of ASIC or custom design. Making layouts more regular generally requires more area and hurts performance. However, at the nanoscale this tradeoff may be worthwhile. 1.1.2.2 Molecular Scale Electronics Layout restrictions are even more onerous in bottom up manufacturing processes such as those used for MSE. The key to understanding the restrictions imposed by MSE is that, unlike traditional semiconductor manufacturing, the manufacture and assembly of the devices occur in separate steps. First, the devices are created and then they are assembled into a system. Due to the scale of the individual devices, the assembly techniques favor regular patterns. We divide the MSE assembly methods into three broad categories: Probe based methods, nanoimprint techniques, and self assembly based approaches. Probe based methods use the tip of a scanning tunneling microscope (STM) or an atomic force microscope (AFM) to write on a surface. One of the most promising of these methods is dip pen lithography (DPL), which uses an AFM tip to deliver molecules to a surface. This technique can create arbitrary patterns, but is limited by the write time of the probe. Parallel versions of DPL have 7 been demonstrated, but either the line widths increase or the individual probes are not individually addressable. In the latter case, complex patterns may be generated, but each pattern must be replicated by all the probes. Nanoimprint techniques use a master to stamp a pattern directly into the substrate. E beam lithography or DPL would be used to create a master with irregular patterns. However, issues relating to lift off, master creation, and contact printing may, for irregular patterns, limit the pitch to 60nm printing by super lattice nanowire pattern transfer (SNAP), can create parallel arrays of wires with a pitch of <20nm. The SNAP master is formed by layering through molecular beam epitaxy (MBE) materials of different hardness. The composite is cleaved and the cleaved edge is used as the master. The softer of the two materials is partially etched, leaving a parallel set of voids on the cleaved edge. Metal can be placed in the voids and then contact printed onto a surface. The resulting wires have a very large aspect ratio. There are a multitude of self assembly based approaches, which include ﬂow based alignment, electroﬁeld alignment, self assembled mono layers, Langmuir Blodgett (LB) ﬁlms, etc. All of these approaches create highly regular assemblies. For example, 2D meshes of nanowires have been made by combining self assembly and photolithography. The nanowires are placed in a ﬂuid which is compressed, which in turn causes the wires to align their long axis. The nanowires are then transferred to another surface. A second set of aligned nanowires is placed orthogonally on top of the ﬁrst. Photolithography is then used to create individual meshes from the two layers. A more deterministic approach to self assembly uses the hybridization of DNA to guide the assembly process. DNA directed synthesis proceeds by attaching an unpaired DNA strand to a device and the complementary strand to another device. The two stands will hybridize assembling the two devices. This method has been used to join nanoscale devices as well as micronscale devices. A more complex process based on DNA crossover, has been used to attach a single wall carbon nanotube FET to a particular location and then use the DNA as a template for the creation of metallic contacts. Another approach uses the DNA to create a template on which to assemble other structures. A ﬁnal example programs self assembly patterns through the use of DNA tiles. Each tile has multiple single stranded ends that lie in the plane and act as programmable binding sites. The tiles are designed so that they self assemble into a desired shape, e.g., a demultiplexor or a 2D mesh. Additionally, the tiles can be functionalized with molecules in the third dimension allowing them to serve as a scaffold onto which additional structures can 8 be placed. The MSE fabrication and assembly primitives will most readily create highly regular structures. Clearly, post fabrication customization is required to implement custom circuits. By lucky coincidence, a two terminal programmable non volatile switch can be implemented at a single crossing of two wires. Furthermore, the state of the switch can be programmed using the same wires as carry the signal during execution of the circuit. Thus, the area overhead associated with solid state reconﬁgurable fabrics is not present in their molecular cousins. 1.1.2.3 Self Assembled NanoStructure Although more than one of these possible new devices may prove to be controllable enough to be used in a new class of electronics, we still must consider Moore's second law, which deals with the ever increasing cost of an integrated circuit fabrication facility. Manufacturing facilities currently cost several billion (109) U.S. dollars, and their cost is projected to continue doubling with almost every generation of smaller devices. The main costs of the facility are the expensive lithography needed to form ever smaller features and the requirement for ever cleaner environments to prevent defects. Even one defect in a critical part of the circuit can destroy an entire chip. To reduce the need for fine lithography, self assembly techniques are being explored. The basic idea of self assembly in this context is to use natural forces to form a device feature, although its position may be determined by coarser lithography. Thus, the less expensive equipment associated with a previous technology generation may be used, along with self assembly techniques, to fabricate circuits for the next generation of devices. One technique of forming such nanostructures relies on the stress resulting from the lattice mismatch of a substrate and a deposited layer. As the layer is deposited, the atoms in the layer register with the lattice of the underlying material, even though the intrinsic lattice constant of the depositing material is different. The resulting stress tends to deform the depositing material, eventually forming three dimensional structures. If the lattice mismatch is the same in two orthogonal directions, zero dimensional nano islands form. An example of a nanoisland formed by the deposition of Ge on Si with its 4% lattice mismatch (Kamins.et al., 2001). 9 1.1.3 Defect Tolerance Although fine nanowires and nanoislands can be formed by self assembly, a significant number of the structures formed by any thermodynamically controlled fabrication process will be defective. The effect of defects can be minimized by basing much of the circuit on a simple and dense cross bar array architecture, which can be made tolerant of defects. The cross bar array essentially contains two orthogonal sets of wires separated by a layer that forms the devices. Wherever two wires intersect, a device is automatically formed. A nonlinear element e.g., a diode, contained within the device allows efficient programming of individual devices. However, the device must have a well defined threshold to allow switching of only the desired element, without affecting other devices in the same row or column of the array. Because the cross bar array is very regular, its repetitive structure allows it to be configured to avoid random defects. The defects are located by initially testing the array, and then the array is programmed to avoid the defects and to accomplish the desired electronic task. Thus, an architecture based on cross bar arrays can be tolerant of defects by its simple structure and configurability. The defect tolerant architecture can be used even with conventional devices to reduce the need for cleaner environments as device features become smaller, and the principle has been demonstrated (Kamins, et al., 2001) using conventional field programmable gate arrays. 1.1.3.1 Defect Tolerant Architectures For nanoscale crossbars, the main type of defect is that introduced during manufacture rather than during operation. This is reasonable for plausible technologies, which involve high temperatures during manufacture, and hence a relative ease of introducing defects, but low temperature during operation, with much less chance o creating new defects. In this situation, an appropriate system architecture requires a compiler to arrange for desired circuit behaviors by using only correctly functioning junctions within crossbars, as determined by a testing phase after manufacture. This approach of avoiding known defects yields a defect tolerant architecture. It contrasts with methods dealing with defect induced faults that appear during operation, perhaps intermittently, e.g. using majority votes from replicated hardware. For simplicity, we restrict our attention here to defects leading to unconﬁgurable junctions, rather than defects the short out a junction or adjacent wires, or that unexpectedly cause a break in a 10 wire. In this scenario, we can test the circuits to determine which junctions are defective, and then use the remaining ones to implement the circuit. That is, a compiler takes the required logic function and a table of defects to ﬁnd a way to implement that function on the defective crossbar fabric. This leads to the central question addressed in this section given a defect rate and a certain size crossbar. Thus, a related problem is the computational difﬁculty for the compiler to identify an implementation, or conclude that no implementation is possible. Decreasing the allowable defect rate for a nanoelectronic fabric will generally require more difﬁcult and costly manufacturing. Increasing the allowable defect rate will make it less likely that a desired circuit can be implemented, and can also result in longer run times for the compiler to identify a way to implement the circuit while avoiding defects. A further aspect of this problem is that logic functions can often be written in different but logically equivalent forms. The ripple carry implementation (top) translates directly to a diode crossbar implementation (bottom) using feedback from some of the outputs to the inputs (gray lines). Regenerative buffers (left-pointing triangles) between stages regenerate signals degraded by diode and resistor voltage drops. The input wire marked A0 gives the complement of input bit A0, and similarly for the other inputs. Note that the carry bit between successive stages of the crossbar implementation must be presented in both original and complemented forms, Thus, an architecture based on cross bar arrays can be tolerant of defects by its simple structure and configurability. The defect tolerant architecture can be used even with conventional devices to reduce the need for cleaner environments as device features become smaller, and the principle has been demonstrated (Kamins, et al., 2001) using conventional field programmable gate arrays. (Snider, Hogg , et al, 2005) 11 Figure 2: A 3 bit adder which adds two 3 bit numbers (denoted as the bitsA2A1A0 and B2B1B0) to produce a 4 bit sum (S3S2S1S0). (Snider, Hogg , et al, 2005) 1.1.3.2 Fault tolerance We distinguish faults, transient perturbations of the circuit that could cause it to function incorrectly from defects, permanent ﬂaws in the nanofabric. Faults, as we have deﬁned them, are also referred to as single event upsets (SEUs) or soft errors. They can be caused by subatomic particles striking a conductor within a circuit, brieﬂy altering the charge, and hence the voltage, in that conductor. Nanoelectronic structures will be more susceptible to faults than conventional CMOS since 12 (1) Fewer electrons will be used to represent logic states in nanologic circuits, increasing their sensitivity to changes in charge. (2) Devices will likely have a much greater variability in device parameters, reducing voltage safety margins. (3) Quantum probabilities will become more visible at the nanoscale, so that devices may not behave as reliably or predictably as one would like. (G. Snider, et al., 2005) 1.1.3.3 Von Neumann Multiplexing Based Defect-Tolerance The multiplexing based fault tolerance scheme replaces a single logic device by a multiplexed unit with N copies of every input and output of the device. The N devices in the multiplexing unit process the copies of the inputs in parallel to give N outputs. Each element of the output set will be identical and equal to the original output of the logic device, if all the copies of the inputs and devices are reliable. However, if the inputs or devices are in error, the outputs will not be identical .The basic design of a von Neumann multiplexing technique consists of two stages: the executive stage, which performs the basic function of the processing unit to be replaced, and the restorative stage, which reduces the degradation in the executive stage caused by erroneous inputs and faulty devices. In the case of the executive stage for a MAJ multiplexing system the majority gate, and its inputs and outputs are duplicated N times. The unit U represents a random permutation of the input signals, that is, for each input set to the N copies of the majority gate, three input signals are randomly chosen from the three separate input bundles respectively. Also, the restorative stage which is made using the same technique as the executive stage, duplicating the outputs of the executive stage to use as its inputs. Note that applying this approach only once will invert the result, therefore two steps are required. To give a more effective restoration mechanism this stage can be iterated (Debayan Bhaduri, et al., 2004). 13 Figure.3. Majority gate implemented with four NAND gates (Debayan Bhaduri, et al., 2004). 1.1.3.4. R fold modular redundancy The concept of TMR is to have three units working in parallel, and to compare their outputs with a majority gate. Then TMR can provide an assemblage that behaves like one of its constituent components, but with an improved probability of working. The trade off is that instead of n devices, at least 3n devices plus a majority gate are needed to make this new unit. RMR is a generalization of TMR where instead of three we have R units working in parallel. In our analysis we have assumed a chip with N devices, with the probability pf of an individual device failing. The probability P fail of a complete chip failing during the working lifetime is minimized for some optimum cluster size (Nc), under the condition N cpf 1. A cluster represents the unit which is replicated R times; outputs from each of the units are compared in a majority gate and the output is determined. It would be a more realistic approach to ﬁx the logical depth of the system (to say D = 10) and then have determined values for Nc for each R. However, in the proposed model we calculate the maximum theoretical improvement offered by this technique. Imperfect majority gates have B outputs and mB devices. First we assume that a module of Nc devices works only if every single device in the module works ….(1) Now the probability that a module fails, if 1, is =1– ≈ .A group consisting of R modules and a majority gate works correctly when at least (R +1)/2 modules work correctly and when the majority gate also works. The number of devices in a 14 group is RNc + mB, so the total number of groups is Ngroups = N/ (RNc + mB). The chip fails if any of the groups fail, hence the probability that the whole chip with N devices fails is approximately Here we assume that the errors in each module are uncorrelated, i.e. that common mode or common cause failures are not present in the redundant system. The equation /dNc =0 gives the optimum module size (Nc) for a given pf , which is substituted in yielding the minimum chip failure probability. Figure 4. The basic elements of fault tolerant techniques: (a) RMR, (b) CTMR (Debayan Bhaduri, et al., 2004). 1.1.3.5 Cascaded triple modular redundancy The TMR process can be repeated by combining three of the TMR units with another majority gate to form a second order TMR unit with even higher reliability, a technique called CTMR. If all three of the units work independently, then the probability of the assemblage, three units plus majority gate working, ,is given by ………….(4) where is given by equation (4). 15 The performance with additional stages of TMR is obtained by repeated application of this formula Figure. 5. The probability of obtaining a defective (TMR/CTMR) (Debayan Bhaduri, et al., 2004). Figure 5 demonstrates the effectiveness of the CTMR technique. It shows that there is no advantage in using CTMR for units containing a small number of devices, when the majority gates are made from the same devices as the units that they are monitoring. However, at least in principle, improvement is possible for units with large Nc. There are three regions in each set of curves: (a) Ncpf > ln 2, where redundancy affords no advantage; (b) 10−3<Ncpf < ln 2, where redundancy is most effective; and (c) Ncpf < 10−3, where only ﬁrst order redundancy offers an advantage. In case (b), the effectiveness of redundancy scales as a power law with the order of CTMR. The failure probability is for case (c) the effectiveness of redundancy depends on the ratio mB/Nc.), it can be shown that in region (c) the failure probabilities are 16 1.1.3.6 NAND multiplexing Fifty years ago, von Neumann was the ﬁrst person to consider the use of redundant components to overcome the effects of defective devices. He described the now well known technique of multiple redundancy, but he also described another method, which we have called NAND multiplexing for brevity.Von Neumann showed that this method could in principle enable a circuit to work, even if the individual devices had a failure rate of ∼0.01. The problem with this technique was that it required enormous levels of redundancy. Although this seems very unrealistic, very high levels of redundancy may be needed for nanocomputers, since it may be very hard to make huge numbers of nanoscale devices with good reliability. We have examined whether this technique can be used with smaller levels of redundancy, and we have also examined its performance for different circuit sizes. In essence the basic technique of multiplexing is similar to RMR, but instead of having a majority gate to decide on the proper output, the output is carried on a bundle of wires, e.g. for a single bit output one would have R wires in a bundle which carries the output to the next stage. Therefore, in this method, processing units of any size are replaced by multiplexed units containing Nbundle number of lines for every single input and output. Essentially, a multiplex unit consists of two stages. The ﬁrst, the executive stage, performs the basic function of the processing unit in parallel. The second, the restorative stage, reduces the degradation caused by the executive stage and thus acts as a nonlinear ampliﬁer of the output. An example for the executive stage given in ﬁgure 6 is a simple NAND (two input gate), but it could be a unit with an arbitrary number of gates. Now the signals to and from units are not carried in single lines but in bundles (Nbundle = 4 in the example in ﬁgure 3) and the unit is replicated the appropriate number of times. If the inputs and processing units are perfectly reliable then the lines comprising each output should be identically stimulated (1) or unstimulated (0). However, due to errors in the input data as well as errors occurring in the processing of the inputs from faulty devices, not all of the output lines in each output will be identically stimulated. Thus, for multiplexed networks, the ﬁnal outputs are considered to be 1 if more than (1 − ) Nbundle lines are stimulated and 0 if less than Nbundle lines are stimulated, where is a critical level that is pre deﬁned (0 < < 0.5). Stimulation levels in between are considered to be undecided. 17 Figure 6. The basic elements of NAND multiplexing (after von Neumann) fault tolerant techniques. The speciﬁc logic gates shown are only for the purpose of illustration. (Nikoli, et al., 2002) Figure 7. The basic structure for the reconﬁguration technique theory. The speciﬁc logic gates shown are only for the purpose of illustration. (Nikoli, et al., 2002) For the sake of simplifying calculations, von Neumann assumed the most basic logic gate in a chip to be the NAND gate. This is a universal logic gate and can be used to build the basic logic gates NOT and NOR. Thus, a NAND network equivalent can replace any conventional architecture. The use of only NAND gates in extremely large scale integration (XLSI) architectures may also be justiﬁed in that construction is simpliﬁed through the use of identical repeating subunits. (Nikoli, et al., 2002) 1.1.3.7 Successful fault tolerance precedents One of the most referenced examples related to fault tolerance of systems with a medium/high ratio of defective components is the reconfigurable architecture of TERAMAC 18 designed by HP researchers. In this example, a computer is organized from a reconfigurable structure made up of around 8 million components The system showed to be efficient to tolerate 10% of defective programmable cells. The principle of tolerance was based on reconfigurability; first a whole testing phase determined the set of defective components, later the systems were reconfigured using healthy components only, avoiding the use of defective ones. This is one of the principles of the reconfiguration technique, test find replace. The same technique is used in reconfigurable memories, which integrate sparse components, or even in multi core systems where in every startup the systems test the cores and cancel the defective or marginal ones. In spite of the apparent success of this technique, it cannot be used in computers that exhibit transient errors as are those caused by noise, the unavoidable source. The test of the whole systems takes time decades higher than the period of appearance of transient faults. Another drawback is that the method is not efficient if defect rate is very high, which is the case, we expect in nanodevice architectures. In the scenario of nanosystems, we will find a significant number of defects at any level, section or subsection. 1.1.4 Other fault tolerant techniques adequate for transient errors Hardware fault tolerant structures implemented through the use of massive redundancy have been reported previously. The more relevant are: NAND multiplexing, N modular redundancy (NMR) and averaging cells. 1.1.4.1 NAND MULTIPLEXING technique It was introduced by Von Neumann in 1956. The design is based on the implementation of a functional gate into three stages: executive stage, restorative stage and output stage. In each one of these stages the redundancy level is given by N (with typical values of 100 or 1000). Figure 6, shows the results of this technique for three different levels of redundancy (N= 10, 100 and 1000). Observe that for high values of N the error probability is limited even for a significant individual gate error rate. However, many publications have shown that this technique has a limitation, because the system is not able to tolerate global errors when the individual error rate is higher than 1% approximately. 19 1.1.4.2 The NMR or TMR technique NMR and TMR are based on the use of N tuple or Triple modular redundancy of the basic system. The system evaluates answers from the N or 3 modules and assumes as good the one voted in majority. The system is simple and efficient but only for very low error rates as shown in figure 8. Figure.8. Error tolerance caused by TMR. The vertical axis is the error rate of the TMR system and the horizontal the error probability of each one of the three components. (Nikoli, et al., 2002) 1.1.5 Basic Concept: Reliability, Entropy, and Model of Computation It relates information theoretic entropy and thermal entropy of computation in a way so as to connect reliability to entropy. It has been shown that the thermo dynamic limit of computation is KbT ln 2, where KbT is the thermal energy and is expressed in normalized units relative to the logic state energy (Kb is the Boltzmann constant and T is the temperature in Kelvin). What this means is that the minimum entropy loss due to irreversible computation amounts to thermal energy that is proportional to this value. If we consider energy levels close to these thermal limits, the reliability of computation is likely to be affected, and if we can keep our systems far from the temperature values that might bring the systems close to this amount of entropy loss, the reliability is likely to improve. This idea is also taken into account 20 by and a Gibbs distribution based technique is used to characterize the computations by Boolean gates and networks. NANOLAB automates this non discrete model of computation. This automation entails parameterized library functions to compute the output distributions given the input distributions for all the generic logic gates. These can be used to calculate the probability of the state conﬁgurations and entropy values at the primary outputs of any arbitrary Boolean network. Figure.9. Generic Triple Modular Redundancy Conﬁguration. (Debayan Bhaduri, et al., 2004). The usefulness of such a library and methodology is illustrated by modeling and computing various reliability ﬁgures of merit for interesting defect tolerant architectures. One of the architectures shown as an illustration is a Triple modular redundancy (TMR) conﬁguration where three NAND gates work in parallel, and their outputs are compared with a majority gate. The TMR conﬁguration is shown in Figure 10 and the units in the ﬁgure are NAND gates. 21 Figure 10. The probability of the TMR output for different values of KbT. (Debayan Bhaduri, et al., 2004). The probability of the state conﬁgurations of the output of the TMR conﬁguration are plotted in Figure 10 for different values of KbT. The results obtained by our methodology are consistent with our analysis. The probability p(z) of the TMR output z for the different state conﬁgurations of z are outlined in Figure 10. As it can be observed, the probability at z = 0 for different values of KbT is very high and decreases as z approaches 1 in all the cases. This is because the three NAND gates in the TMR conﬁguration shown in Figure 10 have input probability distributions where p (inputs=0) = 0.1 and p(inputs=1) = 0.9. Therefore, the output of the majority logic gate z in Figure 1 has a high probability of being at 0. Also, it is observed that as the KbT values increase, the likelihood of the occurrence of intermediate values for the state conﬁgurations of the output goes up to the extent that the probability of occurrence of these values is almost the same as z being logic low or high. (Debayan Bhaduri, et al., 2004). 22 1.1.5.1. PTM Theory In this section, we describe the PTM algebra and some key operations to manipulate PTMs and compute reliability. First we discuss the basic operations needed to describe circuits and to compute circuit PTMs from gate PTMs. Next, we deﬁne additional operations to extract reliability information, eliminate variables and handle fanout efﬁciently. Finally we discuss how PTMs capture signal correlation and a wide variety of errors. 1.1.5.2 PTM Algebra Consider a circuit C with n inputs and m outputs. We order the inputs for purposes of PTM representation and label them in 0 ... inn−1; similarly, the m outputs are labeled out 0 ... outm−1. The circuit C can be represented by a 2n×2m PTM M. The rows of M are indexed by an n bit vector whose values range from 000 ... 0 to 111 ... 1. The row indices correspond to truth assignments of the circuit’s inputs. Therefore, if i = i0i1 ... in is an n bit vector, then row M(i) gives the output probability distribution for n input values in0 = i0, Therefore, each entry in M gives the conditional probability that a certain output combination occurs given a certain input combination. 1.1.6 A MATLAB Based Tool In this section, we discuss how information theoretic entropy coupled with thermal entropy can be used as a metric for reliability evaluation. We also present the automation methodology for the NANOLAB tool with a detailed example not only provides a different non discrete model of computation, in fact, it relates information theoretic entropy and thermal entropy of computation in a way so as to connect reliability to entropy. It has been shown that the thermodynamic limit of computation is KT ln2, K is the Boltzmann constant and T is the temperature in Kelvin, and is expressed in normalized units relative to the logic energy. The logic energy at a particular node of a Markov network depends only on its neighborhood. What the thermodynamic limit of computation means is that the minimum entropy loss due to irreversible computation amounts to thermal energy that is proportional to this value. Due to the reduced operating voltage levels in nanoarchitectures, the entropy loss can become significant and reliability may suffer when the computation is carried out close to this thermal limit of computation. However, as we show by considering various defect tolerant 23 architectural techniques, that based on the usage of different configuration parameters for TMR, CTMR etc, the effects of carrying out computation within close thermal limits can be reduced substantially. NANOLAB consists of a library of functions implemented in MATLAB. The library consists of functions based on this probabilistic non discrete model of computation, and can handle discrete energy distributions at the inputs and interconnects of any specified architectural configuration. We have also developed libraries that can compute energy distribution at the outputs given continuous distributions at the inputs and interconnects, introducing the notion of signal noise. Therefore, this tool supports the modeling of both discrete and continuous energy distributions. The library functions work for any generic one, two and three input logic gates and can be extended to handle n input logic gates. The inputs of these gates are assumed to be independent of each other. These functions are also parameterized and take in as inputs the logic compatibility function and the initial energy distribution for the inputs of a gate. If the input distribution is discrete, the energy distribution at the output of a gate is computed by marginalizing over the set of possible values of the inputs. The output probability distributions are returned as vectors by these functions and indicate the probability of the output node being at different energy levels between 0 and 1. These probabilities are also calculated over different values of KT so as to analyze thermal effects on the node. The Belief Propagation algorithm is used to propagate these probability values to the next node of the network to perform the next marginalization process. The tool can also calculate entropy values at different nodes of the logic network. It also verifies that for each logical component of a Boolean network, the valid states have an energy level less than the invalid states as shown theoretically in. Our tool also consists of a library of functions that can model noise either as uniform or Gaussian distributions or combinations of these, depending on the user specifications. The probability of the energy levels at the output of a gate is calculated by similar marginalizing techniques. Arbitrary Boolean networks in any redundancy based defect tolerant architectural configuration can be analyzed by writing simple MATLAB scripts that use the NANOLAB library functions. Also, generic fault tolerant architectures like TMR, CTMR are being converted into library functions such that these can be utilized in larger Boolean networks where more than one kind of defect tolerant scheme may be used for higher reliability of computation. 24 1.2 LITERATURE REVIEW Nanotechnology is a field of science and technology that controls matter on a scale that lies between 1-100 nanometers. Miniaturization is the key to increase performance in electronics. Atomically Precise Manufacturing is the ability to manufacture materials and structures at the atomic or molecular size scale. This technology integrates the knowledge and low manufacturing cost of chemistry with the knowledge and flexibility of engineering. It is apparent to achieve the highest degree of precision. Embedded systems are increasingly becoming connected through wireless networking. These devices now form the basis of many of today’s consumer products including cell phones and video game controllers. Nanoelectronics offers the potential for much denser circuitry than is possible with current CMOS technology, but presents a number of challenges that must be addressed if we are to successfully exploit it. At the size scale we are considering (< 15 nm), conventional lithography will lack the resolution necessary to create the individual device normally combined to create larger circuits. Sensitivity to noise, subatomic particles, and even quantum uncertainty at this scale will increase the rate of transient faults to the point that redundancy and fault tolerance will become necessary for logic functions as well as memory. And, regardless of fabrication technique, nanoscale circuits will likely contain defect so numerous that it would be uneconomical to simply discard circuits containing a single defect some form of defect tolerance will be necessary to achieve acceptable yields. Recent years have seen a dramatic improvement in the size and speed of electronic devices; the exponential pace of microelectronics is well known. Although current trends may continue for some time, inevitable road blocks loom. Whether or not one can predict with conﬁdence how long he exponential path can be extended, it makes sense to now explore more radical technologies that could leapfrog conventional CMOS and enable scaling to continue unhindered down to molecular sizes. It is helpful to appreciate that current MOS transistors are direct descendants of the electromechanical switches ﬁrst used by Zuse to code digital information in an electronic form. Representing binary information by turning on or off a current switch has been one of 25 the most fruitful ideas in the history of technology. This paradigm does however have serious drawbacks as device sizes are reduced. The interconnect problem is one. One needs to distribute signals over large distances, which involves charging long lines. Remarkable complexity attends the routing of signals on multiple levels. At the other end, as transistors become smaller, the quantization of charge both in the channel and in the doping layer becomes signiﬁcant. When reduced to nanometer scales, current switches may not be the best way to code information. (lent, et al., 2005). Today´s most advanced integrated technology shows a miniaturizing level characterized by a manufacturing critical dimension of 32 nm and decreasing. Following the scaling down of components predicted by Gordon Moore. We have seen a continuous miniaturization of the device sizes that in the last decade was located around the 350 nm generation. This last decade, from 350 to 90 nm nodes, the generations have been called the Pre Giga transistor era. From 90 to 22 nm (a predicted feasible technology today) is called Giga transistor era and from 22 nm to the ultimate feasible size (presumably 12or 6 nm) is called the Tera transistor era. (Carmen García, et al., 2004). Physicists and chemists are starting to demonstrate a variety of sub 10nm electron devices which can be fabricated bottomup without relying on nanoscale patterning, for example by self assembly of pre synthesized molecules from solution. Simultaneously, scientists and engineers are beginning to demonstrate a number of techniques for building and engineering wires with nanometer scale widths. These devices are emerging just as we are encountering increasing challenges in the control, variability, and reliability as we scale our traditional, top down CMOS fabrication techniques. These encouraging developments have stimulated intense work on architectures for future nanoelectronic circuits. Some of these architectures are based on the use of active nanodevices alone. However, these proposals must either deal with of signal restoration and device isolation challenges in two terminal devices or the self assembly challenges of molecular three terminal nanodevices (e.g., transistors). This is why one of the most promising solutions is to combine nanoscale devices and nanowires with larger CMOS components. The CMOS subsystem can also serve as a reliable medium connecting the nanoscale circuit blocks and providing I/O functions. This paper gives a brief 26 review of such hybrid CMOS/ nanowire/ nanodevice integrated circuits. We start with a discussion of possible active and passive components of such circuits in order to show that they create a new set of constraints and issues to address at circuit design, essentially requiring a completely new design mindset. We then discuss several hybrid circuit architectures as examples of how these issues could be addressed .Together, these new issues and architectures present a new set of challenges for design automation (Dehon, Likharev et al., 2004). The first ACS architecture is called the MT ACS architecture. The MT ACS architecture is an extension of the conventional ACS architecture and contains a multi threaded processor, memory, I/O devices and a reconfigurable coprocessor connected to the bus. For a given application, a part of the application will be executed on the processor and the remaining part will be implemented on the reconfigurable coprocessor. For the portions of the applications mapped on the reconfigurable hardware, CED, fault location and recovery techniques described in can be used. Multi threading can be used to implement fault tolerance in the processor. Fault tolerance is accomplished by using multiple threads of computations and algorithms. With the MT ACS architecture, recovery from permanent faults in the processor requires board swapping or replacement of the microprocessor chip. Thus, the MT ACS architecture needs human intervention for recovery and may not be suitable for unmanned applications. (Subhasish Mitra et al., 2000). The paper which follows is based on notes taken by R.S. Pierce on five lectures given by the author at the California Institute of Technology in January 1952. They have been revised by the author, but they reflect, apart from stylistic changes, the lectures as they were delivered. The author intends to prepare an expanded version for publication, and the present write up, which is imperfect in various ways, does therefore not represent the final and complete publication. The neurological connections may then also be explored somewhat further. The present write up is nevertheless presented in this form because the held is in a state of rapid mux, and therefore for ideas that bear on it an exposition without too much delay seems desirable. The analytical table of contents which precedes this will give a reasonably close orientation about the contents indeed the title should be fairly self explanatory. The subject-matter is the role of error in logics, or in the physical implementation: of logics in automata synthesis. Error is viewed, therefore, not as an extraneous and misdirected or 27 misdirecting accident, but as an essential part of the process under consideration its importance in the synthesis of automata being fully comparable to that one of the factor which is normally considerd, the intended and correct logical structure (J. von Neumann, et al., 1952) New technologies for building nanometer scale devices are expected to provide the means for constructing much denser logic and thinner wires. These technologies provide a mechanism for the construction of a useful Avogadro computer that makes efﬁcient use of extremely large number of small devices computing in parallel in the presence of defects and uncertainty. But the economic fabrication of complete circuits at the nanometer level remains challenging because of the difﬁculty of connecting nanodevices to one another. Also, the shrinking feature size will make the defect density control very expensive; in the near future we will be unable to manufacture large, defect free integrated circuits. Thus designing reliable system architectures that can work around these problems at run time becomes important. Also, at the nanometer scale, the focus of micro architecture will move from processing to communication. General computer architectures till date have been based on principles that differentiate between memory and processing and rely on communication over buses. Nanoelectronics promises to change these basic principles. Processing will be cheap and copious, interconnection expensive and prevalent. This will tend to move computer architecture in the direction of locally connected, redundant and reconﬁgurable hardware meshes that merge processing and memory. At the same time, due to fundamental limitations at the nanoscale, micro architects will be presented with new design challenges. For instance, the methodology of using global interconnections. And assuming error free computing may no longer be possible. Due to the small feature size, there will be an extremely large number of nanodevices at a designer’s disposal. This will lead to redundancy based defect tolerant architectures, and thus some conventional techniques such as TMR, CTMR and multistage CTMR may be implemented to obtain high reliability. However, too much redundancy does not necessarily lead to higher reliability, since the defects affect the redundant parts as well. As a result, in depth analysis is required to ﬁnd the optimal redundancy level for each speciﬁc architecture. Exhaustive analysis of the NAND multiplexing technique in shows that the arbitrary augmentation of unreliable devices could result in the decrease of the reliability of an 28 architecture. This meansthat for each speciﬁc architecture and a given failure distribution of devices, once an optimal redundancy level is reached, any increase or decrease in the number of devices may lead to less reliable computation. Also note that redundancy may also be applied at different levels of granularity, such as gate level, logic block level, functional unit level etc. Our work shows that deciding the correct granularity level For a speciﬁc Boolean network is crucial in achieving an optimal redundancy level. We have developed an automated tool for evaluating reliability of various alternative defect tolerant architectures for any arbitrary Boolean network. This probabilistic model checking based tool named NANOPRISM can automatically evaluate reliability in terms of redundancy and granularity levels and most importantly show the trade offs and saturation points. By saturation point we mean the granularity based redundancy vs. reliability reaches a plateau meaning that there cannot be any more improvements in reliability by increasing redundancy or granularity levels (Debayan, et al., 2004). 1.2.4 Defect tolerant computing In 1952, von Neumann studied the problem of constructing reliable computation from unreliable devices, introducing a redundancy technique called NAND multiplexing. He showed that, if the failure probabilities of the gates are sufﬁciently small and failures are independent, then computations may be done with a high probability of correctness. Later, it was shown that a logarithmic redundancy is necessary for some Boolean function computations, and sufﬁcient for all Boolean functions. Pippenger showed that von Neumann’s construction works only when the probability of failure per gate is strictly less than 1/2, and that computation in the presence of noise, requires more layers of redundancy, NAND multiplexing was compared to other techniques for fault tolerance and theoretical calculations showed that the redundancy level must be quite high to obtain acceptable levels of reliability. Formally, a defect tolerant architecture is one which uses techniques to mitigate the effects of defects in the devices that make up the architecture, and guarantees a given level of reliability. (Gethin Norman, et al., 2005). New technologies for building nanometer scale devices are expected to provide the means for constructing much denser logic and thinner wires. But the economic fabrication of 29 complete circuits at the nanometer level remains challenging because of the difficulty of connecting nanodevices to one another. Also, the shrinking feature size will make the defect density control very expensive; in the near future we will be unable to manufacture large, defect-free integrated circuits. Thus designing reliable system architectures that can work around these problems at run-time becomes important. Due to the small feature size, there will be an extremely large number of nanodevices at a designer’s disposal. This will lead to redundancy based defect-tolerant architectures, and thus some conventional techniques such as Triple Modular Redundancy (TMR), Cascaded Triple Modular Redundancy (CTMR) and multistage iterations of these may be implemented to obtain high reliability. We have developed tools to evaluate such reliability redundancy trade offs of different redundant architectural configurations for arbitrary Boolean networks. In particular, we describe a MATLAB based tool called nanolab, and a probabilistic model checking based tool named nanoprism. Recently, a probabilistic model of computation based on Markov Random Fields (MRF) has been proposed in that introduces a new information encoding and computation scheme where signals are interpreted to be logic low or high over a continuous energy distribution. The inputs and outputs of the gates in a combinational block are realized as nodes of a Markov network and the logic function for each gate is interpreted by a Gibbs distribution based transformation. Nanolab automates this probabilistic design levels. It consists of libraries built on prism for different redundancy based defect tolerant architectural configurations. These libraries also support modeling of redundancy at different levels of granularity, such as gate level, logic block level, logic function level, unit level etc (Debayan Bhaduri, et al., 2004). As CMOS is approaching its physical limit, nanoscale devices such as Resonant Tunneling Diodes, Quantum dot Cellular Arrays and molecular electronics have been proposed due to their drastically improved operating speeds, low power consumption and device densities reaching 1012device / cm2. Consequently, defect and fault tolerance has assumed paramount importance as a design objective in the emerging nanoelectronic environment. Related research in defect tolerance in the nanoelectronic systems includes system level approaches and logic level solutions. These approaches reconfigure the redundant 30 hardware to bypass the permanent faulty units. To deal with transient faults and provide fault tolerance capability, the N Module Redundancy and NAND multiplexing can be applied at the logic gate level. These simple hardware redundancy based schemes at the logic and lower levels require an immense number of redundancies to tolerate the large failure rates in the emerging nanotechnologies. research approaches in architectural level fault tolerance schemes for processors based on CMOS technology mainly deal with a low and relatively fixed fault rate and are not applicable to the nanoelectronic environment. Essentially, computational models for computing system architectures based on nanoelectronic devices should satisfy two core requirements. First, correctness of computations is a fundamental requirement. The overall system should operate reliably even though the underlying nanotechnology is unreliable. A second requirement is to implement a high performance system. The large number of computation units should be used to dramatically speed up system performance. In doing so, several unique challenges need to be addressed including (i) How to translate the speed up afforded by nanoelectronic devices into high performance at the system level and (ii) How to organize the abundant computational resources to trade off fault tolerance against system performance in the presence of high and time varying failure rates. In this paper, we propose a fault tolerance computational model for the nanoelectronic processors. Essentially, the correctness of every instruction is conrmed by multiple execution instances through hardware redundancy complemented by time redundancy approach. To achieve system performance, multiple computation branches can proceed in parallel in a speculative manner. Hardware resource growth in the speculative computations is controlled so that performance boost does not occur at hardware expense. An instance of the proposed computation model can be found in. We set up an experimental framework to validate the effectiveness of the proposed approach and investigate the variations of the algorithm under different parameters. Experimental data confirm that, both in terms of hardware and in terms of latency aspects, the proposed computation model can provide fault tolerance for pipelined instructions even under a high and variable fault rate and can achieve high system performance with low hardware overhead. The experimental results further show that, for different fault rate ranges, a solution quite close to the optimal can be achieved through the proper selection of 31 parameters in the proposed computation model. The paper is organized as follows. A description of the proposed fault tolerance computation model, with the hardware allocation algorithm for a pipelined architecture provided. Experimental setup, results and analysis are provided (Wenjing Rao, et al., 1999). 32 CHAPTER 2 2.1 AIM AND OBJECTIVES The main aim of this work is To evaluate the reliability of defect tolerance in nanoelectronic devices. The objectives of the work includes Design of a nanoelectronic device Analysis for error detection Correction by using simulation techniques. Simulating the device using MATLAB 33 2.2 WORK PLAN Signal Hazard calculation Entropy calculation Redundancy calculation Defect tolerance calculation 34 2.3 MATERIALS AND METHODS 2.3.1 Granularity and Measure of Redundancy Redundancy based defect-tolerance can be implemented for Boolean networks at different levels of granularity. For a specific logic circuit, all the gates could be replicated as a particular CTMR configuration and the overall architecture can be some other CTMR configuration. For example, each gate in a logic circuit could be a kth order CTMR configuration and the overall circuit could be a nth order configuration where k 6= n. The different levels of granularity at which redundancy can be applied are gate level, logic block level, logic function level etc. We discuss some interesting results later that show us that reliability is indeed dependent on the granularity level at which redundancy is injected in a specific Boolean network. 2.3.2 Modeling Single gate TMR, CTMR We explain the PRISM model construction of a single gate TMR, CTMR and multistage iterations of these. The first approach is directly modeling the system as given in Figure 1. For each redundant unit and the majority voting logic, construct separate PRISM modules and combine these modules through synchronous parallel composition. However, such an approach leads to the well know state space explosion problem. At the same time, we observe that the actual values of the inputs and outputs of each logic block is not important, instead one needs to keep track of only the total number of stimulated (and non stimulated) inputs and outputs. Furthermore, to compute these values, without having to store all the outputs of the units, we replace the set of replicated units working in parallel with ones working in sequence. This folds space into time, or in other words reuse the same logic unit over time rather than making redundancy over space. This approach does not influence the performance of the system since each unit works independently and the probability of each gate failing is also independent. Also, different orders of CTMR configurations are built incrementally from the models of the previous iterations. In this case too, two approaches seem to emerge. One of the approaches is incorporating PRISM modules of the previous CTMR iteration as the redundant functional units for the current order and adding a majority 35 voting logic. This causes the model to grow exponentially as the higher orders of configuration are reached. The other approach is to use already calculated probability values (probability of being a logic low or high) at the output of the last CTMR configuration as the output probability distributions of the three redundant functional units of the current order of a CTMR configuration. Let us consider a TMR configuration of a NAND gate. For a given input probability distribution and failure distribution of the NAND gates, the PRISM model checks whether the output of the configuration is in an invalid state and computes the probability of being in that state. This gives the output distribution of the TMR including the invalid state configurations. A CTMR configuration uses three TMR logic units and a majority voter. The probability distribution obtained for the TMR block can be used directly in this configuration, thus reducing the state space. The NANOPRISM libraries are parameterized and different input and error probabilities can be plugged in to check for reliability measures. 2.3.3 Tools Involved Nanoprism DTMC based libraries Digital Electronics Tools 2.3.3.1 MATLAB MATLAB is a high level technical computing language and interactive environment for algorithm development, data visualization, data analysis, and numeric computation. Using the MATLAB product, we can solve technical computing problems faster than with traditional programming languages, such as C, C++, and FORTRAN. We can use MATLAB in a wide range of applications, including signal and image processing, communications, control design, test and measurement, financial modeling and analysis, and computational biology. Add on toolboxes, collections of special purpose MATLAB functions, available separately extend the MATLAB environment to solve particular classes of problems in these application areas. MATLAB provides a number of features for documenting and sharing our work. We can integrate our MATLAB code with other languages and applications, and distribute our MATLAB algorithms and applications. 36 CHAPTER 3 3.1 RESULTS AND DISCUSSION 3.1.1 Reliability and Bit transfer of a traditional NAND Gate and Nano NAND Gate The bit transfer analysis was performed for that of a traditional NAND gate and nano NAND gate. The output probability distribution of a NAND gate is asymmetrical. At the end of the analysis, it was observed that the performance of a nano NAND gate was superior to the traditional NAND gate. The elapsed time for the traditional NAND gate is 0.000102 seconds and 0.000035 seconds for nano NAND gate. Figure 11. Bit transfer of a traditional NAND Gate and Nano NAND Gate 37 It is found that the traditional NAND gate took 8.4 seconds to transfer an eight bit data, whereas it was only 2.4 seconds with the nano NAND gate. With the nano NAND gate, data transfer is faster and hence the Reliability of Defect Tolerant is high. Figures 11 shows the entropy curves at the outputs for a single NAND gate and a CTMR configuration at different KbT values. The output probability distribution of a NAND gate is asymmetrical. This should be expected since only one input combination produces a logic 0 at the output. Figure 12. Entropy of a CTMR NAND configuration when inputs are uniformly distributed Figure 12 shows the entropy curves for different orders of a NAND CTMR configuration at different KbT values. The inputs in this case are uniformly distributed. It can be observed that 38 the 0th order CTMR has higher entropy value less logic margin than the 1st order CTMR at lower KbT values and in both cases the entropy values almost converge at KbT = 0.5. At this point logic margins at the output of the CTMR become insignificant and there is total disorder. Computation becomes unpredictable. The entropy shoots up at a KbT value of 0.25 indicating that the optimal redundancy level for a NAND CTMR configuration is the 1st order. This aptly shows that every architecture has an optimal redundancy level. Figure 13. Entropy of a CTMR NAND configuration when inputs are logic high with 0.8 probability Figure 13 indicates the entropy values when the inputsare non-uniformly distributed with a probability of 0.8 to be logic high. In this case too, it is observed that the 1st order CTMR is 39 the optimal redundancy level for the configuration. The entropy values values are higher than for all values of KbT. This is because the input distribution being at a higher probability of being a logic high implies that the output of the NAND gate will be a 0 and only one input combination can cause this to happen i.e. when both inputs are high. Thus, the logic margins are a bit reduced and the entropy has higher values than the previous experiment. Figure 14: Entropy of NAND Gate Figure 14 indicates the entropy values when the inputs are non-uniformly distributed with a probability of 0.8 to be logic high. In this case too, it is observed that the 1st order CTMR is the optimal redundancy level for the configuration. The entropy values are higher than Figure 12 for all values of KbT. This is because the input distribution being at a higher probability of being a logic high implies that the output of the NAND gate will be a 0 and only one input combination can cause this to happen i.e. when both inputs are high. Thus, the logic margins are a bit reduced and the entropy has higher values than the previous experiment. 40 3.2 CONCLUSION In this work, two automation methodologies that can expedite reliability evaluation of defect tolerant architectures for nanodevices is simulated. In particular, nanolab that is built on MATLAB, and a probabilistic model checking based tool called nanoprism is made use of. These tools can determine optimal redundancy and granularity levels for any specific logic network, given the failure distribution of the devices and the desired reliability of the system. These can also be used to inject signal noise at inputs and interconnects and create practical situations the circuits are subjected to during normal operation. nanolab applies an approach to reliability analysis that is based on Markov Random Fields as the probabilistic model of computation. Whereas, nanoprism applies probabilistic model checking techniques to calculate the likelihood of occurrence of probabilistic transient defects in devices and interconnections. Thus, there is an inherent difference in the model of computation between these two approaches. 41 3.2 REFERENCES 1. Andr´e DeHon, Konstantin K. Likharev (2004) “Hybrid CMOS/Nanoelectronic Digital Circuits, Devices, Architectures, and Design Automation.” Journal of nanotechnology .Vol 54, pp. 152-158 2. Andr´e DeHon (2008). “Defect and fault tolerance” journal of Electrical and system engineering. Vol. 37, pp. 830-852. 3. Carmen García, Francesc Moll, Antonio Rubio. (2004) ‘Electronic System Design Paradigms In The Technologies Of The Year 2020. Journal of Faulty of Telecommunication. Journal of IEEE Vol. 85, pp. 541-557 4. Craig S. Lent and P. Douglas Tougaw. (2008) ‘A Device Architecture for Computing with Quantum Dots’. Vol 54, pp. 152-158 5. Debayan Bhaduri, Sandeep K. Shukla,Paul S . Graham, Maya Gokhale,, (2005), “Reliability/Redundancy Trade off Evaluation for Multiplexed Architectures Used to Implement Quantum Dot Based Computing.’ journal of Fermat lab Vol. 121, pp. 1-16. 6. Debayan Bhaduri, Sandeep Shukla, (2003) “NANOPRISM: A Tool for Evaluating Granularity vs. Reliability Trade offs in Nano Architectures”. journal of Fermat lab Vol. 54, pp. 52-59 7. Debayan Bhaduri, Sandeep Shukla, (2005), “Comparing Reliability Redundancy Trade offs for Two Von Neumann Multiplexing Architectures”. Vol. 54, pp. 15-19 8. Debayan Bhaduri, Sandeep Shukla, Heather Quinn, (2004), “Reliability Driven Probabilistic Design Paradigm for Transient Error Tolerant Architectures on Nanofabrics”. Vol. 54, pp. 25-29 9. Debayan Bhaduri, Sandeep K. Shukla, (2008) “Probabilistic Analysis of Self Assembled Molecular Networks” Vol. 5, pp. 123-151 10. Debayan Bhaduri, Sandeep Shukla, (2004), ‘Tools and Techniques for Evaluating Reliability of Defect Tolerant Nano Architectures’ journal of Fermat lab. Vol. 96, pp. 40-46 11. Debayan Bhaduri, Sandeep Shukla, (2004) ‘Reliability Analysis of Nano Architectures in the Presence of Thermal Perturbations and Signal Noise.’ journal of Fermat lab Vol. 51, pp. 152-158 42 12. Gethin Norman, David Parker, Marta Kwiatkowska, (2005) ‘Evaluating the Reliability of Defect Tolerant Architectures for Nanotechnology with Probabilistic Model Checking’. Vol. 78, pp. 15-19 13. James R. Heath; Philip J. Kuekes; Gregory S. Snider; R. Stanley Williams, (1998) ‘A Defect Tolerant Computer Architecture: Opportunities for Nanotechnology.’ Journal of Jstore. Vol. 280, pp. 1716-1721 14. Kamins and R. Stanley Williams, (2001). ‘Trends in Nanotechnology: Self Assembly and Defect Tolerance’. Journal of Nanotechnology. Vol. 5, pp. 111-118 15. Marta Kwiatkowska, David Parker, Yi Zhang. (2003) ‘Dual Processor Parallelisation of Symbolic Probabilistic Model Checking’. In: School of Computer Science, University of Birmingham, Edgbaston, Birmingham. Vol. 14, pp. 152-171 16. Marta Kwiatkowska, Gethin Norman, David Parker, (2004), ‘Probabilistic Symbolic Model Checking with PRISM: A Hybrid Approach’. Journal of Vol. 71, pp. 1501-1511 17. Margarida Jacome, Chen He, Gustavo de Veciana, and Stephen Bijansky, (2004), “Defect Tolerant Probabilistic Design Paradigm for Nanotechnologies. Journal of Jstore”.Journal of Computer Engineering. Vol. 19, pp. 15-25 18. Marta Kwiatkowska, Gethin Norman, David Parker, (2004), ‘Probabilistic Symbolic Model Checking with PRISM: A Hybrid Approach’ Vol. 18, pp. 251-271 19. Mike Butts. (2002) “Survey of Nanoscale Digital System Technology”. Journal of Nano circuits Vol. 12, pp. 1-6 20. Nikoli´c, A Sadek and M Forshaw, (2002) “Fault tolerant techniques for Nanocomputers”. Journal of Nanotechnology. Vol. 13, pp. 357-362 21. Sandeep K. Shukla, Ramesh Kam, Seth Copen Goldstein, Forrest Brewer, (2004) ‘Nano, Quantum, and Molecular Computing: Are we ready for the Validation and Test Challenges.” Journal of Quantum computation. Vol. 6, pp. 3-7 22. Smitha krishnasawamy, George (2008) “probabilistic Transfer Matrices in Symbolic Reliability Analysis of logic Circuits.” Journal of Design Automation of Electronic Systems, Vol. 13, pp. 8.1-8.35. 23. Snider, P. Kuekes, T. Hogg, R. Stanley Williams, (2005) ‘Nanoelectronic architectures’. . Journal of Materials Science and Processing. Vol. 80, pp. 1183-1195. 43 24. Stan Z. Li, (1995) ‘Markov Random Field Models for pose Estimation in object Recognition’. Vol.17, pp. 151-158 25. Tian Ban, Lirida Naviner, (2011) ‘Progressive module redundancy for fault-tolerant designs in nanoelectronics’. Journal of Jstore. Vol. 41, pp. 14-29 26. WeiLu and Charles M. Lieber, (2007) ‘Nanoelectronics from the bottom up’. 27. Wenjing Rao, Alex Orailoglu,Ramesh Karri, (2005), ‘Architectural Level Fault Tolerant Computation in Nanoelectronic Processors.’Journal of Nanotechnology . Vol. 11, pp. 1-8

DOCUMENT INFO

OTHER DOCS BY examfree

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.