Document Sample

1 A Linear Tradeoff Model for Determining the Economic Viability of 100% Wafer Flatness Inspection and Sorting David Myers, Texas Instruments, Dallas, Texas; Larry Beckwith, National Semiconductor, Santa Clara, California; Murray Bullis, Materials & Metrology, Sunnyvale, California; Laszlo Fabry, Wacker Siltronic, Burghausen, Germany; Howard Huff, International Sematech, Austin, Texas; Bill Hughes, MEMC Electronic Materials, St. Peters, Missouri; Mototaka Kamoshida, NEC TOKIN, Sendai, Japan; Paul Langer, Komatsu Silicon America, Allentown, Pennsylvania; Don McCormack, International Sematech, Austin, Texas; Noel Poduje, ADE Corporation, Westwood, Massachusetts ABSTRACT A linear tradeoff model is developed by making various assumptions about the capability of flatness measuring equipment, the ability of the flatness specification to predict die failure due to flatness deviations, the characteristic distribution of site flatness values, and the revenue losses associated with wafer testing and die yield loss due to flatness deviation. It is found that if the die price exceeds a critical value, it is more cost-effective to 100% inspect and sort, while if the die price is less than this value, it is more cost-effective not to inspect and sort. If the process capability is high and each rejected site causes only one die failure and if the defective sites are randomly distributed, the critical value is equal to the value of the wafer. Poor process capability, multiple die failure for each defective site, and non-random site failures all serve to decrease the critical value below the value of the wafer. The model calculations imply that specifications allowing less than 100% usable area are not appropriate when 100% inspection and sorting is required. When there is only one die per site, the inspection of partial sites would appear to be not appropriate. However, their inclusion allows more of the wafer surface area to be inspected, albeit at the possible expense of introducing additional metrology errors. Both incorrect specification limits and metrology inaccuracies would affect the results obtained by the model, but neither of these factors has yet been fully evaluated. INTRODUCTION: FLATNESS MODELS Wafer flatness is a critical parameter for silicon wafers because of the effect it could have on the ability of photolithographic systems in a wafer fab to accurately print circuit features during the manufacture of integrated circuits. The photolithographic systems are known as steppers because they expose a fixed area of the wafer to a circuit feature image, and then “step” to an adjacent area and repeat the process until the entire wafer is exposed. The fixed area is known as an exposure field, or stepper field, and its size in current (2002) steppers is typically 25 mm 25 mm, although the exact dimensions depend on the stepper manufacturer. The image contained within the exposure field consists of circuit features that ultimately produce one or more integrated circuits, or die, when wafer processing is completed. Therefore, each stepper field contains one or more die. Modern steppers tilt the wafer and focus the image to remove as much dependency on the wafer topography as possible when exposing each field. However topographic extremes that are within the exposure field (i.e., peaks and valleys on the wafer), if large enough, distort the circuit feature image and might cause die yield loss. To reduce this risk of loss, wafer fab engineers specify a maximum peak-to-valley height (range) on the silicon wafers they purchase. It is not practical for silicon wafer manufacturers to own steppers for wafer characterization. Instead they use measurement instruments designed for the specific purpose of wafer inspection to measure the range. The measurement technology of these tools does not operate in the 2 same way as stepper exposures, so the wafer metrology only approximates stepper performance. However, attempts are made to collect and report wafer data in a form that is relevant as possible to stepper performance. For example, the magnitude of the range is strongly dependent on the size of the area measured, so wafer data are reported on the basis of 25 mm 25 mm sites (or other dimensions that mimic the field sizes of current steppers), and the corresponding metric is called site flatness. Photolithography engineers may shift the layout of stepper fields left-right or up-down on a wafer to maximize the number of die that fit on the wafer; site flatness measurements sometimes are also specified with an x (left-right) or y (up-down) layout shift, or offset, relative to the center of the wafer. Finally, since multiple die are often contained within a single exposure field, integrated circuit manufacturers often expose a field near the edge of the wafer so that some of the die are inside the wafer, where they can yield, and some are not. Data that are relevant to this practice are collected on wafers by including partial sites (defined as a site with some of its area outside the Fixed Quality Area (FQA) — the area inside the nominal edge exclusion1 — but having at least its centerpoint inside this area). Silicon wafer suppliers typically inspect 100% of their wafers for site flatness and use the results of the inspection to screen the product wafers. Any wafers that do not conform to the customer’s specified maximum range requirement are discarded before the remaining wafers are shipped. Typically the discarded wafers have only one nonconforming site, which could represent as little as ~0.65% of the fixed quality area.2 With so much usable silicon apparently being wasted, it is reasonable to ask whether the 100% inspection strategy is really the most rational course of action. Consequently, an investigation of various models related to wafer flatness inspection was carried out3 to determine whether it is more economical from an industry-wide cost of ownership perspective to perform a sample inspection or to continue 100% inspection and sort. The results of that investigation are reported here. During the development of the linear tradeoff model described below, many of the authors frequently expressed skepticism about two basic assumptions: (1) current metrology gives accurate results and (2) flatness specifications are strictly correct. These assumptions imply that all sites that fail inspection result in die yield loss, and none of the sites that pass inspection result in die yield loss for flatness-related loss mechanisms. Nevertheless, it was decided to develop the model as completely as possible using these assumptions and then return to the issue after the model was completed to determine how (or if) the conclusions would be altered by changing the assumptions. Under these conditions, the linear tradeoff model for site flatness was developed as follows: 1. Determine the distribution of site flatness typically found in a wafer population and the resulting site failure distribution (Site Loss model). 2. Develop a model for the fraction of wafers that are discarded because of sites failing ( Wafer Loss model). 3. Estimate the revenue losses incurred by the supplier as a result of inspecting the wafers, i.e., the loss due to discarding or downgrading the failed wafers ( Supplier Revenue Loss model). 4. Calculate the number of dice that fail to yield because of out of spec site flatness (Die Loss Model). 1 Fixed quality area, nominal edge exclusion, and other wafer parameters related to flatness measurement are defined in SEMI® M1, Specifications for Polished Monocrystalline Silicon Wafers. SEMI is a registered trademark of Semiconductor Equipment and Materials International, San Jose, CA. Website: www.semi.org. 2 If a 2-mm nominal edge exclusion and a measurement site size of 25 mm 25 mm with zero offsets and inclusion of partial sites are used on a 300 mm wafer, some of the partial sites have as little as 72% of their area inside the FQA. Failure of such a site 2 2 2 2 would imply a maximum nonconforming area of 0.72 (25) = 450 mm . Because the total FQA is (148) = 68,813 mm , such a failed site would occupy only a little more than 0.65% of the FQA. 3 This investigation was conducted by the Physical Models and Statistical Distributions team of the Starting Materials sub technical working group of the International Technology Roadmap for Semiconductors throughout 2001 and 2002. 3 5. Develop assumptions involved in converting die yield loss to wafer fab revenue losses ( User Revenue Loss model). 6. Combine the Supplier Revenue Loss and the User Revenue Loss models into a Linear Tradeoff model. Each of these models can be elaborated in terms of their sensitivity to various parameters (or assumptions about parameters). In fact, a full understanding of the revenue-loss tradeoffs between silicon wafer suppliers and users (models #3 and #5 listed above) cannot be attained unless the models encompass at least a rudimentary sensitivity analysis. This report is accompanied by a Microsoft®4 Excel spreadsheet that contains the calculations used for all of the following figures. The caption for each figure gives the applicable worksheet (tab). The person who uses this report therefore can insert his or her own data in order to apply the model to his or her particular circumstances. SITE LOSS MODEL First, consider the model for site loss, which is derived from the distribution of site flatness typically found in a wafer population. If the site flatness specification (usually abbreviated SFQR 1) is plotted on the x- axis, the distribution can be expressed as the probability for site failure, or the fraction of sites that will fail in the wafer population. The distribution for site flatness is usually described as being lognormal.5 Microsoft® Excel has two functions that are useful in probability calculations involving lognormal distributions. The function used to generate a lognormal cumulative distribution function is CDF = LOGNORMDIST(xi,μ,σ). (1) Note that the parameters μ and σ are the mean and standard deviation, respectively, of ln( xi) and thus do not have the same meaning as the usual symbols for the mean and standard deviation of a normal (Gaussian) distribution. Excel does not have a corresponding built-in lognormal Probability Density Function (PDF), but the function for a normal distribution can be used: PDF = (1/xi) NORMDIST(ln(xi), μ,σ,FALSE). (2) (The argument FALSE means the distribution is a PDF and not a CDF; and again, the parameters μ and σ are the mean and standard deviation, respectively, of ln(xi).) An example of a lognormal PDF generated with this function is shown in Figure 1. The function LOGNORMDIST( xi,μ,σ) gives the integral of this distribution, as is also shown in the example marked CDF in the figure. Please note, however, that the integral of the PDF gives the number of sites equal to or less than a given value of SFQR. For example, almost all of the sites on the wafers in the present example have an SFQR <0.3 µm, and almost none have an SFQR <0.01 µm. The number of sites that have SFQR greater than a given SFQR is 1 LOGNORMDIST(xi,μ,σ), as is also shown in the example in the figure. The function, (1 CDF), can be interpreted as the probability of a site failing, or alternatively, the fraction of sites, fs, that fail in a large sample (e.g., as is obtained by measuring every site on a large number of wafers): fs = 1 CDF = 1 LOGNORMDIST(USL,,), (3) where USL is the upper specification limit for acceptable SFQR. 4 Microsoft is a registered trademark of Microsoft Corporation, Redmond, WA. Excel is copyrighted by Microsoft Corporation. 5 For example, see Appendix 3 of SEMI M32, Guide to Statistical Specifications, published in September 1998. 4 0.06 1.0 0.9 0.05 <SFQR (CDF) 0.8 Fraction of Sites 0.7 Frequency 0.04 0.6 0.03 0.5 0.4 0.02 0.3 P DF 0.2 0.01 >SFQR (1-CDF) 0.1 0 0.0 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 SFQR ( m) Figure 1 [SFQR model.xls, sheet “Lognormal”]. SFQR process distributions modeled by standard Excel spreadsheet functions. The cumulative distribution function (CDF), which shows the fraction of sites with SFQR less than or equal to the value on the x-axis, is modeled by Eq (1). The probability distribution function (PDF) is modeled by Eq (2). The fraction of sites with SFQR greater than the value on the x-axis is given by 1 CDF [Eq (3)]. The diamond on the PDF curve shows the value of the arithmetic mean of the ln(xi) values, and the crosses on the same curve represent the ln(xi) values that are integral multiples of the standard deviation of the ln(xi) distribution away from the mean. For this example, was taken as 2.5 and was taken as 0.5. Although these values were derived from experimental data, they should not be considered representative of production capability. For example, if the specification for maximum SFQR is 0.1 m, approximately 35% of the sites in the illustrated distribution fail. All three of these functions are illustrated in the “Lognormal” worksheet of the Excel spreadsheet, “SFQR model.xls.” Assumptions of the Model The lognormal model for site flatness distribution is an empirical model based on a long period of industrial experience. There may be other types of multi-parameter, skewed distributions that would fit a given set of data better (e.g., Weibull distributions), but the assumption that site flatness is distributed lognormally is a convenient tradeoff between accuracy of fit and model simplicity. Parameter Estimation Given an experimentally observed SFQR distribution (frequency of occurrence vs. SFQR), and assuming the distribution is lognormal, the parameters and can be estimated by calculating µ = ln(xi) fi / fi , and 2 = fi [ln(xi) – µ]2 / [ (fi ) – 1 ], 5 where the xi are the SFQR values and the fi are the frequencies of occurrence of the SFQR value with the same index, i. An example of this procedure is given in the “Parameter Estimation” worksheet of the Excel spreadsheet “SFQR model.xls.” This worksheet also shows the experimental data from which the example parameters, and , were derived. WAFER LOSS MODEL If the probability of one site failing is fs, the probability of finding one or more failed sites in a sample of N sites can be calculated using the binomial probability distribution. 6 In Excel, that probability is given by7 Probability of 1 failed site = 1 – BINOMDIST(0,N,fs,TRUE) (4) where the argument “TRUE” makes it a cumulative binomial distribution. If N, the size of the sample, is equal to the total number of sites on the wafer, Sw, the equation can be written as fw = 1 – BINOMDIST(0,Sw,fs,TRUE) (5) where fw is the probability of a wafer failing because one or more sites failed. This model can be coupled with the lognormal site loss model to give a distribution of wafer losses as a function of SFQR value. Figure 2 shows the cumulative fraction of both failed sites and failed wafers, using the following parameter values: = 2.5, = 0.5, Sw = 52 (25 mm x 25 mm full and partial sites on a 200 mm wafer with 2 mm nominal edge exclusion). Several interesting insights emerge from this analysis. Firstly, it is seen that a very small site failure rate results in a very large wafer failure rate. In effect, the site failure rate is “magnified” because every site on a wafer has an independent opportunity to fail. Secondly, the only other parameter besides site failure probability that affects the wafer failure rate is the number of sites per wafer. All other things being equal, the larger the number of sites, the greater the probability of failing a wafer. This leads to a rather complex dependence of wafer reject rate on site size. For a given SFQR, smaller sites have a lower probability of site rejection, which results in a lower probability of wafer rejection, but this tendency competes with the larger number of sites per wafer resulting in a higher probability of wafer rejection. Thirdly, the ratio of fw to fs is always greater than 1 (until the reject rates of both wafers and sites = 100%, when the ratio is exactly 1.0). The development of this model allows the calculation of fw as a function of fs. Figure 3 shows this relationship for several values of Sw, the number of sites per wafer. The following numbers of sites per wafer were chosen to correspond to 25 mm x 25 mm sites on 200 mm (52) and 300 mm (112) wafers, and 25 mm x 32 mm sites on 200 mm (36) and 300 mm (88) wafers. In all cases, the offsets are zero and partial sites are included. These are the most common site sizes chosen for the 130-nm technology node and beyond. From this graph, one can understand the concern of the wafer makers. A site rejection rate of only 0.2% causes a wafer rejection rate from 7% to 20% of the wafers, depending on the site size and wafer diameter. 6 D. W. McCormack, Jr., “A Simple Approach for Comparing Costs Under No Inspection and 100% Inspection for Site Level Data,” Proceedings of the 2002 Conference on Modeling and Analysis of Semiconductor Manufacturing, Arizona State University, Tempe, AZ, April 10-12, 2002. 7 In more general terms, fw = 1 – BINOMDIST(r,Sw,fs,TRUE), where r is the maximum number of failed sites on an acceptable wafer. For example, set r = 0 to find the fraction of wafers that fail with one or more failed sites, r = 1 to find the fraction of wafers that fail with two or more failed sites, etc. For purposes of this development, it is assumed that 100% of the wafer is within specification (i.e., r = 0). 6 1.0 0.9 0.8 Fractional Loss 0.7 0.6 WAFERS 0.5 0.4 0.3 SITES 0.2 0.1 0.0 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 SFQR Limit ( m) for 100% Inspection & Sort Figure 2 [SFQR model.xls, Sheet “Loss Models”]. Site loss and wafer loss models shown on the same graph. The site loss model is lognormal [Eq (3)]; the wafer loss model is binomial [Eq (5)]. The parameters and for this example are the same as in Figure 1. Assumptions of the model In this model it is assumed that the probability of rejecting any given site is the same as rejecting any other site, or in other words, that there is no pattern or spatial dependence of the probability of site rejection. This is almost certainly untrue, because it is known that partial sites, located on the wafer edge, have a higher probability of failing than full sites that are located in the interior of the wafer. This violates the assumption and makes the model estimate of wafer rejection rate too high. However, it is 100% 80% Fractional Wafer Loss, f w 60% Sites per Wafer Symbol Site Size Wafer, Sw Diameter 40% 36 200 mm 25 mm 32 mm 52 200 mm 25 mm 25 mm 20% 88 300 mm 25 mm 32 mm 112 300 mm 25 mm 25 mm 0% 0% 2% 4% 6% 8% 10% Fractional Site Loss, f s Figure 3 [SFQR model.xls, Sheet “fw vs fs”]. Fraction of rejected wafers (fw) as a function of the fraction of rejected sites (fs), for four different numbers of sites per wafer (Sw). 7 still possible to continue the analysis to its conclusion, then (once the framework is established) return to this point and see how the conclusions would be altered when the assumptions of the binomial model are violated. SUPPLIER REVENUE-LOSS MODEL The models discussed above provide a way to predict the number of wafers rejected because of site flatness inspection. Now consider how that information might be used to help build a supplier revenue- loss model, which is based on the costs incurred by inspecting the wafers for site flatness, i.e., the cost of discarding or downgrading the failed wafers. The costs of performing the inspection, including hidden costs such as misclassification, are also considered. Unfortunately a publicly available cost model for silicon wafer manufacturing is not available. Wafer price information obviously is made available privately to each customer by each wafer supplier, and estimates of wafer pricing may be available from industry reports and newsletters. Supplier revenue losses based on wafer prices are used in this report to illustrate trends and relative costs, but it always should be kept in mind that the revenue losses stated in this paper are arbitrary and are used for illustration only. A user of the model can verify the conclusions reached in this paper by replacing the illustrative revenue- loss numbers in the “fw vs fs” worksheet of the Excel spreadsheet, “SFQR model.xls,” with privately obtained data to construct a user-specific model. With those caveats in mind, the following observations can be made: It can be assumed that the price of a wafer currently includes the cost of inspection, since current practice is to inspect 100% of the prime wafers sold to manufacturers of advanced semiconductors. Flatness inspection is done only after processing is irreversible, i.e., any attempt to rework the failed wafers degrades the flatness rather than improving it. Rejected polished CZ wafers can be downgraded and sold as test wafers, and in principle so could p /p wafers although the practice does not appear to be common at present. In these cases, the revenue loss due to rejecting a wafer may be taken as the price difference between a prime and downgraded wafer. Some wafers that are rejected cannot be downgraded and sold into a different category, but can only be discarded. In particular, this is true of p /p + epitaxial wafers. In this case, the supplier’s revenue loss associated with inspecting and rejecting a wafer can be taken as the full price of the wafer. As a reminder, for now the assumption is that the metrology is completely accurate, so there is no risk of incorrectly rejecting a good site or of incorrectly accepting a bad site. The revenue losses in Table 1 are used for purposes of illustrating the application of the model. Wafer prices change over time and vary depending on user and supplier, so once again each user of the model is urged to substitute values appropriate to his or her application and determine how it affects the results of the model analysis. DIE LOSS MODEL The Die Loss model describes the number of dice that fail to yield because Table 1. Illustrative Wafer Revenue Loss Estimates. of site flatness failures. Site flatness Prime Wafer Type \ Wafer Diameter 200 mm 300 mm failures are caused when the metrology + p/p epitaxial wafer $110 $500 asserts that the site does not conform polished wafer to a specified SFQR criterion. Die $15 $90 (price difference between prime and downgrade) failures related to out of specification p/p epitaxial wafer site flatness are caused when local $60 $215 (price difference between prime and downgrade) 8 flatness deviations of the wafer interact with the photolithographic system to cause a pattern distortion large enough to result in parametric or functional loss of the die. The metrology attempts to model the photolithographic system. Therefore, development of a valid die loss model requires an understanding of how the metrology interprets local flatness deviations, and how the photolithographic pattern exposed by the stepper is affected by these local flatness deviations. Both of these topics are rather complex subjects, beyond the scope of this paper. The following two issues, however, can be considered: the number of potentially good dice inside partial sites, and the flatness deviations of the wafer surface inside failed sites (viz., the size of the feature that causes the site to fail). Each of these is considered in turn. The number of potentially good dice inside partial sites is a function of the exact positioning of each partial site and the way in which the full site is subdivided into subsites ( i.e., the number of dice per site), so meaningful model assumptions would not only be user-specific, but application-specific. In addition, further analysis requires the flatness metrology “site map” to be exactly congruent with the stepper “shot map”, or a 1:1 correlation cannot be made between subsite locations and die locations. Even assuming the site map and shot map match one another, each one has placement uncertainties on the wafer that may cause them to diverge from true congruity. As for the feature size that causes a site to fail, when one site includes only one die, any feature that causes a site to fail causes the die to fail, so the feature size that causes the failure is, in a sense, irrelevant. When one site includes multiple dice, however, the feature size does matter. For example, which is the more plausible assumption: that a majority of sites fail because of features (wafer topography) that are greater than ¾ of the site size, or because of features that are less than ¼ of the site size? Until these questions can be answered, it is only speculation as to whether it is more reasonable to assume that one die or four dice will fail, if a site is subdivided into four dice (subsites). Unfortunately, virtually nothing appears to be known about the relationship between feature size and site failure, except in very broad terms. Because of all these uncertainties, it apparently is not possible to meaningfully relate the number of dice that fail to yield because of site failures. However, the model is developed by first considering that one die fails for each failed site, and then return after the model is developed, to see how the conclusions might be affected if a site failure causes more than one die failure. USER REVENUE-LOSS MODEL The conversion of die yield loss to wafer fab revenue loss is simply a matter of assigning a unit-selling price per die and multiplying it by the die yield loss. Die selling price appears to be the best metric because it represents lost revenue to the wafer user, which is what the user was trying to avoid by imposing a specification on site flatness. However, it should be noted that there are many implicit assumptions by taking this approach. These assumptions are as follows: 1) The amount of revenue lost by a wafer user as a result of a site failing for site flatness is assumed to be the selling price of the packaged die and not its wafer fab manufacturing cost. This keeps both the supplier and user terms of all equations in units of lost revenue. 2) It is assumed that site flatness failures cause loss in multiprobe yield instead of die reliability issues, so revenue loss due to reliability failures can be ignored. It is recognized that, in principle, any failure mechanism that can cause multiprobe loss could also generate die that marginally pass multiprobe and become reliability risks, but in the present model it is assumed that the number of such marginal escapes caused by site flatness is negligibly small. 3) Each site failure is assumed to result in a single die failure. 9 4) The spatial distribution of site flatness failures is assumed to be random (no clustering). This assumption is probably valid as long as the site failure rate ( fs) is low, because empirical data at present indicate that most failures are one site per wafer until the specification causes a fairly large number of wafers to be rejected. (The actual value of "fairly large" can be quantitatively determined from supplier data). When the number of rejected sites per wafer becomes significantly larger than 1.0, present empirical data suggest that defective sites do tend to cluster around the edge of the wafer. LINEAR TRADEOFF MODEL During the above development, several mathematical cost tradeoff models were considered, but a rather simple one has proven to be the most useful and it is derived as follows: Compare the revenue lost by a supplier when a sort is performed to the revenue lost by a user when the sort is not performed. The sort is assumed to be done on the results of a 100% inspection, using a sort limit equal to the SFQR specification. The revenue lost by the supplier (if the sort is done) is taken to be equal to the value of the wafers rejected by doing the 100% inspection and sort. The revenue lost by the user (if the sort were not done) is taken to be the value of all the dice that fail because of sites greater than the SFQR specification limit in the entire shipment. Assume that all the supplier’s costs related to site flatness inspection and screening are spread across the remaining wafers and raise their sales price to absorb the loss. The revenue lost by the supplier from doing the sort would be $s = PwfwN (6) where Pw is the supplier’s revenue loss due to failure of a wafer to meet the flatness (SFQR) specification, fw is the fraction of wafers rejected by the sort, and N is the number of wafers in the shipment. The revenue lost by the user from not doing the sort would be $u = PdfdDwN (7) where Pd is the selling price of a die, fd is the fraction of dice that fail because of wafer flatness, Dw is the number of dice/wafer, and N is the number of wafers in the shipment. This needs to be expanded somewhat because only the fraction of rejected sites is known from the inspection, not the fraction of rejected dice. First, we have Dw = SwDs (8) where Sw is the number of sites per wafer and Ds is the (average) number of dice per site. (Note that the last equation is not strictly true unless Ds is allowed to be a non-integer number, because all partial sites do not contain the same number of subsites, or dice, as full sites; and different partial sites will even differ in their number of dice per site). It also can be shown that fd = fsnds / Ds (9) where fs is the fraction of sites with SFQR greater than the specification limit (all the dice that fail are in these sites), and nds is the number of dice that fail for each site that fails (initially assumed to be 1). 8 8 The validity of Equation (9) is not obvious, so the following example may be useful. Suppose we have 100 sites with 4 die/site, for a total of 400 die. Further suppose that 2 sites fail the SFQR limit, and the topology that led to the site failures causes 2 die on 10 Substitute Equations (8) and (9) into Equation (7) to give $u = (PdfsndsSwDsN)/Ds = PdfsndsSwN. (10) If a wafer fab uses the same wafer type (wafer specification) to manufacture multiple devices with different selling prices, the highest die price should be used in the model, because it is the one that causes the largest amount of lost revenue. Usually the highest-priced die will also be one of the fab's largest dice, and therefore, among the most sensitive to site flatness. Before continuing, consider the relative magnitudes of the supplier and user revenue losses. Remembering that $s is the supplier’s revenue loss when the sort is done and $u is the user’s revenue loss when the sort is not done, if $s > $u, doing the inspection and sort causes more lost revenue overall than omitting the inspection and sort. In other words, from a systems perspective it is more cost-effective to stop doing the sort. On the other hand, if $s < $u, doing the inspection and sort causes less lost revenue overall than omitting it, so it is more cost-effective to retain the sort. Equation (6) states that $s = PwfwN, and Equation (10) has $u = PdfsndsSwN. For any given set of wafers that are ready to be inspected, sorted, and shipped, the revenue loss per wafer ( Pw) and the number of wafers (N ) are fixed. And since the wafers have a given topographic feature height distribution, once the inspection criteria (site size and placement; SFQR sort limit) are decided, everything else ( fw, fs, nds, and Sw) is also fixed except the die price, Pd. Since $u is directly proportional to the die price, this implies that a lower die price favors omitting the sort, and a higher die price favors retaining the sort. This makes sense, because yield loss (or gain) is more important for more expensive die. Therefore, a critical die price can be defined for which $s = $u. If the user’s die prices are lower than this critical die price, it is more economical to drop the inspection and sort, and if the user’s die prices are higher than this critical die price, the inspection and sort should be continued. Of course the value for the critical die price depends on the fixed values for all the other parameters given in Equations (6) and (10). These relationships can be visualized by plotting the quantity ($ s - $u) vs. Pd, as shown for 200 mm and 300 mm wafers in Figure 4. These curves are obtained by calculating ($s - $u) = (PwfwN ) – (PdfsndsSwN ). (11) The details of the calculation are as follows. The number of wafers per shipment, N, is common to both terms of the equation and so can be omitted (or taken as equal to one). The number of sites per wafer, Sw, is taken as 52 for 200 mm wafers and 112 for 300 mm wafers (see Figure 3). The fraction of rejected sites, fs, and wafers, fw, are computed from the lognormal model with µ = -2.5 and σ = 0.5 and from the binomial distribution, respectively, as was done earlier. For the time being, nds is assumed to be 1, i.e., when a site fails it causes one and only one die failure. The implications of changing this assumption will be discussed later. one of the sites to fail and 3 die on the other site to fail. Then fd = 5 die failures / 400 die = 0.0125 by direct calculation. To calculate fd from Equation (9), we need fs = 2 site failures / 100 sites = 0.02, and nds = 5 die failures / 2 site failures = 2.5 die failures/site failure so we have fd = (0.02)(2.5)/4 = 0.0125, which matches the result of the direct calculation. 11 Supplier L oss - User L oss $10 $60 Supplier L oss - User Loss $8 DO NOT INSPECT DO NOT INSPECT $6 (a) $40 (b) $4 $2 $20 $0 -$2 $0 -$4 -$6 -$20 -$8 INSPECT INSPECT -$10 -$40 $0 $50 $100 $150 $0 $200 $400 $600 Die Selling Price Die Selling Price Figure 4 [SFQR model.xls, Sheet “Linear Model”]. Supplier revenue loss user revenue loss ($s $u) as a function of die selling price (Pd) for (a) 200 mm wafers and (b) 300 mm wafers. Illustrative wafer prices are taken from Table 1 [p/p+: , ; p/p: ,; polished: ,]. Note that it is more economical to drop the wafer inspection and sort step for positive values of $s $u, but retention of the wafer inspection and sort step is more economical for negative values of $ s $u. The lines are parallel because for a given wafer diameter, all the parameters are fixed (held constant) except for wafer value (revenue loss) and die price. Changing the wafer value moves the line up and down; changing the die price generates a linear graph because die price is the independent variable plotted on the x-axis, and the function being plotted is of the form y = mx + b. The lines cross the x-axis [(Supplier Revenue Loss) (User Revenue Loss) = $0] at the critical die price, Pd,c, when, from Equation (11) PwfwN = Pd,c fsndsSwN. (12) Solving for Pd,c gives the equation for the critical die price: Pd,c = Pwfw/(fsndsSw). (13) Effect of supplier process capability It is very instructive to look now at the ratio of fw to fs, and in particular, how this ratio varies with the wafer supplier’s process capability. When dealing with normal distributions, process capability is typically characterized with a Cpk value. The corresponding statistic for lognormal distributions is the effective C pk (ECpk), calculated from ECpk = {ln(USL) – µ}/(3σ) where and are, again, the mean and standard deviation of the natural log of the site flatness. Figure 5 and Table 2 show fw /fs normalized to Sw, the number of 25 mm 25 mm sites per wafer, as a function of ECpk for wafers with 52 (200 mm) and 112 (300 mm) sites, including partial sites. Because the value for fs, depends only on ECpk; independent of the particular values chosen for USL, , and σ of which the value of ECpk is comprised, the results depend only on ECpk. 12 1.0 Figure 5 [SFQR model.xls, Sheet “fw over fs vs ECpk”]. The ratio, fw/fs, normalized to Sw and 0.8 plotted against ECpk, assuming 52 sites 52 sites per wafer (appropriate (fw /fs )/Sw 0.6 for 200 mm wafers, including 112 sites partial sites) and 112 sites per wafer (appropriate for 300 mm 0.4 wafers, including partial sites). 0.2 0.0 0.4 0.6 0.8 1.0 1.2 1.4 ECpk However, as can be seen from both the figure and the table, the results do depend on the number of sites per wafer, Sw, because the function fw depends on this parameter. Specifically, for large values of ECpk, the limiting value of fw /fs is equal to the number of sites per wafer. The mathematical proof of why this is true, involving as it does the integrals of both lognormal and binomial distributions, is left to the interested reader, but the following discussion gives an indication of why it necessarily must be so. Table 2. The Ratio (fw/fs) to Sw for Different Values of ECpk and Sw. [SFQR_model.xls, Sheet “Ratio Calculations”] 200 mm Wafers (Sw = 52) 300 mm Wafers (Sw = 112) ECpk fs fw fw/fs (fw/fs)/Sw fw fw/fs (fw/fs)/Sw 0.65 2.559% 74.02% 28.93 55.63% 94.52% 36.94 32.98% 0.85 0.539% 24.48% 45.46 87.42% 45.39% 84.26 75.24% 1.00 0.135% 6.78 50.25 96.63% 14.04% 104.01 92.87% 1.07 0.066% 3.39% 51.13 98.33% 7.17% 107.97 96.40% 1.13 0.035% 1.80% 51.54 99.11% 3.84% 109.85 98.08% 1.20 0.016% 0.82% 21.79 99.60% 1.77% 111.02 99.12% 1.33 33 ppm 0.172% 51.96 99.92% 0.370% 111.79 99.82% 1.50 3.4 ppm 0.018% 51.995 99.99% 0.038% 111.98 99.98% If N is the total number of wafers being inspected, the fraction of wafers rejected is given by fw = number of wafers rejected N, (14) and the fraction of sites rejected is given by fs = number of sites rejected / total number of sites = number of sites rejected SwN. (15) Therefore, fw /fs = Sw (number of wafers rejected / number of sites rejected). (16) Now consider what happens when the site reject rate is very low, i.e., when ECpk is large. Each time that a site is rejected, a wafer is rejected, and in the absence of any kind of systematic defect pattern, it would be rare for a rejected wafer to have more than one failed site. This discussion will be returned to later, but for now, keep in mind that the models developed up to this point are based on assumptions that support this conclusion. In particular, the binomial model predicts only one rejected site per wafer until the overall site reject rate becomes large enough for the number of wafers with two rejected sites to 13 be significant. Therefore, it can be taken as a limit that the ratio of the (number of wafers rejected) to the (number of sites rejected) is equal to one, and thus limfs0 fw/fs Sw. (17) Recalling from Equation (13) that the critical die price, Pd,c, is Pd,c = Pwfw/(fsndsSw), (13) substituting Equation (17) in Equation (13) gives Pd,c = Pw/nds , (18) Assuming, as was done when calculating the tradeoff model curves (Figure 4), that nds = 1, i.e., when a site fails it causes one and only one die failure, the critical die price is equal to the lost wafer revenue: Pd,c = Pw. (19) The implications of this will be discussed, and then how the conclusions would be affected if any of the assumptions were violated will be discussed. As was stated earlier, if the user’s die prices are lower than this critical die price, it is more economical to drop the inspection and sort, and if the user’s die prices are higher than this critical die price, the inspection and sort should be continued. With Equation (19) and the preceding discussion, the following conclusion can be stated: It is less economical to perform a wafer sort based on 100% site flatness inspection than to omit it if the selling price per die is lower than the critical die price, Pd,c. Subject to certain assumptions, the critical die price is approximately equal to the wafer revenue loss. In mathematical notation, the inspection and sort should be omitted if Pd < Pd,c ≈ Pw The assumptions that lead to this conclusion can now be examined to see how any deviations from the assumptions might affect the conclusion. (1) First, as has been stated, the maximum value of (fw /fs) is Sw, because a larger value than Sw would imply that one site failure causes more than one wafer to fail, which is logically impossible. Anything that causes the ratio of (fw /fs) to Sw to be lowered below 1 reduces the critical die price (see Equation (13)), so that it is economically beneficial to carry out the inspection and sort step for even less expensive devices under these circumstances. (2) From Figure 5 it can be seen that fw /fs Sw when, and only when, ECpk is large. (Table 2 shows the ratio of (fw /fs) to Sw for selected values of ECpk.) Taking this into account, the following conclusions can be drawn: Equation (13) shows that, if nds = 1, the critical die price is lowered from its maximum value (Pw) by the ratio (fw /fs) to Sw. This reduction in the critical die price is not too significant as long as ECpk is above 1.0, but it becomes noticeable if ECpk becomes as small as 0.85. As would be expected, the wafer loss (fw) also becomes significant, and probably unacceptable, for ECpk values below 1.0 (or perhaps even 1.1 or higher). From both Figure 5 and Table 2, it is clear that ECpk has to be higher for 300 mm wafers than for 200 mm wafers to obtain the same results. In fact, fw is larger, and the reduction in the critical die price is more significant, for a larger number of sites per wafer. 14 (3) If more than one die fails for each site that fails (nds > 1), it also lowers the critical die price. Note that nds is not the number of dice per site, only the number of dice that fail when a site fails. For example, if a site contains 16 dice and each site failure results in an average of two die failures, the critical die price becomes equal to Pw /2. The reduction in the critical die price is potentially worse for smaller die size (more dice/site), and in the absence of evidence to the contrary, the user is most likely to assume that every die in a site fails when the site fails. (4) If site failures are not spatially random (i.e., every site has the same probability of failure, regardless of location), fw /fs is smaller than otherwise, so the critical die price is lowered. To visualize this, imagine that a given number of site failures is “clustered” on a small number of wafers instead of being evenly (i.e., binomially) distributed. Then the number of wafers rejected is smaller than expected from Equation (16), so ( fw /fs) is smaller. Data received from a wafer supplier support the conclusion that site failures are not, in fact, randomly distributed. Table 3 shows some of these data and the impact they have on the ratio ( fw /fs)/Sw. Although somewhat sparse, these data do show that the observed fw /fs ratio is considerably lower than the model would predict. In particular, when ECpk approaches 1, (fw /fs)/Sw should approach 100%, but it appears to be about 40% at best. Table 3. Modeled and Actual Data for a 200 mm Wafer with 25 mm 32 mm sites (Sw = 36) fw fs fw / fs (fw /fs)/Sw fw / fs (fw /fs)/Sw ECpk modeled modeled modeled modeled actual actual 0.63 65.8% 2.94% 22.40 62.2% 9.7 26.9% 0.79 27.5% 0.89% 30.92 85.9% 14.6 40.6% 0.93 9.06% 0.26% 34.39 95.5 13.0 36.1% (5) On the other hand, the critical die price is increased if the number of dice/wafer is less than SwDs because of partial sites. Going back to some of the equations used to derive the linear tradeoff model and rearranging the terms demonstrates this. Equating $s from Equation (6) to $u from Equation (7) and solving for the critical die price, Pd,c, yields: Pd,c = Pwfw /fdDw (20) Now recalling from Equation (9) that fd = fsnds /Ds, after some rearranging, the critical die price is Pd,c = Pw(Ds / Dw) (fw /fs)(1/nds) (21) If it is assumed that fw /fs = Sw and that nds = 1, the critical price becomes: Pd,c = Pw (DsSw /Dw) (17) Table 4 shows the ratio (DsSw /Dw) for several scenarios. The dice per wafer (Dw) values shown in the table are the number of full dice that fit within the FQA. Sometimes dice fall within the FQA even though they lie within a site that is not considered a partial site. These cases were also counted and included in the calculated value of Dw. (6) In the preceding discussions, the possibility of having outliers was ignored. Outliers are abnormally high SFQR values due to “special causes” that are distinct from the system of causes that result in a predictable lognormal distribution. If outliers are observed in the flatness data they are indicative of a lack of statistical control and can be taken as evidence that 100% inspection is required. 15 CONCLUSIONS Table 4. Ratio of Die Selling Price to Wafer Revenue Loss (DsSw /Dw) for Several Scenarios In summary, all of the deviations Wafer from the original model assumptions Site Size, Diameter, Die Array Ds Dw DsSw DsSw /Dw mm mm cause the critical die price to be mm lowered from the maximum value, 200 mm 25 25 1 1 32 52 1.625 Pw, except for point (5). Some 22 4 156 208 1.33 examples of the degree by which the critical die price can be lowered are 55 25 1108 1300 1.17 as follows: 25 32 1 1 20 36 1.80 22 4 120 144 1.20 If the supplier’s process capability is not adequate 58 40 1396 1440 1.03 compared to the customer 300 mm 25 25 1 1 88 112 1.27 specification, it is 22 4 392 448 1.14 economically impractical to 55 25 2592 2800 1.08 omit the inspection even when the die price is 25 32 1 1 64 88 1.375 relatively low. For example, 22 4 296 352 1.19 if ECpk is only 0.85 on 300 58 40 3264 3520 1.08 mm wafers (25 mm 25 mm sites), the critical die price is reduced to 0.75Pw. The amount by which non-random (clustered) site failures lower the critical die price is, in general, unknown because knowing it would require a generalization from a wealth of quantitative data. However, the small amount of data collected suggests that the critical die price could be reduced to 0.4Pw or less. Probably the factor that has the most impact is the number of die failures that a site failure causes. If the user’s die array has four dice/site and the user assumes that a site failure causes all four dice to fail, the critical die price is reduced to 0.25Pw. All of these reductions in the critical die price appear to feed back into the equations multiplicatively. For example, if the ECpk were 0.85, the site failures clustered as they did in the data shown above, and the user has four dice/site which are all assumed to fail if a site fails, the critical die price is reduced to (0.75)(0.4)(0.25)Pw = 0.075Pw. Returning to the issue of metrology accuracy, one may ask whether these conclusions could be substantially altered by poor measurement accuracy. To fundamentally alter the overall conclusion, poor metrology would have to be the primary cause of a supplier’s site flatness failures, and while this is certainly possible, it does appear to be unlikely. As a final point in this section, one comment can be made about specifications that allow one or more nonconforming sites within the FQA before the wafer is rejected ( i.e., PUA < 100%). The linear tradeoff model compares the relative cost of losing dice and losing wafers. Therefore, if the model indicates that the revenue lost by die failure overwhelms the revenue lost by wafer rejection, when the strictest criteria are used to screen out wafers that cause die failure, it does not make sense to loosen the reject criteria to allow an even larger number of dice to fail. In all this discussion, however, perfect metrology is assumed. This is not completely correct; anecdotal evidence suggests that metrology failures occur most frequently at the edges of the wafer, usually in partial sites. The impact of such metrology errors has not been fully factored into the present model. However, it should be noted that the general approach used by the practice for determining the cost component due to misclassification as a result of incorrect 16 measurement, under development in the SEMI Standards Program, might be used to quantitatively determine the cost impact of metrology errors. FUTURE DIRECTIONS The linear tradeoff model appears to be quite useful because it is relatively easy to understand and can be manipulated to reach meaningful conclusions by examining the assumptions that were used to formulate it. According to the model, when the die price exceeds the wafer value (or revenue loss), it is cost-effective to conduct 100% inspection and sort. On the other hand when the die price is less than the wafer value, it may or may not be cost-effective to inspect and sort. The critical die price decreases below the wafer value as ECpk decreases. It also is decreased when more than one die fails per site or when site failures cluster. The assumption is made that both the metrology is accurate and the user specification is correct—i.e., that (1) the die always fails when the site flatness exceeds the critical SFQR value and never fails if the site flatness is less than this value and (2) the user has correctly identified the critical SFQR value that marks the onset of die loss and made its value the USL. It is not at all clear that the knowledge to support either of these assumptions exists within the semiconductor industry. However, for reasons outlined below, it is believed the effort necessary to gain this knowledge would be better spent by adopting a new approach to site flatness metrology. Although strictly speaking it is outside the scope set for the investigation leading to this report, the discussions and ideas that lead to the above models raise significant questions concerning the value of the present metric (SFQR) for site flatness. A brief summary of some of the limitations inherent in the presently used metric follows: Placement of the site pattern on the wafer does not, and cannot, in every case match the user’s stepper field placement. Therefore, the topography that affects printing is not necessarily being measured by the wafer supplier, so the measured topography frequently does not represent the topography seen by the stepper. It is possible to print yielding dice in a location outside the site pattern. This problem becomes worse for large sites coupled with smaller-diameter wafers. For example, for a 200 mm wafer with a 3-mm edge exclusion, a 25 mm 32 mm site size with zero offsets does not account for 7.16% of the area inside the FQA. (Recall that in the Introduction a case was mentioned where a wafer was rejected because ~0.65% of the area inside the FQA was nonconforming). In addition, the amount of unmeasured FQA varies in unpredictable ways as the site size and pattern offsets are changed. Information about the relationship between topographic feature height and feature size on a particular wafer is absent from standard site flatness measurements and site flatness maps. In view of such concerns about the current implementation of site flatness metrology, it is suggested that a new metric be given serious consideration within the industry. One such possibility is the “flying-site” measurement that has been proposed, which addresses the above issues, and is currently under evaluation. This measurement is made by determining SFQR for every possible position on a wafer of a 25 mm 8 mm site (which mimics the slit size of a scanning stepper), to a resolution of 1 mm without utilizing partial sites. Flying-site measurements thus produce an information density of 100 points/cm 2, and provide a map of wafer topography that is more likely to be relevant to stepper performance, independent of how the stepper exposure fields are arranged on the wafer. Whether a new metric such as the flying site will improve the industry-wide cost of ownership remains to be determined. It has also been noted that front surface site flatness appears to be confounded with the topography of both the wafer back surface and the chuck, which makes measurements of front surface site flatness of doubtful value as a model for the performance of steppers that use low-contact chucking. However, in 17 view of the great number of variations in wafer-chuck interactions, it is unreasonable to expect the wafer manufacturer to account fully for these interactions. It is recommended that greater understanding of these interactions be developed. Nevertheless, when consistently applied, front surface site flatness metrology has provided evidence of improving flatness process capability without considering the wafer- chuck interactions. Appendix. Table of Symbols Used in Report Symbol Meaning First Introduced $s potential revenue lost by supplier from doing 100% inspection and Eq (6) sort for flatness $u potential revenue lost by wafer user from not requiring 100% Eq (7) inspection and sort for flatness mean value of the distribution of the natural logs of the SFQR values Eq (1) [ln(xi)] standard deviation of the distribution of the natural logs of the SFQR Eq (1) values [ln(xi)] Ds average number of dice per site Eq (8) Dw number of dice per wafer Eq (7) ECpk effective process capability coefficient Unnumbered Eq on p 11 FALSE logical argument indicating that the distribution produced by the Eq (2) spreadsheet function is a probability density function (PDF) fd fraction of dice that fail due to out of specification wafer flatness Eq (7) fi frequency of occurrence of the SFQR value with index I Unnumbered Eqs on p 4 FQA fixed quality area (see SEMI M1) p3 fs fraction of sites in a large sample that have SFQR > USL Eq (3) fw fraction of wafers that have one or more out of specification sites Eq (5) N number of sites in sample Eq (4) nds average number of dice that fail in each failed site Eq (9) Pd die selling price Eq (7) Pd,c critical die price Eq (12 PUA percent usable area p 16 Pw supplier’s revenue loss due to failure of a wafer to meet flatness Eq (6) specification (SFQR>USL on one or more sites) r general argument for binomial distribution indicating maximum Footnote 6 number of failed sites in an acceptable wafer SFQR a measure of site flatness (see SEMI M1) p2 Sw total number of sites on a wafer Eq (5) TRUE logical argument indicating that the distribution produced by the Eq (4) spreadsheet function is a cumulative distribution function (CDF) USL upper specification limit for SFQR in micrometers Eq (3) xi SFQR value with index i Eq (1)

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 7 |

posted: | 10/6/2012 |

language: | English |

pages: | 17 |

OTHER DOCS BY alicejenny

How are you planning on using Docstoc?
BUSINESS
PERSONAL

By registering with docstoc.com you agree to our
privacy policy and
terms of service, and to receive content and offer notifications.

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.