Cost Tradeoff Model for Determining the Economic Viability ITRS

Document Sample
Cost Tradeoff Model for Determining the Economic Viability ITRS Powered By Docstoc

A Linear Tradeoff Model for Determining the Economic Viability
       of 100% Wafer Flatness Inspection and Sorting
                 David Myers, Texas Instruments, Dallas, Texas; Larry Beckwith, National
                      Semiconductor, Santa Clara, California; Murray Bullis, Materials &
                      Metrology, Sunnyvale, California; Laszlo Fabry, Wacker Siltronic,
                   Burghausen, Germany; Howard Huff, International Sematech, Austin,
                     Texas; Bill Hughes, MEMC Electronic Materials, St. Peters, Missouri;
                 Mototaka Kamoshida, NEC TOKIN, Sendai, Japan; Paul Langer, Komatsu
                 Silicon America, Allentown, Pennsylvania; Don McCormack, International
                   Sematech, Austin, Texas; Noel Poduje, ADE Corporation, Westwood,

    A linear tradeoff model is developed by making various assumptions about the capability of
    flatness measuring equipment, the ability of the flatness specification to predict die failure due to
    flatness deviations, the characteristic distribution of site flatness values, and the revenue losses
    associated with wafer testing and die yield loss due to flatness deviation. It is found that if the
    die price exceeds a critical value, it is more cost-effective to 100% inspect and sort, while if the
    die price is less than this value, it is more cost-effective not to inspect and sort. If the process
    capability is high and each rejected site causes only one die failure and if the defective sites are
    randomly distributed, the critical value is equal to the value of the wafer. Poor process capability,
    multiple die failure for each defective site, and non-random site failures all serve to decrease the
    critical value below the value of the wafer. The model calculations imply that specifications
    allowing less than 100% usable area are not appropriate when 100% inspection and sorting is
    required. When there is only one die per site, the inspection of partial sites would appear to be
    not appropriate. However, their inclusion allows more of the wafer surface area to be inspected,
    albeit at the possible expense of introducing additional metrology errors. Both incorrect
    specification limits and metrology inaccuracies would affect the results obtained by the model,
    but neither of these factors has yet been fully evaluated.


Wafer flatness is a critical parameter for silicon wafers because of the effect it could have on the ability of
photolithographic systems in a wafer fab to accurately print circuit features during the manufacture of
integrated circuits. The photolithographic systems are known as steppers because they expose a fixed
area of the wafer to a circuit feature image, and then “step” to an adjacent area and repeat the process
until the entire wafer is exposed. The fixed area is known as an exposure field, or stepper field, and its
size in current (2002) steppers is typically 25 mm  25 mm, although the exact dimensions depend on
the stepper manufacturer. The image contained within the exposure field consists of circuit features that
ultimately produce one or more integrated circuits, or die, when wafer processing is completed.
Therefore, each stepper field contains one or more die. Modern steppers tilt the wafer and focus the
image to remove as much dependency on the wafer topography as possible when exposing each field.
However topographic extremes that are within the exposure field (i.e., peaks and valleys on the wafer), if
large enough, distort the circuit feature image and might cause die yield loss.

To reduce this risk of loss, wafer fab engineers specify a maximum peak-to-valley height (range) on the
silicon wafers they purchase. It is not practical for silicon wafer manufacturers to own steppers for wafer
characterization. Instead they use measurement instruments designed for the specific purpose of wafer
inspection to measure the range. The measurement technology of these tools does not operate in the

same way as stepper exposures, so the wafer metrology only approximates stepper performance.
However, attempts are made to collect and report wafer data in a form that is relevant as possible to
stepper performance. For example, the magnitude of the range is strongly dependent on the size of the
area measured, so wafer data are reported on the basis of 25 mm  25 mm sites (or other dimensions
that mimic the field sizes of current steppers), and the corresponding metric is called site flatness.
Photolithography engineers may shift the layout of stepper fields left-right or up-down on a wafer to
maximize the number of die that fit on the wafer; site flatness measurements sometimes are also
specified with an x (left-right) or y (up-down) layout shift, or offset, relative to the center of the wafer.
Finally, since multiple die are often contained within a single exposure field, integrated circuit
manufacturers often expose a field near the edge of the wafer so that some of the die are inside the
wafer, where they can yield, and some are not. Data that are relevant to this practice are collected on
wafers by including partial sites (defined as a site with some of its area outside the Fixed Quality Area
(FQA) — the area inside the nominal edge exclusion1 — but having at least its centerpoint inside this

Silicon wafer suppliers typically inspect 100% of their wafers for site flatness and use the results of the
inspection to screen the product wafers. Any wafers that do not conform to the customer’s specified
maximum range requirement are discarded before the remaining wafers are shipped. Typically the
discarded wafers have only one nonconforming site, which could represent as little as ~0.65% of the
fixed quality area.2 With so much usable silicon apparently being wasted, it is reasonable to ask whether
the 100% inspection strategy is really the most rational course of action. Consequently, an investigation
of various models related to wafer flatness inspection was carried out3 to determine whether it is more
economical from an industry-wide cost of ownership perspective to perform a sample inspection or to
continue 100% inspection and sort. The results of that investigation are reported here.

During the development of the linear tradeoff model described below, many of the authors frequently
expressed skepticism about two basic assumptions: (1) current metrology gives accurate results and (2)
flatness specifications are strictly correct. These assumptions imply that all sites that fail inspection result
in die yield loss, and none of the sites that pass inspection result in die yield loss for flatness-related loss
mechanisms. Nevertheless, it was decided to develop the model as completely as possible using these
assumptions and then return to the issue after the model was completed to determine how (or if) the
conclusions would be altered by changing the assumptions. Under these conditions, the linear tradeoff
model for site flatness was developed as follows:

1. Determine the distribution of site flatness typically found in a wafer population and the resulting site
   failure distribution (Site Loss model).

2. Develop a model for the fraction of wafers that are discarded because of sites failing ( Wafer Loss

3. Estimate the revenue losses incurred by the supplier as a result of inspecting the wafers, i.e., the loss
   due to discarding or downgrading the failed wafers ( Supplier Revenue Loss model).

4. Calculate the number of dice that fail to yield because of out of spec site flatness (Die Loss Model).

  Fixed quality area, nominal edge exclusion, and other wafer parameters related to flatness measurement are defined in SEMI® M1,
Specifications for Polished Monocrystalline Silicon Wafers. SEMI is a registered trademark of Semiconductor Equipment and
Materials International, San Jose, CA. Website:
  If a 2-mm nominal edge exclusion and a measurement site size of 25 mm  25 mm with zero offsets and inclusion of partial sites
are used on a 300 mm wafer, some of the partial sites have as little as 72% of their area inside the FQA. Failure of such a site
                                                             2           2                                   2              2
would imply a maximum nonconforming area of 0.72  (25) = 450 mm . Because the total FQA is (148) = 68,813 mm , such a
failed site would occupy only a little more than 0.65% of the FQA.
  This investigation was conducted by the Physical Models and Statistical Distributions team of the Starting Materials sub technical
working group of the International Technology Roadmap for Semiconductors throughout 2001 and 2002.

5. Develop assumptions involved in converting die yield loss to wafer fab revenue losses ( User Revenue
   Loss model).

6. Combine the Supplier Revenue Loss and the User Revenue Loss models into a Linear Tradeoff model.

Each of these models can be elaborated in terms of their sensitivity to various parameters (or
assumptions about parameters). In fact, a full understanding of the revenue-loss tradeoffs between
silicon wafer suppliers and users (models #3 and #5 listed above) cannot be attained unless the models
encompass at least a rudimentary sensitivity analysis.

This report is accompanied by a Microsoft®4 Excel spreadsheet that contains the calculations used for all
of the following figures. The caption for each figure gives the applicable worksheet (tab). The person
who uses this report therefore can insert his or her own data in order to apply the model to his or her
particular circumstances.


First, consider the model for site loss, which is derived from the distribution of site flatness typically found
in a wafer population. If the site flatness specification (usually abbreviated SFQR 1) is plotted on the x-
axis, the distribution can be expressed as the probability for site failure, or the fraction of sites that will
fail in the wafer population.

The distribution for site flatness is usually described as being lognormal.5 Microsoft® Excel has two
functions that are useful in probability calculations involving lognormal distributions. The function used to
generate a lognormal cumulative distribution function is
                                                CDF = LOGNORMDIST(xi,μ,σ).                                                      (1)
Note that the parameters μ and σ are the mean and standard deviation, respectively, of ln( xi) and thus
do not have the same meaning as the usual symbols for the mean and standard deviation of a normal
(Gaussian) distribution.
Excel does not have a corresponding built-in lognormal Probability Density Function (PDF), but the
function for a normal distribution can be used:
                                        PDF = (1/xi) NORMDIST(ln(xi), μ,σ,FALSE).                                               (2)
(The argument FALSE means the distribution is a PDF and not a CDF; and again, the parameters μ and σ
are the mean and standard deviation, respectively, of ln(xi).) An example of a lognormal PDF generated
with this function is shown in Figure 1. The function LOGNORMDIST( xi,μ,σ) gives the integral of this
distribution, as is also shown in the example marked CDF in the figure. Please note, however, that the
integral of the PDF gives the number of sites equal to or less than a given value of SFQR. For example,
almost all of the sites on the wafers in the present example have an SFQR <0.3 µm, and almost none
have an SFQR <0.01 µm. The number of sites that have SFQR greater than a given SFQR is
1  LOGNORMDIST(xi,μ,σ), as is also shown in the example in the figure.

The function, (1  CDF), can be interpreted as the probability of a site failing, or alternatively, the fraction
of sites, fs, that fail in a large sample (e.g., as is obtained by measuring every site on a large number of
                                       fs = 1  CDF = 1  LOGNORMDIST(USL,,),                                                 (3)
where USL is the upper specification limit for acceptable SFQR.

    Microsoft is a registered trademark of Microsoft Corporation, Redmond, WA. Excel is copyrighted by Microsoft Corporation.
    For example, see Appendix 3 of SEMI M32, Guide to Statistical Specifications, published in September 1998.

              0.06                                                                                  1.0

              0.05                              <SFQR
                                                (CDF)                                               0.8

                                                                                                          Fraction of Sites

              0.03                                                                                  0.5

                                               P DF
              0.01              >SFQR

                0                                                                                   0.0
                 0.00   0.05    0.10       0.15        0.20         0.25         0.30   0.35    0.40
                                                SFQR ( m)
   Figure 1 [SFQR model.xls, sheet “Lognormal”]. SFQR process distributions modeled by standard
 Excel spreadsheet functions. The cumulative distribution function (CDF), which shows the fraction
       of sites with SFQR less than or equal to the value on the x-axis, is modeled by Eq (1). The
 probability distribution function (PDF) is modeled by Eq (2). The fraction of sites with SFQR greater
 than the value on the x-axis is given by 1  CDF [Eq (3)]. The diamond on the PDF curve shows the
  value of the arithmetic mean of the ln(xi) values, and the crosses on the same curve represent the
   ln(xi) values that are integral multiples of the standard deviation of the ln(xi) distribution away
    from the mean. For this example,  was taken as 2.5 and  was taken as 0.5. Although these
     values were derived from experimental data, they should not be considered representative of
                                          production capability.

For example, if the specification for maximum SFQR is 0.1 m, approximately 35% of the sites in the
illustrated distribution fail. All three of these functions are illustrated in the “Lognormal” worksheet of the
Excel spreadsheet, “SFQR model.xls.”

Assumptions of the Model

The lognormal model for site flatness distribution is an empirical model based on a long period of
industrial experience. There may be other types of multi-parameter, skewed distributions that would fit a
given set of data better (e.g., Weibull distributions), but the assumption that site flatness is distributed
lognormally is a convenient tradeoff between accuracy of fit and model simplicity.

Parameter Estimation

Given an experimentally observed SFQR distribution (frequency of occurrence vs. SFQR), and assuming
the distribution is lognormal, the parameters  and  can be estimated by calculating

                                             µ =  ln(xi) fi /    fi ,

                                    2 =  fi [ln(xi) – µ]2 / [ (fi ) – 1 ],

where the xi are the SFQR values and the fi are the frequencies of occurrence of the SFQR value with the
same index, i. An example of this procedure is given in the “Parameter Estimation” worksheet of the
Excel spreadsheet “SFQR model.xls.” This worksheet also shows the experimental data from which the
example parameters,  and , were derived.


If the probability of one site failing is fs, the probability of finding one or more failed sites in a sample of
N sites can be calculated using the binomial probability distribution. 6 In Excel, that probability is given
                              Probability of 1 failed site = 1 – BINOMDIST(0,N,fs,TRUE)                                            (4)
where the argument “TRUE” makes it a cumulative binomial distribution.

If N, the size of the sample, is equal to the total number of sites on the wafer, Sw, the equation can be
written as
                                             fw = 1 – BINOMDIST(0,Sw,fs,TRUE)                                                       (5)
where fw is the probability of a wafer failing because one or more sites failed.

This model can be coupled with the lognormal site loss model to give a distribution of wafer losses as a
function of SFQR value. Figure 2 shows the cumulative fraction of both failed sites and failed wafers,
using the following parameter values:  = 2.5,  = 0.5, Sw = 52 (25 mm x 25 mm full and partial sites
on a 200 mm wafer with 2 mm nominal edge exclusion).

Several interesting insights emerge from this analysis. Firstly, it is seen that a very small site failure rate
results in a very large wafer failure rate. In effect, the site failure rate is “magnified” because every site
on a wafer has an independent opportunity to fail. Secondly, the only other parameter besides site
failure probability that affects the wafer failure rate is the number of sites per wafer. All other things
being equal, the larger the number of sites, the greater the probability of failing a wafer. This leads to a
rather complex dependence of wafer reject rate on site size. For a given SFQR, smaller sites have a
lower probability of site rejection, which results in a lower probability of wafer rejection, but this tendency
competes with the larger number of sites per wafer resulting in a higher probability of wafer rejection.
Thirdly, the ratio of fw to fs is always greater than 1 (until the reject rates of both wafers and sites =
100%, when the ratio is exactly 1.0).

The development of this model allows the calculation of fw as a function of fs. Figure 3 shows this
relationship for several values of Sw, the number of sites per wafer. The following numbers of sites per
wafer were chosen to correspond to 25 mm x 25 mm sites on 200 mm (52) and 300 mm (112) wafers,
and 25 mm x 32 mm sites on 200 mm (36) and 300 mm (88) wafers. In all cases, the offsets are zero
and partial sites are included. These are the most common site sizes chosen for the 130-nm technology
node and beyond. From this graph, one can understand the concern of the wafer makers. A site
rejection rate of only 0.2% causes a wafer rejection rate from 7% to 20% of the wafers, depending on
the site size and wafer diameter.

  D. W. McCormack, Jr., “A Simple Approach for Comparing Costs Under No Inspection and 100% Inspection for Site Level Data,”
Proceedings of the 2002 Conference on Modeling and Analysis of Semiconductor Manufacturing, Arizona State University, Tempe,
AZ, April 10-12, 2002.
  In more general terms, fw = 1 – BINOMDIST(r,Sw,fs,TRUE), where r is the maximum number of failed sites on an acceptable wafer.
For example, set r = 0 to find the fraction of wafers that fail with one or more failed sites, r = 1 to find the fraction of wafers that
fail with two or more failed sites, etc. For purposes of this development, it is assumed that 100% of the wafer is within
specification (i.e., r = 0).

               Fractional Loss                  0.7
                                                0.3                                 SITES
                                                   0.00      0.05        0.10        0.15         0.20       0.25        0.30    0.35
                                                            SFQR Limit ( m) for 100% Inspection & Sort

    Figure 2 [SFQR model.xls, Sheet “Loss Models”]. Site loss and wafer loss models shown on the
   same graph. The site loss model is lognormal [Eq (3)]; the wafer loss model is binomial [Eq (5)].
                 The parameters  and  for this example are the same as in Figure 1.

Assumptions of the model
In this model it is assumed that the probability of rejecting any given site is the same as rejecting any
other site, or in other words, that there is no pattern or spatial dependence of the probability of site
rejection. This is almost certainly untrue, because it is known that partial sites, located on the wafer
edge, have a higher probability of failing than full sites that are located in the interior of the wafer. This
violates the assumption and makes the model estimate of wafer rejection rate too high. However, it is


                   Fractional Wafer Loss, f w

                                                                                          Sites per    Wafer
                                                                            Symbol                                  Site Size
                                                                                          Wafer, Sw   Diameter
                                                40%                                         36       200 mm     25 mm  32 mm
                                                                                            52       200 mm     25 mm  25 mm

                                                20%                                         88       300 mm     25 mm  32 mm
                                                                                           112       300 mm     25 mm  25 mm

                                                       0%           2%               4%               6%            8%           10%

                                                                            Fractional Site Loss, f s
   Figure 3 [SFQR model.xls, Sheet “fw vs fs”]. Fraction of rejected wafers (fw) as a function of the
            fraction of rejected sites (fs), for four different numbers of sites per wafer (Sw).

still possible to continue the analysis to its conclusion, then (once the framework is established) return to
this point and see how the conclusions would be altered when the assumptions of the binomial model are


The models discussed above provide a way to predict the number of wafers rejected because of site
flatness inspection. Now consider how that information might be used to help build a supplier revenue-
loss model, which is based on the costs incurred by inspecting the wafers for site flatness, i.e., the cost of
discarding or downgrading the failed wafers. The costs of performing the inspection, including hidden
costs such as misclassification, are also considered.

Unfortunately a publicly available cost model for silicon wafer manufacturing is not available. Wafer price
information obviously is made available privately to each customer by each wafer supplier, and estimates
of wafer pricing may be available from industry reports and newsletters. Supplier revenue losses based
on wafer prices are used in this report to illustrate trends and relative costs, but it always should be kept
in mind that the revenue losses stated in this paper are arbitrary and are used for illustration only. A
user of the model can verify the conclusions reached in this paper by replacing the illustrative revenue-
loss numbers in the “fw vs fs” worksheet of the Excel spreadsheet, “SFQR model.xls,” with privately
obtained data to construct a user-specific model.

With those caveats in mind, the following observations can be made:
       It can be assumed that the price of a wafer currently includes the cost of inspection, since
        current practice is to inspect 100% of the prime wafers sold to manufacturers of advanced
       Flatness inspection is done only after processing is irreversible, i.e., any attempt to rework the
        failed wafers degrades the flatness rather than improving it.
       Rejected polished CZ wafers can be downgraded and sold as test wafers, and in principle so
        could p /p  wafers although the practice does not appear to be common at present. In these
        cases, the revenue loss due to rejecting a wafer may be taken as the price difference between a
        prime and downgraded wafer.
       Some wafers that are rejected cannot be downgraded and sold into a different category, but can
        only be discarded. In particular, this is true of p /p + epitaxial wafers. In this case, the supplier’s
        revenue loss associated with inspecting and rejecting a wafer can be taken as the full price of the
       As a reminder, for now the assumption is that the metrology is completely accurate, so there is
        no risk of incorrectly rejecting a good site or of incorrectly accepting a bad site.
The revenue losses in Table 1 are used for purposes of illustrating the application of the model. Wafer
prices change over time and vary depending on user and supplier, so once again each user of the model
is urged to substitute values appropriate to his or her application and determine how it affects the results
of the model analysis.


The Die Loss model describes the
number of dice that fail to yield because     Table 1. Illustrative Wafer Revenue Loss Estimates.
of site flatness failures. Site flatness        Prime Wafer Type \ Wafer Diameter              200 mm   300 mm
failures are caused when the metrology           +
                                              p/p epitaxial wafer                               $110     $500
asserts that the site does not conform        polished wafer
to a specified SFQR criterion. Die                                                              $15      $90
                                              (price difference between prime and downgrade)
failures related to out of specification         
                                              p/p epitaxial wafer
site flatness are caused when local                                                             $60      $215
                                              (price difference between prime and downgrade)

flatness deviations of the wafer interact with the photolithographic system to cause a pattern distortion
large enough to result in parametric or functional loss of the die. The metrology attempts to model the
photolithographic system. Therefore, development of a valid die loss model requires an understanding of
how the metrology interprets local flatness deviations, and how the photolithographic pattern exposed by
the stepper is affected by these local flatness deviations. Both of these topics are rather complex
subjects, beyond the scope of this paper. The following two issues, however, can be considered: the
number of potentially good dice inside partial sites, and the flatness deviations of the wafer surface inside
failed sites (viz., the size of the feature that causes the site to fail). Each of these is considered in turn.

The number of potentially good dice inside partial sites is a function of the exact positioning of each
partial site and the way in which the full site is subdivided into subsites ( i.e., the number of dice per site),
so meaningful model assumptions would not only be user-specific, but application-specific. In addition,
further analysis requires the flatness metrology “site map” to be exactly congruent with the stepper “shot
map”, or a 1:1 correlation cannot be made between subsite locations and die locations. Even assuming
the site map and shot map match one another, each one has placement uncertainties on the wafer that
may cause them to diverge from true congruity.

As for the feature size that causes a site to fail, when one site includes only one die, any feature that
causes a site to fail causes the die to fail, so the feature size that causes the failure is, in a sense,
irrelevant. When one site includes multiple dice, however, the feature size does matter. For example,
which is the more plausible assumption: that a majority of sites fail because of features (wafer
topography) that are greater than ¾ of the site size, or because of features that are less than ¼ of the
site size? Until these questions can be answered, it is only speculation as to whether it is more
reasonable to assume that one die or four dice will fail, if a site is subdivided into four dice (subsites).
Unfortunately, virtually nothing appears to be known about the relationship between feature size and site
failure, except in very broad terms.

Because of all these uncertainties, it apparently is not possible to meaningfully relate the number of dice
that fail to yield because of site failures. However, the model is developed by first considering that one
die fails for each failed site, and then return after the model is developed, to see how the conclusions
might be affected if a site failure causes more than one die failure.


The conversion of die yield loss to wafer fab revenue loss is simply a matter of assigning a unit-selling
price per die and multiplying it by the die yield loss. Die selling price appears to be the best metric
because it represents lost revenue to the wafer user, which is what the user was trying to avoid by
imposing a specification on site flatness. However, it should be noted that there are many implicit
assumptions by taking this approach. These assumptions are as follows:

1)    The amount of revenue lost by a wafer user as a result of a site failing for site flatness is assumed
      to be the selling price of the packaged die and not its wafer fab manufacturing cost. This keeps
      both the supplier and user terms of all equations in units of lost revenue.

2)    It is assumed that site flatness failures cause loss in multiprobe yield instead of die reliability
      issues, so revenue loss due to reliability failures can be ignored. It is recognized that, in principle,
      any failure mechanism that can cause multiprobe loss could also generate die that marginally pass
      multiprobe and become reliability risks, but in the present model it is assumed that the number of
      such marginal escapes caused by site flatness is negligibly small.

3)    Each site failure is assumed to result in a single die failure.

4)      The spatial distribution of site flatness failures is assumed to be random (no clustering). This
        assumption is probably valid as long as the site failure rate ( fs) is low, because empirical data at
        present indicate that most failures are one site per wafer until the specification causes a fairly large
        number of wafers to be rejected. (The actual value of "fairly large" can be quantitatively
        determined from supplier data). When the number of rejected sites per wafer becomes
        significantly larger than 1.0, present empirical data suggest that defective sites do tend to cluster
        around the edge of the wafer.


During the above development, several mathematical cost tradeoff models were considered, but a rather
simple one has proven to be the most useful and it is derived as follows:

     Compare the revenue lost by a supplier when a sort is performed to the revenue lost by a user
     when the sort is not performed. The sort is assumed to be done on the results of a 100%
     inspection, using a sort limit equal to the SFQR specification. The revenue lost by the supplier (if
     the sort is done) is taken to be equal to the value of the wafers rejected by doing the 100%
     inspection and sort. The revenue lost by the user (if the sort were not done) is taken to be the
     value of all the dice that fail because of sites greater than the SFQR specification limit in the
     entire shipment. Assume that all the supplier’s costs related to site flatness inspection and
     screening are spread across the remaining wafers and raise their sales price to absorb the loss.

The revenue lost by the supplier from doing the sort would be
                                                             $s = PwfwN                                                              (6)
where Pw is the supplier’s revenue loss due to failure of a wafer to meet the flatness (SFQR) specification,
fw is the fraction of wafers rejected by the sort, and N is the number of wafers in the shipment.

The revenue lost by the user from not doing the sort would be
                                                            $u = PdfdDwN                                                             (7)
where Pd is the selling price of a die, fd is the fraction of dice that fail because of wafer flatness, Dw is the
number of dice/wafer, and N is the number of wafers in the shipment.

This needs to be expanded somewhat because only the fraction of rejected sites is known from the
inspection, not the fraction of rejected dice. First, we have
                                                              Dw = SwDs                                                              (8)
where Sw is the number of sites per wafer and Ds is the (average) number of dice per site.

(Note that the last equation is not strictly true unless Ds is allowed to be a non-integer number, because
all partial sites do not contain the same number of subsites, or dice, as full sites; and different partial
sites will even differ in their number of dice per site).

It also can be shown that
                                                            fd = fsnds / Ds                                                          (9)
where fs is the fraction of sites with SFQR greater than the specification limit (all the dice that fail are in
these sites), and nds is the number of dice that fail for each site that fails (initially assumed to be 1). 8

  The validity of Equation (9) is not obvious, so the following example may be useful. Suppose we have 100 sites with 4 die/site,
for a total of 400 die. Further suppose that 2 sites fail the SFQR limit, and the topology that led to the site failures causes 2 die on

Substitute Equations (8) and (9) into Equation (7) to give
                                              $u = (PdfsndsSwDsN)/Ds = PdfsndsSwN.                                                      (10)
If a wafer fab uses the same wafer type (wafer specification) to manufacture multiple devices with
different selling prices, the highest die price should be used in the model, because it is the one that
causes the largest amount of lost revenue. Usually the highest-priced die will also be one of the fab's
largest dice, and therefore, among the most sensitive to site flatness.

Before continuing, consider the relative magnitudes of the supplier and user revenue losses.
Remembering that $s is the supplier’s revenue loss when the sort is done and $u is the user’s revenue
loss when the sort is not done, if
                                                                $s > $u,
doing the inspection and sort causes more lost revenue overall than omitting the inspection and sort. In
other words, from a systems perspective it is more cost-effective to stop doing the sort. On the other
hand, if
                                                                $s < $u,
doing the inspection and sort causes less lost revenue overall than omitting it, so it is more cost-effective
to retain the sort.

Equation (6) states that $s = PwfwN, and Equation (10) has $u = PdfsndsSwN. For any given set of wafers
that are ready to be inspected, sorted, and shipped, the revenue loss per wafer ( Pw) and the number of
wafers (N ) are fixed. And since the wafers have a given topographic feature height distribution, once the
inspection criteria (site size and placement; SFQR sort limit) are decided, everything else ( fw, fs, nds, and
Sw) is also fixed except the die price, Pd. Since $u is directly proportional to the die price, this implies that
a lower die price favors omitting the sort, and a higher die price favors retaining the sort. This makes
sense, because yield loss (or gain) is more important for more expensive die.

Therefore, a critical die price can be defined for which
                                                                $s = $u.
If the user’s die prices are lower than this critical die price, it is more economical to drop the inspection
and sort, and if the user’s die prices are higher than this critical die price, the inspection and sort should
be continued. Of course the value for the critical die price depends on the fixed values for all the other
parameters given in Equations (6) and (10).

These relationships can be visualized by plotting the quantity ($ s - $u) vs. Pd, as shown for 200 mm and
300 mm wafers in Figure 4. These curves are obtained by calculating
                                               ($s - $u) = (PwfwN ) – (PdfsndsSwN ).                                                    (11)
The details of the calculation are as follows. The number of wafers per shipment, N, is common to both
terms of the equation and so can be omitted (or taken as equal to one). The number of sites per wafer,
Sw, is taken as 52 for 200 mm wafers and 112 for 300 mm wafers (see Figure 3). The fraction of
rejected sites, fs, and wafers, fw, are computed from the lognormal model with µ = -2.5 and σ = 0.5 and
from the binomial distribution, respectively, as was done earlier. For the time being, nds is assumed to be
1, i.e., when a site fails it causes one and only one die failure. The implications of changing this
assumption will be discussed later.

one of the sites to fail and 3 die on the other site to fail. Then fd = 5 die failures / 400 die = 0.0125 by direct calculation. To
calculate fd from Equation (9), we need fs = 2 site failures / 100 sites = 0.02, and nds = 5 die failures / 2 site failures = 2.5 die
failures/site failure so we have fd = (0.02)(2.5)/4 = 0.0125, which matches the result of the direct calculation.

                                                                                     Supplier L oss - User L oss
                                $10                                                                                $60

  Supplier L oss - User Loss
                                 $8                    DO NOT INSPECT                                                                      DO NOT INSPECT
                                 $6                                   (a)                                          $40
                                 $2                                                                                $20
                                -$2                                                                                 $0
                                -$6                                                                                -$20
                                -$8        INSPECT                                                                             INSPECT
                               -$10                                                                                -$40
                                      $0        $50        $100         $150                                              $0        $200     $400        $600
                                                Die Selling Price                                                                  Die Selling Price
Figure 4 [SFQR model.xls, Sheet “Linear Model”]. Supplier revenue loss  user revenue loss ($s  $u) as
  a function of die selling price (Pd) for (a) 200 mm wafers and (b) 300 mm wafers. Illustrative wafer
prices are taken from Table 1 [p/p+: , ; p/p: ,; polished: ,]. Note that it is more economical
   to drop the wafer inspection and sort step for positive values of $s  $u, but retention of the wafer
                inspection and sort step is more economical for negative values of $ s  $u.

The lines are parallel because for a given wafer diameter, all the parameters are fixed (held constant)
except for wafer value (revenue loss) and die price. Changing the wafer value moves the line up and
down; changing the die price generates a linear graph because die price is the independent variable
plotted on the x-axis, and the function being plotted is of the form
                                                                            y = mx + b.
The lines cross the x-axis [(Supplier Revenue Loss)  (User Revenue Loss) = $0] at the critical die price,
Pd,c, when, from Equation (11)
                                                                      PwfwN = Pd,c fsndsSwN.                                                                 (12)
Solving for Pd,c gives the equation for the critical die price:
                                                                      Pd,c = Pwfw/(fsndsSw).                                                                 (13)

Effect of supplier process capability

It is very instructive to look now at the ratio of fw to fs, and in particular, how this ratio varies with the
wafer supplier’s process capability. When dealing with normal distributions, process capability is typically
characterized with a Cpk value. The corresponding statistic for lognormal distributions is the effective C pk
(ECpk), calculated from
                                                                    ECpk = {ln(USL) – µ}/(3σ)
where  and  are, again, the mean and standard deviation of the natural log of the site flatness.

Figure 5 and Table 2 show fw /fs normalized to Sw, the number of 25 mm  25 mm sites per wafer, as a
function of ECpk for wafers with 52 (200 mm) and 112 (300 mm) sites, including partial sites. Because
the value for fs, depends only on ECpk; independent of the particular values chosen for USL, , and σ of
which the value of ECpk is comprised, the results depend only on ECpk.

                1.0                                                                 Figure 5 [SFQR model.xls, Sheet
                                                                                    “fw over fs vs ECpk”]. The ratio,
                                                                                    fw/fs, normalized to Sw and
                0.8                                                                 plotted against ECpk, assuming
                              52 sites
                                                                                    52 sites per wafer (appropriate
 (fw /fs )/Sw

                0.6                                                                 for 200 mm wafers, including
                                           112 sites                                partial sites) and 112 sites per
                                                                                    wafer (appropriate for 300 mm
                0.4                                                                 wafers, including partial sites).


                   0.4          0.6       0.8           1.0        1.2        1.4
However, as can be seen from both the figure and the table, the results do depend on the number of
sites per wafer, Sw, because the function fw depends on this parameter. Specifically, for large values of
ECpk, the limiting value of fw /fs is equal to the number of sites per wafer. The mathematical proof of why
this is true, involving as it does the integrals of both lognormal and binomial distributions, is left to the
interested reader, but the following discussion gives an indication of why it necessarily must be so.

Table 2. The Ratio (fw/fs) to Sw for Different Values of ECpk and Sw.
 [SFQR_model.xls, Sheet “Ratio Calculations”]

                                                 200 mm Wafers (Sw = 52)                300 mm Wafers (Sw = 112)
                ECpk           fs
                                            fw           fw/fs       (fw/fs)/Sw       fw         fw/fs      (fw/fs)/Sw
                0.65        2.559%       74.02%           28.93          55.63%     94.52%       36.94        32.98%
                0.85        0.539%       24.48%           45.46          87.42%     45.39%       84.26        75.24%
                1.00        0.135%         6.78           50.25          96.63%     14.04%       104.01       92.87%
                1.07        0.066%        3.39%           51.13          98.33%     7.17%        107.97       96.40%
                1.13        0.035%        1.80%           51.54          99.11%     3.84%        109.85       98.08%
                1.20        0.016%        0.82%           21.79          99.60%     1.77%        111.02       99.12%
                1.33         33 ppm      0.172%           51.96          99.92%     0.370%       111.79       99.82%
                1.50        3.4 ppm      0.018%           51.995         99.99%     0.038%       111.98       99.98%
If N is the total number of wafers being inspected, the fraction of wafers rejected is given by
                                                  fw = number of wafers rejected  N,                                    (14)
and the fraction of sites rejected is given by
                       fs = number of sites rejected / total number of sites = number of sites rejected  SwN.           (15)
                                fw /fs = Sw  (number of wafers rejected / number of sites rejected).                    (16)
Now consider what happens when the site reject rate is very low, i.e., when ECpk is large. Each time that
a site is rejected, a wafer is rejected, and in the absence of any kind of systematic defect pattern, it
would be rare for a rejected wafer to have more than one failed site. This discussion will be returned to
later, but for now, keep in mind that the models developed up to this point are based on assumptions
that support this conclusion. In particular, the binomial model predicts only one rejected site per wafer
until the overall site reject rate becomes large enough for the number of wafers with two rejected sites to

be significant. Therefore, it can be taken as a limit that the ratio of the (number of wafers rejected) to
the (number of sites rejected) is equal to one, and thus
                                               limfs0 fw/fs  Sw.                                            (17)
Recalling from Equation (13) that the critical die price, Pd,c, is
                                              Pd,c = Pwfw/(fsndsSw),                                          (13)
substituting Equation (17) in Equation (13) gives
                                                  Pd,c = Pw/nds ,                                             (18)
Assuming, as was done when calculating the tradeoff model curves (Figure 4), that nds = 1, i.e., when a
site fails it causes one and only one die failure, the critical die price is equal to the lost wafer revenue:
                                                    Pd,c = Pw.                                                (19)
The implications of this will be discussed, and then how the conclusions would be affected if any of the
assumptions were violated will be discussed.

As was stated earlier, if the user’s die prices are lower than this critical die price, it is more economical to
drop the inspection and sort, and if the user’s die prices are higher than this critical die price, the
inspection and sort should be continued. With Equation (19) and the preceding discussion, the following
conclusion can be stated:

  It is less economical to perform a wafer sort based on 100% site flatness inspection than
  to omit it if the selling price per die is lower than the critical die price, Pd,c. Subject to
  certain assumptions, the critical die price is approximately equal to the wafer revenue
  loss. In mathematical notation, the inspection and sort should be omitted if
                                           Pd < Pd,c ≈ Pw

The assumptions that lead to this conclusion can now be examined to see how any deviations from the
assumptions might affect the conclusion.

    (1) First, as has been stated, the maximum value of (fw /fs) is Sw, because a larger value than Sw
        would imply that one site failure causes more than one wafer to fail, which is logically impossible.
        Anything that causes the ratio of (fw /fs) to Sw to be lowered below 1 reduces the critical die price
        (see Equation (13)), so that it is economically beneficial to carry out the inspection and sort step
        for even less expensive devices under these circumstances.

    (2) From Figure 5 it can be seen that fw /fs  Sw when, and only when, ECpk is large. (Table 2 shows
        the ratio of (fw /fs) to Sw for selected values of ECpk.) Taking this into account, the following
        conclusions can be drawn:

            Equation (13) shows that, if nds = 1, the critical die price is lowered from its maximum value
             (Pw) by the ratio (fw /fs) to Sw. This reduction in the critical die price is not too significant as
             long as ECpk is above 1.0, but it becomes noticeable if ECpk becomes as small as 0.85.

            As would be expected, the wafer loss (fw) also becomes significant, and probably
             unacceptable, for ECpk values below 1.0 (or perhaps even 1.1 or higher).

            From both Figure 5 and Table 2, it is clear that ECpk has to be higher for 300 mm wafers than
             for 200 mm wafers to obtain the same results. In fact, fw is larger, and the reduction in the
             critical die price is more significant, for a larger number of sites per wafer.

(3) If more than one die fails for each site that fails (nds > 1), it also lowers the critical die price.
    Note that nds is not the number of dice per site, only the number of dice that fail when a site
    fails. For example, if a site contains 16 dice and each site failure results in an average of two die
    failures, the critical die price becomes equal to Pw /2. The reduction in the critical die price is
    potentially worse for smaller die size (more dice/site), and in the absence of evidence to the
    contrary, the user is most likely to assume that every die in a site fails when the site fails.

(4) If site failures are not spatially random (i.e., every site has the same probability of failure,
    regardless of location), fw /fs is smaller than otherwise, so the critical die price is lowered. To
    visualize this, imagine that a given number of site failures is “clustered” on a small number of
    wafers instead of being evenly (i.e., binomially) distributed. Then the number of wafers rejected
    is smaller than expected from Equation (16), so ( fw /fs) is smaller. Data received from a wafer
    supplier support the conclusion that site failures are not, in fact, randomly distributed. Table 3
    shows some of these data and the impact they have on the ratio ( fw /fs)/Sw. Although somewhat
    sparse, these data do show that the observed fw /fs ratio is considerably lower than the model
    would predict. In particular, when ECpk approaches 1, (fw /fs)/Sw should approach 100%, but it
    appears to be about 40% at best.

 Table 3. Modeled and Actual Data for a 200 mm Wafer with 25 mm  32 mm sites (Sw = 36)
                fw              fs              fw / fs       (fw /fs)/Sw   fw / fs      (fw /fs)/Sw
             modeled         modeled          modeled         modeled       actual         actual
   0.63       65.8%          2.94%             22.40             62.2%       9.7          26.9%
   0.79       27.5%          0.89%             30.92             85.9%      14.6          40.6%
   0.93       9.06%          0.26%             34.39              95.5      13.0          36.1%

(5) On the other hand, the critical die price is increased if the number of dice/wafer is less than SwDs
    because of partial sites. Going back to some of the equations used to derive the linear tradeoff
    model and rearranging the terms demonstrates this. Equating $s from Equation (6) to $u from
    Equation (7) and solving for the critical die price, Pd,c, yields:
                                             Pd,c = Pwfw /fdDw                                         (20)
    Now recalling from Equation (9) that fd = fsnds /Ds, after some rearranging, the critical die price is
                                     Pd,c = Pw(Ds / Dw) (fw /fs)(1/nds)                                (21)
    If it is assumed that fw /fs = Sw and that nds = 1, the critical price becomes:
                                           Pd,c = Pw (DsSw /Dw)                                        (17)
    Table 4 shows the ratio (DsSw /Dw) for several scenarios. The dice per wafer (Dw) values shown
    in the table are the number of full dice that fit within the FQA. Sometimes dice fall within the
    FQA even though they lie within a site that is not considered a partial site. These cases were
    also counted and included in the calculated value of Dw.

(6) In the preceding discussions, the possibility of having outliers was ignored. Outliers are
    abnormally high SFQR values due to “special causes” that are distinct from the system of causes
    that result in a predictable lognormal distribution. If outliers are observed in the flatness data
    they are indicative of a lack of statistical control and can be taken as evidence that 100%
    inspection is required.

CONCLUSIONS                             Table 4. Ratio of Die Selling Price to Wafer Revenue Loss
                                        (DsSw /Dw) for Several Scenarios
In summary, all of the deviations           Wafer
from the original model assumptions                   Site Size,
                                          Diameter,                Die Array   Ds    Dw    DsSw    DsSw /Dw
                                                      mm  mm
cause the critical die price to be           mm
lowered from the maximum value,           200 mm        25  25       1        1     32     52       1.625
Pw, except for point (5). Some
                                                                     22       4    156     208      1.33
examples of the degree by which the
critical die price can be lowered are                                55       25   1108   1300      1.17
as follows:                                             25  32       1        1     20     36       1.80
                                                                     22       4    120     144      1.20
       If the supplier’s process
        capability is not adequate                                   58       40   1396   1440      1.03
        compared to the customer          300 mm        25  25       1        1     88     112      1.27
        specification, it is                                         22       4    392     448      1.14
        economically impractical to
                                                                     55       25   2592   2800      1.08
        omit the inspection even
        when the die price is                           25  32       1        1     64     88       1.375
        relatively low. For example,                                 22       4    296     352      1.19
        if ECpk is only 0.85 on 300
                                                                     58       40   3264   3520      1.08
        mm wafers (25 mm  25
        mm sites), the critical die
        price is reduced to 0.75Pw.

       The amount by which non-random (clustered) site failures lower the critical die price is, in
        general, unknown because knowing it would require a generalization from a wealth of
        quantitative data. However, the small amount of data collected suggests that the critical die
        price could be reduced to 0.4Pw or less.

       Probably the factor that has the most impact is the number of die failures that a site failure
        causes. If the user’s die array has four dice/site and the user assumes that a site failure causes
        all four dice to fail, the critical die price is reduced to 0.25Pw.

       All of these reductions in the critical die price appear to feed back into the equations
        multiplicatively. For example, if the ECpk were 0.85, the site failures clustered as they did in the
        data shown above, and the user has four dice/site which are all assumed to fail if a site fails, the
        critical die price is reduced to (0.75)(0.4)(0.25)Pw = 0.075Pw.

       Returning to the issue of metrology accuracy, one may ask whether these conclusions could be
        substantially altered by poor measurement accuracy. To fundamentally alter the overall
        conclusion, poor metrology would have to be the primary cause of a supplier’s site flatness
        failures, and while this is certainly possible, it does appear to be unlikely.

As a final point in this section, one comment can be made about specifications that allow one or more
nonconforming sites within the FQA before the wafer is rejected ( i.e., PUA < 100%). The linear tradeoff
model compares the relative cost of losing dice and losing wafers. Therefore, if the model indicates that
the revenue lost by die failure overwhelms the revenue lost by wafer rejection, when the strictest criteria
are used to screen out wafers that cause die failure, it does not make sense to loosen the reject criteria
to allow an even larger number of dice to fail. In all this discussion, however, perfect metrology is
assumed. This is not completely correct; anecdotal evidence suggests that metrology failures occur most
frequently at the edges of the wafer, usually in partial sites. The impact of such metrology errors has not
been fully factored into the present model. However, it should be noted that the general approach used
by the practice for determining the cost component due to misclassification as a result of incorrect

measurement, under development in the SEMI Standards Program, might be used to quantitatively
determine the cost impact of metrology errors.


The linear tradeoff model appears to be quite useful because it is relatively easy to understand and can
be manipulated to reach meaningful conclusions by examining the assumptions that were used to
formulate it. According to the model, when the die price exceeds the wafer value (or revenue loss), it is
cost-effective to conduct 100% inspection and sort. On the other hand when the die price is less than
the wafer value, it may or may not be cost-effective to inspect and sort. The critical die price decreases
below the wafer value as ECpk decreases. It also is decreased when more than one die fails per site or
when site failures cluster. The assumption is made that both the metrology is accurate and the user
specification is correct—i.e., that (1) the die always fails when the site flatness exceeds the critical SFQR
value and never fails if the site flatness is less than this value and (2) the user has correctly identified the
critical SFQR value that marks the onset of die loss and made its value the USL. It is not at all clear that
the knowledge to support either of these assumptions exists within the semiconductor industry.
However, for reasons outlined below, it is believed the effort necessary to gain this knowledge would be
better spent by adopting a new approach to site flatness metrology.

Although strictly speaking it is outside the scope set for the investigation leading to this report, the
discussions and ideas that lead to the above models raise significant questions concerning the value of
the present metric (SFQR) for site flatness. A brief summary of some of the limitations inherent in the
presently used metric follows:

       Placement of the site pattern on the wafer does not, and cannot, in every case match the user’s
        stepper field placement. Therefore, the topography that affects printing is not necessarily being
        measured by the wafer supplier, so the measured topography frequently does not represent the
        topography seen by the stepper.

       It is possible to print yielding dice in a location outside the site pattern. This problem becomes
        worse for large sites coupled with smaller-diameter wafers. For example, for a 200 mm wafer
        with a 3-mm edge exclusion, a 25 mm  32 mm site size with zero offsets does not account for
        7.16% of the area inside the FQA. (Recall that in the Introduction a case was mentioned where
        a wafer was rejected because ~0.65% of the area inside the FQA was nonconforming). In
        addition, the amount of unmeasured FQA varies in unpredictable ways as the site size and
        pattern offsets are changed.

       Information about the relationship between topographic feature height and feature size on a
        particular wafer is absent from standard site flatness measurements and site flatness maps.

In view of such concerns about the current implementation of site flatness metrology, it is suggested that
a new metric be given serious consideration within the industry. One such possibility is the “flying-site”
measurement that has been proposed, which addresses the above issues, and is currently under
evaluation. This measurement is made by determining SFQR for every possible position on a wafer of a
25 mm  8 mm site (which mimics the slit size of a scanning stepper), to a resolution of 1 mm without
utilizing partial sites. Flying-site measurements thus produce an information density of 100 points/cm 2,
and provide a map of wafer topography that is more likely to be relevant to stepper performance,
independent of how the stepper exposure fields are arranged on the wafer. Whether a new metric such
as the flying site will improve the industry-wide cost of ownership remains to be determined.

It has also been noted that front surface site flatness appears to be confounded with the topography of
both the wafer back surface and the chuck, which makes measurements of front surface site flatness of
doubtful value as a model for the performance of steppers that use low-contact chucking. However, in

view of the great number of variations in wafer-chuck interactions, it is unreasonable to expect the wafer
manufacturer to account fully for these interactions. It is recommended that greater understanding of
these interactions be developed. Nevertheless, when consistently applied, front surface site flatness
metrology has provided evidence of improving flatness process capability without considering the wafer-
chuck interactions.

Appendix. Table of Symbols Used in Report
   Symbol                                      Meaning                                     First Introduced
      $s      potential revenue lost by supplier from doing 100% inspection and        Eq (6)
              sort for flatness
      $u      potential revenue lost by wafer user from not requiring 100%             Eq (7)
              inspection and sort for flatness
             mean value of the distribution of the natural logs of the SFQR values    Eq (1)
             standard deviation of the distribution of the natural logs of the SFQR   Eq (1)
              values [ln(xi)]
      Ds      average number of dice per site                                          Eq (8)
      Dw      number of dice per wafer                                                 Eq (7)
     ECpk     effective process capability coefficient                                 Unnumbered Eq on p 11
   FALSE      logical argument indicating that the distribution produced by the        Eq (2)
              spreadsheet function is a probability density function (PDF)
      fd      fraction of dice that fail due to out of specification wafer flatness    Eq (7)
       fi     frequency of occurrence of the SFQR value with index I                   Unnumbered Eqs on p 4
     FQA      fixed quality area (see SEMI M1)                                         p3
       fs     fraction of sites in a large sample that have SFQR > USL                 Eq (3)
      fw      fraction of wafers that have one or more out of specification sites      Eq (5)
       N      number of sites in sample                                                Eq (4)
      nds     average number of dice that fail in each failed site                     Eq (9)
      Pd      die selling price                                                        Eq (7)
     Pd,c     critical die price                                                       Eq (12
     PUA      percent usable area                                                      p 16
      Pw      supplier’s revenue loss due to failure of a wafer to meet flatness       Eq (6)
              specification (SFQR>USL on one or more sites)
       r      general argument for binomial distribution indicating maximum            Footnote 6
              number of failed sites in an acceptable wafer
    SFQR      a measure of site flatness (see SEMI M1)                                 p2
      Sw      total number of sites on a wafer                                         Eq (5)
    TRUE      logical argument indicating that the distribution produced by the        Eq (4)
              spreadsheet function is a cumulative distribution function (CDF)
     USL      upper specification limit for SFQR in micrometers                        Eq (3)
       xi     SFQR value with index i                                                  Eq (1)

Shared By: