3 Uncertainty Propagation and Linear Least-Squares Fitting A Tutorial

Document Sample
3 Uncertainty Propagation and Linear Least-Squares Fitting A Tutorial Powered By Docstoc
					Uncertainty Propagation and Linear Least-Squares Fitting                                    15


3       Uncertainty Propagation and Linear Least-Squares
        Fitting: A Tutorial

3.1     Uncertainty Propagation
3.1.1    Uncertainties in Measurement
When a measurement is made in an experiment, the measured quantity cannot be determined
to infinite precision. Consequently, a very important aspect of experimental science deals
with an understanding of “uncertainty” (sometimes given the misnomer “error”, suggesting
that a mistake has been made) in measurements, and how these uncertainties propagate when
different quantities with associated uncertainties are algebraically combined. Here we present
the basic rules of uncertainty propagation, and derive an important general propagation rule
for functions of one variable.

3.1.2    Rules for Uncertainty Propagation: Specific Cases
Last year in first-year physics, you were told, for example, that when you add two quantities,
each with an associated uncertainty, then the uncertainty in the sum is calculated according
to the following prescription:

        Given x ± δx and y ± δy, and given the definition of z = x + y, the uncertainty
        δz is given by

                                      δz = δx + δy
   It turns out that this is an overly pessimistic estimate of δz. From a basic study of
statistics, one can show that a better estimate is


                                  δz =    (δx)2 + (δy)2

    Note that this estimate of δz is always smaller than the previous estimate. We call this
combination “adding the absolute uncertainties in quadrature”. We will not derive this rule:
it will take just too much time for this course. If you take a course in statistics later, then
all will be made clear then.
    The important rules for uncertainty propagation for the cases of addition, subtraction,
multiplication and division, that you will need for the laboratory component of this course
are summarized below.
16                               Uncertainty Propagation and Linear Least-Squares Fitting


                           Uncertainty in Derived Quantities

     Operation     Derived Quantity                            Uncertainty

      Addition
         or        Z = A ± B ± C ± ...            δZ =       (δA)2 + (δB)2 + (δC)2 + ...
     Subtraction


 Multiplication
                                                         2            2          2          2
                       A × B × ...         δZ      δA          δB           δC         δD
         or         Z=                        =              +            +          +          + ...
                       C × D × ...         Z       A           B            C           D
      Division


3.1.3     Uncertainties and Significant Figures
Consider the statement of a measured quantity and its uncertainty:

                                  x = 3.9297 ± 0.61 m.

There are two problems with this statement. First, we generally only quote an uncertainty
to one significant figure. Thus, we correct the statement to be:

                                   x = 3.9297 ± 0.6 m.

However, there is still a problem. The number of significant figures for x implies that we
are certain of the value out to the fourth decimal place. However, this is inconsistent with
the stated uncertainty. The number of significant figures of a measured quantity (or one
derived from measured quantities) should be consistent with the uncertainty. Thus, the
proper statement of the value and uncertainty of x should be the following:

                                    x = 3.9 ± 0.6 m.

    Another example which combines the arguments above with scientific notation is the
following: the statement

                              x = 0.0062851 ± 0.0000728 m

should, at the very least, be rewritten,

                                x = 0.00629 ± 0.00007 m
Uncertainty Propagation and Linear Least-Squares Fitting                               17


but, more preferably, in the form

                              x = (6.29 ± 0.07) × 10−3 m

    Keep these points in mind when calculating propagating uncertainties for derived quan-
tities.

3.1.4   Examples
Answer the following questions:

  1. The three dimensions of a rectangular block of lead are measured to 5.2 cm, 11.8 cm
     and 7.3 cm. The uncertainty in each of the measurements is ±0.1 cm. Find the volume
     and the uncertainty in the volume in the lead block. Express your answer in scientific
     notation.
18                                Uncertainty Propagation and Linear Least-Squares Fitting


     2. The four quantities, a, b, c and d are measured to be a = 10.1 ± 0.1 cm,
        b = (6.2 ± 0.2) × 10−6 C, c = 0.52 ± 0.03 cm2 , and d = 0.5 ± 0.1 cm. Find the value
        and the uncertainty in the derived quantity,

                                               ad
                                        z=b       +1
                                                c

       where the “1” is an exact number. Express your answer in scientific notation.
Uncertainty Propagation and Linear Least-Squares Fitting                                 19


3.1.5   Uncertainty Propagation: Derivation for Functions of One Variable
Say we have measured a quantity x to have a value x = xm ± δx. Further, let us say that
we are interested in a derived quantity, y, which is an arbitrary, but known, function of x:
i.e. y = y(x). How can we find the uncertainty in the derived quantity, i.e. δy?
    First, we note that the propagation rules of the previous section do not help us here.
However, there is a straightforward procedure to find δy. Note that if we make a statement
                                      x = xm ± δx
then we are actually saying that x likely lies in the range
                                x ∈ [ xm − δx, xm + δx ]
  However, if y = y(x), then it is clear that y must lie in the range
                             y ∈ [ y(xm − δx), y(xm + δx) ]
If we define
                                δy1 = y(xm ) − y(xm − δx)

                                δy2 = y(xm + δx) − y(xm )
then we can rewrite the range for y as
                            y ∈ [ y(xm ) − δy1 , y(xm ) + δy2 ]
In general, δy1 is different from δy2 .
   Let us assume that the uncertainty δx is a small quantity. This implies that y(x) is
approximately a straight line in the range x ∈ [ xm − δx, xm + δx ]. Thus,
                             δy1   y(xm ) − y(xm − δx)   dy
                                 =                     ≈
                             δx             δx           dx       xm

                                       dy
                        ⇒ δy1 ≈                  δx
                                       dx   xm

Likewise,
                             δy2   y(xm + δx) − y(xm )   dy
                                 =                     ≈
                             δx            δx            dx       xm

                                       dy
                        ⇒ δy2 ≈                  δx
                                       dx   xm

Thus, δy1 = δy2 , and so we shall just re-label these quantities δy. The basic idea of this
derivation is illustrated in the figure below
20                               Uncertainty Propagation and Linear Least-Squares Fitting

                             y
                                                            y(x)
                                 δy
                   y+δy
                     y
                   y− δy
                                                       δx
                                                                       x
                                  x m− δx            xm+δx
                                                xm
   There is one other complication: we have implicitly assumed that the slope dy/dx at
x = xm was positive. However, since it can also be negative, this would imply that the
uncertainty δy < 0, which doesn’t make sense. To correct this problem, we simply modify
the formula by putting absolute value bars around the derivative. Thus,



                                             dy
                                      δy =      δx
                                             dx


where dy/dx is evaluated at the measured value of x.

3.1.6    Examples
     1. x is measured to be −10.0 ± 0.3 cm. Find the uncertainty in the derived quantity
        y = x2 .
Uncertainty Propagation and Linear Least-Squares Fitting                           21


  2. x is measured to 5.03 ± 0.07 m. Find the value and the uncertainty of the derived
     quantity y = A ln(x/a), where A = 1 m exactly, and where a = 2 cm, exactly.
22                                 Uncertainty Propagation and Linear Least-Squares Fitting


3.2     Fitting Data to a Straight Line

3.2.1   Introduction
Suppose we carry out N measurements of some quantity, y1 , y2 , y3 , ..., yN , by making N
variations of another variable x1 , x2 , x3 , ..., xN . Furthermore, suppose that we know that the
y is a linear function of x,
                                              y = A + Bx                                       (1)
where A and B are constants. If we could measure y to infinite precision, then the data points
would all fall on a straight line if we were to plot y vs. x. However, in a real experiment,
there is some limit to the precision with which we can make the measurement. Consequently,
the data points tend to be “scattered” around the “true” linear function. You will observe
this behaviour in a few of the experiments you will perform in this course. Suppose that
you are asked to “draw the best straight line” through these data points in order to obtain,
for example, the slope of the line. In previous courses, you have done this simply by “eye-
balling” the data and drawing the line that simply looked the most reasonable. There is,
however, a quantitatively rigorous approach to “fitting” lines to data.

3.2.2   Example: Measuring a Spring Constant
Here we consider a simple physical example to introduce the idea of fitting data to a straight
line. Consider a spring with spring constant k, and equilibrium length l0 . The magnitude of
the restoring force when the spring is stretched to length l is

                                         F = k(l − l0 )                                       (2)

   Let us conduct an experiment where we suspend a series of masses from the spring in
order to determine k. Given the magnitude of the gravitational force on the mass m is
F = mg, it is easy to show
                                  l = l0 + (g/k)m                                  (3)
where g is the acceleration due to gravity. If we Compare eqs. (1) and (3), we can make the
identification:

                              Equation (1)           Equation (3)

                                        y     ⇐⇒       l
                                        x     ⇐⇒       m
                                        A     ⇐⇒       l0
                                        B     ⇐⇒       g/k
Uncertainty Propagation and Linear Least-Squares Fitting                                       23


Thus, if we plot l vs. m, we expect a straight line with a slope of g/k and a y-intercept of l0 .
   Let’s say that we make eight measurements of the length of the spring for eight different
suspended masses:




         mass m (kg)         0.2         0.3      0.4    0.5     0.6      0.7   0.8     0.9


         length l (m)       0.051       0.055   0.059   0.068   0.074    0.075 0.086   0.094



  Let’s plot the data:



                            0.1

                           0.09

                           0.08
                   l (m)




                           0.07

                           0.06

                           0.05

                           0.04
                                  0.2           0.4      0.6            0.8       1
                                                        m (kg)

   Although we expect the point to lie on a straight line, they do not. Why aren’t they?
Because the measurements of l have not been made to 100 % precision. In other words,
there is an uncertainty associated with the measurement of l.
   In the present example, let us say that the uncertainty in the length l is 0.3 cm. For any
given measurement of l, we should write, for example, l = 0.051 ± 0.003 m. We can illustrate
this uncertainty by including error bars:
24                                       Uncertainty Propagation and Linear Least-Squares Fitting



                             0.1

                            0.09

                    l (m)   0.08

                            0.07

                            0.06

                            0.05

                            0.04
                                   0.2        0.4      0.6         0.8        1
                                                      m (kg)

   We would like to determine the spring constant k. How do we do this with the present
data? If the data really conformed perfectly to the straight line prediction of equation (2),
then we could simply measure the slope of the line, which is (g/k). As we know that
g = 9.80 m/s2 , knowledge of the slope yields k easily. However, the raw data do not lie on a
straight line due to uncertainties in the measurements. We therefore would like to construct
a “best fit” line, and use the slope of that line to determine the spring constant k.

3.2.3   Criterion for the “Best Fit” of a Straight Line
But how do we obtain this “best fit” straight line? To answer this question, let’s go back
to the original notation: i.e. we have made N measurements of the variable y, namely, y1 ,
y2 , y3 ,...,yN , at N values of the variable x, x1 , x2 , x3 ,...,xN . Further, suppose that we can
measure each yi to within an uncertainty of ±σi .
    Consider the quantity:

                                         ∆yi ≡ |yi − (A + Bxi )|

where

                                            y(xi ) = A + Bxi

is the equation for the “best” straight line. ∆yi is just the difference between the y value of
the data point and the straight line. Clearly, each ∆yi would be zero if each point were on
the line.
Uncertainty Propagation and Linear Least-Squares Fitting                                       25


   Now consider the sum:
                               N               N
                                     ∆yi =          |yi − (A + Bxi )|
                               i=1           i=1

One criterion for the best fit would be to choose A and B such that the sum is minimized.
It turns out that a better criterion is to choose to minimize the related sum:
                                           N
                                                    (∆yi )2
                               χ2 ≡                    2
                                          i=1         σi
                                           N
                                                    (yi − (A + Bxi ))2
                                      =                      2
                                          i=1               σi

Let us only consider the case that all the uncertainties are equal: σi = σ for all i = 1, ..., N .
Thus, the expression for χ2 becomes:
                                                   N
                                           1
                                χ2 =                     (yi − (A + Bxi ))2
                                           σ2      i=1

That is, we want to “minimize the sum of the squares of the differences” between the mea-
sured and fit y values. The quantity χ2 (“chi-squared”, where χ is a Greek letter) is an
important quantity that plays a significant role in statistics.
   Does this criterion make sense? Clearly, if the ∆yi are large, then χ2 will be large. Thus,
minimizing χ2 will tend to reduce each of the ∆yi .
   How can we do this? You know from calculus that a function can be maximized or
minimized by taking a derivative and setting it equal to zero. In this case, we want to
minimize the function χ2 – but with respect to what? The data points yi ? No – these are
fixed quantities. Instead, we want to minimize χ2 with respect to the parameters A and B.
   However, this looks a little strange. How can we minimize a function with respect to two
variables? The answer is simple. One simply takes two derivatives, and sets them each to
zero:


                                               ∂χ2
                                                   =0                                         (4)
                                               ∂A
                                               ∂χ2
                                                   =0                                         (5)
                                               ∂B
                                     ∂χ2            dχ2
   Note the different notation:           instead of     . These are called “partial derivatives”.
                                     ∂A             dA
                          ∂χ2
The meaning is simple:        simply means to take the derivative with respect to A while
                          ∂A
26                                   Uncertainty Propagation and Linear Least-Squares Fitting

                                    ∂χ2
treating B as a constant. Likewise,     means to take the derivative with respect to B while
                                    ∂B
treating A as a constant. You will learn all about partial derivatives in calculus this year.
   The equations above imply the following:
                                            N
                                                  (yi − A − Bxi ) = 0                                  (6)
                                            i=1

and
                                        N
                                                xi (yi − A − Bxi ) = 0                                 (7)
                                        i=1

     Proof of eq. (6):

                                                             N
                           ∂χ2    ∂                    1
                               =                                  (yi − (A + Bxi ))2
                           ∂A    ∂A                    σ2   i=1
                                                   N
                                            1                ∂
                                    =                          (yi − (A + Bxi ))2
                                            σ2    i=1       ∂A
     Using the chain rule to evaluate the derivative, we find

                                    N
                     ∂χ2   1                                             ∂
                         =                  2(yi − (A + Bxi ))             (yi − (A + Bxi ))
                     ∂A    σ2       i=1                                 ∂A
                                     N
                               1
                           =                2(yi − (A + Bxi ))(−1)
                               σ2   i=1
                                             N
                                −2
                           =                      (yi − (A + Bxi ))
                                σ2          i=1

                                                                                  ∂χ2
     Now, imposing one of the conditions for minimization, i.e.,                  ∂A
                                                                                        = 0, we find:
                                                                      ∂χ2
                                                                          = 0
                                                                      ∂A
                                             N
                                  2
                                − 2               (yi − (A + Bxi )) = 0
                                 σ          i=1


     Dividing each side by −2/σ 2 , we complete the proof:
                                        N
                                            (yi − (A + Bxi )) = 0
                                    i=1
Uncertainty Propagation and Linear Least-Squares Fitting                                         27


  Exercise: Derive equation (7)




  From these equations, one can easily show that:

                              (    i   x2 )(
                                        i       i    yi ) − (    i   xi )(       i   xi y i )
                         A=                                                                      (8)
                                                          ∆

                                   N(      i   xi yi ) − (      i    xi )(   i       yi )
                            B=                                                                   (9)
                                                        ∆
where
                                  ∆ = N(              x2 ) − (
                                                       i                 xi )2                  (10)
                                                 i                   i
28                              Uncertainty Propagation and Linear Least-Squares Fitting


These results just follow from the solution to the two equations (eqs. (6) and (7)) and two
unknowns (A and B).
   While we do not derive it here, one can also show that the uncertainties in the “fitting”
coefficients A and B are given by the following:
                                      2
                                     σA = σ 2       x2 /∆
                                                     i                                (11)
                                                i

and
                                      σB = N σ 2 /∆
                                       2
                                                                                      (12)
Uncertainty Propagation and Linear Least-Squares Fitting                                    29


3.2.4        Back to the Spring Constant Example

Now let’s go back to our example with the measurement of the spring constant. Use the data
in the table for l and m and the equations for A and B for the best fit line and determine
what the spring constant k is for this physical system.


            NOTE: In the following example, and in uncertainty calculations for labs in
            this course generally, keep extra significant figures until the very end of the
            calculation, i.e. calculated value of k and its uncertainty, δk on page 31.

   Exercise: Evaluate the following quantities below. Note that the variable in the equa-
tions x is the mass m, and the variable y is the length of the spring l.

  i   xi =

  i   yi =

  i   x2 =
       i


  i   xi yi =

Now evaluate the quantity

∆ ≡ N(                x2 ) − (
                       i                xi )2 =

Now use the equations above to find the best fit A and B coefficients:
        (        x2 )(        yi )−(          xi )(          xi yi )
A=           i    i       i
                                  ∆
                                          i              i



        N(            xi yi )−(        xi )(          yi )
B=                i
                              ∆
                                  i              i




Now we know the equation describing the “best fit” line through the data points through
y = A + Bx. Let’s add this line to the plot of the raw data points. To do this, let’s consider
two points at x = 0.10 and x = 1.00. Using the equation of the best fit line, calculate:

y(x = 0.20) =

y(x = 1.00) =
30                                         Uncertainty Propagation and Linear Least-Squares Fitting


Now, we can add the best fit line to the figure. Plot the two points you’ve just calculated,
and then draw a line through them.


                               0.1

                              0.09

                              0.08
                      l (m)




                              0.07

                              0.06

                              0.05

                              0.04
                                     0.2        0.4      0.6         0.8        1
                                                        m (kg)

  Are we done yet? No! There still is the business about the uncertainty in our measured A and B.
We can obtain these using equations (11) and (12):

 2
σA = σ 2    i   x2 /∆ =
                 i

 2
σB = N σ 2 /∆ =


     Therefore,

σA =

σB =


     Thus, we can say that

A=                                    ±
Uncertainty Propagation and Linear Least-Squares Fitting                                    31


B=                           ±


    Now let’s think about the meaning of these quantities. Recall from equation (3) that A
is just the equilibrium length of the spring, l0 , and B is the quantity (g/k). Okay, now the
purpose of this whole exercise is to use the original raw data to calculate the spring constant
k. Since B = (g/k), it is clear that
                                                   g
                                             k=                                            (13)
                                                   B
Since we know g=9.80 m/s2 , and we know that B=                          ±                    ,
we can easily calculate k, and find that

k=

   What about the uncertainty in k? Easy – just use the standard uncertainty propagation
rule:
                                              df
                                   δ(f (x)) =     δx                                (14)
                                              dx


  Thus, for arbitrary B ± δB, δk =                           .

For, the calculated B=                      ±                     ,

δk =                     .

   The “final answer” to this whole problem is that the spring constant has been determined
from an analysis of the experimental data to be:

k=                       ±

(Don’t forget units, and careful with significant figures!)
   Congratulations! You have just fit your first straight line to “experimental” data. You
can see that it is not a trivial exercise, and that the calculations can be quite tedious –
especially if you have lots of data points.
   You will be doing lots of “line fitting” in the lab of this course – and quite probably in
other courses later on. However, don’t despair: you will have a computer program do all the
tedious calculations for you. The program will print out the coefficients of the best fit line,
and their associated uncertainties. You will just have to just enter in the raw data points
and their uncertainties.
32                               Uncertainty Propagation and Linear Least-Squares Fitting


3.2.5   A Useful Tip for Future Labs
Finally, there is just one more matter. Let’s say that we collect some data in an experiment
where the measured quantity y is not related to the independent variable x in a linear
manner. For example, in the RC circuit experiment that you will do later in the course,
the measured voltage V across a resistor of resistance R in the circuit changes with time
according to
                                       V (t) = V0 e−t/RC                                 (15)
where V0 is the voltage at time t=0, and where C is a quantity called “capacitance”. Let’s
say that we measure a dozen or so values of V at different t, and we want to use this data to
find C. Can we use what we have learned about line fitting? It seems not, since equation (15)
is not a linear equation. However, consider the following newly defined quantity: y ≡ ln(V ).
Using the equation above, it is clear that
                                                           −1
                          y = ln(V0 e−t/RC ) = ln(V0 ) +      t
                                                           RC
                                 A + Bt                                                 (16)
                                       −1
where A = ln(V0 ) and where B = RC . Get the idea? We’ve transformed the non-linear
equation into a linear equation. Now you should know what to do: fit ln(V ) vs. t to a
straight line, calculate the slope B, and use this to calculate the capacitance C.
   One other thing to remember: be careful about the uncertainties in the raw data. If you
know the uncertainties δV , you will need to transform these to new uncertainties δ(ln V ),
using the usual uncertainty propagation rule,

                                                  df
                                     δ(f (x)) =      δx                                 (17)
                                                  dx

   One last comment: the type of fitting you have just done is called “linear least-squares
fitting”. It has this name since the criterion for determining the best fit is to minimize the
sum of the squares of the differences between the fitting line and the raw data points. It is
also sometimes called “linear regression”.
   Okay, now you’re all set for all the line fitting you will have to do in this lab.