# TOPICS IN PRECISION AGRICULTURE

Document Sample

```					            BIOLOGICAL AND AGRICULTURAL
ENGINEERING DEPARTMENT
INTRODUCTION TO PRECISION AGRICULTURE
ABT 175

Geostatistics – Semivariograms and Kriging
In the interpolation techniques discussed so far, neither there is a clear guideline to select
the radius of influence, nor there is a method to determine how good the interpolation is.
Kriging offers an elegant way to deal with these concerns. Kriging is an interpolation
technique that uses statistical nature of the variability of geo-spatial data. The spatial
variability in geological data consists of three sources – (1) a structural part consisting of
a trend, (2) a random, spatially correlated component, and (3) a random error term, i.e.

P(x)  m(x)  g(x)                                   (1)

where m(x) is the spatially dependent mean, g(x) is the random, spatially correlated part,
and  is the Gaussian error term (figure 1). The nature of m(x) can be determined by
techniques such as trend surface analysis and will not be discussed here. Once the
structural component, m(x), is removed, we are left with two random terms one of which
is spatially correlated [g(x)] and the other is a pure noise term ().

Y



g(x)

P(x)

m(x)

x
X
2

Figure 1. Three major components in geo-spatial data.

In the following discussion we make the following assumptions1:

1) Once the structural component is removed, the mean or expectation of the parameter
is independent of the location, i.e. E(Pi)=E(Pi+h),

2) Variation of parameter values separated by a distance h depends only on the
separation distance h but not on their location, i.e. C[Pi, Pi+h] is independent of
location x,

where E stands for expectation or mean and C stands for covariance. These assumptions
are known as second order stationarity assumptions.

Semivariogram, Covariance, and Correlogram: Consider the variation of Pi+h versus
Pi as shown in figure 2.

1
Review of basic statistics:
If Z is a random variable then its mean or expectation,  = E(Z) and its variance, =E(Z-)2 =E(Z2)-2.
Covariance between any two random variables, U and V is C(U,V)=E(UV-UV) =E(UV) -UV. If Uand V
are drawn from the same distribution, then U=V=and C(UV)=E(UV)-2
3

O’
(Pi+ h , Pi)

A

C
Pi+h
45o
B
(Pi, Pi)

45o

O
Pi

Figure 2. Variation of Pi+h versus Pi in geo-spatial data

If the separation distance, h is zero we expect all data to fall on a 45 degree line, OO’. An
observed data Pi+h which is separated by a nearby data point Pi by a distance h may be a
distance AC=AB cos(45o )= (Pi+h –Pi))/ 2 away from OO’. The moments of all such
data points which are separated by a distance h is given by:
n
1 n
M   (P h  P )2 / 2 
i      i             (Pi  h  Pi )2                  (2)
i1                  2 i 1

The average value of M given in equation (2) is a measure of dissimilarity between
nearby points separated by a distance h and is known as semi-variance, (h),

1 n
 (h)          (P  Pi )2
2n(h) i1 i h
(3)

The separation distance, h, is often referred to as lag. As separation distance approaches
zero, one would expect the semi-variance to approach zero. However, the semi-variance
4

usually approaches a non-zero value, known as nugget as h->0. This is due to the
Gaussian error component of the geo-spatial data. As lag increases, (h) also increases
and becomes relatively constant for lag values greater than certain distance, a, known as
range. This constant value of semi-variance for h > a is known as sill. Figure (3) shows
a typical semivariogram along with nugget, sill, and range for this case.

Range , a
(h)

Sill

Nugge t

Lag , h

Figure 3. A typical semivariogram which shows a clear nugget, sill, and range.

Instead of the measure of dissimilarity of nearby data points, we could have looked at the
similarity or correlation between points separated by a distance h. Covariance functions
provide a measure of similarity between points separated by lag, h. It is defined as:

1 n                                    1 n
C(h)    =        (Pi   Pi )(Pi h   Pi 1 )  n(h)  (Pi Pi  h   2 )
n(h) i 1                                   i1                    (4)

1 N
where        P which is the mean of all data points N.
N k 1 k
5

Sometimes an alternative to covariance function known as correlation function is used
(note that Pi refers to the mean of Pis). Correlation function is essentially a normalized
version of covariance function and ranges between 0 and 1. The covariance function
tends to be high when h=0 (i.e. correlation function is 1) and goes to zero for points
which are separated by distances greater or equal to the range (i.e. uncorrelated). Figure
4 is schematic of the covariance function, C(h) and correlogram (h). These are related
by following relationships2:

C(h)  C(0)   (h)                                           (5)

and

Correlogram,  (h)  C(h) / C(0)  1   (h) / C(0)           (6)

1
 (h)   E(Pi  Pi 1 )
2

2
1      2               2
 E(Pi  2Pi Pi 1  Pi 1 )
2
1
  i 2 )  2E(Pi Pi 1 )  E(Pi 1 )
2
2
E(P
2
 E(Pi2 )  E(PPi 1 )                  E(P
  i
2
)  E(Pi 2 )
1

 i )                        ) 
2         2                   2
E(P                    E(PP
i 1

    C(0)  C(h)
6

C(h)
(h)

1
C(0)

Range , a

0
0

O
h

Figure 4. Covariance and correlogram representation of geo-spatial variability.

As distance increases C(h) decreases and approaches zero and (h) increases and becomes
asymptotic to the sill value which is equal to the variance,2, of the random function.
From equation (5), we get:

C()  0  C(0)   ()
But  (h)   2
(7)
 C(0)   2
and C(h)   2   (h)

Therefore semivariogram, covariance, and correlogram are similar and for historical
reasons semivariogram is widely used in geostatistics.
7

Models of Semivariogram:

Spherical model:

3h 1 h 3 
 (h)  c0  c1 
     ( )                   for 0  h  a
2a 2 a 

 c 0  c1                                       for h  a     (8)
 (0)  0

Exponential model:

               h 
 (h)  c0  c1   exp
1                                                 (9)
                 a 

Linear model:

(h) = c0 + c1h                                                      (10)

The linear model (equation 10) is used when the semivariogram does not appear to have a
clear sill. Note c0 and c1 are constants obtained from curve fitting the experimental data.
Moreover, c0 is the nugget and (c0 + c1) is the sill or variance, 2.

Kriging: Interpolation technique in which weighting functions are derived from the
^
geostatistical spatial analysis of the sample variogram is known as kriging.            Let P 0 be
the estimated value of the geological data at point O. It can be written as:

^      n
P 0   wi Pi                                                        (11)
i 1
n
where   w
i 1
i    1 . The error or residual, R, in interpolation is given by:

^                 n
R  P  P 0  P   wi P
0         0        i                                            (12)
i 1

Variance of this residual can be expressed as:
n
Var(R)  E(P   wi P ) 2
0        i                                               (13)
i 1
8

since mean of the residual is zero3. The weights wi s are obtained by minimizing the
n
variance of residual subject to the constraint               w
i 1
i    1 with respect to wis and the

Lagrange multiplier, i.e.
n                 n
  E(P   wi P ) 2  2 ( wi  1)
0        i                                                            (14)
i 1              i 1
 
and            0                             for all i                              (15)
wi 
where  is the objective function.

This minimization procedure results in the following matrix equation for the weights4:

       n
           n
E(R)  E  0   wi P  E(P )   wi E(P )
P           i       0            i
3
     i 1             i 1

n
 n       
     wi      0                          wi  1

              i 1                                   i1     
4
Consider a simple two parameter case. The objective function  becomes:
9

w      C
 1   10 
C
 11     C12     C13      .        .        .        C1n     1 
w  C20 
 21
C        C22     C23      .        .        .        C2n     1  2
w  C 

.        .       .        .        .        .        .       .  3   30 
                                                                   
.       .
.        .       .        .        .        .        .       .                  (16)
                                                                 
.       .
C        Cn2     Cn3      .        .        .        Cnn     1
 n1                                                           w  C 
1
       1        1        .        .        .       1        0  n   n0 
      1
   

where Cij is the covariance between parameter values Pi and Pj. Similarly Cio is the
covariance between the parameter value at the unvisited point O (Po) and the parameter
value at point i (Pi). The procedure to find the weights are as follows:

Var(R)            E(P0  w1 P  w2 P2 )2
1

 E(P02  w12 P12  w2 P22  2w1 P0 P1  2w2 P0 P2  2w1w2 P P2 )
2
1

 E(P02 )  w12 E(P2 )  w2 E(P22 )  2w1 E(P0 P1 )  2w2 E(P0 P) 2  2w1w2 E(P1 P2 )
1
2

 C00   2  w12 (C11   2 )  w2 (C22   2 )  2w1 (C10   2 )  2w2 (C20   2 )  2w1 w2 (C12   2 )
2

2        2
 C00  w1 C11  w2 C22  2w1C10  2w2 C20  2w1w2 C12
  2   2 (w1  w1  2w1 w2 )  2  2 (w1  w2 )
2
2

 C00  w1 C11  w2 C22  2w1C10  2w2 C20  2w1w2 C12     (w1  w2 )  2  (w1  w2 )
2       2                                          2      2           2      2

 C00  w1 C11  w2 C22  2w1C10  2w2 C20  2w1w2 C12  ( 2   2  2 2 )
2        2

2       2
 C00  w1 C11  w2 C22  2w1C10  2w2 C20  2w1w2 C12
 The objective function  is given by :
  C00  w1 C11  w2 C22  2w1 C10  2w2 C20  2w1w2 C12  2 (w1  w2  1)
2
2


 2w1C11  2C10  2w2 C12  2   0 or w1C11  w2 C12    C10
w1

 2w2 C22  2C20  2w1C12  2   0 or w1C12  w2 C22    C20
w2

 2(w1  w2  1)  0 or w1  w2  1

Or in the matrix form:
 11
C        C12     1  1   10 
w       C
   
 12
C        C22     1  2   20 
w       C


1        1       0   w 
   
10

1)   From the experimental data determine the experimental variogram,
2)   Determine the variogram model,
3)   Use the model to determine Cij(h) and Ci0(h) for use in equation (16),
4)   Solve equation 16 for weights,
5)   Use equation (11) to predict desired value, Po.

The prediction obtained by kriging has the property of minimum variance. Moreover,
range,a, provides an effective radius of influence. If geological data shows anisotropy,
which is often the case, the technique discussed here can be extended to two dimensions
(Issaaks and Srivastava, 1989).

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 41 posted: 1/3/2010 language: English pages: 10
How are you planning on using Docstoc?