Demonstration of variogram analysis by using ahypothetical example

Document Sample
Demonstration of variogram analysis by using ahypothetical example Powered By Docstoc
					                                                                                         1


                Demonstration of variogram analysis by using a

                                    hypothetical example




                                                   By



                                   C.P. Gunasena1/ R.P.De. Silva2




1
    Department of Agric. Engineering, Faculty of Agriculture, University of Ruhuna
2
    Department of Agric. Engineering, Faculty of Agriculture, University of Peradeniya
                                                                                      2

                                  INTRODUCTION

Gold mining in African countries was very popular in early 1960s. Concentration or

the density of gold in a mine varies continuously through the space and time.

Economical excavation needs correct estimation techniques to asses the amount of

gold in a particular location. In order to capture this continuous spatial variability,

several statistical techniques have been developed and all these attempts led to the

introduction of new array of statistics named as Geostatistics.



Estimation techniques developed in early stages had to be used wherever a continuous

measure is made on a sample at a particular location in space or time scale and sample

value is expected to be affected by its position relationships with its neighbours.



During the early stages histograms were used to demonstrate the variability between

sample points and it was time consuming and laborious work with large number of

observation points. Several problems were confronted during predicting the

concentration or the density of a mine within an area to be investigated from a limited

number of peripheral samples.



Furthermore, if the concerned variable is more important and when plotted in the form

of a histogram shows a highly skewed distribution with a very long tail into the areas

where concerned variable is concentrated (Figure 1).
                                                                                      3




                  Relative frequency   Thickness of a gold deposit in meters


Figure 1 Hypothetical skewed relative frequency distribution of gold in a mine

Normal (Gaussian) statistical theory could not be applied in such highly skewed

distributions unless a transformation is applied. Considering this situation a scientist

named H. S. Sichel applied a log-normal distribution to analyse such variables and

achieved encouraging results. Three major drawbacks exist in the application of

Sichel’s (t) estimator namely, the ‘background’ probability distribution must be log-

normal, and samples must be independent and no consideration taken of the position

of the samples.



Later attempts were made to incorporate location and spatial relationships into the

estimation procedure. It was identified that there should be areas where concerned

variable is more concentrated and some areas have less concentration with in the study

area.



This identification was captured by introducing Trend Surface Analysis in 1950s and

early 1960s. Using this analysis trends were picked out by forming a ‘rolling mean’

which produced a smoothed map so that concerned variable having high and low

concentrated areas could be distinguished.
                                                                                       4

Estimation procedure adapted to evaluate value at unknown spatial location in early

stages can be discussed as follows. A set of data was formulated based on a

hypostatical distribution of a variable which as a continuous variation to demonstrate

the estimation procedure.



In order to get a holistic picture of the distribution several samples of that particular

variable were taken and analysed. Spatial distribution of sample points is shown in the

Figure 2. Value of the point (A) has to be estimated using surrounding points denoted

in number one to five.




                                                       3

                                          A


                                                1                2
                         5



                                           4



                  Figure 2. Hypothetical sample collecting points

Value at point A can be estimated using the sample values at various locations (1 to 5)

as shown in the Figure 2. It is clear that sample value in point 1 must be given higher

priority than sample point 5 in any estimation procedure.



Furthermore, as shown in Figure 2 that the concentration of the variable at position 5

will be ‘very different’ from that value at point (A), where value of the sample 1,

could not be very different’ from that value at (A). In this sense an assumption has to
                                                                                      5

be made that the difference in value between two locations depends only on the

distance between them and their relative orientation.

If enough such pairs were taken, a histogram could be developed to illustrate the

differences and investigation can be carried out about the distribution from which they

were drawn. In this manner histograms can be developed for every different distance

and direction in a given spatial location.



In order to develop a comprehensive picture of the location, sample values should be

calculated for many different distances and directions as possible. It would be a time

consuming and laborious work to construct and investigate a histogram for each

distance and direction.



This procedure can be simplified by summarising the histogram in to some useful

statistical parameters such as the arithmetic mean (average) and the variance, or

equivalently the standard deviation.



This can be statistically denoted by using several notations. Distance between the

samples and the relative orientation can be denoted as (h) and the difference in sample

value between the two samples depends only on (h). Hence, it can be concluded that it

is also true of mean and variance of the concerned distance. Thus mean and variance

differences can be statistically expressed using several notations. The mean difference

in sample values can be denoted as m(h) and the variance of these differences as 2(h).
                                                                                        6




Experimental value for mean differences m(h)

Using a set of pairs of samples for a specific (h) an ‘experimental’ value for m(h) can

be calculated using the following statistical expression.




Sample value is denoted by (g) and (x) denotes the location of one sample in the pair

and x+h denotes the location of the other sample. Number of study pairs is denoted by

(n). Symbol Asteric ‘*’ was introduced to show that the equation is something that has

been calculated rather than ‘theoretical’ approach.



It can be shown that this method of calculating mean difference is not a very good way

of estimating m(h) and it involves intensive mathematical computations to have a good

estimation. Mean difference m(h) represents an average difference in sample values

between two samples or an ‘expected’ difference. If m(h) is zero, this implies that we

‘expect’ no difference between sample values a distance h apart. But in the real world

it will not be an appropriate estimation. Later variograms were developed using

differences of variance between sample points.


Experimental values for differences of the variance 2(h)

Differences of the variance statistically can be denoted by 2(h) and is usually defined

as the variogram. This variogram indicates the variation of the concerned parameter

with the distance and direction (h). Assuming that there is no trend between point (A)

and the concerned sample point/s differences of the variance can be calculated by

using following statistical expression.
                                                                                      7




The notation (h) can be named as the semi-variogram or variogram, and notation

*(h) can be used to express the calculated experimental semi-variogram. Difference

in distance and direction between sample values can be denoted by (h). The units of

measurements consider in this analysis could be (unit)² and values for the experimental

semi-variogram can be calculated for as many different values of (h) as possible.



Values of a semi-variogram can be conveniently displayed by using graphical

expressions. The distance between the pairs of samples can be plotted along the

horizontal axis and the value of the semi-variogram or the variance can be plotted

along the vertical axis.

                                   OBJECTIVES

To discuss the application of variogram analysis using a data set formulated by using

hypothetical sample survey conducted to investigate the distribution of fluoride

content in an aquifer within the Anuradhapura district.


                           MATERIALS AND METHODS

Sample survey was conducted and a set of data pertaining to fluoride concentration in

a ground water aquifer was obtained in Anuradhapura district. Concentration is given

in mg/litter. Samples were collected using a grid and length and breadth of one grid is

equal to 500 meters as shown in the Figure 3. Aquifer was stratified and a set of drill-

holes were bored, and water samples were collected for the analysis.
                                                                                     8

Values marked at each location denote the average value of Fluoride content in

mg/litter over the intersection of the grid points. Since the problem being analysed in

two-dimensional distance between two sample points denoted as (h) in the definition

of the semi-variogram. That depends on the distance between the pair of samples, and

their relative orientation in a two-dimensional plane. For the calculation of semi-

variogram, the following statistical expression was used.




Samples were taken in the west – east, south – north and diagonally south west – north

east directions. Experimental semi-variograms were constructed with respect to the

relative orientation.


Since the samples collection points have been selected 500 meter by 500 meters

distance calculation is only possible with multiples of 500 meters for the experimental

semi-variogram, (*) At zero distance that the value of *(0) is equal to zero.



           4.5             4.0     1.5     3.8     4.5      0.6             3.0
                           5

           3.8            3.5      2.5     2.8     4.5      4.8     4.3      2.5



           3.2     3.7    3.5      3.0     2.0     3.6      3.8     3.3      3.4


           3.5     2.8    4.2      3.7     3.8     4.2      2.5     3.7     3.5


           2.6     3.5     2.8     3.8     2.6     4.8      3.5     3.0     4.2
                                                   3.5               
                         2.6
                                                                     
          2.0    3.0            2.8     3.5     2.0   2.6     3.5    4.6
                                                                     
                                                                     
                                                                     
                                                                     
  Figure 3 sample collecting points marked in 500m grid indicating the Fluoride
                                                                     
                                                                        
                                                                     
                                content in mg/litre
                                                                     
                                                                                  9

                          RESULTS AND DISCUSSION

Calculation procedure can be defined according to the above equation where

measuring the difference between each pair of observations and taking the square

values and adding all squared values and dividing the summation by twice the number

of pairs could give the value of the semi-variogram.


Table 1 Calculation procedure for the analysis of samples taken 500meters apart
from West to East direction

 γ*(500)=          [(4.0-1.5)2 +   (2.5-3.8) 2 +   (3.8-4.5) 2 +       (4.5-0.6) 2 +
                   (3.5-2.5) 2 +             2
                                   (2.5-2.8) +     (2.8-4.5) 2 +       (4.5-4.8) 2 +
                   (4.8-4.3) 2 +   (4.5-2.5) 2 +   (3.2-3.7) 2 +       (3.7-3.5) 2 +
                   (3.5-3.0) 2 +             2
                                   (3.0-2.0) +     (2.0-3.6) 2 +       (3.6-3.8) 2 +
                   (3.8-3.3) 2 +   (3.3-3.4) 2 +   (3.5-2.8) 2 +       (2.8-4.2) 2 +
                   (4.2-3.7) 2 +             2
                                   (3.7-3.8) +     (3.8-4.2) 2 +       (4.2-2.5) 2 +
                   (2.5-3.7) 2 +   (3.7-3.5) 2 +   (2.6-3.5) 2 +       (3.5-2.8) 2 +
                   (2.8-3.8) 2 +             2
                                   (3.8-2.6) +     (2.6-4.8) 2 +       (4.8-3.5) 2 +
                   (3.5-3.0) 2 +             2
                                   (3.0-4.2) +     (2.0-3.0) 2 +       (3.0-2.6) 2 +
                   (2.6-2.8) 2 +   (2.8-3.5) 2 +   (3.5-2.0) 2 +       (2.0-2.6) 2 +
                   (2.6-3.5) 2 +                    (2  42)
                                             2
                                   (3.5-4.6) ]
 γ*(500)=                          36.46 (2  42)
                                   36.46/84
                                   0.4340 (mg/l) 2
Point value calculated for 500 meter distance shows the variance of    the fluoride

concentration within 500 meters over the study area (Table 1) and that value can be

plotted in an experimental semi-variogram (*) with respect to the distance between

the samples (h). By altering the sample distances different sets of values could be

obtained for the construction of semi-variogram.

Table 2 Calculation procedure for the analysis of samples taken 1000meters
apart from West to East direction

 γ*(1000)=         [(4.5-4.0) 2 +     (4.0-3.8) 2 +    (3.8-0.6) 2 +   (0.6-3.4) 2 +
                   (3.8-3.5) 2 +      (3.5-3.8) 2 +    (3.8-4.8) 2 +   (4.8-3.5) 2 +
                   (3.2-3.2) 2 +      (3.2-2.0) 2 +    (2.0-3.8) 2 +   (3.8-3.4) 2 +
                   (3.5-3.0) 2 +      (3.0-3.2) 2 +    (3.2-2.5) 2 +   (2.5-3.5) 2 +
                   (2.6-2.8) 2 +      (2.8-2.6) 2 +    (2.6-3.5) 2 +   (3.5-3.2) 2 +
                   (2.0-3.6) 2 +      (3.6-3.0) 2 +    (3.0-2.6) 2     (2.6-3.0) 2]
 γ*(1000)=                                              (2  24)
                                                       31.92/48
                                                       0.665(mg/l) 2
                                                                                   10

Furthermore, samples can be taken from 1500 meter apart for the same direction, to

investigative the distribution of the concerned phenomena.

Table 4-3 Calculation procedure for the analysis of samples taken 1500meters
apart from West to East direction

 γ*(1500)=         [(4.5-4.5) 2 +     (4.5-0.6) 2 +      (3.8-2.5) 2 +   (2.5-4.8) 2 +
                   (3.2-3.0) 2 +      (3.0-3.8) 2 +      (3.5-3.7) 2 +   (3.7-2.5) 2 +
                   (2.6-3.5) 2 +      (3.5-3.5) 2 +      (2.0-2.0) 2 +   (2.0-2.6) 2 ]
                                                          (2  12)
                                                         28.76/24
 γ*(1500)=                                               1.198(mg/l) 2


Similarly values for the semi-variogram can be constructed for South to North

direction as well as South west to North East direction diagonally.

Table 4-4 Calculation procedure for the analysis of samples taken 500meters
apart from South to North direction

 γ*(500)=          [(2.0-2.6)2 +      (2.6-3.5) 2 +      (3.5-3.2) 2 +   (3.2-3.8) 2 +
                   (3.8-4.5) 2 +      (3.0-2.5) 2 +      (2.5-3.8) 2 +   (3.8-3.7) 2 +
                   (3.6-2.8) 2 +      (2.8-3.0) 2 +      (3.0-3.2) 2 +   (3.2-3.5) 2 +
                   (3.5-4.0) 2 +      (2.0-3.5) 2 +      (3.5-3.7) 2 +   (3.7-3.0) 2 +
                   (3.0-2.5) 2 +      (2.5-4.5) 2 +      (3.0-2.6) 2 +   (2.6-3.2) 2 +
                   (3.2-2.0) 2 +      (2.0-3.8) 2 +      (3.8-3.8) 2 +   (3.0-2.8) 2 +
                   (2.8-3.2) 2 +      (3.2-3.0) 2 +      (3.0-4.8) 2 +   (4.8-4.5) 2 +
                   (2.6-3.5) 2 +      (3.5-2.5) 2 +      (2.5-3.8) 2 +   (3.8-4.8) 2 +
                   (4.8-0.6) 2 +      (3.5-3.0) 2 +      (3.0-3.7) 2 +   (3.7-3.3) 2 +
                   (3.3-4.5) 2 +      (3.0-3.2) 2 +      (3.2-3.5) 2 +   (3.5-3.4) 2 +
                   (3.4-3.5) 2 +      (3.5-3.4) 2 ]+
 γ*(500)=                             36.46 (2  42)
                                      45.56/84
                                      0.5297(mg/l) 2

Table 4-5 Calculation procedure for the analysis of samples taken 1000meters
apart from South to North direction

 γ*(1000)=         [(2.0-3.5) 2 +     (3.5-3.8) 2 +      (3.0-3.8) 2 +   (3.6-3.0) 2 +
                   (3.0-3.5) 2 +      (2.0-3.7) 2 +      (3.7-2.5) 2 +   (3.0-3.2) 2 +
                   (3.2-3.8) 2 +      (3.0-3.2) 2 +      (3.2-4.8) 2 +   (2.6-2.5) 2 +
                   (2.5-4.8) 2 +      (3.5-3.7) 2 +      (3.7-4.5) 2 +   (3.0-3.5) 2 +
                   (3.5-3.5) 2]
 γ*(1000)=                             (2  17)
                                      42.25/34
                                      1.2426(mg/l) 2
                                                                                    11

Table 4-6 Calculation procedure for the analysis of samples taken 1500meters
apart from South to North direction

  γ*(1500)=         [(2.0-3.2) 2 +    (3.0-3.7) 2 +      (3.6-3.2) 2 +      (2.0-3.0) 2 +
                    (3.0-2.0) 2 +     (3.0-3.0) 2 +      (2.6-3.8) 2 +      (3.5-3.3) 2 +
                    (3.0-3.4) 2 ]
                                                          (2  9)
                                                         31.54/18
  γ*(1500)=                                              1.7522(mg/l) 2


In order to calculate the distance between sample points along the diagonal,

Pythagorean Theorem was used and it was found that the diagonal distance is about

707meeters.

Table 4-7 Calculation procedure for the analysis of samples taken 707meters
apart from South West to North East direction diagonally

 γ*(500)=         [(3.5-3.7)2 +      (3.7-3.5) 2 +     (3.5-4.5) 2 +      (2.6-3.8) 2 +
                  (3.8-3.2) 2 +      (3.2-2.5) 2 +     (2.5-3.8) 2 +      (2.0-2.5) 2 +
                  (2.5-3.0) 2 +      (3.0-3.0) 2 +     (3.0-3.8) 2 +      (3.8-4.5) 2 +
                  (3.0-2.8) 2 +      (2.8-3.7) 2 +     (3.7-2.0) 2 +      (2.0-4.8) 2 +
                  (4.8-0.6) 2 +      (3.6-3.5) 2 +     (3.5-3.2) 2 +      (3.2-3.0) 2 +
                  (3.0-4.8) 2 +      (2.0-2.6) 2 +     (2.6-3.2) 2 +      (3.2-3.8) 2 +
                  (3.8-4.5) 2 +      (4.5-3.4) 2 +     (3.0-2.8) 2 +      (2.8-2.5) 2 +
                  (2.5-3.3) 2 +      (3.3-3.5) 2 +     (3.0-3.5) 2 +      (3.5-3.7) 2 +
                  (3.7-3.4) 2 +      (2.6-3.0) 2 +     (3.0-3.5) 2 +      (3.5-3.2) 2 ]
 γ*(500)=                            43.76 (2  36)
                                     43.76/72
                                     0.60777(mg/l) 2


Table 4-8 Calculation procedure for the analysis of samples taken 1414meters
apart from South West to North East direction diagonally

 γ*(1414)=        [(3.2-4.0) 2 +     (3.5-3.5) 2 +     (2.6-3.2) 2 +      (3.2-3.8) 2 +
                  (2.0-3.0) 2 +      (3.0-3.8) 2 +     (3.0-3.7) 2 +      (3.7-4.8) 2 +
                  (3.6-3.2) 2 +      (3.2-4.8) 2 +     (2.0-3.2) 2 +      (3.2-4.5) 2 +
                  (3.0-2.5) 2 +      (2.5-3.5) 2 +     (3.0-3.7) 2 +      (2.6-3.5) 2]
 γ*(1414)=                            (2  16)
                                     13.1/32
                                     0.4093(mg/l) 2
                                                                                   12

Table 4-9 Calculation procedure for the analysis of samples taken 2121meters
apart from South West to North East direction diagonally

 γ*(2121)=         [(3.5-4.5) 2 +    (2.6-2.5) 2 +     (2.0-3.0) 2 +     (3.0-2.0) 2 +
                   (3.6-3.0) 2 +     (2.0-3.8) 2 +     (3.0-3.3) 2 +     (3.0-3.4) 2 ]+
 γ*(2121)=                            (2  8)
                                     6.86/16
                                     0.4287(mg/l) 2


Using the calculated data Table 4-10 can be developed to illustrate the values of the

sami-variogram.

Table 4-10. Calculated values for the semi-variogram

    Direction      Distance           Experimental       Number of
                   between                semi-            pairs
                    samples            variogram
 West to East   500                  0.4340            42
                1000                 0.6650            48
                1500                 1.1980            24
 South to North 500                  0.5297            42
                1000                 1.2426            17
                1500                 1.7522            18
 South west to 707                   0.6077            36
 North East
                1414                 0.4093            32
                2121                 0.4287            16
                                                                                                                             13

Then a graph can be drawn using values obtained for experimental variogram verses

distance between sample points as shown in the Figure 4.


       Experimental semi variogram (mg / l) 2   1.8

                                                1.6

                                                1.4

                                                1.2

                                                 1

                                                0.8

                                                0.6

                                                0.4

                                                0.2

                                                 0
                                                            0               500            1000            1500

                                                Distance between sample points in meters      N to S   W to E     SW to NE




Figure 4 Constructed semi-variogram for the distribution of Fluoride content in

groundwater in Anuradhapura district



Variograms developed using original data illustrates a clear difference in three semi

variograms. The south-north semi-variogram shows sharp rise than the west-east, semi

variogram.



It implies that there should be a greater concentration or the continuity of the Fluoride

concentration towards the west-east direction. In order to verify this inclination

another semi-vreogram was constructed diagonally from northwest to southeast

direction and plotted on the same graph. It lies above the south-north semi-variogram.
                                                                                    14

Reliability of the semi-variogram depends on the number of pairs taken as samples.

Variogram developed using less number of pairs would results less reliable estimates.

Two variograms developed shows practically consistent, up to 1000 meters.



The south -north semi-variogram illustrates by almost a straight line up to 1000m

distance with slope [1.24mg/litre] / [1000m] = 0.00124 mg/litre per meter. Thus for

the south –north direction semi-variogram can be written as

                                (h)  0.00124 h (mg / li) 2


The west – east semi-variogram illustrates by almost a straight line up to 1000m

distance with slope [0.7mg/litre] / [1000m] = 0.0007 mg/litre per meter. Thus for the

west – east direction semi-variogram can be written as

                                (h)  0.0007 h (mg / li) 2

Since the semi-varogram graphically illustrates the changes of the variance of the

concern parameter with the distance, by taking the square root of the values of the

semi variograms give the standard deviations. Thus, standard deviation for the South

to North direction is 0.035mg/litre. Hence Fluoride content in the ground water differs

0.035mg/litre per meter in the South to North direction. Similarly, Fluoride content

differs 0.026mg/litre per meter in the West to East direction.



                                   CONCLUTION

Variogram analysis helps to build up a picture of the fluctuations of Fluoride content

in the groundwater within the study area and a simple model can be constructed to

describe the differences in Fluoride content.
                                                         15




References

http://uk.geocities.com/drisobelclark/PG1979/index.htm

http://www.surpac.com/refman/default/stats/geostat.htm
                                                                                        16

                                      ABSTRACT


In order to capture the continuous spatial variability, several statistical techniques have

been developed. During the early stages histograms were used to demonstrate the

variability between sample points and it was time consuming and laborious work with

large number of observation points. Since most of the distributions were skewered

Normal (Gaussian) statistical theory could not be applied. Later attempts were made to

incorporate location and spatial relationships into the estimation procedure.



Differences of the variance of distance between sample points statistically defined as

the variogram and indicate the variation of the concerned parameter with the distance

and direction. Present paper discuss the application of variogram analysis using a data

set formulated by using hypothetical sample survey conducted to investigate the

distribution of fluoride content in an aquifer within the Anuradhapura district. Samples

were collected using a grid and length and breadth of one grid is equal to 500 meters.

Samples were taken in the west – east, south – north and diagonally south west – north

east directions. Experimental semi-variograms were constructed with respect to the

relative orientation. Results revealed that there is a high variability of fluoride content

in the ground water aquifer towards south to north direction and variability is less

towards west to east direction.



Variogram analysis helps to build up a picture of the fluctuations of Fluoride content

in the groundwater within the study area and a simple model can be constructed to

describe the differences in Fluoride content.

				
DOCUMENT INFO
Categories:
Stats:
views:22
posted:5/15/2012
language:English
pages:16
Description: This paper discuss the application of variogram analysis using a data set formulated by using hypothetical sample survey conducted to investigate the distribution of fluoride content in an aquifer within the Anuradhapura district.