Document Sample
Sampling Powered By Docstoc
					                                              Methodology Glossary Tier 2

Tier 1 showed that sampling is extremely useful when you want to find out
information about a large group of people but do not have time/resources to
speak to each person individually. We will now look at some sampling
techniques in more detail.

Simple Random Sampling

Theoretically, simple random sampling is the ideal method to use as it gives
an equal probability of being selected to every unit in the population.
Obtaining a simple random sample is relatively straightforward. A unique
number is assigned to each unit and a random number generator is then
used to choose which data will be included in the sample.

For example, assume a researcher has 10 individual measurements for
height. The researcher wishes to choose 5 for his experiment so he assigns
them all a number from 1 – 10, with each measurement having a 10% chance
of being included in the sample.

Person     A       B      C      D     E      F      G     H      I      J
Height(cm) 175     165    150    178   173    155    169   180    153    172
Number     0       1      2      3     4      5      6     7      8      9

Computer programs such as Excel can generate a list of random numbers
e.g. 5,9,6,4,1,2,3,5,7,8,6,4,0. This means that person F will be chosen first,
followed by Persons J, G, E and B.

In practice, however, simple random sampling can be time-consuming and
may not return a sample which contains the specific members of interest,
therefore other methods are often used.

Stratified Random Sampling

It is common for people who are conducting statistical analysis to want to
obtain information about key subsections of a population. Stratified random
sampling is a technique which involves dividing a population or sampling
frame into several, non-overlapping ‘strata’ (subgroups) according to a
particular characteristic which reflects the variables of interest. Once the
population or sampling frame is divided appropriately, simple random
samples would then be selected from within each stratum. This is more
preferable than simple random sampling when selecting samples for strata.
Examples of usual stratification characteristics are age-group, gender,
income bracket and ethnic origin.

For instance, one may be particularly interested in survey responses by the
ages of the respondents. If simple random sampling was used to select the
sample, it would be possible that one/some groups may be over-represented
and one/some may be under-represented. To ensure that adequate numbers
                                                Methodology Glossary Tier 2

of people from each age group are included in the sample, it would be
necessary to conduct stratified sampling, with each age-group forming one

The advantages of stratified random sampling are that it ensures better, more
representative coverage of the population of interest and enables the
important subgroups to be properly accounted for. Stratification will allow
greater precision than simple random sampling so long as (i) members of the
same stratum are as similar as possible to one another in relation to the
stratification characteristic; and (ii) the differences between each separate
stratum are as big as possible.

This technique can also make the sampling strategy more efficient. Without
stratification it may be necessary, and therefore more costly, to have a very
large sample in order to ensure that each subgroup of the population is
included in the analysis. By stratifying and ensuring that each group will have
sufficient representation in the sample, it is possible to achieve more precise
and reliable results.

However, a disadvantage of stratified random sampling is that it can take
longer to prepare and develop samples, due to difficulties in identifying
appropriate strata and the analysis can be quite complex. Furthermore, a
good knowledge of the population is required.

Proportionate Allocation

Normally sample sizes are proportionate to the size of the stratum which
means that each stratum has the same sampling fraction. The proportionate
allocation method of stratification is used to ensure that sample sizes for
strata are of their expected size in relation to the population.


A company employs 180 people and wishes to conduct an employee
satisfaction survey on a sample of 40. The company is made up of staff as
shown in the table below.

                           Type of Staff     Number
                                             of Staff
                           male, full time       90
                           male, part time       18
                           female, full time      9
                           female, part time     63

It would make sense for the company to select a stratified random sample to
ensure that they are able to capture members of staff from each of the
distinct subgroups. To do this, they first need to establish the percentage of
staff who are in each stratum. The table below shows that 50% of the
company’s staff are male and full time. This means that in the sample of 40
employees, 20 respondents (50%) should be full time males. Similarly, the
                                              Methodology Glossary Tier 2

sample should contain 4 part time males (10%), 2 full time females (5%) and
14 part time females (35%).

                      No. Staff    % of Total No. Staff    No. in Sample
  Male, full time        90                 50                    20
  Male, part time        18                 10                     4
  Female, full time        9                 5                     2
  Female, part time      63                 35                    14
  Total                 180                100                    40

The sampling frame in this example was the same for each of the strata – the
sample sizes for each stratum were of their expected size in relation to the
population. The sampling frame is calculated by dividing the number of
people from each stratum in the sample by the total number of people in that

                               Calculation    Sampling Fraction
          male, full time     20/90           0.22
          male, part time     4/18            0.22
          female, full time   2/9             0.22
          female, part time   14/63           0.22

Disproportionate Allocation

With disproportionate stratification, the sampling fraction will vary from one
stratum to another. This method is often used in cases when there is one (or
more) minority group(s) within the population which are likely to be
particularly under-represented or omitted from a simple random sample,
unless specific attention is paid to them. It therefore gives larger than
proportionate sample sizes for one or more strata to ensure that separate
analyses by sub-group will be possible


Suppose a local magazine has 2000 readers, of whom 100 are female and
1900 are male, and that it wishes to survey a sample of these. If we were to
select a simple random sample of 100 readers we would expect by chance
alone to get 5 females and 95 males. In order that both groups of readers be
represented within our sample we may decide to split the membership list into
two strata (male and female) and select separate samples per strata. It is
decided to sample 50 male readers and 50 female readers. This means that
the sampling fraction to be applied to the male stratum would be 1 in 38
(50/1900) and the sampling fraction to be applied to the female stratum would
be 1 in 2 (50/100). Clearly, each stratum contains a different sampling
fraction – they are disproportionate. By adopting this approach we have
ensured that the minority group (females) have been more fully represented
in the sample survey than they otherwise would have been.
                                                Methodology Glossary Tier 2

In order to ensure that inferences made from the survey are representative of
the whole population, it is necessary for the survey estimates to be weighted.
Calculating sample weights is quite straightforward – the sample weight is
simply the inverse of the sampling fraction that was applied to the stratum.
So for the ‘male’ stratum, where the sampling fraction was 1 in 38, all males
would be given a weight of 38. Similarly, for the ‘female’ stratum, where the
sampling fraction was 1 in 2, all females would be given a weight of 2. This
ensures that any analysis carried out using survey estimates is representative
of the entire population. Further information about weighting can be found at:

Cluster Sampling

The main reason for the development of the cluster sampling technique was
to increase the efficiency of survey administration by reducing things like cost
and travel time. A sample derived through simple random sampling can
result in sample units which are widely dispersed geographically, meaning
that interviewers must travel great distances to conduct a survey. This
means that expensive travel costs are incurred and it will take longer to
complete all the interviews required.

Cluster sampling involves splitting the population of interest into clusters.
These could be geographical areas (eg. towns, postcode sectors or local
authorities) or they could be natural clusters (eg. industries, schools or
hospitals). After the population is divided, several clusters are chosen at
random to form the sampling frame. Ideally, the chosen clusters should be
dissimilar from one another to ensure that the sample is as representative of
the population as possible. Clusters provide a more localised way of
conducting the survey and, whilst some may be in different geographical
locations, there will be less widespread dispersion and it would be possible to
assign one interviewer to each cluster. Two forms of cluster sampling are
described below:

One-stage Cluster Sampling

This involves splitting the population into suitable clusters, then randomly
selecting (via simple random sampling) a proportion of those to be included
for further analysis. All units contained within the sample clusters would then
be chosen to participate in the survey.             For example, the Scottish
Government wishes to find out information about the diets of primary one
pupils. Clearly we must create a sample, as it would be expensive and time
consuming to survey all primary one pupils in Scotland. We may decide that
each school in Scotland represents one cluster and then select a random
sample of 30 schools. In one-stage cluster sampling, we would now visit
each of the 30 schools (clusters) and interview all of the primary one pupils in
                                               Methodology Glossary Tier 2

Two-stage cluster sampling

This involves splitting the population into suitable clusters and, once more,
selecting a proportion of those to be included for further analysis. However,
in this method, the units within each sample cluster would be subject to a
further round of simple random sampling so that only a proportion of units in
each cluster would actually be surveyed.

For example, the Government now wishes to find out information on the sleep
patterns of all primary school pupils which will require one to one interviews
with school pupils and their parents/guardians. Obviously it is not feasible to
survey pupils in all schools or even to study all the students in a sample of
schools. Instead, we would select a sample of clusters and then select a
random sample of pupils from those schools to participate in the study.

In all cases, we expect the sampled units to represent the population as a
whole whether the method used is one-stage or two-stage cluster sampling.
We have established that the main and overriding advantage of using cluster
sampling is that it saves travel time and reduces costs. However, there are a
number of disadvantages which mean that this technique should be
approached with a degree of caution. For example, units which are quite
close (geographically) to one another may be relatively similar and therefore
less likely to represent the wider population. Furthermore, cluster sampling is
generally less precise than simple/stratified random sampling and produces a
larger standard error than these methods.

Despite its disadvantages, clustering usually enables the selection of larger
samples than simple/stratified random sampling. Consequently, if it is
possible to target a large enough sample that offsets the loss of precision,
then cluster sampling may be the most appropriate choice of sampling
method. The selection process effectively comes down to a trade off
between two factors – precision and cost. If you require a very accurate and
precise sample, then simple/stratified random sampling is more appropriate
whereas, if cost and time are the more important factors in your
considerations, then cluster sampling is likely to be more suitable, so long as
the sample is large enough to offset the loss of precision.

Note: The difference between Strata and Clusters

      All strata are represented in a sample but only a subset of clusters are
      With stratified sampling, the best survey results occur when units
       within strata are internally homogeneous (i.e. members of one strata
       are similar to one another). However, with cluster sampling, the best
       results occur when units within clusters are internally heterogeneous
       (i.e. members of a cluster are dissimilar).

 Further Information

 Tier 1 Sampling | Social Survey Design

Shared By:
Tags: Sampl
Description: Sampling