# week4

Document Sample

```					         Survey Sampling and
Stephen Fisher and Robert Andersen

stephen.fisher@sociology.ox.ac.uk

http://malroy.econ.ox.ac.uk/fisher/survey

9th February 2005

• Survey Process

• Sampling

• Problems: Sources of Error and Bias

Stephen Fisher, Survey Research Methods week 4, HT05
Survey Process

1. Deﬁne the population you want to learn about

2. Sample the population
• Obtain a sampling frame (if possible)
• Choose a sample

3. Decide on a mode of administration

4. Questionnaire design
• Search for existing measures
• Pre-testing and piloting
• Re-work ﬁnal questionnaire

6. Analyse the data using statistical methods

Stephen Fisher, Survey Research Methods week 4, HT05   1
Validity and Sampling Bias

External Validity

• The degree to which the conclusions of a study
would hold for other persons in other places and
at other times.

Sampling Bias

• Those selected for the sample are not “typical” or
“representative” of the population.

• Undercoverage
– Groups in the population are systematically left
out of the sample

• Non-response
– When individuals are left out because they can’t
be reached or refuse to cooperate.

Stephen Fisher, Survey Research Methods week 4, HT05   2
Non-probability Samples

• Haphazard and Convenience samples
– Pick cases without a plan—usually non repre-
sentative and whatever is easiest

• Quota samples
– Match proportion of selected groups to popu-
lation
– Acceptable in exploratory research or when ran-
dom samples are not possible

• Snowball samples (network or reputational
samples)
– Used in special situations when it is diﬃcult to
obtain a list of the population, but people know
one another

• Others used in Qualitative research

Stephen Fisher, Survey Research Methods week 4, HT05   3
Probability or Random Sampling

• In a random sample each case has an equal chance
of being selected

• In a probability sample each case has a known
probability of being selected

• With probability samples we can determine the
probability that a statistic represents the true
population parameter

• Central Limit Theorem
– Based on the idea of repeated random sampling
– With repeated sampling the distribution of the
sample mean tends to a Normal distribution.

• Law of Large Numbers
– If we repeat a random process many times, the
average value will get closer and closer to the
population parameter each time.
– The larger the sample, the more likely a statistic
represents the true population parameter.

Stephen Fisher, Survey Research Methods week 4, HT05   4
Determining Sample Size

How much error are you willing to accept?
Conﬁdence level:                   95%        90%
Margin of error:               5%     3%   5% 3%
Population size:               Required sample size:
100                             79    92    73     88
1,000                          278 521 216 434
10,000                         370 982 268 711
100,000                        383 1077 275 760
1,000,000                      384 1088 275 765

Stephen Fisher, Survey Research Methods week 4, HT05      5
Types of Probability Sample

• Simple Random Sample
– Each case has an equal chance of being selected
and is chosen completely at random from a
sampling frame.

• Systematic Random Sample
– Pick a number between 1 and k at random to
start and then pick every kth case.

• Stratiﬁed Sample
– Divide frame into homogeneous groups and pick
a sample (either systematic or random) from
within each group.

• Cluster Sample
– Used when cases are geographically sparse or
when population cannot be easily listed.
– Almost standard for national surveys

Stephen Fisher, Survey Research Methods week 4, HT05   6
Cluster Sample Process

1. Identify primary sampling unit (PSU) (e.g. con-
stituency)

2. Pick a sample of psu’s with probability propor-
tional to the size of the psu

3. Pick a sample within each of the psu’s

Stephen Fisher, Survey Research Methods week 4, HT05   7
Cluster Sampling: An Example

Goal: Study the attitudes of Catholic women in
England. Want a sample of 1000
Problem: No population list
Solution: Multi-Stage Cluster Sample

1. List of all Catholic churches

2. Randomly select 10 geographic regions

3. Randomly select 10 churches from each region

4. Randomly select 10 women from each congrega-
tion list

5. Add all the clusters together (N =1000)

Caution: Multi-stage cluster sampling can lead to
strong “design eﬀects” and we need to account
for the intra-cluster correlation (e.g. using multilevel
models or in Stata using the svy estimators or the
cluster() option)

Stephen Fisher, Survey Research Methods week 4, HT05   8
Sample Weighting

Over-sampling small populations

• Used in stratiﬁed samples to ensure representation
of small groups (e.g. Scots or Ethnic Minorities in
UK)

• Before analysis, correct for over-sampling by
weighting downwards.

Known demographic attributes

• Information exists on some demographic variables
of interest but you can’t sample them directly

• Compare samples to the population along demo-
graphic lines for which you have information

• Post-weight people in the sample upward or down-
ward in the appropriate direction

Warning: Possible bias in SPSS if the weights do not
average 1 (Insalaco, SMB 45 7/99)

Stephen Fisher, Survey Research Methods week 4, HT05   9

• Low cost, therefore large samples

• Can easily cover isolated or large areas

• Perhaps better for sensitive topics

• High non-response rate (typically less than
20%)—need a motivated population

• Unable to use complicated skips, ﬁlters or probes

• Slow

• Biased towards the educated

Stephen Fisher, Survey Research Methods week 4, HT05   10
Face-to-face Interviews

• Less self-selection bias

• Higher response rates

• Possibility of more complex ﬁlters, skips and
probes

• Costly (the most expensive mode)

• Call backs often required

• Possible interviewer bias

• Extensive interviewer training required

• Probably not good for sensitive topics

Stephen Fisher, Survey Research Methods week 4, HT05   11
Telephone Interviews

• Quick ﬁeld time

• Sampling process relatively simple (Random Digit
Dialing RDD)

• Possibility of complex ﬁlters, skips and probes

• Call backs often required

• Some households don’t have land lines (no phone
or cell phone only)

• Declining response rates (partly due to telemar-
keting)

Stephen Fisher, Survey Research Methods week 4, HT05   12
Computer Assisted Interviewing

Either known as CAPI or CATI (Personal or Tele-
phone)

• Eﬃcient: Questions appear on monitor

• Accurate: Data entered immediately into a com-
puter

• Flexible:            Allows incredibly complex interview
structure

• Cost Eﬀective

Stephen Fisher, Survey Research Methods week 4, HT05    13

• Extremely Quick ﬁeld time

• Very cheap

• Easy maintenance of panels allowing weighting on
past responses

• Possible to survey particular populations easily

• Completely non-random self selection
– But there are methods of correcting for this
(e.g. Heckman models and Propensity Score
Matching)

• Limited control over who respondents actually and
strong incentives for respondents to pretend they
rare respondents (e.g. old women)

Stephen Fisher, Survey Research Methods week 4, HT05   14
Major problems of surveys

• Low response rates
– only a problem in they introduce bias

• Selection biases

• Sampling error

• Measurement error
– will introduce noise and maybe bias too

• Inappropriate causal inferences
– Cross-sectional surveys give no information on
causal direction
– Correlations may be spurious

Most of these not limited to Survey Research

Stephen Fisher, Survey Research Methods week 4, HT05   15
Survey Non-Response

Types of Non-Response

• Failure to participate at all in the survey (unit
non-response)

• Failure to complete a survey once started

• Failure to answer speciﬁc questions (item non-
response)

• Failure to carefully follow instructions or fully an-
swer questions

Problems created by non-response:

• Cannot generalize to the entire population

• Results can be false or misleading if non-
respondents diﬀer signiﬁcantly from respondents

Stephen Fisher, Survey Research Methods week 4, HT05   16
Respondent-Interviewer Errors

Interviewer Errors

• Unintentional errors in coding

• Intentional changes to responses

• Failure to probe

Respondent Errors aﬀected by Interviewers

• Inﬂuence of the characteristics of the interviewer
– Physical appearance, language, social position

• Telling interviewers what they think they want to
hear

Stephen Fisher, Survey Research Methods week 4, HT05   17
Interviewer eﬀects:
Experimental Evidence

Could you tell me who two or three of your favourite actors or
entertainers are? (Black only respondents)

Black interviewers     White interviewers
Race of entertainer           Form A Form B          Form A Form B
Blacks only                   45.5%      39.1%       14.8%      19.2%
Whites only                    7.9%       5.7%       22.2%      23.1%
Blacks and Whites             46.6%      55.2%       63.0%      57.7%
N                               101         87         54          52

Form A: Question follows non-racial questions on friends and
neighbours
Form B: Question follows questions dealing with discrimination,
distrust of whites, and black consciousness

Interviewer eﬀect is statistically signiﬁcant, but the
order eﬀect and interaction are not.

Stephen Fisher, Survey Research Methods week 4, HT05                 18
This week and next

Next week: Practical session on Scaling

• IT room upstairs

• Using either SPSS or Stata

• Bring your own data if you have it

Group work this week should focus on choosing a
sampling scheme and a mode of administration.

Stephen Fisher, Survey Research Methods week 4, HT05   19

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 9 posted: 9/14/2010 language: English pages: 20