Surveys by dfhdhdhdhjr


									       Lecture 4 - Survey design
•   Sampling
•   Sample size/precision
•   Data collection issues
•   Sources of bias
             Why do surveys?
• Information on particular population
   – prevalence of a disease
   – behaviour, knowledge, attitude
• Planning of services
• Collect information on data not routinely
   – e.g., mental health status, health behaviours
• Repeat surveys to monitor trends (serial cross-
  sectional studies)
 Bias and precision of the survey
• Bias:
  – selection bias relates to sample selection
  – information bias relates to information
• Precision
  – relates to sample size
          Reasons to sample
• Reduce cost
• Increase accuracy and quality of data
• Sampling unit
  – person or group (e.g., household)
• Sampling frame
  – list of sampling units in the population
     •   censuses
     •   electoral lists
     •   telephone lists
     •   are institutional populations excluded (e.g., prisons,
         nursing homes)
   Target and study population
• Target population:
  – population for generalization of results
• Study population:
  – population for collection of data
  – may be total target population or a sample
             Types of sample
• Non-representative
  – convenience
  – volunteers
• Representative
  –   simple random
  –   systematic
  –   cluster
  –   multistage
       Simple random sample
• Each sampling unit in the population has
  equal probability of being included
• Sampling with replacement:
  – each unit placed back in pool
• Sampling without replacement (usual
  – each unit selected is kept out of pool
 Simple random sample (cont’d)
• Methods:
  – manual
  – tables of random numbers
  – computer-generated random numbers
          Systematic sample
• Select every nth individual from a list
  – can use existing numbers
  – e.g., patient appointments, medical records
• Advantages:
  – Does not require complete sampling frame
  – Simple to carry out
• Disadvantages:
  – May be unsuitable for cyclic or ordered data
    (e.g., every 5th patient when only 5/day)
          Stratified sampling
• Separate sample selected from different
  strata of population
• Requires separate sampling frame for each
• Useful if there are small but important
  subgroups of the population (e.g., very old,
  very young, institutionalized, sick)
           Cluster sampling
• Sampling frame comprises groups
  (households, villages, schools)
• Step 1: Simple random sample of groups
• Step2: All individuals in each group
  included in survey
• Advantages:
  – enumeration of population not needed
  – more efficient use of resources
         Multistage sampling
• Larger units sampled in first stage, smaller
  units later
• e.g.:
  – stage 1 - sample of towns
  – stage 2 - sample of city blocks or census tracts
  – stage 3 - sample of households
 Sampling for “hidden populations”

• Homosexual men:
  – gay bars, newspapers
• Injection drug users:
  – convenience sample (e.g., treatment facilities)
  – snowball sampling (through networks)
• Capture-recapture methods
  – identify biases of sampling method
             Planning a survey
• Define target population
• Select method of sampling
   – sampling unit, sampling frame, etc
• Calculate sample size
• Define survey data collection methods
• Non-respondents
   – number of attempts to reach
   – different days, times
      Sample size estimations
• Requirements:
  – level of precision (width of confidence interval)
  – expected variability (estimated from previous
    studies, pilot study, or literature)
        Design of questionnaires
•   List study variables
•   Collect existing questions and instruments
•   Adapt and/or develop new questions
•   Format questionaire
•   Pre-testing (timing, responses, clarity, etc.)
•   Revise, determine priorities, shorten
      Question wording: clarity
• Use concrete rather than abstract terms, e.g.,
   – During a typical week, how many hours do you
     spend doing vigorous exercise?
   – Not: How much exercise do you get?
• Avoid jargon, technical terms, slang
• Avoid double-negatives (Do you disagree that
  doctors should not make house calls?)
• Use active vs passive voice (Has a doctor ever told
  you vs Have you ever been told by a doctor?)
   Question wording: clarity
– Break long sentences into short ones (20 word
  or fewer)
– Use good grammar but use informal style
– Avoid hypothetical questions
– Evaluate reading level (normally not more than
  8th grade)
   Question wording: neutrality
• Do not suggest desirable response, e.g.:
  – Not: do you ever drink alcohol?
  – Better: how often do you drink alcohol?
• Give permission to give undesirable response e.g.:
  – Sometimes people forget to take medications
    their doctor prescribes. Do you ever forget (or
    how often do you forget) to take your
           Question wording
• Introduce attitude questions, e.g.:
  – People have different opinions about their
    medical care. We are interested in your opinion.
• Avoid double-barreled questions
  – How much coffee or tea do you drink each day?
• Avoid assumptions
  – How much help do you get from your family?
            Response wording
• Make them short
• Use as few options as possible
• Consider different types of non-response:
  –   refuse
  –   don’t know
  –   no opinion
  –   not applicable
  –   omission by subject or interviewer
          Response wording
• Make sure responses are mutually exclusive
  (or give instructions to “check all that
• Consider use of response card for multiple
  questions with same set of responses
  Organization of questionnaire
• Group questions by subject matter
• Introduce each group with short descriptive
  statement (e.g., now I am going to ask you
  some questions about your use of health
• Begin with more emotionally neutral
• More sensitive questions (e.g., income,
  sexual function) near end of questionnaire
  Organization of questionnaire
• interviewer-administered: repeat time frame
  fairly frequently
• self-administered: repeat time frame at top
  of each page or each set of questions, e.g.:
     During the past year, how many times have you:
        – Visited a doctor?
        – Been a patient in an emergency department?
        – Been admitted to hospital?
  Organization of questionnaires
• Group questions with similar response scale
• Format skip patterns
  – screener questions
  – branching questions
• Time frame
  – group questions that ask about same time frame
  – “usual” behavior vs specified time period
  – assist respondent with milestones to help define
    reference time frame
           Questionnaire mode
•   Face-to-face
•   Telephone
•   Mail
•   Other:
    – diaries
• Mixed mode
       Face-to-face interviews:
• reduce items with no response
• easier for older, less educated, lack of
  fluency in language
• some formats easier to administer:
  – skip patterns to avoid irrelevant questions
  – open-ended questions - can probe for more
    complete response
      Face-to-face interviews:
• cost
• time
• effort (interviewer training, evaluation of
  inter-rater reliability)
• interviewer biases
• differences in sociodemographic
  characteristics of interviewer and subject
          Telephone interviews:

• less expensive than face-to-face
• reduce items with non-response
• some formats easier to administer:
   – skip patterns to avoid irrelevant questions
   – open-ended questions - can probe for more complete
• large, representative samples can be organized from one
• avoids bias associated with appearance of interviewer
         Telephone interviews:
•   misses households without telephone
•   misses those with unlisted ‘phone numbers
•   bias when calls made during day
•   multiple calls may be needed
•   perceived as intrusive by some
•   difficult to administer items with multiple
    response options
         Mailed questionnaires:
•   least expensive
•   can be coordinated from one office
•   social desirability minimized
•   inconsistent results on completeness of
    reporting (e.g., for # MD visits)
        Mailed questionnaires:
• relatively low response rates
   – multiple mailings, cover letter, letterhead,
     advance warning, token of appreciation, SSAE
• difficult to get information on non-respondents
   – differences between early and late responders
• items may be omitted: 5-10% may be unusable
• cannot control order of questions
• postal strikes
         Analysis of surveys
• Missing data
  – exclude
  – imputation: e.g., based on characteristics of
  – sensitivity of estimate to method of imputation
• Weighting of estimates
  – for stratified samples
  Analysis of surveys (cont’d)
• Crude estimates, confidence intervals
  – Continuous data: Mean, median, quartile
  – Categorical data: proportion
  – Confidence intervals to describe precision
 Bias and precision of the survey
• Bias:
  – selection bias relates to sample selection
  – information bias relates to information
• Precision
  – relates to sample size
      Selection bias in surveys
• Does the final analysis sample represent the
  original target population?
• Sources of bias:
  – sampling method
  – non-response
  – missing data
      Information bias in surveys
• Bias in measurement of outcomes
• Sources of information bias:
  –   non-validated measurement instrument
  –   unblinded or poorly trained data collectors
  –   response set
  –   etc.

To top