Document Sample
Quotation Powered By Docstoc
					Sampling rare and hard-to-reach groups in the

Patten Smith
Ipsos MORI
Session outline

 Briefly summarise major ways in which samples can be hard-
 Outline approaches that can be taken for each
 Emphasis will be on methods which reflect principles of random
  probability sampling
 Will focus mainly on methods for identifying Ethnic Minority
What makes a group hard to sample?

 Three main reasons:
   1 Group is wholly or partly excluded from available sampling frames
   2 Group is covered by a sampling frame that is hard to access
   3 Group is on a sampling frame but is relatively rare and not separately
Group is excluded from sample frame

  May be because do not appear on useable frame at all,
   usually because of mobility - eg:
    – Rough sleepers
    – Travellers
    – Hostel residents

  Or because potential frames not available at time needed –
   eg locally administered Govt. funded projects
If excluded from sample frame

 What to do:
   – Create new sample frame
   – Use non-probability methods (last resort)
Creating own frame

 How to do this will depend on sample type:
   – eg for rough sleepers in a town, might divide town into grid
     squares and draw sample of these: then enumerate all eligibles
     identified in sampled squares
   – For museum visitors cover all exits and take every nth; no list
     used – but implied sample frame is all people passing through
     the exits
   – For hostel residents list bed spaces and sample occupants via
   – For young people having appointments with a Connexions
     Personal Advisor (PA) – recruit PAs and ask then to complete
     short sampling form for each young person they have an
     appointment with during a defined reference period
Group is covered by a sampling frame
that is hard to access 1
 Examples:
  – hospital patients (requirement for ethical clearance)
  – University students – ethical concerns by Higher Ed. institutions
  – children in care (lists held by local authorities)
  – Convicted offenders
  – Members of support groups - eg AA, Narcotics Anonymous
  – Residents of old people’s homes
Group is covered by a sampling frame
that is hard to access 2
  What to do:
    – Depends on the frame, who holds it and why it is hard to access
    – Requires negotiation!
    – Sometimes can use frame if use opt-out procedure
    – Sometimes frame holder will do the sampling and initial contacting
      on your behalf
    – Sometimes can gain access with suitable government department
      /other high level support- eg use of Child Benefit records and IDBR
    – NHS work: formal ethical clearance - can take some months
Group is covered by a sampling frame
that is hard to access 3
 If still no access, us screening procedures or resort to non-
  probability methods
Group is on sample frame but rare and not
separately identified
 Probably the most common situation in practice
 Examples:
   – ethnic minority groups
   – unemployed people
   – males aged 85 and over
   – people with low qualifications only
Group is rare, on sample frame but not
separately identified

  Requires screening
  Need to start with a sampling frame with good coverage of
   the population to be sampled
  Select large sample of units from frame and screen into
   eligible and ineligible units
  Several screening methods can be used
  Mostly focus on screening for general population samples of
   ethnic minorities, but much of this can be generalised
Screening methods

 Use earlier survey, omnibus survey, access panel , etc
 For some ethnic / religious groups can use list name matching
  (but no longer feasible for national samples)
 Screen in the field:
   – post/telephone
   – door to door
   – focused enumeration
Using earlier surveys

 If you have access to recent survey that identifies eligible
  people can follow these up – for example:
   – using British Crime Survey to identify victims of domestic violence
   – using 1999 Health Survey for England respondents to identify sample for
     EMPIRIC survey (Ethnic Minority Psychiatric Illness Rates in the
Use omnibus surveys or access panels

 Add a question to identify eligible people - eg vegetarians,
  parents of children who live apart from the other parent
 But omnibus surveys / access panels often do not use random
  probability sampling methods
Screening names on lists1

 Scan lists for names that are associated with particular ethnic /
  religious groups
 Effective for some groups - eg those of Indian, Pakistani and
  Bangladeshi origin
Screening names on lists 2

 But:
   – Only works for groups with distinguishable names (eg not for those of
     Caribbean origin)
   – Assumes minimal intermarriage involving name change
   – Nowadays, cannot use this method for samples taken from population at
     large; no suitable list of names since Electoral Registers' use for surveys
     was restricted
Screening in the field: postal/telephone

 Postal:
   – Post short screening questionnaires to large start sample (eg PAF)
   – Often not done because low response rates: but can be used in
     combination with other methods
   – May prove difficult for some types of screening (eg for ethnic
     minority people) because no interviewer to reassure/explain

 Telephone screening:
   – Although can now draw random (RDD) population samples
     response rates generally low
   – Telephone methods might work better on special populations – eg
     membership lists
Door to door screening 1

 Generate general population sample from PAF
 Interviewer visits each sampled address to establish eligibility of
 Either interview those identified as eligible there and then or
  return on another occasion (latter allows sub-sampling,
  interviewer matching, etc)
Door to door screening 2

 Safest way of identifying eligible people
 But expensive - especially in low concentration areas: eg to
  obtain sample representative of non-white HHs in Britain would
  require approx. 30 or more addresses to be issued for
  screening for each achieved HH interview
 Improve cost-effectiveness in two ways:
   – taking advantage of concentration
   – focused enumeration
Taking advantage of concentration 1

 Costs of door to door screening less if higher eligibility rate
 For example, with 10% deadwood addresses, 80% screen
  response rate, 75% main interview response rate:
   – if 5% of HHs eligible, issue 38 to achieve 1
   – if 20% of HHs eligible, issue 9 to achieve 1
Taking advantage of concentration 2
Taking advantage of concentration 3

 Table shows, for example, that 86% of ethnic minority
  individuals lived in wards in which 5%+ of population ethnic
 If prepared to limit findings to 86% of the ethnic minority
  population, can reduce number of addresses screened from 30
  to 12 times achieved sample
 But sample biased: no coverage of the 14% living in low
  concentration areas
 Also method only as good as concentration figures: 2001
  Census out-of-date
Taking advantage of concentration 4

 In principle can apply this logic to any characteristic which
  varied in concentration across different areas
 For example: sample from high unemployment / high
  deprivation areas to identify low income families
Focused enumeration 1

  Involves screening by proxy - from neighbouring
  Significantly cheaper than door-to-door screening in areas
   of lower concentration
  Can be used for any visible minority; mostly on ethnic
   minorities, but could be used for (say) households
   containing children
  Used in a number of high profile surveys, notably the
   Fourth National Survey of Ethnic Minorities, the British
   Crime Survey and the Home Office Citizenship Survey
  Various versions of the method have now been used
Focused enumeration 2

 4th National Survey method:
   – Draw sample comprising large clusters of adjacent addresses
   – Visit every nth (eg 6th) address (“location” addresses) and ask about
     ethnic origins of people living (i) at location addresses (ii) the n-1
     addresses to the left and the n-1 to the right
   – Substitutions for location addresses allowed under defined
Focused enumeration                        3

 If positive enumeration given for any address to the left or to the
  right, the interviewer calls at all intervening addresses in the
  relevant direction
 Each address screened twice once from each of two location
 Visit intervening addresses if positive identification from either
  address or if two “don’t knows”
Focused enumeration 4
Focused enumeration 5

 Once addresses containing eligible people identified, more
  detailed information collected about occupants
 Special rules for street corners, flats, rural areas, etc
Focused enumeration 6

 Basic method adapted to allow ethnic minority boost sample to
  be added to an existing survey – now much more commonly
 This involves asking at main survey sample address about
  about eligibility of those living at the n addresses to the left and
  at the addresses to the right (n is commonly 2 – eg on BCS)
 Either interviewers identify neighbouring addresses or pre-
  select from PAF
 Note, each address only asked about once - not twice
Focused enumeration 7

Example of focused enumeration from a main sample address

Interviewer screens addresses 3, 4, 6 and 7 from main sample
address 5
Focused enumeration 8

 Independent analyses by me and by NatCen indicate that
  focused enumeration fails to identify c. 30% of eligible
 Still probably better than not covering ethnic minorities at all in
  low concentration areas
Non-probability methods

 If group not on frame, frame is not useable may have to resort
  to non-probability methods
 Three main approaches:
   – Quota sampling
   – Snowballing
   – Sampling at known points of congregation / through organisations
Quota sampling 1

 Use group defining feature (eg ethnic origin) in conjunction with
  other demographic characteristics (eg age, sex, working status)
  to set quotas
 Common to select randomly clusters and then select
  respondents using quotas
Quota sampling 2

 Relatively easy and cheap to implement, but problems:
   – Risk of unquantifiable bias as with all quota samples
   – Bias depends on (mainly unknown) correlations between quota variables,
     survey variables and propensity to be interviewed
   – Population totals (upon which quotas based) not always available and
     can quickly become out of date (but can use survey data - eg LFS as
Snowballing 1

 Interview eligible individuals obtained from any source
 At end of interview ask for names and contact details of other
 Add any newly identified people to sample
 Continue until interviews attempted with everyone and no new
  names identified
Snowballing 2

 Problems:
   – Because can only ask for names and details after interview, only works
     if prepared to interview whole eligible population in area - cannot be
     used to generate frame from which sample selected
   – (If stopped snowballing after enough interviews achieved, sample
     would be grossly biased)
   – Cross-checking names administratively complex
   – Bias against those who do not mix with other eligible people
 But:
   – Work on "Respondent Driven sampling" in past few years offers
     possibility of better estimates using snowball like method
Sampling at known points of congregation

  Examples:
   – Visit gay bars to find sample of gay men
   – Visit Muslim community centres to obtain sample of Muslims
   – Visit organisations for people with disabilities for sample of people
     with disabilities

  Problems:
   – Those who visit points of congregation will be different from those
     who don’t
General conclusion

 General moral: there is no cheap and easy way to get good
  quality minority samples - unless they are pre-identified on a
  sample frame
 Which is why so many surveys of minorities use poor quality
  samples – with potentially very misleading results