Learning Center
Plans & pricing Sign in
Sign Out

Methodological Review of the Survey of English Housing


									                                                      Annex F
        Methodological Review of the Survey of English Housing

Part I: Overview

1. Description of the Survey of English Housing

  1.1    Background and history

  1.2    Sampling
          Population definition
          Sampling frame
          Sample selection
          Sample stratification
          Balancing the sample over time

  1.3    Fieldwork and data collection

  1.4    Questionnaire
         Content and structure
          Stability and change

  1.5    Response to the survey

  1.6    Grossing of results

  1.7    Reporting of results

2. A general critique of the existing survey design

  2.1    Sampling
  2.2    Fieldwork and data collection
         Simplicity and robustness
         Choosing household respondents
  2.3    Questionnaire: structure and change
         Questionnaire structure
         Response burden and sensitive topics
         Questionnaire forward planning

  2.4    Some important definitions
         Exclusions from the target populations
         Permanent residence
         Main residence
         Dwelling or household space

           Tenancy group
           Household reference person
           Family unit
           Concealed household
           Vacant accommodation unit

     2.5   Response rate

     2.6   Grossing and analysis

     2.7   Timeliness and flexibility
           Who are the users of the SEH?
           Factors that control the reporting timetable
           Shortening turn-round time

Part II: Special Topics

3. ODPM Housing survey strategy

     3.1    The English House Condition Survey

      3.2 Similarities and differences between the EHCS and the SEH
          Operational links between the surveys

4. SEH sample size and sample design

     4.1   The importance of estimates for population subgroups

     4.2   Sample stratification and clustering
           Stratification of primary sampling units
           Area deprivation scores
           Area typologies based on clustering
           Sample clustering and clustering design effects
           The effect of stratifying the selection of primary sampling units
           Calculating sampling errors for complex sample designs

     4.3   Other types of design effect
           Design effects due to correlated coding or assessment errors
           Design effects due to post-hoc weighting of the data

5.    Increasing effective sample size for important estimates
      5.1 Straightforward scaling up of the sample

      5.2 Boosting the representation of rare tenure groups

     5.3 Boosting the representation of particular geographical areas

     5.4 A larger periodic SEH

     5.5 Periodic SEH sample boosts

     5.6 Ad hoc sample boosting

6.   Pooling results from the SEH and other surveys

7.   Rotation of primary sampling units

     7.1   Rotation patterns

8.   Dwellings

     8.1 Address outcomes and the enumeration of dwellings

     8.2 Collecting and compiling additional information about dwellings

9.   Imputation for missing data

     9.1 Item non-response
          Sources of item non-response
            Effects of missing data on derived analysis variables
          Item non-response and data quality

     9.2 Item non-response and imputation

     9.3 Methods of imputation

10. Grossing and weighting methods

     10.1 Simple expansion estimator

     10.2 Non-response weighting

     10.3 Calibration weighting

     10.4 Limitations of grossing and weighting systems

     10.5 Census check studies

     10.6 Other weighting and grossing systems and software for household

Part III Recommendations

  1.1. Introduction

  1.2. Simplicity and robustness

  1.3. Single household respondent

  1.4. Household reference person

  1.5. Exclusions from the target population

  1.6. Response rates

  1.7. Sample size

  1.8. Selective sample boosting

  1.9. Sample stratification

  1.10.      Rotation of primary sampling units

  1.11.      Clustering and other design effects

  1.12.      The grossing and weighting system

  1.13.      Enumeration of dwellings

  1.14.      Imputation for missing data

  1.15.      Forward planning and survey timetable

  1.16.      Dissemination of results and data sets

  1.17.      The SEH and the EHCS


           A Methodological Review of the Survey of English Housing

This design review of the Survey of English Housing (SEH) has been carried out for
the Office of the Deputy Prime Minister (ODPM) by the Survey Methods Centre of the
National Centre for Social Research. The National Centre currently holds the SEH
contract, so both the team responsible for carrying out the review and the SEH
implementation team are staff of the same organisation. The National Centre may
again be an interested party when the SEH contract is re -tendered in 2003.The
review, however, focuses on the design of the SEH, which was due to Mr Denis
Down, the Statistician formerly in charge at ODPM (formerly DOE). There are no new
design features or changes to the survey specification that have been introduced
specifically by, or at the request of, the National Centre. The Survey Methods Centre
review team have therefore taken an independent view, without any possibility of
clashes of interest.

Part I: Overview

1.   Description of the Survey of English Housing

1.1 Background and history
The Survey of English Housing (SEH) is a large continuous household survey. It is
designed to provide ODPM with information on the housing arrangements and
preferences of private households and thus to answer a range of information needs
that arise in developing and monitoring housing policies.

The survey was set up in 1993 by the Housing part of the (then) DOE. It replaced
arrangements that had been made with the (then) Employment Department to run
sets of housing questions and housing “trailers” on the large Labour Force Survey.
An independent purpose-designed housing survey was set up at this time because
the design of the Labour Force Survey was due to change in ways that made it no
longer possible to run housing trailers and because there were limitations of the data
that DOE could obtain through trailers, given that the design and conduct of the LFS
were naturally dominated by labour force research priorities that differ fro m those of
housing research.

Other precursors of the SEH were the very large National Dwelling and Household
Survey mounted (with various follow-ups) in 1977-78 and a long series of ad hoc
surveys focusing on various aspects of housing, such as vacant properties, private
lettings, households that had recently moved house and others. The SEH does not in
itself adequately cover all the issues addressed by the more narrowly focused ad hoc

For the first five years to 1998 the SEH was carried out by Social Survey Division of
the Office for National Statistics (ONS). In 1998, after competitive tendering, the
second five-year contract was awarded to the National Centre for Social Research.

1.2 Sampling
Population definition
The SEH is based on independent annual probability samples of private addresses
and households in England. The main population to be sampled consists of
“households”, with an individual household member (normally the Household
Reference Person1 or his/her spouse or partner) used as respondent. Institutions and
other communal households are excluded, as are individuals and households who
have no permanent home. In its present form the SEH is therefore a survey of the
population of private households, but not a survey of the population of d wellings or
household spaces, even though the sample of addresses would in principle provide a
basis for such a survey. The target size for the analysis sample is about 20,000

Sampling frame
No sampling frame exists for private households as such, but a sampling frame does
exist of addresses at which under 50 items of mail are delivered daily by the Post
Office. This is the Postcode Address File (Small Users) (PAF), which has excellent
coverage of private domestic addresses. The PAF is arranged in a geographically
hierarchical order, so that addresses are listed within postcodes, postcodes are listed
within postal sectors, sectors are listed within postal areas and so on up to the
national level.

PAF also contains some addresses that for most household survey purposes are
treated as ineligible, such as hostels, small hotels and boarding houses and business
and other non-residential premises. Unfortunately no reliable method exists of
identifying and deleting these ineligible addresses in the office, so this has to be done
in the field. Another very important limitation for SEH sample design purposes is that
the address entries in PAF contain no information about the size, age or type of the
dwelling(s) to be found at the address, nor about the current residents of the dwelling
or their tenure arrangements.

Sample selection
SEH sampling from the PAF proceeds in two stages. At the first stage the entities
selected are postal sectors. The arrangement of the PAF file enables postal sectors
to be distinguished and counts of addresses in each sector to be made. 2 The number
of addresses per postal sector is about 2500 on average. Postal sectors are in effect
geographical entities; in densely-populated urban areas sectors they are quite
compact, but in thinly populated parts of the country they often cover much larger

At the first stage 1176 postal sectors are selected from the stratified list with
probability proportional to the number of addresses that they contain. At the second
stage 25 addresses are selected within each of the sectors selected at the first stage,
giving 29,400 addresses in all. The balanced probabilities of selection at the two
stages give each PAF address in England an equal overall chance of selection, while
 The concept and definition of a Household Referenc e Person are discussed at section 2.4 below.
 A small number of sectors are judged to have too small a p opulation to be suitable as sampling units.
These sectors are identified beforehand and amalgamated for sampling purposes with an adjac ent
sector. Wherever possible the amalgamation is carried out with a sector in the same sampling stratum.

providing interviewer address assignments of convenient and equal size. At each
stage strict probability sampling is used and selection is by a systematic procedure
from a random start in the stratified population listing.

On first making contact with a resident of an address interviewers check how many
different households reside at it. There is normally just one household per address,
but where there is more than one all resident households are included in the sample
(i.e. no selection at this stage). If any households prove at interview to contain
separate tenancy groups (see section 2.5), these tenancy groups are also treated for
most purposes as though they were extra households.

Sample stratification
In the SEH sampling stratification is applied to the selection of sectors, so that the
sample distribution of sectors matches the population in terms of: allocation to
Government Office Regions and sub-regions; proportions of privately rented
dwellings and of local authority housing that they contain; and the proportion of
resident household heads classified to certain socio-demographic groups (higher
social classes). The source of information for the last three stratifiers is the most
recent National Census of Population for which results are available. This is possible
because 1991 Census addresses were post-coded and as a result the characteristics
of census households falling within each postal sector can be summarised as
proportions and means. These summary values can then be linked to the current
version of the PAF and used in sample stratification and analysis of survey results. 3

Stratification at area level is not equivalent to stratification at the level of addresses or
households. In particular, use of differential sampling fractions within area strata is
quite weak as a means of boosting the proportion of households within the obtained
sample that have particular housing or other characteristics, except where there is an
unusual and extreme, but also stable, concentration of population units of interest
within the stratum. 4

1.3 Fieldwork and data collection
The prime unit of data collection, analysis and reporting is the household (see section
2.4). Some information is collected about addresses and household spaces, but the
survey does not compile information on the basis of d wellings (see section 8.2). If a
whole address, or a household space within an address, is found to be vacant, or to
be a second or holiday home, its presence is noted in field returns, but it is then
treated as out of scope and no further effort is made to collect information about it or
its residents (if any).

The data for analysis are collected by trained field interviewers, who record details of
the household spaces found at each address and then approach each household
with a request for a face-to-face interview with the head of household or partner. If
neither is available or competent an interview may be conducted with another

  There is also a tendency for Census-based stratification to decline in effectiveness as the Census
results become more out of date.
  An example from the 1991 Census would be the concentration of persons of Bangladeshi origin in
certain parts of East London.

responsible adult. 5 If it is found that a household contains two or more tenancy
groups (see section 2.4) each tenancy group is treated for sampling and interviewing
purposes as a separate household.

Information is normally obtained from one adult household member. A few items of
information are collected about each member of the household, for the purpose of
classifying households according to size, demographic structure, economic activity
etc. The respondent is asked to provide factual answers to the survey questions
about the household‟s (or tenancy group‟s) housing circumstances and his or her
opinions are treated as the corporate opinions of the household or tenancy group.
For some purposes a single household member, identified as the main earner, is
treated as the household reference person (see section 2.4).

Interview length varies according to the number of individuals in the household or
tenancy group whose personal details have to be ascertained and the number of
different sections of housing-related questions that apply. 6 Most interviews last 30-40
minutes. The survey information is elicited and recorded using a computer-assisted
personal interview (CAPI) program running on a lap-top computer. The CAPI
programming package used is BLAISE.

1.4 Questionnaire
Content and structure
The design of the SEH questionnaire owes much to that of the LFS housi ng trailers
of 1980-92. Some important content domains normally covered are:
   Description of the household. Demographic structure and identification of
    families. Economic activity and disability status of adult members. Estimated
    income of household as a whole and of head of household. Access to vehicles.
    Occupational and employment details of head of household (as basis for socio-
    economic classification of household). Some questions about second homes.
   Details of tenure and accommodation occupied by the household, including
    any lodging or sub-tenancy arrangements.
   Housing history of the household as a unit, with reference to tenure,
    geographical mobility etc.
   Views of households (in effect, of a single household respondent ) about the
    local area. Household and area characteristics of the most dissatisfied. Views on
    local services. How area and services could be improved.
   For owner-occupier households, the process of becoming an owner-occupier,
    mortgages and housing finance, leaseholders. Whether they own second or
    holiday homes.
   For households that have recently ceased to be owner-occupiers,
    circumstances of the change.

  Exceptionally the respondent may be someone who is not a household member, as for example
where a close relative responds on behalf of a frail elderly person living alone.
  From this point onward the term “household” should be taken to cover bot h hous eholds and tenancy
groups, unless otherwise indicated in the text.

   For social renters their household characteristics, movements into, within and
    our of the sector. Tenants‟ views and experiences in being offered
    accommodation. Rent and rent arrears. Receipt of Housing Benefit. Views about
    landlords and attitudes to transfers from Councils to RSLs.
   For private renters characteristics of the main tenancy types and of households
    that hold such tenancies. Recent moves into, within and out of the tenure
    category. Rents charged and receipt of Housing Benefit. Views about landlords.

In particular years special trailer questionnaires or follow-ups may be included to
collect extra detail about sub-groups of interest, such as private renters.

Stability and change
Questionnaire content is reviewed each year. It is part of the rationale of a
continuous survey that question modules that have long-term relevance to housing
policy and series of statistics that track important changes in the national and
regional housing scene should be repeated. Thus the SEH has a substantial core of
standard question modules and in that respect it broadly resembles the Labour Force
Survey (LFS), the Family Resources Survey (FRS) and other la rge government
continuous surveys. Nevertheless in most years new question modules are
developed, pilot-tested and included in the main survey, typically for periods of 1-2
years. There may also be modifications to existing question modules, made to reflec t
e.g. changes in housing policy or legislation, or to improve the performance of the
questions in obtaining complete and good-quality data.

1.5 Response to the survey
At SEH addresses, or household spaces within addresses, that are believed to be
occupied the rate of response is currently around 73% and the rate of non-response
is thus around 27%. There has been a decline in the rate of response over the life of
the survey which mirrors that experienced by other government continuous surveys.
The main reasons for non-response are that residents cannot be contacted in spite of
repeated calls by the interviewer (about 4.5%), or that residents when contacted
refuse to take part in the survey (about 18%). The remaining 4.5% or so of cases fall
into the category “Other non-productive”, which means that some contact was made
and no direct refusal was received, but the survey procedures could not be
completed successfully. This would include, for example, households that repeatedly
broke interview appointments and eventually had to be abandoned because fieldwork
time ran out. This level of non-response is not excessively high compared with that
experienced on other household surveys imposing similar response burden, but
nevertheless leaves scope for bias due to mean differences between responding and
non-responding households in terms of their accommodation, circumstances,
behaviour, or opinions.
In the case of non-responding household spaces and household spaces treated as
ineligible, all that is recorded is the existence of the household space and its
classification (vacant, derelict, second or holiday home, non-contact, refusal etc).

1.6 Grossing and weighting of results
The results of the SEH are presented in reports as population estimates. The method
used to produce the estimates, starting from the raw survey results, is quite complex.

It involves the use of a set of grossing-up factors calculated using control totals
supplied by ONS Current Population Estimates and also incorporates adjustments to
compensate for bias caused by non-response. The grossing strategy is discussed in
more detail in section 10.

1.7 Reporting of results
Thus far, SEH practice has been to publish annually a paper report containing tables,
commentary and a set of technical appendices (over 300 pages in all) about 12
months after the completion of a year‟s fieldwork. This report is the main channel
through which external users learn about the results of the survey and it is probably
well used internally also. Some results are supp lied to users within ODPM to a faster
timetable where policy priorities require this.

The SEH sample design is approximately nationally representative per calendar
quarter and results for a half year, based on approximately 10,000 responding
households, are for many purposes quite robust statistically. It is possible and useful
to base some analyses on data for (say) 6 months, but even then a period of at least
12 months is required between the identification of a policy need for new data, or for
some other change to the survey questionnaire, and the publication of a full set of
interpreted results.

2.   General critique of the existing survey design

2.1 Sampling
The SEH uses a sampling frame and a type of multistage, equal-probability
household sample design that is common to a number of other government
household interview surveys. It has a uniform selection probability for addresses in all
strata, no sample rotation and no built-in longitudinal features.
Various changes to the sample design that might be made to meet user requirements
are discussed in Part II.

2.2 Fieldwork and data collection
Simplicity and robustness
All the SEH data are collected in the course of a single interview per household
(separate tenancy groups found at an address are treated as extra households). 7 The
survey involves no fieldwork complications such as the need to interview several
members, or one particular member, of each household, or to fill in diaries, or to
organise a separate data collection visit by a physical surveyor (cf. the EHCS). The
efficiency with which the SEH is conducted in the field is enhanced by survey
contractors‟ depth of experience in designing and operating other large government
surveys of broadly similar design and also by the stability of its main content and
procedures, with which interviewers become familiar.

  Exceptions to the “one respondent, one interview” pattern are cases where a partial interview is
taken with someone other than the HRP or spouse and supplemented by telephoning the HRP later;
and the special data collection procedures undertaken to assist sampling for the English House
Condition Survey.

Choosing household respondents
One reason why the data collection design is relatively straightforward is the reliance
on a single household informant, who will normally be either the person designated
as HRP (the main income earner), or his/her spouse or partner. 8 In this it resembles
the Labour Force Survey, but differs from, for example, the General Household
Survey, the Health Survey for England, the Family Resources Survey and the
Expenditure and Food Survey, which require interviews with all adult members of the
household (and in the case of the GHS and the HSE sometimes interviews with
children also).

The GHS and the HSE aim wherever possible to interview individual members of
multi-person households in conditions of privacy, which helps when personal and
potentially sensitive questions are being asked about such topics as alcohol
consumption, cohabitation relationships or contraceptive practices. The FRS and the
EFS, one the other hand, encourage simultaneous interviewing of two or more adult
household members and interviewer skills and the CAPI software used must be
equal to this. Simultaneous interviewing is done partly to reduce the amount of time
that the interviewer has to spend with each household, but also to assist in the
eliciting of detailed financial information by allowing respondents to confer and agree
and avoid duplications and omissions of amounts.

In the case of the SEH it seems reasonable to rely on a single responsible adult (a
“householder”) when asking factual questions about tenure, housing financial
arrangements and the like as they relate to that particular household. Where more
than one household member is qualified to be the respondent it also seems
reasonable to leave it to the interviewer to make a choice based on availability and
survey convenience.

However, when sampling opinions and attitudes it is less clear that a single
“convenient” household respondent can adequately represent the views of “the
household”, each member of which may actually have different views and
preferences. For example, it is not clear that the views of a male resident, who
spends much of his time working elsewhere, on the good and bad features the
neighbourhood will adequately represent those of his partner, who spends most of
her time at home looking after young children. At best, therefore, the use of a single
“convenient” respondent adds to the variance of measures of “household opinions”
and at worst it may important substa ntial biases because the views of less readily
available household members will be under-represented. Another relevant
consideration here is the fact that some household member(s) might really prefer to
live as a separate household. Such concealed households are often the subject of
policy interest, but the views of the individual(s) concerned are effectively masked by
those of the current household respondent.

2.3 Questionnaire: structure and change
Questionnaire structure
The SEH questionnaire consists mainly of a core that is broadly constant from year to
year. Questionnaire design for an upcoming year therefore involves only a limited

 In exceptional cases an interview may be taken with a pers on who is not a member of the household.
This can occur, for example, where a relative responds as proxy for a frail elderly person living alone.

number of changes and does not require radical rethinking. The questionnaire
structure can be seen as modular, with well-defined modules on, for example,
personal attributes of household members, the household‟s housing history, housing
finance and so on. Some modules apply only to particular types of household, with
others being routed past them. The main structural complication results from the
need to capture detail about housing histories and transitions (i.e. about the
dynamics of the housing market), so that households in each tenure have to be
asked retrospectively about transitions from other tenures and other accommodation.

The smooth flow of the questionnaire is a result of careful thinking that has gone into
the structure and ordering of the modules and the logical interdependencies between
data items. A questionnaire and interview in which the ordering of mod ules was
changed, or new modules simply "slotted in” to replace old modules, would not
necessarily work well. The current model contrasts with that of an Omnibus survey,
where the operational vehicle remains constant from year to year but the question
content changes to meet short-notice ad hoc requests from customers.

Response burden and sensitive topics
Although quite demanding on the respondent, the SEH questionnaire is not grossly
overloaded. The most sensitive questions are probably those on income and housing
finance (particularly repossessions and rent arrears) and, for some, those which
require questioning about the circumstances of past housing events involving
relationship breakdown and change of partners. These are, however, vital topics for
the survey to cover.

Questionnaire forward planning
So far as we know the SEH does not have an explicit forward plan or strategy, similar
to that which developed over time on the General Household Survey, of treating
some question blocks as permanent and mandatory, inserting others periodically
(say every third year) and treating others again as temporary ad hoc insertions (and
therefore requiring to be piloted). Also so far as we know, there have so far been no
cases where questions have been inserted for parts of a calendar year and then
withdrawn or replaced by others for the remainder of the year. We understand that
there have been instances where, in spite of piloting, it is only after revision on the
basis of data obtained in the first year of inclusion that questions have been deemed
to work as intended in the second year.

2.4 Some important definitions
In collecting the SEH data interviewers are instructed to apply a number of important
definitions that affect the inferences that can be made from the results. A selection of
key definitions of special importance to housing surveys is discussed below.

Exclusions from the target population
The SEH aims to provide information about private households and their housing
arrangements. However, the populations of households and dwellings thus defined
constantly gain members from and lose members to “fringe” populations, which are
excluded from the SEH as ineligible. In the case of households the “fringe” includes
residents of institutions such as hostels, elderly residential and children‟s homes,
prisons and military establishments and individuals and households in temporary

accommodation, i.e. all households and individuals that have no permanent private
residence (see below under definition of “permanent residence”).

In the case of dwellings the “fringe” includes premises that are have become
temporarily or permanently unfit for habitation (though some of these actually have
occupants such a vagrants or squatters), and vacant accommodation (see below
under definition of “vacant accommodation unit”). 9 These exclusions simplify SEH
fieldwork, but at the cost of making the survey database less suitable as a basis for
studying the dynamics of movement between the main and “fringe” populations.

Permanent residence
For survey purposes each individual is assumed to have just one identifiable
permanent address and residence. Special definitions are applied where the dwelling
at the address is a caravan or other moveable structure, but is nevertheless the
permanent home of a household. One aim of the definition is to avoid counting as
residents at an address any individuals found there who are “visitors” and therefore
assumed to have a permanent home address elsewhere (no check is made to
establish whether they actually do have such a permanent home). Auxiliary
definitions are then needed to ensure that individuals who are on extended visits (but
still have a permanent private address elsewhere) are excluded, and that,
conversely, individuals who constantly travel but nevertheless have the address as
their only permanent home, are included. Residence at the address for at least six
months of the past twelve is used as an operational criterion for identifying
permanent residence, but it is generally assumed that the permanent residence of
one spouse is also the permanent residence of her or his partner, even if the latter
does not satisfy the residence criteria.
If the aim is to account for all members of the survey populations of households and
individuals, these assumptions that each person has a unique address of permanent
residence and that there is for each individual some address at which he or she
satisfies the “permanent residence” rules, are questionable. It seems likely that
applying them (particularly the second) results in substantial under-coverage of
certain population subgroups, such as individuals and families in temporary
accommodation and single adults, often young, who have a vagrant lifestyle and
spend time at a number of different addresses, but never for long enough to qualify
as a permanent resident. There are some alternative sources of information for
households placed in temporary accommodation by local housing authorities, but not
for the second group, which is probably much larger. 10 Both groups are important
from a housing policy viewpoint.

Main residence
The assumption that each household and individual has just one permanent home is
enforced through rules that require each household and individual to have a unique
“main” residence. The main residence is defined prima facie as the address which
the household respondent names as such in answer to a direct question, or, if there
is any doubt, as the address at which the household spends most of its time. A
corollary is then that dwellings, with any households or persons who occupy them for

  Second and holiday homes are also excluded, but some information about these dwellings is
collected when the owners are picked up as members of the eligible SE H sample.
   It seems likely that a substantial part of the Census undercount is also accounted for by this group.

the time being, are excluded as ineligible if they are not anyone‟s “main” address and
people who have no permanent home according to the definition discussed above
become statistically invisible so far as the SEH is concerned. It is, of course, likely
that the excluded households and individuals, even if treated as eligible, would have
high survey non-contact and refusal rates in practice.

Households and individuals who actually have several permanent addresses can
only be treated as eligible if contacted at their “main” address. This rule is intended to
avoid giving households that have several homes multiple chances of selection. A
corollary is that households or individuals who spend much of their time at a second
or holiday home (which may be outside the UK) have a lower chance than others of
being included in the SEH analysis sample (since if the call were to be made at their
“main” residence they would be classified as non-contacts).

Dwelling or household space
For responding SEH households a household space is effectively defined as the
accommodation that they occupy (allowing appropriately for any accommodation
assigned to sub-tenants or lodgers). Interviewers also seek to identify and enumerate
all other household spaces at sample addresses. However, the concept of “a
dwelling” includes in addition a criterion of “self-containedness” which is not satisfied
by all occupied (or vacant) “household spaces”, for example in housing in multiple
occupation. The three major sources of housing information, namely, the SEH, the
EHCS and the Census, use definitions of a dwelling that are not identical and can
lead to different ways of allocating accommodation to separate dwellings and to non-
identical dwelling counts.

A household is defined as one person living alone, or a group of people who have the
sample address as their only or main residence and who either normally share at
least one meal a day, or share a living room. This is the standard definition used i n
censuses and surveys. It subsumes other rules mentioned above that define
permanent residents of the accommodation.

Tenancy group
A tenancy group is defined as a group of persons resident at an address who occupy
their accommodation under a common formal arrangement, which could be owner
occupation or some form of tenancy, sub -tenancy or rent-free agreement. In the
great majority of cases a tenancy group is coterminous with a household, but the
cases where this is not so need to be catered for and are of policy interest. The
tenancy group is an important concept in housing policy and research, since an
estimate of the number of tenancy groups is used in estimating total housing
demand. As mentioned above, separate tenancy groups at an address, once
identified in the SEH, are for most purposes treated as though they were additional
household units.

Household reference person
A further definition uniquely identifies a household reference person (HRP) for each
household. It is useful to identify a HRP as a means of classifying households
socially (using the occupation of the HRP) and also in estimating trends in household
formation via “headship rates”. Until 2000 the HRP was identified with the Head of

Household (HoH), using a definition which selected a person who owned the
household accommodation or was legally responsible for paying the rent (a
“householder”) 11. Where two household members of opposite sex (usually spouses)
qualified according to this criterion the male was selected and where two of same sex
qualified, the elder. These partly gender-based tie-breaking rules were used to
ensure that one individual only was selected and also because, of two spouses, the
man was (and still is) likely to be the higher earner.

From April 2001 the older definition was abandoned in the face of objections that the
tie-breaker rule was sexist and replaced with a definition of the HRP as the
household member with the highest income. It is difficult to see much merit in this
change in the case of housing surveys. The title “Head of Household” could
reasonably have been replaced as it is no longer used colloquially and could convey
an out-of-date implication that the HoH is dominant within the household in
sociological or legal senses. However, choosing the “highest income” criterion
seems equally if not more unfair to women and also likely to cause doubts and
anomalies in practice. How, for example, should it operate in a household where the
husband, a qualified heating engineer, is unemployed and the wife has a part-time
cleaning job? Is the wife then SRP until the husband gets a job, or is the husband to
be selected because he is potentially the higher earner?

Family unit
In the course of the SEH household interview other units of interest, such as family
units, are identified for analysis and also for questionnaire routing purposes. Family
units are usually defined as two or more adults, married or cohabiting, with or without
dependent children for whom they are responsible. Most households containing
families consist of just one such unit, but amongst certain ethnic minorities, for
example, large households containing more than one family unit are quite common.
On the other hand, the SEH does not recognise arrangements where two individuals
operate as a couple but retain their own separate permanent dwellings.

Concealed household
Some individuals or groups who live reside with others in a single dwelling would
count as concealed households. These are usually defined as single adults or groups
of adults, with or without children, who occupy the accommodation with, but do not
belong to, the household reference person’s family unit. There are various ways in
which this kind of situation can arise (e.g. extended families, unrelated single adults
sharing accommodation, elderly parents in “granny flats” etc) and doing justice to
each of these is a complex area of housing research. A full account of the residents
of a dwelling, with their ages, sexes and the relationships between them, is required
to identify such concealed households with certainty.

Vacant accommodation unit
Conceptually, vacant accommodation units (household spaces) are defined as those
that are fit for, or are in course of being repaired or altered so as to be fit for, human
habitation, but currently have no permanent residents. As in the case of second or
holiday homes, interviewers will sometimes have difficulty in distinguishing “vacant”
   To deal with cases of rent-free tenure this was extended to “the person had the accommodation by
virtue of some relationship to the owner in cases where the owner or tenant was not a household

units from units the residents of which are seldom at home. By the same token, there
will often be no readily accessible respondent who is able and willing to supply
accurate information about important dwelling attributes such as tenure, nature of
accommodation, amenities etc. Use of casual third-party respondents also raises
issues of confidentiality.

2.5 Response rate
Maintaining high rates of response to household interview surveys is a perennial
challenge which is becoming more severe over time. The SEH has not escaped the
downward trend in rates of response which has affected all the continuous household
surveys conducted by the National Centre and by the Office for National Statistics,
particularly over the past 10 years. The reasons for this trend are likely to include:
   less public tolerance of unsolicited approaches (some made for survey purposes,
    but most for purposes of direct selling);
   more use of barriers to such approaches, such as entry-phone systems;
   more widespread fear of crime, particularly in some areas;
   more people being in full-time employment and leading over-busy lives, with less
    spare time and energy to take part in surveys;
   growing prevalence of lifestyles that cause people to be frequently away from
   members of the public becoming blasé and less flattered and intrigued at being
    invited to take part in a survey;
   less public faith in and support of the institutions of government (evinced also in
    lower electoral participation rates);
   less deferential acceptance that government has legitimate reasons for collecting
    information about households and individuals.

The above interpretation is difficult to confirm and quantify scientifically without
strictly comparable evidence from non-response studies conducted recently and in
earlier times, but if correct it suggests that there is no easy way (other than perhaps
the judicious use of costly response incentives) in which survey sponsors and
practitioners can revive the willingness of members of the public to take part in
government surveys.

2.6 Grossing and analysis
The most innovative and complex feature of the survey is probably the grossing and
weighting. This is further discussed in Part II.

2.7 Service to users, timeliness and flexibility
Who are the users of the SEH?
The SEH is a major government continuous household survey. It is justified within
government as a means of meeting the housing policy information needs of ODPM.
However, in common with its peer surveys, it is also a vital source of information and
data for local government, academic, commercial and other users outside central
government who are interested in housing and in other topics that are covered by
large household surveys.

The view we take is that, whereas ODPM and other government uses are
paramount, account should also be taken of the requirements of external users.
These requirements are not limited to getting speedy access to those sets of SEH
results to which the Department attaches greatest importance and priority, important
though that is. They also include having access to the household -level micro-data so
as to carry out new analyses defined by themselves, subject to data protection
safeguards, and to survey methodological information, such as complete
specifications of the sampling and data collection designs and the fieldwork
documents, procedures and outcomes. The annual published reports, as well as
presenting tabulated results and commentary, have hitherto supplied much of this
methodological information.

Factors that control the reporting timetable
Like all the large and complex continuous surveys conducted for government, the
SEH operates on a long time cycle. For example, any significant change to the
questionnaire must be agreed with sponsors and users well before it goes into the
field as part of the main survey, so as to allow time for pilot tests to be set up and the
details of the main survey questionnaire and field procedures to be finalised in the
light of pilot results. Proposed content changes often involve net additions to the
questionnaire and internal negotiation within the department may be required to
decide whether a longer interview can be funded or, if not, what can be dropped from
the existing questionnaire so as to hold interview length and other cost -related
features of the design constant. Internal content review processes may therefore
need to start a good deal earlier still.

If one measures from the point at which a new set of question topics is internally
approved within the sponsoring department for one of these surveys, it is necessary
to allow for:
   drafting the new questions in consultation with users and integrating them into the
    questionnaire to give a realistic context;
   piloting the questions in that context;
   digesting pilot feed-back and making any required modifications to question
    wording etc. (which may involve negotiating with the originators);
   integrating the questions into the Computer-Assisted Personal Interviewing
    (CAPI) programs for the main SEH round to start in the following April;
   running the questions for 12 months;
   processing the data and preparing them for analysis;
   running, interpreting and writing up the required analyses.
The time lapse from start to finish can in some circumstances be as long as two
years. Given all this, the SEH has so far had quite a good record for publishing
reports reliably to the sort of timetable just described.

Shortening turn-round time
The above cycle of activities operates on the SEH to a rhythm that is no slower than
on other comparable large continuous household surveys, but policy users within
ODPM are keen that the processes should be speeded up and the timetable
shortened. Results required for internal policy purposes (not necessarily for external
dissemination) do not have to wait for the published report or for publication on the
ODPM website. The delay in providing full year results internally can therefore be as

short as 18 months for full year results and 12 months for half-year results. Half-year
results can be very useful, bearing in mind that the sample size for six months is
around 10,000 households and that the results for each calendar quarter are
approximately nationally representative. However, time-lags of this order are still long
in relation to the normal timetable for the policy-making process.

Currently SEH managers at ODPM are moving to a system in which the principal
output is a set of tables published on the ODPM website (the main batch will be
scheduled for publication before Christmas for the fieldwork year ending on 31 March
of that calendar year). At least one further batch will be published in the spring. There
will probably be other small batches made up of tables that have been produced in
response to ad hoc requests but are felt to be of wider interest. The tables on the
website will not be confined to the results from the latest 12 months of fieldwork, but
will include results from earlier years i n the case of questions that are not asked
every year. There will still be an interpretive report, but this will be smaller than in the
past, and will contain only selected tables.

Apart from this the main focus of SEH management has been on shortening the
amount of time that must be allowed between internal notification of a requirement
new questions and the launch of the main survey. This period has been reduced to
about 15 weeks, so that, with the main survey fieldwork cycle beginning on 1st April,
some requests for new questions can be taken on board as late as the end of the
previous November. This assumes, of course, that the generation of demand for new
questions always occurs in the autumn, but the exigencies of the policy process may
mean that the demand surfaces in the spring, adding another six months to the
timetable. Another point to remember is that putting pressure on the time available
for testing and improving the questions can result in questions forms being inserted
in the main survey which are not fully tested and do not perform as well as they

A survey such as the SEH could only be speeded up more radically only by changing
the design so fundamentally as to turn it into a different kind of survey. For example,
to run a quarterly as well as an annual cycle of questionnaire reviews and changes,
with corresponding cycles of data collection, data processing, analysis and reporting
(effectively four surveys a year) would require far more resources, different
organisation and some watering down of statistical and data quality standards. We
believe that thinking about the content of the SEH needs to be long -range, focusing
on aspects of housing that are of durable significance and that a sufficient number of
topics in this category exist to justify the survey. Information needs generated by
short-notice policy changes or innovations should be addressed in other ways. On
the other hand it is right that there should be regular reviews of content and the
survey should on no account be allowed to fossilise or to become dominated by
requirements to continue annual statistical series which are not really justified.

Part II: Special Topics

In Part I we have given a general description of the design and execution of the SEH
(Section 1) and an overall review and critique of the design (Section 2). In Part II
Sections 3-10 we provide further commentary on a number of important design
issues and problems that have either been flagged in the brief for this consultancy, or
have emerged from the critique (or both).

3. ODPM housing survey strategy
Housing policy is about matching the supply of housing units of various types, within
geographical areas, that meet minimum standards in terms of physical condition and
amenities, with the housing demand generated by persons in those areas who wish
either to live together in multi-person households of various types and sizes, or to live
alone. To provide information on which policy can be based ODPM conducts two
large surveys, the English House Condition Surve y and the SEH.

3.1   English House Condition Survey
The EHCS has been running as a regularly repeated survey since the nineteen
seventies. It samples from the population of all dwellings, including those that are
vacant 12. It aims to collect sample information (a) about private dwellings and
household spaces and their physical condition and (b) about the households
inhabiting those dwellings and is thus able to investigate which types and conditions
of dwelling are inhabited by which types of household. The EHCS interview with
dwelling residents overlaps considerably in content with the SEH interview, but there
are some differences in definitions and question wording. The EHCS is now run
continuously, like the SEH, but the SEH has a substantially larger annual sample

3.2   Similarities and differences between the EHCS and the SEH
The EHCS can relate survey-measured attributes of the resident household(s) to
surveyor-assessed condition and other physical features of the dwelling, including
e.g. adaptation to the needs of the disabled. Its follow-up surveys expand the
information available about the physical attributes of the dwelling. The content of the
EHCS household interview overlaps heavily, but by no means completely, with that of
the SEH and within common sections includes details not covered by the SEH. Even
where there are topics in common, definitions and question wordings sometimes

The SEH can relate attributes of the resident household to the type, but not the
condition, of the dwelling. It does not systematically collect sample information about
the population of all existing dwellings and in reporting covers only those that are

  For brevity we will in the remainder of this discussion use the term “dwelling” to include the idea of
“household spaces” that may not be structurally separate, except when structural separateness is an

The PAF address sampling frame used by both surveys in principle covers all private
housing units. Neither the SEH nor the EHCS covers the complete population of
households and individuals that require accommodation, since it excludes those that
have no permanent residence.

Two important differences between the sample designs of the two surveys are:
   EHCS samples differentially, assigning much higher selection probabilities to
    dwellings in rare tenures and older housing compared with other parts of the
    housing stock;
   in the EHCS a high proportion of the dwellings and households included in one
    year are also included in at least one subsequent year (sample rotation at the
    level of dwellings).

This rotating sample design reflects the high priority attached in the EHCS to
reducing the standard errors of measures of change in the mean condition in the
housing stock. Another advantage of retaining addresses and dwellings in the
sample for more than one round of the EHCS is that it offers the possibility of
conducting longitudinal studies at the level of individual dwellings and household
spaces. Such studies of micro-change at the level of dwellings have considerable
attractions in theory, but in have often been disappointing in practice. The main
reasons for this are:
   confusion of addresses, leading to errors in identifying on recall the housing unit
    that was surveyed on the earlier occasion;
   the fact that changes of occupancy make it hard to find reliable respondents who
    can give details of how and when changes or repairs were made;
   the fact that there are many ways in which dwellings may disappear or be
    transformed that are hard to track unambiguously;
   unreliability and error in the items of information collected at successive stages
    (which generates spurious “change”).

Operational link between the surveys
In 2001-02 an operational link was put in place between the SEH and the EHCS. The
EHCS now uses a “shadow sample”, drawn at the same time as the SEH sample, to
provide sufficient numbers of cases in certain strata. The SEH addresses and their
“shadows” are normally next-door or closely adjacent to one another. The method
depends partly on the correlation in terms of type and tenure that exists between
closely adjacent dwellings and partly on procedures built into the SEH fieldwork to
check tenure and other details of each “shadow” address. In sampling terms,
therefore, the EHCS partly depends upon the SEH. So far as we are aware no further
use has as yet been made of the close co-ordination of the two survey samples.

4. SEH sample size and sample design
4.1 The importance of estimates for population subgroups
As with most large-scale surveys, the main issue in discussions of the optimum SEH
sample size is the representation of population subgroups. Housing policy in England
is often directed to minorities, such as households and dwellings in rare tenures,

households that have recently changed their tenure arrangements, ethnic minority
households and others.

Separate estimates for these groups are therefore desired and an important function
of the SEH is to screen a large sample of residential addresses in order to identify
sufficient numbers from minority sub-populations, defined in terms of housing and
demographic variables, to provide stable estimates for those sub -populations. This
requirement is reinforced by demands from users to improve the detail, precision and
reliability of estimates for separate regions and sub -regions of the country. Detecting
and measuring change in the distribution of housing variables over time is another
particular concern, which interacts with the demand for finer geographical

4.2 Sample stratification and clustering
The most powerful determinant of the precision of estimates of level and change in
variables measured by the survey, and also of the power of tests of relationships
between variables, is usually sample size. However, (sub-)sample size is not the only
factor that affects the precision of estimates. Two other features that affect the
statistical performance of sample designs are stratification and clustering.

Stratification of primary sampling units
Stratifying the selection of sampling units tends to reduce the variance of estimates,
particularly if based on the total sample, but only in proportion to the degree to which
the selected stratifiers are correlated with those variables the level or change of
which the survey is designed to estimate .

Like almost all the other large government household surveys, the SEH issues to the
field a sample of addresses drawn from the PAF. The PAF has good coverage of
addresses, but a limitation is that it contains no information about the households that
reside in them, other than their geographical location as indicated by the postcode.
Over the years arrangements have been made to link the geographical referencing
system of PAF with that of outputs from successive Censuses of Population. Census
confidentiality prevents the making of links at the level of individual addresses or
postcodes13, and in any case information about residents rapidly becomes inaccurate
unless regularly and frequently updated. Therefore, links are made at small area
level. As a result, where postal sectors are used as primary sampling units (as in the
SEH) it is possible to stratify them by reference to the averaged attributes of the
households and individuals found to be living in the sector at the time of the last
census. Stratification at the small area level is useful, since there are some marked
contrasts between areas that are relevant to surveys. Housing attributes, such as
tenure, type of dwelling and age of dwelling, provide some good examples of such
area contrasts.

Five points should be noted.

  The exception is the linking of census and survey addresses carried out after each census by the
Office for National Statistics, using its uniquely privileged position vis -à-vis the census. This linkage
enables responding and non-responding survey households to be compared in terms of their census-
measured characteristics.

1. Because the largest component in the population variance of survey variables is
   usually between households within small areas, rather than between area
   averages, area stratification can operate to control only a relatively small
   component of variance. It is thus much weaker in its effects than household-level
   stratification would be. Thus in looking at evidence on the scope for improving
   stratification schemes it must always be remembered that any improvement
   achieved affects only the smaller, between-areas, component of total between-
   households variance.
2. There is a tendency, varying according to which survey variable is looked at, for
   households of particular types to cluster together geographically. This produces
   unfavourable design effects in address samples that are themselves drawn in two
   stages, with small areas selected at the first stage (see next section). If all small
   areas were identical in terms of the survey variables such design effects would
   not arise; but nor would there be any point in stratifying the selection of primary
   sampling units. As it is, the sample designer tries to recover through area
   stratification the losses in precision that result from address clustering. Sadly, it
   almost always turns out that the effect of clustering in increasing variance is
   stronger than the effect of stratification in reducing variance.
3. Both clustering effects and stratification effects tend to decrease in magnitude for
   estimates based on sub-samples (which in practice most estimates are).
   However, for within-region estimates the regional stratifier has no effect, whereas
   the clustering effect is undiminished.
4. Clustered sample designs, such as that of the SEH, that allocate address
   selections to many primary sampling units (areas) are better, but gain less from
   stratification, than designs that use fewer areas and larger clusters. Nevertheless,
   within those limitations area stratification can make a useful contribution to
   making estimates as precise as possible.
5. Household-related results from a given census remain in use until the small area
   data from the next census become available about 12-13 years later. Over that
   period their effectiveness in stratifying area samples is likely to be reduced by
   changes in area composition over time. This factor does not affect the power of
   regional stratifiers.

Large surveys within many primary sampling units (PSUs) give scope to distinguish
many strata in selecting PSUs (areas). This is likely to make stratification more
effective, provided that all distinctions made between strata mark genuine area
contrasts that are relevant to the survey variables. In practice most sample designs
make a trade-off between number of stratifying variables and fineness of
discrimination on particular stratifiers. A rule-of-thumb empirical guide sometimes
used is that it is better to use several stratifiers that contrast areas in different ways
(provided that all are relevant to the survey variables) than to make very fine
distinctions on just one stratifier.

Large national household surveys are often designed to provide a wide range of
different estimates and consequently tend to select a general-purpose set of
relatively independent census-based stratifiers. 14 The selection often includes:

     A list can be found in Barton (1996).

     a classification of the country into geographical regions;
     measure(s) that distinguish conurbations, other urban areas, suburban areas and
      rural areas (or a measure of population density);
     measure(s) that take account of the preponderant form of housing in each area;
     measure(s) of “average area social class” (interpreted as the proportion of
      household heads or reference persons within the area that were at the time of the
      census in “middle class” rather than “working class” occupations);
     a measure of the average level of affluence of area residents, such as car

Variables such as housing tenure are, of course, of direct relevance to housing
policy, but also function partly as measures of relative material deprivation and for
that reason are used in stratifying samples for surveys not primarily focused on
housing topics. Social class reflects cultural and lifestyle differences, but also relative
material affluence or deprivation. The last two variables mentioned above are the
nearest the census and other available area information sources can get to
stratification by average area level of household income or relative material
deprivation. The choice of stratifiers often owes something to political and well as
statistical considerations. For example, survey users may wish to be assured that the
proportion of the sample assigned each region, or to each ethnic group, is controlled.
These frequently-used stratifiers are all to some extent intercorrelated across the
population of areas.

In the case of the SEH, the most important survey estimates relate to households. As
already described in Part I, the design is quite elaborately stratified at the PSU
(postal sector) level. The population of postal sectors is classified into 17 regions and
sub-regions and takes account of the proportions of households per sector in private
renting and in local authority renting tenures and the proportion of households per
sector in higher versus lower social classes (based on occupations of heads of
households). These factors are balanced over the year within the sampling scheme.
It would not be possible to incorporate additional PSU stratifiers into the current
design without either dropping one or more of the existing ones or reducing the detail
of the classifications.

Compared with other large household survey designs the SEH design gives more
scope to the overtly housing-related census-based area stratifiers. The regional
stratifier and the two tenure-related stratifiers are concise and satisfy the requirement
that they be directly related to important survey estimates, so we do not think they
should necessarily be changed (see also below on area deprivation scores and area
typologies based on clustering). It is possible that the stratification based on area
social class indicators could be improved upon and it will be necessary to consider
this issue when the small-area results of the 2001 Census become available and
sampling frames are updated, because ONS has switched, in classifying Census
data on occupation and status in employment, from the old Socio-Economic Groups
(SEG) classification to a new National Statistics Socio-Economic Classes (NS-SEC)
classification. The 100% processing of 2001 Census occupational data should
strengthen this form of stratification 15 and analyses carried out in the course of
developing NS-SEC suggest that it may perform slightly better than SEG in

     Formerly social class information was coded for only 10% of households.

distinguishing disadvantaged types of household, which is probably one of the aims
of SEH stratification. These improvements should carry through when the
classification is carried up to small area level via the proportion of household heads
(or of employed persons) whose occupations are assigned to particular NS-SEC
categories. Another important set of sampling options likely to be opened up when
the results of the 2002 Census become available is the ability to distinguish area
entities below the level of postal sectors that may be particularly meaningful in
housing and environmental terms, such as “neighbourhoods”.

During the past decade trials have been reported aimed at optimising the
stratification designs of the Family Resources Survey (using the results of the Family
Expenditure Survey as a test-bed) (Bruce 1993, Barton 1996) and the General
Household Survey (Insalaco 2000). In principle the optimum way of combining
stratifiers could be different for each outcome variable, so the approach has been to
define a small set of important outcome variables and to look for stratifiers that
perform well in reducing the variance of most or all of these important variables. The
criterion set of variables chosen tends to vary from one survey to another and the
same variable may be used in a different form from one survey Certain stratifiers
(e.g. the social class measures) tend to recur, but it is not safe to generalise from one
survey to another. Nevertheless, results pointing to worthwhile improvements in the
way stratifiers are defined, selected and used have been obtained. This type of test
requires only a survey data set in which the allocation of cases to primary sampling
units and hence to strata is defined and access to the required census area-level

To predict for the SEH which exact combination will produce the greatest reduction in
the between-areas component of sampling variance, empirical trials are needed
using a customised list of key output variables. We therefore recommend that a
similar exercise be undertaken for the SEH. The best opportunity to do this work
would probably be as part of the redesign of the SEH sample that will in any case be
required when the 2001 Census small area data come on stream. However, in our
view not too much should be expected by way of improvements in statistical
efficiency. Some of the papers referenced (e.g. Barton 1996) give a misleading
impression in interpreting results showing how area stratification impacts on the
variance of household-based estimates. The improvements claimed affect only the
minor component of variance, that is, variance between area averages. Thus, for
example, if a reduction in variance of 15% is predicted for a new set of stratifiers as
compared with the old, but the between-areas component accounts for only 10% of
total between-households variance, then the actual reduction in the variance of
household-based estimates is only 1.5%.

Area deprivation scores
In addition to single Census variables there exist area-level deprivation indicators
derived from sets of variables, the development of which was sponsored by the
(then) DOE (DOE 2000). These indicators are based partly on Census variables and
partly on other small area statistics that are regularly updated, such as registered
unemployment rates. There is a family of these deprivation indicators, each
corresponding to a different domain (e.g. the Housing Domain, the Income and
Employment Domain), as well as an overall i ndicator. These will need to be updated
on the basis of the 2003 Census results, but we do not know what the timetable for

that will be. There has been some technical controversy about methods by which the
indicators were derived.

From the SEH viewpoint a a drawback to using these indicators to stratify the area
sample is that they are available for electoral wards only. The matching with
postcode sectors, which are the SEH primary sampling units, is necessarily rough
and this would weaken the power of area deprivation scores as stratifiers. SEH
designers have tended instead to use individual census small area measures,
because they are more easily interpreted and more clearly related to the aims of the
survey. A more thorough review of advantages and disadvantages of the scores in
SEH sampling would require more time and effort than we can afford to give within
the scope of the present review. Our feeling is, however, that there are no great
gains to be made.

Area typologies based on multivariate cluster analysis
Another way of reprocessing Census Small Area Statistics, often used in market
research and sometimes in social research, has been to subject a set of about fifty
area indicators to a form of cluster analysis 16, which seeks to assign every small area
(for example wards) to an area cluster. This is another type of area classification; one
such system is known as ACORN and another as MOSAIC. Impressionistic labels
are given to the clusters on the basis of the indicators on which they score notably
above average. The labels tend to refer to demographic, socio-economic and
housing attributes of areas, along the lines of “Pebble -dashed Subtopia”, “Bohemian
Melting Pot” and the like. At the most detailed level there may be around 40 area
clusters (quite uneven in size), but these can be recombined hierarchically into a
more manageable number, perhaps 11, for use in sample stratification. However,
recombination necessarily robs the classifications of much of their precision and
interpretability and hence their intuitive appeal.

We have been asked to consider whether the SEH sample stratification should be re-
based, wholly or in part, on one of these area classification systems. We believe that
that would not be justified. The labelling of area clusters is designed to maximise
customer appeal by appearing to “capture” many different dimensions of variation
between areas and their residents. It also gives an impression that the clusters are
much more sharply distinguished statistically than is actually the case. The clustering
algorithm, by its nature, cannot be optimally focused on any one area of application,
such as housing. We believe that the existing set of stratifiers is more focused, more
transparent and likely to control more of the variance in SEH target variables than the
area clustering systems.

It is worth noting here that the performance of different sets of stratifiers could be
tested and compared empirically if indicators of deprivation score(s) and the area
cluster assignment for each PSU were attached to a recent SEH data set (see above
under stratification).

 Not e that in this section we are using the terms “cluster” and “clustering” to refer to a form of
multivariate data analysis, not to a sample design feature.

Sample clustering and clustering design effects
In the SEH sample design addresses are selected in two stages. This produces a
two-stage clustered design. The clusters correspond to primary sampling units
(PSUs) and each consists of the set of households for which data are obtained within
a postal sector. Sample clustering is one important source of unfavourable sample
design effects, which increase the variance of estimates and hence the
corresponding standard errors of estimation. The size of the clustering design effect
(DEFF) can be approximately estimated as a joint function of the mean intra-cluster
in each cluster (M):

DEFF = 1 + (M –

The design effect is the factor by which the variance of the estimate for the complex
design is greater or less than the variance for a simple random sample of the same
size. Design effect coefficients greater than one indicate that the sample design is
less statistically efficient than a simple random sample (so the effective sample size
is smaller than the actual sample size).

                                                                                   hat it
would be considered negligible in many applications of correlational analysis (say
0.02 or less), if M is large (say 100 or greater) the design effect will also be large

relatively high because of the tendency for blocks of housing to have been built at
much the same time and to similar designs. Being aware of this, designers aim to
keep down the number of cases per cluster, which for fixed total sample size entails
having a larger number of smaller clusters, rather than a smaller number of larger
ones. The SEH thus selects 1176 first stage units and the average cluster size
(number of responding households/tenancy groups per first stage unit) is
approximately 17. This is also a convenient number for a single SEH interviewer to
cover in one month.

Calculating sampling errors for complex sample designs
In the case of simple random sampling it is possible to calculate directly, from a
knowledge of the sample means and standa rd deviations of variables and base
numbers, the values of the standard errors of estimates presented in tables.
However, in the case of complex sample designs the calculation is more laborious
and requires rho values, cluster sizes and stratification to be taken into account. This
can be done through the use of algorithms embodied in special software, which
enables true variances and standard errors for estimates based on samples with
complex designs to be calculated. In general the standard errors are larger than
those estimated by assuming simple random sampling. In the SEH reports, rho
values and true standard errors are published for a range of important estimates. 17

In specifying the SEH design the unfavourable clustering design effect has been
carefully balanced against practical arguments in favour of clustering. Appendices to
the SEH reports show standard errors and design factors for quite a wide range of
  We are not sure whet her or not these estimates, in addition to taking account of the sample design,
also allowed for the complex grossing and weighting scheme applied to the SEH result s, but we think

estimates based on the grossed-up sample (or sub-samples of it). The largest design
factor value shown is around 1.88, but the majority are under 1.2 and many are close
to 1.0. The largest design factors observed reflect geographical clustering in the
population, as occurs for example for households living in high-rise buildings and
members of certain ethnic minorities. We therefore agree that the balance of
advantage is in favour of a two stage clustered design because it makes fieldwork so
much less costly and more manageable.

Where the subgroup for which estimates are required is spread across most of the
sampling strata and area sampling units, the number of cases per area cluster and
thus the clustering design effect is reduced. However, it is not reduced when the
subgroup is defined to include certain classes of area only, such as those located in
a particular geographical region. Therefore the precision of estimates for regions is
reduced not only because the sample size is (obviously) smaller than for national
estimates, but also because the clustering design effect is the same as for national

4.3 Other types of design effect
Design effects due to correlated coding or assessment errors
While clustering design effects due to the sample design are pervasive in the surveys
we are considering, they are not necessarily the only serious type of clustering
design effect that influences the precision of results. For example, interviewers may
differ, slightly but systematically, in the codes they record at particular questions,
perhaps because they interpret their instructions differently. That will produce a rho
value greater than one for certain variables within the cluster of households
interviewed by the same interviewer. Each SEH interviewer typically deals with only a
small number of responding cases 18 and there are no obvious cases where coding of
responses depends on interviewer judgement 19, so the effect on standard errors is
likely to be negligible. 20

Design effects due to post-hoc weighting of the data
In the case of the SEH selection of households is with equal probability, but the
results presented are based on data that have, as an inherent part of the grossing
procedure, been differentially weighted to reduce biases due to non-response. Such
differential weighting tends in most circumstances to generate unfavourable design
effects and inflate standard errors, the magnitude of the effect being proportional to
the sample variance of the weighting factors applied. Because of the complexity of
the SEH weighting and grossing system it may be that true variances and sampling
errors could only be estimated through some balanced repeated replications
algorithm. It seems possible that Elliott (1997) carried out some such estimations, but
if so he does not report results for the SEH.

   It must be remembered, however, that some interviewers cover more than one quota of SEH
addresses in the course of a year.
   Apart, perhaps, from interviewer assessment of “shadow sample” addresses.
   However, effects due to systematic inter-assessor differences cannot automatically be dismissed as
negligible. In the EHCS surveyors responsible for assessing the condition of dwellings tend (in spite of
special training) to differ systematically from each ot her in the standards they apply. The rho value is
still small in absolute terms, but when it interacts with the fact that the average number of dwellings
assessed by each surveyor is large, the resulting unfavourable design effect is also uncomfort ably

The SEH weighting and grossing system consolidates several conceptually different
stages and it seems unlikely that the effects on variance of different weighting
elements (non-response weighting, calibration weighting) could be separately
calculated using existing data sets. To investigate these effects for a selection of
SEH estimates a special data set would be required that incorporated not only the
raw survey results for some suitable set of variables but also full meta-data on both
the sample design and the weighting scheme, including per-case values of the
weights applied at each stage (see section 10). Specifying and obtaining such a data
set, carrying out appropriate analyses and interpreting the results is a larger task
than could be attempted within the limits of the present review.

5. Increasing effective sample size for important estimates

5.1 Straightforward scaling up of the sample
The simplest response to demands for more detail and greater precision in estimates
based on the SEH would be to increase the total set sample size for the survey as a
whole, while holding the sample design constant. This solution, as well as being
conceptually simple in itself, would have the merit of maintaining the relative
simplicity of the design. However, because the precision of estimates is proportional
to the square root of sample size, total sample size would probably have to be at
least doubled and quite possibly quadrupled to satisfy a reasonable proportion of the
demands. That would have the obvious disadvantage of raising data collection and
data processing costs very substantially. It would also be difficult to justify, since it
would boost the representation of large population subgroups already well covered,
as well as that of small but important subgroups.

5.2 Boosting the representation of rare tenure groups
A strategy that might in principle be more efficient is weighted sample selection. This
could be applied whether or not total sample size was increased. For example, a key
feature of the current English housing scene is that approximately 70% of private
households currently own or are buying their accommodation, so that all other tenure
arrangements have become relatively rare. There is much of importance to be
studied in the owner-occupied sector and in movements of households into, out of
and within it. However, developments in other tenure sectors where the units are
relatively few and sparsely distributed are also of great policy interest and it would be
more efficient if representation of the rare tenures could be boosted relative to that of

It would in principle be straightforward to achieve this if at least one complete and up-
to-date list sampling frame of private housing units existed which contained
information on tenure. In the past something that roughly corresponded to that ideal
existed in the shape of the Rating and Valuation Lists, a master copy of which was
held by the Inland Revenue Department and edited copies of which were held by
Local Rating Offices. The use of the lists for revenue collection encouraged local
authorities to keep their copies up to date. Authorised staff were able to gain access
to the Local Rating Office lists for sampling purposes, but in practice sampling had to
be done on local authority premises. The lists were held in non-standard formats and
as a result sample stratification and selection was a laborious and error-prone

procedure. Nevertheless, this was the source from which a very large weighted
sample was drawn for the purposes of the National Dwelling and Household Survey
in 1977. Even if such an administrative source were somehow to be recreated, it
seems doubtful, in the light of personal privacy legislation that has since been
enacted, whether access to it would now be granted for survey sampling purposes.

When the domestic rating system was replaced by the Council Tax the accessibility
of the lists for sampling purposes lapsed and since then most national housing
surveys have relied either on straight PAF samples or on follow-ups of other large
surveys, such as the LFS or the GHS, which bore the cost of screening address
samples to identify households and dwellings in particular categories (e.g. private
renters, recent movers etc).

Currently, therefore, there are no sampling frames available that contain information
about the housing characteristics of every dwelling in England. However, there are
two commercial data-bases known to us which estimate the „average‟ characteristics
of a dwelling in any particular postcode. One is called Residata and the other is
developed and maintained by the commercial agency Experian.

Residata, is maintained by the Building Research Establishment and uses
information from multiple sources, including Census data and insurance applications,
to estimate the most dominant type (flat or house) and age group for dwellings in
each unit postcode in the United Kingdom. Residata also predicts whether dwellings
in the postcode are predominately owner occupied, privately rented, in the social
sector or of mixed tenure (i.e. some private sector and some social sector). The
correlation between the characteristics of a particular dwelling and the housing
characteristics attributed to its postcode by Residata could in principle be exploited
when selecting samples for the SEH. To be useful, such stratification would need to
control the representation of housing units in rare tenures, considered separately.

From comparisons of dwelling characteristics predicted by Residata with those
measures in the EHCS 1996, it was concluded that Residata was fairly good at
identifying the age and type of dwelling, but was not so accurate for tenure (see Lynn
et al, 2000). Using the information from Residata would correctly predict the age
group for about 71% of dwellings and the type of dwelling for about 87% of dwellings.
For tenure, the figure would be about 64%. Although the correlation for tenure might
seem encouraging at first sight, it only actually measures the ability of Residata to
differentiate between private and social housing – Residata is not able to differentiate
within these tenure groups. In particular, a very small proportion (13%) of dwellings
that are private rented would actually be identified as such by Residata – and those
identified would be in areas of high density of rented accommodation and hence
would be atypical of private sector dwellings in general. Also, Residata is unable to
distinguish between local authority and RSL housing. As over -sampling the rare
tenure groups was a key requirement of the EHCS, it was decided not to use

Therefore Residata could provide a simple and relatively cheap method of assessing
for sampling purposes likely dwelling age, whether the dwelling is a flat or a house,
and also whether a dwelling is in the social or private sector. Sampling strata could

be distinguished and selection probabilities between strata could be varied to boost
the representation of the housing attributes mentioned above. However, it would not
be possible to use it to boost the sample representation of rare tenures.

The fundamental problem therefore remains that there is no cheap way of selectively
boosting the representation in the SEH sample of the rarer tenure groups, or, indeed,
of particular demographic groups. The only practical option is through screening a
large additional sample of addresses, which is costly and inherently uneconomic,
since most of the extra addresses screened would then be discarded.

5.3 Boosting the representation of particular geographical areas
The sponsors of the SEH might also wish to consider oversampling particular
geographical areas, such as smaller regions, or parts of the country that are subject
to chronic housing difficulty or stress. The weighted selection would, of course, be
done in a controlled way, such that unbiased estimates for the total population could
still be computed. This is quite easy to implement in sampling terms, but for fixed
total cost would entail reducing the representation of other areas and would also
generate unfavourable design effects for national estimates. It seems unlikely that
such a scheme would commend itself, unless the total sample size were to be
increased at the same time, so that the sub-sample sizes for smaller regions or sub-
regions was brought up to some agreed minimum level.

5.4 A larger periodic SEH
It can be seen by comparing the figures in successive SEH annual reports that the
national distributions of many housing variables do not change rapidly, even in
response to specific policy interventions. It can therefore be argued that measuring
them every year is not cost-effective. One might then suggest a design where the
available funding was devoted to a survey held (say) once every five years, but with
a sample size increased by a factor of five. The intervening years would not
necessarily be without survey activity. As in the case of the NDHS, the large periodic
base survey could be used to identify sub-samples of special interest and of
statistically viable size, which could then be used for follow-ups in depth, focused
longitudinal studies etc.

On reflection, however, there are a number of powerful objections to this option.
From the user viewpoint, some key housing variables, such as house prices and
housing finance arrangements at household level, are subject to rapid change. Policy
users of the SEH might well feel too exposed by a system that did not give the kind of
reliable annual nation-wide fix on housing market trends that the full-sized SEH

From an operational viewpoint a complex survey with a six-figure sample size, held
every five years, is difficult to manage and quality-control. It would require a
consortium of survey organisations to be formed to carry it out and tendering and
managing the required contracts would in itself constitute a substantial extra
management task for ODPM. The prospective contractors would, inevitably, be
carrying out large and complex surveys simultaneously for other demanding clients
and for them an enormous one-off housing survey would be rather like a cuckoo in
the nest. Because the last “big boost” would have been five years earlier, it would be

unlikely that the contractors (even if they continued to bid for and win the contract)
could again allocate many key staff who had previously carried out this unusual type
and scale of operation. In some very real senses, therefore, the contractors and their
staff would always be doing this enormous and demanding survey for the first time
and that is a recipe for sub-optimal performance. The operation of a periodic “big
boost” design would certainly contrast unfavourably with the current relatively
smooth, well-oiled running of the continuous SEH.

5.5 Periodic SEH sample boosts
Compromise designs could be suggested in which the continuous survey continued
to run, but at a lower annual sample size than at present, but the size of the whole
sample was heavily boosted every Nth year. N might be a number other than 5 and
the ratio of the “steady” to the “boost” sample size could be varied. Obviously the
attitude of the Department to limiting the overall cost over the N-year cycle would be
a key factor. The range of possible compromises would need to be gone through in
detail with survey customers to see whether there was one or more where the
advantages clearly outbalanced the disadvantages.

5.6 Ad hoc sample boosting
Cases might be made for boosting in an ad hoc way the representation of
geographical areas deemed to be under housing stress 21, of ethnic minorities and so
on. This idea is logically the same as that of an SEH design in which units in different
population strata are selected with differential probability. However, it tends to come
up in contexts where the emphasis is not so much the definition of a fixed long-term
sampling strategy for a continuous SEH, but rather a desire to make the SEH more
flexible and adaptable to changing policy priorities, both in terms of questionnaire
content and in terms of sample design. For purposes of the following discussion it is
assumed that the aim is to add more members of sub-population X to the main SEH
sample, where they would answer only the standard SEH questions applicable to
them. Whether it would be feasible to incorporate into the SEH interview a module of
extra questions to obtain more detail about members of the boosted group, thus
significantly increasing average interview length, is a separate issue.

From a sampling viewpoint a boosting strategy is straightforward to design and
implement where the groups to be boosted are geographically defined (e.g. regional
sub-samples). In other cases it is usually necessary to carry out an address
screening operation to locate and select members of the group(s) to be boosted.
Such screening usually has to be done face-to-face by knocking on the requisite
number of doors and that is really a survey in itself. The groups being by definition
minorities the screening operation is inherently uneconomical, since most of the
households screened will prove to be ineligible for the boost. To provide a
manageable field coverage task the sampling for the screening operation has to be
organised in such a way that the screened-in sample is clustered on the ground, but
it is seldom possible to arrange evenly-sized clusters. Each of these problems can be
solved, but solving them is demanding technically and generates high unit costs.

  For ex ample because of shortages of affordable accommodation or poor quality of the housing

A second major problem that arises where several subgroups are to be boosted is
that each subgroup is likely to need a different sampling strategy. The strategy must
provide the required additional cases in a way that is efficient in fieldwork terms and
provides a statistical basis for combining the cases from population X that form part
of the main SEH sample to be combined statistically with the boost sample. This
implies knowledge of selection probabilities for each unit in the population. 22 As a
result attempts to combine several boosts in the same design tend rapidly to
generate severe statistical and practical complications.

In practice, therefore, it would probably be best, if pursuing a boosting strategy, to
take the sub-populations to be boosted one at a time and to deal with a different one
each year, or in rotation. An exception could be made if and when two sub-
populations that require a similar boosting strategy were to be targeted. A
hypothetical example would be households containing young people and households
consisting of elderly people. In that case the two groups do not overlap and together
make a larger target, so that screening actually becomes more efficient. An example
of two groups difficult to boost at the same time would be (say) ethnic minority
households and households renting from RSLs. 23

6. Pooling results from the SEH and other surveys

General social surveys sponsored by government, such as the GHS and the ONS
Omnibus Survey, tend routinely to include at least one housing variable (usually
housing tenure). Large surveys sponsored by social policy departments other the
ODPM, such as the LFS, the FRS and the EFS, usually contain small sections of
questions on housing, because the housing background of individuals interlocks with
and illuminates so many other topics.24 From an ODPM viewpoint, therefore, various
large survey data sets provide scope to analyse tenure and sometimes other housing
variables by other standard variables measured by the survey. Even if no housing
variables were included in a particular large government survey, the extra cost of
including simple and easy-to-measure ones such as housing tenure would be quite
small and it would sit naturally with other classificatory variables.

Against that background we have been asked to consider whether housing data
collected by a range of different government continuous surveys could be
aggregated. It would be particularly valuable if an acceptable way of sample pooling
were devised and implemented that enabled annual estimates for important housing
variables of a useful degree of precision to be provided at regional and sub-regional
levels of aggregation. This would get round a major limitation of the SEH and could,
   Many of these problems are illustrated and discussed in the reports on the various follow -up surveys
to the 1977 NDHS.
   The E HCS has in the past boosted the representation of RSL housing units in its sample by
approaching a small sample of housing associations, requesting from each access to its list of
properties and drawing a sample from the lists supplied. The rate of co-operation by housing
associations was not very good and the resulting sample would have had high clustering rho values
for housing attributes due to the tendency for properties owned by a particular housing association to
be similar in age, location and design.
   Conversely, the SEH and the E HCS contain sections on topics such as economic activity and
disability (to name just two) that are of central int erest to other departments and to some external
users. Recently a joint team from the National Centre for Social Research and National Statistics
conducted a review of survey sourc es on individual disability, the report of which illustrates both this
point and others to be made below.

prima facie, be achieved at no additional data collection cost and without imposing
extra burdens on the public as respondents. This is one of the key benefits offered by
the proposals for an Integrated Social Survey recently circulated by ONS.

7. Rotation of primary sampling units

From preliminary discussions with Mr Kafka we understand that he wishes to explore
the idea of rotating primary sampling units (PSUs) within the SEH sample design.
This is to be distinguished from rotation at the level of addresses as practised by the

7.1 Rotation patterns
The aim of PSU rotation is to reduce the sampling variance of change in key survey
variables from one year to the next. A simple and not unusual PSU rotation pattern
would be where the PSUs in Year 1 are split into replicate halves (preserving the
stratification scheme) and one half is retained in Year (N+1) but then dropped and
replaced by new selections from the same sampling stratum in year (N+2). Thus
each PSU remains in the survey sample for two successive years. If longer -range
change is seen to be more important, other rotation patterns can be used in which
less than half of the PSUs are changed each year and each PSU remains in the
sample for more than two years. It is possible also to devise and operate more
complex patterns where sets of PSUs reappear in the sample at set intervals (say
every five years), but these are seldom used in practice.

PSU rotation makes the measurement of aggregate change more precise because
the component of the sampling variance of measures of change between Year N and
Year (N+1) that is due to replacing half the PSUs is eliminated. 25 The effect of
rotation in controlling variance in measures of change would be maximal if all PSUs
were retained from one period to the next. However, the larger the proportion of
PSUs retained, the more any random peculiarities of the Year N sample would then
tend to be perpetuated. 26 The power of the sample to provide effective up -to-date
cross-sectional representation of the population as a whole would also be reduced.
With a moderate degree of year-on-year overlap these effects might not be too
serious, given the large number of PSUs and the careful stratification used by the
SEH. However, severe practical and statistical difficulties would clearly arise if all or
most of the PSUs were retained over a longer period.

A second problem with a rotational (overlapping) designs is that they reduce the
effectiveness of aggregating samples year on year to obtain a large enough total
sample to support finer-grained geographical or other estimates. This is because,
with a design that rotates half the PSUs annually, the number of PSUs that results
from year-on-year aggregation is not double the number of PSUs used in any given
year, but only one and a half times the number.

   Not e that, even in the retained half, different addresses would be selected in Year (N+ 1) from those
used in Year N.
   An example of such a peculiarity might be that a particular ethnic sub-group was markedly over- or
underrepresented relative to the population. Stratification gives only partial protection against this.

A third objection to heavily overlapping rotation designs at the level of quite small
units such as postal sectors is that PSUs would tend to become “over-surveyed” and
measures might have to be taken for PR reasons to avoid reselecting individual
addresses. From a statistical viewpoint, also, it would not be desirable to double the
sample size in overlapped PSUs.

These drawbacks might be tolerated if the reduction in the variance of measures of
change was likely to be a large one. This is prima facie more likely in the case of
housing surveys than in the case of surveys on other topics because of the uneven
sector-level distribution of many housing variables in the population. That in turn
results from the tendency for blocks of housing to have been constructed at much the
same date and to be of similar design and quality. Thus variables measuring (for
example) the state of repair of houses within the same postal sector tend to be more
strongly correlated than (for example) variables measuring the behaviour or opinions
of the inhabitants. With two-(or multi-) stage sampling of the kind used in the SEH
this lumpiness generates unusually high sample design effects due to clustering. On
the other hand, the design effects are reduced by judicious stratification of the PSU
sample such as that used in the SEH (see above).

A balance must therefore be struck between:,
     optimising the sensitivity of measures of change but risking the perpetuation of
      random sampling peculiarities (large year-on-year overlap) and reducing the
      power of sample aggregation across years;
     and optimising the ability of the sample to represent the population cross-
      sectionally and when aggregated across years, but having no reduction in the
      variance of measures of change (no overlap, as at present).
It would be possible to examine these issues (e.g. the reduction in variance of
measures of change associated with particular rotation patterns) empirically using
special SEH data sets which contained variables identifying PSUs and strata.

8.     Dwellings

The SEH is based on a sample of households, but it also collects some information
on d wellings. The definition of a dwelling quoted in the introduction to the English
House Condition Survey report runs as follows:
 A dwelling is a self contained unit of accommodation where all rooms and
 facilities available for the use of the occupants are behind a front door. For the
 most part a dwelling will contain one household, but may contain none (vacant
 dwelling), or may contain more than one (HMO)27.

We understand that a count of dwellings is of value to ODPM as a measure of the
housing stock potentially available for occupation by households, whether or not it is
currently so occupied. The above definition of a dwelling focuses on the criterion of
self-containment and privacy. Analysts and policy makers are in the best position to
judge what value to attach to this measure in its own right, rather than as a stage on
the way to counting dwellings that do and do not satisfy criteria both of self-

     The full definition of an HMO is much wider than this.

containment and of fitness for habitation. Additional fitness criteria would presumably
include -
   having (behind the front door, and for the exclusive use of a single resident
    household) a minimum set of housing amenities, including hot and cold water
    supplies, a bath or shower and an indoor WC; and
   being in an adequate state of repair.

If the EHCS were completely successful in obtaining a physical survey at every
household space selected as part of its sample, it would be able to refine the
enumeration of dwellings so as to distinguish those that had minimum amenities and
were fit for habitation from those which failed either or both tests, and to provide
estimates on that basis. In practice, of course, it suffers from differential non-
response and almost certainly significant non-response bias with respect to coverage
of certain types of dwelling, and it has a sample size too small to yield estimates that
can be geographically disaggregated to the extent required.

A question therefore arises as to whether the SEH can contribute more than it has so
far to providing estimates of the population of dwellings. Estimates of the number of
dwellings in the housing stock, of the numbers, types and locations of vacant
dwellings, of the numbers of households that occupy sub-standard dwellings, and of
the numbers of households or tenure groups that share a dwelling with another
household, are all of interest. We were asked to consider whether use of information
already recorded by interviewers on Address Record Forms (ARFs), plus small
amounts of additional information that could be collected relatively cheaply, would
enable useful estimates relating to a wider population of dwellings to be provided.

8.1 Address outcomes and the enumeration of dwellings
The SEH starts from a random sample of all addresses listed in the Postcode
Address File (Small Users). Interviewers are required to enumerate household
spaces at each address before selecting a household to be approached for interview.
The vast majority of households are sole occupiers of the only dwelling covered by
an address. If the SEH interviewer finds more than one occupied household space at
an address and any of the additional household spaces is the current main residence
of at least one household, (s)he is instructed to add the extra household(s) to her/his
assignment. As an example, in the 1998-1999 SEH the breakdown of outcomes at
each selected address was as shown in Table 1.

The source of the information summarised in Table 1 is the Address Record Form
completed by interviewers for every address. Completion of ARFs is standard
practice in outcome accounting for surveys based on randomly pre-selected address

Table 1 Outcome classification of SEH addresses issued to the field
                                                                           %    Number
     Selected addresses                                                  100     28,224
     Untraceable in the field                                            1.2         345
     Premises used for business purposes only                            1.8         497
     Demolished or judged to be derelict                                 0.5         136
     Vacant (including not yet fully built)                              4.7       1,322
     Temporary accommodation only / second or holiday                    0.8         213
     Other ineligible, including institution or communal                 0.8         221
     Eligible addresses found                                           90.3     25,490
     Extra ineligible households found at multi-household                              98

     Extra eligible households found at multi-household                              490

     Total sample of eligible households found                                   25,979

 Source: Report: Housing in England 1998/01, DTLR and National Statistics 2000

It will be seen that about 2% of the issued addresses (the first two categories) were
probably outside the scope of any household survey, but about a further 7% were
rejected as ineligible because they did not satisfy the eligibility criteria for the SEH,
namely, that they should be the permanent main address of one or more households.
Of the ineligible addresses the majority (4.5% of all issued addresses) were judged to
be vacant, but in addition there were about 0.8% that were determined or judged to
be temporary or secondary accommodation only 28, another 0.4% that were thought to
have been demolished or were judged to be derelict and another 0.2% that proved
not to be private addresses as defined, but the addresses of institutions or communal

8.2 Collecting and compiling information about dwellings
The SEH procedures thus identify as potential analysis units:
a) occupied household spaces at each sample address that are ascertained to be
   the main or only residence of one or more households (these comprise the
   existing main analysis sample);
b) household spaces enumerated at an address and identified as occupied, that did
   not respond to the interview approach.

   The distinction between “vacant” and “holiday or secondary” accommodation my not be particularly

In addition, other types of household space are identified by interviewers and
recorded on Address Record Forms (ARFs), but then rejected as ineligible for the
main analysis sample. By drawing on data already recorded on ARFs it would be
procedurally simple to add to the cases available for analysis :
c) household spaces identified in the field as vacant;
d) household spaces identified in the field as second or holiday homes.

We understand that arrangements have been made to modify the 2003-04 fieldwork
procedures so as to enumerate household spaces of types (c) and (d), but not to distinguish
them separately. The effect of this will be that there is in principle a complete enumeration,
available for analysis, of household spaces at those addresses where at least one household
responds to the SEH. This will require a change interviewer instructions.29 This is not
unprecedented. In an number of surveys conducted by the National Centre similar moves
have been made to collect data about units in category (b) (identified but non-responding),
the object being to obtain information useful in non-response weighting. On the SEH itself
procedures have recently been put in place to enhance the “shadow sampling” information
collected. If the address appeared to be occupied, but the occupants did not respond
to the SEH because of non-contact or refusal, interviewers could attempt to code the
type of accommodation by observation. This information it might enable those types
of accommodation that are more likely not to be self-contained or to lack amenities to
be identified, but not with any certainty.

At households that are interviewed the SEH collects all the information needed to
determine the dwelling status of the accommodation. On the other hand, the
interviewers will still be under pressure to get the main contacting and interviewing
job done and to control their time-use and costs. They cannot take on the role of
detectives in the very common situation where there is no respondent available and
qualified to answer questions about units in categories (b)-(d) above. Our main
concern, therefore, is whether any dwelling -level data set could be:
    sufficiently complete in terms of its coverage of dwellings,
    sufficiently accurate in its identification of dwellings,
    sufficiently full in terms of the information recorded about each dwelling,
    of sufficiently good quality in terms of the validity and reliability of the information
to justify the cost of the extra interviewer effort involved. The conclusion reached on
the basis of experience in 2003-04 will no doubt turn on what specific analytic
enhancements to the SEH the extension of the fieldwork procedures is expected to
deliver and what weight will be placed upon the data obtained in policy applications.
The situation may be one where some extra information on dwellings is deemed to
be better than none and therefore worthwhile at the price.
9. Imputation for missing data

   In standard survey fieldwork practice the purpose of the ARF document is field control, rat her than
collection of substantive survey data. It enables controllers to check that the interviewer has made
adequate and appropriate efforts to identify households at each address and to secure an interview
with each household. In the case of the SE H, the ARF rec ord of calls at the address is also used as
part of the grossing and weighting procedure. Otherwise the ARF data have not hitherto been used in
the analysis of the substantive results.

9.1 Item non-response
Sources of item non-response
The ideal for all surveys is, of course, that each respondent should give an
acceptable answer to every question put to them. It is also true that virtually no
surveys literally achieve the ideal. Traditionally there have been a number of different
sources of such item non-response in interview surveys. They include:
a) interviewer routing errors (sometimes leading to a whole section of questions
   being omitted in error);
b) interviewer failure to record a response that was given by the respondent;
c) out-of-range codes recorded;
d) respondent declines to give an answer;
e) respondent says she/he does not know the answer;
f) an answer is given, but found at the data editing stage to be unacceptable (say,
   an implausibly high amount specified in answer to a question about rent

The situation is less complex with questions that invite the respondent to choose one
(or more) of a predetermined list of alternatives, than with questions that invite a
verbatim answer. In the former case it is simple to define in formal terms what
constitutes an acceptable answer, but in the latter case it may only be at the stages
of office data coding and editing that an answer is ruled to be out-of-scope, too vague
to be usable or effectively a refusal to answer. This can happen, for example, in the
case of questions designed to elicit a codable job-title and description.

The introduction of computer-assisted interviewing (CAPI) has virtually eliminated
components (a), (b) and (c) of item non-response, which are due to interviewer
mistakes. This is because the CAPI computer program requires a response within
the permitted range to be recorded before the next question can be displayed on the
screen. CAPI also reduces item non-response of type (f), since internal plausibility
checks can be included in the program which inform the interviewer if a response
outside the plausible (or logically possible) range has been entered and require that it
be amended to an acceptable code. As regards (d) and (e), CAPI programs provide
codes that the interviewer can enter for “Refusal” and “Don‟t know”, so in the formal
sense CAPI normally ensures that some permissible answer is entered at each
question. Questionnaire designers take particular pains to ensure that response
problems are avoided at key filter questions, which determine on the basis of the
response given what route the questioning should take through the remainder of the

Effects of missing data on derived analysis variables
In many surveys the analysis of data is conducted largely using variables that do not
relate directly to what is recorded as the response to a particular question, but are
derived from several different responses using logical or arithmetical procedures. If
the source items are affected by non-response (missing data), decisions are needed
on how to proceed with the derivation. A conservative policy would be to set the
derived variable value to “missing” if any of the source items are “missing”, but in
some situations it may be thought justifiable to effectively impute values for some
missing responses. These are very real and important issues in household surveys

concerned with detailed income and expenditure, for example. What the analyst sees
is affected by the editing and imputation policies adopted.

Item non-response and data quality
The “tidiness” enforced by CAPI does not necessarily mean that the data obtained
are valid and reliable, as well as being formally complete. In the first place a refusal
to answer or a “Don‟t know” is often tantamount to missing data from the point of
view of the analyst. In the second place, by presenting a limited list of alternative
responses at a question (precoding) the questionnaire designer may fail to provide
for answers that respondents legitimately wish to give. All precoding, even instances
that might appear to the designer and the data analyst to be very simple and
unproblematic, entails a degree of forcing of responses into a predetermined frame.
For example, where a question designer might assume that the alternatives “Yes”
and “No” (with perhaps provision for a “Don‟t know”, if volunteered) cover all cases,
some respondents might wish (but would not be allowed) to say “Yes from one point
of view, no from another”. There is also the pre-emptive “Have you stopped beating
your wife? Yes or no?” type of question.

A key hidden skill of question design is to make sure that there is no undue forcing of
responses that distorts the data for the purposes for which they will be used in
analysis. In general, it is probable that respondents who, wishing to help and not
wishing to appear inadequate, hazard inaccurate guesses when they really do not
know the answer to a question, do more harm to data quality than respondents who
say they don‟t know. But of course the inaccuracy of the guesses is invisible to data
analysts, whereas the presence of missing data is obvious.

9.2 Item non-response and imputation
In large household interview surveys item (question-level) non-response rates of 0-
3% are commonly found. Often the highest rate of item non-response (perhaps 10%)
occurs at questions asking for details of income or savings, because some otherwise
co-operative respondents are unwilling to give these. It is important to recognise that
all methods of imputing for missing data are very much a second best to finding ways
of reducing missing data rates at source. If there is a particular problem, one should
look first at question design and data collection procedures. In general, case -level
(household) non-response is a much more serious statistical problem than item non-
response (see Section 10 below), but item non-response, even at low levels, can be
an irritation to data analysts because it leads to minor discrepancies in the data set,
depending on which set of variables is used in a particular analysis.

Our enquiries suggest that there are few if any data items in the SEH that suffer from
rates of non-response that are unusually high, compared with other surveys, and that
most are in the 0-3% range. It would, of course, be good if a simple, cheap and
statistically effective method existed for replacing missing data with valid responses.
We draw a distinction here between imputation for particular missing values within
case records and grossing and weighting of the data set as a whole. Technically the
distinction is quite clear, but if the main concern is to correct biases in the data
obtained from the responding survey sample, then a particular grossing and
weighting strategy is likely to have effects which overlap with those of case-level

9.3 Methods of imputation
Methods of imputation range from the highly particularistic to the highly generalised.
Some key factors affecting decisions on whether imputation is worthwhile and which
methods to employ are:
a) the likelihood that imputing data will significantly improve the quality of the data
   set as a whole;
b) whether any independent data sources exist that might help to supply or infer
   values that are missing;
c) whether imputation for missing values is to be integrated into a general data
   editing strategy;
d) the time and money costs and complications of applying imputation.
Two examples may illuminate this.

For many years an expert was employed in editing the data collected by the (former)
Family Expenditure Survey, an important part of whose job was:
    to identify cases where certain key item(s) of information were missing and
     arrange for them to be reissued to the interviewer, so that (s)he could revisit the
     household to fill in the gaps (thus avoiding the need to impute); and
    to enter Benefit reference books, using survey information on the circumstances
     of households that failed to answer questions on amounts of benefit received and
     make a best guess at the rates of benefit received by respondents. 30

The FES reported household-level response rates, but not missing data rates, so it is
not possible to say either how large the missing data problem was, nor what the
effect of imputation was. The expert guesses made are likely to have been fairly
good, but their statistical effect on bias and variance of the estimates that would have
resulted without imputation is not known.

At the other extreme, statistical imputation methods have been developed for
censuses and some very large surveys. It should first be said that data collection
methods used in these cases are usually much weaker than those employed by the
SEH and suffer from much higher levels of missing data. “Hot deck” and “cold deck”
imputation methods rely fundamentally on the idea that, in very large household
survey data sets, it will usually be possible, for any given case record C, to find other
case records (i.e. other households) which closely match record C in terms of some
basic and influential variables (say household structure variables, car ownership,
economic activity profile etc). If case C is missing an item of data D, one can then
proceed by randomly selecting a record M from within the other cases processed
which matches C in the specified respects and substituting the D value found in
record M into record C, thus filling the gap in the data. The more exactly cases can
be matched, the more successful the method is likely to be.

Because the case-matching criteria are pre-specified and the selection from matched
cases is random, the procedure satisfies statistical criteria of being unbiased and

  The cases so treated had either been unable to say how much they received, or had given an
implausible ans wer which suggested that they were confusing or conflating different benefits .

having calculable effects on variance of estimates. It can also be implemented by a
computer program. The problems lie in the selection of matching criteria and in the
integration of the imputation procedure with other editing and grossing procedures
that may be applied. Employment of this type of method requires not only a very
large sample, but also substantial investment of statistical design and development
and computer programming resources.

Subject to comments from ODPM, our view is that missing data problems in the SEH
are at a level where the extra complications and costs of addressing them
systematically as part of data editing are unlikely to be justified by any benefits
obtained. It is not sufficient or correct to embark on such a programme simply
because small amounts of missing data are an irritation. It is possible that SEH
analysts have in mind particular variables where missing data is a problem. In that
case we should consider first ways of reducing it at source.

10. Grossing and weighting of results
This topic has been well discussed in two papers by David Elliott: “Software to weight
and gross survey data” (1997) and “Report of the Task Force on Weighting and
Estimation” (1999) (particularly Appendix D on the SEH). The present section is
indebted to Elliott‟s work and does not attempt to duplicate it. We do however make a
number of points about grossing and weighting methods, with particular reference to
methods that are currently used and might in future be used in the case of the SEH.

The term “grossing” is used to describe the process of converting sample numbers
into numeric population estimates. In general the aims of a grossing strategy should
a) to bring up the sample numbers so that they relate to an (independently know)
   population total;
b) to minimise the variance of grossed-up survey-based estimates;
c) to remove or reduce any biases arising in the survey sampling and data collection
   processes (or in the grossing process itself).

The results of the SEH are presented in reports as population estimates and the
grossing scheme applied to the raw survey data combines three main elements: an
expansion estimator, adjustments for non-response and calibration weighting. In
detail the method is quite complicated and the effects of successive stages of
grossing interact, so it is not easy to see analytically what the net effects of any
particular element on the grossed-up estimates are likely to be.

Elliott (1997) brings together some main findings regarding the net effect of grossing
and weighting the SEH and results given in an appendix to the annual SEH reports
compare the weighted and unweighted distributions of some important variables,
such as housing tenure, household type, household size and employment status of
the head of household. The most obvious (but of course not the only) effect of
weighting is to increase the representation in the weighted sample of single-person
households. This probably reflects the experience of all general population
household surveys that response tends to be lower than average amongst single -
person households and particularly amongst young single people living alone.

In what follows we trace and comment on the main stages of the grossing process.

10.1 Simple expansion estimator
Simple expansion estimation strictly requires only values for the effective sample size
and for the size of the target population. Since the SEH sample of households is
selected with equal probability, a standard grossing factor, calculated as the
population size divided by the responding sample size, might be applied to the
responses of each household in the responding sample, so as to make the grossed
up total match an externally available population total. Such an approach would
compensate not only for the set sample being a fraction of the population and for the
achieved sample being smaller than the set sample (because of non-response and
other losses), but also for shortfalls in the survey sampling frame and sampling
procedures. 31

On the other hand a simple expansion estimator does not in itself correct any
imbalances within the sample, such that some groups are over-represented and
others under-represented relative to the population. These can arise through random
sampling variability, particularly with respect to the representation of rare population
subgroups that are unevenly distributed geographically, but the major cause of
imbalances is often differential non-response bias. In practice most grossing systems
address such internal imbalances to some extent.

10.2 Non-response weighting
In the case of the SEH non-response weighting is integrated with the application of
the expansion estimator. It uses an unusual method based on the following two
    Some SEH households are contacted at the first interviewer call, others only at
     the second or later call and others again (about 6%) are not contacted at all and
     have to be abandoned after a number of calls. Thus for every sample household,
     responding and non-responding, a number of calls is recorded.
    The measured characteristics of “harder-to-contact” households that do
     eventually respond differ systematically, on average, from those of “easier-to-
     contact” households that do respond; for example “harder-to-contact” households
     tend on average to contain fewer individuals.

The weighting strategy is then applied as follows.
    Both responding and non-responding households are sorted into groups that
     received 1-2, 3, 4-5 and 6 or more calls. 32 Households that were never contacted

   Such short falls arise not only bec ause a small proportion (estimated to be about 2% ) of inhabited
addresses do not appear in the PAF, but also because the survey process is “leaky”. This can be seen
from the fact that, if one attempts on any large household survey to gross up to the population of
individuals by using the inverse of the effective sampling fraction, the population size at which one
arrives is always well short of the population size as shown in (for example) ONS current population
estimates, even when PAF under-coverage is discounted. The “leakage” is no doubt due to a variety
of factors, but an import ant one is probably that household surveys miss numbers of individuals who
do not qualify as “residents” at any address.
   The grouping of values produces a smaller number of larger weighting groups, which is likely to
provide more stable weighting factors.

    are placed in a group according to the number of calls made before the case was
   A separate response rate is then calculated for each group of households. A
    weighting factor is calculated for each group as the inverse of the response rate
    for the group.
   Each responding member of the weighting group is multiplied by this factor to
    compensate for the absence of the non-responding members.
   At a later stage of grossing the weights are effectively scaled so that the final
    totals match the population controls.

A limitation of this method is that it actually adjusts for the effects of failure to contact
households, rather than for household non-response as a whole. The refusal
component of non-response is actually much larger than the non-contact component
(about 25% as compared with about 6%) and studies of other surveys have shown
that non-response due to refusal is a different phenomenon from non-response due
to non-contact (e.g. Foster (1998), Lynn et al. (2002), Groves and Couper (1998)).
Therefore assuming, in effect, that all non-responding households have the same
characteristics as non-contacted households seems fallacious and could quite
possibly make net non-response bias for some estimates worse. The design of the
current round of survey non-response checks using the 2001 Census takes account
of this. See

It seems that the SEH non-response weighting procedure makes a significant impact
in adjusting the results. For example, it tends to increase the proportion of smaller
and reduce the proportion of larger households in the weighted sample and also to
increase the proportion of households consisting of young people. However, without
being to able to inspect the weights and measure their effect on distributions at each
stage of the grossing process it is impossible to be specific regarding the effects of
this method of adjusting for non-response, since they undoubtedly interact with the
effects of calibration weighting.

10.3 Calibration weighting
Calibration weighting (post-stratification) is in origin an extension of classical
probability sampling theory. Its aim is to remove (some of) the random variance in
survey estimates that results from random sampling. It also implicitly addresses
imbalances arising from other causes including, in particular, non-response bias,
though it is not necessarily the best method of adjusting for non-response.

Calculation of post-stratification weights depends on having available trusted external
estimates of relevant population parameters (control distributions) and in that respect
resembles expansion estimation. In fact the one merges into the other where the
population control totals are available for sub-groups, such as age by sex groups,
since then it is natural to gross up to the separate group totals, thus forcing the
grossed-up sample distribution to match the corresponding population distribution
and removing any imbalances in the distribution of the sample by age and sex. In the
case of the SEH the external sources used in calibration are the ONS current
population estimates for age-groups by sex and by standard region of England. The

standard region adjustment is applied as the final step in the grossing process, but
the age by sex distributions are applied earlier, using the method described next.

There are various different statistical methods of implementing post-stratification-
style weighting, which can produce somewhat different results. The method used by
the SEH is unusual. The prime aim is to correct for the known and consistent under-
representation in household surveys generally, and in the SEH in particular, of young
adults in households consisting only of young adults. Thus, although the method is
discussed here under the heading of calibration weighting because external control
totals are used, the motivation for using this special “cascade” method of adjustment
is largely to correct for non-response bias.

The key feature is a “cascade” procedure, whereby a weighting factor based on age
and sex control totals is first calculated for the weighting class “youngest member of
each household if aged under five years”. This obviously deals only with those
households that include at least one member aged under five. Then another
weighting factor is calculated for the class “youngest member of each household if
aged 5-15 years. This excludes the first weighting class, where the weight already
calculated is allowed to stand. Then other weighting factors are successively
calculated, moving up the grouped age range but in each case allowing the weights
already calculated to stand, until all households have been dealt with. The final
weighting class therefore consists of “elderly (as defined) persons who are either the
youngest or the only member of their household”. From age 30 upwards the age
classes used for “youngest person” are broader (30-44 years for example). This is
because response does not vary sharply with age at ages above 30. A refinement
from age 20 upward is to introduce a further division into separate weighting classes
for households that consist entirely of people in the youngest adult age group and
those that also contain older persons.

A crucial feature of this method is that the household member defined at each stage
is unique within the household (multiple births can be treated as one individual). In
this way weighting factors calculated for individuals can be applied to households
(with all household members receiving the same weight).

As described above the method proceeds by weighting one class of household at a
time. A consequence of this is that, as each successive weighting class is “forced” to
conform to the population distribution, there is a tendency for the remaining
discrepancies to become concentrated in the classes still to be dealt with and, in
particular, in the last class to be dealt with. The total variance of the weights required
across the whole sample is not affected, but the more e xtreme weights will tend to be
concentrated in those households forming the last class to be weighted. This will
have effects on the variance of some estimates based on the grossed-up sample, but
without special analyses which pick apart the components of the weighting system it
is not possible to give examples.

Another weakness of this method as applied to the SEH is that it requires external
control totals for each weighting class. Thus there should be a control total for
“youngest members of households if aged under five years”, for “youngest members
of households if aged 5-15” and so on. In fact Current Population Estimates do not
provide such separate totals, so the weighting calculation at each stage uses as its

control total “All persons aged under 5”, “All persons aged 5-15” and so on. It is not
clear what effect these substitutions would be likely to have on the validity of the
weights and the variance and bias of survey estimates.

In the case of the SEH, straightforwardly correcting distortions in the sample of
individual household members in terms of age, sex and region might to some extent
reduce, but would certainly not remove, the variance and bias of estimates of survey
outcome variables such as housing type, housing tenure and housing attit udes. Age
and sex distributions for household members, for which control distributions are
available, seem unlikely to be very closely correlated with the household-level
variables in which the SEH is most interested. The way in which they are actually
used in the grossing system represents an ingenious attempt to bring them to bear in
a way that improves their power to reduce the bias in particular of household-level

There does, however, appear to be some cause for concern, on the side of vari ance,
about the reliance of the grossing system on small groups, such as households that
are not contacted and relatively small weighting classes, in calculating values for the
weights. This would show up in the results of true standard error calculations for a
range of survey estimates (i.e. calculations that take full account of the weighting and
grossing system as described above, as well as of the clustered sampling design),
but it is not clear to us whether such calculations have ever been carried out. It
seems possible that Elliott did so when preparing his 1997 paper.

Our comments above on certain potential weaknesses in the SEH grossing and
weighting system should not be allowed to obscure our view that the system is an
impressive, though complicated, one that appears to serve a number of the key aims
reasonable well. A thorough empirical assessment is, however, severely hampered
by the lack of diagnostic information on how the different elements of the system
perform separately and together in reduci ng both bias and variance.

10.4 Limitations of grossing and weighting systems
A key point about calibration-style weighting (which in practice also addresses non-
response bias) is that external control distributions are usually unavailable for the
most important survey outcome variables. 33 Calibration weighting therefore relies on
correcting the distributions of other variables (such as age and sex) for which
population parameter estimates are available and which can be shown (using sample
or other data) to be correlated with the key survey-measured variables.

The effectiveness of calibration (post-stratification) weighting therefore depends
crucially on the nature and strength of the statistical relationship between the control
variables used in weighting and these key survey-measured variables. In general the
power of the variables used in weighting, taken together, to predict what the true
distributions of the key survey variables should be is likely to be modest, so we
should not expect weighting to remove all, or even most, of the bias in survey
estimates. There is in fact no “gold standard” for judging the performance of
weighting systems, so usually we only have plausibility as a criterion. It is an illusion

     If such information were available the case for carrying out a survey at all might be undermined.

to suppose that, because the grossed up sample matches population distributions in
terms of control variables such as age, sex and region, all must be well.

Another general point to bear in mind about both calibration and non-response
weighting is that, while they correct the distribution of the variables explicitly used in
weighting (e.g. age and sex), their effects in removing distortions in the distribution of
other survey variables can be difficult to predict. This is because in certain cases the
relationship between control and target variables incorporates complex interactions.

This is easiest to understand in the case of non-response weighting. Let us take the
simple case where males are under-represented (and females correspondingly over-
represented) in a responding sample. It is simple to devise and apply weighting that
corrects the sex ratio, but what this in effect does is to weight up males who did
respond to “replace” males who did not respond. However, it is likely that the sub-
population of males not disposed to respond differs from the sub-population of males
who are disposed to respond in their behaviour and lifestyle (e.g. tending to be out
rather than at home at times when interviewers call), in their demographic attributes
(e.g. tending to be older and married rather than younger and single) and in their
attitudes. Thus weighting up responding males may not compensate satisfactorily for
the absence from the sample of non-responding males. This warns us that weighting
is not a panacea for nullifying the effects of non-response on survey results and that
we should always be aware of the processes that are likely to generate
response/non-response in surveys

As already noted, in implementing weighting schemes an optimum balance needs to
be struck between reducing the bias of estimates and increasing their variance. The
use of differential weights is in general likely to increase variance and the effect
increases as a function of the variance across the analysis (sub -)sample of the
weighting values applied. For example, in certain cases it may be possible to form
elaborate contingency tables both from the survey data and from the trusted external
data source (say a Census). In the case of the SEH this might be a table of age in
years by sex by region. It is then possible to generate a separate weighting factor for
each cell of the table (“cell weighting”). However, such a method, although it may
look to be maximally effective in removing bias, is prone to increase variance
unacceptably. 34 This is because correction factors calculated using small sample cell
totals are likely to be unstable from one year of the survey to the next and to produce
extreme and extremely variable weights that in turn increase the variance of survey

10.5    Census check studies
A great and obvious problem in devising and testing weighting systems to
compensate for non-response bias is that the same factors that lead to non-response
in the first place (the elusiveness and reluctance to take part in the survey of some
sample members) make it expensive and difficult to collect any information helpful in

   In practice there would be empty cells in the sample table which would enforce the need to form
larger age categories, but the detrimental effect of using too many small weighting cells goes well
beyond that.

weighting from a sufficient proportion of non-responders. 35 One exception in this
country is the series of studies of non-response in large continuous household
surveys that ONS have been able to conduct in the wake of successive Censuses of
Population, thanks to their privileged ability to match both responding and non-
responding survey households with Census records for the same households. This
enables comparisons to be made, in terms of Census-measured characteristics,
between those households that responded to the survey and those that did not (see
for example Kemsley (1975), Redpath (1986), Foster (1998)) . In some of these
studies non-responding households that were not contacted by the survey are
distinguished from households that refused to take part.

Using these methods it has been found by successive Census check studies that:
    households consisting of young adults and, in particular, of a young adult living
     alone, have lower-than-average rates of response due to non-contact;
    households headed by self-employed persons have lower-than-average rates of
     response, mainly due to refusals;
    households with young children have higher-than-average rates of response;
    elderly households have lower than average rates of response because of
     refusals, particularly in the case of burdensome surveys such as those that
     require diary-keeping. 36

So far there has not been an opportunity to include the SEH in such Census check
studies (since the survey was not in existence at the time of the 1991 Census), but
we understand that it will be included in the 2001 Census matching exercise and we
would expect this exercise to provide important opportunities to research both non-
response to the SEH as such and the effectiveness of the weighting and grossing
system. As noted above, non-response due to non-contact and non-response due to
refusal will be distinguished. Census-matching studies are not, of course, a panacea,
since the range of variables included in the Census is limited, but it does allow non-
response bias as it affects several important housing measures and allow the
attributes of housing spaces and their inhabitants to be looked at together to be

10.6 Other weighting and grossing systems and software for household
Because of the above considerations and because sometimes only marginal control
distributions are available, grossing and weighting methods that use marginal
population distributions as controls are often preferred. In order to generate a
consistent set of weights it is then necessary to use one of a set of statistical
methods usually known as “raking estimation”. This name is used because the
method involves “raking” through a contingency table iteratively, adjusting cell values
to each set of marginal totals in turn, until a unique stable convergence that satisfies
all the marginal constraints is reached.

   The current SEH grossing and weighting system bypasses this problem by assuming that the
“difficulty in contacting” variable is an effective proxy for direct measures of the attribut es of non -
responding households.
   This is a general finding, but Groves and Couper (op. cit.) argue that the tendency of elderly persons
to refuse tends to disappear when household size is allowed for.

Since the early 1990s these methods have been further developed and refined for
household surveys, in particular by a group at INSEE (Deville and Sarndal 1992,
Deville, Sarndal and Sautory 1993; see Elliot 1997 for more detail). Their methods
have been embodied in a generalised computing package known as CALMAR. Over
the same period other packages for performing the same types of adjustment have
been developed for ONS (GROSSWGT aka G-UP), by Statistics Netherlands
(BASCULA) and by Statistics Canada (GES). Elliott (1997) reviewed and trialled
each of these packages (in the versions current at the time) and recommended
CALMAR as the most versatile and usable.

We have been asked to comment on whether the present method of grossing and
weighting the SEH results should be replaced by CALMAR. There is a problem here
in that, whereas CALMAR is a generalised set of statistical algorithms, the current
system is very much a customised product. To provide an unambiguous answer one
would need to apply both systems to a common SEH data set to check, first, whether
they led to notably different weighted distributions for key survey variables. If so, it
might be possible to use the data set provided by the Census check exercise to
compare their effects in reducing bias. It would also be important to compare the
systems in terms of the variance of the grossed estimates to which they lead. We
believe that methods exist for providing variance estimates using CALMAR. In the
case of the current SEH system it may be necessary to estimate variances using
some type of balanced repeated replications procedure.

11. Recommendations

11.1 Introduction
We were asked to review technical aspects of the design and execution of the
Survey of English Housing. Mr Kafka briefed us about current methodological issues
to which we should give attention and we are confident that we have understood the
long-term aims and design priorities of the survey and the way it is designed to meet
them. However, making recommendations regarding possible design or operational
changes requires assumptions about the value and acceptable cost of options to
survey customers and to ODPM as a whole. The assumptions we have made may or
may not be correct and the following paragraphs should be read with that in mind.

11.2 Simplicity and robustness
The SEH uses a sampling frame and a type of household sample design that is
common to other government household interview surveys. It has a uniform selection
probability for addresses in all strata, no sample rotation and no built-in longitudinal
features. All the SEH data are collected in the course of a single interview per
household/tenancy group. The survey does not require the collection of information
from more than one member of each household, or on more than one occasion. The
efficiency with which the SEH is conducted in the field is enhanced by the stability of
its main content and procedures. The questionnaire is not grossly overloaded, as has
happened in the past with some other government surveys. The design is of a
standard kind and is well adapted to the aims of the survey (subject to certain
detailed points made in following sections). Sound basic design, relative simplicity
and operational robustness are all strong points and an insurance against the
unforeseen. They should not be compromised without good reason.

Recommendation: The basic design and current operational procedures
should be retained.

11.3 Single household respondent
The way in which a single household respondent is selected conflicts with the use of
the survey to measure housing attitudes, aspirations and satisfaction. These are
attributes of individuals on which household members may well not agree. That could
be accommodated if a random member of each multi-adult household were selected
to answer attitude questions. However, it is not technically acceptable for this
purpose to choose the household respondent on the basis of convenience, as
happens at present.
Recommendation: Consideration should be given to selecting a random
member of each multi-adult household to answer attitude questions.

11.4 Household reference person
From April 2001 the older procedure for identifying uniquely a “Head of Household”
was abandoned as being sexist and replaced with a definition of the Household
Reference Person as the household member with the highest income. In the case of
housing surveys it is difficult to see much merit in this change. The title “Head of
Household” could reasonably have been replaced, but the “highest income” criterion
seems equally if not more unfair to women and also likely to cause doubts and
anomalies in practice (see also recommendation 11.3). How, for example, should it
operate in a household where the husband, a qualified heating engineer, is
unemployed and the wife has a part-time cleaning job? Is the wife then SRP until the
husband gets a job, or is the husband to be selected because he is potentially the
higher earner?

11.5 Exclusions from the target population
The SEH aims to provide information about private households and their housing
arrangements. However, certain “fringe” populations are excluded from the sample
as ineligible. In the case of households and individuals these include residents of
institutions such as hostels, elderly residential and children‟s homes, prisons and
military establishments, families and individuals in temporary accommodation,
vagrants and squatters. In the case of d wellings the excluded “fringe” includes
premises that are have become temporarily or permanently unfit for habitation
(though some of these actually have occupants), and vacant accommodation.
Second and holiday homes are partly covered in the main questionnaire when
households owning such homes are found and respond at their main address, but
are excluded as ineligible when encountered in field sampling.

For SEH purposes each individual is assumed to have just one identifiable
permanent address. This assumption is enforced through eligibility rules that require
each household and individual to have a unique “main” residence. A corollary is then
that dwellings, with any households or persons who occupy them for the time being,
are excluded as ineligible if they a re not anyone‟s “main” address; and people who
have no permanent home according to the definition discussed above become
statistically invisible so far as the SEH is concerned.

The assumptions that each person has a unique address of permanent residence
and that there is for each individual some address at which he or she satisfies the
“permanent residence” rules are questionable. It seems very probable that applying
them (particularly the second) results in substantial under-coverage of certain
population subgroups, such as individuals and families in temporary accommodation
and single adults, often young, who have a vagrant lifestyle and spend time at a
number of different addresses, but never for long enough to qualify as a permanent
resident. There are some alternative sources of information for households placed in
temporary accommodation by local housing authorities, but not for the second group,
which is probably much larger.

Households and individuals who actually have several permanent addresses can
only be treated as eligible if contacted at their “main” address. This rule is intended to
avoid giving households that have several homes multiple chances of selection. A
corollary is that households or individuals who spend much of their time at a second
or holiday home (which may be outside the UK) have a lower chance than others of
being included in the SEH analysis sample (since if the call were to be made at their
“main” residence they would often be classified as non-contacts).

The excluded and under-sampled groups, particularly those who have no permanent
private residence, seem important from a housing policy viewpoint. On the other
hand it is likely that the excluded households and individuals, even if treated as
eligible, would have high survey non-contact and refusal rates in practice and the unit
costs of attempting to cover these groups would be high, but we nevertheless
recommend that some attention be given to these rather fundamental issues.
Recommendation: Attention should be given to the possibility of extending the
coverage to bring in excluded and under-sampled groups

11.6 Response rates
Maintaining high rates of response to household interview surveys is a perpetual
challenge which is becoming more severe over time. The SEH has not escaped the
downward trend in rates of response which has affected all the continuous household
surveys conducted by the National Centre and by the Office for National Statistics,
particularly over the past 10 years. No survey contractor can supply a magic,
costless ingredient that will enable surveys to defy this trend (reasons discussed in
the main text). The time may have come for the SEH, along with other major
government surveys, to consider the payment of response incentives in cash or other
forms, since experiments on other surveys have shown that these do have a
significant impact in raising response.
Recommendation: some experiments with response incentives should be

11.7   Sample size
As in the case of every other important government survey, there are users and uses
of the SEH that statistically demand a larger sample size than is available at present.
In particular, there is demand for reliable results for small areas, for households in
the rarer tenures (considered separately) and for other population subgroups.

Ultimately, determining the appropriate size/cost/benefit balance must be a matter for
the Department, but a number of technical recommendations can be made that bear
on the decision.
Since the SEH selects an independent sample of addresses to the same design each
year, a low-cost and technically straightforward way of boosting the numbers of
cases (households) available for analysis is to aggregate results from several
consecutive years. There are no statistical disadvantages, but of course it is inherent
that the time-reference of the results becomes blurred. This matters more for some
types of analysis than for others and we accept that for the former there is a high
premium on being able to use up-to-date results.

Boosting of the continuous SEH address sample to provide useful and reliable results
for any set of geographical entities down to local authority size is probably too costly
for a single department (it was done for the NDHS in 1977, but on a one-off basis).
Boosting with equal probability would not be cost-effective, because the sampling
shortfall affects households in minority tenures and other small subgroups, rather
than owner-occupier households. The need to selectively boost certain subgroups
has long been recognised by the SEH‟s sister survey, the EHCS, which uses designs
with widely different sampling probabilities between address/dwelling strata defined
by area, tenure and age of structure. Ironically, however, a recent review of the
EHCS carried out by the present consultants showed that, because of the dearth of
good stratifiable address sampling frames, the best way to maintain this feature of
the EHCS design was to rely on the larger SEH to effectively create an auxiliary
stratified sampling frame for it through “shadow sampling”.

ONS is currently promoting the idea of a very large Integrated Social Survey (ISS),
conducted on behalf of a range of user departments by ONS. In principle this could
deliver, for a very limited number of key housing variables, the very large sample
sizes that some desired uses of the SEH require. It also appears to offer a sample
considerably larger than that of the SEH for a wider range of housing variables,
though it is not clear from the proposals so far circulated exactly what that would
mean. This ambitious project is still at an early stage of development and raises
many technical and survey ownership issues.

It has been pointed out that a simpler low-cost version of this approach might be put
together by aligning with the SEH the results of housing questions already included in
other large government surveys such as the LFS and the GHS, so as to be able to
provide results for a few key variables based on a large aggregated sample. The
proposition appears feasible and cost-effective, but would of course require
interdepartmental agreement and collaboration and currently the IHS initiative holds
the interdepartmental stage.
Recommendation: If the ISS fails to make progress, pursue the possibility of
achieving a larger sample by combining housing data from other government
surveys with SEH data.

11.8 Selective sample boosting
In principle a preferable alternative to global SEH sample boosting would be to boost
the representation of rare subgroups of interest to housing policy (we refer here to

groups that are not geographically defined). Two likely examples out of several would
be tenure groups and ethnic groups. It must be remembered that the requirement is
not just to find an additional N households belonging to a particular tenure or ethnic
group, nor even to do that and be able to contact and interview them at affordable
cost and acceptable rate of response. It must also be done in such a way that the
extra cases can be added to those covered by the existing SEH in such a way that
the aggregate sample can support national estimates that can be shown to be
unbiased and to have calculable and acceptable margins of sampling variance.
This approach once again runs up against the lack of complete and reliable national
address sampling frames in which the subgroups of particular interest can be pre-
identified. To overcome this an address screening design would almost certainly be
needed. This can be and has been successful as a means of providing statistically
valid samples of minority populations, but it is inherently uneconomical because a
large screening sample is needed of which, by definition, the majority will prove to be
out of scope. Also, it usually results in samples that require weighting, which reduces
their statistical efficiency, and screening simultaneously for several different sets of
subgroups (e.g. ethnic groups and tenure groups) rapidly becomes too complicated.
Recommendation: If the SEH sample is to be boosted selectively only one set
of non-overlapping subgroups at a time should be addressed in this way.

11.9 Sample stratification
In the SEH non-geographical stratification of the sample can be applied only at the
PSU (sector) level. This is because the PAF sampling frame contains locational
information only. The aim of stratification is to reduce the variance of sample-based
estimates, but in the case of large, well-spread samples such as that of the SEH the
marginal effect of such stratification is usually quite small. Against that background
we have considered various suggestions for improvement but do not recommend any
changes in the present stratification scheme (described in Section 1). We note,
however, that several of the stratifiers are Census -based will in any case need to be
reviewed when the 2001 Census small area statistics become available.
Recommendation: No change in the present stratification scheme, but see
11.11 below.

11.10 Rotation of primary sampling units
Mr Kafka asked us to consider the merits of introducing into the SEH a sampling
feature whereby the postal sectors used in any one fieldwork year would be split into
(say) two replicate subgroups, with one being retained for the next year or years and
the other replaced by a new replicate selection. In the third year the retained sectors
would be replaced and so on. This is known as rotation of PSUs. Rotation at the level
of PSUs should be clearly distinguished from rotation at the level of addresses, which
is a feature of the EHCS. Because most of the variation within the sample in terms of
important housing variables is between addresses within sectors, rather than
between sector means, the effects of PSU rotation on the statistical performance of
the sample are much smaller than those of address rotation.

The aim of PSU rotation is to reduce the variance and standard errors of measures of
aggregate change over time (in the simple example given above, year-on-year
change). Experience of other surveys with similar sample designs suggests that the

beneficial effects of such rotation would be very small. On the other hand rotation
reduces the effectiveness of year-on-year aggregation of samples to obtain a larger
sample size, which seems rather important. It also has a complication cost, since the
Post Office constantly reviews the post-coding system and sectors may change their
boundaries or even be radically reorganised from one year to the next.
Recommendation: PSU rotation should not be introduced..

11.11 Clustering and other design effects
Like most national household interview surveys the SEH has a sample drawn in two
stages. At the first stage areas (postal sectors) are selected systematically (with
stratification and probability proportional to size) from a list of all sectors. At the
second stage an equal number of addresses is selected by a random systematic
procedure from those listed within the sectors selected at the first stage. This
produces an equal-probability sample made up of standard -sized clusters of
addresses. The number and size of clusters is balanced so that each provides a
convenient interviewer workload.

The practical and cost advantages of two-stage sampling over simple random
sampling of addresses are enormous, but a price is paid in that the statistical
efficiency of clustered samples, as measured by the standard errors of estimates
based on them, is lower than that of a simple random sampling. Two features of the
SEH design help to reduce this unfavourable design effect to a minimum: one is
stratification in the selection of sectors and the other is limitation on the number of
addresses per cluster. In spite of these features the design effect is still significant for
many housing estimates based on the survey. In our view the balance struck in the
case of the SEH design is a good one. Nevertheless, empirical work on other surveys
(FRS, GHS) comparing the effects of different stratification designs has suggested
that ways of increasing the variance-reducing effects of stratification might be found.
Recommendation: The number and size of clusters should not be changed. It
would, however, be worth carrying out empirical work to check whether the
existing stratification scheme could be improved by fine tuning. A suitable
opportunity for doing such work will be the general updating of the design that
will be necessary in order to make use of the 2001 Census data for small areas.

Another source of design effects in the SEH is differential weighting applied to the
results. Differential weighting tends in most circumstances to generate unfavourable
design effects that inflate standard errors of estimates, the magnitude of the effect
being proportional to the variance across the sample of the weighting factors applied.
To investigate the effects on the variance and bias of estimates a special data set
would need to be produced, incorporating not only the raw survey results for some
suitable set of variables but also full meta-data on both the sample design and the
weighting scheme, including per-case values of the weights applied at each stage
(see section 10). We recommend that this be done, but specifying and obtaining such
a data set, carrying out appropriate analyses and interpreting the results is a larger
task than could be attempted within the limits of the present review.
Recommendation: Establish a special data set and use this to investigate the
effects on variance and bias of the differential weighting used currently. This

data set would also be used for a comparison of the current grossing method
with the CALMAR weighting and grossing approach (see below).

11.12 The grossing and weighting system
Another reason for making the recommendation at the end of the previous paragraph
is the need for better empirical and theoretical understanding of the current grossing
and weighting system, which is both complex and unique. We believe we have
understood the rationale and in principle how the system works, but a separate
project would be required to compare it in detail with, say, the CALMAR weighting
and grossing approach, which is more widely used in handling government survey
data and is more fully documented. Such a comparison would need to compare
important estimates produced by the two (or more) systems tested in terms of their
bias and variance. D Elliott (1997,1999) has explored and reported on this area and
evidently looked at the SEH system, but did not publish specific results. He should
certainly be consulted.
Recommendation: Undertake a project to compare the results of the current
weighting and grossing approach with the equivalent using CALMAR, taking
into account the work carried out by D.Elliott (see references).

11.13   Enumeration of dwellings
It would enhance the value of the SEH to users in ODPM if it could be made to yield
useful estimates relating to the population of dwellings, as distinct from the
population of households. The first step towards this would be to take steps to ensure
that all actual or potential private dwellings forming part of each PAF address
included in the sample are identified by the survey. As a start, one could take
advantage of the fact that SEH interviewers, as part of the existing field procedures,
already identify and list on Address Record Forms (ARFs) all household spaces
found at an address. The information recorded is likely to be more reliable where
they are able to gain access to the address and (optimally) find someone able to tell
them what households live there. Emphasis will need to be given to identifying and
counting vacant as well as occupied household spaces. For a complete enumeration
it is also desirable that dwellings used as holiday or second homes be included in the
count, even though the occupants for the time being of such addresses are treated
as ineligible for the household sample under current rules. That can also be arranged
by using information already recorded on ARFs, but again there may be problems in
the field in gaining access and/or in finding a reliable informant. It has been agreed to
implement these changes for fieldwork year 2003-04.
Recommendation: The results of the new dwelling enumeration procedures to
be introduced in 2003-04 should be monitored and checks should be made,
possibly by revisiting a sample of problematic addresses, on the accuracy of
the dwelling count obtained. Policy-makers at ODPM should consider the value
of the dwelling count obtained and whether there is a case for enhancing the
survey procedures (at higher cost) in order to obtain more information about
dwellings other than those occupied by responding households.

11.14 Imputation for missing data
On Mr Kafka‟s instructions we have looked at this aspects of data processing. We
concluded that the missing data problems on the SEH were more of a minor

nuisance (as on most other well-conducted surveys) than a serious threat and that no
general method of imputation would be cost effective.
Recommendation: Imputation should not be introduced into SEH processing.

11.15 Forward planning and survey timetable
So far as we know the SEH does not have an explicit forward planning cycle that
extends beyond the upcoming fieldwork year. On the General Household Survey, by
way of comparison, a planning system developed in which some question blocks
were formally classified as permanent and mandatory (“core”), other were covered
periodically (say every third year) a nd others again were treated as temporary ad hoc
insertions. New sections and sections requiring substantial redevelopment, which will
generally require questionnaire development and piloting, were identified well in
advance, appropriate resources were costed in and timetables prepared. There may
be scope for some similar system on the SEH.

We are aware of the demand from SEH users within ODPM for a survey able to
respond more quickly to changing user needs and priorities. The cycle of operations
from articulation of a new data request to data delivery will normally require a year at
minimum, even where part year‟s results will suffice (the sample for each quarter is
for practical purposes nationally representative and large enough for many analysis
purposes). In response, we know that work has been done to cut out unnecessary
delays in the planning, data processing, data analysis and reporting phases. We do
not think it possible to make a large continuous survey like the SEH double as an
agile “Omnibus” or ad hoc instrument and we would not like to see either robustness
and simplicity (see first recommendation) or dissemination of results (see below)
sacrificed to vain efforts in that direction (see next section).
Recommendation: Consider the establishment of a formal planning cycle
extending over several survey years

11.16 Dissemination of results and data sets
Whereas the needs of ODPM and other government users of the SEH are
paramount, account should also be taken of the needs of external users. These are
not limited to getting speedy access to the latest SEH results, but also include having
early access to the household-level micro-data and to survey methodological
information, such as complete specifications of the sampling and data collection
designs and the fieldwork documents, procedures and outcomes. Annual published
reports, as well as presenting tabulated results and commentary, have hitherto
supplied these needs. There is a tendency to assume that detailed paper reports,
containing well-selected tables etc and detailed expert commentary, are rapidly
becoming obsolete, so that the resources put into writing and publishing such
documents can be reduced. However, every consultation that is conducted amongst
the wider population of users of importa nt government surveys contradicts this.
Recommendation: Continue to produce paper reports containing selected
tables and commentary.

11.17 The SEH and the EHCS
In the course of this consultancy we became aware the ODPM were also reviewing
the possibility of in some way merging the SEH and the EHCS. This was not within
our remit, but our main strategic comments would be the following. EHCS sampling
currently relies upon the SEH “shadow-sampling” and the household questionnaires
of the two surveys have topics in common, but otherwise they have quite different
designs. Specifically, the SEH has an equal-probability design while the EHCS uses
sharply different sampling factions between strata; and the SEH draws independent
address samples each year, whereas the EHCS has a sample that rotates at the
address/household level. Unless some important features of the EHCS were
sacrificed, a merged survey would be very complicated to design and implement. On
the other hand if the task were to produce a single design from scratch from a brief
which incorporated the key aims of both surveys and looked to a sample similar in
size to that of the current SEH, innovative thinking might possible be able to address
some of the problems and shortcomings of both existing surveys.

Barton, J. (1996) Selecting Stratifiers for the Family Expenditure Survey. Survey
Methodology Bulletin, 32, pp21-26, ONS
Bruce, S. (1993) Selecting Stratifiers for the Family Resources Survey. Survey
Methodology Bulletin, 32, pp20-25, ONS
DETR (2000) “Multiple Deprivation at the Small Area Level – Indices of Deprivation”
HMSO: ISBN 1 851 124 535
Elliott, D (1997): “Software to weight and gross survey data” GSS Methodology
Series No 1
Elliott, D (1999) “Report of the Task Force on Weighting and Estimation” GSS
Methodology Series No 16
Foster, K. (1998) Evaluating non-response on household surveys. GSS Methodology
Series no. 8, London: Government Statistical Service.
Insalaco, F. (2000) “Choosing stratifiers for the General Household Survey”. Survey
Methodology Bulletin, 46, pp6-14, ONS
Groves, R.M. and Couper, M.P. (1998) Non-response in household interview
surveys. Chichester: John Wiley. Groves and Couper argue the need to separate out
non-contacts and refusals.
Kemsley, WFF (1975) “Family Expenditure Survey: A study of differential non-
response based on a comparison of the 1971 sample with the Census”. Statistical
News 31 pp3-8
Lynn, P.; Clarke, P.; Martin, J. and Sturgis, P. (2002) The effects of extended
interviewer efforts on non-response bias. In: Survey Nonresponse (ed.s Groves,
R.M.; Dillman, D.A.; Eltinge, J.L. and Little, R.J.A.), chapter 9, Chichester: John
Redpath, R (1986) “Family Expenditure Survey: A second study of differential non-
response, comparing census characteristics of FES respondents and non-
respondents” Statistical News 72 pp13-16


To top