13th International Roundtable on Business Survey Frames Paris ; 27 September-1 October 1999 Session No 6 Paper No 3 Tom WOODHOUSE, Statistics New Zealand The Quest for Quality – Establishing Industrial Classification, Life Cycle and Coverage Quality Measurements in Statistics New Zealand’s Business Register 1. Introduction
This paper outlines three quality investigations Statistics New Zealand has undertaken, in order to establish existing levels of quality and discusses some of the issues associated with the establishment of quality benchmarks. As with other business registers, the main factors affecting the quality of Statistics New Zealand’s Business Frame are accuracy, coverage and timeliness. Although these business register quality issues are recognised as being important, there are currently no generally accepted benchmarks that can be used to determine the optimal allocation of frame maintenance resources in dealing with them. There is probably no ideal proportion of the economic statistics budget that should be invested in a business register. Conditions vary between statistical agencies both in the range of outputs produced and the quality and accessibility of frame maintenance data. However the role of the register must be seen in terms of its contribution to the quality of final statistical outputs. Hence the quality benchmarks that apply to the business register should be established in terms of the quality standards that have been set for these statistics. By this means a well-focused register maintenance strategy can then be implemented within the context of increased efficiency for the statistical agency as well as consideration of the demands made on respondents. A well managed business register must therefore have quality standards that have been defined in consultation with the users as well as appropriate measurement methods to monitor achievement in terms of the quality benchmarks. The benefits of such an approach include: Robust measurement methodologies for regularly monitoring quality achieved against defined standards; More cost effective allocation of frame maintenance resources; Faster response in adapting to changed statistical output priorities; More effective evaluation of frame development initiatives including new technologies and alternative data sources; Improved management of respondent load; and Improved quality of final statistical outputs.
Timeliness Business Frame timeliness is the length of time that elapses between the occurrence of a relevant real world event and it being recorded on the register. There are two broad components to timeliness: I. The elapse time between the occurrence of a relevant event and its detection by the Frame maintenance operators. II. The elapse time between the event’s detection and it being reflected on the Frame. Coverage Business Frame coverage in the context of these studies can be defined as its degree of completeness in recording the population of economic transactors from which surveys can be selected to produce statistics on the economy which meet users needs. Coverage in the core Business Frame is restricted to economically significant enterprises. Refer Appendix A. Undercoverage occurs when: an economically significant enterprise has escaped detection by the monitoring processes and is not recorded on the Business Frame; or an economically significant enterprise is still trading but has been incorrectly recorded on the Business Frame as having ceased operation; or an economically significant enterprise is on the Business Frame but one or more of the locations at which it is operating are not represented by associated Geographic Units on the Frame.
Overcoverage occurs when: enterprise that has become economically insignificant (or ceased) has escaped detection by the monitoring processes and remains live on the Business Frame; or a Geographic Unit that belongs to an economically significant enterprise is still recorded as ‘live’ on the Business Frame although it has ceased operating.
Detection is the key factor that distinguishes timeliness from coverage. For example, a recently established enterprise that is not on the Business Frame because of a tax monitoring time lag is a timeliness issue. However, an enterprise that has escaped detection from tax monitoring is a coverage issue. Duplication Duplication occurs when an enterprise or geographic unit is represented more than once on the Business Frame. Data Accuracy Data accuracy refers to how accurately data items about a particular statistical unit are held on the Business Frame. Inaccurate data does not refer to data which is out of date, but rather: incorrect or ambiguous data supplied by the respondent; or incorrect loading or coding of information by a Business Frame maintenance operator.
3. The Studies
Below are summaries of three studies, carried out over the last three years that have independently measured aspects of the quality of the Business Frame. The full investigations are available on request. The studies focus on coverage, accuracy and timeliness issues. The three studies that were designed to indicate levels of quality of Statistics New Zealand's Business Frame are: 1. The Household Labour Force Survey Investigation (Coverage) 1997; 2. The Annual Frame Updating Survey (AFUS) Industry Classification check (Accuracy) 1998; and 3. The Life Cycle Investigation (Timeliness) 1999.
Study 1: The Household Labour Force Survey Investigation (Coverage) 1997- Brief Description
Why is coverage so important? Statistics New Zealand’s survey methodologists, assume that the Business Frame holds all businesses that are economically significant. In reality, this is not the case. The Household Labour Force Survey (HLFS) Investigation compared the independently sourced HLFS's ‘place of work’ question with the Business Frame, and established that the Business Frame had 93.5 percent coverage of economically significant businesses with employment. The investigation also established that GST registrations are a robust source of Business Frame births. Objectives of this Study: This study’s objectives were to : identify weaknesses and gaps in the coverage of the database; give survey designers and analysts a truer indication of frame coverage and quality; allow more effective use of database maintenance resources by focusing them on our key problem areas; and send a positive message to clients that the Business Frame section is serious about quality improvement.
Methodology: The study took a sample of respondents who had filled in the 'place of work' question from the March 1996 quarter of the Household Labour Force Survey (HLFS). From this sample, it was determined whether the business (including both companies and self-employed individuals) could be found on the Business Frame as at 31 March 1996. Outcomes from the Investigation: The Study found that: the Business Frame covered 93.5 percent of New Zealand businesses with employment that were in the sample; the remaining 6.5 percent comprised : units which had not yet been birthed to the Business Frame (2.3 percent), and units which were ceased on the Business Frame but had an annual Goods and Services Tax (GST) turnover of greater than $30,000 (4.2 percent )
Units not yet birthed on the Business Frame Of the units missing from the Business Frame as at 31 March 1996, 2.3 percent resulted from the time lag between a business registering with the New Zealand Tax Office, known as the IRD, and being recorded on the Business Frame. When checked against IRD's client registration file, all 2.3 percent could be found i.e. would have been birthed on to the frame, generally within 5 months.
Units recorded as ceased on the Business Frame The remaining (4.2 percent ) of undercoverage were businesses that were ceased as a result of a survey response feedback but that had subsequently restarted using the same IRD GST Registration number. Changes to these units were more difficult to detect within the old mainframe system. However following the changeover to the windows based LAN system, a specific facility was designed to report on restarts. Conclusions from the Investigation: That the quality of coverage of economically significant businesses with employment on the Business Frame at the time of the study was probably over 90 percent ; That the primary source of undercoverage (non activated restarts) could be detected and remedied provided resources were made available; That the secondary source of undercoverage (timing issues) was a function of the current maintenance strategy policy; That GST registrations are a robust source of Business Frame births.
Study 2: The Annual Frame Updating Survey (AFUS) Industry Classification Check (Accuracy) 1998- Brief Description
Why is the Industry Code so important? The industry code is an important characteristic for the Business Frame both in terms of coverage and collection management functions. In terms of coverage it is used to reflect the population for the various groups of activities into which the economy is classified. With respect to the collection function the industry code has an effect on the survey designs and ultimately the quality of the statistical outputs that are used to measure the economy. For the Australian and New Zealand Standard Industry Classification codes (ANZSIC), the risk is greatest if the enterprise has been placed into the wrong division, and reduces progressively for incorrect coding at the subdivision, group, and class level. Objectives of this Study: To provide an ANZSIC code quality benchmark for the future measurement of the quality of ANZSIC coding on the Business Frame. To identify and develop a typology of incorrect ANZSIC coding and investigate ways of improving the quality of ANZSIC coding on the Business Frame.
Methodology: Survey methodologists were consulted on the sample size and method of questionnaire selection for this study. In response to their recommendation a sample size of 2000 questionnaires were randomly selected. Each selected questionnaire was then checked against the Business Frame, and a judgement was made as to the quality of the ANZSIC code.
Outcomes from the Investigation: The AFUS Industry Classification (ANZSIC) investigation indicated that there was about a 94 percent level of accuracy with respect to the industry codes assigned on the Business Frame. Analysis of Incorrect Coding An analysis of the causes of the 5.6 percent error rate (116 enterprises) in ANZSIC code assignment within the 2000 sampled enterprises revealed that : (i) an incorrect change was actioned in 10 enterprises (0.5percent); (ii) a change was required and not actioned in the case of 106 enterprises (5.1 percent). Furthermore in terms of degree of error: 3.0 percent of units investigated were incorrectly coded at the Division level. (This is the broadest level and would affect industry specific survey populations for Quarterly Manufacturing Survey, Wholesale Trade, Retail Trade, and Accommodation). 5.6 percent of the units investigated were coded to an incorrect ANZSIC Class. (This is the finest level and may contribute to errors in Business Demography statistics and in some Annual Enterprise Survey industries).
Refer Appendix A for a full list of the 17 ANZSIC Divisions. Conclusions from the Investigation: That the level of accuracy of ANZSIC coding on the Business Frame at the time of the study was probably more than 90 percent; That maintenance resources be focused on ensuring that ANZSIC codes are assigned correctly when new enterprises are birthed ; and That a cost benefit analysis be carried out to evaluate any potential changes to our ANSZIC coding procedures, including future post-coding checks, resourcing implications, system enhancements (including the automation of the coder) and training requirements.
Study 3: The Annual Enterprise Survey (AES) lifecycle check (Timeliness) 1999Investigation into the quality of the Business Frame Enterprise Life Cycle Code – Pilot Study
Why is the Lifecycle Code so important? The Business Frame must endeavour to reflect the current business activity in terms of enterprises that are alive on the Business Frame. The Business Frame is updated monthly with GST births. However, there are time delays between receipt of data from IRD and the respondent. Equally, the ceasing of units may be less timely again, as the Business Frame, in some cases, relies on 6 monthly GST return information to identify if businesses have fallen below the economic significance threshold (refer Appendix B). A concern within survey processing sections is that the Business Frame may be over or under stating the number of live businesses in New Zealand at any given point in time. Surveys need the Business Frame to have a high level of accuracy and timeliness. A timely Business Frame ensures that only those units that are in scope are selected for surveys. However, almost every piece of information the Business Frame receives is historic, making it unlikely that a frame is 100 percent correct at any one time. The current Maintenance Strategy ensures that the Business Frame is updated annually as at February.
The Annual Enterprise Survey (AES) is an annual survey designed to collect financial data from most sectors of the economy. The collection unit is the Kind of Activity Unit (KAU) and the selection unit is the Enterprise (ENT). All multi-KAU Enterprises are placed into full-coverage strata. Of the over 30,000 Enterprises in the AES 97 sample, 282 enterprises are designated as Key Firms. For the purposes of this pilot investigation only these Key Firms have been investigated. Key Firms represent 1 percent of the AES population while making up 31 percent of the AES total income. Objectives of this Study: Establish the impact on the Annual Enterprise Survey's Key Firms, in terms of the number of untimely Business Frame life cycle changes. Establish a robust investigation methodology that can be used to investigate all enterprises in the AES97 sample.
Methodology: The Business Frame holds two dates, the Machine date and the Real world date. The Machine date is the date at which the machine records the changes made. The Real World date is the date at which the event actually took place. This investigation: Isolated all the life cycle (cease, birth, reactivated) changes that occurred after the selection date of AES97 on the 11th of July 1997 (machine date); and then Identified those where the Real World date indicated that the event had taken place prior to the survey sample selection date of 11th July 1997. Note: For the purposes of this investigation, and because we endeavour to maintain the frame annually, any lifecycle change that has occurred within one year of the real world date is considered to have been updated within an acceptable time frame. Outcomes from the Investigation: Of the 282 key firms, 42 enterprises were identified as having machine dates on or after 11 July 1997 (the survey selection date) and the date of the study (30 June 1999) that indicated that there had been a life cycle change. Of these 42 enterprises: 27 enterprises had a real world date, which was subsequent to 11 July 1997. The remaining 15 enterprises had a life cycle change with a real world date prior to July 11 1997. 3 of these 15 enterprises had not updated within the currently acceptable time frame of one year.
Therefore, 94.7 percent of key firms had a correct lifecycle code at time of selection, and 98.9 percent of Key Firms were considered to have been updated on the Business Frame in a timely manner (within one year). Of the three units with an incorrect life cycle status, two (0.7 percent ) were enterprises that should not have been selected into AES 1997. Although the third enterprise had not been updated in a timely manner, it was still eligible for selection into AES 97.
Conclusions from the Investigation: AES Key Firms, by their very nature, are different from the AES population as a whole. So even though we can say with some certainty that 1 percent of Key firms, were ineligible for AES at the time of selection, we cannot extend the statement to the whole AES population. However, as key firms make up 31 percent of the AES total income estimate it is accepted that a higher standard of quality should be applied to these units compared with the rest of the AES population.
The pilot study concluded that: That 94.7 percent of key firms had a correct lifecycle code at time of selection into AES97, and 98.9 percent of key firms were updated on the Business Frame in a timely manner ; The pilot study be expanded to cover all enterprises in AES97, with Survey Methods endorsing the methodology; That Business Frames clients be asked to define "timeliness" for the purpose of establishing a benchmark for this attribute; That when ‘timeliness’ is defined, the relative impact on surveys of different types of firms is considered.
3. Conclusions from the three studies
That the quality of coverage of economically significant business with employment recorded on the Business Frame at the time of the study was more than 90 percent accurate; That the quality of ANZSIC coding on the Business Frame at the time of the study was also more than 90 percent accurate; and That about 95 percent of key firms had a correct lifecycle code at time of selection into AES97, and approximately 99 percent of key firms were updated on the Business Frame in a timely manner.
These studies have provided previously unavailable information on aspects of Business Frame quality. However their acceptability can only be determined after some benchmarks have been set. Yet they do provide some measurement methodologies they can form part of a broader Business Frames quality strategy.
4. Investigations in Progress
The Business Frame section is currently undertaking a fourth investigation, which is to establish the volatility of Business Frame units over time. This investigation will be completed by February 2000. It is considered that measuring the relative volatility of different segments within the survey populations will add a further dimension when establishing appropriate quality benchmarks.
5. Conclusion and Future Direction
The above investigations have only dealt with one facet of benchmarking, namely establishing the existing level of quality of various aspects of the business register. However at this stage we have not addressed whether these quality levels meet, fail to meet or exceed our clients’ requirements. This is a phase of a wider study that, in itself, presents considerable challenges. Getting users to clearly define and agree on their minimum requirements can often be difficult to achieve. However this broader study must be embarked upon because of the pressures within a statistical agency to balance quality and efficiency. For Statistics New Zealand this broader study is contained within the framework of the Economic Statistics Strategy that is currently being developed. The aims of this strategy are to meet the needs of statistical output users for information within the context of increased efficiency and minimisation of respondent load. The Economic Statistics Strategy has identified a number of principles, practices and infrastructural issues that must be addressed if these aims are to be achieved. As a consequence one of the key projects that has recently commenced is the review of the business survey population maintenance strategy.
The Business Frame ultimately exists in order that relevant data is efficiently collected so that agreed needs of the users for statistical output information are met. However it should strive to do this within the aims of overall efficiency. Essentially this means adopting a business case approach in terms of identifying the costs and benefits of frame maintenance efforts. Part of this process involves an assessment of the risks to the established quality standards of final statistical outputs from adopting particular update options. Within this context quality benchmarks can best be determined by relating the various types and levels of error within the business frame with their effects on the quality of those final statistical outputs. It is suspected that there are likely to be areas where excess effort is being applied at the expense of under resourcing some other more rewarding frame maintenance issues. This suggests that further research and analysis should be undertaken before we can conclude that the reported quality levels should be considered as anything more than default benchmarks. However the investigations do provide methodologies for measuring quality and can be repeated at intervals to establish whether various frame maintenance practices have raised or lowered quality in relation to these default benchmarks.
Appendix A- ANZSIC (Australian & New Zealand Standard Industrial Classification) Divisions A B C D E F G H I J K L M N O P Agriculture, Forestry and Fishing Mining Manufacturing Electricity, Gas and Water Supply Construction Wholesale Trade Retail Trade Accommodation, Cafes and Restaurant Transport and Storage Construction Services Finance and Insurance Property and Business Services Government Administration and Defence Education Health and Community Services Cultural and Recreational Services
Q Personal and Other Services
Appendix B - Economic Significance Only those Enterprises, which are defined as being ‘economically significant’, are maintained on the Core Business Frame. These Enterprises are defined as any Enterprise that meets any one of the following criteria: Enterprises with greater than $30,000 Annual GST expenses or sales Enterprises with more than 2 Full Time equivalent paid employees Enterprises in GST exempt industries, other than residential property leasing/rental Enterprises that are part of a group Enterprises that are new GST registrations, registered as compulsory/special/forced Enterprises registered for GST with an activity unit classified to agriculture/forestry
Enterprises with Voluntary GST Registration status and Enterprises with less than $30,000 annual GST expenses or sales are monitored for changes in size. Enterprises moving above the threshold would progress from IRD’s database into the BF. Appendix C - Statistical Unit Rules A Enterprise Unit (ENT) is a business or service entity operating in New Zealand such as a company, partnership, Trust, estate, incorporated society, producer board, local or central government organisation, voluntary organisation or self-employed individual. A Kind-of-Activity Unit (KAU) is an institutional unit or part of an institutional unit, which engages in one or predominantly one kind of economic activity without being restricted to a geographic area. Value added statistics must be able to be produced for a KAU, or be able to be readily or meaningfully imputed. Economic activities are the lowest level categories of the industrial classification in use. The Kind of Activity Unit (KAU) comprises one or more geographic units at one or more places for which a single set of accounting records is available. The KAU should be as industrially homogenous as possible. However, data availability and respondent burden issues often affect the quality of information in this area. Generally, the accounting set is not broken to achieve this except in significant cases. A Geographic Unit (GEO) is a separate operating unit engaged in New Zealand in one, or predominantly one kind of economic activity, from a single geographic location or base.