Seasonal Adjustment in a Mass Production Environment Duncan Office for

Seasonal Adjustment in a Mass Production Environment Duncan Elliott1 1 Office for National Statistics, Cardiff Road, Newport NP10 8XG, United Kingdom Abstract A common problem in seasonal adjustment is the sheer number of series to be analyzed. Combine this with a manually intensive analysis process for each series, and resources can become stretched. This paper reports how the Office for National Statistics in the UK is automating its seasonal adjustment process in order to deal with a massive increase in the number of series to be analyzed (a consequence of a new compilation scheme for the National Accounts). Efficiency benefits are already being felt, in advance of the anticipated avalanche of new work, as a result of the ‘proof of concept’ stages of this project. The paper takes the audience from mapping the current process, through identifying potential areas for automation and initial automation methods, to finding a long-term solution for this perennial problem. Key Words: seasonal adjustment, infrastructure, R 1. Introduction The Office for National Statistics (ONS) is the largest provider of socio-economic and demographic information in the United Kingdom. This information plays an important part in policy making and is accessible to a wide range of users from other government departments, private sector industry, academia and the public. ONS surveys commonly collect information over time to create time series of estimates to enable movements in the economy or social phenomena to be tracked. When released in a raw, unadjusted state, seasonal patterns can cloud interpretation of the series. Seasonal adjustment removes these patterns and as such is an important part of the production process for many official statistics. The seasonally adjusted series are in fact often the primary focus of users, as they are perceived as providing clearer interpretation of movements over time, as systematic movements are estimated and removed. Responsibility for the quality of seasonally adjusted estimates in ONS rests with a small team of time series experts who advise production areas on appropriate methods for seasonal adjustment, using variants of the X11 process to estimate and remove seasonal components in a time series (Dagum et al., 1996 and Findley et al., 1998). In this paper we distinguish between the production environment, which involves data collection, processing and publication, and the analysis environment where experts decide on appropriate methods for these processes. Our focus is on the efficiency benefits in analysis concerning seasonal adjustment; in particular, the choice of appropriate models and filters for high quality seasonal adjustment (as espoused in the ESS Guidelines for Seasonal Adjustment). Section 2 provides background information on production and the changes that have led to increased pressures on analysis. Section 3 discusses analysis in more detail, current limitations, planned improvements to increase efficiency, and the benefits already seen due to a proof of concept using the software environment for statistical computing, R. 2. The Production Environment 2.1 Seasonal Adjustment in the UK The ONS publishes in the region of ten thousand seasonally adjusted time series annually. The working definition of seasonal adjustment used in the UK is the ‘estimation and removal of effects associated with the time of year and the arrangement of the calendar’. The unadjusted estimate of a time series (Yt) is a function f of a number of component series; trend (Ct), seasonal (St), irregular (It), trading day effects (tdt), moving holiday effects (holt), and other time series effects such as changes in the level of a series (level shifts), or changes to the seasonal pattern (seasonal breaks). Hence Yt = f (Ct , St , I t , tdt , holt ,...) where t denotes the time period. The seasonally adjusted estimate of Yt is, SA(Yt ) = f *(Ct , St , I t , tdt , holt ,...) where the function f* estimates and removes components defined as seasonal, ie (St, tdt, holt, …) Time series are either directly seasonally adjusted as given by f* or indirectly seasonally adjusted as given by the function g by aggregating seasonally adjusted component series, SA(Yt ) = g ( SA(Y1t ), SA(Y2t ),...) where SA(Y1t), SA(Y2t), …, are component series of SA(Yt) that have been directly seasonally adjusted. However simple the methodology, the process in the UK is not straightforward. Different production areas have different software solutions for seasonal adjustment (although all are based, at heart, on the X-11 algorithm), and the number of series involved in the production systems is huge. Although production areas actually run seasonal adjustment, the analysis areas are responsible for the quality of models and prior adjustments. The quality and parameters are checked at annual reviews conducted by time series experts – who also provide ad-hoc support to production throughout the year. As analysis is undertaken in the more up-to-date software, the results then need to be translated to fit each individual production area. Given the non-standard nature of this review-and-update process, it is highly resource intensive. 2.2 The Production Environment 2.2.1 Increased requirements ONS is facing increased demands to publish seasonally adjusted time series • • • • at a finer level of detail (for example, regional time series to meet European Union (EU) requirements, and at a more detailed level of industrial classification for UK users) more quickly and/or more frequently (for example, short-term indicators for the EU, and monthly Gross Domestic Product (GDP) for UK users) under a new compilation scheme for UK National Accounts (due to re-engineering of current practices, see Compton 2008) with an increased level of metadata (for example, explanatory notes and information on statistical quality) These pressures are magnified further by existing issues, such as the decision whether to seasonally adjust directly or indirectly, and the challenges raised by benchmarking and balancing of series. 2.2.2 System Changes Simultaneous with the increased demands outlined in 2.2.1, production areas are being migrated to X-12-ARIMA in preparation for the launch of a corporate project to centralise data holdings and subsequent processing. Migration has knock-on effects in terms of training requirements, and the corporate project itself requires software specification and testing. All of these system requirements, migration, training and software, fall on the same time series analysis experts who are responsible for the annual quality reviews. 2.2.3 Automation of Seasonal Adjustment Although in a production setting seasonal adjustment is automatic, in analysis automation is a dangerous option – the computer makes decisions based solely on the data whereas humans often have access to complementary information. However, manual intervention is resource intensive, so prioritization of resource to focus attention on key or problematic series is often the optimal choice. 3. The Analysis Environment 3.1 Current Procedures ONS conducted a review of the annual review process in October 2006 because of concerns about resources consumed. The main findings of the review were as follows. • • • • Demand for expert time series analysis and advice will increase (to cover around 8000 series - a four-fold increase). In order to maintain quality, a system capable of automatically running and evaluating the seasonal adjustment from a methodological perspective for large batches of series in a short time period was required. The analysis could be made more efficient, and less error-prone, by systematising many of the repeated manual tasks. Transfer of data and parameter files between analysis and production areas could be made more efficient and less error prone through standardisation and automation. The new corporate production system offered opportunities for efficiencies both within analysis and in the interaction between analysis and production areas due to consistent software and standard methodology. • • • The quality review process did not guarantee the quality of final ONS outputs because processes after seasonal adjustment, such as aggregation, balancing and application of revision policies, incurred a risk of publishing time series estimates for high level aggregates containing seasonal features. Quick analysis of large volumes of series was not possible without further automation of processes and reduction in the range of graphical/analytical software involved, and the information required to effectively balance speed versus detail was lacking. Reporting was ad-hoc and time consuming – with non-standard text and manual extraction of results/graphics. The conclusion from these findings was that although some improvements could be made to the current system, there was a need to develop an expert seasonal adjustment infrastructure to deliver an efficient, high quality, low risk quality assurance programme. The requirements of this infrastructure were split into a number of parts of the overall analysis process that we have classified as pre-analysis work, data preparation, analysis, and post-analysis. 3.1.1Pre-Analysis The stage of pre-analysis consists of joint production/analysis planning: timetabling the review, allocating resources and agreeing the scope. This stage worked well already and was deemed outside of the infrastructure project. 3.1.2Data Preparation Data preparation covers the creation of input files in order to run analyses. Input files include the time series to be seasonally adjusted, associated prior adjustment time series, parameter settings from different production systems, and creation of multiple X-12-ARIMA specification files for each time series (for example, a specification file to replicate the production system and various user defined specification files used in analysis). An improved infrastructure could reduce the potential for user error with automation to deal with the variety of production systems and software in use and speed up the decision making process in the analysis area due to reduced manual input, 3.1.3Analysis Analysis of large volumes of series is needed to allow quick identification of problematic series, production of summary diagnostics and graphics, a reduction in the use of different software used during analysis, increased speed of re-estimation (ie after alteration of parameters) and general interaction of the analyst with the generated results. 3.1.4Post-Analysis Post-analysis processes are needed to automatically extract parameter file settings and graphics, and improve the efficiency and user-friendliness of the reported results to fit well with business area production systems. 3.2 Proof of Concept for Analysis Infrastructure • • • • The initial proof of concept for the new infrastructure was written in R and provides a series of improvements. Greater automation of data preparation (input & spec files) Summary output tables in html for all series - with links to graphics, key files and key indicators Generation of output data and parameter file settings The improvements are already proving so tangible, that this proof of concept is in use as part of the analysis process. However, there are risks – the software was written by statisticians, not computer programmers, and the automation, whilst it saves time, might reduce quality. Further improvements are planned to the interface layout/options, and the interactivity of the infrastructure. 3.3 The future solution for analysis infrastructure The next stage of the project involves evaluation of potential infrastructure solutions – existing packages bought “off the shelf” versus an internally developed system versus a further developed proof-of-concept. Once a decision has been reached, based on assessment of each solution against statistical and technical criteria, the new solution will then need to be put into place and tested. That project is for the future – for the present, thanks to expert review, research and only limited development of current processes, analysis is stronger than before, and more prepared to handle the demands of the future. Acknowledgements The authors would like to thank all past and present colleagues in the Time Series Analysis Branch for the work that this paper attempts to summarise. The views presented in this paper are those of the authors and do not necessarily represent those of the Office for National Statistics. References Compton, S. (2008) ‘Populating Quarterly Constant Price Supply and Use Tables with Seasonally Adjusted Data’ International Association for Official Statistics Conference, Shanghai, October 2008 http://www.iaos2008conf.cn/c_paper.html Dagum, E.B., N. Chhab and K. Chiu, (1996) ‘Derivation and Properties of the X11ARIMA and Census X11 Linear Filters.’ Journal of Official Statistics, 12, No. 4, pp. 329-347. ESS Guidelines on Seasonal Adjustment http://epp.eurostat.ec.europa.eu/pls/portal/docs/PAGE/PGP_RESEARCH/PGE_RESEARCH_0 4/ESS%20GUIDELINES%20ON%20SA.PDF Findley, D. F., B. C. Monsell, W. R. Bell, M. C. Otto and B. Chen (1998) ‘New Capabilities and Methods of the X-12-ARIMA Seasonal-Adjustment Program’ Journal of Business and Economic Statistics, 16,No.2, pp. 127-152.

Related docs
Other docs by Local Girl