Data Sharing the Economic and Social Data Service

Document Sample
Data Sharing the Economic and Social Data Service Powered By Docstoc
					   Data Sharing: the Economic and Social Data Service

           Dr. Sharon Bolton, Data Services Manager, UK Data Archive

 Paper presented at the GSS Methodology Conference, London, 23 June 2008

The Economic and Social Data Service (ESDS) was established in January 2003.
Funded by Economic and Social Research Council and the Joint Information Systems
Committee, ESDS serves the UK higher education (HE) and further education (FE)
sectors, providing access to and support for key economic and social data. The ESDS is
a distributed service, bringing together four centres of expertise in data creation,
dissemination, preservation and use, to encourage the sharing and secondary use of
data via seamless and easy access to a range of disparate resources. The ESDS
collection currently includes around 6,000 datasets. Approximately 300 datasets are
added to the collection annually, and usage is high; around 50,000 dataset downloads
are made by registered users each year.

Data acquisition
Data are acquired from a range of sources, including official agencies and central
government; individual academic research; and research centres/survey companies (for
example the National Centre for Social Research). The collection includes a wide variety
of data, for example, qualitative, quantitative and mixed-methods microdata and time
series macrodata. Examples of some of the key data held by ESDS include Office for
National Statistics (ONS) series such as the Labour Force Survey and General
Household Survey, large-scale longitudinal studies like the British Household Panel
Survey and Millennium Cohort Study; the cross-national Eurobarometer survey series,
the IMF International Financial Statistics, OECD Main Economic Indicators and other
macrodata provided by international organizations, and key qualitative social research
collections, such as the interview-based Family Life and Work Experience Before 1918,
1870-1973 (the ‘Edwardians’) study.

ESDS specialist services
The four centres that collaborate to provide the ESDS service comprise the UK Data
Archive (UKDA) and the Institute of Social and Economic Research (ISER), both based
at the University of Essex, and the Cathie Marsh Centre for Census and Survey
Research (CCSR) and Manchester Information and Associated Services (MIMAS), both
based at the University of Manchester. These centres work together to provide core
archiving facilities plus four specialist data services that offer enhanced support for
particular types of data: ESDS Government; ESDS International; ESDS Longitudinal and
ESDS Qualidata. The specialist services provide dedicated help desks and web pages
for each of the studies they cover, develop value-added documentation, and hold
workshops, conferences and seminars to bring together data creators and users.

To examine each service in turn, ESDS Government is led by CCSR, and provides user
support for a range of large-scale government data. Activities undertaken include
training courses on key topics of interest, on specific statistical packages and on
methods of statistical analysis. Topic-related online course materials are produced, and

a range of teaching datasets have been developed, based on the large-scale
government data such as the British Crime Survey and General Household Survey. The
overall aim of ESDS Government is to promote and facilitate the increased and more
effective use of government datasets in research, learning and teaching across a range
of disciplines.

ESDS International, jointly run by Mimas at Manchester and the UKDA at Essex,
disseminates and supports both international macrodata and survey microdata. The
service provides web-based access to regularly updated international aggregate
help for users in locating and acquiring international survey data from other archives;
comprehensive support materials for macro datasets and teaching and learning
resources; case studies detailing data use; introductory courses to raise awareness of
international datasets and their research potential; an annual conference on issues
relating to international data research; and interactive visualisation interfaces for
international macro datasets.

The work of ESDS Longitudinal is undertaken jointly by UKDA and ISER, and links to
specialist support provided by the Centre for Longitudinal Studies at the Institute of
Education. The service provides training and workshops, and provides a range of value-
added data enhancements for the following collections: the 1970 British Cohort Study;
British Household Panel Survey; English Longitudinal Study of Ageing; Families and
Children Study; Longitudinal Study of Young People in England; Millennium Cohort
Study; and the National Child Development Study.

ESDS Qualidata is led by the UKDA at Essex and provides support for a range of social
science qualitative datasets. Value is added to data by enhancing user-oriented
information; allowing users to browse the content via web-based samplers of key
qualitative datasets; and by providing methodological information, e.g. a guide to
interviews providing information on how to use them as data resources. The service
focus is on acquiring digital data collections from purely qualitative and mixed methods
contemporary research and from UK-based 'classic studies’. Qualidata Online, the
interactive face of the service,has also been developed, to help users move beyond
catalogue searching and data download to allow web-based free-text and filtered
searching, and browsing and retrieval of research data in real time. Increasingly, data in
the online system includes not only traditional interview transcripts, but also audio and
image files. Interviews and related materials from four of ESDS Qualidata's classic
sociology collections are currently available online: Edwardians, Mothers and Daughters,
100 Families, and Mothers Alone.

Other ESDS resources
Beyond the specialist services, ESDS also provides support for a range of other data
resources, which, whilst not currently covered by one of the four services, are still
important and well-used resources. These include series such as the Workplace
Employee Relations Survey, the International Passenger Survey, and the British Election
Studies. The ESDS is also very keen to work with data creators, to obtain data series not
yet archived, in order to broaden the scope of the collection. Whether the resource will
be covered by a specialist service or not, it will be securely preserved, quality
maintained, and its use properly supported.

Preparation of data for secondary use
Core data archiving services, including the ‘ingest’ processing and preparation of data
for secondary use, are based at the UKDA (with the exception of the international macro
data series archived by ESDS International at MIMAS). Once a dataset is acquired for
the ESDS collection, the UKDA archiving team process the data and accompanying
documentation files in a systematic way, to ensure consistency and standardization of
product quality. For example, any data errors discovered are resolved in collaboration
with the data creators; value-added data enhancements may be produced (such as the
addition of variable-level metadata, and the compilation of extra materials to help users
navigate through the dataset).

Once ingest processing is complete, the dataset is released for secondary use. Each
dataset (including the international macrodata held by ESDS International) has a record
in the online ESDS data catalogue, which includes standard descriptive information,
links to user guides and documentation, an online variable list where appropriate, and
download/online access links. Information about available data may be found on the
ESDS website using a variety of means: users can browse by subject; major series
studies have dedicated webpages with a link to the record for each dataset in the series;
and pages detailing new releases are available. Once data have been located, they are
generally downloadable (by registered ESDS users) in a range of popular formats, and
some data may be explored using online browsing tools, including the aforementioned
Qualidata Online and the ESDS Nesstar online system. Nesstar allows registered users
to perform a variety of analyses on selected quantitative data; they can subset, tabulate,
compute new variables, create graphics and use a mapping tool for data visualization.

Accessing data and ensuring responsible use
Potential users who require access to data must register with ESDS, and will first need
an Athens account (by August 2008 most universities will have in fact switched from
Athens to Federated Access Management (Shibboleth) for user authentication, though
ESDS will still retain facilities for those who stay with Athens). There are currently
40,000+ registered ESDS users. The majority of those users are within the UK HE and
FE sectors, but anyone can register with ESDS; however, some access restrictions are
in place for particular data/users. Users are also required to register the purpose for
which data will be used, including project details. With permission, we can share these
details with other users, so researchers (and data creators) can gain a good idea of kind
of research taking place in the field. In addition, when users from UK HE and FE register
with ESDS they also gain access to the ESRC Census Programme which provides
access to the 1971, 1981, 1991 and 2001 UK Censuses.

In order to complete registration, the user must agree to an End User Licence (EUL),
which outlines the terms and conditions of data use. As part of the EUL, users agree to
preserve the confidentiality of, and not attempt to identify, individuals, households or
organisations in the data, and not to pass the data on to any person who is not also a
registered ESDS user. There are strict sanctions in place if these conditions are
breached. All users are encouraged to follow procedures laid down in the Guide to good
practice: microdata handling and security document available on the ESDS website, and
it is a primary aim of the service to encourage data users to act responsibly and observe
security at all times. The strict range of confidentiality checks undertaken on all datasets
prior to ingest processing, and the fact that ESDS works closely with data creators at all
stages of the deposit process, also help to minimise the risk of disclosure.

Approved Researcher and Special Licence arrangements
Over the past couple of years, there has been increased concern in many circles over
data confidentiality. The ONS Microdata Release Panel was established to examine
disclosure risk with regard to ONS data releases, which led to some reduction in detail
for data held at ESDS, including age banding, and the aggregation of geographical
levels and employment categories. In order to ensure that sufficiently detailed data
remain available to support research, ONS and ESDS together developed the ‘Special
Licence’ system in 2005, where data creators could share more detailed and sensitive
data with ‘trusted’ users. ESDS was included in the ONS Approved Researcher and
Statistical Disclosure Control Working Groups, in order to represent data users and to
help find the correct balance between data needs and the reduction of disclosure risk. In
practical terms, ESDS now hold two copies of some ONS datasets: one that includes
more detailed data and is available only under Special Licence conditions; and one with
less risk of disclosure (albeit with the reduced levels of detail indicated above) available
to registered users under the standard End User Licence. Other organisations have also
begun to set up special licensing arrangements with ESDS, based on the ONS model.

Once the Statistics and Registration Services Act came into force on 1 April 2008, the
ONS Special Licence system changed accordingly. Access to ONS Special Licence data
is provided via new legal framework requiring prior accreditation by the Statistics Board
as an ‘Approved Researcher’. Registered ESDS users, who already abide by the End
User Licence agreement, can order Special Licence data via the online ESDS Data
Catalogue, where a link will take them first to the Approved Researcher forms for
accreditation. Potential users must fulfil several stringent conditions, and provide
evidence that they are ‘fit and proper’ persons; details about the purpose of the
research; a signed declaration that they understand the confidentiality obligations owed
to these data, including their physical security; evidence of previous research projects
and publication and possibly details of a senior researcher who can vouch for them;
information about the intended use of the data, a justification for access; and a summary
of planned outputs. Once users gain accredited Approved Researcher status, they may
gain access to Special Licence data. Whilst the Special Licence system has added an
administrative burden to the work of both ONS and ESDS, increasing automation is
under development. Future technologies to provide streamlined access to sensitive data
are currently being investigated, including the provision of secure data services using the
Virtual Microdata Laboratory model, the subsetting of larger datasets, and variable-level
access control.

Research and development – the future of data services
This paper has provided an introduction to the wide range of services that ESDS
currently provides. In an environment where the needs of data users and researchers
are changing fast, ESDS has to respond to developments and remain at the cutting edge
of data service provision. With this in mind, ESDS undertakes an extensive and ongoing
programme of research. For example, a self-archiving service for researchers is
currently in development, led by the UKDA, where technology transfer from that project
will enable a host of other developments to feed into the enhancement of services to
users. Also, the development of Data Exchange Tools (the DExT project) for data format
transfer and preservation will enable the UKDA to carry out more efficient and
streamlined ingest processing of datasets acquired for the ESDS collection; digitization
and data visualization advances are also being investigated. Beyond purely technical
developments, ESDS is also committed to finding innovative ways to improve its service

to users, by exploring new formats and Web 2.0 technologies to provide enhanced
learning and teaching aids. Guidance and support for researchers prior to data deposit,
and help to manage their data throughout the project lifecycle, is also now available;
much ground-breaking work on these concepts continues under the UKDA’s Rural
Economy and Land Use Data Management Service (RELU-DSS) programme, and is
finding wider application across the service. The ESDS aims to remain at the forefront of
data service provision, and works tirelessly to ensure first-class support will continue to
be available to data users.

For further information on ESDS, and the services and developments covered in this
paper, readers should visit .

2,258 words