Data quality assessment guidelines

Reviews
Shared by: user002
Stats
views:
670
rating:
not rated
reviews:
0
posted:
2/5/2008
language:
English
pages:
0
Draft Department of Infrastructure, Planning and Natural Resources Natural Resource Information Systems Data quality assessment guidelines Information Management Framework (IMF) IMP02/10.1.3 Data quality assurance and management standards IMF701 June 2003 Draft Document Release Information Reviewers Name Jane Deller-Smith Mick Dwyer Grant Robinson Trev Mount Jonathan Doig Sue Irvine Fred De Closey Role SIP Project Representative Data Custodian Representative CNR Data Administrator Representative NRIC Editorial Reviewer Regional Reviewer 7/3/2003 Signature Date Project approvals Name Neil Bennett Role GM, Natural Resource Information Systems GM, Information Management & Technology (Chief Information Officer) Group GM, Natural Resource Products Signature Date Audience Role Operational Data Custodian and Data Managers Corporate Data Administrator Responsibility To help in planning and execution of Data Quality Assessments To review and maintain this document. Related documents Document Name How to write a business rule Location http://imf.dsnr/binarydata/IMF703.pdf History Version v2d2 Date 16 June 2003 Author-Editor(s) Adrian Richardson Notes Changes made after ALTIS comments in carrying out first DQA. File details Filename File server location Online location IMF701_DatQltyAssessGuide_v2d2.doc \\PAR01\DATA1\GROUP\IMTIIC\CORPORAT\IMT\COORD\NRIS\Projects\SIP\10.1_Data_Ma nagement\10.1.3 Data Quality Assurance\Guidelines and Templates\ http://imf.dsnr/binarydata/IMF701.pdf Draft Contents 1 BACKGROUND ........................................................................................................................ 4 1.1 1.2 1.3 1.4 2 PURPOSE OF DOCUMENT............................................................................................. 4 WHAT IS A DATA QUALITY ASSESSMENT?................................................................. 4 SUPPORTING THE BUSINESS PROCESS .................................................................... 5 DESCRIBING QUALITY ................................................................................................... 7 DATA QUALITY ASSESSMENT PROCESSES ...................................................................... 8 2.1 2.2 PRE-ASSESSMENT......................................................................................................... 8 PLANNING PHASE .......................................................................................................... 9 2.2.1 2.2.2 2.2.3 2.3 2.3.1 2.3.2 2.3.3 2.4 2.5 2.4.1 2.5.1 Define Assessment........................................................................................... 9 Design tests and expected results ................................................................... 9 Access Data ..................................................................................................... 9 Compare expected with actual ....................................................................... 10 Collate and Analyse Findings ......................................................................... 10 Produce Recommendations ........................................................................... 10 Project Closure ............................................................................................... 11 Assessment outcomes ................................................................................... 11 EXECUTION PHASE...................................................................................................... 10 CLOSURE PHASE ......................................................................................................... 11 POST ASSESSMENT..................................................................................................... 11 3 4 GLOSSARY ............................................................................................................................ 12 REFERENCES........................................................................................................................ 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 1 Background 1.1 Purpose of Document This document is aimed at Data Managers wanting to design and carry out a Data Quality Assessment (DQA). The Data Quality Assessment Guideline provides a guideline, which is to be used during the implementation of the DSNR Information Management Framework (IMF). In addition to this document, a number of work instructions and templates are available from the IMF Intranet site to help carry out a DQA. 1.2 What is a Data Quality Assessment? Information has a number of operational lifecycle activities, referred to as the information lifecycle (OIT, 2002 p4). The Data Quality Assessment is a separate process to any Quality Assurance (QA), or data verification. There latter are undertaken within all lifecycle phases, such the QA during data collection while the Data Quality Assessment is carried out in order to ascertain the quality of the data currently in ‘stock’. (See Figure 1) A Data Quality Assessment: • Compares the expected quality of the data, as set out in the Data Quality Plan and the actual data quality. • Provides recommendations on how to improve data quality when quality has failed to meet the benchmark. • Is separate to any verification process carried out within an individual phase of the Information lifecycle. Such as the QA process utilised in collection of data. • Ideally is carried out by an independent body external to information processes. IMF701_DatQltyInvestGuide_v2d2.doc Page 4 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft Data Quality Audit and Remediation Process Data Quality Audit and Remediation Process Data Remediation Part 3 Data Management Plan Approval / Priority Process Data Quality Assessment Audit Recommendations Data Collection Data Store Data Access Historic data Collection Information lifecycle phases Storage Access & Use Archive/Disposal Figure 1. Data Quality Assessment within Information Lifecycle Phases 1.3 Supporting the business process Regular DQAs will benefit business units by ensuring the following: • • • Data custodians can properly assess the quality of the data they provide The gap between actual and expected quality is known allowing plans to be developed to align these gaps Management and clients understand the current quality and limitations of any dataset When implemented, the information management framework (IMF) will ensure that information supports business requirements. Figure 2. Data Quality Assessment within the business process shows the IMF modules/processes that are designed to ensure that data quality meets the business requirements. IMF701_DatQltyInvestGuide_v2d2.doc Page 5 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft Some business requirements for collecting and managing data are as follows: Increased need for data quality practices Routine Work – covers routine monitoring programs and management of natural resources [incl. Land, water, soil, veg. etc.] Research and Development - includes research projects, development and implementation of new techniques, ideas and equipment Management and Analysis - used for tasks not involving data collection; includes system management tasks, tasks collating data from other tasks, for future planning Supports, or is required, to carry out a statutory function The business requirements are defined by the use of the data and will change over time. Data might have any number of uses but, when defining the expected quality for the Assessment criteria, the primary use should be used. Data Quality management cycle Start Original use Define Business Requirements Obtain Feedback Use/analyse data Define Purpose of Data Quality Investigation Module Update Metadata Quality Plan (Benchmarks) Define Business/data rules Data Quality Assessment processes Update Data Management Plan Update Benchmarks System Spec Standards Prior audits Figure 2. Data Quality Assessment within business processes IMF701_DatQltyInvestGuide_v2d2.doc Page 6 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 1.4 Describing quality The Data Quality elements in this assessment guideline align with ANZLIC metadata guidelines (ANZLIC 2001 p85). This is to facilitate up to date and accurate ANZLIC metadata for datasets Accordingly, all tests should utilise the terminology of the ANZLIC metadata quality elements and each element should be addressed by at least one test. Help in deciding what Quality Element a business rule relates to can be found in ‘How to write a business rule’(IMF703) 1. Lineage 1.1. Source of Data 1.2. Processing steps 2. Positional accuracy 2.1. Determining accuracy requires comparison of a recorded position against the actual position as defined by a known datum. Positional accuracy would be determined by how close the represented position of a feature is in relationship to its actual position on the earth. This can be done through comparisons with aerial photographs or similar methods. 3. Attribute accuracy 3.1. Determining accuracy requires comparison of recorded entry against the actual as defined by predefined standards. Attribute accuracy can include a classification method used to assign values and then how well attributes conform to the classification (described as a percentage % accuracy = 100 x (Number of Accurate instances / Total Number of instances) 4. Logical consistency 4.1. How well does the data fit within logical rules of data structure. 4.2. Attribute logical consistency entails the testing of two or more functionally related attributes. The value for one attribute determines the valid values for its related attributes. (If X then Y) 4.3. Feature logical consistency is the testing for feature to feature relationships that are consistent with known or expected rules. For instance, Dryland Salinity must occur on land. 5. Completeness 5.1. Completeness of spatial coverage tests expected spatial coverage against actual coverage either as areas missed or stations covered 5.2. Completeness of temporal coverage is for time series data when there are gaps in the time recordings 5.3. Completeness of classification examines how exhaustive is the classification system and are there generalisations 5.4. Completeness of verification examines the verification method for the data 5.5. Completeness of attribution examines if each record is complete Other classifications that can be used: 6. Currency 6.1. Beginning date 6.2. End date (processing time between collection and storage if ongoing collection) 7. Status 7.1. Maintenance and update frequency 7.2. Progress The final report might also present these broken up by regions or individual data collectors if appropriate. IMF701_DatQltyInvestGuide_v2d2.doc Page 7 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 2 Data Quality Assessment Processes Each Data Quality Assessment (DQA) is an entirely separate project from all other projects. Establish a DQA according to Departmental Project Management Processes (PMP) http://projectoffice.imt.dlwc/guide/guide-default.cfm which must be followed. The internal processes of a Data Quality Assessment project are shown in Figure 3 below. Data Quality Assessment process diagram 2.2 Planning 2.1 Pre-Audit 2.3 Execution 2.4 Closure 2.2.1 Define audit 2.3.1 Compare items with actual 2.4.1 Project closure 2.5 Post Audit 2.2.2 Determine tests and expected results 2.3.2 Collate and analyse findings 2.2.3 Access data 2.3.3 Produce recommendations Figure 3. Data Quality Assessment Process diagram 3 2.1 Pre-Assessment The quality benchmark, or expectations, should have been established based on the defined purpose for the data. It is recommended that the Data Management Plan (DMP) be completed before any DQA to ensure that the purpose of the data better understood. IMF701_DatQltyInvestGuide_v2d2.doc Page 8 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 2.2 Planning phase 2.2.1 Define Assessment A DQA will only reveal the quality of a single dataset at the date of extraction or analysis. Only actively managed (or live) data should be investigated. 1. Inputs 1.1. Dataset name 1.2. Data Management Plan 1.3. Prior plans and lessons learnt 1.4. Data Quality Assessment Guidelines 2. Tools and Techniques 2.1. Project management processes 2.2. Tools or systems to be utilised for data extraction/mining 3. Outputs 3.1. Project Plan Dataset name Prior plans 2.2.1 Define assessment Project Plan DQA Guideline DMP 2.2.2 Design tests and expected results Document a specific test for each business rule before data extraction. The guideline Developing data quality testing scripts provides information on this process. This includes some generic tests which can be applied to all data. 1. Inputs 1.1. Prior test scripts 1.2. Quality Plan (benchmarks) 1.3. How to develop data quality test scripts & process discovery 2. Tools and Techniques 2.1. Assessment test template 3. Outputs 3.1. Test script How to Develop DQ Test Scripts Prior test scripts Quality Plan benchmarks 2.2.2 Determine tests and expected results Test Scripts 2.2.3 Access Data How the data is to be queried/retrieved/extracted is fundamental to the Assessment. The tool is defined in the Assessment plan. 1. Inputs 1.1. Assessment Plan 1.2. Test Scripts 2. Tools and Techniques 2.1. SQL Scripting 2.2. Data extraction or direct query 3. Outputs 3.1. Access to data IMF701_DatQltyInvestGuide_v2d2.doc Page 9 of 13 Assessm ent Plan 2.2.3 Access data Test Scripts Data for Asses Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 2.3 Execution phase 2.3.1 Compare expected with actual 1. Inputs 1.1. Data 1.2. Test script 2. Tools and Techniques 2.1. Data manipulation tool/spreadsheet 2.2. Execute testing 3. Outputs 3.1. Test results Data 2.3.1 Compare items with actual Test Scripts Test Results 2.3.2 Collate and Analyse Findings 1. Inputs 1.1. Test results 2. Tools and Techniques 2.1. Review where actual fails to meet expected 2.2. Review where improvement/decline from previous Assessment 2.3. Review outstanding actions from previous Assessment 2.4. Meetings with Assessment team and data custodian to decide corrective action 3. Outputs 3.1. Action Sheets (DMP Part 3 Template) 2.3.3 Produce Recommendations 1. Inputs 1.1. Action Sheets 1.2. Assessment Report Template 2. Tools and Techniques 2.1. Priority estimating/setting 2.2. Cost estimating 2.3. Peer review 2.4. Management signoff (including Executive Data Custodian signoff) 3. Outputs 3.1. Assessment Report IMF701_DatQltyInvestGuide_v2d2.doc Page 10 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 2.4 Closure Phase 2.4.1 Project Closure 4. Inputs 4.1. Assessment Plan 4.2. Actual resources used 5. Tools and Techniques 5.1. Compare expected vs Actual Costs 5.2. Compare expected vs Actual Time 5.3. Compare expected vs Actual Resources used 5.4. Peer review 5.5. Management sign off (including Executive Data Custodian sign off) 6. Outputs 6.1. Lessons Learnt/Hindsight Report 2.5 Post Assessment 2.5.1 Assessment outcomes The Assessment provides recommendations on how to improve data quality. Each business unit must then develop the recommendations into action/project plans for implementation. Actions to be carried out post assessment include: 1. Update metadata records. 2. Update Data Management Plan (Part 3) with approved actions rising from Data Quality Assessment 3. Update Benchmarks and business rules 4. File records of Assessment (ensuring later access) 5. Provide/store Approved Assessment Report on intranet site IMF701_DatQltyInvestGuide_v2d2.doc Page 11 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 3 Glossary A full Glossary of terms can be found http://imf.dsnr/glossary/glossary-terms.cfm Archive Disposal DSNR DMP DQA Information lifecycle IMF Instance KRA Live data System Post Project, post actively management data stored for posterity and later reuse (cf disposal). Post Project, post actively management data which is no longer recoverable Department of Sustainable Natural Resources Data Management Plan Data Quality Assessment Comprises several phases: Collection, Storage, Access, Use and Disposal of information and is a continual process. Information Management Framework A single record within a dataset Key Result Area Refers to data where it is being actively managed. Taken from the point of where data is stored after verification until it is archived. A group of applications, including input/output devices and a database, used to help manage data through the information lifecycle (see above). Verification is a comparison against standards but is usually carried out at the end of the process to ensure quality of individual lifecycle phases. Verification Data IMF701_DatQltyInvestGuide_v2d2.doc Page 12 of 13 Department of Infrastructure, Planning & Natural Resources Data Quality Assessment Guide Draft 4 References ANZLIC, ANZLIC Metadata Guidelines Version 2,Feb 2001) [online] available from http://www.anzlic.org.au/asdi/metgidv2.pdf [accessed 30 Jan 2003] International Organisation for Standardisation. ISO 8402-1994. Quality Management and Quality Assurance, Geneva, Switzerland: ISO Press Natural Resource Information Management Strategy (NRIMS) Data Management Planning Guidelines. [online] Available from: http://www.nrims.nsw.gov.au/policies/plan_guide.html [accessed 28 Jan 2003] NSW Department of Information Technology and Management, Office of Information Technology.(OIT), Information Management Framework Guideline, (May 2002) [online] Available from: http://www.oit.nsw.gov.au/pages/4.3.14-IM-Framework.htm [accessed 30 Dec 2002] IMF701_DatQltyInvestGuide_v2d2.doc Page 13 of 13

Related docs
Data Quality
Views: 451  |  Downloads: 49
Guidelines
Views: 0  |  Downloads: 0
Assessment Guidelines
Views: 0  |  Downloads: 0
Health Quality Assessment 2007 claims data
Views: 0  |  Downloads: 0
ASSESSMENT GUIDELINES FOR TRAVEL AND TOURISM
Views: 18  |  Downloads: 1
GUIDELINES FOR ACADEMIC ASSESSMENT PLANS
Views: 2  |  Downloads: 2
Data Quality Templates
Views: 34  |  Downloads: 4
Quality assessment of clinical trials
Views: 11  |  Downloads: 2
premium docs
Other docs by user002
meeting the digital challenge
Views: 935  |  Downloads: 79
Introduction to Data Mining
Views: 1868  |  Downloads: 310
Information Management Framework
Views: 1474  |  Downloads: 278
Information Management Framework metadata
Views: 821  |  Downloads: 99
Information Management Framework Data Quality
Views: 1050  |  Downloads: 183
Information Management Classification Guideline
Views: 907  |  Downloads: 112
Information Architecture
Views: 717  |  Downloads: 57
How to measure success
Views: 815  |  Downloads: 29
HelloPartner Data Model
Views: 593  |  Downloads: 19
Emotional Intelligence
Views: 636  |  Downloads: 30
Developing Strategies for Managing Your Files
Views: 381  |  Downloads: 16
Data Quality Framework
Views: 502  |  Downloads: 69
Categorization of Software for mobile work
Views: 703  |  Downloads: 45
Competitive Intelligence
Views: 446  |  Downloads: 39