An Applied Data Quality Analysis for Traffic Crash Location Information Deanna A. Peabody Michael A. Knodler Jr. Ph.D. College of Engineering University of Massachusetts Amherst Abstract In an effort to reduce the frequency and severity of motor vehicle crashes, it is Research Objective imperative that high crash locations are accurately identified. This project serves as Since transportation safety professionals use the data collected on crash an in depth analysis for a sample of crash reports that were not successfully geo- reports to improve traffic safety, it is crucial that officers, serving as the front located. Specifically this research will include both a quantitatively and qualitatively lines of data collection, fill out the form accurately and in sufficient detail at the review of crash reports to: 1) characterize why reports are not geo-located, 2) scene of every crash. By analyzing a sample of 2005 non geo-located attempt to locate these crashes using additional methods, 3) categorize common reports, this research aims to identify inconsistencies and data shortcomings problems that led to locating these crashes originally, and 4) provide with crash report data, which in turn will lead to improved data quality for traffic recommendations for changes to the crash data collection process to increase the crash location information. percentage of crashes that are successfully located. Initial results indicate that many reports do not provide sufficient information in the various location fields on a crash report form. For example, one common error includes the recording of a roadway name with no other indication of the specific crash location. An additional element Crash Report Location Information being investigated with this research effort is the extent to which the crash narrative and diagram can be employed to help locate the crash location. The identification of specific reasons for non geo-located crashes are being determined, which in turn will translate into improved data quality for traffic crash location information. Possible outcomes include changes to the crash report form, new training for police completing the form and for CDS data entry staff at the Registry of Motor Vehicles, and new technologies to scan information at the crash scene, electronically submitting the crash data to the Registry of Motor Vehicles. Progress and Results A concern that has become apparent with crash data quality in Massachusetts is poor location information. Since information sometimes varies or is vague, about 25% of crashes cannot be successfully geo-located, making the exact location of these crashes unknown. One hundred non-geolocated crashes were reviewed to identify trends as to why the exact location could not be pinpointed. ENOUGH INFORMATION WAS INCLUDED IN THE LOCATION ADDITIONAL INFORMATION WAS PRESENTED IN THE NARRATIVE SECTION OF THE CRASH REPORT FORM AND/OR DIAGRAM NO NO 48% YES 47% YES 52% 53% ALL THE INFORMATION IN THE LCOATION SECTION OF THE THE CRASH LOCATION WAS DETERMINED CRASH REPORT IS VALID NO YES 43% 44% YES NO 57% 56% Analysis Although the information in the location section of the crash report form is valid and in the right section, there is not enough information to locate exactly where the crash occurred. The narrative and diagram does, however, provide Results show that 48% of non-geolocated crash reports simply do not additional information on the location of the crash by including a landmark. By provided enough information. With some frequency (53%), additional information is searching google earth, the address of DeSantis Garage was found and the presented in the narrative and diagram of the crash report; however, this location of the crash was pinpointed. See the map below. information is not entered into the Crash Data System. Of the reports that contained additional information in the narrative and/or diagram, 34% were still not be located, 45% were found using this additional information, and 21% could be found without the information. 56% of the crash reports contained invalid, but generally the information was still understandable and clear. Although 57% of these locations were found using google maps, it is significant that these locations cannot currently be systematically geo-located by MassHighway in order to determine specific locations having a high number of crashes and what controls and safety devices already exist there, so that additional improvements can be made. Future Work MassHighway uses crash data to better understand all elements of traffic crashes. In addition to poor location information other issues include a high rate of missing injury severity data, poor data quality for engineering related fields, and data entry errors. To enhance crash data quality in Massachusetts, MassSAFE is conducting an audit of a randomly selected sample of 2005 police crash reports to identify which fields are being completed incorrectly and result in poor data. A panel of experts on crash data including representatives from state police, local police, and the Registry of Motor Vehicles will perform a manual review of each crash report. The final report will serve as a basis for understanding data quality issues in Massachusetts. This work is supported in part by the National Science Foundation under NSF award number 0552548. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the National Science Foundation.
Pages to are hidden for
"Sample Police Reports - PowerPoint"Please download to view full document