TIER ONE PERFORMANCE SCREEN Tonia S. Heffner,*, Len White, and Kimberly S. Owens U.S. Army Research Institute Arlington, VA 22202 ABSTRACT The ASVAB was developed to predict an applicant’s likelihood of being trained to proficiency on This research was designed to identify the most the necessary knowledge and skills to perform an Army promising non-cognitive measures to screen Army job. Extensive research demonstrates that ASVAB applicants. Soldiers were administered a non-cognitive performs exactly as it is intended to perform (Campbell test battery at Army Reception Battalions. Performance, & Knapp, 2001); it is an excellent predictor of the “can- attitudinal, and attrition data were captured from these do” (proficiency) aspects of performance. However, the same Soldiers at the end of Initial Military Training, in ASVAB is not a strong predictor of the “will-do” or their first unit of assignment, and from Army databases. motivational outcomes; e.g., non-academic attrition, The results demonstrate that non-cognitive measures effort, physical fitness. To improve the selection of new increase the prediction of the outcomes beyond that Soldiers and to increase flexibility within the personnel which can be achieved with the existing selection tools. management system, the Army needs to predict which Preliminary results from an initial operational test and applicants have not only the aptitude to become evaluation continue to support these conclusions. The technically proficient, but also the motivation to results have implications for tailoring applicant selection diligently perform at a high standard. to current Army needs. Findings from multiple research efforts have 1. INTRODUCTION demonstrated that ARI’s non-cognitive measures add to the capability of educational attainment and the ASVAB To meet current and future missions, the Army for predicting attrition and Army performance needs flexibility within the personnel system to recruit components such as job effort, leadership, and personal and access applicants with the greatest potential to discipline (Campbell & Knapp, 2001; Ingerick, Diaz, & succeed in the Army. Such flexibility allows the Army Putka, 2009; Knapp, McCloy, & Heffner, 2004; Knapp to adapt to changing economic, social, and global & Tremble, 2007). Non-cognitive is a comprehensive conditions which may impact the recruiting environment. term which encompasses a broad spectrum of In more favorable recruiting markets, the number of assessments. For this research, non-cognitive measures applicants exceeds the number of Soldiers needed and were limited to temperament, or personality, and the emphasis may be to “screen out” applicants with vocational interest assessments. The purpose of the lowest potential. When the recruiting environment is research described in this paper was to conduct a large very challenging, selection tools can help expand the scale, longitudinal examination of a battery of state-of- recruiting market and “screen in” high potential the-art non-cognitive measures for predicting valued individuals. As conditions change, a flexible personnel Army outcomes. The focus of this research is Tier 1 system must continue to maintain the high standards nonprior service applicants. The Tier 2 attrition screen currently in place for applicant selection. (TTAS), which uses a non-cognitve measure, already is in operation for selection of Tier 2 applicants (White, To predict a recruit’s potential for lower attrition and Young, Heggestad, Stark, Drasgow, & Piskator, 2004). higher performance, the Army uses educational Prior service Soldiers were excluded because they have attainment and the Armed Services Vocational Aptitude different knowledge about and expectations of military Battery (ASVAB) as screening tools. Educational Tier 1 service as well as an established record of success. applicants, primarily high school diploma graduates, are more desirable recruits because analyses have shown that 2. METHOD they have lower attrition than Tier 2, or non-high school diploma graduates (Strickland, 2004; Trent & Lawrence, 2.1 Reception Battalion Predictor Testing 1993). Although educational attainment does predict attrition, it has a much weaker relationship with other A total of 8,103 Tier 1 nonprior service Soldiers motivational outcomes; e.g., physical fitness, effort, participated in the research. The vast majority of the leadership. Soldiers had 2-5 days time in service when they completed the test battery. The sample was mostly male (78%). Of the sample, 76% reported their race/ethnicity as White, 16% reported African-American, and 15% precise measurement while simultaneously increasing reported Hispanic. 1 An emphasis was placed on gaining testing efficiency and providing improved test security. participation from Soldiers assigned to the military The TAPAS uses a paired forced-choice approach. Each occupational specialties (MOS) of Infantryman (11B), applicant is required to select one of two statements that Armor Crewman (19K), Military Police (31B), Health is most like him or her. The item bank of more than 600 Care Specialist (68W), Motor Transport Operator (88M), statements and the delivery software have been carefully and Wheeled Vehicle Mechanic (91B) and from all developed so that no pair of displayed statements has one components (Regular Army [RA], Army Reserve that is discernibly more socially desirable than the other. [USAR] and Army National Guard [ARNG]). The statements are independent so each applicant will be presented with different pairs of statements which makes The Soldiers were administered the paper and pencil each applicant’s test virtually unique. The determination non-cognitive test battery in large classrooms at the of the subsequent pairings is regulated by the previous Reception Battalions. They were informed that the responses of the applicant. A sample item is presented in research was voluntary and that the U.S. Army Research Figure 1. Institute (ARI) would track their progress through their Table 2. Definitions of TAPAS Dimensions first term of enlistment to include administering Achievement Individuals scoring high are hard working, measures at the end of Initial Military Training (IMT), at ambitious, confident, or resourceful. about 18 months time in service (TIS), and at about 36 Non- Persons scoring high tend to comply with months TIS. The test administration took about two delinquency current rules and expectations; they dislike hours. change and do not challenge authority. Even- Persons scoring high tend to be calm, level tempered headed, and stable. Table 1. Predictor Sample Size by MOS and Component Intellectual High scoring individuals process information MOS RA ARNG USAR efficiency quickly and are described by others as knowledgeable, astute, or intellectual. 11B 1177 612 0 Optimism Persons scoring high have a general 19K 447 113 0 emotional tone reflecting joy or happiness. 31B 616 580 288 Physical Person scoring high are interested in physical 68W 114 148 45 Conditioning activities. 88M 162 262 88 Attention- Individuals scoring high seek social 91B 186 181 105 seeking stimulation; they are loud, loquacious, Other 2668 1873 1113 entertaining, and even boastful. 3. PREDICTOR MEASURES Computer-adaptive testing was not feasible at the Reception Battalions, so a carefully constructed paper 3.1 Non-cognitive Test Battery and pencil version of the TAPAS was created for this research. Care was taken to capitalize on the advances of The non-cognitive test battery consisted of six computer adaptive testing such as paring statements from measures (see Knapp & Heffner, 2009) but this different dimensions, balancing statements for social discussion will be limited to one temperament and one desirability, and ensuring that statements reflected equal vocational interest measure, described below. levels of the underlying dimensions. TAPAS. The Tailored Adaptive Personality Dimension Statements (always displayed in pairs) Assessment System (TAPAS) is a 15 dimension Dominance I am not one to volunteer to be group personality measure of which 7 are described in Table 2 leader, but I would serve if asked. (Stark, Chernyshenko, & Drasgow, 2010). These Optimism My life has had about an equal share of dimensions were selected based on past research and job ups and downs. analyses as likely to be predictive of enlisted Figure 1. Sample TAPAS items performance outcomes. The TAPAS normally administered as an adaptive test designed to be administered on the same computer platform as the The TAPAS provides two composite scores based ASVAB so the transition to TAPAS would be seamless on the seven dimensions listed above. The “can do” to the applicant. The advantage of a computer-adaptive score reflects learning-based aspects of performance. test is that it allows each test to be tailored to an The “will do” score reflects the motivational aspects of individual’s level of a particular attribute for more performance. 1 The numbers sum to greater than 100% because race Work Preferences Assessment. The Work and ethnicity are two distinct questions. Preferences Assessment (WPA) is a vocational interest measure designed to assess preferences for various work 4. OUTCOME ASSESSMENT activities, work environments, and learning opportunities. It is designed to assess “fit” between an 4.1 IMT Participants and Procedure applicant and the types of jobs available in the Army. The content is based on Holland’s (1997) theory of A subset of the Soldiers in the target MOS who were personality and work environments which posits that participants in the predictor data collection (n = 2,294) jobs can be rated and profiled on 6 dimensions: Realistic, also participated in the outcome data collection Investigative, Artistic, Social, Enterprising, and approximately two weeks before graduation from IMT Conventional (see Figure 2 for sample statements). The (see Table 3). The sample was mostly male (91%), 86% Soldier rates each statement based on how important it is reported their race/ethnicity as White, 7% reported to his or her ideal job. Although the WPA has some African-American, and 7% reported Hispanic. Males potential utility for selection, its real strength lies in the were slightly overrepresented because 11B and 19K are potential to improve classification. restricted to males only. For AFQT category, 32% were Category I or II, 28% were Category IIIA, 36% were Category IIIB, and 3% were Category IV. The Soldiers Dimension Statement were tracked by training company rather than Realistic A job that requires me to get my hands individually and the entire company took the outcome dirty. measures. The assessments were administered, in groups Investigative A job that requires me to research topics no larger than 40 Soldiers, via computers using ARI’s and write reports about what I find. Interform software. Outcome data collection took 90 to Artistic A job that requires me to come up with 120 minutes per group and was proctored by project creative ideas. staff. Social A job in which I can learn how to communicate better with people. Table 3. IMT Sample Size by MOS and Component Enterprising A job in which advancement in the MOS RA ARNG USAR organization is valued. 11B 551 122 0 Conventional A job in which I can learn more about 19K 354 113 0 managing an office. 31B 316 269 132 Figure 2. Sample WPA statements. 68W 42 71 22 88M 23 35 15 91B 102 78 40 3.2 Administrative Data 4.2 In-unit Participants and Procedure From Army databases, demographic information was acquired including gender, race, MOS, ASVAB A subset of the Soldiers (n = 1,233) were scores, and Armed Forces Qualification Test (AFQT) administered the computerized outcome measures at scores. The AFQT is a subscore of the ASVAB which is about18 months TIS (see Table 4). The sample was used to determine enlistment eligibility and other mostly male (78%), 77% reported their race/ethnicity as enlistment considerations. White, 14% reported African-American, and 16% reported Hispanic. For AFQT category, 41% were Applicants are assigned to one of six categories Category I or II, 20% were Category IIIA, 34% were based on their AFQT scores. Those above the mean are Category IIIB, and 5% were Category IV. With the assigned to Categories I (highest scorers), II (above assistance of the Human Resources Command, we were average), IIIA (slightly above average). Applicants able to track the Soldiers to their posts and requested scoring in Categories I-IIIA are given priority for them, by name, to participate in the 90-120 minute accession over those scoring below the mean. testing sessions which were proctored by project staff. Applicants below the mean are assigned to Category IIIB (slightly below average) and Category IV (lowest acceptable category). Category IV enlistments are Table 4. In-unit Sample Size by MOS and Component greatly restricted and Category V applicants are not MOS RA ARNG USAR enlistment eligible. 11B 184 21 0 19K 60 3 0 31B 108 49 28 68W 13 14 7 88M 23 14 10 91B 18 10 14 Other 374 160 123 4.3 Outcome Measures assessment regardless of whether they participated in the IMT and/or in-unit data collections. Job Knowledge. The Soldiers took one or two knowledge tests depending on when they were tested and Attrition. Attrition data for the Regular Army on their MOS. Both tests consisted of multiple choice, Soldiers was continuously provided at 3 month intervals matching, ordering, and drag and drop (i.e., moving from the U.S. Army Accessions Command. Attrition items around with a mouse on the computer screen) type data for this sample was not available for Army Reserve questions. Graphics were liberally included in the tests and National Guard Soldiers. to reflect the procedural aspects of some tasks and to decrease the reading demand. Those Soldiers who were Training Performance. We collected training in one of the target MOS listed above took an MOS- performance/test scores from the Resident Integrated specific test. The number of possible points on these Training Management System (RITMS). Each tests ranged from 90 to 168. For those Soldiers assessed Advanced Individual Training (AIT) course is divided at the end of training, the test content was restricted to into performance blocks which represent training for a content from the program of instruction (POI) for the specific topic area. Either a performance-based or test- course whereas the tests administered at about 18 months based assessment was completed at the end of each time in service included questions related to a broader performance block. To determine the training scope of skill level 10 tasks. Soldiers tested in IMT only performance score, we averaged across the performance took the MOS-specific test. The second knowledge test, blocks. Training performance was not available for all which had 126 possible points, assessed warrior tasks MOS. and battle drills (WTBD), or Soldier common tasks, and was administered to every Soldier tested in-unit. Those Training Completion. From the Army Training in a target MOS also took the MOS-specific test. Requirements and Resources System (ATRRS), we collected data on the number of course restarts a Soldier Army Life Questionnaire. The Soldiers in IMT and had and his or her graduation status. in-unit completed a survey of their attitudes including commitment, attrition thoughts, career intentions, 5. RESULTS satisfaction with their jobs and the Army, and their perceived fit with their MOS and the Army. The Overall, the results demonstrated that TAPAS can Soldiers self-reported their Army Physical Fitness Test contribute to the prediction of Army performance and (APFT) and Basic Rifle Marksmanship (BRM) scores. attrition at the end of IMT (Knapp & Heffner, 2009); The Soldiers also reported their disciplinary incidents. Knapp & Heffner, 2010) and in-unit (Knapp, Owens, & Discipline was very broadly defined to include formal Allen, 2010). counseling or being placed on restriction as well as Article 15s. 5.1 IMT Results Ratings of Performance. Ratings of Soldier Figures 3-6 provide a sample of the IMT results. performance at the end of IMT came from two different Figures 3 and 4 illustrate how TAPAS and WPA, sources, the Soldier’s peers (up to four raters) and the respectively, can improve the prediction of attrition and Soldier’s Drill Sergeants or Platoon Sergeants (two performance over and above the AFQT. TAPAS raters). For the in-unit performance ratings, only first incremented the AFQT only a small amount for training line supervisors provided ratings. exam grades, but incremented the AFQT significantly for the other outcome measures which are largely All Soldiers, in both IMT and in-unit, were rated on motivation-based. Likewise, the WPA did little to 8 to 14 dimensions of common performance such as peer increment over the MOS-specific job knowledge test, but leadership, commitment & adjustment, job-specific task added significantly to the AFQT for the prediction of the performance, and common task performance. For those motivation-based outcomes. Of particular note is the Soldiers in the target MOS, the raters also were asked to large increment that the WPA provided over AFQT for complete the job-specific task performance rating scales Army fit. This result supports the proposition that the for their Soldiers. The number of dimensions ranged WPA can aid with classification. from 5 to 9. To better illustrate the impact of non-cognitive 4.4 Administrative Data assessment to improve applicant selection and function as a recruiting market expander, the results will be Administrative data were collected from a variety of presented within AFQT categories. TAPAS “passing” Army databases. The administrative data was collected scores were defined by two different cut scores. Passing for all Soldiers who participated in the predictor at the 50th percentile means that the Soldier scored above the mean on both the “can do” and “will do” composites. Approximately 38% of the Soldiers were labeled as passing. The 50th percentile reflects use of TAPAS as a market expander, i.e., does the applicant have the potential to perform like a Soldier in a higher AFQT category? Scoring above the 10th percentile on both composites resulted in 87% of the Soldiers “passing.” The 10th percentile reflects use of TAPAS as a “screen- out” tool; i.e., is the applicant likely to perform poorly? Soldiers in AFQT Category IIIB who passed the TAPAS at the 50th percentile had significantly lower attrition rates than the Soldiers in AFQT Category IIIB who failed (7.8 vs. 14.2%; see Figure 5). Further, those in IIIB who passed at the 50th percentile had lower attrition than Soldiers in any other AFQT Category. For Soldiers in the lowest acceptable AFQT Category, IV, those who passed TAPAS at the 10th percentile had training exam scores similar to Soldiers in higher AFQT Categories and markedly higher scores that those who failed TAPAS (see Figure 6). Figure 3. Initial research findings for IMT: TAPAS increases prediction of potential beyond AFQT. Figure 5. Comparison of attrition by AFQT Category and TAPAS score (50th percentile). Figure 4. Initial research findings for IMT: WPA increases prediction of potential beyond AFQT. Figure 6. Comparison of training course grades by AFQT Category and TAPAS scores (10th percentile). 5.2 In-unit Results & Putka, 2009; Knapp, McCloy, & Heffner, 2004; Knapp & Tremble, 2007). Further, these results The results for outcomes assessed in-unit were demonstrate that non-cognitive measures, in combination consistent with the results in IMT. A low sample size for with the AFQT, can be used to “screen-in” applicants AFQT Category IV Soldier prohibited any analysis for likely to perform better and have lower attrition than this group. For Soldiers in AFQT Category IIIB, those their AFQT category alone would predict and to “screen- who passed TAPAS at the 50th percentile had lower out” low-motivated applicants who are likely to be low attrition rates at 21 months TIS than those who failed performers and high attrition risks. TAPAS (see Figure 6). Likewise, Soldiers in AFQT Category IIIB who passed the TAPAS as the 50th percentile had lower reported disciplinary incidents than 6.0 INITIAL OPERATIONAL TEST AND those who failed TAPAS or any other AFQT Category. EVALUATION In response to these findings, the Deputy Chief of Staff, G-1, implemented TAPAS for an initial operational test and evaluation (IOT&E). The TAPAS is being used as a Tier 1 enlistment eligibility test. To evaluate the operational use of TAPAS, ARI and US Army Accessions Command are conducting an evaluation. Testing at all Military Entrance Processing Stations (MEPS) began FY10 Q2. To date, more than 100,000 applicants have taken the TAPAS. The WPA is undergoing final testing by the Defense Manpower Data Center (DMDC) to be added to the ASVAB platform and testing is expected to begin FY11 Q2. Outcome data is being collected to validate TAPAS Figure 7. Comparison of 21 month attrition by as an enlistment eligibility test, to define appropriate AFQT category and TAPAS score (50th percentile). pass-fail scores, and to examine use as an assignment tool. The Soldiers who took TAPAS at the MEPS are being tracked and a subset will be assessed four (4) times during their first and second terms of enlistment using a research design similar to the one described above. The four changes to the research design are: 1) the outcome measures are being administered by the Drill Sergeants or Platoon Sergeants, 2) Signal Support Specialist (25U) and Human Resource Specialist (42A) were added to the list of target MOS, 3) the WTBD knowledge test is being administered in IMT along with the MOS test, and 4) peer ratings are not being collected in IMT to reduce the burden on the Drill Sergeants and Platoon Sergeants. 6.1 Preliminary Results As of FY10 Q4, a small number of Soldiers (n = 429) took the TAPAS and have completed IMT. The length of the Delayed Entry Program, i.e., the time from signing a contract to beginning training, has delayed Figure 8. Disciplinary incidents by AFQT collection of the outcome data. Of the participants who category and TAPAS score (50th percentile). have completed IMT, the sample was mostly male (81%), 65% reported their race/ethnicity as White, 11% 5.3 Results Summary reported African-American, 14% reported Hispanic, and 20% did not respond. For AFQT category, 39% were This research supports earlier findings that non- Category I or II, 20% were Category IIIA, 24% were cognitive measures contribute to prediction of “can do” Category IIIB, and 16% were Category IV. The large performance and are strong predictors of “will do” number of Category IV Soldiers is an intentional performance (Campbell & Knapp, 2001; Ingerick, Diaz, overrepresentation to ensure an analyzable sample size. The following results should be interpreted very the Army. From a “screen-out” perspective, applicants cautiously as they are based on very small sample sizes. who have a lower likelihood of completing training or A second caution applies to the attrition results. In order the first term of enlistment can be excluded from to have a larger sample size, the data were analyzed at 3 consideration – thus reducing training costs. From a months time in service. For most Soldiers, this “screen-in” perspective, the results show that TAPAS represents the completion of Basic Combat Training and has potential to help identify applicants who will perform little to no MOS training. like applicants in the next higher AFQT category – thus better identifying which applicants from a large group Table 5. IOT&E Sample Size by MOS should be permitted to access and allowing for market 11B/C/X 198 expansion. 19K 1 25U 7 31B 43 42A 28 68W 64 88M 70 91B 18 The pattern of results for 3 month attrition parallels what was found previously (see Figures 5 & 7). Soldiers in AFQT Category IIIB who passed TAPAS had lower attrition than those who failed TAPAS. Attrition rates for those who passed TAPAS in AFQT Category IIIB were similar to those in other AFQT Categories. Figure 10. Training grades by AFQT category and TAPAS score (50th percentile). The use of non-cognitive assessments to improve selection is relatively new, but the potential is great. In addition to applicant selection, TAPAS is currently being evaluated for in-service selection for special assignments, officer selection, and applicant MOS assignment. TAPAS also is being evaluated for Air Force applicant selection. ACKNOWLEDGEMENTS The authors would like to thank Drasgow Consulting Group, Dr. Fritz Drasgow, Project Director, Figure 9. IOT&E 3 month attrition by AFQT score for their efforts under an SBIR Contract to support the and TAPAS score development of the Tailored Adaptive Personality Assessment System. We also thank the many contractors Training grades for those AFQT Category IIIB from the Human Resources Research Organization, Dr. Soldiers who passed TAPAS were slightly higher than Deirdre Knapp, Project Director, for their support for the Soldiers who failed TAPAS and similar to the training measure development, data collection, and data analyses. grades of IIIB Soldiers. The sample size, however, is too We appreciate the data collection assistance provided by small to draw any conclusions. numerous ARI colleagues for this large effort. Finally, we want to acknowledge the tens of thousands of CONCLUSION Soldiers, noncommissioned officers, Army civilians, and officers, particularly in the target MOS, who provided Our results indicate that TAPAS improves the their time, patience, and organizational skills to make prediction of Soldier attrition and performance beyond this research possible. what is possible with current enlistment screens, ASVAB and education credentials. Further, implementation of TAPAS has the potential to provide numerous benefits to REFERENCES Campbell, J. P. & Knapp, D. J. (2001). Exploring the Personnel Selection and Classification Measures limits in personnel selection and classification. (Technical Report 1205). Arlington, VA: U.S. Mahwah, NJ: Lawrence Erlbaum. Army Research Institute for the Behavioral and Ingerick, M., Diaz, T., & Putka, D. (2009). Investigations Social Sciences. into Army enlisted classification systems: Moriarty, K. O., Campbell, R. C., Heffner, T. S., & Concurrent validation report (Technical Report Knapp, D. J. (2009). Validating future force 1244). Arlington, VA: U.S. Army Research performance measures (Army Class): Institute for the Behavioral and Social Sciences. Reclassification test and criterion development Knapp, D. J., & Heffner, T. S. (2009). Expanded Research Product 2009-11). Arlington, VA: U.S. Enlistment Eligibility Metrics (EEEM : Army Research Institute for the Behavioral and Recommendations on a non-cognitive screen for Social Sciences. new Soldier selection (Technical Report 1267). Stark, S., Chernyshenko, O. S., & Drasgow, F. (2010). Arlington, VA: U.S. Army Research Institute for Tailored Adaptive Personality Assessment System the Behavioral and Social Sciences. (TAPAS-95s). In D. J. Knapp & T. S. Heffner Knapp, D. J., & Heffner, T. S. (2009). Validating Future (Eds.), Expanded Enlistment Eligibility Metrics Force Performance Measures (Army Class): End (EEEM : Recommendations on a non-cognitive of training longitudinal validation (Technical screen for new Soldier selection (Technical Report Report 1257). Arlington, VA: U.S. Army 1267). Arlington, VA: U.S. Army Research Research Institute for the Behavioral and Social Institute for the Behavioral and Social Sciences. Sciences. Strickland, W. J. (2004). A longitudinal study of first Knapp, D. J., McCloy, R. A., & Heffner, T. S. (2004). term attrition and reenlistment among FY99 Validation of measures designed to maximize 21st enlisted accessions (FR-04-14). Alexandria, VA: century Army NCO performance (Technical Human Resources Research Organization. Report 1145). Alexandria, VA: U.S. Army Trent, T. & Lawrence, J. H (1993). Adaptability Research Institute for the Behavioral and Social screening for the Armed Forces. Washington, Sciences. D.C.: Office of the Assistant Secretary of Defense Knapp, D. J., Owens, K. S., & Allen, M. T. (2010). (Force Management and Personnel). Validating Future Force Performance Measures White, L. A., Young, M. C., Heggestad, E. D., Stark, S., (Army Class): In-unit performance longitudinal Drasgow, F., & Piskator, G. (2004). Development validation (Final Report FR-10-38). Alexandria, of a non-high school diploma graduate pre- VA: Human Resources Research Organization. enlistment screening model to enhance the future Knapp, D. J., & Tremble, T. R. (2007). Concurrent force. Paper presented at the 23rd meeting of the Validation of Experimental Army Enlisted Army Science Conference, Orlando, FL.
Pages to are hidden for
"TIER ONE PERFORMANCE SCREEN"Please download to view full document