The Reliability of GRE Scores in Predicting Graduate School by iho16825


									             The Reliability of GRE Scores
        in Predicting Graduate School Success:
   A Meta-Analytic, Cross-Functional, Regressive,
Unilateral, Post-Kantian, Hyper-Empirical, Quadruple
             Blind, Verbiage-Intensive and
             Hemorrhoid-Inducing Study

                               John Orlando, PhD
                                Research Director
                   Norwich University Online Graduate Programs
                                  May 19, 2005

Studies of GRE Reliability
There have been numerous studies of the reliability of GRE scores as predictors of
graduate school success (See Bibliography). Unfortunately, results span the range from
finding little if any predictive validity (Morrison, Monahan, Wesche), to finding a strong
correlation between GRE scores and graduate school achievement (Kingston, Harvancik,

The situation is complicated by the common tendency to use the results to advance
political agendas. The National Association of Scholars, a conservative group fighting
political correctness on campus, sites a strong correlation study to press universities to
use test scores in admissions, while Fairness in Testing sites numerous studies against the
correlation to fight the use of test scores on grounds that they are discriminatory.

The issue is further muddled by questions about the criteria used to measure academic
success. Some studies use first-year graduate GPAs. Others use final GPA, and still
others use percentage of students to complete a program. Even the use of the last group
creates problems. One study that measured success by completion of the graduate
program did not control for people who were dismissed due to academic failure rather
than left on their own for reasons such as job commitments or family emergencies
(Nelson and Nelson). In other words, it is likely that academically competent persons
who left due to non-academic reasons were lumped into the group used to gauge failure.
We know that this is not a trivial group, since adult students in particular face a far
greater range of outside commitments that can interfere with their education.

Mich Kabay notes that studies often compare GREs to the GPAs of those who complete
graduate programs. This restricts the range of subjects being tested by ignoring those
who were never admitted in the first place, as well as those who drop out. This is an
important oversight. If the issue were just distinguishing between those who will be
highly successful in a graduate program and those who will be just successful, then the

tested group would be sufficient. But because the test is being used to make admissions
decisions in the first place, lack of information about applicants not admitted means that
we cannot tell if the test is being used to deny admission to those who would have been
successful. Because of this weakness, the Education Testing Service itself flatly states
that “a cutoff score based only on GRE scores should never be used as a sole criterion for
denial of admission” (

Results of Studies
Taken as a whole, the evidence suggests that there is some correlation between GRE
scores and graduate achievement. But there is widespread disagreement about the degree
of correlation. The Educational Testing Service, which funds a considerable amount of
research into the validity of the GRE, asserts only that “GRE General Test scores tend to
show moderate correlations with first-year [GPA] averages”(ETS 1990). It also admits
that there are “critical skills associated with scholarly and professional competence that
are not currently measured by graduate admissions tests” (ETS 1989).

The devil is in the details when it comes to GRE validity studies, as relevant correlations
are often embedded within distinctions in the data. For instance, there is considerable
amount of variability in predictive validity of GRE scores between disciplines (Braun and
Jones). This can partly be explained by the fact that the GRE is actually three separate
tests: analytic, verbal, and quantitative. Different disciplines demand these skills in
different degrees. Thus, predictive validity tends to improve when a particular test is
matched to a particular discipline. One study concluded that the verbal score is the best
predictor of success in majors that are “descriptive in nature,” while the quantitative score
is the best predictor for “symbol-oriented disciplines” (Kaiser). See Appendix 1 for a
study that separates results by test and discipline. There are also subject matter tests
offered by the ETS that are rarely required by admissions committees, but tend to be
better predictors of success than any of the regular tests (ETS 1990). Due to this
variation, the Educational Testing Service advises against using a composite score to
judge applicants (

Another interesting variable is type of admission. One study looked at the GRE as
predictor of success for regular admissions versus probationary admissions; students
admitted with academic credentials below what is normally required for admission. Not
surprisingly, those admitted on probation tended to do less well (with success defined, in
this case, as completing the program) than regular admissions. But for probationary
students, qualitative and analytical scores best predict academic success, while verbal
scores best predict success for regular admissions (Nelson and Nelson).

This study also split subjects by discipline to get an even more fine-grained analysis.
Even here anomalies arose. For instance, among regularly admitted students, there was
not a significant difference in GRE scores between those that succeeded and those that
did not. In fact, in many of the disciples those who scored better in the GRE were more
likely to fail than those that scored worse (Nelson and Nelson: p.9, See Appendix II).
These disciplines include Applied Sciences, Education, and the Humanities, Life

sciences, and Social Sciences. The authors never speculate as to the source of this odd
result, but it raises the possibility that the aforementioned inclusion of students who left
the programs voluntarily within the set of “failures” might be skewing the results.
Perhaps students who do very well in certain areas of the GRE tend to have more
demanding jobs that are more likely to interfere with their studies, or are more likely to
change jobs in the middle of their studies.

Finally, there are few studies on the difference between older and younger students in the
predictive validity of GRE scores. One might hypothesize that older students will do less
well on GRE tests than younger students because they have been away from the academic
world longer and have gotten out of the practice of taking tests. One study found that
there was actually little difference in the predictive validity of test scores of different
aged groups (Clark). But another found that there “was a significant underprediction of
first-year grade average for older females in all graduate fields. Although it had been
predicted that they would do less well than younger students and about as well as males,
they in fact earned considerably higher grades than all other groups” (Swinton). It merits
note that the Educational Testing Service specifically cautions against giving too much
weight to the test for those students “who are returning to school after an extended
absence” (

The Educational Testing Service has a variety of guidelines about the use of the GRE. If
it is to be used, a school is advised to make a careful examination of these guidelines.
ETS publishes a Guide to the Use of Scores than can be downloaded free at The most important guidelines are used to inform the recommendations

Given the well-documented problems with the GRE tests, if it is to be used at all in
graduate school admission decisions, it should only be used as part of an “all things
considered” judgment. As mentioned above, the Educational Testing Service advises
against using it as a floor for admissions. ETS instead says that:

       Regardless of the decision to be made, multiple sources of information should be
       used to ensure fairness and balance the limitations of any single measure of
       knowledge, skills, or abilities. These sources may include undergraduate grade
       point average, letters of recommendation, personal statement, samples of
       academic work, and professional experience related to proposed graduate study.

ETS also advises against using a composite GRE score. It instead suggests that
universities choose those scores from the three seperate tests—analytical, quantitative,
and verbal—which best map to the discipline the student is interested in entering. ETS
even suggests that each department conduct its own validity studies on the use of the
GRE in light of the variation observed in results, and will provide advice on the design of
these studies without charge.

ETS also states that “small differences in GRE scores (as defined by the standard error of
measurement) should not be used to make distinctions among examinees.” All tests have
some standard measure of error, and ETS breaks them out by test. Details can again be
found in the Guide to the Use of Scores.

It is clear that the GRE is not required as a measure of likely graduate school success, and
ETS never says that it is. Many schools do not require the test. Some studies suggest
that undergraduate GPA is a better predictor of graduate achievement than GRE
(Monahan). Given the wide disagreement in studies of GRE predictive validity, the best
advice that emerges from the literature is that if the GRE is used, it should only be used
as one measure among many.

Statistical Data
Appendix I and II contain the data from Nelson and Nelson’s study of the GRE as
predictor of graduation rates for regular and probationary students in different disciplines.
The legend is as follows:

Graduates: Students who completed their program.
Non-Graduate: Students who did not complete their program.
9-Hr GPA: GPA after the first 9 hours of graduate work.
Final GGPA: Final graduate grade point average.
GRE-V: GRE verbal score.
GRE-Q: GRE quantitative.
GRE-A: GRE analytical.

(It’s not clear what “final graduate GPA” means for students who did not graduate, unless
it’s the student’s GPA at the time of leaving the program.)

Note the wide variation in predictive power of GRE scores between disciplines, with
some disciplines actually showing that students who score lower on the GRE do better
than those who score higher.


“ERIC” refers to the Education Resources Information Center at

Boldt, Robert F. (1986). Generalization of GRE General Test Validity across
Departments. ERIC Document No. 281 865.

Bornheimer, D.G. (1984). Predicting Success in Graduate School Using GRE and PAEG
Aptitude Test Scores. College and University, v. 60 (no. 1) pp. 54-62.

Braun, Henry I. & Jones, Douglas H. (1985). Use of Empirical Bayes Methods in the
Study of the Validity of Academic Predictors of Graduate School Performance.
ERIC Document No. 255 545.

Clark, Mary Joe. (1986). Test Scores and the Graduate Admission of Older Students.
ERIC Document No. 271 498.

Enright, M. K. & Gitorner, D. (1989). Toward a description of successful graduate
students. Princeton, NJ: Educational Testing Service.

Educational Testing Service. (1998). GRE Guide to the Use of Scores, 1998-1999.
Princeton, NJ.

Educational Testing Service. (1989). Toward a Description of Successful Graduate
Students. Princeton, NJ.

Fairtest (2001). Examining the GRE: Myths, Misuses, and Alternatives.

GRE Validity Study Service (1990). Validity of the GRE: 1988-1989 Summary Report.
Educational Testing Service,

Goldberg, Edith L. & Alliger, George M. (1992). Assessing the Validity of the GRE for
Students in Psychology. Educational and Psychological Measurement. v52, n4, p1019-
27, Win 1992.

Harvancik, Mark J. & Golsan, Gordon. (1986). Graduate Record Exam Scores and
Grade Point Average: Is There a Relationship? ERIC Document No. 270 682.

Hartnett, R. & Payton, B.F. (1977). Minority Admissions and Performance in Graduate
Study: Preliminary Study of Fellowship Programs of the Ford and Danforth
Foundations. New York: Ford Foundation.

Hebert, David J & Holmes, Alan F. (1979). Graduate Record Examinations Aptitude
Test Scores as a Predictor of Graduate Grade Point Average. Educational and
Psychological Measurement, v39, n2, p415-20, Sum 1979.

Jacobson, R. L. (1993). Critics Say Graduate Record Exam does not measure qualities
needed for success and is often misused. The Chronicle of Higher Education, March, pp.

Kaiser, Javaid. (1982). The Predictive Validity of GRE Aptitude Test. ERIC Document
No. 227 174, abstract.

Kingston, Neal M. (1985). The Incremental Validity of the GRE Analytical Measure for
Predicting Graduate First-Year Grade-Point Average. ERIC Document No. 226 021.

Kuncel, Nathan R.; Hezlett, Sarah A., & Ones, Deniz S. (2001). A Comprehensive
Meta-Analysis of the Predictive Validity of the Graduate Record Examinations:
Implications for Graduate Student Selection and Performance. Psychological Bulletin
127 (1), 162-181.

Milner, M., McNeil, J. & King, S.W. (1984). The GRE: A Question of Validity in
Predicting Performance in Professional Schools of Social Work. Educational and
Psychological Measurement, vol. 44, pp. 945-950.

Monahan, Thomas C. (1991). Using Graduate Record Examination Scores in the
Graduate Admissions Process at Glassboro State College. ERIC Document No. 329

Morrison, T. & Morrison, M. (1995). A Meta-Analytic Assessment of the Predictive
Validity of the Quantitative and Verbal Components of the Graduate Record Examination
with Graduate Grade Point Averages Representing the Criterion of Graduate Success.
Educational and Psychological Measurement, v. 55 (no. 2) pp. 309-316.

National Association of Scholars, (2002). The Validity of GRE Subject Tests.

Nelson, Jacquelyn & Nelson, C. Van. (1995). Predictors of Success for Students
Entering Graduate School on a Probationary Basis. ERIC Document No. 388 206.

Onasch, C. (1994). Undergraduate Grade Point Average and Graduate Record Exam
Scores as Predictors of Length of Enrollment in Completing a Mater of Science Degree.
ERIC Document No. 375 739.

Oltman, P.K. & Harnett, R.T. (1984). The Role of the GRE General and Subject Test
Scores in Graduate Program Admission. Princeton, NJ: Educational Testing Service.

Penncock-Roman, M. (1994). Background Characteristics and Futures Plans of High-
Scoring GRE General Test Examinees. Research report ETS-RR9412 submitted to
EXXON Education Foundation, Princeton, NJ: Educational Testing Service.

Scott, R.R. & Shaw, M.E. (1985). Black and White Performance in Graduate School and
Policy Implications For Using GRE Scores in Admission. Journal of Negro Education, v.
54, no.1, pp.14-23.

Sternberg, R. & Williams, W. (1997). Does the Graduate Record Examination Predict
Meaningful Success in the Graduate Training of Psychologists? American Psychologist,
v. 52 (no. 6), pp. 630-641.

Swinton, Spencer S. (1987). The Predictive Ability of the Restructured GRE with
Particular Attention to Older Students. ETS Research Report 87-22.

Thornell, John G & McCoy, Anthony. (1985) The Predictive Validity of the Graduate
Record Examinations for Subgroups of Students in Different Academic Disciplines.
Educational and Psychological Measurement, v45, n2, p415-19, Sum 1985.

Wesche, Lilburn, E., et al. (1984). A Study of the MAT and GRE as Predictors of Success
in M.Ed. Programs. ERIC Document No. 310 150.

Wilson, Kenneth M. (1986). The Relationship of GRE General Test Scores to First-Year
Grades for Foreign Graduate Students: Report of a Cooperative Study. ERIC Document
No. 281 862.

Wilson, Kenneth M (1982). A Study of the Validity of the Restructured GRE Aptitude
Test for Predicting First-Year Performance in Graduate Study. ERIC Document No. 240

Source: Ubiquity, Volume 6, Issue 21 (June 8 -
15, 2005)"


To top