Publications & Resources

The Behavior of Linking Items in Test Equating

May 2004

Edward H. Haertel

Large-scale testing programs often require multiple forms to maintain test security over time or to enable the measurement of change without repeating the identical questions. The comparability of scores across forms is consequential: Students are admitted to colleges based on their test scores, and the meaning of a given scale score one year should be the same as for the previous year. Agencies set scale-score cut points defining passing levels for professional certification, and fairness requires that these standards be held constant over time. Large-scale evaluations or comparisons of educational programs may require pretest and posttest scale scores in a common metric. In short, to allow interchangeable use of alternate forms of tests built to the same content and statistical specifications, scores based on different sets of items must often be placed on a common scaleā€”a process called test equating (AERA, APA, NCME, 1999).

Haertel, E. H. (2004). The behavior of linking items in test equating (CSE Report 630). Los Angeles: University of California, Los Angeles, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).