Validating Standards-Referenced Science Assessments

Oct 2000

Bokhee Yoon and Michael J. Young

As standards with accompanying assessments are being proposed and developed in various states and large districts as instruments for raising academic achievement, the validity of the standards-referenced assessments in shaping educational reform demands attention. In this paper, we examine the construct validity of the New Standards middle school Science Reference Examination focusing on evidence related to the internal and external structure of the assessment, the reliability of the assessment scores, and generalizability of the assessment results. The data were taken from the field test of spring 1998. Results related to the internal structure of the assessment suggest that although the assessment tasks measured a single common factor, this did not detract from the usefulness of scientific thinking or science concept subscores for instructional purposes. With respect to the external structure of the assessment, moderate correlations between the New Standards total scores, and the Stanford Achievement Test (9th edition) and the Otis-Lennon School Aptitude Test (7th edition) scores provided evidence that the scores from these assessments rank student performance in similar ways. However, these correlations do not indicate that the assessments are measuring the same construct. For evidence for the reliability of the assessment scores and decisions based on them, the results of the generalizability studies imply that reader variance could be made negligible by training readers with well-defined scoring rubrics. The high rates of decision consistency and accuracy at different total score cutpoints provide evidence that the New Standards Science Reference Examination could be used reliably to classify student performance on the basis of a total test score. For subscores, providing one cutpoint with a reference point to meet the standards would be instructionally informative.

