On the Development and Scoring of Classification and Observation Science Performance Assessments

Jul 1997

Guillermo Solano-Flores, Richard J. Shavelson, Maria Araceli Ruiz-Primo, Susan Elise Schultz, and Edward W. Wiley

We have developed a framework for conceptualizing science performance assessments. According to this framework, science performance assessments can be classified as to type of science investigation. Performance on those assessments can be scored according to the scientific defensibility of the approaches used by students. In this study we constructed two assessments to explore the nature and the psychometric characteristics of the classification and observation task-based performance assessments. Sink and Float, a classification assessment, consisted of four problems, each intended to address a different type of knowledge. Daytime Astronomy, an observation investigation assessment, consisted of six problems. We found reasonably high interrater reliabilities for both assessments. Moreover, we found that the problems presented in Sink and Float and Daytime Astronomy distinguished different aspects of knowledge within the domain addressed by each assessment. Unfortunately, we found no evidence that Sink and Float and Daytime Astronomy were as sensitive to differences in instruction as expected. Additional development work and research is needed before classification and observation performance assessments can be considered ready to be used in practice.

Solano-Flores, G., Shavelson, R. J., Ruiz-Primo, M. A., Schultz, S. E., & Wiley, E. W. (1997). On the development and scoring of classification and observation science performance assessments (CSE Report 458). Los Angeles: University of California, Los Angeles, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).