Scholarly Communications Project



Stephanie Hildegarde Zadro Jacobson

PhD Dissertation submitted to the Faculty of the Virginia Tech in partial fulfillment of the requirements for the degree of

Doctor of Philosophy


Educational Research and Evaluation


Marvin G. Cline, Chair
Victoria Fu
Barbara Hutson
Javaid Kaiser
Ronald McKeen

April 17, 1997
Blacksburg, Virginia


A COMPARISON OF EARLY CHILDHOOD ASSESSMENTS AND A STANDARDIZED MEASURE FOR PROGRAM EVALUATION by Stephanie Hildegarde Zadro Jacobson Marvin G. Cline, Chairman Educational Research and Evaluation (ABSTRACT) Traditionally, standardized achievement tests have been used to monitor program effectiveness. Recently, however, educators have questioned the appropriateness of standardized tests for this purpose, especially for programs designed for young children. Early childhood advocates suggest using developmentally appropriate assessments instead of standardized achievement tests for making classroom-level decisions about children and for program evaluation. Proponents, however, have not fully identified the psychometric properties of the assessments, certainly not for the purposes of program evaluation. Although developmentally appropriate assessments have been implemented in a number of classrooms across the country, few studies have verified their ability to discriminate among developmental levels. In addition, even fewer studies have addressed their use for evaluating program effectiveness. Using the records of 293 students from the local site of a National Transition Project and both classical test theory (CTT) and item response theory (IRT) procedures, three assessment instruments and a standardized test were examined. It was shown that the Concepts about Print portion of the Early Childhood Assessment Package, the Language Arts component of the kindergarten developmental progress reports, and the first grade Early Literacy Scale tasks are, in fact, developmental assessments. Additionally, IRT procedures located students on the developmental continuum underlying the assessments. Although classical ANCOVAs were unable to identify Treatment or Head Start program effects beyond the kindergarten year, IRT procedures showed that the expected proportion of students at the highest latent ability levels tended to be greater for students in Demonstration schools and Head Start graduates than their counterparts throughout kindergarten and first grade. A standardized reading achievement measure administered to the students in second grade, was unable to differentiate program effects through either classical or IRT procedures. This suggests that the concepts underlying standardized tests differ from those underlying developmentally appropriate assessments. As a result, the key issue to be resolved is which type of measure is more valid, that is, more appropriate, for evaluating early childhood programs.

Full text (PDF) 713,172 Bytes

The author grants to Virginia Tech or its agents the right to archive and display their thesis or dissertation in whole or in part in the University Libraries in all forms of media, now or hereafter known. The author retains all proprietary rights, such as patent rights. The author also retains the right to use in future works (such as articles or books) all or part of this thesis or dissertation.
[ETD main page] [Search ETDs][] [SCP home page] [library home page]

Send Suggestions or Comments to