Assessment, Measurement and Evaluation

Citation: Huitt, W. (2007, October). Assessment, measurement, and evaluation: Overview. Educational Psychology Interactive. Valdosta, GA: Valdosta State University. Retrieved [date], from

Return to | EdPsyc Interactive: Courses |

Previously we discussed some concepts related to Science: A way of knowing that provided an overview of the processes of assessment, measurement, evaluation, and research as they are applied in educational and developmental psychology. You may want to review that before going any further.

Remember that we defined learning as "a relatively permanent change in behavior (or behavior potential) as a result of experience or practice." Remember also that we said maturation, or relatively permanent change as a result of biology and genetics, is the only other explanation for relatively permanent change. That is, when a person is different today than he or she was yesterday, or last week, or last month or last year, etc. it is because that person either matured or learned something, or both. When both of these occur simultaneously, in a way that is difficult to study independently, we refer to that as development. In addition, development has the connotation of improvement or increase in complexity or level. In reality, most of what we do in education has both learning (i.e., providing experiences for students) and maturational (appropriate for a given biological functioning) aspects. However, in this course we will focus mainly on learning with less emphasis on development. The major exception is when we study the cognitive theory of Jean Piaget.

Another important issue in the area of teaching and learning is the difference between aptitude and achievement. By aptitude we mean the potential one has to learn and/or develop. Aptitude is a measurement in the present that is intended to predict future events, especially the capacity of an individual to learn (i.e., change as a result of experience or practice.). IQ is a measure of academic aptitude. Students with higher academic aptitude will show more achievement than students with low academic aptitude if both are given the same quality and amount of experience or practice. Other examples include manual dexterity, spatial, mechanical, musical, etc. Achievement, on the other hand, refers to capacities or skills or knowledge one has now that one did not have in the past. That is, achievement is a measurement in the present that refers to behavior or effort or learning done in the past. It is especially important that our educational system not mix the two. Achievement is likely the result of specific training or experiences whereas aptitude is more likely the result of genetic inheritance and general experiences that may or may not have been explicitly provided to all students. If some students have been provided appropriate learning experiences, while others have not, we should not use achievement measures as measures of aptitude. Rather we need to provide students with specific learning experiences, making sure that they have the prerequisite skills necessary to learn, before we make judgments about aptitude.

Another important distinction is between instruction, assessment, and evaluation. Remember that assessment refers to the collection of data to describe or better understand an issue whereas evaluation refers to the comparison of data to a standard for the purpose of judging worth or quality. Instruction is defined as "the purposeful direction of student learning" and is one of the three major categories of teacher classroom behavior (the other two are planning and managment.)   Instruction, then, is providing experiences for learners that will allow them to successfully move from their present level of knowledge, skill, etc. to a desired level. It is an active process involving both teachers and students moving towards specific desired end results. In relation to instruction, assessment is the collection of data before, during, and after instruction. Evaluation is the process of making judgments at these specified times. When done before instruction we might make decisions as to a student's having the prerequisite skills or the necessary aptitude to learn the material in the specified time allotted. During instruction, we might make decisions about the appropriateness of the student effort or student learning. After instruction, we can make decisions about whether or not the student has mastered the material sufficiently to go on to the next learning experience.

In future presentations we will discuss the concept of "What You Measure Is What You Get" (WYMIWYG) which highlights the tendency to focus during instruction on those outcomes you intend to measure or that will measured by some outside individual or agency. In addition, we will discuss how the changing requirements of the digital/information/conceptual age demand that educators revise curricula to focus on important outcomes such as critical thinking and social skills in addition to achievement in basic skills.

Another issue already discussed is related to reliability and validity. Both are important for every aspect of evaluation and research. Reliability refers to the consistency or the dependability of the data whereas validity refers to the truthfulness or correctness of the data. As previously stated, assessment data and evaluation decisions may be reliable (consistent) without being valid (correctly judging the truth of the data relative to intentions.) However, the reverse is not true (i.e., data cannot be valid if they are not reliable.)

Three issues are important for classroom assessment that is under the control of the teacher. The first relates to what data (assessments and measures) we will use for making judgments . The issue of traditional versus performance assessment is important in this regard. By traditional assessment we are referring to the types of assessments generally found in classrooms: multiple-choice, true-false, or matching objective exams or fill-in-the-blank, short-answer, or essay exams. Performance assessment, on the other hand, refers to data collected during the actual accomplishment or demonstration of a particular set of skills or knowledge. A major difference is that traditional assessment is almost always done out of context (i.e., separate and apart from the situation in which it will be used.) Performance assessment, on the other hand, includes as much of the context as possible. It may be that a student fixes an engine, or perhaps gives a speech to a group of parents. In any case, the situation is as real as possible.

A second issue, that of formative versus summative assessment, is an issue of timing of assessment and evaluation. Formative assessment refers to any data collection that occurs before or during instruction. If it is done prior to instruction for the purpose of determining a students readiness for instruction, we call that diagnostic assessment. Summative assessment is done at the end of instruction to determine what has been learned. There is no expectation of further instruction at that point.

The third issue is one of the most important issues in measurement and evaluation: the difference between criterion- and norm-referenced testing. In general, criterion-referenced testing is done when we want to know how much a student knows vis-a-vis a pre-set standard and norm-referenced testing is done when we want to know how one student or group of students compares to other students in terms of the content or skills being tested. It's important to be knowledgeable regarding this issues so that you can explain test results to students and parents.

In this course, we will be integrating the issues of assessment, measurement, and evaluation into the study of learning theory and classroom applications. It is not enough simply to know these terms; we must also be able to apply them in the classroom.

| Internet Resources | Electronic Files |

Return to:

All materials on this website [] are, unless otherwise stated, the property of William G. Huitt. Copyright and other intellectual property laws protect these materials. Reproduction or retransmission of the materials, in whole or in part, in any manner, without the prior written consent of the copyright holder, is a violation of copyright law.