Test scores should be information at a teacher’s disposal, not information used to dispose of teachers.
The New York State Board of Regents passed the new regulations, which allow scores on state standardized tests to account for up to 40 percent of a teacher’s evaluation. Several researchers expressed concern about this, before the vote, in a letter. At least two board members spoke up against this change, and three voted against it. Kathleen Cashin said that this would lead to even more reliance on test prep. Roger Tilles pointed out that the districts that can’t afford to develop local assessments will be forced to use state assessments for the full 40 percent of the evaluation.
We need tests, including standardized tests. As a teacher, I want to know promptly how my students did on a given test. (Often the results don’t come back until the following year.) I would like to look at the questions and my students’ answers, instead of relying on diagnostic reports that tell me that such-and-such a student needs to work on “finding the main idea.”
The tests are one way of verifying that students have learned what they are supposed to learn. But they cannot be the only way, or even close. In English language arts, the tests can be especially misleading, as they are generally rather weak in “content.” That is, they do not presume that students have read anything in particular. They test generic skills–sometimes accurately, sometimes not.
They are even less reliable as indicators of teachers’ performance. For reasons that have been brought up again and again, reasons given by scholars, teachers, policymakers, and others, test scores should not decide a teacher’s fate or override human judgment. There are simply too many unstable factors–the tests themselves, the students’ lives, conditions on the day of the test–that make the scores inaccurate indicators of what a teacher is accomplishing.
In an op-ed in the New York Daily News, Arthur Goldstein points out that students’ efforts are not uniform: “For example, how much television does a student watch? … If my students don’t know how to read, haven’t been in school for the past six years or refuse to put a mark on a piece of paper, is it my fault? If a kid was dragged to the U.S. against his will and simply won’t learn English, should I be penalized?” (Having taught ESL, I have seen these situations.)
Value-added formulas are just as problematic as test scores, if not more so. They control for all sorts of factors, but the various controls create their own problems and distortions. Value-added ratings can provide useful information about schools, over time. But in teacher evaluations and tenure decisions, they should be regarded carefully and critically. And there should be room to “unpack” them–to figure out what a teacher’s rating might have been under this or that different condition.
All of this has been said, many times. Of course, human judgment is also fallible. “Multiple measures” can also be misleading. Don’t get me started on portfolios–it is often the teacher, not the student, who puts time and effort into these portfolios, and they may not reflect what a student can do independently.
What, then, should constitute teacher evaluations? Well, as in government, I prefer a system of thoughtful checks and balances. Consider test scores, but don’t give them too much power. Consider the principal’s judgment, but don’t let that override all else. Consider student work, but look carefully at it–don’t just check off items on a checklist. Consider a teacher’s lesson plans, assignments, and contributions to the school. Yes, this comes down to “multiple measures,” but the point isn’t just that they are multiple. The point is that each one is regarded carefully.
When one measure (not to mention a flawed one) is given too much power, it is bad for schools through and through. It tells teachers (and, indirectly, students) that excercising one’s judgment isn’t that valuable after all.