Good teaching, poor test scores

Evaluating teachers based partly on student test scores is unreliable, concludes a study in Educational Evaluation and Policy Analysis. Researchers analyzed a subsample of 327 fourth- and eighth-grade mathematics and English-language-arts teachers across six school districts.

“Some teachers who were well-regarded based on student surveys, classroom observances by principals and other indicators of quality had students who scored poorly on tests,” reports the Washington Post. Some poorly regarded teachers had students who did well.

Thirty-five states and the District of Columbia require student achievement to be a “significant” or the “most significant” factor in teacher evaluations. Just 10 states do not require student test scores to be used in teacher evaluations.

Most states are using “value-added models” — or VAMs — which are statistical algorithms designed to figure out how much teachers contribute to their students’ learning, holding constant factors such as demographics.

Last month, the American Statistical Association warned against used VAMS, saying that “recent studies have found that teachers account for a maximum of about 14 percent of a student’s test score.”

“We need to slow down or ease off completely for the stakes for teachers, at least in the first few years, so we can get a sense of what do these things measure, what does it mean,” said Morgan S. Polikoff, a USC assistant professor of education and co-author of the study. “We’re moving these systems forward way ahead of the science in terms of the quality of the measures.”

About Joanne


  1. Could this be explained by some teachers using an approach that the principal considers appropriate and the students enjoy, but which isn’t actually effective? LIkewise, some teachers may be teaching in a ‘boring’ manner, which is evaluated poorly, but the students do well?

    As somebody who tends to think that teaching methods should vary based on the teachers’ abilities, the content of the course, and the students that are being taught, my first thoughts would be that there is some sort of mismatch, so that what ‘looks good’ and is fun isn’t actually working.

  2. Deirdre Mundy says:

    Could it be that even a bad teacher can teach a kid who is willing to learn, either because of innate curiosity or parental expectations? So test scores are less about the teacher, and more about the kid and the home environment?

  3. Roger Sweeny says:

    I see two separate things here:

    1) the argument that evaluating teachers on the basis of student test scores is unreliable because it disagrees with evaluations based on “student surveys, classroom observances by principals and other indicators of quality.” Of course, it is just as logical to say that “student surveys, classroom observances by principals and other indicators of quality” are unreliable because they disagree with test scores. If the tests really are valid measures of student achievement, then I find the second argument more persuasive.

    2) The argument that differences between teachers don’t make much difference at all, “that teachers account for a maximum of about 14 percent of a student’s test score.” In that case, it doesn’t make much sense to try to rate teachers and pay better ones a higher salary. But if 86% of student achievement results from non-teacher factors, and if that is the main purpose of school, then teachers are overpaid, maybe substantially overpaid.

    • 1) I guess you can write off the possibility of being invited to give the keynote address at the next American Statistical Association convention.

      2) That is the logical conclusion to draw from this statistical analysis. Trouble is, the outliers are where the interesting stuff resides and a statistical analysis provides a fine excuse to disregard them.

      Also, why would anyone interested in determining how much a teacher can add to a student’s attainment *not* include the principal as an element in that determination? That “14%” number’s hardly persuasive unless one excepts the implicit assumption that one teacher’s as good as another and the further assumption that the atmosphere of the school, as set by the principal, is immaterial.