Poorly trained part-timers determine test scores that loom so large in education, writes Todd Farley in a New York Times op-ed. The author of Making the Grades, Farley was hired to score fourth-grade, state-wide reading comprehension tests when he was a graduate student in 1994.
One of the tests I scored had students read a passage about bicycle safety. They were then instructed to draw a poster that illustrated a rule that was indicated in the text. We would award one point for a poster that included a correct rule and zero for a drawing that did not.
The first poster I saw was a drawing of a young cyclist, a helmet tightly attached to his head, flying his bike over a canal filled with flaming oil, his two arms waving wildly in the air. I stared at the response for minutes. Was this a picture of a helmet-wearing child who understood the basic rules of bike safety? Or was it meant to portray a youngster killing himself on two wheels?
Some fellow scorers wanted to give full marks for understanding bicycle safety; others wanted to give a zero.
I realized then — an epiphany confirmed over a decade and a half of experience in the testing industry — that the score any student would earn mostly depended on which temporary employee viewed his response.
This is why multiple-choice tests can be more reliable than subjectively graded tests that rely on drawing (or writing) skills to measure reading comprehension.
I have a review copy of Farley’s book, which I plan to read very soon — along with the four other review books waiting for me. Maybe today! Anyhow, I vowed not to mention other people’s books without promoting my own book, Our School: The Inspiring Story of Two Teachers, One Big Idea and the Charter School That Beat the Odds.