Testing: Threat or menace?

Student achievement, as measured by test scores, is meaningless, writes Karl Wheatley, a Cleveland State education professor in a Plain Dealer op-ed.

. . . most of what matters in life is simply not on the tests. Many key subjects and skills that form the backbone of people’s careers are not being tested. Also, many of the top goals that parents and employers have for American students are not on the tests, including teamwork, independence, creativity, love of learning, risk-taking, problem-solving, critical thinking, confidence, initiative, persistence, and to be caring, happy and healthy. Even in the subjects tested, researchers repeatedly find that standardized tests overemphasize low-level outcomes and underemphasize higher-level skills.

. . . focusing education on test scores creates collateral damage in every corner of education: dumbed-down curriculum, motivation problems for students and teachers, higher teacher attrition, mind-numbing scripted instruction, increased mental health problems, more kids put on drugs to pay attention and increased alienation, behavioral problems and dropouts.

Testing isn’t the problem, responds Jamie Davies O’Leary on Flypaper. The problem is low achievement in, for example, Cleveland.

The most glaring of Wheatley’s arguments is his contradiction that testing is bad because it doesn’t focus on soft skills like teamwork, personal management, creativity, etc. Even if we shifted toward teaching those “skills” in lieu of core content (reading, math, science and history), how would we know that students are progressing appropriately unless we assess their learning? Regardless of what schools teach, that content has to be tested somehow in order for us to know a) that students are learning it and b) that teachers are doing a decent job of teaching it. Furthermore, no one is arguing that self-sufficiency, creativity, etc. are not important, just that they aren’t going to be that useful if students reach high school reading at a sixth-grade level and still can’t tell time on an analog clock.

Testing enables us to diagnose learning problems — and teaching problems, O’Leary argues. For example, “only 10 percent of Cleveland’s fourth-graders were proficient in mathematics according to the 2007 NAEP, and only 8 percent were proficient readers.”

Bad tests and bad test prep lead to bad results. But no testing lets us pretend that those Cleveland students are doing OK.  Poor readers may have a “love of learning” that will kick in some day.  Kids who can’t add or multiply may be strong in problem-solving, critical thinking, initiative or persistence. Pigs may have wings.

In Why she quit teaching, an ex-teacher talks about trying to teach 10th graders who’ve been passed along without learning to read. These are not happy kids.

About Joanne


  1. As another analogy, take medicine. As the medical sector has, over generations, gotten more and more serious about testing, they’ve discovered a lot of things that they thought worked don’t. For example, surgery often has a placebo effect.
    In my own experience in labs at school and university, it was easy to, without realising it, get some stage of the preparation wrong (eg putting the same solution into both sides of a titration). And of course engineers are obsessive over testing.
    With all due respect to the teaching profession, it seems implausible to me that its members are perfect when the members of the engineering, science and medical professions are not.

  2. The emphasis on process over content seems to be typical of the education world, from ed schools to administrators to teachers. Beyond that alternate reality, it is accepted that it is impossible to think critically without domain-specific knowledge. Only the education world seems to think otherwise.

  3. Elizabeth says:

    “Caring, happy, healthy” you have to be joking. And educators wonder why homeschooling has become such a growth industry.

  4. We have to remember that the purpose of these tests is to assess student learning, and these assessments are not explicitly designed to draw conclusions about individual teachers. As form Tracy W.’s and momof4’s assertions that teacher think they are perfect or better than other “worlds,” please don’t make that assumption based until you’ve been in every classroom in the nation. Most teachers know that the system is borribly flawed, based on faulty premises, and many, like myself, are working to reform not funding or testing, but teaching.

    Consider this: would you assess the effectiveness of an emergency room in a hospital only on the number of patients who survive their visit? Such a binary assessment of data disregards the myriad of external factors which are relevant to that department’s capacity for success. Should the effectiveness of law enforcement be based only on the number of people locked up? The problem with testing is that it often represents a binary, black-and-white, representation of a complicated situation.

    For example, I had a young woman in my HS English class a few years ago whose grandmother died two days before the state tests, but because she knew the state tests in the were required for graduation, she came and sat for them. She bombed them, since one of the writing prompts had to do with identifiying and describing an important influential person the student would like to meet again. She was an A/B student and that one day was an off day.

    Sure, that example is the exception, but is not figured into the data. By your discussion above, not only would that test be assumed as evidence of her lack of learning, but also my lack of teaching.

    I teach in an intervention program where many kids enter with significant deficits in skills. My teaching team has worked hard and innovated strategies to get those kids where they need to be. Yet, a good chunk still don’t pass the state test. By your logic, we are failures. However, what isn’t shown in the data is that those kids rose from an average 4th grade reading level to an average 8th grade reading level from September to March (test time)…but the state test is in the 10th grade. Clearly, the data proves that we are failures as teachers.

    Wrong…but the data can help us understand where we did something right and where we can improve future instruction. I bristle at the premise, from comments above, that because teachers see bad data as a spur to implement change…rather than as a call to throw our hands up and admit defeat…that we somehow think we are “better” or “perfect” in comparison to other fields. Good teachers use bad data to make better instruction. If you’d rather we just say “we’re failures,” I don’t see how that would be more productive. I don’t know a single teacher who thinks our profession is populated with perfects.

    We are working with people here. There is much evidence about all the factors which contribute to test performance, from the time of day when it is administered to whether the child has eaten in the previous 24 hours…and especially, that child’s parents’ views of education and schools. I am not saying testing is evil, nor that that data provided by testing is evil. I am just calling for caution when considering what data represents…taking those binary pass-fail measures and drawing conclusions as absolute…such as “teachers think they are perfect and blame the test” is quite counterproductive.

  5. We once had a 4th grader whose father murdered his mother and then killed himself. The child heard the shots. He still had to take the state tests.

    Test in and of themselves are Ok, its what they’re used for nowadays that is the problem. I don’t even see the tests results of my students until the last week of school, how am I supposed to use that to improve instruction?

  6. Mark G: would you assess the effectiveness of an emergency room in a hospital only on the number of patients who survive their visit?

    We assess the effectiveness of emergency treatments based on how many patients survive (plus a range of quality-of-life measures). It has often turned out that a medical treatment that sounded logical was worse than doing nothing.

    More generally, I think I wrote poorly. I was intending a defence of testing in reponse to Karl Wheatley’s attacks. Take your intervention programme – if you didn’t test the kids (and hopefully test them independently), how could you know that you’d improved their reading level from an average of 4th grade to 8th grade? You might have thought that you’d improved their reading level but in fact not had any actual impact at all, or possibly even worsened their reading levels. (I am not questioning your results here, I am pointing out the value of testing is that you know you had such results, if you didn’t you’d be flying blind).

    I don’t know a single teacher who thinks our profession is populated with perfects.

    I give you Karl Wheatley. At least, that’s the implication I take away from his attack on testing. Admittedly he might not be a teacher.

  7. Margo/Mom says:

    Wheatley has based his editorial on some questionable assumptions. One assumption is that the driving force behind measuring student achievement is to make the US the highest scoring country in the world. Personally, I was not aware of this goal, and I wouldn’t necessarily endorse it as a meaningful goal. On the other hand, it is troubling that so many assume that the US still occupies that spot–when we actually fall somewhere around the middle. He then attempts to point out that test-measured achievement is unimportant by referring to a study that followed adults with the highest IQ scores through life, and finding that their accomplishments were not astounding. I don’t find that surprising, but IQ is not the same thing as achievement.

    When learning (or knowledge at a point in time–which can be used to evaluate learning)–the stuff that achievement tests are designed to measure is shown to be tightly inter-related with such things as who graduates high school and who is then able to go on to and succeed in college or the workplace–we are dealing with an important gatekeeper. It is intensely discomforting to look at the reality of the difference between the Cleveland achievement scores and those in nearby Beachwood or Bay Village. It is even more discomforting to see that the response of places like Cleveland is not to emulate anything that happens in Beachwood or Bay Village, but rather to institute dress codes, increase drills and eliminate recess, the arts and physical education.

    But the solution is not to make the measures of achievement, however limited they may be, go away. We certainly have had ample time doing the wrong things (and hoping that the tests would just go away). For the most part, they haven’t helped much. There are too many successful models (in this country and elsewhere) for improving education. Odds are that districts like Cleveland have drawers full of recommendations based on sound practice rather than cosmetic changes and eliminating the things that support a well-rounded education. We really have to confront the inequality in the outcomes of education–and work to change them, in real ways.

  8. Roger Sweeny says:

    I give you Karl Wheatley. At least, that’s the implication I take away from his attack on testing. Admittedly he might not be a teacher.

    Karl Wheatley is an education professor. Education professors are to teachers as blood-letters are to doctors.

  9. Per Wheatley:

    “What do the researchers know that the policy-makers don’t? “

    Extremely funny, though I don’t think he meant it to be.

  10. Tracy W., you’re right about the testing used with my students–the key is that testing happened in my classroom and is not what gets used to assess my performance. My goal with those assessment had two prongs: first, figure out my students’ enry skills (assess students) and second, in the postassessment, determine if my strategies had been effective. What is used to assess my performance is the external measure, the state standarized test, not the measures employed within my classroom. That outside test, to me, is not a reasonable way to draw conclusions about my effectiveness as an educator.

    If the argument then becomes that I still did not fulfill my duties as a teacher since my kids did not pass the state test, well I suppose that is one way to interpret the data. Hey, if it means teachers get fired for taking on kids with skills deficiencies because those test scores are lower, I’ll fight for Honors courses next year where my kids will arrive already having mastered much of what the state test will ask of them. The fact is, those kids will exist as long as social promotion exists and as long as our education system remains fundamentally broken. Until the big fix, it is important to divorce snapshot, one-time-testing from the assessment of teacher effectiveness.

  11. tim-10-ber says:

    MarkG — I am a parent. I have seen the ill-effects of a poor teacher on students. I have had to explain why some “dumb” teacher was yelling at my son who knew the answer in his head right but did not show his work why showing “his work” was “important”. I have seen the top down approach (either management or professors) not work as they are too fat removed from the client or the students. I know students have bad test days and I know some students just don’t do well on standardize tests. I know there is grade inflation. The list goes on…

    As a parent I want to know my student is mastering the subject matter. I wanted to know my child’s teachers are effective (it is easy to tell when they are not). How do I do this without relying on data?


  12. tim-10-ber, the key is to consider the data and what it really measures, and what conclusions can be drawn from that data… teacher effectiveness and student performance on standardized tests are not a direct, predictable, consistent relationship. I have probably the worst state standardized test scores in my building, so if that data were used to determine my effectiveness, no parent would want their kid in my class. But, there are other factors to consider: for one, I teach in an intervention program, the only one in my building. I’ve been lucky to receive recognition at the state and national level for what I do, and I have data I gather from within my classroom to indicate my successes and shortcomings with my students, so I do have parents who want their kid in my class. When my parents come and ask to see the data (not necessarily in those words) I never show them state test scores (mainly because I don’t have them until the last week of school), I walk them through the kid’s portfolio of work and talk about the progression I see or do not see therein. It also involves broadening the definition of what is data: there is more data about your kid than just one-shot-and-done state assessment of district test. A good teacher is constantly gathering “data” on their kids and should be able to articulate to you what that data means.

    I cannot defend bad teaching or bad teachers. There are far more bad teachers out there than most principals probably want to admit, and unfortunately their empolyment is vigorously defended by certain groups. It is far too hard to get rid of ineffective teachers, partly because there is no consistently applied and irrefutable, reliable measure of teacher effectiveness. THIS is the big problem. Because there is no way for parents to determine effective teachers and because districts often have no means of designative the most effective teachers with any degree of verifiable reliability, parents are left to rely on data which actually says very little about teacher effectiveness and is often an out-of-context snapshot.

    I guess, the best you have is to talk to other parents, talk to other teachers, and make sure to have a relationship with the teacher in order to assess that teacher’s effectiveness. Unfortunately, that approach is the best we have at the moment until better data directly linked to teacher effectiveness is available.

  13. tim-10-ber says:

    MArk G — thank you!! I an understand your situation as your are trying to bring kids to grade level that other teachers allowed to be promoted. I applaud for working with these kids! They need you. From what you say you are doing well!! We need more teachers like you!!

    There has to be an effective way to measure teachers and their performance. Yes, I talked with the principal when a teacher forced all the kids to do “write offs” when there were 9 out of the 31 the teacher knew did nothing wrong. That process ended fairly quickly. Still…the teacher was dumbing down the class by that one exercise rather than challenging the 22 other kids to raise their standards to or above the that were doing the right thing.

    Another teacher told me she would have to tutor my son through Algebra I as he had an 8 stanine on a NRT but she had not prepared him for Algebra I in what I thought was a pre-algebra class. Turnout when he was tested by his new private school for math placement he could not do half of what was on the assessment. For my son this teacher failed him. For others she might have been great. What was clear was she had no clue how to do differientiated learning. UGH!!

    Someone has to figure out how to do effective 360 feedback in education. Teachers know who the poor teachers are. Students know, too. So do administrators. So…how to make this work to get these teachers out before they do more harm…

    Thanks again —

  14. Mark G – I think, for effective teaching, there does need to be external tests. Not necessarily the ones currently in place, but something.
    In medicine, the gold-standard for testing is double-blinding, where neither the patients nor the doctors directly administering the treatment know whether the patients are getting the placebo or the actual treatment, in order to rule out the possibility of the doctor even subconsciously biasing the results. Engineering companies hire specialised test engineers, whose job is to check what the other engineers (and technicians and so forth) have produced, because there’s a lot of experience that engineers often fail to see the faults in their own work, even though the thing about engineering is that as areas of human endeavour go, it’s one where it’s really really hard to cover up faults.
    With teaching, it strikes me as plausible that an ordinary teacher could fail to notice that he was accidentally teaching kids to memorise texts or guess meanings from pictures rather than actaully read (I’m not saying that you’re doing this, from what you report of your students’ results on external tests it sounds like you’re not at all, I’m just saying that it’s a possibility well within the human range of error, the equivalent of say bloodletting in medicine). And if a school teaches critical thinking or creativity, that strikes me as even easier for the school to go off the rails and start teaching “critical thinking” as “agreeing with the teachers all the time” or “parroting the school’s attitudes”.
    Of course we then face the question of how reliable the external tests are. NZ has addressed that issue by releasing the test questions publicly after the tests, and sending each student their test papers back with marking. It’s imperfect, and it’s expensive, but at least when I went through a bad test question was front-page news.

  15. I agree that there need to be external tests. In fact, in my home state, I am part of a task force developing exactly that. Some would also suggest that National Board Certification is a measure of teacher effectiveness– I tend to agree that it is the best measure we have but not yet a perfect measure. However, a measure of teacher effectiveness should be designed to specifically assess teacher effectiveness, not just student achievement on state assessments… which is just one indicator but not a consistent or necessarily reliable indicator of teacher effectiveness.

  16. Homeschooling Granny says:

    Mark G. wrote:
    I guess, the best you have is to talk to other parents, talk to other teachers, and make sure to have a relationship with the teacher in order to assess that teacher’s effectiveness. Unfortunately, that approach is the best we have at the moment until better data directly linked to teacher effectiveness is available.

    How about talking to the people who really know who is an effective teacher and who is not: the students. They always know and if given a proper opportunity, will tell you.

  17. Homeschooling Granny–that’s absolutely right… as long as such conversation is used with a good filter about whether a teacher is “popular” versus “effective.” I’d also add that if possible, asking students who had the teacher several years ago (and therefore have perhaps gained maturity to guide their insights) would be useful as well.