Clinton claims ‘no evidence’ for value-added

Democratic front-runner Hillary Clinton has been endorsed by both major teachers' unions. Photo: AP

Democratic front-runner Hillary Clinton has been endorsed by both major teachers’ unions. Photo: AP

Hillary Clinton is “saying everything teachers unions want to hear,” writes Lauren Camera on U.S. News.

“I have for a very long time also been against the idea that you tie teacher evaluation and even teacher pay to test outcomes,” she told New Hampshire teachers. “There’s no evidence. There’s no evidence.”

Is she right on the “no evidence claim? asks Stephen Sawchuk on Teacher Beat.

There have been a number of empirical studies showing that value-added measures, which are based on test scores, do pick up on differences in teacher performance.

Whether value-added measures should be used to evaluate or pay teachers is another question, Sawchuk writes. In addition to “technical challenges,” there is a risk of encouraging test prep and ignoring all the non-tested things that make up a good education.

Research on whether performance pay improves learning is mixed.

One recent study of a federal initiative showed a small effect in reading, but that stands in notable contrast to other studies that have found virtually no effects.

He concludes that Clinton “glossed over” what studies say about teacher effectiveness. 

Teacher evaluation sticker shock in Florida

With hundreds of mentors and “peer evaluators,” big raises for teachers and consultants’ fees, teacher evaluation has become a budget buster in Hillsborough County, Florida, reports Marlene Sokol for the Tampa Bay Times.

The Gates Foundation offered $100 million to fund Empowering Effective Teachers if the district paid the other half. Although other foundations also contributed, the district’s share has ballooned to $124 million.

Frank Hannaway teaches music at MacFarlane Elementary in Hillsborough County, Florida. Credit: Willie J. Allen, Jr., Tampa Bay Times

Frank Hannaway teaches music at MacFarlane Elementary in Hillsborough County, Florida. Credit: Willie J. Allen, Jr., Tampa Bay Times

“With $200 million in private and public money to play with, it was as if the district dined out nightly, ordered lobster and never kept track of the mounting tab,” writes Sokol.

Teachers got raises for performance — and for seniority. Most of the big raises went to veteran teachers in suburban schools, while high-poverty schools continued to get the least experienced, lowest-paid teachers.

Test scores rose, but the district continues to lag on graduation rates.

Hillsborough may cut back on peer evaluators, instead asking high-performing teachers to provide “non-evaluative” feedback to colleagues.

Valerie Strauss is leading the chorus of sneers, writing, “Another Bill Gates-funded education reform project, starting with mountains of cash and sky-high promises, is crashing to Earth.”

Graph for press release.PNG

Forty-three states require that student achievement and growth be included in teacher evaluations, according to a National Council on Teacher Quality report. In 35 states, it’s a significant factor.

Only Alabama, New Hampshire and Texas have teacher effectiveness policies that exist only in waiver promises made to the U.S. Department of Education.

Endless testing? High stakes? Not really

U.S. schools don’t test as much as people think and the stakes “aren’t really that high,” argues Kevin Huffman, a New America fellow, in a Washington Post commentary.

“In an apparent about-face from his administration’s education policy over the past seven years,” President Obama said last week he wants to “fix” over-testing, writes Huffman. The administration wants to limit testing to 2 percent of classroom time.

Testing averages 1.6 percent of class time, according to a Center for American Progress analysis. In Tennessee, where Huffman was education commissioner, state-mandated tests took seven to 10 hours per student per year, less than 1 percent of class time.

“Where students spend too much time taking tests, local schools and districts — not federal or state policies — tend to be the culprits,” he adds.

Due to federal pressure, more states now evaluate teachers based partially on their students’ test scores. All use “multiple measures” and “nearly all teachers perform at or above expectations.”

When schools are evaluated, “significant interventions” are targeted at the bottom 5 percent of campuses, he writes.

“Many schools spend too much time on mind-numbing test prep, sitting kids at their desks and going over endless multiple-choice questions,” Huffman concedes. There’s little evidence it improves scores.

Too much testing

Schools are giving too many tests, President Obama has declared in a Facebook video. He wants to help schools to spend no more than 2 percent of instructional time on testing, while retaining “smart, strategic” tests.

Eighth-graders — the most tested students — spend 4.22 days or 2.34 percent of school time taking mandated tests, estimates a Council of the Great City Schools study of 66 urban school districts. That doesn’t include time devoted to test prep.

No Child Left Behind requires an annual math and reading exam in grades 3 through 8 and once in high school in math. Science is tested in some grades. But states and districts have added many other tests — often to qualify for federal grants and waivers, notes the Washington Post.

To win a grant under the competitive Race to the Top program, or to receive a waiver from No Child Left Behind, states had to evaluate teachers based in part on student test scores. Since federal law required standardized tests only in math and reading in certain grades, states added tests in social studies, science, languages — even physical education — to have scores they could use to evaluate teachers.

“Many of the appalling things reported on here are the direct result of the way the federal government has approached this,” said Marc Tucker, president of the National Center on Education and the Economy. “The accountability system is what’s driving this and it’s fundamentally flawed.”

The average urban student takes roughly 112 tests between pre-K and grade 12, the Great City Schools report finds. There’s lots of duplication: Some districts require a “summative” exam and an “end-of-course” exam in the same subject. In addition, most tests “don’t actually assess students on any particular content knowledge.”

Often, results aren’t used to improve teaching, the report found. Results come months late and teachers aren’t trained in how to use the results.

Eliminating duplication makes sense, of course. But the most effective way to cut testing time is to give up on evaluating teachers by their students’ test scores. (The new Student Learning Objective assessments for non-NCLB subjects are unreliable and low-quality, says Great City Schools.) Is the Obama administration ready to back away from that policy?

Common Core-aligned tests take much longer because they require students to do more writing and less bubbling. I assume that would fall under “smart” and “strategic.”

Training mirage: Most teachers don’t improve

Despite heavy spending on professional development, most district teachers don’t improve their skills after the first few years, according to The Mirage. The TNTP study analyzed three large public school districts and a mid-sized charter school network.

The three districts spend an average of $18,000 per teacher each year on development efforts, which take take nearly 10 percent of the school year. Yet “only three out of 10 teachers in the districts we studied improved substantially over several years, even though many have not yet mastered critical instructional skills.” Five stayed the same and two declined in effectiveness.

Teachers improved in 95 percent of schools studied, but no approach to training or amount of training appeared to more effective.

The vast majority of district teachers “received high marks on their evaluations” and less than half of teachers surveyed thought their teaching skills needed improvement.

“Even the few teachers who did earn low ratings seemed to reject them,” the report found. “More than 60 percent of low-rated teachers still gave themselves high performance ratings.” Among teachers whose observation scores declined “substantially” over the previous two years, 80 percent said their teaching had improved.

By contrast, 70 percent of the charter network’s teachers improved significantly, showing more growth than district teachers at every experience level. Their students also improved more than students in neighboring schools.

Charter teachers were much more critical of their own skills: Only 4 percent gave their teaching a 5 on a 1-5 scale, compared to 30 percent of district teachers. Eighty-one percent of charter teachers said their teaching skills had weaknesses; only 47 percent of district teachers acknowledged room for improvement.

The charter network spent $33,000 per teacher for “a tight loop of observation, feedback, and implementation,” notes Alyssa Schwenk on Gadfly.

Two factors — “openness to feedback” and “ratings alignment” — were linked to improved teaching, writes Catherine Brown, vice president for education policy at the Center for American Progress in U.S. News.

In other words, teachers who were open to hearing ways to get better got better. “Ratings alignment,” which means teachers rated themselves the same as their evaluators, embodies a similar concept: These teachers were clear-eyed about the deficiencies and bright spots in their own practice.

. . . By establishing early on that teachers are going to get a lot of feedback and having master teachers with dedicated time in their schedules to helping other teachers improve, the high-performing charter network in The Mirage was able to create a culture where it’s OK to say you have room for improvement.

Some charters screen for openness to feedback when interviewing new teachers, Brown writes. Candidates teach a lesson, receive feedback and then teach again to show whether they can use the feedback to improve. (I believe New Teachers for New Schools, now known as New Leaders, developed this model. It’s used in two high-performing San Jose charters I visited in the spring.)

The Madison Metropolitan School District also selects candidates based on their ability to “reflect on strengths and growth areas regularly, and seek support, feedback, and mentors to improve,” writes Brown.

She suggests spending more on recruiting and selecting feedback-friendly achievers would make later training more effective.

Accountability fail

A highly rated New York City teacher who moves to a low-rated school will get an asterisk on her new ratings, writes teacher Arthur Goldstein in an open letter to Chancellor Carmen Fariña.

“Doesn’t that indicate that the test scores are determined more by students themselves as opposed to teachers?” he asks.

Goldstein teaches English as a Second Language to immigrant students who tend to do badly on standardized tests. It would be “irresponsible of me to neglect . . . basic conversation and survival skills,” yet the test focuses on academic English.

Teaching ESL or special education is a high-risk specialty, Goldstein argues.

Attaching high stakes to test scores places undue pressure on high-needs kids to pass tests for which they are unsuited. For years I’ve been hearing about differentiation in instruction. I fail to see how this approach can be effectively utilized when there is no differentiation whatsoever in assessment. It’s as though we’re determined to punish both the highest needs children and their teachers.

Teacher morale has “taken a nose dive” because of high-stakes evaluations, he writes.

Accountability can backfire, writes Marc Tucker in Ed Week.

When states decided to track and publish surgeons’ success rates, the very best surgeons took fewer high-risk cases, according to several studies.

Rating teachers by their students’ performance poses the same risk, argues Tucker. Instead of rewarding good teachers, it may reward teachers with good students and penalize those who teach the most challenging students.

He imagines a top teacher who leaves her suburban school for a high-poverty school. The work is much harder. “Your students’ scores on the state tests may not go up much, but you know what you have done for a number of these kids has spelled the difference between a chance for a future and none at all,” Tucker writes. But the teacher earns a very low rating and other experienced teachers decide that teaching the neediest kids is too much of a risk.

Value-added measures are supposed to compare students’ past performance, so teachers aren’t penalized for teaching low-performing kids. But it’s not clear that the measures are reliable — especially for the many teachers who don’t teach subjects that are tested.

Evaluation isn’t about firing bad teachers

Nearly all teachers receive high ratings in most districts. Teachers are in short supply in some parts of the country, writes Paul Bruno for the Brookings Institution.“The extent to which a principal is willing to dismiss (or give a poor evaluation to) a teacher will likely depend in part upon her beliefs about the probability of finding a superior replacement in a reasonable period of time.”

Teacher evaluation systems should be seen as a way to help teachers improve, not as a system to “dismiss teachers,” responds Bellwether’s Kaitlin Pennington.

New evaluation systems were meant to be a tool to reward excellent instruction, provide opportunities for targeted professional development, and create systems of support in schools in districts. Unfortunately, new teacher evaluation systems in many places were sold as ways to “get rid of bad teachers,” which greatly hurt implementation efforts.

Effective evaluation systems let a principal who’s hiring know “what effective teaching looks like and how it is measured,” writes Pennington.

Teacher evaluation isn’t included in either version of the Elementary and Secondary Education Act, Pennington points out. “States will not have the political cover from federal policy to move forward with teacher evaluation.” And if it’s seen as just a way to fire teachers, it will not survive.

Judging a music teacher by reading, math scores

Music, art, P.E. and other non-academic teachers are being judged based on reading and math scores, writes Alexandria Neason on Slate.

Nick Prior teaches music at Albuquerque’s Eisenhower Middle School. His choirs have won state and national competitions. He won a statewide teaching award from the New Mexico Music Educators Association in 2014.
Nicholas D Prior’s teacher evaluation form.But Prior was rated “minimally effective” on his annual evaluation. He earned 33.25 points out of a possible 100 in the “student achievement” category that made up half of the document. “Achievement” had nothing to do with music. It was based on reading and math scores of his school’s lowest performing quarter of students, many of whom hadn’t taken one of his classes.

Prior earned average or above-average ratings in classroom observations, teacher attendance, and student and parent surveys, but it wasn’t enough to balance the low reading and math scores.

Forty-two states across the country have moved in recent years to evaluate all teachers at least in part on student test score growth, according to the National Center for Teacher Quality. But tens of thousands of teachers work with students in grades that aren’t tested (like kindergarten) or subjects in which standardized tests typically don’t exist (like art, music, and physical education).

Officials in Nevada are even considering how they might hold support staff—like school nurses and counselors—responsible for student test results, arguing that they impact student achievement by keeping students healthy and able to learn.

Some are creating new tests to measure music, art and PE achievement, writes Neason. Others argue that all school staffers are responsible for teaching foundational literacy and math skills.

The percentage of a teacher’s evaluation that rests on schoolwide scores varies from 5 percent for Chicago high school teachers to 25 percent in Tennessee, as high as 40 percent in Florida, and 50 percent in New Mexico, according to Neason.

Prior, who makes $30,000 as a “level one” or beginning, teacher, had hoped for an advanced rating that would raise his pay to $40,,000. Instead, he could lose his teaching license if his score doesn’t improve next year.

Core tests aren’t really ‘high stakes’

Forty-four states plus the District of Columbia are giving Common Core-aligned tests this spring, but  the exams are stakes are low for students and only slightly higher for teachers, according to a Hechinger Report survey.



Test boycotts spread

What Happens When Students Boycott a Standardized Test? asks Laura McKenna in The Atlantic.

Anti-testing politics are evolving in New Jersey, she writes. Resistance to new Core-aligned tests started with a small number of parents and grew as “teachers unions helped parents organize.”

. . .  weeks before the March segment of the PARCC, the NJEA, New Jersey’s largest teachers union, aired a series of widely viewed television commercials that denounced the exam. One ad features a middle-aged dad with a goatee telling a group of fellow parents that his first-grader cried when he came home from school, apparently too tired to go to karate practice. The goateed dad despairs, “What are we doing to our kids?”

. . . Some of the unions’ local branches even arranged parties to view the film Standardized (Lies, Money, & Civil Rights: How Testing Is Ruining Public Education) or set up websites informing parents how to complete the necessary paperwork to release their children from the testing.

Now students are asking their parents to exempt them from testing, McKenna writes. Her 15-year-old son “used every weapon in his teenage arsenal—eye rolls, deep sighs, guilt-tripping, and even logic—to pressure my husband and me to write a letter to the school opting him out of the test.” None of his friends were taking the PARCC exam, he claimed. (It didn’t work.)

Parents protest PARCC in Northampton, MA.

Parents protest PARCC in Northampton, MA.

In New Jersey, 5 percent of students “are estimated to have opted out of the first installment of the PARCC test, which was conducted in March; greater numbers are expected refuse to take the second one in May,” McKenna writes. Opt-outs are most common in affluent communities, which means students likely to do well are the most likely to sit out the exams.

Opting out has become a “movement of conscience” for parents and teachers, argues Carol Burris,  a principal who sees Common Core standards narrowing what’s taught. In her New York  district, 30 percent of students already have asked to sit out the tests.

“In the majority of classrooms, where opt-out appears likely to remain at low levels,” students “sitting out of standardized testing will have only a trivial impact on the ratings received by their teachers, writes Matthew Chingos at Brookings’ Chalkbeat.