Gates: Mix measures to evaluate teachers

Combining growth in students’ test scores, student feedback and classroom observations produces accurate information on teacher effectiveness, according to Gates Foundation research.

A composite measure on teacher effectiveness drawing on all three of those measures, and tested through a random-assignment experiment, predicted fairly accurately how much high-performing teachers would successfully boost their students’ standardized-test scores, concludes the series of new papers, part of the massive Measures of Effective Teaching study launched three years ago.

No more than half of a teacher’s evaluation should be on growth in student achievement, researchers concluded.  In addition, teachers’ classroom performance should be observed by more than one person.

Of course, the controversy on how to evaluate teachers — and what to do with the information — is not over.

The ever-increasing federal role in education makes no sense, writes Marc Tucker, who complains that U.S. Education Secretary Arne Duncan is forcing states to evaluate teachers based on student performance in order to get No Child Left Behind waivers.  Most researchers don’t think value-added measures of teacher performance are reliable, writes Tucker.

The study is a “political document and not a research document,” Jay Greene tells the Wall Street Journal.  Classroom observations aren’t a strong predictor of student performance says Greene, a professor of education policy at the University of Arkansas. “But the Gates Foundation knows that teachers and others are resistant to a system that is based too heavily on student test scores, so they combined them with other measures to find something that was more agreeable to them,” he said.

LA study: New teachers get worst students

In Los Angeles Unified, new teachers get the weakest students, reports a six-year study by the Strategic Data Project.

The study also found “significant disparities in effectiveness among the district’s elementary and middle school teachers, as measured by students’ standardized test scores,” notes EdSource Today.

Researchers found that the difference between a math teacher in the 75th percentile – those whose students performed better than three quarters of other students – and a teacher in the 25th percentile was the roughly equivalent benefit to a student of having eight additional months of instruction in a calendar year (technically one quarter of a standard deviation).

New teachers hired through Teach for America and the district’s Career Ladder program that helps aides become teachers were more effective in math than other novice teachers by two months for TFA and one month for former aides. However, most TFA teachers leave after two years, while Career Ladder teachers usually stay for the long haul.

Forty-five percent of laid-off teachers ranked in the top two quartiles in effectiveness, the study found. All layoffs are based on seniority.

Of the teachers who were laid off, 45 percent were in the top two quartiles of effective teachers in Los Angeles Unified. Source: SDP Human Capital Diagnostic in the Los Angeles Unified. (Click to enlarge.)
Los Angeles teachers with advanced academic degrees earn more, but are no more effective, the study found. However, “teachers with a National Board Certification outperform other teachers, by roughly two months of additional math instruction and one month of additional ELA instruction over a year.”  Most board-certified teachers in Los Angeles work in high-performing schools.

How federal rules block innovation

Federal education funding is supporting the status quo, argues a new Center on Reinventing Public Education report, Federal Barriers to Innovation. Authors Raegen Miller and Robin Lake focus on Title I funding for disadvantaged students and  IDEA funding for disabled students.

 The Title I comparability loop hole, for instance, prevents districts from adopting promising new technology–based school models. If a district has a high-poverty school staffed with inexperienced, lower-paid teachers, and an affluent school of the same size staffed with the same number of more experienced, higher-paid teachers, those schools are considered to have comparable staffing levels. The loophole masks the true educational costs of schools, reinforces a traditional compensation system that favors tenure and post-graduate education, and prevents districts from differentiating pay in strategic ways.

IDEA’s maintenance of effort requirement forces districts to keep spending money “without regard for its efficiency or effectiveness.” That blocks innovative teaching methods and technologies.

Instead, IDEA needs a “challenge waiver” system, Miller and Lake write.

Districts could be granted waivers for the 100 percent spending threshold on special education and related services “provided they furnish a coherent, strategic special education plan documenting the rationale for a lower threshold.” Such a system would encourage more data-driven decision-making, while random audits would ensure fidelity of implementation.

In addition, they call for “redirecting Title II funds (an amalgam of funding streams supporting ineffectual professional development and class-size reduction programs)” toward effective new instructional technologies.

Good principals are great

Good principals are very, very good for teachers and students, concludes a study in Education Next. “For student outcomes, greater attention to the selection and retention of high-quality principals would have a very high payoff,” write Gregory F. Branch, Eric A. Hanushek and Steven G. Rivkin.

. . . highly effective principals raise the achievement of a typical student in their schools by between two and seven months of learning in a single school year; ineffective principals lower achievement by the same amount. These impacts are somewhat smaller than those associated with having a highly effective teacher. But teachers have a direct impact on only those students in their classroom; differences in principal quality affect all students in a given school.

Less-effective teachers are more likely to leave schools run by highly effective principals, the study found. “Good principals are likely to make more personnel changes in grade levels where students are under-performing.”

Unsuccessful principals aren’t weeded out, especially those teaching in high-poverty schools. Those who leave go to other schools.

The value-added analysis looked at “the extent to which math achievement in a school is higher or lower than would be expected based on the characteristics of students in that school, including their achievement in the prior year.”

Merit mandate = $1 bonus for top teachers

Some Michigan school districts think their best teachers are worth $1 more than their worst, reports Michigan Capitol Confidential.

That’s the amount the Davison Community Schools in Genessee County, and the Stephenson Area Public Schools in Menominee County, pay to be in compliance with the state’s merit pay law, which was put in place when Jennifer Granholm was governor. The Gladstone Area Public Schools in Delta County pays its top-notch teachers $3 more than the worst.

Job performance must be “a significant factor in determining compensation,” according to state law. In Davison and Stephenson schools, that means a $1 bonus for  “highly effective” teachers. Gladstone pays a $3 bonus to “highly effective” teachers, $2 to those rated “effective” and an extra $1 to any teacher who “meets goals.”

Eighty percent of Michigan districts are ignoring the merit pay law, estimates the Mackinac Center for Public Policy.  Teachers are paid based on years of experience and credits earned past a bachelor’s degree. There’s no monetary reward for teaching well.

. . .in the Troy School District in Oakland County, seven gym teachers made more money in 2011 than a biology teacher who was selected as a national teacher of the year.

A measure on the November ballot, Proposal 2, would end the merit pay mandate by letting government union contracts  overrule state laws.

A few districts have replaced the old salary scales with performance pay without spending more overall on salaries, says Michael Van Beek, education policy director at Mackinac.

When students grade teachers

When students evaluate their teachers, they’re remarkably good at identifying who’s effective and who’s not, writes Amanda Ripley in The Atlantic. Students evaluations have proved to be “more reliable than any other known measure of teacher performance—­including classroom observations and student test-score growth,”  researchers have found, Ripley writes.

Some 250,000 students participated in a Gates Foundation study of student evaluations, using a survey developed by Harvard economist Ronald Ferguson.

The responses did indeed help predict which classes would have the most test-score improvement at the end of the year. In math, for example, the teachers rated most highly by students delivered the equivalent of about six more months of learning than teachers with the lowest ratings. (By comparison, teachers who get a master’s degree—one of the few ways to earn a pay raise in most schools —delivered about one more month of learning per year than teachers without one.)

Students were better than trained adult observers in evaluating teacher effectiveness, probably because students spend a lot more time with each teacher. And there are more of them.

Five items were linked strongly with student learning:

1. Students in this class treat the teacher with respect.

2. My classmates behave the way my teacher wants them to.

3. Our class stays busy and doesn’t waste time.

4. In this class, we learn a lot almost every day.

5. In this class, we learn to correct our mistakes.

Teachers were surprised that caring about students was less important than controlling the classroom and challenging students, Ripley writes.

At McKinley Technology High School in Washington D.C., the same students “gave different teachers wildly different reviews” on Control and Challenge.

For Control, which reflects how busy and well-behaved students are in a given classroom, teachers’ scores ranged from 16 to 90 percent favorable; for Challenge, the range stretched from 18 to 88 percent. Some teachers were clearly respected for their ability to explain complex material or keep students on task, while others seemed to be boring their students to death.

Memphis now counts student survey results as 5 percent of a teacher’s evaluation in the annual review; 35 percent is linked students’ test scores and 40 percent to classroom observations.

The use of student surveys is spreading to Georgia and Chicago — and possibly Pittsburgh — Ripley writes.

Study: Some ‘alternate’ teachers do well

Florida’s alternatively certified teachers have better qualifications but vary in classroom effectiveness, concludes a study in Education Research reported by Ed Week‘s Teacher Beat.

Georgia State researcher Tim R.Sass compared the growth in test scores by students taught by teachers certified by community colleges’ Education Preparation Institute (EPI) option, by district-run alt-cert and by the American Board for Certification of Teacher Excellence (ABCTE).  Then he added traditionally certified teachers.

Compared to graduates of Florida’s teacher colleges, alt-cert teachers “graduated on average from more competitive colleges, tended to pass the licensing tests on the first time, and had higher SAT scores.” They also had taken two additional science courses in college.

. . . The EPI completers tended to do worse than traditionally prepared teachers, or about 3 to 4 percent of a standard deviation lower. By contrast, the ABCTE teachers boosted math achievement on average by 6 to 11 percent of a standard deviation more than traditionally prepared teachers. They were only slightly better in reading, however.

District-certified teachers did about the same as traditionally trained teachers.

in a a 2009 study, ABCTE teachers performed worse in math, notes Teacher Beat, who adds that the sample sizes are small.

Movin’ and improvin’

Teacher-effectiveness data should be used to help teachers improve, not just to fire incompetents, argues Movin’ It and Improvin’ It! by Craig Jerald, an education policy consultant, on the the Center for American Progress site.

. . . districts are missing an opportunity to … help leverage their highest performers and help teachers with strong potential grow into solid contributors.

The  “movin’ it” strategy uses “selective recruitment, retention, and ‘deselection’ to attract and keep teachers with higher effectiveness while removing teachers with lower effectiveness.

In contrast, “improvin’ it” policies treat teachers’ effectiveness as a mutable trait that can be improved with time. When reformers talk about providing all teachers with useful feedback following classroom observations or using the results of evaluation to individualize professional development for teachers, they are referring to “improvin’ it” strategies. If enough teachers improved their effectiveness, then the accumulated gains would boost the average effectiveness in the workforce.

Smart districts will do both, Jerald argues.

Professional development rarely improves teaching effectiveness and student learning, research shows. “The nation’s school systems spend billions of dollars annually on wasteful and ineffective professional development,” Jerald writes. Yet some forms of training have shown “substantial improvements in teaching and learning” in the last two years.

The uses (and misuses) of value-added research

Value-added research, which uses “sophisticated statistical techniques to attempt to isolate a teacher’s effect on student test score growth,”  makes sense, writes Matt DiCarlo in a thoughtful analysis on Shanker Blog. What’s troubling is how the models are used.

For example, the most prominent conclusion of this body of evidence is that teachers are very important, that there’s a big difference between effective and ineffective teachers, and that whatever is responsible for all this variation is very difficult to measure (see hereherehere and here). These analyses use test scores not as judge and jury, but as a reasonable substitute for “real learning,” with which one might draw inferences about the overall distribution of “real teacher effects.”

And then there are all the peripheral contributions to understanding that this line of work has made, including (but not limited to):

The “research does not show is that it’s a good idea to use value-added and other growth model estimates as heavily-weighted components in teacher evaluations or other personnel-related systems.,” DiCarlo concludes.

As has been discussed before, there is a big difference between demonstrating that teachers matter overall – that their test-based effects vary widely, and in a manner that is not just random –and being able to accurately identify the “good” and “bad” performers at the level of individual teachers.

Most districts and states use value-added models poorly, concludes DiCarlo

Teach for America outperforms in Tennessee

Teach for America teachers in Memphis and Nashville outperformed both experienced and new teachers, according to a state report card on teacher training. Teachers trained at Nashville’s Lipscomb University also did well.

Nine teacher training programs, including Tennessee State University, University of Tennessee-Martin, Middle Tennessee State and the Memphis Teacher Residency were cited for failing to compete with the quality of new teachers from other programs.

Memphis Teacher Residency, which recruits college graduates from other careers, posted low scores for high school teachers but relatively high scores for teachers in grades four through eight.