Will new tests live up to the hype?

Muslim Alkurdi, 18, of Albuquerque High School, joins hundreds of classmates in Albuquerque, N.M, Monday, March 2, 2015, as students staged a walkout to protest a new standardized test they say isn't an accurate measurement of their education. Students frustrated over the new exam walked out of schools across the state Monday in protest as the new exam was being given. The backlash came as millions of U.S. students start taking more rigorous exams aligned with Common Core standards.

Muslim Alkurdi, 18, of Albuquerque High School, joins hundreds of classmates, as students staged a walkout to protest a new exams.

In 2010, U.S. Secretary of Education Arne Duncan promised teachers that Common Core-aligned Assessments 2.0 would be the tests they had “longed for.”

Millions of students are taking those new tests this spring, writes Emmanuel Felton on the Hechinger Report. Enthusiasm for the new tests has waned.

The federal government put $360 million into the Partnership for Assessment of Readiness for College and Careers (PARCC) and the Smarter Balanced Assessment Consortium, which developed Core-aligned tests.

This spring, of the original 26 states that signed up for PARCC, just 11 plus Washington, D.C. are giving the test. Of the original 31 signed up for Smarter Balanced, only 18 are still on board. (In the early years, some states were members of both coalitions.) Several of the states will give the PARCC or Smarter Balanced test for one year only, before switching to their own state-based exams next year. Another Common Core exam, known as Aspire, produced by ACT, has stolen away some states from the federally sponsored groups; this spring students in South Carolina and Alabama will take that test.

On the old state tests, only 2 percent of math questions and 21 percent of English questions assessed “higher-order skills,” such as abstract thinking and the ability to draw inferences, concluded a 2012 RAND study of 17 state tests.

Two-thirds of PARCC and SBAC questions call for higher-order skills, according to a 2013 analysis by the National Center for Research on Evaluation, Standards, and Student Testing.

“In the old tests a student would just get a vocabulary word by itself and would be asked to find a synonym,” said Andrew Latham, director of Assessment & Standards Development Services at WestEd, a nonprofit that worked with Smarter Balanced and PARCC on the new tests. “Now you will get that word in a sentence. Students will have to read the sentence and be able to find the right answers through context clues.”

The new tests require students to answer open-ended questions, which takes more time.  Smarter Balanced will take eight and a half hours, while some PARCC tests will take over ten hours.

Duncan had promised teachers would get quick feedback from the new tests, but it takes time to grade students’ writing. The only way to get fast feedback is to use robo-graders instead of humans.

No profit left behind

Pearson, the British publishing behemoth,  sells billions of dollars of textbooks, tests, software and online courses in North America, reports Politico‘s Stephanie Simon in No profit left behind.

“Public officials often commit to buying from Pearson because it’s familiar, even when there’s little proof its products and services are effective,” writes Simon.

Its software grades student essays, tracks student behavior and diagnoses — and treats — attention deficit disorder. The company administers teacher licensing exams and coaches teachers once they’re in the classroom. It advises principals. It operates a network of three dozen online public schools. It co-owns the for-profit company that now administers the GED.

Pearson’s interactive tutorials on subjects from algebra to philosophy form the foundation of scores of college courses. It builds online degree programs for a long list of higher education clients, including George Washington University, Arizona State and Texas A&M. The universities retain authority over academics, but Pearson will design entire courses, complete with lecture PowerPoints, discussion questions, exams and grading rubrics.

In peak years, the company has “spent about $1 million lobbying Congress and perhaps $1 million more on the state level,” writes Simon. But, she adds, the National Education Association spent $2.5 million lobbying Congress in 2013.

I think this is the key point:

“The policies that Pearson is benefiting from may be wrongheaded in a million ways, but it strikes me as deeply unfair to blame Pearson for them,” said Jonathan Zimmerman, an education historian at New York University. “When the federal government starts doing things like requiring all states to test all kids, there’s going to be gold in those hills.”

The real question is whether schools need the products and services they’re buying from Pearson and its competitors. As long as Pearson has competitors, it can’t jack up its prices or lower its quality without losing business. For example, it’s losing GED customers like crazy because the new test is too expensive and too difficult. I predict they’ll announce a new new GED or lower prices to regain business.

How hard are Core math problems?

Math teachers in Maryland analyzed a Core-aligned fourth-grade math performance task from PARCC, reports Liana Heitin on Ed Week. Several were surprised at how much it required.

PARCC math item deer.JPG

Teachers listed what students need to know and be able to do to solve the problem:

The definitions of perimeter and area
How to find perimeter and area
The definition of a square mile
The properties of a rectangle
How to solve for an unknown in a perimeter
Multiplication (up to multi-digit)
Addition and subtraction (up to multi-digit)

Some might need division, depending on how they approached the problem.

And everyone will need reading and writing skills.

Students earn credit for finding the missing side length, for finding the area of the park, and for calculating the final number of deer. They also can get partial credit for each piece if they make minor calculation errors. That means the problem must be scored by a person, not a machine.

Here are some fifth-grade math questions released by New York. (Here are third- through eighth-grade questions for English and math.)

The next one involves the (gasp!) metric system.

Could I have solved these in fifth grade? I think so.

Promises, ineptitude and overreach

Race to the Top was a loser, writes Rick Hess on the fifth anniversary of the Obama administration’s $4.35 billion education competition. RTTT has become “a monument to paper promises, bureaucratic ineptitude, and federal overreach.”

Instead of letting states come up with reform ideas, the administration created a list of 19 “priorities.” States could “ace three of the 19 priorities if they promised to adopt the brand-new Common Core and its federally-funded tests.”

 Applicants produced hundreds of jargon-laden pages in an attempt to convince the Department-selected reviewers that they would do what the administration asked. As one reviewer described it to me, “We knew the states were lying. The trick was figuring out who was lying the least.”

. . . States promised to adopt “scalable and sustained strategies for turning around clusters of low-performing schools” and “clear, content-rich, sequenced, spiraled, detailed curricular frameworks.”

. . . winning states relied heavily on outside consultants funded by private foundations. This meant that in-house commitment to the promised reforms could be pretty thin.

At the height of the Great Recession, dangling billions in federal dollars encouraged state education leaders to dream up new spending programs, Hess writes. Yet the value for grant winners amounted to “about one percent of a state’s annual K-12 budget.”

The Common Core might have been “a collaborative effort of 15 or so enthusiastic states,” writes Hess. RTTT transformed it into “a quasi-federal initiative with lots of half-hearted participants who signed on only for federal dollars.”

Given that Race to the Top also pushed states to hurriedly adopt new teacher evaluation systems and specifically to use test results to gauge teachers, not-ready-for-primetime evaluation systems are now entangled with the Common Core and new state tests.

Now, states are running from their Race to the Top promises, threatening the Common Core enterprise.

Test answers are in the (missing) book

Pennsylvania’s state exams can be “gamed” by a “shockingly low-tech strategy,” writes Meredith Broussard, a Temple professor of data journalism. All it takes is reading “the textbooks created by the test makers.”

Poor Schools Can’t Win at Standardized Testing because they don’t have the right books, she writes in  The Atlantic.

On  the 2009 Pennsylvania exam, third-grade students were asked to write down an even number with three digits and how they know it’s even.

Here’s an example of a correct answer from a testing supplement put out by the Pennsylvania Department of Education:

This partially correct answer earned one point instead of two:

Everyday Math’s third-grade study guide tells teachers to drill students on the rules for odd and even factors and be able to explain how they know the rule is true, Broussard writes. “A third-grader without a textbook can learn the difference between even and odd numbers, but she will find it hard to guess how the test-maker wants to see that difference explained.”

I’m not shocked that tests are aligned to textbooks. What’s truly disturbing is Broussard’s research into whether Philadelphia schools have the right books. She found district administrators don’t know what curriculum each school is using, what books they have or what they need.

According to district policy, every school is supposed to record its book inventory in a centralized database called the Textbook Storage System. “If you give me that list of books in the Textbook Storage System, I can reverse-engineer it and make you a list of which curriculum each school uses,” I told the curriculum officer.

“Really?” she said. “That would be great. I didn’t know you could do that!”

Principals use their own systems for tracking supplies and books. Short of support staff, schools stack books in closets and forget they’re there. Teachers scavenge materials from closed schools and spend their own money to supplement their $100 a year supplies budget.

Broussard built a program, Stacked Up, which found the average Philadelphia school has 27 percent of the books it needs. But that’s just a guess because nobody really knows who’s got what.

Why they cheated

Christopher Waller, the principal of Parks, was lauded in Atlanta, and became a minor celebrity of the school-reform movement.

A former math teacher at a high-poverty Atlanta middle school explains why the principal and teachers cheated in a sympathetic New Yorker profile.

Students who’d passed a competency test in fifth grade arrived at Parks Middle School with first-grade reading levels. The elementary schools were cheating, Principal Christopher Waller concluded. And his supervisors didn’t care.

Waller recruited Damany Lewis to lead a team of teachers willing to change wrong answers. He told them the school would close if it didn’t meet Superintendent Beverly Hall’s unreachable targets.

During testing week, after students had completed the day’s section, Waller distracted the testing coördinator, Alfred Kiel, by taking him out for leisurely lunches in downtown Atlanta. On their way, Waller called the reading coördinator to let her know that it was safe to enter Kiel’s office. She then paged up to six teachers and told them to report to the room. While their students were at recess, the teachers erased wrong answers and filled in the right ones. Lewis took photographs of the office with his cell phone so that he could make sure he left every object, even the pencils on Kiel’s desk, exactly as he’d found them.

As the school’s scores soared, it was lauded for its success, attributed to a “relentless focus on data.” Waller was lauded for his success.

In the spring of 2008, Parks’s scores were almost as high as those of a middle school in Inman Park, a gentrified neighborhood with yoga studios, bike paths, and million-dollar houses. Waller thought the results seemed obviously false, and he called his supervisor, Michael Pitts, to warn him.

Nothing happened. Year after year, improbable numbers were accepted as valid. Complaints were ignored.

Parks attracted so many visitors who were eager to understand the school’s turnaround that teachers had to come up with ways to explain it. At Waller’s direction, they began maintaining what they called “standard-based mastery folders,” an index of all the objectives that each student needed to grasp in order to comprehend a given lesson. Lewis, who was taking night classes at the School of Education at Clark Atlanta University, wrote his master’s thesis on the technique. “It was a wonderful system,” he said. “But we only put it in place to hide the fact that we were cheating.”

Believing the tests weren’t valid, teachers saw cheating as a “victimless crime.”

Confused by Core tests

Kids have been field-testing new Common Core exams — and parents have been trying practice tests posted online. The verdict: The new tests are much harder — partly because of poorly worded questions.

Carol Lloyd, executive editor at GreatSchools, is a fan of the new standards, but worried about the test. She went online to try practice questions for both major common-core assessment consortia—Smarter Balanced and PARCC (the Partnership for Assessment of Readiness for College and Careers)—for her daughter’s grade.

Many of the questions were difficult but wonderful. Others were in need of a good editor.

A few, however, were flat-out wrong. One Smarter Balanced question asked students to finish an essay that began with a boy waking up and going down the hall to talk to his mother. Then, in the next paragraph, he’s suddenly jumping out of bed.

A PARCC reading-comprehension question asked students to pick a synonym for “constantly” out of five possible sentence options. I reread the sentences 10 times before I realized that no words or phrases in those sentences really meant “constantly,” but that the test-writer had confused “constantly” with “repeatedly.” Any student who really understood the language would be as confused as I was.

If these are the test questions they’re sharing with the public, “what are they doing in the privacy of my daughter’s test?” asks Lloyd.

Natalie Wexler, a writing tutor at a high-poverty D.C. high school, took the PARCC English Language Arts practice test for 10th-graders.  A number of questions were confusing, unrealistically difficult, or just plain wrong,” she writes.

Question 1 starts with a brief passage:

I was going to tell you that I thought I heard some cranes early this morning, before the sun came up. I tried to find them, but I wasn’t sure where their calls were coming from. They’re so loud and resonant, so it’s sometimes hard to tell.

Part A asked for the meaning of “resonant” as used in this passage:

A. intense B. distant C. familiar D. annoying

Looking at the context — it was hard to tell where the calls were coming from — Wexler chose “distant.”  The official correct answer was “intense.” Which is not what “resonant” means. 

Another passage described fireflies as “sketching their uncertain lines of light down close to the surface of the water.” What was implied by the phrase “uncertain lines of light.”

She chose: “The lines made by the fireflies are difficult to trace.” The correct answer? “The lines made by the fireflies are a trick played upon the eye.”

Wexler did better on a section where all the questions were based on excerpts from a majority and a dissenting opinion in a Supreme Court case about the First Amendment. “But then again, I have a law degree, and, having spent a year as a law clerk to a Supreme Court Justice, I have a lot of experience interpreting Supreme Court opinions,” she writes.

The average D.C. 10th grader won’t be able to demonstrate critical thinking skills, Wexler fears.

. . .  if a test-taker confronts a lot of unfamiliar concepts and vocabulary words, she’s unlikely to understand the text well enough to make any inferences. In just the first few paragraphs of the majority opinion, she’ll confront the words “nascent,” “undifferentiated,” and “apprehension.”

Most D.C. students “will either guess at the answers or just give up,” Wexler predicts.

Common Core tests may not pass

The two Common Core testing groups — Smarter Balanced Assessment Consortium and the Partnership for the Assessment of Readiness for College and Careers (PARCC) — made big promises when they bid for $350 million in federal funding, notes Education Week. The vision has “collided with reality.” Due to “political, technical, and financial constraints,” some ambitious plans have been scaled back.

. . . most students will take the exams on computers, rather than use bubble sheets, for instance. The Smarter Balanced assessment will adapt in difficulty to each student’s skill level, potentially providing better information about strengths and weaknesses.

In addition, students taking the PARCC test will write essays drawing on multiple reading sources. And to a level not seen since the 1990s, students taking both exams will be engaged in “performance” items that ask them to analyze and apply knowledge, explain their mathematical reasoning, or conduct research.

Performance-based assessment requires “longer, more expensive exams,” reports Ed Week. That’s a tough sell. Both exams have reduced the length or complexity of some test elements.

Both groups will continue to use some multiple-choice or machine-scored questions, but many of those items have been enhanced — allowing students to select multiple answers, for instance, or to drag and drop text from reading passages to cite evidence.

Both exams will hire teachers to score written answers after deciding that robot scorers aren’t yet up to the job.

Both consortia promised to develop tools and supports for teachers, but help for teachers has “lagged,” reports Ed Week.

Why Common Core is doomed to fail

Common Core standards are doomed, writes Jay P. Greene. The political backlash “will undo or neuter Common Core.”

With the U.S. Education Department, D.C.-based reform groups and state school chiefs on board, Common Core supporters thought they’d won a “clear and total victory.” (He compares it to the early victories by opponents of gay marriage.)

(They) failed to consider how the more than 10,000 school districts, more than 3 million teachers, and the parents of almost 50 million students would react.  For standards to actually change practice, you need a lot of these folks on board. Otherwise Common Core, like most past standards, will just be a bunch of empty words in a document.

It’s too late for supporters to convince the public and to “love” the core, Greene writes. Reforms like the Common Core have a fatal flaw.

Trying to change the content and practice of the entire nation’s school system requires a top-down, direct, and definitive victory to get adopted.  If input and deliberation are sought, or decisions are truly decentralized, then it is too easy to block standards reforms, like Common Core.  

But the brute force and directness required for adopting national standards makes its effective implementation in a diverse, decentralized and democratic country impossible.

Common Core didn’t need to start as national standards. It’s a shame the feds got involved instead of letting the standards truly be voluntary. I think some states will drop the core, weaken the standards or fudge the tests. But if half-a-dozen states implement the standards  and tests well, that will be educational.

The Federalist Debate features Fordham’s Mike Petrilli and Heartland’s Joy Pullman discussing  the Common Core standards — without getting nasty.

Getting started with core standards

Fordham’s Common Core in the Districts: An Early Look at Early Implementers examines how school leaders and teachers are implementing new standards “in a high-performing suburb, a trailblazer, an urban bellwether, and a creative implementer.”

“In the absence of externally vetted, high-quality Common Core materials, districts are striving—with mixed success—to devise their own, the report finds.

Delivering quality CCSS-aligned professional development also is “crucial” and “patchy.”

Core-aligned tests aren’t ready either. 

Seventy-three percent of teachers in Common Core states say they’re enthusiastic about the new standards, but think implementation will be challenging, according to a survey by Scholastic and the Gates Foundation.

Many teachers say they need more training and resources, especially for low-achieving students.

Fifty-seven percent of teachers believe the new standards will be positive for most students; only 8 percent predict a negative impact.