Thousands rallied in Albany, N.Y., against the standardized testing that is required by New York state, in June 2013. Shannon DeCelle/AP

MONTCLAIR, N.J. — This upscale, racially diverse suburb isn’t the first place you’d expect to see a pitched battle break out over school reform. In recent years, Montclair has become a frequent landing site for middle-class families looking to recreate Manhattan’s Upper West Side on a lawn-bedecked hilltop. It is home to New Jersey’s first medical marijuana dispensary as well as a healthy slice of The New York Times’ editorial staff and is, as one resident puts it, the place you flee to “so you don’t have to fight school wars anymore.” Yet the last few months have seen everything from shouting matches at school board meetings to subpoenas leveled at parents for allegedly leaking new standardized tests online. The tests were imposed, over parent and teacher protests, by a new district superintendent who declared them necessary to prepare Montclair schools for an even bigger change: the new multistate standardized tests being prepared by the Partnership for Assessment of Readiness for College and Careers. PARCC, as it’s universally known, is at the forefront of the push to reinvent the standardized test. Starting March 24 and running through early June, the consortium will conduct a series of three-hour field tests with about 1.2 million students in 14 states plus the District of Columbia, with plans to roll out the full assessments for 22 million students in 2015. The promise is to sweep away the panoply of fill-in-the-bubble exams that states currently use. They would be replaced by computerized assessments that, PARCC says, will provide schools with more complete data and better reflect the new national Common Core curriculum, which has been adopted in 45 states since 2010. But as the recent dustup over the revised SAT has shown, nothing is that simple in the world of schools and tests. Many in the growing crowd of PARCC critics, in Montclair and beyond, worry that the new assessments will only impose longer tests, divert money from classroom instruction and increase stresses on children and teachers alike without necessarily providing better information about what and whether students are learning. “There’s kind of a belief in a town like Montclair that the more we test, the more we can be sure that our teachers are delivering a quality curriculum,” says Michelle Fine, a City University of New York psychology professor who is a member of the parent group Montclair Cares About Schools. “I think that’s magical thinking.”

There’s kind of a belief in a town like Montclair that the more we test, the more we can be sure that our teachers are delivering a quality curriculum. I think that’s magical thinking. Michelle Fine member, Montclair Cares About Schools

The roots of PARCC go back to 2001, when President George W. Bush’s No Child Left Behind (NCLB) Act promised to force the evaluation of school systems on how they educated all students, not just a rarefied few at the top. When it was shown that NCLB’s all-stick approach — schools were threatened with closure if they didn’t meet an escalating series of benchmarks, culminating in the impossible goal of 100 percent of students scoring as proficient by 2014 — was doomed to fail, President Barack Obama introduced a plan that was more carrot. Obama’s Race to the Top provided $4.3 billion of new federal funding to be doled out to states that best met certain criteria, which include improving teacher performance and student evaluation, which means testing. Of that funding, $350 million would go to the development of new multistate standardized tests for grades 3 through 11, which would, as Education Secretary Arne Duncan said at the time, “close the data gap that now handcuffs districts from tracking growth in student learning and improving classroom instruction.” The money ended up being split between two consortiums: PARCC, made up of 24 states mostly in the East and South, and Smarter Balanced, with 28, primarily in the West and Midwest. (Six states — Alabama, Colorado, Kentucky, North Dakota, Pennsylvania and South Carolina — hedged their bets and joined both groups, and some states joined neither group.) Fans of the new PARCC assessments say they will be a much-needed reimagining of what standardized tests should look like. “You look at the assessments that were in place in so many of these states before, and so many of them were of such poor quality,” says Michael Brickman, policy director of the Thomas B. Fordham Institute, a conservative education think tank. “They were getting a low-quality assessment at a low price.” Price, however, has proved one of the early sticking points of the new tests, as states face higher costs not only for the assessments but also for the technology needed to implement them. Over the past year, Georgia and Kentucky have announced they were dropping out of PARCC and seeking a cheaper alternative, and the New York State Board of Regents quietly put off its adoption of the new tests, largely over concerns that schools wouldn’t have the hardware to administer them. In New Jersey, which is going full speed ahead, educators and parent advocates are worried about what the new tests will mean for already strapped school budgets. Tina Weishaus of a Highland Park, N.J., parent group says her school district recently defended layoffs as necessary to fund new technology that, she charges, “pertains to the PARCC tests.” Other New Jersey towns have raised similar concerns over PARCC’s imposing “unfunded mandates” on their communities.

I think [the new PARCC tests are] going to be much more engaging for kids. This is the 21st century, and they’ll be in a 21st century environment. It won’t be, ‘Oh, here’s another bubble test.’ Jeff Nellhaus policy, research and design director, PARCC

If there are a billion answers that need to be scored in a two-month span, how is it possible without hiring every person off the street you can and cutting corners? Todd Farley author, ‘Making the Grades’

Jeff Nellhaus, who designed Massachusetts’ highly regarded state tests and now serves as PARCC’s policy, research and design director, promises that the shift to more complex questions will provide deeper data about student knowledge. The PARCC English tests, he says, will require students to read and consolidate multiple texts, then write an essay drawing on evidence from them. Math assessments, meanwhile, will place greater emphasis on real-world problems — one sample question asks third-graders to calculate how an art teacher can best use tiles to cover a wall — and extended mathematical thinking. He says, “I think it’s going to be much more engaging for kids. This is the 21st century, and they’ll be in a 21st century environment. It won’t be, ‘Oh, here’s another bubble test.’” Testing experts, however, have long warned that more elaborate questions come with a price. One common problem is construct-irrelevant variance — the risk of testing for things, such as the ability to navigate the new computer interface, that have nothing to do with academic skills. It doesn’t help that Common Core testing materials use language that can be confusing even to educators. “They’re written like Ikea instructions for putting together a desk,” says the Montclair parent group’s Fine. Even the complexity of the questions can degrade the quality of test results. As Harvard education professor Daniel Koretz explained in his 2008 book “Measuring Up,” multiple-choice questions have one built-in advantage: You can ask lots of them, assessing students across a wide range of knowledge. More involved questions provide depth but not breadth — which means less opportunity for an especially good or poor score on any one topic to be evened out. The new tests offer more open-ended responses, which raises yet another dilemma: How do you ensure that students’ varied responses are graded objectively? Some former workers at testing companies such as Pearson and ETS warn that this is a bigger challenge than most people realize. In his book “Making the Grades,” longtime test-scoring leader Todd Farley described a testing industry dominated by low-paid temp scorers, typically earning $11 to $13 an hour, forced to puzzle out illegible handwriting and unclear scoring rubrics (if a student is assigned a reading on “The Lion and the Squirrel” but calls the squirrel a mouse, should he get full credit?) on a tight schedule — and then as often as not deciding to just pick a score, as long as they are all in agreement, keeping the scorers’ reliability rating up without being accurate. Nellhaus promises a “very rigorous” scoring process for PARCC essay questions, with a certain percentage of questions read over by a second reader to confirm the scores. Farley, though, is unconvinced. “If there are a billion answers that need to be scored in a two-month span, how is it possible without hiring every person off the street you can and cutting corners?” he asks.

It’s been found in all kinds of fields, from doctors to bus drivers: If you put high stakes on a particular set of indicators, you distort everything else. Monty Neill National Center for Fair and Open Testing