These are issues that experts have been debating since well before the original replication study appeared last August. “On some level, I suppose it is appealing to think everything is fine and there is no reason to change the status quo,” said Sanjay Srivastava, a psychologist at the University of Oregon, who was not a member of either team. “But we know too much, from many other sources, to put too much credence in an analysis that supports that remarkable conclusion.”

One issue the critique raised was how faithfully the replication team had adhered to the original design of the 100 studies it retested. Small alterations in design can make the difference between whether a study replicates or not, scientists say. To address this, Dr. Nosek and his many collaborators consulted closely with the authors of the studies they were trying to reproduce. Afterward, independent researchers — that is, neither from the original study team nor the replication one — evaluated how closely the study designs matched.

But Dr. Wilson and other authors of the critique — Daniel T. Gilbert, Gary King, and Stephen Pettigrew, all of Harvard — pointed out that authors of 31 of the original studies had not explicitly endorsed the design of the retest. They noted that, for example, one study on race initially run at Stanford was replicated in Amsterdam, a different cultural context.

The critique found that the explicitly endorsed studies were nearly four times more likely to replicate than the nonendorsed ones.

Dr. Nosek said he planned to rerun the replications of 11 studies whose authors raised concern to try to answer whether design differences accounted for the differing results.

Another issue that the critique raised had to do with statistical methods. When Dr. Nosek began his study, there was no agreed-upon protocol for crunching the numbers. He and his team settled on five measures, including the strength of the effect and the effect of combining both studies, to look at the results together.