The problem, according to David Freedman, a statistician at the University of California, Berkeley, who studies the design and analysis of medical studies, is not so much the differences that are known. Instead, it is the differences that scientists are not aware of.

Cynthia Pearson, executive director of the National Women’s Health Network, has a favorite example of how easy it is to be fooled. Study after study found that women taking estrogen had less heart disease than women who did not. But, Ms. Pearson says, it turns out that women who faithfully take any medication for years  even a sugar pill  are different from women who don’t. The compliant pill-takers tend to be healthier, perhaps because they follow doctor’s orders. So when scientists said they were comparing two equal populations, the estrogen users and the nonestrogen users, they may have actually been comparing the health of the sort of women who conscientiously take pills with that of the sort of women who don’t or who do so less rigorously.

The advantage of randomized clinical trials is that you have to worry a lot less about whether your groups are alike. You assign them treatments by the statistical equivalent of a toss of the coin, the idea being that differences among individuals will be randomly allocated in the groups. Faithful pill takers will be as likely to show up in the beta carotene group, for example, as in the placebo group.

The second basic principle is that the bigger the group studied, the more reliable the conclusions. That’s because the real result of a study is not a single number, like a 20 percent reduction in risk. Instead, it’s a range of numbers that represent a so-called margin of error, like a 5 to 35 percent reduction in risk. The larger the sample size, the smaller the margin of error. Small studies have large uncertainties in results, making it difficult to know where the truth lies. Also, in a small study, randomization may not balance things well.

The third principle, Dr. Goodman says, “is often off the radar of even many scientists.” But it can be a deciding factor in whether a result can be believed. It’s a principle that comes from statistics, called Bayes’ theorem. As Dr. Goodman explains it,

“What is the strength of all the supporting evidence separate from the study at hand?”

A clinical trial that randomly assigns groups to an intervention, like beta carotene or a placebo, Dr. Goodman notes, “is typically at the top of a pyramid of research.” Large and definitive clinical trials can be hugely expensive and take years, so they usually are undertaken only after a large body of evidence indicates that a claim is plausible enough to be worth the investment. Supporting evidence can include laboratory studies indicating a biological reason for the effect, animal studies, observational studies of human populations and even other clinical trials.

But if one clinical trial tests something that is plausible, with a lot of supporting evidence to back it up, and another tests something implausible, the trial testing a plausible hypothesis is more credible even if the two studies are similar in size, design and results. The guiding principle, Dr. Goodman says, is that “things that have a good reason to be true and that have good supporting evidence are likely to be true.”