Bayesian E.T. Jaynes , in “Chapter 5: Queer uses for probability theory” , discusses the probabilistic generalization of the reasoning we are engaged in when we choose whether to modus ponens or modus tollens, with early ESP experiments as an example, pointing out that from a Bayesian perspective, all claims are being evaluated in a larger Bayesian-model-comparison context where issues like experimenter error or bias are always possibilities:

What probability would you assign to the hypothesis that Mr. Smith has perfect extrasensory perception (ESP)? He can guess right every time which number you have written down. To say zero is too dogmatic…We take this man who says he has extrasensory perception, and we will write down some numbers from 1 to 10 on a piece of paper and ask him to guess which numbers we’ve written down. We’ll take the usual precautions to make sure against other ways of finding out. If he guesses the first number correctly, of course we will all say “you’re a very lucky person, but I don’t believe it.” And if he guesses two numbers correctly, we’ll still say “you’re a very lucky person, but I don’t believe it.” By the time he’s guessed four numbers correctly—well, I still wouldn’t believe it. So my state of belief is certainly lower than −40 db. How many numbers would he have to guess correctly before you would really seriously consider the hypothesis that he has extrasensory perception? In my own case, I think somewhere around 10. My personal state of belief is, therefore, about −100 db. You could talk me into a ±10 change, and perhaps as much as ±30, but not much more than that.

But on further thought we see that, although this result is correct, it is far from the whole story. In fact, if he guessed 1000 numbers correctly, I still would not believe that he has ESP, for an extension of the same reason that we noted in Chapter 4 when we first encountered the phenomenon of resurrection of dead hypotheses. An hypothesis A that starts out down at −100 db can hardly ever come to be believed whatever the data, because there are almost sure to be alternative hypotheses (B1,B2,...) above it, perhaps down at −60 db. Then when we get astonishing data that might have resurrected A, the alternatives will be resurrected instead. Let us illustrate this by two famous examples, involving telepathy and the discovery of Neptune.

…on the basis of such a result [as Mrs. Stewart’s experimental results in _ Modern Experiments In Telepathy_, Soal & Bateman 1954], ESP researchers would proclaim a virtual certainty that ESP is real. …it hardly matters what these prior probabilities are; in the view of an ESP researcher who does not consider the prior probability Pf=P(Hf|X) particularly small, P(Hf|D,X) is so close to unity that its decimal expression starts with over a hundred 9’s. He will then react with anger and dismay when, in spite of what he considers this overwhelming evidence, we persist in not believing in ESP. Why are we, as he sees it, so perversely illogical and unscientific? The trouble is that the above calculations (5-9) and (5-12) represent a very naive application of probability theory, in that they consider only H p and H f ; and no other hypotheses. If we really knew that H p and H f were the only possible ways the data (or more precisely, the observable report of the experiment and data) could be generated, then the conclusions that follow from (5-9) and (5-12) would be perfectly all right. But in the real world, our intuition is taking into account some additional possibilities that they ignore.

…When we are dealing with some extremely implausible hypothesis, recognition of a seemingly trivial alternative possibility can make orders of magnitude difference in the conclusions. Taking note of this, let us show how a more sophisticated application of probability theory explains and justifies our intuitive doubts.

Let H p , H f , and L p , L f , P p , P f be as above; but now we introduce some new hypotheses about how this report of the experiment and data might have come about, which will surely be entertained by the readers of the report even if they are discounted by its writers. These new hypotheses (H1,H2...Hk) range all the way from innocent possibilities such as unintentional error in the record keeping, through frivolous ones (perhaps Mrs. Stewart was having fun with those foolish people, with the aid of a little mirror that they did not notice), to less innocent possibilities such as selection of the data (not reporting the days when Mrs. Stewart was not at her best), to deliberate falsification of the whole experiment for wholly reprehensible motives. Let us call them all, simply, “deception”. For our purposes it does not matter whether it is we or the researchers who are being deceived, or whether the deception was accidental or deliberate. Let the deception hypotheses have likelihoods and prior probabilities Li,Pi,i=(1,2,...,k). There are, perhaps, 100 different deception hypotheses that we could think of and are not too far-fetched to consider, although a single one would suffice to make our point. In this new logical environment, what is the posterior probability of the hypothesis H f that was supported so overwhelmingly before? Probability theory now tells us: (5-13)

P(Hf|D,X)=PfLfPfLf+PpLp+ΣPiLi

Introduction of the deception hypotheses has changed the calculation greatly; in order for P(Hf|D,X) to come anywhere near unity it is now necessary that: (5-14)

PpLp+ΣiPiLi<<PfLf

From (5-7), PpLp is completely negligible so (5-14) is not greatly different from: (5-15)

ΣPi<<Pf

But each of the deception hypotheses is, in my judgment, more likely than H f , so there is not the remotest possibility that inequality (5-15) could ever be satisfied.

Therefore, this kind of experiment can never convince me of the reality of Mrs. Stewart’s ESP; not because I assert Pf=0 dogmatically at the start, but because the verifiable facts can be accounted for by many alternative hypotheses.

…Indeed, the very evidence which the ESPers throw at us to convince us, has the opposite effect on our state of belief; issuing reports of sensational data defeats its own purpose. For if the prior probability of deception is greater than that of ESP, then the more improbable the alleged data are on the null hypothesis of no deception and no ESP, the more strongly we are led to believe, not in ESP, but in deception. For this reason, the advocates of ESP (or any other marvel) will never succeed in persuading scientists that their phenomenon is real, until they learn how to eliminate the possibility of deception in the mind of the reader.

It is interesting that Laplace perceived this phenomenon long ago. His Essai Philosophique sur les probabilités (1819) has a long chapter on the “Probabilities of Testimonies”, in which he calls attention to “the immense weight of testimonies necessary to admit a suspension of natural laws”. He notes that those who make recitals of miracles, “decrease rather than augment the belief which they wish to inspire; for then those recitals render very probable the error or the falsehood of their authors. But that which diminishes the belief of educated men often increases that of the uneducated, always avid for the marvelous.”

We observe the same phenomenon at work today, not only in the ESP enthusiast, but in the astrologer, reincarnationist, exorcist, fundamentalist preacher or cultist of any sort, who attracts a loyal following among the uneducated by claiming all kinds of miracles; but has zero success in converting educated people to his teachings. Educated people, taught to believe that a cause-effect relation requires a physical mechanism to bring it about, are scornful of arguments which invoke miracles; but the uneducated seem actually to prefer them. [see also David Hume’s “Of Miracles”]