First published Wed Mar 21, 2018

In this article, we will first examine Hume’s own argument, provide a reconstruction of it, and then survey different responses to the problem which it poses.

Hume’s argument is one of the most famous in philosophy. A number of philosophers have attempted solutions to the problem, but a significant number have embraced his conclusion that it is insoluble. There is also a wide spectrum of opinion on the significance of the problem. Some have argued that Hume’s argument does not establish any far-reaching skeptical conclusion, either because it was never intended to, or because the argument is in some way misformulated. Yet many have regarded it as one of the most profound philosophical challenges imaginable since it seems to call into question the justification of one of the most fundamental ways in which we form knowledge. Bertrand Russell, for example, expressed the view that if Hume’s problem cannot be solved, “there is no intellectual difference between sanity and insanity” (Russell 1946: 699).

Hume asks on what grounds we come to our beliefs about the unobserved on the basis of inductive inferences. He presents an argument in the form of a dilemma which appears to rule out the possibility of any reasoning from the premises to the conclusion of an inductive inference. There are, he says, two possible types of arguments, “demonstrative” and “probable”, but neither will serve. A demonstrative argument produces the wrong kind of conclusion, and a probable argument would be circular. Therefore, for Hume, the problem remains of how to explain why we form any conclusions that go beyond the past instances of which we have had experience (T. 1.3.6.10). Hume stresses that he is not disputing that we do draw such inferences. The challenge, as he sees it, is to understand the “foundation” of the inference—the “logic” or “process of argument” that it is based upon (E. 4.2.21). The problem of meeting this challenge, while evading Hume’s argument against the possibility of doing so, has become known as “the problem of induction”.

The original source of what has become known as the “problem of induction” is in Book 1, part iii, section 6 of A Treatise of Human Nature by David Hume, published in 1739. In 1748, Hume gave a shorter version of the argument in Section iv of An enquiry concerning human understanding. Throughout this article we will give references to the Treatise as “T”, and the Enquiry as “E”.

We generally think that the observations we make are able to justify some expectations or predictions about observations we have not yet made, as well as general claims that go beyond the observed. For example, the observation that bread of a certain appearance has thus far been nourishing seems to justify the expectation that the next similar piece of bread I eat will also be nourishing, as well as the claim that bread of this sort is generally nourishing. Such inferences from the observed to the unobserved, or to general laws, are known as “inductive inferences”.

1. Hume’s Problem

Hume introduces the problem of induction as part of an analysis of the notions of cause and effect. Hume worked with a picture, widespread in the early modern period, in which the mind was populated with mental entities called “ideas”. Hume thought that ultimately all our ideas could be traced back to the “impressions” of sense experience. In the simplest case, an idea enters the mind by being “copied” from the corresponding impression (T. 1.1.1.7/4). More complex ideas are then created by the combination of simple ideas (E. 2.5/19). Hume took there to be a number of relations between ideas, including the relation of causation (E. 3.2; for more on Hume’s philosophy in general, see Morris & Brown 2014).

For Hume, the relation of causation is the only relation by means of which “we can go beyond the evidence of our memory and senses” (E. 4.1.4, T. 1.3.2.3/74). Suppose we have an object present to our senses: say gunpowder. We may then infer to an effect of that object: say, the explosion. The causal relation links our past and present experience to our expectations about the future (E. 4.1.4/26).

Hume argues that we cannot make a causal inference by purely a priori means (E. 4.1.7). Rather, he claims, it is based on experience, and specifically experience of constant conjunction. We infer that the gunpowder will explode on the basis of past experience of an association between gunpowder and explosions.

Hume wants to know more about the basis for this kind of inference. If such an inference is made by a “chain of reasoning” (E. 4.2.16), he says, he would like to know what that reasoning is. In general, he claims that the inferences depend on a transition of the form:

I have found that such an object has always been attended with such an effect, and I foresee, that other objects, which are, in appearance, similar, will be attended with similar effects. (E. 4.2.16)

In the Treatise, Hume says that

if Reason determin’d us, it would proceed upon that principle that instances, of which we have had no experience, must resemble those, of which we have had experience, and that the course of nature continues always uniformly the same. (T. 1.3.6.4)

For convenience, we will refer to this claim of similarity or resemblance between observed and unobserved regularities as the “Uniformity Principle (UP)”. Sometimes it is also called the “Resemblance Principle”, or the “Principle of Uniformity of Nature”.

Hume then presents his famous argument to the conclusion that there can be no reasoning behind this principle. The argument takes the form of a dilemma. Hume makes a distinction between relations of ideas and matters of fact. Relations of ideas include geometric, algebraic and arithmetic propositions, “and, in short, every affirmation, which is either intuitively or demonstratively certain”. “Matters of fact”, on the other hand are empirical propositions which can readily be conceived to be other than they are. Hume says that

All reasonings may be divided into two kinds, namely, demonstrative reasoning, or that concerning relations of ideas, and moral reasoning, or that concerning matter of fact and existence. (E. 4.2.18)

Hume considers the possibility of each of these types of reasoning in turn, and in each case argues that it is impossible for it to supply an argument for the Uniformity Principle.

First, Hume argues that the reasoning cannot be demonstrative, because demonstrative reasoning only establishes conclusions which cannot be conceived to be false. And, he says,

it implies no contradiction that the course of nature may change, and that an object seemingly like those which we have experienced, may be attended with different or contrary effects. (E. 4.2.18)

It is possible, he says, to clearly and distinctly conceive of a situation where the unobserved case does not follow the regularity so far observed (E. 4.2.18, T. 1.3.6.5/89).

Second, Hume argues that the reasoning also cannot be “such as regard matter of fact and real existence”. He also calls this “probable” reasoning. All such reasoning, he claims, “proceed upon the supposition, that the future will be conformable to the past”, in other words on the Uniformity Principle (E. 4.2.19).

Therefore, if the chain of reasoning is based on an argument of this kind it will again be relying on this supposition, “and taking that for granted, which is the very point in question”. (E. 4.2.19, see also T. 1.3.6.7/90). The second type of reasoning then fails to provide a chain of reasoning which is not circular.

In the Treatise version, Hume concludes

Thus, not only our reason fails us in the discovery of the ultimate connexion of causes and effects, but even after experience has inform’d us of their constant conjunction, ’tis impossible for us to satisfy ourselves by our reason, why we shou’d extend that experience beyond those particular instances, which have fallen under our observation. (T. 1.3.6.11/91–2)

The conclusion then is that our tendency to project past regularities into the future is not underpinned by reason. The problem of induction is to find a way to avoid this conclusion, despite Hume’s argument.

After presenting the problem, Hume does present his own “solution” to the doubts he has raised (E. 5, T. 1.3.7–16). This consists of an explanation of what the inductive inferences are driven by, if not reason. In the Treatise Hume raises the problem of induction in an explicitly contrastive way. He asks whether the transition involved in the inference is produced

by means of the understanding or imagination; whether we are determin’d by reason to make the transition, or by a certain association and relation of perceptions? (T. 1.3.6.4)

And he goes on to summarize the conclusion by saying

When the mind, therefore, passes from the idea or impression of one object to the idea or belief of another, it is not determin’d by reason, but by certain principles, which associate together the ideas of these objects, and unite them in the imagination. (T. 1.3.6.12)

Thus, it is the imagination which is taken to be responsible for underpinning the inductive inference, rather than reason.

In the Enquiry, Hume suggests that the step taken by the mind,

which is not supported by any argument, or process of the understanding … must be induced by some other principle of equal weight and authority. (E. 5.1.2)

That principle is “custom” or “habit”. The idea is that if one has seen similar objects or events constantly conjoined, then the mind is inclined to expect a similar regularity to hold in the future. The tendency or “propensity” to draw such inferences, is the effect of custom:

… having found, in many instances, that any two kinds of objects, flame and heat, snow and cold, have always been conjoined together; if flame or snow be presented anew to the senses, the mind is carried by custom to expect heat or cold, and to believe, that such a quality does exist and will discover itself upon a nearer approach. This belief is the necessary result of of placing the mind in such circumstances. It is an operation of the soul, when we are so situated, as unavoidable as to feel the passion of love, when we receive benefits; or hatred, when we meet with injuries. All these operations are a species of natural instincts, which no reasoning or process of the thought and understanding is able, either to produce, or to prevent. (E. 5.1.8)

Hume argues that the fact that these inferences do follow the course of nature is a kind of “pre-established harmony” (E. 5.2.21). It is a kind of natural instinct, which may in fact be more effective in making us successful in the world, than if we relied on reason to make these inferences.

2. Reconstruction

Hume’s argument has been presented and formulated in many different versions. There is also an ongoing lively discussion over the historical interpretation of what Hume himself intended by the argument. It is therefore difficult to provide an unequivocal and uncontroversial reconstruction of Hume’s argument. Nonetheless, for the purposes of organizing the different responses to Hume’s problem that will be discussed in this article, the following reconstruction will serve as a useful starting point.

Hume’s argument concerns specific inductive inferences such as:

All observed instances of A have been B. The next instance of A will be B.

Let us call this “inference I”. Inferences which fall under this type of schema are now often referred to as cases of “simple enumerative induction”.

Hume’s own example is:

All observed instances of bread (of a particular appearance) have been nourishing. The next instance of bread (of that appearance) will be nourishing.

Hume’s argument then proceeds as follows (premises are labeled as P, and subconclusions and conclusions as C):

P1. There are only two kinds of arguments: demonstrative and probable (Hume’s fork).

P2. Inference I presupposes the Uniformity Principle (UP). 1st horn: P3. A demonstrative argument establishes a conclusion whose negation is a contradiction.

P4. The negation of the UP is not a contradiction.

C1. There is no demonstrative argument for the UP (by P3 and P4). 2nd horn: P5. Any probable argument for UP presupposes UP.

P6. An argument for a principle may not presuppose the same principle (Non-circularity).

C2. There is no probable argument for the UP (by P5 and P6).

C3. There is no argument for the UP (by P1, C1 and C2). Consequences: P7. If there is no argument for the UP, there is no chain of reasoning from the premises to the conclusion of any inference that presupposes the UP.

C4. There is no chain of reasoning from the premises to the conclusion of inference I (by P2, C3 and P7).

P8. If there is no chain of reasoning from the premises to the conclusion of inference I , the inference is not justified.

C5. Inference I is not justified (by C4 and P8).

There have been different interpretations of what Hume means by “demonstrative” and “probable” arguments. Sometimes “demonstrative” is equated with “deductive”, and probable with “inductive” (e.g., Salmon 1966). Then the first horn of Hume’s dilemma would eliminate the possibility of a deductive argument, and the second would eliminate the possibility of an inductive argument. However, under this interpretation, premise P3 would not hold, because it is possible for the conclusion of a deductive argument to be a non-necessary proposition. Premise P3 could be modified to say that a demonstrative (deductive) argument establishes a conclusion that cannot be false if the premises are true. But then it becomes possible that the supposition that the future resembles the past, which is not a necessary proposition, could be established by a deductive argument from some premises, though not from a priori premises (in contradiction to conclusion C1).

Another common reading is to equate “demonstrative” with “deductively valid with a priori premises”, and “probable” with “having an empirical premise” (e.g., Okasha 2001). This may be closer to the mark, if one thinks, as Hume seems to have done, that premises which can be known a priori cannot be false, and hence are necessary. If the inference is deductively valid, then the conclusion of the inference from a priori premises must also be necessary. What the first horn of the dilemma then rules out is the possibility of a deductively valid argument with a priori premises, and the second horn rules out any argument (deductive or non-deductive), which relies on an empirical premise.

However, recent commentators have argued that in the historical context that Hume was situated in, the distinction he draws between demonstrative and probable arguments has little to do with whether or not the argument has a deductive form (Owen 1999; Garrett 2002). In addition, the class of inferences that establish conclusions whose negation is a contradiction may include not just deductively valid inferences from a priori premises, but any inferences that can be drawn using a priori reasoning (that is, reasoning where the transition from premises to the conclusion makes no appeal to what we learn from observations). It looks as though Hume does intend the argument of the first horn to rule out any a priori reasoning, since he says that a change in the course of nature cannot be ruled out “by any demonstrative argument or abstract reasoning a priori” (E. 5.2.18). On this understanding, a priori arguments would be ruled out by the first horn of Hume’s dilemma, and empirical arguments by the second horn. This is the interpretation that I will adopt for the purposes of this article.

In Hume’s argument, the UP plays a central role. As we will see in section 4.2, various authors have been doubtful about this principle. Versions of Hume’s argument have also been formulated which do not make reference to the UP. Rather they directly address the question of what arguments can be given in support of the transition from the premises to the conclusion of the specific inductive inference I. What arguments could lead us, for example, to infer that the next piece of bread will nourish from the observations of nourishing bread made so far? For the first horn of the argument, Hume’s argument can be directly applied. A demonstrative argument establishes a conclusion whose negation is a contradiction. The negation of the conclusion of the inductive inference is not a contradiction. It is not a contradiction that the next piece of bread is not nourishing. Therefore, there is no demonstrative argument for the conclusion of the inductive inference. In the second horn of the argument, the problem Hume raises is a circularity. Even if Hume is wrong that all inductive inferences depend on the UP, there may still be a circularity problem, but as we shall see in section 4.1, the exact nature of the circularity needs to be carefully considered. But the main point at present is that the Humean argument is often formulated without invoking the UP.

Since Hume’s argument is a dilemma, there are two main ways to resist it. The first is to tackle the first horn and to argue that there is after all a demonstrative argument –here taken to mean an argument based on a priori reasoning—that can justify the inductive inference. The second is to tackle the second horn and to argue that there is after all a probable (or empirical) argument that can justify the inductive inference. We discuss the different variants of these two approaches in sections 3 and 4.

There are also those who dispute the consequences of the dilemma. For example, some recent commentators on Hume interpret him as drawing only conclusion C4, and not the normative conclusion C5 (we discuss these interpretations in section 5.1). There are also approaches which take issue with premise P8 and argue that providing a chain of reasoning from the premises to the conclusion is not a necessary condition for justification of an inductive inference (sections 5.2 and 5.3). Finally, there are some philosophers who do accept the skeptical conclusion C5 and attempt to accommodate it. For example, there have been attempts to argue that inductive inference is not as central to scientific inquiry as is often thought (section 6). It is also possible to argue that even though Hume’s argument does establish that inductive inferences are not justified in the sense that we have reasons to think their conclusions true, nonetheless a weaker kind of justification is possible. This is based on the idea that we can establish that following inductive procedures is a means to certain epistemic ends. We examine the tradition associated with this approach in section 7.

3. Tackling the First Horn of Hume’s Dilemma

The first horn of Hume’s argument, as formulated above, is aimed at establishing that there is no demonstrative argument for the UP. A number of philosophers have thought that this does not definitively rule out the possibility of a justification of inductive inferences based on a demonstrative argument. There are two main potential escape routes from the first horn of Hume’s dilemma. The first is to deny premise P3, which amounts to admitting the possibility of synthetic a priori propositions. The second is to accept the conclusion C1, that there is no demonstrative argument for the UP, but to argue that such an argument is not necessary for justification. Indeed, one could say that it is not even necessary to have a demonstrative argument for the conclusion of the inductive inference. Rather, the thought is, it will be sufficient for justification to have an argument to the proposition that the conclusion of the inductive inference is probable. We address each of these approaches in the next two sections.

3.1 Synthetic a priori

As we have seen in section 1, Hume takes demonstrative arguments to have conclusions which are “relations of ideas”, whereas “probable” or “moral” arguments have conclusions which are “matters of fact”. Hume’s distinction between “relations of ideas” and “matters of fact” anticipates the distinction drawn by Kant between “analytic” and “synthetic” propositions (Kant 1781). A classic example of an analytic proposition is “Bachelors are unmarried men”, and a synthetic proposition is “My bike tyre is flat”. For Hume, demonstrative arguments, which are based on a priori reasoning, can establish only relations of ideas, or analytic propositions. The association between a prioricity and analyticity underpins premise P3, which states that a demonstrative argument establishes a conclusion whose negation is a contradiction.

One possible response to Hume’s problem is to deny premise P3, by allowing the possibility that a priori reasoning could give rise to synthetic propositions. Kant famously argued in response to Hume that such synthetic a priori knowledge is possible (Kant 1781, 1783). He does this by a kind of reversal of the empiricist programme espoused by Hume. Whereas Hume tried to understand how the concept of a causal or necessary connection could be based on experience, Kant argued instead that experience only comes about through the concepts or “categories” of the understanding. On his view, one can gain a priori knowledge of these concepts, including the concept of causation, by a transcendental argument concerning the necessary preconditions of experience. A more detailed account of Kant’s response to Hume can be found in de Pierris and Friedman 2013.

3.2 Arguing for a Probable Conclusion

The first horn of Hume’s dilemma implies that there cannot be a demonstrative argument to the conclusion of an inductive inference because it is possible to conceive of the negation of the conclusion. For instance, it is quite possible to imagine that the next piece of bread I eat will poison me rather than nourish me. However, this does not rule out the possibility of a demonstrative argument that establishes only that the bread is highly likely to nourish, not that it definitely will. There are several approaches that attempt to produce a demonstrative argument that the conclusion of an inductive inference is probable, though not certain. If this succeeds, a chain of reasoning based on demonstrative arguments from the premises of inference I to the proposition that the conclusion is probable is not ruled out by Hume’s argument. One might then challenge premise P8, by saying that it is not necessary for justification of an inductive inference to have a chain of reasoning from its premises to its conclusion. Rather it would suffice if we had an argument from the premises to the claim that the conclusion is probable or likely. Then an a priori justification of the inductive inference would have been provided.

3.2.1 The Nomological-Explanatory solution

The first of these approaches is the “Nomological-explanatory” solution, which has been put forward by Armstrong, BonJour and Foster (Armstrong 1983; BonJour 1998; Foster 2004). This solution appeals to Inference to the Best Explanation (IBE), which says that we should infer that the hypothesis which provides the best explanation of the evidence is probably true. Proponents of this approach take Inference to the Best Explanation to be a mode of inference which is distinct from the type of “extrapolative” inductive inference that Hume was trying to justify. They also regard it as a type of inference which although non-deductive, is justified a priori. For example, Armstrong says “To infer to the best explanation is part of what it is to be rational. If that is not rational, what is?” (Armstrong 1983: 59).

The a priori justification is taken to proceed in two steps. First, it is argued that we should recognize that certain observed regularities require an explanation in terms of some underlying law. For example, if a coin persistently lands heads on repeated tosses, then it becomes increasingly implausible that this occurred just because of “chance”. Rather, we should infer to the better explanation that the coin has a certain bias. Saying that the coin lands heads not only for the observed cases, but also for the unobserved cases, does not provide an explanation of the observed regularity. Thus, mere Humean constant conjunction is not sufficient. What is needed for an explanation is a “non-Humean, metaphysically robust conception of objective regularity” (BonJour 1998), which is thought of as involving actual natural necessity (Armstrong 1983; Foster 2004).

Once it has been established that there must be some metaphysically robust explanation of the observed regularity, the second step is to argue that out of all possible metaphysically robust explanations, the “straight” inductive explanation is the best one, where the straight explanation extrapolates the observed frequency to the wider population. For example, given that a coin has some objective chance of landing heads, the best explanation of the fact that \(m/n\) heads have been so far observed, is that the objective chance of the coin landing heads is \(m/n\). And this objective chance determines what happens not only in observed cases but also in unobserved cases.

The Nomological-Explanatory solution relies on taking IBE as a rational, a priori form of inference which is distinct from inductive inferences like inference I. However, one might alternatively view inductive inferences as a special case of IBE (Harman 1968), or take IBE to be merely an alternative way of characterizing inductive inference (Henderson 2014). If either of these views is right, IBE does not have the necessary independence from inductive inference to provide a non-circular justification of it.

One may also object to the Nomological-Explanatory approach on the grounds that regularities do not necessarily require an explanation in terms of necessary connections or robust metaphysical laws. The viability of the approach also depends on the tenability of a non-Humean conception of laws. There have been several serious attempts to develop such an account (Armstrong 1983; Tooley 1977; Dretske 1977), but also much criticism (see J. Carroll 2016).

Another critical objection is that the Nomological-Explanatory solution simply begs the question, even if it is taken to be legitimate to make use of IBE in the justification of induction. In the first step of the argument we infer to a law or regularity which extends beyond the spatio-temporal region in which observations have been thus far made, in order to predict what will happen in the future. But why could a law that only applies to the observed spatio-temporal region not be an equally good explanation? The main reply seems to be that we can see a priori that laws with temporal or spatial restrictions would be less good explanations. Foster argues that the reason is that this would introduce more mysteries:

For it seems to me that a law whose scope is restricted to some particular period is more mysterious, inherently more puzzling, than one which is temporally universal. (Foster 2004)

3.2.2 Bayesian solution

Another way in which one can try to construct an a priori argument that the premises of an inductive inference make its conclusion probable, is to make use of the formalism of probability theory itself. At the time Hume wrote, probabilities were used to analyze games of chance. And in general, they were used to address the problem of what we would expect to see, given that a certain cause was known to be operative. This is the so-called problem of “direct inference”. However, the problem of induction concerns the “inverse” problem of determining the cause or general hypothesis, given particular observations.

One of the first and most important methods for tackling the “inverse” problem using probabilities was developed by Thomas Bayes. Bayes’s essay containing the main results was published after his death in 1764 (Bayes 1764). However, it is possible that the work was done significantly earlier and was in fact written in direct response to the publication of Hume’s Enquiry in 1748 (see Zabell 1989: 290–93, for discussion of what is known about the history).

We will illustrate the Bayesian method using the problem of drawing balls from an urn. Suppose that we have an urn which contains white and black balls in an unknown proportion. We draw a sample of balls from the urn by removing a ball, noting its color, and then putting it back before drawing again.

Consider first the problem of direct inference. Given the proportion of white balls in the urn, what is the probability of various outcomes for a sample of observations of a given size? Suppose the proportion of white balls in the urn is \(\theta = 0.6\). The probability of drawing one white ball in a sample of one is then \(p(W; \theta = 0.6) = 0.6\). We can also compute the probability for other outcomes, such as drawing two white balls in a sample of two, using the rules of the probability calculus (see section 1 of Hájek 2011). Generally, the probability that \(n_w\) white balls are drawn in a sample of size N, is given by the binomial distribution:

\[ p(n_w;\theta=x) = \left(\begin{matrix}N\

_w\end{matrix}\right) x^{n_w} (1-x)^{(1-n_w)} \]

This is a specific example of a “sampling distribution”, \(p(E\mid H)\), which gives the probability of certain evidence E in a sample, on the assumption that a certain hypothesis H is true. Calculation of the sampling distribution can in general be done a priori, given the rules of the probability calculus.

However, the problem of induction is the inverse problem. We want to infer not what the sample will be like, with a known hypothesis, rather we want to infer a hypothesis about the general situation or population, based on the observation of a limited sample. The probabilities of the candidate hypotheses can then be used to inform predictions about further observations. In the case of the urn, for example, we want to know what the observation of a particular sample frequency of white balls, \(\frac{n_w}{N}\), tells us about \(\theta\), the proportion of white balls in the urn.

The idea of the Bayesian approach is to assign probabilities not only to the events which constitute evidence, but also to hypotheses. One starts with a “prior probability” distribution over the relevant hypotheses \(p(H)\). On learning some evidence E, the Bayesian updates the prior \(p(H)\) to the conditional probability \(p(H\mid E)\). This update rule is called the “rule of conditionalisation”. The conditional probability \(p(H\mid E)\) is known as the “posterior probability”, and is calculated using Bayes’ rule:

\[ p(H\mid E) = \frac{p(E\mid H) p(H)}{p(E)} \]

Here the sampling distribution can be taken to be a conditional probability \(p(E\mid H)\), which is known as the “likelihood” of the hypothesis H on evidence E.

One can then go on to compute the predictive distribution for as yet unobserved data \(E'\), given observations E. The predictive distribution in a Bayesian approach is given by

\[ p(E'\mid E) = \sum_{H} p(E'\mid H) p(H\mid E) \]

where the sum becomes an integral in cases where H is a continuous variable.

For the urn example, we can compute the posterior probability \(p(\theta\mid n_w)\) using Bayes’ rule, and the likelihood given by the binomial distribution above. In order to do so, we also need to assign a prior probability distribution to the parameter \(\theta\). One natural choice, which was made early on by Bayes himself and by Laplace, is to put a uniform prior over the parameter \(\theta\). Bayes’ own rationale for this choice was that then if you work out the probability of each value for the number of whites in the sample based only on the prior, before any data is observed, all those probabilities are equal. Laplace had a different justification, based on the Principle of Indifference. This principle states that if you don’t have any reason to favor one hypothesis over another, you should assign them all equal probabilities.

With the choice of uniform prior, the posterior probability and predictive distribution can be calculated. It turns out that the probability that the next ball will be white, given that \(n_w\) of N draws were white, is given by

\[ p(w\mid n_w) = \frac{n_w + 1}{N+2} \]

This is Laplace’s famous “rule of succession” (1814). Suppose on the basis of observing 90 white balls out of 100, we calculate by the rule of succession that the probability of the next ball being white is \(91/102=0.89\). It is quite conceivable that the next ball might be black. Even in the case, where all 100 balls have been white, so that the probability of the next ball being white is 0.99, there is still a small probability that the next ball is not white. What the probabilistic reasoning supplies then is not an argument to the conclusion that the next ball will be a certain color, but an argument to the conclusion that certain future observations are very likely given what has been observed in the past.

Overall, the Bayes-Laplace argument in the urn case provides an example of how probabilistic reasoning can take us from evidence about observations in the past to a prediction for how likely certain future observations are. The question is what kind of solution, if any, this type of calculation provides to the problem of induction. At first sight, since it is just a mathematical calculation, it looks as though it does indeed provide an a priori argument from the premises of an inductive inference to the proposition that a certain conclusion is probable.

However, in order to establish this definitively, one needs to argue that all the components and assumptions of the argument are a priori and this requires further examination of at least three important issues.

First, the Bayes-Laplace argument relies on the rules of the probability calculus. What is the status of these rules? Does following them amount to a priori reasoning? The answer to this depends in part on how probability itself is interpreted. Broadly speaking, there are prominent interpretations of probability according to which the rules plausibly have a priori status and could form the basis of a demonstrative argument. These include the classical interpretation originally developed by Laplace (1814), the logical interpretation which had its heyday in the work of Keynes (1921), Johnson (1921), Jeffreys (1939), and Carnap (1950), and the subjectivist interpretation of Ramsey (1926), Savage (1954), and de Finetti (1964). Attempts to argue for a probabilistic a priori solution to the problem of induction have been primarily associated with these interpretations.

Secondly, in the case of the urn, the Bayes-Laplace argument is based on a particular probabilistic model—the binomial model. This involves the assumption that there is a parameter describing an unknown proportion \(\theta\) of balls in the urn, and that the data amounts to independent draws from a distribution over that parameter. What is the basis of these assumptions? Do they generalize to other cases beyond the actual urn case—i.e., can we see observations in general as analogous to draws from an “Urn of Nature”? There has been a persistent worry that these types of assumptions, while reasonable when applied to the case of drawing balls from an urn, will not hold for other cases of inductive inference. Thus, the probabilistic solution to the problem of induction might be of relatively limited scope. At the least, there are some assumptions going into the choice of model here that need to be made explicit.

Thirdly, the Bayes-Laplace argument relies on a particular choice of prior probability distribution. What is the status of this assignment, and can it be based on a priori principles? Historically, the Bayes-Laplace choice of a uniform prior, as well as the whole concept of classical probability, relied on the Principle of Indifference. This principle has been regarded by many as an a priori principle. However, it has also been subjected to much criticism on the grounds that it can give rise to inconsistent probability assignments (Bertrand 1888; Borel 1909; Keynes 1921). Such inconsistencies are produced by there being more than one way to carve up the space of alternatives, and different choices give rise to conflicting probability assignments. One attempt to rescue the Principle of Indifference has been to appeal to explanationism, and argue that the principle should be applied only to the carving of the space at “the most explanatorily basic level”, where this level is identified according to an a priori notion of explanatory priority (Huemer 2009).

The quest for an a priori argument for the assignment of the prior has been largely abandoned. For many, the subjectivist foundations developed by Ramsey, de Finetti and Savage provide a more satisfactory basis for understanding probability. From this point of view, it is a mistake to try to introduce any further a priori constraints on the probabilities beyond those dictated by the probability rules themselves. Rather the assignment of priors may reflect personal opinions or background knowledge, and no prior is a priori an unreasonable choice.

So far, we have considered probabilistic arguments which place probabilities over hypotheses in a hypothesis space as well as observations. There is also a tradition of attempts to determine what probability distributions we should have, given certain observations, from the starting point of a joint probability distribution over all the observable variables. One may then postulate axioms directly on this distribution over observables, and examine the consequences for the predictive distribution. Much of the development of inductive logic, including the influential programme by Carnap, proceeded in this manner (Carnap 1950, 1952).

This approach helps to clarify the role of the assumptions behind probabilistic models. One fundamental assumption that one can make about the observations is that they are “exchangeable”. This means that the joint distribution of the random variables is invariant under permutations. Informally, this means that the order of the observations does not affect the probability. For instance, in the urn case, this would mean that drawing first a white ball and then a black ball is just as probable as first drawing a black and then a white. De Finetti proved a general representation theorem that if the joint probability distribution of an infinite sequence of random variables is assumed to be exchangeable, then it can be written as a mixture of distribution functions from each of which the data behave as if they are independent random draws (de Finetti 1964). In the case of the urn example, the theorem shows that it is as if the data are independent random draws from a binomial distribution over a parameter \(\theta\), which itself has a prior probability distribution.

The assumption of exchangeability may be seen as a natural formalization of Hume’s assumption that the past resembles the future. This is intuitive because assuming exchangeability means thinking that the order of observations, both past and future, does not matter to the probability assignments.

However, the development of the programme of inductive logic revealed that many generalizations are possible. For example, Johnson proposed to assume an axiom he called the “sufficientness postulate”. This states that outcomes can be of a number of different types, and that the conditional probability that the next outcome is of type i depends only on the number of previous trials and the number of previous outcomes of type i (Johnson 1932). Assuming the sufficientness postulate for three or more types gives rise to a general predictive distribution corresponding to Carnap’s “continuum of inductive methods” (Carnap 1952). This predictive distribution takes the form:

\[ p(i\mid N_1,N_2,\ldots N_t)= \frac{N_i + k}{N_1 +N_2 + \cdots + N_t + kt} \]

for some positive number k. This reduces to Laplace’s rule of succession when \(t=2\) and \(k=1\).

Generalizations of the notion of exchangeability, such as “partial exchangeability” and “Markov exchangeability”, have been explored, and these may be thought of as forms of symmetry assumption (Zabell 1988; Skyrms 2012). As less restrictive axioms on the probabilities for observables are assumed, the result is that there is no longer a unique result for the probability of a prediction, but rather a whole class of possible probabilities, mapped out by a generalized rule of succession such as the above. Therefore, in this tradition as in the Bayes-Laplace approach, we have moved away from producing an argument which produces a unique a priori probabilistic answer to Hume’s problem.

One might think then that the assignment of the prior, or the relevant corresponding postulates on the observable probability distribution, is precisely where empirical assumptions enter into inductive inferences. The probabilistic calculations are empirical arguments, rather than a priori ones. If this is correct, then the probabilistic framework has not in the end provided an a priori solution to the problem of induction, but it has rather allowed us to clarify what could be meant by Hume’s claim that inductive inferences rely on the Uniformity Principle.

Some think that although the problem of induction is not solved, there is in some sense a partial solution, which has been called a “logical solution”. Howson, for example, argues that “Inductive reasoning is justified to the extent that it is sound, given appropriate premises” (Howson 2000: 239, his emphasis). According to this view, there is no getting away from an empirical premise for inductive inferences, but we might still think of Bayesian conditioning as functioning like a kind of logic or “consistency constraint” which “generates predictions from the assumptions and observations together” (Romeijn 2004: 360). Once we have an empirical assumption, instantiated in the prior probability, and the observations, Bayesian conditioning tells us what the resulting predictive probability distribution should be.

3.2.3 Combinatorial approach

An alternative attempt to use probabilistic reasoning to produce an a priori justification for inductive inferences is the so-called “combinatorial” solution. This was first put forward by Donald C. Williams (1947) and later developed by David Stove (1986).

Like the Bayes-Laplace argument, the solution relies heavily on the idea that straightforward a priori calculations can be done in a “direct inference” from population to sample. As we have seen, given a certain population frequency, the probability of getting different frequencies in a sample can be calculated straightforwardly based on the rules of the probability calculus. The Bayes-Laplace argument relied on inverting the probability distribution using Bayes’ rule to get from the sampling distribution to the posterior distribution. Williams instead proposes that the inverse inference may be based on a certain logical syllogism: the proportional (or statistical) syllogism.

The proportional, or statistical syllogism, is the following:

Of all the things that are M, \(m/n\) are P. a is an M

Therefore, a is P, with probability \(m/n\).

For example, if 90% of rabbits in a population are white and we observe a rabbit a, then the proportional syllogism says that we infer that a is white with a probability of 90%. Williams argues that the proportional syllogism is a non-deductive logical syllogism, which effectively interpolates between the syllogism for entailment

All Ms are P a is an M

Therefore, a is P.

And the syllogism for contradiction

No M is P a is M

Therefore, a is not P.

This syllogism can be combined with an observation about the behavior of increasingly large samples. From calculations of the sampling distribution, it can be shown that as the sample size increases, the probability that the sample frequency is in a range which closely approximates the population frequency also increases. In fact, Bernoulli’s law of large numbers states that the probability that the sample frequency approximates the population frequency tends to one as the sample size goes to infinity. Williams argues that such results support a “general over-all premise, common to all inductions, that samples ‘match’ their populations” (Williams 1947: 78).

We can then apply the proportional syllogism to samples from a population, to get the following argument:

Most samples match their population S is a sample.

Therefore, S matches its population, with high probability.

This is an instance of the proportional syllogism, and it uses the general result about samples matching populations as the first major premise.

The next step is to argue that if we observe that the sample contains a proportion of \(m/n\) Fs, then we can conclude that since this sample with high probability matches its population, the population, with high probability, has a population frequency that approximates the sample frequency \(m/n\). Both Williams and Stove claim that this amounts to a logical a priori solution to the problem of induction.

A number of authors have expressed the view that the Williams-Stove argument is only valid if the sample S is drawn randomly from the population of possible samples—i.e., that any sample is as likely to be drawn as any other (Brown 1987; Will 1948; Giaquinto 1987). Sometimes this is presented as an objection to the application of the proportional syllogism. The claim is that the proportional syllogism is only valid if a is drawn randomly from the population of Ms. However, the response has been that there is no need to know that the sample is randomly drawn in order to apply the syllogism (Maher 1996; Campbell 2001; Campbell & Franklin 2004). Certainly if you have reason to think that your sampling procedure is more likely to draw certain individuals than others—for example, if you know that you are in a certain location where there are more of a certain type—then you should not apply the proportional syllogism. But if you have no such reasons, the defenders claim, it is quite rational to apply it. Certainly it is always possible that you draw an unrepresentative sample—meaning one of the few samples in which the sample frequency does not match the population frequency—but this is why the conclusion is only probable and not certain.

The more problematic step in the argument is the final step, which takes us from the claim that samples match their populations with high probability to the claim that having seen a particular sample frequency, the population from which the sample is drawn has frequency close to the sample frequency with high probability. The problem here is a subtle shift in what is meant by “high probability”, which has formed the basis of a common misreading of Bernouilli’s theorem. Hacking (1975: 156–59) puts the point in the following terms. Bernouilli’s theorem licenses the claim that much more often than not, a small interval around the sample frequency will include the true population frequency. In other words, it is highly probable in the sense of “usually right” to say that the sample matches its population. But this does not imply that the proposition that a small interval around the sample will contain the true population frequency is highly probable in the sense of “credible on each occasion of use”. This would mean that for any given sample, it is highly credible that the sample matches its population. It is quite compatible with the claim that it is “usually right” that the sample matches its population to say that there are some samples which do not match their populations at all. Thus one cannot conclude from Bernouilli’s theorem that for any given sample frequency, we should assign high probability to the proposition that a small interval around the sample frequency will contain the true population frequency. But this is exactly the slide that Williams makes in the final step of his argument. Maher (1996) argues in a similar fashion that the last step of the Williams-Stove argument is fallacious. In fact, if one wants to draw conclusions about the probability of the population frequency given the sample frequency, the proper way to do so is by using the Bayesian method described in the previous section. But, as we there saw, this requires the assignment of prior probabilities, and this explains why many people have thought that the combinatorial solution somehow illicitly presupposed an assumption like the principle of indifference. The Williams-Stove argument does not in fact give us an alternative way of inverting the probabilities which somehow bypasses all the issues that Bayesians have faced.

4. Tackling the Second Horn of Hume’s Dilemma

So far we have considered ways in which the first horn of Hume’s dilemma might be tackled. But it is of course also possible to take on the second horn instead.

One may argue that a probable argument would not, despite what Hume says, be circular in a problematic way (we consider responses of this kind in section 4.1). Or, one might attempt to argue that probable arguments are not circular at all (section 4.2).

4.1 Inductive Justifications of Induction

One way to tackle the second horn of Hume’s dilemma is to reject premise P6, which rules out circular arguments. Some have argued that certain kinds of circular arguments would provide an acceptable justification for the inductive inference. Since the justification would then itself be an inductive one, this approach is often referred to as an “inductive justification of induction”.

First we should examine how exactly the Humean circularity supposedly arises. Take the simple case of enumerative inductive inference that follows the following pattern (X):

Most observed Fs have been Gs Therefore: Most Fs are Gs.

Hume claims that such arguments presuppose the Uniformity Principle (UP). According to premises P7 and P8, this supposition also needs to be supported by an argument in order that the inductive inference be justified. A natural idea is that we can argue for the Uniformity Principle on the grounds that “it works”. We know that it works, because past instances of arguments which relied upon it were found to be successful. This alone however is not sufficient unless we have reason to think that such arguments will also be successful in the future. That claim must itself be supported by an inductive argument (S):

Most arguments of form X that rely on UP have succeeded in the past. Therefore, most arguments of form X that rely on UP succeed.

But this argument itself depends on the UP, which is the very supposition which we were trying to justify.

As we have seen in section 2, some reject Hume’s claim that all inductive inferences presuppose the UP. However, the argument that basing the justification of the inductive inference on a probable argument would result in circularity need not rely on this claim. The circularity concern can be framed more generally. If argument S relies on something which is already presupposed in inference X, then argument S cannot be used to justify inference X. The question though is what precisely the something is.

Some authors have argued that in fact S does not rely on any premise or even presupposition that would require us to already know the conclusion of X. S is then not a “premise circular” argument. Rather, they claim, it is “rule-circular”—it relies on a rule of inference in order to reach the conclusion that that very rule is reliable. Suppose we adopt the rule R which says that when it is observed that most Fs are Gs, we should infer that most Fs are Gs. Then inference X relies on rule R. We want to show that rule R is reliable. We could appeal to the fact that R worked in the past, and so, by an inductive argument, it will also work in the future. Call this argument S*:

Most inferences following rule R have been successful Therefore, most inferences following R are successful.

Since this argument itself uses rule R, using it to establish that R is reliable is rule-circular.

Some authors have then argued that although premise-circularity is vicious, rule-circularity is not (Cleve 1984; Papineau 1992). One reason for thinking rule-circularity is not vicious would be if it is not necessary to know or even justifiably believe that rule R is reliable in order to move to a justified conclusion using the rule. This is a claim made by externalists about justification (Cleve 1984). They say that as long as R is in fact reliable, one can form a justified belief in the conclusion of an argument relying on R, as long as one has justified belief in the premises.

If one is not persuaded by the externalist claim, one might attempt to argue that rule circularity is benign in a different fashion. For example, the requirement that a rule be shown to be reliable without any rule-circularity might appear unreasonable when the rule is of a very fundamental nature. As Lange puts it:

It might be suggested that although a circular argument is ordinarily unable to justify its conclusion, a circular argument is acceptable in the case of justifying a fundamental form of reasoning. After all, there is nowhere more basic to turn, so all that we can reasonably demand of a fundamental form of reasoning is that it endorse itself. (Lange 2011: 56)

Proponents of this point of view point out that even deductive inference cannot be justified deductively. Consider Lewis Carroll’s dialogue between Achilles and the Tortoise (Carroll 1895). Achilles is arguing with a Tortoise who refuses to perform modus ponens. The Tortoise accepts the premise that p, and the premise that p implies q but he will not accept q. How can Achilles convince him? He manages to persuade him to accept another premise, namely “if p and p implies q, then q”. But the Tortoise is still not prepared to infer to q. Achilles goes on adding more premises of the same kind, but to no avail. It appears then that modus ponens cannot be justified to someone who is not already prepared to use that rule.

It might seem odd if premise circularity were vicious, and rule circularity were not, given that there appears to be an easy interchange between rules and premises. After all, a rule can always, as in the Lewis Carroll story, be added as a premise to the argument. But what the Carroll story also appears to indicate is that there is indeed a fundamental difference between being prepared to accept a premise stating a rule (the Tortoise is happy to do this), and being prepared to use that rule (this is what the Tortoise refuses to do).

Suppose that we grant that an inductive argument such as S (or S*) can support an inductive inference X without vicious circularity. Still, a possible objection is that the argument simply does not provide a full justification of X. After all, less sane inference rules such as counterinduction can support themselves in a similar fashion. The counterinductive rule is CI:

Most observed As are Bs. Therefore, it is not the case that most As are Bs.

Consider then the following argument CI*:

Most CI arguments have been unsuccessful Therefore, it is not the case that most CI arguments are unsuccessful, i.e., many CI arguments are successful.

This argument therefore establishes the reliability of CI in a rule-circular fashion (see Salmon 1963).

Argument S can be used to support inference X, but only for someone who is already prepared to infer inductively by using S. It cannot convince a skeptic who is not prepared to rely upon that rule in the first place. One might think then that the argument is simply not achieving very much.

The response to these concerns is that, as Papineau puts it, the argument is “not supposed to do very much” (Papineau 1992: 18). The fact that a counterinductivist counterpart of the argument exists is true, but irrelevant. It is conceded that the argument cannot persuade either a counterinductivist, or a skeptic. Nonetheless, proponents of the inductive justification maintain that there is still some added value in showing that inductive inferences are reliable, even when we already accept that there is nothing problematic about them. The inductive justification of induction provides a kind of important consistency check on our existing beliefs.

4.2 No Rules

It is possible to go even further in an attempt to dismantle the Humean circularity. Maybe inductive inferences do not even have a rule in common. What if every inductive inference is essentially unique? Okasha, for example, argues that Hume’s circularity problem can be evaded if there are “no rules” behind induction (Okasha 2005a,b). Norton puts forward the similar idea that all inductive inferences are material, and have nothing formal in common (Norton 2003).

Proponents of such views have attacked Hume’s claim that there is a UP on which all inductive inferences are based. There have long been complaints about the vagueness of the Uniformity Principle (Salmon 1953). The future only resembles the past in some respects, but not others. Suppose that on all my birthdays so far, I have been under 40 years old. This does not give me a reason to expect that I will be under 40 years old on my next birthday. There seems then to be a major lacuna in Hume’s account. He might have explained or described how we draw an inductive inference, on the assumption that it is one we can draw. But he leaves untouched the question of how we distinguish between cases where we extrapolate a regularity legitimately, regarding it as a law, and cases where we do not.

Nelson Goodman is often seen as having made this point in a particularly vivid form with his “new riddle of induction” (Goodman 1955: 59-83). Suppose we define a predicate “grue” in the following way. An object is “grue” when it is green if observed before time t and blue otherwise. Goodman considers a thought experiment in which we observe a bunch of green emeralds before time t. We could describe our results by saying all the observed emeralds are green. Using a simple enumerative inductive schema, we could infer from the result that all observed emeralds are green, that all emeralds are green. But equally, we could describe the same results by saying that all observed emeralds are grue. Then using the same schema, we could infer from the result that all observed emeralds are grue, that all emeralds are grue. In the first case, we expect an emerald observed after time t to be green, whereas in the second, we expect it to be blue. Thus the two predictions are incompatible. Goodman claims that what Hume omitted to do was to give any explanation for why we project predicates like “green”, but not predicates like “grue”. This is the “new riddle”, which is often taken to be a further problem of induction that Hume did not address.

One moral that could be taken from Goodman is that there is not one general Uniformity Principle that all probable arguments rely upon (Sober 1988; Norton 2003; Okasha 2001, 2005a,b). Rather each inductive inference presupposes some more specific empirical presupposition. A particular inductive inference depends on some specific way in which the future resembles the past. It can then be justified by another inductive inference which depends on some quite different empirical claim. This will in turn need to be justified—by yet another inductive inference. The nature of Hume’s problem in the second horn is thus transformed. There is no circularity. Rather there is a regress of inductive justifications, each relying on their own empirical presuppositions (Sober 1988; Norton 2003; Okasha 2001, 2005a,b).

One way to put this point is to say that Hume’s argument rests on a quantifier shift fallacy (Sober 1988; Okasha 2005a). Hume says that there exists a general presupposition for all inductive inferences, whereas he should have said that for each inductive inference, there is some presupposition. Different inductive inferences then rest on different empirical presuppositions, and the problem of circularity is evaded.

What will then be the consequence of supposing that Hume’s problem should indeed have been a regress, rather than a circularity? Here different opinions are possible. On the one hand, one might think that a regress still leads to a skeptical conclusion. So although the exact form in which Hume stated his problem was not correct, the conclusion is not substantially different (Sober 1988). Another possibility is that the transformation mitigates or even removes the skeptical problem. For example, Norton argues that the upshot is a dissolution of the problem of induction, since the regress of justifications benignly terminates (Norton 2003). And Okasha more mildly suggests that even if the regress is infinite, “Perhaps infinite regresses are less bad than vicious circles after all” (Okasha 2005b: 253).

Any dissolution of Hume’s circularity does not depend only on arguing that the UP should be replaced by empirical presuppositions which are specific to each inductive inference. It is also necessary to establish that inductive inferences share no common rules—otherwise there will still be at least some rule-circularity. Okasha suggests that the Bayesian model of belief-updating is an illustration how induction can be characterized in a rule-free way, but this is problematic, since in this model all inductive inferences still share the common rule of Bayesian conditionalisation. Norton’s material theory of induction more genuinely promises a rule-free characterization of induction, but it is not clear whether it really can avoid any role for general rules (Achinstein 2010; Worrall 2010).

5. The Necessary Conditions for Justification

Hume is usually read as delivering a negative verdict on the possibility of justifying inference I, via a premise such as P8. There are however some who question whether Hume is best interpreted as drawing a conclusion about justification of inference I at all (we will discuss these interpretations in section 5.1). There are also those who question in different ways whether premise P8 really does give a valid necessary condition for justification of inference I (sections 5.2 and 5.3).

5.1 Interpretation of Hume’s Conclusion

Some scholars have denied that Hume should be read as invoking a premise such premise P8 at all. The reason, they claim, is that he was not aiming for an explicitly normative conclusion about justification such as C5. Hume certainly is seeking a “chain of reasoning” from the premises of the inductive inference to the conclusion, and he thinks that an argument for the UP is necessary to complete the chain. However, one could think that there is no further premise regarding justification, and so the conclusion of his argument is simply C4: there is no chain of reasoning from the premises to the conclusion of an inductive inference. Hume could then be, as Don Garrett and David Owen have argued, advancing a “thesis in cognitive psychology”, rather than making a normative claim about justification (Owen 1999; Garrett 2002). The thesis is about the nature of the cognitive process underlying the inference. According to Garrett, the main upshot of Hume’s argument is that there can be no reasoning process that establishes the UP. For Owen, the message is that the inference is not drawn through a chain of ideas connected by mediating links, as would be characteristic of the faculty of reason.

There are also interpreters who have argued that Hume is merely trying to exclude a specific kind of justification of induction, based on a conception of reason predominant among rationalists of his time, rather than a justification in general (Beauchamp & Rosenberg 1981; Baier 2009). In particular, it has been claimed that it is “an attempt to refute the rationalist belief that at least some inductive arguments are demonstrative” (Beauchamp & Rosenberg 1981: xviii). Under this interpretation, premise P8 should be modified to read something like:

If there is no chain of reasoning based on demonstrative arguments from the premises to the conclusion of inference I, then inference I is not justified.

Such interpretations do however struggle with the fact that Hume’s argument is explicitly a two-pronged attack, which concerns not just demonstrative arguments, but also probable arguments.

The question of how expansive a normative conclusion to attribute to Hume is a complex one. It depends in part on the interpretation of Hume’s own solution to his problem. As we saw in section 1, Hume attributes the basis of inductive inference to principles of the imagination in the Treatise, and in the Enquiry to “custom”, “habit”, conceived as a kind of natural instinct. The question is then whether this alternative provides any kind of justification for the inference, even if not one based on reason. On the face of it, it looks as though Hume is suggesting that inductive inferences proceed on an entirely arational basis. He clearly does not think that they do not succeed in producing good outcomes. In fact, Hume even suggests that this operation of the mind may even be less “liable to error and mistake” than if it were entrusted to “the fallacious deductions of our reason, which is slow in its operations” (E. 5.2.22). It is also not clear that he sees the workings of the imagination as completely devoid of rationality. For one thing, Hume talks about the imagination as governed by principles. Later in the Treatise, he even gives “rules” and “logic” for characterizing what should count as a good causal inference (T. 1.3.15). He also clearly sees it as possible to distinguish between better forms of such “reasoning”, as he continues to call it. Thus, there may be grounds to argue that Hume was not trying to argue that inductive inferences have no rational foundation whatsoever, but merely that they do not have the specific type of rational foundation which is rooted in the faculty of Reason.

All this indicates that there is room for debate over the intended scope of Hume’s own conclusion. And thus there is also room for debate over exactly what form a premise (such as premise P8) that connects the rest of his argument to a normative conclusion should take. No matter who is right about this however, the fact remains that Hume has throughout history been predominantly read as presenting an argument for inductive skepticism.

5.2 Postulates and Hinges

Even if one does attribute a normative conclusion to Hume, one may question his argument by asking whether premise P8 is true. This can prompt general reflection on what is needed for justification of an inference in the first place, and what Hume is even asking for.

For example, Wittgenstein raised doubts over whether it is even meaningful to ask for the grounds for inductive inferences.

If anyone said that information about the past could not convince him that something would happen in the future, I should not understand him. One might ask him: what do you expect to be told, then? What sort of information do you call a ground for such a belief? … If these are not grounds, then what are grounds?—If you say these are not grounds, then you must surely be able to state what must be the case for us to have the right to say that there are grounds for our assumption…. (Wittgenstein 1953: 481)

One might not, for instance, think that there even needs to be a chain of reasoning in which each step or presupposition is supported by an argument. Wittgenstein took it that there are some principles so fundamental that they do not require support from any further argument. They are the “hinges” on which enquiry turns.

Out of Wittgenstein’s ideas has developed a general notion of “entitlement”, which is a kind of rational warrant to hold certain propositions which does not come with the same requirements as “justification”. Entitlement provides epistemic rights to hold a proposition, without responsibilities to base the belief in it on an argument. Crispin Wright (2004) has argued that there are certain principles, including the Uniformity Principle, that we are entitled in this sense to hold.

Some philosophers have set themselves the task of determining a set or sets of postulates which form a plausible basis for inductive inferences. Bertrand Russell, for example, argued that five postulates lay at the root of inductive reasoning (Russell 1948). Arthur Burks, on the other hand, proposed that the set of postulates is not unique, but there may be multiple sets of postulates corresponding to different inductive methods (Burks 1953, 1955).

The main objection to all these views is that they do not really solve the problem of induction in a way that adequately secures the pillars on which inductive inference stands. As Salmon puts it, “admission of unjustified and unjustifiable postulates to deal with the problem is tantamount to making scientific method a matter of faith” (Salmon 1966: 48).

5.3 Ordinary Language Dissolution

Rather than allowing undefended empirical postulates to give normative support to an inductive inference, one could instead argue for a completely different conception of what is involved in justification. Like Wittgenstein, later ordinary language philosophers, notably P.F. Strawson, also questioned what exactly it means to ask for a justification of inductive inferences (Strawson 1952). This has become known as the “Ordinary language dissolution” of the problem of induction.

Strawson points out that it could be meaningful to ask for a deductive justification of inductive inferences. But it is not clear that this is helpful since this is effectively “a demand that induction shall be shown to be really a kind of deduction” (Strawson 1952: 230). Rather, Strawson says, when we ask about whether a particular inductive inference is justified, we are typically judging whether it conforms to our usual inductive standards. Suppose, he says, someone has formed the belief by inductive inference that All f’s are g. Strawson says that if that person is asked for their grounds or reasons for holding that belief,

I think it would be felt to be a satisfactory answer if he replied: “Well, in all my wide and varied experience I’ve come across innumerable cases of f and never a case of f which wasn’t a case of g”. In saying this, he is clearly claiming to have inductive support, inductive evidence, of a certain kind, for his belief. (Strawson 1952)

That is just because inductive support, as it is usually understood, simply consists of having observed many positive instances in a wide variety of conditions.

In effect, this approach denies that producing a chain of reasoning is a necessary condition for justification. Rather, an inductive inference is justified if it conforms to the usual standards of inductive justification. But, is there more to it? Might we not ask what reason we have to rely on those inductive standards?

It surely makes sense to ask whether a particular inductive inference is justified. But the answer to that is fairly straightforward. Sometimes people have enough evidence for their conclusions and sometimes they do not. Does it also make sense to ask about whether inductive procedures generally are justified? Strawson draws the analogy between asking whether a particular act is legal. We may answer such a question, he says, by referring to the law of the land.

But it makes no sense to inquire in general whether the law of the land, the legal system as a whole, is or is not legal. For to what legal standards are we appealing? (Strawson 1952: 257)

According to Strawson,

It is an analytic proposition that it is reasonable to have a degree of belief in a statement which is proportional to the strength of the evidence in its favour; and it is an analytic proposition, though not a proposition of mathematics, that, other things being equal, the evidence for a generalisation is strong in proportion as the number of favourable instances, and the variety of circumstances in which they have been found, is great. So to ask whether it is reasonable to place reliance on inductive procedures is like asking whether it is reasonable to proportion the degree of one’s convictions to the strength of the evidence. Doing this is what “being reasonable” means in such a context. (Strawson 1952: 256–57)

Thus, according to this point of view, there is no further question to ask about whether it is reasonable to rely on inductive inferences.

The ordinary language philosophers do not explicitly argue against Hume’s premise P8. But effectively what they are doing is offering a whole different story about what it would mean to be justified in believing the conclusion of inductive inferences. What is needed is just conformity to inductive standards, and there is no real meaning to asking for any further justification for those.

The main objection to this view is that conformity to the usual standards is insufficient to provide the needed justification. What we need to know is whether belief in the conclusion of an inductive inference is “epistemically reasonable or justified in the sense that …there is reason to think that it is likely to be true” (BonJour 1998: 198). The problem Hume has raised is whether, despite the fact that inductive inferences have tended to produce true conclusions in the past, we have reason to think the conclusion of an inductive inference we now make is likely to be true. Arguably, establishing that an inductive inference is rational in the sense that it follows inductive standards is not sufficient to establish that its conclusion is likely to be true. In fact Strawson allows that there is a question about whether “induction will continue to be successful”, which is distinct from the question of whether induction is rational. This question he does take to hinge on a “contingent, factual matter” (Strawson 1952: 262). But if it is this question that concerned Hume, it is no answer to establish that induction is rational, unless that claim is understood to involve or imply that an inductive inference carried out according to rational standards is likely to have a true conclusion.

6. Living with Inductive Skepticism

So far we have considered the various ways in which we might attempt to solve the problem of induction by resisting one or other premise of Hume’s argument. Some philosophers have however seen his argument as unassailable, and have thus accepted that it does lead to inductive skepticism, the conclusion that inductive inferences cannot be justified. The challenge then is to find a way of living with such a radical-seeming conclusion. We appear to rely on inductive inference ubiquitously in daily life, and it is also generally thought that it is at the very foundation of the scientific method. Can we go on with all this, whilst still seriously thinking none of it is justified by any rational argument?

One option here is to argue, as does Nicholas Maxwell, that the problem of induction is posed in an overly restrictive context. Maxwell argues that the problem does not arise if we adopt a different conception of science than the ‘standard empiricist’ one, which he denotes ‘aim-oriented empiricism’ (Maxwell 2017).

Another option here is to think that the significance of the problem of induction is somehow restricted to a skeptical context. Hume himself seems to have thought along these lines. For instance he says:

Nature will always maintain her rights, and prevail in the end over any abstract reasoning whatsoever. Though we should conclude, for instance, as in the foregoing section, that, in all reasonings from experience, there is a step taken by the mind, which is not supported by any argument or process of the understanding; there is no danger, that these reasonings, on which almost all knowledge depends, will ever be affected by such a discovery. (E. 5.1.2)

Hume’s purpose is clearly not to argue that we should not make inductive inferences in everyday life, and indeed his whole method and system of describing the mind in naturalistic terms depends on inductive inferences through and through. The problem of induction then must be seen as a problem that arises only at the level of philosophical reflection.

Another way to mitigate the force of inductive skepticism is to restrict its scope. Karl Popper, for instance, regarded the problem of induction as insurmountable, but he argued that science is not in fact based on inductive inferences at all (Popper 1935 [1959]). Rather he presented a deductivist view of science, according to which it proceeds by making bold conjectures, and then attempting to falsify those conjectures. In the simplest version of this account, when a hypothesis makes a prediction which is found to be false in an experiment, the hypothesis is rejected as falsified. The logic of this procedure is fully deductive. The hypothesis entails the prediction, and the falsity of the prediction refutes the hypothesis by modus tollens. Thus, Popper claimed that science was not based on the extrapolative inferences considered by Hume. The consequence then is that it is not so important, at least for science, if those inferences would lack a rational foundation.

Popper’s account appears to be incomplete in an important way. There are always many hypotheses which have not yet been refuted by the evidence, and these may contradict one another. According to the strictly deductive framework, since none are yet falsified, they are all on an equal footing. Yet, scientists will typically want to say that one is better supported by the evidence than the others. We seem to need more than just deductive reasoning to support practical decision-making (Salmon 1981). Popper did indeed appeal to a notion of one hypothesis being better or worse “corroborated” by the evidence. But arguably, this took him away from a strictly deductive view of science. It appears doubtful then that pure deductivism can give an adequate account of scientific method.

7. Means-ends Solutions

Hume’s argument might be taken as having definitively ruled out the kind of justification for inductive inferences that he was looking for. That is, it may preclude a justification which gives reason to believe the conclusion of a particular inductive inference is correct, or even likely to be correct. However, it is also possible to move away from the focus on justifying particular inductive inferences, and to consider inductive methods more generally. In simple cases of enumerative induction, the “inductive method”, or “inductive principle”, as it is sometimes called, is a rule for how to extrapolate from the observed instances. For example, it might be the rule that one should infer to a universal generalization, after a certain number of positive instances and reject the universal generalization after observation of a counter-instances. Or it might be formulated as the so-called “straight rule”, which says that one should project the observed frequency of an attribute to the population as a whole, including future instances. Might it be the case that the general properties of an inductive method give grounds for employing that method, even when we have no reason to think that the method will result in a correct answer in any particular application? Given a particular inductive problem, we can look for an optimal method, or means, for providing a solution. Such a means-ends argument may then form the basis for following the method, even in the absence of reasons to believe in its success in particular instances.

7.1 Pragmatic Vindication

One of the main early attempts in this direction was the “pragmatic” approach of Reichenbach (1938 [2006]). Reichenbach did think Hume’s argument unassailable, but nonetheless he attempted to provide a weaker kind of justification for induction. In order to emphasize the difference from the kind of justification Hume sought, some have given it a different term and refer to Reichenbach’s solution as a “vindication”, rather than a justification of induction (Feigl 1950; Salmon 1963).

According to this approach, we have a certain aim in making inductive inferences. Even if we cannot be sure we can achieve the aim, we can still argue that if the aim can be met, it will be by following the usual principles of inductive inference. This provides a reason for making those usual inductive inferences. Reichenbach makes a comparison to the situation where a man is suffering from a disease, and the physician says “I do not know whether an operation will save the man, but if there is any remedy, it is an operation” (Reichenbach 1938 [2006: 349]). This provides some kind of justification for operating on the man, even if one does not know that the operation will succeed.

Reichenbach applied the strategy to a general form of “statistical induction” in which we observe the relative frequency \(f_n\) of a particular event in n observations and then form expectations about the frequency that will arise when more observations are made. The “inductive principle” then states that if after a certain number of instances, an observed frequency of \(m/n\) is observed, for any prolongation of the series of observations, the frequency will continue to fall within a small interval of \(m/n\). Cases such as Hume considered are a special case of this principle, where the observed frequency is 1. For example, in Hume’s bread case, suppose bread was observed to nourish n times out of n (i.e. an observed frequency of 100%), then according to the principle of induction, we expect that as we observe more instances, the frequency of nourishing ones will continue to be within a very small interval of 100%. Following this inductive principle is also sometimes referred to as following the “straight rule”. The problem then is to justify the use of this rule.

Reichenbach argued that even if Hume is right to think that we cannot be justified in thinking for any particular application of the rule that the conclusion is likely to be true, for the purposes of practical action we do not need to establish this. We can instead regard the inductive rule as resulting in a “posit”, or statement that we deal with as if it is true. We posit a certain frequency f on the basis of our evidence, and this is like making a wager or bet that the frequency is in fact f.

The aim of inductive inference, according to Reichenbach, is “to find series of events whose frequency of occurrence converges towards a limit” (1938 [2006: 350]). It is possible that the world is so disorderly that we cannot construct series with such limits. But if there is a limit, there is some element of a series of observations, beyond which the principle of induction will lead to the true value of the limit. Although the inductive rule may give quite wrong results early in the sequence, as it follows chance fluctuations in the sample frequency, it is guaranteed to eventually approximate the limiting frequency, if such a limit exists. Therefore, the rule of induction is justified as an instrument of positing because it is a method of which we know that if it is possible to make statements about the future we shall find them by means of this method (Reichenbach 1949: 475). This justification is taken to be a pragmatic one, since though it does not supply knowledge of a future event, it supplies a sufficient reason for action (Reichenbach 1949: 481).

There are several problems with this pragmatic approach. One concern is that the kind of justification it offers is too much tied to the long run, while allowing essentially no constraint on what can be posited in the short-run. Yet it is in the short run that inductive practice actually occurs and where it really needs justification (BonJour 1998: 194; Salmon 1966: 53).

Related to this is the worry that the justification is weak in the sense that it applies to many other rules of inference as well as the so-called “straight rule” (Salmon 1966: 53). It applies, in fact, to any method which converges asymptotically to the straight rule. An easily specified class of such rules are those which add to the inductive rule a function \(c_n\) in which the \(c_n\) converge to zero with increasing n.

Reichenbach makes two suggestions aimed at avoiding this problem. On the one hand, he claims, since we have no real way to pick between methods, we might as well just use the inductive rule since it is “easier to handle, owing to its descriptive simplicity”. He also claims that the method which embodies the “smallest risk” is following the inductive rule (Reichenbach 1938 [2006: 355–356]).

Another problem is whether Reichenbach has really established that there could not be a better rule than the straight rule. For instance, for all that has been said, there might be a soothsayer or psychic who is able to predict future events reliably. Here Reichenbach argues that by using induction we could recognize the reliability of the alternative method, by examining its track record. This thought was later picked up and developed into the suggestion that a “meta-inductivist” who applies induction not only at the “object” level to observations, but also to the success of others’ methods, might by those means be able to do as well predictively as the alternative method (Schurz 2008; see section 7.3 for more discussion of meta-induction).

One might also question whether a pragmatic argument can really deliver an all-purpose, general justification for following the inductive rule. Surely a pragmatic solution should be sensitive to differences in pay-offs that depend on the circumstances. For example, Reichenbach offers the following analogue to his pragmatic justification:

We may compare our situation to that of a man who wants to fish in an unexplored part of the sea. There is no one to tell him whether or not there are fish in this place. Shall he cast his net? Well, if he wants to fish in that place, I should advise him to cast the net, to take the chance at least. It is preferable to try even in uncertainty than not to try and be certain of getting nothing. (Reichenbach 1938 [2006: 362–363])

As Lange points out, the argument here “presumes that there is no cost to trying”. In such a situation, “the fisherman has everything to gain and nothing to lose by casting his net” (Lange 2011: 77). But if there is some significant cost to making the attempt, it may not be so clear that the most rational course of action is to cast the net. Similarly, whether or not it would make sense to adopt the policy of making no predictions, rather than the policy of following the inductive rule, may depend on what the practical penalties are for being wrong. A pragmatic solution may not be capable of offering rationale for following the inductive rule which is applicable in all circumstances.

7.2 Formal Learning Theory

As we saw above, one of the problems for Reichenbach was that there are too many rules which converge in the limit to the true frequency. Which one should we then choose in the short-run? It is possible to broaden Reichenbach’s general strategy by considering what happens if we have other epistemic goals besides long-run convergence. Might other goals place constraints on which methods should be used in the short-run? The field of formal learning theory has developed answers to these questions (Kelly 1996; Schulte 1999; also see Schulte 2017).

In particular, formal learning theorists have considered the goal of getting to the truth as efficiently, or quickly, as possible, as well as the goal of minimizing the number of mind-changes, or retractions along the way. It has then been shown that the usual inductive method, which is characterized by a preference for simpler hypotheses (Occam’s razor), can be justified since it is the unique method which meets the standards for getting to the truth in the long run as efficiently as possible, with a minimum number of retractions (Schulte 1999).

Formal learning theory can be regarded as a kind of extension of the Reichenbachian programme. It does not offer justifications for inductive inferences, in the sense of giving reasons why they should be taken as likely to produce a true conclusion. Rather it offers reasons for following particular methods based on their optimality in achieving certain desirable epistemic ends, even if there is no guarantee that at any given stage of inquiry the results they produce are at all close to the truth. Recently, however, Steel (2010) has suggested that formal learning theory offers more, and does provide a solution to the problem of induction. This claim is based on a rather restrictive interpretation of “Hume’s problem” as the problem: “What is the justification for making inductive generalizations at all?” (2010: 182), rather than as the problem of giving the grounds for a given inductive inference. Steel’s claims have been disputed by Colin Howson (2011).

7.3 Meta-induction

Another approach to pursuing a broadly Reichenbachian programme is to move to the level of meta-induction. We can draw a distinction between applying inductive methods at the level of events—so-called “object-level” induction, and applying inductive methods at the level of competing prediction methods—so-called “meta-induction”. Whereas object-level inductive methods make predictions based on the events which have been observed to occur, meta-inductive methods make predictions based on aggregating the predictions of different available prediction methods according to their success rates. Here, the success rate of a method is defined according to some precise way of scoring success in making predictions.

The question is then whether there can be a meta-inductive method which is “predictively optimal” in the sense that following that method succeeds best in predictions among all competing methods, no matter what data is received. Gerhard Schurz has highlighted results from the regret-based learning framework of Cesa-Bianchi that there is a meta-inductive strategy that is predictively optimal among all predictive methods that are accessible to an epistemic agent (Cesa-Bianchi & Lugosi 2006; Schurz 2008, forthcoming). This meta-inductive strategy, which Schurz calls “wMI”, predicts a weighted average of the predictions of the accessible methods, where the weights are “attractivities”, which measure the difference between the method’s own success rate and the success rate of wMI.

The main result is that the wMI strategy is long-run optimal in the sense that it converges to the maximum success rate of the accessible prediction methods. Worst-case bounds for short-run performance can also be derived. The optimality result forms the basis for an a priori means-ends justification for the use of wMI. Namely, the thought is, it is reasonable to use wMI, since it achieves the best success rate possible in the long run out of the given methods.

Schurz also claims that this a priori justification of wMI, together with the contingent fact that inductive methods have so far been much more successful than non-inductive methods, gives rise to an a posteriori justification of induction. Since wMI will achieve in the long run the maximal success rate of the available prediction methods, it is reasonable to use it. But as a matter of fact, the maximal success rate is achieved by inductive methods. Therefore, since it is a priori justified to use wMI, it is also a priori justified to use the maximally successful method at the object level. Since it turns out that that the maximally successful method is induction, then it is reasonable to use induction.

Schurz’s theorems on the optimality of wMI apply to the case where there are finitely many predictive methods. One point of discussion is whether this amounts to an important limitation on its claims to provide a full solution of the problem of induction (Eckhardt 2010).