This is a file in the archives of the Stanford Encyclopedia of Philosophy.

Stanford Encyclopedia of Philosophy

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Probabilistic Causation

"Probabilistic Causation" designates a group of philosophical theories that aim to characterize the relationship between cause and effect using the tools of probability theory. A primary motivation for the development of such theories is the desire for a theory of causation that does not presuppose physical determinism. The central idea behind these theories is that causes raise the probabilities of their effects, all else being equal. As we shall see, a great deal of the work that has been done in this area has been concerned with making the ceteris paribus clause more precise. Issues within, and objections to, probabilistic theories of causation will also be discussed.

1. Introduction

According to David Hume, causes are sufficient conditions for their effects: "We may define a cause to be an object, followed by another, and where all the objects similar to the first, are followed by objects similar to the second." (1748, section VII.) Later writers refined Hume’s theory, but still characterized the causal relation in terms of necessary and sufficient conditions. One of the best known approaches is Mackie’s theory of inus conditions. An inus condition for some effect is an insufficient but non-redundant part of an unnecessary but sufficient condition. Suppose, for example, that a lit match causes a forest fire. The lighting of the match, by itself, is not sufficient; many matches are lit without ensuing forest fires. The lit match is, however, a part of some constellation of conditions that are jointly sufficient for the fire. Moreover, given that this set of conditions occurred, rather than some other set sufficient for fire, the lighting of the match was necessary: without it, the fire would not have occurred.

The necessity/sufficiency approach makes causation incompatible with indeterminism: if an event is not determined to occur, then no event can be a part of a sufficient condition for that event. (An analogous point may be made about necessity.) The recent success of quantum mechanics -- and to a lesser extent, other theories employing probability -- has shaken our faith in determinism. Thus it has struck many philosophers as desirable to develop a theory of causation that does not presuppose determinism. The central idea behind probabilistic theories of causation is that causes raise the probability of their effects; an effect may still occur in the absence of a cause or fail to occur in its presence.

Suggested Readings: Hume (1748), especially section VII; Mackie (1974), especially chapter 3.

2. Is Probabilistic Causation an Oxymoron?

Many philosophers find the idea of indeterministic causation counterintuitive. Indeed, the word "causality" is sometimes used as a synonym for determinism. A strong case for indeterministic causation can be made by considering the epistemic warrant for causal claims. There is now very strong empirical evidence that smoking causes lung cancer. Yet the question of whether there is a deterministic relationship between smoking and lung cancer is wide open. The formation of cancer cells depends upon mutation, which is a strong candidate for being an indeterministic process. Moreover, whether an individual smoker develops lung cancer or not depends upon a host of additional factors, such as whether or not she is hit by a bus before cancer cells begin to form. Thus the price of preserving the intuition that causation presupposes determinism is agnosticism about even our best supported causal claims.

Suggested Readings: Suppes (1970) makes the case for a probabilistic theory of causation in his introduction. Humphreys (1989), contains a sensitive treatment of issues involving indeterminism and causation; see especially sections 10 and 11.

3. Main Developments

3.1 The Central Idea

The central idea that causes raise the probability of their effects can be expressed formally using the apparatus of conditional probability. Let A, B, C,... represent entities that potentially stand in causal relations. Depending upon the account, these may be particular events, such as the assassination of Archduke Ferdinand, or event types, such as exposure to ultraviolet radiation. We will discuss this issue at greater length in section 4.3 below. For now we will adopt the generic word "factor" to describe the relevant entities. Let P be a probability function, satisfying the normal rules of the probability calculus, such that P(A) represents the empirical probability that factor A occurs or is instantiated (and likewise for the other factors). The issue of how empirical probability is to be interpreted will not be addressed here. (See the entry under probability). The probability of B, given A, is represented as a conditional probability:
P(B|A) = P(A & B)/P(A).
One natural way of understanding the idea that A raises the probability of B is that P(B|A) > P(B|not-A). Thus a first attempt at a probabilistic theory of causation would be:
PR: A causes B if and only if P(B|A) > P(B|not-A).
This formulation is labeled PR for "Probability-Raising".

There are two central problems with this theory. The first is that probability-raising is symmetric: if P(B|A) > P(B|not-A), then P(A|B) > P(A|not-B). The causal relation, however, is asymmetric: if A causes B, then typically B does not cause A. The problem of causal asymmetry arises for virtually every theory of causation, and probabilistic theories of causation are no exception.

The second problem concerns spurious correlations. If, for example, A and B are both caused by some third event, say C, then it may be that P(B|A) > P(B|not-A) even though A does not cause B. For example, let A be an individual’s having yellow-stained fingers, and B that individual’s having lung cancer. Then we would expect that P(B|A) > P(B|not-A). The reason that those with yellow-stained fingers are more likely to suffer from lung cancer is that smoking tends to produce both effects. Because individuals with yellow-stained fingers are more likely to be smokers, they are also more likely to suffer from lung cancer. Intuitively, the way to address this problem is to require that causes raise the probabilities of their effects ceteris paribus. The history of probabilistic causation is to a large extent a history of attempts to resolve these two central problems.

3.2 Spurious Correlations

Hans Reichenbach introduced the terminology of "screening off" to apply to a particular type of probabilistic relationship. If P(B|A & C) = P(B|C), then C is said to screen A off from B. Intuitively, C renders A probabilistically irrelevant to B. With this notion in hand, we can attempt to avoid the problem of spurious correlations by adding a ‘no screening off’ condition to the basic probability-raising condition:
NSO: Factor A occurring at time t, is a cause of the later factor B if and only if:
1. P(B|A) > P(B|not-A)
2. There is no factor C, occurring earlier than or simultaneously with A, that screens A off from B.
We will call this the NSO, or ‘No Screening Off’ formulation. Suppose, as in our example above, that smoking (C) causes both yellow-stained fingers (A) and lung cancer (B). Then smoking will screen yellow-stained fingers off from lung cancer: given that an individual smokes, his yellow-stained fingers have no impact upon his probability of developing lung cancer.

The second condition of NSO does not suffice to resolve the problem of spurious correlations, however. This condition was added to eliminate cases where spurious correlations give rise to factors that raise the probability of other factors without causing them. Spurious correlations can also give rise to cases where a cause does not raise the probability of its effect. So genuine causes need not satisfy the first condition of NSO. Suppose, for example, that smoking is highly correlated with exercise: those who smoke are much more likely to exercise as well. Smoking is a cause of heart disease, but suppose that exercise is an even stronger preventative of heart disease. Then it may be that smokers are, over all, less likely to suffer from heart disease than non-smokers. That is, letting A represent smoking, C exercise, and B heart disease, P(B|A) < P(B|not-A). Note, however, that if we conditionalize on whether one exercises or not, this inequality is reversed: P(B|A & C) > P(B|not-A & C), and P(B|A & not-C) > P(B|not-A & not-C).

The next step is to replace conditions 1 and 2 with the requirement that causes must raise the probability of their effects in test situations:

TS: A causes B if P(B|A & T) > P(B|not-A & T) for every test situation T.
A test situation is a conjunction of factors. When such a conjunction of factors is conditioned on, those factors are said to be "held fixed". To specify what the test situations will be, then, we must specify what factors are to be held fixed. In the previous example, we saw that the true causal relevance of smoking for lung cancer was revealed when we held exercise fixed, either positively (conditioning on C) or negatively (conditioning on not-C). This suggests that in evaluating the causal relevance of A for B, we need to hold fixed other causes of B, either positively or negatively. This suggestion is not entirely correct, however. Let A and B be smoking and lung cancer as above. Suppose C is a causal intermediary, say the presence of tar (and other carcinogens) in the lungs. If A causes B exclusively via C, then C will screen A off from B: given the presence (absence) of carcinogens in the lungs, the probability of lung cancer is not affected by whether those carcinogens got there by smoking (are absent despite smoking). Thus we will not want to hold fixed any causes of B that are themselves caused by A. Let us call the set of all factors that are causes of B, but are not caused by A, the set of independent causes of B. A test situation for A and B will then be a maximal conjunction, each of whose conjuncts is either an independent cause of B, or the negation of an independent cause of B.

Note that the specification of factors that need to be held fixed appeals to causal relations. This appears to rob the theory of its status as a reductive analysis of causation. We will see in section 4.4 below, however, that the issue is substantially more complex than that. In any event, even if there is no reduction of causation to probability, a theory detailing the systematic connections between causation and probability would be of great philosophical interest.

TS can be generalized in a number of ways. For example, one could define a ‘negative cause’ or ‘preventer’ or ‘inhibitor’ as a factor that lowers the probability of its ‘effect’ in all test situations, and a ‘mixed’ or ‘interacting’ cause as one that affects the probability of its ‘effect’ in different ways in different test situations. Or one could define causal relationships between variables that are non-binary, such as caloric intake and blood pressure. In principle, there are infinitely many ways in which one variable might depend probabilistically on another, even holding fixed some particular test situation, so this approach abandons any neat classification of causal factors into causes and preventers. These generalizations will also suggest revisions of the method for constructing test situations, since they suggest different sorts of factors to be held fixed.

An alternative approach to the problem of spurious correlations is through counterfactuals. According to a probabilistic counterfactual theory of causation (PC), A causes B if both occur and the probability that B would occur, at the time of A’s occurrence, was much higher than it would have been at the corresponding time if A had not occurred. This counterfactual is to be understood in terms of possible worlds: it is true if, in the nearest possible world(s) where A does not occur, the probability of B is much lower than it was in the actual world. On this account, one does not compare conditional probabilities, but unconditional probabilities in different possible worlds. The test situation is not some specified conjunction of factors, but the sum total of all that remains unchanged in moving to the nearest possible world(s) where A does not occur. Obviously a great deal hinges here upon the account of what makes some worlds nearer than others; for more on this issue, see the entry under "causation, counterfactual theories."

Suggested Readings: This section more or less follows the main developments in the history of probabilistic theories of causation. Versions of the NSO theory are found in Reichenbach (1956, section 23), and Suppes (1970, chapter 2). Salmon (1980) is an influential critique of these theories. The first version of TS was presented in Cartwright (1979). Eells (1991, chapters 2, 3, and 4) and Hitchcock (1993) carry out the two generalizations of TS described. Lewis (1986) is the locus classicus for PC. Good (1961, 1962) is an early essay on probabilistic causation that is rich in insights, but has had surprisingly little influence on the formulation of later theories.

3.3 Asymmetry

The second major problem with the basic probability-raising idea was that the relationship of probability-raising is symmetrical. One way of cutting through the Gordian knot is to require that causes precede their effects in time. This has several systematic disadvantages. It rules out the possibility of backwards-in-time causation a priori, whereas many believe that it is only a contingent fact that causes precede their effects in time. This is less of a worry if one is not concerned to give a conceptual analysis of causation. Second, this approach rules out the possibility of developing a causal theory of temporal order (on pain of vicious circularity), a theory that has seemed attractive to some philosophers. Note also that while assigning temporal locations to particular events is entirely coherent, it is not so clear what it means to say that one property or event type occurs before another. For example, what does it mean to say that smoking precedes lung cancer? There have been many episodes of smoking, and many of lung cancer, and not all of the former occurred prior to all of the latter. This will be a problem for those who are interested in providing a probabilistic theory of causal relations among properties or event types.

A more ambitious approach to the problem of causal asymmetry is to try to characterize that asymmetry in terms of probability relations alone. The best-known proposal of this sort is due to Hans Reichenbach. Suppose that factors A and B are positively correlated:

1. P(A & B) > P(A)P(B)
It is easy to see that this will hold exactly when A raises the probability of B and vice versa. Suppose, moreover, that there is some factor C having the following properties:
2. P(A & B|C) = P(A|C)P(B|C)

3. P(A & B|not-C) = P(A|not-C)P(B|not-C)

4. P(A|C) > P(A|not-C)

5. P(B|C) > P(B|not-C).

In this case, the trio ACB is said to form a conjunctive fork. Conditions 2 and 3 stipulate that C and not-C screen off A from B. As we have seen, this sometimes occurs when C is a common cause of A and B. Conditions 2 through 5 entail 1, so in some sense C explains the correlation between A and B. If C occurs earlier than A and B, and there is no event satisfying 2 through 5 that occurs later than A and B, then ACB is said to form a conjunctive fork open to the future. Analogously, if there is a future factor satisfying 2 through 5, but no past factor, we have a conjunctive fork open to the past. If a past factor C and a future factor D both satisfy 2 through 5, then ACBD forms a closed fork. Reichenbach’s proposal was that the direction from cause to effect is the direction in which open forks predominate. In our world, there are many forks open to the future, few or none open to the past.

It is not clear, however, that this asymmetry between forks open to the past and open to the future will be as pervasive as this proposal seems to presuppose. In quantum mechanics, there are correlated effects that are believed to have no common cause that screens them off. Moreover, if ACB forms a conjunctive fork in which C precedes A and B, but C has a deterministic effect D which occurs after A and B, then ACBD will form a closed fork. A further difficulty with this proposal is that since it provides a global ordering of causes and effects, it seems to rule out a priori the possibility that some effects might precede their causes. More complex attempts to derive the direction of causation from probabilities have been offered; the issues here intersect with the problem of reduction, discussed in section 4.4 below.

Proponents of counterfactual theories of causation attempt to derive the asymmetry of causation from a corresponding asymmetry in the truth values of counterfactuals. For details see the entry for " causation, counterfactual theories."

Suggested Readings: Suppes (1970, chapter 2) and Eells (1991, chapter 5) define causal asymmetry in terms of temporal asymmetry. Reichenbach’s proposal is presented in his (1956, chapter IV). Some difficulties with this proposal are discussed in Arntzenius (1993). Papineau (1993) is a good overall discussion of the problem of causal asymmetry within probabilistic theories.

4. Further Issues and Problems

4.1 Context-unanimity

According to TS, a cause must raise the probability of its effect in every test situation. This has been called the requirement of context-unanimity. This requirement is vulnerable to the following sort of counterexample. Suppose that there is a gene that has the following unusual effect: those that possess the gene have their chances of contracting lung cancer lowered when they smoke. This gene is very rare, let us imagine -- indeed, it need not exist at all in the human population, so long as humans have some non-zero probability of possessing this gene (perhaps as a result of a very improbable mutation). In this scenario, there would be test situations (those that hold fixed the presence of the gene) in which smoking lowers the probability of lung cancer: thus smoking would not be a cause of lung cancer according to the context-unanimity requirement. Nonetheless, it seems unlikely that the discovery of such a gene (or of the mere possibility of its occurrence) would lead us to abandon the claim that smoking causes lung cancer.

This line of objection is surely right about our ordinary use of causal language. It is nonetheless open to the defender of context-unanimity to respond that she is interested in supplying a precise concept to replace the vague notion of causation that corresponds to our everyday usage. In a population consisting of individuals lacking the gene, smoking causes lung cancer. In a population consisting entirely of individuals who possess the gene, smoking prevents lung cancer. In contexts where one desires causal information for purposes of deliberation (say concerning whether to smoke), it is this more precise type of information that is desired.

Suggested Readings: Dupré (1984) presents this challenge to the context-unanimity requirement, and offers an alternative. Eells (1991, chapters 1 and 2), defends context-unanimity using the idea that causal claims are made relative to a population.

4.2 Potential Counterexamples

Given the basic probability-raising idea, one would expect putative counterexamples to probabilistic theories of causation to be of two basic types: cases where causes fail to raise the probabilities of their effects, and cases where non-causes raise the probabilities of non-effects. The discussion in the literature has focused almost entirely on the first sort of example. Consider the following example, due to Deborah Rosen. A golfer badly slices a golf ball, which heads toward the rough, but then bounces off a tree and into the cup for a hole in one. The golfer’s slice lowered the probability that the ball would wind up in the cup, yet nonetheless caused this result. One way of avoiding this problem is to attend to the probabilities that are being compared. If we label the slice A, not-A is a disjunction of several alternatives. One such alternative is a clean shot -- compared to this alternative, the slice lowered the probability of a hole-in-one. Another alternative is no shot at all, relative to which the slice increases the probability of a hole-in-one. By making the latter sort of comparison, we can recover our original intuitions about the example.

For an example of the second type, suppose that two gunmen shoot at a target. Each has a certain probability of hitting, and a certain probability of missing. Assume that none of the probabilities are one or zero. As a matter of fact, the first gunman hits, and the second gunman misses. Nonetheless, the second gunman did fire, and by firing, increased the probability that the target would be hit, which it was. While it is obviously wrong to say that the second gunman’s shot caused the target to be hit, it would seem that a probabilistic theory of causation is committed to this consequence. A natural approach to this problem would be to try to strengthen the probabilistic theory of causation with a requirement of spatiotemporal connection between cause and effect (see the entry on "causation, causal processes"), but to date, no successful proposal along these lines has been proffered.

Suggested Readings: Salmon (1980) presents several examples of probability-lowering causes. Hitchcock (1995) presents a response. Woodward (1990) describes the structure that is instantiated in the example of the two gunmen. Humphreys (1989, section 14) responds. Menzies (1989, 1996) discusses examples involving causal pre-emption where non-causes raise the probabilities of non-effects.

4.3 Singular and General Causation

We make at least two different kinds of causal claim. Singular causal claims, such as "Jill’s heavy smoking during the ‘80’s caused her to develop lung cancer," relate particular events that have spatiotemporal locations. General causal claims, such "smoking causes lung cancer" relate event types or properties. With this distinction in mind, we may note that the counterexamples mentioned above are both formulated in terms of singular causation. The examples do not undermine the General causal claims that a probabilistic theory of causation would appear to license in these cases: slices prevent (are negative causes of) holes-in-one; shooting at targets causes them to be hit. So one possible reaction to the counterexamples of the previous section would be to maintain that the probabilistic theory of causation whose development was sketched in section 3 above is a theory of general causation only, and that singular causation requires a distinct philosophical theory. One consequence of this move is that there are (at least) two distinct species of causal relation, each requiring its own philosophical account--not an altogether happy predicament.

Suggested Readings: The need for distinct theories of singular and general causation is defended in Good (1961, 1962), Sober (1985), and Eells (1991, introduction and chapter 6). Eells (1991, chapter 6) offers a distinct probabilistic theory of singular causation in terms of the temporal evolution of probabilities. Carroll (1991) and Hitchcock (1995) offer two quite different lines of response.

4.4. Reduction and Circularity

Returning to the theories outlined in section 3, recall that theory NSO was an attempt at a reductive analysis of causation in terms of probabilities (and perhaps also temporal order). By contrast, TS defines causal relations in terms of probabilities conditional upon specifications of test conditions, which are themselves characterized in causal terms. Thus it appears that the latter theories cannot be analyses of causation, since causation appears in the analysans. Given that TS contains much needed improvements over NSO, it looks as though there can be no reduction of causation to probabilities. This may be giving up too soon, however. In order to determine whether a probabilistic reduction of causation is possible, the central issue is not whether the word ‘cause’ appears in both the analysandum and the analysans; rather, the key question should be whether, given an assignment of probabilities to a set of factors, there is a unique set of causal relations among those factors compatible with the probability assignment and the theory in question. Suppose that a set of factors, and a system of causal relations among those factors is given: call this the causal structure CS. Let T be a theory connecting causal relations among factors with probabilistic relations among factors. Then the causal structure CS will be probabilistically distinguishable relative to T, if for every assignment of probabilities to the factors in CS that is compatible with CS and T, CS is the unique causal structure compatible with T and those probabilities. (One could formulate a weaker sense of distinguishability by requiring that only some assignment of probabilities uniquely determines CS). Intuitively, T allows you to infer that the causal structure is in fact CS given the probability relations between factors. Given a probabilistic theory of causation T, it is possible to imagine many different properties it might have. Here are some possibilities:
1. All causal structures are probabilistically distinguishable relative to T

2. All causal structures having some interesting property are probabilistically distinguishable relative to T

3. Any causal structure can be embedded in a causal structure that is probabilistically distinguishable relative to T

4. The actual causal structure of the world (assuming there is such a thing) is probabilistically distinguishable relative to T .

It is not obvious which type of distinguishability properties a theory must have in order to constitute a reduction of causation to probabilities. This sort of approach to the question of probabilistic reduction is quite new, and currently an active area of investigation.

Suggested Readings: The most detailed treatment of probabilistic distinguishability is given in Spirtes, Glymour and Scheines (1993); see especially chapter 4. Spirtes, Glymour and Scheines prove (theorem 4.6) a result along the lines of 3 for a theory that they propose. This work is very technical. An accessible presentation is contained in Papineau (1993), which defends a position along the lines of 4.


Other Internet Resources

[Please contact the author with suggestions.]

Related Entries

causation: causal processes | causation: counterfactual theories of | cause and effect | conditionals: counterfactual | determinism, causal | events | Hume, David | physics: Reichenbach’s common cause principle | probability calculus: interpretations of | quantum mechanics | time

Copyright © 1997 by
Christopher Hitchcock

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Table of Contents

First published: July 11, 1997
Content last modified: July 17, 1997