This is a file in the archives of the Stanford Encyclopedia of Philosophy. |

- 1. Introduction and Motivation
- 2. Preliminaries
- 3. Main Developments
- 4. Counterfactual Approaches
- 5. Causal Modeling and Probabilistic Causation
- 6. Further Issues and Problems
- Bibliography
- Other Internet Resources
- Related Entries

**Suggested Readings:** Hume (1748), especially section VII.

The problem of imperfect regularities does not tell decisively
against the regularity approach to causation. Successors of Hume,
especially John Stuart Mill and John Mackie, have attempted to offer
more refined accounts of the regularities that underwrite causal
relations. Mackie introduced the notion of an *inus*
condition: an inus condition for some effect is an insufficient but
non-redundant part of an unnecessary but sufficient condition.
Suppose, for example, that a lit match causes a forest fire. The
lighting of the match, by itself, is not sufficient; many matches are
lit without ensuing forest fires. The lit match is, however, a part
of some constellation of conditions that are jointly sufficient for
the fire. Moreover, given that this set of conditions occurred,
rather than some other set sufficient for fire, the lighting of the
match was necessary: fires do not occur in such circumstances when
lit matches are not present.

There are, however, disadvantages to this type of approach. The
regularities upon which a causal claim rest now turn out to be much
more complicated then we had previously realized. In particular, this
complexity raises problems for the epistemology of causation. One
appeal of Hume's regularity theory is that it seems to provide a
straightforward account of how we come to know what causes what: we
learn that *A* causes *B* by observing that *A*s
are invariably followed by *B*s. Consider again the case of
smoking and lung cancer: on the basis of what evidence do we believe
that the one is a cause of the other? It is not that all smokers
develop lung cancer, for we do not observe this to be true. But
neither have we observed some constellation of conditions *C*,
such that smoking is invariably followed by lung cancer in the
presence of *C*, while lung cancer never occurs in non-smokers
meeting condition *C*. Rather, what we observe is that smokers
develop lung cancer *at much higher rates* than non-smokers;
this is the *prima facie* evidence that leads us to think that
smoking causes lung cancer. This fits very nicely with the
probabilistic approach to causation.

As we shall see in Section 3.2 below, however, the basic idea that causes raise the probability of their effects has to be qualified in a number of ways. By the time these qualifications are added, it appears that probabilistic theories of causation have to make a move that is quite analogous to Mackie's appeal to constellations of background conditions. Thus it is not clear that the problem of imperfect regulaties, by itself, offers any real reason to prefer probabilistic approaches to causation over regularity approaches.

**Suggested Readings:** Refined versions of the
regularity analysis are found in Mill (1843), Volume I, chapter V,
and in Mackie (1974), chapter 3. The introduction of Suppes (1970)
presses the problem of imperfect regularities.

Many philosophers find the idea of indeterministic causation counterintuitive. Indeed, the word “causality” is sometimes used as a synonym for determinism. A strong case for indeterministic causation can be made by considering the epistemic warrant for causal claims. There is now very strong empirical evidence that smoking causes lung cancer. Yet the question of whether there is a deterministic relationship between smoking and lung cancer is wide open. The formation of cancer cells depends upon mutation, which is a strong candidate for being an indeterministic process. Moreover, whether an individual smoker develops lung cancer or not depends upon a host of additional factors, such as whether or not she is hit by a bus before cancer cells begin to form. Thus the price of preserving the intuition that causation presupposes determinism is agnosticism about even our best supported causal claims.

Since probabilistic theories of causation require only that a cause raise the probability of its effect, these theories are compatible with indeterminism. This seems to be a potential advantage over regularity theories. It is unclear, however, to what extent this potential advantage is actual. In the realm of microphysics, where we have strong (but still contestable) evidence of indeterminism, our ordinary causal notions do not easily apply. This is brought out especially clearly in the famous Einstein, Podolski and Rosen thought experiment. On the other hand, it is unclear to what extent quantum indeterminism ‘percolates up’ to the macroworld of smokers and cancer victims, where we do seem to have some clear causal intuitions.

**Suggested Readings:** Humphreys (1989), contains a
sensitive treatment of issues involving indeterminism and causation;
see especially sections 10 and 11. Earman (1986) is a thorough
treatment of issues of determinism in physics.

Some proponents of probabilistic theories of causation follow Hume in identifying causal direction with temporal direction. Others have attempted to use the resources of probability theory to articulate a substantive account of the asymmetry of causation, with mixed success. We will discuss these proposals at greater length in Section 3.3 below.

**Suggested Readings:** Hausman (1998) contains a
detailed discussion of issues involving the asymmetry of
causation. Mackie (1974), chapter 3, shows how the problem of
asymmetry can arise for his inus condition theory. Lewis (1986)
contains a very brief but clear statement of the problem of
asymmetry.

Figure 1

The ability to handle such spurious correlations is probably the greatest success of probabilistic theories of causation, and remains a major source of attraction for such theories. We will discuss this issue in greater detail in Section 3.2 below.

**Suggested Readings:** Mackie (1974), chapter 3, shows
how the problem of spurious regularities can arise for his inus
condition theory. Lewis (1986) contains a very brief but clear
statement of the problem of spurious regularities.

**Suggested Readings:** Mill (1843) contains the classic
discussion of “the cause” and “a cause.” Bennett
(1988) is an excellent discussion of facts and events.

P(As an illustration, suppose that we toss a fair die. LetB|A) = P(A&B)/P(A).

If P(*A*) is 0, then the ratio in the definition of
conditional probability is undefined. There are, however, other
technical developments that will allow us to define P(*B* |
*A*) when P(*A*) is 0. The simplest is simply to take
conditional probability as a primitive, and to define unconditional
probability as probability conditional on a tautology.

One
natural way of understanding the idea that *A* raises the
probability of *B* is that P(*B* | *A*) >
P(*B* | not-*A*). Thus a first attempt at a
probabilistic theory of causation would be:

This formulation is labeledPR:AcausesBif and only if P(B|A) > P(B| not-A).

PR addresses the problems of imperfect regularities and
indeterminism, discussed above. But it does not address the other two
problems discussed in section 1 above. First, probability-raising is
symmetric: if P(*B* | *A*) > P(*B* |
not-*A*), then P(*A* | *B*) > P(*A* |
not-*B*). The causal relation, however, is typically
*asymmetric.*

Figure 2

Second, PR has trouble with *spurious correlations*. If
*A* and *B* are both caused by some third factor,
*C*, then it may be that P(*B* | *A*) >
P(*B* | not-*A*) even though *A* does not cause
*B*. This situation is shown schematically in Figure 2. For
example, let *A* be an individual's having yellow-stained
fingers, and *B* that individual's having lung cancer. Then
we would expect that P(*B* | *A*) > P(*B* |
not-*A*). The reason that those with yellow-stained fingers
are more likely to suffer from lung cancer is that smoking tends to
produce both effects. Because individuals with yellow-stained
fingers are more likely to be smokers, they are also more likely to
suffer from lung cancer. Intuitively, the way to address this
problem is to require that causes raise the probabilities of their
effects *ceteris paribus.* The history of probabilistic
causation is to a large extent a history of attempts to resolve these
two central problems.

**Suggested Readings**: For a primer on basic
probability theory, see the entry for “probability calculus:
interpretations of.” This entry also contains a discussion of
the intperpretation of probability claims.

We will call this theNSO:FactorAoccurring at timet, is a cause of the later factorBif and only if:

- P(B | A) > P(B | not-A)
- There is no factor
C, occurring earlier than or simultaneously withA, that screensAoff fromB.

The second condition of *NSO* does not suffice to resolve the
problem of spurious correlations, however. This condition was added
to eliminate cases where spurious correlations give rise to factors
that raise the probability of other factors without causing them.
Spurious correlations can also give rise to cases where a cause does
not raise the probability of its effect. So genuine causes need not
satisfy the *first* condition of *NSO*. Suppose, for
example, that smoking is highly correlated with exercise: those who
smoke are much more likely to exercise as well. Smoking is a cause
of heart disease, but suppose that exercise is an even stronger
preventative of heart disease. Then it may be that smokers are, over
all, less likely to suffer from heart disease than non-smokers. That
is, letting *A* represent smoking, *C* exercise, and
*B* heart disease, P(*B* | *A*) < P(*B* |
not-*A*). Note, however, that if we conditionalize on whether
one exercises or not, this inequality is reversed: P(*B* |
*A* & *C*) > P(*B* | not-A &
*C*), and P(*B* | *A* & not-*C*) >
P(*B* | not-*A* & not-*C*). Such reversals of
probabilistic inequalities are instances of “Simpson's
Paradox.”

The next step is to replace conditions 1 and 2 with the
requirement that causes must raise the probability of their effects
in *test situations*:

A test situation is a conjunction of factors. When such a conjunction of factors is conditioned on, those factors are said to be “held fixed.” To specify what the test situations will be, then, we must specify what factors are to be held fixed. In the previous example, we saw that the true causal relevance of smoking for lung cancer was revealed when we held exercise fixed, either positively (conditioning onTS:AcausesBif P(B|A&T) > P(B| not-A&T) for every test situationT.

Note that the specification of factors that need to be held fixed
appeals to causal relations. This appears to rob the theory of its
status as a *reductive analysis* of causation. We will see in
Section 6.4
below, however, that the issue is substantially more complex than
that. In any event, even if there is no reduction of causation to
probability, a theory detailing the systematic connections between
causation and probability would be of great philosophical
interest.

The move from the basic idea of *PR* to the complex
formulation of *TS* is rather like the move from Hume's
original regularity theory to Mackie's theory of inus conditions. In
both cases, the move substantially complicates the epistemology of
causation. In order to know whether *A* is a cause of
*B*, we need to know what happens in the presence and absence
of *B*, while holding fixed a complicated conjunction of
further factors. The hope that a probabilistic theory of causation
would enable us to handle the problem of imperfect regularities
without appealing to such constellations of background conditions
seems not to have been borne out. Nonetheless, *TS* does seem
to provide us with a theory that is compatible with indeterminism and
that can distinguish causation from spurious correlation.

*TS* can be generalized in at least two important ways.
First, we can define a ‘negative cause’ or
‘preventer’ or ‘inhibitor’ as a factor that
lowers the probability of its ‘effect’ in all test
situations, and a ‘mixed’ or ‘interacting’ cause
as one that affects the probability of its ‘effect’ in
different ways in different test situations. It should be apparent
that when constructing test situations for *A* and *B*
one should also hold fixed preventers and mixed causes of *B*
that are independent of *A*. Generalizing even further, one
could define causal relationships between variables that are
non-binary, such as caloric intake and blood pressure. In evaluating
the causal relevance of *X* for *Y*, we will need to
hold fixed the values of variables that are independently causally
relevant to *Y*. In principle, there are infinitely many ways
in which one variable might depend probabilistically on another, even
holding fixed some particular test situation. Thus, once the theory
is generalized to include non-binary variables, it will not be
possible to provide any neat classification of causal factors into
causes and preventers.

These two generalizations bring out an important distinction. It is
one thing to ask whether *A* is causally relevant to
*B* *in some way*; it is another to ask *in which
way* is *A* causally relevant to *B*. To say that
*A* causes *B* is then potentially ambiguous: it might
mean that *A* is causally relevant to *B* in some way
or other; or it could mean that *A* is causally relevant for
*B* in a particular way, that *A* promotes *B*
or is a positive factor for the occurrence of *B*. For
example, if *A* prevents *B*, then *A* will
count as a cause of *B* in the first sense, but not in the
second. Probabilistic theories of causation can be used to answer
both types of question. *A* is causally relevant to *B*
if *A* makes some difference for the probability of *B*
in some test situation; whereas *A* is a positive or promoting
cause of *B* if *A* *raises* the probability of
*B* in *all* test situations.

The problem of spurious correlations also plagues certain versions
of decision theory. This can happen when one's choice of action is
symptomatic of certain good or bad outcomes, without causing those
outcomes. (The best-known example of this sort is Newcomb's Problem.)
In cases like this, some versions of decision theory appear to
recommend that one act so as to receive good news about events beyond
one's control, rather than act so as to bring about desirable events
that are *within* one's control. In response, many decision
theorists have advocated versions of causal decision theory. Some
versions closely resemble *TS*.

**Suggested Readings:** This section more or less
follows the main developments in the history of probabilistic
theories of causation. Versions of the *NSO* theory are found
in Reichenbach (1956, section 23), and Suppes (1970, chapter 2). Good
(1961, 1962) is an early essay on probabilistic causation that is
rich in insights, but has had surprisingly little influence on the
formulation of later theories. Salmon (1980) is an influential
critique of these theories. The first versions of *TS* were
presented in Cartwright (1979) and Skyrms (1980). Eells (1991,
chapters 2, 3, and 4) and Hitchcock (1993) carry out the two
generalizations of *TS* described. Skyrms (1980) presents a
version of causal decision theory that is very similar to
*TS*. See also the entry for “decision theory:
causal.”

Some defenders of manipulability or agency theories of causation
have argued that the necessary asymmetry is provided by our
perspective as agents. In assessing whether *A* is a cause of
*B*, we must ask whether *A* increases the probability
of *B*, where the relevant conditional probabilities are
*agent probabilities*: the probabilities that *B* would
have were *A* (or not-*A*) to be realized by the choice
of a free agent. Critics have wondered just what these agent
probabilities are.

Other approaches attempt to locate the asymmetry between cause and
effect within the structure of the probabilities themselves. One very
simple proposal would be to refine the way in which the test
situations are constructed. (See the
previous section
for discussion of test situations.) In evaluating whether *A*
is a cause of *B*, we should hold fixed not only the
independent causes of *B*, but also the causes of
*A*. Thus if *B* is a cause of *A*, rather than
vice versa, *A* will not raise the probability of *B*
in the appropriate test situation, since the presence or absence of
*B* will already be held fixed. This idea is built into the
*Causal Markov Condition* discussed in
Section 5
below. Proponents of traditional probabilistic theories of causation
have not adopted this strategy. This may be because they feel that
this refinement would take the theory too close to vicious
circularity: in order to assess whether *A* causes *B*,
we would need to know already whether *B* causes
*A*.

A more ambitious approach to the problem of causal asymmetry is due
to Hans Reichenbach. Suppose that factors *A* and *B*
are positively correlated:

1. P(It is easy to see that this will hold exactly whenA&B) > P(A)P(B)

2. P(In this case, the trioA&B|C) = P(A|C)P(B|C)3. P(

A&B| not-C) = P(A| not-C)P(B| not-C)4. P(

A|C) > P(A| not-C)5. P(

B|C) > P(B| not-C).

It is not clear, however, that this asymmetry between forks open to
the past and forks open to the future will be as pervasive as this
proposal seems to presuppose. In quantum mechanics, there are
correlated effects that are believed to have no common cause that
screens them off. Moreover, if *ACB* forms a conjunctive fork
in which *C* precedes *A* and *B*, but
*C* has a deterministic effect D which occurs after *A*
and *B*, then ACBD will form a closed fork. A further
difficulty with this proposal is that since it provides a global
ordering of causes and effects, it seems to rule out *a
priori* the possibility that some effects might precede their
causes. More complex attempts to derive the direction of causation
from probabilities have been offered; the issues here intersect with
the problem of reduction, discussed in
Section 6.4 below.

**Suggested Readings:** Suppes (1970, chapter 2) and
Eells (1991, chapter 5) define causal asymmetry in terms of temporal
asymmetry. Price (1991) defends an account of causal asymmetry in
terms of agent probabilities; see also the entry for “causation
and manipulation.” Reichenbach's proposal is presented in his
(1956, chapter IV). Some difficulties with this proposal are
discussed in Arntzenius (1993); see also his entry to this
encylopedia under “physics: Reichenbach's common cause
principle.” Papineau (1993) is a good overall discussion of the
problem of causal asymmetry within probabilistic theories. Hausman
(1998) is a detailed study of the problem of causal asymmetry.

Causal dependence, as defined in the previous paragraph, is
sufficient, but not necessary, for causation. Causation is defined to
be the *ancestral* of causal dependence; that is, *A*
causes *B* just in case there is a sequence of events
*C _{1}*,

Proponents of counterfactual theories of causation attempt to derive the asymmetry of causation from a corresponding asymmetry in the truth values of counterfactuals. For instance, it may be true that if Mary had not smoked, she would have been less likely to develop lung cancer, but we would not normally agree that if Mary had not developed lung cancer, she would have been less likely to smoke. Ordinary counterfactuals do not ‘backtrack’ from effects to causes. This proscription against backtracking also solves the problem of spurious correlations: we would not say that if the column of mecury had not risen, then the drop in atmospheric pressure would have been less likely, and so the storm would have been less likely as well.

One important question is whether the counterfactuals that appear in the analysis of causation can be characterized without reference to causation. In order to do this, one would have to say what makes some worlds closer than others without making reference to any causal notions. Despite some interesting attempts, it is not clear whether this can be done. If not, then it will not be possible to provide a reductive PC analysis of causation, although it may still be possible to articulate interesting interconnections between causation, probability and counterfactuals.

The Philosopher Igal Kvart has been a persistent critic of the claim that it is possible to analyze counterfactuals without using causation. He has developed a probabilistic theory of singular causation that does not use counterfactuals. Nonetheless, his theory has a number of features in common with counterfactual theories: it is an attempt to analyze singular causation among events; it elaborates on the basic probability-raising idea in an attempt to avoid some of the problems raised in Section 6.2 below; and it aspires to be a reductive analysis of causation, making no reference to causal relations in the analysans.

**Suggested Readings:** Lewis (1986a) is the *locus
classicus* for *PC*. Lewis (1986b) is an attempt to
explicate the notion of proximity among possible worlds. Recent
attempts to analyze causation in terms of probabilistic
counterfactuals have become quite intricate; see for example Noordhof
(1999). For further discussion of counterfactual theories of
causation, see the entry under “causation, counterfactual
theories.” For Kvart's theory, see for example Kvart (1997).

Our concern here will not be with the efficacy of these methods of causal inference, but rather with their philosophical underpinnings. We will here follow the developments of SGS, as these bear a stronger resemblance to the probabilistic theories of causation described in Section 3 above. (Pearl's approach, at least in its more recent development, bears a stronger connection to counterfactual approaches.)

**Suggested Readings**: Pearl (2000) and Spirtes,
Glymour and Scheines (2000) are the most detailed presentations of
the two research programs discussed. Both works are quite technical,
although the epilogue of Pearl (2000) provides a very readable
historical introduction to Pearl's work. Pearl (1999) also contains a
reasonably accessible introduction to some of Pearl's more recent
developments. Scheines (1997) is a non-technical introduction to some
of the ideas in SGS (2000). McKim and Turner (1997) is a collection
of papers on causal modeling, including some important critiques of
SGS.

The directed acyclic graph **G** over **V**
may be related to the probability distribution in a number of
ways. One important condition that the two might satisfy is the
so-called *Markov Condition*:

The notation needs a little clarification. Consider, for example, the first term in the equality. SinceMC:For everyXinV, and every setYof variables inV\DE(X), P(X|PA(X) &Y) = P(X|PA(X)); whereDE(X) is the set of descendants ofX, andPA(X) is the set of parents ofX.

As stated, the Markov Condition describes a purely formal relation
between abstract entities. Suppose, however, that we give the graph
and probability distribution empirical interpretations. The graph
will represent the causal relationships among the variables in a
population, and the probability distribution will represent the
empirical probability that an individual in the population will
possess certain values of the relevant variables. When the directed
graph is given a causal interpretation, it is called a *causal
graph*. We will return shortly to the question of what, exactly,
the arrows in a causal graph represent.

The *Causal Markov Condition* (*CMC*) asserts that
*MC* holds of a population when the directed graph and
probability distribution are given these
interpretations. *CMC* does not hold in general, but only when
certain further conditions are satisfied. For instance,
**V** must include all common causes of variables that
are included in **V**. Suppose, for example, that
**V** = {*X, Y*}, that neither variable is a
cause of the other, and that *Z* is a common cause of
*X* and *Y* (the true causal structure is shown in
Figure 3 below). The correct causal graph on **V** will
include no arrows, since neither *X* nor *Y* cause the
other. But *X* and *Y* will be probabilistically
correlated, because of the underlying common cause. This is a
violation of *CMC*. Since the correct causal graph on {*X,
Y*} has no arrows, *X* has no parents or descendents; thus
*CMC* entails that P(*X* | *Y*) =
P(*X*). This equality is false, since *X* and
*Y* are in fact correlated. *CMC* can also fail for
certain types of heterogeneous populations composed of subpopulations
with differenct causal structures. And *CMC* will fail for
certain quantum systems. One area of controversy concerns the extent
to which actual populations satisfy *CMC* with respect to the
sorts of variable sets that are typically employed in empirical
investigations. For purposes of further discussion, we will assume
that *CMC* holds.

Figure 3

The Causal Markov Condition is a generalization of Reichenbach's Common Cause Principle, discussed in Section 3.3 above. Here are a few illustrations of how it works.

Figure 4

In Figures 3 and 4, *CMC* entails that the values of
*Z* screen off the values of *X* from the values of
*Y*.

Figure 5

Figure 6

In Figures 5 and 6, *CMC* again entails that the values of
*Z* screen off the values of *X* from the values of
*Y*. However, *CMC* does not entail that the values of
*W* screen off the values of *X* from the values of
*Y* in Figure 5, whereas it does entail that the values of
*W* screen off the values of *X* from the values of
*Y* in Figure 6. This shows that being a common cause of
*X* and *Y* is neither necessary nor sufficient for
screening off the values of those variables.

Figure 7

In Figure 7, both *Z* and *W* are common causes of
*X* and *Y*, yet *CMC* does not entail that
either one of them, by itself, suffices to screen off the values of
*X* and *Y*. This seems reasonable: if we hold fixed
the value of *Z*, we should expect *X* and *Y*
to remain correlated due to the action of *W*. *CMC
does* entail that *Z* and *W* jointly screen off
*X* and *Y*; that is, when we condition on the values
of *Z* and *W*, there will be no residual correlation
between *X* and *Y*.

A second important relation between a directed graph and probability
distribution is the *Minimality Condition*. Suppose that the
directed graph **G** on variable set **V**
satisfies the Markov condition with respect to the probability
distribution *P*. The Minimality Condition asserts that no
sub-graph of **G** over **V** also
satisfies the Markov Condition with respect to *P*. The
*Causal Minimality Condition* asserts that the Minimality
Condition holds when **G** and P are given their
empirical interpretations. As an illustration, consider the variable
set {*X*, *Y*}, let there be an arrow from *X*
to *Y*, and suppose that *X* and *Y* are
probabilistically independent of each other in *P*. This graph
would satisfy the Markov Condition with respect to *P*: none
of the independence relations mandated by *MC* are absent (in
fact, *MC* mandates no independence relations). But this graph
would violate the Minimality Condition with respect to *P*,
since the subgraph that omits the arrow from *X* to *Y*
would also satisfy the Markov Condition.

**Suggested Readings**: Spirtes, Glymour and Scheines
(2000) and Scheines (1997). Hausman and Woodward (1999) provide a
detailed discussion of the Causal Markov Condition.

P(This says nothing about howY=y|X=x) P(Y=y|X=x).

Figure 8

Consider Figure 8. Note that it differs from Figure 4 in that there
is an additional arrow running directly fron *X* to
*Y*. What does this arrow from *X* to *Y*
indicate? It does not merely indicate that *X* is causally
relevant to *Y*; in Figure 4, it is natural to expect that
*X* will relevant to *Y* via its effect on
*Z*. Applying the Causal Markov and Minimality Conditions, the
arrow from *X* to *Y* indicates that *Y* is
probabilistically dependent on *X*, even when we hold fixed
the value of *Z*. That is, *X* makes a probabilistic
difference for *Y*, over and above the difference it makes in
virtue of its effect on *Z*. Figure 8 thus indicates that
*X* has an effect on *Y* via two different routes: one
route that runs through the variable *Z* and the other route
which is *direct*, i.e., unmediated by any other variable in
**V**. As an illustration, consider a well-known example
due to Germund Hesslow. Consumption of birth control pills
(*X*) is a risk factor for thrombosis (*Y*). On the
other hand, birth control pills are an effective preventer of
pregnancy (*Z*), which is in turn a powerful risk factor for
thrombosis. The use of birth control pills may thus affect one's
chances of suffering from thrombosis in two different ways, one
'direct', and one via the effect of pills on one's chances of
becoming pregnant. Whether birth control pills raise or lower the
probability of thrombosis overall will depend upon the relative
strengths of these two routes. The probabilistic theories of
causation described in
Section 3
above are suited to analyze the total or net effect of one factor or
variable on other, whereas the causal modeling techniques discussed
in this section are primarily geared toward decomposing a causal
system into individual routes of causal influence.

**Suggested Readings**: The birth control pill example
was originally presented in Hesslow (1976). Hitchcock (2001a)
discusses the distinction between total or net effect, and causal
influence along individual routes.

Figure 9

The Faithfulness Condition implies that the causal influences of one
variable on another along multiple causal routes does not
‘cancel’. For example, suppose that Figure 8 correctly
represents the underlying causal structure. Then the Faithfulness
Condition implies that *X* and *Y* cannot be
unconditionally independent of one another in the empirical
distribution. In Hesslow's example, this means that the tendency of
birth control pills to cause thrombosis along the direct route cannot
be exactly canceled by the tendency of birth control pills to prevent
thrombosis by preventing pregnancy. This ‘no canceling’
condition seems implausible as a metaphysical or conceptual
constraint upon the connection between causation and
probabilities. Why can't competing causal paths cancel one another
out? Indeed, Newtonian physics provides us with an example: the
downward force on my body due to gravity triggers an equal and
opposite upward force on my body from the floor. My body responds as
if neither force were acting upon it. The Faithfulness Condition
seems rather to be a *methodological* principle. Given a
distribution on {*X*, *Y*, *Z*} in which
*X* and *Y* are independent, we should infer that the
causal structure is that depicted in Figure 9, rather than Figure
8. This is not because Figure 8 is conclusively ruled out by the
distribution, but rather because it is gratuitously complex: it
postulates causal connections that are not necessary to explain the
underlying pattern of probabilistic dependencies. The Faithfulness
Condition is thus a formal version of Ockham's razor.

SGS use the Causal Markov, Minimality, and Faithfulness Conditions
to prove a variety of *statistical indistinguishability*
theorems. These theorems tell us when two distinct causal structures
can or cannot be distinguished on the basis of the probability
distributions to which they give rise. We will return to this issue
in
Section 6.4 below.

**Suggested Readings**: Spirtes, Glymour and Scheines
(2000) and Scheines (1997).

This line of objection is surely right about our ordinary use of causal language. It is nonetheless open to the defender of context-unanimity to respond that she is interested in supplying a precise concept to replace the vague notion of causation that corresponds to our everyday usage. In a population consisting of individuals lacking the gene, smoking causes lung cancer. In a population consisting entirely of individuals who possess the gene, smoking prevents lung cancer.

Note that this dispute only arises in the context of a heterogeneous population. Restricting ourselves to one particular test situation, both parties can agree that smoking causes lung cancer in that test population just in case it increases the probability of lung cancer in that test situation.

One's position in this debate will depend, in part, on how one wants to use general causal claims such as “smoking causes lung cancer”. If one conceives of them as causal laws, then the contextual-unanimity requirement may seem attractive. If “smoking causes lung cancer” is a kind of law, then its truth should not be contingent upon the scarcity of the gene that reverses the effects of smoking. By contrast, one may understand the causal claim in a more practical way, by treating it as a kind of policy-guiding principle. Since the gene in question is very rare, it would still be rational for public health organizations to promote policies that would reduce the incidence of smoking.

**Suggested Readings:** Dupré; (1984) presents
this challenge to the context-unanimity requirement, and offers an
alternative. Eells (1991, chapters 1 and 2), defends
context-unanimity using the idea that causal claims are made relative
to a population. Hitchock (2001b) contains further discussion and
develops the idea of treating general causal claims as policy-guiding
principles.

A different sort of counterexample involves *causal
preemption*. Suppose that an assassin puts a weak poison in the
king's drink, resulting in a 30% chance of death. The king drinks the
poison and dies. If the assassin had not poisoned the drink, her
associate would have spiked the drink with an even deadlier elixir
(70% chance of death). In the example, the assassin caused the king
to die by poisoning his drink, even though she lowered his chance of
death (from 70% to 30%). Here the cause lowered the probability of
death, because it preempted an even stronger cause.

One approach to this problem, built into the counterfactual approach described in Section 4 above, is to invoke the principle of the transitivity of causation. The assassin's action increased the probability of, and hence caused, the presence of weak poison in the king's drink. The presence of weak poison in the king's drink raised the probability of, and hence caused, the king's death. (By this time, it is already determined that the associate will not poison the drink.) By transitivity, the assassin's action caused the king's death. The claim that causation is transitive is highly controversial, however, and is subject to many persuasive counterexamples.

Another approach would be to invoke a distinction introduced in
Section 5.3
above. The assassin's action affects the king's chances of death in
two distinct ways: first, it introduces the weak poison into the
king's drink; second, it prevents the introduction of a stronger
poison. The net effect is to reduce the king's chance of
death. Nonetheless, we can isolate the first of these effects (which
would be indicated by an arrow in a causal graph). We do this by
holding fixed the inaction of the associate: given that the associate
did not in fact poison the drink, the assassin's action *increased
the king's chance of death* (from near zero to .3). We count the
assassin's action as a cause of death because it increased the chance
of death along one of the routes connecting the two events.

For a counterexample of the second type, suppose that two gunmen shoot at a target. Each has a certain probability of hitting, and a certain probability of missing. Assume that none of the probabilities are one or zero. As a matter of fact, the first gunman hits, and the second gunman misses. Nonetheless, the second gunman did fire, and by firing, increased the probability that the target would be hit, which it was. While it is obviously wrong to say that the second gunman's shot caused the target to be hit, it would seem that a probabilistic theory of causation is committed to this consequence. A natural approach to this problem would be to try to combine the probabilistic theory of causation with a requirement of spatiotemporal connection between cause and effect, although it is not at all clear how this hybrid theory would work.

**Suggested Readings:** The example of the golf ball,
due to Deborah Rosen, is first presented in Suppes (1970) Salmon
(1980) presents several examples of probability-lowering causes.
Hitchcock (1995) presents a response. Lewis (1986a) discusses cases
of preemption, see also the entry for “causation: counterfactual
theories.” Hithcock (2001a) presents the solution in terms of
decomposition into component causal routes. Woodward (1990)
describes the structure that is instantiated in the example of the
two gunmen. Humphreys (1989, section 14) responds. Menzies (1989,
1996) discusses examples involving causal pre-emption where
non-causes raise the probabilities of non-effects. Hitchcock (2002)
provides a general discussion of these counterexamples. For a
discussion of attempts to analyze cause and effect in terms of
contiguous processes, see the entry for “causation: causal
processes.”

**Suggested Readings:** The need for distinct theories
of singular and general causation is defended in Good (1961, 1962),
Sober (1985), and Eells (1991, introduction and chapter 6). Eells
(1991, chapter 6) offers a distinct probabilistic theory of singular
causation in terms of the temporal evolution of probabilities.
Carroll (1991) and Hitchcock (1995) offer two quite different lines
of response. Hitchcock (2001b) argues that there are really (at
least) two different distinctions at work here.

The most important work along these lines has been carried out by
Spirtes, Glymour and Scheines. Rather than report on the details of
their results, we present here a more generalized discussion.
Suppose that a set of factors, and a system of causal relations among
those factors is given: call this the *causal structure CS.*
Let *T* be a theory connecting causal relations among factors
with probabilistic relations among factors. Then the causal
structure *CS* will be *probabilistically
distinguishable* relative to *T,* if for every assignment
of probabilities to the factors in *CS* that is compatible
with *CS* and *T, CS* is the unique causal structure
compatible with *T*and those probabilities. (One could
formulate a weaker sense of distinguishability by requiring that only
some assignment of probabilities uniquely determines *CS*).
Intuitively, *T* allows you to infer that the causal structure
is in fact *CS* given the probability relations between
factors. Given a probabilistic theory of causation *T*, it is
possible to imagine many different properties it might have. Here
are some possibilities:

- All causal structures are probabilistically distinguishable
relative to
*T* - All causal structures having some interesting property are
probabilistically distinguishable relative to
*T* - Any causal structure can be embedded in a causal structure that is
probabilistically distinguishable relative to
*T* - The actual causal structure of the world (assuming there is such a
thing) is probabilistically distinguishable relative to
*T*.

**Suggested Readings:** The most detailed treatment of
probabilistic distinguishability is given in Spirtes, Glymour and
Scheines (2000); see especially chapter 4. Spirtes, Glymour and
Scheines prove (theorem 4.6) a result along the lines of 3 for a
theory that they propose. This work is very technical. An
accessible presentation is contained in Papineau (1993), which
defends a position along the lines of 4.

- Arntzenius, Frank. (1993) “The Common Cause Principle,” in Hull, Forbes, and Okruhlik (1993), pp. 227 - 237.
- Bennett, Jonathan. (1988)
*Events and Their Names*. Indianapolis and Cambridge: Hackett. - Carroll, John. (1991) “Property-level Causation?”
*Philosophical Studies***63**: 245-70. - Cartwright, Nancy. (1979) ?usal Laws and Effective
Strategies,”
*No&#ucirc;s***13**: 419-437. - Dupré, John. (1984) “Probabilistic Causality
Emancipated,” in Peter French, Theodore Uehling, Jr., and Howard
Wettstein, eds., (1984)
*Midwest Studies in Philosophy IX*(Minneapolis: University of Minnesota Press), pp. 169 - 175. - Earman, John. (1986)
*A Primer on Determinism*. Dordrecht: Reidel. - Eells, Ellery. (1991)
*Probabilistic Causality*. Cambridge, U.K.: Cambridge University Press. - Good, I. J. (1961) “A Causal Calculus I,”
*British Journal for the Philosophy of Science***11**: 305-18. - -----. (1962) “A Causal Calculus II,”
*British Journal for the Philosophy of Science***12**: 43-51. - Hausman, Daniel. (1998)
*Causal Asymmetries*. Cambridge: Cambridge University Press. - Hausman, Daniel, and Woodward, James. (1999) “Independence,
Invariance, and the Causal Markov Condition,”
*British Journal for the Philosophy of Science***50**: 1 - 63. - Hesslow, Germund. (1976) “Discussion: Two Notes on the
Probabilistic Approach to Causality,”
*Philosophy of Science***43**: 290 - 292. - Hitchcock, Christopher. (1993) “A Generalized Probabilistic
Theory of Causal Relevance,”
*Synthese***97**: 335-364. - -----. (1995) “The Mishap at Reichenbach Fall: Singular
vs. General Causation,”
*Philosophical Studies***78**: 257 - 291. - ----. (2001a) “A Tale of Two Effects,”
*Philosophical Review***110**: 361 - 396. - -----. (2001b) “Causal Generalizations and Good
Advice,”
*Monist***84**: 218 - 241. - -----. (2002) “Do All and Only Causes Raise the
Probabilities of Effects?” in John Collins, Ned Hall, and
L.A. Paul (eds.),
*Causation and Counterfactuals*(Cambridge MA: MIT Press, 2002). - Hull, David, Mickey Forbes, and Kathleen Okruhlik, eds. (1993)
*PSA 1992, Volume Two*. East Lansing: Philosophy of Science Association. - Hume, David. (1748)
*An Enquiry Concerning Human Understanding*. - Humphreys, Paul. (1989)
*The Chances of Explanation: Causal Explanations in the Social, Medical, and Physical Sciences*, Princeton: Princeton University Press. - Kvart, Igal. (1997) “Cause and Some Positive Causal
Impact,”
*Philosophical Perspectives***11**: 401 - 432. - Lewis, David. (1986a) “Causation” and “Postscripts to ‘Causation’,” in Lewis (1986c), pp. 172-213.
- -----. (1986b) “Counterfactual Dependence and Time's Arrow” and “Postscripts to ‘Counterfactual Dependence and Time's Arrow’,” in Lewis (1986c), pp. 32 - 66.
- -----. (1986c)
*Philosophical Papers, Volume II*. Oxford: Oxford University Press. - Mackie, John. (1974)
*The Cement of the Universe*. Oxford: Clarendon Press. - McKim, Vaughn, and Stepher Turner, eds. (1997)
*Causality in Crisis?*Notre Dame: University of Notre Dame Press. - Menzies, Peter. (1989) “Probabilistic Causation and Causal
Processes: A Critique of Lewis,”
*Philosophy of Science***56**: 642-63. - Menzies, Peter. (1996) “Probabilistic Causation and the
Pre-emptionProblem”,
*Mind***105**: 85-117. - Mill, John Stuart. (1843)
*A System of Logic, Ratiocinative and Inductive*. London: Parker and Son. - Noordhof, Paul. (1999) “Probabilistic Causation, Preemption
and Counterfactuals,”
*Mind***108**: 95 - 125. - Papineau, David. (1993) “Can We Reduce Causal Direction to Probabilities?” in Hull, Forbes and Okruhlik (1993), pp. 238-252.
- Pearl, Judea. (1999) “Reasoning with Cause and Effect,”
in
*Proceedings of the International Joint Conference on Artificial Intelligence*(San Francisco: Morgan Kaufman), pp. 1437 - 1449. - -----. (2000)
*Causality: Models, Reasoning, and Inference*. Cambridge: Cambridge University Press. - Price, Huw. (1991) “Agency and Probabalistic
Causality”,
*British Journal for the Philosophy of Science***42**: 157 -76. - Reichenbach, Hans. (1956)
*The Direction of Time*. Berkeley and Los Angeles: University of California Press. - Salmon, Wesley. (1980) “Probabilistic Causality,”
*Pacific Philosophical Quarterly***61**: 50 - 74. - Scheines, Richard. (1997) “An Introduction to Causal Inference” in McKim and Turner (1997), pp. 185 - 199.
- Skyrms, Brian. (1980)
*Causal Necessity*. New Haven and London: Yale University Press. - Sober, Elliott. (1985) “Two Concepts of Cause” in Peter
Asquith and Philip Kitcher, eds.,
*PSA 1984, Vol. II*(East Lansing: Philosophy of Science Association), pp. 405-424. - Spirtes, Peter, Clark Glymour, and Richard Scheines. (2000)
*Causation, Prediction and Search*, Second edition. Cambridge, MA: M.I.T. Press. - Suppes, Patrick. (1970)
*A Probabilistic Theory of Causality*. Amsterdam: North-Holland Publishing Company. - Woodward, James. (1990) “Supervenience and Singular Causal
Claims,” in Dudley Knowles, ed.,
*Explanation and its Limits*(Cambridge, U.K: Cambridge University Press), pp. 211 - 246.

*First published: July 11, 1997*

*Content last modified: September 6, 2002*