A cause is regularly followed by its effect. This idea is at the coreof regularity theories of causation. The most influential regularitytheory can be found in Hume (1739). The theory has been refined byMill (1843) who insisted that the relevant regularities are laws ofnature. Further refinements used to enjoy popularity until David Lewis(1973) criticized the regularity theory and proposed an analysis ofcausation in terms of counterfactuals (see the entry oncounterfactual theories of causation). Since then, counterfactual theories of causation have risen andregularity theories have more and more fallen into disuse.
Regularities are closely tied to inferences. When a cause isinvariably followed by its effect, it is reasonable to infer theeffect from the cause. In the wake of logical empiricism and logicalapproaches to belief change, theories that explicitly analysecausation in terms of inference relations have cropped up. The basicidea is that effects are inferable from corresponding causes in thecontext of a suitable background theory using a suitable logic.Inferential theories offer solutions to the problems which weredecisive for the decline of regularity theories. Such theories maythus be seen as successors of the regularity theories.
The core idea of regularity theories of causation is that causes areregularly followed by their effects. A genuine cause and its effectstand in a pattern of invariable succession: whenever the causeoccurs, so does its effect. This regular association is to beunderstood by contrast to a relation of causal power or efficacy. On aregularity theory, a cause and its effect only instantiate aregularity. No causal powers are posited in virtue of which a causebrings about its effect. Moreover, no metaphysical entities orconnections are posited which would ground the regularities in theworld. Causation is not conceived of as a metaphysically thickrelation of production, but rather as simple instantiation ofregularity.
The most influential regularity theory is due to Hume. He defines acause to be
[a]n object precedent and contiguous to another, and where all theobjects resembling the former are plac’d in like relations ofpriority and contiguity to those objects, that resemble the latter.(Hume 1739: book I, part III, section XIV [1978: 169])
This definition contains three conditions for a cause. First, a causetemporally precedes its effect. Second, a cause is contiguous to itseffect. That is, a cause is in spatiotemporal vicinity to its effect.Third, all objects similar to the cause are in a “likerelation” to objects similar to the effect. This thirdresemblance condition says that cause and effect instantiate aregularity. To see this, note the presupposition of the resemblancecondition: causes and effects can be sorted into types. The objectswhich are like the cause are of a certain type and so are the objectswhich are like the effect. The third condition thus says that allobjects of a certain type are followed by an effect of a certaintype.
A Humean Regularity Theory (HRT), which may or may not coincide withHume’s genuine analysis of causation, can be summarized asfollows: where \(c\) and \(e\) denote token events of the types \(C\)and \(E\), respectively,
\(c\) is a cause of \(e\) iff
Causation is thus analysed in terms of spatiotemporal contiguity,temporal precedence, and regular connection. The causal relation isreduced to non-causal entities: the two particular facts aboutspatiotemporal contiguity and temporal precedence, and the generalregularity. There is nothing more to causation according to HRT. Inparticular, causation does not involve a necessary connection, aproductive relation, causal powers, or the like—not even toground the regularities. This stance against metaphysically thickconceptions of causation is characteristic for the regularity theory(see, e.g., Dowe 2000: Ch. 2; Psillos 2009).
Let us explain some of the corollaries of HRT. The spatiotemporalcontiguity of cause and effect excludes causation at a distance: acause is proximate to its effect, either directly or via a chain ofcontiguous events (Hume 1739: book 1, part III, section II [1978:75]). The temporal precedence of the cause and the direction of timeimply that causation is asymmetric: if \(c\) causes \(e\), then it isnot the case that \(e\) causes \(c\). Temporal precedence thusexplains the asymmetry of causation, and how we can distinguishbetween cause and effect. However, it also rules out the possibilityof simultaneous and backwards-in-time causation; and it dampens theprospects of a non-circular theory of time defined in terms ofcausation.
Condition (iii) relates a particular instance of causation, \(c\) is acause of \(e\), to the class of all like cases, the types of \(c\) and\(e\). Hume takes causation to be primarily a relation betweenparticular matters of fact. Yet the causal relation between theseactual particulars holds in virtue of a certain regularity. Any causalrelation thus supervenes on a regularity in the sense that there canbe no change in the causal relation without a change in theregularity. Any regularity, in turn, supervenes on particular mattersof fact: there can be no change in the regularity without a change ofparticular matters of fact. Whether a regularity is true is thusdetermined by which particular matters of fact obtain. Likewise,whether a causal relation obtains is determined by which regularityobtains. In the end, only the particular matters of fact determinewhether or not a causal relation obtains, since the relation ofsupervenience is transitive. (For details, see the entries onsupervenience andDavid Lewis, section on Humean supervenience.)
According to condition (iii), a cause is asufficientcondition for its effect in this sense:all \(c\)-like eventsoccurring in the actual world are in fact followed by an \(e\), wherethe actual world is understood as the collection of all particularspatiotemporal matters of facts. The sufficiency of the cause for theeffect is not to be understood in a modally stronger sense (Dowe 2000:20). In particular, Hume denies that a cause necessitates its effect.In the world, there is—to use Hume’s phrase—just a“constant conjunction” between cause and effect.
On HRT, every effect has a sufficient cause. This does not imply thatany event has a sufficient cause, but it says that, if there is acausal relation, it is deterministic. Probabilistic causation is thusnot covered by HRT. And while the theory is compatible with aprobabilistic analog of causation, we follow Hume in restrictingourselves to deterministic causation in this entry. (See the entry onprobabilistic causation for an overview of theories of causation, where the underlyingprocesses may be indeterministic.)
HRT allows us to discover causal relations. We may observespatiotemporal contiguity, temporal precedence, and that a certaintype of event is generally followed by another type of event. Uponrepeated observation of an invariable pattern of succession, we beginto expect that this particular event of a certain type is followed bythat particular event of a certain type. We develop an inferentialhabit due to the experienced constant conjunction of types. Or as Humeputs it:
Acause is an object precedent and contiguous to another, andso united with it, that the idea of the one determines the mind toform the idea of the other. (Hume 1739: book 1, part III, section XIV[1978: 170])
Our mind forms a habit to expect or to infer the effect from a causeupon repeatedly experiencing events which occur in sequence. Thereby,we produce a connection between objects or events that goes“beyond what is immediately present to the senses” (1739:book I, part III, section II [1978: 73]). Our senses and our memoryprovide us only with finitely many instances of objects or eventswhich occur jointly, and from which our minds form inferential habits.Still these inferential habits apply also to future, as of yetunobserved instances of the regularity. In Hume’s words,
after the discovery of the constant conjunction of any objects, wealways draw an inference from one object to another, […].Perhaps ’twill appear in the end, that the necessary connexiondepends on the inference, instead of the inference’s dependingon the necessary connexion. (Hume 1739: book 1, part III, section VI[1978: 88])
And indeed, Hume says that “[n]ecessity is something that existsin the minds, not in objects” (1739: book I, part III, sectionXIV [1978: 165]). So there is no necessary connectionin theworld which would go beyond regular association. However, theinferential habits we develop based on the perceived regularitiesmakes us feel that the connection between cause and effect is strongerthan mere invariable succession. The necessary connection existsin our minds butnot in the objects (see section inthe entry onHume on Necessary Connection: Constructive Phase).
According to the standard interpretation, the regularities themselvesare real. There really are jointly and repeatedly occurring events ofcertain types that are spatially contiguous and some temporallyprecede others. And the regularities are mind-independent: even ifthere were no minds around, the regularities would exist as patternsin the Humean mosaic of particular facts.
Is causation in the world, as expressed by HRT, extensionallyequivalent to causation in the mind? The difference between the twodefinitions seems to be that condition (iii) of HRT is replaced by thefollowing condition: \(c\) of type \(C\) makes us infer \(e\) of type\(E\). While HRT does not require the presence of an epistemic agent,the other definition seems to require such an agent and is thusmind-dependent. And an epistemic agent may infer an event from anotherevent, even though there is no regularity connecting the two events.Having said that the two definitions are not extensionally equivalent,the definitions can be read in ways which make them coextensive. Theepistemic agent of the second definition, for example, can be seen asa hypothetical omniscient observer who always and only infers inaccordance with all and only true regularities (for details, seeGarrett 1997: Ch. 5).
What is Hume’s view on causation? We will not enter thisexegetical debate here. Suffice it to say that the interpretations ofHume are manifold. Psillos (2009: 132) says that HRT “wasHume’s view, too, give or take a bit” and argues for it inPsillos (2002: Ch. 1). But there are many deviating interpretations(Beebee 2006; Garrett 2009; Strawson 1989; J. Wright 1983). Therelation between causation in the world and causation in the mind, inparticular, has remained a matter of debate among Hume scholars (seee.g., Garrett 1993).
Let’s go back to the Humean Regularity Theory. HRT enjoys twobenefits. As we have observed, no substantive metaphysics is needed toexplain causation. No entities like causal powers or a necessaryconnection between cause and effect need to be posited. Causation israther reduced to spatiotemporal contiguity, temporal precedence, andregular association. Second, Hume’s regularity theory explainshow we can figure out what causes what. A cause is simply the eventwhich temporally precedes and is spatiotemporally contiguous toanother event such that the former and latter event instantiate aregularity. Since temporal precedence and spatiotemporal precedencecan be observed, the problem of identifying causes reduces to theproblem of identifying regularities.
However, HRT also faces serious problems. Recall that it presupposesthat causes and effects can be sorted into types which are connectedby a regularity. But it is unclear how the sorting can be achieved.What makes an event “like” another? Which aspects of theobjects or events are relevant for the sorting? Many contemporaryphilosophers explicate the relevant notion of resemblance orsimilarity in terms of laws of nature. If there is a law of the form\(\forall x (A(x) \rightarrow B(x))\), then events similar to a tokenevent \(a\) are simply events of the same type—otherinstantiations of \(A\)—and events similar to \(b\) are alsojust token events of the same type. However, it is a non-trivial taskto give a satisfying account of what a law of nature is—to saythe least.
A second problem runs as follows. Assume, plausibly, that a giantmeteor caused the dinosaurs to go extinct. For the meteor to be acause on HRT, there must be a regularity saying that events like giantmeteors hitting the Earth are followed by events like the extinctionof the dinosaurs. But it seems as if such a regularity does not exist:the next giant meteor hitting the Earth will not cause the dinosaursto go extinct. The dinosaurs are already extinct. The problem is thatthe meteor is only a cause of the extinction on a single occasion. Butaccording to HRT this one meteor hitting the Earth is only a cause ifsome other meteors hitting the Earth also satisfy the regularity. Butwhy would events far away in space and time determine whether or notthis giant meteor caused the extinction of the dinosaurs on thisoccasion? It seems as if we can have causation without regularity on asingle occasion. We will refer to this problem as the problem ofsingular causal relations.
A third problem is that there can be regularity without causation. Thecry of the cock precedes the rise of the sun and the two events areregularly associated. And yet the cry of the cock is not a cause ofthe sunrise. A special case of this problem arises from joint effectsof a common cause. Suppose an instantiation \(c\) of \(C\) is a commoncause of the instantiations \(a\) of \(A\) and \(b\) of \(B\), wherethe token event \(a\) precedes the token event \(b\) in time (seeFigure 1). Further suppose \(C\) is instantiated whenever \(A\) is. That is,whenever a token event of type \(A\) occurs, a token event of type\(C\) occurs. Then there is a regular connection between \(A\) and\(B\), which we normally do not count as a causal relation. Thisproblem remains a central challenge for regularity theories todate.
Figure 1
Setting aside the possibilities of remote and backwards causation fornow, a regularity theory of causation is only tenable when the threeproblems can be answered. The next section focuses on how laws ofnature may help overcome the problem of sorting events intoappropriate types and how they may exclude the cry of the cock as acause of the sunrise.
Mill (1843) refined the Humean Regularity Theory of causation. Aneffect usually occurs only due to several instantiated factors, thatis, instantiated event types. He distinguishes between positive andnegative factors. Let \(C_1,\) \(C_2,\) …, \(C_n\) be a set ofpositive factors and \(\overline{D}_1,\) \(\overline{D}_2,\) …,\(\overline{D}_m\) a set of negative factors which are togethersufficient for an effect \(E\). Mill says that the totality of presentpositive and absent negative factors sufficient for an effect to occuris the cause of this effect. In symbols, the totality of the tokenevents \(c_1,\) \(c_2,\) …, \(c_n\) and the absence of anytoken events of the types \(D_1,\) \(D_2,\) …, \(D_m\) is acause of an occurring event \(e\) iff \(C_1,\) \(C_2,\) …,\(C_n,\) \(\overline{D}_1,\) \(\overline{D}_2,\) …,\(\overline{D}_m\) is sufficient for \(E\).
Furthermore, Mill claims that causation requires lawlike regularities.Mere regular association is not enough. The cry of a cock (as part ofa totality of conditions) precedes the rise of the sun and the formeris invariably followed by the latter, and yet the cry of a cock doesnot count as a cause of the sunrise. For the regularity is not a lawof nature, it is a mere accidental regularity.
The Millian Regularity Theory (MRT) can be summarized as follows:
The totality of present positive factors \(\{ C_{1-n} \}\) and absentnegative factors \(\{ \overline{D}_{1-m} \}\) is a cause of an event\(e\) of type \(E\) iff
there are events of type \(C_1,\) \(C_2,\) …, \(C_n\) which arespatiotemporally contiguous to \(e\) and there are no such events oftype \(D_1,\) \(D_2,\) …, \(D_m\),
the events of type \(C_1,\) \(C_2,\) …, \(C_n\) precede \(e\)in time, and
it is alaw of nature that an instantiation of all positivefactors \(\{ C_{1-n} \}\) is followed by an instantiation of factor\(E\) when none of the negative factors \(\{ D_{1-m} \}\) isinstantiated.
What are laws of nature? Well, on a regularity theory of causation, alaw is a regularity understood as a stable pattern ofevents—nothing more, nothing less. And yet, laws are stillspecial regularities. According to Mill (1843), regularities whichsubsume other true regularities are more general than the subsumedones. The laws of nature are those regularities which are the mostgeneral. They thus organise our knowledge of the world in an idealdeductive system which strikes the best balance between simplicity andstrength (see, e.g., 1843: Book III, Ch. XII). This best systemaccount of laws has been further developed by Ramsey and Lewis. Ramseyputs it as follows,
even if we knew everything, we should still want to systematize ourknowledge as a deductive system, and the general axioms in that systemwould be the fundamental laws of nature. The choice of axioms is boundto some extent to be arbitrary, but what is less likely to bearbitrary if any simplicity is to be preserved is a body offundamental generalizations, some to be taken as axioms and othersdeduced. (1928 [1978: 131])
Everything else being equal, a system with fewer axioms is simpler.Simplicity thus demands the pruning of inessential elements from thesystem of laws. A deductive system is stronger when it entails moretruths. Strength demands that the deductive system of laws should beas informative as possible. Note that simplicity and strength compete.A system consisting of only one uncomplicated axiom is simple but notvery informative. A system which has one axiom for any truth is verystrong but far from simple. To avoid a deductive system that is toosimple to be informative and one that is too complicated, we need tobalance simplicity and strength. The proper balance between simplicityand strength makes the deductive system best.
There is no guarantee that there is a unique best deductive system. Ifthere is none, the laws of nature are those axioms and theorems thatare common to all deductive systems that tie with respect tosimplicity and strength (or are at least good enough). D. Lewis thussays that
a contingent generalization is alaw of nature if and only ifit appears as a theorem (or axiom) in each of the true deductivesystems that achieves a best combination of simplicity and strength.(1973: 73)
As a consequence, laws are simple and yet highly informative as a partof each best deductive system. The regularity about the cry of thecock and the sunrise is not part of a best deductive system. Andwhichever regularity is not part of any best system is accidental: itis not a law of nature. The dividing line between laws and accidentalregularities is determined by being a part of any best system. Notethat no regularity can be regarded as a law of nature in absence of afull system.
There is a worry that the best system account of laws is infected byepistemic considerations. Of course, what truths a best system entailsis an objective matter, it does not depend on our knowledge of it.Whether a truth is objectively implied by a best system, however,depends on the way this system is organized. And there may beconsiderable leeway in choosing the axioms to obtain a simple system.The worry is that the notion of simplicity is not fully objective, andeven if it were, striking a balance between simplicity and strength isnot. Hence, which regularities are laws depends on epistemic criteria,our desiderata to organize our knowledge as simple as possible in adeductive system. This is not to say that the lawlike regularities aremind-dependent. Laws and regularities are still patterns in the worldindependent of our knowledge of them. What makes some of theregularities laws and some others not, however, is not only determinedby the world. The law-making feature, striking a proper balancebetween simplicity and strength, seems to have an epistemic component.This is presumably the cost for being free of substantive metaphysicalcommitments. (For details on the best system account of laws, seethe entry onlaws of nature.)
What does the epistemic component mean for causation as lawlikeregularity? The regularity is determined solely by the world, but thelawlikeness of a regularity rests, in part, on epistemic criteria. So,while causation is not arbitrary—there must still be aregularity for causation—there is an epistemic element inchoosing which regularities constitute causation. In brief, theregularities are mind-independent, what causes what is not.
The Humean regularity theory, where causation is instantiation ofregularity, presupposes that events can be sorted into types. A typeis a class of resembling events. Based on the best systems account, wemay now appeal to laws of nature in order to determine which eventsare similar and which are dissimilar. Suppose it is a law that allevents \(a\) of type \(A\) are followed by events \(b\) of type \(B\).Then the events or objects similar to \(a\) are just the otherinstantiations of type \(A\), and likewise for type \(B\). But thelaw-making feature depends on epistemic considerations, and so thetypes determined by those laws depend on epistemic considerations.Hence, falling back to a Humean regularity theory does not solve theissue. We still need a way to sort events into types. But which eventsresemble each other comes in degrees and depends on the respects ofcomparison. The degrees of similarity and respects of comparison arenot fixed by the world understood as the collection of all particularspatiotemporal matters of fact. The degrees and respects are ratherepistemic components which depend on the categories and classificationschemes we use. A related problem are gerrymandered categories orproperties like “grue”, as studied by Goodman (1955) inhis new riddle of induction.
Some defenders of the regularity theory see the dependence onepistemic components as a problem. To solve it, one may want to positthat there are natural properties which carve nature at its joints.The idea is that those natural properties provide a fully objectiveclassification. Those natural properties would ensure that the lawlikeregularities are free of epistemic components. However, what isnatural is notoriously difficult to define. Natural properties, whichare objectively similar and dissimilar to each other, are thus oftensimply posited as primitives which are not analysed further, as Lewis(1986: 60–2) does for example.
On the Millian Regularity Theory, a cause is nomologically sufficientfor its effect: there is a law of nature such that the occurrence of acause suffices for the occurrence of its effect. A cause is taken tobe a totality of positive and negative factors which is sufficient andnecessary for the effect. However, one and the same effect can havemany causes understood as sufficient and necessary totalities(as has already been observed by Venn 1889). Each of these distincttotalities, or clusters of factors, is sufficient to bring about theeffect, but no single one is necessary if another cluster isactual.
Mackie (1974: 63) conceives of each cluster of factors as aconjunction, for example \(C \land \neg A\), where \(C\) is a type ofevent and \(\neg A\) is the absence of any event of type \(A\).Consider, for instance, a house which burns to the ground. There is avariety of distinct clusters which can cause the burning of the house.One cluster includes a short-circuit, the presence of oxygen, theabsence of a sprinkler system, and so on. Another cluster includes anarsonist, the use of petrol, again the presence of oxygen, and so on.Yet another cluster includes a burning tree falling on the house, andso on. Each cluster is a conjunction of single factors. Thedisjunction of all clusters represents then the plurality of causes(see Mackie 1965: 245 and 1974: 37–38 for details).
The disjunction of all clusters for an effect is relative to acausal field. A causal field is “a background againstwhich the causing goes on” (1974: 63). Roughly, a causal fieldcaptures the circumstances which are kept fixed and so cannot even beconsidered as (part of) a cause. Or, as Mackie puts it, what is causedis
not just an event, but an event-in-a-certain-field, and someconditions can be set aside as not causing this-event-in-this-fieldsimply because they are part of the chosen field. (1974: 35)
Suppose a causal field includes that a certain person was born andalive for a while. Being born thus cannot be a cause of theperson’s death relative to this causal field. However, withenough ingenuity, the person being born is a cause of theperson’s death relative to a causal field which does not includethat the person was born and alive for a while as a part of the fixedbackground (see 1974: 37).
This suggests that the regularity for an effect is a disjunction ofconjunctions which is necessary and sufficient for the effect relativeto a causal field. Relative to this field, each cluster is sufficientfor the burning of the house, but no single one is necessary; foranother cluster can be sufficient for the burning. (\(C\) issufficient for \(E\) means that, whenever an event of type \(C\)occurs, so does an event of type \(E\). \(C\) is necessary for \(E\)means that an event of type \(E\) only occurs if an event of type\(C\) occurs.) For simplicity, suppose the complex regularity(relative to a particular causal field) has the form:
\[(C_1 \land C_2) \lor D_1 \leftrightarrow E.\]On the one hand, any cluster, for example \(C_1 \land C_2\), isminimally sufficient for \(E\) (and so is the“cluster” \(D_1\)). \(C_1 \land C_2\) is minimallysufficient for \(E\) means that the cluster is sufficient for \(E\)and each conjunct of the cluster is necessary for its sufficiency. Sono conjunct of any cluster is redundant for bringing about the effect.On the other hand, neither cluster is necessary for the effect \(E\).\(C_1 \land C_2\) is minimally sufficient but not necessary for \(E\).\(E\) occurs if \(D_1\) does. Now, focus on \(C_1\) which is on itsown insufficient to bring about \(E\). \(C_1\) is a single factor ofthe cluster \(C_1 \land C_2\) which is a sufficient but unnecessarycondition for \(E\). Hence, \(C_1\) is aninsufficient butnon-redundant part of an unnecessary but sufficient condition for\(E\) (relative to a causal field). Or simply, \(C_1\) is an INUScondition for \(E\).
In general, a complex regularity has the form:
\[(C_{1,1} \land \ldots \land C_{1, n}) \lor \ldots \lor (C_{k, 1} \land \ldots \land C_{k, m}) \leftrightarrow E.\]Each single factor \(C_{i, j}\) of a cluster is a potential cause.Hence, a cause is at least an INUS condition (relative to a particularcausal field). “At least” because there are limitingcases:
On Mackie’s theory, a factor \(C\) is a cause of \(E\) iff \(C\)is at least an INUS condition of \(E\), and each factor of the clusterthat contains \(C\) and is sufficient for \(E\) is instantiated.
How does Mackie’s theory of causation go beyond the regularitytheories by Hume and Mill? Unlike HRT, Mackie’s theory capturescausal scenarios where several different events are necessary to bringabout a certain effect, and so qualify as a cause of the latter. Takethe regularity
\[(C_1 \land C_2) \lor D_1 \leftrightarrow E.\]Suppose \(C_1, C_2,\) and \(D_1\) occur independently of one another.There is neither a strict positive nor a strict negative correlationamong these events. There are cases where \(C_1\) does not occur but\(D_1\) does, and so \(E\) will occur. There are also cases where\(C_1\) occurs but \(C_2\) and \(D_1\) do not, and so \(E\) will notoccur. And there will be cases where \(C_1\) and \(C_2\) occurtogether so that an instance of \(C_1\) qualifies as a cause of aninstance of \(E\). An event of type \(C_1\) may therefore be a causeof an event of type \(E\), without there being a strict regularsuccession between \(C_1\)-events and \(E\)-events. This being said,Hume (1739: book I, part III, section XV) considers causation withoutstrict regular succession. There, he seems to propose a regularitytheory that goes beyond HRT and may cover causation based onnon-strict succession.
The advantages of Mackie’s theory over Mill’s are lessobvious. Recall that for Mill, a cause is a totality of positive andnegative factors such that, in the respective causal scenario, thistotality is sufficient for \(E\). On this theory, events of type \(C\)may be members of a cause for an effect of type \(E\), without therebeing a strict regular succession between \(C\)-events and\(E\)-events. Moreover, the notion of a totality of factors isconsistent with there being several totalities that are sufficient tobring about an event of type \(E\). In other words, several totalitiesmay serve as instances of some cluster in one and the sameMackie-style regularity. These totalities may be represented byconjunctions and joined by disjunctions. If we amend the Millianregularity theory by a requirement of minimal sufficiency, it seems asif we can go back and forth between a Mill-style and a Mackie-stylerepresentation of causes, with the qualification that causes in thesense of Mackie are mere factors of a cause in the sense of Mill. Inthis sense, the two theories seem to be intertranslatable. It shouldbe pointed out, however, that Mackie’s theory gives us a moreexplicit and concise representation of the several totalities, orclusters, which are minimally sufficient to bring about a certain typeof effect. The complex regularities and their elegant logicalrepresentation have not been in the conceptual repertoire of Mill.
Another merit of Mackie’s theory is that it provides anexplanation for causal inference. If we know that an effect of type\(E\) occurred, and that \(D_1\) was not present, we may infer thatthe cluster \((C_1 \land C_2)\) occurred. In particular, we may inferthat \(C_1\) caused \(E\).
Mackie’s regularity theory of causation still fails todistinguish between genuine causes and mere joint effects of a commoncause. To see this, consider an example adapted from Russell (1921[1961: 289]): the sounding of factory hooters in Manchester isregularly followed by London workers leaving their work. And yet theformer is not a cause of the latter. However, as Mackie (1974:81–84) himself points out, the sounding of the hooters inManchester counts as a cause for the workers going home in London:
In more concrete terms, the sounding of the Manchester factoryhooters, plus the absence of whatever conditions would make them soundwhen it wasn’t five o’clock, plus the presence of whateverconditions are, along with its being five o’clock, jointlysufficient for the Londoners to stop work a momentlater—including, say, automatic devices for setting off theLondon hooters at five o’clock, is a conjunction of featureswhich is unconditionally followed by the Londoners stopping work.(1974: 84)
The structure of the example can be given as follows. There are twoinstantiated effects \(E_1\) (the sounding of the hooters inManchester) and \(E_2\) (the workers going home in London) such that\(C_1\) (knocking-off time at five o’clock) is an instantiatedcommon INUS condition:
\[\begin{align} (C_1 \land C_2) & \lor D \leftrightarrow E_1, \text{ and}\\ (C_1 \land C_3) & \lor B \leftrightarrow E_2. \end{align} \]Furthermore, let’s say that \(C_2\) and \(C_3\) are instantiatedon this particular occasion. Well then, \(C_1\) counts as a commoncause of \(E_1\) and \(E_2\) on Mackie’s theory.
In the presence of the complex regularities, the cluster
\[E_1 \land \neg D \land C_3\]is sufficient for \(E_2\). Indeed, the cluster of the sounding of thehooters in Manchester (\(E_1\)), the absence of any conditions thatwould make the Manchester hooters sound when it was not fiveo’clock (\(\neg D\)), and the presence of all conditions(\(C_3\)) that are together with it being five o’clock (\(C_1\))jointly sufficient for the workers going home in London(\(E_2\))—this cluster is sufficient for the workers going homein London. Hence, \(E_1\) is at least an INUS condition of \(E_2\). If\(D\) is not instantiated, but \(C_2,\) \(C_3,\) \(E_1,\) and \(E_2\)are, then the instantiation of \(E_1\) counts as a cause of the one of\(E_2\). More generally, an effect of a cause \(C_1\) may be part ofan instantiated cluster sufficient for another effect of \(C_1\). Thisshows that, on Mackie’s theory, an effect of a common cause mayfalsely count as a cause for another effect of said common cause.
For Mackie (1974: 85–86), this type of problem indicates thathis regularity theory is at best incomplete. And he tells us what ismissing: a genuine cause-effect relation is marked by causal priority.As an approximation, a token event \(c\) was causally prior to anothertoken event \(e\) if an agent could have—inprinciple—prevented \(e\) by “(directly or indirectly)preventing, or failing to bring about or to do \(c\)” (1974:190). To prevent the sounding of the Manchester hooters does notprevent the workers from going home in London. Hence, the instance of\(E_1\) was not causally prior to the associated instance of \(E_2\)(though it is temporally prior). By contrast, preventing the soundingof the London hooters does prevent the workers from going home inLondon.
The approximate relation of causal priority is interventionist. Causesare seen as devices for an (ideal) agent to manipulate an effect.However, Mackie did not propose an interventionist theory ofcausation. His final notion of causal priority is independent of anotion of agency. \(c\) was causally prior to \(e\) if there was atime at which \(c\) was fixed while \(e\) was unfixed (see 1974:180-183 and 189-192). It should be noted, however, that Mackie himselfdoes not think that his analysis of causal priority succeeds (1974:xiv).
Another problem for Mackie’s regularity theory is that thecomplex regularities seem to be symmetric. This symmetry blurs theasymmetry between cause and effect, and leads to the problem of thedirection of causation. Like Hume, one could stipulate that causesprecede their effects in time. The direction of causation is thussimply the direction of time. However, this stipulation excludes boththe possibility of (a) backwards causation and (b) an analysis of“time’s arrow” in terms of causation. Since—tothe best of our knowledge—there is no account of causation whichhas fully solved the problem of the causal direction, the objections(a) and (b) do not speak decisively against the regularity theory.
Mackie’s theory of causation continues to be influential.Richard Wright (1985, 2011) proposed a similar account for identifyingcauses which can be, and indeed is, used in legal contexts. Buildingon Hart and Honoré (1985), he defined a cause as follows: acondition \(c\) was a cause of \(e\) iff \(c\) was necessary for thecausal sufficiency of a set of existing conditions that was jointlysufficient for the occurrence of \(e\). Causal sufficiency is heredifferent from lawful sufficiency: the set of conditions mustinstantiate the antecedent of a causal law whose consequent is theninstantiated by the effect. Wright’s account requires that wehave causal laws—as opposed to laws—at our disposal and soit is only reductive if we have a reduction of the causal laws.Nevertheless, the idea that a cause is a necessary element of asufficient set for an effect has been applied in legal cases todetermine which event has caused another. In fact, this NESS test forcausation has itself become influential in legal theory (see the entry oncausation in the law).
Strevens (2007) develops Mackie’s theory further. On the way, helikewise gives up its reductive character by replacing sufficiencywith causal sufficiency. The latter is to be understood in terms ofcausal entailment, which in turn is supposed to represent a causalprocess. (We explain Strevens’s theory of causation in greaterdetail in§2.4.) Notably, Strevens (2007) argues that Mackie’s original theoryalready solves not only the causal scenario known asoverdetermination—as everyone agrees—but also (early andlate) preemption. (For an overview of problematic causal scenarios,see Paul & Hall 2013).
Baumgartner (2008, 2013) developed Mackie’s theory further whilekeeping its reductive character. He observes that complex regularitieslike
\[(C_1 \land C_2) \lor D_1 \leftrightarrow E\]are not symmetric in the following sense: an instantiation of acluster is sufficient for \(E\), but an instantiation of \(E\) isgenerally not sufficient to determine which cluster is instantiated.An instantiation of \(E\) does not determine whether \(C_1 \land C_2\)or \(D_1\) is instantiated. An instantiation of \(E\) only determinesthe whole disjunction of minimally sufficient clusters. Thisnon-symmetry may be exploited to establish a notion of causal priorityor the direction of causation.
There is, of course, a problem when a single cluster is necessary andsufficient for an effect:
\[(C_1 \land C_2) \leftrightarrow E.\]Here the instantiation of \(E\) is sufficient to determine theinstantiation of the cluster. Baumgartner’s suggestion requiresan effect to have at least two alternative sufficient clusters inorder to establish the direction of causation. And even then, it seemsthat the problem of joint effects of a common cause is still unsolved.ConsiderFigure 2, where \(A\) and \(B\) are the joint effects of a common cause \(C\),and \(D\) and \(E\) are alternative causes for \(A\) and \(B\),respectively.
Figure 2
Whenever \(A \land \neg D\) is actual, so is \(C\)—for no effectoccurs without any of its causes. Furthermore, whenever \(C\) occurs,so does \(B\). Hence, \(A \land \neg D\) is minimally sufficient for\(C\), and thus for \(B\). And \(A \land \neg D\) is a minimallysufficient cluster in a necessary condition for \(B\):
But \(A \land \neg D\) should not count as a cause of \(B\). Afterall, the causes of \(B\) are only \(C\) and \(E\) in this causalscenario.
Baumgartner proposes a solution. The complex regularity for an effectmust be necessary for that effect in a minimal way. \(B\) isinstantiated only if \(C\) or \(E\) is instantiated. Hence, \(C \lorE\) is necessary for \(B\). And the left side of the complexregularity(1) contains no other disjunction—obtained by removing onedisjunct—which is necessary for \(B\). There are only twocandidates. \((A \land \neg D) \lor C\) is not necessary because \(B\)can be instantiated along with \(E, \neg C\), and \(A \land D\).Similarly, \((A \land \neg D) \lor E\) is not necessary because \(B\)can be instantiated along with \(C,\) \(\neg E\) and \(A \land D\).Hence, \(B \rightarrow C \lor E\) and no disjunct can be removed from\(C \lor E\) such that the implication still holds. In this sense, \(C\lor E\) is aminimally necessary disjunction of minimallysufficient clusters for \(B\).
The notion of a minimally necessary disjunction can also becharacterized in terms of more formal concepts. Suppose \(\mathbf{C}\)is a set of clusters, where each cluster is given by a conjunction offactors. \(\mathbf{C}\) is a minimally necessary condition for \(E\)iff
(\(\bigvee \mathbf{C}\) designates some disjunction of the members of\(\mathbf{C}\).)
The underlying reason why \(A \land \neg D\) can be pruned from theleft side of(1) is that \(A \land \neg D\) is sufficient for \(C \lor E\), while thelatter is not sufficient for the former in the given scenario. Now, aninstantiation of \(C \lor E\) makes a difference as to whether or not\(B\) is instantiated—independently of \(A \land \neg D\). Theconverse is not true: an instantiation of \(A \land \neg D\) makes adifference as to whether or not \(B\) is instantiated, but notindependently of \(C \lor E\) (see Baumgartner 2008: 340–346 and2013: 90–96). This suggests the following requirement: a complexregularity for an effect must be a minimally necessary disjunction ofminimally sufficient clusters for said effect. Indeed, given that eacheffect has at least two alternative causes, this seems to solve theproblem of joint effects of a common cause.
Baumgartner (2013: 95) defines a cause on the type level as follows.Relative to a factor set containing \(C\) and \(E\), \(C\) is atype-level cause of \(E\) iff
A type-level cause is thus permanently non-redundant for its effect.He then goes on to define a cause on the token level (for details, seeBaumgartner 2013: 98). Relative to a factor set containing \(C\) and\(E\), an instance \(c\) of \(C\) is an actual cause of an instance\(e\) of \(E\) iff \(C\) is a type-level cause of \(E\), and, for allsuitable expansions of the factor set including the original factorset, \(C\) is on an active causal route to \(E\). An active causalroute is roughly a sequence of factors \(\langle Z_1, \ldots , Z_n\rangle\), where each element \(Z_i\) is a factor of a minimallynecessary disjunction of minimally sufficient clusters for itssuccessor element \(Z_{i+1}\), and each \(Z_i\) is co-instantiatedwith a set of factors \(X_i\) such that \(Z_i \land X_i\) form aminimally sufficient cluster for \(Z_{i+1}\). This amounts to a quitepowerful analysis of actual causation. It solves many scenarios whichare troublesome for counterfactual accounts, includingoverdetermination, early and late preemption, and scenarios known as“switches“, and “short-circuits“. For adetailed study of these scenarios and an explanation why they aretrouble for the counterfactual approach to causation, see Paul andHall (2013). Finally, it should be noted that Baumgartner’sregularity theory faces a problem relating to what Baumgartner andFalk (forthcoming) call structural redundancies. The latter explainthis problem and offer a solution based on a causal uniquenessassumption.
Inferential theories of causation are driven by the idea that causalrelations can be characterized by inferential relations. In essence,\(C\) is a cause of \(E\) iff
Inferential theories may be seen as refinements and generalizations ofregularity theories. For example, one can explain INUS conditions interms of relatively simple inferences between the presumed cause andthe putative effect. In this explanation, the distinctive feature ofMackie’s theory is explained by the logical form of thebackground theory. In a propositional setting, this background theoryhas the logical form of a biconditional
\[(C_{1,1} \wedge \ldots \wedge C_{1,n}) \vee \ldots \vee (C_{k,1} \wedge C_{k,m}) \leftrightarrow E.\]Each \(C_{i,j}\) that co-occurs with all conjuncts of one conjunctionon the left-hand side of this biconditional is at least an INUScondition for the effect \(E\). Obviously, we can infer \(E\) fromsuch a \(C_{i,j}\) in the context of the biconditional and the otherconjuncts that form, together with \(C_{i,j}\), a sufficient conditionfor \(E\). However, the exposition of the INUS account in Mackie(1965, 1974) is not explicitly inferential. Likewise, Hume’soriginal regularity theory can be interpreted inferentially, but it isnot an explicitly logical or inferential account.
We understand the notion of an inference relation in a broad way. Inessence, an inference relation maps sets of sentences to sets ofsentences such that this mapping is guided by some idea of truthpreservation: if all members of a given set of premises are true, thenall sentences inferable from this set must be true as well. Moreover,there are inference relations that are guided by a weaker requirement:if all members of a given set of premises are believed, then it isrational to believe a certain set of conclusions. Nonmononotoniclogics and formal theories of belief revision define inferencerelations in this sense. We shall say more about such inferencesystems inSection 2.2. Suffice it to say for now that an inference relation understood inour broad way may be defined syntactically in terms of relations ofderivability, or semantically in terms of possible worlds andmodel-theoretic interpretations, or both ways.
Inferential theories seek to improve on regularity theories. RecallfromSection 1.1 that virtually all regularity theories face two problems ofextension. First, in the case of singular causal relations, we have acausal relation without regularity. Second, in the case of spuriouscausation, we have a regularity that is not considered to be causal. Asophisticated inferential theory may allow us to infer singularevents, such as the nuclear disaster in Fukushima and the extinctionof the dinosaurs, from a rich background theory and certain furtherhypotheses about presumed causes and the context of the presumedcausal process. At least some inferential theories hold the promise todistinguish between spurious and genuine causal relations. The idea isthat genuine causal relations may be characterized by distinctiveinferential relations, and so be distinguished from spurious causalrelations.
Inherent in the philosophy of logical empiricism, broadly construed,is a strong opposition to traditional metaphysics. Not surprisingly,many of the logical empiricists were guided by Humean empiricistprinciples. Even some concepts of scientific language were suspectedto be metaphysical in nature. Wittgenstein (1922: 5.136) held thatbelief in a general law of causality is superstition. He also deniedthat laws of nature could explain natural phenomena (6.371n). Frank(1932: Ch. V, IX) pointed out several difficulties in giving a preciseformulation of a general law of causality. Russell (1913) argued thatthe notion of cause is beset with confusion, and should better beexcluded from philosophical vocabulary.
Ramsey (1931) was less sceptical than Russell (1913) and Wittgenstein(1922) about the prospect of an analysis of causation. His account ofcausation seems to be the logical empiricist analysis that comesclosest to Hume’s original analysis. It is centred on the notionof avariable hypothetical. Such a hypothetical is auniversally quantified implication that has instances whose truth orfalsity we are unable to verify for now. A case in point is thesentence saying that all humans are mortal. This sentence goes beyondour finite experience. For there is no way to conclusively establishthat all men—past, present and future—are mortal.
To believe a universal hypothetical consists in “a generalenunciation” and “a habit of singular belief”(Ramsey 1931: 241). That is, we are willing to believe instances ofthe universal sentence, even though we are not able to verify them atthe moment. Ramsey builds here Humean ideas about the formation of ahabit or custom into an informal semantics of universal sentences.
Ramsey’s account, then, amounts to the following analysis. \(C\)is a cause of \(E\) iff
Condition (i) remains implicit, but it would be confusing to omit ithere. Strictly speaking, condition (ii) is redundant since it followsfrom condition (iv).
Note that a regularity by itself need not give rise to a variablehypothetical that we believe or are led to believe by futureobservations. A case in point is the regular connection between thesound of the hooters at a factory in Manchester and the workers goinghome at a factory in London (see§1.3). Ramsey (1931: 242) literally speaks of trust in order to characterizeour epistemic attitude towards causal laws. Ramsey’s account ofsuch laws, which supersedes his earlier best systems analysis, isexplicitly epistemic. However, Ramsey’s account does not seem tohave the resources to address the problem of singular causalrelations. Under which causal law could we subsume the causalhypothesis that dinosaurs became extinct by a collision of the Earthwith a meteor?
More work on causation and explanation has been done within thelogical empiricist program. The deductive nomological model ofexplanation by Hempel and Oppenheim (1948)—also referred to asthe DN model of explanation—seems to give us an exemplar of aconceptual analysis without metaphysics. Let us briefly recall thebasic elements of this model. The explanans of an empirical phenomenon\(E\) consists of two types of statements. First, a set \(C\) ofantecedent conditions \(C_1, \ldots, C_k\). Second, a set \(L\) oflaws \(L_1, \ldots, L_r\). These two sets form theexplanans,while \(E\) is called theexplanandum. \(C\) and \(L\)explain \(E\) iff
The explanandum must thereby be subsumed under at least one law.
While Hempel and Oppenheim were not primarily interested in ananalysis of causation, they thought that the DN model may yield suchan analysis:
The type of explanation which has been considered here so far is oftenreferred to as causal explanation. If \(E\) describes a particularevent, then the antecedent circumstances described in the sentences\(C_1, C_2, \ldots, C_k\) may be said jointly to “cause”that event, in the sense that there are certain empiricalregularities, expressed by the laws \(L_1, L_2, \ldots, L_r\), whichimply that whenever conditions of the kind indicated by \(C_1, C_2,\ldots, C_k\) occur, an event of the type described in \(E\) will takeplace. Statements such as \(L_1, L_2, \ldots, L_r\), which assertgeneral and unexceptional connections between specifiedcharacteristics of events, are customarily called causal, ordeterministic, laws. (Hempel & Oppenheim 1948: 139)
This passage nicely indicates how inferential approaches to causationparallel regularity approaches. The regular connection between thepresumed cause and the putative effect is expressed by certain laws.The putative effect must be inferable from the presumed cause usingthese laws if the former is an actual cause of the latter. Causationimplies that there are covering laws, according to which the cause issufficient for its effect (cf. Pap 1952 and Davidson 1967, 1995). Atthe same time, the above passage reveals logical empiricist scruplesto talk about causation. Hempel and Oppenheim are hesitant to setforth the DN model as an analysis of causation and to say that theantecedent conditions of a DN explanation are causes of theexplanandum if the latter is a particular event.
More than a decade later than the first publication of the DN model,Hempel (1965: 351n) suggested replacing the notion of cause by thenotion of antecedent conditions along the lines of the DN model.Moreover, he then distinguishes more carefully betweenlaws ofsuccession andlaws of coexistence (1965: 352). Only theformer give rise to causal explanations. If a DN explanation is basedon laws of coexistence, this explanation does not qualify as causal. Acase in point is the law that connects the length of a pendulum withits period:
\[T=2 \pi \sqrt{l/g}\]where \(T\) is the period of the pendulum, \(l\) its length, and \(g\)the gravitational acceleration nearby the surface of the Earth.
Arguably, even laws of succession do not always give rise to causalexplanations. Such laws do not only allow for predictions, but alsofor retrodictions. That is, we can derive statements about events thatare prior to events described by the antecedent conditions \(C_1,\)\(C_2,\) …, \(C_k\) in a DN explanation. Hempel (1965: 353)himself observed that Fermat’s principle may be used in a DNexplanation of an event that precedes an event described by anantecedent condition.
In sum, there are at least two problems with a simple DN account ofcausation. First, DN explanations using laws of coexistence do notseem to qualify as causal. Second, in some DN explanations, theexplanandum is an event that occurs prior to an event described in theantecedent conditions. In neither case, we view the antecedentconditions of the DN explanation as cause of the event to beexplained.
These observations anticipate later criticisms of the DN model, whichare commonly referred to assymmetry problems. In the case ofthe infamous tower-shadow example, most people have come to view onlythe derivation of the length of the shadow as properly explanatory,but not the derivation of the height of the tower (Bromberger 1966).Hempel (1965: 353n) himself admits that causal DN explanations seem“more natural and plausible” than non-causal ones.Woodward (2003) goes a step further by expounding a causal account ofexplanation using causal models.
We must wonder why Hempel did not define that \(C\) is a cause of aparticular event \(E\) iff there is a DN explanation of \(E\) suchthat \(E\) is the explanandum, \(C\) is among the antecedentconditions, all antecedent conditions precede the occurrence of \(E\),and all laws are laws of succession. Presumably, his interest in ananalysis of causation was limited.
Spohn (2006, 2012) expounds a broadly Humean analysis of causation.\(C\) is a cause of \(E\) iff
(iii) says that \(C\) is a reason for \(E\), given the circumstances.This relation is analysed in terms of ranking functions. Spohn viewsthis analysis as an alternative to counterfactual approaches. At theend of this section, we will understand why the analysis qualifies asan inferential approach to causation.
Spohn develops the ranking-theoretic analysis in two steps. First, heexplains what it is for a proposition to be a reason for anotherproposition. Second, causal relations are characterized as specifictypes of reason relations. Let us go a bit further into details. Asregards notation, we follow Spohn (2006). The notation in Spohn (2012)is more refined, but slightly more complex as well.
A ranking function represents a belief state. More specifically, aranking function \(\kappa\) assigns each possible world a non-negativenatural number such that \(\kappa(w)=0\) for at least one possibleworld \(w\). Intuitively, the rank \(\kappa(w)\) of a possible world\(w\) expresses a degree of disbelief. \(\kappa(w)=0\) means that thedegree of disbelief is zero. \(w\) is not disbelieved in this case.Otherwise, it is disbelieved with rank \(\kappa(w)=n > 0\). Aranking function thus defines a Grovian system of spheres, used torepresent AGM belief revision operators (Alchourrón,Gärdenfors, & Makinson 1985; Grove 1988). A Grovian system ofspheres is basically a set of nested sets. At the center of the systemis its smallest member, a non-empty set of worlds. The smallest set issurrounded by its supersets, as shown inFigure 3.
Figure 3
The ranks 0, 1, 2, 3,… are understood as cardinal numbers. Forexample, if world \(w_1\) has rank 5 and world \(w_2\) rank 10, then\(w_2\) is disbelieved to a degree twice as much as \(w_1\) isdisbelieved. Some ranks may even be empty in the sense that there isno world that has a specific rank \(n\). For example, ranks 3 and 4may be without any possible worlds, while there are possible worlds atranks 0, 1, 2, and 5. The cardinal interpretation of the ranks isneeded for certain arithmetical operations, such as lifting the ranksby a certain cardinal number, applied to all possible worlds of acertain subset. Because of the possibility of empty ranks, a Groviansystem of spheres does not define a unique ranking function.
At each possible world, either a proposition or its negation is true.We call a world at which a proposition \(A\) is true an \(A\)-world,and identify the proposition \(A\) with the set of \(A\)-worlds. Therank \(\kappa(A)\) of a proposition \(A\) is the rank of thosepossible worlds \(w\) whose rank \(\kappa(w)\) is minimal among the\(A\)-worlds. InFigure 3, the rank of \(A\) is thus \(0\) and the rank of \(\neg A\) is \(1\).A proposition \(A\) is believed iff \(\kappa(\neg A)>0\).
Ranking functions cannot only be used to express degrees of disbelief,but also to express degrees of belief. This is accomplished by thebelief function \(\beta\). Where \(A\) is a proposition,
\[\beta(A)= \kappa(\neg A) - \kappa(A).\]That is, \(\beta(A)\) is the difference between the minimal rank ofthe \(\neg A\)-worlds and the minimal rank of the \(A\)-worlds. Forexample, if \(\kappa(\neg A)\) is rather high, we strongly believe\(A\). \(\beta(A)>0\) means that \(A\) is believed to be true, and\(\beta(A)<0\) that \(A\) is believed to be false. \(\beta(A)=0\)means that \(A\) is neither believed nor disbelieved. In the lattercase, we have both \(A\)- and \(\neg A\)-worlds whose rank iszero.
We are almost done with the core of ranking theory. It remains toexplainconditionalization. Conditionalization of a rankingfunction on a proposition represents the belief change upon coming tobelieve the proposition. The conditionalization of \(\kappa\) on \(A\)is defined as follows:
\[\kappa(w\mid A)= \kappa(w) - \kappa(A)\]for all worlds \(w\) at which \(A\) is true, while \(\kappa(w\mid A)=\infty\) for all worlds \(w\) where \(A\) is false. In less formalterms, if our agent comes to believe \(A,\) then the ranks of all\(A\)-worlds are lowered by the rank of \(A\). And the rank of all\(\neg A\)-worlds is set to infinity, which means that she definitelydisbelieves those worlds. The conditionalization of a belief function\(\beta\) is defined analogously:
\[\beta(B\mid A) = \kappa(\neg B \: |\: A) - \kappa(B \: |\: A).\]These ranking-theoretic concepts at hand, Spohn defines what it is fora proposition to be a reason for another proposition. \(A\) is areason for \(B\) in the context \(C\)—relative to\(\beta\)—iff
\[\beta(B \: |\: A \cap C) > \beta(B \: |\: \neg A \cap C).\]Moreover, he introduces subtle distinctions between an additional, asufficient, a necessary, and a weak reason. These distinctions giverise to the notions of an additional, a sufficient, a necessary, and aweak cause. We leave out these subtleties for simplicity.
Now that we know what a ranking-theoretic reason is, it remains toexplain which types of reasons are causes. In essence, \(C\) is adirect cause of \(E\) at the possible world\(w\)—relative to the belief function \(\beta\)—iff
Spohn thinks that we should understand causation relative tosmallworlds such that the context is given in a certainframeof possible events, but not by the global past of \(E\). Causation isthus doubly relative to an epistemic perspective. First, it isrelative to a belief function, and second to a frame of possibleevents.
A final step is needed to also capture non-direct causal relations.\(C\) is a cause of \(E,\) possibly non-direct, iff the ordered pair\((C,E)\) is in the transitive closure of the relation of directcausation. That is, \(C\) is a cause of \(E\) iff there is a sequence\(\langle C, C_1, \ldots, C_n, E \rangle\) \((n\geq 0)\) such thateach element of the sequence is a direct cause of the successor (ifthere is one).
Most notably, the ranking-theoretic account of causation is able tosolve the problem of spurious causation, arising from common causescenarios. The solution is quite subtle. Suppose \(C\) is a commoncause of \(E\) and \(F,\) where \(F\) precedes \(E\). The crucialquestion then is whether or not \(F\) is a reason for \(E\) in thecontext \(C\). Spohn (2006: 106) argues that it is not. Recall that\(F\) is a reason for \(E\) in the context \(C\) iff
\[\beta(E\mid C \cap F) > \beta(E\mid C \cap \neg F),\]where
\[\beta(E\mid C \cap F)=\kappa(\neg E\mid C \cap F) - \kappa( E\mid C \cap F),\]and
\[\beta(E\mid C \cap \neg F)=\kappa(\neg E\mid C \cap \neg F) - \kappa( E\mid C \cap \neg F).\]Spohn determines the difference between the ranks \(\kappa(\neg E \midC \cap F)\) and \(\kappa( E\mid C \cap F)\) by the difference in thenumber of violations of causal regularities between the world where\(\neg E \land C \land F\) is true and the world where \(E \land C\land F\) is true. In the former world, the causal regularity between\(C\) and \(E\) is violated, while the causal regularity between \(C\)and \(F\) is respected. No causal regularity is violated in the worldwhere \(E \land C \land F\) is true. Hence,
\[\beta(E\mid C \cap F)=1.\]As for the calculation of \(\beta(E\mid C \cap \neg F),\) we have twoviolations of a causal regularity in the world of \(\neg E \land C\land \neg F\), but only one such violation in the world of \(E \landC \land \neg F\). Hence,
\[\beta(E\mid C \cap \neg F)=1.\]It therefore holds that
\[\beta(E\mid C \cap F) = \beta(E\mid C \cap \neg F).\]Hence, \(F\) is neither a reason nor a cause of \(E\). It is easy toshow that \(C\) is a reason for \(E\), as desired.
Spohn (2006, 2012) considers a number of further causal scenarios, andso he offers ranking-theoretic solutions to the problems ofoverdetermination, trumping, and various scenarios of preemption. Theaccount of overdetermination parallels probabilistic accounts of thistype of causal scenario. Spohn does not explicitly address singularcausal relations that lack a corresponding regularity. But there mightbe a way to argue that a collision of the Earth with a hugemeteor—at the time when dinosaurs were roaming theEarth—raises the ranks of those worlds where dinosaurs arealive. Such a raise amounts to believing that dinosaurs could havedied because of such a collision.
Why is Spohn’s analysis of causation inferential? Recall that aranking function represents the epistemic state of an agent, whichimplies that it determines which beliefs are held by the agent.Moreover, the conditionalization of a ranking function defines howsuch a function changes if the agent comes to believe a certainproposition. Hence, conditionalization defines an inference relationthat maps a set of propositions to a powerset of propositions. Theintended meaning of this inference relation is that, if we come tobelieve a certain proposition, then it is rational to believe a set ofpropositions. In other words, proposition \(B\) is inferable fromproposition \(A\) in the context of a ranking function \(\kappa\) iff\(\beta(B\mid A)>0\), where \(\kappa\) is underlying the belieffunction \(\beta\). Put into symbolic notation:
\[A \infd_\kappa B \;\text{ iff }\; \beta(B\mid A)>0.\]This notation is not used by Spohn (2012), but may be helpful to graspthe inferential nature of Spohn’s condition that a cause must bea reason for its effect. Notice that \(\infd_\kappa\) is anenthymematic inference relation in the sense that certain premises arehidden in the background.
The inferential nature of the condition that a cause must be aranking-theoretic reason for its effect can also be seen from thefollowing two results. First, the conditionalization of a belieffunction \(\beta\) defines a belief revision scheme (Spohn 1988). Sucha scheme tells us how a rational agent should change his or herbeliefs in light of new information (Gärdenfors 1988). Second,any belief revision scheme—in the sense of the AGM theory byAlchourrón, Gärdenfors, and Makinson (1985)—definesa nonmonotonic and enthymematic inference relation (Gärdenfors1991). This result motivates the above definition of the inferencerelation \(A \infd_\kappa B\).
Spohn’s theory of causation is first expounded in terms ofranking functions which model the epistemic states of some agents.However, it should be noted that Spohn (2012: Ch. 14) aims to spellout the objective counterpart to ranking functions. He seeks toobjectify ranking functions by saying what it means that aranking function represents objective features of the world. Thenotion of truth may serve as an introductory example. Given \(w\) isthe actual world, a ranking function \(\kappa\) is an objectiverepresentation of this world iff \(\beta(A)>0\) for allpropositions \(A\) such that \(w\in A\). (Proposition \(A\) isconstrued as a set of possible worlds; \(\beta\) is the belieffunction of \(\kappa\), as defined above.) This explanation, however,does not yet capture modal and causal features of our world. And soSpohn goes on to explain how modal and causal properties—definedby a given ranking function \(\kappa\)—can be understood asobjectives features of the world. The details are intricate andcomplex. They are spelled out in (2012: Ch. 15); the basic ideas aresummarized in Spohn (2018). Suffice it to say that Spohn’sobjectification of ranking functions is not only a promising attemptfor an objective account of causation and modality, but also a way toresolve the tension between an epistemic and non-epistemic theory ofcausation.
Drawing on Spohn (2006), Andreas and Günther (2020) analysecausal relations in terms of certain reason relations. This is thebasic schema of the analysis: \(C\) is a cause of \(E\) iff
Andreas and Günther specify the first condition by a strengthenedRamsey Test, which emerges from an analysis of the word“because” in natural language. Ramsey (1931) proposed toevaluate conditionals in terms of belief change. Roughly, you accept“if \(A\) then \(C\)” when you believe \(C\) uponsupposing \(A\). This Ramsey Test has been pointedly expressed byStalnaker:
First, add the antecedent (hypothetically) to your stock of beliefs;second, make whatever adjustments are required to maintain consistency(without modifying the hypothetical belief in the antecedent);finally, consider whether or not the consequent is then true. (1968:102)
Gärdenfors (1988) expressed the Ramsey Test in more formal terms.Where \(K\) denotes an agent’s set of beliefs and \(K \ast A\)the operation of changing \(K\) on the supposition of \(A,\) \(A >C \in K\) if and only if \(C \in K \ast A.\) In the wake of work byRott (1986), Andreas and Günther suggest strengthening the RamseyTest as follows:
\(A \gg C\) iff, after suspending judgment about \(A\) and \(C\), anagent can infer \(C\) from the supposition of \(A\) (in the context offurther beliefs in the background). (2019: 1230)
This conditional is then used to give a simple analysis of the word“because” in natural language:
\[\text{Because } A, C \text{ (relative to }K)\;\;\;\text{ iff }\;\;\;A \gg C \in K \;\;\; \text{ and }\;\;\; A, C \in K\]where \(K\) designates the belief set of the agent. This analysis maywell be incomplete, but proves useful for analysing causation. Thenext step toward such an analysis is to impose further constraints onthe inferential relations between the antecedent and the consequent.For the expert reader, it may be worth noting that the conditional\(\gg\) is defined in terms of finite belief bases rather thanlogically closed sets. Inferential relations between antecedent andconsequent can therefore be defined syntactically in terms ofderivability.
How may the conditional \(\gg\) be used to analyse causation? Supposethe antecedent \(C\) designates a presumed cause and the consequent\(E\) a putative effect. It is first required by Andreas andGünther (2020) that we can infer \(E\) from \(C\) in aforward-directed manner. That is, there is an inferential path between\(C\) and \(E\) such that, for every inferential step, whenever weinfer the occurrence of an event, no premise asserts the occurrence ofan event that is temporally later than the inferred event. Thiscondition is a refined and slightly modified variant of Hume’srequirement according to which a cause always precedes its effect.
Second, the inferential path between \(C\) and \(E\) must be such thatevery intermediate conclusion is consistent with our beliefs about theactual world. The idea is that there must be an inferential pathbetween the presumed cause and the putative effect that tells us howthe effect is brought about by the cause. This inferentialrepresentation of how the effect is produced by the cause must beconsistent with what we believe the world is like.
Third, every generalization—used in the inferential path between\(C\) and \(E\)—must be non-redundant in the following sense: itmust not be possible to derive the generalization from other explicitbeliefs. (A generalization is simply a universal sentence, which mayrepresent a strict or non-strict law.) By means of this condition,Andreas and Günther (2020) try to solve the problem of spuriouscausation in common cause scenarios. Suppose there is a thunderstorm.Then, we have an electrical discharge that causes a flash which isfollowed by thunder. The flash precedes the thunder, and there is aregular connection between these two types of events. But we would notsay that the flash is a cause of the thunder. So, we need todiscriminate between forward-directed inferences with a causal meaningand other forward-directed inferences that lack such a meaning. Notethat we can derive the event of the flash and the event of the thunderfrom the electrical discharge in a forward-directed manner using onlynon-redundant generalizations, that is, universal sentences that wethink are relatively fundamental laws of nature. Among these are thelaws of electrodynamics, atomic and acoustic theory, and optics.However, the only inferential path from the flash to the thunder thatis forward-directed requires redundant generalizations: universalsentences that we can derive from more fundamental laws (see Andreas2019 for details). The strategy is that spurious causal relations canbe excluded by requiring the generalizations used in the inferentialpath to be non-redundant. This strategy works in the mentioned commoncause scenario. It remains to be seen whether this strategy works infull generality.
In sum, \(C\) is a cause of \(E\)—relative to a beliefstate—iff
Obviously, this analysis is epistemic, just as Hume’s analysisof causation in the mind as well as Ramsey’s and Spohn’sanalyses are. Condition (ii) is needed because this condition is notimplied by the requirement that the inferential path between \(C\) and\(E\) is forward-directed. (Such an implication does not hold because,strictly speaking, forward-directedness of the inferential path meansthat no inferential step is backward-directed.) The problem ofinstantaneous causation is addressed in Andreas (2019), at least forsome causal scenarios. Andreas and Günther (2020) deliverrigorous solutions to a number of causal scenarios, includingoverdetermination, early and late preemption, switches, and somescenarios of prevention. Singular causal relations without a regularconnection among \(C\)-type and \(E\)-type events are notaddressed.
The kairetic account by Strevens (2004, 2008) analyses causation interms of causal models. It is driven by the idea that every causalclaim is grounded in a causal-explanatory claim. Causation andexplanation go hand in hand. Strevens uses the notion of a causalmodel to define a relation of entailment with a causal meaning. A setof propositions entail an explanandum \(E\) in a causal model only ifthis entailment corresponds to a “real causal process by which\(E\) is causally produced” (Strevens 2004: 165). Causal modelsare assumed to be founded in physical facts about causal influence,and it is assumed that these facts “can be read off the truetheory of everything” (Strevens 2004: 165).
The kairetic account parallels the DN account of causation in somerespects. The logical form of the explanans is basically the same asin the DN model. We have laws and propositions about events. A causalmodel is here simply a set of causal laws taken together with a set ofpropositions about potential causal factors. Unlike the DN account ofcausation, the kairetic analysis uses causal notions in the analysans.More specifically, the notion of entailment with a causal meaning istaken as antecedently given. In the previous section, we havementioned Andreas’s (2019) attempt to further specify therelation of logical entailment with a causal meaning using non-modalfirst-order logic, belief revision theory, and some furthernon-logical concepts. This specification may be taken to supplementthe kairetic account.
In Strevens (2004), the kairetic account is essentially motivated bythree types of problems. First, causation by conditions that are mostof the time but not always sufficient to produce a correspondingeffect. Second, problems of preemption. Third, the problem of howfine-grained the description of a cause should be. Consideration ofthese problems leads to a sophisticated and powerful analysis ofcausation, which is based on the notion of anexplanatorykernel for an event. Such a kernel contains laws and propositionsabout events. Given the veridical causal model \(M\), the explanatorykernel for an event \(E\) is the causal model \(K\) such that (i)\(M\) entails \(K\), (ii) \(K\) entails \(E\), and (iii) \(K\)optimizes two further properties calledgenerality andcohesion. Generality means that \(K\) encompasses as manyphysical systems as can be. Cohesion is a measure for the degree towhich potential causal factors are active in all systems satisfying\(K\) (Strevens 2004: 171). Eventually, Strevens defines \(C\) to be acausal factor for \(E\) iff it is a member in some explanatory kernelfor \(E\). Causes, in the kairetic account, are difference-makers withrespect to the effect. If we were to take an actual cause out of anexplanatory kernel for \(E\), the remainder of the kernel would notproduce the putative effect \(E\).
Let us try to understand why the description of the cause should bemaximally general. There are three reasons for this. First, thisrequirement allows us to eliminate potential causal factors that donot contribute to the causal production of the effect. A kernel \(K\)that does mention such a factor—in addition to the“active” causal factors—is less general than akernel \(K'\) that mentions only the “active” causalfactors. Hence, \(K'\) should be preferred over \(K\), other thingsbeing equal.
Put simply, some elements of a veridical causal model are not involvedin the production of the effect. Other elements are involved, buttheir influence can be neglected, provided the description of theeffect is not overly fine-grained. For example, the gravitationalforces of other planets may influence the trajectory of a rock thrownnearby the Earth’s surface. But these forces do not make adifference to whether or not the rock hits a fragile object. Hence,the kernel should not contain such forces. More generally, a kernelfor \(E\) should not contain elements of \(M\) that are involved inthe production of \(E\) without actually making a difference to it.This gives us a second reason for the requirement of generality.
These two reasons for optimizing generality are motivated by aconsideration of difference-making: each member of the kernel shouldmake a difference to the production of the effect. If some member ofthe kernel is absent, the remainder of the kernel does not suffice tobring about the effect. Otherwise, it is not a proper kernel for theeffect.
The third reason for optimizing generality of the kernel is a bit moreintricate. Suppose a brick of four kilograms is thrown against awindow such that the window shatters. Then, for this effect to bebrought about, it is often not necessary that the brick has a weightof exactly four kilograms. A brick that is slightly lighter and thrownwith the same velocity would also do. Nor does it have to be a brick,as we know from the rock-throwing examples. So it seems more accurateto say that the shattering of the window was caused by the fact orevent that a rigid, non-elastic, and sharp object was thrown againstthe window in such a manner that the object’s impact was withina certain interval. The latter description is obviously more generalthan the former in the sense that more physical systems satisfy it. Inbrief, the kernel for \(E\) should not contain aspects of causallyrelevant factors that do not make a difference to \(E\).
The requirement of cohesion is needed to exclude disjunctions ofcausal factors from a kernel. Suppose \(C_1\) is a potential, butinactive causal factor, while \(C_2\) is active. That is, \(C_2\) isactually contributing to the causal production of \(E\), and \(C_1\)is not. Then neither \(C_1\) nor \(C_1 \vee C_2\) should be part ofthe kernel. We have just seen how \(C_1\) is excluded from the kernelby the requirement to maximize generality. \(C_1 \vee C_2\) isexcluded by requiring that the kernel should be maximally cohesive.Other things being equal, kernels with \(\{C_1 \vee C_2\}\) and\(\{C_1, C_1 \vee C_2\}\) are less cohesive than the kernel containingjust \(\{C_2\}\).
Strevens (2008) is more comprehensive, and addresses a number offurther topics, most of which are primarily concerned with the notionof an explanation. Notably, an account is offered for singular causalrelations without a corresponding regular connection between thepresumed cause and the putative effect (Ch. 11).
In his seminal book on causality, Pearl (2000) expounded a formalframework of causal models, which is centred on structural equations.A structural equation
\[v_i=f_i(pa_i, u_i),\;\; i=1, \ldots, n\]tells us that the values \(pa_i\) of certain parent variablesdetermine the value \(v_i\) of a child variable according to afunction \(f_i\) in the context of a valuation \(u_i\) of backgroundvariables. At bottom, a causal model is a set of structural equations.Each equation stands for some causal law or mechanism. We can viewthese equations as representing elementary causal dependencies. Thenotion of a parent variable relies on causal graphs, which representelementary causal relations by directed edges between nodes. Each nodestands for a distinct variable of the causal model. Causal models havebeen used to study both deterministic and probabilistic causation.(For details, see the entry oncausal models.)
A distinctive feature of the semantics of structural equations is thatit encodes some notion of asymmetric determination. The values of theparent variables determine the value of the child variable, but notthe other way around. Together with Halpern, Pearl devised severalcounterfactual analyses of causation in the framework of causalmodels. These analyses incorporate elements of the inferentialapproach to causation. To be more precise, condition AC2(a) of theoriginal and the updated Halpern-Pearl definition of actual causalityis counterfactual, while AC2(b) is inferential. The latter conditionrequires that a given effect must be a consequence of its cause for arange of background conditions. (See Halpern (2016) for details.)Notably, the framework of causal models has also been used to developinferential approaches to causation that work without a counterfactualcondition along the lines of Lewis (1973). We explain the basicconcepts of the accounts by Beckers and Vennekens (2018) and Bochman(2018).
Beckers and Vennekens (2018) begin with explaining what it is for thevalue of a variable to determine the value of anothervariable—according to a structural equation—in a context\(L\). This explanation gives us the notion of \(C\) being a directpossible contributing cause of \(E\). Such a direct possiblecontributing cause is actual if \(C\), \(E\), and the context \(L\)are actual. For \(C\) to be an actual contributing cause of \(E\),there must be a sequence
\[\langle C, C_1, \ldots, C_n, E\rangle\quad (n\geq 1)\]such that each element of the sequence is a direct actual contributingcause of its successor (if there is one). Finally, \(C\) is a cause of\(E\) iff
The temporal constraint basically requires that the events of thesequence \(\langle C, C_1, \ldots, C_n, E\rangle\) are temporallyordered, i.e., no element in the sequence occurs later than itssuccessor.
Beckers and Vennekens (2018) deliver rigorous solutions to a number ofcausal problems, including overdetermination, early and latepreemption, switches, and some scenarios of prevention. The account isnon-reductive since structural equations represent elementary causaldependencies. The framework of causal models does not give us afurther analysis of such dependencies. The benefit of this strategy isthat the problem of spurious causal relations in common causescenarios does not arise in the first place. The problem of singularcausal relations is not explicitly addressed, but it allows for atrivial solution in the framework of causal models. We can build acausal model with a structural equation saying that a collision of theEarth with a huge meteor leads to the extinction of dinosaurs. Anon-trivial solution is as challenging as in the other inferentialapproaches. It seems very difficult to cite all the different causallaws and background conditions that causally determine dinosaurs todie.
The analysis by Beckers and Vennekens (2018) is inferential in spirit,but not explicitly inferential in the sense that they define a logicalinference system with structural equations. Bochman (2018) goes a stepfurther in this respect. His analysis is centred on what computerscientists call aproduction inference relation. Thisinference relation is characterized by certain metalogical axioms,such as Strengthening (of the antecedent), Weakening (of theconsequent), Cut, and Or. To get some understanding of a productioninference relation, it suffices to know that it can be reduced to aset of rules
\[A_1 \wedge \ldots \wedge A_k \Rightarrow B_1 \vee \ldots \vee B_n\]where \(A_1, \ldots, A_k\) and \(B_1, \ldots, B_n\) are propositionalliterals, i.e., propositional atoms or negations thereof. Such a rulerepresents an elementary causal dependency in manner akin to astructural equation. The symbol \(\Rightarrow\) stands for some notionof asymmetric determination accordingly. Bochman (2018) shows howstructural equations using only binary variables may be translatedinto a set of rules.
It seems as if we could now define \(C\) to be a cause of \(E\) iff wecan infer \(E\) from \(C\) in the context \(K\). But things are morecomplicated. The challenge is that the inferential path between \(C\)and \(E\) must also represent some active causal path, i.e., asequence of events such that each event actually caused its successor(if there is one). In other words, we need some notion of aninferential path that corresponds to a causal path in the actualsetting of events. Once such a notion is established, Bochman defines\(C\) to be a cause of \(E\) iff (i) we can infer \(E\) from \(C\) inthe context \(K\), while (ii) we cannot infer \(E\) from the context\(K\) alone. The details of this inferential account are a bitintricate, and so we must refer the reader to Bochman (2018) to get afull understanding. There, we can also find rigorous solutions to anumber of causal problems, including overdetermination, early and latepreemption, and some scenarios of prevention. It is worth noting,finally, that Bochman presents his analysis as an inferentialexplication of NESS conditions.
Regularity theories of causation have lost acceptance amongphilosophers to a considerable extent. One reason is DavidLewis’s (1973) influential attack on the regularity approachwhich concludes that its prospects are dark. And indeed, the problemof spurious causation—to pick just one—was severe for theregularity and inferential theories back then. Mackie’s INUSaccount and simple deductive-nomological accounts alike cannotproperly distinguish genuine from spurious causes, for example incommon cause scenarios. In the meantime, however, regularity andinferential theories have been developed that offer at least tentativesolutions to the problem of spurious causation, for instanceBaumgartner (2013), Spohn (2012), and Andreas and Günther (2020)(which are presented in§1.4,§2.2,§2.3, respectively).
Another challenge is the direction of causation. David Lewis (1973,1979) thought that counterfactuals themselves yield an account of thedirection of causation. This account, however, turned out to be rathercontroversial and is rarely adopted (see, e.g., Frisch 2014: 204n). Todetermine the direction of causation, regularity and inferentialtheories tend to rely on broadly Humean constraints on the temporalorder between cause and effect. This move comes at the cost ofexcluding backwards-in-time causation. We must wonder how severe thisproblem is. So far, there is neither a commonly agreed understandingof the very idea of backwards causation nor actual empirical evidencefor it (cf. the entry onbackwards causation). The direction of causation remains a point of controversy withinregularity, inferential, and counterfactual approaches.
To conclude, the contemporary regularity and inferential theories havemade some remarkable progress. Valid criticisms have been largelyovercome and many causal scenarios—which continue to bechallenging for counterfactual accounts of causation—have beensolved. In particular, Baumgartner (2013), Beckers and Vennekens(2018), and Andreas and Günther (2020) all solve the set ofscenarios including overdetermination, early and late preemption, andswitches. In light of these recent developments, it should not beconsidered evident anymore that counterfactual theories of causationhave a clear edge over regularity and inferential theories.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
[Please contact the author with suggestions.]
causal models |causation: and manipulability |causation: backward |causation: counterfactual theories of |causation: in the law |causation: probabilistic |Hume, David |laws of nature |Lewis, David |quantum mechanics: retrocausality |supervenience
We would like to thank Michael Baumgartner, Sander Beckers, WolfgangSpohn, Michael Strevens, and an anonymous referee for very valuableadvice on an earlier version of this entry. Of course, we remainresponsible for the final version.
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2023 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054