1.We write variables \(X\), \(Y\), etc., in italics and their (binary)realizations \(\r{X}\), \(\neg\r{X}\), etc. in roman script.
2.Whereas theorists of probabilistic causality viewedevents as thecausal relata, throughout this article, we usevariables,which can represent a range of relata. We write binary and numericalvariables \(V\) in italics and their instantiations \(\r{V}\) and\(\neg\r{V}\) in roman letters.
3.Reichenbach’sPrinciple of the Common Cause (1956: 163) states that if twoevents are probabilistically dependent but neither is the cause of theother, the dependence must be explained by a common cause. Thisprinciple has been generalized to thecausal Markov conditionin graphical approaches. Malinas (2001: 277) falsely claims thatSimpson’s Paradox leads to counterexamples toReichenbach’s principle, since any probabilistic dependency willbe screened off by an indefinite number of partitioning variables,many of which are not common causes. But the principle doesnot entail that all screening off variables are commoncauses.
4. Skyrms (1980) gives the weaker requirement that causes must raise theprobabilities of their effects in some contexts and lower them innone. In the econometrics literature, this assumption has been studiedextensively under the label “monotonicity” (Imbens &Angrist 1994).
5.In practical contexts, knowing the average effect of a treatment in apopulation may be of limited use for determining how an individual inthe population would respond to the treatment. But this is an issuewith averages generally, and does not make average effects any lessgenuinely causal (cf. Hausman 2010). See Pearl (2000 [2009:396–400]) for further results about the quantitativerelationship between the effects in populations and the effects forindividuals.
6.Here (and below) the qualification “typically” is aplaceholder for “assuming the causal Faithfulnesscondition” (see Weinberger 2018).
7.Note that while the back-door criterion is sufficient for identifiability,it is not necessary. For example, Pearl’s front-door criterion(2000 [2009: 82]) licenses identifiability in certain scenarios inwhich one cannot block all back-door paths. In such a case (and manyothers) the probabilistic formula identifying the effect will be morecomplicated, and the relationship between the effect in the populationand subpopulations will be less transparent from looking at theformula.
8.Here we follow Pearl in assuming that in decision-theoretic contexts,actions should be modeled as interventions. See Stern 2019 forcritical discussion.
9.Whether sampling assumptions in fact have “nothing” to do withcausality is non-trivial, and has not been addressed in theliterature. For instance, if, following Malinas (2001), the variables\(T\), \(R\), and \(M\) refer to letters on balls in an urn in thesame proportions as in table 1, then sorting the balls into two urnsbased on whether they have an \(M\) or not prior to drawing from oneof the urns would be probabilistically equivalent to intervening on\(M\).
10. Fitelson (2017: 305 fn. 17) conjectures that there are cases in which\(T\) has a minor influence on \(M\), and thus \(p(\r{T}\mid \do(\r{M})) \neq p(\r{T})\), but where Simpson’s reversals wouldstill seem paradoxical. These test cases suggest a basis forempirically distinguishing between Fitelson and Pearl’sexplanations of the paradox.
11.Cases such as this one in which certain factors change systematically withtime need to be modeled using non-stationary time-series. The methodsfor establishing probabilistic association among non-stationarytime-series are distinct from those for stationary time-series, andthus pose additional problems for causal inference from probabilities(Hoover 2003). These methods are beyond the scope of the presententry.
12.The discussion here follows Sober (2000 [2018]).
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2023 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054