Logic and Probability

First published Thu Mar 7, 2013; substantive revision Thu Aug 17, 2023

Logic and probability theory are two of the main tools in the formalstudy of reasoning, and have been fruitfully applied in areas asdiverse as philosophy, artificial intelligence, cognitive science andmathematics. This entry discusses the major proposals to combine logicand probability theory, and attempts to provide a classification ofthe various approaches in this rapidly developing field.

1. Combining Logic and Probability Theory

The very idea of combining logic and probability might look strange atfirst sight (Hájek 2001). After all, logic is concerned withabsolutely certain truths and inferences, whereas probability theorydeals with uncertainties. Furthermore, logic offers aqualitative (structural) perspective on inference (thedeductive validity of an argument is based on the argument’sformal structure), whereas probabilities arequantitative(numerical) in nature. However, as will be shown in the next section,there are natural senses in which probability theorypresupposes andextends classical logic.Furthermore, historically speaking, several distinguished theoristssuch as De Morgan (1847), Boole (1854), Ramsey (1926), de Finetti(1937), Carnap (1950), Jeffrey (1992) and Howson (2003, 2007, 2009)have emphasized the tight connections between logic and probability,or even considered their work on probability as a part of logicitself.

By integrating the complementary perspectives of qualitative logic andnumerical probability theory, probability logics are able to offerhighly expressive accounts of inference. It should therefore come asno surprise that they have been applied in all fields that studyreasoning mechanisms, such as philosophy, artificial intelligence,cognitive science and mathematics. The downside to thiscross-disciplinary popularity is that terms such as ‘probabilitylogic’ are used by different researchers in different,non-equivalent ways. Therefore, before moving on to the actualdiscussion of the various approaches, we will first delineate thesubject matter of this entry.

The most important distinction is that betweenprobabilitylogic andinductive logic. Classically, an argument issaid to be(deductively) valid if and only if it isimpossible that the premises of \(A\) are all true, while itsconclusion is false. In other words, deductive validity amounts totruth preservation: in a valid argument, the truth of thepremises guarantees the truth of the conclusion. In some arguments,however, the truth of the premises does not fully guarantee the truthof the conclusion, but it still renders it highly likely. A typicalexample is the argument with premises ‘The first swan I saw waswhite’, …, ‘The 1000th swan I saw was white’,and conclusion ‘All swans are white’. Such arguments arestudied ininductive logic, which makes extensive use ofprobabilistic notions, and is therefore considered by some authors tobe related to probability logic. There is some discussion about theexact relation between inductive logic and probability logic, which issummarized in the introduction of Kyburg (1994). The dominant position(defended by Adams and Levine (1975), among others), which is alsoadopted here, is that probability logic entirely belongs to deductivelogic, and hence should not be concerned with inductive reasoning.Still, most work on inductive logic falls within the‘probability preservation’ approach, and is thus closelyconnected to the systems discussed inSection 2. For more on inductive logic, the reader can consult Jaynes (2003),Fitelson (2006), Romeijn (2011), and the entries onthe problem of induction andinductive logic of this encyclopedia.

We will also steer clear of the philosophical debate over the exactnature of probability. The formal systems discussed here arecompatible with all of the common interpretations of probability, butobviously, in concrete applications, certain interpretations ofprobability will fit more naturally than others. For example, themodal probability logics discussed inSection 4 are, by themselves, neutral about the nature of probability, but whenthey are used to describe the behavior of a transition system, theirprobabilities are typically interpreted in an objective way, whereasmodeling multi-agent scenarios is accompanied most naturally by asubjective interpretation of probabilities (as agents’ degreesof belief). This topic is covered in detail in Gillies (2000), Eagle(2010), and the entry oninterpretations of probability of this encyclopedia.

A recent trend in the literature has been to focus less on integratingor combining logic and probability theory into a single, unifiedframework, but rather to establish bridges between the twodisciplines. This typically involves trying to capture the qualitativenotions of logic in the quantitative terms of probability theory, orthe other way around. We will not be able to do justice to the widevariety of approaches in this booming area, but interested readers canconsult Leitgeb (2013, 2014), Lin and Kelly (2012a, 2012b), Douven andRott (2018), and Harrison-Trainor, Holliday and Icard (2016, 2018). A‘contemporary classic’ in this area is Leitgeb (2017),while van Benthem (2017) offers a useful survey and some interestingprogrammatic remarks.

Finally, although the success of probability logic is largely due toits various applications, we will not deal with these applications inany detail. For example, we will not assess the use of probability asa formal representation of belief in philosophy (Bayesianepistemology) or artificial intelligence (knowledge representation),and its advantages and disadvantages with respect to alternativerepresentations, such as generalized probability theory (for quantumtheory), \(p\)-adic probability, and fuzzy logic. For more informationabout these topics, the reader can consult Gerla (1994), Vennekens etal. (2009), Hájek and Hartmann (2010), Hartmann and Sprenger(2010), Ilić-Stepić et al. (2012), and the entries onformal representations of belief,Bayesian epistemology,defeasible reasoning,quantum logic and probability theory, andfuzzy logic of this encyclopedia.

With these clarifications in place, we are now ready to look at whatwill be discussed in this entry. The most common strategy toobtain a concrete system of probability logic is to start with aclassical (propositional/modal/etc.) system of logic and to‘probabilify’ it in one way or another, by addingprobabilistic features to it. There are various ways in which thisprobabilification can be implemented. One can study probabilisticsemantics for classical languages (which do not have any explicitprobabilistic operators), in which case the consequence relationitself gets a probabilistic flavor: deductive validity becomes‘probability preservation’, rather than ‘truthpreservation’. This direction will be discussed inSection 2. Alternatively, one can add various kinds of probabilisticoperators to the syntax of the logic. InSection 3 we will discuss some initial, rather basic examples of probabilisticoperators. The full expressivity of modal probabilistic operators willbe explored inSection 4. Finally, languages with first-order probabilistic operators will bediscussed inSection 5.

2. Propositional Probability Logics

In this section, we will present a first family of probability logics,which are used to study questions of ‘probabilitypreservation’ (or dually, ‘uncertaintypropagation’). These systems do not extend the language with anyprobabilistic operators, but rather deal with a‘classical’ propositional language \(\mathcal{L}\), whichhas a countable set of atomic propositions, and the usualtruth-functional (Boolean) connectives.

The main idea is that the premises of a valid argument can beuncertain, in which case (deductive) validity imposes no conditions onthe (un)certainty of the conclusion. For example, the argument withpremises ‘if it will rain tomorrow, I will get wet’ and‘it will rain tomorrow’, and conclusion ‘I will getwet’ is valid, but if its second premise is uncertain, itsconclusion will typically also be uncertain. Propositional probabilitylogics represent such uncertainties as probabilities, and study howthey ‘flow’ from the premises to the conclusion; in otherwords, they do not studytruth preservation, but ratherprobability preservation. The following three subsectionsdiscuss systems that deal with increasingly more general versions ofthis issue.

2.1 Probabilistic Semantics

We begin by recalling the notion of a probability function for thepropositional language \(\mathcal{L}\). (In mathematics, probabilityfunctions are usually defined for a \(\sigma\)-algebra of subsets of agiven set \(\Omega\), and required to satisfy countable additivity;cf.Section 4.3. In logicalcontexts, however, it is often more natural to define probabilityfunctions ‘immediately’ for the logic’s objectlanguage (Williamson 2002). Because this language isfinitary—all its formulas have finite length—, it alsosuffices to require finite additivity.) Aprobability function(for \(\mathcal{L}\)) is a function \(P: \mathcal{L}\to\mathbb{R}\) satisfying the following constraints:

Non-negativity. \(P(\phi)\geq 0\) for all\(\phi\in\mathcal{L}.\)

Tautologies. If \(\models\phi\), then\(P(\phi)=1.\)

Finite additivity. If\(\models\neg(\phi\wedge\psi)\), then \(P(\phi\vee\psi) =P(\phi)+P(\psi).\)

In the second and third constraint, the \(\models\)-symbol denotes(semantic) validity in classical propositional logic. The definitionof probability functions thus requires notions from classical logic,and in this sense probability theory can be said topresuppose classical logic (Adams 1998, 22). It can easily beshown that if \(P\) satisfies these constraints, then \(P(\phi)\in[0,1]\) for all formulas \(\phi\in\mathcal{L}\), and \(P(\phi) =P(\psi)\) for all formulas \(\phi,\psi\in\mathcal{L}\) that arelogically equivalent (i.e. such that\(\models\phi\leftrightarrow\psi\)).

We now turn to probabilistic semantics, as defined in Leblanc (1983).An argument with premises \(\Gamma\) and conclusion\(\phi\)—henceforth denoted as \((\Gamma,\phi)\)—is saidto beprobabilistically valid, written\(\Gamma\models_p\phi\), if and only if:

for all probability functions \(P:\mathcal{L}\to\mathbb{R}\):
if \(P(\gamma) = 1\) for all \(\gamma\in\Gamma\), then also \(P(\phi)= 1\).

Probabilistic semantics thus replaces the valuations\(v:\mathcal{L}\to\{0,1\}\) of classical propositional logic withprobability functions \(P:\mathcal{L}\to \mathbb{R}\), which takevalues in the real unit interval \([0,1]\). The classical truth valuesoftrue (1) andfalse (0) can thus be regarded asthe endpoints of the unit interval \([0,1]\), and likewise, valuations\(v:\mathcal{L}\to\{0,1\}\) can be regarded as degenerate probabilityfunctions \(P:\mathcal{L}\to[0,1]\). In this sense, classical logic isa special case of probability logic, or equivalently, probabilitylogic is anextension of classical logic.

It can be shown that classical propositional logic is (strongly) soundand complete with respect to probabilistic semantics:

\[\Gamma \models_p \phi \text{ if and only if } \Gamma \vdash\phi.\]

Some authors interpret probabilities as generalized truth values(Reichenbach 1949, Leblanc 1983). According to this view, probabilitylogic is just a particular kind of many-valued logic, andprobabilistic validity boils down to ‘truth preservation’:truth (i.e. probability 1) carries over from the premises to theconclusion. Other logicians, such as Tarski (1936) and Adams (1998,15), have noted that probabilities cannot be seen as generalized truthvalues, because probability functions are not‘extensional’; for example, \(P(\phi\wedge\psi)\) cannotbe expressed as a function of \(P(\phi)\) and \(P(\psi)\). Morediscussion on this topic can be found in Hailperin (1984).

Another possibility is to interpret a sentence’s probability asa measure of its (un)certainty. For example, the sentence ‘Jonesis in Spain at the moment’ can have any degree of certainty,ranging from 0 (maximal uncertainty) to 1 (maximal certainty). (Notethat 0 is actually a kind of certainty, viz. certainty aboutfalsity; however, in this entry we follow Adams’ terminology(1998, 31) and interpret 0 as maximal uncertainty.) According to thisinterpretation, the following theorem follows from the strongsoundness and completeness of probabilistic semantics:

Theorem 1. Consider a deductively valid argument\((\Gamma,\phi)\). If all premises in \(\Gamma\) have probability 1,then the conclusion \(\phi\) also has probability 1.

This theorem can be seen as a first, very partial clarification of theissue of probability preservation (or uncertainty propagation). Itsays that if there is no uncertainty whatsoever about the premises,then there cannot be any uncertainty about the conclusion either. Inthe next two subsections we will consider more interesting cases, whenthere is non-zero uncertainty about the premises, and ask how itcarries over to the conclusion.

Finally, it should be noted that although this subsection onlydiscussed probabilistic semantics for classical propositional logic,there are also probabilistic semantics for a variety of other logics,such as intuitionistic propositional logic (van Fraassen 1981b, Morganand Leblanc 1983), modal logics (Morgan 1982a, 1982b, 1983, Cross1993), classical first-order logic (Leblanc 1979, 1984, van Fraassen1981b), relevant logic (van Fraassen 1983) and nonmonotonic logic(Pearl 1991). All of these systems share a key feature: thelogic’s semantics is probabilistic in nature, but probabilitiesarenot explicitly represented in the object language; hence,they are much closer in nature to the propositional probability logicsdiscussed here than to the systems presented in later sections.

Most of these systems are not based on unary probabilities\(P(\phi)\), but rather on conditional probabilities \(P(\phi,\psi)\).The conditional probability \(P(\phi,\psi)\) is taken as primitive(rather than being defined as \(P(\phi\wedge\psi)/P(\psi)\), as isusually done) to avoid problems when \(P(\psi)=0\). Goosens (1979)provides an overview of various axiomatizations of probability theoryin terms of such primitive notions of conditional probability.

2.2 Adams’ Probability Logic

In the previous subsection we discussed a first principle ofprobability preservation, which says that if all premises haveprobability 1, then the conclusion also has probability 1. Of course,more interesting cases arise when the premises are less thanabsolutely certain. Consider the valid argument with premises \(p\veeq\) and \(p\to q\), and conclusion \(q\) (the symbol‘\(\to\)’ denotes the truth-conditional materialconditional). One can easily show that

\[P(q) = P(p\vee q) + P(p\to q) - 1.\]

In other words, if we know the probabilities of the argument’spremises, then we can calculate the exact probability of itsconclusion, and thus provide a complete answer to the question ofprobability preservation for this particular argument (for example, if\(P(p \vee q) = 6/7\) and \(P(p\to q) = 5/7\), then \(P(q) = 4/7\)).In general, however, it will not be possible to calculate theexact probability of the conclusion, given the probabilitiesof the premises; rather, the best we can hope for is a (tight) upperand/or lowerbound for the conclusion’s probability. Wewill now discuss Adams’ (1998) methods to compute suchbounds.

Adams’ results can be stated more easily in terms ofuncertainty rather thancertainty (probability).Given a probability function \(P:\mathcal{L}\to [0,1]\), thecorrespondinguncertainty function \(U_P\) is defined as

\[U_P:\mathcal{L}\to[0,1]: \phi\mapsto U_P(\phi):= 1-P(\phi).\]

If the probability function \(P\) is clear from the context, we willoften simply write \(U\) instead of \(U_P\). In the remainder of thissubsection (and in the next one as well) we will assume that allarguments have only finitely many premises (which is not a significantrestriction, given the compactness property of classical propositionallogic). Adams’ first main result, which was originallyestablished by Suppes (1966), can now be stated as follows:

Theorem 2. Consider a valid argument\((\Gamma,\phi)\) and a probability function \(P\). Then theuncertainty of the conclusion \(\phi\) cannot exceed the sum of theuncertainties of the premises \(\gamma\in\Gamma\). Formally:

\[U(\phi) \leq \sum_{\gamma\in\Gamma}U(\gamma).\]

First of all, note that this theorem subsumes Theorem 1 as a specialcase: if \(P(\gamma) = 1\) for all \(\gamma\in\Gamma\), then\(U(\gamma)=0\) for all \(\gamma\in\Gamma\), so \(U(\phi)\leq \sumU(\gamma) = 0\) and thus \(P(\phi) = 1\). Furthermore, note that theupper bound on the uncertainty of the conclusion depends on\(|\Gamma|\), i.e. on the number of premises. If a valid argumenthas a small number of premises, each of which only has a smalluncertainty (i.e. a high certainty), then its conclusion willalso have a reasonably small uncertainty (i.e. a reasonably highcertainty). Conversely, if a valid argument has premises with smalluncertainties, then its conclusion can only be highly uncertain if theargument has a large number of premises (a famous illustration of thisconverse principle is Kyburg’s (1965)lottery paradox,which is discussed in the entry onepistemic paradoxes of this encyclopedia). To put the matter more concretely, note thatif a valid argument has three premises which each have uncertainty1/11, then adding a premise which also has uncertainty 1/11 will notinfluence the argument’s validity, but itwill raisethe upper bound on the conclusion’s uncertainty from 3/11 to4/11—thus allowing the conclusion to be more uncertain than wasoriginally the case. Finally, the upper bound provided by Theorem 2 isoptimal, in the sense that (under the right conditions) theuncertainty of the conclusion can coincide with its upper bound \(\sumU(\gamma)\):

Theorem 3. Consider a valid argument\((\Gamma,\phi)\), and assume that the premise set \(\Gamma\) isconsistent, and that every premise \(\gamma\in\Gamma\) is relevant(i.e. \(\Gamma-\{\gamma\}\not\models\phi\)). Then there exists aprobability function \(P:\mathcal{L}\to[0,1]\) such that

\[U_P(\phi) = \sum_{\gamma\in\Gamma}U_P(\gamma).\]

The upper bound provided by Theorem 2 can also be used to define aprobabilistic notion of validity. An argument \((\Gamma,\phi)\) issaid to beAdams-probabilistically valid, written\(\Gamma\models_a\phi\), if and only if

for all probability functions \(P:\mathcal{L}\to\mathbb{R}\):\(U_P(\phi)\leq \sum_{\gamma\in\Gamma}U_P(\gamma)\).

Adams-probabilistic validity has an alternative, equivalentcharacterization in terms of probabilities rather than uncertainties.This characterization says that \((\Gamma,\phi)\) isAdams-probabilistically valid if and only if the conclusion’sprobability can get arbitrarily close to 1 if the premises’probabilities are sufficiently high. Formally: \(\Gamma\models_a\phi\)if and only if

for all \(\epsilon>0\) there exists a \(\delta>0\) such that forall probability functions \(P\):
if \(P(\gamma)>1-\delta\) for all \(\gamma\in\Gamma\), then\(P(\phi)> 1-\epsilon\).

It can be shown that classical propositional logic is (strongly) soundand complete with respect to Adams’ probabilistic semantics:

\[\Gamma \models_a \phi \text{ if and only if } \Gamma \vdash\phi.\]

Adams (1998, 154) also defines another logic for which hisprobabilistic semantics is sound and complete. However, this systeminvolves a non-truth-functional connective (theprobabilityconditional), and therefore falls outside the scope of thissection. (For more on probabilistic interpretations of conditionals,the reader can consult the entries onconditionals andthe logic of conditionals of this encyclopedia.)

Consider the following example. The argument \(A\) with premises\(p,q,r,s\) and conclusion \(p\wedge(q\vee r)\) is valid. Assume that\(P(p) = 10/11, P(q) = P(r) = 9/11\) and \(P(s) = 7/11\). Then Theorem2 says that

\[\begin{align}&U(p\wedge(q\vee r)) \leq \\&\quad\frac{1}{11} + \frac{2}{11} + \frac{2}{11} + \frac{4}{11} = \frac{9}{11}.\end{align}\]

This upper bound on the uncertainty of the conclusion is ratherdisappointing, and it exposes the main weakness of Theorem 2. One ofthe reasons why the upper bound is so high, is that to compute it wetook into account the premise \(s\), which has a rather highuncertainty (\(4/11\)). However, this premise is irrelevant, in thesense that the conclusion already follows from the other threepremises. Hence we can regard \(p\wedge (q\vee r)\) not only as theconclusion of the valid argument \(A\), but also as the conclusion ofthe (equally valid) argument \(A'\), which has premises \(p,q,r\). Inthe latter case Theorem 2 yields an upper bound of \(1/11 + 2/11 +2/11 = 5/11\), which is already much lower.

The weakness of Theorem 2 is thus that it takes into account (theuncertainty of)irrelevant or inessential premises. To obtainan improved version of this theorem, a more fine-grained notion of‘essentialness’ is necessary. In argument \(A\) in theexample above, premise \(s\) is absolutely irrelevant. Similarly,premise \(p\) is absolutely relevant, in the sense that without thispremise, the conclusion \(p\wedge(q\vee r)\) is no longer derivable.Finally, the premise subset \(\{q,r\}\) is ‘in between’:together \(q\) and \(r\) are relevant (if both premises are left out,the conclusion is no longer derivable), but each of them separatelycan be left out (while keeping the conclusion derivable).

The notion of essentialness is formalized as follows:

Essential premise set. Given a valid argument\((\Gamma,\phi)\), a set \(\Gamma' \subseteq \Gamma\) isessential iff \(\Gamma - \Gamma' \not\models\phi\).

Degree of essentialness. Given a valid argument\((\Gamma,\phi)\) and a premise \(\gamma\in\Gamma\), thedegree ofessentialness of \(\gamma\), written \(E(\gamma)\), is\(1/|S_\gamma|\), where \(|S_\gamma|\) is the cardinality of thesmallest essential premise set that contains \(\gamma\). If \(\gamma\)does not belong to any minimal essential premise set, then the degreeof essentialness of \(\gamma\) is 0.

With these definitions, a refined version of Theorem 2 can beestablished:

Theorem 4. Consider a valid argument\((\Gamma,\phi)\). Then the uncertainty of the conclusion \(\phi\)cannot exceed the weighted sum of the uncertainties of the premises\(\gamma\in\Gamma\), with the degrees of essentialness as weights.Formally:

\[U(\phi) \leq \sum_{\gamma\in\Gamma}E(\gamma)U(\gamma).\]

The proof of Theorem 4 is significantly more difficult than that ofTheorem 2: Theorem 2 requires only basic probability theory, whereasTheorem 4 is proved using methods from linear programming (Adams andLevine 1975; Goldman and Tucker 1956). Theorem 4 subsumes Theorem 2 asa special case: if all premises are relevant (i.e. have degree ofessentialness 1), then Theorem 4 yields the same upper bound asTheorem 2. Furthermore, Theorem 4 does not take into accountirrelevant premises (i.e. premises with degree of essentialness0) to compute this upper bound; hence if a premise is irrelevant forthe validity of the argument, then its uncertainty will not carry overto the conclusion. Finally, note that since \(E(\gamma)\in [0,1]\) forall \(\gamma\in\Gamma\), it holds that

\[\sum_{\gamma\in\Gamma}E(\gamma)U(\gamma) \leq\sum_{\gamma\in\Gamma}U(\gamma),\]

i.e. Theorem 4 yields in general a tighter upper bound thanTheorem 2. To illustrate this, consider again the argument withpremises \(p,q,r,s\) and conclusion \(p \wedge (q\vee r)\). Recallthat \(P(p)=10/11, P(q) = P(r)=9/11\) and \(P(s)=7/11\). One cancalculate the degrees of essentialness of the premises: \(E(p) = 1,E(q) = E(r) = 1/2\) and \(E(s) = 0\). Hence Theorem 4 yields that

\[\begin{align}&U(p \wedge (q\vee r))\leq \\&\quad\left(1\times \frac{1}{11}\right) + \left(\frac{1}{2} \times \frac{2}{11}\right) + \left(\frac{1}{2} \times \frac{2}{11}\right) + \left(0 \times \frac{4}{11}\right) = \frac{3}{11},\end{align}\]

which is a tighter upper bound for the uncertainty of \(p\wedge(q \veer)\) than any of the bounds obtained above via Theorem 2(viz. \(9/11\) and \(5/11\)).

2.3 Further Generalizations

Given the uncertainties (and degrees of essentialness) of the premisesof a valid argument, Adams’ theorems allow us to compute anupper bound for the uncertainty of the conclusion. Of coursethese results can also be expressed in terms of probabilities ratherthan uncertainties; they then yield alower bound for theprobability of the conclusion. For example, when expressed in terms ofprobabilities rather than uncertainties, Theorem 4 looks asfollows:

\[P(\phi)\geq 1 - \sum_{\gamma\in\Gamma}E(\gamma)(1 - P(\gamma)).\]

Adams’ results are restricted in at least two ways:

They only provide alower bound for the probability of theconclusion (given the probabilities of the premises). In a sense thisis the most important bound: it represents the conclusion’sprobability in the ‘worst-case scenario’, which might beuseful information in practical applications. However, in someapplications it might also be informative to have anupperbound for the conclusion’s probability. For example, if oneknows that this probability has an upper bound of 0.4, then one mightdecide to refrain from certain actions (that one would have performedif this upper bound were (known to be) 0.9).
They presuppose that the premises’ exact probabilities areknown. In practical applications, however, there might only be partialinformation about the probability of a premise \(\gamma\): its exactvalue is not known, but itis known to have a lower bound\(a\) and an upper bound \(b\) (Walley 1991). In such applications itwould be useful to have a method to calculate (optimal) lower andupper bounds for the probability of the conclusion in terms of theupper and lower bounds of the probabilities of the premises.

Hailperin (1965, 1984, 1986, 1996) and Nilsson (1986) use methods fromlinear programming to show that these two restrictions can beovercome. Their most important result is the following:

Theorem 5. Consider an argument \((\Gamma,\phi)\),with \(|\Gamma| = n\). There exist functions \(L_{\Gamma,\phi}:\mathbb{R}^{2n} \to \mathbb{R}\) and \(U_{\Gamma,\phi}:\mathbb{R}^{2n} \to \mathbb{R}\) such that for any probabilityfunction \(P\), the following holds: if \(a_i \leq P(\gamma_i) \leqb_i\) for \(1\leq i\leq n\), then:

\(L_{\Gamma,\phi}(a_1,\dots,a_n,b_1,\dots,b_n) \leq P(\phi) \:\leq\)\(U_{\Gamma,\phi}(a_1,\dots,a_n,b_1,\dots,b_n)\).
The bounds in item 1 are optimal, in the sense that there existprobability functions \(P_L\) and \(P_U\) such that \(a_i \leqP_L(\gamma_i),\) \(P_U(\gamma_i)\leq b_i\) for \(1\leq i\leq n\), and
\(L_{\Gamma,\phi}(a_1,\dots,a_n,b_1,\dots,b_n) = P_L(\phi)\)
and
\(P_U(\phi) = U_{\Gamma,\phi}(a_1,\dots,a_n,b_1,\dots,b_n)\).
The functions \(L_{\Gamma,\phi}\) and \(U_{\Gamma,\phi}\) areeffectively determinable from the Boolean structure of the sentencesin \(\Gamma \cup \{\phi\}\).

This result can also be used to define yet another probabilisticnotion of validity, which we will callHailperin-probabilisticvalidity or simplyh-validity. This notion is notdefined with respect to formulas, but rather with respect topairs consisting of a formula and a subinterval of \([0,1]\).If \(X_i\) is the interval associated with premise \(\gamma_i\in\Gamma\) and \(Y\) is the interval associated with the conclusion\(\phi\), then the argument \((\Gamma,\phi)\) is said to beh-valid, written \(\Gamma\models_h\phi\), if and only if for all probability functions \(P\):

\[\text{ if } P(\gamma_i) \in X_i \text{ for } 1\leq i\leq n, \text{ then } P(\phi)\in Y\]

In Haenni et al. (2011) this is written as

\[\gamma_1^{X_1},\dots,\gamma_n^{X_n}|\!\!\!\approx \phi^Y\]

and called thestandard probabilistic semantics.

Nilsson’s work on probabilistic logic (1986, 1993) has sparked alot of research on probabilistic reasoning in artificial intelligence(Hansen and Jaumard 2000; chapter 2 of Haenni et al. 2011).However, it should be noted that although Theorem 5 states that thefunctions \(L_{\Gamma,\phi}\) and \(U_{\Gamma,\phi}\) areeffectively determinable from the sentences in\(\Gamma\cup\{\phi\}\), thecomputational complexity of thisproblem is quite high (Georgakopoulos et al. 1988, Kavvadias andPapadimitriou 1990), and thus finding these functions quickly becomescomputationally unfeasible in real-world applications. Contemporaryapproaches based on probabilistic argumentation systems andprobabilistic networks are better capable of handling thesecomputational challenges. Furthermore, probabilistic argumentationsystems are closely related to Dempster-Shafer theory (Dempster 1968;Shafer 1976; Haenni and Lehmann 2003). However, an extended discussionof these approaches is beyond the scope of (the current version of)this entry; see (Haenni et al. 2011) for a recent survey.

3. Basic Probability Operators

In this section we will study probability logics that extend thepropositional language \(\mathcal{L}\) with rather basic probabilityoperators. They differ from the logics inSection 2 in thatthe logics here involve probability operators in the object language.Section 3.1 discussesqualitative probabilityoperators;Section 3.2 discussesquantitative probability operators.

3.1 Qualitative Representations of Uncertainty

There are several applications in which qualitative theories ofprobability might be useful, or even necessary. In some situationsthere are no frequencies available to use as estimates for theprobabilities, or it might be practically impossible to obtain thosefrequencies. Furthermore, people are often willing tocomparethe probabilities of two statements (‘\(\phi\) is more probablethan \(\psi\)’), without being able to assign explicitprobabilities to each of the statementsindividually(Szolovits and Pauker 1978, Halpern and Rabin 1987). In suchsituations qualitative probability logics will be useful.

One of the earliest qualitative probability logics is Hamblin’s(1959). The language is extended with a unary operator \(\Box\), whichis to be read as ‘probably’. Hence a formula such as\(\Box\phi\) is to be read as ‘probably \(\phi\)’. Thisnotion of ‘probable’ can be formalized assufficientlyhigh (numerical) probability (i.e. \(P(\phi)\geq t\), forsome threshold value \(1/2 < t \leq 1\)), or alternatively in termsofplausibility, which is a non-metrical generalization ofprobability. Burgess (1969) further develops these systems, focusingon the ‘high numerical probability’-interpretation. BothHamblin and Burgess introduce additional operators into their systems(expressing, for example, metaphysical necessity and/or knowledge),and study the interaction between the ‘probably’-operatorand these other modal operators. However, the‘probably’-operator already displays some interestingfeatures on its own (independent from any other operators). If it isinterpreted as ‘sufficiently high probability’, then itfails to satisfy the principle \((\Box\phi\wedge\Box\psi) \to\Box(\phi\wedge\psi)\). This means that it is not anormalmodal operator, and cannot be given a Kripke (relational) semantics.Herzig and Longin (2003) and Arló Costa (2005) provide weakersystems ofneighborhood semantics for such‘probably’-operators, while Yalcin (2010) discusses theirbehavior from a more linguistically oriented perspective.

Another route is taken by Segerberg (1971) and Gärdenfors (1975a,1975b), who build on earlier work by de Finetti (1937), Kraft, Prattand Seidenberg (1959) and Scott (1964). They introduce abinary operator \(\geq\); the formula \(\phi\geq\psi\) is tobe read as ‘\(\phi\) is at least as probable as \(\psi\)’(formally: \(P(\phi)\geq P(\psi)\)). The key idea is that one cancompletely axiomatize the behavior of \(\geq\) without having to usethe ‘underlying’ probabilities of the individual formulas.It should be noted that with comparative probability (a binaryoperator), one can also express some absolute probabilistic properties(unary operators). For example, \(\phi\geq \top\) expresses that\(\phi\) has probability 1, and \(\phi\geq\neg\phi\) expresses that\(\phi\) has probability at least 1/2. More recently, Delgrande andRenne (2015) and Delgrande, Renne, and Sack (2019) further extend thequalitative approach, by allowing the arguments of \(\geq\) to befinitesequences of formulas (of potentially differentlengths). The formula \((\phi_1,\dots,\phi_n) \geq(\psi_1,\dots,\psi_m)\) is informally to be read as ‘the sum ofthe probabilities of the \(\phi_i\)’s is at least as high as thesum of the probabilities of the \(\psi_j\)’s’. Theresulting logic can be axiomatized completely, and can capture any rational quantity, making it as expressive as somequantitiative probability logics. However, it is stilldistinct from quantitative probability logics, as there are no numbersin the language. In the following sections, we direct our attentionto quantitative probability logics.

3.2 Sums and Products of Probability Terms

Propositional probability logics are extensions of propositional logicthat express numerical relationships among probability terms\(P(\varphi)\). A simple propositional probability logic adds topropositional logic formulas of the form \(P(\varphi)\ge q\), where\(\varphi\) is a propositional formula and \(q\) is a number; such aformula asserts that the probability of \(\varphi\) is at least \(q\).The semantics is formalized using models consisting a probabilityfunction \(\mathcal{P}\) over a set \(\Omega\), whose elements areeach given a truth assignment to the atomic propositions of thepropositional logic. Thus a propositional formula is true at anelement of \(\Omega\) if the truth assignment for that element makesthe propositional formula true. The formula \(P(\varphi)\ge q\) istrue in the model if and only if the probability \(\mathcal{P}\) ofthe set of elements of \(\Omega\) for which \(\varphi\) is true is atleast \(q\). See Chapter 3 of Ognjanović et al. (2016) foran overview of such a propositional probability logic.

Some propositional probability logics include other types of formulasin the object language, such as those involving sums and products ofprobability terms. The appeal of involving sums can be clarified bythe additivity condition of probability functions (seeSection 2.1), which can be expressed as \(P(\phi \vee \psi) =P(\phi)+P(\psi)\) whenever \(\neg (\phi \wedge \psi)\) is a tautology,or equivalently as \(P(\phi \wedge \psi) + P(\phi \wedge \neg \psi) =P(\phi)\). Probability logics that explicitly involve sums ofprobabilities tend to more generally include linear combinations ofprobability terms, such as in Fagin et al. (1990). Here,propositional logic is extended with formulas of the form\(a_1P(\phi_1) + \cdots + a_n P(\phi_n) \ge b\), where \(n\) is apositive integer that may differ from formula to formula, and\(a_1,\ldots,a_n\), and \(b\) are all rational numbers. Here are someexamples of what can be expressed.

\(P(\phi) \le q\) by \(-P(\phi) \ge -q\),
\(P(\phi) < q\) by \(\neg (P(\phi) \ge q)\),
\(P(\phi) = q\) by \(P(\phi)\ge q \wedge P(\phi) \le q\).
\(P(\phi) \ge P(\psi)\) by \(P(\phi)-P(\psi) \ge 0\).

Expressive power with and without linearcombinations: Although linear combinations provide a convenient way of expressingnumerous relationships among probability terms, a language withoutsums of probability terms is still very powerful. Consider thelanguage restricted to formulas of the form \(P(\phi) \ge q\) for somepropositional formula \(\phi\) and rational \(q\). We can define

\[P(\phi) \le q \text{ by } P(\neg\phi) \ge 1-q,\]

which is reasonable considering that the probability of the complementof a proposition is equal to 1 minus the probability of theproposition. The formulas \(P(\phi) <q\) and \(P(\phi) = q\) can bedefined without linear combinations as we did above. Using thisrestricted probability language, we can reason about additivity in aless direct way. The formula

\[[P(\phi \wedge \psi) = a \wedge P(\phi \wedge \neg \psi) = b] \to P(\phi) = a+b\]

states that if the probability of \(\phi \wedge \psi\) is \(a\) andthe probability of \(\phi\wedge \neg \psi\) is \(b\), then theprobability of the disjunction of the formulas (which is equivalent to\(\phi\)) is \(a+b\). However, while the use of linear combinationsallows us to assert that the probabilities of \(\varphi\wedge\psi\)and \(\varphi\wedge\neg\psi\) are additive by using the formula\(P(\varphi\wedge \psi)+P(\varphi\wedge\neg\psi) = P(\varphi)\), theformula without linear combinations above only does so if we choosethe correct numbers \(a\) and \(b\). A formal comparison of theexpressiveness of propositional probability logic with linearcombinations and without is given in Demey and Sack (2015). While anytwo models agree on all formulas with linear combinations if and onlyif they agree on all formulas without (Lemma 4.1 of Demey and Sack(2015)), it is not the case that any class of models definable by asingle formula with linear combinations can be defined by a singleformula without (Lemma 4.2 of Demey and Sack (2015)). In particular,the class of models defined by the formula \(P(p)- P(q)\ge 0\) cannotbe defined by any single formula without the power of linearcombinations.

Probabilities belonging to a given subset:Ognjanović and Rašković (1999) extend the language ofprobability logic by means of a new type of operator: \(Q_F\).Intuitively, the formula \(Q_F\phi\) means that the probability of\(\phi\) belongs to \(F\), for some given set \(F \subseteq [0,1]\).This \(Q_F\)-operator cannot be defined in terms of formulas of theform \(P(\phi) \ge a\). Ognjanović and Rašković(1999) provide a sound and complete axiomatization of this type oflogical system. The key bridge principles, which connect the\(Q_F\)-operator to the more standard \(P\)-operator, are the axioms\(P(\phi) = a \to Q_F\phi\) for all \(a \in F\), as well as theinfinitary rule that specifies that from \(P(\phi) = a \to \psi\) forall \(a \in F\), one can infer \(Q_F\phi\to\psi\).

Polynomial weight formulas: Logics with polynomialweight formulas (involving both weighted sums and products ofprobability terms), can allow for formulas of the form\(P(\phi)P(\psi)-P(\phi\wedge \psi) = 0\), that is, the probability ofboth \(\phi\) and \(\psi\) is equal to the product of theprobabilities of \(\phi\) and \(\psi\). This formula captures what itmeans for \(\phi\) and \(\psi\) to bestatisticallyindependent. Such logics were investigated in Fagin etal. (1990), but mostly with first-order logic features included,and then again in a simpler context (without quantifiers) inPerović et al. (2008).

Compactness and completeness: Compactness is aproperty of a logic where a set of formulas is satisfiable if everyfinite subset is satisfiable. Propositional probability logics lackthe compactness property, as every finite subset of\(\{P(p)>0\}\cup\{P(p)\leq a\,|\,a>0\}\) is satisfiable, but theentire set is not.

Without compactness, a logic might be weakly complete (every validformula is provable in the axiomatic system), but not stronglycomplete (for every set \(\Gamma\) of formulas, every logicalconsequence of \(\Gamma\) is provable from \(\Gamma\) in the axiomaticsystem). In Fagin et al. (1990), a proof system involving linearcombinations was given and the logic was shown to be both sound andweakly complete. In Ognjanović and Rašković (1999), asound and strongly complete proof system is given for propositionalprobability logic without linear combinations. In Heifetz and Mongin(2001), a proof system for a variation of the logic without linearcombinations that uses a system of types to allow for iteration ofprobability formulas (we will see inSection 4 how suchiteration can be achieved using possible worlds) was given and thelogic was shown to be sound and weakly complete. They also observethat no finitary proof system for such a logic can be stronglycomplete. Ognjanović et al. (2008) present some qualitativeprobabilistic logics with infinitary derivation rules (which require acountably infinite number of premises), and prove strong completeness.Goldblatt (2010) presents a strongly complete proof system for arelated coalgebraic logic. Perović et al. (2008) give aproof system and proof of strong completeness for propositionalprobability logic with polynomial weight formulas. Finally, anotherstrategy for obtaining strong completeness involves restricting therange of the probability functions to a fixed, finite set of numbers;for example, Ognjanović et al. (2008) discuss a qualitativeprobabilistic logic in which the range of the probability functions isnot the full real unit interval \([0,1]\), but rather the‘discretized’ version\(\{0,\frac{1}{n},\frac{2}{n},\dots,\frac{n-1}{n},1\}\) (for somefixed number \(n\in\mathbb{N}\)). See Chapter 7 of Ognjanović etal. (2016) for an overview of completeness results.

4. Modal Probability Logics

Many probability logics are interpreted over a single, but arbitraryprobability space.Modal probability logic makes use of manyprobability spaces, each associated with a possible world or state.This can be viewed as a minor adjustment to the relational semanticsof modal logic: rather than associate to every possible world a set ofaccessible worlds as is done in modal logic, modal probability logicassociates to every possible world a probability distribution, aprobability space, or a set of probability distributions. The languageof modal probability logic allows for embedding of probabilitieswithin probabilities, that is, it can for example reason about theprobability that (possibly a different) probability is \(1/2\). Thismodal setting involving multiple probabilities has generally beengiven a (1)stochastic interpretation, concerning differentprobabilities over the next states a system might transition into(Larsen and Skou 1991), and (2) asubjective interpretation,concerning different probabilities that different agents may haveabout a situation or each other’s probabilities (Fagin andHalpern 1988). Both interpretations can use exactly the same formalframework.

A basic modal probability logic adds to propositional logic formulasof the form \(P (\phi)\ge q\), where \(q\) is typically a rationalnumber, and \(\phi\) is any formula of the language, possibly aprobability formula. The reading of such a formula is that theprobability of \(\phi\) is at least \(q\). This general reading of theformula does not reflect any difference between modal probabilitylogic and other probability logics with the same formula; where thedifference lies is in the ability to embed probabilities in thearguments of probability terms and in the semantics. The followingsubsections provide an overview of the variations of how modalprobability logic is modeled. In one case the language is alteredslightly (Section 4.2), and in other cases, the logic isextended to address interactions between qualitative and quantitativeuncertainty (Section 4.4 andSection 4.5) or dynamics (Section 4.6).

4.1 Basic Finite Modal Probability Models

Formally, aBasic Finite Modal Probabilistic Model is a tuple\(M=(W,\mathcal{P},V)\), where \(W\) is a finite set of possibleworlds or states, \(\mathcal{P}\) is a function associating adistribution \(\mathcal{P}_w\) over \(W\) to each world \(w\in W\),and \(V\) is a ‘valuation function’ assigning atomicpropositions from a set \(\Phi\) to each world. The distribution isadditively extended from individual worlds to sets of worlds:\(\mathcal{P}_w(S) = \sum_{s\in S}\mathcal{P}_w(s)\). The first twocomponents of a basic modal probabilistic model are effectively thesame as a Kripke frame whose relation is decorated with numbers(probability values). Such a structure has different names, such as adirected graph with labelled edges in mathematics, or a probabilistictransition system in computer science. The valuation function, as in aKripke model, allows us to assign properties to the worlds.

The semantics for formulas are given on pairs \((M,w)\), where \(M\)is a model and \(w\) is an element of the model. A formula \(P(\phi)\ge q\) is true at a pair \((M,w)\), written \((M,w)\models P(\phi)\geq\), if and only if \(\mathcal{P}_w(\{w'\mid (M,w')\models \phi\}) \geq\).

4.2 Indexing and Interpretations

The first generalization, which is most common in applications ofmodal probabilistic logic, is to allow the distributions to be indexedby two sets rather than one. The first set is the set \(W\) of worlds(the base set of the model), but the other is an index set \(A\) oftento be taken as a set of actions, agents, or players of a game.Formally, \(\mathcal{P}\) associates a distribution\(\mathcal{P}_{a,w}\) over \(W\) for each \(w\in W\) and \(a\in A\).For the language, rather than involving formulas of the form\(P(\phi)\ge q\), we have \(P_a(\phi)\ge q\), and \((M,w)\modelsP_a(\phi)\ge q\) if and only if \(\mathcal{P}_{a,w}(\{w'\mid(M,w')\models \phi\}) \ge q\).

Example: Suppose we have an index set \(A = \{a,b\}\), and a set \(\Phi = \{p,q\}\) of atomic propositions. Consider\((W,\mathcal{P},V)\), where

\(W = \{w,x,y,z\}\)
\(\mathcal{P}_{a,w}\) and \(\mathcal{P}_{a,x}\) map \(w\) to \(1/2\),\(x\) to \(1/2\), \(y\) to \(0\), and \(z\) to \(0\).
\(\mathcal{P}_{a,y}\) and \(\mathcal{P}_{a,z}\) map \(y\) to \(1/3\),\(z\) to \(2/3\), \(w\) to \(0\), and \(x\) to \(0\).
\(\mathcal{P}_{b,w}\) and \(\mathcal{P}_{b,y}\) map \(w\) to \(1/2\),\(y\) to \(1/2\), \(x\) to \(0\), and \(z\) to \(0\).
\(\mathcal{P}_{b,x}\) and \(\mathcal{P}_{b,z}\) map \(x\) to \(1/4\),\(z\) to \(3/4\), \(w\) to \(0\), and \(y\) to \(0\).
\(V(p) = \{w,x\}\)
\(V(q) = \{w,y\}\).

We depict this example with the following diagram. Inside each circleis a labeling of the truth of each proposition letter for the worldwhose name is labelled right outside the circle. The arrows indicatethe probabilities. For example, an arrow from world \(x\) to world\(z\) labeled by \((b,3/4)\) indicates that from \(x\), the probablyof \(z\) under label \(b\) is \(3/4\). Probabilities of 0 are notlabelled.

Four circles each with a possible state of p,q and probability arrows between them

Figure

Stochastic Interpretation: Consider the elements\(a\) and \(b\) of \(A\) to be actions, for example, pressing buttonson a machine. In this case, pressing a button does not have a certainoutcome. For instance, if the machine is in state \(x\), there is a\(1/2\) probability it will remain in the same state after pressing\(a\), but a \(1/4\) probability of remaining in the same state afterpressing \(b\). That is,

\[(M,x) \models P_a(p\wedge \neg q) = 1/2 \wedge P_b(p\wedge \neg q) = 1/4.\]

A significant feature of modal logics in general (and this includesmodal probabilistic logic) is the ability to supporthigher-orderreasoning, that is, the reasoning about probabilities ofprobabilities. The importance of higher-order probabilities is clearfrom the role they play in, for example,Miller’sprinciple, which states that \(P_1(\phi\mid P_2(\phi) = b) = b\).Here, \(P_1\) and \(P_2\) are probability functions, which can havevarious interpretations, such as the probabilities of two agents,logical and statistical probability, or the probabilities of one agentat different moments in time (Miller 1966; Lewis 1980; van Fraassen1984; Halpern 1991). Higher-order probability also occurs for instancein theJudy Benjamin Problem (van Fraassen 1981a) where oneconditionalizes on probabilistic information. Whether one agrees withthe principles proposed in the literature on higher-orderprobabilities or not, the ability to represent them forces one toinvestigate the principles governing them.

To illustrate higher-order reasoning more concretely, we return to ourexample and see that at \(x\), there is a \(1/2\) probability thatafter pressing \(a\), there is a \(1/2\) probability that afterpressing \(b\), it will be the case that \(\neg p\) is true, that is,

\[(M,x)\models P_a(P_b(\neg p)= 1/2)=1/2.\]

Subjective Interpretation: Suppose the elements \(a\)and \(b\) of \(A\) are players of a game. \(p\) and \(\neg p\) arestrategies for player \(a\) and \(q\) and \(\neg q\) are bothstrategies for player \(b\). In the model, each player is certain ofher own strategy; for instance at \(x\), player \(a\) is certain thatshe will play \(p\) and player \(b\) is certain that she will play\(\neg q\), that is

\[(M,x)\models P_a(p) = 1 \wedge P_b(\neg q) = 1.\]

But the players randomize over their opponents. For instance at \(x\),the probability that \(b\) has for \(a\)’s probability of \(\negq\) being \(1/2\) is \(1/4\), that is

\[(M,x)\models P_b(P_a(q)=1/2)=1/4.\]

4.3 Probability Spaces

Probabilities are generally defined as measures in a measure space. Ameasure space is a set \(\Omega\) (the sample space) together with a\(\sigma\)-algebra (also called \(\sigma\)-field) \(\mathcal{A}\) over\(\Omega\), which is a non-empty set of subsets of \(\Omega\) suchthat \(A\in \mathcal{A}\) implies that \(\Omega-A\in \mathcal{A}\),and \(A_i\in \mathcal{A}\) for all natural numbers \(i\), implies that\(\bigcup_i A_i\in \mathcal{A}\). A measure is a function \(\mu\)defined on the \(\sigma\)-algebra \(\mathcal{A}\), such that \(\mu(A)\ge 0\) for every set \(A \in\mathcal{A}\) and \(\mu(\bigcup_i A_i) =\sum_i\mu(A_i)\) whenever \(A_i\cap A_j = \emptyset\) for each\(i,j\).

The effect of the \(\sigma\)-algebra is to restrict the domain so thatnot every subset of \(\Omega\) need have a probability. This iscrucial for some probabilities to be defined on uncountably infinitesets; for example, a uniform distribution over a unit interval cannotbe defined on all subsets of the interval while also maintaining thecountable additivity condition for probability measures.

The same basic language as was used for the basic finite probabilitylogic need not change, but the semantics is slightly different: forevery state \(w\in W\), the component \(\mathcal{P}_w\) of a modalprobabilistic model is replaced by an entire probability space\((\Omega_w,\mathcal{A}_w,\mu_w)\), such that \(\Omega_w\subseteq W\)and \(\mathcal{A}_w\) is a \(\sigma\)-algebra over \(\Omega_w\). Thereason we may want entire spaces to differ from one world to anotheris to reflect uncertainty about what probability space is the rightone. For the semantics of probability formulas, \((M,w)\models P(\phi)\ge q\) if and only if \(\mu_w(\{w'\mid (M,w')\models \phi\})\ge q\).Such a definition is not well defined in the event that \(\{w'\mid(M,w')\models \phi\}\not\in \mathcal{A}_w\). Thus constraints areoften placed on the models to ensure that such sets are always in the\(\sigma\)-algebras.

4.4 Combining Quantitative and Qualitative Uncertainty

Although probabilities reflect quantitative uncertainty at one level,there can also be qualitative uncertainty about probabilities. Wemight want to have qualitative and quantitative uncertainty because wemay be so uncertain about some situations that we do not want toassign numbers to the probabilities of their events, while there areother situations where we do have a sense of the probabilities oftheir events; and these situations can interact.

There are many situations in which we might not want to assignnumerical values to uncertainties. One example is where a computerselects a bit 0 or 1, and we know nothing about how this bit isselected. Results of coin flips, on the other hand, are often used asexamples of where we would assign probabilities to individualoutcomes.

An example of how these might interact is where the result of the bitdetermines whether a fair coin or a weighted coin (say, heads withprobability \(2/3\)) be used for a coin flip. Thus there isqualitative uncertainty as to whether the action of flipping a coinyields heads with probability \(1/2\) or \(2/3\).

One way to formalize the interaction between probability andqualitative uncertainty is by adding another relation to the model anda modal operator to the language as is done in Fagin and Halpern(1988, 1994). Formally, we add to a basic finite probability model arelation \(R\subseteq W^2\). Then we add to the language a modaloperator \(\Box\), such that \((M,w)\models \Box\phi\) if and only if\((M,w')\models \phi\) whenever \(w R w'\).

Consider the following example:

\(W = \{(0,H),(0,T),(1,H),(1,T)\}\),
\(\Phi = \{h,t\}\) is the set of atomic propositions,
\(R = W^2\),
\(\mathcal{P}\) associates with \((0,H)\) and \((0,T)\) thedistribution mapping \((0,H)\) and \((0,T)\) each to \(1/2\), andassociates with \((1,H)\) and \((1,T)\) the distribution mapping\((1,H)\) to \(2/3\) and \((1,T)\) to \(1/3\),
\(V\) maps \(h\) to the set \(\{(0,H),(1,H)\}\) and \(t\) to the set\(\{(0,T),(1,T)\}\).

Then the following formula is true at \((0,H)\): \(\neg \Box h \wedge(\neg \Box P(h)= 1/2) \wedge (\Diamond P(h) = 1/2)\). This can be readasit is not known that \(h\) is true, and it is not known thatthe probability of \(h\) is \(1/2\), but it is possible that theprobability of \(h\) is \(1/2\).

4.5 Constraints on Quantitative and Qualitative Interaction

We go into more detail about how to relate quantitative andqualitative uncertainties. Much of this is similar to the goal of theentry onformal representations of belief,but here we focus on the connections between the qualitative andquantitative tools, such as epistemic relations \(R\) and probabilityspaces \(\mathcal{P}_w = (\Omega_w,\mathcal{A}_w,\mu_w)\), discussedabove.

Rather than allowing quantitative and qualitative uncertainty interactfreely, it may be realistic to involve constraints between the two.Some constraints were proposed in Fagin and Halpern (1994). Oneconstraint, called consistency, ensures that the locally definedsample space \(\Omega_w\) is contained in the set of all worlds theagent qualitatively considers possible. Another, called uniformity,ensures that all worlds within a sample space \(\Omega_w\) agree onthe probability space. These two constraints are satisfied in theexample inSection 4.4 concerning qualitative uncertainty about the probabilityassignment.

Another approach to constraining the relationship between quantitativeand qualitative uncertainty is todefine qualitative belieffrom quantitative uncertainty. A natural approach is to define beliefin \(A\) to be quantitative certainty in \(A\) (assigning \(A\)probability 1). This can arise from an epistemic relation \(R\)defined by

\[(w,v)\in R \text{ if and only if } \mu_w(\{v\})\gt 0.\]

Rather than having the sample space \(\Omega_w\) be contained in theset of qualitative possibilities, it is more the other way around; thesample space here is often the set of all possible worlds, and the setan agent considers possible is a subset of that. For the rest of thissubsection, we assume all worlds agree on a single probability space\(\mathcal{P} = (W,\mathcal{P}, \mu)\).

We may wish to allow for a form of belief to be weaker than havingprobability 1. For example, someone might ‘believe’ herpositive medical test result while acknowledging a non-zero chance ofa false positive, and hence falling short of certainty that she hasthe condition being tested. Such belief, although weaker thanprobabilistic certainty, may be particularly relevant if it leads todecisions and actions. A natural way to define such a weak belief isthe ‘Lockean thesis’ which defines belief in an event\(A\) to having \(\mu(A)\ge r\), where \(r\) some threshold less than1 (see the entry onformal representations of belief for more details about the Lockeanthesis), though it is not always possible for such belief to arisefrom an epistemic relation. Alternatively, one can define anepistemic relation \(R\) from the probability function \(\mu\) and athreshold \(q\) by

\[(w,v) \in R \text{ if and only if } \mu(\{v\})\ge q.\]

As we are assuming every world agrees on the probability space, therelation \(R\) satisfies the KD45 axioms of belief (see entry onepistemic logic for KD45 axioms). The threshold \(q\) for possibility istypically very small; it is the lowest likelihood for a possibility tobe considered reasonably likely and to be incorporated into theepistemic relation \(R\). In contrast, the Lockean threshold \(r\) isbetween 0.5 and 1. While a belief operator defined according to theLockean thesis need not correspond to any epistemic relation, thereare special cases where it does. If there is a \(q\) such that \(K =\{w\mid \mu(\{w\})\gt q\}\) is \(P\)-stable (where a subset \(K\) of\(W\) is \(P\)-stable if for all \(w\in K\),\(\mu(\{w\})\gt\mu(W\setminus K)\)), then the epistemic relation\(R\), defined by \((w,v)\in R\) if and only if \(v\in K\), gives riseto a modal belief operator that satisfies the Lockean thesis forthreshold \(r=P(K)\) (See Leitgeb 2013 and Delgrande 2022).

4.6 Dynamics

We have discussed two views of modal probability logic. One istemporal or stochastic, where the probability distribution associatedwith each state determines the likelihood of transitioning into otherstates; another is concerned with subjective perspectives of agents,who may reason about probabilities of other agents. A stochasticsystem is dynamic in that it represents probabilities of differenttransitions, and this can be conveyed by the modal probabilisticmodels themselves. But from a subjective view, the modal probabilisticmodels are static: the probabilities are concerned with what currentlyis the case. Although static in their interpretation, the modalprobabilistic setting can be put in a dynamic context.

Dynamics in a modal probabilistic setting is generally concerned withsimultaneous changes to probabilities in potentially all possibleworlds. Intuitively, such a change may be caused by new informationthat invokes a probabilistic revision at each possible world. Thedynamics of subjective probabilities is often modeled usingconditional probabilities, such as in Kooi (2003), Baltag and Smets(2008), and van Benthem et al. (2009). The probability of \(E\)conditional on \(F\), written \(P(E\mid F)\), is \(P(E\cap F)/P(F)\).When updating by a set \(F\), a probability distribution \(P\) isreplaced by the probability distribution \(P'\), such that \(P'(E)=P(E \mid F)\), so long as \(P(F)\neq 0\). Let us assume for theremainder of this dynamics subsection that every relevant setconsidered has positive probability.

Using a probability logic with linear combinations, we can abbreviatethe conditional probability \(P(\phi\mid \psi)\ge q\) by \(P(\phi\wedge \psi) - qP(\psi)\ge 0\). In a modal setting, an operator\([!\psi]\) can be added to the language, such that \(M,w\models[!\psi]\phi\) if and only if \(M',w\models \phi\), where \(M'\) is themodel obtained from \(M\) by revising the probabilities of each worldby \(\psi\). Note that \([!\psi](P(\phi)\ge q)\) differs from\(P(\phi\mid \psi)\ge q\), in that in \([!\psi](P(\phi)\ge q)\), theinterpretation of probability terms inside \(\phi\) are affected bythe revision by \(\psi\), whereas in \(P(\phi\mid \psi)\ge q\), theyare not, which is why \(P(\phi\mid \psi)\ge q\) nicely unfolds intoanother probability formula. However, \([!\psi]\phi\) does unfold too,but in more steps:

\[[!\psi](P(\phi)\ge q) \leftrightarrow (\psi\to P([!\psi]\phi \mid \psi) \ge q).\]

For other overviews of modal probability logics and its dynamics, seeDemey and Kooi (2014), Demey and Sack (2015), andappendix L on probabilistic update in dynamic epistemic logic of the entry ondynamic epistemic logic.

5. First-order Probability Logic

In this section we will discuss first-order probability logics. As wasexplained inSection 1 of this entry, there are many ways inwhich a logic can have probabilistic features. The models of the logiccan have probabilistic aspects, the notion of consequence can have aprobabilistic flavor, or the language of the logic can containprobabilistic operators. In this section we will focus on thoselogical operators that have a first-order flavor. The first-orderflavor is what distinguishes these operators from the probabilisticmodal operators of the previous section.

Consider the following example from Bacchus (1990):

More than 75% of all birds fly.

There is a straightforward probabilistic interpretation of thissentence, namely when onerandomly selects a bird, then theprobability that the selected bird flies is more than 3/4. First-orderprobabilistic operators are needed to express these sort ofstatements.

There is another type of sentence, such as the following sentencediscussed in Halpern (1990):

The probability that Tweety flies is greater than \(0.9\).

This sentence considers the probability that Tweety (a particularbird) can fly. These two types of sentences are addressed by twodifferent types of semantics, where the former involves probabilitiesover a domain, while the latter involves probabilities over a set ofpossible worlds that is separate from the domain.

5.1 An Example of a First-order Probability Logic

In this subsection we will have a closer look at a particularfirst-order probability logic, whose language is as simple aspossible, in order to focus on the probabilistic quantifiers. Thelanguage is very much like the language of classical first-orderlogic, but rather than the familiar universal and existentialquantifier, the language contains a probabilistic quantifier.

The language is built on a set of ofindividual variables(denoted by \(x, y, z, x_1, x_2, \ldots\)), a set offunctionsymbols (denoted by \(f, g, h, f_1, \ldots\)) where an arity isassociated with each symbol (nullary function symbols are also calledindividual constants), and a set ofpredicateletters (denoted by \( R, P_1, \ldots\)) where an arity isassociated with each symbol. The language contains two kinds ofsyntactical objects, namelyterms andformulas. Theterms are defined inductively as follows:

Every individual variable \(x\) is a term.
Every function symbol \(f\) of arity \(n\) followed by an \(n\)-tupleof terms \((t_1,\ldots,t_n)\) is a term.

Given this definition of terms, the formulas are defined inductivelyas follows:

Every predicate letter \(R\) of arity \(n\) followed by an \(n\)-tupleof terms \((t_1,\ldots,t_n)\) is a formula.
If \(\phi\) is a formula, then so is \(\neg \phi\).
If \(\phi\) and \(\psi\) are formulas, then so is \((\phi \wedge\psi)\).
If \(\phi\) is a formula and \(q\) is a rational number in theinterval \([0,1]\), then so is \(Px (\phi) \geq q\).

Formulas of the form \(Px (\phi) \geq q\) should be read as:“the probability of selecting an \(x\) such that \(x\) satisfies\(\phi\) is at least \(q\)”. The formula \(Px(\phi) \leq q\) isan abbreviation of \(Px(\neg \phi) \geq 1-q\) and \(Px(\phi)=q\) is anabbreviation of \(Px(\phi) \geq q \wedge Px(\phi) \leq q\). Every freeoccurrence of \(x\) in \(\phi\) is bound by the operator.

This language is interpreted on very simple first-order models, whichare triples \(M=(D,I,P)\), where thedomain of discourse\(D\) is a finite nonempty set of objects, theinterpretation\(I\) associates an \(n\)-ary function on \(D\) with every \(n\)-aryfunction symbol occurring in the language, and an \(n\)-ary relationon \(D\) with every \(n\)-ary predicate letter. \(P\) is aprobability function that assigns a probability \(P(d)\) toevery element \(d\) in \(D\) such that \(\sum_{d \in D} P(d)=1\).

In order to interpret formulas containing free variables one alsoneeds anassignment \(g\) which assigns an element of \(D\)to every variable. The interpretation \([\![t]\!]_{M,g}\) of a term\(t\) given a model \(M=(D,I,P)\) and an assignment \(g\) is definedinductively as follows:

\([\![ x ]\!]_{M,g}=g(x)\)
\([\![ f (t_1,\ldots,t_n)]\!]_{M,g}= I(f) ([\![t_1]\!], \ldots,[\![t_n]\!])\)

Truth is defined as a relation \(\models\) between models withassignments and formulas:

\(M,g \models R(t_1,\ldots,t_n)\) iff \(([\![t_1]\!], \ldots,[\![t_n]\!]) \in I(R)\)
\(M,g \models \neg \phi\) iff \(M,g \not \models \phi\)
\(M,g \models (\phi \wedge \psi)\) iff \(M,g \models \phi\) and \(M,g\models \psi\)
\(M,g \models Px(\phi) \geq q\) iff \(\sum_{d :M,g[x \mapsto d]\models \phi} P(d) \geq q\)

As an example, consider a model of a vase containing nine marbles:five are black and four are white. Let us assume that \(P\) assigns aprobability of 1/9 to each marble, which captures the idea that one isequally likely to pick any marble. Suppose the language contains aunary predicate \(B\) whose interpretation is the set of blackmarbles. The sentence \(Px(B(x)) = 5/9\) is true in this modelregardless of the assignment.

The logic that we just presented is too simple to capture many formsof reasoning about probabilities. We will discuss three extensionshere.

5.1.1 Quantifying over More than One Variable

First of all one would like to reason about cases where more than oneobject is selected from the domain. Consider for example theprobability of first picking a black marble, putting it back, and thenpicking a white marble from the vase. This probability is 5/9\(\times\) 4/9 = 20/81, but we cannot express this in the languageabove. For this we need one operator that deals with multiplevariables simultaneously, written as \(Px_1,\ldots x_n (\phi) \geqq\). The semantics for such operators will then have to provide aprobability measure on subsets of \(D^n\). The simplest way to do thisis by simply taking the product of the probability function \(P\) on\(D\), which can be taken as an extension of \(P\) to tuples, where\(P(d_1,\ldots d_n)= P(d_1) \times \cdots \times P(d_n)\), whichyields the following semantics:

\(M,g \models Px_1\ldots x_n (\phi) \geq q\) iff\(\sum_{(d_1,\ldots,d_n) :M,g[x_1 \mapsto d_1, \ldots, x_n \mapstod_n] \models \phi} P(d_1,\ldots,d_n) \geq q\)

This approach is taken by Bacchus (1990) and Halpern (1990),corresponding to the idea that selections are independent and withreplacements. With these semantics the example above can be formalizedas \(Px,y (B(x) \wedge \neg B(y))= 20/81\). There are also moregeneral approaches to extending the measure on the domain to tuplesfrom the domain such as by Hoover (1978) and Keisler (1985).

5.1.2 Conditional Probability

When one considers the initial example that more than 75% of all birdsfly, one finds that this cannot be adequately captured in a modelwhere the domain contains objects that are not birds. These objectsshould not matter to what one wishes to express, but the probabilityquantifiers, quantify over the whole domain. In order to restrictquantification one must add conditional probability operators \(Px(\phi | \psi) \geq q\) with the following semantics:

\(M,g \models Px (\phi | \psi) \geq q\) iff if there is a \(d \in D\)such that \(M,g[x \mapsto d] \models \psi\) then
\[\frac{\sum_{d : M,g[x\mapsto d] \models \phi \wedge \psi} P(d)} {\sum_{d: M,g [x \mapsto d] \models \psi} P(d)} \geq q.\]

With these operators, the formula \(Px(F(x) \mid B(x)) > 3/4\)expresses that more than 75% of all birds fly.

5.1.3 Probabilities as Terms

When one wants to compare the probability of different events, say ofselecting a black ball and selecting a white ball, it may be moreconvenient to consider probabilities to be terms in their own right.That is, an expression \(Px(\phi)\) is interpreted as referring tosome rational number. Then one can extend the language witharithmetical operations such as addition and multiplication, and withoperators such as equality and inequalities to compare probabilityterms. One can then say that one is twice as likely to select a blackball compared to a white ball as \(Px(B(x))=2 \times Px (W(x))\). Suchan extension requires that the language contains two separate classesof terms: one for probabilities, numbers and the results ofarithmetical operations on such terms, and one for the domain ofdiscourse which the probabilistic operators quantify over. We will notpresent such a language and semantics in detail here. One can findsuch a system in Bacchus (1990).

5.2 Possible World First-order Probability Logic

In this subsection, we consider a first-order probability logic with apossible-world semantics (which we abbreviate FOPL). The language ofFOPL is similar to the example we gave inSection 5.1 related to that ofBacchus, except here we have full quantifier formulas of the form\((\forall x)\phi\) for any formula \(\phi\), and instead ofprobability formulas of the form \(Px(\phi)\ge q\), we haveprobability formulas of the form \(P(\phi)\ge q\) (similar to theprobability formulas in propositional probability logic).

The models of FOPL are of the form \(M = (W,D,I,P)\), where \(W\) is aset ofpossible worlds, \(D\) is adomain ofdiscourse, \(I\) is a localized interpretation function mappingevery \(w\in W\) to a interpretation function \(I(w)\) that associatesto every function and predicate symbol, a function or predicate ofappropriate arity, and \(P\) is a probability function that assign aprobability \(P(w)\) to every \(w\) in \(W\).

Similarly to the simple example before, we involve an assignmentfunction \(g\) mapping each variable to an element of the domain\(D\). To interpret terms, for every model \(M\), world \(w\in W\),and assignment function \(g\), we map each term \(t\) to domainelements as follows:

\([\![ x ]\!]_{M,w,g} =g(x)\)
\([\![ f (t_1,\ldots,t_n)]\!]_{M,w,g} = I(w)(f) ([\![t_1]\!], \ldots, [\![t_n]\!])\)

Truth is defined according to a relation \(\models\) between pointedmodels (models with designated worlds) with assignments and formulasas follows:

\(M,w,g \models R(t_1,\ldots,t_n)\) iff \(([\![t_1]\!], \ldots,[\![t_n]\!]) \in I(w)(R)\)
\(M,w,g \models \neg \phi\) iff \(M,w,g \not \models \phi\)
\(M,w,g \models (\phi \wedge \psi)\) iff \(M,w,g \models \phi\) and\(M,w,g \models \psi\)
\(M,w,g\models (\forall x)\varphi\) iff \(M,w,g[x/d]\models \varphi\)for all \(d\in D\), where \(g[x/d]\) is the same as \(g\) except thatit maps \(x\) to \(d\).
\(M,w,g\models P(\varphi)\ge q\) iff \(P(\{w'\mid (M,w',g)\models\varphi\})\ge q\).

As an example, consider a model where there are two possible vases: 4white marbles and 4 black marbles were put in both possible vases. Butthen another marble, called , was placed in the vase, but in onepossible vase, was white, and in the other it was black. Thus in theend, there are two possible vases: one with 5 black marbles and 4white marbles, and the other with 4 black marbles and 5 white marbles.Suppose \(P\) assigns \(1/2\) probability to the two possible vases.Then \(P(B(\mathsf{last})) = 1/2\) is true for this variableassignment, and if any other variable assignment were chosen, theformula \((\exists x) P(B(x)) = 1/2\) would still be true.

5.3 Metalogic

Generally it is hard to provide proof systems for first-orderprobability logics, because the validity problem for these logics isgenerally undecidable. It is even not the case, as it is the case inclassical first-order logic, that if an inference is valid, then onecan find out in finite time (see Abadi and Halpern (1994)).

Nonetheless there are many results for first-order probability logic.For instance, Hoover (1978) and Keisler (1985) study completenessresults. Bacchus (1990) and Halpern (1990) also provide completeaxiomatizations as well as combinations of first-order probabilitylogics and possible-world first-order probability logics respectively.In Ognjanović and Rašković (2000), an infinitarycomplete axiomatization is given for a more general version of thepossible-world first-order probability logic presented here.

Bibliography

Abadi, M. and Halpern, J. Y., 1994, “Decidability andExpressiveness for First-Order Logics of Probability,”Information and Computation, 112: 1–36.
Adams, E. W. and Levine, H. P., 1975, “On the UncertaintiesTransmitted from Premisses to Conclusions in DeductiveInferences,”Synthese, 30: 429–460.
Adams, E. W., 1998,A Primer of Probability Logic,Stanford, CA: CSLI Publications.
Arló Costa, H., 2005, “Non-Adjunctive Inference andClassical Modalities,”Journal of Philosophical Logic,34: 581–605.
Bacchus, F., 1990,Representing and Reasoning withProbabilistic Knowledge, Cambridge, MA: The MIT Press.
Baltag, A. and Smets, S., 2008, “Probabilistic DynamicBelief Revision,”Synthese, 165: 179–202.
van Benthem, J., 2017, “Against all odds: when logic meetsprobability”, inModelEd, TestEd, TrustEd. Essays Dedicatedto Ed Brinksma on the Occasion of His 60th Birthday, J. P.Katoen, R. Langerak and A. Rensink (eds.), Cham: Springer, pp.239–253.
van Benthem, J., Gerbrandy, J., and Kooi, B., 2009, “DynamicUpdate with Probabilities,”Studia Logica, 93:67–96.
Boole, G., 1854,An Investigation of the Laws of Thought, onwhich are Founded the Mathematical Theories of Logic andProbabilities, London: Walton and Maberly.
Burgess, J., 1969, “Probability Logic,”Journal ofSymbolic Logic, 34: 264–274.
Carnap, R., 1950,Logical Foundations of Probability,Chicago, IL: University of Chicago Press.
Cross, C., 1993, “From Worlds to Probabilities: AProbabilistic Semantics for Modal Logic,”Journal ofPhilosophical Logic, 22: 169–192.
Delgrande, J. and Renne, B., 2015, “The Logic ofQualitative Probability,” inProceedings of theTwenty-Fourth International Joint Conference on ArtificialIntelligence (IJCAI 2015), Q. Yang and M. Wooldridge (eds.), PaloAlto, CA: AAAI Press, pp. 2904–2910.
Delgrande, J., Renne, B., and Sack, J., 2019, “The logic ofqualitative probability,”Artificial Intelligence, 275:457–486.
Delgrande, J., Sack, J., Lakemeyer, G., and Pagnucco, M., 2022,“Epistemic Logic of Likelihood and Belief,” inProceedings of the Thirty-First International Joint Conference onArtificial Intelligence (IJCAI-22), published by theInternational Joint Conferences on Artificial Intelligence,pp. 2599–2605. [Delgrande et al. 2022 available online
Demey, L. and Kooi, B., 2014, “Logic and ProbabilisticUpdate,” in A. Baltag and S. Smets (eds.),Johan van Benthemon Logic and Information Dynamics, pp. 381–404.
Demey, L. and Sack, J., 2015, “Epistemic ProbabilisticLogic,” in theHandbook of Epistemic Logic. H. vanDitmarsch, J. Halpern, W. van der Hoek and B. Kooi (eds.), London:College Publications, pp. 147–202.
Dempster, A., 1968, “A Generalization of BayesianInference,”Journal of the Royal Statistical Society,30: 205–247.
De Morgan, A., 1847,Formal Logic, London: Taylor andWalton.
de Finetti, B., 1937, “La Prévision: Ses LoisLogiques, Ses Sources Subjectives”,Annales del’Institut Henri Poincaré, 7: 1–68; translatedas “Foresight. Its Logical Laws, Its Subjective Sources,”inStudies in Subjective Probability, H. E. Kyburg, Jr. andH. E. Smokler (eds.), Malabar, FL: R. E. Krieger Publishing Company,1980, pp. 53–118.
Douven, I. and Rott, H., 2018, “From probabilities tocategorical beliefs: Going beyond toy models,”Journal ofLogic and Computation, 28: 1099–1124.
Eagle, A., 2010,Philosophy of Probability: ContemporaryReadings, London: Routledge.
Fagin, R. and Halpern, J. Y., 1988, “Reasoning aboutKnowledge and Probability,” inProceedings of the 2ndconference on Theoretical aspects of reasoning about knowledge,M. Y. Vardi (ed.), Pacific Grove, CA: Morgan Kaufmann, pp.277–293.
–––, 1994, “Reasoning about Knowledge andProbability,”Journal of the ACM, 41:340–367.
Fagin, R., Halpern, J. Y., and Megiddo, N., 1990, “A Logicfor Reasoning about Probabilities,”Information andComputation, 87: 78–128.
Fitelson, B., 2006, “Inductive Logic,” inThePhilosophy of Science: An Encyclopedia, J. Pfeifer and S. Sarkar(eds.), New York, NY: Routledge, pp. 384–394.
van Fraassen, B., 1981a, “A Problem for Relative InformationMinimizers in Probability Kinematics,”British Journal forthe Philosophy of Science, 32:375–379.
–––, 1981b, “Probabilistic SemanticsObjectified: I. Postulates and Logics,”Journal ofPhilosophical Logic, 10: 371–391.
–––, 1983, “Gentlemen’s Wagers:Relevant Logic and Probability,”Philosophical Studies,43: 47–61.
–––, 1984, “Belief and the Will,”Journal of Philosophy, 81: 235–256.
Gärdenfors, P., 1975a, “Qualitative Probability as anIntensional Logic,”Journal of Philosophical Logic, 4:171–185.
–––, 1975b, “Some Basic Theorems ofQualitative Probability,”Studia Logica, 34:257–264.
Georgakopoulos, G., Kavvadias, D., and Papadimitriou, C. H., 1988,“Probabilistic Satisfiability,”Journal ofComplexity, 4: 1–11.
Gerla, G., 1994, “Inferences in Probability Logic,”Aritificial Intelligence, 70: 33–52.
Gillies, D., 2000,Philosophical Theories of Probability,London: Routledge.
Goldblatt, R. (2010) “Deduction systems for coalgebras overmeasurable spaces.”Journal of Logic and Computation20(5): 1069–1100
Goldman, A. J. and Tucker, A. W., 1956, “Theory of LinearProgramming,” inLinear Inequalities and Related Systems.Annals of Mathematics Studies 38, H. W. Kuhn and A. W. Tucker(eds.), Princeton: Princeton University Press, pp. 53–98.
Goosens, W. K., 1979, “Alternative Axiomatizations ofElementary Probability Theory,”Notre Dame Journal of FormalLogic, 20: 227–239.
Hájek, A., 2001, “Probability, Logic, and ProbabilityLogic,” inThe Blackwell Guide to Philosophical Logic,L. Goble (ed.), Oxford: Blackwell, pp. 362–384.
Hájek, A. and Hartmann, S., 2010, “BayesianEpistemology,” inA Companion to Epistemology, J.Dancy, E. Sosa, and M. Steup (eds.), Oxford: Blackwell, pp.93–106.
Haenni, R. and Lehmann, N., 2003, “ProbabilisticArgumentation Systems: a New Perspective on Dempster-ShaferTheory,”International Journal of Intelligent Systems,18: 93–106.
Haenni, R., Romeijn, J.-W., Wheeler, G., and Williamson, J., 2011,Probabilistic Logics and Probabilistic Networks, Dordrecht:Springer.
Hailperin, T., 1965, “Best Possible Inequalities for theProbability of a Logical Function of Events,”AmericanMathematical Monthly, 72: 343–359.
–––, 1984, “Probability Logic,”Notre Dame Journal of Formal Logic, 25: 198–212.
–––, 1986,Boole’s Logic andProbability, Amsterdam: North-Holland.
–––, 1996,Sentential Probability Logic:Origins, Development, Current Status, and Technical Applications,Bethlehem, PA: Lehigh University Press.
Halpern, J. Y. and Rabin, M. O., 1987, “A Logic to Reasonabout Likelihood”,Artificial Intelligence, 32:379–405.
Halpern, J. Y., 1990, “An analysis of first-order logics ofprobability”,Artificial Intelligence, 46:311–350.
–––, 1991, “The Relationship betweenKnowledge, Belief, and Certainty,”Annals of Mathematics andArtificial Intelligence, 4: 301–322. Errata appeared inAnnals of Mathematics and Artificial Intelligence, 26 (1999):59–61.
–––, 2003,Reasoning about Uncertainty,Cambridge, MA: The MIT Press.
Hamblin, C.L., 1959, “The modal‘probably’”,Mind, 68: 234–240.
Hansen, P. and Jaumard, B., 2000, “ProbabilisticSatisfiability,” inHandbook of Defeasible Reasoning andUncertainty Management Systems. Volume 5: Algorithms for Uncertaintyand Defeasible Reasoning, J. Kohlas and S. Moral (eds.),Dordrecht: Kluwer, pp. 321–367.
Harrison-Trainor M., Holliday, W. H., and Icard, T., 2016,“A note on cancellation axioms for comparativeprobability”,Theory and Decision, 80:159–166.
–––, 2018, “Inferring probabilitycomparisons”,Mathematical Social Sciences, 91:62–70.
Hartmann, S. and Sprenger J., 2010, “BayesianEpistemology,” inRoutledge Companion to Epistemology,S. Bernecker and D. Pritchard (eds.), London: Routledge, pp.609–620.
Heifetz, A. and Mongin, P., 2001, “Probability Logic forType Spaces”,Games and Economic Behavior, 35:31–53.
Herzig, A. and Longin, D., 2003, “On Modal Probabilityand Belief,” inProceedings of the 7th European Conferenceon Symbolic and Quantitative Approaches to Reasoning with Uncertainty(ECSQARU 2003), T.D. Nielsen and N.L. Zhang (eds.), Lecture Notesin Computer Science 2711, Berlin: Springer, pp. 62–73.
Hoover, D. N., 1978, “Probability Logic,”Annalsof Mathematical Logic, 14: 287–313.
Howson, C., 2003, “Probability and Logic,”Journalof Applied Logic, 1: 151–165.
–––, 2007, “Logic with Numbers,”Synthese, 156: 491–512.
–––, 2009, “Can Logic be Combined withProbability? Probably,”Journal of Applied Logic, 7:177–187.
Ilić-Stepić, Ognjanović, Z., Ikodinović, N.,Perović, A., (2012), “A \(p\)-adic probabilitylogic,”Mathematical Logic Quarterly 58(4–5):63–280.
Jaynes, E. T., 2003,Probability Theory: The Logic ofScience, Cambridge: Cambridge University Press.
Jeffrey, R., 1992,Probability and the Art of Judgement,Cambridge: Cambridge University Press.
Jonsson, B., Larsen, K., and Yi, W., 2001 “ProbabilisticExtensions of Process Algebras,” inHandbook of ProcessAlgebra, J. A. Bergstra, A. Ponse, and S. A. Smolka (eds.),Amsterdam: Elsevier, pp. 685–710.
Kavvadias, D. and Papadimitriou, C. H., 1990, “A LinearProgramming Approach to Reasoning about Probabilities,”Annals of Mathematics and Artificial Intelligence, 1:189–205.
Keisler, H. J., 1985, “Probability Quantifiers,” inModel-Theoretic Logics, J. Barwise and S. Feferman(eds.), New York, NY: Springer, pp. 509–556.
Kooi B. P., 2003, “Probabilistic Dynamic EpistemicLogic,”Journal of Logic, Language and Information, 12:381–408.
Kraft, C. H., Pratt, J. W., and Seidenberg, A., 1959,“Intuitive Probability on Finite Sets,”Annals ofMathematical Statistics, 30: 408–419.
Kyburg, H. E., 1965, “Probability, Rationality, and the Ruleof Detachment,” inProceedings of the 1964 InternationalCongress for Logic, Methodology, and Philosophy of Science, Y.Bar-Hillel (ed.), Amsterdam: North-Holland, pp. 301–310.
–––, 1994, “Uncertainty Logics, ” inHandbook of Logic in Artificial Intelligence and LogicProgramming, D. M. Gabbay, C. J. Hogger, and J. A. Robinson(eds.), Oxford: Oxford University Press, pp. 397–438.
Larsen, K. and Skou, A., 1991, “Bisimulation throughProbabilistic Testing,”Information and Computation,94: 1–28.
Leblanc, H., 1979, “Probabilistic Semantics for First-OrderLogic,”Zeitschrift für mathematische Logik undGrundlagen der Mathematik, 25: 497–509.
–––, 1983, “Alternatives to StandardFirst-Order Semantics,” inHandbook of Philosophical Logic,Volume I, D. Gabbay and F. Guenthner (eds.), Dordrecht: Reidel,pp. 189–274.
Leitgeb, H., 2013, “Reducing belief simpliciter to degreesof belief,”Annals of Pure and Applied Logic, 164:1338–1389.
–––, 2014, “The stability theory ofbelief,”Philosophical Review, 123: 131–171.
–––, 2017,The Stability of Belief. HowRational Belief Coheres with Probability, Oxford: OxfordUniversity Press.
Leitgeb, H., 2013, “Reducing Belief Simpliciter to Degreesof Belief,”Annals of Pure and Applied Logic, 164:1338–1389.
Lewis, D., 1980, “A Subjectivist’s Guide to ObjectiveChance,” inStudies in Inductive Logic and Probability.Volume 2, R. C. Jeffrey (ed.), Berkeley, CA: University ofCalifornia Press, pp. 263–293; reprinted inPhilosophicalPapers. Volume II, Oxford: Oxford University Press, 1987, pp.83–113.
Lin, H. and Kelly, K. T., 2012a, “A geo-logicalsolution to the lottery paradox, with applications to conditionallogic,”Synthese, 186: 531–575.
–––, 2012b, “Propositional reasoning thattracks probabilistic reasoning,”Journal of PhilosophicalLogic, 41: 957–981.
Miller, D., 1966, “A Paradox of Information,”British Journal for the Philosophy of Science, 17:59–61.
Morgan, C., 1982a, “There is a Probabilistic Semantics forEvery Extension of Classical Sentence Logic,”Journal ofPhilosophical Logic, 11: 431–442.
–––, 1982b, “Simple ProbabilisticSemantics for Propositional K, T, B, S4, and S5,”Journal ofPhilosophical Logic, 11: 443–458.
–––, 1983, “Probabilistic Semantics forPropositional Modal Logics”. inEssays in Epistemology andSemantics, H. Leblanc, R. Gumb, and R. Stern (eds.), New York,NY: Haven Publications, pp. 97–116.
Morgan, C. and Leblanc, H., 1983, “Probabilistic Semanticsfor Intuitionistic Logic,”Notre Dame Journal of FormalLogic, 24: 161–180.
Nilsson, N., 1986, “Probabilistic Logic,”Artificial Intelligence, 28: 71–87.
–––, 1993, “Probabilistic LogicRevisited,”Artificial Intelligence, 59:39–42.
Ognjanović, Z. and Rašković, M., 1999,“Some probability logics with new types of probabilityoperators,”Journal of Logic and Computation 9 (2):181–195.
Ognjanović, Z. and Rašković, M., 2000,“Some first-order probability logics,”TheoreticalComputer Science 247 (1–2): 191–212.
Ognjanović, Z., Rašković, M., and Marković,Z., 2016,Probability Logics: Probability-Based Formalization ofUncertain Reasoning, Springer International Publishing AG.
Ognjanović, Z., Perović, A., andRašković, M., 2008, “Logics with the QualitativeProbability Operator,”Logic Journal of the IGPL 16(2): 105–120.
Paris, J. B., 1994,The Uncertain Reasoner’s Companion,A Mathematical Perspective, Cambridge: Cambridge UniversityPress.
Parma, A. and Segala, R., 2007, “Logical Characterizationsof Bisimulations for Discrete Probabilistic Systems,” inProceedings of the 10th International Conference on Foundations ofSoftware Science and Computational Structures (FOSSACS), H. Seidl(ed.), Lecture Notes in Computer Science 4423, Berlin: Springer, pp.287–301.
Pearl, J., 1991, “Probabilistic Semantics for NonmonotonicReasoning,” inPhilosophy and AI: Essays at theInterface, R. Cummins and J. Pollock (eds.), Cambridge, MA: TheMIT Press, pp. 157–188.
Perović, A., Ognjanović, Z., Rašković, M.,Marković, Z., 2008, “A probabilistic logic with polynomialweight formulas”. In Hartmann, S., Kern-Isberner, G. (eds.)Proceedings of the Fifth International Symposium Foundations ofInformation and Knowledge Systems, FoIKS 2008, Pisa, Italy,11–15 February 2008.Lecture Notes in Computer Science,vol. 4932, pp. 239–252. Springer.
Ramsey, F. P., 1926, “Truth and Probability”, inFoundations of Mathematics and other Essays, R. B.Braithwaite (ed.), London: Routledge and Kegan Paul, 1931, pp.156–198; reprinted inStudies in SubjectiveProbability, H. E. Kyburg, Jr. and H. E. Smokler (eds.), 2nd ed.,Malabar, FL: R. E. Krieger Publishing Company, 1980, pp. 23–52;reprinted inPhilosophical Papers, D. H. Mellor (ed.)Cambridge: Cambridge University Press, 1990, pp. 52–94.
Reichenbach, H., 1949,The Theory of Probability,Berkeley, CA: University of California Press.
Romeijn, J.-W., 2011, “Statistics as Inductive Logic,”inHandbook for the Philosophy of Science. Vol. 7: Philosophy ofStatistics, P. Bandyopadhyay and M. Forster (eds.), Amsterdam:Elsevier, pp. 751–774.
Scott, D., 1964, “Measurement Structures and LinearInequalities,”Journal of Mathematical Psychology, 1:233–247.
Segerberg, K., 1971, “Qualitative Probability in a ModalSetting”, inProceedings 2nd Scandinavian LogicSymposium, E. Fenstad (ed.), Amsterdam: North-Holland, pp.341–352.
Shafer, G., 1976,A Mathematical Theory of Evidence,Princeton, NJ: Princeton University Press.
Suppes, P., 1966, “Probabilistic Inference and the Conceptof Total Evidence,” inAspects of Inductive Logic, J.Hintikka and P. Suppes (eds.), Amsterdam: Elsevier, pp.49–65.
Szolovits, P. and Pauker S.G., 1978, “Categorical andProbabilistic Reasoning in Medical Diagnosis,”ArtificialIntelligence, 11: 115–144.
Tarski, A., 1936, “Wahrscheinlichkeitslehre und mehrwertigeLogik”,Erkenntnis, 5: 174–175.
Vennekens, J., Denecker, M., and Bruynooghe, M., 2009,“CP-logic: A Language of Causal Probabilistic Events and itsRelation to Logic Programming,”Theory and Practice of LogicProgramming, 9: 245–308.
Walley, P., 1991,Statistical Reasoning with ImpreciseProbabilities, London: Chapman and Hall.
Williamson, J., 2002, “Probability Logic,” inHandbook of the Logic of Argument and Inference: the Turn Towardthe Practical, D. Gabbay, R. Johnson, H. J. Ohlbach, and J. Woods(eds.), Amsterdam: Elsevier, pp. 397–424.
Yalcin, S., 2010, “Probability Operators,”Philosophy Compass, 5: 916–937.

Academic Tools

How to cite this entry.
Preview the PDF version of this entry at theFriends of the SEP Society.
Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO).
Enhanced bibliography for this entryatPhilPapers, with links to its database.

Other Internet Resources

[Please contact the author with suggestions.]

Acknowledgments

We would like to thank Johan van Benthem, Joe Halpern, Jan Heylen,Jan-Willem Romeijn and the anonymous referees for their comments onthis entry.

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free

Browse

About

Support SEP

Mirror Sites

View this site from another server:

USA (Main Site)Philosophy, Stanford University

Info about mirror sites

Library of Congress Catalog Data: ISSN 1095-5054

Movatterモバイル変換