


Jump to content
WikipediaThe Free Encyclopedia

Probability space

From Wikipedia, the free encyclopedia
(Redirected fromEvent space)
Mathematical concept
This article is about the mathematical concept. For the novel, seeProbability Space (novel).
Part of a series onstatistics
Probability theory

Inprobability theory, aprobability space or aprobability triple(Ω,F,P){\displaystyle (\Omega ,{\mathcal {F}},P)} is amathematical construct that provides a formal model of arandom process or "experiment". For example, one can define a probability space which models the throwing of adie.

A probability space consists of three elements:[1][2]

  1. Asample space,Ω{\displaystyle \Omega }, which is the set of all possibleoutcomes of a random process under consideration.
  2. Anevent space,F{\displaystyle {\mathcal {F}}}, which is a set ofevents, where an event is a subset of outcomes in the sample space.
  3. Aprobability function,P{\displaystyle P}, which assigns, to each event in the event space, aprobability, which is a number between 0 and 1 (inclusive).

In order to provide a model of probability, these elements must satisfyprobability axioms.

In the example of the throw of a standard die,

  1. The sample spaceΩ{\displaystyle \Omega } is typically the set{1,2,3,4,5,6}{\displaystyle \{1,2,3,4,5,6\}} where each element in the set is a label which represents the outcome of the die landing on that label. For example,1{\displaystyle 1} represents the outcome that the die lands on 1.
  2. The event spaceF{\displaystyle {\mathcal {F}}} could be theset of all subsets of the sample space, which would then contain simple events such as{5}{\displaystyle \{5\}} ("the die lands on 5"), as well as complex events such as{2,4,6}{\displaystyle \{2,4,6\}} ("the die lands on an even number").
  3. The probability functionP{\displaystyle P} would then map each event to the number of outcomes in that event divided by 6 – so for example,{5}{\displaystyle \{5\}} would be mapped to1/6{\displaystyle 1/6}, and{2,4,6}{\displaystyle \{2,4,6\}} would be mapped to3/6=1/2{\displaystyle 3/6=1/2}.

When an experiment is conducted, it results in exactly one outcomeω{\displaystyle \omega } from the sample spaceΩ{\displaystyle \Omega }. All the events in the event spaceF{\displaystyle {\mathcal {F}}} that contain the selected outcomeω{\displaystyle \omega } are said to "have occurred". The probability functionP{\displaystyle P} must be so defined that if the experiment were repeated arbitrarily many times, the number of occurrences of each event as a fraction of the total number of experiments, will most likely tend towards the probability assigned to that event.

The Soviet mathematicianAndrey Kolmogorov introduced the notion of a probability space and theaxioms of probability in the 1930s. In modern probability theory, there are alternative approaches for axiomatization, such as thealgebra of random variables.


Probability space for throwing a die twice in succession: The sample spaceΩ{\displaystyle \Omega } consists of all 36 possible outcomes; three different events (colored polygons) are shown, with their respective probabilities (assuming adiscrete uniform distribution).

A probability space is a mathematical triplet(Ω,F,P){\displaystyle (\Omega ,{\mathcal {F}},P)} that presents amodel for a particular class of real-world situations. As with other models, its author ultimately defines which elementsΩ{\displaystyle \Omega },F{\displaystyle {\mathcal {F}}}, andP{\displaystyle P} will contain.

  • Thesample spaceΩ{\displaystyle \Omega } is the set of all possible outcomes. Anoutcome is the result of a single execution of the model. Outcomes may be states of nature, possibilities, experimental results and the like. Every instance of the real-world situation (or run of the experiment) must produce exactly one outcome. If outcomes of different runs of an experiment differ in any way that matters, they are distinct outcomes. Which differences matter depends on the kind of analysis we want to do. This leads to different choices of sample space.
  • Theσ-algebraF{\displaystyle {\mathcal {F}}} is a collection of all theevents we would like to consider. This collection may or may not include each of theelementary events. Here, an "event" is a set of zero or more outcomes; that is, asubset of the sample space. An event is considered to have "happened" during an experiment when the outcome of the latter is an element of the event. Since the same outcome may be a member of many events, it is possible for many events to have happened given a single outcome. For example, when the trial consists of throwing two dice, the set of all outcomes with a sum of 7pips may constitute an event, whereas outcomes with an odd number of pips may constitute another event. If the outcome is the element of the elementary event of two pips on the first die and five on the second, then both of the events, "7 pips" and "odd number of pips", are said to have happened.
  • Theprobability measureP{\displaystyle P} is aset function returning an event'sprobability. A probability is a real number between zero (impossible events have probability zero, though probability-zero events are not necessarily impossible) and one (the event happensalmost surely, with almost total certainty). ThusP{\displaystyle P} is a functionP:F[0,1].{\displaystyle P:{\mathcal {F}}\to [0,1].} The probability measure function must satisfy two simple requirements: First, the probability of acountable union of mutually exclusive events must be equal to the countable sum of the probabilities of each of these events. For example, the probability of the union of the mutually exclusive eventsHead{\displaystyle {\text{Head}}} andTail{\displaystyle {\text{Tail}}} in the random experiment of one coin toss,P(HeadTail){\displaystyle P({\text{Head}}\cup {\text{Tail}})}, is the sum of probability forHead{\displaystyle {\text{Head}}} and the probability forTail{\displaystyle {\text{Tail}}},P(Head)+P(Tail){\displaystyle P({\text{Head}})+P({\text{Tail}})}. Second, the probability of the sample spaceΩ{\displaystyle \Omega } must be equal to 1 (which accounts for the fact that, given an execution of the model, some outcome must occur). In the previous example the probability of the set of outcomesP({Head,Tail}){\displaystyle P(\{{\text{Head}},{\text{Tail}}\})} must be equal to one, because it is entirely certain that the outcome will be eitherHead{\displaystyle {\text{Head}}} orTail{\displaystyle {\text{Tail}}} (the model neglects any other possibility) in a single coin toss.

Not every subset of the sample spaceΩ{\displaystyle \Omega } must necessarily be considered an event: some of the subsets are simply not of interest, others cannot be"measured". This is not so obvious in a case like a coin toss. In a different example, one could consider javelin throw lengths, where the events typically are intervals like "between 60 and 65 meters" and unions of such intervals, but not sets like the "irrational numbers between 60 and 65 meters".



In short, a probability space is ameasure space such that the measure of the whole space is equal to one.

The expanded definition is the following: a probability space is a triple(Ω,F,P){\displaystyle (\Omega ,{\mathcal {F}},P)} consisting of:

Discrete case


Discrete probability theory needs onlyat most countable sample spacesΩ{\displaystyle \Omega }. Probabilities can be ascribed to points ofΩ{\displaystyle \Omega } by theprobability mass functionp:Ω[0,1]{\displaystyle p:\Omega \to [0,1]} such thatωΩp(ω)=1{\textstyle \sum _{\omega \in \Omega }p(\omega )=1}. All subsets ofΩ{\displaystyle \Omega } can be treated as events (thus,F=2Ω{\displaystyle {\mathcal {F}}=2^{\Omega }} is thepower set). The probability measure takes the simple form

P(A)=ωAp(ω)for all AΩ.{\displaystyle P(A)=\sum _{\omega \in A}p(\omega )\quad {\text{for all }}A\subseteq \Omega .}

The greatest σ-algebraF=2Ω{\displaystyle {\mathcal {F}}=2^{\Omega }} describes the complete information. In general, a σ-algebraF2Ω{\displaystyle {\mathcal {F}}\subseteq 2^{\Omega }} corresponds to a finite or countablepartitionΩ=B1B2{\displaystyle \Omega =B_{1}\cup B_{2}\cup \dots }, the general form of an eventAF{\displaystyle A\in {\mathcal {F}}} beingA=Bk1Bk2{\displaystyle A=B_{k_{1}}\cup B_{k_{2}}\cup \dots }. See also the examples.

The casep(ω)=0{\displaystyle p(\omega )=0} is permitted by the definition, but rarely used, since suchω{\displaystyle \omega } can safely be excluded from the sample space.

General case


IfΩ isuncountable, still, it may happen thatP(ω) ≠ 0 for someω; suchω are calledatoms. They are an at most countable (maybeempty) set, whose probability is the sum of probabilities of all atoms. If this sum is equal to 1 then all other points can safely be excluded from the sample space, returning us to the discrete case. Otherwise, if the sum of probabilities of all atoms is between 0 and 1, then the probability space decomposes into a discrete (atomic) part (maybe empty) and anon-atomic part.

Non-atomic case


IfP(ω) = 0 for allω ∈ Ω (in this case, Ω must be uncountable, because otherwiseP(Ω) = 1 could not be satisfied), then equation () fails: the probability of a set is not necessarily the sum over the probabilities of its elements, as summation is only defined for countable numbers of elements. This makes the probability space theory much more technical. A formulation stronger than summation,measure theory is applicable. Initially the probabilities are ascribed to some "generator" sets (see the examples). Then a limiting procedure allows assigning probabilities to sets that are limits of sequences of generator sets, or limits of limits, and so on. All these sets are the σ-algebraF{\displaystyle {\mathcal {F}}}. For technical details seeCarathéodory's extension theorem. Sets belonging toF{\displaystyle {\mathcal {F}}} are calledmeasurable. In general they are much more complicated than generator sets, but much better thannon-measurable sets.

Complete probability space


A probability space(Ω,F,P){\displaystyle (\Omega ,\;{\mathcal {F}},\;P)} is said to be a complete probability space if for allBF{\displaystyle B\in {\mathcal {F}}} withP(B)=0{\displaystyle P(B)=0} and allAB{\displaystyle A\;\subset \;B} one hasAF{\displaystyle A\in {\mathcal {F}}}. Often, the study of probability spaces is restricted to complete probability spaces.



Discrete examples


Example 1


If the experiment consists of just one flip of afair coin, then the outcome is either heads or tails:Ω={H,T}{\displaystyle \Omega =\{{\text{H}},{\text{T}}\}}. The σ-algebraF=2Ω{\displaystyle {\mathcal {F}}=2^{\Omega }} contains22=4{\displaystyle 2^{2}=4} events, namely:{H}{\displaystyle \{{\text{H}}\}} ("heads"),{T}{\displaystyle \{{\text{T}}\}} ("tails"),{}{\displaystyle \{\}} ("neither heads nor tails"), and{H,T}{\displaystyle \{{\text{H}},{\text{T}}\}} ("either heads or tails"); in other words,F={{},{H},{T},{H,T}}{\displaystyle {\mathcal {F}}=\{\{\},\{{\text{H}}\},\{{\text{T}}\},\{{\text{H}},{\text{T}}\}\}}. There is a fifty percent chance of tossing heads and fifty percent for tails, so the probability measure in this example isP({})=0{\displaystyle P(\{\})=0},P({H})=0.5{\displaystyle P(\{{\text{H}}\})=0.5},P({T})=0.5{\displaystyle P(\{{\text{T}}\})=0.5},P({H,T})=1{\displaystyle P(\{{\text{H}},{\text{T}}\})=1}.

Example 2


The fair coin is tossed three times. There are 8 possible outcomes:Ω = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} (here "HTH" for example means that first time the coin landed heads, the second time tails, and the last time heads again). The complete information is described by the σ-algebraF=2Ω{\displaystyle {\mathcal {F}}=2^{\Omega }} of28 = 256 events, where each of the events is a subset of Ω.

Alice knows the outcome of the second toss only. Thus her incomplete information is described by the partitionΩ =A1A2 = {HHH, HHT, THH, THT} ⊔ {HTH, HTT, TTH, TTT}, where ⊔ is thedisjoint union, and the corresponding σ-algebraFAlice={{},A1,A2,Ω}{\displaystyle {\mathcal {F}}_{\text{Alice}}=\{\{\},A_{1},A_{2},\Omega \}}. Bryan knows only the total number of tails. His partition contains four parts:Ω =B0B1B2B3 = {HHH} ⊔ {HHT, HTH, THH} ⊔ {TTH, THT, HTT} ⊔ {TTT}; accordingly, his σ-algebraFBryan{\displaystyle {\mathcal {F}}_{\text{Bryan}}} contains 24 = 16 events.

The two σ-algebras areincomparable: neitherFAliceFBryan{\displaystyle {\mathcal {F}}_{\text{Alice}}\subseteq {\mathcal {F}}_{\text{Bryan}}} norFBryanFAlice{\displaystyle {\mathcal {F}}_{\text{Bryan}}\subseteq {\mathcal {F}}_{\text{Alice}}}; both are sub-σ-algebras of 2Ω.

Example 3


If 100 voters are to be drawn randomly from among all voters in California and asked whom they will vote for governor, then the set of allsequences of 100 Californian voters would be the sample space Ω. We assume thatsampling without replacement is used: only sequences of 100different voters are allowed. For simplicity an ordered sample is considered, that is a sequence (Alice, Bryan) is different from (Bryan, Alice). We also take for granted that each potential voter knows exactly his/her future choice, that is he/she does not choose randomly.

Alice knows only whether or notArnold Schwarzenegger has received at least 60 votes. Her incomplete information is described by the σ-algebraFAlice{\displaystyle {\mathcal {F}}_{\text{Alice}}} that contains: (1) the set of all sequences in Ω where at least 60 people vote for Schwarzenegger; (2) the set of all sequences where fewer than 60 vote for Schwarzenegger; (3) the whole sample space Ω; and (4) the empty set ∅.

Bryan knows the exact number of voters who are going to vote for Schwarzenegger. His incomplete information is described by the corresponding partitionΩ =B0B1 ⊔ ⋯ ⊔B100 and the σ-algebraFBryan{\displaystyle {\mathcal {F}}_{\text{Bryan}}} consists of 2101 events.

In this case, Alice's σ-algebra is a subset of Bryan's:FAliceFBryan{\displaystyle {\mathcal {F}}_{\text{Alice}}\subset {\mathcal {F}}_{\text{Bryan}}}. Bryan's σ-algebra is in turn a subset of the much larger "complete information" σ-algebra 2Ω consisting of2n(n−1)⋯(n−99) events, wheren is the number of all potential voters in California.

Non-atomic examples


Example 4


A number between 0 and 1 is chosen at random, uniformly. Here Ω = [0,1],F{\displaystyle {\mathcal {F}}} is the σ-algebra ofBorel sets on Ω, andP is theLebesgue measure on [0,1].

In this case, the open intervals of the form(a,b), where0 <a <b < 1, could be taken as the generator sets. Each such set can be ascribed the probability ofP((a,b)) = (ba), which generates theLebesgue measure on [0,1], and theBorel σ-algebra on Ω.

Example 5


A fair coin is tossed endlessly. Here one can take Ω = {0,1}, the set of all infinite sequences of numbers 0 and 1.Cylinder sets{(x1,x2, ...) ∈ Ω :x1 =a1, ...,xn =an} may be used as the generator sets. Each such set describes an event in which the firstn tosses have resulted in a fixed sequence(a1, ...,an), and the rest of the sequence may be arbitrary. Each such event can be naturally given the probability of 2n.

These two non-atomic examples are closely related: a sequence(x1,x2, ...) ∈ {0,1} leads to the number2−1x1 + 2−2x2 + ⋯ ∈ [0,1]. This is not aone-to-one correspondence between {0,1} and [0,1] however: it is anisomorphism modulo zero, which allows for treating the two probability spaces as two forms of the same probability space. In fact, all non-pathological non-atomic probability spaces are the same in this sense. They are so-calledstandard probability spaces. Basic applications of probability spaces are insensitive to standardness. However, non-discrete conditioning is easy and natural on standard probability spaces, otherwise it becomes obscure.

Related concepts


Probability distribution

Main article:Probability distribution

Random variables

Main article:Random variable

A random variableX is ameasurable functionX: Ω →S from the sample space Ω to another measurable spaceS called thestate space.

IfAS, the notation Pr(XA) is a commonly used shorthand forP({ωΩ:X(ω)A}){\displaystyle P(\{\omega \in \Omega :X(\omega )\in A\})}.

Defining the events in terms of the sample space


If Ω iscountable, we almost always defineF{\displaystyle {\mathcal {F}}} as thepower set of Ω, i.e.F=2Ω{\displaystyle {\mathcal {F}}=2^{\Omega }} which is trivially a σ-algebra and the biggest one we can create using Ω. We can therefore omitF{\displaystyle {\mathcal {F}}} and just write (Ω,P) to define the probability space.

On the other hand, if Ω isuncountable and we useF=2Ω{\displaystyle {\mathcal {F}}=2^{\Omega }} we get into trouble defining our probability measureP becauseF{\displaystyle {\mathcal {F}}} is too "large", i.e. there will often be sets to which it will be impossible to assign a unique measure. In this case, we have to use a smaller σ-algebraF{\displaystyle {\mathcal {F}}}, for example theBorel algebra of Ω, which is the smallest σ-algebra that makes all open sets measurable.

Conditional probability

Main article:Conditional probability

Kolmogorov's definition of probability spaces gives rise to the natural concept of conditional probability. Every setA with non-zero probability (that is,P(A) > 0) defines another probability measureP(BA)=P(BA)P(A){\displaystyle P(B\mid A)={P(B\cap A) \over P(A)}}on the space. This is usually pronounced as the "probability ofB givenA".

For any eventA such thatP(A) > 0, the functionQ defined byQ(B) =P(B | A) for all eventsB is itself a probability measure.


Main article:Statistical independence

Two events,A andB are said to be independent ifP(AB) =P(A)P(B).

Two random variables,X andY, are said to be independent if any event defined in terms ofX is independent of any event defined in terms ofY. Formally, they generate independent σ-algebras, where two σ-algebrasG andH, which are subsets ofF are said to be independent if any element ofG is independent of any element ofH.

Mutual exclusivity

Main article:Mutual exclusivity

Two events,A andB are said to be mutually exclusive ordisjoint if the occurrence of one implies the non-occurrence of the other, i.e., their intersection is empty. This is a stronger condition than the probability of their intersection being zero.

IfA andB are disjoint events, thenP(AB) =P(A) +P(B). This extends to a (finite or countably infinite) sequence of events. However, the probability of the union of an uncountable set of events is not the sum of their probabilities. For example, ifZ is anormally distributed random variable, thenP(Z =x) is 0 for anyx, butP(ZR) = 1.

The eventAB is referred to as "A andB", and the eventAB as "A orB".

See also



  1. ^Loève, Michel. Probability Theory, Vol 1. New York: D. Van Nostrand Company, 1955.
  2. ^Stroock, D. W. (1999). Probability theory: an analytic view. Cambridge University Press.


The first major treatise blending calculus with probability theory, originally in French:Théorie Analytique des Probabilités.
The modern measure-theoretic foundation of probability theory; the original German version (Grundbegriffe der Wahrscheinlichkeitrechnung) appeared in 1933.
An empiricist, Bayesian approach to the foundations of probability theory.
Foundations of probability theory based on nonstandard analysis. Downloadable.
  • Patrick Billingsley:Probability and Measure, John Wiley and Sons, New York, Toronto, London, 1979.
  • Henk Tijms (2004)Understanding Probability
A lively introduction to probability theory for the beginner, Cambridge Univ. Press.
  • David Williams (1991)Probability with martingales
An undergraduate introduction to measure-theoretic probability, Cambridge Univ. Press.

External links

Basic concepts
Types ofmeasures
Particular measures
Main results
Other results
ForLebesgue measure
Applications & related
Authority control databases: NationalEdit this at Wikidata
Retrieved from ""
Hidden categories:

