Inprobability theory, apairwise independent collection ofrandom variables is a set of random variables any two of which areindependent.[1] Any collection ofmutually independent random variables is pairwise independent, but some pairwise independent collections are not mutually independent. Pairwise independent random variables with finitevariance areuncorrelated.
A pair of random variablesX andY areindependentif and only if the random vector (X,Y) withjointcumulative distribution function (CDF) satisfies
or equivalently, their joint density satisfies
That is, the joint distribution is equal to the product of the marginal distributions.[2]
Unless it is not clear in context, in practice the modifier "mutual" is usually dropped so thatindependence meansmutual independence. A statement such as "X,Y,Z are independent random variables" means thatX,Y,Z are mutually independent.
Pairwise independence does not imply mutual independence, as shown by the following example attributed to S. Bernstein.[3]
SupposeX andY are two independent tosses of afair coin, where we designate 1 for heads and 0 for tails. Let the third random variableZ be equal to 1 if exactly one of those coin tosses resulted in "heads", and 0 otherwise (i.e.,). Then jointly the triple (X,Y,Z) has the followingprobability distribution:
Here themarginal probability distributions are identical: and Thebivariate distributions also agree: where
Since each of the pairwise joint distributions equals the product of their respective marginal distributions, the variables are pairwise independent:
However,X,Y, andZ arenotmutually independent, since the left side equalling for example 1/4 for (x,y,z) = (0, 0, 0) while the right side equals 1/8 for (x,y,z) = (0, 0, 0). In fact, any of is completely determined by the other two (any ofX,Y,Z is thesum (modulo 2) of the others). That is as far from independence as random variables can get.
Bounds on theprobability that the sum ofBernoullirandom variables is at least one, commonly known as theunion bound, are provided by theBoole–Fréchet[4][5] inequalities. While these bounds assume onlyunivariate information, several bounds with knowledge of generalbivariate probabilities, have been proposed too. Denote by a set ofBernoulli events withprobability of occurrence for each. Suppose thebivariate probabilities are given by for every pair of indices. Kounias[6] derived the followingupper bound:which subtracts the maximum weight of astarspanning tree on acomplete graph with nodes (where the edge weights are given by) from the sum of themarginal probabilities.
Hunter-Worsley[7][8] tightened thisupper bound by optimizing over as follows:where is the set of allspanning trees on the graph. These bounds are not thetightest possible with generalbivariates even whenfeasibility is guaranteed as shown in Boros et.al.[9] However, when the variables arepairwise independent (), Ramachandra—Natarajan[10] showed that the Kounias-Hunter-Worsley[6][7][8] bound istight by proving that the maximum probability of the union of events admits aclosed-form expression given as:
| 1 |
where theprobabilities are sorted in increasing order as. Thetight bound inEq. 1 depends only on the sum of the smallestprobabilities and the largest probability. Thus, whileordering of theprobabilities plays a role in the derivation of the bound, theordering among the smallestprobabilities is inconsequential since only their sum is used.
It is useful to compare the smallest bounds on the probability of the union with arbitrarydependence andpairwise independence respectively. ThetightestBoole–Fréchetupperunion bound (assuming onlyunivariate information) is given as:
| 2 |
As shown in Ramachandra-Natarajan,[10] it can be easily verified that the ratio of the twotight bounds inEq. 2 andEq. 1 isupper bounded by where the maximum value of is attained whenwhere theprobabilities are sorted in increasing order as. In other words, in the best-case scenario, the pairwise independence bound inEq. 1 provides an improvement of over theunivariate bound inEq. 2.
More generally, we can talk aboutk-wise independence, for anyk ≥ 2. The idea is similar: a set ofrandom variables isk-wise independent if every subset of sizek of those variables is independent.k-wise independence has been used in theoretical computer science, where it was used to prove a theorem about the problemMAXEkSAT.
k-wise independence is used in the proof thatk-independent hashing functions are secure unforgeablemessage authentication codes.
{{cite book}}: CS1 maint: multiple names: authors list (link) Definition 2.5.1, page 109.{{cite book}}: CS1 maint: multiple names: authors list (link) Remark 2.6.1, p. 120.