Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Convergence of measures

From Wikipedia, the free encyclopedia
Mathematical concept
Not to be confused withConvergence in measure.

Inmathematics, more specificallymeasure theory, there are various notions of theconvergence of measures. For an intuitive general sense of what is meant byconvergence of measures, consider a sequence of measuresμn on a space, sharing a common collection of measurable sets. Such a sequence might represent an attempt to construct 'better and better' approximations to a desired measureμ that is difficult to obtain directly. The meaning of 'better and better' is subject to all the usual caveats for takinglimits; for any error toleranceε > 0 we require there beN sufficiently large fornN to ensure the 'difference' betweenμn andμ is smaller thanε. Various notions of convergence specify precisely what the word 'difference' should mean in that description; these notions are not equivalent to one another, and vary in strength.

Three of the most common notions of convergence are described below.

Informal descriptions

[edit]

This section attempts to provide a rough intuitive description of three notions of convergence, using terminology developed incalculus courses; this section is necessarily imprecise as well as inexact, and the reader should refer to the formal clarifications in subsequent sections. In particular, the descriptions here do not address the possibility that the measure of some sets could be infinite, or that the underlying space could exhibit pathological behavior, and additional technical assumptions are needed for some of the statements. The statements in this section are however all correct ifμn is a sequence of probability measures on aPolish space.

The various notions of convergence formalize the assertion that the 'average value' of each 'sufficiently nice' function should converge:fdμnfdμ{\displaystyle \int f\,d\mu _{n}\to \int f\,d\mu }

To formalize this requires a careful specification of the set of functions under consideration and how uniform the convergence should be.

The notion ofweak convergence requires this convergence to take place for every continuous bounded functionf. This notion treats convergence for different functionsf independently of one another, i.e., different functionsf may require different values ofNn to be approximated equally well (thus, convergence is non-uniform inf).

The notion ofsetwise convergence formalizes the assertion that the measure of each measurable set should converge:μn(A)μ(A){\displaystyle \mu _{n}(A)\to \mu (A)}

Again, no uniformity over the setA is required.Intuitively, considering integrals of 'nice' functions, this notion provides more uniformity than weak convergence. As a matter of fact, when considering sequences of measures with uniformly boundedvariation on aPolish space, setwise convergence implies the convergencefdμnfdμ{\textstyle \int f\,d\mu _{n}\to \int f\,d\mu } for any bounded measurable functionf[citation needed].As before, this convergence is non-uniform inf.

The notion oftotal variation convergence formalizes the assertion that the measure of all measurable sets should convergeuniformly, i.e. for everyε > 0 there existsN such that|μn(A)μ(A)|<ε{\displaystyle |\mu _{n}(A)-\mu (A)|<\varepsilon } for everyn >N and for every measurable setA. As before, this implies convergence of integrals against bounded measurable functions, but this time convergence is uniform over all functions bounded by any fixed constant.

Total variation convergence of measures

[edit]

This is the strongest notion of convergence shown on this page and is defined as follows. Let(X,F){\displaystyle (X,{\mathcal {F}})} be ameasurable space. Thetotal variation distance between two (positive) measuresμ andν is then given by

μνTV=supf{XfdμXfdν}.{\displaystyle \left\|\mu -\nu \right\|_{\text{TV}}=\sup _{f}\left\{\int _{X}f\,d\mu -\int _{X}f\,d\nu \right\}.}

Here the supremum is taken overf ranging over the set of allmeasurable functions fromX to[−1, 1]. This is in contrast, for example, to theWasserstein metric, where the definition isof the same form, but the supremum is taken overf ranging over the set of those measurable functions fromX to[−1, 1] which haveLipschitz constant at most 1; and also in contrast to theRadon metric, where the supremum is taken overf ranging over the set of continuous functions fromX to[−1, 1]. In the case whereX is aPolish space, the total variation metric coincides with the Radon metric.

Ifμ andν are bothprobability measures, then the total variation distance is also given by

μνTV=2supAF|μ(A)ν(A)|.{\displaystyle \left\|\mu -\nu \right\|_{\text{TV}}=2\cdot \sup _{A\in {\mathcal {F}}}|\mu (A)-\nu (A)|.}

The equivalence between these two definitions can be seen as a particular case of theMonge–Kantorovich duality. From the two definitions above, it is clear that the total variation distance between probability measures is always between 0 and 2.

To illustrate the meaning of the total variation distance, consider the following thought experiment. Assume that we are given two probability measuresμ andν, as well as a random variableX. We know thatX has law eitherμ orν but we do not know which one of the two. Assume that these two measures have prior probabilities 0.5 each of being the true law ofX. Assume now that we are givenone single sample distributed according to the law ofX and that we are then asked to guess which one of the two distributions describes that law. The quantity

2+μνTV4{\displaystyle {2+\|\mu -\nu \|_{\text{TV}} \over 4}}

then provides a sharp upper bound on the prior probability that our guess will be correct.

Given the above definition of total variation distance, a sequenceμn of measures defined on the same measure space is said toconverge to a measureμ in total variation distance if for everyε > 0, there exists anN such that for alln >N, one has that[1]

μnμTV<ε.{\displaystyle \|\mu _{n}-\mu \|_{\text{TV}}<\varepsilon .}

Setwise convergence of measures

[edit]

For(X,F){\displaystyle (X,{\mathcal {F}})} ameasurable space, a sequenceμn is said to converge setwise to a limitμ if

limnμn(A)=μ(A){\displaystyle \lim _{n\to \infty }\mu _{n}(A)=\mu (A)}

for every setAF{\displaystyle A\in {\mathcal {F}}}.

Typical arrow notations areμnswμ{\displaystyle \mu _{n}\xrightarrow {sw} \mu } andμnsμ{\displaystyle \mu _{n}\xrightarrow {s} \mu }.

For example, as a consequence of theRiemann–Lebesgue lemma, the sequenceμn of measures on the interval[−1, 1] given byμn(dx) = (1 + sin(nx))dx converges setwise to Lebesgue measure, but it does not converge in total variation.

In a measure theoretical or probabilistic context setwise convergence is often referred to as strong convergence (as opposed to weak convergence). This can lead to some ambiguity because infunctional analysis, strong convergence usually refers to convergence with respect to a norm.

Weak convergence of measures

[edit]

Inmathematics andstatistics,weak convergence is one of many types of convergence relating to the convergence ofmeasures. It depends on a topology on the underlying space and thus is not a purely measure-theoretic notion.

There are several equivalentdefinitions of weak convergence of a sequence of measures, some of which are (apparently) more general than others. The equivalence of these conditions is sometimes known as thePortmanteau theorem.[2]

Definition. LetS{\displaystyle S} be ametric space with itsBorelσ{\displaystyle \sigma }-algebraΣ{\displaystyle \Sigma }. A bounded sequence of positiveprobability measuresPn(n=1,2,){\displaystyle P_{n}\,(n=1,2,\dots )} on(S,Σ){\displaystyle (S,\Sigma )} is said toconverge weakly to a probability measureP{\displaystyle P} (denotedPnP{\displaystyle P_{n}\Rightarrow P}) if any of the following equivalent conditions is true (hereEn{\displaystyle \operatorname {E} _{n}} denotes expectation or the integral with respect toPn{\displaystyle P_{n}}, whileE{\displaystyle \operatorname {E} } denotes expectation or the integral with respect toP{\displaystyle P}):

In the caseS{\displaystyle S} andR{\displaystyle \mathbf {R} } (with its usual topology) are homeomorphic , ifFn{\displaystyle F_{n}} andF{\displaystyle F} denote thecumulative distribution functions of the measuresPn{\displaystyle P_{n}} andP{\displaystyle P}, respectively, thenPn{\displaystyle P_{n}} converges weakly toP{\displaystyle P} if and only iflimnFn(x)=F(x){\displaystyle \lim _{n\to \infty }F_{n}(x)=F(x)} for all pointsxR{\displaystyle x\in \mathbf {R} } at whichF{\displaystyle F} is continuous.

For example, the sequence wherePn{\displaystyle P_{n}} is theDirac measure located at1/n{\displaystyle 1/n} converges weakly to the Dirac measure located at 0 (if we view these as measures onR{\displaystyle \mathbf {R} } with the usual topology), but it does not converge setwise. This is intuitively clear: we only know that1/n{\displaystyle 1/n} is "close" to0{\displaystyle 0} because of the topology ofR{\displaystyle \mathbf {R} }.

This definition of weak convergence can be extended forS{\displaystyle S} anymetrizabletopological space. It also defines a weak topology onP(S){\displaystyle {\mathcal {P}}(S)}, the set of all probability measures defined on(S,Σ){\displaystyle (S,\Sigma )}. The weak topology is generated by the following basis of open sets:

{ Uφ,x,δ |φ:SR is bounded and continuous, xR and δ>0 },{\displaystyle \left\{\ U_{\varphi ,x,\delta }\ \left|\quad \varphi :S\to \mathbf {R} {\text{ is bounded and continuous, }}x\in \mathbf {R} {\text{ and }}\delta >0\ \right.\right\},}

where

Uφ,x,δ:={ μP(S) ||Sφdμx|<δ }.{\displaystyle U_{\varphi ,x,\delta }:=\left\{\ \mu \in {\mathcal {P}}(S)\ \left|\quad \left|\int _{S}\varphi \,\mathrm {d} \mu -x\right|<\delta \ \right.\right\}.}

IfS{\displaystyle S} is alsoseparable, thenP(S){\displaystyle {\mathcal {P}}(S)} is metrizable and separable, for example by theLévy–Prokhorov metric. IfS{\displaystyle S} is also compact orPolish, so isP(S){\displaystyle {\mathcal {P}}(S)}.

IfS{\displaystyle S} is separable, it naturally embeds intoP(S){\displaystyle {\mathcal {P}}(S)} as the (closed) set ofDirac measures, and itsconvex hull isdense.

There are many "arrow notations" for this kind of convergence: the most frequently used arePnP{\displaystyle P_{n}\Rightarrow P},PnP{\displaystyle P_{n}\rightharpoonup P},PnwP{\displaystyle P_{n}\xrightarrow {w} P} andPnDP{\displaystyle P_{n}\xrightarrow {\mathcal {D}} P}.

Weak convergence of random variables

[edit]
Main article:Convergence of random variables

Let(Ω,F,P){\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} )} be aprobability space andX be a metric space. IfXn: Ω →X is a sequence ofrandom variables thenXn is said toconverge weakly (orin distribution orin law) to the random variableX: Ω →X asn → ∞ if the sequence ofpushforward measures (Xn)(P) converges weakly toX(P) in the sense of weak convergence of measures onX, as defined above.

Comparison with vague convergence

[edit]

LetX{\displaystyle X} be a metric space (for exampleR{\displaystyle \mathbb {R} } or[0,1]{\displaystyle [0,1]}). The following spaces of test functions are commonly used in the convergence of probability measures.[3]

We haveCcC0CBC{\displaystyle C_{c}\subset C_{0}\subset C_{B}\subset C}. Moreover,C0{\displaystyle C_{0}} is the closure ofCc{\displaystyle C_{c}} with respect to uniform convergence.[3]

Vague Convergence

[edit]

A sequence of measures(μn)nN{\displaystyle \left(\mu _{n}\right)_{n\in \mathbb {N} }}convergesvaguely to a measureμ{\displaystyle \mu } if for allfCc(X){\displaystyle f\in C_{c}(X)},XfdμnXfdμ{\displaystyle \int _{X}f\,d\mu _{n}\rightarrow \int _{X}f\,d\mu }.

Weak Convergence

[edit]

A sequence of measures(μn)nN{\displaystyle \left(\mu _{n}\right)_{n\in \mathbb {N} }}converges weakly to a measureμ{\displaystyle \mu } if for allfCB(X){\displaystyle f\in C_{B}(X)},XfdμnXfdμ{\displaystyle \int _{X}f\,d\mu _{n}\rightarrow \int _{X}f\,d\mu }.

In general, these two convergence notions are not equivalent.

In a probability setting, vague convergence and weak convergence of probability measures are equivalent assumingtightness. That is, a tight sequence of probability measures(μn)nN{\displaystyle (\mu _{n})_{n\in \mathbb {N} }} convergesvaguely to a probability measureμ{\displaystyle \mu } if and only if(μn)nN{\displaystyle (\mu _{n})_{n\in \mathbb {N} }} converges weakly toμ{\displaystyle \mu }.

The weak limit of a sequence of probability measures, provided it exists, is a probability measure. In general, if tightness is not assumed, a sequence of probability (or sub-probability) measures may not necessarily convergevaguely to a true probability measure, but rather to a sub-probability measure (a measure such thatμ(X)1{\displaystyle \mu (X)\leq 1}).[3] Thus, a sequence of probability measures(μn)nN{\displaystyle (\mu _{n})_{n\in \mathbb {N} }} such thatμnvμ{\displaystyle \mu _{n}{\overset {v}{\to }}\mu } whereμ{\displaystyle \mu } is not specified to be a probability measure is not guaranteed to imply weak convergence.

Weak convergence of measures as an example of weak-* convergence

[edit]

Despite having the same name asweak convergence in the context of functional analysis, weak convergence of measures is actually an example of weak-* convergence. The definitions of weak and weak-* convergences used in functional analysis are as follows:

LetV{\displaystyle V} be a topological vector space or Banach space.

  1. A sequencexn{\displaystyle x_{n}} inV{\displaystyle V}converges weakly tox{\displaystyle x} ifφ(xn)φ(x){\displaystyle \varphi \left(x_{n}\right)\rightarrow \varphi (x)} asn{\displaystyle n\to \infty } for allφV{\displaystyle \varphi \in V^{*}}. One writesxnwx{\displaystyle x_{n}\mathrel {\stackrel {w}{\rightarrow }} x} asn{\displaystyle n\to \infty }.
  2. A sequence ofφnV{\displaystyle \varphi _{n}\in V^{*}}converges in the weak-* topology toφ{\displaystyle \varphi } provided thatφn(x)φ(x){\displaystyle \varphi _{n}(x)\rightarrow \varphi (x)} for allxV{\displaystyle x\in V}. That is, convergence occurs in the point-wise sense. In this case, one writesφnwφ{\displaystyle \varphi _{n}\mathrel {\stackrel {w^{*}}{\rightarrow }} \varphi } asn{\displaystyle n\to \infty }.

To illustrate how weak convergence of measures is an example of weak-* convergence, we give an example in terms of vague convergence (see above). LetX{\displaystyle X} be a locally compact Hausdorff space. By theRiesz-Representation theorem, the spaceM(X){\displaystyle M(X)} of Radon measures is isomorphic to a subspace of the space of continuous linear functionals onC0(X){\displaystyle C_{0}(X)}. Therefore, for each Radon measureμnM(X){\displaystyle \mu _{n}\in M(X)}, there is a linear functionalφnC0(X){\displaystyle \varphi _{n}\in C_{0}(X)^{*}} such thatφn(f)=Xfdμn{\displaystyle \varphi _{n}(f)=\int _{X}f\,d\mu _{n}} for allfC0(X){\displaystyle f\in C_{0}(X)}. Applying the definition of weak-* convergence in terms of linear functionals, the characterization of vague convergence of measures is obtained. For compactX{\displaystyle X},C0(X)=CB(X){\displaystyle C_{0}(X)=C_{B}(X)}, so in this case weak convergence of measures is a special case of weak-* convergence.

See also

[edit]

Notes and references

[edit]
  1. ^Madras, Neil; Sezer, Deniz (25 Feb 2011). "Quantitative bounds for Markov chain convergence: Wasserstein and total variation distances".Bernoulli.16 (3):882–908.arXiv:1102.5245.doi:10.3150/09-BEJ238.S2CID 88518773.
  2. ^Klenke, Achim (2006).Probability Theory. Springer-Verlag.ISBN 978-1-84800-047-6.
  3. ^abcChung, Kai Lai (1974).A course in probability theory. Internet Archive. New York, Academic Press. pp. 84–99.ISBN 978-0-12-174151-8.

Further reading

[edit]
Basic concepts
Sets
Types ofmeasures
Particular measures
Maps
Main results
Other results
ForLebesgue measure
Applications & related
This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(February 2010) (Learn how and when to remove this message)
Retrieved from "https://en.wikipedia.org/w/index.php?title=Convergence_of_measures&oldid=1323363888"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp