Fourier transform of the probability density function
The characteristic function of a uniformU(–1,1) random variable. This function is real-valued because it corresponds to a random variable that is symmetric around the origin; however characteristic functions may generally be complex-valued.
In addition tounivariate distributions, characteristic functions can be defined for vector- or matrix-valued random variables, and can also be extended to more generic cases.
The characteristic function always exists when treated as a function of a real-valued argument, unlike themoment-generating function. There are relations between the behavior of the characteristic function of a distribution and properties of the distribution, such as the existence of moments and the existence of a density function.
The characteristic function is a way to describe arandom variableX.Thecharacteristic function,
a function oft,determines the behavior and properties of the probability distribution ofX.It is equivalent to aprobability density function orcumulative distribution function, since knowing one of these functions allows computation of the others, but they provide different insights into the features of the random variable. In particular cases, one or another of these equivalent functions may be easier to represent in terms of simple standard functions.
If a random variable admits adensity function, then the characteristic function is itsFourier dual, in the sense that each of them is aFourier transform of the other. If a random variable has amoment-generating function, then the domain of the characteristic function can be extended to the complex plane, and
Note however that the characteristic function of a distribution is well defined for allreal values oft, even when themoment-generating function is not well defined for all real values oft.
The characteristic function approach is particularly useful in analysis of linear combinations of independent random variables: a classical proof of theCentral Limit Theorem uses characteristic functions andLévy's continuity theorem. Another important application is to the theory of thedecomposability of random variables.
For a scalar random variableX thecharacteristic function is defined as theexpected value ofeitX, wherei is theimaginary unit, andt ∈R is the argument of the characteristic function:
HereFX is thecumulative distribution function ofX,fX is the correspondingprobability density function,QX(p) is the corresponding inverse cumulative distribution function also called thequantile function,[2] and the integrals are of theRiemann–Stieltjes kind. If a random variableX has aprobability density function then the characteristic function is itsFourier transform with sign reversal in the complex exponential.[3][4] This convention for the constants appearing in the definition of the characteristic function differs from the usual convention for the Fourier transform.[5] For example, some authors[6] defineφX(t) = E[e−2πitX], which is essentially a change of parameter. Other notation may be encountered in the literature: as the characteristic function for a probability measurep, or as the characteristic function corresponding to a densityf.
The notion of characteristic functions generalizes to multivariate random variables and more complicatedrandom elements. The argument of the characteristic function will always belong to thecontinuous dual of the space where the random variableX takes its values. For common cases such definitions are listed below:
The characteristic function of a real-valued random variable always exists, since it is an integral of a bounded continuous function over a space whosemeasure is finite.
It is non-vanishing in a region around zero:φ(0) = 1.
It is bounded:|φ(t)| ≤ 1.
It isHermitian:φ(−t) =φ(t). In particular, the characteristic function of a symmetric (around the origin) random variable is real-valued andeven.
There is abijection betweenprobability distributions and characteristic functions on,. That is, for any two random variablesX1,X2, with values in, both have the same probability distribution if and only if.[11]
If a random variableX hasmoments up tok-th order, then the characteristic functionφX isk times continuously differentiable on the entire real line. In this case
If a characteristic functionφX has ak-th derivative at zero, then the random variableX has all moments up tok ifk is even, but only up tok – 1 ifk is odd.[12]
IfX1, ...,Xn are independent random variables, anda1, ...,an are some constants, then the characteristic function of the linear combination of theXi variables is One specific case is the sum of two independent random variablesX1 andX2 in which case one has
Let and be two random variables with characteristic functions and. and are independent if and only if.
The tail behavior of the characteristic function determines thesmoothness of the corresponding density function.
Let the random variable be the linear transformation of a random variable. The characteristic function of is. For random vectors and (whereA is a constant matrix andB a constant vector), we have.[13]
The bijection stated above between probability distributions and characteristic functions issequentially continuous. That is, whenever a sequence of distribution functionsFj(x) converges (weakly) to some distributionF(x), the corresponding sequence of characteristic functionsφj(t) will also converge, and the limitφ(t) will correspond to the characteristic function of lawF. More formally, this is stated as
Lévy’s continuity theorem: A sequenceXj ofn-variate random variablesconverges in distribution to random variableX if and only if the sequenceφXj converges pointwise to a functionφ which is continuous at the origin. Whereφ is the characteristic function ofX.[14]
There is aone-to-one correspondence between cumulative distribution functions and characteristic functions, so it is possible to find one of these functions if we know the other. The formula in the definition of characteristic function allows us to computeφ when we know the distribution functionF (or densityf). If, on the other hand, we know the characteristic functionφ and want to find the corresponding distribution function, then one of the followinginversion theorems can be used.
Theorem. If the characteristic functionφX of a random variableX isintegrable, thenFX is absolutely continuous, and thereforeX has aprobability density function. In the univariate case (i.e. whenX is scalar-valued) the density function is given by
Theorem (Lévy).[note 1] IfφX is characteristic function of distribution functionFX, two pointsa <b are such that{x |a <x <b} is acontinuity set ofμX (in the univariate case this condition is equivalent to continuity ofFX at pointsa andb), then
IfX is scalar: This formula can be re-stated in a form more convenient for numerical computation as[15] For a random variable bounded from below one can obtain by taking such that Otherwise, if a random variable is not bounded from below, the limit for gives, but is numerically impractical.[15]
IfX is a vector random variable:
Theorem. Ifa is (possibly) an atom ofX (in the univariate case this means a point of discontinuity ofFX) then
The set of all characteristic functions is closed under certain operations:
Aconvex linear combination (with) of a finite or a countable number of characteristic functions is also a characteristic function.
The product of a finite number of characteristic functions is also a characteristic function. The same holds for aninfinite product provided that it converges to a function continuous at the origin.
Ifφ is a characteristic function andα is a real number, then,Re(φ), |φ|2, andφ(αt) are also characteristic functions.
It is well known that any non-decreasingcàdlàg functionF with limitsF(−∞) = 0,F(+∞) = 1 corresponds to acumulative distribution function of some random variable. There is also interest in finding similar simple criteria for when a given functionφ could be the characteristic function of some random variable. The central result here isBochner’s theorem, although its usefulness is limited because the main condition of the theorem,non-negative definiteness, is very hard to verify. Other theorems also exist, such as Khinchine’s, Mathias’s, or Cramér’s, although their application is just as difficult.Pólya’s theorem, on the other hand, provides a very simple convexity condition which is sufficient but not necessary. Characteristic functions which satisfy this condition are called Pólya-type.[19]
Bochner’s theorem. An arbitrary functionφ :Rn →C is the characteristic function of some random variable if and only ifφ ispositive definite, continuous at the origin, and ifφ(0) = 1.
Khinchine’s criterion. A complex-valued, absolutely continuous functionφ, withφ(0) = 1, is a characteristic function if and only if it admits the representation
Mathias’ theorem. A real-valued, even, continuous, absolutely integrable functionφ, withφ(0) = 1, is a characteristic function if and only if
forn = 0,1,2,..., and allp > 0. HereH2n denotes theHermite polynomial of degree2n.
Pólya’s theorem can be used to construct an example of two random variables whose characteristic functions coincide over a finite interval but are different elsewhere.
Pólya’s theorem. If is a real-valued, even, continuous function which satisfies the conditions
Because of thecontinuity theorem, characteristic functions are used in the most frequently seen proof of thecentral limit theorem. The main technique involved in making calculations with a characteristic function is recognizing the function as the characteristic function of a particular distribution.
Characteristic functions are particularly useful for dealing with linear functions ofindependent random variables. For example, ifX1,X2, ...,Xn is a sequence of independent (and not necessarily identically distributed) random variables, and
where theai are constants, then the characteristic function forSn is given by
In particular,φX+Y(t) =φX(t)φY(t). To see this, write out the definition of characteristic function:
The independence ofX andY is required to establish the equality of the third and fourth expressions.
Another special case of interest for identically distributed random variables is whenai = 1 /n and thenSn is the sample mean. In this case, writingX for the mean,
Characteristic functions can also be used to findmoments of a random variable. Provided that then-th moment exists, the characteristic function can be differentiatedn times:
This can be formally written using the derivatives of theDirac delta function:which allows a formal solution to themoment problem.For example, supposeX has a standardCauchy distribution. ThenφX(t) =e−|t|. This is notdifferentiable att = 0, showing that the Cauchy distribution has noexpectation. Also, the characteristic function of the sample meanX ofnindependent observations has characteristic functionφX(t) = (e−|t|/n)n =e−|t|, using the result from the previous section. This is the characteristic function of the standard Cauchy distribution: thus, the sample mean has the same distribution as the population itself.
A similar calculation shows and is easier to carry out than applying the definition of expectation and using integration by parts to evaluate.
The logarithm of a characteristic function is acumulant generating function, which is useful for findingcumulants; some instead define the cumulant generating function as the logarithm of themoment-generating function, and call the logarithm of the characteristic function thesecond cumulant generating function.
Characteristic functions can be used as part of procedures for fitting probability distributions to samples of data. Cases where this provides a practicable option compared to other possibilities include fitting thestable distribution since closed form expressions for the density are not available which makes implementation ofmaximum likelihood estimation difficult. Estimation procedures are available which match the theoretical characteristic function to theempirical characteristic function, calculated from the data. Paulson et al. (1975)[20] and Heathcote (1977)[21] provide some theoretical background for such an estimation procedure. In addition, Yu (2004)[22] describes applications of empirical characteristic functions to fittime series models where likelihood procedures are impractical. Empirical characteristic functions have also been used by Ansari et al. (2020)[23] and Li et al. (2020)[24] for traininggenerative adversarial networks.
This sectionneeds expansion. You can help byadding to it.(December 2009)
As defined above, the argument of the characteristic function is treated as a real number: however, certain aspects of the theory of characteristic functions are advanced by extending the definition into the complex plane byanalytic continuation, in cases where this is possible.[25]
whereP(t) denotes thecontinuous Fourier transform of the probability density functionp(x). Likewise,p(x) may be recovered fromφX(t) through the inverse Fourier transform:
Indeed, even when the random variable does not have a density, the characteristic function may be seen as the Fourier transform of the measure corresponding to the random variable.
^Shaw, W. T.; McCabe, J. (2009). "Monte Carlo sampling given a Characteristic Function: Quantile Mechanics in Momentum Space".arXiv:0903.1592 [q-fin.CP].
Andersen, H.H.; Højbjerre, M.; Sørensen, D.; Eriksen, P.S. (1995).Linear and graphical models for the multivariate complex normal distribution. Lecture Notes in Statistics 101. New York: Springer-Verlag.ISBN978-0-387-94521-7.
Billingsley, Patrick (1995).Probability and measure (3rd ed.). John Wiley & Sons.ISBN978-0-471-00710-4.
Bisgaard, T. M.; Sasvári, Z. (2000).Characteristic functions and moment sequences. Nova Science.
Bochner, Salomon (1955).Harmonic analysis and the theory of probability. University of California Press.
Oberhettinger, Fritz (1973).Fourier transforms of distributions and their inverses; a collection of tables. New York: Academic Press.ISBN9780125236508.
Paulson, A.S.; Holcomb, E.W.; Leitch, R.A. (1975). "The estimation of the parameters of the stable laws".Biometrika.62 (1):163–170.doi:10.1093/biomet/62.1.163.
Pinsky, Mark (2002).Introduction to Fourier analysis and wavelets. Brooks/Cole.ISBN978-0-534-37660-4.