Awhitening transformation orsphering transformation is alinear transformation that transforms a vector ofrandom variables with a knowncovariance matrix into a set of new variables whose covariance is theidentity matrix, meaning that they areuncorrelated and each havevariance 1.[1] The transformation is called "whitening" because it changes the input vector into awhite noise vector.
Several other transformations are closely related to whitening:
Suppose is arandom (column) vector with non-singular covariance matrix and mean. Then the transformation with awhitening matrix satisfying the condition yields the whitened random vector with unit diagonal covariance.
If has non-zero mean, then whitening can be performed by.
There are infinitely many possible whitening matrices that all satisfy the above condition. Commonly used choices are (Mahalanobis or ZCA whitening), where is theCholesky decomposition of (Cholesky whitening),[3] or the eigen-system of (PCA whitening).[4]
Optimal whitening transforms can be singled out by investigating the cross-covariance and cross-correlation of and.[3] For example, the unique optimal whitening transformation achieving maximal component-wise correlation between original and whitened is produced by the whitening matrix where is the correlation matrix and the diagonal variance matrix.
Whitening a data matrix follows the same transformation as for random variables. An empirical whitening transform is obtained byestimating the covariance (e.g. bymaximum likelihood) and subsequently constructing a corresponding estimated whitening matrix (e.g. byCholesky decomposition).
This modality is a generalization of the pre-whitening procedure extended to more general spaces where is usually assumed to be a random function or other random objects in aHilbert space. One of the main issues of extending whitening to infinite dimensions is that thecovariance operator has an unbounded inverse in, therefore only partial standardization is possible in infinite dimensions. A whitening operator can be then defined from the factorization of a degenerated covariance operator. High-dimensional features of the data can be exploited through kernel regressors or basis function systems.[5]
An implementation of several whitening procedures inR, including ZCA-whitening and PCA whitening but alsoCCA whitening, is available in the "whitening" R package[6] published onCRAN. The R package "pfica"[7] allows the computation of high-dimensional whitening representations using basis function systems (B-splines,Fourier basis, etc.).