Part of a series onStatistics |
Correlation and covariance |
---|
![]() |
For deterministic signals |
Inprobability theory andstatistics, the mathematical concepts ofcovariance andcorrelation are very similar.[1][2] Both describe the degree to which tworandom variables orsets of random variables tend to deviate from theirexpected values in similar ways.
IfX andY are two random variables, withmeans (expected values)μX andμY andstandard deviationsσX andσY, respectively, then their covariance and correlation are as follows:
so that
whereE is the expected value operator. Notably, correlation isdimensionless while covariance is in units obtained by multiplying the units of the two variables.
IfY always takes on the same values asX, we have the covariance of a variable with itself (i.e.), which is called thevariance and is more commonly denoted as the square of the standard deviation. Thecorrelation of a variable with itself is always 1 (except in thedegenerate case where the two variances are zero becauseX always takes on the same single value, in which case the correlation does not exist since its computation would involvedivision by 0). More generally, the correlation between two variables is 1 (or –1) if one of them always takes on a value that is given exactly by alinear function of the other with respectively a positive (or negative)slope.
Although the values of the theoretical covariances and correlations are linked in the above way, the probability distributions ofsample estimates of these quantities are not linked in any simple way and they generally need to be treated separately.
With any number of random variables in excess of 1, the variables can be stacked into arandom vector whosei th element is thei th random variable. Then the variances and covariances can be placed in acovariance matrix, in which the (i,j) element is the covariance between thei th random variable and thej th one. Likewise, the correlations can be placed in acorrelation matrix.
In the case of atime series which isstationary in the wide sense, both the means and variances are constant over time (E(Xn+m) = E(Xn) = μX and var(Xn+m) = var(Xn) and likewise for the variableY). In this case the cross-covariance and cross-correlation are functions of the time difference:
IfY is the same variable asX, the above expressions are called theautocovariance andautocorrelation:
This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Covariance and correlation" – news ·newspapers ·books ·scholar ·JSTOR(August 2011) (Learn how and when to remove this message) |