numpy.cov#
- numpy.cov(m,y=None,rowvar=True,bias=False,ddof=None,fweights=None,aweights=None,*,dtype=None)[source]#
Estimate a covariance matrix, given data and weights.
Covariance indicates the level to which two variables vary together.If we examine N-dimensional samples,\(X = [x_1, x_2, ... x_N]^T\),then the covariance matrix element\(C_{ij}\) is the covariance of\(x_i\) and\(x_j\). The element\(C_{ii}\) is the varianceof\(x_i\).
See the notes for an outline of the algorithm.
- Parameters:
- marray_like
A 1-D or 2-D array containing multiple variables and observations.Each row ofm represents a variable, and each column a singleobservation of all those variables. Also seerowvar below.
- yarray_like, optional
An additional set of variables and observations.y has the same formas that ofm.
- rowvarbool, optional
Ifrowvar is True (default), then each row represents avariable, with observations in the columns. Otherwise, the relationshipis transposed: each column represents a variable, while the rowscontain observations.
- biasbool, optional
Default normalization (False) is by
(N-1), whereNis thenumber of observations given (unbiased estimate). Ifbias is True,then normalization is byN. These values can be overridden by usingthe keywordddofin numpy versions >= 1.5.- ddofint, optional
If not
Nonethe default value implied bybias is overridden.Note thatddof=1will return the unbiased estimate, even if bothfweights andaweights are specified, andddof=0will returnthe simple average. See the notes for the details. The default valueisNone.- fweightsarray_like, int, optional
1-D array of integer frequency weights; the number of times eachobservation vector should be repeated.
- aweightsarray_like, optional
1-D array of observation vector weights. These relative weights aretypically large for observations considered “important” and smaller forobservations considered less “important”. If
ddof=0the array ofweights can be used to assign probabilities to observation vectors.- dtypedata-type, optional
Data-type of the result. By default, the return data-type will haveat least
numpy.float64precision.New in version 1.20.
- Returns:
- outndarray
The covariance matrix of the variables.
See also
corrcoefNormalized covariance matrix
Notes
Assume that the observations are in the columns of the observationarraym and let
f=fweightsanda=aweightsfor brevity. Thesteps to compute the weighted covariance are as follows:>>>m=np.arange(10,dtype=np.float64)>>>f=np.arange(10)*2>>>a=np.arange(10)**2.>>>ddof=1>>>w=f*a>>>v1=np.sum(w)>>>v2=np.sum(w*a)>>>m-=np.sum(m*w,axis=None,keepdims=True)/v1>>>cov=np.dot(m*w,m.T)*v1/(v1**2-ddof*v2)
Note that when
a==1, the normalization factorv1/(v1**2-ddof*v2)goes over to1/(np.sum(f)-ddof)as it should.Examples
>>>importnumpyasnp
Consider two variables,\(x_0\) and\(x_1\), whichcorrelate perfectly, but in opposite directions:
>>>x=np.array([[0,2],[1,1],[2,0]]).T>>>xarray([[0, 1, 2], [2, 1, 0]])
Note how\(x_0\) increases while\(x_1\) decreases. The covariancematrix shows this clearly:
>>>np.cov(x)array([[ 1., -1.], [-1., 1.]])
Note that element\(C_{0,1}\), which shows the correlation between\(x_0\) and\(x_1\), is negative.
Further, note howx andy are combined:
>>>x=[-2.1,-1,4.3]>>>y=[3,1.1,0.12]>>>X=np.stack((x,y),axis=0)>>>np.cov(X)array([[11.71 , -4.286 ], # may vary [-4.286 , 2.144133]])>>>np.cov(x,y)array([[11.71 , -4.286 ], # may vary [-4.286 , 2.144133]])>>>np.cov(x)array(11.71)