numpy.cov#

numpy.cov(m,y=None,rowvar=True,bias=False,ddof=None,fweights=None,aweights=None,*,dtype=None)[source]#

Estimate a covariance matrix, given data and weights.

Covariance indicates the level to which two variables vary together.If we examine N-dimensional samples,\(X = [x_1, x_2, ... x_N]^T\),then the covariance matrix element\(C_{ij}\) is the covariance of\(x_i\) and\(x_j\). The element\(C_{ii}\) is the varianceof\(x_i\).

See the notes for an outline of the algorithm.

Parameters:
marray_like

A 1-D or 2-D array containing multiple variables and observations.Each row ofm represents a variable, and each column a singleobservation of all those variables. Also seerowvar below.

yarray_like, optional

An additional set of variables and observations.y has the same formas that ofm.

rowvarbool, optional

Ifrowvar is True (default), then each row represents avariable, with observations in the columns. Otherwise, the relationshipis transposed: each column represents a variable, while the rowscontain observations.

biasbool, optional

Default normalization (False) is by(N-1), whereN is thenumber of observations given (unbiased estimate). Ifbias is True,then normalization is byN. These values can be overridden by usingthe keywordddof in numpy versions >= 1.5.

ddofint, optional

If notNone the default value implied bybias is overridden.Note thatddof=1 will return the unbiased estimate, even if bothfweights andaweights are specified, andddof=0 will returnthe simple average. See the notes for the details. The default valueisNone.

fweightsarray_like, int, optional

1-D array of integer frequency weights; the number of times eachobservation vector should be repeated.

aweightsarray_like, optional

1-D array of observation vector weights. These relative weights aretypically large for observations considered “important” and smaller forobservations considered less “important”. Ifddof=0 the array ofweights can be used to assign probabilities to observation vectors.

dtypedata-type, optional

Data-type of the result. By default, the return data-type will haveat leastnumpy.float64 precision.

New in version 1.20.

Returns:
outndarray

The covariance matrix of the variables.

See also

corrcoef

Normalized covariance matrix

Notes

Assume that the observations are in the columns of the observationarraym and letf=fweights anda=aweights for brevity. Thesteps to compute the weighted covariance are as follows:

>>>m=np.arange(10,dtype=np.float64)>>>f=np.arange(10)*2>>>a=np.arange(10)**2.>>>ddof=1>>>w=f*a>>>v1=np.sum(w)>>>v2=np.sum(w*a)>>>m-=np.sum(m*w,axis=None,keepdims=True)/v1>>>cov=np.dot(m*w,m.T)*v1/(v1**2-ddof*v2)

Note that whena==1, the normalization factorv1/(v1**2-ddof*v2) goes over to1/(np.sum(f)-ddof)as it should.

Examples

>>>importnumpyasnp

Consider two variables,\(x_0\) and\(x_1\), whichcorrelate perfectly, but in opposite directions:

>>>x=np.array([[0,2],[1,1],[2,0]]).T>>>xarray([[0, 1, 2],       [2, 1, 0]])

Note how\(x_0\) increases while\(x_1\) decreases. The covariancematrix shows this clearly:

>>>np.cov(x)array([[ 1., -1.],       [-1.,  1.]])

Note that element\(C_{0,1}\), which shows the correlation between\(x_0\) and\(x_1\), is negative.

Further, note howx andy are combined:

>>>x=[-2.1,-1,4.3]>>>y=[3,1.1,0.12]>>>X=np.stack((x,y),axis=0)>>>np.cov(X)array([[11.71      , -4.286     ], # may vary       [-4.286     ,  2.144133]])>>>np.cov(x,y)array([[11.71      , -4.286     ], # may vary       [-4.286     ,  2.144133]])>>>np.cov(x)array(11.71)
On this page