GraphicalLasso#

classsklearn.covariance.GraphicalLasso(alpha=0.01,*,mode='cd',covariance=None,tol=0.0001,enet_tol=0.0001,max_iter=100,verbose=False,eps=np.float64(2.220446049250313e-16),assume_centered=False)[source]#

Sparse inverse covariance estimation with an l1-penalized estimator.

For a usage example seeVisualizing the stock market structure.

Read more in theUser Guide.

Changed in version v0.20:GraphLasso has been renamed to GraphicalLasso

Parameters:
alphafloat, default=0.01

The regularization parameter: the higher alpha, the moreregularization, the sparser the inverse covariance.Range is (0, inf].

mode{‘cd’, ‘lars’}, default=’cd’

The Lasso solver to use: coordinate descent or LARS. Use LARS forvery sparse underlying graphs, where p > n. Elsewhere prefer cdwhich is more numerically stable.

covariance“precomputed”, default=None

If covariance is “precomputed”, the input data infit is assumedto be the covariance matrix. IfNone, the empirical covarianceis estimated from the dataX.

Added in version 1.3.

tolfloat, default=1e-4

The tolerance to declare convergence: if the dual gap goes belowthis value, iterations are stopped. Range is (0, inf].

enet_tolfloat, default=1e-4

The tolerance for the elastic net solver used to calculate the descentdirection. This parameter controls the accuracy of the search directionfor a given column update, not of the overall parameter estimate. Onlyused for mode=’cd’. Range is (0, inf].

max_iterint, default=100

The maximum number of iterations.

verbosebool, default=False

If verbose is True, the objective function and dual gap areplotted at each iteration.

epsfloat, default=eps

The machine-precision regularization in the computation of theCholesky diagonal factors. Increase this for very ill-conditionedsystems. Default isnp.finfo(np.float64).eps.

Added in version 1.3.

assume_centeredbool, default=False

If True, data are not centered before computation.Useful when working with data whose mean is almost, but not exactlyzero.If False, data are centered before computation.

Attributes:
location_ndarray of shape (n_features,)

Estimated location, i.e. the estimated mean.

covariance_ndarray of shape (n_features, n_features)

Estimated covariance matrix

precision_ndarray of shape (n_features, n_features)

Estimated pseudo inverse matrix.

n_iter_int

Number of iterations run.

costs_list of (objective, dual_gap) pairs

The list of values of the objective function and the dual gap ateach iteration. Returned only if return_costs is True.

Added in version 1.3.

n_features_in_int

Number of features seen duringfit.

Added in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen duringfit. Defined only whenXhas feature names that are all strings.

Added in version 1.0.

See also

graphical_lasso

L1-penalized covariance estimator.

GraphicalLassoCV

Sparse inverse covariance with cross-validated choice of the l1 penalty.

Examples

>>>importnumpyasnp>>>fromsklearn.covarianceimportGraphicalLasso>>>true_cov=np.array([[0.8,0.0,0.2,0.0],...[0.0,0.4,0.0,0.0],...[0.2,0.0,0.3,0.1],...[0.0,0.0,0.1,0.7]])>>>np.random.seed(0)>>>X=np.random.multivariate_normal(mean=[0,0,0,0],...cov=true_cov,...size=200)>>>cov=GraphicalLasso().fit(X)>>>np.around(cov.covariance_,decimals=3)array([[0.816, 0.049, 0.218, 0.019],       [0.049, 0.364, 0.017, 0.034],       [0.218, 0.017, 0.322, 0.093],       [0.019, 0.034, 0.093, 0.69 ]])>>>np.around(cov.location_,decimals=3)array([0.073, 0.04 , 0.038, 0.143])
error_norm(comp_cov,norm='frobenius',scaling=True,squared=True)[source]#

Compute the Mean Squared Error between two covariance estimators.

Parameters:
comp_covarray-like of shape (n_features, n_features)

The covariance to compare with.

norm{“frobenius”, “spectral”}, default=”frobenius”

The type of norm used to compute the error. Available error types:- ‘frobenius’ (default): sqrt(tr(A^t.A))- ‘spectral’: sqrt(max(eigenvalues(A^t.A))where A is the error(comp_cov-self.covariance_).

scalingbool, default=True

If True (default), the squared error norm is divided by n_features.If False, the squared error norm is not rescaled.

squaredbool, default=True

Whether to compute the squared error norm or the error norm.If True (default), the squared error norm is returned.If False, the error norm is returned.

Returns:
resultfloat

The Mean Squared Error (in the sense of the Frobenius norm) betweenself andcomp_cov covariance estimators.

fit(X,y=None)[source]#

Fit the GraphicalLasso model to X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Data from which to compute the covariance estimate.

yIgnored

Not used, present for API consistency by convention.

Returns:
selfobject

Returns the instance itself.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please checkUser Guide on how the routingmechanism works.

Returns:
routingMetadataRequest

AMetadataRequest encapsulatingrouting information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator andcontained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

get_precision()[source]#

Getter for the precision matrix.

Returns:
precision_array-like of shape (n_features, n_features)

The precision matrix associated to the current covariance object.

mahalanobis(X)[source]#

Compute the squared Mahalanobis distances of given observations.

For a detailed example of how outliers affects the Mahalanobis distance,seeRobust covariance estimation and Mahalanobis distances relevance.

Parameters:
Xarray-like of shape (n_samples, n_features)

The observations, the Mahalanobis distances of the which wecompute. Observations are assumed to be drawn from the samedistribution than the data used in fit.

Returns:
distndarray of shape (n_samples,)

Squared Mahalanobis distances of the observations.

score(X_test,y=None)[source]#

Compute the log-likelihood ofX_test under the estimated Gaussian model.

The Gaussian model is defined by its mean and covariance matrix which arerepresented respectively byself.location_ andself.covariance_.

Parameters:
X_testarray-like of shape (n_samples, n_features)

Test data of which we compute the likelihood, wheren_samples isthe number of samples andn_features is the number of features.X_test is assumed to be drawn from the same distribution thanthe data used in fit (including centering).

yIgnored

Not used, present for API consistency by convention.

Returns:
resfloat

The log-likelihood ofX_test withself.location_ andself.covariance_as estimators of the Gaussian model mean and covariance matrix respectively.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects(such asPipeline). The latter haveparameters of the form<component>__<parameter> so that it’spossible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.