OAS#
- classsklearn.covariance.OAS(*,store_precision=True,assume_centered=False)[source]#
Oracle Approximating Shrinkage Estimator.
Read more in theUser Guide.
- Parameters:
- store_precisionbool, default=True
Specify if the estimated precision is stored.
- assume_centeredbool, default=False
If True, data will not be centered before computation.Useful when working with data whose mean is almost, but not exactlyzero.If False (default), data will be centered before computation.
- Attributes:
- covariance_ndarray of shape (n_features, n_features)
Estimated covariance matrix.
- location_ndarray of shape (n_features,)
Estimated location, i.e. the estimated mean.
- precision_ndarray of shape (n_features, n_features)
Estimated pseudo inverse matrix.(stored only if store_precision is True)
- shrinkage_float
coefficient in the convex combination used for the computationof the shrunk estimate. Range is [0, 1].
- n_features_in_int
Number of features seen duringfit.
Added in version 0.24.
- feature_names_in_ndarray of shape (
n_features_in_,) Names of features seen duringfit. Defined only when
Xhas feature names that are all strings.Added in version 1.0.
See also
EllipticEnvelopeAn object for detecting outliers in a Gaussian distributed dataset.
EmpiricalCovarianceMaximum likelihood covariance estimator.
GraphicalLassoSparse inverse covariance estimation with an l1-penalized estimator.
GraphicalLassoCVSparse inverse covariance with cross-validated choice of the l1 penalty.
LedoitWolfLedoitWolf Estimator.
MinCovDetMinimum Covariance Determinant (robust estimator of covariance).
ShrunkCovarianceCovariance estimator with shrinkage.
Notes
The regularised covariance is:
(1 - shrinkage) * cov + shrinkage * mu * np.identity(n_features),
where mu = trace(cov) / n_features and shrinkage is given by the OAS formula(see[1]).
The shrinkage formulation implemented here differs from Eq. 23 in[1]. Inthe original article, formula (23) states that 2/p (p being the number offeatures) is multiplied by Trace(cov*cov) in both the numerator anddenominator, but this operation is omitted because for a large p, the valueof 2/p is so small that it doesn’t affect the value of the estimator.
References
Examples
>>>importnumpyasnp>>>fromsklearn.covarianceimportOAS>>>fromsklearn.datasetsimportmake_gaussian_quantiles>>>real_cov=np.array([[.8,.3],...[.3,.4]])>>>rng=np.random.RandomState(0)>>>X=rng.multivariate_normal(mean=[0,0],...cov=real_cov,...size=500)>>>oas=OAS().fit(X)>>>oas.covariance_array([[0.7533, 0.2763], [0.2763, 0.3964]])>>>oas.precision_array([[ 1.7833, -1.2431 ], [-1.2431, 3.3889]])>>>oas.shrinkage_np.float64(0.0195)
See alsoShrinkage covariance estimation: LedoitWolf vs OAS and max-likelihoodandLedoit-Wolf vs OAS estimationfor more detailed examples.
- error_norm(comp_cov,norm='frobenius',scaling=True,squared=True)[source]#
Compute the Mean Squared Error between two covariance estimators.
- Parameters:
- comp_covarray-like of shape (n_features, n_features)
The covariance to compare with.
- norm{“frobenius”, “spectral”}, default=”frobenius”
The type of norm used to compute the error. Available error types:- ‘frobenius’ (default): sqrt(tr(A^t.A))- ‘spectral’: sqrt(max(eigenvalues(A^t.A))where A is the error
(comp_cov-self.covariance_).- scalingbool, default=True
If True (default), the squared error norm is divided by n_features.If False, the squared error norm is not rescaled.
- squaredbool, default=True
Whether to compute the squared error norm or the error norm.If True (default), the squared error norm is returned.If False, the error norm is returned.
- Returns:
- resultfloat
The Mean Squared Error (in the sense of the Frobenius norm) between
selfandcomp_covcovariance estimators.
- fit(X,y=None)[source]#
Fit the Oracle Approximating Shrinkage covariance model to X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Training data, where
n_samplesis the number of samplesandn_featuresis the number of features.- yIgnored
Not used, present for API consistency by convention.
- Returns:
- selfobject
Returns the instance itself.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please checkUser Guide on how the routingmechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulatingrouting information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator andcontained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_precision()[source]#
Getter for the precision matrix.
- Returns:
- precision_array-like of shape (n_features, n_features)
The precision matrix associated to the current covariance object.
- mahalanobis(X)[source]#
Compute the squared Mahalanobis distances of given observations.
For a detailed example of how outliers affects the Mahalanobis distance,seeRobust covariance estimation and Mahalanobis distances relevance.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
The observations, the Mahalanobis distances of the which wecompute. Observations are assumed to be drawn from the samedistribution than the data used in fit.
- Returns:
- distndarray of shape (n_samples,)
Squared Mahalanobis distances of the observations.
- score(X_test,y=None)[source]#
Compute the log-likelihood of
X_testunder the estimated Gaussian model.The Gaussian model is defined by its mean and covariance matrix which arerepresented respectively by
self.location_andself.covariance_.- Parameters:
- X_testarray-like of shape (n_samples, n_features)
Test data of which we compute the likelihood, where
n_samplesisthe number of samples andn_featuresis the number of features.X_testis assumed to be drawn from the same distribution thanthe data used in fit (including centering).- yIgnored
Not used, present for API consistency by convention.
- Returns:
- resfloat
The log-likelihood of
X_testwithself.location_andself.covariance_as estimators of the Gaussian model mean and covariance matrix respectively.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects(such as
Pipeline). The latter haveparameters of the form<component>__<parameter>so that it’spossible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
Gallery examples#
Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification
Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood
