LinearRegression#

classsklearn.linear_model.LinearRegression(*,fit_intercept=True,copy_X=True,tol=1e-06,n_jobs=None,positive=False)[source]#

Ordinary least squares Linear Regression.

LinearRegression fits a linear model with coefficients w = (w1, …, wp)to minimize the residual sum of squares between the observed targets inthe dataset, and the targets predicted by the linear approximation.

Parameters:
fit_interceptbool, default=True

Whether to calculate the intercept for this model. If setto False, no intercept will be used in calculations(i.e. data is expected to be centered).

copy_Xbool, default=True

If True, X will be copied; else, it may be overwritten.

tolfloat, default=1e-6

The precision of the solution (coef_) is determined bytol whichspecifies a different convergence criterion for thelsqr solver.tol is set asatol andbtol ofscipy.sparse.linalg.lsqr whenfitting on sparse training data. This parameter has no effect when fittingon dense data.

Added in version 1.7.

n_jobsint, default=None

The number of jobs to use for the computation. This will only providespeedup in case of sufficiently large problems, that is if firstlyn_targets>1 and secondlyX is sparse or ifpositive is settoTrue.None means 1 unless in ajoblib.parallel_backend context.-1 means using allprocessors. SeeGlossary for more details.

positivebool, default=False

When set toTrue, forces the coefficients to be positive. Thisoption is only supported for dense arrays.

For a comparison between a linear regression model with positive constraintson the regression coefficients and a linear regression without such constraints,seeNon-negative least squares.

Added in version 0.24.

Attributes:
coef_array of shape (n_features, ) or (n_targets, n_features)

Estimated coefficients for the linear regression problem.If multiple targets are passed during the fit (y 2D), thisis a 2D array of shape (n_targets, n_features), while if onlyone target is passed, this is a 1D array of length n_features.

rank_int

Rank of matrixX. Only available whenX is dense.

singular_array of shape (min(X, y),)

Singular values ofX. Only available whenX is dense.

intercept_float or array of shape (n_targets,)

Independent term in the linear model. Set to 0.0 iffit_intercept=False.

n_features_in_int

Number of features seen duringfit.

Added in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen duringfit. Defined only whenXhas feature names that are all strings.

Added in version 1.0.

See also

Ridge

Ridge regression addresses some of the problems of Ordinary Least Squares by imposing a penalty on the size of the coefficients with l2 regularization.

Lasso

The Lasso is a linear model that estimates sparse coefficients with l1 regularization.

ElasticNet

Elastic-Net is a linear regression model trained with both l1 and l2 -norm regularization of the coefficients.

Notes

From the implementation point of view, this is just plain OrdinaryLeast Squares (scipy.linalg.lstsq) or Non Negative Least Squares(scipy.optimize.nnls) wrapped as a predictor object.

Examples

>>>importnumpyasnp>>>fromsklearn.linear_modelimportLinearRegression>>>X=np.array([[1,1],[1,2],[2,2],[2,3]])>>># y = 1 * x_0 + 2 * x_1 + 3>>>y=np.dot(X,np.array([1,2]))+3>>>reg=LinearRegression().fit(X,y)>>>reg.score(X,y)1.0>>>reg.coef_array([1., 2.])>>>reg.intercept_np.float64(3.0)>>>reg.predict(np.array([[3,5]]))array([16.])
fit(X,y,sample_weight=None)[source]#

Fit linear model.

Parameters:
X{array-like, sparse matrix} of shape (n_samples, n_features)

Training data.

yarray-like of shape (n_samples,) or (n_samples, n_targets)

Target values. Will be cast to X’s dtype if necessary.

sample_weightarray-like of shape (n_samples,), default=None

Individual weights for each sample.

Added in version 0.17:parametersample_weight support to LinearRegression.

Returns:
selfobject

Fitted Estimator.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please checkUser Guide on how the routingmechanism works.

Returns:
routingMetadataRequest

AMetadataRequest encapsulatingrouting information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator andcontained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)[source]#

Predict using the linear model.

Parameters:
Xarray-like or sparse matrix, shape (n_samples, n_features)

Samples.

Returns:
Carray, shape (n_samples,)

Returns predicted values.

score(X,y,sample_weight=None)[source]#

Returncoefficient of determination on test data.

The coefficient of determination,\(R^2\), is defined as\((1 - \frac{u}{v})\), where\(u\) is the residualsum of squares((y_true-y_pred)**2).sum() and\(v\)is the total sum of squares((y_true-y_true.mean())**2).sum().The best possible score is 1.0 and it can be negative (because themodel can be arbitrarily worse). A constant model that always predictsthe expected value ofy, disregarding the input features, would geta\(R^2\) score of 0.0.

Parameters:
Xarray-like of shape (n_samples, n_features)

Test samples. For some estimators this may be a precomputedkernel matrix or a list of generic objects instead with shape(n_samples,n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True values forX.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns:
scorefloat

\(R^2\) ofself.predict(X) w.r.t.y.

Notes

The\(R^2\) score used when callingscore on a regressor usesmultioutput='uniform_average' from version 0.23 to keep consistentwith default value ofr2_score.This influences thescore method of all the multioutputregressors (except forMultiOutputRegressor).

set_fit_request(*,sample_weight:bool|None|str='$UNCHANGED$')LinearRegression[source]#

Configure whether metadata should be requested to be passed to thefit method.

Note that this method is only relevant when this estimator is used as asub-estimator within ameta-estimator and metadata routing is enabledwithenable_metadata_routing=True (seesklearn.set_config).Please check theUser Guide on how the routingmechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed tofit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it tofit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains theexisting request. This allows you to change the request for someparameters and not others.

Added in version 1.3.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing forsample_weight parameter infit.

Returns:
selfobject

The updated object.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects(such asPipeline). The latter haveparameters of the form<component>__<parameter> so that it’spossible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

set_score_request(*,sample_weight:bool|None|str='$UNCHANGED$')LinearRegression[source]#

Configure whether metadata should be requested to be passed to thescore method.

Note that this method is only relevant when this estimator is used as asub-estimator within ameta-estimator and metadata routing is enabledwithenable_metadata_routing=True (seesklearn.set_config).Please check theUser Guide on how the routingmechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed toscore if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it toscore.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains theexisting request. This allows you to change the request for someparameters and not others.

Added in version 1.3.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing forsample_weight parameter inscore.

Returns:
selfobject

The updated object.

Gallery examples#

Principal Component Regression vs Partial Least Squares Regression

Principal Component Regression vs Partial Least Squares Regression

Plot individual and voting regression predictions

Plot individual and voting regression predictions

Failure of Machine Learning to infer causal effects

Failure of Machine Learning to infer causal effects

Comparing Linear Bayesian Regressors

Comparing Linear Bayesian Regressors

Logistic function

Logistic function

Non-negative least squares

Non-negative least squares

Ordinary Least Squares and Ridge Regression

Ordinary Least Squares and Ridge Regression

Quantile regression

Quantile regression

Robust linear model estimation using RANSAC

Robust linear model estimation using RANSAC

Robust linear estimator fitting

Robust linear estimator fitting

Theil-Sen Regression

Theil-Sen Regression

Isotonic Regression

Isotonic Regression

Metadata Routing

Metadata Routing

Face completion with a multi-output estimators

Face completion with a multi-output estimators

Plotting Cross-Validated Predictions

Plotting Cross-Validated Predictions

Underfitting vs. Overfitting

Underfitting vs. Overfitting

Using KBinsDiscretizer to discretize continuous features

Using KBinsDiscretizer to discretize continuous features