Instatistics, theresidual sum of squares (RSS), also known as thesum of squared residuals (SSR) or thesum of squared estimate of errors (SSE), is thesum of thesquares ofresiduals (deviations predicted from actual empirical values of data). It is a measure of the discrepancy between the data and an estimation model, such as alinear regression. A small RSS indicates a tight fit of the model to the data. It is used as anoptimality criterion in parameter selection andmodel selection.
In a model with a single explanatory variable, RSS is given by:[1]
whereyi is theith value of the variable to be predicted,xi is theith value of the explanatory variable, and is the predicted value ofyi (also termed).In a standard linear simpleregression model,, where and arecoefficients,y andx are theregressand and theregressor, respectively, and ε is theerror term. The sum of squares of residuals is the sum of squares of; that is
where is the estimated value of the constant term and is the estimated value of the slope coefficient.
Matrix expression for the OLS residual sum of squares
The general regression model withn observations andk explanators, the first of which is a constant unit vector whose coefficient is the regression intercept, is
wherey is ann × 1 vector of dependent variable observations, each column of then ×k matrixX is a vector of observations on one of thek explanators, is ak × 1 vector of true coefficients, ande is ann× 1 vector of the true underlying errors. Theordinary least squares estimator for is
The residual vector; so the residual sum of squares is:
(equivalent to the square of thenorm of residuals). In full:
whereH is thehat matrix, or the projection matrix in linear regression.
Relation with Pearson's product-moment correlation
^Archdeacon, Thomas J. (1994).Correlation and regression analysis : a historian's guide. University of Wisconsin Press. pp. 161–162.ISBN0-299-13650-7.OCLC27266095.
Draper, N.R.; Smith, H. (1998).Applied Regression Analysis (3rd ed.). John Wiley.ISBN0-471-17082-8.