Movatterモバイル変換

[0]ホーム

Jump to content

Coefficient of determination

Edit links

From Wikipedia, the free encyclopedia

Indicator for how well data points fit a line or curve

Not to be confused withCoefficient of variation.

You can helpexpand this article with text translated fromthe corresponding article in German. (September 2019)Click [show] for important translation instructions.

View a machine-translated version of the German article.
Machine translation, likeDeepL orGoogle Translate, is a useful starting point for translations, but translators must revise errors as necessary and confirm that the translation is accurate, rather than simply copy-pasting machine-translated text into the English Wikipedia.
Consideradding a topic to this template: there are already 1,773 articles in themain category, and specifying|topic= will aid in categorization.
Do not translate text that appears unreliable or low-quality. If possible, verify the text with references provided in the foreign-language article.
Youmust providecopyright attribution in theedit summary accompanying your translation by providing aninterlanguage link to the source of your translation. A model attribution edit summary isContent in this edit is translated from the existing German Wikipedia article at [[:de:Bestimmtheitsmaß]]; see its history for attribution.
You may also add the template{{Translated|de|Bestimmtheitsmaß}} to thetalk page.
For more guidance, seeWikipedia:Translation.

Ordinary least squares regression ofOkun's law. Since the regression line does not miss any of the points by very much, theR² of the regression is relatively high.

Instatistics, thecoefficient of determination, denotedR² orr² and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).It is astatistic used in the context ofstatistical models whose main purpose is either theprediction of future outcomes or the testing ofhypotheses, on the basis of other related information. It provides a measure of how well observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model.^[1]^[2]^[3]

There are several definitions ofR² that are only sometimes equivalent. Insimple linear regression (which includes anintercept),r² is simply the square of the samplecorrelation coefficient (r), between the observed outcomes and the observed predictor values.^[4] If additionalregressors are included,R² is the square of thecoefficient of multiple correlation. In both such cases, the coefficient of determination normally ranges from 0 to 1.

There are cases whereR² can yield negative values. This can arise when the predictions that are being compared to the corresponding outcomes have not been derived from a model-fitting procedure using those data. Even if a model-fitting procedure has been used,R² may still be negative, for example when linear regression is conducted without including an intercept,^[5] or when a non-linear function is used to fit the data.^[6] In cases where negative values arise, the mean of the data provides a better fit to the outcomes than do the fitted function values, according to this particular criterion.

The coefficient of determination can be more intuitively informative thanMAE,MAPE,MSE, andRMSE inregression analysis evaluation, as the former can be expressed as a percentage, whereas the latter measures have arbitrary ranges. It also proved more robust for poor fits compared toSMAPE on certain test datasets.^[7]

When evaluating the goodness-of-fit of simulated (Y_pred) versus measured (Y_obs) values, it is not appropriate to base this on theR² of the linear regression (i.e.,Y_obs=m·Y_pred + b).^{[citation needed]} TheR² quantifies the degree of any linear correlation betweenY_obs andY_pred, while for the goodness-of-fit evaluation only one specific linear correlation should be taken into consideration:Y_obs = 1·Y_pred + 0 (i.e., the 1:1 line).^[8]^[9]

v t e Machine learning evaluation metrics
Regression	MSE MAE sMAPE MAPE MASE MSPE RMS RMSE/RMSD R² MDA MAD
Classification	F-score P4 Accuracy Precision Recall Kappa MCC AUC ROC Sensitivity and specificity Logarithmic loss
Clustering	Silhouette Calinski–Harabasz index Davies–Bouldin index Dunn index Hopkins statistic Jaccard index Rand index Similarity measure SMC DBCV index
Ranking	MRR NDCG AP
Computer vision	PSNR SSIM IoU
NLP	Perplexity BLEU
Deep learning	Inception score FID
Recommender system	Coverage Intra-list similarity
Similarity	Cosine similarity Euclidean distance Pearson correlation coefficient
Confusion matrix

x	1	2	3	4	5
y	1.9	3.7	5.8	8.0	9.6

Movatterモバイル変換

Definitions

Relation to unexplained variance

As explained variance

As squared correlation coefficient

Interpretation

In a multiple linear model

Inflation ofR2

Caveats

Extensions

AdjustedR2

Coefficient of partial determination

Generalizing and decomposingR2

R2 in logistic regression

Comparison with residual statistics

History

See also

Notes

Further reading

Inflation ofR²

AdjustedR²

Generalizing and decomposingR²

R² in logistic regression