Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Mean squared prediction error

From Wikipedia, the free encyclopedia
Statistics concept

Instatistics themean squared prediction error (MSPE), also known asmean squared error of the predictions, of asmoothing,curve fitting, orregression procedure is theexpected value of thesquaredprediction errors (PE), thesquare difference between the fitted values implied by the predictive functiong^{\displaystyle {\widehat {g}}} and the values of the (unobservable)true valueg. It is an inverse measure of theexplanatory power ofg^,{\displaystyle {\widehat {g}},} and can be used in the process ofcross-validation of an estimated model.Knowledge ofg would be required in order to calculate the MSPE exactly; in practice, MSPE is estimated.[1]

Formulation

[edit]

If the smoothing or fitting procedure hasprojection matrix (i.e., hat matrix)L, which maps the observed values vectory{\displaystyle y} topredicted values vectory^=Ly,{\displaystyle {\hat {y}}=Ly,} then PE and MSPE are formulated as:

PEi=g(xi)g^(xi),{\displaystyle \operatorname {PE_{i}} =g(x_{i})-{\widehat {g}}(x_{i}),}
MSPE=E[PEi2]=i=1nPEi2/n.{\displaystyle \operatorname {MSPE} =\operatorname {E} \left[\operatorname {PE} _{i}^{2}\right]=\sum _{i=1}^{n}\operatorname {PE} _{i}^{2}/n.}

The MSPE can be decomposed into two terms: the squaredbias (mean error) of the fitted values and thevariance of the fitted values:

MSPE=ME2+VAR,{\displaystyle \operatorname {MSPE} =\operatorname {ME} ^{2}+\operatorname {VAR} ,}
ME=E[g^(xi)g(xi)]{\displaystyle \operatorname {ME} =\operatorname {E} \left[{\widehat {g}}(x_{i})-g(x_{i})\right]}
VAR=E[(g^(xi)E[g(xi)])2].{\displaystyle \operatorname {VAR} =\operatorname {E} \left[\left({\widehat {g}}(x_{i})-\operatorname {E} \left[{g}(x_{i})\right]\right)^{2}\right].}

The quantitySSPE=nMSPE is calledsum squared prediction error.Theroot mean squared prediction error is the square root of MSPE:RMSPE=MSPE.

Computation of MSPE over out-of-sample data

[edit]
Further information:Cross-validation (statistics)

The mean squared prediction error can be computed exactly in two contexts. First, with adata sample of lengthn, thedata analyst may run theregression over onlyq of the data points (withq <n), holding back the othern – q data points with the specific purpose of using them to compute the estimated model’s MSPE out of sample (i.e., not using data that were used in the model estimation process). Since the regression process is tailored to theq in-sample points, normally the in-sample MSPE will be smaller than the out-of-sample one computed over then – q held-back points. If the increase in the MSPE out of sample compared to in sample is relatively slight, that results in the model being viewed favorably. And if two models are to be compared, the one with the lower MSPE over then – q out-of-sample data points is viewed more favorably, regardless of the models’ relative in-sample performances. The out-of-sample MSPE in this context is exact for the out-of-sample data points that it was computed over, but is merely an estimate of the model’s MSPE for the mostly unobserved population from which the data were drawn.

Second, as time goes on more data may become available to the data analyst, and then the MSPE can be computed over these new data.

Estimation of MSPE over the population

[edit]
This article'sfactual accuracy isdisputed. Relevant discussion may be found on thetalk page. Please help to ensure that disputed statements arereliably sourced.(May 2018) (Learn how and when to remove this message)

When the model has been estimated over all available data with none held back, the MSPE of the model over the entirepopulation of mostly unobserved data can be estimated as follows.

For the modelyi=g(xi)+σεi{\displaystyle y_{i}=g(x_{i})+\sigma \varepsilon _{i}} whereεiN(0,1){\displaystyle \varepsilon _{i}\sim {\mathcal {N}}(0,1)}, one may write

nMSPE(L)=gT(IL)T(IL)g+σ2tr[LTL].{\displaystyle n\cdot \operatorname {MSPE} (L)=g^{\text{T}}(I-L)^{\text{T}}(I-L)g+\sigma ^{2}\operatorname {tr} \left[L^{\text{T}}L\right].}

Using in-sample data values, the first term on the right side is equivalent to

i=1n(E[g(xi)g^(xi)])2=E[i=1n(yig^(xi))2]σ2tr[(IL)T(IL)].{\displaystyle \sum _{i=1}^{n}\left(\operatorname {E} \left[g(x_{i})-{\widehat {g}}(x_{i})\right]\right)^{2}=\operatorname {E} \left[\sum _{i=1}^{n}\left(y_{i}-{\widehat {g}}(x_{i})\right)^{2}\right]-\sigma ^{2}\operatorname {tr} \left[\left(I-L\right)^{T}\left(I-L\right)\right].}

Thus,

nMSPE(L)=E[i=1n(yig^(xi))2]σ2(ntr[L]).{\displaystyle n\cdot \operatorname {MSPE} (L)=\operatorname {E} \left[\sum _{i=1}^{n}\left(y_{i}-{\widehat {g}}(x_{i})\right)^{2}\right]-\sigma ^{2}\left(n-\operatorname {tr} \left[L\right]\right).}

Ifσ2{\displaystyle \sigma ^{2}} is known or well-estimated byσ^2{\displaystyle {\widehat {\sigma }}^{2}}, it becomes possible to estimate MSPE by

nMSPE^(L)=i=1n(yig^(xi))2σ^2(ntr[L]).{\displaystyle n\cdot \operatorname {\widehat {MSPE}} (L)=\sum _{i=1}^{n}\left(y_{i}-{\widehat {g}}(x_{i})\right)^{2}-{\widehat {\sigma }}^{2}\left(n-\operatorname {tr} \left[L\right]\right).}

Colin Mallows advocated this method in the construction of his model selection statisticCp, which is a normalized version of the estimated MSPE:

Cp=i=1n(yig^(xi))2σ^2n+2p.{\displaystyle C_{p}={\frac {\sum _{i=1}^{n}\left(y_{i}-{\widehat {g}}(x_{i})\right)^{2}}{{\widehat {\sigma }}^{2}}}-n+2p.}

wherep the number of estimated parametersp andσ^2{\displaystyle {\widehat {\sigma }}^{2}} is computed from the version of the model that includes all possible regressors.That concludes this proof.

See also

[edit]

References

[edit]
  1. ^Pindyck, Robert S.;Rubinfeld, Daniel L. (1991)."Forecasting with Time-Series Models".Econometric Models & Economic Forecasts (3rd ed.). New York: McGraw-Hill. pp. 516–535.ISBN 0-07-050098-3.
Machine learning evaluation metrics
Regression
Classification
Clustering
Ranking
Computer vision
NLP
Deep learning
Recommender system
Similarity
Retrieved from "https://en.wikipedia.org/w/index.php?title=Mean_squared_prediction_error&oldid=1257553434"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp