Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Degrees of freedom (statistics)

From Wikipedia, the free encyclopedia
Number of values in the final calculation of a statistic that are free to vary
For other uses, seeDegrees of freedom.

Instatistics, the number ofdegrees of freedom is the number of values in the final calculation of astatistic that are free to vary.[1]

Estimates ofstatistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom. In general, the degrees of freedom of an estimate of a parameter are equal to the number of independentscores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself. For example, if thevariance is to be estimated from a random sample ofN{\textstyle N} independent scores, then the degrees of freedom is equal to the number of independent scores (N) minus the number of parameters estimated as intermediate steps (one, namely, the sample mean) and is therefore equal toN1{\textstyle N-1}.[2]

Mathematically, degrees of freedom is the number ofdimensions of the domain of arandom vector, or essentially the number of "free" components (how many components need to be known before the vector is fully determined).

The term is most often used in the context oflinear models (linear regression,analysis of variance), where certain random vectors are constrained to lie inlinear subspaces, and the number of degrees of freedom is the dimension of thesubspace. The degrees of freedom are also commonly associated with the squared lengths (or "sum of squares" of the coordinates) of such vectors, and the parameters ofchi-squared and other distributions that arise in associated statistical testing problems.

While introductory textbooks may introduce degrees of freedom as distribution parameters or through hypothesis testing, it is the underlying geometry that defines degrees of freedom, and is critical to a proper understanding of the concept.

History

[edit]

Although the basic concept of degrees of freedom was recognized as early as 1821 in the work of German astronomer and mathematicianCarl Friedrich Gauss,[3] its modern definition and usage was first elaborated by English statisticianWilliam Sealy Gosset in his 1908Biometrika article "The Probable Error of a Mean", published under the pen name "Student".[4] While Gosset did not actually use the term 'degrees of freedom', he explained the concept in the course of developing what became known asStudent's t-distribution. The term itself was popularized by English statistician and biologistRonald Fisher, beginning with his 1922 work on chi squares.[5]

Notation

[edit]

In equations, the typical symbol for degrees of freedom isν (lowercaseGreek letter nu). In text and tables, the abbreviation "d.f." is commonly used.R. A. Fisher usedn to symbolize degrees of freedom but modern usage typically reservesn for sample size. When reporting the results ofstatistical tests, the degrees of freedom are typically noted beside thetest statistic as either subscript or in parentheses.[6]

Of random vectors

[edit]

Geometrically, the degrees of freedom can be interpreted as the dimension of certain vector subspaces. As a starting point, suppose that we have a sample of independent normally distributed observations,

X1,,Xn.{\displaystyle X_{1},\dots ,X_{n}.\,}

This can be represented as ann-dimensionalrandom vector:

(X1Xn).{\displaystyle {\begin{pmatrix}X_{1}\\\vdots \\X_{n}\end{pmatrix}}.}

Since this random vector can lie anywhere inn-dimensional space, it hasn degrees of freedom.

Now, letX¯{\displaystyle {\bar {X}}} be thesample mean. The random vector can be decomposed as the sum of the sample mean plus a vector of residuals:

(X1Xn)=X¯(11)+(X1X¯XnX¯).{\displaystyle {\begin{pmatrix}X_{1}\\\vdots \\X_{n}\end{pmatrix}}={\bar {X}}{\begin{pmatrix}1\\\vdots \\1\end{pmatrix}}+{\begin{pmatrix}X_{1}-{\bar {X}}\\\vdots \\X_{n}-{\bar {X}}\end{pmatrix}}.}

The first vector on the right-hand side is constrained to be a multiple of the vector of 1's, and the only free quantity isX¯{\displaystyle {\bar {X}}}. It therefore has 1 degree of freedom.

The second vector is constrained by the relationi=1n(XiX¯)=0{\textstyle \sum _{i=1}^{n}(X_{i}-{\bar {X}})=0}. The firstn − 1 components of this vector can be anything. However, once the firstn − 1 components are known, thenth component can also be known. Therefore, this vector hasn − 1 degrees of freedom.

Mathematically, the first vector is theoblique projection of the data vector onto thesubspacespanned by the vector of 1's. The 1 degree of freedom is the dimension of this subspace. The second residual vector is the least-squares projection onto the (n − 1)-dimensionalorthogonal complement of this subspace, and hasn − 1 degrees of freedom.

In statistical testing applications, often one is not directly interested in the component vectors, but rather in their squared lengths. In the example above, theresidual sum-of-squares is

i=1n(XiX¯)2=X1X¯XnX¯2.{\displaystyle \sum _{i=1}^{n}(X_{i}-{\bar {X}})^{2}={\begin{Vmatrix}X_{1}-{\bar {X}}\\\vdots \\X_{n}-{\bar {X}}\end{Vmatrix}}^{2}.}

If the data pointsXi{\displaystyle X_{i}} are normally distributed with mean 0 and varianceσ2{\displaystyle \sigma ^{2}}, then the residual sum of squares has a scaledchi-squared distribution (scaled by the factorσ2{\displaystyle \sigma ^{2}}), withn − 1 degrees of freedom. The degrees-of-freedom, here a parameter of the distribution, can still be interpreted as the dimension of an underlying vector subspace.

Likewise, the one-samplet-test statistic,

n(X¯μ0)i=1n(XiX¯)2/(n1){\displaystyle {\frac {{\sqrt {n}}({\bar {X}}-\mu _{0})}{\sqrt {\sum \limits _{i=1}^{n}(X_{i}-{\bar {X}})^{2}/(n-1)}}}}

follows aStudent's t distribution withn − 1 degrees of freedom when the hypothesized meanμ0{\displaystyle \mu _{0}} is correct. Again, the degrees-of-freedom arises from the residual vector in the denominator.

In structural equation models

[edit]

When the results of structural equation models (SEM) are presented, they generally include one or more indices of overall model fit, the most common of which is aχ2 statistic. This forms the basis for other indices that are commonly reported. Although it is these other statistics that are most commonly interpreted, thedegrees of freedom of theχ2 are essential to understanding model fit as well as the nature of the model itself.

Degrees of freedom in SEM are computed as a difference between the number of unique pieces of information that are used as input into the analysis, sometimes called knowns, and the number of parameters that are uniquely estimated, sometimes called unknowns. For example, in a one-factor confirmatory factor analysis with 4 items, there are 10 knowns (the six unique covariances among the four items and the four item variances) and 8 unknowns (4 factor loadings and 4 error variances) for 2 degrees of freedom. Degrees of freedom are important to the understanding of model fit if for no other reason than that, all else being equal, the fewer degrees of freedom, the better indices such asχ2 will be.

It has been shown that degrees of freedom can be used by readers of papers that contain SEMs to determine if the authors of those papers are in fact reporting the correct model fit statistics. In the organizational sciences, for example, nearly half of papers published in top journals report degrees of freedom that are inconsistent with the models described in those papers, leaving the reader to wonder which models were actually tested.[7]

Of residuals

[edit]
Further information:Residuals (statistics)

A common way to think of degrees of freedom is as the number of independent pieces of information available to estimate another piece of information. More concretely, the number of degrees of freedom is the number of independent observations in a sample of data that are available to estimate a parameter of the population from which that sample is drawn. For example, if we have two observations, when calculating the mean we have two independent observations; however, when calculating the variance, we have only one independent observation, since the two observations are equally distant from the sample mean.

In fitting statistical models to data, the vectors of residuals are constrained to lie in a space of smaller dimension than the number of components in the vector. That smaller dimension is the number ofdegrees of freedom for error, also calledresidual degrees of freedom.

Example

[edit]

Perhaps the simplest example is this. Suppose

X1,,Xn{\displaystyle X_{1},\dots ,X_{n}}

arerandom variables, each withexpected value (weighted average)μ, and let

X¯n=X1++Xnn{\displaystyle {\overline {X}}_{n}={\frac {X_{1}+\cdots +X_{n}}{n}}}

be the "sample mean." Then the quantities

XiX¯n{\displaystyle X_{i}-{\overline {X}}_{n}}

are residuals that may be consideredestimates of theerrorsXi − μ. The sum of the residuals (unlike the sum of the errors) is necessarily 0. If one knows the values of anyn − 1 of the residuals, one can thus find the last one. That means they are constrained to lie in a space of dimensionn − 1. One says that there aren − 1 degrees of freedom for errors.

An example which is only slightly less simple is that ofleast squares estimation ofa andb in the model

Yi=a+bxi+ei for i=1,,n{\displaystyle Y_{i}=a+bx_{i}+e_{i}{\text{ for }}i=1,\dots ,n}

wherexi is given, but ei and henceYi are random. Leta^{\displaystyle {\widehat {a}}} andb^{\displaystyle {\widehat {b}}} be the least-squares estimates ofa andb. Then the residuals

e^i=yi(a^+b^xi){\displaystyle {\widehat {e}}_{i}=y_{i}-({\widehat {a}}+{\widehat {b}}x_{i})}

are constrained to lie within the space defined by the two equations

e^1++e^n=0,{\displaystyle {\widehat {e}}_{1}+\cdots +{\widehat {e}}_{n}=0,}
x1e^1++xne^n=0.{\displaystyle x_{1}{\widehat {e}}_{1}+\cdots +x_{n}{\widehat {e}}_{n}=0.}

One says that there aren − 2 degrees of freedom for error.

Notationally, the capital letterY is used in specifying the model, while lower-casey in the definition of the residuals; that is because the former are hypothesized random variables and the latter are actual data.

We can generalise this to multiple regression involvingp parameters and covariates (e.g.p − 1 predictors and one mean (=intercept in the regression)), in which case the cost indegrees of freedom of the fit isp, leavingn - p degrees of freedom for errors

In linear models

[edit]

The demonstration of thet and chi-squared distributions for one-sample problems above is the simplest example where degrees-of-freedom arise. However, similar geometry and vector decompositions underlie much of the theory oflinear models, includinglinear regression andanalysis of variance. An explicit example based on comparison of three means is presented here; the geometry of linear models is discussed in more complete detail by Christensen (2002).[8]

Suppose independent observations are made for three populations,X1,,Xn{\displaystyle X_{1},\ldots ,X_{n}},Y1,,Yn{\displaystyle Y_{1},\ldots ,Y_{n}} andZ1,,Zn{\displaystyle Z_{1},\ldots ,Z_{n}}. The restriction to three groups and equal sample sizes simplifies notation, but the ideas are easily generalized.

The observations can be decomposed as

Xi=M¯+(X¯M¯)+(XiX¯)Yi=M¯+(Y¯M¯)+(YiY¯)Zi=M¯+(Z¯M¯)+(ZiZ¯){\displaystyle {\begin{aligned}X_{i}&={\bar {M}}+({\bar {X}}-{\bar {M}})+(X_{i}-{\bar {X}})\\Y_{i}&={\bar {M}}+({\bar {Y}}-{\bar {M}})+(Y_{i}-{\bar {Y}})\\Z_{i}&={\bar {M}}+({\bar {Z}}-{\bar {M}})+(Z_{i}-{\bar {Z}})\end{aligned}}}

whereX¯,Y¯,Z¯{\displaystyle {\bar {X}},{\bar {Y}},{\bar {Z}}} are the means of the individual samples, andM¯=(X¯+Y¯+Z¯)/3{\displaystyle {\bar {M}}=({\bar {X}}+{\bar {Y}}+{\bar {Z}})/3} is the mean of all 3n observations. Invector notation this decomposition can be written as

(X1XnY1YnZ1Zn)=M¯(111111)+(X¯M¯X¯M¯Y¯M¯Y¯M¯Z¯M¯Z¯M¯)+(X1X¯XnX¯Y1Y¯YnY¯Z1Z¯ZnZ¯).{\displaystyle {\begin{pmatrix}X_{1}\\\vdots \\X_{n}\\Y_{1}\\\vdots \\Y_{n}\\Z_{1}\\\vdots \\Z_{n}\end{pmatrix}}={\bar {M}}{\begin{pmatrix}1\\\vdots \\1\\1\\\vdots \\1\\1\\\vdots \\1\end{pmatrix}}+{\begin{pmatrix}{\bar {X}}-{\bar {M}}\\\vdots \\{\bar {X}}-{\bar {M}}\\{\bar {Y}}-{\bar {M}}\\\vdots \\{\bar {Y}}-{\bar {M}}\\{\bar {Z}}-{\bar {M}}\\\vdots \\{\bar {Z}}-{\bar {M}}\end{pmatrix}}+{\begin{pmatrix}X_{1}-{\bar {X}}\\\vdots \\X_{n}-{\bar {X}}\\Y_{1}-{\bar {Y}}\\\vdots \\Y_{n}-{\bar {Y}}\\Z_{1}-{\bar {Z}}\\\vdots \\Z_{n}-{\bar {Z}}\end{pmatrix}}.}

The observation vector, on the left-hand side, has 3n degrees of freedom. On the right-hand side, the first vector has one degree of freedom (or dimension) for the overall mean. The second vector depends on three random variables,X¯M¯{\displaystyle {\bar {X}}-{\bar {M}}},Y¯M¯{\displaystyle {\bar {Y}}-{\bar {M}}} andZ¯M¯{\displaystyle {\overline {Z}}-{\overline {M}}}. However, these must sum to 0 and so are constrained; the vector therefore must lie in a 2-dimensional subspace, and has 2 degrees of freedom. The remaining 3n − 3 degrees of freedom are in the residual vector (made up ofn − 1 degrees of freedom within each of the populations).

In analysis of variance (ANOVA)

[edit]

In statistical testing problems, one usually is not interested in the component vectors themselves, but rather in their squared lengths, or Sum of Squares. The degrees of freedom associated with a sum-of-squares is the degrees-of-freedom of the corresponding component vectors.

The three-population example above is an example ofone-way Analysis of Variance. The model, or treatment, sum-of-squares is the squared length of the second vector,

SST=n(X¯M¯)2+n(Y¯M¯)2+n(Z¯M¯)2{\displaystyle {\text{SST}}=n({\bar {X}}-{\bar {M}})^{2}+n({\bar {Y}}-{\bar {M}})^{2}+n({\bar {Z}}-{\bar {M}})^{2}}

with 2 degrees of freedom. The residual, or error, sum-of-squares is

SSE=i=1n[(XiX¯)2+(YiY¯)2+(ZiZ¯)2]{\displaystyle {\text{SSE}}=\sum _{i=1}^{n}\left[(X_{i}-{\bar {X}})^{2}+(Y_{i}-{\bar {Y}})^{2}+(Z_{i}-{\bar {Z}})^{2}\right]}

with 3(n−1) degrees of freedom. Of course, introductory books on ANOVA usually state formulae without showing the vectors, but it is this underlying geometry that gives rise to SS formulae, and shows how to unambiguously determine the degrees of freedom in any given situation.

Under the null hypothesis of no difference between population means (and assuming that standard ANOVA regularity assumptions are satisfied) the sums of squares have scaled chi-squared distributions, with the corresponding degrees of freedom. The F-test statistic is the ratio, after scaling by the degrees of freedom. If there is no difference between population means this ratio follows anF-distribution with 2 and 3n − 3 degrees of freedom.

In some complicated settings, such as unbalancedsplit-plot designs, the sums-of-squares no longer have scaled chi-squared distributions. Comparison of sum-of-squares with degrees-of-freedom is no longer meaningful, and software may report certain fractional 'degrees of freedom' in these cases. Such numbers have no genuine degrees-of-freedom interpretation, but are simply providing anapproximate chi-squared distribution for the corresponding sum-of-squares. The details of such approximations are beyond the scope of this page.

In probability distributions

[edit]

Several commonly encountered statistical distributions (Student'st,chi-squared,F) have parameters that are commonly referred to asdegrees of freedom. This terminology simply reflects that in many applications where these distributions occur, the parameter corresponds to the degrees of freedom of an underlying random vector, as in the preceding ANOVA example. Another simple example is: ifXi;i=1,,n{\displaystyle X_{i};i=1,\ldots ,n} are independent normal(μ,σ2){\displaystyle (\mu ,\sigma ^{2})} random variables, the statistic

i=1n(XiX¯)2σ2{\displaystyle {\frac {\sum _{i=1}^{n}(X_{i}-{\bar {X}})^{2}}{\sigma ^{2}}}}

follows a chi-squared distribution withn − 1 degrees of freedom. Here, the degrees of freedom arises from the residual sum-of-squares in the numerator, and in turn then − 1 degrees of freedom of the underlying residual vector{XiX¯}{\displaystyle \{X_{i}-{\bar {X}}\}}.

In the application of these distributions to linear models, the degrees of freedom parameters can take onlyinteger values. The underlying families of distributions allow fractional values for the degrees-of-freedom parameters, which can arise in more sophisticated uses. One set of examples is problems where chi-squared approximations based oneffective degrees of freedom are used. In other applications, such as modellingheavy-tailed data, a t orF-distribution may be used as an empirical model. In these cases, there is no particulardegrees of freedom interpretation to the distribution parameters, even though the terminology may continue to be used.

In non-standard regression

[edit]

Many non-standard regression methods, includingregularized least squares (e.g.,ridge regression),linear smoothers,smoothing splines, andsemiparametric regression, are not based onordinary least squares projections, but rather onregularized (generalized and/or penalized) least-squares, and so degrees of freedom defined in terms of dimensionality is generally not useful for these procedures. However, these procedures are still linear in the observations, and the fitted values of the regression can be expressed in the form

y^=Hy,{\displaystyle {\hat {y}}=Hy,}

wherey^{\displaystyle {\hat {y}}} is the vector of fitted values at each of the original covariate values from the fitted model,y is the original vector of responses, andH is thehat matrix or, more generally, smoother matrix.

For statistical inference, sums-of-squares can still be formed: the model sum-of-squares isHy2{\displaystyle \|Hy\|^{2}}; the residual sum-of-squares isyHy2{\displaystyle \|y-Hy\|^{2}}. However, becauseH does not correspond to an ordinary least-squares fit (i.e. is not an orthogonal projection), these sums-of-squares no longer have (scaled, non-central) chi-squared distributions, and dimensionally defined degrees-of-freedom are not useful.

Theeffective degrees of freedom of the fit can be defined in various ways to implementgoodness-of-fit tests,cross-validation, and otherstatistical inference procedures. Here one can distinguish betweenregression effective degrees of freedom andresidual effective degrees of freedom.

Regression effective degrees of freedom

[edit]

For the regression effective degrees of freedom, appropriate definitions can include thetrace of the hat matrix,[9] tr(H), the trace of thequadratic form of the hat matrix, tr(H'H), the form tr(2HHH'), or theSatterthwaite approximation,tr(H'H)2/tr(H'HH'H).[10]In the case of linear regression, the hat matrixH isX(X 'X)−1X ', and all these definitions reduce to the usual degrees of freedom. Notice that

tr(H)=ihii=iy^iyi,{\displaystyle \operatorname {tr} (H)=\sum _{i}h_{ii}=\sum _{i}{\frac {\partial {\hat {y}}_{i}}{\partial y_{i}}},}

the regression (not residual) degrees of freedom in linear models are "the sum of the sensitivities of the fitted values with respect to the observed response values",[11] i.e. the sum ofleverage scores.

One way to help to conceptualize this is to consider a simple smoothing matrix like aGaussian blur, used to mitigate data noise. In contrast to a simple linear or polynomial fit, computing the effective degrees of freedom of the smoothing function is not straightforward. In these cases, it is important to estimate the Degrees of Freedom permitted by theH{\displaystyle H} matrix so that the residual degrees of freedom can then be used to estimate statistical tests such asχ2{\displaystyle \chi ^{2}}.

Residual effective degrees of freedom

[edit]

There are corresponding definitions of residual effective degrees-of-freedom (redf), withH replaced byI − H. For example, if the goal is to estimate error variance, the redf would be defined as tr((I − H)'(I − H)), and the unbiased estimate is (withr^=yHy{\displaystyle {\hat {r}}=y-Hy}),

σ^2=r^2tr((IH)(IH)),{\displaystyle {\hat {\sigma }}^{2}={\frac {\|{\hat {r}}\|^{2}}{\operatorname {tr} \left((I-H)'(I-H)\right)}},}

or:[12][13][14][15]

σ^2=r^2ntr(2HHH)=r^2n2tr(H)+tr(HH){\displaystyle {\hat {\sigma }}^{2}={\frac {\|{\hat {r}}\|^{2}}{n-\operatorname {tr} (2H-HH')}}={\frac {\|{\hat {r}}\|^{2}}{n-2\operatorname {tr} (H)+\operatorname {tr} (HH')}}}
σ^2r^2n1.25tr(H)+0.5.{\displaystyle {\hat {\sigma }}^{2}\approx {\frac {\|{\hat {r}}\|^{2}}{n-1.25\operatorname {tr} (H)+0.5}}.}

The last approximation above[13] reduces the computational cost fromO(n2) to onlyO(n). In general the numerator would be the objective function being minimized; e.g., if the hat matrix includes an observation covariance matrix, Σ, thenr^2{\displaystyle \|{\hat {r}}\|^{2}} becomesr^Σ1r^{\displaystyle {\hat {r}}'\Sigma ^{-1}{\hat {r}}}.

General

[edit]

Note that unlike in the original case, non-integer degrees of freedom are allowed, though the value must usually still be constrained between 0 andn.[16]

Consider, as an example, thek-nearest neighbour smoother, which is the average of thek nearest measured values to the given point. Then, at each of then measured points, the weight of the original value on the linear combination that makes up the predicted value is just 1/k. Thus, the trace of the hat matrix isn/k. Thus the smooth costsn/k effective degrees of freedom.

As another example, consider the existence of nearly duplicated observations. Naive application of classical formula,np, would lead to over-estimation of the residuals degree of freedom, as if each observation were independent. More realistically, though, the hat matrixH =X(X ' Σ−1X)−1X ' Σ−1 would involve an observation covariance matrix Σ indicating the non-zero correlation among observations.

The more general formulation of effective degree of freedom would result in a more realistic estimate for, e.g., the error variance σ2, which in its turn scales the unknown parameters'a posteriori standard deviation; the degree of freedom will also affect the expansion factor necessary to produce anerror ellipse for a givenconfidence level.

Other formulations

[edit]

Similar concepts are theequivalent degrees of freedom innon-parametric regression,[17] thedegree of freedom of signal in atmospheric studies,[18][19] and thenon-integer degree of freedom in geodesy.[20][21]

The residual sum-of-squaresyHy2{\displaystyle \|y-Hy\|^{2}} has ageneralized chi-squared distribution, and the theory associated with this distribution[22] provides an alternative route to the answers provided above.[further explanation needed]

See also

[edit]

References

[edit]
  1. ^"Degrees of Freedom".Glossary of Statistical Terms. Animated Software. Retrieved2008-08-21.
  2. ^Lane, David M."Degrees of Freedom".HyperStat Online. Statistics Solutions. Retrieved2008-08-21.
  3. ^Walker, H. M. (April 1940)."Degrees of Freedom"(PDF).Journal of Educational Psychology.31 (4):253–269.doi:10.1037/h0054588.
  4. ^Student (March 1908)."The Probable Error of a Mean".Biometrika.6 (1):1–25.doi:10.2307/2331554.JSTOR 2331554.
  5. ^Fisher, R. A. (January 1922)."On the Interpretation of χ2 from Contingency Tables, and the Calculation of P".Journal of the Royal Statistical Society.85 (1):87–94.doi:10.2307/2340521.JSTOR 2340521.
  6. ^Cichoń, Mariusz (2020-06-01)."Reporting statistical methods and outcome of statistical analyses in research articles".Pharmacological Reports.72 (3):481–485.doi:10.1007/s43440-020-00110-5.ISSN 2299-5684.PMID 32542585.
  7. ^Cortina, J. M., Green, J. P., Keeler, K. R., & Vandenberg, R. J. (2017). Degrees of freedom in SEM: Are we testing the models that we claim to test?. Organizational Research Methods, 20(3), 350-378.
  8. ^Christensen, Ronald (2002).Plane Answers to Complex Questions: The Theory of Linear Models (Third ed.). New York: Springer.ISBN 0-387-95361-2.
  9. ^Trevor Hastie,Robert Tibshirani, Jerome H. Friedman (2009),The elements of statistical learning: data mining, inference, and prediction, 2nd ed., 746 p.ISBN 978-0-387-84857-0,doi:10.1007/978-0-387-84858-7,[1] (eq.(5.16))
  10. ^Fox, J. (2000).Nonparametric Simple Regression: Smoothing Scatterplots. Quantitative Applications in the Social Sciences. Vol. 130. SAGE Publications. p. 58.ISBN 978-0-7619-1585-0. Retrieved2020-08-28.
  11. ^Ye, J. (1998), "On Measuring and Correcting the Effects of Data Mining and Model Selection",Journal of the American Statistical Association, 93 (441), 120–131.JSTOR 2669609 (eq.(7))
  12. ^Catherine Loader (1999),Local regression and likelihood,ISBN 978-0-387-98775-0,doi:10.1007/b98858, (eq.(2.18), p. 30)
  13. ^abTrevor Hastie, Robert Tibshirani (1990),Generalized additive models, CRC Press, (p. 54) and (eq.(B.1), p. 305))
  14. ^Simon N. Wood (2006),Generalized additive models: an introduction with R, CRC Press, (eq.(4,14), p. 172)
  15. ^David Ruppert, M. P. Wand, R. J. Carroll (2003),Semiparametric Regression, Cambridge University Press (eq.(3.28), p. 82)
  16. ^James S. Hodges (2014),Richly Parameterized Linear Models, CRC Press.[2]
  17. ^Peter J. Green, B. W. Silverman (1994),Nonparametric regression and generalized linear models: a roughness penalty approach, CRC Press (eq.(3.15), p. 37)
  18. ^Clive D. Rodgers (2000),Inverse methods for atmospheric sounding: theory and practice, World Scientific (eq.(2.56), p. 31)
  19. ^Adrian Doicu, Thomas Trautmann, Franz Schreier (2010),Numerical Regularization for Atmospheric Inverse Problems, Springer (eq.(4.26), p. 114)
  20. ^D. Dong, T. A. Herring and R. W. King (1997), Estimating regional deformation from a combination of space and terrestrial geodetic data,J. Geodesy, 72 (4), 200–214,doi:10.1007/s001900050161 (eq.(27), p. 205)
  21. ^H. Theil (1963), "On the Use of Incomplete Prior Information in Regression Analysis",Journal of the American Statistical Association, 58 (302), 401–414JSTOR 2283275 (eq.(5.19)–(5.20))
  22. ^Jones, D.A. (1983) "Statistical analysis of empirical models fitted by optimisation",Biometrika, 70 (1), 67–88

Further reading

[edit]

External links

[edit]
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Retrieved from "https://en.wikipedia.org/w/index.php?title=Degrees_of_freedom_(statistics)&oldid=1319920952"
Category:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp