Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Unbiased estimation of standard deviation

From Wikipedia, the free encyclopedia
Procedure to estimate standard deviation from a sample

Instatistics and in particularstatistical theory,unbiased estimation of a standard deviation is the calculation from astatistical sample of an estimated value of thestandard deviation (a measure ofstatistical dispersion) of apopulation of values, in such a way that theexpected value of the calculation equals the true value. Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use ofsignificance tests andconfidence intervals, or by usingBayesian analysis.

However, for statistical theory, it provides an exemplar problem in the context ofestimation theory which is both simple to state and for which results cannot be obtained in closed form. It also provides an example where imposing the requirement forunbiased estimation might be seen as just adding inconvenience, with no real benefit.

Motivation

[edit]

Instatistics, thestandard deviation of a population of numbers is often estimated from arandom sample drawn from the population. This is the sample standard deviation, which is defined bys=i=1n(xix¯)2n1,{\displaystyle s={\sqrt {\frac {\sum _{i=1}^{n}(x_{i}-{\overline {x}})^{2}}{n-1}}},}where{x1,x2,,xn}{\displaystyle \{x_{1},x_{2},\ldots ,x_{n}\}} is the sample (formally, realizations from arandom variableX) andx¯{\displaystyle {\overline {x}}} is thesample mean.

One way of seeing that this is abiased estimator of the standard deviation of the population is to start from the result thats2 is anunbiased estimator for thevariance σ2 of the underlying population if that variance exists and the sample values are drawn independently with replacement. The square root is a nonlinear function, and only linear functions commute with taking the expectation. Since the square root is a strictly concave function, it follows fromJensen's inequality that the square root of the sample variance is an underestimate.

The use ofn − 1 instead ofn in the formula for the sample variance is known asBessel's correction, which corrects the bias in the estimation of the populationvariance, and some, but not all of the bias in the estimation of the populationstandard deviation.

It is not possible to find an estimate of the standard deviation which is unbiased for all population distributions, as the bias depends on the particular distribution. Much of the following relates to estimation assuming anormal distribution.

Bias correction

[edit]

Results for the normal distribution

[edit]
Correction factorc4{\displaystyle c_{4}} versus sample sizen.

When the random variable isnormally distributed, a minor correction exists to eliminate the bias. To derive the correction, note that for normally distributedX,Cochran's theorem implies that(n1)s2/σ2{\displaystyle (n-1)s^{2}/\sigma ^{2}} has achi square distribution withn1{\displaystyle n-1}degrees of freedom and thus its square root,n1s/σ{\displaystyle {\sqrt {n-1}}s/\sigma } has achi distribution withn1{\displaystyle n-1} degrees of freedom. Consequently, calculating the expectation of this last expression and rearranging constants,

E[s]=c4(n)σ{\displaystyle \operatorname {E} [s]=c_{4}(n)\sigma }

where the correction factorc4(n){\displaystyle c_{4}(n)} is the scale mean of the chi distribution withn1{\displaystyle n-1} degrees of freedom,μ1/n1{\displaystyle \mu _{1}/{\sqrt {n-1}}}. This depends on the sample sizen, and is given as follows:[1]

c4(n)=2n1Γ(n2)Γ(n12)=114n732n219128n3+O(n4){\displaystyle c_{4}(n)={\sqrt {\frac {2}{n-1}}}{\frac {\Gamma \left({\frac {n}{2}}\right)}{\Gamma \left({\frac {n-1}{2}}\right)}}=1-{\frac {1}{4n}}-{\frac {7}{32n^{2}}}-{\frac {19}{128n^{3}}}+O(n^{-4})}

where Γ(·) is thegamma function. An unbiased estimator ofσ can be obtained by dividings{\displaystyle s} byc4(n){\displaystyle c_{4}(n)}. Asn{\displaystyle n} grows large it approaches 1, and even for smaller values the correction is minor. The figure shows a plot ofc4(n){\displaystyle c_{4}(n)} versus sample size. The table below gives numerical values ofc4(n){\displaystyle c_{4}(n)} and algebraic expressions for some values ofn{\displaystyle n}; more complete tables may be found in most textbooks[2][3] onstatistical quality control.

Sample sizeExpression ofc4{\displaystyle c_{4}}Numerical value
22π{\displaystyle {\sqrt {\frac {2}{\pi }}}}0.7978845608
3π2{\displaystyle {\frac {\sqrt {\pi }}{2}}}0.8862269255
4223π{\displaystyle 2{\sqrt {\frac {2}{3\pi }}}}0.9213177319
534π2{\displaystyle {\frac {3}{4}}{\sqrt {\frac {\pi }{2}}}}0.9399856030
68325π{\displaystyle {\frac {8}{3}}{\sqrt {\frac {2}{5\pi }}}}0.9515328619
753π16{\displaystyle {\frac {5{\sqrt {3\pi }}}{16}}}0.9593687891
816527π{\displaystyle {\frac {16}{5}}{\sqrt {\frac {2}{7\pi }}}}0.9650304561
935π64{\displaystyle {\frac {35{\sqrt {\pi }}}{64}}}0.9693106998
101281052π{\displaystyle {\frac {128}{105}}{\sqrt {\frac {2}{\pi }}}}0.9726592741
1000.9974779761
10000.9997497811
100000.9999749978
2k2π(2k1)22k2(k1)!2(2k2)!{\displaystyle {\sqrt {\frac {2}{\pi (2k-1)}}}{\frac {2^{2k-2}(k-1)!^{2}}{(2k-2)!}}}
2k+1πk(2k1)!22k1(k1)!2{\displaystyle {\sqrt {\frac {\pi }{k}}}{\frac {(2k-1)!}{2^{2k-1}(k-1)!^{2}}}}

It is important to keep in mind this correction only produces an unbiased estimator for normally and independently distributedX. When this condition is satisfied, another result abouts involvingc4(n){\displaystyle c_{4}(n)} is that thestandard error ofs is[4][5]σ1c42{\displaystyle \sigma {\sqrt {1-c_{4}^{2}}}}, while thestandard error of the unbiased estimator isσc421.{\displaystyle \sigma {\sqrt {c_{4}^{-2}-1}}.}

Rule of thumb for the normal distribution

[edit]

If calculation of the functionc4(n) appears too difficult, there is a simple rule of thumb[6] to take the estimator

σ^=1n1.5i=1n(xix¯)2{\displaystyle {\hat {\sigma }}={\sqrt {{\frac {1}{n-1.5}}\sum _{i=1}^{n}(x_{i}-{\overline {x}})^{2}}}}

The formula differs from the familiar expression fors2 only by havingn − 1.5 instead ofn − 1 in the denominator. This expression is only approximate; in fact,

E[σ^]=σ(1+116n2+316n3+O(n4)).{\displaystyle \operatorname {E} \left[{\hat {\sigma }}\right]=\sigma \cdot \left(1+{\frac {1}{16n^{2}}}+{\frac {3}{16n^{3}}}+O(n^{-4})\right).}

The bias is relatively small: say, forn=3{\displaystyle n=3} it is equal to 2.3%, and forn=9{\displaystyle n=9} the bias is already 0.1%.

Other distributions

[edit]

In cases wherestatistically independent data are modelled by a parametric family of distributions other than thenormal distribution, the population standard deviation will, if it exists, be a function of the parameters of the model. One general approach to estimation would bemaximum likelihood. Alternatively, it may be possible to use theRao–Blackwell theorem as a route to finding a good estimate of the standard deviation. In neither case would the estimates obtained usually be unbiased. Notionally, theoretical adjustments might be obtainable to lead to unbiased estimates but, unlike those for the normal distribution, these would typically depend on the estimated parameters.

If the requirement is simply to reduce the bias of an estimated standard deviation, rather than to eliminate it entirely, then two practical approaches are available, both within the context ofresampling. These arejackknifing andbootstrapping. Both can be applied either to parametrically based estimates of the standard deviation or to the sample standard deviation.

For non-normal distributions an approximate (up toO(n−1) terms) formula for the unbiased estimator of the standard deviation is

σ^=1n1.514γ2i=1n(xix¯)2,{\displaystyle {\hat {\sigma }}={\sqrt {{\frac {1}{n-1.5-{\tfrac {1}{4}}\gamma _{2}}}\sum _{i=1}^{n}\left(x_{i}-{\overline {x}}\right)^{2}}},}

whereγ2 denotes the populationexcess kurtosis. The excess kurtosis may be either known beforehand for certain distributions, or estimated from the data.

Effect of autocorrelation (serial correlation)

[edit]

The material above, to stress the point again, applies only to independent data. However, real-world data often does not meet this requirement; it isautocorrelated (also known as serial correlation). As one example, the successive readings of a measurement instrument that incorporates some form of “smoothing” (more correctly, low-pass filtering) process will be autocorrelated, since any particular value is calculated from some combination of the earlier and later readings.

Estimates of the variance, and standard deviation, of autocorrelated data will be biased. The expected value of the sample variance is[7]

E[s2]=σ2[12n1k=1n1(1kn)ρk]{\displaystyle {\rm {E}}\left[s^{2}\right]=\sigma ^{2}\left[1-{\frac {2}{n-1}}\sum _{k=1}^{n-1}\left(1-{\frac {k}{n}}\right)\rho _{k}\right]}

wheren is the sample size (number of measurements) andρk{\displaystyle \rho _{k}} is the autocorrelation function (ACF) of the data. (Note that the expression in the brackets is simply one minus the average expected autocorrelation for the readings.) If the ACF consists of positive values then the estimate of the variance (and its square root, the standard deviation) will be biased low. That is, the actual variability of the data will be greater than that indicated by an uncorrected variance or standard deviation calculation. It is essential to recognize that, if this expression is to be used to correct for the bias, by dividing the estimates2{\displaystyle s^{2}} by the quantity in brackets above, then the ACF must be knownanalytically, not via estimation from the data. This is because the estimated ACF will itself be biased.[8]

Example of bias in standard deviation

[edit]

To illustrate the magnitude of the bias in the standard deviation, consider a dataset that consists of sequential readings from an instrument that uses a specific digital filter whose ACF is known to be given by

ρk=(1α)k{\displaystyle \rho _{k}=(1-\alpha )^{k}}

whereα is the parameter of the filter, and it takes values from zero to unity. Thus the ACF is positive and geometrically decreasing.

Bias in standard deviation for autocorrelated data.

The figure shows the ratio of the estimated standard deviation to its known value (which can be calculated analytically for this digital filter), for several settings ofα as a function of sample sizen. Changingα alters the variance reduction ratio of the filter, which is known to be

VRR=α2α{\displaystyle {\rm {VRR}}={\frac {\alpha }{2-\alpha }}}

so that smaller values ofα result in more variance reduction, or “smoothing.” The bias is indicated by values on the vertical axis different from unity; that is, if there were no bias, the ratio of the estimated to known standard deviation would be unity. Clearly, for modest sample sizes there can be significant bias (a factor of two, or more).

Variance of the mean

[edit]

It is often of interest to estimate the variance or standard deviation of an estimatedmean rather than the variance of a population. When the data are autocorrelated, this has a direct effect on the theoretical variance of the sample mean, which is[9]

Var[x¯]=σ2n[1+2k=1n1(1kn)ρk].{\displaystyle {\rm {Var}}\left[{\overline {x}}\right]={\frac {\sigma ^{2}}{n}}\left[1+2\sum _{k=1}^{n-1}{\left(1-{\frac {k}{n}}\right)\rho _{k}}\right].}

The variance of the sample mean can then be estimated by substituting an estimate ofσ2. One such estimate can be obtained from the equation for E[s2] given above. First define the following constants, assuming, again, aknown ACF:

γ1:=12n1k=1n1(1kn)ρk=nγ2n1{\displaystyle \gamma _{1}:=1-{\frac {2}{n-1}}\sum _{k=1}^{n-1}\left(1-{\frac {k}{n}}\right)\rho _{k}={\frac {n-\gamma _{2}}{n-1}}}
γ2:=1+2k=1n1(1kn)ρk=n(n1)γ1{\displaystyle \gamma _{2}:=1+\qquad 2\,\sum _{k=1}^{n-1}\left(1-{\frac {k}{n}}\right)\rho _{k}=n-(n-1)\gamma _{1}}

so that

E[s2]=σ2γ1E[s2γ1]=σ2{\displaystyle {\rm {E}}\left[s^{2}\right]=\sigma ^{2}\gamma _{1}\Rightarrow {\rm {E}}\left[{\frac {s^{2}}{\gamma _{1}}}\right]=\sigma ^{2}}

This says that the expected value of the quantity obtained by dividing the observed sample variance by the correction factorγ1{\displaystyle \gamma _{1}} gives an unbiased estimate of the variance. Similarly, re-writing the expression above for the variance of the mean,

Var[x¯]=σ2nγ2{\displaystyle {\rm {Var}}\left[{\overline {x}}\right]={\frac {\sigma ^{2}}{n}}\gamma _{2}}

and substituting the estimate forσ2{\displaystyle \sigma ^{2}} gives[10]

Var[x¯]=E[s2γ1γ2n]=E[s2nn1nγ21]{\displaystyle {\rm {Var}}\left[{\overline {x}}\right]={\rm {E}}\left[{\frac {s^{2}}{\gamma _{1}}}\cdot {\frac {\gamma _{2}}{n}}\right]={\rm {E}}\left[{\frac {s^{2}}{n}}\cdot {\frac {n-1}{{\frac {n}{\gamma _{2}}}-1}}\right]}

which is an unbiased estimator of the variance of the mean in terms of the observed sample variance and known quantities. If the autocorrelationsρk{\displaystyle \rho _{k}} are identically zero, this expression reduces to the well-known result for the variance of the mean for independent data. The effect of the expectation operator in these expressions is that the equality holds in the mean (i.e., on average).

Estimating the standard deviation of the population

[edit]

Having the expressions above involving thevariance of the population, and of an estimate of the mean of that population, it would seem logical to simply take the square root of these expressions to obtain unbiased estimates of the respective standard deviations. However it is the case that, since expectations are integrals,

E[s]E[s2]σγ1{\displaystyle {\rm {E}}[s]\neq {\sqrt {{\rm {E}}\left[s^{2}\right]}}\neq \sigma {\sqrt {\gamma _{1}}}}

Instead, assume a functionθ exists such that an unbiased estimator of the standard deviation can be written

E[s]=σθγ1σ^=sθγ1{\displaystyle {\rm {E}}[s]=\sigma \theta {\sqrt {\gamma _{1}}}\Rightarrow {\hat {\sigma }}={\frac {s}{\theta {\sqrt {\gamma _{1}}}}}}

andθ depends on the sample sizen and the ACF. In the case of NID (normally and independently distributed) data, the radicand is unity andθ is just thec4 function given in the first section above. As withc4,θ approaches unity as the sample size increases (as doesγ1).

It can be demonstrated via simulation modeling that ignoringθ (that is, taking it to be unity) and using

E[s]σγ1σ^sγ1{\displaystyle {\rm {E}}[s]\approx \sigma {\sqrt {\gamma _{1}}}\Rightarrow {\hat {\sigma }}\approx {\frac {s}{\sqrt {\gamma _{1}}}}}

removes all but a few percent of the bias caused by autocorrelation, making this areduced-bias estimator, rather than anunbiased estimator. In practical measurement situations, this reduction in bias can be significant, and useful, even if some relatively small bias remains. The figure above, showing an example of the bias in the standard deviation vs. sample size, is based on this approximation; the actual bias would be somewhat larger than indicated in those graphs since the transformation biasθ is not included there.

Estimating the standard deviation of the sample mean

[edit]

The unbiased variance of the mean in terms of the population variance and the ACF is given by

Var[x¯]=σ2nγ2{\displaystyle {\rm {Var}}\left[{\overline {x}}\right]={\frac {\sigma ^{2}}{n}}\gamma _{2}}

and since there are no expected values here, in this case the square root can be taken, so that

σx¯=σnγ2{\displaystyle \sigma _{\overline {x}}={\frac {\sigma }{\sqrt {n}}}{\sqrt {\gamma _{2}}}}

Using the unbiased estimate expression above forσ, anestimate of the standard deviation of the mean will then be

σ^x¯=sθnγ2γ1{\displaystyle {\hat {\sigma }}_{\overline {x}}={\frac {s}{\theta {\sqrt {n}}}}{\frac {\sqrt {\gamma _{2}}}{\sqrt {\gamma _{1}}}}}

If the data are NID, so that the ACF vanishes, this reduces to

σ^x¯=sc4n{\displaystyle {\hat {\sigma }}_{\overline {x}}={\frac {s}{c_{4}{\sqrt {n}}}}}

In the presence of a nonzero ACF, ignoring the functionθ as before leads to thereduced-bias estimator

σ^x¯snγ2γ1=snn1nγ21{\displaystyle {\hat {\sigma }}_{\overline {x}}\approx {\frac {s}{\sqrt {n}}}{\frac {\sqrt {\gamma _{2}}}{\sqrt {\gamma _{1}}}}={\frac {s}{\sqrt {n}}}{\sqrt {\frac {n-1}{{\frac {n}{\gamma _{2}}}-1}}}}

which again can be demonstrated to remove a useful majority of the bias.

See also

[edit]

References

[edit]
  1. ^Ben W. Bolch, "More on unbiased estimation of the standard deviation", The American Statistician, 22(3), p. 27 (1968)
  2. ^Duncan, Acheson J. (1974).Quality Control and Industrial Statistics. Irwin. p. 968. Retrieved25 November 2024.
  3. ^Committee E-11 on Statistical Control (2002).Manual on Presentation of Data and Control Chart Analysis. ASTM Manual Series. Vol. MNL 7. ASTM International. p. 67.ISBN 0-8031-1289-0.{{cite book}}: CS1 maint: numeric names: authors list (link)
  4. ^Duncan, A. J.,Quality Control and Industrial Statistics 4th Ed., Irwin (1974)ISBN 0-256-01558-9, p.139
  5. ^* N.L. Johnson, S. Kotz, and N. Balakrishnan,Continuous Univariate Distributions, Volume 1, 2nd edition, Wiley and sons, 1994.ISBN 0-471-58495-9. Chapter 13, Section 8.2
  6. ^Richard M. Brugger, "A Note on Unbiased Estimation on the Standard Deviation", The American Statistician (23) 4 p. 32 (1969)
  7. ^Law and Kelton,Simulation Modeling and Analysis, 2nd Ed. McGraw-Hill (1991), p.284,ISBN 0-07-036698-5. This expression can be derived from its original source in Anderson,The Statistical Analysis of Time Series, Wiley (1971),ISBN 0-471-04745-7, p.448, Equation 51.
  8. ^Law and Kelton, p.286. This bias is quantified in Anderson, p.448, Equations 52–54.
  9. ^Law and Kelton, p.285. This equation can be derived from Theorem 8.2.3 of Anderson. It also appears in Box, Jenkins, Reinsel,Time Series Analysis: Forecasting and Control, 4th Ed. Wiley (2008),ISBN 978-0-470-27284-8, p.31.
  10. ^Law and Kelton, p.285
  • Douglas C. Montgomery and George C. Runger,Applied Statistics and Probability for Engineers, 3rd edition, Wiley and sons, 2003. (see Sections )

External links

[edit]

Public Domain This article incorporatespublic domain material from the National Institute of Standards and Technology

Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Retrieved from "https://en.wikipedia.org/w/index.php?title=Unbiased_estimation_of_standard_deviation&oldid=1334302339"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp