In statistics,efficiency is a measure of quality of anestimator, of an experimental design,[1] or of ahypothesis testing procedure.[2] Essentially, a more efficient estimator needs fewer input data or observations than a less efficient one to achieve theCramér–Rao bound. Anefficient estimator is characterized by having the smallest possiblevariance, indicating that there is a smalldeviance between the estimated value and the "true" value in theL2 norm sense.[1]
Therelative efficiency of two procedures is the ratio of their efficiencies, although often this concept is used where the comparison is made between a given procedure and a notional "best possible" procedure. The efficiencies and the relative efficiency of two procedures theoretically depend on the sample size available for the given procedure, but it is often possible to use theasymptotic relative efficiency (defined as the limit of the relative efficiencies as the sample size grows) as the principal comparison measure.
The efficiency of anunbiasedestimator,T, of aparameterθ is defined as[3]
where is theFisher information of the sample. Thuse(T) is the minimum possible variance for an unbiased estimator divided by its actual variance. TheCramér–Rao bound can be used to prove thate(T) ≤ 1.
Anefficient estimator is anestimator that estimates the quantity of interest in some “best possible” manner. The notion of “best possible” relies upon the choice of a particularloss function — the function which quantifies the relative degree of undesirability of estimation errors of different magnitudes. The most common choice of the loss function isquadratic, resulting in themean squared error criterion of optimality.[4]
In general, the spread of an estimator around the parameter θ is a measure of estimator efficiency and performance. This performance can be calculated by finding the mean squared error. More formally, letT be an estimator for the parameterθ. The mean squared error ofT is the value, which can be decomposed as a sum of its variance and bias:
An estimatorT1 performs better than an estimatorT2 if.[5] For a more specific case, ifT1 andT2are two unbiased estimators for the same parameter θ, then the variance can be compared to determine performance. In this case,T2 ismore efficient thanT1 if the variance ofT2 issmaller than the variance ofT1, i.e. for all values ofθ. This relationship can be determined by simplifying the more general case above for mean squared error; since the expected value of an unbiased estimator is equal to the parameter value,. Therefore, for an unbiased estimator,, as the term drops out for being equal to 0.[5]
If anunbiasedestimator of a parameterθ attains for all values of the parameter, then the estimator is called efficient.[3]
Equivalently, the estimator achieves equality in theCramér–Rao inequality for allθ. TheCramér–Rao lower bound is a lower bound of the variance of an unbiased estimator, representing the "best" an unbiased estimator can be.
An efficient estimator is also theminimum variance unbiased estimator (MVUE). This is because an efficient estimator maintains equality on the Cramér–Rao inequality for all parameter values, which means it attains the minimum variance for all parameters (the definition of the MVUE). The MVUE estimator, even if it exists, is not necessarily efficient, because "minimum" does not mean equality holds on the Cramér–Rao inequality.
Thus an efficient estimator need not exist, but if it does, it is the MVUE.
Suppose{Pθ |θ ∈ Θ} is aparametric model andX = (X1, …,Xn) are the data sampled from this model. LetT =T(X) be anestimator for the parameterθ. If this estimator isunbiased (that is,E[ T ] =θ), then theCramér–Rao inequality states thevariance of this estimator is bounded from below:
where is theFisher information matrix of the model at pointθ. Generally, the variance measures the degree of dispersion of a random variable around its mean. Thus estimators with small variances are more concentrated, they estimate the parameters more precisely. We say that the estimator is afinite-sample efficient estimator (in the class of unbiased estimators) if it reaches the lower bound in the Cramér–Rao inequality above, for allθ ∈ Θ. Efficient estimators are alwaysminimum variance unbiased estimators. However the converse is false: There exist point-estimation problems for which the minimum-variance mean-unbiased estimator is inefficient.[6]
Historically, finite-sample efficiency was an early optimality criterion. However this criterion has some limitations:
As an example, among the models encountered in practice, efficient estimators exist for: the meanμ of thenormal distribution (but not the varianceσ2), parameterλ of thePoisson distribution, the probabilityp in thebinomial ormultinomial distribution.
Consider the model of anormal distribution with unknown mean but known variance:{Pθ =N(θ,σ2) |θ ∈R }. The data consists ofnindependent and identically distributed observations from this model:X = (x1, …,xn). We estimate the parameterθ using thesample mean of all observations:
This estimator has meanθ and variance ofσ2 / n, which is equal to the reciprocal of theFisher information from the sample. Thus, the sample mean is a finite-sample efficient estimator for the mean of the normal distribution.
Asymptotic efficiency requiresConsistency (statistics), asymptotically normal distribution of the estimator, and an asymptotic variance-covariance matrix no worse than that of any other estimator.[9]
Consider a sample of size drawn from anormal distribution of mean and unitvariance, i.e.,
Thesample mean,, of the sample, defined as
The variance of the mean, 1/N (the square of thestandard error) is equal to the reciprocal of theFisher information from the sample and thus, by theCramér–Rao inequality, the sample mean is efficient in the sense that its efficiency is unity (100%).
Now consider thesample median,. This is anunbiased andconsistent estimator for. For large the sample median is approximatelynormally distributed with mean and variance[10]
The efficiency of the median for large is thus
In other words, the relative variance of the median will be, or 57% greater than the variance of the mean – the standard error of the median will be 25% greater than that of the mean.[11]
Note that this is theasymptotic efficiency — that is, the efficiency in the limit as sample size tends to infinity. For finite values of the efficiency is higher than this (for example, a sample size of 3 gives an efficiency of about 74%).[citation needed]
The sample mean is thus more efficient than the sample median in this example. However, there may be measures by which the median performs better. For example, the median is far more robust tooutliers, so that if the Gaussian model is questionable or approximate, there may advantages to using the median (seeRobust statistics).
If and are estimators for the parameter, then is said todominate if:
Formally, dominates if
holds for all, with strict inequality holding somewhere.
The relative efficiency of two unbiased estimators is defined as[12]
Although is in general a function of, in many cases the dependence drops out; if this is so, being greater than one would indicate that is preferable, regardless of the true value of.
An alternative to relative efficiency for comparing estimators, is thePitman closeness criterion. This replaces the comparison of mean-squared-errors with comparing how often one estimator produces estimates closer to the true value than another estimator.
In estimating the mean of uncorrelated, identically distributed variables we can take advantage of the fact thatthe variance of the sum is the sum of the variances. In this case efficiency can be defined as the square of thecoefficient of variation, i.e.,[13]
Relative efficiency of two such estimators can thus be interpreted as the relative sample size of one required to achieve the certainty of the other. Proof:
Now because we have, so the relative efficiency expresses the relative sample size of the first estimator needed to match the variance of the second.
Efficiency of an estimator may change significantly if the distribution changes, often dropping. This is one of the motivations ofrobust statistics – an estimator such as the sample mean is an efficient estimator of the population mean of a normal distribution, for example, but can be an inefficient estimator of amixture distribution of two normal distributions with the same mean and different variances. For example, if a distribution is a combination of 98%N(μ,σ) and 2%N(μ, 10σ), the presence of extreme values from the latter distribution (often "contaminating outliers") significantly reduces the efficiency of the sample mean as an estimator ofμ. By contrast, thetrimmed mean is less efficient for a normal distribution, but is more robust (i.e., less affected) by changes in the distribution, and thus may be more efficient for a mixture distribution. Similarly, theshape of a distribution, such asskewness orheavy tails, can significantly reduce the efficiency of estimators that assume a symmetric distribution or thin tails.
Efficiency in statistics is important because it allows the performance of various estimators to be compared. Although an unbiased estimator is usually favored over a biased one, a more efficient biased estimator can sometimes be more valuable than a less efficient unbiased estimator. For example, this can occur when the values of the biased estimator gathers around a number closer to the true value. Thus, estimator performance can be predicted easily by comparing their mean squared errors or variances.
While efficiency is a desirable quality of an estimator, it must be weighed against other considerations, and an estimator that is efficient for certain distributions may well be inefficient for other distributions. Most significantly, estimators that are efficient for clean data from a simple distribution, such as the normal distribution (which is symmetric, unimodal, and has thin tails) may not be robust to contamination by outliers, and may be inefficient for more complicated distributions. Inrobust statistics, more importance is placed on robustness and applicability to a wide variety of distributions, rather than efficiency on a single distribution.M-estimators are a general class of estimators motivated by these concerns. They can be designed to yield both robustness and high relative efficiency, though possibly lower efficiency than traditional estimators for some cases. They can be very computationally complicated, however.
A more traditional alternative areL-estimators, which are very simple statistics that are easy to compute and interpret, in many cases robust, and often sufficiently efficient for initial estimates. Seeapplications of L-estimators for further discussion. Inefficient statistics in this sense are discussed in detail inThe Atomic Nucleus by R. D. Evans, written before the advent of computers, when efficiently estimating even the arithmetic mean of a sorted series of measurements was laborious.[14]
For comparingsignificance tests, a meaningful measure of efficiency can be defined based on the sample size required for the test to achieve a given taskpower.[15]
Pitman efficiency[16] andBahadur efficiency (orHodges–Lehmann efficiency)[17][18][19] relate to the comparison of the performance ofstatistical hypothesis testing procedures.
For experimental designs, efficiency relates to the ability of a design to achieve the objective of the study with minimal expenditure of resources such as time and money. In simple cases, the relative efficiency of designs can be expressed as the ratio of the sample sizes required to achieve a given objective.[20]
{{cite book}}: CS1 maint: location missing publisher (link)