Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Average absolute deviation

From Wikipedia, the free encyclopedia
Summary statistic of variability

Theaverage absolute deviation (AAD) of a data set is theaverage of theabsolutedeviations from acentral point. It is asummary statistic ofstatistical dispersion or variability. In the general form, the central point can be amean,median,mode, or the result of any other measure of central tendency or any reference value related to the given data set. AAD includes themean absolute deviation and themedian absolute deviation (both abbreviated asMAD).

Measures of dispersion

[edit]

Several measures ofstatistical dispersion are defined in terms of the absolute deviation.The term "average absolute deviation" does not uniquely identify a measure ofstatistical dispersion, as there are several measures that can be used to measure absolute deviations, and there are several measures ofcentral tendency that can be used as well. Thus to uniquely identify the absolute deviation it is necessary to specify both the measure of deviation and the measure of central tendency. The statistical literature has not yet adopted a standard notation, as both the mean absolute deviation around the mean and the median absolute deviation around the median have been denoted by their initials "MAD" in the literature, which may lead to confusion, since they generally have values considerably different from each other.

Mean absolute deviation around a central point

[edit]
For arbitrary differences (not around a central point), seeMean absolute difference.
For paired differences (also known as mean absolute deviation), seeMean absolute error.

The mean absolute deviation of a setX = {x1,x2, …,xn} is:1ni=1n|xim(X)|.{\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}|x_{i}-m(X)|.}

The choice of measure of central tendency,m(X){\displaystyle m(X)}, has a marked effect on the value of the mean deviation. For example, for the data set {2, 2, 3, 4, 14}:

Measure of central tendencym(X){\displaystyle m(X)}Mean absolute deviation
Arithmetic Mean = 5|25|+|25|+|35|+|45|+|145|5=3.6{\displaystyle {\frac {|2-5|+|2-5|+|3-5|+|4-5|+|14-5|}{5}}=3.6}
Median = 3|23|+|23|+|33|+|43|+|143|5=2.8{\displaystyle {\frac {|2-3|+|2-3|+|3-3|+|4-3|+|14-3|}{5}}=2.8}
Mode = 2|22|+|22|+|32|+|42|+|142|5=3.0{\displaystyle {\frac {|2-2|+|2-2|+|3-2|+|4-2|+|14-2|}{5}}=3.0}

Mean absolute deviation around the mean

[edit]

Themean absolute deviation (MAD), also referred to as the "mean deviation" or sometimes "average absolute deviation", is the mean of the data's absolute deviations around the data's mean: the average (absolute) distance from the mean. "Average absolute deviation" can refer to either this usage, or to the general form with respect to a specified central point (see above).

MAD has been proposed to be used in place ofstandard deviation since it corresponds better to real life.[1] Because the MAD is a simpler measure of variability than thestandard deviation, it can be useful in school teaching.[2][3]

This method's forecast accuracy is very closely related to themean squared error (MSE) method which is just the average squared error of the forecasts. Although these methods are very closely related, MAD is more commonly used because it is both easier to compute (avoiding the need for squaring)[4] and easier to understand.[5]

Relation to standard deviation

[edit]
See also:Median absolute deviation § Relation to standard deviation

For thenormal distribution, the ratio of mean absolute deviation from the mean to standard deviation is2/π=0.79788456{\textstyle {\sqrt {2/\pi }}=0.79788456\ldots }. Thus ifX is a normally distributed random variable with expected value 0 then, see Geary (1935):[6]w=E[|X|]E[X2]=2π.{\displaystyle w={\frac {\operatorname {E} \left[|X|\right]}{\sqrt {\operatorname {E} \left[X^{2}\right]}}}={\sqrt {\frac {2}{\pi }}}\,.}In other words, for a normal distribution, mean absolute deviation is about 0.8 times the standard deviation.However, in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian samplen with the following bounds:wn[0,1]{\displaystyle w_{n}\in [0,1]}, with a bias for smalln.[7]

The mean absolute deviation from the mean is less than or equal to thestandard deviation; one way of proving this relies onJensen's inequality.

Proof

Jensen's inequality isφ(E[Y])E[φ(Y)]{\displaystyle \varphi \left(\operatorname {E} [Y]\right)\leq \operatorname {E} \left[\varphi (Y)\right]}, whereφ{\displaystyle \varphi } is a convex function, this implies forY=|Xμ|{\displaystyle Y=\vert X-\mu \vert } that:(E[|Xμ|])2E[|Xμ|2]=Var(X){\displaystyle \left(\operatorname {E} \left[|X-\mu |\right]\right)^{2}\leq \operatorname {E} \left[|X-\mu |^{2}\right]=\operatorname {Var} (X)}

Since both sides are positive, and thesquare root is amonotonically increasing function in the positive domain:E[|Xμ|]Var(X){\displaystyle \operatorname {E} \left[|X-\mu |\right]\leq {\sqrt {\operatorname {Var} (X)}}}

For a general case of this statement, seeHölder's inequality.

Mean absolute deviation around the median

[edit]

Themedian is the point about which the mean deviation is minimized. The MAD median offers a direct measure of the scale of a random variable around its medianDmed=E[|Xmedian|]{\displaystyle D_{\text{med}}=\operatorname {E} \left[|X-{\text{median}}|\right]}

This is themaximum likelihood estimator of the scale parameterb{\displaystyle b} of theLaplace distribution.

Since the median minimizes the average absolute distance, we haveDmedDmean{\displaystyle D_{\text{med}}\leq D_{\text{mean}}}.The mean absolute deviation from the median is less than or equal to the mean absolute deviation from the mean. In fact, the mean absolute deviation from the median is always less than or equal to the mean absolute deviation from any other fixed number.

By using the general dispersion function, Habib (2011) defined MAD about median asDmed=E[|Xmedian|]=2Cov(X,IO){\displaystyle D_{\text{med}}=\operatorname {E} \left[|X-{\text{median}}|\right]=2\operatorname {Cov} (X,I_{O})}where the indicator function isIO:={1if x>median,0otherwise.{\displaystyle \mathbf {I} _{O}:={\begin{cases}1&{\text{if }}x>{\text{median}},\\0&{\text{otherwise}}.\end{cases}}}

This representation allows for obtaining MAD median correlation coefficients.[citation needed]

Median absolute deviation around a central point

[edit]
Main article:Median absolute deviation

While in principle the mean or any other central point could be taken as the central point for the median absolute deviation, most often themedian value is taken instead.

Median absolute deviation around the median

[edit]
Main article:Median absolute deviation

Themedian absolute deviation (also MAD) is themedian of the absolute deviation from themedian. It is arobust estimator of dispersion.

For the example {2, 2, 3, 4, 14}: 3 is the median, so the absolute deviations from the median are {1, 1, 0, 1, 11} (reordered as {0, 1, 1, 1, 11}) with a median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation is 1.

For a symmetric distribution, the median absolute deviation is equal to half theinterquartile range.

Maximum absolute deviation

[edit]

Themaximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using the formula for the average absolute deviation as above withm(X)=max(X){\displaystyle m(X)=\max(X)}, wheremax(X){\displaystyle \max(X)} is thesample maximum.

Minimization

[edit]

The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency asminimizing dispersion:The median is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows:

  • L2 norm statistics: the mean minimizes themean squared error
  • L1 norm statistics: the median minimizesaverage absolute deviation,
  • L norm statistics: themid-range minimizes themaximum absolute deviation
  • trimmedL norm statistics: for example, themidhinge (average of first and thirdquartiles) which minimizes themedian absolute deviation of the whole distribution, also minimizes themaximum absolute deviation of the distribution after the top and bottom 25% have been trimmed off.

Estimation

[edit]

The mean absolute deviation of a sample is abiased estimator of the mean absolute deviation of the population.In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For the population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore, the absolute deviation is a biased estimator.

However, this argument is based on the notion of mean-unbiasedness. Each measure of location has its own form of unbiasedness (see entry onbiased estimator). The relevant form of unbiasedness here is median unbiasedness.

See also

[edit]

References

[edit]
  1. ^Taleb, Nassim Nicholas (2014)."What scientific idea is ready for retirement?".Edge. Archived from the original on 2014-01-16. Retrieved2014-01-16.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  2. ^Kader, Gary (March 1999)."Means and MADS".Mathematics Teaching in the Middle School.4 (6):398–403.doi:10.5951/MTMS.4.6.0398.Archived from the original on 2013-05-18. Retrieved20 February 2013.
  3. ^Franklin, Christine, Gary Kader, Denise Mewborn, Jerry Moreno,Roxy Peck, Mike Perry, and Richard Scheaffer (2007).Guidelines for Assessment and Instruction in Statistics Education(PDF). American Statistical Association.ISBN 978-0-9791747-1-1.Archived(PDF) from the original on 2013-03-07. Retrieved2013-02-20.{{cite book}}: CS1 maint: multiple names: authors list (link)
  4. ^Nahmias, Steven;Olsen, Tava Lennon (2015),Production and Operations Analysis (7th ed.), Waveland Press, p. 62,ISBN 9781478628248,MAD is often the preferred method of measuring the forecast error because it does not require squaring.
  5. ^Stadtler, Hartmut; Kilger, Christoph; Meyr, Herbert, eds. (2014),Supply Chain Management and Advanced Planning: Concepts, Models, Software, and Case Studies, Springer Texts in Business and Economics (5th ed.), Springer, p. 143,ISBN 9783642553097,the meaning of the MAD is easier to interpret.
  6. ^Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. Biometrika, 27(3/4), 310–332.
  7. ^See also Geary's 1936 and 1946 papers: Geary, R. C. (1936). Moments of the ratio of the mean deviation to the standard deviation for normal samples. Biometrika, 28(3/4), 295–307 and Geary, R. C. (1947). Testing for normality. Biometrika, 34(3/4), 209–242.

External links

[edit]
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis (see alsoTemplate:Least squares and regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Retrieved from "https://en.wikipedia.org/w/index.php?title=Average_absolute_deviation&oldid=1315595671"
Category:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp