Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Moving average

From Wikipedia, the free encyclopedia
Type of statistical measure over subsets of a dataset
For other uses, seeMoving-average model andMoving average (disambiguation).
Smoothing of a noisy sine (blue curve) with a moving average (red curve).

Instatistics, amoving average (rolling average orrunning average ormoving mean[1] orrolling mean) is a calculation to analyze data points by creating a series ofaverages of different selections of the full data set. Variations include:simple,cumulative, orweighted forms.

Mathematically, a moving average is a type ofconvolution. Thus insignal processing it is viewed as alow-passfinite impulse response filter. Because theboxcar function outlines its filter coefficients, it is called aboxcar filter. It is sometimes followed bydownsampling.

Given a series of numbers and a fixed subset size, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the series.

A moving average is commonly used withtime series data to smooth out short-term fluctuations and highlight longer-term trends or cycles - in this case the calculation is sometimes called atime average. The threshold between short-term and long-term depends on the application, and the parameters of the moving average will be set accordingly. It is also used ineconomics to examine gross domestic product, employment or other macroeconomic time series. When used with non-time series data, a moving average filters higher frequency components without any specific connection to time, although typically some kind of ordering is implied. Viewed simplistically it can be regarded as smoothing the data.

Simple moving average

[edit]

In financial applications asimple moving average (SMA) is the unweightedmean of the previousk{\displaystyle k} data-points. However, in science and engineering, the mean is normally taken from an equal number of data on either side of a central value. This ensures that variations in the mean are aligned with the variations in the data rather than being shifted in time.An example of a simple equally weighted running mean is the mean over the lastk{\displaystyle k} entries of a data-set containingn{\displaystyle n} entries. Let those data-points bep1,p2,,pn{\displaystyle p_{1},p_{2},\dots ,p_{n}}. This could be closing prices of a stock. The mean over the lastk{\displaystyle k} data-points (days in this example) is denoted asSMAk{\displaystyle {\textit {SMA}}_{k}} and calculated as:SMAk=pnk+1+pnk+2++pnk=1ki=nk+1npi{\displaystyle {\begin{aligned}{\textit {SMA}}_{k}&={\frac {p_{n-k+1}+p_{n-k+2}+\cdots +p_{n}}{k}}\\&={\frac {1}{k}}\sum _{i=n-k+1}^{n}p_{i}\end{aligned}}}

When calculating the next meanSMAk,next{\displaystyle {\textit {SMA}}_{k,{\text{next}}}} with the same sampling widthk{\displaystyle k} the range fromnk+2{\displaystyle n-k+2} ton+1{\displaystyle n+1} is considered. A new valuepn+1{\displaystyle p_{n+1}} comes into the sum and the oldest valuepnk+1{\displaystyle p_{n-k+1}} drops out. This simplifies the calculations by reusing the previous meanSMAk,prev{\displaystyle {\textit {SMA}}_{k,{\text{prev}}}}.SMAk,next=1ki=nk+2n+1pi=1k(pnk+2+pnk+3++pn+pn+1i=nk+2n+1pi+pnk+1pnk+1=0)=1k(pnk+1+pnk+2++pn)=SMAk,prevpnk+1k+pn+1k=SMAk,prev+1k(pn+1pnk+1){\displaystyle {\begin{aligned}{\textit {SMA}}_{k,{\text{next}}}&={\frac {1}{k}}\sum _{i=n-k+2}^{n+1}p_{i}\\&={\frac {1}{k}}{\Big (}\underbrace {p_{n-k+2}+p_{n-k+3}+\dots +p_{n}+p_{n+1}} _{\sum _{i=n-k+2}^{n+1}p_{i}}+\underbrace {p_{n-k+1}-p_{n-k+1}} _{=0}{\Big )}\\&=\underbrace {{\frac {1}{k}}{\Big (}p_{n-k+1}+p_{n-k+2}+\dots +p_{n}{\Big )}} _{={\textit {SMA}}_{k,{\text{prev}}}}-{\frac {p_{n-k+1}}{k}}+{\frac {p_{n+1}}{k}}\\&={\textit {SMA}}_{k,{\text{prev}}}+{\frac {1}{k}}{\Big (}p_{n+1}-p_{n-k+1}{\Big )}\end{aligned}}}This means that the moving average filter can be computed quite cheaply on real time data with a FIFO /circular buffer and only 3 arithmetic steps.

During the initial filling of the FIFO / circular buffer the sampling window is equal to the data-set size thusk=n{\displaystyle k=n} and the average calculation is performed as acumulative moving average.

The period selected (k{\displaystyle k}) depends on the type of movement of interest, such as short, intermediate, or long-term.

If the data used are not centered around the mean, a simple moving average lags behind the latest datum by half the sample width. An SMA can also be disproportionately influenced by old data dropping out or new data coming in. One characteristic of the SMA is that if the data has a periodic fluctuation, then applying an SMA of that period will eliminate that variation (the average always containing one complete cycle). But a perfectly regular cycle is rarely encountered.[2]

For a number of applications, it is advantageous to avoid the shifting induced by using only "past" data. Hence acentral moving average can be computed, using data equally spaced on either side of the point in the series where the mean is calculated.[3] This requires using an odd number of points in the sample window.

A major drawback of the SMA is that it lets through a significant amount of the signal shorter than the window length. This can lead to unexpected artifacts, such as peaks in the smoothed result appearing where there were troughs in the data. It also leads to the result being less smooth than expected since some of the higher frequencies are not properly removed.

Its frequency response is a type of low-pass filter calledsinc-in-frequency.

Continuous moving average

[edit]

The continuous moving average is defined with the following integral. Theε{\displaystyle \varepsilon } environment[xoε,xo+ε]{\displaystyle [x_{o}-\varepsilon ,x_{o}+\varepsilon ]} aroundxo{\displaystyle x_{o}} defines the intensity of smoothing of the graph of the function.

f:RRxf(x){\displaystyle {\begin{array}{rrcl}f:&\mathbb {R} &\rightarrow &\mathbb {R} \\&x&\mapsto &f\left(x\right)\end{array}}}

The continuous moving average of the functionf{\displaystyle f} is defined as:

MAf:RRxo12εxoεxo+εf(t)dt{\displaystyle {\begin{array}{rrcl}MA_{f}:&\mathbb {R} &\rightarrow &\mathbb {R} \\&x_{o}&\mapsto &\displaystyle {\frac {1}{2\cdot \varepsilon }}\cdot \int _{x_{o}-\varepsilon }^{x_{o}+\varepsilon }f\left(t\right)\,dt\end{array}}}

A largerε>0{\displaystyle \varepsilon >0} smoothes the source graph of the function (blue)f{\displaystyle f} more. The animations below show the moving average as animation in dependency of different values forε>0{\displaystyle \varepsilon >0}. The fraction12ε{\displaystyle {\frac {1}{2\cdot \varepsilon }}} is used, because2ε{\displaystyle 2\cdot \varepsilon } is the interval width for the integral.

  • Continuous moving average sine and polynom - visualization of the smoothing with a small interval for integration
    Continuous moving average sine and polynom - visualization of the smoothing with a small interval for integration
  • Continuous moving average sine and polynom - visualization of the smoothing with a larger interval for integration
    Continuous moving average sine and polynom - visualization of the smoothing with a larger interval for integration
  • Animation showing the impact of interval width and smoothing by moving average.
    Animation showing the impact of interval width and smoothing by moving average.

Cumulative average

[edit]

In acumulative average (CA), the data arrive in an ordered datum stream, and the user would like to get the average of all of the data up until the current datum. For example, an investor may want the average price of all of the stock transactions for a particular stock up until the current time. As each new transaction occurs, the average price at the time of the transaction can be calculated for all of the transactions up to that point using the cumulative average, typically an equally weightedaverage of the sequence ofn valuesx1.,xn{\displaystyle x_{1}.\ldots ,x_{n}} up to the current time:CAn=x1++xnn.{\displaystyle {\textit {CA}}_{n}={{x_{1}+\cdots +x_{n}} \over n}\,.}

The brute-force method to calculate this would be to store all of the data and calculate the sum and divide by the number of points every time a new datum arrived. However, it is possible to simply update cumulative average as a new value,xn+1{\displaystyle x_{n+1}} becomes available, using the formulaCAn+1=xn+1+nCAnn+1.{\displaystyle {\textit {CA}}_{n+1}={{x_{n+1}+n\cdot {\textit {CA}}_{n}} \over {n+1}}.}

Thus the current cumulative average for a new datum is equal to the previous cumulative average, timesn, plus the latest datum, all divided by the number of points received so far,n+1. When all of the data arrive (n =N), then the cumulative average will equal the final average. It is also possible to store a running total of the data as well as the number of points and dividing the total by the number of points to get the CA each time a new datum arrives.

The derivation of the cumulative average formula is straightforward. Usingx1++xn=nCAn{\displaystyle x_{1}+\cdots +x_{n}=n\cdot {\textit {CA}}_{n}}and similarly forn + 1, it is seen thatxn+1=(x1++xn+1)(x1++xn){\displaystyle x_{n+1}=(x_{1}+\cdots +x_{n+1})-(x_{1}+\cdots +x_{n})}xn+1=(n+1)CAn+1nCAn{\displaystyle x_{n+1}=(n+1)\cdot {\textit {CA}}_{n+1}-n\cdot {\textit {CA}}_{n}}

Solving this equation forCAn+1{\displaystyle {\textit {CA}}_{n+1}} results inCAn+1=xn+1+nCAnn+1=xn+1+(n+11)CAnn+1=(n+1)CAn+xn+1CAnn+1=CAn+xn+1CAnn+1{\displaystyle {\begin{aligned}{\textit {CA}}_{n+1}&={x_{n+1}+n\cdot {\textit {CA}}_{n} \over {n+1}}\\[6pt]&={x_{n+1}+(n+1-1)\cdot {\textit {CA}}_{n} \over {n+1}}\\[6pt]&={(n+1)\cdot {\textit {CA}}_{n}+x_{n+1}-{\textit {CA}}_{n} \over {n+1}}\\[6pt]&={{\textit {CA}}_{n}}+{{x_{n+1}-{\textit {CA}}_{n}} \over {n+1}}\end{aligned}}}

Weighted moving average

[edit]

A weighted average is an average that has multiplying factors to give different weights to data at different positions in the sample window. Mathematically, the weighted moving average is theconvolution of the data with a fixed weighting function. One application is removingpixelization from a digital graphical image. This is also known asAnti-aliasing[citation needed]

In the financial field, and more specifically in the analyses of financial data, aweighted moving average (WMA) has the specific meaning of weights that decrease in arithmetical progression.[4] In ann-day WMA the latest day has weightn, the second latestn1{\displaystyle n-1}, etc., down to one.

WMAM=npM+(n1)pM1++2p((Mn)+2)+p((Mn)+1)n+(n1)++2+1{\displaystyle {\text{WMA}}_{M}={np_{M}+(n-1)p_{M-1}+\cdots +2p_{((M-n)+2)}+p_{((M-n)+1)} \over n+(n-1)+\cdots +2+1}}

WMA weightsn = 15

The denominator is atriangle number equal ton(n+1)2.{\textstyle {\frac {n(n+1)}{2}}.} In the more general case the denominator will always be the sum of the individual weights.

When calculating the WMA across successive values, the difference between the numerators ofWMAM+1{\displaystyle {\text{WMA}}_{M+1}} andWMAM{\displaystyle {\text{WMA}}_{M}} isnpM+1pMpMn+1{\displaystyle np_{M+1}-p_{M}-\dots -p_{M-n+1}}. If we denote the sumpM++pMn+1{\displaystyle p_{M}+\dots +p_{M-n+1}} byTotalM{\displaystyle {\text{Total}}_{M}}, then

TotalM+1=TotalM+pM+1pMn+1NumeratorM+1=NumeratorM+npM+1TotalMWMAM+1=NumeratorM+1n+(n1)++2+1{\displaystyle {\begin{aligned}{\text{Total}}_{M+1}&={\text{Total}}_{M}+p_{M+1}-p_{M-n+1}\\[3pt]{\text{Numerator}}_{M+1}&={\text{Numerator}}_{M}+np_{M+1}-{\text{Total}}_{M}\\[3pt]{\text{WMA}}_{M+1}&={{\text{Numerator}}_{M+1} \over n+(n-1)+\cdots +2+1}\end{aligned}}}

The graph at the right shows how the weights decrease, from highest weight for the most recent data, down to zero. It can be compared to the weights in the exponential moving average which follows.

Exponential moving average

[edit]
Main article:Exponential smoothing
Further information:EWMA chart

Anexponential moving average (EMA), also known as anexponentially weighted moving average (EWMA),[5] is a first-orderinfinite impulse response filter that applies weighting factors which decreaseexponentially. The weighting for each olderdatum decreases exponentially, never reaching zero. This formulation is according to Hunter (1986).[6]

There is also a multivariate implementation of EWMA, known as MEWMA.[7]

Other weightings

[edit]

Other weighting systems are used occasionally – for example, in share trading avolume weighting will weight each time period in proportion to its trading volume.

A further weighting, used by actuaries, is Spencer's 15-Point Moving Average[8] (a central moving average). Its symmetric weight coefficients are [−3, −6, −5, 3, 21, 46, 67, 74, 67, 46, 21, 3, −5, −6, −3], which factors as[1, 1, 1, 1]×[1, 1, 1, 1]×[1, 1, 1, 1, 1]×[−3, 3, 4, 3, −3]/320 and leaves samples of any quadratic or cubic polynomial unchanged.[9][10]

Outside the world of finance, weighted running means have many forms and applications. Each weighting function or "kernel" has its own characteristics. In engineering and science the frequency and phase response of the filter is often of primary importance in understanding the desired and undesired distortions that a particular filter will apply to the data.

A mean does not just "smooth" the data. A mean is a form of low-pass filter. The effects of the particular filter used should be understood in order to make an appropriate choice. On this point, the French version of this article discusses the spectral effects of 3 kinds of means (cumulative, exponential, Gaussian).

Moving median

[edit]

From a statistical point of view, the moving average, when used to estimate the underlying trend in a time series, is susceptible to rare events such as rapid shocks or other anomalies. A more robust estimate of the trend is thesimple moving median overn time points:p~SM=Median(pM,pM1,,pMn+1){\displaystyle {\widetilde {p}}_{\text{SM}}={\text{Median}}(p_{M},p_{M-1},\ldots ,p_{M-n+1})}where themedian is found by, for example, sorting the values inside the brackets and finding the value in the middle. For larger values ofn, the median can be efficiently computed by updating anindexable skiplist.[11]

Statistically, the moving average is optimal for recovering the underlying trend of the time series when the fluctuations about the trend arenormally distributed. However, the normal distribution does not place high probability on very large deviations from the trend which explains why such deviations will have a disproportionately large effect on the trend estimate. It can be shown that if the fluctuations are instead assumed to beLaplace distributed, then the moving median is statistically optimal.[12] For a given variance, the Laplace distribution places higher probability on rare events than does the normal, which explains why the moving median tolerates shocks better than the moving mean.

When the simple moving median above is central, the smoothing is identical to themedian filter which has applications in, for example, image signal processing. The Moving Median is a more robust alternative to the Moving Average when it comes to estimating the underlying trend in a time series. While the Moving Average is optimal for recovering the trend if the fluctuations around the trend are normally distributed, it is susceptible to the impact of rare events such as rapid shocks or anomalies. In contrast, the Moving Median, which is found by sorting the values inside the time window and finding the value in the middle, is more resistant to the impact of such rare events. This is because, for a given variance, the Laplace distribution, which the Moving Median assumes, places higher probability on rare events than the normal distribution that the Moving Average assumes. As a result, the Moving Median provides a more reliable and stable estimate of the underlying trend even when the time series is affected by large deviations from the trend. Additionally, the Moving Median smoothing is identical to the Median Filter, which has various applications in image signal processing.

Moving average regression model

[edit]
Main article:Moving-average model

In amoving average regression model, a variable of interest is assumed to be a weighted moving average of unobserved independent error terms; the weights in the moving average are parameters to be estimated.

Those two concepts are often confused due to their name, but while they share many similarities, they represent distinct methods and are used in very different contexts.

See also

[edit]
Wikimedia Commons has media related toMoving averages.

References

[edit]
  1. ^Hydrologic Variability of the Cosumnes River Floodplain (Booth et al., San Francisco Estuary and Watershed Science, Volume 4, Issue 2, 2006)
  2. ^Statistical Analysis, Ya-lun Chou, Holt International, 1975,ISBN 0-03-089422-0, section 17.9.
  3. ^The derivation and properties of the simple central moving average are given in full atSavitzky–Golay filter.
  4. ^"Weighted Moving Averages: The Basics". Investopedia.
  5. ^"DEALING WITH MEASUREMENT NOISE - Averaging Filter". Archived fromthe original on 2010-03-29. Retrieved2010-10-26.
  6. ^NIST/SEMATECH e-Handbook of Statistical Methods: Single Exponential Smoothing at theNational Institute of Standards and Technology
  7. ^Yeh, A.; Lin, D.; Zhou, H.; Venkataramani, C. (2003)."A multivariate exponentially weighted moving average control chart for monitoring process variability"(PDF).Journal of Applied Statistics.30 (5):507–536.Bibcode:2003JApSt..30..507Y.doi:10.1080/0266476032000053655.ISSN 0266-4763. Retrieved16 January 2025.
  8. ^Spencer's 15-Point Moving Average — from Wolfram MathWorld
  9. ^Rob J Hyndman. "Moving averages". 2009-11-08. Accessed 2020-08-20.
  10. ^Aditya Guntuboyina. "Statistics 153 (Time Series) : Lecture Three". 2012-01-24. Accessed 2024-01-07.
  11. ^"Efficient Running Median using an Indexable Skiplist « Python recipes « ActiveState Code".
  12. ^G.R. Arce, "Nonlinear Signal Processing: A Statistical Approach", Wiley:New Jersey, US, 2005.
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Concepts
Charts
Patterns
Chart
Candlestick
Simple
Complex
Point and figure
Indicators
Support &
resistance
Trend
Momentum
Volume
Volatility
Breadth
Other
Analysts
Quantitativeforecasting methods
Retrieved from "https://en.wikipedia.org/w/index.php?title=Moving_average&oldid=1294055084"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp