Movatterモバイル変換

[0]ホーム

Jump to content

P–P plot

Edit links

From Wikipedia, the free encyclopedia

Probability plot which compares two cumulative distribution functions

Not to be confused withQ–Q plot.

In statistics, aP–P plot (probability–probability plot orpercent–percent plot orP value plot) is a probability plot for assessing how closely twodata sets agree, or for assessing how closely a dataset fits a particular model. It works by plotting the twocumulative distribution functions against each other; if they are similar, the data will appear to be nearly a straight line. This behavior is similar to that of the more widely usedQ–Q plot, with which it is often confused.

Definition

[edit]

A P–P plot plots twocumulative distribution functions (cdfs) against each other:^[1] given two probability distributions, with cdfs "F" and "G", it plots $(F(z),G(z))$ asz ranges from $-\infty$ to $\infty .$ As a cdf has range [0,1], the domain of this parametric graph is $(-\infty ,\infty )$ and the range is the unit square $[0,1]\times [0,1].$

Thus for inputz the output is the pair of numbers giving whatpercentage off and whatpercentage ofg fall at or belowz.

The comparison line is the 45° line from (0,0) to (1,1), and the distributions are equal if and only if the plot falls on this line. The degree of deviation makes it easy to visually identify how different the distributions are, but because of sampling error, even samples drawn from identical distributions will not appear identical.^[2]

Example

[edit]

As an example, if the two distributions do not overlap, sayF is belowG, then the P–P plot will move from left to right along the bottom of the square – asz moves through the support ofF, the cdf ofF goes from 0 to 1, while the cdf ofG stays at 0 – and then moves up the right side of the square – the cdf ofF is now 1, as all points ofF lie below all points ofG, and now the cdf ofG moves from 0 to 1 asz moves through the support ofG. (need a graph for this paragraph)

Use

[edit]

As the above example illustrates, if two distributions are separated in space, the P–P plot will give very little data – it is only useful for comparing probability distributions that have nearby or equal location. Notably, it will pass through the point (1/2, 1/2) if and only if the two distributions have the samemedian.

P–P plots are sometimes limited to comparisons between two samples, rather than comparison of a sample to a theoretical model distribution.^[3] However, they are of general use, particularly where observations are not all modelled with the same distribution.

However, it has found some use in comparing a sample distribution from aknown theoretical distribution: givenn samples, plotting the continuous theoretical cdf against the empirical cdf would yield a stairstep (a step asz hits a sample), and would hit the top of the square when the last data point was hit. Instead one only plots points, plotting the observedkth observed points (in order: formally the observedkth order statistic) against thek/(n + 1)quantile of the theoretical distribution.^[3] This choice of "plotting position" (choice of quantile of the theoretical distribution) has occasioned less controversy than the choice for Q–Q plots. The resulting goodness of fit of the 45° line gives a measure of the difference between a sample set and the theoretical distribution.

A P–P plot can be used as a graphical adjunct to a tests of the fit of probability distributions,^[4]^[5] with additional lines being included on the plot to indicate either specific acceptance regions or the range of expected departure from the 1:1 line. An improved version of the P–P plot, called the SP or S–P plot, is available,^[4]^[5] which makes use of avariance-stabilizing transformation to create a plot on which the variations about the 1:1 line should be the same at all locations.

References

[edit]

Citations

[edit]

^Gibbons, Jean Dickinson; Chakraborti, Subhabrata (9 May 2003) [1970].Nonparametric Statistical Inference. Statistics: A Series of Textbooks and Monographs (Fourth Edition: Revised and Expanded ed.). New York:CRC Press. p. 145.ISBN 978-0824740528.LCCN 2004266776.OCLC 54454874.OL 3317690M.
^Derrick, B; Toher, D; White, P (2016)."Why Welchs test is Type I error robust".The Quantitative Methods for Psychology.12 (1):30–38.doi:10.20982/tqmp.12.1.p030.
^^a ^bTesting for Normality, by Henry C. Thode, CRC Press, 2002,ISBN 978-0-8247-9613-6, Section 2.2.3, Percent–percent plots,p. 23
^^a ^bMichael J.R. (1983) "The stabilized probability plot".Biometrika, 70(1), 11–17.JSTOR 2335939
^^a ^bShorack, G.R.,Wellner, J.A (1986)Empirical Processes with Applications to Statistics, Wiley.ISBN 0-471-86725-X p248–250

Sources

[edit]

Davidson, Russell; MacKinnon, James (January 1998). "Graphical Methods for Investigating the Size and Power of Hypothesis Tests".The Manchester School.66 (1):1–26.CiteSeerX 10.1.1.57.4335.doi:10.1111/1467-9957.00086.

v t e Distribution fitting
Overview and methods	Probability plot Normal probability plot P–P plot Q–Q plot Plotting position L-moment Distribution fitting Cumulative frequency analysis
Software	MathWorks R StatSoft PHITTER

Retrieved from "https://en.wikipedia.org/w/index.php?title=P–P_plot&oldid=1286931055"

Category:

Statistical charts and diagrams

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

Definition

Example

Use

See also

References

Citations

Sources