Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Posterior probability

From Wikipedia, the free encyclopedia
Conditional probability used in Bayesian statistics
Part of a series on
Bayesian statistics
Posterior =Likelihood ×Prior ÷Evidence
Background
Model building
Posterior approximation
Estimators
Evidence approximation
Model evaluation

Theposterior probability is a type ofconditional probability that results fromupdating theprior probability with information summarized by thelikelihood via an application ofBayes' rule.[1] From anepistemological perspective, the posterior probability contains everything there is to know about an uncertain proposition (such as a scientific hypothesis, or parameter values), given prior knowledge and a mathematical model describing the observations available at a particular time.[2] After the arrival of new information, the current posterior probability may serve as the prior in another round of Bayesian updating.[3]

In the context ofBayesian statistics, theposteriorprobability distribution usually describes the epistemic uncertainty aboutstatistical parameters conditional on a collection of observed data. From a given posterior distribution, variouspoint andinterval estimates can be derived, such as themaximum a posteriori (MAP) or thehighest posterior density interval (HPDI).[4] But while conceptually simple, the posterior distribution is generally not tractable and therefore needs to be either analytically or numerically approximated.[5]

Definition in the distributional case

[edit]

In Bayesian statistics, the posterior probability is the probability distribution of the parametersθ{\displaystyle \theta } given the evidenceX{\displaystyle X}, and is denotedp(θ|X){\displaystyle p(\theta |X)}.

It contrasts with thelikelihood function, which is the probability of the evidence given the parameters:p(X|θ){\displaystyle p(X|\theta )}.

The two are related as follows:

Given aprior belief that aprobability distribution function isp(θ){\displaystyle p(\theta )} and that the observationsx{\displaystyle x} have a likelihoodp(x|θ){\displaystyle p(x|\theta )}, then the posterior probability is defined as

p(θ|x)=p(x|θ)p(x)p(θ){\displaystyle p(\theta |x)={\frac {p(x|\theta )}{p(x)}}p(\theta )},[6]

wherep(x){\displaystyle p(x)} is the normalizing constant and is calculated as

p(x)=p(x|θ)p(θ)dθ{\displaystyle p(x)=\int p(x|\theta )p(\theta )d\theta }

for continuousθ{\displaystyle \theta }, or by summingp(x|θ)p(θ){\displaystyle p(x|\theta )p(\theta )}over all possible values ofθ{\displaystyle \theta } for discreteθ{\displaystyle \theta }.[7]

The posterior probability is thereforeproportional to the productLikelihood · Prior probability.[8]

Example

[edit]

Suppose there is a school with 60% boys and 40% girls as students. The girls wear trousers or skirts in equal numbers; all boys wear trousers. An observer sees a (random) student from a distance; all the observer can see is that this student is wearing trousers. What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem.

The eventG is that the student observed is a girl, and the eventT is that the student observed is wearing trousers. To compute the posterior probabilityP(G|T){\displaystyle P(G|T)}, we first need to know:

  • P(G){\displaystyle P(G)}, or the probability that the student is a girl regardless of any other information. Since the observer sees a random student, meaning that all students have the same probability of being observed, and the percentage of girls among the students is 40%, this probability equals 0.4.
  • P(B){\displaystyle P(B)}, or the probability that the student is not a girl (i.e. a boy) regardless of any other information (B is the complementary event toG). This is 60%, or 0.6.
  • P(T|G){\displaystyle P(T|G)}, or the probability of the student wearing trousers given that the student is a girl. As they are as likely to wear skirts as trousers, this is 0.5.
  • P(T|B){\displaystyle P(T|B)}, or the probability of the student wearing trousers given that the student is a boy. This is given as 1.
  • P(T){\displaystyle P(T)}, or the probability of a (randomly selected) student wearing trousers regardless of any other information. SinceP(T)=P(T|G)P(G)+P(T|B)P(B){\displaystyle P(T)=P(T|G)P(G)+P(T|B)P(B)} (via thelaw of total probability), this isP(T)=0.5×0.4+1×0.6=0.8{\displaystyle P(T)=0.5\times 0.4+1\times 0.6=0.8}.

Given all this information, theposterior probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula:

P(G|T)=P(T|G)P(G)P(T)=0.5×0.40.8=0.25.{\displaystyle P(G|T)={\frac {P(T|G)P(G)}{P(T)}}={\frac {0.5\times 0.4}{0.8}}=0.25.}

An intuitive way to solve this is to assume the school hasN students. Number of boys = 0.6N and number of girls = 0.4N. IfN is large enough that rounding errors can be ignored, total number of trouser wearers = 0.6N + 50% of 0.4N. And number of girl trouser wearers = 50% of 0.4N. Therefore, in the population of trousers, girls are (50% of 0.4N)/(0.6N + 50% of 0.4N) = 25%. In other words, if you separated out the group of trouser wearers, a quarter of that group will be girls. Therefore, if you see trousers, the most you can deduce is that you are looking at a single sample from a subset of students where 25% are girls. And by definition, chance of this random student being a girl is 25%. Every Bayes-theorem problem can be solved in this way.[9]

Calculation

[edit]

The posterior probability distribution of onerandom variable given the value of another can be calculated withBayes' theorem by multiplying theprior probability distribution by thelikelihood function, and then dividing by thenormalizing constant, as follows:

fXY=y(x)=fX(x)LXY=y(x)fX(u)LXY=y(u)du{\displaystyle f_{X\mid Y=y}(x)={f_{X}(x){\mathcal {L}}_{X\mid Y=y}(x) \over {\int _{-\infty }^{\infty }f_{X}(u){\mathcal {L}}_{X\mid Y=y}(u)\,du}}}

gives the posteriorprobability density function for a random variableX{\displaystyle X} given the dataY=y{\displaystyle Y=y}, where

Credible interval

[edit]

Posterior probability is a conditional probability conditioned on randomly observed data. Hence it is a random variable. For a random variable, it is important to summarize its amount of uncertainty. One way to achieve this goal is to provide acredible interval of the posterior probability.[11]

Classification

[edit]

Inclassification, posterior probabilities reflect the uncertainty of assessing an observation to particular class, see alsoclass-membership probabilities. Whilestatistical classification methods by definition generate posterior probabilities, Machine Learners usually supply membership values which do not induce any probabilistic confidence. It is desirable to transform or rescale membership values to class-membership probabilities, since they are comparable and additionally more easily applicable for post-processing.[12]

See also

[edit]

References

[edit]
  1. ^Lambert, Ben (2018). "The posterior – the goal of Bayesian inference".A Student's Guide to Bayesian Statistics. Sage. pp. 121–140.ISBN 978-1-4739-1636-4.
  2. ^Grossman, Jason (2005).Inferences from observations to simple statistical hypotheses (PhD thesis). University of Sydney.hdl:2123/9107.
  3. ^Etz, Alex (2015-07-25)."Understanding Bayes: Updating priors via the likelihood".The Etz-Files. Retrieved2022-08-18.
  4. ^Gill, Jeff (2014). "Summarizing Posterior Distributions with Intervals".Bayesian Methods: A Social and Behavioral Sciences Approach (Third ed.). Chapman & Hall. pp. 42–48.ISBN 978-1-4398-6248-3.
  5. ^Press, S. James (1989). "Approximations, Numerical Methods, and Computer Programs".Bayesian Statistics : Principles, Models, and Applications. New York: John Wiley & Sons. pp. 69–102.ISBN 0-471-63729-7.
  6. ^Christopher M. Bishop (2006).Pattern Recognition and Machine Learning. Springer. pp. 21–24.ISBN 978-0-387-31073-2.
  7. ^Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari and Donald B. Rubin (2014).Bayesian Data Analysis. CRC Press. p. 7.ISBN 978-1-4398-4095-5.{{cite book}}: CS1 maint: multiple names: authors list (link)
  8. ^Ross, Kevin.Chapter 8 Introduction to Continuous Prior and Posterior Distributions | An Introduction to Bayesian Reasoning and Methods.
  9. ^"Bayes' theorem - C o r T e x T".sites.google.com. Retrieved2022-08-18.
  10. ^"Posterior probability - formulasearchengine".formulasearchengine.com. Retrieved2022-08-19.
  11. ^Clyde, Merlise; Çetinkaya-Rundel, Mine; Rundel, Colin; Banks, David; Chai, Christine; Huang, Lizzy.Chapter 1 The Basics of Bayesian Statistics | An Introduction to Bayesian Thinking.
  12. ^Boedeker, Peter; Kearns, Nathan T. (2019-07-09)."Linear Discriminant Analysis for Prediction of Group Membership: A User-Friendly Primer".Advances in Methods and Practices in Psychological Science.2 (3):250–263.doi:10.1177/2515245919849378.ISSN 2515-2459.S2CID 199007973.

Further reading

[edit]
  • Lancaster, Tony (2004).An Introduction to Modern Bayesian Econometrics. Oxford: Blackwell.ISBN 1-4051-1720-6.
  • Lee, Peter M. (2004).Bayesian Statistics : An Introduction (3rd ed.).Wiley.ISBN 0-340-81405-5.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Posterior_probability&oldid=1332734906"
Category:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp