InBayesian statistics, acredible interval is aninterval used to characterize aprobability distribution. It is defined such that an unobservedparameter value has a particularprobability to fall within it. For example, in an experiment that determines the distribution of possible values of the parameter, if the probability that lies between 35 and 45 is, then is a 95% credible interval.
Credible intervals are typically used to characterizeposterior probability distributions orpredictive probability distributions.[1] Their generalization to disconnected or multivariate sets is calledcredible set or credible region.
Credible intervals are aBayesian analog toconfidence intervals infrequentist statistics.[2] The two concepts arise from different philosophies:[3] Bayesian intervals treat their bounds as fixed and the estimated parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random variables and the parameter as a fixed value. Also, Bayesian credible intervals use (and indeed, require) knowledge of the situation-specificprior distribution, while the frequentist confidence intervals do not.
Credible sets are not unique, as any given probability distribution has an infinite number of-credible sets, i.e. sets of probability. For example, in the univariate case, there are multiple definitions for a suitable interval or set:
Thesmallest credible interval (SCI), sometimes also called thehighest density interval. This interval necessarily contains themedian whenever. When the distribution isunimodal, this interval also contains themode.
Thesmallest credible set (SCS), sometimes also called thehighest density region. For a multimodal distribution, this is not necessarily an interval as it can be disconnected. This set always contains themode.
Aquantile-based credible interval, which is computed by taking the inter-quantile interval for some predefined. For instance, themedian credible interval (MCI) of probability is the interval where the probability of being below the interval is as likely as being above it, that is to say the interval. It is sometimes also called theequal-tailed interval, and it always contains themedian. Other quantile-based credible intervals can be defined, such as thelowest credible interval (LCI) which is, or thehighest credible interval (HCI) which is. These intervals may be more suited for bounded variables.
One may also define an interval for which themean is the central point, assuming that the mean exists.
-Smallest Credible Sets (-SCS) can easily be generalized to the multivariate case, and are bounded by probability densitycontour lines.[4] They always contain themode, but not necessarily themean, thecoordinate-wise median, nor thegeometric median.
Credible intervals can also be estimated through the use of simulation techniques such asMarkov chain Monte Carlo.[5]
A frequentist 95%confidence interval means that with a large number of repeated samples, 95% of such calculated confidence intervals would include the true value of the parameter. In frequentist terms, the parameter isfixed (cannot be considered to have a distribution of possible values) and the confidence interval israndom (as it depends on the random sample).
Bayesian credible intervals differ from frequentist confidence intervals by two major aspects:
Credible intervals are intervals whose values have a (posterior) probability density, representing the plausibility that the parameter has those values, whereas confidence intervals regard the population parameter as fixed and therefore not the object of probability. Within confidence intervals, confidence refers to the randomness of the very confidence interval under repeated trials, whereas credible intervals analyze the uncertainty of the target parameter given the data at hand.
Credible intervals and confidence intervals treatnuisance parameters in radically different ways.
For the case of a single parameter and data that can be summarised in a singlesufficient statistic, it can be shown that the credible interval and the confidence interval coincide if the unknown parameter is alocation parameter (i.e. the forward probability function has the form), with a prior that is a uniform flat distribution;[6] and also if the unknown parameter is ascale parameter (i.e. the forward probability function has the form), with aJeffreys' prior[6] — the latter following because taking the logarithm of such a scale parameter turns it into a location parameter with a uniform distribution.But these are distinctly special (albeit important) cases; in general no such equivalence can be made.
^O'Hagan, A. (1994)Kendall's Advanced Theory of Statistics, Vol 2B, Bayesian Inference, Section 2.51. Arnold,ISBN0-340-52922-9
^Chen, Ming-Hui; Shao, Qi-Man (1 March 1999). "Monte Carlo Estimation of Bayesian Credible and HPD Intervals".Journal of Computational and Graphical Statistics.8 (1):69–92.doi:10.1080/10618600.1999.10474802.
^abJaynes, E. T. (1976). "Confidence Intervals vs Bayesian Intervals", inFoundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, (W. L. Harper and C. A. Hooker, eds.), Dordrecht: D. Reidel, pp. 175et seq
Bolstad, William M.; Curran, James M. (2016). "Comparing Bayesian and Frequentist Inferences for Mean".Introduction to Bayesian Statistics (Third ed.). John Wiley & Sons. pp. 237–253.ISBN978-1-118-09156-2.